From 994672f5ddb5786dd18ff5740d95843429106286 Mon Sep 17 00:00:00 2001 From: Pavel Vergeev Date: Thu, 13 Jan 2022 03:53:48 +0300 Subject: [PATCH 001/531] fix 404 for sorting link (#41) * fix 404 for sorting link * replace relative link with an absolute one is more explicit and less bug-prone * revert d7e6203 reverts "replace relative link with an absolute one" * remove the automatic "published: true" clause --- content/russian/cs/complexity/asymptotic.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/russian/cs/complexity/asymptotic.md b/content/russian/cs/complexity/asymptotic.md index c8c27c73..f0a43d16 100644 --- a/content/russian/cs/complexity/asymptotic.md +++ b/content/russian/cs/complexity/asymptotic.md @@ -18,7 +18,7 @@ weight: 2 При этом важно не просто считать строчки, а ещё учитывать, как реализованы некоторые отдельные вещи в самом языке. Например, в питоне срезы массива (`array[3:10]`) копируют этот массив, то есть этот срез работает за 7 элементарных действий. А `swap`, например, можно реализовать за 3 присваивания. -**Упражнение.** Попробуйте посчитать точное число *сравнений* и *присваиваний* в [сортировках](../sorting) пузырьком, выбором, вставками и подсчетом в худшем случае. Это должна быть какая-то формула, зависящая от $n$ — длины массива. +**Упражнение.** Попробуйте посчитать точное число *сравнений* и *присваиваний* в [сортировках](../../sorting) пузырьком, выбором, вставками и подсчетом в худшем случае. Это должна быть какая-то формула, зависящая от $n$ — длины массива. Чтобы учесть вообще все элементарные операции, ещё надо посчитать, например, сколько раз прибавилась единичка внутри цикла `for`. А ещё, например, строчка `n = len(array)` — это тоже действие. Поэтому даже посчитав их, не сразу очевидно, какой из этих алгоритмов работает быстрее — сравнивать формулы сложно. Хочется придумать способ упростить эти формулы так, чтобы From c9c2ec9f364d08db4e5ce1746aebf91a3ed32be1 Mon Sep 17 00:00:00 2001 From: Pavel Vergeev Date: Thu, 13 Jan 2022 18:39:27 +0300 Subject: [PATCH 002/531] fix a typo in big o notation definition --- content/russian/cs/complexity/asymptotic.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/content/russian/cs/complexity/asymptotic.md b/content/russian/cs/complexity/asymptotic.md index f0a43d16..b46e27ee 100644 --- a/content/russian/cs/complexity/asymptotic.md +++ b/content/russian/cs/complexity/asymptotic.md @@ -1,6 +1,7 @@ --- title: Асимптотический анализ weight: 2 +published: true --- Часто бывает полезно оценить, сколько времени работает алгоритм. Конечно, можно его просто реализовать и запустить, но тут возникают проблемы: @@ -28,7 +29,7 @@ weight: 2 Для этого придумали О-нотацию — асимптотическое время работы вместо точного (часто его ещё называют просто *асимптотикой*). -**Определение.** Пусть $f(n)$ — это какая-то функция. Говорят, что функция $g(n) = O(f(n))$, если существует такие константы $c$ и $n_0$, что $g(n) < c \cdot g(n)$ для всех $n \geq n_0$. +**Определение.** Пусть $f(n)$ — это какая-то функция. Говорят, что функция $g(n) = O(f(n))$, если существует такие константы $c$ и $n_0$, что $g(n) < c \cdot f(n)$ для всех $n \geq n_0$. Например: From 390a68e2b50e3f49cd7b869f9325913588435664 Mon Sep 17 00:00:00 2001 From: romanpovol <97195569+romanpovol@users.noreply.github.com> Date: Thu, 13 Jan 2022 20:15:50 +0300 Subject: [PATCH 003/531] Update bridges.md --- content/russian/cs/graph-traversals/bridges.md | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/content/russian/cs/graph-traversals/bridges.md b/content/russian/cs/graph-traversals/bridges.md index 662fafdb..4fa8bcf0 100644 --- a/content/russian/cs/graph-traversals/bridges.md +++ b/content/russian/cs/graph-traversals/bridges.md @@ -1,6 +1,7 @@ --- title: Мосты и точки сочленения weight: 6 +published: true --- **Определение.** *Мостом* называется ребро, при удалении которого связный неориентированный граф становится несвязным. @@ -79,6 +80,7 @@ void dfs(int v, int p = -1) { void dfs(int v, int p = -1) { used[v] = 1; d[v] = h[v] = (p == -1 ? 0 : h[p] + 1); + int children = 0; for (int u : g[v]) { if (u != p) { if (used[u]) @@ -90,10 +92,11 @@ void dfs(int v, int p = -1) { // v -- точка сочленения // (это условие может выполниться много раз для разных детей) } + children++; } } } - if (p == -1 && g[v].size() > 1) { + if (p == -1 && children > 1) { // v -- корень и точка сочленения } } From 1d1376632dc1ad0adbac37b7366fff0063120466 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Thu, 13 Jan 2022 20:51:32 +0300 Subject: [PATCH 004/531] comment about the count of children of a tree root --- content/russian/cs/graph-traversals/bridges.md | 1 + 1 file changed, 1 insertion(+) diff --git a/content/russian/cs/graph-traversals/bridges.md b/content/russian/cs/graph-traversals/bridges.md index 662fafdb..c2a0cd30 100644 --- a/content/russian/cs/graph-traversals/bridges.md +++ b/content/russian/cs/graph-traversals/bridges.md @@ -93,6 +93,7 @@ void dfs(int v, int p = -1) { } } } + // если v -- корень, то число детей — это просто количество смежных ей вершин if (p == -1 && g[v].size() > 1) { // v -- корень и точка сочленения } From d807f7aa0e6fd717c1b07b1fc8d7f4bdc7a5b6f8 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Thu, 13 Jan 2022 20:52:03 +0300 Subject: [PATCH 005/531] mdash unicode issues --- content/russian/cs/graph-traversals/bridges.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/russian/cs/graph-traversals/bridges.md b/content/russian/cs/graph-traversals/bridges.md index c2a0cd30..73426044 100644 --- a/content/russian/cs/graph-traversals/bridges.md +++ b/content/russian/cs/graph-traversals/bridges.md @@ -93,7 +93,7 @@ void dfs(int v, int p = -1) { } } } - // если v -- корень, то число детей — это просто количество смежных ей вершин + // если v -- корень, то число детей -- это просто количество смежных ей вершин if (p == -1 && g[v].size() > 1) { // v -- корень и точка сочленения } From cccf0ee2d9f055c6a517990837709e219aa1902a Mon Sep 17 00:00:00 2001 From: Ihor Chovpan <67230858+chopikus@users.noreply.github.com> Date: Tue, 18 Jan 2022 20:07:41 +0300 Subject: [PATCH 006/531] =?UTF-8?q?=D0=98=D1=81=D0=BF=D1=80=D0=B0=D0=B2?= =?UTF-8?q?=D0=BB=D0=B5=D0=BD=D0=BE=20=D0=B4=D0=BE=D0=BA=D0=B0=D0=B7=D0=B0?= =?UTF-8?q?=D1=82=D0=B5=D0=BB=D1=8C=D1=81=D1=82=D0=B2=D0=BE=20=D0=BE=D0=B6?= =?UTF-8?q?=D0=B8=D0=B4=D0=B0=D0=B5=D0=BC=D0=BE=D0=B9=20=D0=BB=D0=BE=D0=B3?= =?UTF-8?q?=D0=B0=D1=80=D0=B8=D1=84=D0=BC=D0=B8=D1=87=D0=B5=D1=81=D0=BA?= =?UTF-8?q?=D0=BE=D0=B9=20=D0=B3=D0=BB=D1=83=D0=B1=D0=B8=D0=BD=D1=8B=20?= =?UTF-8?q?=D0=B2=D0=B5=D1=80=D1=88=D0=B8=D0=BD=D1=8B=20=D0=B2=20=D0=B4?= =?UTF-8?q?=D0=B5=D0=BA=D0=B0=D1=80=D1=82=D0=BE=D0=B2=D0=BE=D0=BC=20=D0=B4?= =?UTF-8?q?=D0=B5=D1=80=D0=B5=D0=B2=D0=B5.?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- content/russian/cs/tree-structures/treap.md | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/content/russian/cs/tree-structures/treap.md b/content/russian/cs/tree-structures/treap.md index 561280a5..667f561c 100644 --- a/content/russian/cs/tree-structures/treap.md +++ b/content/russian/cs/tree-structures/treap.md @@ -1,14 +1,15 @@ --- title: Декартово дерево authors: -- Сергей Слотин -date: 2021-08-20 -created: "2018" + - Сергей Слотин +date: {} +created: '2018' prerequisites: -- . -- ../basic-structures/heap -- /math/probability/expectation + - . + - ../basic-structures/heap + - /math/probability/expectation weight: 1 +published: true --- Рене Декарт (фр. *René Descartes*) — великий французский математик и философ XVII века. @@ -88,7 +89,7 @@ $$ Теперь, чтобы найти матожидание глубины, эти вероятности надо просуммировать: $$ -E[d_i] = \sum_{j \neq i} p(j, i) = \sum_{j \neq i} \frac{1}{|i-j|+1} \leq \sum_{i=1}^n \frac{2}{n} = O(\log n) +E[d_i] = \sum_{j \neq i} p(j, i) = \sum_{j \neq i} \frac{1}{|i-j|+1} = \sum_{j < i} \frac{1}{i-j} + \sum_{j > i} \frac{1}{j-i} \leq 2 \cdot (\sum_{k=1}^n \frac{1}{k}) = O(\log n) $$ Перед последним переходом мы получили сумму гармонического ряда. From 1ef31e97b016c7f221605d363d828d0103af1145 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sat, 22 Jan 2022 12:55:51 +0300 Subject: [PATCH 007/531] make equation more precise --- content/russian/cs/tree-structures/treap.md | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/content/russian/cs/tree-structures/treap.md b/content/russian/cs/tree-structures/treap.md index 667f561c..d5aa96a8 100644 --- a/content/russian/cs/tree-structures/treap.md +++ b/content/russian/cs/tree-structures/treap.md @@ -2,7 +2,7 @@ title: Декартово дерево authors: - Сергей Слотин -date: {} +date: 2022-01-22 created: '2018' prerequisites: - . @@ -89,7 +89,11 @@ $$ Теперь, чтобы найти матожидание глубины, эти вероятности надо просуммировать: $$ -E[d_i] = \sum_{j \neq i} p(j, i) = \sum_{j \neq i} \frac{1}{|i-j|+1} = \sum_{j < i} \frac{1}{i-j} + \sum_{j > i} \frac{1}{j-i} \leq 2 \cdot (\sum_{k=1}^n \frac{1}{k}) = O(\log n) +E[d_i] = \sum_{j \neq i} p(j, i) + = \sum_{j \neq i} \frac{1}{|i-j|+1} + = \sum_{j < i} \frac{1}{i -j + 1} + \sum_{j > i} \frac{1}{j - i + 1} + \leq 2 \cdot (\sum_{k=2}^n \frac{1}{k}) + = O(\log n) $$ Перед последним переходом мы получили сумму гармонического ряда. From 042417ebad92f39646f3aed3e69b1620a0ad495c Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sat, 22 Jan 2022 13:05:26 +0300 Subject: [PATCH 008/531] increase horizontal padding --- themes/algorithmica/assets/style.sass | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/themes/algorithmica/assets/style.sass b/themes/algorithmica/assets/style.sass index 9b05bd35..a30cfe62 100644 --- a/themes/algorithmica/assets/style.sass +++ b/themes/algorithmica/assets/style.sass @@ -257,7 +257,7 @@ main min-width: 500px max-width: 850px margin: auto - padding: 6px 12px + padding: 6px 18px // so that the footer is stuck to bottom even if the page is short: min-height: calc(100vh - 168px) From 12d25fde1cc733ae3e04cba91f2824125de675ca Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sat, 22 Jan 2022 13:37:10 +0300 Subject: [PATCH 009/531] fix alt+arrow hotkey conflict --- themes/algorithmica/layouts/partials/head.html | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/themes/algorithmica/layouts/partials/head.html b/themes/algorithmica/layouts/partials/head.html index 55c6d380..f87a8873 100644 --- a/themes/algorithmica/layouts/partials/head.html +++ b/themes/algorithmica/layouts/partials/head.html @@ -56,8 +56,8 @@ menu.classList.add('scrolled') } }) - // onkeypress didn't work with arrows for some reasons window.addEventListener('keydown', function(e) { + if (e.altKey) { return } if (e.key == 'ArrowLeft') { document.getElementById('prev-article').click() } else if (e.key == 'ArrowRight') { From f3e57a02f62fa5f2ea1a52af0e5b55d353544efd Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sat, 22 Jan 2022 15:21:55 +0300 Subject: [PATCH 010/531] simplify treap height formula --- content/russian/cs/tree-structures/treap.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/russian/cs/tree-structures/treap.md b/content/russian/cs/tree-structures/treap.md index d5aa96a8..b5fdf764 100644 --- a/content/russian/cs/tree-structures/treap.md +++ b/content/russian/cs/tree-structures/treap.md @@ -91,8 +91,8 @@ $$ $$ E[d_i] = \sum_{j \neq i} p(j, i) = \sum_{j \neq i} \frac{1}{|i-j|+1} - = \sum_{j < i} \frac{1}{i -j + 1} + \sum_{j > i} \frac{1}{j - i + 1} - \leq 2 \cdot (\sum_{k=2}^n \frac{1}{k}) + = \sum_{j < i} \frac{1}{i - j} + \sum_{j > i} \frac{1}{j - i} + \leq 2 \cdot \sum_{k=2}^n \frac{1}{k} = O(\log n) $$ From 302140349e61bd54a105c5cd47b03b29c7f8dee8 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sat, 22 Jan 2022 15:43:25 +0300 Subject: [PATCH 011/531] update blog list --- content/english/hpc/_index.md | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/content/english/hpc/_index.md b/content/english/hpc/_index.md index 6c2b4af3..793f4df1 100644 --- a/content/english/hpc/_index.md +++ b/content/english/hpc/_index.md @@ -88,7 +88,7 @@ This work is largely based on blog posts, research papers, conference talks and - [Agner Fog](https://agner.org/optimize/) - [Daniel Lemire](https://lemire.me/en/#publications) - [Andrei Alexandrescu](https://erdani.com/index.php/about/) -- Chandler Carruth +- [Chandler Carruth](https://twitter.com/chandlerc1024) - [Wojciech Muła](http://0x80.pl/articles/index.html) - [Malte Skarupke](https://probablydance.com/) - [Travis Downs](https://travisdowns.github.io/) @@ -105,9 +105,8 @@ This work is largely based on blog posts, research papers, conference talks and - [Edmond Chow](https://www.cc.gatech.edu/~echow/) - [Peter Cordes](https://stackoverflow.com/users/224132/peter-cordes) - [ridiculous_fish](https://ridiculousfish.com/blog/) -- Kazushige Goto -- Matt Kulukundis -- Oleksandr Bacherikov +- [Geoff Langdale](https://branchfree.org/) +- [Matt Kulukundis](https://twitter.com/JuvHarlequinKFM) Volume: 300-400 pages Release date: early 2022 From d30f92a8937d5bec299397a76c64ea5c958cac5e Mon Sep 17 00:00:00 2001 From: Alex <8746137+rationalex@users.noreply.github.com> Date: Sun, 23 Jan 2022 15:13:40 +0300 Subject: [PATCH 012/531] =?UTF-8?q?=D0=A3=D1=82=D0=BE=D1=87=D0=BD=D0=B5?= =?UTF-8?q?=D0=BD=D0=B8=D0=B5=20=D0=BF=D1=80=D0=BE=20=D1=82=D0=BE,=20?= =?UTF-8?q?=D0=B3=D0=B4=D0=B5=20=D1=80=D0=B5=D0=B4=D0=B0=D0=BA=D1=82=D0=B8?= =?UTF-8?q?=D1=80=D0=BE=D0=B2=D0=B0=D1=82=D1=8C=20=D1=81=D1=82=D0=B0=D1=82?= =?UTF-8?q?=D1=8C=D0=B8.?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- content/russian/contributing.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/content/russian/contributing.md b/content/russian/contributing.md index e7fb62ee..d006892d 100644 --- a/content/russian/contributing.md +++ b/content/russian/contributing.md @@ -1,9 +1,10 @@ --- title: Как добавлять и редактировать статьи authors: -- Сергей Слотин -date: 2021-09-30 + - Сергей Слотин +date: {} hideSidebar: true +published: true --- Неполные гайдлайны, которые постепенно будут пополняться. @@ -14,7 +15,7 @@ hideSidebar: true ### Если у меня маленькая правка -Нужно нажать на кнопку с карандашом сверху справа. Откроется интерфейс prose.io, в котором нужно залогиниться через github, после чего можно редактировать markdown-исходник страницы. +Если вы читаете этот текст на github'е, то откройте его на [алгоритмике](https://ru.algorithmica.org/contributing/). Затем нужно нажать на кнопку с карандашом сверху справа. Откроется интерфейс prose.io, в котором нужно залогиниться через github, после чего можно редактировать markdown-исходник страницы. При первом сохранении автоматически создастся ветка и pull request от вашего имени, и при дальнейших он будет обновляться. Когда закончили, оставьте как есть — кто-нибудь придет и апрувнет. From 5057e72a3b6d4f147f4a81d95ea17b35374f4550 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sun, 23 Jan 2022 16:07:22 +0300 Subject: [PATCH 013/531] edits to the contribution guide --- content/russian/contributing.md | 14 +++++--------- 1 file changed, 5 insertions(+), 9 deletions(-) diff --git a/content/russian/contributing.md b/content/russian/contributing.md index d006892d..c33a6b1f 100644 --- a/content/russian/contributing.md +++ b/content/russian/contributing.md @@ -2,7 +2,7 @@ title: Как добавлять и редактировать статьи authors: - Сергей Слотин -date: {} +date: 2021-01-23 hideSidebar: true published: true --- @@ -15,9 +15,7 @@ published: true ### Если у меня маленькая правка -Если вы читаете этот текст на github'е, то откройте его на [алгоритмике](https://ru.algorithmica.org/contributing/). Затем нужно нажать на кнопку с карандашом сверху справа. Откроется интерфейс prose.io, в котором нужно залогиниться через github, после чего можно редактировать markdown-исходник страницы. - -При первом сохранении автоматически создастся ветка и pull request от вашего имени, и при дальнейших он будет обновляться. Когда закончили, оставьте как есть — кто-нибудь придет и апрувнет. +На любой странице сайта можно нажать кнопку с карандашом сверху справа. Откроется интерфейс prose.io, в котором нужно залогиниться через GitHub, после чего можно редактировать markdown-исходник страницы. При первом сохранении автоматически создастся ветка и pull request от вашего имени, и при дальнейших он будет обновляться. Когда закончили, оставьте как есть — кто-нибудь придет и апрувнет. Полного preview там нет — осторожнее с правкой сложных формул, если не уверены в корректности. @@ -25,9 +23,7 @@ published: true ### Если у меня большая правка -Для чего-либо серьёзного рекомендуется счекаутить репозиторий и поднять сайт локально. - -Это можно сделать так (предполагается, что вы знакомы с работой в терминале): +Для чего-либо серьёзного рекомендуется счекаутить репозиторий и поднять сайт локально. Это можно сделать так (предполагается, что вы знакомы с работой в терминале): 1. [Поставить Hugo](https://gohugo.io/getting-started/installing/): скорее всего одно из `sudo apt-get install hugo`, `sudo pacman -Syu hugo`, `brew install hugo` или `choco install hugo -confirm` в зависимости от системы. 2. Форкнуть репозиторий и сделать `git clone https://github.com/$USERNAME/algorithmica.git`. @@ -57,7 +53,7 @@ published: true [Гайд по синтаксису](https://www.markdownguide.org/basic-syntax/). -Помимо основного синтаксиса, поддерживаются ещё таблицы, блоки кода, strikethrough, latex (через один или два `$`) и tikz (через две `@`). +Помимо основного синтаксиса, поддерживаются ещё таблицы, блоки кода, strikethrough, latex-формулы (через один или два `$`) и tikz-диаграммы (через две `@`). ### Front matter @@ -74,7 +70,7 @@ published: true ## Правила русского языка -Ревьюер всё равно поправит, но пожалуйста, имейте в виду: +Ревьюер всё равно поправит, но, пожалуйста, имейте в виду: 1. Кавычки: « и ». 2. [Дефисы, минусы и тире](https://www.artlebedev.ru/kovodstvo/sections/97/): -, $a-b$ (через latex) и —. From a03aea643722246b75700bc5ceb0898f7060cfc9 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sun, 23 Jan 2022 16:46:32 +0300 Subject: [PATCH 014/531] 128-bit integer addition --- content/english/hpc/arithmetic/integer.md | 27 ++++++++++++++++++++--- 1 file changed, 24 insertions(+), 3 deletions(-) diff --git a/content/english/hpc/arithmetic/integer.md b/content/english/hpc/arithmetic/integer.md index 8fd1059a..e9531a03 100644 --- a/content/english/hpc/arithmetic/integer.md +++ b/content/english/hpc/arithmetic/integer.md @@ -95,7 +95,9 @@ Big-endian is also more "natural" — this is how we write binary numbers on pap Sometimes we need to multiply two 64-bit integers to get a 128-bit integer — that usually serves as a temporary value and e. g. reduced by modulo right away. -There are no 128-bit registers to hold the result of such multiplication, but `mul` instruction can operate in a manner [similar to division](/hpc/analyzing-performance/gcd/), by multiplying whatever is stored in `rax` by its operand and [writing the result](https://gcc.godbolt.org/z/4Gfxhs84Y) into two registers — the lower 64 bits of the result will go into `rdx`, and `rax` will have the higher 64 bits. Some languages have a special type to support such an operation: +There are no 128-bit registers to hold the result of such multiplication, but `mul` instruction can operate in a manner [similar to division](/hpc/analyzing-performance/gcd/), by multiplying whatever is stored in `rax` by its operand and [writing the result](https://gcc.godbolt.org/z/4Gfxhs84Y) into two registers — the lower 64 bits of the result will go into `rdx`, and `rax` will have the higher 64 bits. + +Some compilers have a separate type supporting this operation. In GCC and Clang it is available as `__int128`: ```cpp void prod(int64_t a, int64_t b, __int128 *c) { @@ -103,7 +105,7 @@ void prod(int64_t a, int64_t b, __int128 *c) { } ``` -For all purposes other than multiplication, 128-bit integers are just bundled as two registers. This makes it too weird to have a full-fledged 128-bit type, so the support for it is limited. The typical use for this type is to get either the lower or the higher part of the multiplication and forget about it: +Its typical use case is to immediately extract either the lower or the higher part of the multiplication and forget about it: ```c++ __int128_t x = 1; @@ -111,4 +113,23 @@ int64_t hi = x >> 64; int64_t lo = (int64_t) x; // will be just truncated ``` -Other platforms provide similar mechanisms for dealing with longer-than-word multiplication. For example, arm has `mulhi` and `mullo` instruction, returning lower and higher parts of the multiplication, and x86 SIMD extensions have similar 32-bit instructions. +For all purposes other than multiplication, 128-bit integers are just bundled as two registers. This makes it too weird to have a full-fledged 128-bit type, so the support for it is limited, other than for basic arithmetic operations. For example: + +```c++ +__int128_t add(__int128_t a, __int128_t b) { + return a + b; +} +``` + +is compiled into: + +```nasm +add: + mov rax, rdi + add rax, rdx ; this sets the carry flag in case of an overflow + adc rsi, rcx ; +1 if the carry flag is set + mov rdx, rsi + ret +``` + +Other platforms provide similar mechanisms for dealing with longer-than-word multiplication. For example, Arm has `mulhi` and `mullo` instructions, returning lower and higher parts of the multiplication, and x86 [SIMD extensions](/hpc/simd) have similar 32-bit instructions. From fa728f3acd8ced1c4ee81495f7680ccf4cd17543 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sun, 23 Jan 2022 17:26:39 +0300 Subject: [PATCH 015/531] instrumentation edits --- .../english/hpc/profiling/instrumentation.md | 20 ++++++------------- 1 file changed, 6 insertions(+), 14 deletions(-) diff --git a/content/english/hpc/profiling/instrumentation.md b/content/english/hpc/profiling/instrumentation.md index 8ca71f99..bdf3392b 100644 --- a/content/english/hpc/profiling/instrumentation.md +++ b/content/english/hpc/profiling/instrumentation.md @@ -14,9 +14,9 @@ float seconds = float(clock() - start) / CLOCKS_PER_SEC; printf("do_something() took %.4f", seconds); ``` -One nuance here is that you can't measure the execution time of particularly quick functions this way. The `clock` function returns the current timestamp in microseconds ($10^{-6}$), and it does so by waiting to the nearest ceiled microsecond — so it basically takes up to 1000ns to complete, which is an eternity in the world of low-level optimization. +One nuance here is that you can't measure the execution time of particularly quick functions this way because the `clock` function returns the current timestamp in microseconds ($10^{-6}$) and also by itself takes up to a few hundred nanoseconds to complete. All other time-related utilities similarly have at least microsecond granularity, which is an eternity in the world of low-level optimization. -As a workaround, you can invoke the function repeatedly in a loop, time the whole thing once, and then divide the total time by the number of iterations. You also need to ensure nothing gets cached or affected by similar side effects. This is a rather tedious way of doing profiling, especially if you are interested in multiple small sections of the program. +To achieve higher precision, you can invoke the function repeatedly in a loop, time the whole thing once, and then divide the total time by the number of iterations: ```cpp #include @@ -28,28 +28,20 @@ int main() { clock_t start = clock(); for (int i = 0; i < N; i++) - clock(); + clock(); // benchmarking the clock function itself float duration = float(clock() - start) / CLOCKS_PER_SEC; - printf("%.2fns\n", 1e9 * duration / N); + printf("%.2fns per iteration\n", 1e9 * duration / N); return 0; } ``` - +You also need to ensure that nothing gets cached, optimized away by the compiler, or affected by similar side effects. This is a separate and highly complicated topic that we will discuss in more detail at [the end of the chapter](../benchmarking). ### Event Sampling -Instrumentation can also be used for collecting other types of info that can give useful insights about the performance of a particular algorithm. For example: +Instrumentation can also be used to collect other types of information that can give useful insights about the performance of a particular algorithm. For example: - for a hash function, we are interested in the average length of its input; - for a binary tree, we care about its size and height; From f84d0154f6853d7051f7d96f0933d922d4c3e47b Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sun, 23 Jan 2022 17:58:05 +0300 Subject: [PATCH 016/531] statistical profiling --- content/english/hpc/profiling/events.md | 24 +++++++++++++++--------- 1 file changed, 15 insertions(+), 9 deletions(-) diff --git a/content/english/hpc/profiling/events.md b/content/english/hpc/profiling/events.md index c531ed28..90525eda 100644 --- a/content/english/hpc/profiling/events.md +++ b/content/english/hpc/profiling/events.md @@ -3,9 +3,15 @@ title: Statistical Profiling weight: 2 --- -Another, less invasive approach to profiling is to interrupt the execution of a program at random intervals and look where the instruction pointer is. The number of times the pointer stopped in each function's block would be roughly proportional to the total time spent executing these functions. You can also get some other useful information this way, like finding out which functions are called by which functions by inspecting the call stack. +[Instrumentation](../instrumentation) is a rather tedious way of doing profiling, especially if you are interested in multiple small sections of the program. And even if it can be partially automated by the tooling, it still won't help you gather some fine-grained statistics because of its inherent overhead. -This could in principle be done by just running a program with `gdb` and `ctrl+c`'ing it at random intervals, but modern CPUs and operating systems provide special utilities for this type of profiling. Hardware *performance counters* are special registers built into microprocessors that can store the counts of certain hardware-related activities. They are cheap to add on a microchip, as they are basically just binary counters with an activation wire connected to them. +Another, less invasive approach to profiling is to interrupt the execution of a program at random intervals and look where the instruction pointer is. The number of times the pointer stopped in each function's block would be roughly proportional to the total time spent executing these functions. You can also get some other useful information this way, like finding out which functions are called by which functions by inspecting [the call stack](/hpc/architecture/functions). + +This could, in principle, be done by just running a program with `gdb` and `ctrl+c`'ing it at random intervals but modern CPUs and operating systems provide special utilities for this type of profiling. + +### Hardware Events + +Hardware *performance counters* are special registers built into microprocessors that can store the counts of certain hardware-related activities. They are cheap to add on a microchip, as they are basically just binary counters with an activation wire connected to them. Each performance counter is connected to a large subset of circuitry and can be configured to be incremented on a particular hardware event, such as a branch mispredict or a cache miss. You can reset a counter at the start of a program, run it, and output its stored value at the end, and it will be equal to the exact number of times a certain event has been triggered throughout the execution. @@ -15,9 +21,9 @@ Overall, event-driven statistical profiling is usually the most effective and ea ### Profiling with perf -There are many profilers and other performance analysis tools. The one we will mostly rely on in this book is [perf](https://perf.wiki.kernel.org/), which is a statistical profiler available in the Linux kernel. On non-Linux systems, you can use [VTune](https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/vtune-profiler.html#gs.cuc0ks) from Intel, which provides roughly the same functionality for our purposes. It is available for free, although it is a proprietary software for which you need to refresh a community license every 90 days, while perf is free as in freedom. +Performance analysis tools that rely on the event sampling techniques described above are called *statistical profilers*. There are many of them, but the one we will mainly use in this book is [perf](https://perf.wiki.kernel.org/), which is a statistical profiler shipped with the Linux kernel. On non-Linux systems, you can use [VTune](https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/vtune-profiler.html#gs.cuc0ks) from Intel, which provides roughly the same functionality for our purposes. It is available for free, although it is proprietary, and you need to refresh your community license every 90 days, while perf is free as in freedom. -Perf is a command-line application that generates reports based on live execution of programs. It does not need the source and can profile a very wide range of applications, even those that involve multiple processes and interaction with the operating system. +Perf is a command-line application that generates reports based on the live execution of programs. It does not need the source and can profile a very wide range of applications, even those that involve multiple processes and interaction with the operating system. For explanation purposes, I have written a small program that creates an array of a million random integers, sorts it, and then does a million binary searches on it: @@ -38,7 +44,7 @@ int query() { } ``` -After compiling it (`g++ -O3 -march=native example.cc -o run`), we can run it with `perf stat ./run`, which outputs the counts of basic performance events during the execution: +After compiling it (`g++ -O3 -march=native example.cc -o run`), we can run it with `perf stat ./run`, which outputs the counts of basic performance events during its execution: ```yaml Performance counter stats for './run': @@ -60,7 +66,7 @@ After compiling it (`g++ -O3 -march=native example.cc -o run`), we can run it wi 0.000000000 seconds sys ``` -You can see that the execution took 0.53 seconds, or 852M cycles at effective 1.32 GHz clock rate, over which 479M instructions were executed. There were also a total of 122.7M branches, and 15.7% of them were mispredicted. +You can see that the execution took 0.53 seconds or 852M cycles at an effective 1.32 GHz clock rate, over which 479M instructions were executed. There were also 122.7M branches, and 15.7% of them were mispredicted. You can get a list of all supported events with `perf list`, and then specify a list of specific events you want with the `-e` option. For example, for diagnosing binary search, we mostly care about cache misses: @@ -73,7 +79,7 @@ You can get a list of all supported events with `perf list`, and then specify a By itself, `perf stat` simply sets up performance counters for the whole program. It can tell you the total number of branch mispredictions, but it won't tell you *where* they are happening, let alone *why* they are happening. -To try the stop-the-world approach we talked about initially, we need to use `perf record `, which records profiling data and dumps it as a `perf.data` file, and then call `perf report` to inspect it. I highly advise you to go and try it yourselves because the last command is interactive and colorful, but for those that can't do it right now, I'll try to describe it the best I can. +To try the stop-the-world approach we discussed previously, we need to use `perf record `, which records profiling data and dumps it as a `perf.data` file, and then call `perf report` to inspect it. I highly advise you to go and try it yourselves because the last command is interactive and colorful, but for those that can't do it right now, I'll try to describe it the best I can. When you call `perf report`, it first displays a `top`-like interactive report that tells you which functions are taking how much time: @@ -116,8 +122,8 @@ Next, you can "zoom in" on any of these functions, and, among others things, it │ ↑ jne 20 ``` -On the left column, you can see the fraction of times the instruction pointer stopped on a specific line. Because of intricacies such as pipelining and out-of-order execution, "now" is not a well-defined concept in modern CPUs, so the data is slightly inaccurate as the instruction pointer drifts a little bit forward. But it is still useful: here we spend ~65% of the time on the jump instruction because it has a comparison operator before it, indicating that the control flow waits there for this comparison to be decided. +On the left column is the fraction of times that the instruction pointer stopped on a specific line. You can see that we spend ~65% of the time on the jump instruction because it has a comparison operator before it, indicating that the control flow waits there for this comparison to be decided. -At the individual cycle level, we need something more precise. +Because of intricacies such as [pipelining](/hpc/pipelining) and out-of-order execution, "now" is not a well-defined concept in modern CPUs, so the data is slightly inaccurate as the instruction pointer drifts a little bit forward. The instruction-level data is still useful, but at the individual cycle level, we need to switch to [something more precise](../simulation). From 2b6b86c31b773f2cbc9123bfb1659ed929e70be6 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sun, 23 Jan 2022 18:26:12 +0300 Subject: [PATCH 017/531] mcas --- content/english/hpc/profiling/mca.md | 30 ++++++++++++++++++---------- 1 file changed, 19 insertions(+), 11 deletions(-) diff --git a/content/english/hpc/profiling/mca.md b/content/english/hpc/profiling/mca.md index 8f89fe54..f9d52cec 100644 --- a/content/english/hpc/profiling/mca.md +++ b/content/english/hpc/profiling/mca.md @@ -3,13 +3,15 @@ title: Machine Code Analyzers weight: 4 --- -The second category is *machine code analyzers*. These are programs that take assembly code and simulate its execution on a particular microarchitecture using information available to compilers, and output the latency and throughput of the whole snippet, as well as cycle-perfect utilization of various resources in a CPU. +The last approach to profiling is not to gather the data by actually running the program but to analyze what should happen by *simulating* it with specialized tools. There are many subcategories of such profilers, differing in which aspect of computation is simulated, but the one we are going to focus on in this section is *machine code analyzers*. -There are many of them, but I personally prefer `llvm-mca`, which you can probably install via a package manager together with `clang`. You can also access it through a new web-based tool called [UICA](https://uica.uops.info). +A machine code analyzer is a program that takes a small snippet of assembly code and simulates its execution on a particular microarchitecture using information available to compilers, and outputs the latency and throughput of the whole block, as well as cycle-perfect utilization of various resources within the CPU. -### Machine Code Analyzers +### Using `llvm-mca` -What machine code analyzers do is they run a set number of iterations of a given assembly snippet and compute statistics about the resource usage of each instruction, which is useful for finding out where the bottleneck is. +There are many different machine code analyzers, but I personally prefer `llvm-mca`, which you can probably install via a package manager together with `clang`. You can also access it through a new web-based tool called [UICA](https://uica.uops.info). + +What `llvm-mca` does is it runs a set number of iterations of a given assembly snippet and computes statistics about the resource usage of each instruction, which is useful for finding out where the bottleneck is. We will consider the array sum as our simple example: @@ -21,7 +23,7 @@ loop: jne loop ```` -Here is its analysis with `llvm-mca` on Skylake. You are not going to understand much, but that's fine for now. +Here is its analysis with `llvm-mca` for the Skylake microarchitecture: ```yaml Iterations: 100 @@ -37,9 +39,9 @@ Block RThroughput: 0.8 First, it outputs general information about the loop and the hardware: -- It "ran" the loop 100 times, executing 400 instructions in total in 108 cycles, which is the same as executing $\frac{400}{108} \approx 3.7$ instructions per cycle ("IPC") on average. -- The CPU is theoretically capable of executing up to 6 instructions per cycle ("dispatch width"). -- Each cycle in theory can be executed in 0.8 cycles on average ("block reciprocal throughput"). +- It "ran" the loop 100 times, executing 400 instructions in total in 108 cycles, which is the same as executing $\frac{400}{108} \approx 3.7$ [instructions per cycle](/hpc/complexity/hardware) on average (IPC). +- The CPU is theoretically capable of executing up to 6 instructions per cycle ([dispatch width](/hpc/architecture/layout)). +- Each cycle in theory can be executed in 0.8 cycles on average ([block reciprocal throughput](/hpc/pipelining/tables)). - The "uOps" here are the micro-operations that CPU splits each instruction into (e. g. fused load-add is composed of two uOps). Then it proceeds to give information about each individual instruction: @@ -60,11 +62,11 @@ Instruction Info: 1 1 0.50 jne -11 ``` -There is nothing there that there isn't in the instruction tables: +There is nothing there that there isn't in the [instruction tables](/hpc/pipelining/tables): - how many uOps each instruction is split into; -- how many cycles each instruction takes to complete ("latency"); -- how many cycles each instruction takes to complete in the amortized sense ("reciprocal throughput"), considering that several copies of it can be executed simultaneously. +- how many cycles each instruction takes to complete (latency); +- how many cycles each instruction takes to complete in the amortized sense (reciprocal throughput), considering that several copies of it can be executed simultaneously. Then it outputs probably the most important part — which instructions are executing when and where: @@ -77,6 +79,12 @@ Resource pressure by instruction: - - 0.99 - - - - - 0.01 - jne -11 ``` +As the contention for execution ports causes [structural hazards](/hpc/pipelining/hazards), ports often become the bottleneck for throughput-oriented loops, and this chart helps diagnose why. It does not give you a cycle-perfect Gantt chart of something like that, but it gives you the aggregate statistics of the execution ports used for each instruction, which lets you find which one is overloaded. + + From d5780573b181a0b27c6bfed0616b69625d5ff616 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sun, 23 Jan 2022 19:01:37 +0300 Subject: [PATCH 018/531] benchmarking ramblings --- content/english/hpc/profiling/benchmarking.md | 187 +++++++++++++++--- 1 file changed, 164 insertions(+), 23 deletions(-) diff --git a/content/english/hpc/profiling/benchmarking.md b/content/english/hpc/profiling/benchmarking.md index dcc96fa0..585c6939 100644 --- a/content/english/hpc/profiling/benchmarking.md +++ b/content/english/hpc/profiling/benchmarking.md @@ -3,32 +3,189 @@ title: Benchmarking weight: 6 --- -Also, make the dataset as representing of your real use case as possible, and reach an agreement with people on the procedure of benchmarking. This is especially important for data processing algorithms and data structures: most sorting algorithms perform differently depending on the input, hash tables perform differently with different distributions of keys. +(This is an early draft. Don't read it.) + +Performance cycle is implementing, running and collecting metrics, and finding where the bottleneck is. The shorter this cycle is, the better. + +If you do it correctly, working on improving performance should resemble a loop: + +1. run the program and collect metrics, +2. figure out where the bottleneck is, +3. remove the bottleneck and go to step 1. + +The shorter this loop is, the faster you will iterate. + +Faster — and accurate — as possible. + +### Managing Experiments + +This isn't the universally best approach, but this is what I do. For something smaller, you may use this: + +```c++ + +``` + +Here are some hints on how to set up your environment to achieve this (you can find many examples in the [code repo](https://github.com/sslotin/ahm-code) for this book): + +- Separate all testing and analytics code from the implementation of the algorithm itself, and also different implementations from each other. In C/C++, you can do this by creating a single header file (e. g. `matmul.hh`) with a function interface and the code for its benchmarking in `main`, and many implementation files for each algorithm version (`v1.cc`, `v2.cc`, etc.) that all include that single header file. +- To speed up builds and reruns, create a Makefile or just a bunch of small scripts that calculate the statistics you may need. +- To speed up high-level analytics, create a Jupyter notebook where you put small scripts and do all the plots. You can also put build scripts there if you feel like it. https://github.com/google/benchmark +Using C-style global defines instead of `const int`. + +```c++ +#include +#include + +const int N = 1e6; + +#ifndef N +#define N (1<<20) +#endif +int main(int argc, char* argv[]) { + int n = (argc > 1 ? atoi(argv[1]) : N); + int m = (argc > 2 ? atoi(argv[2]) : 1<<20); + + clock_t start = clock(); + + for (int i = 0; i < N; i++) + clock(); + + float duration = float(clock() - start) / CLOCKS_PER_SEC; + printf("%.2fns\n", 1e9 * duration / N); + + return 0; +} +``` + +```c++ +#include + +#ifndef N +#define N (1<<20) +#endif + +void prepare(int *a, int n); +int lower_bound(int x); + +int main(int argc, char* argv[]) { + int n = (argc > 1 ? atoi(argv[1]) : N); + int m = (argc > 2 ? atoi(argv[2]) : 1<<20); + + int *a = new int[n]; + int *q = new int[m]; + + /* + for (int i = 0; i < n; i++) + a[i] = i; + for (int i = 0; i < m; i++) + q[i] = rand() % n; + */ + + for (int i = 0; i < n; i++) + a[i] = rand(); + for (int i = 0; i < m; i++) + q[i] = rand(); + + a[0] = RAND_MAX; + std::sort(a, a + n); + + prepare(a, n); + + int checksum = 0; + clock_t start = clock(); + + /* + for (int i = 0; i < m; i++) { + int x = lower_bound(q[i]); + int y = *std::lower_bound(a, a + n, q[i]); + if (x != y) { + std::cout << q[i] << " " << x << " " << y << std::endl; + //for (int j = 0; j < n; j++) + // if (abs(a[j] - q[i]) <= 2) + // std::cout << a[j] << std::endl; + //return 0; + } + } + */ + + /* + int last = 0; + + for (int i = 0; i < m; i++) { + last = lower_bound(q[i] ^ last); + checksum ^= last; + } + */ + + for (int i = 0; i < m; i++) + checksum ^= lower_bound(q[i]); + + float seconds = float(clock() - start) / CLOCKS_PER_SEC; + + //printf("%.4f s total time\n", seconds); + printf("%.2f ns per query\n", 1e9 * seconds / m); + printf("%d\n", checksum); + + return 0; +} + +``` + +Similarly in header files, e. g. for data structures that share the construction stage. + +You might want to do something more complicated for performance-critical production code, but you are just prototyping, don't over-engineer it go with the simplest approach. + +It is also helpful to include and either read from the standard input or (if you are multiple ) + + ### Measuring the Right Thing +Also, make the dataset as representing of your real use case as possible, and reach an agreement with people on the procedure of benchmarking. This is especially important for data processing algorithms and data structures: most sorting algorithms perform differently depending on the input, hash tables perform differently with different distributions of keys. + Interleaving Similar to how Americans report pre-tax salary, Americans use non-PPP-adjusted stats, attention-seeking startups report revenue instead of profit, performance engineers report the best version of benchmark if not stated otherwise. I have never seen people do that though. It makes most difference when comparing branchy and branch-free algorithms. +```c++ +for (int i = 0; i < m; i++) + q[i] = rand(); + +int checksum = 0; + +for (int i = 0; i < m; i++) + checksum ^= lower_bound(q[i]); +``` + +```c++ +for (int i = 0; i < m; i++) + checksum ^= lower_bound(checksum ^ q[i]); +``` + +The best way to measure something is to plug it into real application. + +You also may want to mark checksums as `volatile` to prevent the compiler from [optimizing too much](/hpc/cpu-cache/latency). + +When your algorithm only writes data and doesn't calculate any sort of checksum, you can use `__sync_synchronize()`, which acts as a memory fence to prevent the compiler from optimizing between iterations. + People report things they like to report and leave out the things they don't. -### Noise Mitigation +Use random numbers. Not 1,2,3,4 because of branch prediction issues. You also better generate them ahead of time and use a fixed seed between invocations to minimize the noise and make the benchmark reproducible, which will help in debugging. -Frequency scaling +To put numbers in perspective, use statistics like "ns per query" or "cycles per byte" instead of wall clock whenever it is applicable. When you start to approach very high levels of performance, it makes sense to calculate what the theoretically maximal performance is and start thinking about your algorithm performance as a fraction of it. -I use a [convenient GNOME shell extension](https://extensions.gnome.org/extension/1082/cpufreq/) that has a separate button to do it. +### Noise Mitigation Since we are guiding our optimization by experiments, it is important to account for side effects and external noise in them, especially when reporting results to someone else: - Unless you are expecting a 2x kind of improvement, treat microbenchmarking the same way as A/B testing. When you run a program on a laptop for under a second, a ±5% fluctuation in performance is normal, so if you want to revert or keep a potential +1% improvement, run it until you reach a statistical significance, by calculating variances and p-values. - Make sure there are no cold start effects due to cache. I usually solve this by making one cold test run where I check correctness of the algorithm, and then run it many times over for benchmarking (without checking correctness). - If you benchmark a CPU-intensive algorithm, measure its performance in cycles using `perf stat`: this way it will be independent of clock frequency, fluctuations fo which is usually the main source of noise. -- Otherwise, set core frequency to the level you expect it to be and make sure nothing interferes with it. On Linux you can do it with `cpupower` (e. g. `sudo cpupower frequency-set -g powersave` to put it to minimum or `sudo cpupower frequency-set -g ondemand` to enable turbo boost). +- Otherwise, set core frequency to the level you expect it to be and make sure nothing interferes with it. On Linux you can do it with `cpupower` (e. g. `sudo cpupower frequency-set -g powersave` to put it to minimum or `sudo cpupower frequency-set -g ondemand` to enable turbo boost). I use a [convenient GNOME shell extension](https://extensions.gnome.org/extension/1082/cpufreq/) that has a separate button to do it. When running benchmarks, always quiesce the system: @@ -39,22 +196,6 @@ When running benchmarks, always quiesce the system: It is very easy to get skewed results without doing anything obviously wrong. Even a program's name can affect its speed: the executable's name ends up in an environment variable, environment variables end up on the call stack, and so the length of the name affects stack alignment, which can result in data accesses slowing down due to crossing cache line or memory page boundaries. -https://www.cs.huji.ac.il/~feit/exp/related.html - -### Managing Experiments - -Performance cycle is implementing, running and collecting metrics, and finding where the bottleneck is. The shorter this cycle is, the better. - -If you do it correctly, working on improving performance should resemble a loop: - -1. run the program and collect metrics, -2. figure out where the bottleneck is, -3. remove the bottleneck and go to step 1. - -The shorter this loop is, the faster you will iterate. Here are some hints on how to set up your environment to achieve this (you can find many examples in the [code repo](https://github.com/sslotin/ahm-code) for this book): - -- Separate all testing and analytics code from the implementation of the algorithm itself, and also different implementations from each other. In C/C++, you can do this by creating a single header file (e. g. `matmul.hh`) with a function interface and the code for its benchmarking in `main`, and many implementation files for each algorithm version (`v1.cc`, `v2.cc`, etc.) that all include that single header file. -- To speed up builds and reruns, create a Makefile or just a bunch of small scripts that calculate the statistics you may need. -- To speed up high-level analytics, create a Jupyter notebook where you put small scripts and do all the plots. You can also put build scripts there if you feel like it. -- To put numbers in perspective, use statistics like "ns per query" or "cycles per byte" instead of wall clock whenever it is applicable. When you start to approach very high levels of performance, it makes sense to calculate what the theoretically maximal performance is and start thinking about your algorithm performance as a fraction of it. +### Further Reading +In you are interested, you can explore this comprehensive [list of experimental computer science resources](https://www.cs.huji.ac.il/w~feit/exp/related.html) by Dror Feitelson, perhaps starting with "[Producing Wrong Data Without Doing Anything Obviously Wrong](http://eecs.northwestern.edu/~robby/courses/322-2013-spring/mytkowicz-wrong-data.pdf)" by Todd Mytkowicz et al. From 67ac65879d77c01086538782b4eb1fefc0f91f79 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sun, 23 Jan 2022 19:14:59 +0300 Subject: [PATCH 019/531] reorganize benchmarking --- content/english/hpc/profiling/benchmarking.md | 116 ++++++++++-------- 1 file changed, 67 insertions(+), 49 deletions(-) diff --git a/content/english/hpc/profiling/benchmarking.md b/content/english/hpc/profiling/benchmarking.md index 585c6939..d423d28e 100644 --- a/content/english/hpc/profiling/benchmarking.md +++ b/content/english/hpc/profiling/benchmarking.md @@ -1,10 +1,9 @@ --- title: Benchmarking weight: 6 +draft: true --- -(This is an early draft. Don't read it.) - Performance cycle is implementing, running and collecting metrics, and finding where the bottleneck is. The shorter this cycle is, the better. If you do it correctly, working on improving performance should resemble a loop: @@ -17,23 +16,22 @@ The shorter this loop is, the faster you will iterate. Faster — and accurate — as possible. -### Managing Experiments +## Managing Experiments -This isn't the universally best approach, but this is what I do. For something smaller, you may use this: +Here are some hints on how to set up your environment to achieve this (you can find many examples in the [code repo](https://github.com/sslotin/ahm-code) for this book): -```c++ +Using C-style global defines instead of `const int`. -``` +Similarly in header files, e. g. for data structures that share the construction stage. -Here are some hints on how to set up your environment to achieve this (you can find many examples in the [code repo](https://github.com/sslotin/ahm-code) for this book): +You might want to do something more complicated for performance-critical production code, but you are just prototyping, don't over-engineer it go with the simplest approach. -- Separate all testing and analytics code from the implementation of the algorithm itself, and also different implementations from each other. In C/C++, you can do this by creating a single header file (e. g. `matmul.hh`) with a function interface and the code for its benchmarking in `main`, and many implementation files for each algorithm version (`v1.cc`, `v2.cc`, etc.) that all include that single header file. -- To speed up builds and reruns, create a Makefile or just a bunch of small scripts that calculate the statistics you may need. -- To speed up high-level analytics, create a Jupyter notebook where you put small scripts and do all the plots. You can also put build scripts there if you feel like it. +It is also helpful to include and either read from the standard input or (if you are multiple ) -https://github.com/google/benchmark +### Writing Code + +Separate all testing and analytics code from the implementation of the algorithm itself, and also different implementations from each other. In C/C++, you can do this by creating a single header file (e. g. `matmul.hh`) with a function interface and the code for its benchmarking in `main`, and many implementation files for each algorithm version (`v1.cc`, `v2.cc`, etc.) that all include that single header file. -Using C-style global defines instead of `const int`. ```c++ #include @@ -77,13 +75,6 @@ int main(int argc, char* argv[]) { int *a = new int[n]; int *q = new int[m]; - /* - for (int i = 0; i < n; i++) - a[i] = i; - for (int i = 0; i < m; i++) - q[i] = rand() % n; - */ - for (int i = 0; i < n; i++) a[i] = rand(); for (int i = 0; i < m; i++) @@ -97,35 +88,11 @@ int main(int argc, char* argv[]) { int checksum = 0; clock_t start = clock(); - /* - for (int i = 0; i < m; i++) { - int x = lower_bound(q[i]); - int y = *std::lower_bound(a, a + n, q[i]); - if (x != y) { - std::cout << q[i] << " " << x << " " << y << std::endl; - //for (int j = 0; j < n; j++) - // if (abs(a[j] - q[i]) <= 2) - // std::cout << a[j] << std::endl; - //return 0; - } - } - */ - - /* - int last = 0; - - for (int i = 0; i < m; i++) { - last = lower_bound(q[i] ^ last); - checksum ^= last; - } - */ - for (int i = 0; i < m; i++) checksum ^= lower_bound(q[i]); float seconds = float(clock() - start) / CLOCKS_PER_SEC; - //printf("%.4f s total time\n", seconds); printf("%.2f ns per query\n", 1e9 * seconds / m); printf("%d\n", checksum); @@ -134,14 +101,65 @@ int main(int argc, char* argv[]) { ``` -Similarly in header files, e. g. for data structures that share the construction stage. +take advantage of compile-time constants. If it is not needed, read parameters from the command line. -You might want to do something more complicated for performance-critical production code, but you are just prototyping, don't over-engineer it go with the simplest approach. +### Makefiles -It is also helpful to include and either read from the standard input or (if you are multiple ) +- To speed up builds and reruns, create a Makefile or just a bunch of small scripts that calculate the statistics you may need. + + +```c++ +compile = g++ -std=c++17 -O3 -march=native -Wall + +%: %.cc gcd.hh + $(compile) $< -o $@ + +%.s: %.cc gcd.hh + $(compile) -S -fverbose-asm $< -o $@ + +%.run: % + @./$< + +.PHONY: %.run +``` + +### Jupyter Notebooks + +- To speed up high-level analytics, create a Jupyter notebook where you put small scripts and do all the plots. You can also put build scripts there if you feel like it. + +### Benchmarking Inside C++ + +Less overhead, and it lets you run more experiments. + +For C++ specifically, https://github.com/google/benchmark +You need to install it. May make sense for your use case, not only if you work for Google. +opinionated towards a particular way of doing things + +Some languages also have embedded facilities for benchmarking. Props to Julia and IPython team. + +This isn't the universally best approach, but this is what I do. For something smaller, you may use this: + +```c++ +void timeit(int (*f)(int, int)) { + clock_t start = clock(); + + volatile int checksum = 0; + + for (int i = 0; i < k; i++) + for (int j = 0; j < n; j++) + checksum += f(a[j], b[j]); + + float seconds = float(clock() - start) / CLOCKS_PER_SEC; + + printf("%.2f ns per call\n", 1e9 * seconds / n / k); + + cout << double(clock() - start) / CLOCKS_PER_SEC << endl; +} +``` +Then call it from `main` using several different implementations. -### Measuring the Right Thing +## Measuring the Right Thing Also, make the dataset as representing of your real use case as possible, and reach an agreement with people on the procedure of benchmarking. This is especially important for data processing algorithms and data structures: most sorting algorithms perform differently depending on the input, hash tables perform differently with different distributions of keys. @@ -178,7 +196,7 @@ Use random numbers. Not 1,2,3,4 because of branch prediction issues. You also be To put numbers in perspective, use statistics like "ns per query" or "cycles per byte" instead of wall clock whenever it is applicable. When you start to approach very high levels of performance, it makes sense to calculate what the theoretically maximal performance is and start thinking about your algorithm performance as a fraction of it. -### Noise Mitigation +## Reducing Noise Since we are guiding our optimization by experiments, it is important to account for side effects and external noise in them, especially when reporting results to someone else: @@ -196,6 +214,6 @@ When running benchmarks, always quiesce the system: It is very easy to get skewed results without doing anything obviously wrong. Even a program's name can affect its speed: the executable's name ends up in an environment variable, environment variables end up on the call stack, and so the length of the name affects stack alignment, which can result in data accesses slowing down due to crossing cache line or memory page boundaries. -### Further Reading +## Further Reading In you are interested, you can explore this comprehensive [list of experimental computer science resources](https://www.cs.huji.ac.il/w~feit/exp/related.html) by Dror Feitelson, perhaps starting with "[Producing Wrong Data Without Doing Anything Obviously Wrong](http://eecs.northwestern.edu/~robby/courses/322-2013-spring/mytkowicz-wrong-data.pdf)" by Todd Mytkowicz et al. From 6b38a383a01d4b460f0ee5feca73780dc679f141 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Mon, 24 Jan 2022 13:51:08 +0300 Subject: [PATCH 020/531] perf flame graphs note --- content/english/hpc/profiling/events.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/hpc/profiling/events.md b/content/english/hpc/profiling/events.md index 90525eda..71ae9cd3 100644 --- a/content/english/hpc/profiling/events.md +++ b/content/english/hpc/profiling/events.md @@ -93,7 +93,7 @@ Overhead Command Shared Object Symbol 0.80% run libc-2.33.so [.] rand ``` -Note that, for each function, just its *overhead* is listed and not the total running time (e. g. `setup` includes `std::__introsort_loop` but only its own overhead is accounted as 3.43%). You also need to account for possible inlining, which is apparently what happened with `std::lower_bound` here. Perf also tracks shared libraries (like `libc`) and, in general, any other spawned processes: if you want, you can launch a web browser with perf and see what's happening inside. +Note that, for each function, just its *overhead* is listed and not the total running time (e. g. `setup` includes `std::__introsort_loop` but only its own overhead is accounted as 3.43%). There are tools for constructing [flame graphs](https://www.brendangregg.com/flamegraphs.html) out of perf reports to make them more clear. You also need to account for possible inlining, which is apparently what happened with `std::lower_bound` here. Perf also tracks shared libraries (like `libc`) and, in general, any other spawned processes: if you want, you can launch a web browser with perf and see what's happening inside. Next, you can "zoom in" on any of these functions, and, among others things, it will offer to show you its disassembly with an associated heatmap. For example, here is the assembly for `query`: From bf3ac3b1f4167250575584aa2a2e3076ec3247b9 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Mon, 24 Jan 2022 13:57:30 +0300 Subject: [PATCH 021/531] update hpc index page --- content/english/hpc/_index.md | 243 ++++++++++++++-------------------- 1 file changed, 97 insertions(+), 146 deletions(-) diff --git a/content/english/hpc/_index.md b/content/english/hpc/_index.md index 793f4df1..1864f155 100644 --- a/content/english/hpc/_index.md +++ b/content/english/hpc/_index.md @@ -21,163 +21,46 @@ The first part covers the basics of computer architecture and optimization of si It walks through the main CPU optimization topics such as caching, SIMD and pipelining, and provides brief examples in C++, followed by large case studies where we usually achieve a significant speedup over some STL algorithm or data structure. -``` -0. Why Go Beyond Big O -1. Analyzing Performance - 1.1. Computer Architecture & Assembly - 1.2. Negotiating with Compilers - 1.3. Profiling - 1.4. Binary GCD <- 2x faster std::gcd -2. Bit Hacks and Arithmetic - 2.1. Floating-Point Arithmetic - 2.2. Numerical Methods - 2.3. Integer Arithmetic - 2.4. Bit Manipulation - 2.5. Modular Arithmetic - 2.6. Finite Fields - 2.7. Cryptography, Hashing and PRNG - 2.8. Integer Factorization - 2.9. Bignum Arithmetic and the Karatsuba Algorithm - 2.10. Fast Fourier Transform -3. Memory - 3.1. External Memory Model - 3.2. Cache Locality - 3.3. Sublinear Algorithms - 3.4. RAM & CPU Caches - 3.5. Memory Management - 3.6. Layouts for Binary Search <- 5x faster std::lower_bound - 3.7. Implicit Data Structures <- 7x faster segment trees - 3.8. Hash Tables <- 5x faster std::unordered_map -4. SIMD Parallelism - 4.1. Intrinsics and Vector Extensions - 4.2. (Auto-)Vectorization - 4.3. SSE & AVX Cookbook - 4.4. Argmin with SIMD - 4.5. Logistic Regression - 4.6. Bitmaps - 4.7. String Searching <- ?x faster strstr - 4.8. Parsing Integers <- 2x faster scanf("%d") - 4.9. Sorting <- 8x faster std::sort -5. Instruction-Level Parallelism - 5.1. Pipelining and Hazards - 5.2. Throughput Computing <- 2x faster std::accumulate - 5.3. µOps & Scheduling - 5.4. Theoretical Performance Limits - 5.5. Matrix Multiplication <- 100x faster gemm -6. Summary -``` - -Among cool things that we will speed up: - -- 2x faster GCD (compared to `std::gcd`) -- 5x faster binary search (compared to `std::lower_bound`) -- 7x faster segment trees -- 5x faster hash tables (compared to `std::unordered_map`) -- ~~?x faster popcount~~ -- 2x faster parsing series of integers (compared to `scanf`) -- ?x faster sorting (compared to `std::sort`) -- 2x faster sum (compared to `std::accumulate`) -- 100x faster matrix multiplication (compared to "for-for-for") -- optimal word-size integer factorization (~0.4ms per 60-bit integer) -- optimal Karatsuba Algorithm -- optimal FFT -- argmin at the speed of memory - -This work is largely based on blog posts, research papers, conference talks and other work authored by a lot of people: - -- [Agner Fog](https://agner.org/optimize/) -- [Daniel Lemire](https://lemire.me/en/#publications) -- [Andrei Alexandrescu](https://erdani.com/index.php/about/) -- [Chandler Carruth](https://twitter.com/chandlerc1024) -- [Wojciech Muła](http://0x80.pl/articles/index.html) -- [Malte Skarupke](https://probablydance.com/) -- [Travis Downs](https://travisdowns.github.io/) -- [Brendan Gregg](https://www.brendangregg.com/blog/index.html) -- [Andreas Abel](http://embedded.cs.uni-saarland.de/abel.php) -- [Jakob Kogler](https://cp-algorithms.com/) -- [Igor Ostrovsky](http://igoro.com/) -- [Steven Pigeon](https://hbfs.wordpress.com/) -- [Denis Bakhvalov](https://easyperf.net/notes/) -- [Paul Khuong](https://pvk.ca/) -- [Pat Morin](https://cglab.ca/~morin/) -- [Victor Eijkhout](https://www.tacc.utexas.edu/about/directory/victor-eijkhout) -- [Robert van de Geijn](https://www.cs.utexas.edu/~rvdg/) -- [Edmond Chow](https://www.cc.gatech.edu/~echow/) -- [Peter Cordes](https://stackoverflow.com/users/224132/peter-cordes) -- [ridiculous_fish](https://ridiculousfish.com/blog/) -- [Geoff Langdale](https://branchfree.org/) -- [Matt Kulukundis](https://twitter.com/JuvHarlequinKFM) - -Volume: 300-400 pages -Release date: early 2022 - -### Part II: Parallel Algorithms - -Concurrency, models of parallelism, green threads and runtimes, cache coherence, synchronization primitives, OpenMP, reductions, scans, list ranking and graph algorithms, lock-free data structures, heterogeneous computing, CUDA, kernels, warps, blocks, matrix multiplication and sorting. - -Volume: 150-200 pages -Release date: late 2022 / 2023? - -### Part III: Distributed Computing - -Communication-constrained algorithms, message passing, actor model, partitioning, MapReduce, consistency and reliability at scale, storage, compression, scheduling and cloud computing, distributed deep learning. - -Release date: ??? - -### Part IV: Compilers and Domain-Specific Architectures - -LLVM IR, main optimization techniques from the dragon book, JIT-compilation, Cython, JAX, Numba, Julia, OpenCL, DPC++ and oneAPI, XLA, FPGAs and Verilog, ASICs, TPUs and other AI accelerators. - -Release date: ??? - -### Disclaimer: Technology Choices - -The examples in this book use C++, GCC, x86-64, CUDA and Spark, although the underlying principles we aim to convey are not specific to them. - -To clear my conscience, I'm not happy with any of these choices: these technologies just happen to be the most widespread and stable at the moment, and thus more helpful for the reader. I would have respectively picked C / Rust, LLVM, arm, OpenCL and Dask; maybe there will be a 2nd edition in which some of the tech stack is changed. - -### Planned New ToC - -Halfway through the book I've realized that very long (>10 pages) articles is perhaps not the best format for the web, and it would be better to increase the number of chapters and split the articles into smaller (5-8 pages) posts each covering one particular technique, so that the book can generate more readership with better Google rankings and referrals to specific topics. - -I have something like this in mind: +Planned table of contents: ``` -0. Preface: Why Go Beyond Big O -1. Computer Models +0. Preface +1. Complexity Models 1.1. Modern Hardware - 1.2. The "Speed" of Programming Languages - 1.3. The Relevance of Algorithmic Programming + 1.2. Programming Languages + 1.3. Models of Computation + 1.4. Levels of Optimization 2. Computer Architecture - 1.1. Introduction to Assembly - 1.2. Control Flow - 1.3. Loop Unrolling - 1.4. Operation Fusion - 1.5. Functions and Recursion - 1.6. Inlining - 1.7. Indirect Branching + 1.1. Instruction Set Architectures + 1.2. Assembly Language + 1.2. Loops and Conditionals + 1.3. Functions and Recursion + 1.4. Indirect Branching + 1.5. Interrupts and System Calls + 1.6. Machine Code Layout 3. Instruction-Level Parallelism - 3.1. Pipelining and Hazards - 3.2. Branchless Computing - 3.3. Throughput Computing - 3.4. µOps & Scheduling - 3.5. Theoretical Performance Limits + 3.1. Pipeline Hazards + 3.2. The Cost of Branching + 3.3. Branchless Programming + 3.4. Instruction Tables + 3.5. Instruction Scheduling + 3.6. Throughput Computing + 3.7. Theoretical Performance Limits 4. Compilation - 4.1. Negotiating with Compilers - 4.2. Stitching Programs Together + 4.1. Stages of Compilation + 4.2. Flags and Targets 4.3. Situational Optimizations - 4.4. Contracts and Undefined Behavior - 4.5. Memory Aliasing - 4.6. Arithmetic Optimizations - 4.7. Code Layout - 4.8. Compile-Time Computation + 4.4. Contracts Programming + 4.5. Non-Zero-Cost Abstractions + 4.6. Compile-Time Computation + 4.7. Arithmetic Optimizations + 4.8. What Compilers Can and Can't Do 5. Profiling 5.1. Instrumentation 5.2. Statistical Profiling 5.3. Program Simulation 5.4. Machine Code Analyzers - 5.5. Reducing Noise - 5.6. Benchmarking + 5.5. Benchmarking 6. Arithmetic 6.1. Floating-Point Numbers 6.2. Interval Arithmetic @@ -246,4 +129,72 @@ I have something like this in mind: (12.7. Probabilistic Filters) ``` -I will probably start refactoring once I'm done with the original plan, but it may start morphing before that. +Among cool things that we will speed up: + +- 2x faster GCD (compared to `std::gcd`) +- 5x faster binary search (compared to `std::lower_bound`) +- 7x faster segment trees +- 5x faster hash tables (compared to `std::unordered_map`) +- ~~?x faster popcount~~ +- 2x faster parsing series of integers (compared to `scanf`) +- ?x faster sorting (compared to `std::sort`) +- 2x faster sum (compared to `std::accumulate`) +- 100x faster matrix multiplication (compared to "for-for-for") +- optimal word-size integer factorization (~0.4ms per 60-bit integer) +- optimal Karatsuba Algorithm +- optimal FFT +- argmin at the speed of memory + +This work is largely based on blog posts, research papers, conference talks and other work authored by a lot of people: + +- [Agner Fog](https://agner.org/optimize/) +- [Daniel Lemire](https://lemire.me/en/#publications) +- [Andrei Alexandrescu](https://erdani.com/index.php/about/) +- [Chandler Carruth](https://twitter.com/chandlerc1024) +- [Wojciech Muła](http://0x80.pl/articles/index.html) +- [Malte Skarupke](https://probablydance.com/) +- [Travis Downs](https://travisdowns.github.io/) +- [Brendan Gregg](https://www.brendangregg.com/blog/index.html) +- [Andreas Abel](http://embedded.cs.uni-saarland.de/abel.php) +- [Jakob Kogler](https://cp-algorithms.com/) +- [Igor Ostrovsky](http://igoro.com/) +- [Steven Pigeon](https://hbfs.wordpress.com/) +- [Denis Bakhvalov](https://easyperf.net/notes/) +- [Paul Khuong](https://pvk.ca/) +- [Pat Morin](https://cglab.ca/~morin/) +- [Victor Eijkhout](https://www.tacc.utexas.edu/about/directory/victor-eijkhout) +- [Robert van de Geijn](https://www.cs.utexas.edu/~rvdg/) +- [Edmond Chow](https://www.cc.gatech.edu/~echow/) +- [Peter Cordes](https://stackoverflow.com/users/224132/peter-cordes) +- [Geoff Langdale](https://branchfree.org/) +- [Matt Kulukundis](https://twitter.com/JuvHarlequinKFM) +- [ridiculous_fish](https://ridiculousfish.com/blog/) +- [Creel](https://www.youtube.com/c/WhatsACreel) + +Volume: 300-400 pages +Release date: early 2022 + +### Part II: Parallel Algorithms + +Concurrency, models of parallelism, green threads and runtimes, cache coherence, synchronization primitives, OpenMP, reductions, scans, list ranking and graph algorithms, lock-free data structures, heterogeneous computing, CUDA, kernels, warps, blocks, matrix multiplication and sorting. + +Volume: 150-200 pages +Release date: late 2022 / 2023? + +### Part III: Distributed Computing + +Communication-constrained algorithms, message passing, actor model, partitioning, MapReduce, consistency and reliability at scale, storage, compression, scheduling and cloud computing, distributed deep learning. + +Release date: ??? + +### Part IV: Compilers and Domain-Specific Architectures + +LLVM IR, main optimization techniques from the dragon book, JIT-compilation, Cython, JAX, Numba, Julia, OpenCL, DPC++ and oneAPI, XLA, FPGAs and Verilog, ASICs, TPUs and other AI accelerators. + +Release date: ??? + +### Disclaimer: Technology Choices + +The examples in this book use C++, GCC, x86-64, CUDA and Spark, although the underlying principles we aim to convey are not specific to them. + +To clear my conscience, I'm not happy with any of these choices: these technologies just happen to be the most widespread and stable at the moment, and thus more helpful for the reader. I would have respectively picked C / Rust, LLVM, arm, OpenCL and Dask; maybe there will be a 2nd edition in which some of the tech stack is changed. From 57c5da1d307e079f61458acf62231345e12d2c54 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 25 Jan 2022 14:13:36 +0300 Subject: [PATCH 022/531] program simulation --- content/english/hpc/profiling/mca.md | 4 +- content/english/hpc/profiling/simulation.md | 111 +++++++++++++++++++- 2 files changed, 109 insertions(+), 6 deletions(-) diff --git a/content/english/hpc/profiling/mca.md b/content/english/hpc/profiling/mca.md index f9d52cec..7d20adf3 100644 --- a/content/english/hpc/profiling/mca.md +++ b/content/english/hpc/profiling/mca.md @@ -3,9 +3,7 @@ title: Machine Code Analyzers weight: 4 --- -The last approach to profiling is not to gather the data by actually running the program but to analyze what should happen by *simulating* it with specialized tools. There are many subcategories of such profilers, differing in which aspect of computation is simulated, but the one we are going to focus on in this section is *machine code analyzers*. - -A machine code analyzer is a program that takes a small snippet of assembly code and simulates its execution on a particular microarchitecture using information available to compilers, and outputs the latency and throughput of the whole block, as well as cycle-perfect utilization of various resources within the CPU. +A *machine code analyzer* is a program that takes a small snippet of assembly code and [simulates](../simulation) its execution on a particular microarchitecture using information available to compilers, and outputs the latency and throughput of the whole block, as well as cycle-perfect utilization of various resources within the CPU. ### Using `llvm-mca` diff --git a/content/english/hpc/profiling/simulation.md b/content/english/hpc/profiling/simulation.md index a8026761..2f6c6dc6 100644 --- a/content/english/hpc/profiling/simulation.md +++ b/content/english/hpc/profiling/simulation.md @@ -1,9 +1,114 @@ --- -title: Simulation +title: Program Simulation weight: 3 -draft: true --- +The last approach to profiling (or rather a group of them) is not to gather the data by actually running the program but to analyze what should happen by *simulating* it with specialized tools. + + + +There are many subcategories of such profilers, differing in which aspect of computation is simulated. In this article, we are going to focus on [caching](/hpc/cpu-cache) and [branch prediction](/hpc/pipelining/branching), and use [Cachegrind](https://valgrind.org/docs/manual/cg-manual.html) for that, which is a profiling-oriented part of [Valgrind](https://valgrind.org/), a well-established tool for memory leak detection and memory debugging in general. + +### Profiling with Cachegrind + +Cachegrind essentially inspects the binary for "interesting" instructions — that perform memory reads / writes and conditional / indirect jumps — and replaces them with code that simulates corresponding hardware operations using software data structures. It therefore doesn't need access to the source code and can work with already compiled programs, and can be run on any program like this: + +```bash +valgrind --tool=cachegrind --branch-sim=yes ./run +# also simulate branch prediction ^ ^ any command, not necessarily one process +``` + +It instruments all involved binaries, runs them, and outputs a summary similar to [perf stat](../events): + +``` +I refs: 483,664,426 +I1 misses: 1,858 +LLi misses: 1,788 +I1 miss rate: 0.00% +LLi miss rate: 0.00% + +D refs: 115,204,359 (88,016,970 rd + 27,187,389 wr) +D1 misses: 9,722,664 ( 9,656,463 rd + 66,201 wr) +LLd misses: 72,587 ( 8,496 rd + 64,091 wr) +D1 miss rate: 8.4% ( 11.0% + 0.2% ) +LLd miss rate: 0.1% ( 0.0% + 0.2% ) + +LL refs: 9,724,522 ( 9,658,321 rd + 66,201 wr) +LL misses: 74,375 ( 10,284 rd + 64,091 wr) +LL miss rate: 0.0% ( 0.0% + 0.2% ) + +Branches: 90,575,071 (88,569,738 cond + 2,005,333 ind) +Mispredicts: 19,922,564 (19,921,919 cond + 645 ind) +Mispred rate: 22.0% ( 22.5% + 0.0% ) +``` + +We've fed Cachegrind exactly the same example code as in [the previous section](../events): we create an array of a million random integers, sort it, and then perform a million binary searches on it. Cachegrind shows roughly the same numbers as perf does, except that that perf's measured numbers of memory reads and branches are slightly inflated due to [speculative execution](/hpc/pipelining): they really happen in hardware and thus increment hardware counters, but are discarded and don't affect actual performance, and thus ignored in the simulation. + +Cachegrind only models the first (`D1` for data, `I1` for instructions) and the last (`LL`, unified) levels of cache, the characteristics of which are inferred from the system. It doesn't limit you in any way as you can also set them from the command line, e. g. to model the L2 cache: `--LL=,,`. + +It seems like it only slowed down our program so far and hasn't provided us any information that `perf stat` couldn't. To get more out of it than just the summary info, we can inspect a special file with profiling info, which it dumps by default in the same directory named as `cachegrind.out.`. It is human-readable, but is expected to be read via the `cg_annotate` command: + +```bash +cg_annotate cachegrind.out.4159404 --show=Dr,D1mr,DLmr,Bc,Bcm +# ^ we are only interested in data reads and branches +``` + +First it shows the parameters that were used during the run, including the characteristics of the cache system: + +``` +I1 cache: 32768 B, 64 B, 8-way associative +D1 cache: 32768 B, 64 B, 8-way associative +LL cache: 8388608 B, 64 B, direct-mapped +``` + +It didn't get the L3 cache quite right: it is not unified (8M in total, but a single core only sees 4M) and also 16-way associative, but we will ignore that for now. + +Next, it outputs a per-function summary similar to `perf report`: + +``` +Dr D1mr DLmr Bc Bcm file:function +-------------------------------------------------------------------------------- +19,951,476 8,985,458 3 41,902,938 11,005,530 ???:query() +24,832,125 585,982 65 24,712,356 7,689,480 ???:void std::__introsort_loop<...> +16,000,000 60 3 9,935,484 129,044 ???:random_r +18,000,000 2 1 6,000,000 1 ???:random + 4,690,248 61,999 17 5,690,241 1,081,230 ???:setup() + 2,000,000 0 0 0 0 ???:rand +``` + +You can see there are a lot of branch mispredicts in the sorting stage, and also a lot of both L1 cache misses and branch mispredicts during binary searching. We couldn't get this information with perf — it would only tell use these counts for the whole program. + +Another great feature that Cachegrind has is the line-by-line annotation of source code. For that, you need to compile the program with debug information (`-g`) and either explicitly tell `cg_annotate` which source files to annotate or just pass the `--auto=yes` option so that it annotates everything it can reach (including the standard library source code). + +The whole source-to-analysis process would therefore go like this: + +```bash +g++ -O3 -g sort-and-search.cc -o run +valgrind --tool=cachegrind --branch-sim=yes --cachegrind-out-file=cachegrind.out ./run +cg_annotate cachegrind.out --auto=yes --show=Dr,D1mr,DLmr,Bc,Bcm +``` + +Since the glibc implementations are not the most readable, for exposition purposes, we replace `lower_bound` with our own binary search, which will be annotated like this: + +```c++ +Dr D1mr DLmr Bc Bcm + . . . . . int binary_search(int x) { + 0 0 0 0 0 int l = 0, r = n - 1; + 0 0 0 20,951,468 1,031,609 while (l < r) { + 0 0 0 0 0 int m = (l + r) / 2; +19,951,468 8,991,917 63 19,951,468 9,973,904 if (a[m] >= x) + . . . . . r = m; + . . . . . else + 0 0 0 0 0 l = m + 1; + . . . . . } + . . . . . return l; + . . . . . } +``` + +Unfortunately, Cachegrind only tracks memory accesses and branches. When the bottleneck is caused by something else, we need [other simulation tools](../mca). From 960564933f0d8431e4aa799c695bc0137e415e85 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 25 Jan 2022 19:12:06 +0300 Subject: [PATCH 023/531] hpc benchmarking --- content/english/hpc/profiling/benchmarking.md | 254 ++++++++---------- 1 file changed, 119 insertions(+), 135 deletions(-) diff --git a/content/english/hpc/profiling/benchmarking.md b/content/english/hpc/profiling/benchmarking.md index d423d28e..58bb8a0b 100644 --- a/content/english/hpc/profiling/benchmarking.md +++ b/content/english/hpc/profiling/benchmarking.md @@ -1,219 +1,203 @@ --- title: Benchmarking weight: 6 -draft: true --- + -Using C-style global defines instead of `const int`. +Most good software engineering practices in one way or another address the issue of making *development cycles* faster: you want to compile your software faster (build systems), catch bugs as soon as possible (static analysis, continuous integration), release as soon as the new version is ready (continuous deployment), and react to user feedback without much delay (agile development). -Similarly in header files, e. g. for data structures that share the construction stage. +Performance engineering is not different, and if you do it correctly, it should also resemble a cycle: -You might want to do something more complicated for performance-critical production code, but you are just prototyping, don't over-engineer it go with the simplest approach. +1. Run the program and collect metrics. +2. Figure out where the bottleneck is. +3. Remove the bottleneck and go to step 1. -It is also helpful to include and either read from the standard input or (if you are multiple ) +In this section, we will talk about benchmarking and discuss some practical techniques that make this cycle shorter and help you iterate faster. Most of the advice comes from working on this book, so you can find many real examples of described setups in the [code repository](https://github.com/sslotin/ahm-code) for this book. -### Writing Code +### Benchmarking Inside C++ -Separate all testing and analytics code from the implementation of the algorithm itself, and also different implementations from each other. In C/C++, you can do this by creating a single header file (e. g. `matmul.hh`) with a function interface and the code for its benchmarking in `main`, and many implementation files for each algorithm version (`v1.cc`, `v2.cc`, etc.) that all include that single header file. +There are several approaches to writing benchmarking code. Perhaps the most popular one is to include several same-language implementations you want to compare in one file, separately invoke them from the `main` function, and calculate all the metrics you want in the same source file. +The disadvantage of this method is that you need to write a lot of boilerplate code and duplicate it for each implementation, but it can be partially neutralized with metaprogramming. For example, when you are benchmarking multiple [gcd](/hpc/algorithms/gcd) implementations, you can reduce benchmarking code considerably with this higher-order function: ```c++ -#include -#include +const int N = 1e6, T = 1e9 / N; +int a[N], b[N]; -const int N = 1e6; +void timeit(int (*f)(int, int)) { + clock_t start = clock(); -#ifndef N -#define N (1<<20) -#endif -int main(int argc, char* argv[]) { - int n = (argc > 1 ? atoi(argv[1]) : N); - int m = (argc > 2 ? atoi(argv[2]) : 1<<20); + int checksum = 0; - clock_t start = clock(); + for (int t = 0; t < T; t++) + for (int i = 0; i < n; i++) + checksum ^= f(a[i], b[i]); + + float seconds = float(clock() - start) / CLOCKS_PER_SEC; - for (int i = 0; i < N; i++) - clock(); + printf("checksum: %d\n", checksum); + printf("%.2f ns per call\n", 1e9 * seconds / N / T); +} - float duration = float(clock() - start) / CLOCKS_PER_SEC; - printf("%.2fns\n", 1e9 * duration / N); +int main() { + for (int i = 0; i < N; i++) + a[i] = rand(), b[i] = rand(); + + timeit(std::gcd); + timeit(my_gcd); + timeit(my_another_gcd); + // ... return 0; } ``` -```c++ -#include +This is a very low-overhead method that lets you run more experiments and [get more accurate results](../noise) from them. You still have to perform some repeated actions, but they can be largely automated with frameworks, [Google benchmark library](https://github.com/google/benchmark) being the most popular choice for C++. Some programming languages also have handy built-in tools for benchmarking: special mention here goes to [Python's timeit function](https://docs.python.org/3/library/timeit.html) and [Julia's @benckmark macro](https://github.com/JuliaCI/BenchmarkTools.jl). -#ifndef N -#define N (1<<20) -#endif +Although *efficient* in terms of execution speed, C and C++ are not the most *productive* languages, especially when it comes to analytics. When your algorithm depends on some parameters such as the input size, and you need to collect more than just one data point from each implementation, you really want to integrate your benchmarking code with the outside environment and analyze the results using something else. -void prepare(int *a, int n); -int lower_bound(int x); +### Splitting Up Implementations -int main(int argc, char* argv[]) { - int n = (argc > 1 ? atoi(argv[1]) : N); - int m = (argc > 2 ? atoi(argv[2]) : 1<<20); +One way to improve modularity and reusability is to separate all testing and analytics code from the actual implementation of the algorithm, and also make it so that different versions are implemented in separate files, but have the same interface. - int *a = new int[n]; - int *q = new int[m]; +In C/C++, you can do this by creating a single header file (e. g. `gcd.hh`) with a function interface and all its benchmarking code in `main`: - for (int i = 0; i < n; i++) - a[i] = rand(); - for (int i = 0; i < m; i++) - q[i] = rand(); +```c++ +int gcd(int a, int b); // to be implemented - a[0] = RAND_MAX; - std::sort(a, a + n); +// for data structures, you also need to create a setup function +// (unless the same preprocessing step for all versions would suffice) - prepare(a, n); +int main() { + const int N = 1e6, T = 1e9 / N; + int a[N], b[N]; + // careful: local arrays are allocated on the stack and may cause stack overflow + // for large arrays, allocate with "new" or create a global array + + for (int i = 0; i < N; i++) + a[i] = rand(), b[i] = rand(); int checksum = 0; - clock_t start = clock(); - for (int i = 0; i < m; i++) - checksum ^= lower_bound(q[i]); + clock_t start = clock(); + for (int t = 0; t < T; t++) + for (int i = 0; i < n; i++) + checksum += gcd(a[i], b[i]); + float seconds = float(clock() - start) / CLOCKS_PER_SEC; - printf("%.2f ns per query\n", 1e9 * seconds / m); printf("%d\n", checksum); + printf("%.2f ns per call\n", 1e9 * seconds / N / T); return 0; } - ``` -take advantage of compile-time constants. If it is not needed, read parameters from the command line. - -### Makefiles - -- To speed up builds and reruns, create a Makefile or just a bunch of small scripts that calculate the statistics you may need. - +Then you create many implementation files for each algorithm version (e. g. `v1.cc`, `v2.cc` and so on, or some meaningful names if applicable) that all include that single header file: ```c++ -compile = g++ -std=c++17 -O3 -march=native -Wall - -%: %.cc gcd.hh - $(compile) $< -o $@ - -%.s: %.cc gcd.hh - $(compile) -S -fverbose-asm $< -o $@ - -%.run: % - @./$< +#include "gcd.hh" -.PHONY: %.run +int gcd(int a, int b) { + if (b == 0) + return a; + else + return gcd(b, a % b); +} ``` -### Jupyter Notebooks - -- To speed up high-level analytics, create a Jupyter notebook where you put small scripts and do all the plots. You can also put build scripts there if you feel like it. - -### Benchmarking Inside C++ - -Less overhead, and it lets you run more experiments. - -For C++ specifically, https://github.com/google/benchmark -You need to install it. May make sense for your use case, not only if you work for Google. -opinionated towards a particular way of doing things - -Some languages also have embedded facilities for benchmarking. Props to Julia and IPython team. - -This isn't the universally best approach, but this is what I do. For something smaller, you may use this: +The whole purpose of doing this is to be able to benchmark a specific algorithm version from the command line without touching any source code files. For this purpose, you may also want to expose any parameters that it may have — for example, by parsing them from the command line arguments: ```c++ -void timeit(int (*f)(int, int)) { - clock_t start = clock(); - - volatile int checksum = 0; - - for (int i = 0; i < k; i++) - for (int j = 0; j < n; j++) - checksum += f(a[j], b[j]); - - float seconds = float(clock() - start) / CLOCKS_PER_SEC; +int main(int argc, char* argv[]) { + int N = (argc > 1 ? atoi(argv[1]) : 1e6); + const int T = 1e9 / N; - printf("%.2f ns per call\n", 1e9 * seconds / n / k); - - cout << double(clock() - start) / CLOCKS_PER_SEC << endl; + // ... } ``` -Then call it from `main` using several different implementations. +Another way to do it is to use C-style global defines and then pass them with the `-D N=...` flag during compilation: -## Measuring the Right Thing - -Also, make the dataset as representing of your real use case as possible, and reach an agreement with people on the procedure of benchmarking. This is especially important for data processing algorithms and data structures: most sorting algorithms perform differently depending on the input, hash tables perform differently with different distributions of keys. +```c++ +#ifndef N +#define N 1000000 +#endif -Interleaving +const int T = 1e9 / N; +``` -Similar to how Americans report pre-tax salary, Americans use non-PPP-adjusted stats, attention-seeking startups report revenue instead of profit, performance engineers report the best version of benchmark if not stated otherwise. +This way you can make use of compile-time constants, which may be very beneficial for performance of some algorithms, at the expense having to re-build the program each time you want to change the parameter, which considerably increases the time you need to collect metrics across a range of parameter values. -I have never seen people do that though. It makes most difference when comparing branchy and branch-free algorithms. +### Makefiles -```c++ -for (int i = 0; i < m; i++) - q[i] = rand(); + -int checksum = 0; +Splitting up source files allows you to speed up compilation using a caching build system such as [Make](https://en.wikipedia.org/wiki/Make_(software)). -for (int i = 0; i < m; i++) - checksum ^= lower_bound(q[i]); -``` +I usually carry a version of this Makefile across my projects: ```c++ -for (int i = 0; i < m; i++) - checksum ^= lower_bound(checksum ^ q[i]); -``` +compile = g++ -std=c++17 -O3 -march=native -Wall + +%: %.cc gcd.hh + $(compile) $< -o $@ -The best way to measure something is to plug it into real application. +%.s: %.cc gcd.hh + $(compile) -S -fverbose-asm $< -o $@ -You also may want to mark checksums as `volatile` to prevent the compiler from [optimizing too much](/hpc/cpu-cache/latency). +%.run: % + @./$< -When your algorithm only writes data and doesn't calculate any sort of checksum, you can use `__sync_synchronize()`, which acts as a memory fence to prevent the compiler from optimizing between iterations. +.PHONY: %.run +``` -People report things they like to report and leave out the things they don't. +You can now compile `example.cc` with `make example`, and automatically run it with `make example.run`. -Use random numbers. Not 1,2,3,4 because of branch prediction issues. You also better generate them ahead of time and use a fixed seed between invocations to minimize the noise and make the benchmark reproducible, which will help in debugging. +You can also add scripts for calculating statistics in the Makefile. -To put numbers in perspective, use statistics like "ns per query" or "cycles per byte" instead of wall clock whenever it is applicable. When you start to approach very high levels of performance, it makes sense to calculate what the theoretically maximal performance is and start thinking about your algorithm performance as a fraction of it. +### Jupyter Notebooks -## Reducing Noise +To speed up high-level analytics, you can create a Jupyter notebook where you put all your scripts and do all the plots. -Since we are guiding our optimization by experiments, it is important to account for side effects and external noise in them, especially when reporting results to someone else: +It is convenient to add a wrapper for benchmarking an implementation, which just returns a scalar result: -- Unless you are expecting a 2x kind of improvement, treat microbenchmarking the same way as A/B testing. When you run a program on a laptop for under a second, a ±5% fluctuation in performance is normal, so if you want to revert or keep a potential +1% improvement, run it until you reach a statistical significance, by calculating variances and p-values. -- Make sure there are no cold start effects due to cache. I usually solve this by making one cold test run where I check correctness of the algorithm, and then run it many times over for benchmarking (without checking correctness). -- If you benchmark a CPU-intensive algorithm, measure its performance in cycles using `perf stat`: this way it will be independent of clock frequency, fluctuations fo which is usually the main source of noise. -- Otherwise, set core frequency to the level you expect it to be and make sure nothing interferes with it. On Linux you can do it with `cpupower` (e. g. `sudo cpupower frequency-set -g powersave` to put it to minimum or `sudo cpupower frequency-set -g ondemand` to enable turbo boost). I use a [convenient GNOME shell extension](https://extensions.gnome.org/extension/1082/cpufreq/) that has a separate button to do it. +```python +def bench(source, n=2**20): + !make -s {source} + if _exit_code != 0: + raise Exception("Compilation failed") + res = !./{source} {n} {q} + duration = float(res[0].split()[0]) + return duration +``` -When running benchmarks, always quiesce the system: +Then you can use it to write clean analytics code: -- make sure no other jobs are running, -- turn turbo boost and hyper-threading off, -- turn off network and don't fiddle with the mouse, -- attach jobs to specific cores. +```python +ns = list(int(1.17**k) for k in range(30, 60)) +baseline = [bench('std_lower_bound', n=n) for n in ns] +results = [bench('my_binary_search', n=n) for n in ns] -It is very easy to get skewed results without doing anything obviously wrong. Even a program's name can affect its speed: the executable's name ends up in an environment variable, environment variables end up on the call stack, and so the length of the name affects stack alignment, which can result in data accesses slowing down due to crossing cache line or memory page boundaries. +# plotting relative speedup for different array sizes +import matplotlib.pyplot as plt -## Further Reading +plt.plot(ns, [x / y for x, y in zip(baseline, results)]) +plt.show() +``` -In you are interested, you can explore this comprehensive [list of experimental computer science resources](https://www.cs.huji.ac.il/w~feit/exp/related.html) by Dror Feitelson, perhaps starting with "[Producing Wrong Data Without Doing Anything Obviously Wrong](http://eecs.northwestern.edu/~robby/courses/322-2013-spring/mytkowicz-wrong-data.pdf)" by Todd Mytkowicz et al. +Once established, this workflow makes you iterate much faster and just focus on optimizing the algorithm itself. From ba65f4f0ae7f1eb780e02c736549251a54228ce2 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 25 Jan 2022 19:30:54 +0300 Subject: [PATCH 024/531] benchmarking edits --- content/english/hpc/profiling/benchmarking.md | 20 +++---------------- 1 file changed, 3 insertions(+), 17 deletions(-) diff --git a/content/english/hpc/profiling/benchmarking.md b/content/english/hpc/profiling/benchmarking.md index 58bb8a0b..7357f451 100644 --- a/content/english/hpc/profiling/benchmarking.md +++ b/content/english/hpc/profiling/benchmarking.md @@ -3,23 +3,9 @@ title: Benchmarking weight: 6 --- - - Most good software engineering practices in one way or another address the issue of making *development cycles* faster: you want to compile your software faster (build systems), catch bugs as soon as possible (static analysis, continuous integration), release as soon as the new version is ready (continuous deployment), and react to user feedback without much delay (agile development). -Performance engineering is not different, and if you do it correctly, it should also resemble a cycle: +Performance engineering is not different. If you do it correctly, it should also resemble a cycle: 1. Run the program and collect metrics. 2. Figure out where the bottleneck is. @@ -141,7 +127,7 @@ Another way to do it is to use C-style global defines and then pass them with th const int T = 1e9 / N; ``` -This way you can make use of compile-time constants, which may be very beneficial for performance of some algorithms, at the expense having to re-build the program each time you want to change the parameter, which considerably increases the time you need to collect metrics across a range of parameter values. +This way you can make use of compile-time constants, which may be very beneficial for the performance of some algorithms, at the expense of having to re-build the program each time you want to change the parameter, which considerably increases the time you need to collect metrics across a range of parameter values. ### Makefiles @@ -168,7 +154,7 @@ compile = g++ -std=c++17 -O3 -march=native -Wall You can now compile `example.cc` with `make example`, and automatically run it with `make example.run`. -You can also add scripts for calculating statistics in the Makefile. +You can also add scripts for calculating statistics in the Makefile, or incorporate it with `perf stat` calls to make profiling automatic. ### Jupyter Notebooks From 006231b6b7e4266b4c2421ad93e6ac3dcc395df4 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 25 Jan 2022 21:35:52 +0300 Subject: [PATCH 025/531] removing noise --- content/english/hpc/profiling/noise.md | 144 +++++++++++++++++++++++++ 1 file changed, 144 insertions(+) create mode 100644 content/english/hpc/profiling/noise.md diff --git a/content/english/hpc/profiling/noise.md b/content/english/hpc/profiling/noise.md new file mode 100644 index 00000000..46d7d788 --- /dev/null +++ b/content/english/hpc/profiling/noise.md @@ -0,0 +1,144 @@ +--- +title: Getting Accurate Results +weight: 10 +--- + +It is not an uncommon for there to be two library algorithm implementations, each maintaining its own benchmarking code, and each claiming to be faster than the other. This confuses everyone involved, especially the users, who have to somehow choose between the two. + +Situations like these are usually not caused by fraudulent actions by their authors; they just have different definitions of what "faster" means, and indeed, defining and using just one performance metric is often very problematic. + +### Measuring the Right Thing + +There are many things that can introduce bias into benchmarks. + +**Differing datasets.** There are many algorithms whose performance somehow depends on the dataset distribution. In order to define, for example, what the fastest sorting, shortest path, or binary search algorithms are, you have to fixing the dataset on which the algorithm is run. + +This sometimes applies even to algorithms that process a single piece of input. For example, it is not a good idea to feed GCD implementations sequential numbers because it makes branches very predictable: + +```c++ +// don't do this +int checksum = 0; + +for (int a = 0; a < 1000; a++) + for (int b = 0; b < 1000; b++) + checksum ^= gcd(a, b); +``` + +However, if we sample these same numbers randomly, branch prediction becomes much harder, and the benchmark takes longer time, despite processing the same input, but in altered order: + +```c++ +int a[1000], b[1000]; + +for (int i = 0; i < 1000; i++) + a[i] = rand() % 1000, b[i] = rand() % 1000; + +int checksum = 0; + +for (int t = 0; t < 1000; t++) + for (int i = 0; i < 1000; i++) + checksum += gcd(a[i], b[i]); +``` + + +Although the most logical choices for most cases is to just sample data uniformly at random, many real-world applications have distributions that are far from uniform, so you can't pick just one. In general, a good benchmark should be application-specific, and use the dataset that is as representing of your real use case as possible. + + + +**Multiple objectives.** Some algorithm design problems have more than one key objective. For example, hash tables, in addition to being highly dependant on the distribution of keys, also need to carefully balance: + +- memory usage, +- latency of add query, +- latency of positive membership query, +- latency of negative membership query. + +The only way to choose between hash table implementations is to try and put multiple variants into the application. + +**Latency vs Throughput.** Another aspect that people often overlook is that the execution time can be defined in more than one way, even for a single query. + +When you write code like this: + +```c++ +for (int i = 0; i < N; i++) + q[i] = rand(); + +int checksum = 0; + +for (int i = 0; i < N; i++) + checksum ^= lower_bound(q[i]); +``` + +and then time the whole thing and divide it by the number of iterations, you are actually measuring the *throughput* of the query — how many operations it can process per unit of time. This is usually less than the time it actually takes to process one operation separately because of interleaving. + +To measure actual *latency*, you need to introduce a dependency between the invocations: + +```c++ +for (int i = 0; i < N; i++) + checksum ^= lower_bound(checksum ^ q[i]); +``` + +It usually makes the most difference in algorithms with possible pipeline stall issues, e. g. when comparing branchy and branch-free algorithms. + +**Cold cache.** Another source of bias is the *cold cache effect*, when memory reads initially take longer time because the required data is not in cache yet. + +This is solved by making a *warm-up run* before starting measurements: + +```c++ +// warm-up run + +volatile checksum = 0; + +for (int i = 0; i < N; i++) + checksum ^= lower_bound(q[i]); + + +// actual run + +clock_t start = clock(); +checksum = 0; + +for (int i = 0; i < N; i++) + checksum ^= lower_bound(q[i]); +``` + +It is also sometimes convenient to combine the warm-up run with answer validation, it if is more complicated than just computing some sort of checksum. + +**Over-optimization.** Sometimes the benchmark is outright erroneous because the compiler just optimized the benchmarked code away. To prevent the compiler from cutting corners, you need to add checksums and either print them somewhere or add the `volatile` qualifier, which also prevents any sort of interleaving of loop iterations. + +For algorithms that only write data, you can use the `__sync_synchronize()` intrinsic to add a memory fence and prevent the compiler from accumulating updates. + +### Reducing Noise + + + +The issues we've described produce *bias* in measurements: they consistently give advantage to one algorithm over the other. There are other types of possible problems with benchmarking that result in either unpredictable skews or just completely random noise, thus increasing *variance*. + +These type of issues are caused by side effects and some sort of external noise, mostly due to noisy neighbors and CPU frequency scaling: + +- If you benchmark a compute-bound algorithm, measure its performance in cycles using `perf stat`: this way it will be independent of clock frequency, fluctuations of which is usually the main source of noise. +- Otherwise, set core frequency to the what you expect it to be and make sure nothing interferes with it. On Linux you can do it with `cpupower` (e. g. `sudo cpupower frequency-set -g powersave` to put it to minimum or `sudo cpupower frequency-set -g ondemand` to enable turbo boost). I use a [convenient GNOME shell extension](https://extensions.gnome.org/extension/1082/cpufreq/) that has a separate button to do it. +- If applicable, turn hyper-threading off and attach jobs to specific cores. Make sure no other jobs are running on the system, turn off networking and try not to fiddle with the mouse. + +You can't remove noises and biases completely. Even a program's name can affect its speed: the executable's name ends up in an environment variable, environment variables end up on the call stack, and so the length of the name affects stack alignment, which can result in data accesses slowing down due to crossing cache line or memory page boundaries. + +It is important to account for the noise when guiding optimizations and especially when reporting results to someone else. Unless you are expecting a 2x kind of improvement, treat all microbenchmarks the same way as A/B testing. + +When you run a program on a laptop for under a second, a ±5% fluctuation in performance is completely normal. So, if you want to decide whether to revert or keep a potential +1% improvement, run it until you reach statistical significance, which you can determine by calculating variances and p-values. + +### Further Reading + +Interested readers can explore this comprehensive [list of experimental computer science resources](https://www.cs.huji.ac.il/w~feit/exp/related.html) by Dror Feitelson, perhaps starting with "[Producing Wrong Data Without Doing Anything Obviously Wrong](http://eecs.northwestern.edu/~robby/courses/322-2013-spring/mytkowicz-wrong-data.pdf)" by Todd Mytkowicz et al. From 88ba97641e153b9714d22aee8d87e7054a19fe44 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 25 Jan 2022 23:11:30 +0300 Subject: [PATCH 026/531] arithmetic intro --- content/english/hpc/arithmetic/_index.md | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/content/english/hpc/arithmetic/_index.md b/content/english/hpc/arithmetic/_index.md index 307ea536..686d15ab 100644 --- a/content/english/hpc/arithmetic/_index.md +++ b/content/english/hpc/arithmetic/_index.md @@ -3,6 +3,14 @@ title: Arithmetic weight: 6 --- -As we have seen in [the previous chapter](../analyzing-performance/gcd), knowing darker corners of the instruction set can be very fruitful, especially in the case of CISC platforms like x86, which currently has [somewhere between 1000 and 4000](https://stefanheule.com/blog/how-many-x86-64-instructions-are-there-anyway/) distinct instructions, depending on how you count. +As we repeatedly demonstrate throughout this book, knowing darker corners of the instruction set can be very fruitful, especially in the case of [CISC](/hpc/architecture/isa) platforms like x86, which currently has [somewhere between 1000 and 4000](https://stefanheule.com/blog/how-many-x86-64-instructions-are-there-anyway/) distinct instructions, depending on how you count. + +Most of these instructions are related arithmetic, and using them all efficiently to optimize arithmetic operations requires a great deal of both knowledge, skill, and creativity. Therefore, in this chapter, we will discuss number representations and their use in numerical algorithms. + + From bb615a2ce1e42a8b18d3452c35ccadc922fc6347 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 25 Jan 2022 23:11:37 +0300 Subject: [PATCH 027/531] split float --- content/english/hpc/arithmetic/float.md | 102 ----------------------- content/english/hpc/arithmetic/ieee.md | 104 ++++++++++++++++++++++++ 2 files changed, 104 insertions(+), 102 deletions(-) create mode 100644 content/english/hpc/arithmetic/ieee.md diff --git a/content/english/hpc/arithmetic/float.md b/content/english/hpc/arithmetic/float.md index 2fe80f4c..31591f02 100644 --- a/content/english/hpc/arithmetic/float.md +++ b/content/english/hpc/arithmetic/float.md @@ -190,105 +190,3 @@ fp operator*(fp a, fp b) { Many applications that require higher levels of precision use software floating-point arithmetic in a similar fashion. Buf of course, you don't want to execute a sequence 10 or so instructions that this code compiles to each time you want to multiply two real numbers, so floating-point arithmetic is implemented in hardware — often in separate coprocessors due to its complexity. The FPU of x86 (often referred to as x87) has separate registers and its own tiny instruction set that supports memory operations, basic arithmetic, trigonometry and some common operations such logarithm, exponent and square root. - -## IEEE 754 Floats - -When we designed our DIY floating-point type, we omitted quite a lot of important little details: - -- How many bits do we dedicate for the mantissa and the exponent? -- Does a "0" sign bit means "+" or is it the other way around? -- How are these bits stored in memory? -- How do we represent 0? -- How exactly does rounding happen? -- What happens if we divide by zero? -- What happens if we take a square root of a negative number? -- What happens if increment the largest representable number? -- Can we somehow detect if one of the above three happened? - -Most of the early computers didn't have floating-point arithmetic, and when vendors started adding floating-point coprocessors, they had slightly different vision for what answers to those questions should be. Diverse implementations made it difficult to use floating-point arithmetic reliably and portably — particularly for people developing compilers. - -In 1985, the Institute of Electrical and Electronics Engineers published a standard (called [IEEE 754](https://en.wikipedia.org/wiki/IEEE_754)) that provided a formal specification of how floating-point numbers should work, which was quickly adopted by the vendors and is now used in virtually all general-purpose computers. - -### Float Formats - -Similar to our handmade float implementation, hardware floats use one bit for sign and a variable number of bits for exponent and mantissa. For example, the standard 32-bit `float` encoding uses the first (highest) bit for sign, the next 8 bits for exponent, and the 23 remaining bits for mantissa. - -![](../img/float.svg) - -One of the reasons why they are stored in this exact order is so that it would be easier to compare and sort them: you can simply use largely the same comparator circuit as for [unsigned integers](../integer) — except for maybe flipping the bits in the case of negative numbers. - -IEEE 754 and a few consequent standards define not one, but *several* representations that differ in sizes, most notably: - -| Type | Sign | Exponent | Mantissa | Total bits | Approx. decimal digits | -|----------:|------|----------|----------|------------|------------------------| -| single | 1 | 8 | 23 | 32 | ~7.2 | -| double | 1 | 11 | 52 | 64 | ~15.9 | -| half | 1 | 5 | 10 | 16 | ~3.3 | -| extended | 1 | 15 | 64 | 80 | ~19.2 | -| quadruple | 1 | 15 | 112 | 128 | ~34.0 | -| bfloat16 | 1 | 8 | 7 | 16 | ~2.3 | - -Their availability ranges from chip to chip: - -- Most CPUs support single- and double-precision — which is what `float` and `double` types refer to in C. -- Extended formats are exclusive to x86, and are available in C as the `long double` type, which falls back to double precision on arm. The choice of 64 bits for mantissa is so that every `long long` integer can be represented exactly. There is also a 40-bit format that similarly allocates 32 mantissa bits. -- Quadruple as well as the 256-bit "octuple" formats are only used for specific scientific computations and are not supported by general-purpose hardware. -- Half-precision arithmetic only supports a small subset of operations, and is generally used for machine learning applications, especially neural networks, because they tend to do a large amount of calculation, but don't require a high level of precision. -- Half-precision is being gradually replaced by bfloat, which trades off 3 mantissa bits to have the same range as single-precision, enabling interoperability with it. It is mostly being adopted by specialized hardware: TPUs, FGPAs and GPUs. The name stands for "[Brain](https://en.wikipedia.org/wiki/Google_Brain) float". - -Lower precision types need less memory bandwidth to move them around and usually take less cycles to operate on (e. g. the division instruction may take $x$, $y$, or $z$ cycles depending on the type), which is why they are preferred when error tolerance allows it. - -Deep learning, emerging as a very popular and computationally-intensive field, created a huge demand for low-precision matrix multiplication, which led to manufacturers developing separate hardware or at least adding specialized instructions that support these types of computations — most notably, Google developing a custom chip called TPU (*tensor processing unit*) that specializes on multiplying 128-by-128 bfloat matrices, and NVIDIA adding "tensor cores", capable of performing 4-by-4 matrix multiplication in one go, to all their newer GPUs. - -Apart from their sizes, most of behavior is exactly the same between all floating-point types, which we will now clarify. - -## Handling Corner Cases - -The default way integer arithmetic deals with corner cases such as division by zero is to crash. - -Sometimes a software crash in turn causes a real, physical one. In 1996, the maiden flight of the [Ariane 5](https://en.wikipedia.org/wiki/Ariane_5) (the space launch vehicle that ESA uses to lift stuff into low Earth orbit) ended in [a catastrophic explosion](https://www.youtube.com/watch?v=gp_D8r-2hwk) due to the policy of aborting computation on arithmetic error, which in this case was a floating-point to integer conversion overflow, that led to the navigation system thinking that it was off course and making a large correction, eventually causing the disintegration of a $1B rocket. - -There is a way to gracefully handle corner cases such like these: hardware interrupts. When an exception occurs, CPU: - -- interrupts the execution of a program; -- packs every all relevant information into a data structure called "interrupt vector"; -- passes it to the operating system, which in turn either calls the handling code if it exists (the "try-except" block) or terminates the program otherwise. - -This is a complex mechanism that deserves an article of its own, but since this is a book about performance, the only thing you need to know is that they are quite slow and not desirable in a real-time systems such as navigating rockets. - -### NaNs and Infinities - -Floating-point arithmetic often deals with noisy, real-world data, and exceptions there are much more common than in the integer case. For this reason, the default behavior is different. Instead of crashing, the result is substituted with a special value without interrupting the executing, unless the programmer explicitly wants to. - -The first type of such values are the two infinities: a positive and a negative one. They are generated if the result of an operation can't fit within in the representable range, and they are treated as such in arithmetic. - -$$ -\begin{aligned} - -∞ < x &< ∞ -\\ ∞ + x &= ∞ -\\ x ÷ ∞ &= 0 -\end{aligned} -$$ - -What happens if we, say, divide a value by zero? Should it be a negative or a positive infinity? This case in actually unambiguous because, somewhat less intuitively, there are also two zeros: a positive and a negative one. - -$$ - \frac{1}{+0} = +∞ -\;\;\;\; \frac{1}{-0} = -∞ -$$ - -Zeros are encoded by setting all bits to zero, except for the sign bit in the negative case. Infinities are encoded by setting all their exponent bits to one and all mantissa bits to zero, but the sign bit distinguishing between a positive and a negative infinity. - -The other type is the "not-a-number” (NaN), which is generated as the result of mathematically incorrect operations: - -$$ -\log(-1),\; \arccos(1.01),\; ∞ − ∞,\; −∞ + ∞,\; 0 × ∞,\; 0 ÷ 0,\; ∞ ÷ ∞ -$$ - -There are two types of NaNs: a signalling NaN and a quiet NaN. A signalling NaN raises an exception flag, which may or may not cause an immediate hardware interrupt because on FPU configuration, while a quiet NaN just propagates through almost every arithmetic operation, resulting in more NaNs. - -Both NaNs are encoded as all their exponent set to ones and the mantissa part being everything other than all zeroes (to distinguish them from infinities). - -## Further Reading - -If you are so inclined, you can read the classic "[What Every Computer Scientist Should Know About Floating-Point Arithmetic](https://www.itu.dk/~sestoft/bachelor/IEEE754_article.pdf)" (1991) and [the paper introducing Grisu3](https://www.cs.tufts.edu/~nr/cs257/archive/florian-loitsch/printf.pdf), the current state-of-the art for printing floating-point numbers. diff --git a/content/english/hpc/arithmetic/ieee.md b/content/english/hpc/arithmetic/ieee.md new file mode 100644 index 00000000..ac0b173d --- /dev/null +++ b/content/english/hpc/arithmetic/ieee.md @@ -0,0 +1,104 @@ +--- +title: IEEE 754 Floats +weight: 2 +--- + +When we designed our DIY floating-point type, we omitted quite a lot of important little details: + +- How many bits do we dedicate for the mantissa and the exponent? +- Does a "0" sign bit means "+" or is it the other way around? +- How are these bits stored in memory? +- How do we represent 0? +- How exactly does rounding happen? +- What happens if we divide by zero? +- What happens if we take a square root of a negative number? +- What happens if increment the largest representable number? +- Can we somehow detect if one of the above three happened? + +Most of the early computers didn't have floating-point arithmetic, and when vendors started adding floating-point coprocessors, they had slightly different vision for what answers to those questions should be. Diverse implementations made it difficult to use floating-point arithmetic reliably and portably — particularly for people developing compilers. + +In 1985, the Institute of Electrical and Electronics Engineers published a standard (called [IEEE 754](https://en.wikipedia.org/wiki/IEEE_754)) that provided a formal specification of how floating-point numbers should work, which was quickly adopted by the vendors and is now used in virtually all general-purpose computers. + +### Float Formats + +Similar to our handmade float implementation, hardware floats use one bit for sign and a variable number of bits for exponent and mantissa. For example, the standard 32-bit `float` encoding uses the first (highest) bit for sign, the next 8 bits for exponent, and the 23 remaining bits for mantissa. + +![](../img/float.svg) + +One of the reasons why they are stored in this exact order is so that it would be easier to compare and sort them: you can simply use largely the same comparator circuit as for [unsigned integers](../integer) — except for maybe flipping the bits in the case of negative numbers. + +IEEE 754 and a few consequent standards define not one, but *several* representations that differ in sizes, most notably: + +| Type | Sign | Exponent | Mantissa | Total bits | Approx. decimal digits | +|----------:|------|----------|----------|------------|------------------------| +| single | 1 | 8 | 23 | 32 | ~7.2 | +| double | 1 | 11 | 52 | 64 | ~15.9 | +| half | 1 | 5 | 10 | 16 | ~3.3 | +| extended | 1 | 15 | 64 | 80 | ~19.2 | +| quadruple | 1 | 15 | 112 | 128 | ~34.0 | +| bfloat16 | 1 | 8 | 7 | 16 | ~2.3 | + +Their availability ranges from chip to chip: + +- Most CPUs support single- and double-precision — which is what `float` and `double` types refer to in C. +- Extended formats are exclusive to x86, and are available in C as the `long double` type, which falls back to double precision on arm. The choice of 64 bits for mantissa is so that every `long long` integer can be represented exactly. There is also a 40-bit format that similarly allocates 32 mantissa bits. +- Quadruple as well as the 256-bit "octuple" formats are only used for specific scientific computations and are not supported by general-purpose hardware. +- Half-precision arithmetic only supports a small subset of operations, and is generally used for machine learning applications, especially neural networks, because they tend to do a large amount of calculation, but don't require a high level of precision. +- Half-precision is being gradually replaced by bfloat, which trades off 3 mantissa bits to have the same range as single-precision, enabling interoperability with it. It is mostly being adopted by specialized hardware: TPUs, FGPAs and GPUs. The name stands for "[Brain](https://en.wikipedia.org/wiki/Google_Brain) float". + +Lower precision types need less memory bandwidth to move them around and usually take less cycles to operate on (e. g. the division instruction may take $x$, $y$, or $z$ cycles depending on the type), which is why they are preferred when error tolerance allows it. + +Deep learning, emerging as a very popular and computationally-intensive field, created a huge demand for low-precision matrix multiplication, which led to manufacturers developing separate hardware or at least adding specialized instructions that support these types of computations — most notably, Google developing a custom chip called TPU (*tensor processing unit*) that specializes on multiplying 128-by-128 bfloat matrices, and NVIDIA adding "tensor cores", capable of performing 4-by-4 matrix multiplication in one go, to all their newer GPUs. + +Apart from their sizes, most of behavior is exactly the same between all floating-point types, which we will now clarify. + +## Handling Corner Cases + +The default way integer arithmetic deals with corner cases such as division by zero is to crash. + +Sometimes a software crash in turn causes a real, physical one. In 1996, the maiden flight of the [Ariane 5](https://en.wikipedia.org/wiki/Ariane_5) (the space launch vehicle that ESA uses to lift stuff into low Earth orbit) ended in [a catastrophic explosion](https://www.youtube.com/watch?v=gp_D8r-2hwk) due to the policy of aborting computation on arithmetic error, which in this case was a floating-point to integer conversion overflow, that led to the navigation system thinking that it was off course and making a large correction, eventually causing the disintegration of a $1B rocket. + +There is a way to gracefully handle corner cases such like these: hardware interrupts. When an exception occurs, CPU: + +- interrupts the execution of a program; +- packs every all relevant information into a data structure called "interrupt vector"; +- passes it to the operating system, which in turn either calls the handling code if it exists (the "try-except" block) or terminates the program otherwise. + +This is a complex mechanism that deserves an article of its own, but since this is a book about performance, the only thing you need to know is that they are quite slow and not desirable in a real-time systems such as navigating rockets. + +### NaNs and Infinities + +Floating-point arithmetic often deals with noisy, real-world data, and exceptions there are much more common than in the integer case. For this reason, the default behavior is different. Instead of crashing, the result is substituted with a special value without interrupting the executing, unless the programmer explicitly wants to. + +The first type of such values are the two infinities: a positive and a negative one. They are generated if the result of an operation can't fit within in the representable range, and they are treated as such in arithmetic. + +$$ +\begin{aligned} + -∞ < x &< ∞ +\\ ∞ + x &= ∞ +\\ x ÷ ∞ &= 0 +\end{aligned} +$$ + +What happens if we, say, divide a value by zero? Should it be a negative or a positive infinity? This case in actually unambiguous because, somewhat less intuitively, there are also two zeros: a positive and a negative one. + +$$ + \frac{1}{+0} = +∞ +\;\;\;\; \frac{1}{-0} = -∞ +$$ + +Zeros are encoded by setting all bits to zero, except for the sign bit in the negative case. Infinities are encoded by setting all their exponent bits to one and all mantissa bits to zero, but the sign bit distinguishing between a positive and a negative infinity. + +The other type is the "not-a-number” (NaN), which is generated as the result of mathematically incorrect operations: + +$$ +\log(-1),\; \arccos(1.01),\; ∞ − ∞,\; −∞ + ∞,\; 0 × ∞,\; 0 ÷ 0,\; ∞ ÷ ∞ +$$ + +There are two types of NaNs: a signalling NaN and a quiet NaN. A signalling NaN raises an exception flag, which may or may not cause an immediate hardware interrupt because on FPU configuration, while a quiet NaN just propagates through almost every arithmetic operation, resulting in more NaNs. + +Both NaNs are encoded as all their exponent set to ones and the mantissa part being everything other than all zeroes (to distinguish them from infinities). + +## Further Reading + +If you are so inclined, you can read the classic "[What Every Computer Scientist Should Know About Floating-Point Arithmetic](https://www.itu.dk/~sestoft/bachelor/IEEE754_article.pdf)" (1991) and [the paper introducing Grisu3](https://www.cs.tufts.edu/~nr/cs257/archive/florian-loitsch/printf.pdf), the current state-of-the art for printing floating-point numbers. From 9d44476feaf8719a1dbe252b9114ba285d5c084f Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 25 Jan 2022 23:33:41 +0300 Subject: [PATCH 028/531] float edits --- content/english/hpc/arithmetic/float.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/content/english/hpc/arithmetic/float.md b/content/english/hpc/arithmetic/float.md index 31591f02..7feb2769 100644 --- a/content/english/hpc/arithmetic/float.md +++ b/content/english/hpc/arithmetic/float.md @@ -127,7 +127,7 @@ $$ and in 28 other ways that don't overflow the mantissa. -This can be problematic for some applications, such as comparisons or hashing. To fix this, we can *normalize* these representations using a certain convention. In decimal, the [standard form](https://en.wikipedia.org/wiki/Scientific_notation) is to always put the comma after the first digit (`6.022e23`), and for binary we can do the same: +This can be problematic for some applications, such as comparisons or hashing. To fix this, we can *normalize* these representations using a certain convention. In decimal, the [standard form](https://en.wikipedia.org/wiki/Scientific_notation) is to always put the comma after the first digit (`6.022e23`), and for binary, we can do the same: $$ 42 = 10101_2 = 1.0101_2 \times 2^5 @@ -187,6 +187,6 @@ fp operator*(fp a, fp b) { } ``` -Many applications that require higher levels of precision use software floating-point arithmetic in a similar fashion. Buf of course, you don't want to execute a sequence 10 or so instructions that this code compiles to each time you want to multiply two real numbers, so floating-point arithmetic is implemented in hardware — often in separate coprocessors due to its complexity. +Many applications that require higher levels of precision use software floating-point arithmetic in a similar fashion. But of course, you don't want to execute a sequence of 10 or so instructions that this code compiles to each time you want to multiply two real numbers, so on modern CPUs, floating-point arithmetic is implemented in hardware — usually as separate coprocessors due to its complexity. -The FPU of x86 (often referred to as x87) has separate registers and its own tiny instruction set that supports memory operations, basic arithmetic, trigonometry and some common operations such logarithm, exponent and square root. +The *floating-point unit* of x86 (often referred to as x87) has separate registers and its own tiny instruction set that supports memory operations, basic arithmetic, trigonometry, and some common operations such as logarithm, exponent, and square root. To make these operations properly work together, some additional details of floating-point number representation need to be clarified — which we will do in [the next section](../ieee). From 00c96fa3afca3e407ce8d057ace8bf07d633addf Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 25 Jan 2022 23:45:53 +0300 Subject: [PATCH 029/531] ieee float edits --- content/english/hpc/arithmetic/ieee.md | 36 +++++++++++++------------- 1 file changed, 18 insertions(+), 18 deletions(-) diff --git a/content/english/hpc/arithmetic/ieee.md b/content/english/hpc/arithmetic/ieee.md index ac0b173d..edc9ce5a 100644 --- a/content/english/hpc/arithmetic/ieee.md +++ b/content/english/hpc/arithmetic/ieee.md @@ -3,25 +3,25 @@ title: IEEE 754 Floats weight: 2 --- -When we designed our DIY floating-point type, we omitted quite a lot of important little details: +When we designed our [DIY floating-point type](../float), we omitted quite a lot of important little details: - How many bits do we dedicate for the mantissa and the exponent? -- Does a "0" sign bit means "+" or is it the other way around? +- Does a "0" sign bit mean "+", or is it the other way around? - How are these bits stored in memory? - How do we represent 0? - How exactly does rounding happen? - What happens if we divide by zero? -- What happens if we take a square root of a negative number? -- What happens if increment the largest representable number? +- What happens if we take the square root of a negative number? +- What happens if we increment the largest representable number? - Can we somehow detect if one of the above three happened? -Most of the early computers didn't have floating-point arithmetic, and when vendors started adding floating-point coprocessors, they had slightly different vision for what answers to those questions should be. Diverse implementations made it difficult to use floating-point arithmetic reliably and portably — particularly for people developing compilers. +Most of the early computers didn't have floating-point arithmetic, and when vendors started adding floating-point coprocessors, they had slightly different visions for what answers to those questions should be. Diverse implementations made it difficult to use floating-point arithmetic reliably and portably — particularly for people developing compilers. In 1985, the Institute of Electrical and Electronics Engineers published a standard (called [IEEE 754](https://en.wikipedia.org/wiki/IEEE_754)) that provided a formal specification of how floating-point numbers should work, which was quickly adopted by the vendors and is now used in virtually all general-purpose computers. -### Float Formats +## Float Formats -Similar to our handmade float implementation, hardware floats use one bit for sign and a variable number of bits for exponent and mantissa. For example, the standard 32-bit `float` encoding uses the first (highest) bit for sign, the next 8 bits for exponent, and the 23 remaining bits for mantissa. +Similar to our handmade float implementation, hardware floats use one bit for sign and a variable number of bits for the exponent and the mantissa parts. For example, the standard 32-bit `float` encoding uses the first (highest) bit for sign, the next 8 bits for the exponent, and the 23 remaining bits for the mantissa. ![](../img/float.svg) @@ -43,34 +43,34 @@ Their availability ranges from chip to chip: - Most CPUs support single- and double-precision — which is what `float` and `double` types refer to in C. - Extended formats are exclusive to x86, and are available in C as the `long double` type, which falls back to double precision on arm. The choice of 64 bits for mantissa is so that every `long long` integer can be represented exactly. There is also a 40-bit format that similarly allocates 32 mantissa bits. - Quadruple as well as the 256-bit "octuple" formats are only used for specific scientific computations and are not supported by general-purpose hardware. -- Half-precision arithmetic only supports a small subset of operations, and is generally used for machine learning applications, especially neural networks, because they tend to do a large amount of calculation, but don't require a high level of precision. -- Half-precision is being gradually replaced by bfloat, which trades off 3 mantissa bits to have the same range as single-precision, enabling interoperability with it. It is mostly being adopted by specialized hardware: TPUs, FGPAs and GPUs. The name stands for "[Brain](https://en.wikipedia.org/wiki/Google_Brain) float". +- Half-precision arithmetic only supports a small subset of operations and is generally used for machine learning applications, especially neural networks, because they tend to do a large amount of calculation, but don't require a high level of precision. +- Half-precision is being gradually replaced by bfloat, which trades off 3 mantissa bits to have the same range as single-precision, enabling interoperability with it. It is mostly being adopted by specialized hardware: TPUs, FGPAs, and GPUs. The name stands for "[Brain](https://en.wikipedia.org/wiki/Google_Brain) float". -Lower precision types need less memory bandwidth to move them around and usually take less cycles to operate on (e. g. the division instruction may take $x$, $y$, or $z$ cycles depending on the type), which is why they are preferred when error tolerance allows it. +Lower precision types need less memory bandwidth to move them around and usually take fewer cycles to operate on (e. g. the division instruction may take $x$, $y$, or $z$ cycles depending on the type), which is why they are preferred when error tolerance allows it. Deep learning, emerging as a very popular and computationally-intensive field, created a huge demand for low-precision matrix multiplication, which led to manufacturers developing separate hardware or at least adding specialized instructions that support these types of computations — most notably, Google developing a custom chip called TPU (*tensor processing unit*) that specializes on multiplying 128-by-128 bfloat matrices, and NVIDIA adding "tensor cores", capable of performing 4-by-4 matrix multiplication in one go, to all their newer GPUs. -Apart from their sizes, most of behavior is exactly the same between all floating-point types, which we will now clarify. +Apart from their sizes, most of the behavior is exactly the same between all floating-point types, which we will now clarify. ## Handling Corner Cases The default way integer arithmetic deals with corner cases such as division by zero is to crash. -Sometimes a software crash in turn causes a real, physical one. In 1996, the maiden flight of the [Ariane 5](https://en.wikipedia.org/wiki/Ariane_5) (the space launch vehicle that ESA uses to lift stuff into low Earth orbit) ended in [a catastrophic explosion](https://www.youtube.com/watch?v=gp_D8r-2hwk) due to the policy of aborting computation on arithmetic error, which in this case was a floating-point to integer conversion overflow, that led to the navigation system thinking that it was off course and making a large correction, eventually causing the disintegration of a $1B rocket. +Sometimes a software crash, in turn, causes a real, physical one. In 1996, the maiden flight of the [Ariane 5](https://en.wikipedia.org/wiki/Ariane_5) (the space launch vehicle that ESA uses to lift stuff into low Earth orbit) ended in [a catastrophic explosion](https://www.youtube.com/watch?v=gp_D8r-2hwk) due to the policy of aborting computation on arithmetic error, which in this case was a floating-point to integer conversion overflow, that led to the navigation system thinking that it was off course and making a large correction, eventually causing the disintegration of a $1B rocket. -There is a way to gracefully handle corner cases such like these: hardware interrupts. When an exception occurs, CPU: +There is a way to gracefully handle corner cases like these: hardware interrupts. When an exception occurs, CPU: - interrupts the execution of a program; - packs every all relevant information into a data structure called "interrupt vector"; - passes it to the operating system, which in turn either calls the handling code if it exists (the "try-except" block) or terminates the program otherwise. -This is a complex mechanism that deserves an article of its own, but since this is a book about performance, the only thing you need to know is that they are quite slow and not desirable in a real-time systems such as navigating rockets. +This is a complex mechanism that deserves an article of its own, but since this is a book about performance, the only thing you need to know is that they are quite slow and not desirable in real-time systems such as navigating rockets. ### NaNs and Infinities Floating-point arithmetic often deals with noisy, real-world data, and exceptions there are much more common than in the integer case. For this reason, the default behavior is different. Instead of crashing, the result is substituted with a special value without interrupting the executing, unless the programmer explicitly wants to. -The first type of such values are the two infinities: a positive and a negative one. They are generated if the result of an operation can't fit within in the representable range, and they are treated as such in arithmetic. +The first type of such value is the two infinities: a positive and a negative one. They are generated if the result of an operation can't fit within the representable range, and they are treated as such in arithmetic. $$ \begin{aligned} @@ -80,14 +80,14 @@ $$ \end{aligned} $$ -What happens if we, say, divide a value by zero? Should it be a negative or a positive infinity? This case in actually unambiguous because, somewhat less intuitively, there are also two zeros: a positive and a negative one. +What happens if we, say, divide a value by zero? Should it be a negative or a positive infinity? This case is actually unambiguous because, somewhat less intuitively, there are also two zeros: a positive and a negative one. $$ \frac{1}{+0} = +∞ \;\;\;\; \frac{1}{-0} = -∞ $$ -Zeros are encoded by setting all bits to zero, except for the sign bit in the negative case. Infinities are encoded by setting all their exponent bits to one and all mantissa bits to zero, but the sign bit distinguishing between a positive and a negative infinity. +Zeros are encoded by setting all bits to zero, except for the sign bit in the negative case. Infinities are encoded by setting all their exponent bits to one and all mantissa bits to zero, with the sign bit distinguishing between positive and negative infinity. The other type is the "not-a-number” (NaN), which is generated as the result of mathematically incorrect operations: @@ -95,7 +95,7 @@ $$ \log(-1),\; \arccos(1.01),\; ∞ − ∞,\; −∞ + ∞,\; 0 × ∞,\; 0 ÷ 0,\; ∞ ÷ ∞ $$ -There are two types of NaNs: a signalling NaN and a quiet NaN. A signalling NaN raises an exception flag, which may or may not cause an immediate hardware interrupt because on FPU configuration, while a quiet NaN just propagates through almost every arithmetic operation, resulting in more NaNs. +There are two types of NaNs: a *signaling NaN* and a *quiet NaN*. A signaling NaN raises an exception flag, which may or may not cause immediate hardware interrupt depending on the FPU configuration, while a quiet NaN just propagates through almost every arithmetic operation, resulting in more NaNs. Both NaNs are encoded as all their exponent set to ones and the mantissa part being everything other than all zeroes (to distinguish them from infinities). From fe986928c52699740891d39c736ec40082aaf30e Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 25 Jan 2022 23:46:24 +0300 Subject: [PATCH 030/531] typo --- content/english/hpc/arithmetic/ieee.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/hpc/arithmetic/ieee.md b/content/english/hpc/arithmetic/ieee.md index edc9ce5a..451c295e 100644 --- a/content/english/hpc/arithmetic/ieee.md +++ b/content/english/hpc/arithmetic/ieee.md @@ -101,4 +101,4 @@ Both NaNs are encoded as all their exponent set to ones and the mantissa part be ## Further Reading -If you are so inclined, you can read the classic "[What Every Computer Scientist Should Know About Floating-Point Arithmetic](https://www.itu.dk/~sestoft/bachelor/IEEE754_article.pdf)" (1991) and [the paper introducing Grisu3](https://www.cs.tufts.edu/~nr/cs257/archive/florian-loitsch/printf.pdf), the current state-of-the art for printing floating-point numbers. +If you are so inclined, you can read the classic "[What Every Computer Scientist Should Know About Floating-Point Arithmetic](https://www.itu.dk/~sestoft/bachelor/IEEE754_article.pdf)" (1991) and [the paper introducing Grisu3](https://www.cs.tufts.edu/~nr/cs257/archive/florian-loitsch/printf.pdf), the current state-of-the-art for printing floating-point numbers. From f963850e1239653df5f82b41bde31d23586b030d Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 25 Jan 2022 23:56:14 +0300 Subject: [PATCH 031/531] rounding errors edits --- content/english/hpc/arithmetic/errors.md | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/content/english/hpc/arithmetic/errors.md b/content/english/hpc/arithmetic/errors.md index d40276ab..45e2a3b2 100644 --- a/content/english/hpc/arithmetic/errors.md +++ b/content/english/hpc/arithmetic/errors.md @@ -7,27 +7,27 @@ The way rounding works in hardware floats is remarkably simple: it occurs if and Apart from the default mode (also known as Banker's rounding), you can [set](https://www.cplusplus.com/reference/cfenv/fesetround/) other rounding logic with 4 more modes: -* round to nearest, with ties always rounding "away" from zero; +* round to nearest, with perfect ties always rounding "away" from zero; * round up (toward $+∞$; negative results thus round toward zero); * round down (toward $-∞$; negative results thus round away from zero); -* round toward zero (truncation of the binary result). +* round toward zero (a truncation of the binary result). -The alternative rounding modes are also useful in diagnosing numerical instability. If the results of a subroutine vary substantially between rounding to the positive and negative infinities, then it indicates susceptibility to round-off errors. Is a better test than switching all computations to a lower precision and checking whether the result changed by too much, because the default rounding to nearest results in the right "expected" value given enough averaging: statistically, half of the time they are rounding up and the other are rounding down, so they cancel each other. +The alternative rounding modes are also useful in diagnosing numerical instability. If the results of a subroutine vary substantially between rounding to the positive and negative infinities, then it indicates susceptibility to round-off errors. Is a better test than switching all computations to a lower precision and checking whether the result changed by too much because the default rounding to nearest results in the right "expected" value given enough averaging: statistically, half of the time they are rounding up and the other are rounding down, so they cancel each other. Note that while most operations with real numbers are commutative and associative, their rounding errors are not: even the result of $(x+y+z)$ depends on the order of summation. Compilers are not allowed to produce non-spec-compliant results, so this disables some potential optimizations that involve rearranging operands. You can disable this strict compliance with the `-ffast-math` flag in GCC and Clang, although you need to be aware that this lets compilers sometimes choose less precise computation paths. -It seems surprising to expect this guarantee from hardware that performs complex calculations such as natural logarithms and square roots, but this is it: you guaranteed to get the highest precision possible from all operations. This makes it remarkably easy to analyze round-off errors, as we will see in a bit. +It seems surprising to expect this guarantee from hardware that performs complex calculations such as natural logarithms and square roots, but this is it: you are guaranteed to get the highest precision possible from all operations. This makes it remarkably easy to analyze round-off errors, as we will see in a bit. ## Measuring and Mitigating Errors There are two natural ways to measure computational errors: -* The engineers who create hardware or spec-compliant exact software are concerned with *units in the last place* (ulps), which is the distance between two numbers in terms of how many representable numbers can fit between the precise real value and the actual result of computation. +* The engineers who create hardware or spec-compliant exact software are concerned with *units in the last place* (ulps), which is the distance between two numbers in terms of how many representable numbers can fit between the precise real value and the actual result of the computation. * People that are working on numerical algorithms care about *relative precision*, which is the absolute value of the approximation error divided by the real answer: $|\frac{v-v'}{v}|$. In either case, the usual tactic to analyze errors is to assume the worst case and simply bound them. -If you perform a single basic arithmetic operation, then the worst thing that can happen is the result rounding to the nearest representable number, meaning that the error in this case does not exceed 0.5 ulps. To reason about relative errors the same way, we can define a number $\epsilon$ called *machine epsilon*, equal to the difference between $1$ and the next representable value (which should be equal to 2 to the negative power of however many bits are dedicated to mantissa). +If you perform a single basic arithmetic operation, then the worst thing that can happen is the result rounding to the nearest representable number, meaning that the error does not exceed 0.5 ulps. To reason about relative errors the same way, we can define a number $\epsilon$ called *machine epsilon*, equal to the difference between $1$ and the next representable value (which should be equal to 2 to the negative power of however many bits are dedicated to mantissa). This means that if after a single arithmetic operation you get result $x$, then the real value is somewhere in the range @@ -60,7 +60,7 @@ for (int i = 0; i < n; i++) x *= a[i]; ``` -After the first multiplication, the value of $x$ relative to the value of the real product is bounded by $(1 + \epsilon)$, and after each additional multiplication this upper bound is multiplied by another $(1 + \epsilon)$. By induction, after $n$ multiplications, the computed value is bound by $(1 + \epsilon)^n = 1 + n \epsilon + O(\epsilon^2)$ and a similar lower bound. +After the first multiplication, the value of $x$ relative to the value of the real product is bounded by $(1 + \epsilon)$, and after each additional multiplication, this upper bound is multiplied by another $(1 + \epsilon)$. By induction, after $n$ multiplications, the computed value is bound by $(1 + \epsilon)^n = 1 + n \epsilon + O(\epsilon^2)$ and a similar lower bound. This implies that the relative error is $O(n \epsilon)$, which is sort of okay, because usually $n \ll \frac{1}{\epsilon}$. @@ -90,7 +90,7 @@ $$ If $x$ and $y$ are close in magnitude, the error will be $O(\epsilon \cdot |x|)$. -Under direct computation, the subtraction "magnifies" the errors of the squaring. But this can be fixed by instead using the following formula: +Under direct computation, the subtraction "magnifies" the errors of squaring. But this can be fixed by instead using the following formula: $$ f(x, y) = x^2 - y^2 = (x + y) \cdot (x - y) @@ -100,7 +100,7 @@ In this one, it is easy to show that the error is be bound by $\epsilon \cdot |x ### Kahan Summation -From previous example, we can see that long chains of operations are not a problem, but adding and subtracting numbers of different magnitude is. The general approach to dealing with such problems is to try to keep big numbers with big numbers and low numbers with low numbers. +From the previous example, we can see that long chains of operations are not a problem, but adding and subtracting numbers of different magnitude is. The general approach to dealing with such problems is to try to keep big numbers with big numbers and low numbers with low numbers. Consider the standard summation algorithm: @@ -141,7 +141,7 @@ for (int i = 0; i < n; i++) { This trick is known as *Kahan summation*. Its relative error is bounded by $2 \epsilon + O(n \epsilon^2)$: the first term comes from the very last summation, and the second term is due to the fact that we work with less-than-epsilon errors on each step. -Of course, a more general approach would be to switch to a more precise data type, like `double`, either way effectively squaring the machine epsilon. It can sort of be scaled by bundling two `double` variable together ne for storing the value, and another for its non-representable errors, so that they actually represent $a+b$. This approach is known as *double-double* arithmetic, and can be similarly generalized to define quad-double and higher precision arithmetic. +Of course, a more general approach would be to switch to a more precise data type, like `double`, either way effectively squaring the machine epsilon. It can sort of be scaled by bundling two `double` variables together: one for storing the value, and another for its non-representable errors, so that they actually represent $a+b$. This approach is known as *double-double* arithmetic, and it can be similarly generalized to define quad-double and higher precision arithmetic. + +When you fetch anything from memory, there is always some non-zero latency before the data arrives. Moreover, the request doesn't go directly to its ultimate storage location, but it first goes through an incredibly complex system of address translation units and caching layers designed to both help in memory management and reduce the latency. + +Therefore, the only correct answer to this question is "it depends" — primarily on where the operands are stored: + +- If the data is stored in the main memory (RAM), it will take around ~100ns, or about 200 cycles, to fetch it, and then another 200 cycles to write it back. +- If it was accessed recently, it is probably *cached* and will take less than that to fetch, depending on how long ago it was accessed — it could be ~50 cycles for the slowest layer of cache and around 4-5 cycles for the fastest. +- But it could also be stored on some type of *external memory* such as a hard drive, and in this case, it will take around 5ms, or roughly $10^7$ cycles (!) to access it. + +Such high variance of memory performance is caused by the fact that memory hardware doesn't follow the same [laws of silicon scaling](/hpc/complexity/hardware) as CPU chips do. Memory is still improving through other means, but if 50 years ago memory timings were roughly on the same scale with the instruction latencies, nowadays they lag far behind. + +To be less of a limiting factor, modern memory systems are becoming increasingly [hierarchical](hierarchy), where the lower layers trade off some of their capacity for reduced latency. As these characteristics may change in the orders of magnitude between the layers — especially in the case of external memory types — it became crucial for many memory-intensive algorithms to optimize their IO operations before anything else. + +This prompted the creation of a new cost model, called the *external memory model*, whose only primitive operations are block reads and writes, and everything else has zero cost as long as it only involves data stored in a limited-sized local memory. It spawned an exciting new field of *external memory algorithms*, which we will study in this chapter. + + From 3b7a106b6550953b23de7b9c364e905af6eab0aa Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 26 Jan 2022 18:26:56 +0300 Subject: [PATCH 043/531] memory hierarchy --- content/english/hpc/external-memory/_index.md | 2 +- .../english/hpc/external-memory/hierarchy.md | 32 ++++++++++--------- 2 files changed, 18 insertions(+), 16 deletions(-) diff --git a/content/english/hpc/external-memory/_index.md b/content/english/hpc/external-memory/_index.md index d8bacb87..576db8a7 100644 --- a/content/english/hpc/external-memory/_index.md +++ b/content/english/hpc/external-memory/_index.md @@ -29,7 +29,7 @@ Therefore, the only correct answer to this question is "it depends" — primaril Such high variance of memory performance is caused by the fact that memory hardware doesn't follow the same [laws of silicon scaling](/hpc/complexity/hardware) as CPU chips do. Memory is still improving through other means, but if 50 years ago memory timings were roughly on the same scale with the instruction latencies, nowadays they lag far behind. -To be less of a limiting factor, modern memory systems are becoming increasingly [hierarchical](hierarchy), where the lower layers trade off some of their capacity for reduced latency. As these characteristics may change in the orders of magnitude between the layers — especially in the case of external memory types — it became crucial for many memory-intensive algorithms to optimize their IO operations before anything else. +To be less of a limiting factor, modern memory systems are becoming increasingly [hierarchical](hierarchy), where the higher layers trade off some of their capacity for reduced latency. As these characteristics may change in the orders of magnitude between the layers — especially in the case of external memory types — it became crucial for many memory-intensive algorithms to optimize their IO operations before anything else. This prompted the creation of a new cost model, called the *external memory model*, whose only primitive operations are block reads and writes, and everything else has zero cost as long as it only involves data stored in a limited-sized local memory. It spawned an exciting new field of *external memory algorithms*, which we will study in this chapter. diff --git a/content/english/hpc/external-memory/hierarchy.md b/content/english/hpc/external-memory/hierarchy.md index 249457cc..c40934a1 100644 --- a/content/english/hpc/external-memory/hierarchy.md +++ b/content/english/hpc/external-memory/hierarchy.md @@ -3,9 +3,7 @@ title: Memory Hierarchy weight: 1 --- -## Memory Hierarchy - -Modern computer memory is hierarchical. It consists of multiple *cache layers* of varying speed and size, where *upper* levels typically store most frequently accessed data from *lower* levels to reduce latency. Each new level is usually an order of magnitude faster, but also smaller and/or more expensive. +Modern computer memory is highly hierarchical. It consists of multiple *cache layers* of varying speed and size, where *higher* levels typically store most frequently accessed data from *lower* levels to reduce latency: each next level is usually an order of magnitude faster, but also smaller and/or more expensive. ![](../img/hierarchy.png) @@ -17,7 +15,7 @@ From this perspective, each type of memory has a few important characteristics: - *block size* $B$; - *latency*, that is, how much time it takes to fetch one byte; - *bandwidth*, which may be higher than just the block size times latency, meaning that IO operations can "overlap"; -- *cost* in the amortized sense, including the price for chip, its energy requirements, maintenance and so on. +- *cost* in the amortized sense, including the price for the chip, its energy requirements, maintenance, and so on. Here is an approximate comparison table for commodity hardware in 2021: @@ -31,41 +29,45 @@ Here is an approximate comparison table for commodity hardware in 2021: | HDD | TBs | - | 10ms | 1G/s | 0.04 | | S3 | $\infty$ | - | 150ms | $\infty$ | 0.02[^S3] | -Of course, in reality there are many specifics about each type of memory, which we will now go through. +In reality, there are many specifics about each type of memory, which we will now go through. -[^pricing]: Pricing information is taken from Google Cloud Platform. -[^S3]: Cloud storage typically has multiple tiers, becoming progressively cheaper if you access the data less frequently. +[^pricing]: Pricing information is taken from the [Google Cloud Platform](https://cloud.google.com/products/calculator?skip_cache=true). +[^S3]: Cloud storage typically has [multiple tiers](https://aws.amazon.com/s3/storage-classes/), becoming progressively cheaper if you access the data less frequently. ### Volatile Memory -Everything up to the RAM level is called *volatile memory*, because it does not persist data in case of a power shortage and other disasters. It is fast, which is why it is used to store temporary data while the computer is powered. +Everything up to the RAM level is called *volatile memory* because it does not persist data in case of a power shortage and other disasters. It is fast, which is why it is used to store temporary data while the computer is powered. From fastest to slowest: -- **CPU registers**, which are the zero-time access data cells CPU uses to store all its intermediate values, can also be thought of as a memory type. There is only a very limited number of them (e. g. 16 "general purpose" ones), and in some cases you may want to use all of them for performance reasons. -- **CPU caches.** Modern CPUs have multiple layers of cache (L1, L2, often L3, and rarely even L4). The lowest layer is shared between cores and is usually scaled with the their number (e. g. a 10-core CPU should have around 10M of L3 cache). +- **CPU registers**, which are the zero-time access data cells CPU uses to store all its intermediate values, can also be thought of as a memory type. There is only a limited number of them (e. g. 16 "general purpose" ones), and in some cases, you may want to use all of them for performance reasons. +- **CPU caches.** Modern CPUs have multiple layers of cache (L1, L2, often L3, and rarely even L4). The lowest layer is shared between cores and is usually scaled with their number (e. g. a 10-core CPU should have around 10M of L3 cache). - **Random access memory,** which is the first scalable type of memory: nowadays you can rent machines with half a terabyte of RAM on the public clouds. This is the one where most of your working data is supposed to be stored. The CPU cache system has an important concept of a *cache line*, which is the basic unit of data transfer between the CPU and the RAM. The size of a cache line is 64 bytes on most architectures, meaning that all main memory is divided into blocks of 64 bytes, and whenever you request (read or write) a single byte, you are also fetching all its 63 cache line neighbors whether your want them or not. -Caching on the CPU level happens automatically based on the last access times of cache lines. When accessed, the contents of a cache line are emplaced onto the lowest cache layer, and then gradually evicted to a higher levels unless accessed again in time. The programmer can't control this process explicitly, but it is worthwhile to study how it works in detail, which we will do [later](cpu-cache) in this chapter. +Caching on the CPU level happens automatically based on the last access times of cache lines. When accessed, the contents of a cache line are emplaced onto the lowest cache layer and then gradually evicted to higher levels unless accessed again in time. The programmer can't control this process explicitly, but it is worthwhile to study how it works in detail, which we will do [in the next chapter](/hpc/cpu-cache). + + + ### Non-Volatile Memory -While the data cells in CPU caches and the RAM only gently store just a few electrons (that periodically leak and need to be periodically refreshed), the data cells in *non-volatile memory* types store hundreds of them. This lets the data to be persisted for prolonged periods of time without power, but comes at the cost of performance and durability — because when you have more electrons, you also have more opportunities for them colliding with silicon atoms. +While the data cells in CPU caches and the RAM only gently store just a few electrons (that periodically leak and need to be periodically refreshed), the data cells in *non-volatile memory* types store hundreds of them. This lets the data to be persisted for prolonged periods of time without power but comes at the cost of performance and durability — because when you have more electrons, you also have more opportunities for them colliding with silicon atoms. There are many ways to store data in a persistent way, but these are the main ones from a programmer's perspective: -- **Solid state drives.** These have relatively low latency on the order of 0.1ms ($10^5$ ns), but they also have high cost, amplified by the fact that they have limited lifespans as each cell can only be written to a limited number of times. This is what mobile devices and most laptops use, because they are compact and have no moving parts. +- **Solid state drives.** These have relatively low latency on the order of 0.1ms ($10^5$ ns), but they also have a high cost, amplified by the fact that they have limited lifespans as each cell can only be written to a limited number of times. This is what mobile devices and most laptops use because they are compact and have no moving parts. - **Hard disk drives** are unusual because they are actually [rotating physical disks](https://www.youtube.com/watch?v=3owqvmMf6No&feature=emb_title) with a read/write head attached to them. To read a memory location, you need to wait until the disk rotates to the right position and then very precisely move the head to it. This results in some very weird access patterns where reading one byte randomly may take the same time as reading the next 1MB of data — which is usually on the order of milliseconds. Since this is the only part of a computer, except for the cooling system, that has mechanically moving parts, hard disks break quite often (with the average lifespan of ~3 years for a data center HDD). -- **Network-attached storage**, which is the practice of using other networked devices to store data on them. There are two distinctive types. The first one is the Network File System (NFS), which is a protocol for mounting other computer's file system over the network. The other is API-based distributed storage systems, most famously [Amazon S3](https://aws.amazon.com/s3/), that are backed by a fleet of storage-optimized machines of a public cloud, typically using cheap HDDs or some [more exotic](https://aws.amazon.com/storagegateway/vtl/) storage types internally. While NFS can can sometimes work even faster than HDD if located in the same data center, object storage in the public cloud usually has latencies of 50-100ms. They are typically highly distributed and replicated for better availability. +- **Network-attached storage**, which is the practice of using other networked devices to store data on them. There are two distinctive types. The first one is the Network File System (NFS), which is a protocol for mounting the file system of another computer over the network. The other is API-based distributed storage systems, most famously [Amazon S3](https://aws.amazon.com/s3/), that are backed by a fleet of storage-optimized machines of a public cloud, typically using cheap HDDs or some [more exotic](https://aws.amazon.com/storagegateway/vtl/) storage types internally. While NFS can sometimes work even faster than HDD if it is located in the same data center, object storage in the public cloud usually has latencies of 50-100ms. They are typically highly distributed and replicated for better availability. Since SDD/HDD are noticeably slower than RAM, everything on or below this level is usually called *external memory*. -Unlike the CPU caches, external memory can be explicitly controlled. This is useful in many cases, but most programmers just want to abstract away from it and use it as an extension of the main memory, and operating systems have the capability to do so by the virtue of *memory paging*. +Unlike the CPU caches, external memory can be explicitly controlled. This is useful in many cases, but most programmers just want to abstract away from it and use it as an extension of the main memory, and operating systems have the capability to do so by the means of [virtual memory](../virtual). From b010428ecf934a303dfe573988488f38e1ab16ce Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 26 Jan 2022 20:23:44 +0300 Subject: [PATCH 044/531] virtual memory --- .../english/hpc/external-memory/virtual.md | 49 ++++++++++++++++--- 1 file changed, 41 insertions(+), 8 deletions(-) diff --git a/content/english/hpc/external-memory/virtual.md b/content/english/hpc/external-memory/virtual.md index dbfa1594..5a294993 100644 --- a/content/english/hpc/external-memory/virtual.md +++ b/content/english/hpc/external-memory/virtual.md @@ -3,17 +3,50 @@ title: Virtual Memory weight: 2 --- -Modern operating systems give every process the impression that it is working with large, contiguous sections of memory, called *virtual memory*. Physically, the memory allocated to each process may be dispersed across different areas of physical memory, or may have been moved to another storage such as SSD or HDD. +Early operating systems gave every process the freedom of reading and modifying any memory region they want, including those allocated for other processes. While this keeps things simple, it also poses some problems: -Do achieve this, the address space of the virtual memory is divided into *pages* (typically 4KB in size), and the memory system maintains a separate hardware data structure called *page table*, which points to where the data is physically stored for each page. When a process requests access to data in its memory, the operating system maps the virtual address to the physical address through the page table and forwards the read/write request to where that data is actually stored. +- What if one of the processes is buggy or outright malicious? How do we prevent it from modifying the memory allocated for other processes while still keeping inter-process communication through memory possible? +- How do we deal with memory fragmentation? Say, we have 4MB of memory, process A allocates the first 1MB for itself, then process B claims the next 2MB, then A terminates and releases its memory, and then process C comes and asks for a contiguous 2MB region — and can't get it because we only have two separate 1MB slices. Restarting process B or somehow stopping it and shifting all its data and pointers by one megabyte doesn't seem like a good solution. +- How do we access non-RAM memory types? How do we plug a flash drive and read a specific file from it? -Since the address translation needs to be done for each memory request, this process is also cached with what's called *translation lookaside buffer* (TLB), which is just a very small cache for physical page addresses. When it doesn't hit, you essentially pay double the cost of a memory access. For this reason, some operating systems have support for larger pages (~2MB). +These problems are not that critical for some specialized computer systems such as GPUs, where you typically solve just one task at a time and have full control over the computation, but they are absolutely essential for modern multitasking operating systems — and they solve all these problems with a technique called *virtual memory*. -![From John Bell\'s OS course at University of Illinois](../img/virtual-memory.jpg) +### Memory Paging -This mechanism allows using external memory quite transparently. Operating systems have two basic mechanisms: +Virtual memory gives each process the impression that it fully controls a contiguous region of memory, which in reality may be mapped to multiple smaller blocks of the physical memory — which includes both the main memory (RAM) and external memory (HDD, SDD). -- *Swap files*, which let the operating system automatically use parts of an SDD or an HDD as an extension of RAM when there is not enough real RAM. -- [Memory mapping](https://en.wikipedia.org/wiki/Mmap), which lets you open a file a use its contents as if they were in the main memory. +![](../img/virtual-memory.jpg) -This essentially turns your RAM into "L4 cache" for the external memory, which is a good way to reason about it. +To achieve this, the memory address space is divided into *pages* (typically 4KB in size), which are the base units of memory that the programs can request from the operating system. The memory system maintains a special hardware data structure called the *page table*, which contains the mappings of virtual page addresses to the physical ones. When a process accesses data using its virtual memory address, the memory system calculates its page number (by right-shifting it by $12$ if $4096=2^{12}$ is the page size), looks up in the page table that its physical address is, and forwards the read or write request to where that data is actually stored. + +Since the address translation needs to be done for each memory request, and the number of memory pages itself may be large (e. g. 16G RAM / 4K page size = 4M pages), address translation poses a difficult problem in itself. One way to speed it up is to use a special cache for the page table itself called *translation lookaside buffer* (TLB), and the other is to increase the page size so that the total number of memory pages is made smaller at the cost of reduced granularity. + + + +### Mapping External Memory + +The mechanism of virtual memory also allows using external memory types quite transparently. Modern operating systems support [memory mapping](https://en.wikipedia.org/wiki/Mmap), which lets you open a file and use its contents as if they were in the main memory: + +```c++ +// open a file containing 1024 random integers for reading and writing +int fd = open("input.bin", O_RDWR); +// map it into memory size allow reads and writes write changes back to the file +int* data = (int*) mmap(0, 4096, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); +// sort it like if it was a normal integer array +std::sort(data, data + 1024); +// changes are eventually propagated to the file +``` + +Here we map a 4K file, which can fit entirely on a just a single memory page, but when we open larger files, its reads will be done lazily when we request a certain page, and its writes will be buffered and committed to the file system when the operating decides to (usually on the program termination or when the system runs out of RAM). + +A technique that has the same operating principle, but reverse intention is the *swap file*, which let the operating system automatically use parts of an SDD or an HDD as an extension of the main memory when there is not enough real RAM. This lets the systems that run out of memory just terribly slow down instead of crashing. + +This seamless integration of the main and external memory basically turns RAM into "L4 cache" for the external memory, which is a good way to think about it from the algorithm design perspective. From e3f02ae812afbe3b014fa6a8d1697a7d5e37eb8c Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 26 Jan 2022 20:25:53 +0300 Subject: [PATCH 045/531] virtual memory edits --- content/english/hpc/external-memory/virtual.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/content/english/hpc/external-memory/virtual.md b/content/english/hpc/external-memory/virtual.md index 5a294993..fc3f8eed 100644 --- a/content/english/hpc/external-memory/virtual.md +++ b/content/english/hpc/external-memory/virtual.md @@ -45,8 +45,8 @@ std::sort(data, data + 1024); // changes are eventually propagated to the file ``` -Here we map a 4K file, which can fit entirely on a just a single memory page, but when we open larger files, its reads will be done lazily when we request a certain page, and its writes will be buffered and committed to the file system when the operating decides to (usually on the program termination or when the system runs out of RAM). +Here we map a 4K file, which can fit entirely on just a single memory page, but when we open larger files, its reads will be done lazily when we request a certain page, and its writes will be buffered and committed to the file system when the operating decides to (usually on the program termination or when the system runs out of RAM). -A technique that has the same operating principle, but reverse intention is the *swap file*, which let the operating system automatically use parts of an SDD or an HDD as an extension of the main memory when there is not enough real RAM. This lets the systems that run out of memory just terribly slow down instead of crashing. +A technique that has the same operating principle, but the reverse intention is the *swap file*, which lets the operating system automatically use parts of an SDD or an HDD as an extension of the main memory when there is not enough real RAM. This lets the systems that run out of memory just terribly slow down instead of crashing. -This seamless integration of the main and external memory basically turns RAM into "L4 cache" for the external memory, which is a good way to think about it from the algorithm design perspective. +This seamless integration of the main and external memory essentially turns RAM into ab "L4 cache" for the external memory, which is a convenient way to think about it from the algorithm design perspective. From cd49f730fdbe7f431802af9b812418f4c44b33e8 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 26 Jan 2022 20:42:00 +0300 Subject: [PATCH 046/531] external memory model --- content/english/hpc/external-memory/model.md | 49 ++++++++++++++----- .../english/hpc/external-memory/sorting.md | 2 + 2 files changed, 39 insertions(+), 12 deletions(-) diff --git a/content/english/hpc/external-memory/model.md b/content/english/hpc/external-memory/model.md index 0b0b33f2..841a5600 100644 --- a/content/english/hpc/external-memory/model.md +++ b/content/english/hpc/external-memory/model.md @@ -1,28 +1,30 @@ --- -title: Cache-Aware Model +title: External Memory Model weight: 3 --- -To reason about performance of memory-bound algorithms, we need to develop a cost model that is more sensitive to expensive block IO operations, but is not too rigorous to still be useful. +To reason about the performance of memory-bound algorithms, we need to develop a cost model that is more sensitive to expensive block IO operations but is not too rigorous to still be useful. -In the standard RAM model, we ignore the fact that primitive operations take unequal time to complete. Most importantly, it does not differentiate between operations on different types of memory, equating a read from RAM taking ~50ns in real-time with a read from HDD taking ~5ms, or about a $10^5$ times as much. +### Cache-Aware Model -Similar in spirit, in *external memory model*, we simply ignore every operation that is not an I/O operation. More specifically, we consider one level of cache hierarchy and assume the following about the hardware and the problem: +In the [standard RAM model](/hpc/complexity), we ignore the fact that primitive operations take unequal time to complete. Most importantly, it does not differentiate between operations on different types of memory, equating a read from RAM taking ~50ns in real-time with a read from HDD taking ~5ms, or about a $10^5$ times as much. + +Similar in spirit, in the *external memory model*, we simply ignore every operation that is not an I/O operation. More specifically, we consider one level of cache hierarchy and assume the following about the hardware and the problem: - The size of the dataset is $N$, and it is all stored in *external* memory, which we can read and write in blocks of $B$ elements in a unit time (reading a whole block and just one element takes the same time). - We can store $M$ elements in *internal* memory, meaning that we can store up to $\left \lfloor \frac{M}{B} \right \rfloor$ blocks. -- We only care about I/O operations: any computations done in-between reads and writes are free. +- We only care about I/O operations: any computations done in-between the reads and the writes are free. - We additionally assume $N \gg M \gg B$. -In this model, we measure performance of the algorithm in terms of its high-level *I/O operations*, or *IOPS* — that is, the total number of blocks read or written to external memory during execution. +In this model, we measure the performance of an algorithm in terms of its high-level *I/O operations*, or *IOPS* — that is, the total number of blocks read or written to external memory during execution. We will mostly focus on the case where the internal memory is RAM and external memory is SSD or HDD, although the underlying analysis techniques that we will develop are applicable to any layer in the cache hierarchy. Under these settings, reasonable block size $B$ is about 1MB, internal memory size $M$ is usually a few gigabytes, and $N$ is up to a few terabytes. -## Array Scan +### Array Scan -As a simple example, when we calculate the sum of array by iterating through it one element at a time, we implicitly load it by chunks of $O(B)$ elements and, in terms of external memory model, process these chunks one by one: +As a simple example, when we calculate the sum of an array by iterating through it one element at a time, we implicitly load it by chunks of $O(B)$ elements and, in terms of the external memory model, process these chunks one by one: $$ \underbrace{a_1, a_2, a_3,} _ {B_1} @@ -31,14 +33,37 @@ $$ \underbrace{a_{n-3}, a_{n-2}, a_{n-1}} _ {B_{m-1}} $$ -Thus, in external memory model, the complexity of summation and other linear array scans is +Thus, in the external memory model, the complexity of summation and other linear array scans is $$ SCAN(N) \stackrel{\text{def}}{=} O\left(\left \lceil \frac{N}{B} \right \rceil \right) \; \text{IOPS} $$ -Note that, in most cases, operating systems do this automatically. Even when the data is just redirected to the standard input from a normal file, the operating system buffers its stream and reads it in blocks of ~4KB (by default). +You can implement external array scan explicitly like this: + +```c++ +FILE *input = fopen("input.bin", "rb"); + +const int M = 1024; +int buffer[M], sum = 0; + +// while the file is not fully processed +while (true) { + // read up to M of 4-byte elements from the input stream + int n = fread(buffer, 4, M, input); + // ^ the number of elements that were actually read + + // if we can't read any more elements, finish + if (n == 0) + break; + + // sum elements in-memory + for (int i = 0; i < n; i++) + sum += buffer[i]; +} - +fclose(input); +printf("%d\n", sum); +``` -Now, let's slowly build up more complex things. The goal of this article is to eventually get to *external sorting* and its interesting applications. It will be based on the standard merge sort, so we need to derive a few of its primitives first. +Note that, in most cases, operating systems do this buffering automatically. Even when the data is just redirected to the standard input from a normal file, the operating system buffers its stream and reads it in blocks of ~4KB (by default). diff --git a/content/english/hpc/external-memory/sorting.md b/content/english/hpc/external-memory/sorting.md index 55caafa6..b3c2e5a1 100644 --- a/content/english/hpc/external-memory/sorting.md +++ b/content/english/hpc/external-memory/sorting.md @@ -3,6 +3,8 @@ title: External Sorting weight: 4 --- +Now, let's slowly build up more complex things. The goal of this article is to eventually get to *external sorting* and its interesting applications. It will be based on the standard merge sort, so we need to derive a few of its primitives first. + ## Merge **Problem:** given two sorted arrays $a$ and $b$ of lengths $N$ and $M$, produce a single sorted array $c$ of length $N + M$ containing all of their elements. From 1c8498df918112fb0a5836622b27b7e6db122585 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 26 Jan 2022 20:46:09 +0300 Subject: [PATCH 047/531] fixing inaccuracies --- content/english/hpc/external-memory/hierarchy.md | 2 +- content/english/hpc/external-memory/list-ranking.md | 5 +---- 2 files changed, 2 insertions(+), 5 deletions(-) diff --git a/content/english/hpc/external-memory/hierarchy.md b/content/english/hpc/external-memory/hierarchy.md index c40934a1..894eeeb0 100644 --- a/content/english/hpc/external-memory/hierarchy.md +++ b/content/english/hpc/external-memory/hierarchy.md @@ -21,7 +21,7 @@ Here is an approximate comparison table for commodity hardware in 2021: | Type | $M$ | $B$ | Latency | Bandwidth | $/GB/mo[^pricing] | |:-----|:---------|-----|---------|-----------|:------------------| -| L1 | 10K | 64B | 0.5ns | 80G/s | - | +| L1 | 10K | 64B | 2ns | 80G/s | - | | L2 | 100K | 64B | 5ns | 40G/s | - | | L3 | 1M/core | 64B | 20ns | 20G/s | - | | RAM | GBs | 64B | 100ns | 10G/s | 1.5 | diff --git a/content/english/hpc/external-memory/list-ranking.md b/content/english/hpc/external-memory/list-ranking.md index 033a65e6..fdd6a921 100644 --- a/content/english/hpc/external-memory/list-ranking.md +++ b/content/english/hpc/external-memory/list-ranking.md @@ -3,10 +3,7 @@ title: List Ranking weight: 5 --- - -## List Ranking - -Now we are going to use external sorting and joining to solve a problem that seems useless, but is actually a very important primitive many graph algorithms in external memory as well as in parallel computing, so bear with me. +Now we are going to use [external sorting](../sorting) and [joining](../sorting#joining) to solve a problem that seems useless, but is actually a very important primitive many graph algorithms in external memory as well as in parallel computing, so bear with me. **Problem.** Given a linked list, compute *rank* of each element, equal to its distance from the front element. From c334ce9777fcbcf754db94b74c6bfcfc4ae78f69 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Thu, 27 Jan 2022 16:51:03 +0300 Subject: [PATCH 048/531] external sorting edits --- content/english/hpc/external-memory/_index.md | 2 +- .../english/hpc/external-memory/hierarchy.md | 2 +- content/english/hpc/external-memory/model.md | 2 +- .../english/hpc/external-memory/sorting.md | 84 ++++++++++--------- 4 files changed, 47 insertions(+), 43 deletions(-) diff --git a/content/english/hpc/external-memory/_index.md b/content/english/hpc/external-memory/_index.md index 576db8a7..d1b25df6 100644 --- a/content/english/hpc/external-memory/_index.md +++ b/content/english/hpc/external-memory/_index.md @@ -29,7 +29,7 @@ Therefore, the only correct answer to this question is "it depends" — primaril Such high variance of memory performance is caused by the fact that memory hardware doesn't follow the same [laws of silicon scaling](/hpc/complexity/hardware) as CPU chips do. Memory is still improving through other means, but if 50 years ago memory timings were roughly on the same scale with the instruction latencies, nowadays they lag far behind. -To be less of a limiting factor, modern memory systems are becoming increasingly [hierarchical](hierarchy), where the higher layers trade off some of their capacity for reduced latency. As these characteristics may change in the orders of magnitude between the layers — especially in the case of external memory types — it became crucial for many memory-intensive algorithms to optimize their IO operations before anything else. +To be less of a limiting factor, modern memory systems are becoming increasingly [hierarchical](hierarchy), where the higher layers trade off some of their capacity for reduced latency. As these characteristics may change in the orders of magnitude between the layers — especially in the case of external memory types — it became crucial for many memory-intensive algorithms to optimize their I/O operations before anything else. This prompted the creation of a new cost model, called the *external memory model*, whose only primitive operations are block reads and writes, and everything else has zero cost as long as it only involves data stored in a limited-sized local memory. It spawned an exciting new field of *external memory algorithms*, which we will study in this chapter. diff --git a/content/english/hpc/external-memory/hierarchy.md b/content/english/hpc/external-memory/hierarchy.md index 894eeeb0..35670da9 100644 --- a/content/english/hpc/external-memory/hierarchy.md +++ b/content/english/hpc/external-memory/hierarchy.md @@ -14,7 +14,7 @@ From this perspective, each type of memory has a few important characteristics: - *total size* $M$; - *block size* $B$; - *latency*, that is, how much time it takes to fetch one byte; -- *bandwidth*, which may be higher than just the block size times latency, meaning that IO operations can "overlap"; +- *bandwidth*, which may be higher than just the block size times latency, meaning that I/O operations can "overlap"; - *cost* in the amortized sense, including the price for the chip, its energy requirements, maintenance, and so on. Here is an approximate comparison table for commodity hardware in 2021: diff --git a/content/english/hpc/external-memory/model.md b/content/english/hpc/external-memory/model.md index 841a5600..35cba4ea 100644 --- a/content/english/hpc/external-memory/model.md +++ b/content/english/hpc/external-memory/model.md @@ -3,7 +3,7 @@ title: External Memory Model weight: 3 --- -To reason about the performance of memory-bound algorithms, we need to develop a cost model that is more sensitive to expensive block IO operations but is not too rigorous to still be useful. +To reason about the performance of memory-bound algorithms, we need to develop a cost model that is more sensitive to expensive block I/O operations but is not too rigorous to still be useful. ### Cache-Aware Model diff --git a/content/english/hpc/external-memory/sorting.md b/content/english/hpc/external-memory/sorting.md index b3c2e5a1..43745d4a 100644 --- a/content/english/hpc/external-memory/sorting.md +++ b/content/english/hpc/external-memory/sorting.md @@ -3,13 +3,15 @@ title: External Sorting weight: 4 --- -Now, let's slowly build up more complex things. The goal of this article is to eventually get to *external sorting* and its interesting applications. It will be based on the standard merge sort, so we need to derive a few of its primitives first. +Now, let's try to design some actually useful algorithms for the new [external memory model](../model). Our goal in this section is to slowly build up more complex things and eventually get to *external sorting* and its interesting applications. -## Merge +The algorithm will be based on the standard merge sorting algorithm, so we need to derive its main primitive first. -**Problem:** given two sorted arrays $a$ and $b$ of lengths $N$ and $M$, produce a single sorted array $c$ of length $N + M$ containing all of their elements. +### Merge -The standard technique using two pointers looks like this: +**Problem.** Given two sorted arrays $a$ and $b$ of lengths $N$ and $M$, produce a single sorted array $c$ of length $N + M$ containing all of their elements. + +The standard two-pointer technique for merging sorted arrays looks like this: ```cpp void merge(int *a, int *b, int *c, int n, int m) { @@ -27,24 +29,24 @@ In terms of memory operations, we just linearly read all elements of $a$ and $b$ So far the examples have been simple, and their analysis doesn't differ too much from the RAM model, except that we divide the final answer by the block size $B$. But here is a case where this is not so. -**K-way merging.** Consider the modification of this algorithm where we need to merge not just two arrays, but $k$ arrays of total size $N$ — by likewise looking at $k$ values, choosing the minimum between them, writing it into $c$ and incrementing one of the iterators. +**K-way merging.** Consider the modification of this algorithm where we need to merge not just two arrays, but $k$ arrays of total size $N$ — by likewise looking at $k$ values, choosing the minimum between them, writing it into $c$, and incrementing one of the iterators. -In the standard RAM model, the asymptotic complexity would be multiplied $k$, since we would need to do $O(k)$ comparisons to fill each next element. But in external memory model, since everything we do in-memory doesn't cost us anything, its asymptotic complexity would not change as long as we can fit $(k+1)$ full blocks in memory, that is, if $k = O(\frac{M}{B})$. +In the standard RAM model, the asymptotic complexity would be multiplied $k$, since we would need to perform $O(k)$ comparisons to fill each next element. But in the external memory model, since everything we do in-memory doesn't cost us anything, its asymptotic complexity would not change as long as we can fit $(k+1)$ full blocks in memory, that is, if $k = O(\frac{M}{B})$. -Remember the $M \gg B$ assumption? If we have $M \geq B^{1+ε}$ for $\epsilon > 0$, then we can fit any sub-polynomial amount of blocks in memory, certainly including $O(\frac{M}{B})$. This condition is called *tall cache assumption*, and it is usually required in many other external memory algorithms. +Remember [the $M \gg B$ assumption](../model) when we introduced the computational model? If we have $M \geq B^{1+ε}$ for $\epsilon > 0$, then we can fit any sub-polynomial amount of blocks in memory, certainly including $O(\frac{M}{B})$. This condition is called *tall cache assumption*, and it is usually required in many other external memory algorithms. -## Merge Sorting +### Merge Sorting -The "normal" complexity the standard mergesort algorithm is $O(N \log_2 N)$: on each of its $O(\log_2 N)$ "layers", the algorithms need to go through all $N$ elements in total and merge them in linear time. +The "normal" complexity of the standard mergesort algorithm is $O(N \log_2 N)$: on each of its $O(\log_2 N)$ "layers", the algorithms need to go through all $N$ elements in total and merge them in linear time. -In external memory model, when we read a block of size $M$, we can sort its elements "for free", since they are already in memory. This way we can split the arrays into $O(\frac{N}{M})$ blocks of consecutive elements and sort them separately as the base step, and only then merge them. +In the external memory model, when we read a block of size $M$, we can sort its elements "for free", since they are already in memory. This way we can split the arrays into $O(\frac{N}{M})$ blocks of consecutive elements and sort them separately as the base step, and only then merge them. ![](../img/k-way.png) This effectively means that, in terms of IO operations, the first $O(\log M)$ layers of mergesort are free, and there are only $O(\log_2 \frac{N}{B})$ non-zero-cost layers, each mergeable in $O(\frac{N}{B})$ IOPS in total. This brings total I/O complexity to $$ -O(\frac{N}{B} \log_2 \frac{N}{M}) +O\left(\frac{N}{B} \log_2 \frac{N}{M}\right) $$ This is quite fast. If we have 1GB of memory and 10GB of data, this essentially means that we need a little bit more than 3 times the effort than just reading the data to sort it. Interestingly enough, we can do better. @@ -55,21 +57,19 @@ Half of a page ago we have learned that in the external memory model, we can mer Let's sort each block of size $M$ in-memory just as we did before, but during each merge stage, we will split sorted blocks not just in pairs to be merged, but take as many blocks we can fit into our memory during a k-way merge. This way the height of the merge tree would be greatly reduced, while each layer would still be done in $O(\frac{N}{B})$ IOPS. -How many sorted arrays can we merge at once? Exactly $k = \frac{M}{B}$, since we need memory for one block for each array. Since the total amount of layers will be reduced to $\log_{\frac{M}{B}} \frac{N}{M}$, the whole complexity will be reduced to +How many sorted arrays can we merge at once? Exactly $k = \frac{M}{B}$, since we need memory for one block for each array. Since the total amount of layers will be reduced to $\log_{\frac{M}{B}} \frac{N}{M}$, the total complexity will be reduced to $$ SORT(N) \stackrel{\text{def}}{=} O\left(\frac{N}{B} \log_{\frac{M}{B}} \frac{N}{M} \right) $$ -Note that, in our example, we have 10GB of data, 1GB of memory, and the block size is around 1MB for HDD. This makes $\frac{M}{B} = 1000$ and $\frac{N}{M} = 10$, and so the logarithm is less than one (namely, $\log_{1000} 10 = \frac{1}{3}$). Of course, we can't sort an array faster than reading it, so this analysis applies to the cases when we have very large dataset, small memory, and/or large block sizes, which happens in real life nowadays. +Note that, in our example, we have 10GB of data, 1GB of memory, and the block size is around 1MB for HDD. This makes $\frac{M}{B} = 1000$ and $\frac{N}{M} = 10$, and so the logarithm is less than one (namely, $\log_{1000} 10 = \frac{1}{3}$). Of course, we can't sort an array faster than reading it, so this analysis applies to the cases when we have a very large dataset, small memory, and/or large block sizes, which rarely happens in real life these days. ### Practical Implementation -Under more realistic constraints, instead of using $\log_{\frac{M}{B}} \frac{N}{M}$ layers, we can do just two: one for sorting data in blocks of $M$, and another one for merging all of them at once. With a gigabyte of RAM and a block size of 1MB, this would be enough to sort arrays up to a terabyte in size. - -This way we would essentially just loop around our dataset twice. THe bandwidth of HDDs can be quite high, and we wouldn't want to stall it, so we need a slightly faster way to merge $k$ arrays than by finding minimum with $O(k)$ comparisons — namely, we can maintain for $k$ elements, and extract minimum elements from it in a manner almost identical to heapsort. +Under more realistic constraints, instead of using $\log_{\frac{M}{B}} \frac{N}{M}$ layers, we can use just two: one for sorting data in blocks of $M$ elements, and another one for merging all of them at once. This way, from the I/O operations perspective, we just loop around our dataset twice. And with a gigabyte of RAM and a block size of 1MB, this way can sort arrays up to a terabyte in size. -Here is the first phase looks in C++: +Here is how the first phase looks in C++. This program opens a multi-gigabyte binary file with unsorted integers, reads it in blocks of 256MB, sorts them in memory, and then writes them back in files named `part-000.bin`, `part-001.bin`, `part-002.bin`, and so on: ```cpp const int B = (1<<20) / 4; // 1 MB blocks of integers @@ -85,7 +85,7 @@ while (true) { if (n == 0) break; - // sort in-memory + // sort a block in-memory std::sort(part, part + n); char fpart[sizeof "part-999.bin"]; @@ -104,76 +104,80 @@ while (true) { fclose(input); ``` -This would create many arrays named `part-000.bin`, `part-001.bin`, `part-002.bin` and so on. +What is left now is to merge them together. The bandwidth of modern HDDs can be quite high, and there may be a lot of parts to merge, so the I/O efficiency of this stage is not our only concern: we also need a faster way to merge $k$ arrays than by finding minima with $O(k)$ comparisons. We can do that in $O(\log k)$ time per element if we maintain a min-heap for these $k$ elements, in a manner almost identical to heapsort. -What is left now is to merge them together. First we create the an array for storing pointers to current elements of all block, their separate buffers, and a priority queue, that we populate with their first elements: +Here is how to implement it. First, we need to initialize some variables: ```cpp -std::priority_queue< std::pair > q; - const int nparts = parts.size(); -auto buffers = new int[nparts][B]; -int outbuffer[B]; -std::vector l(nparts), r(nparts); +std::priority_queue< std::pair > q; // the heap itself (element + part number) +auto buffers = new int[nparts][B]; // buffers for each part +int *l = new int[nparts], // # of already processed buffer elements + *r = new int[nparts]; // buffer size (in case it isn't full) + +// now we add fill the buffer for each part and add their elements to the heap for (int part = 0; part < nparts; part++) { + l[part] = 1; // if the element is in the heap, we also consider it "processed" r[part] = fread(buffers[part], 4, B, parts[part]); q.push({buffers[part][0], part}); - l[part] = 1; } ``` -Now we need to populate the result file until it is full, carefully writing it and reading new batches of elements when needed: +Now we just need to pop elements from the heap into the result file until it is empty, carefully writing and reading elements in batches: ```cpp FILE *output = fopen("output.bin", "w"); -int buffered = 0; + +int outbuffer[B]; // the output buffer +int buffered = 0; // number of elements in it while (!q.empty()) { auto [key, part] = q.top(); q.pop(); + // write the minimum to the output buffer outbuffer[buffered++] = key; + // check if it needs to be committed to the file if (buffered == B) { fwrite(outbuffer, 4, B, output); buffered = 0; } + // fetch a new block of that part if needed if (l[part] == r[part]) { r[part] = fread(buffers[part], 4, B, parts[part]); l[part] = 0; } + // read a new element from that part unless we've already processed all of it if (l[part] < r[part]) { q.push({buffers[part][l[part]], part}); l[part]++; } } +// write what's left of the output buffer fwrite(outbuffer, 4, buffered, output); +//clean up delete[] buffers; for (FILE *file : parts) fclose(file); - fclose(output); ``` -This implementation is not particularly effective or safe-looking (well, this is basically C), but is a good educational example of how to work with low-level memory APIs. +This implementation is not particularly effective or safe-looking (well, this is basically plain C), but is a good educational example of how to work with low-level memory APIs. -## Joining +### Joining -Sorting by mainly used not by itself, but as an intermediate step for other operations. One important real-world use case for external sorting is joining (as in "SQL join"), used in databases and other data processing applications. +Sorting is mainly used not by itself, but as an intermediate step for other operations. One important real-world use case of external sorting is joining (as in "SQL join"), used in databases and other data processing applications. **Problem.** Given two lists of tuples $(x_i, a_{x_i})$ and $(y_i, b_{y_i})$, output a list $(k, a_{x_k}, b_{y_k})$ such that $x_k = y_k$ -The optimal solution would be to sort the two lists and then use the standard two-pointer technique to merge them. The I/O complexity here would be the same as sorting, and just $O(\frac{N}{B})$ if the arrays are already sorted. - -This is why most data processing applications (databases, MapReduce systems) like to keep their tables at least partially sorted. - -### Other Implementations +The optimal solution would be to sort the two lists and then use the standard two-pointer technique to merge them. The I/O complexity here would be the same as sorting, and just $O(\frac{N}{B})$ if the arrays are already sorted. This is why most data processing applications (databases, MapReduce systems) like to keep their tables at least partially sorted. -Note that this analysis is only applicable in external memory setting — that is, if you don't have the memory to fit entire dataset. In the real world, it is important to consider alternative methods. +**Other approaches.** Note that this analysis is only applicable in the external memory setting — that is, if you don't have the memory to read the entire dataset. In the real world, alternative methods may be faster. The simplest of them is probably *hash join*, which goes something like this: @@ -185,6 +189,6 @@ def join(a, b): yield d[x] ``` -In external memory, joining two lists with a hash table would be unfeasible, as it would involve doing $O(M)$ entire block reads. +In external memory, joining two lists with a hash table would be unfeasible, as it would involve doing $O(M)$ block reads, even though only one element is used in each of them. -Another way is to use alternative sorting algorithms such as radix sort. In particular, radix sort would work in $O(\frac{N}{B} \cdot w)$ if enough memory is available to maintain a buffer possible key, which could be beneficial in the case of small keys and large datasets +Another method is to use alternative sorting algorithms such as radix sort. In particular, radix sort would work in $O(\frac{N}{B} \cdot w)$ block reads if enough memory is available to maintain buffers for all possible keys, and it could be faster in the case of small keys and large datasets. From f85062ffa6259cf80858ff18d8815d67b6574712 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Thu, 27 Jan 2022 16:54:39 +0300 Subject: [PATCH 049/531] decrease katex font size --- content/english/hpc/external-memory/sorting.md | 6 +++--- themes/algorithmica/assets/style.sass | 2 +- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/content/english/hpc/external-memory/sorting.md b/content/english/hpc/external-memory/sorting.md index 43745d4a..6ac13ae0 100644 --- a/content/english/hpc/external-memory/sorting.md +++ b/content/english/hpc/external-memory/sorting.md @@ -29,7 +29,7 @@ In terms of memory operations, we just linearly read all elements of $a$ and $b$ So far the examples have been simple, and their analysis doesn't differ too much from the RAM model, except that we divide the final answer by the block size $B$. But here is a case where this is not so. -**K-way merging.** Consider the modification of this algorithm where we need to merge not just two arrays, but $k$ arrays of total size $N$ — by likewise looking at $k$ values, choosing the minimum between them, writing it into $c$, and incrementing one of the iterators. +**$k$-way merging.** Consider the modification of this algorithm where we need to merge not just two arrays, but $k$ arrays of total size $N$ — by likewise looking at $k$ values, choosing the minimum between them, writing it into $c$, and incrementing one of the iterators. In the standard RAM model, the asymptotic complexity would be multiplied $k$, since we would need to perform $O(k)$ comparisons to fill each next element. But in the external memory model, since everything we do in-memory doesn't cost us anything, its asymptotic complexity would not change as long as we can fit $(k+1)$ full blocks in memory, that is, if $k = O(\frac{M}{B})$. @@ -51,11 +51,11 @@ $$ This is quite fast. If we have 1GB of memory and 10GB of data, this essentially means that we need a little bit more than 3 times the effort than just reading the data to sort it. Interestingly enough, we can do better. -### K-way Mergesort +### $k$-way Mergesort Half of a page ago we have learned that in the external memory model, we can merge $k$ arrays just as easily as two arrays — at the cost of reading them. Why don't we apply this fact here? -Let's sort each block of size $M$ in-memory just as we did before, but during each merge stage, we will split sorted blocks not just in pairs to be merged, but take as many blocks we can fit into our memory during a k-way merge. This way the height of the merge tree would be greatly reduced, while each layer would still be done in $O(\frac{N}{B})$ IOPS. +Let's sort each block of size $M$ in-memory just as we did before, but during each merge stage, we will split sorted blocks not just in pairs to be merged, but take as many blocks we can fit into our memory during a $k$-way merge. This way the height of the merge tree would be greatly reduced, while each layer would still be done in $O(\frac{N}{B})$ IOPS. How many sorted arrays can we merge at once? Exactly $k = \frac{M}{B}$, since we need memory for one block for each array. Since the total amount of layers will be reduced to $\log_{\frac{M}{B}} \frac{N}{M}$, the total complexity will be reduced to diff --git a/themes/algorithmica/assets/style.sass b/themes/algorithmica/assets/style.sass index a30cfe62..5ec69255 100644 --- a/themes/algorithmica/assets/style.sass +++ b/themes/algorithmica/assets/style.sass @@ -394,7 +394,7 @@ footer font-family: $font-interface .katex - font-size: 1.15em !important + font-size: 1.1em !important /* headers */ h1, h2, h3, h4, h5, h6 From d738bf498de9bb9dbcad0bc421b42d20d1d70c74 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Thu, 27 Jan 2022 17:48:58 +0300 Subject: [PATCH 050/531] list ranking edits --- .../hpc/external-memory/list-ranking.md | 34 +++++++++++-------- 1 file changed, 20 insertions(+), 14 deletions(-) diff --git a/content/english/hpc/external-memory/list-ranking.md b/content/english/hpc/external-memory/list-ranking.md index fdd6a921..6a043588 100644 --- a/content/english/hpc/external-memory/list-ranking.md +++ b/content/english/hpc/external-memory/list-ranking.md @@ -3,26 +3,26 @@ title: List Ranking weight: 5 --- -Now we are going to use [external sorting](../sorting) and [joining](../sorting#joining) to solve a problem that seems useless, but is actually a very important primitive many graph algorithms in external memory as well as in parallel computing, so bear with me. +In this section, we will apply [external sorting](../sorting) and [joining](../sorting#joining) to solve a problem that seems useless on the surface but is actually a key primitive used in a large number of external memory and parallel algorithms. -**Problem.** Given a linked list, compute *rank* of each element, equal to its distance from the front element. +**Problem.** Given a singly-linked list, compute the *rank* of each element, equal to its distance from the *last* element. -![](../img/list-ranking.png) +![Example input and output for the list ranking problem](../img/list-ranking.png) -The problem is easily solvable in RAM model, but it is nontrivial how to solve this in external memory. Since our data is stored so chaotically, we can't simply traverse the list by querying each new element. +This problem can be trivially solved in the RAM model: you just traverse the entire list with a counter. But this pointer jumping wouldn't work well in the external memory setting because the list nodes are stored arbitrarily, and in the worst case, reading each new node may require reading a new block. ### Algorithm -Consider a slightly more general version of the problem. Now, each element has a *weight* $w_i$, and for each element we need to compute the sum of weights of all preceding elements instead of just its rank. To solve the initial problem, we can just set all weights equal to 1. +Consider a slightly more general version of the problem. Now, each element has a *weight* $w_i$, and for each element, we need to compute the sum of the weights of all its preceding elements instead of just its rank. To solve the initial problem, we can just set all weights equal to 1. -Now, the key idea of the algorithm is to remove some fraction of elements, recursively solve the problem, and then use it to reconstruct the answer for the initial problem. +The main idea of the algorithm is to remove some fraction of elements, recursively solve the problem, and then use these weight-ranks to reconstruct the answer for the initial problem — which is the tricky part. -Consider some three consecutive elements: $x$, $y$ and $z$. Assume that we deleted $y$ and solved the problem for the remaining list, which included $x$ and $z$, and now we need to restore the answer for the original triplet. The weight of $x$ would be correct as it is, but we need to calculate the answer for $y$ and adjust it for $z$, namely: +Consider some three consecutive elements $x$, $y$ and $z$. Assume that we deleted $y$ and solved the problem for the remaining list, which included $x$ and $z$, and now we need to restore the answer for the original triplet. The weight of $x$ would be correct as it is, but we need to calculate the answer for $y$ and adjust it for $z$, namely: - $w_y' = w_y + w_x$ - $w_z' = w_z + w_y + w_x$ -Now, we can just delete, say, first element, solve the problem recursively, and recalculate weights for the original array. But, unfortunately, it would work in quadratic time, because to make the update, we would need to know where its neighbors are, and since we can't hold the entire array in memory, we would need to scan it each time. +Now, we can just delete, say, the first element, solve the problem recursively, and recalculate weights for the original array. But, unfortunately, it would work in quadratic time, because to make the update, we would need to know where its neighbors are, and since we can't hold the entire array in memory, we would need to scan it each time. Therefore, on each step, we want to remove as many elements as possible. But we also have a constraint: we can't remove two consecutive elements because then merging results wouldn't be that simple. @@ -32,23 +32,29 @@ $$ T(N) = T\left(\frac{3}{4} N\right) = O(N) $$ -The only tricky part here is how to implement the merge step in external memory. +The only tricky part here is how to implement the merge step in external memory. To do it efficiently, we need to maintain our list in the following form: -To do it efficiently, we need to maintain our list in the following form: - List of tuples $(i, j)$ indicating that element $j$ follows after element $i$ - List of tuples $(i, w_i)$ indicating that element $i$ currently has weight $w_i$ - A list of deleted elements Now, to restore the answer after randomly deleting some elements and recursively solving the smaller problem, we need to iterate over all lists using three pointers looking for deleted elements. and for each such element, we will write $(j, w_i)$ to a separate table, which would signify that before the recursive step we need to add $w_i$ to $j$. We can then join this new table with initial weights, add these additional weights to them. -After coming back from recursion, we need to update weights for the deleted elements, which we can do with the same technique, iterating over reversed connections instead of direct ones. +After coming back from the recursion, we need to update weights for the deleted elements, which we can do with the same technique, iterating over reversed connections instead of direct ones. -I/O complexity of this algorithm with therefore be the same as joining, namely $SORT(N)$. +I/O complexity of this algorithm with therefore be the same as joining, namely $SORT(N) = O\left(\frac{N}{B} \log_{\frac{M}{B}} \frac{N}{M} \right)$. ### Applications List ranking is especially useful in graph algorithms. -For example, we can obtain the euler tour of a tree in external memory by constructing a linked list where, for each edge, we add two copies of it, one for each direction. Then we can apply the list ranking algorithm and get the position of each node which will be the same as its number (*tin*) in the euler tour. +For example, we can obtain the Euler tour of a tree in external memory by constructing a linked list from the tree that corresponds to its Wuler tour and then applying the list ranking algorithm — the ranks of each node will be the same as its index $tin_v$ in the Euler tour. To construct this list, we need to: -Exactly same approach cay be applied to parallel algorithms, but we will cover that more deeply later. +- split each undirected tree edge into two directed ones; +- duplicate the parent node for each up-edge (because list nodes can only have one incoming edge, but we visit some tree vertices multiple times); +- route each such node either to the "next sibling", if it has one, or otherwise to its own parent; +- and then finally break the resulting cycle at the root. + +This general technique is called *tree contraction*, and it serves as the basis for a large number of tree algorithms. + +Exactly the same approach can be applied to parallel algorithms, and we will convert that much more deeply in part 2. From 6f4374ff9b1e00e26c74e28c96aa0e8cd9ede963 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Thu, 27 Jan 2022 18:35:52 +0300 Subject: [PATCH 051/531] eviction policies edits --- .../english/hpc/external-memory/policies.md | 54 ++++++++++--------- 1 file changed, 29 insertions(+), 25 deletions(-) diff --git a/content/english/hpc/external-memory/policies.md b/content/english/hpc/external-memory/policies.md index ef6da591..1ff0e724 100644 --- a/content/english/hpc/external-memory/policies.md +++ b/content/english/hpc/external-memory/policies.md @@ -3,56 +3,60 @@ title: Eviction Policies weight: 6 --- -## Caching Strategies +You can control the I/O operations of your program manually, but most of the time people just rely on automatic bufferization and caching, either due to laziness or because of the computing environment limitations. -When you run out of inner memory to store your data, you need to delete one block to make space for a new one. Since caching usually happens in the background, you need a concrete rule for deciding which data to retain in the cache, called *eviction policy*. +But automatic caching comes with its own challenges. When a program runs out of working memory to store its intermediate data, it needs to get rid of one block to make space for a new one. A concrete rule for deciding which data to retain in the cache in case of conflicts is called an *eviction policy*. This rule can be arbitrary, but there are several popular choices: - First in first out (FIFO): simply evict the earliest added block, without any regard to how often it was accessed before (the same way as a FIFO queue). - Least recently used (LRU): evict the block that has not been accessed for the longest period of time. - Last in first out (LIFO) and most recently used (MRU): the opposite of the previous two. It seems harmful to delete the hottest blocks, but there are scenarios where these policies are optimal, such as repeatedly looping around a file in a cycle. -- Least-frequently used (LFU): counts how often each block has been requested, and discards the one used least often. There are variations that account for changing access patterns over time, such as using a time window to only consider the last $n$ accesses, or using exponential averaging to give recent accesses more weight. +- Least-frequently used (LFU): counts how often each block has been requested and discards the one used least often. Some variations also account for changing access patterns over time, such as using a time window to only consider the last $n$ accesses or using exponential averaging to give recent accesses more weight. - Random replacement (RR): discard a block randomly. The advantage is that it does not need to maintain any data structures with block information. -There is a natural trade-off between the accuracy of eviction policies and the additional overhead due to the complexity of their implementations. For a CPU cache, you need a simple policy that can be easily implemented in hardware with almost zero latency, while in more slow-paced and plannable settings such as Netflix deciding in which data centers to store their movies or Google Drive optimizing where to store user data, it makes sense to use more complex policies, possibly involving machine learning to predict when the data is going to be accessed next. +There is a natural trade-off between the accuracy of eviction policies and the additional overhead due to the complexity of their implementations. For a CPU cache, you need a simple policy that can be easily implemented in hardware with next-to-zero latency, while in more slow-paced and plannable settings such as Netflix deciding in which data centers to store their movies or Google Drive optimizing where to store user data, it makes sense to use more complex policies, possibly involving some machine learning to predict when the data is going to be accessed next. -### Implementing Caching +### Optimal Caching -This is not always a trivial task to find the right block to evict in a reasonable time. While CPU caches are implemented in hardware (usually as a variation of LRU), higher-level eviction policies have to rely on software to store certain statistics about the blocks and maintain data structures on top of them to speed up the process. +Apart from the aforementioned strategies, there is also the theoretical *optimal policy*, denoted as $OPT$ or $MIN$, which determines, for a given sequence of queries, which blocks should be retained to minimize the total number of cache misses. -For example, let's think about what it takes to implement an LRU cache. Assume we are storing some moderately large objects — say, we need to develop a cache for a database, there both the requests and replies are medium-sized strings in some SQL dialect, so the overhead of our structure is small, but non-negligible. +These decisions can be made using a simple greedy approach called *Bélády algorithm*: we can just keep the *latest-to-be-used* block, and it can be shown by contradiction that doing so is always one of the optimal solutions. The downside of this method is that you either need to have these queries in advance or somehow be able to predict the future. - +The good thing is that, in terms of asymptotic complexity, it doesn't really matter which particular method is used. [Sleator & Tarjan showed](https://www.cs.cmu.edu/~sleator/papers/amortized-efficiency.pdf) that in most cases, the performance of popular policies such as $LRU$ differs from $OPT$ just by a constant factor. -First of all, we need a hash table to find the data itself. Since we are working with large variable-length strings, it makes sense to use a hash of the query as the key and a pointer to the heap-allocated result string as the value. +**Theorem.** Let $LRU_M$ and $OPT_M$ denote the number of blocks a computer with $M$ internal memory would need to access while executing the same algorithm following the least recently used cache replacement policy and the theoretical minimum respectively. Then: -To implement the LRU logic, the simplest approach would be to create a queue where we put the current time and IDs/keys of objects when we access them, and also store for each object when was the last time it was accessed (not necessarily as a timestamp — any increasing counter will suffice). +$$ +LRU_M \leq 2 \cdot OPT_{M/2} +$$ + +The main idea of the proof is to consider the worst case scenario. For LRU it would be the repeating series of $\frac{M}{B}$ distinct blocks: each block is new and so LRU has 100% cache misses. Meanwhile, $OPT_{M/2}$ would be able to cache half of them (but not more, because it only has half the memory). Thus $LRU_M$ needs to fetch double the number of blocks that $OPT_{M/2}$ does, which is basically what is expressed in the inequality, and anything better for $LRU$ would only weaken it. + +![Dimmed are the blocks cached by OPT (but note cached by LRU)](../img/opt.png) -Now, when we need to free up space, we can find the least recently used object by popping elements from the front of the queue — but we can't just delete them, because it may be that they were accessed again since their record was added to the queue. So we need to check if the timestamp when we put them in queue matches the timestamp when they were last accessed, and only then free up the memory. +This is a very relieving result. It means that, at least in terms of asymptotic I/O complexity, you can just assume that the eviction policy is either LRU or OPT — whichever is easier for you — do complexity analysis with it, and the result you get will normally transfer to any other reasonable cache replacement policy. -The only problem here is that we add an entry to the queue each time a block is accessed, and only remove entries when we have a cache miss and start popping them off from the front until we have a match. This may lead to the queue overflowing, and to counter this, instead of adding an entry and forgetting about it, we can move it to the end of the queue on a cache hit right away. +### Implementing Caching -To support this, we need to implement the queue over a doubly linked list and store a pointer to the block's node in the queue in the hash table. Then, when we have a cache hit, we follow the pointer and remove the node from the linked list in constant time, and add a newer node to the end of the queue. This way, at any point in time, there would be exactly as many nodes in the queue as we have objects, and the memory overhead will be guaranteed to be constant per cache entry. + -As an exercise, try to think about ways to implement other caching strategies. It is quite fun, I assure you. +This is not always a trivial task to find the right block to evict in a reasonable time. While CPU caches are implemented in hardware (usually as some variation of LRU), higher-level eviction policies have to rely on software to store certain statistics about the blocks and maintain data structures on top of them to speed up the process. -### Optimal Caching +Let's think about what we need to implement an LRU cache. Assume we are storing some moderately large objects — say, we need to develop a cache for a database, there both the requests and replies are medium-sized strings in some SQL dialect, so the overhead of our structure is small but non-negligible. -Apart from aforementioned strategies, there is also what's called *Bélády algorithm*, often denoted as $OPT$ or $MIN$, which determined which blocks should be retained in the *optimal* policy for a given sequence of queries. + -The way it achieves it is simple: we can always greedily keep the *latest-to-be-used* block, and it can be shown by contradiction that doing so is always one of the optimal solutions. The downside of this method is that you either need to have these queries in advance or somehow be able to predict the future. +First of all, we need a hash table to find the data itself. Since we are working with large variable-length strings, it makes sense to use the hash of the query as the key and a pointer to a heap-allocated result string as the value. -But the good thing is that, in terms of asymptotic complexity, it doesn't really matter which particular method is used. [Sleator & Tarjan showed](https://www.cs.cmu.edu/~sleator/papers/amortized-efficiency.pdf) that in most cases, the performance of popular policies such as $LRU$ differs from $OPT$ just by a constant factor. +To implement the LRU logic, the simplest approach would be to create a queue where we put the current time and IDs/keys of objects when we access them, and also store when each object was accessed the last time (not necessarily as a timestamp — any increasing counter will suffice). -**Theorem.** Let $LRU_M$ and $OPT_M$ denote the number of blocks a computer with $M$ internal memory would need to access while executing the same algorithm following the least recently used cache replacement policy and the theoretical minimum respectively. Then +Now, when we need to free up space, we can find the least recently used object by popping elements from the front of the queue. We can't just delete them, because it may be that they were accessed again since their record was added to the queue. So we need to check if the time of when we put them in queue matches the time of when they were last accessed, and only then free up the memory. -$$ -LRU_M \leq 2 \cdot OPT_{M/2} -$$ +The only remaining issue here is that we add an entry to the queue each time a block is accessed, and only remove entries when we have a cache miss and start popping them off from the front until we have a match. This may lead to the queue overflowing, and to mitigate this, instead of adding an entry and forgetting about it, we can move it to the end of the queue on a cache hit right away. -The main idea of the proof is to consider the "worst case" scenario. For LRU it would be the repeating series of $\frac{M}{B}$ distinct blocks: each block is new and so LRU has 100% cache misses. Meanwhile, $OPT_{M/2}$ would be able to cache half of them (but not more, because it only has half the memory). Thus $LRU_M$ needs to fetch double the number of blocks that $OPT_{M/2}$ does, which is basically what is expressed in the inequality, and anything better for $LRU$ would only weaken it. +To support this, we need to implement the queue over a doubly-linked list and store a pointer to the block's node in the queue in the hash table. Then, when we have a cache hit, we follow the pointer and remove the node from the linked list in constant time, and add a newer node to the end of the queue. This way, at any point in time, there would be exactly as many nodes in the queue as we have objects, and the memory overhead will be guaranteed to be constant per cache entry. -![Dimmed are the blocks cached by OPT (but note cached by LRU)](../img/opt.png) +As an exercise, try to think about ways to implement other caching strategies. -This is a very relieving result. It means that, at least in terms of asymptotic I/O complexity, you can just assume that the eviction policy is either LRU or OPT — whichever is easier for you — do complexity analysis with it, and the result you get will normally transfer to any other reasonable cache replacement policy. + From 6a7296ff000525691ca709709d26f4e63b2f8669 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Thu, 27 Jan 2022 18:52:11 +0300 Subject: [PATCH 052/531] cache-oblivious algorithms edits --- .../english/hpc/external-memory/oblivious.md | 38 +++++++++---------- 1 file changed, 19 insertions(+), 19 deletions(-) diff --git a/content/english/hpc/external-memory/oblivious.md b/content/english/hpc/external-memory/oblivious.md index eec77cca..5e4650b2 100644 --- a/content/english/hpc/external-memory/oblivious.md +++ b/content/english/hpc/external-memory/oblivious.md @@ -3,18 +3,18 @@ title: Cache-Oblivious Algorithms weight: 7 --- -In the context of cache hierarchies, there are two types of efficient [external memory](../external) algorithms: +In the context of the [external memory model](../model), there are two types of efficient algorithms: - *Cache-aware* algorithms that are efficient for *known* $B$ and $M$. - *Cache-oblivious* algorithms that are efficient for *any* $B$ and $M$. -For example, external merge sort is cache-aware, but not cache-oblivious: we need to know memory characteristics of the system, namely the ratio of available memory to the block size, to find the right $k$ to do k-way merge sort. +For example, [external merge sorting](../sorting) is a cache-aware, but not cache-oblivious algorithm: we need to know the memory characteristics of the system, namely the ratio of available memory to the block size, to find the right $k$ to perform $k$-way merge sort. -Cache-oblivious algorithms are interesting because they automatically become optimal for all memory levels in the cache hierarchy, and not just the one for which they were specifically tuned. In this article we will consider some of their applications in matrix calculations. +Cache-oblivious algorithms are interesting because they automatically become optimal for all memory levels in the cache hierarchy, and not just the one for which they were specifically tuned. In this article, we consider some of their applications in matrix calculations. -## Matrix Transpose +## Matrix Transposition -Assume we have a square matrix $A$ of size $N \times N$ and we need to transpose it. The naive by-definition approach would go something like this: +Assume we have a square matrix $A$ of size $N \times N$, and we need to transpose it. The naive by-definition approach would go something like this: ```cpp for (int i = 0; i < n; i++) @@ -24,11 +24,11 @@ for (int i = 0; i < n; i++) Here we used a single pointer to the beginning of the memory region instead of a 2d array to be more explicit about its memory operations. -The I/O complexity of this code is $O(N^2)$ because the writes are not sequential. If you try to swap the iteration variables it is going to be the the other way around, but the result will be the same. +The I/O complexity of this code is $O(N^2)$ because the writes are not sequential. If you try to swap the iteration variables, it will be the other way around, but the result is going to be the same. ### Algorithm -The *cache-oblivious* way relies on the following block matrix identity: +The *cache-oblivious* algorithm relies on the following block matrix identity: $$ \begin{pmatrix} @@ -47,7 +47,7 @@ It lets us solve the problem recursively using a divide-and-conquer approach: 2. Transpose each one recursively. 3. Combine results by swapping the corner result matrices. -Implementing D&C on matrices is a bit more complex than on arrays, but the main idea is the same. Instead of copying submatrices explicitly, we want to use "views" into them, and also switch to the naive method when the data starts fitting in L1 cache (or pick something small like $32 \times 32$ if you don't know it in advance). We also need to carefully handle the case when we have odd $n$ and thus can't split the matrix into 4 equal submatrices. +Implementing D&C on matrices is a bit more complex than on arrays, but the main idea is the same. Instead of copying submatrices explicitly, we want to use "views" into them, and also switch to the naive method when the data starts fitting in the L1 cache (or pick something small like $32 \times 32$ if you don't know it in advance). We also need to carefully handle the case when we have odd $n$ and thus can't split the matrix into 4 equal submatrices. ```cpp void transpose(int *a, int n, int N) { @@ -96,9 +96,9 @@ for (int i = 0; i < n; i++) c[i * n + j] += a[i * n + k] * b[k * n + j]; ``` -It needs to access $O(N^3)$ blocks in total as each scalar multiplication needs a new block read. +It needs to access $O(N^3)$ blocks in total as each scalar multiplication needs a separate block read. -Many people know that one good optimization is to transpose transpose $B$ first: +One well-known optimization is to transpose $B$ first: ```cpp for (int i = 0; i < n; i++) @@ -112,13 +112,13 @@ for (int i = 0; i < n; i++) c[i * n + j] += a[i * n + k] * b[j * n + k]; // <- note the indices ``` -Regardless of whether the transpose is done naively or with the cache-oblivious method we just developed, the matrix multiplication with one of the matrices transposed would work in $O(N^3/B + N^2)$ as all memory accesses are now sequential. +Whether the transpose is done naively or with the cache-oblivious method we previously developed, the matrix multiplication with one of the matrices transposed would work in $O(N^3/B + N^2)$ as all memory accesses are now sequential. -It seems like we can't do better, but turns out we can. +It seems like we can't do better, but it turns out we can. ### Algorithm -Cache-oblivious matrix multiplication involves essentially the same trick. We need to divide the data until it fits into lowest cache (i. e. $N^2 \leq M$). For matrix multiplication, this equates to using this formula: +Cache-oblivious matrix multiplication relies on essentially the same trick as the transposition. We need to divide the data until it fits into lowest cache (i. e. $N^2 \leq M$). For matrix multiplication, this equates to using this formula: $$ \begin{pmatrix} @@ -133,7 +133,7 @@ A_{21} B_{11} + A_{22} B_{21} & A_{21} B_{12} + A_{22} B_{22}\\ \end{pmatrix} $$ -It is slightly harder to implement though, as we now have 8 recursive matrix multiplications: +It is slightly harder to implement though because we now have a total of 8 recursive matrix multiplications: ```cpp void matmul(const float *a, const float *b, float *c, int n, int N) { @@ -198,11 +198,11 @@ $$ T(N) = O\left(\frac{(\sqrt{M})^2}{B} \cdot \left(\frac{N}{\sqrt M}\right)^3\right) = O\left(\frac{N^3}{B\sqrt{M}}\right) $$ -This is better than just $O(\frac{N^3}{B})$ by quite a lot. +This is better than just $O(\frac{N^3}{B})$ and by quite a lot. ### Strassen Algorithm -In a spirit similar to the Karatsuba algorithm, matrix multiplication can be decomposed in a way that involves 7 matrix multiplications of size $\frac{n}{2}$, and master theorem tells us the such divide-and-conquer algorithm would work in $O(n^{\log_2 7}) \approx O(n^{2.81})$ time and a similar asymptotic in external memory model. +In a spirit similar to the Karatsuba algorithm, matrix multiplication can be decomposed in a way that involves 7 matrix multiplications of size $\frac{n}{2}$, and the master theorem tells us that such divide-and-conquer algorithm would work in $O(n^{\log_2 7}) \approx O(n^{2.81})$ time and a similar asymptotic in the external memory model. This technique, known as the Strassen algorithm, similarly splits each matrix into 4: @@ -221,7 +221,7 @@ B_{21} & B_{22} \\ \end{pmatrix} $$ -It then computes intermediate products of the $\frac{N}{2} \times \frac{N}{2}$ matrices and combines them to get matrix $C$: +But then it computes intermediate products of the $\frac{N}{2} \times \frac{N}{2}$ matrices and combines them to get matrix $C$: $$ \begin{aligned} @@ -237,10 +237,10 @@ $$ You can verify these formulas with simple substitution if you feel like it. -As far as we know, none of the mainstream optimized linear algebra libraries use the Strassen algorithm, although there are some prototype implementations that become efficient for matrices larger than 4000 or so. +As far as I know, none of the mainstream optimized linear algebra libraries use the Strassen algorithm, although there are some prototype implementations that are efficient for matrices larger than 4000 or so. This technique can and actually has been extended multiple times to reduce the asymptotic even further by considering more submatrix products. As of 2020, current world record is $O(n^{2.3728596})$. Whether you can multiply matrices in $O(n^2)$ or at least $O(n^2 \log^k n)$ time is an open problem. ## Further Reading -[Cache-Oblivious Algorithms and Data Structures](https://erikdemaine.org/papers/BRICS2002/paper.pdf) by Erik Demaine. +For a solid theoretical viewpoint, consider reading [Cache-Oblivious Algorithms and Data Structures](https://erikdemaine.org/papers/BRICS2002/paper.pdf) by Erik Demaine. From d0dd961d7571d4364e0c71a7a45a66dc5a744fa6 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Thu, 27 Jan 2022 20:15:39 +0300 Subject: [PATCH 053/531] data locality edits --- .../hpc/data-structures/binary-search.md | 4 +- .../english/hpc/external-memory/locality.md | 82 ++++++++++--------- 2 files changed, 47 insertions(+), 39 deletions(-) diff --git a/content/english/hpc/data-structures/binary-search.md b/content/english/hpc/data-structures/binary-search.md index efc09c2f..7ce64718 100644 --- a/content/english/hpc/data-structures/binary-search.md +++ b/content/english/hpc/data-structures/binary-search.md @@ -75,9 +75,9 @@ If compiler is successful in piercing through the abstractions, it compiles to r ### Temporal Locality -When we find lower bound of $x$ in a sorted array by binary searching, the main problem is that its memory accesses pattern is neither temporary nor spacially local. +When we find lower bound of $x$ in a sorted array by binary searching, the main problem is that its memory accesses pattern is neither temporary nor spatially local. -For example, element $\lfloor \frac n 2 \rfloor$ is accessed very often (every search) and element $\lfloor \frac n 2 \rfloor + 1$ is not, while they are probably occupying the same cache line. In general, only the first 3-5 reads are temporary local and only the last 3-4 reads are spacially local, and the rest are just random memory accesses. +For example, element $\lfloor \frac n 2 \rfloor$ is accessed very often (every search) and element $\lfloor \frac n 2 \rfloor + 1$ is not, while they are probably occupying the same cache line. In general, only the first 3-5 reads are temporary local and only the last 3-4 reads are spatially local, and the rest are just random memory accesses. ![](../img/binary-heat.png) diff --git a/content/english/hpc/external-memory/locality.md b/content/english/hpc/external-memory/locality.md index d7ea4af9..a50620ae 100644 --- a/content/english/hpc/external-memory/locality.md +++ b/content/english/hpc/external-memory/locality.md @@ -1,49 +1,57 @@ --- -title: Spacial and Temporal Locality +title: Spatial and Temporal Locality weight: 8 --- + -## Data Locality +To precisely assess the performance of an algorithm in terms of its memory operations, we need to take into account multiple characteristics of the cache system: the number of cache layers, the [memory and block sizes](../hierarchy) of each layer, the exact [strategy](../policies) used for cache eviction by each layer, and sometimes even the details of the [memory paging](../virtual) mechanism. -Abstracting away from the minor details of the cache system helps a lot when designing algorithms. Instead of calculating theoretical cache hit rates, it often makes more sense to reason about cache performance in more abstract qualitative terms. +Abstracting away from all these minor details helps a lot in the first stages of designing algorithms. Instead of calculating theoretical cache hit rates, it often makes more sense to reason about cache performance in more qualitative terms. -We can talk about the degree of cache reuse primarily in two ways: + + +In this context, we can talk about the degree of cache reuse primarily in two ways: - *Temporal locality* refers to the repeated access of the same data within a relatively small time duration, such that the data likely remains cached between the requests. -- *Spacial locality* refers to the use of elements relatively close to each other in terms of their memory locations, such that they are likely fetched in the same memory block. +- *Spatial locality* refers to the use of elements relatively close to each other in terms of their memory locations, such that they are likely fetched in the same memory block. -In other words, temporal locality is when it is likely that this same memory location will soon be requested again, while spacial locality is when it is likely that a nearby location will be requested right after. +In other words, temporal locality is when it is likely that this same memory location will soon be requested again, while spatial locality is when it is likely that a nearby location will be requested right after. -We will now go through some examples to show how these concepts can help in optimization. +In this section, we will do some case studies to show how these high-level concepts can help in practical optimization. -### Depth-First and Breadth-First +### Depth-First vs. Breadth-First -Consider a divide-and-conquer algorithm such as merge sort. There are two approaches to implementing it: +Consider a divide-and-conquer algorithm such as merge sorting. There are two approaches to implementing it: -- We can implement it recursively, or "depth-first", the way it is normally implemented: sort the left half, sort the right half, and then merge the results. +- We can implement it recursively, or "depth-first", the way it is normally implemented: sort the left half, sort the right half and then merge the results. - We can implement it iteratively, or "breadth-first": do the lowest "layer" first, looping through the entire dataset and comparing odd elements with even elements, then merge the first two elements with the second two elements, the third two elements with the fourth two elements and so on. It seems like the second approach is more cumbersome, but faster — because recursion is always slow, right? -But this is not the case for this and many similar divide-and-conquer algorithms. Although the iterative approach has the advantage of only doing sequential I/O, the recursive approach has much better temporal locality: when a segment fully fits into cache, it stays there for all lower layers of recursion, resulting in better access times later on. +Generally, recursion is [indeed slow](/hpc/architecture/functions), but this is not the case for this and many similar divide-and-conquer algorithms. Although the iterative approach has the advantage of only doing sequential I/O, the recursive approach has much better temporal locality: when a segment fully fits into the cache, it stays there for all lower layers of recursion, resulting in better access times later on. -In fact, since we only need $O(\log \frac{N}{M})$ layers until this happens, we would only need to read $O(\frac{N}{B} \log \frac{N}{M})$ blocks in total, while in the iterative approach the entire array will be read from scratch $O(\log N)$ times no matter what. The results in the speedup of $O(\frac{\log N}{\log N - \log M})$, which may be up to an order of magnitude. +In fact, since we only need to split the array $O(\log \frac{N}{M})$ times until this happens, we would only need to read $O(\frac{N}{B} \log \frac{N}{M})$ blocks in total, while in the iterative approach the entire array will be read from scratch $O(\log N)$ times no matter what. This results in the speedup of $O(\frac{\log N}{\log N - \log M})$, which may be up to an order of magnitude. -In practice, there is still some overhead associated with the recursion, and for of this reason, it makes sense to use hybrid algorithms where we don't go all the way down to the base case and instead switch to the iterative code on lower levels of recursion. +In practice, there is still some overhead associated with the recursion, and for this reason, it makes sense to use hybrid algorithms where we don't go all the way down to the base case and instead switch to the iterative code on the lower levels of recursion. ### Dynamic Programming -A similar reasoning can be applied to the implementations of dynamic programming algorithms. - -Consider the classic knapsack problem, where we got $n$ items with integer costs $c_i$, and we need to pick a subset of items with maximum total cost that does not exceed a given constant $w$. +Similar reasoning can be applied to the implementations of dynamic programming algorithms but leading to the reverse result. Consider the classic knapsack problem, where we got $n$ items with integer costs $c_i$, and we need to pick a subset of items with the maximum total cost that does not exceed a given constant $w$. -The way to solve it is to introduce the state of dynamic $f[i, k]$, which corresponds to the maximum total cost less than $k$ can be achieved having already considered and excluded the first $i$ items. It can be updated in $O(1)$ time per entry, by either taking or not taking the $i$-th item and using further states of the dynamic to compute the optimal decision for each state. +The way to solve it is to introduce the *state* $f[i, k]$, which corresponds to the maximum total cost not exceeding $k$ that can be achieved having already considered and excluded the first $i$ items. The state can be updated in $O(1)$ time per entry if consider either taking or not taking the $i$-th item and using further states of the dynamic to compute the optimal decision for each state. -Python has a handy `lru_cache` decorator for implementing it with memoized recursion: +Python has a handy `lru_cache` decorator, which can be used for implementing it with memoized recursion: ```python @lru_cache @@ -55,9 +63,9 @@ def f(i, k): return max(f(i + 1, k), c[i] + f(i + 1, k - w[i])) ``` -When computing $f[n, w]$, the recursion may possibly visit $O(n \cdot w)$ different states, which is asymptotically efficient, but rather slow in reality. Even after nullifying the overhead of Python recursion and the hash table queries required for the LRU cache to work, it would still be slow because it does random I/O throughout most of the execution. +When computing $f[n, w]$, the recursion may visit up to $O(n \cdot w)$ different states, which is asymptotically efficient, but rather slow in reality. Even after nullifying the overhead of Python recursion and all the hash table queries required for the LRU cache to work, it would still be slow because it does random I/O throughout most of the execution. -What we can do instead is to create a 2d array for the dynamic and replace memoized recursion with a nice nested loop like this: +What we can do instead is to create a two-dimensional array for the dynamic and replace the recursion with a nice nested loop like this: ```cpp int f[N + 1][W + 1]; @@ -67,9 +75,9 @@ for (int i = n - 1; i >= 0; i++) f[i][k] = w[i] > k ? f[i + 1][k] : max(f[i + 1][k], c[i] + f[i + 1][k - w[i]]); ``` -Notice that we are only using the previous layer of the dynamic to calculate the next one. This means that if we can store one layer in cache, we would only need to write $O(\frac{n \cdot w}{B})$ blocks in external memory. +Notice that we are only using the previous layer of the dynamic to calculate the next one. This means that if we can store one layer in the cache, we would only need to write $O(\frac{n \cdot w}{B})$ blocks in external memory. -Moreover, if we only need the answer, we don't actually have to store the whole 2d array, but only the last layer. This lets us use just $O(w)$ memory by maintaining a single array of $w$ values. To simplify the code, we can slightly change the dynamic to store a binary value: whether it is possible to get the sum of exactly $k$ using the items that we have consider. This dynamic is even faster to compute: +Moreover, if we only need the answer, we don't actually have to store the whole 2d array but only the last layer. This lets us use just $O(w)$ memory by maintaining a single array of $w$ values. To simplify the code, we can slightly change the dynamic to store a binary value: whether it is possible to get the sum of exactly $k$ using the items that we have already considered. This dynamic is even faster to compute: ```cpp bool f[W + 1] = {}; // this zero-fills the array @@ -88,11 +96,11 @@ for (int i = 0; i < n; i++) b |= b << c[i]; ``` -Surprisingly, there is still some room for improvement, and we will come back ot this problem later. +Surprisingly, there is still some room for improvement, and we will come back to this problem later. ### Sparse Table -*Sparse table* is a *static* data structure often used for solving static RMQ problem and computing any similar *idempotent reductions* in general. It can be formally defined as a 2d array of size $\log n \times n$: +*Sparse table* is a *static* data structure that is often used for solving the *static RMQ* problem and computing any similar *idempotent range reductions* in general. It can be formally defined as a two-dimensional array of size $\log n \times n$: $$ t[k][i] = \min \{ a_i, a_{i+1}, \ldots, a_{i+2^k-1} \} @@ -100,7 +108,7 @@ $$ In plain English: we store the minimum on each segment whose length is a power of two. -Such array can be used for calculating minima on arbitrary segments in constant time, because for each segment there are two possibly overlapping segments whose sizes is the same power of two, the union of which gives the whole segment. +Such array can be used for calculating minima on arbitrary segments in constant time because for each segment we can always find two possibly overlapping segments whose sizes are the same power of two, the union of which gives the whole segment. ![](../img/sparse-table.png) @@ -113,9 +121,9 @@ int rmq(int l, int r) { // half-interval [l; r) } ``` -The `__lg` function is an intrinsic available in GCC that calculates binary logarithm of a number rounded down. Internally it uses already familiar `clz` ("count leading zeros") instruction and subtracts this count from 32 in case of a 32-bit integer, and thus takes just a few cycles. +The `__lg` function is an intrinsic available in GCC that calculates the binary logarithm of a number rounded down. Internally it uses the `clz` ("count leading zeros") instruction and subtracts this count from 32 (in case of a 32-bit integer), and thus takes just a few cycles. -The reason why I bring it up in this article is because there are multiple alternative ways it can be built, with different performance in terms of I/O operations. In general, sparse table can be built in $O(n \log n)$ time in dynamic programming fashion by iterating in the order of increasing $i$ or $k$ and applying the following identity: +The reason why I bring it up in this article is that there are multiple alternative ways it can be built, with different performances in terms of I/O operations. In general, a sparse table can be built in $O(n \log n)$ time in dynamic programming fashion by iterating in the order of increasing $i$ or $k$ and applying the following identity: $$ t[k][i] = \min(t[k-1][i], t[k-1][i+2^{k-1}]) @@ -135,17 +143,17 @@ for (int l = 0; l < logn - 1; l++) This is the only combination of the memory layout and the iteration order that results in beautiful linear passes that work ~3x faster. As an exercise, consider the other three variants and think about *why* they are slower. -### Array-of-Structs and Struct-of-Arrays +### Array-of-Structs vs. Struct-of-Arrays -Suppose you want to implement a binary tree and store its fields in separate arrays like this: +Suppose that you want to implement a binary tree and store its fields in separate arrays like this: ```cpp int left_child[maxn], right_child[maxn], key[maxn], size[maxn]; ``` -Such memory layout, when we store each field separately from others, is called *struct-of-arrays* (SoA). In most cases, when implementing tree operations, you access a node and shortly after request all or most of its data. If these fields are stored separately, this would mean that they are also located in different memory blocks. It some of the requested fields are cached while others are not, you would still have to wait for the data in the lowest layer of cache to arrive. +Such memory layout, when we store each field separately from others, is called *struct-of-arrays* (SoA). In most cases, when implementing tree operations, you access a node and then shortly after all or most of its internal data. If these fields are stored separately, this would mean that they are also located in different memory blocks. If some of the requested fields happen to be are cached while the others are not, you would still have to wait for the slowest of them to be fetched. -In contrast, if it was instead stored as an array-of-structs (AoS), you would need ~4 times less block reads as all the data of the node is stored in the same block and fetched at once: +In contrast, if it was instead stored as an array-of-structs (AoS), you would need ~4 times fewer block reads as all the data of a node is stored in the same block and fetched at once: ```cpp struct Node { @@ -155,11 +163,11 @@ struct Node { Node t[maxn]; ``` -So the AoS layout is beneficial for data structures, but SoA still has good uses: while it is worse for searching, it is much better for linear scanning. +The AoS layout is usually preferred for data structures, but SoA still has good uses: while it is worse for searching, it is much better for linear scanning. -This difference in design is important in data processing applications. For example, databases can be either row-based or columnar: +This difference in design is important in data processing applications. For example, databases can be either *row-* or *column-oriented* (also called *columnar*): -- *Row-based* storage formats are used when you need to search for a limited amount of objects in a large dataset, and fetch all or most of their fields. -- *Columnar* storage formats are used for big data processing and analytics, where you need to scan through everything anyway to calculate certain statistics. +- *Row-oriented* storage formats are used when you need to search for a limited amount of objects in a large dataset and fetch all or most of their fields. Examples: PostgreSQL, MongoDB. +- *Columnar* storage formats are used for big data processing and analytics, where you need to scan through everything anyway to calculate certain statistics. Examples: ClickHouse, Hbase. -Columnar formats have an additional advantage that you can only read the fields that you need, as different fields are stored in separate external memory regions. +Columnar formats have the additional advantage that you can only read the fields that you need, as different fields are stored in separate external memory regions. From 0b55cc23cc378e78ef844d5e9cf230a2bf5feecb Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Thu, 27 Jan 2022 20:18:45 +0300 Subject: [PATCH 054/531] change wording --- content/english/hpc/external-memory/locality.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/hpc/external-memory/locality.md b/content/english/hpc/external-memory/locality.md index a50620ae..8607506d 100644 --- a/content/english/hpc/external-memory/locality.md +++ b/content/english/hpc/external-memory/locality.md @@ -123,7 +123,7 @@ int rmq(int l, int r) { // half-interval [l; r) The `__lg` function is an intrinsic available in GCC that calculates the binary logarithm of a number rounded down. Internally it uses the `clz` ("count leading zeros") instruction and subtracts this count from 32 (in case of a 32-bit integer), and thus takes just a few cycles. -The reason why I bring it up in this article is that there are multiple alternative ways it can be built, with different performances in terms of I/O operations. In general, a sparse table can be built in $O(n \log n)$ time in dynamic programming fashion by iterating in the order of increasing $i$ or $k$ and applying the following identity: +The reason why I bring it up in this article is that there are multiple alternative ways it can be built, with different efficiencies in terms of memory operations. In general, a sparse table can be built in $O(n \log n)$ time in dynamic programming fashion by iterating in the order of increasing $i$ or $k$ and applying the following identity: $$ t[k][i] = \min(t[k-1][i], t[k-1][i+2^{k-1}]) From 97bf189a728603edea6bf7ac66d7ce3058f4cda6 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Fri, 28 Jan 2022 15:38:35 +0300 Subject: [PATCH 055/531] reorganize cpu-cache --- content/english/hpc/cpu-cache/_index.md | 28 +- content/english/hpc/cpu-cache/aos-soa.md | 49 + .../english/hpc/cpu-cache/associativity.md | 11 +- content/english/hpc/cpu-cache/bandwidth.md | 10 +- content/english/hpc/cpu-cache/cache-lines.md | 7 +- .../english/hpc/cpu-cache/hw-prefetching.md | 3 + content/english/hpc/cpu-cache/img/aos-soa.svg | 1268 +++++++++++++++ .../hpc/cpu-cache/img/permutation-mlp.svg | 1353 +++++++++++++++++ .../hpc/cpu-cache/img/permutation-padded.svg | 1278 ++++++++++++++++ content/english/hpc/cpu-cache/latency.md | 2 +- content/english/hpc/cpu-cache/mlp.md | 38 +- content/english/hpc/cpu-cache/paging.md | 3 + 12 files changed, 4029 insertions(+), 21 deletions(-) create mode 100644 content/english/hpc/cpu-cache/aos-soa.md create mode 100644 content/english/hpc/cpu-cache/img/aos-soa.svg create mode 100644 content/english/hpc/cpu-cache/img/permutation-mlp.svg create mode 100644 content/english/hpc/cpu-cache/img/permutation-padded.svg diff --git a/content/english/hpc/cpu-cache/_index.md b/content/english/hpc/cpu-cache/_index.md index 9bde5517..c3563b74 100644 --- a/content/english/hpc/cpu-cache/_index.md +++ b/content/english/hpc/cpu-cache/_index.md @@ -11,22 +11,9 @@ At this level, we can no longer simply ignore either all arithmetic or memory op To do so, instead of digging ourselves in Intel spec sheets filled with theoretically possible performance metrics, we will estimate these parameters experimentally: by running small benchmark programs that perform access patterns that may realistically occur in real code. -### Recall: CPU Caches - -If you jumped to this page straight from Google or just forgot what [we've been doing](../), here is a brief summary of how memory operations work in CPUs: - -- In-between CPU registers and RAM, there is a hierarchy of *caches* that exist to speed up access to frequently used data: "lower" layers are faster, but more expensive and therefore smaller in size. -- Caches are physically a part of CPU. Accessing them takes a fixed amount of time in CPU cycles, so their real access time is proportional to the clock rate. On the contrary, RAM is a separate chip with its own clock rate. Its latencies are therefore better measured in nanoseconds, and not cycles. -- The CPU cache system operates on *cache lines*, which is the basic unit of data transfer between the CPU and the RAM. The size of a cache line is 64 bytes on most architectures, meaning that all main memory is divided into blocks of 64 bytes, and whenever you request (read or write) a single byte, you are also fetching all its 63 cache line neighbors whether your want them or not. -- Memory requests can overlap in time: while you wait for a read request to complete, you can sand a few others, which will be executed concurrently. In some contexts that allow for many concurrent I/O operations it therefore makes more sense to talk abound memory *bandwidth* than *latency*. -- Taking advantage of this free concurrency, it is often beneficial to *prefetch* data that you will likely be accessing soon, if you know its location. You can do this explicitly by using a separate instruction or just by accessing any byte in its cache line, but the most frequent patterns, such as linearly iterating forward or backward over an array, prefetching is already handled by hardware. -- Caching is done transparently; when there isn't enough space to fit a new cache line, the least recently used one automatically gets evicted to the next, slower layer of cache hierarchy. The programmer can't control this process explicitly. -- Since implementing "find the oldest among million cache lines" in hardware is unfeasible, each cache layer is split in a number of small "sets", each covering a certain subset of memory locations. *Associativity* is the size of these sets, or, in other terms, how many different "cells" of cache each data location can be mapped to. Higher associativity allows more efficient utilization of cache. -- There are other types of cache inside CPUs that are used for things other than data. The most important for us are *instruction cache* (I-cache), which is used to speed up the fetching of machine code from memory, and *translation lookaside buffer* (TLB), which is used to store physical locations of virtual memory pages, which is instrumental to the efficiency of virtual memory. +### System Setup -The last few points may be a bit hand-wavy, but don't worry: they will become clear as we go along with the experiments and demonstrate it all in action. - -**Setup.** As before, I will be running these experiments on [Ryzen 7 4700U](https://en.wikichip.org/wiki/amd/ryzen_7/4700u), which is a "Zen 2" CPU whose cache-related specs are as follows: +As before, I will be running these experiments on [Ryzen 7 4700U](https://en.wikichip.org/wiki/amd/ryzen_7/4700u), which is a "Zen 2" CPU whose cache-related specs are as follows: - 8 physical cores (without hyper-threading) clocked at 2GHz[^boost]; - 512K of 8-way set associative L1 cache, half of which is instruction cache — meaning 32K per core; @@ -42,6 +29,15 @@ Due to difficulties in [refraining compiler from cheating](..//hpc/analyzing-per I am not going to turn off frequency boosting or silence other programs while doing these benchmarks. The goal is to get realistic values, like when optimizing a video game. +There are more thorough [measurements for Zen 2](https://www.7-cpu.com/cpu/Zen2.html). + + diff --git a/content/english/hpc/cpu-cache/aos-soa.md b/content/english/hpc/cpu-cache/aos-soa.md new file mode 100644 index 00000000..2935c464 --- /dev/null +++ b/content/english/hpc/cpu-cache/aos-soa.md @@ -0,0 +1,49 @@ +--- +title: AoS and SoA +weight: 4 +--- + +Exploit [spatial locality](/hpc/external-memory/locality). + +Let's modify the pointer chasing code so that the next pointer needs to be computed using a variable number of fields. We can either place them in separate arrays, or in the same array. + +The first approach, struct + +```c++ +const int M = N / D; // # of memory accesses +int p[M], q[M][D]; + +iota(p, p + M, 0); +random_shuffle(p, p + M); + +int k = p[M - 1]; + +for (int i = 0; i < M; i++) + q[k][0] = p[i]; + + for (int j = 1; j < D; j++) + q[i][0] ^= (q[j][i] = rand()); + + k = q[k][0]; +} + +for (int i = 0; i < M; i++) { + int x = 0; + for (int j = 0; j < D; j++) + x ^= q[k][j]; + k = x; +} +``` + +Transpose the array and also swap indices in all its accesses: + +```c++ +int q[D][M]; +// ^--^ +``` + +![](../img/aos-soa.svg) + +Running a bit forward: the boosts at powers of two for AoS are due to SIMD, and dips in SoA are due to cache associativity. + + diff --git a/content/english/hpc/cpu-cache/associativity.md b/content/english/hpc/cpu-cache/associativity.md index f1cdd77e..eab0a51c 100644 --- a/content/english/hpc/cpu-cache/associativity.md +++ b/content/english/hpc/cpu-cache/associativity.md @@ -3,7 +3,14 @@ title: Cache Associativity weight: 8 --- -If you looked carefully, you could notice patterns while inspecting the dots below the graph in the previous experiment. These are not just noise: certain step sizes indeed perform much worse than their neighbors. +- Since implementing "find the oldest among million cache lines" in hardware is unfeasible, each cache layer is split in a number of small "sets", each covering a certain subset of memory locations. *Associativity* is the size of these sets, or, in other terms, how many different "cells" of cache each data location can be mapped to. Higher associativity allows more efficient utilization of cache. + + +If you looked carefully, you could notice patterns while inspecting the dots below the graph in the [previous experiment](../paging): + +![](../img/strides-hugepages.svg) + +These are not just noise: certain step sizes indeed perform much worse than their neighbors. For example, the stride of 256 corresponding to this loop: @@ -27,6 +34,8 @@ This is not just a single specific bad value: it is the same for all indices tha This effect is due to a feature called *cache associativity*, and an interesting artifact of how CPU caches are implemented in hardware. +### Hardware Caching + When studying memory theoretically using the external memory model, we discussed different ways one can [implement caching policies](/hpc/memory/locality/) in software, and went into detail on particular case of a simple but effective strategy, LRU, which required some non-trivial data manipulation. In the context of hardware, such scheme is called *fully associative cache*. ![Fully associative cache](../img/cache2.png) diff --git a/content/english/hpc/cpu-cache/bandwidth.md b/content/english/hpc/cpu-cache/bandwidth.md index ad04186b..03b9eab9 100644 --- a/content/english/hpc/cpu-cache/bandwidth.md +++ b/content/english/hpc/cpu-cache/bandwidth.md @@ -1,8 +1,13 @@ --- title: Memory Bandwidth -weight: 2 +weight: 1 --- +- In-between CPU registers and RAM, there is a hierarchy of *caches* that exist to speed up access to frequently used data: "lower" layers are faster, but more expensive and therefore smaller in size. + +- Caching is done transparently; when there isn't enough space to fit a new cache line, the least recently used one automatically gets evicted to the next, slower layer of cache hierarchy. The programmer can't control this process explicitly. + + For many algorithms, memory bandwidth is the most important characteristic of the cache system. Coincidentally, it is also the easiest to measure. For our benchmark, let's create an array and linearly iterate over it $K$ times, incrementing its values: @@ -27,6 +32,9 @@ All CPU cache layers are placed on the same microchip as the processor, so bandw To reduce noise, we will run all the remaining benchmarks at plain 2GHz — but the lesson to retain here is that the relative performance of different approaches or decisions between algorithm designs may depend on the clock frequency — unless when we are working with datasets that either fit in cache entirely. +Caches are physically a part of CPU. Accessing them takes a fixed amount of time in CPU cycles, so their real access time is proportional to the clock rate. On the contrary, RAM is a separate chip with its own clock rate. Its latencies are therefore better measured in nanoseconds, and not cycles. + + **Exercise: theoretical peak performance.** By the way, assuming infinite bandwidth, what would the throughput of that loop be? How to verify that the 14 GFLOPS figure is the CPU limit and not L1 peak bandwidth? For that we need to look a bit closer at how the processor will execute the loop. diff --git a/content/english/hpc/cpu-cache/cache-lines.md b/content/english/hpc/cpu-cache/cache-lines.md index 80772983..bd5b4c84 100644 --- a/content/english/hpc/cpu-cache/cache-lines.md +++ b/content/english/hpc/cpu-cache/cache-lines.md @@ -1,8 +1,11 @@ --- title: Cache Lines -weight: 4 +weight: 3 --- +- The CPU cache system operates on *cache lines*, which is the basic unit of data transfer between the CPU and the RAM. The size of a cache line is 64 bytes on most architectures, meaning that all main memory is divided into blocks of 64 bytes, and whenever you request (read or write) a single byte, you are also fetching all its 63 cache line neighbors whether your want them or not. + + The most important feature of the memory system is that it deals with cache lines, and not individual bytes. To demonstrate this, let's add "step" parameter to our loop — we will now increment every $D$-th element: @@ -26,3 +29,5 @@ When we change the step parameter to 8, the graphs equalize: ![](../img/strided2.svg) The important lesson is to count the number of cache lines to fetch when analyzing memory-bound algorithms, and not the total count of memory accesses. This becomes increasingly important with larger problem sizes. + +![](../img/permutation-padded.svg) diff --git a/content/english/hpc/cpu-cache/hw-prefetching.md b/content/english/hpc/cpu-cache/hw-prefetching.md index 0c836011..1894b180 100644 --- a/content/english/hpc/cpu-cache/hw-prefetching.md +++ b/content/english/hpc/cpu-cache/hw-prefetching.md @@ -3,6 +3,9 @@ title: Hardware Prefetching weight: 5 --- +- Taking advantage of this free concurrency, it is often beneficial to *prefetch* data that you will likely be accessing soon, if you know its location. You can do this explicitly by using a separate instruction or just by accessing any byte in its cache line, but the most frequent patterns, such as linearly iterating forward or backward over an array, prefetching is already handled by hardware. + + In the bandwidth benchmark, we iterated over array and fetched its elements. Although separately each memory read in that case is not different from the fetch in pointer chasing, they run much faster because they can are overlapped: and in fact, CPU issues read requests in advance without waiting for the old ones to complete, so that the results come about the same time as the CPU needs them. In fact, this sometimes works even when we are not sure which instruction is going to be executed next. Consider the following example: diff --git a/content/english/hpc/cpu-cache/img/aos-soa.svg b/content/english/hpc/cpu-cache/img/aos-soa.svg new file mode 100644 index 00000000..3e23dec6 --- /dev/null +++ b/content/english/hpc/cpu-cache/img/aos-soa.svg @@ -0,0 +1,1268 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/cpu-cache/img/permutation-mlp.svg b/content/english/hpc/cpu-cache/img/permutation-mlp.svg new file mode 100644 index 00000000..75056a8a --- /dev/null +++ b/content/english/hpc/cpu-cache/img/permutation-mlp.svg @@ -0,0 +1,1353 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/cpu-cache/img/permutation-padded.svg b/content/english/hpc/cpu-cache/img/permutation-padded.svg new file mode 100644 index 00000000..c3dae3be --- /dev/null +++ b/content/english/hpc/cpu-cache/img/permutation-padded.svg @@ -0,0 +1,1278 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/cpu-cache/latency.md b/content/english/hpc/cpu-cache/latency.md index 37867121..b8a2f800 100644 --- a/content/english/hpc/cpu-cache/latency.md +++ b/content/english/hpc/cpu-cache/latency.md @@ -1,6 +1,6 @@ --- title: Memory Latency -weight: 1 +weight: 2 --- Despite bandwidth — how many data one can load — is a more complicated concept, it is much easier to observe and measure than latency — how much time it takes to load one cache line. diff --git a/content/english/hpc/cpu-cache/mlp.md b/content/english/hpc/cpu-cache/mlp.md index 52c331c5..b41ff8bf 100644 --- a/content/english/hpc/cpu-cache/mlp.md +++ b/content/english/hpc/cpu-cache/mlp.md @@ -1,7 +1,41 @@ --- title: Memory-Level Parallelism weight: 3 -draft: true --- -... +- Memory requests can overlap in time: while you wait for a read request to complete, you can sand a few others, which will be executed concurrently. In some contexts that allow for many concurrent I/O operations it therefore makes more sense to talk abound memory *bandwidth* than *latency*. + +```c++ +const int M = N / D; +int p[M], q[D][M]; + +for (int d = 0; d < D; d++) { + iota(p, p + M, 0); + random_shuffle(p, p + M); + k[d] = p[M - 1]; + for (int i = 0; i < M; i++) + k[d] = q[d][k[d]] = p[i]; +} + +for (int i = 0; i < M; i++) + for (int d = 0; d < D; d++) + k[d] = q[d][k[d]]; +``` + +![](../img/permutation-mlp.svg) + +There is a conflict over registers: + +```nasm +dec edx +movsx rdi, DWORD PTR q[0+rdi*4] +movsx rsi, DWORD PTR q[1048576+rsi*4] +movsx rcx, DWORD PTR q[2097152+rcx*4] +movsx rax, DWORD PTR q[3145728+rax*4] +jne .L9 +``` + +```nasm +mov edx, DWORD PTR q[0+rdx*4] +mov DWORD PTR [rbp-128+rax*4], edx +``` diff --git a/content/english/hpc/cpu-cache/paging.md b/content/english/hpc/cpu-cache/paging.md index f0141b81..27cbb512 100644 --- a/content/english/hpc/cpu-cache/paging.md +++ b/content/english/hpc/cpu-cache/paging.md @@ -3,6 +3,9 @@ title: Memory Paging weight: 7 --- +- There are other types of cache inside CPUs that are used for things other than data. The most important for us are *instruction cache* (I-cache), which is used to speed up the fetching of machine code from memory, and *translation lookaside buffer* (TLB), which is used to store physical locations of virtual memory pages, which is instrumental to the efficiency of virtual memory. + + Let's consider other possible values of $D$ and try to measure loop performance. Since for values larger than 16 we will skip some cache lines altogether, requiring less memory reads and fewer cache, we change the size of the array so that the total number of cache lines fetched is constant. ```cpp From 50f0ac3fc0b5a36ebbe3d911896236df362c1b20 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Fri, 28 Jan 2022 21:05:31 +0300 Subject: [PATCH 056/531] maximum vertex cover edits --- .../russian/cs/matching/matching-problems.md | 27 +++++++++---------- 1 file changed, 12 insertions(+), 15 deletions(-) diff --git a/content/russian/cs/matching/matching-problems.md b/content/russian/cs/matching/matching-problems.md index 12a99f79..cedfe69d 100644 --- a/content/russian/cs/matching/matching-problems.md +++ b/content/russian/cs/matching/matching-problems.md @@ -4,13 +4,14 @@ weight: 3 authors: - Сергей Слотин - Максим Иванов +date: 2022-01-28 --- Алгоритм нахождения паросочетания далеко не настолько сложный, насколько сложно сводить задачи к нему. Начнём с простых примеров. -## Кубики +### Кубики Дано $n$ кубиков, у каждого из них 6 граней, на каждой гране написана какая-то буква. Дано слово $s$, и требуется каждой букве слова $s$ сопоставить уникальный кубик, так чтобы мы могли повернуть этот кубик и получить нужную нам букву. @@ -26,7 +27,7 @@ authors: По определению паросочетания мы не сопоставим ни один кубик нескольким буквам, но так как наше паросочетание — максимально, то мы покроем максимально возможное количество букв. -## Доминошки +### Доминошки Есть прямоугольное поле $n \times m$, которое содержит какие-то выколотые клетки. Надо положить на это поле как можно больше костей домино (прямоугольников размера $1 \times 2$), но с условием, что поверх выколотых полей ничего лежать не должно. @@ -34,7 +35,7 @@ authors: Ответ — максимальное паросочетание в таком графе. Асимптотика с алгоритмом Куна $O(n^2 m^2)$, потому что у нас будет $O(nm)$ вершин и рёбер. -## Покрытие путями DAG +### Покрытие путями DAG Разберем более сложную задачу, до решения которой самостоятельно додуматься сложно. @@ -54,21 +55,19 @@ authors: Мы теперь можем свести задачу к нахождению максимального паросочетания в двудольном графе $H$. После нахождения этого паросочетания мы должны преобразовать его в набор путей в $G$. Это делается тривиальным алгоритмом: возьмем $a_1$, посмотрим, с какой $b_k$ она соединена, посмотрим на $a_k$ и так далее. Некоторые вершины могут остаться ненасыщенными — в таком случае в ответ надо добавить пути нулевой длины из каждой из этих вершин. -## Минимальное вершинное покрытие +### Минимальное вершинное покрытие -**Задача**. Дан граф. Назовем *вершинным покрытием* такое множество вершин, что каждое ребро графа инцидентно хотя бы одной вершине из множества. Необходимо найти вершинное покрытие наименьшего размера. +**Задача**. Назовем *вершинным покрытием* графа такое множество вершин, что каждое ребро графа инцидентно хотя бы одной вершине из множества. Необходимо найти вершинное покрытие наименьшего размера. -Следует заметить, что в общем случае это очень сложная задача, но для двудольных графов она имеет достаточно простое решение. +В общем случае это NP-полная задача, но для двудольных графов она имеет достаточно простое решение. -**Теорема**. $\mid V_{min} \mid \le \mid M \mid$, где $V_{min}$ — минимальное вершинное покрытие, а $M$ — максимальное паросочетание. - -**Доказательство**. $\mid V_{min} \mid \ge \mid M \mid$, поскольку $M$ — множество независимых ребер. Теперь приведем алгоритм, который строит вершинное покрытие размера $\mid M \mid$. Очевидно, оно будет минимальным. +Обозначим за $V_{min}$ наименьшее вершинное покрытие, а за $M$ — максимальное паросочетание в графе. Тогда сразу заметим, что $|V_{min}| \ge |M|$, потому что $M$ — множество независимых ребер. Теперь приведем алгоритм, который строит вершинное покрытие размера ровно $|M|$. Очевидно, оно будет минимальным. **Алгоритм**. Мысленно ориентируем ребра графа: ребра из $M$ проведем из правой доли в левую, остальные — из левой в правую, после чего запустим обход в глубину из всех вершин левой доли, не включенных в $M$. ![](https://neerc.ifmo.ru/wiki/images/4/4c/Bipartdfs_right.jpg) -Заметим, что граф разбился на несколько множеств: $L^+, L^-, R^+, R^-$, где "плюсовые" множества — это множества посещенных в процессе обхода вершин. В графе такого вида не бывает ребер $L^+ \rightarrow R^-$, $L^- \leftarrow R^+$ по очевидным соображениям. Ребер $L^+ \leftarrow R^-$ не бывает, потому что в противном случае паросочетание $M$ не максимальное — его можно дополнить ребрами такого типа. +Заметим, что граф разбился на несколько множеств: $L^+, L^-, R^+, R^-$, где «плюсовые» множества — это множества посещенных в процессе обхода вершин. В графе такого вида не бывает ребер $L^+ \rightarrow R^-$ и $L^- \leftarrow R^+$ по очевидным соображениям. Ребер $L^+ \leftarrow R^-$ не бывает, потому что в противном случае паросочетание $M$ не максимальное — его можно дополнить ребрами такого типа. $$ L^- \cup R^+ = V_{min} @@ -78,12 +77,10 @@ $$ **Упражнение**. Подумайте, как это можно применить к решению задачи о нахождении [максимального независимого множества](https://neerc.ifmo.ru/wiki/index.php?title=%D0%A1%D0%B2%D1%8F%D0%B7%D1%8C_%D0%B2%D0%B5%D1%80%D1%88%D0%B8%D0%BD%D0%BD%D0%BE%D0%B3%D0%BE_%D0%BF%D0%BE%D0%BA%D1%80%D1%8B%D1%82%D0%B8%D1%8F_%D0%B8_%D0%BD%D0%B5%D0%B7%D0%B0%D0%B2%D0%B8%D1%81%D0%B8%D0%BC%D0%BE%D0%B3%D0%BE_%D0%BC%D0%BD%D0%BE%D0%B6%D0%B5%D1%81%D1%82%D0%B2%D0%B0). -## Паросочетание минимального веса +### Паросочетание минимального веса Пусть у вершин левой доли есть какие-то веса, и нам нужно набрать максимальное паросочетание минимального веса. -Выясняется, что можно просто отсортировать вершины левой доли по весу и пытаться в таком порядке добавлять их в паросочетание стандартным алгоритмом Куна. - -Для доказательства этого факта читатель может прочитать про [жадный алгоритм Радо-Эдмондса](/cs/greedy/matroid), частным случаем которого является такая модификация алгоритма Куна. +Выясняется, что можно просто отсортировать вершины левой доли по весу и пытаться в таком порядке добавлять их в паросочетание стандартным алгоритмом Куна. Для доказательства этого факта читатель может прочитать про [жадный алгоритм Радо-Эдмондса](/cs/greedy/matroid), частным случаем которого является такая модификация алгоритма Куна. -[Аналогичную задачу](/cs/flows/mincost-maxflow), когда у *ребер* есть веса, проще всего решать сведением к потоку минимальной стоимости. +Аналогичную задачу, но когда у *ребер* есть веса, проще всего решать сведением к нахождению [потока минимальной стоимости](/cs/flows/mincost-maxflow). From 094ca7477c0128b0c0be6d63fcc67d79d863ad1b Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Fri, 28 Jan 2022 23:32:16 +0300 Subject: [PATCH 057/531] cpu-cache intro --- content/english/hpc/cpu-cache/_index.md | 35 +- content/english/hpc/cpu-cache/img/aos-soa.svg | 511 ++++++++++++------ content/english/hpc/cpu-cache/sharing.md | 2 +- 3 files changed, 373 insertions(+), 175 deletions(-) diff --git a/content/english/hpc/cpu-cache/_index.md b/content/english/hpc/cpu-cache/_index.md index c3563b74..fff0d8a9 100644 --- a/content/english/hpc/cpu-cache/_index.md +++ b/content/english/hpc/cpu-cache/_index.md @@ -3,33 +3,38 @@ title: RAM & CPU Caches weight: 9 --- -In the previous chapter, we studied computer memory from theoretical standpoint, using the [external memory model](../external-memory) to estimate performance of memory-bound algorithms. +In the [previous chapter](../external-memory), we studied computer memory from a theoretical standpoint, using the [external memory model](../external-memory/model) to estimate the performance of memory-bound algorithms. -While it is more or less accurate for computations involving HDDs and network storage, where in-memory arithmetic is negligibly fast compared to external I/O operations, it becomes erroneous on lower levels in the cache hierarchy, where the costs of these operations become comparable. +While it is more or less accurate for computations involving HDDs and network storage, where in-memory arithmetic is negligibly fast compared to the external I/O operations, it is too imprecise for lower levels in the cache hierarchy, where the costs of these operations become comparable. + +To perform more fine-grained optimization of in-memory algorithms, we have to start taking into account the many specific details of the CPU cache system. And instead of studying loads of boring Intel documents with dry specs and theoretically achievable limits, we will estimate these parameters experimentally by running numerous small benchmark programs with access patterns that resemble the ones that often occur in practical code. + + + -### System Setup +### Experimental Setup -As before, I will be running these experiments on [Ryzen 7 4700U](https://en.wikichip.org/wiki/amd/ryzen_7/4700u), which is a "Zen 2" CPU whose cache-related specs are as follows: +As before, I will be running all experiments on Ryzen 7 4700U, which is a "Zen 2" CPU with the following main cache-related specs: -- 8 physical cores (without hyper-threading) clocked at 2GHz[^boost]; -- 512K of 8-way set associative L1 cache, half of which is instruction cache — meaning 32K per core; -- 4M of 8-way set associative L2 cache, or 512K per core; -- 8M of 16-way set associative L3 cache, *shared* between 8 cores (4M actually); -- 16G of DDR4 RAM @ 2667MHz. +- 8 physical cores (without hyper-threading) clocked at 2GHz (and 4.1GHz in boost mode — [which we disable](/hpc/profiling/noise)); +- 256K of 8-way set associative L1 data cache or 32K per core; +- 4M of 8-way set associative L2 cache or 512K per core; +- 8M of 16-way set associative L3 cache, [shared](sharing) between 8 cores; +- 16GB (2x8G) of DDR4 RAM @ 2667MHz. -[^boost]: Although the CPU can be clocked at 4.1GHz in boost mode, we will perform most experiments at 2GHz to reduce noise — so keep in mind that in realistic applications the numbers can be multiplied by 2. +You can compare it with your own hardware by running `dmidecode -t cache` or `lshw -class memory` on Linux or by installing [CPU-Z](https://en.wikipedia.org/wiki/CPU-Z) on Windows. You can also find additional details about the CPU on [WikiChip](https://en.wikichip.org/wiki/amd/ryzen_7/4700u) and [7-CPU](https://www.7-cpu.com/cpu/Zen2.html). -You can compare it with your own hardware by running `dmidecode -t cache` or `lshw -class memory` on Linux or just looking it up on WikiChip. + -There are more thorough [measurements for Zen 2](https://www.7-cpu.com/cpu/Zen2.html). +Due to difficulties in [refraining the compiler from cheating](/hpc/profiling/noise/), the code snippets in this article are slightly simplified for exposition purposes. Check the [code repository](https://github.com/sslotin/amh-code/tree/main/cpu-cache) if you want to reproduce them yourself. + + + + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + - @@ -726,7 +774,7 @@ L 414.72 307.584 - @@ -741,7 +789,7 @@ L 414.72 251.229352 - @@ -756,7 +804,7 @@ L 414.72 194.874705 - @@ -771,7 +819,7 @@ L 414.72 138.520057 - @@ -836,31 +884,6 @@ L 55.171875 0 L 9.8125 0 z " id="DejaVuSans-76"/> - - @@ -879,7 +902,7 @@ z - - + + + + + + - + - - - +" id="DejaVuSans-80"/> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + + + + + + + + + + + + + + + + + + + + + + + + - - + - + @@ -1227,12 +1360,12 @@ z - + - + @@ -1257,11 +1390,71 @@ L 292.855469 72.026219 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + - + diff --git a/content/english/hpc/cpu-cache/sharing.md b/content/english/hpc/cpu-cache/sharing.md index c9341822..98a02eff 100644 --- a/content/english/hpc/cpu-cache/sharing.md +++ b/content/english/hpc/cpu-cache/sharing.md @@ -1,5 +1,5 @@ --- -title: Memory Sharing +title: Cache and Memory Sharing weight: 4 --- From ea4315be5155051861e2160dd4f7161351d44ea5 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Fri, 28 Jan 2022 23:56:50 +0300 Subject: [PATCH 058/531] reorganize cpu-cache --- content/english/hpc/cpu-cache/aos-soa.md | 49 - .../english/hpc/cpu-cache/associativity.md | 4 +- content/english/hpc/cpu-cache/bandwidth.md | 41 +- .../english/hpc/cpu-cache/img/directional.svg | 1390 +++++++++++++++++ content/english/hpc/cpu-cache/mlp.md | 47 + content/english/hpc/pipelining/limits.md | 13 + 6 files changed, 1476 insertions(+), 68 deletions(-) delete mode 100644 content/english/hpc/cpu-cache/aos-soa.md create mode 100644 content/english/hpc/cpu-cache/img/directional.svg diff --git a/content/english/hpc/cpu-cache/aos-soa.md b/content/english/hpc/cpu-cache/aos-soa.md deleted file mode 100644 index 2935c464..00000000 --- a/content/english/hpc/cpu-cache/aos-soa.md +++ /dev/null @@ -1,49 +0,0 @@ ---- -title: AoS and SoA -weight: 4 ---- - -Exploit [spatial locality](/hpc/external-memory/locality). - -Let's modify the pointer chasing code so that the next pointer needs to be computed using a variable number of fields. We can either place them in separate arrays, or in the same array. - -The first approach, struct - -```c++ -const int M = N / D; // # of memory accesses -int p[M], q[M][D]; - -iota(p, p + M, 0); -random_shuffle(p, p + M); - -int k = p[M - 1]; - -for (int i = 0; i < M; i++) - q[k][0] = p[i]; - - for (int j = 1; j < D; j++) - q[i][0] ^= (q[j][i] = rand()); - - k = q[k][0]; -} - -for (int i = 0; i < M; i++) { - int x = 0; - for (int j = 0; j < D; j++) - x ^= q[k][j]; - k = x; -} -``` - -Transpose the array and also swap indices in all its accesses: - -```c++ -int q[D][M]; -// ^--^ -``` - -![](../img/aos-soa.svg) - -Running a bit forward: the boosts at powers of two for AoS are due to SIMD, and dips in SoA are due to cache associativity. - - diff --git a/content/english/hpc/cpu-cache/associativity.md b/content/english/hpc/cpu-cache/associativity.md index eab0a51c..001f9016 100644 --- a/content/english/hpc/cpu-cache/associativity.md +++ b/content/english/hpc/cpu-cache/associativity.md @@ -6,9 +6,7 @@ weight: 8 - Since implementing "find the oldest among million cache lines" in hardware is unfeasible, each cache layer is split in a number of small "sets", each covering a certain subset of memory locations. *Associativity* is the size of these sets, or, in other terms, how many different "cells" of cache each data location can be mapped to. Higher associativity allows more efficient utilization of cache. -If you looked carefully, you could notice patterns while inspecting the dots below the graph in the [previous experiment](../paging): - -![](../img/strides-hugepages.svg) +If you looked carefully, you could notice patterns while inspecting the dots below the graph in the [previous experiment](../paging). These are not just noise: certain step sizes indeed perform much worse than their neighbors. diff --git a/content/english/hpc/cpu-cache/bandwidth.md b/content/english/hpc/cpu-cache/bandwidth.md index 03b9eab9..f16091ca 100644 --- a/content/english/hpc/cpu-cache/bandwidth.md +++ b/content/english/hpc/cpu-cache/bandwidth.md @@ -3,14 +3,18 @@ title: Memory Bandwidth weight: 1 --- -- In-between CPU registers and RAM, there is a hierarchy of *caches* that exist to speed up access to frequently used data: "lower" layers are faster, but more expensive and therefore smaller in size. +On the data path between the CPU registers and the RAM, there is a hierarchy of *caches* that exist to speed up access to frequently used data: the layers closer to the processor are are faster, but also smaller in size. The word "faster" here means two things: + +- The time between the moment when a read (or write) is initiated and the moment when it is (latency). +- The number of - Caching is done transparently; when there isn't enough space to fit a new cache line, the least recently used one automatically gets evicted to the next, slower layer of cache hierarchy. The programmer can't control this process explicitly. +--> -For many algorithms, memory bandwidth is the most important characteristic of the cache system. Coincidentally, it is also the easiest to measure. +For many algorithms, *memory bandwidth* is the most important characteristic of the cache system. And at the same time, it is also the easiest to measure. -For our benchmark, let's create an array and linearly iterate over it $K$ times, incrementing its values: +For our experiment, we create an array and iterate over it $K$ times incrementing its values: ```cpp int a[N]; @@ -26,26 +30,31 @@ Changing $N$ and adjusting $K$ so that the total number of cells accessed remain You can clearly see the sizes of the cache layers on this graph. When the whole array fits into the lowest layer of cache, the program is bottlenecked by CPU rather than L1 cache bandwidth. As the the array becomes larger, overhead becomes smaller, and the performance approaches this theoretical maximum. But then it drops: first to ~12 GFLOPS when it exceeds L1 cache, and then gradually to about 2.1 GFLOPS when it can no longer fit in L3. -All CPU cache layers are placed on the same microchip as the processor, so bandwidth, latency, all its other characteristics scale with the clock frequency. RAM, on the other side, lives on its own clock, and its characteristics remain constant. This can be seen on these graphs if we run the same benchmark while turning frequency boost on: +### Directional Access -![](../img/boost.svg) +Only read: -To reduce noise, we will run all the remaining benchmarks at plain 2GHz — but the lesson to retain here is that the relative performance of different approaches or decisions between algorithm designs may depend on the clock frequency — unless when we are working with datasets that either fit in cache entirely. +```c++ +for (int i = 0; i < N; i++) + s += a[i]; +``` -Caches are physically a part of CPU. Accessing them takes a fixed amount of time in CPU cycles, so their real access time is proportional to the clock rate. On the contrary, RAM is a separate chip with its own clock rate. Its latencies are therefore better measured in nanoseconds, and not cycles. +Only write: +```c++ +// same as memset(a, 0, sizeof a); +for (int i = 0; i < N; i++) + a[i] = 0; +``` - +![](../img/directional.svg) -**Exercise: theoretical peak performance.** By the way, assuming infinite bandwidth, what would the throughput of that loop be? How to verify that the 14 GFLOPS figure is the CPU limit and not L1 peak bandwidth? For that we need to look a bit closer at how the processor will execute the loop. +### Frequency Scaling -Incrementing an array can be done with SIMD; when compiled, it uses just two operations per 8 elements — performing the read-fused addition and writing the result back: +All CPU cache layers are placed on the same microchip as the processor, so bandwidth, latency, all its other characteristics scale with the clock frequency. RAM, on the other side, lives on its own clock, and its characteristics remain constant. This can be seen on these graphs if we run the same benchmark while turning frequency boost on: -```asm -vpaddd ymm0, ymm1, YMMWORD PTR [rax] -vmovdqa YMMWORD PTR [rax], ymm0 -``` +![](../img/boost.svg) -This computation is bottlenecked by the write, which has a throughput of 1. This means that we can theoretically increment and write back 8 values per cycle on average, yielding the performance of 2 GHz × 8 = 16 GFLOPS (or 32.8 in boost mode), which is fairly close to what we observed. +To reduce noise, we will run all the remaining benchmarks at plain 2GHz — but the lesson to retain here is that the relative performance of different approaches or decisions between algorithm designs may depend on the clock frequency — unless when we are working with datasets that either fit in cache entirely. -On all modern architectures, you can typically assume that you won't ever be bottlenecked by the throughput of L1 cache, but rather by the read/write execution ports or the arithmetic. In these extreme cases, it may be beneficial to store some data in registers without touching any of the memory, which we will cover later in the book. +Caches are physically a part of CPU. Accessing them takes a fixed amount of time in CPU cycles, so their real access time is proportional to the clock rate. On the contrary, RAM is a separate chip with its own clock rate. Its latencies are therefore better measured in nanoseconds, and not cycles. diff --git a/content/english/hpc/cpu-cache/img/directional.svg b/content/english/hpc/cpu-cache/img/directional.svg new file mode 100644 index 00000000..7a07815c --- /dev/null +++ b/content/english/hpc/cpu-cache/img/directional.svg @@ -0,0 +1,1390 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/cpu-cache/mlp.md b/content/english/hpc/cpu-cache/mlp.md index b41ff8bf..d54ec60a 100644 --- a/content/english/hpc/cpu-cache/mlp.md +++ b/content/english/hpc/cpu-cache/mlp.md @@ -39,3 +39,50 @@ jne .L9 mov edx, DWORD PTR q[0+rdx*4] mov DWORD PTR [rbp-128+rax*4], edx ``` + +### AoS and SoA + +Exploit [spatial locality](/hpc/external-memory/locality). + +Let's modify the pointer chasing code so that the next pointer needs to be computed using a variable number of fields. We can either place them in separate arrays, or in the same array. + +The first approach, struct + +```c++ +const int M = N / D; // # of memory accesses +int p[M], q[M][D]; + +iota(p, p + M, 0); +random_shuffle(p, p + M); + +int k = p[M - 1]; + +for (int i = 0; i < M; i++) + q[k][0] = p[i]; + + for (int j = 1; j < D; j++) + q[i][0] ^= (q[j][i] = rand()); + + k = q[k][0]; +} + +for (int i = 0; i < M; i++) { + int x = 0; + for (int j = 0; j < D; j++) + x ^= q[k][j]; + k = x; +} +``` + +Transpose the array and also swap indices in all its accesses: + +```c++ +int q[D][M]; +// ^--^ +``` + +![](../img/aos-soa.svg) + +Running a bit forward: the boosts at powers of two for AoS are due to SIMD, and dips in SoA are due to cache associativity. + + \ No newline at end of file diff --git a/content/english/hpc/pipelining/limits.md b/content/english/hpc/pipelining/limits.md index d4679825..0e9526e2 100644 --- a/content/english/hpc/pipelining/limits.md +++ b/content/english/hpc/pipelining/limits.md @@ -39,3 +39,16 @@ However hard you try, you can't make the latency lower than the slowest memory r There is an FMA instruction. This is the number usually reported. + +**Exercise: theoretical peak performance.** By the way, assuming infinite bandwidth, what would the throughput of that loop be? How to verify that the 14 GFLOPS figure is the CPU limit and not L1 peak bandwidth? For that we need to look a bit closer at how the processor will execute the loop. + +Incrementing an array can be done with SIMD; when compiled, it uses just two operations per 8 elements — performing the read-fused addition and writing the result back: + +```asm +vpaddd ymm0, ymm1, YMMWORD PTR [rax] +vmovdqa YMMWORD PTR [rax], ymm0 +``` + +This computation is bottlenecked by the write, which has a throughput of 1. This means that we can theoretically increment and write back 8 values per cycle on average, yielding the performance of 2 GHz × 8 = 16 GFLOPS (or 32.8 in boost mode), which is fairly close to what we observed. + +On all modern architectures, you can typically assume that you won't ever be bottlenecked by the throughput of L1 cache, but rather by the read/write execution ports or the arithmetic. In these extreme cases, it may be beneficial to store some data in registers without touching any of the memory, which we will cover later in the book. From db1827d00af4f20049b1fad627220ced19d48b10 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sat, 29 Jan 2022 20:13:40 +0300 Subject: [PATCH 059/531] memory bandwidth --- content/english/hpc/cpu-cache/bandwidth.md | 78 +- .../english/hpc/cpu-cache/img/directional.svg | 374 ++--- .../hpc/cpu-cache/img/non-temporal.svg | 1480 +++++++++++++++++ .../hpc/simd/{loading.md => moving.md} | 0 4 files changed, 1726 insertions(+), 206 deletions(-) create mode 100644 content/english/hpc/cpu-cache/img/non-temporal.svg rename content/english/hpc/simd/{loading.md => moving.md} (100%) diff --git a/content/english/hpc/cpu-cache/bandwidth.md b/content/english/hpc/cpu-cache/bandwidth.md index f16091ca..d0c39ffc 100644 --- a/content/english/hpc/cpu-cache/bandwidth.md +++ b/content/english/hpc/cpu-cache/bandwidth.md @@ -3,18 +3,14 @@ title: Memory Bandwidth weight: 1 --- -On the data path between the CPU registers and the RAM, there is a hierarchy of *caches* that exist to speed up access to frequently used data: the layers closer to the processor are are faster, but also smaller in size. The word "faster" here means two things: +On the data path between the CPU registers and the RAM, there is a hierarchy of *caches* that exist to speed up access to frequently used data: the layers closer to the processor are faster but also smaller in size. The word "faster" here applies to two closely related but separate timings: -- The time between the moment when a read (or write) is initiated and the moment when it is (latency). -- The number of +- The delay between the moment when a read or a write is initiated and when the data arrives (*latency*). +- The number of memory operations that can be processed per unit of time (*bandwidth*). -- Caching is done transparently; when there isn't enough space to fit a new cache line, the least recently used one automatically gets evicted to the next, slower layer of cache hierarchy. The programmer can't control this process explicitly. +For many algorithms, memory bandwidth is the most important characteristic of the cache system. And at the same time, it is also the easiest to measure, so we are going to start with it. ---> - -For many algorithms, *memory bandwidth* is the most important characteristic of the cache system. And at the same time, it is also the easiest to measure. - -For our experiment, we create an array and iterate over it $K$ times incrementing its values: +For our experiment, we create an array and iterate over it $K$ times, incrementing its values: ```cpp int a[N]; @@ -24,37 +20,81 @@ for (int t = 0; t < K; t++) a[i]++; ``` -Changing $N$ and adjusting $K$ so that the total number of cells accessed remains roughly constant, and normalizing the timings as "operations per second", we get the following results: +Changing $N$ and adjusting $K$ so that the total number of array cells accessed remains roughly constant and expressing the total time in "operations per second", we get a graph like this: ![Dotted vertical lines are cache layer sizes](../img/inc.svg) -You can clearly see the sizes of the cache layers on this graph. When the whole array fits into the lowest layer of cache, the program is bottlenecked by CPU rather than L1 cache bandwidth. As the the array becomes larger, overhead becomes smaller, and the performance approaches this theoretical maximum. But then it drops: first to ~12 GFLOPS when it exceeds L1 cache, and then gradually to about 2.1 GFLOPS when it can no longer fit in L3. +You can clearly see the cache sizes on this graph: + +- When the whole array fits into the lowest layer of cache, the program is bottlenecked by the CPU rather than the L1 cache bandwidth. As the array becomes larger, the overhead associated with the first iterations of the loop becomes smaller, and the performance gets closer to its theoretical maximum of 16 GFLOPS. +- But then the performance drops: first to 12-13 GFLOPS when it exceeds the L1 cache, and then gradually to about 2 GFLOPS when it can no longer fit in the L3 cache. + +This situation is typical for many lightweight loops. + +### Frequency Scaling + +All CPU cache layers are placed on the same microchip as the processor, so the bandwidth, latency, and all its other characteristics scale with the clock frequency. The RAM, on the other side, lives on its own fixed clock, and its characteristics remain constant. We can observe this by re-running the same benchmarking with turbo boost on: + +![](../img/boost.svg) + +This detail comes into play when comparing algorithm implementations. Unless the dataset fits entirely in the cache, the relative performance of the two implementations may be different depending on the CPU clock rate because the RAM remains unaffected by it, while everything else does. + +For this reason, it is [advised](/hpc/profiling/noise) to keep the clock rate fixed, and as the turbo boost isn't stable enough, we run most of the benchmarks in this book at plain 2GHz. ### Directional Access -Only read: +On each iteration, we need to fetch a value, increment it, and then write it back — so this loop simultaneously performs both reads and writes during its execution. For many applications we only need to do one of them, so let's try to measure one-directional bandwidth. + +An array sum would only require memory reads: ```c++ for (int i = 0; i < N; i++) s += a[i]; ``` -Only write: +And zeroing an array or filling it with any other value would only require memory writes: ```c++ -// same as memset(a, 0, sizeof a); for (int i = 0; i < N; i++) a[i] = 0; ``` +Both loops are trivially vectorized by the compiler, and the second one is actually replaced with `memset`, so the CPU is also not the bottleneck here, except when the array fits into the L1 cache. + ![](../img/directional.svg) -### Frequency Scaling +The reason why unidirectional and bidirectional memory accesses would perform differently is that they share the cache and memory buses and other CPU facilities. In the case of RAM, this causes a twofold difference in performance between the pure read and simultaneous read and write scenarios because the memory controller has to switch between the modes on the one-way memory bus, thus halving the bandwidth. The performance drop is less severe for the L2 cache: the bottleneck here is not the cache bus, so the incrementing loop loses by only ~15%. -All CPU cache layers are placed on the same microchip as the processor, so bandwidth, latency, all its other characteristics scale with the clock frequency. RAM, on the other side, lives on its own clock, and its characteristics remain constant. This can be seen on these graphs if we run the same benchmark while turning frequency boost on: +There is one interesting anomaly on the graph, namely that the write-only loop performs the same as the read-and-write one when the array hits the L3 cache and the RAM. This is because the CPU moves the data to the highest level of cache on each access, whether it is a read or a write — which is typically a good optimization, as in many use cases we will be needing it soon. When reading data, this isn't a problem, as the data travels through the cache hierarchy anyway, but when writing, this causes another implicit read to be dispatched right after a write — thus requiring twice the bus bandwidth. -![](../img/boost.svg) +### Bypassing the Cache + +We can prevent the CPU from prefetching the data that we just have written by using *non-temporal* memory accesses. To do this, we need to re-implement the zeroing loop more directly without relying on compiler vectorization. + +Ignoring a few special cases, what `memset` and auto-vectorized assignment loops do under the hood is they just [move](/hpc/simd/moving) 32-byte blocks of data with [SIMD instructions](/hpc/simd): + +```c++ +const __m256i zeros = _mm256_set1_epi32(0); + +for (int i = 0; i + 7 < N; i += 8) + _mm256_store_si256((__m256i*) &a[i], zeros); +``` + +We can replace the usual vector store intrinsic with a *non-temporal* one: + +```c++ +const __m256i zeros = _mm256_set1_epi32(0); + +for (int i = 0; i + 7 < N; i += 8) + _mm256_stream_si256((__m256i*) &a[i], zeros); +``` + +Non-temporal memory reads or writes are a way to tell the CPU that we won't be needing the data that we have just accessed in the future, so there is no need to read the data back after a write. + +![](../img/non-temporal.svg) + +On the one hand, if the array is small enough to fit into the cache, and we actually access it some short time after, this has a negative effect because we have to read entirely it from the RAM (or, in this case, we have to *write* it into the RAM instead of using a locally cached version). And on the other, this prevents read-backs and lets us use the memory bus more efficiently. -To reduce noise, we will run all the remaining benchmarks at plain 2GHz — but the lesson to retain here is that the relative performance of different approaches or decisions between algorithm designs may depend on the clock frequency — unless when we are working with datasets that either fit in cache entirely. +In fact, the performance increase in the case of the RAM is even more than 2x. My theory here is that it is because the memory controller doesn't have to switch modes this way and also because the instruction sequence becomes simpler, and the CPU can handle more pending memory operations — after all, a [single core can't saturate the memory bandwidth](../sharing). -Caches are physically a part of CPU. Accessing them takes a fixed amount of time in CPU cycles, so their real access time is proportional to the clock rate. On the contrary, RAM is a separate chip with its own clock rate. Its latencies are therefore better measured in nanoseconds, and not cycles. +The same technique generalizes to `memcpy`: it also just moves 32-byte blocks with SIMD load/store instructions, and it can be similarly made non-temporal, increasing the throughput twofold for large arrays. There is also a non-temporal load instruction (`_mm256_stream_load_si256`) for when you want to *read* without polluting cache (e. g. when you don't need the original array after a `memcpy`, but will need some data that you had accessed before calling it). diff --git a/content/english/hpc/cpu-cache/img/directional.svg b/content/english/hpc/cpu-cache/img/directional.svg index 7a07815c..9badafe2 100644 --- a/content/english/hpc/cpu-cache/img/directional.svg +++ b/content/english/hpc/cpu-cache/img/directional.svg @@ -29,7 +29,7 @@ z - @@ -115,7 +115,7 @@ z - @@ -156,7 +156,7 @@ z - @@ -182,7 +182,7 @@ z - @@ -229,7 +229,7 @@ z - @@ -420,7 +420,7 @@ z - @@ -456,21 +456,21 @@ z - - + - @@ -494,15 +494,15 @@ L 4.890625 26.703125 z " id="DejaVuSans-52"/> - + - @@ -539,15 +539,15 @@ Q 48.484375 72.75 52.59375 71.296875 z " id="DejaVuSans-54"/> - + - @@ -593,20 +593,20 @@ Q 18.3125 60.0625 18.3125 54.390625 z " id="DejaVuSans-56"/> - + - - + @@ -614,13 +614,13 @@ L 414.72 143.449047 - - + @@ -628,13 +628,13 @@ L 414.72 110.622056 - - + @@ -642,13 +642,13 @@ L 414.72 77.795065 - - + @@ -854,176 +854,176 @@ z - - - - - - @@ -1064,7 +1064,7 @@ L 9.8125 0 z " id="DejaVuSans-75"/> - + @@ -1072,7 +1072,7 @@ z - + @@ -1098,7 +1098,7 @@ L 9.8125 0 z " id="DejaVuSans-77"/> - + @@ -1383,7 +1383,7 @@ L 168.479062 291.546344 - + diff --git a/content/english/hpc/cpu-cache/img/non-temporal.svg b/content/english/hpc/cpu-cache/img/non-temporal.svg new file mode 100644 index 00000000..78ca301e --- /dev/null +++ b/content/english/hpc/cpu-cache/img/non-temporal.svg @@ -0,0 +1,1480 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/simd/loading.md b/content/english/hpc/simd/moving.md similarity index 100% rename from content/english/hpc/simd/loading.md rename to content/english/hpc/simd/moving.md From 9c316c6e1cab89638f4048591586f51e0d0e205e Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sun, 30 Jan 2022 03:12:04 +0300 Subject: [PATCH 060/531] latency draft --- content/english/hpc/cpu-cache/cache-lines.md | 18 ++++++ content/english/hpc/cpu-cache/latency.md | 63 ++++++++++++++------ 2 files changed, 64 insertions(+), 17 deletions(-) diff --git a/content/english/hpc/cpu-cache/cache-lines.md b/content/english/hpc/cpu-cache/cache-lines.md index bd5b4c84..ae8bd0d3 100644 --- a/content/english/hpc/cpu-cache/cache-lines.md +++ b/content/english/hpc/cpu-cache/cache-lines.md @@ -31,3 +31,21 @@ When we change the step parameter to 8, the graphs equalize: The important lesson is to count the number of cache lines to fetch when analyzing memory-bound algorithms, and not the total count of memory accesses. This becomes increasingly important with larger problem sizes. ![](../img/permutation-padded.svg) + +### Other Blocks + +Pages + +RAM rows. 3-4 nanosecond increase. + + + + diff --git a/content/english/hpc/cpu-cache/latency.md b/content/english/hpc/cpu-cache/latency.md index b8a2f800..3e7ec790 100644 --- a/content/english/hpc/cpu-cache/latency.md +++ b/content/english/hpc/cpu-cache/latency.md @@ -3,18 +3,19 @@ title: Memory Latency weight: 2 --- -Despite bandwidth — how many data one can load — is a more complicated concept, it is much easier to observe and measure than latency — how much time it takes to load one cache line. +Despite that [bandwidth](../bandwidth) is a more complicated concept, it is much easier to observe and measure than latency: you can simply execute a long series of independent read or write queries, and the scheduler, having access to them in advance, reorders and overlaps them, hiding their latency and maximizing the total throughput. -Measuring memory bandwidth is easy because the CPU can simply queue up multiple iterations of data-parallel loops like the one above. The scheduler gets access to the needed memory locations far in advance and can dispatch read requests in a way that will overlap all memory operations, hiding the latency. - -To measure latency, we need to design an experiment where the CPU can't cheat by knowing the memory location in advance. We can do this like this: generate a random permutation of size $n$ that corresponds a full cycle, and then repeatedly follow the permutation. +To measure *latency*, we need to design an experiment where the CPU can't cheat by knowing the memory locations we are going to request in advance. One way to ensure this is to generate a random permutation of size $N$ that corresponds to a full cycle, and then repeatedly follow the permutation: ```cpp int p[N], q[N]; +// generating a random permutation iota(p, p + N, 0); random_shuffle(p, p + N); +// this permutation may contain multiple cycles, +// so instead we use it to construct another permutation with a single cycle int k = p[N - 1]; for (int i = 0; i < N; i++) k = q[k] = p[i]; @@ -24,30 +25,58 @@ for (int t = 0; t < K; t++) k = q[k]; ``` -This performance anti-pattern is known as *pointer chasing*, and it very frequent in software, especially written in high-level languages. Iterating an array this way is considerably slower. +Compared to linear iteration, it is *much* slower — by multiple orders of magnitude — to visit all elements of an array this way. Not only does it make [SIMD](/hpc/simd) impossible, but it also [stalls the pipeline](/hpc/pipelining), creating a large traffic jam of instructions, all waiting for a single piece of data to be fetched from the memory. + +This performance anti-pattern is known as *pointer chasing*, and it is very frequent in data structures, especially those written high-level languages that use lots of heap-allocated objects and pointers to them necessary for dynamic typing. + +![](../img/latency-throughput.svg) + +When talking about latency, it makes more sense to use cycles or nanoseconds rather than throughput units. So we will replace this graph with its reciprocal: ![](../img/permutation-latency.svg) -When speaking of latency, it makes more sense to use cycles or nanoseconds rather than bandwidth units. So we will replace this graph with its reciprocal: +Note that the cliffs on both graphs aren't as distinctive as they were for the bandwidth. This is because we still have some chance of hitting the previous layer of cache even if the array can't fit into it entirely. More formally, there are $k$ levels in the cache hierarchy with sizes $s_i$ and latencies $l_i$, then their expected latency will be -![](../img/latency-throughput.svg) +$$ +E[L] = \frac{ + s_1 \cdot l_1 + + (s_2 - s_1) \cdot l_2 +% + (s_3 - s_2) \cdot l_3 + + \ldots + + (N - s_k) \cdot l_{RAM} + }{N} +$$ -It is generally *much* slower — by multiple orders of magnitude — to iterate an array this way. Not only because it makes SIMD practically impossible, but also because it stalls the pipeline a lot. +instead of just being equal to the slowest access. -### Latency of RAM and TLB +Since sizes and latencies typically differ by almost an order of magnitude. When we increase the size of the array. So the graph of reciprocal latency should roughly look like if it was composed of a few transposed and scaled hyperbolas. -Similar to bandwidth, the latency of CPU cache scales with its clock frequency, while the RAM lives on its own fixed-frequency clock, and its performance is therefore usually measured in nanoseconds. We can observe this difference if we change the frequency by turning turbo boost on. +we aren't exactly measuring latency in this experiment. -![](../img/permutation-boost.svg) +$$ +E[L] = \frac{N \cdot l_{last} - C}{N} = l_{last} - \frac{C}{N} +$$ -The graph starts making a bit more sense if we look at the relative speedup instead. +$$ +\frac{1}{l_{last} - \frac{C}{N}} += \frac{N}{N \cdot l_{last} - C} +$$ -![](../img/permutation-boost-speedup.svg) + -Actually, TLB misses may stall memory reads for the same reason. The TLB cache is called "lookaside" because the lookup can happen independently from normal data cache lookups. L1 and L2 caches on the other side are private to the core, and so they can store virtual addresses and be queried concurrently with TLB — after fetching a cache line, its tag is used to restore the physical address, which is then checked against the concurrently fetched TLB entry. This trick does not work for shared memory however, because their bandwidth is limited, and dispatching read queries there for no reason is not a good idea in general. So we can observe a similar effect in L3 and RAM reads when the page does not fit L1 TLB and L2 TLB respectively. +### Frequency Scaling -For sparse reads, it often makes sense to increase page size, which improves the latency. +Similar to bandwidth, the latency of all CPU caches proportionally scales with its clock frequency, while the RAM does not. We can also observe this difference if we change the frequency by turning turbo boost on. + +![](../img/permutation-boost.svg) + +The graph starts making a more sense if we plot it as a relative speedup. + +![](../img/permutation-boost-speedup.svg) -It is possible, but quite tedious to also construct an experiment actually measuring all this — so you will have to take my word on that one. +You would expect 2x rates for array sizes that fit into CPU cache entirely, but then roughly equal for arrays stored in RAM. But this is not quite what is happening: there is a small, fixed-latency delay on lower clocked run even for RAM accesses. This happens because the CPU first has to check its cache before dispatching a read query to the main memory — to save RAM bandwidth for other processes that potentially need it. From 10cafe7cd161e2241649535fe37ab4e1bccca04e Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sun, 30 Jan 2022 12:56:01 +0300 Subject: [PATCH 061/531] pointer chasing with hugepages --- .../cpu-cache/img/permutation-hugepages.svg | 1184 +++++++++++++++++ content/english/hpc/cpu-cache/paging.md | 22 + 2 files changed, 1206 insertions(+) create mode 100644 content/english/hpc/cpu-cache/img/permutation-hugepages.svg diff --git a/content/english/hpc/cpu-cache/img/permutation-hugepages.svg b/content/english/hpc/cpu-cache/img/permutation-hugepages.svg new file mode 100644 index 00000000..2a3feb42 --- /dev/null +++ b/content/english/hpc/cpu-cache/img/permutation-hugepages.svg @@ -0,0 +1,1184 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/cpu-cache/paging.md b/content/english/hpc/cpu-cache/paging.md index 27cbb512..339de0c9 100644 --- a/content/english/hpc/cpu-cache/paging.md +++ b/content/english/hpc/cpu-cache/paging.md @@ -37,3 +37,25 @@ This flattens the curve: Typical size of a page is 4KB, but it can be up to 1G or so for large databases, but enabling it by default is not a good idea as scenarios when we have a VPS with 256M or RAM and more than 256 processes are not uncommon. Typical page sizes are 4K, 2M and 1G (e. g. allowing for 256K, 128M, 64G memory regions to be stored in a 64-entry L1 TLB respectively). + +You can only request if it has alignment. + +```c++ +#include + +void *ptr = std::aligned_alloc(page_size, array_size); +madvise(pre, array_size, MADV_HUGEPAGE); +``` + +On Windows, this operation is fused: + +```c++ +#include "memoryapi.h" + +void *ptr = VirtualAlloc(NULL, array_size, + MEM_RESERVE | MEM_COMMIT | MEM_LARGE_PAGES, PAGE_READWRITE); +``` + +In both cases, `array_size` should be a multiple of `page_size`. + +![](../img/permutation-hugepages.svg) From c2e638c19530f7f29d43f14fd07c86ccb34bf4a9 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sun, 30 Jan 2022 13:12:07 +0300 Subject: [PATCH 062/531] cpu-cache acknowledgements --- content/english/hpc/cpu-cache/_index.md | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-) diff --git a/content/english/hpc/cpu-cache/_index.md b/content/english/hpc/cpu-cache/_index.md index fff0d8a9..484a39dc 100644 --- a/content/english/hpc/cpu-cache/_index.md +++ b/content/english/hpc/cpu-cache/_index.md @@ -26,7 +26,7 @@ As before, I will be running all experiments on Ryzen 7 4700U, which is a "Zen 2 - 8M of 16-way set associative L3 cache, [shared](sharing) between 8 cores; - 16GB (2x8G) of DDR4 RAM @ 2667MHz. -You can compare it with your own hardware by running `dmidecode -t cache` or `lshw -class memory` on Linux or by installing [CPU-Z](https://en.wikipedia.org/wiki/CPU-Z) on Windows. You can also find additional details about the CPU on [WikiChip](https://en.wikichip.org/wiki/amd/ryzen_7/4700u) and [7-CPU](https://www.7-cpu.com/cpu/Zen2.html). +You can compare it with your own hardware by running `dmidecode -t cache` or `lshw -class memory` on Linux or by installing [CPU-Z](https://en.wikipedia.org/wiki/CPU-Z) on Windows. You can also find additional details about the CPU on [WikiChip](https://en.wikichip.org/wiki/amd/ryzen_7/4700u) and [7-CPU](https://www.7-cpu.com/cpu/Zen2.html). Not all conclusions will generalize to every CPU platform in existence. From 2837ecf6b654f3f75abbeca9cd6e2f34acbf643d Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sun, 30 Jan 2022 13:48:44 +0300 Subject: [PATCH 063/531] memory latency edits --- content/english/hpc/cpu-cache/latency.md | 31 +++++++++++++++--------- 1 file changed, 20 insertions(+), 11 deletions(-) diff --git a/content/english/hpc/cpu-cache/latency.md b/content/english/hpc/cpu-cache/latency.md index 3e7ec790..b7a43827 100644 --- a/content/english/hpc/cpu-cache/latency.md +++ b/content/english/hpc/cpu-cache/latency.md @@ -5,7 +5,7 @@ weight: 2 Despite that [bandwidth](../bandwidth) is a more complicated concept, it is much easier to observe and measure than latency: you can simply execute a long series of independent read or write queries, and the scheduler, having access to them in advance, reorders and overlaps them, hiding their latency and maximizing the total throughput. -To measure *latency*, we need to design an experiment where the CPU can't cheat by knowing the memory locations we are going to request in advance. One way to ensure this is to generate a random permutation of size $N$ that corresponds to a full cycle, and then repeatedly follow the permutation: +To measure *latency*, we need to design an experiment where the CPU can't cheat by knowing the memory locations we will request in advance. One way to ensure this is to generate a random permutation of size $N$ that corresponds to a cycle and then repeatedly follow the permutation: ```cpp int p[N], q[N]; @@ -31,11 +31,11 @@ This performance anti-pattern is known as *pointer chasing*, and it is very freq ![](../img/latency-throughput.svg) -When talking about latency, it makes more sense to use cycles or nanoseconds rather than throughput units. So we will replace this graph with its reciprocal: +When talking about latency, it makes more sense to use cycles or nanoseconds rather than throughput units, so we replace this graph with its reciprocal: ![](../img/permutation-latency.svg) -Note that the cliffs on both graphs aren't as distinctive as they were for the bandwidth. This is because we still have some chance of hitting the previous layer of cache even if the array can't fit into it entirely. More formally, there are $k$ levels in the cache hierarchy with sizes $s_i$ and latencies $l_i$, then their expected latency will be +Note that the cliffs on both graphs aren't as distinctive as they were for the bandwidth. This is because we still have some chance of hitting the previous layer of cache even if the array can't fit into it entirely. More formally, if there are $k$ levels in the cache hierarchy with sizes $s_i$ and latencies $l_i$, then, instead of being equal to the slowest access, their expected latency will be: $$ E[L] = \frac{ @@ -47,21 +47,28 @@ E[L] = \frac{ }{N} $$ -instead of just being equal to the slowest access. - -Since sizes and latencies typically differ by almost an order of magnitude. When we increase the size of the array. So the graph of reciprocal latency should roughly look like if it was composed of a few transposed and scaled hyperbolas. - -we aren't exactly measuring latency in this experiment. +If we abstract away from all that happens before the slowest cache layer, we can reduce the formula to just this: $$ E[L] = \frac{N \cdot l_{last} - C}{N} = l_{last} - \frac{C}{N} $$ +As $N$ increases, the expected latency slowly approaches $l_{last}$, and if you squint hard enough, the graph of the throughput (reciprocal latency) should roughly look like if it is composed of a few transposed and scaled hyperbolas: + $$ -\frac{1}{l_{last} - \frac{C}{N}} -= \frac{N}{N \cdot l_{last} - C} +\begin{aligned} +E[L]^{-1} &= \frac{1}{l_{last} - \frac{C}{N}} +\\ &= \frac{N}{N \cdot l_{last} - C} +\\ &= \frac{1}{l_{last}} \cdot \frac{N + \frac{C}{l_{last}} - \frac{C}{l_{last}}}{N - \frac{C}{l_{last}}} +\\ &= \frac{1}{l_{last}} \cdot \left(\frac{1}{N \cdot \frac{l_{last}}{C} - 1} + 1\right) +\\ &= \frac{1}{k \cdot (x - x_0)} + y_0 +\end{aligned} $$ +To get the actual latency numbers, we can iteratively apply the first formula to deduce $l_1$, then $l_2$, and so on. Or just look at the values right before the cliff — they should be within 10-15% of the true latency. + +There are more direct ways to measure latency, including the use of [non-temporal reads](../bandwidth), but this benchmark is more representable of practical access patterns. + +![](../img/permutation-padded.svg) +The important practical lesson when designing and analyzing memory-bound algorithms is to count the number of cache lines accessed and not just the total count of memory reads and writes. diff --git a/content/english/hpc/cpu-cache/paging.md b/content/english/hpc/cpu-cache/paging.md index 339de0c9..f4583ee7 100644 --- a/content/english/hpc/cpu-cache/paging.md +++ b/content/english/hpc/cpu-cache/paging.md @@ -59,3 +59,22 @@ void *ptr = VirtualAlloc(NULL, array_size, In both cases, `array_size` should be a multiple of `page_size`. ![](../img/permutation-hugepages.svg) + + +### Other Blocks + +Pages + +RAM rows. 3-4 nanosecond increase. + + + + From d4ab444f94228879445f0bd269baa6b24c1a6dfe Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sun, 30 Jan 2022 14:52:41 +0300 Subject: [PATCH 067/531] alignas isn't necessary for padded int --- content/english/hpc/cpu-cache/cache-lines.md | 2 +- content/english/hpc/cpu-cache/mlp.md | 4 +++- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/content/english/hpc/cpu-cache/cache-lines.md b/content/english/hpc/cpu-cache/cache-lines.md index 1b9c6e12..4ba63632 100644 --- a/content/english/hpc/cpu-cache/cache-lines.md +++ b/content/english/hpc/cpu-cache/cache-lines.md @@ -32,7 +32,7 @@ struct padded_int { int padding[15]; }; -alignas(64) padded_int q[N / 16]; +padded_int q[N / 16]; // constructing a cycle from a random permutation // ... diff --git a/content/english/hpc/cpu-cache/mlp.md b/content/english/hpc/cpu-cache/mlp.md index d54ec60a..ec8e9cac 100644 --- a/content/english/hpc/cpu-cache/mlp.md +++ b/content/english/hpc/cpu-cache/mlp.md @@ -1,8 +1,10 @@ --- title: Memory-Level Parallelism -weight: 3 +weight: 4 --- +The reason why bandwidth benchmark works is because you can simply execute a long series of independent read or write queries, and the scheduler, having access to them in advance, reorders and overlaps them, hiding their latency and maximizing the total throughput. + - Memory requests can overlap in time: while you wait for a read request to complete, you can sand a few others, which will be executed concurrently. In some contexts that allow for many concurrent I/O operations it therefore makes more sense to talk abound memory *bandwidth* than *latency*. ```c++ From 579ad824da5df1f1b04568586290a6c480c786c2 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sun, 30 Jan 2022 21:52:46 +0300 Subject: [PATCH 068/531] memory sharing --- .../hpc/cpu-cache/img/parallel-bandwidth.svg | 1146 +++++++++++++++++ content/english/hpc/cpu-cache/sharing.md | 47 +- 2 files changed, 1175 insertions(+), 18 deletions(-) create mode 100644 content/english/hpc/cpu-cache/img/parallel-bandwidth.svg diff --git a/content/english/hpc/cpu-cache/img/parallel-bandwidth.svg b/content/english/hpc/cpu-cache/img/parallel-bandwidth.svg new file mode 100644 index 00000000..11ff2798 --- /dev/null +++ b/content/english/hpc/cpu-cache/img/parallel-bandwidth.svg @@ -0,0 +1,1146 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/cpu-cache/sharing.md b/content/english/hpc/cpu-cache/sharing.md index 98a02eff..da5b93ea 100644 --- a/content/english/hpc/cpu-cache/sharing.md +++ b/content/english/hpc/cpu-cache/sharing.md @@ -1,46 +1,57 @@ --- -title: Cache and Memory Sharing +title: Memory Sharing weight: 4 --- -Starting from a certain level in the hierarchy, cache becomes shared between different cores. This limits the size and bandwidth of the cache, reducing performance in case of parallel algorithms or just noisy neighbors. +Starting at some level of the hierarchy, the cache becomes *shared* between different cores. This reduces the total die area and lets you add more cores on a single chip but also poses some "noisy neighbor" problems as it limits the effective cache size and bandwidth available to a single execution thread. -On my machine, there is actually not 4M, but 8M of L3 cache, but it is shared between groups of 4 cores so that each core "sees" only 4M that is shared with 3 other cores — and, of course, all the cores have uniform access to RAM. There may be more complex situations, especially in the case of multi-socket and NUMA architectures. The "topology" of the cache system can be retrieved with the `lstopo` utility. +On most CPUs, only the last layer of cache is shared, and not always in a uniform manner. On my machine, there are 8 physical cores, and the size of the L3 cache is 8M, but it is split into two halves: two groups of 4 cores have access to their own 4M region of the L3 cache, and not all of it. -![Cache hierarchy scheme generated by lstopo command on Linux](../img/lstopo.png) +There are even more complex topologies, where accessing certain regions of memory takes non-constant time, different for each core (which is [sometimes unintended](https://randomascii.wordpress.com/2022/01/12/5-5-mm-in-1-25-nanoseconds/)). Such architectural feature is called *non-uniform memory access* (NUMA), and it is the case for multi-socket systems that have several separate CPU chips installed. -This has some very important implications for certain parallel algorithms: +On Linux, the topology of the memory system can be retrieved with `lstopo`: -- If and algorithm is memory-bound, then it doesn't matter how much cores you add, as it will be bottlenecked by the RAM bandwidth. -- On non-uniform architectures, it matters which cores are running which execution threads. +![Cache hierarchy of my Ryzen 7 4700U generated by lstopo](../img/lstopo.png) -To show this, we can run the same benchmarks in parallel. Instead of changing source code to run multiple threads, we can make use of GNU parallel. Due to the asymmetry `taskset` to manage CPU affinity and set them to the first "half" of cores (to temporary ignore the second issue). +This has some important implications for parallel algorithms: the performance of multi-threaded memory accesses depends on which cores are running which execution threads. To demonstrate this, we will run the [bandwidth benchmarks](../bandwidth) in parallel. + +### CPU Affinity + +Instead of modifying the source code to run on multiple threads, we can simply run multiple identical processes with [GNU parallel](https://www.gnu.org/software/parallel/). To control which cores are executing them, we set their *processor affinity* with `taskset`. This combined command runs 4 processes that can run on the first 4 cores of the CPU: ```bash parallel taskset -c 0,1,2,3 ./run ::: {0..3} ``` -You can now see that the L3 effects diminishes with more cores competing for it, and after falling into the RAM region the total performance remains constant. +Here is what we get when we change the number of processes running simultaneously: ![](../img/parallel.svg) -TODO: note about RAM +You can now see that the performance decreases with more processes when the array exceeds the L2 cache (which is private to each core), as the cores start competing for the shared L3 cached and the RAM. -This asymmetry makes it important to manage where exactly different threads should be running. By default, the operating systems knows nothing about affinity, so it assigns threads to cores arbitrarily and dynamically during execution, based on core load and job priority, and settings of the scheduler. This can be affected directly, which is what we did with `taskset` to restrict the available cores to the first half that share the same 4M region of L3. +We specifically set all processes to run on the first 4 cores because they have a unified L3 cache. If some of the processes were to be scheduled on the other half of the cores, there would be less contention for the L3 cache. The operating system doesn't [monitor](/hpc/profiling/events) such activities — what a process does is its own private business — so by default, it assigns threads to cores arbitrarily during execution, without caring about cache affinity and only taking into account the core load. -Let's add another 2-thread run, but now with running on cores in different 4-core groups that don't share L3 cache: +Let's run another benchmark, but now with pinning the processes to different 4-core groups that don't share L3 cache: ```bash -parallel taskset -c 0,1 ./run ::: {0..1} -parallel taskset -c 0,4 ./run ::: {0..1} +parallel taskset -c 0,1 ./run ::: {0..1} # L3 cache sharing +parallel taskset -c 0,4 ./run ::: {0..1} # no L3 cache sharing ``` -You can see that it performs better — as if there were twice as much L3 cache available. +It performs better — as if there were twice as much L3 cache and RAM bandwidth available: ![](../img/affinity.svg) -These issues are especially tricky when benchmarking and is usually the largest source of noise in real-world applications. +These issues are especially tricky when benchmarking and are a huge source of noise when timing parallel applications. + +### Saturating Bandwidth + +When looking at the RAM section of the first graph, it may seem that with more cores, the per-process throughput goes ½, ⅓, ¼, and so on, and the total bandwidth remains constant. But this isn't quite true: the contention hurts, but a single CPU core usually can't saturate all of the RAM bandwidth. + +If we plot it more carefully, we see that the total bandwidth actually increases with the number of cores — although not proportionally, and eventually approaches its theoretical maximum of ~42.4 GB/s: + +![](../img/parallel-bandwidth.svg) -Non-uniform memory access, RAM paging +Note that we still specify processor affinity: the $k$-threaded run uses the first $k$ cores. This is why we have such a huge performance increase when switching from 4 cores to 5: you can have more RAM bandwidth if the requests go through separate L3 caches. -https://randomascii.wordpress.com/2022/01/12/5-5-mm-in-1-25-nanoseconds/ +In general, to achieve maximum bandwidth, you should always split the threads of an application symmetrically. From 22420deb86888f10d3033cc94b1410dbfc43c15a Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sun, 30 Jan 2022 22:57:45 +0300 Subject: [PATCH 069/531] aos and soa graphs --- .../hpc/cpu-cache/img/aos-soa-padded.svg | 1371 +++++++++++++++++ content/english/hpc/cpu-cache/img/aos-soa.svg | 240 +-- content/english/hpc/cpu-cache/mlp.md | 15 +- 3 files changed, 1406 insertions(+), 220 deletions(-) create mode 100644 content/english/hpc/cpu-cache/img/aos-soa-padded.svg diff --git a/content/english/hpc/cpu-cache/img/aos-soa-padded.svg b/content/english/hpc/cpu-cache/img/aos-soa-padded.svg new file mode 100644 index 00000000..e5132965 --- /dev/null +++ b/content/english/hpc/cpu-cache/img/aos-soa-padded.svg @@ -0,0 +1,1371 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/cpu-cache/img/aos-soa.svg b/content/english/hpc/cpu-cache/img/aos-soa.svg index c744599c..14219dd5 100644 --- a/content/english/hpc/cpu-cache/img/aos-soa.svg +++ b/content/english/hpc/cpu-cache/img/aos-soa.svg @@ -29,7 +29,7 @@ z - @@ -79,7 +79,7 @@ z - @@ -119,7 +119,7 @@ z - @@ -167,7 +167,7 @@ z - @@ -200,7 +200,7 @@ z - @@ -240,7 +240,7 @@ z - @@ -761,7 +761,7 @@ z - @@ -774,7 +774,7 @@ L 414.72 307.584 - @@ -789,7 +789,7 @@ L 414.72 251.229352 - @@ -804,7 +804,7 @@ L 414.72 194.874705 - @@ -819,7 +819,7 @@ L 414.72 138.520057 - @@ -902,7 +902,7 @@ z - - - - - - - - - - + - + @@ -1360,12 +1226,12 @@ z - + - + @@ -1390,71 +1256,11 @@ L 292.855469 72.026219 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - + diff --git a/content/english/hpc/cpu-cache/mlp.md b/content/english/hpc/cpu-cache/mlp.md index ec8e9cac..733a70b0 100644 --- a/content/english/hpc/cpu-cache/mlp.md +++ b/content/english/hpc/cpu-cache/mlp.md @@ -3,9 +3,18 @@ title: Memory-Level Parallelism weight: 4 --- +The fundamental reason why [linear iteration](../bandwidth) is so much faster than [pointer jumping](../latency) is that the CPU knows which memory locations it needs to fetch first and sends the corresponding memory requests far in advance, successfully hiding the latencies of these individual requests. + +Exploring this idea further, the memory system supports a large but finite number of concurrent I/O operations. To find this limit, we can modify our pointer chasing benchmark + + + ```c++ const int M = N / D; @@ -85,6 +94,6 @@ int q[D][M]; ![](../img/aos-soa.svg) -Running a bit forward: the boosts at powers of two for AoS are due to SIMD, and dips in SoA are due to cache associativity. +Running a bit forward: the spikes at powers of two for AoS are due to SIMD, and dips in SoA are due to cache associativity. - \ No newline at end of file +![](../img/aos-soa-padded.svg) From 7c901c52df2c5c0d4e77e6fcf083983ab1724763 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sun, 30 Jan 2022 23:06:39 +0300 Subject: [PATCH 070/531] ram-specific timings draft --- content/english/hpc/cpu-cache/latency.md | 2 +- content/english/hpc/cpu-cache/mlp.md | 17 +++++++++++++++++ 2 files changed, 18 insertions(+), 1 deletion(-) diff --git a/content/english/hpc/cpu-cache/latency.md b/content/english/hpc/cpu-cache/latency.md index 5cddffa5..4f787595 100644 --- a/content/english/hpc/cpu-cache/latency.md +++ b/content/english/hpc/cpu-cache/latency.md @@ -92,4 +92,4 @@ The graph starts making more sense if we plot it as a relative speedup. You would expect 2x rates for array sizes that fit into CPU cache entirely, but then roughly equal for arrays stored in RAM. But this is not quite what is happening: there is a small, fixed-latency delay on lower clocked run even for RAM accesses. This happens because the CPU first has to check its cache before dispatching a read query to the main memory — to save RAM bandwidth for other processes that potentially need it. -Memory latency is also slightly affected by some details of the virtual memory implementation and RAM-specific timings, which we will discuss [later](../paging). +Memory latency is also slightly affected by some details of the [virtual memory implementation](../paging) and [RAM-specific timings](../mlp), which we will discuss later. diff --git a/content/english/hpc/cpu-cache/mlp.md b/content/english/hpc/cpu-cache/mlp.md index 733a70b0..34a5c5de 100644 --- a/content/english/hpc/cpu-cache/mlp.md +++ b/content/english/hpc/cpu-cache/mlp.md @@ -96,4 +96,21 @@ int q[D][M]; Running a bit forward: the spikes at powers of two for AoS are due to SIMD, and dips in SoA are due to cache associativity. +### RAM-Specific Timings + ![](../img/aos-soa-padded.svg) + +```c++ +struct padded_int { + int val; + int padding[15]; +}; + +padded_int q[M][D]; +``` + +The rest of the core is the same: the only difference is that they require a separate cache line access. + +This is only specific to RAM: on array sizes that fit in cache, the benchmark is actually worse because the [cache sharing is worse](../cache-lines). + +RAM timings. From 7d12453d6afd04d15a97f0b082be8a651227654d Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Mon, 31 Jan 2022 22:22:03 +0300 Subject: [PATCH 071/531] vector of vectors is a bad matrix abstraction --- .../english/hpc/compilation/abstractions.md | 59 +++++++++++++++++++ 1 file changed, 59 insertions(+) diff --git a/content/english/hpc/compilation/abstractions.md b/content/english/hpc/compilation/abstractions.md index a11026b7..004b809b 100644 --- a/content/english/hpc/compilation/abstractions.md +++ b/content/english/hpc/compilation/abstractions.md @@ -27,3 +27,62 @@ Usually it isn't that hard to rewrite a small program so that it is more straigh Object-oriented and especially functional languages have some very hard-to-pierce abstractions like these. For this reason, people often prefer to write performance critical software (interpreters, runtimes, databases) in a style closer to C rather than higher-level languages. Thick-bearded C/assembly programmers. + +### Memory + +Pointer chasing. + +```c++ +typedef vector< vector > matrix; +matrix a(n, vector(n, 0)); + +int val = a[i][j]; +``` + +This is up tow twice as slow: you first need to fetch + +```c++ +int a = new int[n * n]; +memset(a, 0, 4 * n* n); + +int val = a[i * n + j]; +``` + +You can write a wrapper is you really want an abstraction: + +```c++ +template +struct Matrix { + int x, y, n, N; + T* data; + T* operator[](int i) { return data + (x + i) * N + y; } +}; +``` + +For example, the [cache-oblivious transposition](/hpc/external-memory/oblivious) would go like this: + +```c++ +Matrix subset(int _x, int _y, int _n) { return {_n, _x, _y, N, data}; } + +Matrix transpose() { + if (n <= 32) { + for (int i = 0; i < n; i++) + for (int j = 0; j < i; j++) + swap((*this)[j][i], (*this)[i][j]); + } else { + auto A = subset(x, y, n / 2).transpose(); + auto B = subset(x + n / 2, y, n / 2).transpose(); + auto C = subset(x, y + n / 2, n / 2).transpose(); + auto D = subset(x + n / 2, y + n / 2, n / 2).transpose(); + for (int i = 0; i < n; i++) + for (int j = 0; j < n; j++) + swap(B[i][j], C[i][j]); + } + + return *this; +} +``` + +I personally prefer to write low-level code, because it is easier to optimize. + +It is cleaner? Don't think so. From 36ec4bdde3a76e6f74c8a1c18355ed71fa05086b Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Mon, 31 Jan 2022 22:33:07 +0300 Subject: [PATCH 072/531] bandwidth limit --- content/english/hpc/cpu-cache/bandwidth.md | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/content/english/hpc/cpu-cache/bandwidth.md b/content/english/hpc/cpu-cache/bandwidth.md index d0c39ffc..8a39f862 100644 --- a/content/english/hpc/cpu-cache/bandwidth.md +++ b/content/english/hpc/cpu-cache/bandwidth.md @@ -95,6 +95,12 @@ Non-temporal memory reads or writes are a way to tell the CPU that we won't be n On the one hand, if the array is small enough to fit into the cache, and we actually access it some short time after, this has a negative effect because we have to read entirely it from the RAM (or, in this case, we have to *write* it into the RAM instead of using a locally cached version). And on the other, this prevents read-backs and lets us use the memory bus more efficiently. -In fact, the performance increase in the case of the RAM is even more than 2x. My theory here is that it is because the memory controller doesn't have to switch modes this way and also because the instruction sequence becomes simpler, and the CPU can handle more pending memory operations — after all, a [single core can't saturate the memory bandwidth](../sharing). +In fact, the performance increase in the case of the RAM is even more than 2x and faster than the read-only benchmark. The best explanation I have is that it is because: + +- the memory controller doesn't have to switch the bus between read and write modes this way; +- the instruction sequence becomes simpler, allowing for more pending memory instructions; +- and, perhaps most importantly, the cache system can simply "fire and forget" non-temporal write requests, while for reads it needs to remember what to do with the data once it arrives — similar to connection handles in networking software. + +Also, for these reasons, a single CPU core usually [can't fully saturate the memory bandwidth](../sharing). The same technique generalizes to `memcpy`: it also just moves 32-byte blocks with SIMD load/store instructions, and it can be similarly made non-temporal, increasing the throughput twofold for large arrays. There is also a non-temporal load instruction (`_mm256_stream_load_si256`) for when you want to *read* without polluting cache (e. g. when you don't need the original array after a `memcpy`, but will need some data that you had accessed before calling it). From 4f5a6155143d894f5c833727d054a9df7ade0233 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Mon, 31 Jan 2022 23:13:15 +0300 Subject: [PATCH 073/531] reorganize cpu-cache chapter --- content/english/hpc/_index.md | 40 +- content/english/hpc/cpu-cache/alignment.md | 2 +- content/english/hpc/cpu-cache/aos-soa.md | 83 + .../english/hpc/cpu-cache/associativity.md | 2 +- .../english/hpc/cpu-cache/hw-prefetching.md | 2 +- .../hpc/cpu-cache/img/aos-soa-padded-n.svg | 1330 +++++++++++++++++ content/english/hpc/cpu-cache/img/ram.png | Bin 0 -> 71073 bytes .../hpc/cpu-cache/img/soa-hugepages.svg | 1261 ++++++++++++++++ content/english/hpc/cpu-cache/mlp.md | 68 +- content/english/hpc/cpu-cache/packing.md | 4 +- content/english/hpc/cpu-cache/paging.md | 2 +- content/english/hpc/cpu-cache/pointers.md | 2 +- content/english/hpc/cpu-cache/sharing.md | 2 +- .../english/hpc/cpu-cache/sw-prefetching.md | 2 +- 14 files changed, 2708 insertions(+), 92 deletions(-) create mode 100644 content/english/hpc/cpu-cache/aos-soa.md create mode 100644 content/english/hpc/cpu-cache/img/aos-soa-padded-n.svg create mode 100644 content/english/hpc/cpu-cache/img/ram.png create mode 100644 content/english/hpc/cpu-cache/img/soa-hugepages.svg diff --git a/content/english/hpc/_index.md b/content/english/hpc/_index.md index 1864f155..b46bdeca 100644 --- a/content/english/hpc/_index.md +++ b/content/english/hpc/_index.md @@ -79,26 +79,30 @@ Planned table of contents: 7.6. Hashing 7.7. Random Number Generation 8. External Memory - 8.1. External Sorting - 8.2. List Ranking - 8.3. Eviction Policies - 8.4. Data Locality - 8.5. Cache Blocking - 8.6. Cache-Oblivious Algorithms -(8.7. B-Trees) -(8.8. Sublinear Algorithms) + 8.1. Memory Hierarchy + 8.2. Virtual Memory + 8.3. External Memory Model + 8.4. External Sorting + 8.5. List Ranking + 8.6. Eviction Policies + 8.7. Cache-Oblivious Algorithms + 8.8. Spacial and Temporal Locality +(8.9. B-Trees) +(8.10. Sublinear Algorithms) 9. RAM & CPU Caches 9.1. Memory Bandwidth - 9.2. Cache Lines and Memory Alignment - 9.3. Bit Fields and Packing - 9.4. Memory Paging - 9.5. Cache Associativity - 9.6. Memory Latency - 9.7. Memory-Level Parallelism - 9.8. Prefetching - 9.9. Pointers and Their Alternatives -(9.10. Memory Management) -(9.11. memcpy and memset) + 9.2. Memory Latency + 9.3. Cache Lines + 9.4. Data Alignment + 9.5. Structure Packing + 9.6. Pointer Alternatives + 9.7. Cache Associativity + 9.8. Memory Paging + 9.9. Memory-Level Parallelism + 9.10. Hardware Prefetching + 9.11. Software Prefetching + 9.12. AoS and SoA +(9.13. Memory Management) 10. SIMD Parallelism 10.1. Using SIMD in C/C++ 10.2. Reductions diff --git a/content/english/hpc/cpu-cache/alignment.md b/content/english/hpc/cpu-cache/alignment.md index 9c7a68c4..44724710 100644 --- a/content/english/hpc/cpu-cache/alignment.md +++ b/content/english/hpc/cpu-cache/alignment.md @@ -1,6 +1,6 @@ --- title: Data Alignment -weight: 9 +weight: 4 --- The fact that the memory is split into cache lines has huge implications on data structure layout. If you need to retrieve a certain atomic object, such as a 32-bit integer, you want to have it all located in a single cache line: both because hardware stitching results together takes precious transistor space and because retrieving 2 cache lines is slow and increases memory bandwidth. The "natural" alignment of `int` is 4 bytes. diff --git a/content/english/hpc/cpu-cache/aos-soa.md b/content/english/hpc/cpu-cache/aos-soa.md new file mode 100644 index 00000000..cf91db88 --- /dev/null +++ b/content/english/hpc/cpu-cache/aos-soa.md @@ -0,0 +1,83 @@ +--- +title: AoS and SoA +weight: 12 +--- + +Exploit [spatial locality](/hpc/external-memory/locality). + +Let's modify the pointer chasing code so that the next pointer needs to be computed using a variable number of fields. We can either place them in separate arrays, or in the same array. + +The first approach, struct + +```c++ +const int M = N / D; // # of memory accesses +int p[M], q[M][D]; + +iota(p, p + M, 0); +random_shuffle(p, p + M); + +int k = p[M - 1]; + +for (int i = 0; i < M; i++) + q[k][0] = p[i]; + + for (int j = 1; j < D; j++) + q[i][0] ^= (q[j][i] = rand()); + + k = q[k][0]; +} + +for (int i = 0; i < M; i++) { + int x = 0; + for (int j = 0; j < D; j++) + x ^= q[k][j]; + k = x; +} +``` + +Transpose the array and also swap indices in all its accesses: + +```c++ +int q[D][M]; +// ^--^ +``` + +![](../img/aos-soa.svg) + +Running a bit forward: the spikes at powers of two for AoS are due to SIMD, and dips in SoA are due to cache associativity. + +### RAM-Specific Timings + +![](../img/ram.png) + +```c++ +struct padded_int { + int val; + int padding[15]; +}; + +const int M = N / D / 16; +padded_int q[M][D]; +``` + +![](../img/aos-soa-padded.svg) + +![](../img/aos-soa-padded-n.svg) + +The rest of the core is the same: the only difference is that they require a separate cache line access. + +This is only specific to RAM: on array sizes that fit in cache, the benchmark is actually worse because the [cache sharing is worse](../cache-lines). + +RAM timings. + +This isn't about $D$ being equal to 64 but about $\lfloor \frac{N}{D} \rfloor$ being a large power of two. + +TODO fix D and change N + +### Temporary Storage Contention + +We can turn on hugepages, and they make it 10 times worse (notice the logarithmic scale): + +![](../img/soa-hugepages.svg) + +This is a rare example where hugepages actually worsen performance. Usually they the latency by 10-15%, but here they make it 10x worse. diff --git a/content/english/hpc/cpu-cache/associativity.md b/content/english/hpc/cpu-cache/associativity.md index 001f9016..fd85a8eb 100644 --- a/content/english/hpc/cpu-cache/associativity.md +++ b/content/english/hpc/cpu-cache/associativity.md @@ -1,6 +1,6 @@ --- title: Cache Associativity -weight: 8 +weight: 7 --- - Since implementing "find the oldest among million cache lines" in hardware is unfeasible, each cache layer is split in a number of small "sets", each covering a certain subset of memory locations. *Associativity* is the size of these sets, or, in other terms, how many different "cells" of cache each data location can be mapped to. Higher associativity allows more efficient utilization of cache. diff --git a/content/english/hpc/cpu-cache/hw-prefetching.md b/content/english/hpc/cpu-cache/hw-prefetching.md index 1894b180..b6374c26 100644 --- a/content/english/hpc/cpu-cache/hw-prefetching.md +++ b/content/english/hpc/cpu-cache/hw-prefetching.md @@ -1,6 +1,6 @@ --- title: Hardware Prefetching -weight: 5 +weight: 10 --- - Taking advantage of this free concurrency, it is often beneficial to *prefetch* data that you will likely be accessing soon, if you know its location. You can do this explicitly by using a separate instruction or just by accessing any byte in its cache line, but the most frequent patterns, such as linearly iterating forward or backward over an array, prefetching is already handled by hardware. diff --git a/content/english/hpc/cpu-cache/img/aos-soa-padded-n.svg b/content/english/hpc/cpu-cache/img/aos-soa-padded-n.svg new file mode 100644 index 00000000..2c554a2c --- /dev/null +++ b/content/english/hpc/cpu-cache/img/aos-soa-padded-n.svg @@ -0,0 +1,1330 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/cpu-cache/img/ram.png b/content/english/hpc/cpu-cache/img/ram.png new file mode 100644 index 0000000000000000000000000000000000000000..b45661844a479be40d6f6a91e90d6a5bd8ee7ee5 GIT binary patch literal 71073 zcmdRW^;?u(xHT%$4br8gbPf#y0@584D&5^uN=gk%NVh|$bR!_6)DY5=L#IfCz}fSj z>-+)Vj~^FajylZqJbT}e)M6qp5=zg8Yms<}Qo^d;$WD{19;=0dWCQMs;1)_OI=~(9jss zl%LD$`DN{8`)6h8r<`}4&pK>NJ!XDPiS-q|hoiUXtJNYSI}zr+O*#eiZ)ixYu=`By zJl$Jy;$Pdv!|<@G37%7-W5>KJpd_EwBNGU(cy`|$Us-&?Y`*UIc|uvhb*YLk+qPbm z##Nw<&_e0ksj}nW*KM0Q0nl+{M|4>t@Ey+G@887x4D}M-JU}))g%K;15lb$K$eacB zB6O7bxe+$%1)~r)vB7^Yu+-)L|9e{xnqB3-CYeHJ1H&^~IJ5HCmz(#odfn;}um)Br zQhiaAlIF{P85Ja}vF@Z+M3ENm-N=D8FY%-T{qgsn2;7#t+w0x%7^LWPZ6T}8la9ln zgvoNl>HWgw0zD4=R?S+YwMZJ5Je|^+Y13&IWs-sQzj@a;XH8w}>9#qgyI+!Z!p+(5 zt5P=NFL_Os6_pg&xtgp%2NqvJMqVr z!Z{)*Og7^5l?_t;MSGZ&_VRG|oik!{7Rn6sUgboiHUWA*ShCca7F_U9N!tq1C8oO& zT2uLio>SJY`va{Y_vUryBbhLBaiVGOUGXgS=%>nwR&|I+aDFIqxwYz)!^ug;eD7o9 z3VTeXbFY0Zp;7*ZKfN{?{?m&3sj0^JKfbi^w+Ilo3LqfA@ZonJp{J*jP2@Y_O}L}B z_?oh|zIMA;sHk@?*`D5fZ`7{0f1;iR;v}iDy}fhN64^>_Ij?Ai-E#(2d+6e5_e?(| zJy)pgY@%5e3Hix5FBaw`lL&tyavp!4+{qw3`ljutae@13BGtBQ1(R|>Jaw42$a}6WvS=LAT)WE;ad0x+)1S#9HgDk0e_W;K7K&EM_~pv7 zFDmJ(a@V)K_d{0w4AoXc|Gv^+$|5RFhwAi=xyl=6^YFvgE%qMw@(EH=9GGcFWmU3H z2C_O87friF}Ol;`0^IO*!OEB)sXpWlk}WQa08aw60PyRq*V z?Vr6G&<-EKV1~w4g^NOQqcIi^XUED|HT8N+1S*4n5mgJj$XLKjR5)S#xed8r%4$}_ z8v}7QWscD4_MMr>2?;L}Kb@YRj-5^{CW1e&irus=VaBe~J-yP)KE;VAW*299M!ldC zroIMi@-0ayaX90r<*c%J7(FgbY!I|`^k85eCp{4Q{Ar?SzkB%+T@G?JrjZ2HLq+T- z6^)OD^a&FbwN)(imXYjH9$Db})bWLnqN`#``}jQ_@O$9R(c_CbiWEA=X!Z{WsFoXK zH_jpFl&#dh`9jQPI}IviqU;taL_6q}kG`wCXe(>&u!!dV@8chXkLMrI#*-zV9vr2J zTox0S`PidP>K~;LeeXpU#@wsP zTwTNJm&lJL=w!^sE6dB6(k?uN1K+%9lgsEIx74+<<=HnMc`%2%IDw4P` ze1xm=DTQiZP>@v+1x(nz0w{`ENnn=%9Msj($NML#JRHv7A0Q?kKA=@YErH@a)LzDoKC?8u6QiydI8tk6 z&GG8GUw-339jN&Lmb;hB_4`lz1O8Edz1~B0F~z&SyqzvDm^Lfs2+=hzD%R}&drWM4 zC3_knXxVqYPsSfp74@YI%gz+NPZzpz&$H$ZPcCWeg04QQOGXynTZfqX3jcZKm=r9V zq;=}n(m+OXw(;Bt6}N@x;PIXaYr`Hjpy%E2>)+5*IYRiw=JBZCYv{}KUeObOB$g8MaxUvp(1Wod%acOgF7+Y(Nkmb-?sYu- zQxszO^`yrk^9y-JmHJjw*-f#mA(?-IT7!x^&{9XjO#x4_OW9;->#jTUD-Qstl^rK6X|%L(uOLjO!?RH zb^G;m7(N(Ut91vqRU^-CVkm1r!%ZFRrd82;WVgV z#JWGQE&xT&Li}8!J!tKJQ^Wuhn_cH1C%*uB5_U1j|*rxvLDZS!3j6yWT20dF!VNkgd zF<*l0{YtWc>mN2jrie-wGrOGTtA)d0cp{6c==pEY-M$zqmbQcLbA}X+Y@3NX_h-@u>c^rI~6m< zsYg2Y-W|)_36|hHR$_;uZBeMGJUwVe65Jj!+_`rnsaz9lE$f#4T^_%zw5-mJgg~g-gcisfc?o?$V3>HBpXp`5_ ze;x#ONSkbId|aT!)_k=udarf2YOiH8XU#*behU1|u00W0UdUc6&vy%A%DG4r6BB@y z;O<-d`z)EmtN!#booF@Iu-(?ZuDjbXdOBd))H2YKs!_U9g-)8N6# zQdAyJdsjY-L#BDOw9mpxP<1keZ)badpJ{sXXY0YMxp|3EyH(L1cQPo_MQFs*Q%i+7 zGM~=+{OoLeZcZhUS^jNkvJM%}LveBOJuDe;b2z+B-VZe8#8Ih9!S}(rL%);Bh4nRexjKn-XgYEv3oCVIsc zl4i7E^t#i6%f%yqY#xW3+eM7Lv4Eckfot!Nu8KC+>W8U!p!xjDWyYR$`m6#AMT+*h zOWMJatvoHJQ1+NTp%3X8PDMY+8r7+tn_)2P83--BHv3F95gkS*OJu~WM}#ilf0Dl4 z4Hg+G8gBlAC$-mm!TkIEqH{srOhwb^ecZx_REs!;Iieb)D*5mE;{F67FviB$`hR(` zZZFS>LNQJGQ;Kv-e?2^N8IU%1; zlv)zubL46 z)tJUue4As~L}I_?=)I4_E;!DxcQv$Y|Jb_7B=WBZl_a8W;zK-~te=O;y0`!MmtRi) zp2<^d=a-IVO^lY!jSdHogJdkuQ9T8OgmNOfy1rj?l~RLIU61?Ku}M67k>yNspwxqX zk>8#!2WK?L&#^31ZRY-bo-wWq*Sr!n`I`l{>51-pQ24?3|HDAWw^PbfT2*xm@O-80 zZulottkhSRUm6ze7U`05+2Vh`k|KLh`4!9onf01)Ws=HY5aEHsxN~eErBP_+!j&YU zmC8PC3g4*GqTNXYNL+&LbB#Bj|0qWaj&h|`X#__2h%b}vS0_Nxs7b+ zuARbEe_90J3*vqh^Sr`gyvY7|n$gJojHLlGOcCrCOTpL9AnPzBbAFn`lApvpF*Y#)iQCY?T{pW|bHJN@ zChMqfXh?DBj|lKv`=k`?agVRu>u9j+R$ zOPyBn`KJ^yRs!_U&E@6sAyb3njbY@hZSFef>q`|gi|2yk3{K&}qq3O%A-92Zb8~wT z8!IUz^EL0?DfhoW*^Wo$3GdzHuru|I|0J70Re4&!<@Z@$A9h{F`OxqhyTFm^|2aAZg9Jf(^YhT{Q6ZlT3xcUZ&xY0mU%-iabI80O-Fi%C{kwZ z5-EzrTYmr(EY|(EZ{IGibJB=`@xL&KFXpJOWlk@=0LG$}^}J!udN_9&icDjV`6G#d z-l|V+<+sb4Q_eyAu;`LH-twt!CTXlvDtxLuh5VAeZ#0hG?TBF8h|9tvQnk6N%P1PW+qbEPh+jru;rbOtOL0sm$<_`Kr z#h;rt)3i`Mh)siJK>GzN&2riW=R?sO;U5_h+ktsqfr_tRGqjkx{Lpr%gFw81`Om7YR zZt=+p;`nexi$ACzsb8vB5%t8mt(QCAQhdAq&;0m1Kl8O0xMlUrOF2?;Pb$K>m8O-% zwpp88FHOnWs>sa`ILYk?*8xN0{z2qEVR5y_s8|{(Q?9IjyI=4Rf zrq)#4(bTh5s8fn3+Wb2Qi3E|rH~K=;v8)_>7xLT9FzcSdI*a-vLvzXkM?!O+RHae} zVw$+T*P9f*3Hc9(Q!0W`V5f=U zC@gSrwHtm$!0n!as)3e|hz;aq;;UXhj8_9`eB3~VqCqIT2z$gCAGy!Sf(+xX-;jNBe$M73aK-0e~<);GhL=lZxSVZ|vLF!NJDu^&wj>V*Uows0rfMajIe^C*a_l zJMFwOUJ5PQo1@k54S%I)5+ST2AJ=M00DL=aKOVt={7!?H@vqB~DGX(bxQ<#D)pgLA z1YKPQ-nL%$=kx(Tv7VFU7EEkbuuM zH3<w{GuIe087DXFkIzBjvg8J16BSiK*+{;l&q zphq}aI!(;%fO_o*eh72HRhJK;n#jC%?BCkjimHAXV4JWf@`}XWn99D&KGW3Xl}W$0 zyfUb&a8A`@8q??g9{M+rBy?9U>s@GAf^-HUT}YrlRx?lMCh7KVMHG zQ}WEX$%Rf+`jQ!JZp(18bIh`hK)S5Nvr1u1IxiwPLr6RKU#GrFtsEY6C%|z*k)RQa zIP@NGA{c%}rsx1#kh3+;@N}oVdZYaH%}yRXuHZ)uhi4onz27^TyGw-pfe5AIR=U7b zMN_IHHmt&X_7QMVZL+Yb6Meb4`T4YKWJ@^AgZuN-$H;*8`am(mp!M>9y$>loeq0l< zLd#+Nx5xc7{_B&5+R!ixzh&{0lQUwgI`xkJ#ZKwJX4ziVhp@uJ2TPvVg+y@v+DF5r z1~z0aihxcCs;{rd6B`!+AQxN*yLa^SA*I94_vA}H7ALU9Ev6&Tzfp75raZ`NX?I#( zp46XH(w;{K*zpjj9Zi>LP)eOjGudOXLf?S`j1Zw@m1r&H-@2Aw_wb+j-ZSiFBYsd` zyIjxMKL4SH8Yv15+iVqmmakTSI(y%#lp9$vk8XD4Us^Y^Sv9B% zEipOng@z|h%i^N+lWB#*QM=G?rCkRtDWRqfJu(IFo+3RC?NU7e>;V!k!7MZ4nhQ!~ z&Nt#BLBD5CNg#ex&;Z^sd(*>LPapOTCsnSYq%=B&e-kpnUBq6==VM>L;?bf8T09s$ zcvI5L4z{-ey?jr9Y-)O1Wb4@Wj5Bpm6lxjEj~W!GMN+-8(w=TfYSGEb`iT&m3x9qg zVl`k@Fg|H7$;g$8{SU3ul4BiR{rPLnPnZYkQOaM4tnMd@KGxPMOCD3v<{g!kTQ0ZX8$QW@ zsV)4gZ%K*jwc_m+bl7T@8}ike&R2BY!M%_(>FZ!-^=>-r2d~iH)T~+Hi7q z`XWR(Hh#KAjPgCE{))}>&t#VL{%SOqPCXURgKsZ4U)H?2XD)i7_WfB;lTPWMUWQw@ z!cBGD%|K}csKAI<0s9RK=K|v(5^=v^Msl2*IM(Yt*Z#06%8n?-h=5Zo^du_8`fqgx zT;@g~l6Lp3Zdx1qv_BwQMOA98j&Rq$ih0a(L8rpa!&BWy)p?vqr%i9sWIul6lBj?b zF^Q-MXTvv&`1!7#52QEFH@ryXu|;Ipl&pvNRs?^!qg}^sOuBXTVFbuJe= zSpdO5XIqHJC#wRBWv$>a-`(9`swnSTxoRz9h=5#NXQ1Qq)b>L8MVU4VG_2aOF`#g zoBkDu&uCQQ(Uv&R(^8*L8z&9hh^>T^Ui^uYaheECb5=|eQ4OCoQhLgJ$eg{1AJgDxZe+5q#=V+o+2IE+ z)OU&n>=r9M5kncmgXA5;X;3#tg9;;AB_+(In*-mUHg$knnxDb;I6MlS2;$6xzLnRE^q?%-Db?v8eqq*WkDfkehGU^+L@dfA)|1j*e{QBv$tc zbrJk)zn*^OkkpMfeU+nMx|+2>nyfot3~*&^jG8BhMR*#&k^1|CiOabD17XmXyn#{r zaU6!MHIcdS)L2F!PgFUXxN@`WqSWueQ{jL6>9+3uS)P8j`_x5hv+0{TL7#ceyokG+ zZwN-(ww|>1+g?v_#MR4kz8J8+UDCYEtb#&fIy$U8sREZCn(C3leuD{8m7L3{JGJu! z)*3a5?o0q#l>yhkTpmP9h(8zqT-Twi7^%!)Re_~pMc09YGl&+b( zl5+ktzw%3Ui6mz1-RbJhl0xX}52gfKeuR0x7Go~r_-x*_&TXYfon!38<;j$2zc%O? zsqXJhA%hIL{*hhM>txcNi~gdWg9ib7KL=7cXAj>!Xuevy%XX~pj4aLxOeCc9r5z~$ zoUYxvBkEApUD~EtR{MsI&e_YyyH?(iA{rWl!VkeM`0OXp}H=`08tPz{KNfPO> zxNg&FSY>5ABi0X_x<)6ZfPiZhZ)c-sCQ`j2jMqELx=aHo3(MWg^ordgs|r`~hX3;d zM0@ec72;Dss-1eF&|` zYjs0B;uG9GtM%&(8~rzL#YdP4ghL3GoTywoh_ygH1;H1%)MWj6e3Qo6I{wQedZe8h zCq~v_t zgX)YzZT3miLc)LVbZVbDIugJbqs#99bl(tV-0&8KGV zK8!C23&bOs;-orm)TW9S13pS3gp9DHhAQDd!BhEya||BA$CeYfz7 zQK(UZh*e`$ZCVU!S==8*Gvo9e`D~MPh8GXP%d|1vPJA2>`E7^jUxYv&rZx`O4-+)zS?V4*=h><{g{<*I ztmbmR&E=dp+VgI2{7XB=Kana%h|Hm<$VwVm;#lucn5-I+!m*v5xhqWCKKDW|&=O>> z0y?j&?*O3q&F^>E!Sh3YzZu|htcfPdcsof1XYW2$Q`6tS zXK5uVWMh@(3H%DASXtPMmX}IEg@!PiQ`JxE7DbWHBd)AT*fJ z8F`zj+N!%SQ{OTw8HZ1K{en<>q6^=II^$y_&9RgS0nvClkx~-=2e^Y;UPkuQ{FM3-J@frCv^NP|$Xd<$yZz{vOSl*vE z+&AImJoD^{b7_ar+rPs{m|u&>ev!w(9pXr*74x~xvYVmuhjT7pXYnY4jx0L8wpq|~ z3YwapeMB6Ow4O==;Ye-Pnj(}@)pOzljYvEW)dIE#Lz+4>47-!vQDcm28OjBDXjnelm&mGn(4QFP5 z^|B>YM)Fd0kb--oHi}`RFCn|%b~Etv4WT63wEIqCj-$RA9s7Tb3l$hcvQA8WT1<{M?F}eE1|K zd@U6azMzhuyS9JAJZww2`!_Nv>2__Z;vRv?;2-DRA?3Bt_la^8TeOO85r~xG&Ui^- z9NAtI-e#yil_lnBRBt!-?i#;g=Kyr@d_9g&!#2ADHH0E4t^-0`vgr@{y-CqFRN`P6 zs6#YfOH)$@Zvz6HGd`)nD2y?Fk%5YW+;t6Oo`s#YE*?-9=n_m zukxKh=)j&JZ5%sJFVQYsJ!#K`4810zb137JCR@&g9SDQ`Rq z)D}4WjB%i=0c8!$OhROioalryt8N6+%gS{e~H~`(BwP zCHS49J;W4+U^|6A9dV@pHIa|b&&Tl-k*H2q)^RMuqU1GUO``H((50Aga*c^_Rex?2 z?_Q>SMg}-bupF(1)#kIfv3`Gj3G?U`g$>ES!LY^H(((78f@Bf!zm=Wbhrq5@*2p#3up!`0yzH=>|Y8EU;@hVs>=M&ZVC%0PFgbneRzR07d*FqV6 zI35(IOe z!reh{(!kDGK0YB^*{~Q%2{vf@Wnzua_(tIkPwM<~=Wtgc`xC7D0db;GXQ`jx=01M% z^JyxBmzna+cg4pDsR*DvYy!r+t=)np!Fe0Fg5~IHqh@a(hF;RaYrP+2Oj9>$-p%i| z!RUHRzTblhm3;Y3Z)W&lNcbbOA2fFJk}QVhPtczr3;23a6n%ffVN1eY^EP=urhli~ zNXtK@Ey8-t6W1*fogzL-1vlf}mp7F1Uvl-2C1MI%ZF!YRt|u-`Gne8K)w9wamG(5_ z^BH4KSCa9F0LsX-_U%oHpg>QqV?nL?PdO?tG_&n0ov)d(OlWj{@MiKt1m*f>PQF?X zHm`p+X$Ig}Buzy%wt_o9{d*$~t`ny0GoI83B>}G{f?78HpKDurN6gA#h2me`^l!GO zJLxCrvnJKP_Au6$)0a#~6F**mu8pqO<#lYH)-@sPyrIemccU#ncp$_{pC{At4k_w0 zGSx1lb1Ai4!q&6QQZg|&~`SRkQ9R1s^$8Keqx_aIj2JcEm zrF)tl$MFn#Qbs@k`~~v;7xdlky5>ig_dS|zBO&ZC;gILMf)!jEC6|_!a@_6nbTi`D?r_sz5Q;; zU7aRz5lGGf2VN3m>=|k%mbdgAc?Lv^IHXSWk>=U(0e4!!9Rl*x7R-&-JHJY^eCF)c zvP8$;HaF@Ip$)3po&n`gHv2Jm{dkV;F&@qSC>Nm#6Ih4zPo>()s;NBdQzotr%H77s z##_6)b&7v?U*Bc81|n&){5R64Vy!of<3-BdPv46vA)~CRyS*ORSAB2qG{dcsJ<*WU zcPM?|sIL_#_uJ_i<@e-(YrDZsZR#G;euaF8Z~7aQHijK^A<1$w{gx+lj@9mi+)aEg zzeRn1yDOc|SP~}dSP4+={@mH#UI8NF`iR7#)wG%K`qx38x@+K|tYgZw{pD?T?fZ7N zL1zGmZ|S_!PZ04{EcbCo^^qMFPDJRV;G%`-f)bb3Z&`@rcJbsTv@qq*15yoi+cT4GL!$qG3Yb`?mwacWWf0&F-##>>ZVR{7QB9jOOD=~NbC zFO>K=IFx96Qhi#p`(a&|+(X2RoDi$Rl_8x$cb@8f(Z#3Rx)i|TbVvEgyl4h0zQcYC zD1aH;shJlr9rM|*)?7w4?<{Ju-p~kCHat=;`WNm`A75PV-qKJj1AL~>iL@vz%@rH^ zIjs#y)4!QQG9T7B$LhoY*9ZVII;BR{sV$u+&$ytL0_kdI*Oqb==;n^*?O$wgr19ii z+uB+VT6ab7{vEgc9pFrt^#4lNUEkCs!t!}%q0cW_UE|w3vr1iJO;b~*v^@K%qcC04DmMJ-nf`t-@V@eShS~@y9K+OYI0SQki*5f!rZAU`To+X!AYjQ7v z#E{EtjugM6?-cg-_OeeEOiXAl0~dowUAt}|Rl5QpXv&Q_Q*Z`Td2NA-rm-p*?(Vmq zVhGsVI2w_W5x7hB{qh|%f%DH_erX9q7u1kqtfG`JRzG~v=|7e#(F6L2DMV@MDyNK} z(>82Y!I^-=>HJIa**%-#jPKRKr<|X8Uu%GCrmICVVu3e5#3io*&T-&?0bLa&rNpVt z>SNS`N7R5>$D&B(*Q#UWu#h; zWO|@9;Yq;YboFZljcdECbS|4RNs(H*#XNoqPPgGJ|A_PUO{sR^acUgnPEqml@){i* z(-!eI;z{-2Zc-fo$RCjA@YZ+uN@agFAq*h+2!0BBF zR|nL3Gw-FQ#%pP9XQ2E=iR{d!WAB9s{ruP!x5 zB-F1Av&?*WFTpG{pi&IZ-5)FGMHXM3}zZ?|EzZr%5o zm3!7xUu)!REnhF)NrT-*`yG6}k1B0}hXG4SB*UF37)pKijXVED@cse(NE$GEdtMWW zH-whG7MfIMPy&`wZS;ZC&SGJXII2|Ve4UbI4$LKY>;T+)o3B0``eFp`iSKlP4=qKp zc3--_rzi>Pi=HO#%|Xhhv33M!*aT3iHY`PC(JzeRmEU{ddHQp;TRmSYNUC}Y%t^<` zxp!Hyyqz;|zi5c9lks$%=$y^kGSI@!C(HCvX0X4h=EZ;@{z&%R_rp#fFfv3ea7T{1`opAhzb1UE2)6Nbyv{O+l4kx_t9SY$n>tx1zM=?F>+vZ?bCzuPKI zumOfQ9~#nThRFQvD@W5lj2a|V0&+v$lu;u<&qiv6seg2meHa)8U~3ybknaeAOH72} zq8oF@6CpKbT<5YNUe$H`Hu!2H*8|9#6JuktppM-ByEF~HTsIB5Io$*0cPK|HbGNDu z3LYg23MHtY8p@Uc+5KZeQ)c)lTW;Lpb^FC?tmMBxy@A`RufJNjw&6+B{dwtK^G&-o z8ZJE_U`;I8DVaQc+$XEJ5;NXMj>7?Ot3nB;_ zP0Hpm#1D5n>^T1mI?4nP)gYPQk0lp@frdW#!dye@R1dUKWBwTwb&pMlD#fnVFF72<_iX$q2dkFxAw* zCli{UAX?oP{Wm`iu5W?H3PNUM@b#?#q0|Xc?nNKlOu%6Wa1ElwbNh*+BcPH3`ZB@o zpk?##My4xjky+K!x8Odj>PU06y`g9Ukh#k=SThoB~N{{>pZQ#QvtJK z`8?}0%ct+=;tDof!rlAo!wA+$b8pFJ19Q;TTXeqZKdJ1$zfIgYmEi;#L>YwuKI{Lx zJ~-a(w#f%P zLL-vLYJeOv;u3IQeSUj;c$ahDdEQa+vJL}r+CgIB?Xgg9rSk0!M(dRvF1_yNnL99gkUMB|SI{J46KmN`eJhrF{ z(^)}{#HiQ;=uqBqIQRC1ZT=Q*z6owZ>8`euCkm~^oczE+I$X$+=h4ETQI6 zbUw=ku{;Ik<>f9czt5Z!HtJqb1Wy@zUaSC`fLLj!UKQ}_(p@M)+|bnhA!snbj!7eL z-=|+UTP83SCmw%9>-*as6}O<*0^Yynu=BbZ^xw;qVwqu_`r}7MRsRdG5M7vjh+jKms%vnFJs=QHnX}nZ>3&$E&ft0wBjw`b$;EnekMm3B1`I zc`;cX{JmJ_#=foKO+ayDWWsY@3Uilsd?2SBKgbEbdOFo!f4`uaqYNX~rTXUNJMn?- z$QL7yRX?6|wPz$Qce%w{EFFJ(k0nvK2k5ydJh>c3sOUHh$_-pwiS9e_tnF{j$8jtH zSHtZETt)x{tc1GxeRso3&shJi z)yVn*aIYFD)bH>WLje9kDIq8{{3S`l0_-PHY{Qfog2-TGsHJV3cPaNQcj@9-;<}rF zG;^fr--~BW9V8)FwFw&YE%0(c8w;PXq{yLt>1w3uyQQ#)Z>_Rv zaTTyYR)Dbp2o5zTY;V`bZ)%2I=~lJwaG?CLgE}f>v$M+osS)##?~;QcBmo-S^!M`h z)#XT;0qG8`%NDhEO-mr@ep(4YE*gxubU0GdgzR<9<{}##A*lW=eS4Hp2TBl-mw+1_ z)t`Z&^FD_WFRGss>2+VN?(cpC#jEA*;@k}-nJeP#?Ci4^Rk-&5irKjJjY>n^L&uVt z4zQzF!<%WZN*yIbP^?Hp!>r_6rXw-u2F=CheYKyl0JrVN#3Iuc7hi#xWy|q<0!UD5 z03sg8Qt$#)yzEa808-M8R{a#Db{WJZ@E#fhWwZ9u#%3t(+c-@1WxF3gE;-jS(7pKp z%mG*tJPfK_1AAjo8UV`K!QuIiR4$q&zox6ep}VWWPtWA02tOpH<&>ytvygeKX~UDf z)Lz2%64OqDI`n#K#3Lx#l5rlcdPIy&K}sD3Bki&XP9zfmG>$`^Vbm`-I0DtZ4ph0i zL7ui>--FZg#u>3L=DTi8P{6cQ&jM(7kX*eMQ=7}b>kwr-U^rCJPtgI;6wrCQ0~^4) zN25|Hd>NR7P0IQQ6n^9CROO|w0H1(Op8<6XxQ9R`0?ov={YcTQe?zCV4^#TjJy2vW zerg8y0Qd$>jB6mfLB!26-qelo-Gjf$(PQblHo|+ zULe~+twc#}z@`Gg9=I8`bb*Mae;Xq0VDxK0?>L0j)g=LpSu!}JHi96#NoyF4QBRUeqV|O|?x^}g94Cr=nS$p*n zM4pAZQPqpD5&Uw6ANdfYaa4jp2OG+j$!+o66bLwT;i6LReuXc85C#Zj8{agzH#WKxPG#$^?Dtc)lg$e$NAzQBIMSSMssXoVVE%FLLS6w z+hJ#$nGq?N{6)IInA$lyB13U`mVy4a36#V)-~tw){x_1U1+1c*d-C5DqzAuh^?E9E z@r|%~yC~DB=FR&`N82maEyQN>u#K~f&47?ILU-Jzm0{<2I{k!4>rv+2nd@CYXnTXG zu3VrHE4q~zCc658sd=j)_F^LU)~e&>gOvF>oyHx#yq{?}9|4`rVVF8-J%2F*P$Oy& zOUmk_?{TxB+ETzPgS+LL0dY-8hm8Ro2SE({*L<@aU^lhroja7kM5wMHyBFq%+kik z#=Ze$F#-;TqQ;|;Q^OE_TewZW3QsCBFbyG~v#6f$P9V7+}i8G*0 zph5n`5UkJl*-0zYm~6nm~lvw8e>hr2Jo z$D9i^VtOpsv{TmY<^D?rW)YGn!}(Qv**4po9K}GDl~;Xrx$qnjivqO+g`HcI7-gd47l-=`1T=@dH?$eF zo@f$>?Si&h?ig@dPdsp(IlsI-Ht2jI_48-)`H?uihVV+(it))6!E`)S=@Pp+nV1FD zcMk`v?usqxC!h125PjeuS(=iaR-PnQ^;H1&*K~8Qk{>+>`vOA8_|!gh4nP?zn~^MK zC8a1jV${g?8;lKo?n7X7f>7E+$+aKe6lQpRHZ?nbf=_`@W1}*SZJwOxcuK7)xq|M} zdS-lkGL$RourXEF#tAVsjScck|^&=Dt_gk#Ue!LXh(Rc>z{QbKd}V0U%zJ$CICEPNJ)wg5){A8<&f~M1>OXNvYi!$9w&e!Fumt$g=u0))NP8l63^&0gry>gV-r0o(v($I> z<>k2DRd_F^@oBl!uKABxZM^7Nuw9&8T3(83G*rA$a0En=*x#)}|J6@Q7jqGJKkjb# zY(;PO>WsQtskb)Sqgcv61MTuOlgs}=3@USz^l$QU)Voa*$9Rv0vI9~xVH-EvC{rFd zFf%K|49-s>5qa~w;C%eha)k+q{b!yN^B!5ozFbl_ zBo7PXZoDHdPpi?N3|Gc;`20dXWCS~@eo#oO^AI&R1FsO^_i*rZAN;83<6kUMtk!Kc zhp1Mo_il<3I6FH7>MW7jn=@W9c6tYv`l_I~I0}rb{NT9X4p!8LBRQ<>d#UFvS2`mj zM@srJ;oUQ(&J-nvi$}iywv?W;ZuFf$es5N!j!Z`6^a8^+kyV}ED%N_dU+|>Fa)6GY zGuNjo(>HyQFB?+nPrlH!c$RM?OAii>-O`G*9Za&(dY0<}-$XUi}L9aF$;Lt#MESv-ajZP8T54GnVM+Av)rXDdu+D0}FcbTWGkXP(MXTqd$VfumcCjse9_;u z`@~tJg{lstfh_+Cs?SHyDKE}QQ9^2__~Q#jW7;?Uw|dw!{Q^?mDw@S<7!&n`orlvb z_(F%U2t#p2F1o@`XC!u&VxVM-0iO8hx(8}-ol3H96yBda>-cIk1As9O_#YP7yFmRM zHJ<0XEh>yMW{ESVaqWALRj$rAXjKtSY2h2;c^;r$t-K7 z!Nh!8sgkGj#*Iy22>=QN)25#InvzG)7k-E=l(-V)Y#xoP^3G;4gK5zkERN)# zeHjdeki|q%*-856h}nMus9CJ^dH=DSY3i4aZ*6z)Umwk5(lhRCzh`_izNQ>_5c@i> zu)O^0rVlI?$?n`qxlY5w53L50N;5exG$*NC)vwZpUeVW~&q=1nbLQpUrK{s+o^-QG z^>~?06U2?f`4FC0MjQS-#{AXGpsQBrY^G32fSHofSnGxnGP|tb=|h@{zn6;mocBP^BVdxPk;Q9piy0L*2x?= ztL&G75OZ&hby8N`@At12)p8f6?jgsb_Ye`s|Do=!qO$Cwc40(7QV=PnkuCu#k?wA! zOF&RckZvgj1SAclySt^OrKKB0Ktw=N@}Kv+zwh)P`*goYjPc^b{oHG<8CT4OdNi6T z_i{h7oHVox|65skXsN#J?D%-6)qCTotqk{rGZ+$iP)-1r`1G4SOGrvo&;JwG?O zFV6qpOTI}}{^)hwY#HwFA6X4pJn8j4ee%>VBE%7j% z)YLDg<*J?ZBPbJ@TTUsr42UIFemx+G9(Q|vOYG@*iKoLrK+l^ZD8=HA@s5n{I>t@R zgqK%Y&*08{vjsI5)yuTaM=$NW`YoT7hO+ALslT2&@iVaFYhGmj3Ni$d(fYj61*OGf zq*jH_i0SIk^r}Uz&qklcTtA2i7_ z9y`OJte*OWsGnOtx1s?O$){@@ka6y%`jaP>4nB7zAht^96d?t&kKbCLe)mJRm}QG|M7Y zIX?=33B&<(6RKKpT!+)mra&$kuYUNmUa|HbAdT&Sa&Wc@2Mw5P-@_bE?drhY#R);Z z-+V&n56=33-*ZJfp%Od+z)&S$8yuZr@3b7>AbXDpS__M-}m3fR7KozY_Car?l2_{j+1e?L4F1 zH=Gq25D;L)KGjK*Dci<8Ve^(D(dOIudF+RXe1SwVzE2(lksOKAkzN+K(s705op~As zG}w7ldDzO2tln4q`G2nDKf`@+T^S?Ud}99}?t{+{y|3KWR>>W+BIn0# z3R2ZCT4`6rYy&f{C*aQ0p6kh;V61p}Ff4V4*Jk+Xto$n5fX6#8vd|bSr53qTw5o@T zuJiI))x>tE`WdD7Q`EK=Z|Nwq$i5}JHNT%@KTpz(O^aL3k)%zhP`PtYvA*uz?Bx=P zSY`&>Pz8G^9`ztFOsx*9?hMOQynF(2`FS+)u{l4V3KOaF=!=f97;~4&@8oHZKN=n? zbF1Y5S{b~}@i{q+gQ914p}2Pltzs|N?)&#b9o~bywNRAGy2qf29>1t$(Dyg)0;QZ=t38m@r}0*|4bub z2y~TPPrhRM{p&Y%h7A@gPp|wo!Mt$~7_Gm6yN5xq7)L7=%JpKY)y8Sw>wRj1PIXJ~>!v`|o3SPz76 zrfQ!1bqtAsbx))rJlI)Gm8NS0aBZz(oQ+8Ln!vovZ7~2@>v!96psB~b%loFQOBRT- zKXdhnNnj-W5ZIus`uc|zDA#!T_?!Ttk`4Mo5I9|FEmuzcY0}hROj}ZN za&q!~w}m2dUvB(PE4l9P8W!zAC#_^RbBV_gxBr-di6{WQPm%@4-h!~btHKT~P@1ln zDs?vE_;Kl)yZhq644>^@(7W!d6Q-zXcmI@>3wk-KeHk|@Q9>oG@R6-fZ5&0+EGyc` zrJ3|MxuPEX0|NSh4k5(#9Wg&2Z4Qa!!`?Wimw+PxRWCsRV3dYpIdJ4ie2DF}q;C~H zSqj>yA8E`x@VGUEL%c{?#4$ai_^yA=Yo-YEtF zGysQTzni)m;dk-r@_F*}p0d?3mWTusr>B+rz#+Em0Rj8syN29Ii@!f2qCpy91e8{N zv`yq%^z-LaRj3Ak4kj!4`ii8iJHa;9fSkD24x`fmbp@KO6|j6|9UWpmK0f5Rfk~0& zyM7oI0Sp}i@p8VmXbrfA6!EaDQhfd@_#I?Yj|>mDZj5A>vqvCoI|Aq#IyNN$P)59X z{BP&#(y^rZ^f?lUia+J5By}P5Ay9S{fiz1n?d1^ytapUsT@Sy0PJMME2cAO^%-G^9 zVCSKmu%jq;nqU|SNN~Q-gd;w8cDDH7*t9f|KwIZAPhC!EGNa0mKHHM?iiS^I@p?5n zb}G+&88Rj!Bh9+oD%gD%K^tY_xkem%ld_M>`!{GjtscE1CnwJ@D*AyayWTz+d{Zz==1!RCOhL1qfj6JkAo{yYP#nByQgm(lX_IaR~zHN99O4aW7i9JxwA=@K;_Y8Ltfi3gG4#)V>F| zI$q;j1MGaTH5uJa<^2{py1#xs2R>CTUA^_#7kQyEs<}wJG!5J9&LcN=e6e$`vz-=y z`W1lR5oQq3!_BApOt|wn$!zMcjWKC6C`(0od2yhYbQXsiYDNqviCr5&~kOw@&yp*^WWBD7Hxs( z2qHcQiN|iL>=ht1a!kEJ4GeUzS8Wwib=Fbp@LmL`|01q{J{D2)-3gPI1coj+H zFzh1MHER3BXN!P3)BaZ%h+P_S79;k+D`5N76`kOq?Zcjhvsn)i^apMW48X+9fulf7 zJC%b%5lDnLNDw{U$b=n-n4cFQX(F7u-^G@xmcIU-dTDPSnMnukh^hr4VVfm!L{D-0 z{m8rvxWB0}HK%@K_mNUlF&^SO?1eAG)bK>`xBJTMla3et#VWEIN|dauthBEx!Qjj; zu6yOvN!iy!m|AK-^@fS<5+2&|AA~FEo&S?Aeeu#b#GV48LX~dnD57))6 z?yxWb1?a*n>O%C*@x(4{UGBS6ix)`Ytol75Y6DHi1n(Ju2qeR38XGt1e>M}&nHp}K zej1VBrVwJ@1DX6QBIg0d7ZHl?`ldp`56BPf{Ts-N5EB#g3JC1?v6Ui(Aefu(%dqu; zy*v&m(TH;Z`f}BomjrL(kaE%|t3Ci48FXAI)z#IXLE^%0Si|0@0}(AovjVnz8Eh}G zD+0I#QWHd_fq{4TE5LpjcHzsv#&YoyIyyc5)#<4h@UtBtP_F~w)3yH?2KdIAI^jEK zxYVMbTP_bv&wnt9(IzNzd+eGaFA^9IvX5`Pohe=u1iG+>y}wU`R#zgW_40w^x8&}O z`j-(-oFz&^%g(>Um{t@Oc#=`G-#x`yx~b;#KvuX$6Nj1dI_6q+z1{57`36@+zk<+0 zfM1ps&3!8~@6ZFD1ZkQ#hSROaiG(C4aa9=Oz~L4Ft5gQ8+e;`53DB;Q3piN9G9v#Y zESpTin7hx4e9^F?6Np9v5Ib(X>AN6npx$eU@k$vc89`1TQ>qUk z3j<)vvIkVggYvl;;qOzqQD;z+UC@7>WI}STbXt`~f;CX~o6JiFn8=w#)O49sIiaB# z2ssZ00)R7Fis=4wZ^<W&<%atesZX-7Oe>0QKT?qHV-Vcx_6*7nmAq^liMnTB|G*>UA zh|Y0c)5ujph@pFsXnCyq;(4_E)jIJ}Em`g->N2MV59~n3%cZ`0Q24kMqZB1dN+`yx z_wU;07k_)-rgB|d*6TFF3#Gi9e4x(KIaO})9}nx`kW*7rQ)GunP=9c^{{cAnI;VSx zA`@}v>5rqaFkGA+Tf^z5aPFavi6|4Pausj1b75|Simo1HO?`?(Q#AvKH3DXT9v&W| zXpl)T#<3Kd>S<}+=9oCh+HByAjQmgJ1wtG!nR@I(7%>fRP@$ow#FU;p~B0TFXQX$ z%ged@LWoP{?e$5>`R$l-Fwrs=uOUn*QVM}qNv;>waAz3OS4Raqvhq)dQb4D)2K4DF zBCmn;|Ko7d)Eb1yxMW-budd&4?!Xh+MW(zWB8vZ!TOl$HM4SNnn8i%pOowM30j!;` zzxf^#Umi`R3hwn=mkq_UXzOSXkIpIQ5s3k@@~834f%7ujk;m@Go1xQDyi`J=4L8^p zK;+$k?E46rJT~&nNR2s6IuP<(7|>CcHS*waLgd!ZUg?_ zcTqUwKLA^`K0nz*TIAnAiHsO!TCBUg@Y))X+-r3WGEYi%5qZ-LZ&xEc#P8yi`}ucf z*(i!$h$Cd8K7ywI`gA>@bVjJm@$qq_5P-8yI#pFcEV&B@eh-e7?46{otu16K98@RF zv-Se;!bvy`zP7%)o|!CAe4`K|G@@h!EeMimWucstdVSOCM^6k=)kF9U#Uqy$N{+(# z*A{f+0sQ?xU-nfe0Gfxt^g2Ek$U9k1* z9*%wa?{7bad)KqiZe4fYQauQTGL74<9W8Vs`p2WimT({1?okG%a277J{R;lnP8G5m z@|lX;p|o-oPH2!a5V0pOKfi2-L^Kxc#3guCD~PS%Cnp~~8_nvm60qzcfT!#NzWW=r zdzq!-;o;1J9=0V?JJNoTd4T(y{|K7Yd$MoMzUOI-?2Ev&BBTzIHsE=o`nnEDW~5!w zu&jDZjfGg}X?7~ z>klBML3pd14jdd78OgSWydwqmFT_+H=z%Jp{G!|vn$G!>;wrL5bX<^{|=Acrt$~pCzCO0<7g$IS|5ZTR0SB=aLghcTT zY70cHmYeD)1Z^ZIQdWXo-5RnJ5@n&zKq?317eU!3Pk?r?J^LOC_{o+_S0t?h&(#Io z{Rc6BQKSmfgW#2~UC^E67sU0IGux?W%^6X2?yb7Z^sPvfFe@zFCrO~hAX~Dirom# zhBpvK9)Ma8c^?o!Kk`^%gRF+L(x-E*z=n2wZjK^MFq;T!2}=M|5SIZwya1t`m#(gg z3SRDKXKXdG3tnJ#O<{Cf4g|Xy5^7WBjrM_QJO0zFrM4gi#yDXp7Drx+*-o%S;Sa=p zL>H~^0;C;;wucn>>@GR^G@OHbIG+{NPdC#^Q5x_!zf%;OPMRb@I)uUp1u58H{`|H) z3&PB&^`xu&Nhb0(8pZ36ROHCHaGPGMcIt5eJ_niwCrG1moq>m~7&5x%GBW7Mk^qUn zh!hM!@ZTo|3pqq6HQrrBbV32tCsG{1R)6wy$GoHNN30T}#C`}(!me<1xkdMXFhH** zsLa9i#`Ezl>W$T~hg$hW$Q&FH=K!zG${M4gq0t6ewHG$cH~7k+Y%zj~8VG0yRSU!2 zb7YZ`|0jC(8^$vtQ3^=_zDLtcGYPyemc$X!^$XDV9-l}&_$a^hPt;*k6a$Zppw+?1_S#;EK!`zVNO%5>vp%E`*H2 z?1|K5CYl04J|f)!{pj=f)?=ipg3|XKJR>V``Xe77fc%jRjO_Z__q0j5Z_u@6%t9Fz zW5Wt43KX@!;U6G1qeluU$BPw`Ad#b~AdO7fJ&3krurKOYq5aX((HVKT(wT86{al{} z#F{^l!&BCAGa`5abR$R%1Cr8+QP8>el{*%aQdDeu*{mY0CMF79ieOlKQkw; z8mRCw@TtB*T}WK*37Zp<`kBC%wyCILA5z@3{9i3V1Xw?^16k-;t-R?X5WXKS#cTdgAMjgUcoCJ{Gfq=^qgk1`~9&%GWS9SJE#Nd zL5Y!ZUm0YQ~p1CJ{-;3;^Kz(OsxSh2*aGYCzZ%J2#W=nxi)M?`tU$csV zqd@eHrmf>6m$>qb4~?wmvmPO+fOAmzeO*?bw#I%nLzL$ERYQ(@t@V13w)Q*4?1*P~ z-DqAtv!Ujos|pw$iWpqYxhkFaKmAz6&c<3&GCN`s=oZCx>hpwOy{67r!!JE zRf(FqR3Ps`dv|TS?KLt-DeJL`hzB%%rlgBaWbUh*lj(I>>j=RP+BzmoRFL+6=kLs+ z>c*WV6}i9B{7IQHM)NNm?Z1d}5pc6O1Q&*Pd3_C4aa|6lgKLKJITys2PQ6Wu;(~&3 zz|mm_Q(Uj~W2y_$i>tZ>1^-SQmM!8Sx0cOUJvH{&A(HXx+AK1x39Pf@x9<-Y==M3) z%J^7EQE9KPus1(q#+8D>SG;ewY)66|g$XppT170ZGM_98L`Pz`6KtLVT=pcFZK}@i zdaq?{cvc69-Fvdem}&D0N@OuLaF?pLYgY5$PR-Dc43(0|oz{OJ@&qqavZ~|L@yJG< zgAD2QMqA2z`8Yybdo&A)ejG^yvO`bju+zh$S@zPb?4805#|}KwEcji!?6+C%papAa zxI47p7wgl-^oIr$snnp_5=MlEbhmPto{MlE9-!-~rDbO`gpY@09R51rA6iF{l-FYC zw5v+AxO-^B20pg^`KyAm+IKGPg{;a)uSdu8^ofiozWeLj!d^zipKTDJ!Wndet&9Jm zVQfq1wTZ~{+oht8hWd`Kjb5(Bx;dvaD|BUt-syD)6YJsYJ)_f!t4<1DyE_w@>d(jc zmu!6Q`*PM0&=9MTlb3~b79>#rU4jwNrqpNm`YO`Y(;RwKeJ*AfpVo>TrPVl^5|vFi zi)*pJcrA6jH{now@-jc34%TZ(=G=2Qf`TL9AW)IiD9W&9<+j1Ok3kknj0_vq8L>Y} zHTm%Vm!A3DkVF!py);)!iqcYrMNs>}u?^sn$nN~yyn|@f00=!J!VcNE>xygrnTlK0 z<=)CNNwLf`qooeUJrii|po$50ri)n$zOA|h)+L`Rup{BqN>`a-WZ)5&<)5n$>G6^` z+)G{{pJ^D|jH0GipHVy|_2+OfnY?l=q!gjqoUFolF@?3s+Q>TF&7pU>wWesDx9TVCv>pWkqeZ@iwMEiXA5MsH5G`+G694PUa^ zcg>wlz??8~&Fr4dR9TFE^hL~jQ;!(#)h_O5J5Bty8nvvg+P8SgaFZERC`}uqZih)% ztr~SNK3i_^yYNCE+AyLc7$aFr%aSvJ^YsV}!iCPuR<7&l0B`oP+yE-M!O?G93 z(&yT&Q|}ti_s2R~hK3!>aumH^{^7tiBgD6}&?+k)kl`9D3v;S9UUOyFom?6^iAtiiErR$gbf?v`&wU7LjXtiFVKKfnU&=RH;EU}c| zT8=zOYl_s`HTEw@WJnP*0buzW`#9CyOc$(@tx*`rx9*v`woT?m3z$V~js4$$s5Kmd z8?8Ye%U73O_e-uGYT=6k4kyK#WO30={c^{+W$om4%h#{HjY&%`L2YHmZ+0-hC;la7 zd;Cr{7b=O&oSaaExJRmFjSRiyhl>?O%+<04XxiG^ztypl@kFlQEAo0MB(JU4DHy8I zxZM@<-lRSAc&;P5`S41dwal)7jqRgf-!q@fjm-RQEhV$~gf_Dif#E>8otX%xi1AYE zN_~ViN1QPTbAD6w#0&r%CIBF{6Qi;F+9vKS8G!vyGDuGm!^0oSp!=3If&+ z8oTzVo7`qvpMq;NEvx)WVX@6%v6*nMzeBCHlq}85GrM5Y?|Mt#y|H&Y1{=099=U*~ zp`ntNQce7eTgoVnZDt#|UXv+3Tf&5DL>(NbKr0<8pU6ja}`r0{X z98iCn{UBi{!nRr)d|Suk;{LYm*BW!U>)79VihRq!?#mN0R@{>XwaW;FxMP}+imPjX zX&8TWMb^IOU+yb4hqbrknMJ;?9WqzMkWFC_^B0O5uPC+p~qC(o?( zNl@E#2Swip+h+Q-=jRt=@k5 zOcZ1uh?8Q(*6?Ks;)Mj8N+9AQwCa3w=;kW5lT+!7+?hamv|MxQqNJpxe`O!3`ikJb zM&2i{ee>44dW z5VA@eQ|Jr~2&nHmp>K3CMy+M6-*)Y6Ta7VURzEy6lDzI%p!R{suo>fy&?_Kig9wK9 zd$phT5DN!y6EU4~rp)<@i8B(Ow_F|DV|+hkh^RYACJ z!QP?Y3`P{e>BnkE2^L4cmKfmZw@O}Q+8qD!>YNFUkX~jngs`7O8l#j+$USO?Ni zP&_h}B0y7yjsnVnK$@w7yH5Wy)bYnz=~!zH>XIZ6RaI^lcnwZ z>M7Ri_zH5ygVh7}yy1BBpF;SZRUjCqf2|5hvV_Es)?wq#V?SoVF#j25e*k^J*aX2>~^)b z@2Qzjl(L@S91&LWdn|N8N%zpan??23sUU!^qFW6PuLETpcU_uVv{_}KqA4kTU?X5Q zCz4+}Vz$Bgfoi&cF9?NBH@4FrZq4urp&%h-P(oKvpiAz3H$*H?ujL`}ri{ZmMb65V zuQ~pQZ?q+=iV}Z>t^4+`k>l&~j0BRj*{s+icQS%m|1@trp)6OZBkqhE#BIvP9RwYY zx}7kIrPBInNrGg0*?3!nA`6HBepHOm4ZdN$9)G;-!O>*D-BHG(s;C%dKO1wh7(1&k zcHQ9JX3Z6)d-b1H4NNI^w|>21+Fr>a**_1bDix#2d z*!}AG=ObE?^0aa~uczGaD)BV`&wD5L!e&FA?ixZ&o_B9cW*v&&p14XDlPqyIZ^~+ zzFKVvPLbnWr@aU;UHpslRBYG#TwRg?HN3Nz$nn0(;MxOg9`jlSd0M9s@sMkG&=eI3 z+XKa2T`|z?W+jw=PcZ3&!02K4Z`bl zY%c#GCU!!@J=Z)9kG3v-^@Q&`XEiLgwU?d_gST;Rw4Wzb^WQNSP-2o5flN5d9Sr$>F!zD;Uj z_20+1cViYV1Ux?^mSR=Np}QUxNkDR&N!e`Rm+SfFvYdX?bH0!q_Gqya0w$_U&&$1o z2NW-VlWv;o1Yt^2XC74IY&>Po$)Ano6vheBm%9EN1CcAh78hS|u z^@^-S4oNgj^7!`0otB?wjB^zqJ!9|Bcz0j@qav-OLUVnHk=E@95_ZeEsgW;7Gi>Ka zN3(Z$U!dq3`*OGjVccR3DWl9Pd&jV7%5By(RYy46oL~^|cDt)ojXC5V!!!0~9^bpu zIZut`XquiRi=!EvjDuTJgPAA~Mr68)biQ1#>i#uexVhLO7PV@Zm|D=ZniY>gx7V&? zNE@!YPRB&mWD4-|0@)5I--xHn=g*#>@shVY;5LhFKpnS_%#(kL*WL=HWz#LO1n6zM zCV%YS#%0m%B@791SO)$@L*b;x=FoxyR)mm(;-;ovB*@&`>@RG#`1m^2@UK{50ge2Ere|MV08iQs27_KNR*6L&ws4mlWUuOm@ulsk&%}VTZlxMSFcH8%H zi?-zYs$&#U317`;TYm=s2ZEr3rHnO6&xPmf5n~Qw*oGl*!Juf156kf2OrHRR4!B&V z2VjfKlkS=Ly!FYp#2f(P<*&EZq!slUOWv08xYXcNzx>9H2vO{h%xR^|zv%p~1~!AU z@pn7W&0~M6ziypw+Ik=tJObzr6I0gs7n2;+j$IdzZ)N}J!jN)rL;5#u?D%$kRJb6i zrE%TwFSUQ5#Vp$Bb-87Oo1S-Ko$Z9N&Zu}2jJqP`^o!%J?4wa1bjI>0QomEpeqo#| z*b3ekbHj{yjY8Wu_%p?oO77YbrRvP;8XB*5&J2BViFD_pR^!KS%#AuVpg4fesFzTs zSp2$6Gro5UQjBcdIIW$Dg74XgvsR_Pe8so6(wDC}VAv`RH}p^NEvT|-(Gu-%W6(;_ z`~%l;>uK#GOF>qdI__sek|}-x)(yy#M#%`YXyfa9Wn)6d3+ZM+z)e zSG{{-#L9z}g}aNI8vVFtCMu6$eqV(bNXCC7XL)T_P+C8bx28ej)l72EN9mjVr=@jUbpD6Kvus< zuL_=Qd)(r)GPgL_9Q-YdU;-G9Z#2olmj7MDyDVQBTJo3##pC_KH*E8ZtDLH1$9VI2 z5bUrttD4YxQ->_DUTsZTRz74@{w!WnA0pk(@V}MTnpQc&qpETp!6?8CW(ko7r2ahqzZQCbPMfi_k-o9t@ zua5t&GJxBhz)B4(^u0SP9zfLPeO+cEl#I)w;wLjJ4aHxTw#aI0(;E?7K(hU*(?Y&w zg6+kM8liup;3OsRv$M>cW}&2`wU@^bb-t}& zC(P$%O7^;&jvLX4H9+t1Uq-p}^pT9Cc-85`wSssa32cHy;|d%s1A_{>Ec4rC>ZhZz>8;$gigNBBCS@X0VWV7GcaXyFX@ z!AFK}w-BAk#qr`gGKp#WUu`_UIm}`}WLeQXMvepAu;pM(jpKg(&!N;IP?Jpq_ryUk}GH+n1yzosH$7)H2#x2Y1* z+I3G5i6Zk$fa%ZRpsLM23SidH>$SJqQ2ii&N2<2Ff;~8v*^0JkC4Ah8f?{^txie39 zB4u6XZ~$rlvO*hvzvc|eR2p&#HhHy`l`6FSL~ecbHlV&8uEro<=?f-{NHF(4F!%j3 z*w>t`$$u(zu-Cj3Cr_VI^JrG>&}r}Y&fd%`NOi54Y*MvMsrP9{H#3ypWkm4A$Ev%r z8we?pg~B5S69C(lXV;xrkq*Njg_1eh>dUix<*b|BjRg!w9D5tmD%WzyHs#-L_I4_B zVQid=QVh*epbdJEeF;7j|9Y%t?SMw;y;eiR$nAwH{nQmJ8>}tc!(PIThcHu>) zxEC>svp=#d10+N{+&+qIl- zNvgTEn?~Y4esPn{WK{YJIyjAAI2&U7)sEl4qFi<)L2#p6k3$`zbreh0jh3*XeMu-F zbK@(n4XsVFnYq_IGr3sXH9q3S=3}@%|H*bPv3Ib{qEh6v?6Tg~&@jCG`TYG=lH9@3 z8CKiVFX@fufs=Uu)wg5~nD>exZXwa+|Ki7VO1qb6Y zFiivoFoFDz-EkF9@(3wL$d(rSGNs(p2|H>Cs_6rV4d*^niA5ISK*-rfq zs=)^gQXNDf1X_{VxsNH$Nik zy84%T6}Hu2X#C?4ro=6S>B|B@5-YxGp@N7B5VOPBi7uE3rpOo)?{cyv1=B zng508mYesJcU*p59yBp$gL6bLvAPrc21ZA27_t@A+Fi@Bsj9~lvf{7;#p!ecBv*tK zq)D;aM6fnapEg)_Cpxcz3>Sz7L;kSpL6dnBV1L2vPJpJ7|A~VVN(h+w2vTrc-g+P0 zPc^5z6P7<5VP0Y_zO4i?sSybJL5<~5ZftDQy)O)GV}c?y9zyfh)#Bf=ln?n_8RHs} z_a`zkyp1J=1@%gQ#=05{#cWr3?NEbw>7el>L+X$mNqW>A^Ne#AdhE`bRormmV0Oz^ za0DfsBl^W-1$sh{lvDcDX%{&hCrfE8#QF6q5Nsun6$?xDoH6b^|-gg$r^TARVf_m3>gh zCjU?H`NLJi;AaG=SD-g&8)lkHJ#IavsW7<4IPn$5r3BwL9m;9Qv)Zh%eWA1~C@AO) zr?#!N@J5qBAv&}9@_AsQZgMumm4$p2Rlr3K>1w>&%QjFCu7|8nd89;J;TUyGn7M8) z&CCR5>25Wk*n6*!{=pO(f%P=@-lohP()rTOWpJUZqa(BKfa9y;Rm~a9%;sjxn1n!{ zWqzh#@F8yZ$N<-K2aC@BqT-R>6mxdpsg;krxzQJUoQB2jpuK@EJSq1MBtdl7a0JKJ zMWL9-$k)wQW4@fT%9y7dE;kz~zqi*qQ|#or;gMMP$U>>mvq?+i26__Rw@^VyfHMu> zA5%w>nn&&6e!pnuFh~{f^O5UJdL$VkdoK%j{?2MVKTk8v8cY!Q zvc*S;PbUEr?xUj1{#s*CcVY8iWh>B%Sx6;oA$9$=st7M3X~`C_BvGqr+CwDj+kgxt zTI~Tyu zB*ibIEE>xGOD*_Mbjt;(QlLOg;=*+@CZ+v)lY<#aNyGbG z{TEcBR{CI+%8HN4+ZA?vU1NYn3+WyOdJXoY@v*wPQrw=Z3+Zr4#GXV?XVziD%VOwUuSDbQ(fO zUD?80bV+Cg28xoBl@f+(_H17mVOo_ZYh!;wo#O8)`(D$Wa&o9pRznjrtC?}Sll#Q~ ztb)a~`ZQTo-Jy|nc(O@~u2q5LEcfzkD)2N}j(aD;UsFqixCxJ5k*M=ih}GAGbVD=t z$DveePU`dL)piROx#k@n2lv&|?A{O8$Y-(j*6_wS+kL&sScGQQ>GD*g@r}IgKM5cf zXmC-rnH3m}+QsRxoQ!WUwBk!aC8ts6K*&POtn-~$k|v&blbnHGv1jAhQ1V%EaHP5! zOPNwPD@_$z`h@O-op4}+4|Na444U??i9WP-LdvYG5?yhTq(sG!fND^Sn*kJi+IQ*EpML>kd zcvt6CH}O(19j~t%Q%XK7j(6QTON)hKdFjk#y`K*AIx8!%s(!xQas~>?Sj0~f zzDjA_wCng`!b8B4PLOd@uTb4vayujZ&&cS&#_`nY>7h#34wx`IFEKIE3YZrtB8Ks8 zzF8?e)*(n#{eXX6dUGbCON4`p10*FfwuXMTds7A<*c2HO4Vp@&X5XWvKk*NqVA9Tx zDlllOxqeT*E-FM9^IbaKt}yF^XM`oQKQ;3#hj_^v@Q|;^OR2&gLW97nv5E4&;F!l(&$QUZYh5PG&H{e+u2S*y6ks}AbKvngz}?hG}JpJ{Wrp5F*Rh4KWZ0PmJ?z!^u1-)*dSaREXZI~w+oNZN}6*N zXAUPRQh$@dpd_wvNhG7THC;9E(PeB=3bGJ*WmjOBOw4PlA8#yLdQ3oHTUJ!Dh4xO- z?s-LmN6hw|>GVf*@i*BrW}A(dxYG3AhQ3jfB>oG(Er^SoyZs;A{lPx3I$b<^l#3F7 z&xX}9Kd&e)J3mH}?A;5>V0*d1TGufUUm?KG9WTy;?bK==aEw65b+bn^RzCZkgt+K# zxqDB#Uiw!?C8e!iFhsL!_R+-CI^0 zCrvsAGv& zx)8Y*q3wcyGe#8 zYxC}vTPJ&hy=e75DV{i_yQ$#4yqa?(7dzebUvA(mSxsaY0o<7cHJ*L?4w|MC8crIS zqa`g32@U6~vZ}2i%}y*$oAhjHqG(Hvir)Al%S03<{cL{z`WPa+2FCDfKQcV}GFlU9 z*~CuXt*4%|S(~_UzclgtnC-~E{L=dmR=Xstegmp6X~B=Dnwa*Aw}nWGe&krj70aVa z#{bbF+R-v24oTPSw9pQwA&a#j)1w7V)?^(_S@JpJPFdFgtWz!oF2)sdCzyc?E2duqY{VK zB@Vpa@fT0857{~Y(ei=IPnTk{ZQmE1bXONgt9|5tgKS_N7EQ{m@JH8F#AGPcEq=0H zpd+aHs*DVVRenF%mTJR)r>WWF?Q^i;eXu#18oc#)g4}V*fFD3ftDar+sZAmivK34m zFKi%3JeTCK;5W6`k%%@fR7sW{MbnKID5uizzi1 ztdge4ZOBY9%LwG{HJ1LLue^6GUbu4H*l|Zec$~}rQm$v`C6)!OJ*231vy7?zh|-4# zJcaLQ9q;Q)@+^uVOK=qAW54%T(BkpjF{+?<-p+h_HSpf?4?3u`)jV;B`R7fG(Yz zvT#*R@#GAC+Xe2-u$ax3y4^xG5qzFOGW`j9stcV$LqljSX$uT9`Hfgv2t&%yG+A#7 z@6zN@xd}=v0n0#AX145ed9vFfeDyW%UBA|Ni`WBIk%Uz>p%_2c3m5?8zUUdf)hrg* zn>ri!(=L&T+8a0P$?v*7qZiXwziW1yv$Pbn+pGtb?mL}vwEWnvO_XR)r%V2MzGKuT z0h1xYH^#!tXMrC3-o)pX*uvGLpBMY7dc?%?L=XITrH&r{Fe~a~1(CWujkp1)i zUHnDY8+9ZIZBuIObBnsvxW5)GKI#X>g{xH{D2?c*lw(a;Ikhci#FT9IIelNhN?N-+ zIJ!o2o%3u#`KH(}c@V(JwO8n!jmVvE-&fmR6l=U$)>`wyxkvk{&msHK!(DMZnzaT? z5qv|v6Y~Iz*=_$n*@CrHi?1yv@jCS$1^rI|sU*Jcv{m~?@2#3QT$?w!(lR#NR_jh4 zlU@-2vB~xPc7Y8Z<~d5Y)5ch-SodcSLIG1>Jln~l^c;UN$owBq|J}PIY>5f@{_y?_ z${S%%F28u5pEHR)eGem;Cu*C~%`p=@eOsE6h#gxx-Th9aG{MVEdrNY!N5tJ}1*VKm zuPtVk*N3cM9TB-N?veZPm<#JX##37Kh-=*W5;Owyj#dD6n1m2=>& z#oO`t z-s@lYT*0!%oiK1{**>14|Jl=VGxO&e%>9%&hmiuGVZKlVSdLbh!xU~CF({DZm7EW* zQ5>BIc`kmuY7JdJL?-e3?Cq~}a|!mG4GCSR3bjc3MdMr}^-) zpU@;Bq< z8TsOmG^v|yFNrh?KcYX6V2{>yuk${ur4$n|K5+Ez4N!hR>)HNyN2RMZftfo}ov@Jb z5vNdUh4G$_i1{7U_Qiz!M#>z9hK2U+ALFh`!cUp{xeQq|3kNt7NA~6)UUTi^qr>?b z9UDxu%dHR|AFE5@jf+F_r*8;6V9m$HR}uq znzK+7J$Z%C<2YzHN>3Ykn?!8=;pfYRG0#rwou!cbLU)VtWNl5{`&d37UE(jX+<0sC za|xHT>3lZWDWX+duhyJHt;nQzcF_y_{=?Hw!R8K|%FgBc!k_;&a(qn)%$r{7CpGEm zH$8oV47A6?iL^8jthnivf(7JxT5Bd0G)Qc^I|KxzyOHkPba#g!-QArc-Q8W% z4H8?rLAw6M^Zwt7M|iO3nsLXt$5o}aECw~glwlwdA!KCUN{Fuf z8wmw7-;G9~^S5Z3fe*N#bg31j$D%obsJ6^EN+~8nJ!eXS{1_Sj$4!|=jswK z*ovS6IOHz`J@IDWM-!^ApkX=KDLa^g@Y3y}?3 z$zGWn|BiJfCCJSbR8Xl&sa5C?<)|~56)qPlRgu2Q0QBsK^K@Pka?a0Qp9OgD9#>m# zJH0f!Oc~>X1iAlt5M6r``}jP=jpq5iFpaO8K;H_SqeZa@w-V9)oC(J)1bWfN&gEf^ z2JwmFSZrKQXjO+7zi*!jC%Nqrue(jGjZ>93o)$r`;VBnm`fIZE*45qv5Uw9SDn2ou z`dOVXJU-Ytt${BdbpSa%NujbkUuv3WQi!(ad|2GJ0O6RL0x-}?&@Bv{xQ(GR3`HmKz}e13?~z$q4tL~7Ovao^433%AtgA&(EVt@K#-?BYQ&$T-saWN9( z1GngO0h?1}`a`s2pO5 z>Wff;sm825IBDA+w@1{R_%KCcF}I!gd2%}n0F@{qiF6u`yVO4C6;aHmQ%L<(vvlw5 zs~t+NIh>Z@+MTc3<}2xVKjc)$aEa4Pv0In(%&_r%irs#|Sn{o6j-@nD@94)IMuTub zZZDwPsX@T7!6m{217=C}y(|P^$|)X`n040y3R>W)AEcm_RxuAkZ_vtO36E{S=ZaUu zFNn>2=|Mfu`}`8ucGj9v<$Yna&GNSq;L^MadEaK)n=2z0Xy+}HxT1L+ZRC7jn^I0? zKjEn%L}I1#@|D-;I;I$Su~V_9?!8})xBreonGOBlhvA~?9E{jDT;pfvkxc5%Wx1bi z`sxb1iq5|3VH*y%-d!r@R|tu`=_ex@+s)E%X^;d9S<#IO731sE-Dub2ev!N19r$$B z4SP!(1dmZzU!Pu}O*{ec6$@>m7LQaj$^3MP5~>zJRSi!-N^FI<+<^`iVSj{-0op_TH&oc0?{BQDjdX5*S$t0fozeb|~8!L%(BR__PBe@4Df{9LP#TifhGwcRlI%`zzqR zpWSAz@6G$ieJ11Xq{HH_*m_LSAE~PBio18(Gtpk|X9XNs_Hz4BOw?YHbqcs3Kg`a1 z+&V(8Chxxhvu*<4Fz6X#$~B@f;Xod95mJ3vv{={&7||Rp&!4&sr=Nm&Z(ANA?)*_| zf$X_Kfd1!3-;>1av@4L z9?%Y%`8ueHeKD0U&~7gmwt?TH3dq-9%JeV7J}?{niR2D69PV)YGuE)E)7R0}@{t^S zMk9E!YKUyZ{- z!3|<66lvgPhkbk5XY*4%3nv5yO&y2?#d&4*$oc}GPmAw->4vfq@u1+-%y@|f?1nih zlA{af&C%WYlPseW^!(UlvYP}xJ&ko;B+&cCQYtrWUf*62>p$WEQCe^{QbP9(<@<{r zX1!iJK%?II?|yR66f{vL&z~0Kcc`DnXGeq^K@Ji%lF~xQu99}LMz)~zhmc2+pkR6r zfvfm#cc0H5y*i)H^#8^*J)u>rSh>jiY-~7;I}EWF_=TEcIrfuE*k7c)CX_Kfb1wS5 zYT08oqm*RcFhz5pas@vGqrrdl|v={S+HaEBxV)@8Q=p4lg48)=gn^F%pBEA{mb% zi?f|r5!4b^9(aWMnnIodhqWgn*Yz!VCCb4w*5huTH4j1jm*<`~^jRtVF z=#9d`I|f!=l4F%exI~&7^4yS{l_%(STX%q=^EI2<_ee>Jwk`(kc#g;(dYR`LMf>{% zgPK(8)S^Cl7)Im*7n@TK3i3b0dQ)p_WLiooEbzN=T#(WFgUj46V0J*%AyA5AqIzH) z?=xZ_n!q_OAt5;j9y*|9qlrHECSG*GUFQ<#IG>infmN+iM=XHbLYPrezThp4Ep(6u zg}^1zDS~#y@%$YL#fR{buW!*gwP=P&Cz`kqh(RRs%1Q|sL9G5q{;UW#rK=b-PFkKsly`TYPlJr75Y9D!HwcF!cs4td-Vikf1(>KO5Yk?wA&qlJn-U^msC1bHv^ zIp~y^EwR+1gBd!{$OB2FFcRZs>kp=0bdM2Ru6lBV#CR|;Dcz;F%_F3h|GeHbIjzU| zc1QLIg~uJ}6vubTC(>M$7u0JcEgi6IZCNs6B-rd2$;!7<*+Maxf1RZ2kW)ySt8RO8 zjx+pfX>y9O%5-$cG!aEUcjRJJp{SHQo0z1)3mx5iZp0C6~8uWO$nL>5*!<@;Li$7bLdbJXj z*d+!-gGj@`fw442=OBOichPWG3{oIgw*}ey1rp?m>aZj3`d;`X5)vtLI)8;<92JAO z!cu_o10DeZxxKD9XEcjO)%2oziC$Tannt_U_iHQyeDb@_m0|F<{qwpet%zE@B&A_%0<^dg;vEIZ_AOP6%Q#3|#n)?i z>5!8;<1G9i28{n+t40}X1(kpNk|+e>C-P04h6on5igyf!6z*~}V9hQa-;aP?Nsh>i znibry70Es}^9>84&XJTPJ6Mv_``44d>}M=cZ;@Ys}! z3*TI@XUyYGe{6A-^_a(V6)IiW)AO%rO=)7NB~Z*j;ONoKs#Me+vH7w*`{O3H^A_ z0d*k%?)WeNCcS)Ljb^RUm0bn6mf@1tWXH(OFo+KT`I|Nz>4EHIwTVl3Zd}E`dOI2I z6HjHG1j3w9<5N?|^~fpn3J1_9n{JH=w6u&l7sB)jQ&W$TrVvHxyarM^Wnf9wp5g|` zxubtHXnyy0UB;oMT+k|K<*LkRYf?#u<`1_Anem!)Z`clLWXxleX$v!pE_AXsNB>yB zw_(UWn?}-Z&W=FCsF-CK9W*UFtS%;WDTor6PDAhTw&+yB0M~X>TQm+RT=XX#OEiDgGLuF{}!N5o3l3`kLe405(!H?^NSCpranE?v!2w%3mcRR zQPVbPMNQXCTemli4F6QhXgkm@C7x9cabg@k%RpcDP2!wTH{K1jydI@i{9c=-{^z31 zSevpKaTFH_^f^5(?RGCCFee%obtU55p(M~b!7v-pRtIZV74GPO^WCZkU?t$*ccdSq zqRBnm=o^)Cn02Zo}k(TOo-ox$nLUZ2J)fjr`62v zw_gFnI}WGa-#f$P(NJ*-pA_pk9Ke50IZNqa=u3Ptt$@3@cZf$CVLU|zFNHgsWZ2736<^O<6sx!>rE4UW;wj?Cum>d(G6 zr<^nI#kNMNkV)R7a_gNvQ|D%Lrvx6Ia;W;20X%;!O|~R(3^)9w%NGs`N@?CB8b&or zS7`(i48nm?ak8@h>k|0{2G=@}dM$@ff1XVP#%=XVzD*>lYGl z^(Z~JUMqwIgpKQ=DnZ#3o)xT`UfaF`!N`~`I8E)fbG2Nr;6L%TFe{G-VPPdaU0&!= z%fF+Nd;U_;MZ8b{`4w|@ohQKZ!+}vg5zbyKBt9IB$5Yo3wWtdGcKz1#UYmujFMQa9 z(d6D$Q(NVoBg|^9b2fegx;$$6{jUFAMmqZ>IL|&ln!Hi0&6N`1a)2cvr1?RUGe2MH zKo#eRG<>d0)C<6<>qRbKmCj) z2$3r4>N8G-a^WX~+4?^0L^-Y?WJ(5U;j-mirH_&cxTM4Dx~tn5=o)!VjY_sQBRaP6 z43grZ-a6K5y*L?qP1M)zI*%HLSBHi0IMa=T8i9;TU|EoW%ooPdvxxmz%Hu$<2E0BE zZRtk!EZf_f>hAh8un^rz^IvPjK-=2_#*(%TvNqQY;)Gbt=Rf-gX!S5cdwdyMAuj30 z^khv!Aw^1>*0a0L8r1g=0V`aL)3E6PySh$YSP*%KfjYKaDvffAez{7EEPs+jL_!Gu0ITu*jcN z3<}ekk0msMc`rE`VP-r39H&}+MbMsnoITPgA#OWP*STF4(glnnj zn-M~|$8y$aa?uDs?KD}`M(#;GU5&R(N~NL+1M(ZRdxB<_=FWsF+_CcaQdkgpA)M~% z2?_=T3Z<6s-S{%!O7j~i!!QF}JEBb;@uXWMcXks}dL4lA{jJASu05+sgfy`sEq@gv zlA6KodwvoN_Vl?U4l1~hZuPW5!lUJkP#aaT6!=z**?Tj;!Ib;0;;jDiQ!iO@ae50$ zrPL3-a>^Pwmy8lTEA+#0=$K8o7=)cI)@dEXA%&^jVysE7!$;$aDrgMTjkr;d%}7?u zcdF@#HsMl^gCoKzU1h8K2zcDXjH;nb5=k`Wx|&~WW952K9d$=6=C~BAxiv8{cjWH) z?o;<;HFa%4TAb6*I8So<>SvLuU>jiOu%gMf~($HA`gRP)N(mWlay=-%9xo{ND&Q# zzo6}hpi^134m({Sx8;;JQZ*V1I(bnrlp5Xu=G7JlsnPw(u~+t`5J*wsv7O% zGL+v&vsk*^s(!L@mi-uST0`4>KoMh@Pzgh?$XZIq-hB5$(I-_xK8Ev~g0-_F#4(~= z*W;cxlDUHWy{*AvQTda&(3=$yt^WBe+vopg=p99 zk8AR$tl$>coVLIl(1yI;LaC?n{Erc1L#wF?8_EQi)RNpjfD8ALPnm~dMJa-l3dP&4 zONLF=h$sK;q}!(w)uj)6s0=}x$e~2&%)GZx%hNw2SWPsiyq`uHJy|QLc=Wfe`0XzE zSfn>4e?UTY2If3fk)ejm9Lv{}@N1(+*Zx~}XkJzq)4}z7mnzqYcO?P5=g_KVc1;3W zhqQ28%kioU1$adn-k#|hQLpfL*3{c>HQvho(aCw|oRQu5mnR2DWi%)v&wo&*7@H}O_oj5|`gzTS2YqzryZ=4V84KzZ#o%lIigGX{< zWc1D<^grwLszQ8metiC;X4au+8;E&xjDmSrX?!9^@TwU%3|98zPeZw zaiPp~_q<yhfg0h?cEd#0*8N!)J#a4eY-8b{HwvLS3IlLggfn;X5rx(HIx7i{A#ee?B(&p{-RkbpCjD)A|QoXzX0bu#n1VY^(ot#!KzHJ z%H(-pr?(3c4<~_Isa)Z3sMt7AF;t0eb*!R{Sep*q+aRbW|BIJY{YBCqG(~Ik8hs=j zZ;afhiYCFgl9^AOf!8&0OAyQEgpT*wH=$_svdF=2Y zy)GJ!SekftG3lT(EODzuC9Y9YDNF{m5?ZOuZ>4IX5Sp>sB+Rs%Z~StP;TmxG{tNID zlDs>)vsn7vHf*WpBI#7)+SMQ4oQ;*L;)@ZLgsT{B>KolFMe62-cfyRLCLHODETr}I zqiSm48?-*|NaBMlA``X!7w{KdSO(*ZSv&Q`XvMT*%P>~qKg=K%CglP}yj+HQaS5&y z-7~Uzi;P)U$F~iG`LT#>(ro`kCa@2-7 zX?t4pjG>Xh&0GmelU1-T7-^v^Q5rB-?vWK1YL#u@@f*g+=TGI&TE-5sV^3JK1$rDk z*bJ8okGi4bvfS#>MvE3 zngxhbg;L>rJiwZ97z7sLQHvIkR_-a6E`&jFyR^itqU4newJF(fvZ!dddmI{cD}5e^ zQ-41!dN62*9D1QxBpO6f*@*pHJmj*?-mEY490AM7etCOJQ6fqaRQ&;j6D1xIL0_pV zn)Hk$rV5$?k9yCwL?S4{&ORnmqQ9tois7WEPZl|3J5QtY1JB0V>d zLXHCnS0Sve4sZepq0TVTy(FpC0;I;Q+S+`d6xQuX-$??IeL~?}0Fn}Z;!#{bqA+bO z-z`Js8D@z^Ct*at^oN8{_QJtf^QgUh4U4d-iAIe$qx0tqf*9!{gQ*~D5^FcyQ)<3(p!)lUg{KYHDZ7eYIyk(FYR#Q_CUR4IY z2A;dBQTqSxAK7ipdsg~d7Cc&P+f7UY5u=Ya94hT}oKn?UW3l1tTY3cOx4dfgos?`8 znmj)kV|zG9Q|k*Y=_}tw$b$=EqmODWcchsftE$jrqD`EVN4%NYP}!27ecJC6l~bGS zNgbF(kkY*V%JNsa);Em!T5sLq{otUDEOAqALnb<lXGo&v-HS)e4pu~SUJ%5&al>@fl9`S z#Q1b4Z3$Hdr;WYkOTwA|h@Tahg{N4r0zH}|RGTp%kF?W%E2w>Qa}Qjd)*z{2+kPPa zoBN32Mx#CSOULy|Ar1)815=suru_;^KZ`5I-=c}+h_qfBo=?&J?Joi95&Pr|ZS^ku z?BeRkx4%57?PC4R=Y8Q1;=4%OHOUGsSX%RXGVjUBhd*(xMiHm}Br&93mbVU%`{#{^ z`pkIR8+N1y&d%f_&pxrHmFw*D%`n#J^o!3WUyIg&$ei%M>jVf-X03!>DGlmW4PidO z|MHKS#&0>vI6QH#?#*vUs!)?1_z3l$z#_mhYgo(3O0fZ7+$3=8d(nHR(bYdau&Df^ zAc=qXOwsj)iv8&vWw?ZYzAGUG8cp)u2(~c6^gLYmA;;JnpPsgnT~6%evzt7xG)U&9 zel`Ong1}1xltrS%6S$qsfWjgVtK`9ywfNi?+q$*7bFpg7T*-VgZ`K&&%Q3_vQ`h;~OrdF#Bg`d%MIsL+Z?(saKsZ*biot!w5R` z+u;CTRAAld%j;9nRm+*e?bU#jz}WsMb@Snj%>BfK0A!%b6T1tr(9-O5q#Ppxe9l@9 z#QDzw*x>QSP|fj$i4K5w(RnC}fJPJQmH`IFnmQqb<5^!&84auGz(Q-@^ZE|(zA)M#)m<&iV0ukm!jmQn*Uap-BZjBVQ|BeQMMW$ zOK2W%i*Cb#)F})RQCK@MHdN&Y`qQ14y2B!4*TkANZXn0q%ob2t-umNpxL9V;;C2MeF~e(H=RK?p5C03%1AHH;OOQrTlI;3n3%XAjwa_~bZqq(8aH z{DL8a%AML5!<)JO>1<-$AG*^A5FWC*F-Z{lumgV4ruUGs=DYj|^B~eQ_el%MBFTin zjprLT?}N{K7d>^8rY`~SV(~+bZ11LB7<4arQTYC{M0?+l)eMrEG~#xfXP9*|L$}cD zvlwQ8csj~}xYy}_3HlS_+R^f=?<%$d9<^$1Uxc}6vV9w&o9s>7BQ6;^8P)44+Pbp1 zqOXjxb>&2koJz6kGG%cFlop_KIJ@cRRfMf>wBRFf*Ph@~s;$*s%RmPD}L zlhHzj_R!_~*KueWll!I(KMKjw6M}>4mCCPe7Zi6Y{5_Zh* z#lO+=j}3+E@4NQLz#Hx;lftbR5_rfIh`h9M>FGwqMV=u{qsX2S^72^FW{^N{R*?|FxUP z<}7D-?Q1(L<9`cXD1ZSIv4OP0`R|E?WQ_a?zr9k<1JA(BP=Kyq;^@Zukn<--Z^ z4e+R6CN@~iPI-Y>SD3&O_y+-K#V8|HaIj6xa2Z3N;}5x@N9SV?xmU`>7yj(z2Q3y& zTgC;V7xablz&tl0&UZ zQ{;6M3LYT=PQrxM9P@Omai2J0>j>+0~yLi0kSd3#$o>HEm)^c&PpviT0s9;XrWXFQ&@#Y7QsPFZuJ3mYUESsO>H zl?4@Gbwa+gO4jb@gA zm@daV2?-{!Baz2^fPas`3%pa`I)_@fK92t}A*z|Kk5A225~6AF`DcfOH!Jq~Pi61x zBj1IB#|*JEo|USRgrEfwswxXRKPMy1cz;$e;80s3yJ?BiLp_n>8*XX@`l!=KnTjx& zW}3s=4v~uSOZbN(uz~lcW1n|_?;cP3B-hEmeq{o95nM-*oNjlN4>-rt6@Nm|(1J*N zuUH1Zm8vwKDWGn85y8U3iaLz9>*BD~ui8Fst9Kb4#8a}Te|!jm*Q|A>eZ9W6JD)0u zIo}^m0y*9p&OJV&N#`Xh;08%jkGcG9C8{vROca)udKDBP2no3WvN|k>I{#jbs3`Oy zRv<#*)HXm)k`@H0R^}oY)Iih~s)`}hs9-=Jso1$eL=L1cv*ncyI3IDXglMCAc5#2d z3`=|)H8eMmrIC_!kmslouJ&;5nEiNo>kOBxT7lv@VYcA>MnC5BZHq=~WiqswP%nH$D$WUGRo4C}h1FJWhltZ~)= z7V<6Rv*ctEUWkT-At>xzCzY+iAxhlWoUyL!$Du4|3_vFFe$DBYFYW7u zgGPfV#}_?dq}wv7*&jM`p&Um zJ7Y_BBQLm=h%V_Z-;Ep7XB~9nxr&u4jG}t>t&{SPJP=4j^FwLEn*mN7;xBC8jUlaw z#Q0r}A0fo6ee%b~{a7Pk?3xv^WB@#`h_!UxQ-sI~o$0p_v~hXgb+bU%>0^ zQ3t@K+5+wp;=da3$;rWKlvdV(`mdLt)*Nni%@OrDR{2LHr!ed(dXo?&;_-jjV+n_u zJ-;aAD7-WQ4f|LR`Rd*Q4Nq8^V}$=I8{Sw#MEQ@^P&>xl_T?l$Rd6kWfNuSEwQ4K2 zqsFDlE7_PhQ*FOKjf9Ej5i2ej0q7=2dz_`2otM_}kI!#B0^GEYFIRgLJ4sdElwIvx z@JtWmC7Pby&?B%$XrhIk`HIV<JbT!z(3hQ<>- z?UwJrHk}VFJubS&owaC!k|X+r*TTM@Ck@^QA3jw-P5Kp^Fm%?JTSZ>3ZmjjD5x;Up z+-S;LepCx(&<$cWub_6XtSd}m+mLRD`+n5Qp23(ny8LrQGTD)@KhV4$d1u2R>>&7m z57cwu#H&89c>;wPLRt%UVm>QpFd7P&K;qozHtqPJJ{T%cDa}O?7=P>WBR}wAt$w_tUO=*Y5VDH`OYf7Bz#<7WxxJg1>i~*~39z!^tWavMm`l&q`w^O=BGlh9ipAQW6RuKb#yl5-d#T#7PNR;LoGOpDkc@JS@s*8=TCLu=XXyxV zkzIo5Pkicn`B;7whj4!MjErj6?W7V%ml#G>B}WLRAG1Tn=V7t?KP>>Bf?tOC_rb;l zHczFCCGwc++wFmV9!hK+unW#yww6`__@DX_p!tLE(kf7^*yKSOH0QL{7z+M)gs!x$ z4r4l*6$9C~MTf z@mQ#00}h%OcaVLq-j6Wt-P8@i;srvC=V^fH1dPxaE?PWvn&#?)Ey#3zUmTEQq^cA5 z_p?s0=doFZ)==bcRGr`T`4um0OpbHch32PrU%LY=8{SEB1en4^Kr5i~hO^nG6DwQp zNA*rT`*w6hE~Okml8YV8K8P>5^X=$tjMG;hBT?yOj(8#9xS{m2ey;0;6s)}Sj%OM; ztApYac(@=El#EoC?H^~{Q3SsB*q`6S*UeYMbQ_2Xl}iU}8F}WuNq_Pbn^inJ2Bb*= zh^1vw@1SYuB_A+W6?4Wd3FbLmsyr)=wz0WCA%z^+DTgu;F`jz|vQik11D3R(+1SF- zH`TmBi{!RJ2v%v_>_6hJ`{oW^=;pPOU%CNJ#6MJI7ONYE!gx^!+5C?cETU5Htr(*r zn-4_MSjZ?J{aLw2g-X)5J`()LotvE8uT2b$!WlIG_3R^R5qNYzJQrJt2tZWcPt?>z zI!pOC2IJM4=9QC%83Bpv-?E8BoI;%7;Na9cduWTysj~;(7VCFOFZ|^VKrF;rx!T?N zcrp~cWf;!|K-~ui2j7?<(^>$tC`meqLJANZ0YIQxYtGU+D=uktv9u(UpMQp0ua6Fn zj!F(y5&A;Iomo~$Frvn>-Hi06%=ABT>c0teR6HUIC|a?VE?{S9@;O!vZl*|Vzt-5H zPRl-nXcV{8#NUKI=c$MOzdlAbYCWCFeaUNwI`E2Q*P5e)E<`kO7DplNcCl#aW%$tI zcn%6NT_XFrC|N-@jo7&tH;8ChHjtT@Ch8#l8M#!YtY0y7fwsX?5H|>cc4}~wi>UH! z)>_QnSCHOP@roDN#R^4Qeaf20iSvMKr4~Ui4aGb@GlJwS7WwC+q;o8!^F)~$GwFs+7s+h*mPUs-{|_MaPqstvI&7C2rM3av~@b_ z;#Tzw9y7CKre!HEd(40z!K`I4&7?>PtN+>69+#23b)=r(!H%gQWif%YQ3H1MJzX7H zPIQ8qwQJ_FO*@xeq>waso_vjo*oEIPY|xdH){$!#j2J<8AR{x+PKiZh5>O=6fHtYH zA*NsPXV5GDIU9D$@6x2{9@$p4w;VFuWRQh`I7iQ4<$e+2ewW?$IRXiWJ{()Tc#rKA zK-C!n)-hv-A%u;p9d|b-DGtD-)Yjt*&SF@8D6y^=BXzd(j9SArKrM*|KzcQcdJ&Dk z{v@Iq1dX4THY}?NEEpHiy+iQF#&NdYaRRAP+*|Vf?LUq?VTk7(3&=L@S(W7G|Fd9B zIvfov6;DjxN_KgfA9@-b5GeV9CxO0xgso!lnfUehVI=_GjATLb=fD}dW>pMSW$U_4 z+eIj-Q8FS-YuT-zc%&7Sn&N3gDG>VWV)EigD@4GMyc3;~;DV-P81=g@DZ;z!tN#SJ zo2^cUpJ<`BLu$V9@LW}0Z{(uNbH1*sk8BeD&?C)P5^dr9mwJ0-_?e3bUR~E~yiQW8 zJPo+K5k?Vf>(TcGC;dd5>P!dg&grP z8=`YPSPVi{zJ5wS^UDFC=06m?FB2dXODp4ajbTT8cey{B4|X%~K)B{94J}6jw1*#2 zQGtrF-jLjv4cig3mC;odWV3tu(-u=Xi`d^$}eFL}k@2J`qht83z|BwE8b{CHZqB<#d4WQ`9KN3kYi?yJt* zju=4wgqkX8AIWN?r1o@5<@)a6iWe1^+NJWSam=-m7B~k1u6{`Qi#9?>$pY)7t({Hv z(Ij>5a>%9Yk(s<_(ffDSw!KM&*PcO_wl>6H=GDSVYwDtw<*WMmkdCcytNWv);x%kY z^J#jo+Br3$n*k;%_eDP@AY2_rkWDXsv&DR4@4WGDkUxHGsQ4<3u+0U}HiJ3juuAt% z^CcNDH*;LIX{rHO#BZFU!o;7vee9844z60`JYnjfAsc1iy5&NxipG&|(|YN}nB}I@ zw7{A~gF!=So4&{O^%fPYe}9RFVNgmZ(y-#~Ah!eogiv<5Rr()q@TZrDs@HT~*X5u$ z4)yf3qPT>_w3ZWK>UlLKf_eeC(B+61OjEH%iNj7_tt69E>pT4QAkwnRr^#0a*bCHW^n>WycYi&JV0Q!AS%iw2iB09EhPj$1vzvKAmD1I;(k z5Ig=yfFn%=pp(b4y-vfF!nJK%f-DqmgqkUhDS(tVJV59c+p$bGM z#sj+}1{hNWXyrf%xaRwK0&<=bij0i32Mnd4s{t(B8-mfYL2L5+?SJd5nBOm&*l+nk zvAkCKKR9k3@Y(opTP7QREJv`3WzLkAWCN{2VxVAX^*&P{z$1B|Sm{H6%4{F5NQD|i z&dw8)_;G0{^^2X}SNjh8F+4D!Zr<6BU^WFvb$b9#k^%unXK@FxOot@oJW;%1SxT~8 z@r4lp3?J{qBVpMh>I@E z0Qz<1Lh>CF%)-XTWZR!>Cg<#?U0{K?lye@i6tOPmt9-G+0I%43(<$c>q-FIR5J5jY zJWOddsXt)Tzaps{Yz@Lvm7mhXOhLGIleKLPZ;qF}0j&0ri|!kU_l+|Rpy(x|fjPQA zz*Nl@;0JgGunyAge6KRzppPf@==!e@i`)QQ(qqC(uMaRt1k&x!-ZzLcz1{^mCyfE)nkdffH;B1vn}poNTr;y8$Bb`G#olhr#8RRJqd zy`J}wo`aw^G%9}YaP07)GFyB5*rxU!Mv#3sbiadak9|y5OF4yD z*7g_cWiFS-`Q^dgL!7@jjW#g!HBJR{Tfm9pOH6MM`X=3gI0xi^JyTQPyaqCr!cFVK z5^D+fbwLX8g>uI7{P)Jm44MI~#T^`j0B_Nf(|V~JAj_^S04!s|Dfu8478ao3S+7;$ zib7|_SfN(NIHvW+8(6K;7{tB#<7Ibv&GQC}1P+dQ1#l4nd@QDVEk+_T(m%3)9Lulrae1^;fJ>r z=j?9vsFC0AEFo3ih-Qe}xbXl`2}M9d=at&IP(cpNBt*sDZ)A?VjHccA< ziEwqHM7k(|6@JkrjoZmI;~TcnNzf`|Gs2|;tE!`mDSrK3 zDj5WI-VKX--UtkA0HkIN;N$?rx@Le10f5^{5ULk2#HQigbhbFz2Z5Oas1@7Jo*p64 zP~N;Hx)?e+IXPhT3=c-m-ixD<%6M{)uX=bjJ8OrO|NQ;>%0Fwn?A0`tLMEw-6jtalLbq=YD%k>2d@gmHkQ3IeB zT~>D*9_)O|RBWRr=HqBT>hM1LsOMHL@BV!sv4>}mVU^sD&Ki%{*IR>xU&9B~)E z)^PW}^(hWi&Nyg-lO3a^NYa@Z{=Gj@i ze=6y@YeP{dv&)c+t^+w~U74jjTtBt_a+S-H?XJzGXrql1O%^#?{0AZ68U~<6seyIV7Bq49q zZNMH}lOs zY3mV__r@Ey800di_@Djl3`0LTECf9=oR>BwyYeVb=5uF-MY>`kIhq*3;5yPp=uor; z)%G`Ixr2c@w!7mEY2A&1DHq<2zA4p{zH=Sz2+B^cO`*989qd2N73}UzBqZkt<-GPa z8vQPW0EzYY#G!=lI`uKT>*7SXHB!L$hNA^G@BT)DW#RfqgdCax!0V_yQH(0(2MLq_ zRE5rt++wd&B8)5{FDg6V8VVGmQ2AyBmSLL~R4}~ohuaz@gH^+}mtb9_F1W3YT^*%T z&OaF8i(7)i$*c4^GOcT72#<_#?o3*oGT&(G8B6-^MIwW&a%qIy&GjU3ii+jg%h`@& zVhL;);}D!d%%#6}lbYu19lgk0*ZEDjj8_GExsM%(h)*rM;u1+X|NJ@RC=&?bUgHJ; zkF0VLvQ)DpTF}AoI`uo{`SRahD$jzso6O+V9P*}N_X7@b>Lu=wgjP7z^7Xh!T8n4L6VBM18gocew`AvYnTnnN-kj0CO`+9c@{%QGQ@o^b6zfL}YjcLliZ zL9wOD*7Flpm}vq}d%}ouzG&WPdXY}t=yge>I#c_d8^QFUoH-Ag+D29ynN*A(*Es|4Vl<2m*KTP@+m=RR71lwbzczzEnsF_jwEtxK zgNrd;Em!4igipe_>t?y2zqb$t+fc7@d*@XjS|ZK3|y?0zw7OK2Oks znm27+Iszj}{!=ct4HLG1zY}4(>^uLdnrA2r+$FZ|xLWTyM}g6=Bm2V$WY)LMH#77(u3q#iwV8W%#u(Pf1^+^%qZQ%J`js+W|N<>MrrFDlvFK?`8lI# zv(@CoYVclpW$z$bBcWTe(y62ym1=HfoyESpf5bn&Or@|<@94zkdp?O}Tuh36Cdp9M zPt%%#sTg5&U-HOM!V{e-Faa8|7V(BSIw~a@YQT4A9zw(c$dPJ|so08&up+i0rouUO zU~X+~Z5Fn4k@dF?|IlXIPkQZ3wGb}A6jn1%;KH}e>FMd8X8b3!j!?7!Lf0kIxmG;V z`vgG}SQrXlv`tv-d)^`?Xt}dH*VQy>k!2ztBXefb)lPao1G*%nDp>|3HQc=5 z60!utiPq~vHVa606>uw8b#P2`y8W4hkzZ zo=f@9t*nOh+8B?6X{H$?^1&g@%FD9*zfj@QobgwkRtLSneYhc5H7xU}0Xlt16?nXeA zhC_FXbaxBV0@5X^gmi<_-Q6WE-Q5VA4gu+Y*S`0=f5G>QN6&^G&tA`3bB;0Q7EU74H&_t8_Zlmrh0%w%hpxafi2)D74oiV<=e%m@exKlW8uW{nh!9r2G2(QG1 z9@XV!lwXy1gam>+cJ6+1>D73m`oR>Mo>idYlL$Ew%U8-|qXt#d#7%tzam(o;A@As7 zMVXHdJ-tD$tmr+7_sw68#*V}x$(4(kSZnJn=k<)7yx;P!x9dx{C_%BWEQBJ`V&ms` zi88$o<+OdONsLgwc13&#RQ$Er777w3sbXs`T#n76dx@Qu#_anV3O!M3In~C$((255 zZ;W#7l@WH{DIn0$!A*X;;<$hm3ZLYpf>UBQY zzo6yZ_i+@7RmlQw8K+Om!f5Q34dF5yjOuyM6|z5^S($#R8_wj*CNRyHH{SSWnbnZk zSFO5*_c3N9PJd#=BWY6pT_JziCeKf%vR2Jld=V}4$~JYrpq4D{+gSxq*Ce;7UQ0Og z^Eu4KxKFZ?JLN#!iwnyGmE$YUv-3Cvi(io-x?#}$KG7izlPcCi;dM4{bOmtj8pT&} zei6B;Hk(4`=HU*YLYsfNI99y=!5>GR^^DJrYwQneO?%llgf)N@WLfblcDBl3qVwfk zquRU%>$#93Z% z2u|swB*X$N&XD(WWUc@e)za8kDKDDnK@iq z*{Jrw1eag_UcBuo>5~eoBjmNLUXtK@Sh5gYLI-X8ex47*!xnnl`E%;HU*wSsqA5>5 zuMX$aBD6_H0v9O(5GD-;S#moV;O>6nF?UTO{FcNunsKB|0ljMNzbP3n+g*%%%=J_s zC7$RbvwZda5DBSI%*hs};eRtc#U1{5;|{|Uwp3(#-P=^&FWp`4ucaJ)Zj0*vhOAL< zUkZ$~eCG9pZx_DQFmXNt@BMykA8tmJEpO_QLD<_&1Q7!DFZ6F|36-bbB-JjL8`f+D z(AwI~^I$H!|4l^d%O(Ce@^0xK{|Tvp1)nPL;|jl|<39l_m;KGqk4bx_l}F{qZP5{h zkLd#Jqg$u`cg!HpfRlbeV&ZoHyNqP>&F$PZZDmamh`+YyrzCF#SdFu#uT z&{)51j!Q)99u!}{)^9m+Wl~agW@sdu#6_r#$T|2fZpOe@ZC9sX@+8Rh+~{wrN+iIm zJD)K%y)4;L8~^GI6`{E!h7EypUs%=bqw+B5AXr}a*>X*)^%68uNTg{EU$namzR{w6 z&x$Yk?zN1b`>gkz-{Z!)+4wMP@v1!&T6NwJVPg(G+|=f)PahAkx3u}COSjZQjl%wx zfyx`Co34MwU00MnGYUNd?gVUYkIo3D`jLhfgkMg$0vU--hE@EWnL;eK;41gbepp)H zq0}q8k*&c=enrX4$Y+qrP;0zSGBPb&DBo>b*pS^gY20hv*pSz|VT%q{Ro)V+ti8Y>AJh+Z6SZay_8g0qp% zy3Ai+Lg$MV1s^kjOoVN#naUyOT~0wTO!2{7f?_3H5l(BlFe%O=@7Ne?eLZj9PbB&Y zU>YKE8;DTB5`9C@xb_D^I3kBiX=JT=bF3tqKu_i7a@ygG zn+mK0A&HW{MlLQ~2X@IJ_gA+Py2i{pdZf<|JeV}Yf0gv+zXo6a9ZB0r3m}jn*(^mB zHlc;1?wyRw+(>@QEb-MLI!MDHoUhRVhj|t!Q``Lz#MdY~PyW@IQhH&+=6t zCy#Vf5^~+TL<8k3dNxg+tCz%PQRUKEcP9n1`h%*(MuijnD*VNa;o|*5i5%Tm7C$e` zDl8^`-R$mvdGHEq=V6z*mWHRR1AK zS8bHc;+ggE;=`^Yc1EPDb9!6TuW9jPkObNQk9*X-u-{Mlh=QFI~}<#Of9dDp4EJ8 zIsP~DA9s`jTYD`1={UVs66E{K9o5^HeVJJK8qZCi@w?}+bpM^XQ7<>%7q5SEXTH(= z&z{i_&)7ZbMyT4^g4Co}AAcNiK2E`}W*$2Ep$JrU)%j*L&Doy`OrY&pnyBC*zf0W_}KKJD@TBDXj<8%{#BNg;{9Ro<%Xb^-gk6xBH8 zI*>l!3}EV|)3yhv-*C5Ipc{O zNeX}U7pttS3Kd$VBRoB1gv28)O*)l*ae{Vs7+vl%8gEPL!|3O4%7a7|Z4Mr9bNB*Rt=cqT5<{ zx||A*>N%|G|co7qo*1iZ4B=vLAqC+t71+BjNEja9^`Q*7j>pFDeolZCr5}0YMbsE`?(qM=<%S_cGgDj%x;{by+x{x{+-h zSRJ3~iKV11>|DRP(;?;@RVbxX+or8{$)J#kx-#7PL|cp`*RI?;v840Gq&YOh_Dkh@s7wuqGV$dcpl$d<)9$N8HiRcz#foECF1?Ava>mu$v91AhFkP*8y^Y^0U81U zgxxKT+ZUt6F_k_7?0l-uN(Cfe78M(39kt>fgc+1;-XmJOoR^#kTAJ$+m3Y3+cD`cl zDQMK1uk_r#atQoKdEbUiHHa%iC2iqG2-9i`Mx;6tRYkFtW`;r36lP$gRd%#q>6TYW zqL%r$zQl}*mA={!=*u`L@9ySA>N!v@Wsfn+k}JBHj*b;)a#dMqU13A4_nzl)ZaYFN zZG`VeXSj*c4IR&=eLk+?`93>7Pzs+9(kTiaa{#Y5|LGC}YBr;u z9dNYw1fSLoP<`3*64fO_rf}u3%5>ocaZQq+U=JRIZywRpA>XL{%lnZrRS(}X_7I@> z38YU<@~AT=B!|3Vdbl>dN}XF&4wIq!`F@yF;Pdbu1|=`Ij~uE`zwc3+q4$*Zam@t6 zo8va@s$Y+IBPpyqeOZs+L9VxX5Yo~7yvxhR%Wyfd$%pBj&|I6d$GVAW5G_9ba7TcI zMIt-Qsn=vF!e{7)$W1IVHQzoh@3=R~X6U;Ac+d zJkaHLK$jivTN()*;7dscNRj`>1&>$?l{26W;1ZCLtQ=HtA-rpuopmnAQAdz(*_f4@ z;#L9}d_zMLNZFD`kkcz5P^+^S(l3;aDYS`GV1JI)>SLthDV9xH7R}s+{Vp43`&N0h zHA2YoPgh{vt;q~GTu-`EDh`U06fs!`pv_{$Kj^-hc3UwHYWsIMR#g5lx`Kl55|Ym} zidI?j>D#M}3h^0N+fSbum*}~r2C5Z(4rZ38CEYsECfpeWc;~dVxGO*ZIVj39oA7d-a4C665o*aLiG$=8$df{YY7SKd7ISN@piwzP~FvK!U^7w}$I!IYNXzA}1fr z*RW0wc`K413(EG$yVH)b-^A%LKnPVBjO{)#;9z4uH!xBYA)U->ARpuRYssaj$?43- z-v0M+fxvwbWbq~&zNCBz-&6E+=N-~O)}t>HCS^3n>wTmrWDN6|UpG5bB{RhaEvdWV zXvyq)!>D3{!GYhJ`q6z1X_ov3{-bc*Hf1rXiFbV?oum6FK}Ea9`fu8!^Du#{R4@AL z_n@3-np&P5+>%Bo?Ogb)`;$9xueLW@D0bW&MnWNnaE$|ID*3O3>z`Z$K@+}DSfB5% zO{w^n7=Ip8d1N4<@<#vNoVY_159>|R^#Q4SEQZaoP@1@+aXB2I{Rb=hK^z*czUPVWLY|J23g9Xfn38DI z#?+aTbXTe~=li{F@2@S56!0bF2kKnQsgmq|J@LklsJf!bigYEg#lsAEoVm#j)y3U< zPSa1mGBH98$=+^78OyFI2+oDjy5jT2g0uY7kg1Urd=ig{BfMiUQme#}6Qh7Ate{_Z z-1zo6d{loSoHbDifCy(fDlk|P(_96Sd$KiaKNdzc)>vmL3Xl3!%J<8{_Dss{amgB; zP01)M+V0PnmzXHzw1oBFU$LZ$#oQ3HeVBT@x&5&I#pmo94i3(8JY7OTfopp5Kx+k? z)nfA?wxsgiM4_zC?xOcChwF^_QmqAlWrU^f2mO3!uT*`7M7x;fj;ASMIktqi78coa zrfq6@j4h&>g^*8g5C;lNtuPUb0w6+(>9oJ^8@T7HAm46?s~y+Buo!d11b^-vLRJv% za#3hMn?|n()A~(~e{8=Fr;_Q1BbF@`&TiVFN{~#!tB z4(OM;ti~xQ228C($Qo)j?I%L}*)l#tF zNV-Bg7X|uP1>!kA&$zERpkHCOUtF(To-SZ-BIdinOMTU0ndg8i^cTA) z5h=AD+-A11wa#v>jW=^)wi(IQWTjn>3kk9OvELeK`s210OHvB` zbp@uoS4#!RUM}Sw5 zY(n@B!8>R0Pq#r_#b98e(-vvb#a7$t`E=VU=a+RjW6vY|b;H`AJ!9V}q32uV+B-B? z(zBuPoqdF6`n0Z^xXjQ~fpWuV+>XS(%GA^x^6rD2MBd9 zxAiZp&!Iy9VMO>?5GEr`I1TNU>DE~)30UbZtuyv?HtL6(T(>_bisb|^9lJ$|&q#4H zNT*pwj^AoEO~|MG`4j?eDw!@?660H3_G&+L!vWKxfbXtQUBlZ;hiY^kBN+ z&NfR}o ze%TxC;T`1NCcc;QR#iT-~A0+l*FbI`Y%x55aAXgq7$7x zKb(RT84nJiY5*kn$#TQvRh7TJ&SXnl?uj;wI3LO*$3tt9(u^U4dfeDn(($#e9&Ui9 z8!SY$$dg5bg6XAUb+GoufLxsrj*b1w;-SeI_0jX6N(&^}pyhekqy~h?Ds3$Lk+g={ zuZ()$ldg8{UAl~$mQ^&ueE{diUQ1Et7N3sESX1W(B{#@jpA&wp3ccM#)F9!v<}k*Y zuf&&3M2*XCZA~4f61iOfb8c(;ud*g;Dr>^%6^Lfq8pZrxF$!Si+Rdr!jT98BF)dlQ z1`@z6V4s0ZdPDA+SFx~QaI|PtwXFaZ&b(O>r*P=Zm&45v$luL%=R;z-Z+E8Ra{Loh zo-bIUjut-H(v}i0PU~-n`HPKXLHeDB^{hP8R?st)1y|$kahLArg!uqR&5wt;V+Ig-#=liNq5% z%J}oAP5s*4Eh4tK6q8@x&ZR#-?mDKdSl_Im(E9Kcr4gWC6c33pQui1Hfkp@FyV8@?W2U;X)Yv!0im`&9}{AP8i> z8!i-hoUT(i_2IT(PAkO4`hfa|#kl(COLPzF=bG#mYns@lT(a6;jwN0Kjt$kc-h&o`{bzB1ALXOKuBB0y~Ks zfP(%}CJ&KlaqrVzG#qU zAo)u_C!pl$zs`8P6T8^Zv~jM$5v(KL#&Sq&Wz*F8IiaAtxqz#&*$<@N5bV8CmkHB; z{F#=An@(&OmZ4aRQ?D-AbjPi*ND3ab8EzXJdy78am9shpHq*n6mvN(}b+nMcqzRXU zhJ{+l-;KWE{GM~BPY88{gv}i)m7Ei51JnZ#=*0iJ#Qwf5uu&BJ%O5TJ*Ynxy#dpy- zz*5&U5N50^0gfAA42Q&$Obn$h#4?2F9sXn2z;Iz4obbc|x=$=#NpOKT7~4pa)2LN= zC6$p&MNXskoCV{vA^ijC^W&a{vHzX)*;->@-899Q>7nCa8y=)BV}q)6C8+ODI8qr? zjQYvztXyvFo03G-zSeyCJ0z`Rq}1sLKj{{+{_Ei!l9f3Ifv4WgO!G%B46HmjcA0#q zEW8tXc`%c8zx%%bxE@_h;M9lewwL-2GS-Pmv(IOSuxIRzDg-99Q*q;aDF>o=(iPY? zyC42S#8kwjEwV%(w}JW*QqfdgRt9e*1-D|?#F}F0f;lnqW9V1YhMIB{8_7+^DaX#V z0tv5~=MQ1+g)?EkcCma#iWOEM3L!z$k_4nbUHIRDGt(B!%3tV9A9?W z{f816H72>Ezj zfp&JwO6-C3<3RKYuUcS+l5tXw*U3E+R;nvQC&(*^!h`g#^v9c&|#`Mz!!a7AuvEMH*Sd!ltM}vGmrE80dGHq zO#MdyYsr;ttbTv33|m3$F}B0^Q^}wf4nq9g(fkKJA|8=1wDr~3A6zu>F{3aqPd2EP zP%h721F(;jkWc_xq*U`7JY&Ge`q*14vd_(mp`1dLloF_0wMg1jUa}IKFjAHDsu|!+ax*o*sAhe z6?@i>H$8%3(Gh!|`~`TdM~94pyn&}PZ6+rFc(im&FrtDjYnRYc)5vTps)|~2G8Nw_ zOu3Q%XZB8)yDj~gFGKm`3U8TqMQ%Y2Zcu-Aw?U1=nqYnqEc|_nhT$z}f54U0ZSXRuxtPWpBf*x12%ssVE2TH;_`O$i53kyQlVC^97!dH^-vuw{0DEGE^$2Zv= zsIkS4oyp-AZiXyt3DWe7hxX1Q=*_2gk7xXP5?(UpV-63iwe+4F(B0Di(y0Mu;(MqR z2fQK}iX-K+tOf|1BUrQ0UP^<3V9>G-m>WP9#V1W0#QIaN}F6PD!m}K&i zPVT>V)74Jx@by+_(Vi^wUi?nU}gv-aJ}7b3a2Tcy< ztJ`ro^9Lu>{!~PpQQZ6jD-^jN_zFd~XlAxTPpXPaM9xB;Kbri#Z7NdIov@ZxQgfz^ z^qYtrIWn>%QbF@iu14q=A8iB;u&Ki<(-1mjUiqpQqD~imw)gke)Cr9XJW(y2oVME6 zhBqw>1bB*+6g*fX+)7Nf0%T-lQ?8UFIQ6`I!F^Rs6CkeoV5STi-u4sF_p^dK0xg)9 zEZ&Uf(Vm7rsRb=O^}U7QHwZt_L0vlAox40< z!e4N^^SpCL!XzV@r$~bR<$cvRt2CL@P=mY15OEOzqsL~W;ZEAmDw?%Nu|CafUL2Pg zOe-)GpdP0F5#RUA><66smy8G!0fpFXavBKDC9|J z28PYs)QpUjN;J*7zak2QUG{NbMobb#qC_*CMRk~G|3S1j6x#}fYd>B$by|R@wHl12 zL68tI$#|E*n18j36Npkv0a8MqgqI@$*QyFDV6s%h~5AxqshKamFChb?}##*d*bW89cZvT%%(2o{)Fy2)XX zpeef&O>n|jD<cCk+|mCd@g8dmNk0HGWQLuq0Wt5)JC?f}&e%Zm!&3 zH-_leS~8wfqq$N^P@{V}iCNMhyb!|A`*c5F>v^Zn(hCArFr;K=TtyAfsI-!JOgA@Q zW!JR)G?WfeA1bKjM8=!1FbqqCZ3!1X9sdcBr>>2rg0v`|=GregqhvS{)&dKQ5Qy@$ zP3}?PhA%7olQF!4Ua?oQ6>AAhu8t&w?RmYNtO72WguG$jW~m~1lVdfJ88?0qgs=Po>^{(m5@@NY1~(0roB$WcI921O0t0*eW-Mc>ng`X%5{ zib6J+YxVvu)ssX>u~=_v<{n>`J4Qq7wE}F4}C!3ThR;TpbFI z3tuR@Z$Q9Q()({e5OuK68_%H9n<93x)&O7zHQ=avB?nvNL`U zb6|i*u%IAV$Ji?v$~u5634F(xbt<<(?OOycZHMneEubX8T-zy!&#j~P)uA*~QqPSq zI3GR1^n|sw^{bST^OMj1X$vO)8HQ{FyeoF_U8P!OKg9e+Apo<;e#o9|jh88HTMycgSzLAt>QlC!H9 zBRg>zg(AXH=xjf}QX{lqAAXvh^%zS=hk+r;kb5VgIa^OibVHwNY3KLn+v?ZzTLKJ9 z!AJm5q*Q3y_h)jvwGdyOYi3kmXbheYduSxjP4lRMpvUSnEVC{;j=NvX$e;22k7yAn z5iP`Y-0l@;BKx-GRq)tDD5^?`s}_nH)f~E-Vb*I|gxP1y6XgqvDPK9CmZSy@qhUa!fdrly7)CEf1uCnY8u1Q3Lu z_c!j_eQhER@zAh4V81{Y{EyKW$4_9B$-1>0tny|MsdRhs%b(^9zVhx$%h0i#z&sl* z6252AXGK|OAlF4tzlem_l_4U9#pcvFF;TB+VQ4%z!J@`ZJz$ov_8Gmau$6eDFpW+* zk#BA-NSA)UQg%zD4JAomB3Lo4P!1^ihgTUS^JNL^=Pv8)$p@SQc$-X_lET9ofpq-4 zd5Jo|xVeQo5v?1`fK%>HX=4^sX*r8dvV}BRrKdnzcDnEC&C|ZHCTovm4o*fWtv?vJ zk(1Esww|kEr?k+})MQu2vyBua_&g{0>`N$1h*u92be~W!RKk9ap!p;aBrYFVnYJH- zXbsX=(-gCwUw074jukfVcr^VRDisWxbfRF7XRzm8o=ZR%^cjwd30v$l5kUR)z&k8h zTs)YW%9Gn`6>3rY*ir)VE%ZOH+2^MsH< z#96=O8Hgfn8WKs2^_DSuUx8mbDHcVe$=%KmZ!S4qLpwnjqa=3MrQxfTJMddftY%eFNj8^fCqgiPt7!pYCYu4_(bt zqFC`y(l>Yh(@E$wctJ96IXxNDByV-w!+#C*p1a~Ln_H=+{9a)O5wo0N)~@Ll&Q5-K z0+EHBIvej*wYr>i(wtlzXDeFcA34;PDGksMr*T}g|Ck_FR}q=Xj6U^CaUcK*P&0m z`+|_#*Q&R8W(}mOm3-2E<<3ex*_0(}WN)Rb17sPf-oAEvN}3k`JrwRd;uZw3FJHo` z2!p<@cM5uG1Y71xS1Nj~f|7cWL`wy|{#I;W9~Ac;HuBD)vePvE#+DTi;r?B!I(s?d zCqK%?x!v@UUGY9q$b%)PjqStLPOSpFwDs4cs?IpNbnB33T`zZ#w)q+_{%d*~KbR=WSl^{8E7QzAnn!|?BzOKVr*Dw3sVFL$@yV}z zU8a6El-5BUs!c}SstivKEuh^d005gar1}5uo_ani>6Le2*lo_fpn<;}<0{8~7>U&_ z$Mb2Z3BFlpzfb;UG4Mcpb6>vlWLHx5{mH-Xnz?Nb(!e@bM61hG-Nz9*sI+l+-nNkE zr||Gk`n1Wy-nA$WJj+EafGtt+9v37DxpWTRZTmum6bq&S zsjGa!BAAmV;WS49ZBjOXbG6E}K|Q?)fYgJHPO$>@5+esbM*511irWsAHwoR1C+>Zy zJ=Q(oluAktU%1rt`lA{D2aHXB*Ia6(Oo~wtLjFD|58yUDoH9e#^S<8F`lLOt#YRfp z49T1T2)|mUNCBh93Nxmjf`Y;}mDWxrfmisasaZFTK2_Wb&x~pCmDSVSX}t#5kSi3z zKP9!yJyxt9NQ-MiFk7QI)8}UNK9Zzc^nc^0m6Vk7)fxMvDWNVKL`iY#^jZ0d06}vC zQ%s<6!<7Ub=-`esNl2hg>MHn1P!<^s%*3%;NS6X)y9e&?Z!_Fpj^ff%xMh@$czOlT zlZFLehRVwC`e4L45u^Fzf7jsW!%hgp0NVYfq@*M#H#gVFrWDjADt`a)cPFx_aZBxZ z+X4|4a0_nT)Xm4;){6Cmm_iFGoxs{zj%|hmd*Rtny38dpW6j5i5L!*9HA#>h9RTD* z%k?(xpbmne@P9nVAme7BeHYFR6L;h=`r|*)8Na6GE3{E(Am9gl=D}PUF<4tx`;$5Q zt zp(xP2@PtzlNKP&<4`;nRT0KARi9u(Mokm!jVn=ppl@e=xBn;MBlBPs5%Qaawh^m?u z)Z&VpujNmQR$Y@V8X~+Ycx(y$0n}VvQ4v-CAsPamQg$5USvUiQ`gD~^r>sBWf_+OR zdBZ}@v$fUh(k}}bO0j=NBqYnBck557xcFv7*|;=zvhZdG=0uy+bkD6Y#Wdb1bYS5K z1``8hH7nm8g_kp!X)q;KvoO~Z?Ob?43uRD}{-#ec>;cs!z}qlz;#O`M!KY~I3jWfX9sp{wq08@rhut*dHGx|9&fDm4^P$i41WutN3`;#B| z6t!#aE1*8p6gKjWL7#z0Zw3z9^V`lc08N!#fx24Y4d9Kxx3&sGX+fY+Pc5u*KhgIC zgUB^=+6GQQtr8X?SIxvk z1spw1>)yB3KbMMQ@pJ;Eb%u;mIgFWjMy`O zvc-SqgE*ik%iQ}97&eOD@6&-fDOWQ|6kv#<@bM!Zlu%!?ONL&btmn%jaC=Kb?x#UK zJZPw#zz~QZ;{uPw*T3O?5}9D1pP+ym6o76ix3-qk`4gkZ3b9gkVzYsCA1IZf{Cu)S z&XIS%O?Z(QOm1;Ua}vW}KaD-;``aB$3q5@he!hjY18lV&czp0jO97V2&CM;6QYd-l zb7uz7I)O3j-|*PNT>#Z;0k9Rg68T_a9LjWoKJqJ+ic&8LvR^4T>CA*$C%k}l&!Sh0 z@Jikzdob&;$&{r1b}J&1LI4lw70`77z6NbO0nT6sj>{=Oc}?nx_v}I6UCjW>rC!qa z(<*ZaKo;k|)Dsr~ClmF(cZ3csdl_FA$pQ)g1+jM&5D+TMvS>rM`AJy$*@|p+x;_o6 zVtcTN?tDaP88=>@Z=kGM?%?3n(L$a15FT(c>94gIp*@L*@Pimw2tV#Q1z6}4!2jai zG0^4!xDE84F;pTjxjy>jQg0GW&;E4Bp=;_vp!*E?h&^CgDB0WF6Oihfn8X~l0u1WAFR=7h!VZ&|)n~Y!TMTnM<EV>ToWSoR3Xk6zGJ&%ZPy1Z70QK7nociR~~J7QDI_Ys?}%V z6~H|CeX$RHB_T zZ{VvVa8f@5m3Kf_LWw3*ttewe65cK`TCh9YDma&uO&+(1BdNux0_dXozoO#5;c!^x zhr)&dmw>nhQGDxz0H6RfBoV+F`lzxkdS@gBlD;gMu@Rp5xZ3OlHR^yq0dzJ{?W@WTSuk%-C*D!n1n`={>gCX_F+&89#Gfe{ zYN&gG&ym5#)@T0*19Mo70nH8k*SCVSH2>c^Igf-m(c(9mfit_nj058vP%A{w&uc?9 z%>Wbb<5@72n#dF(74_loIK2Sdh72k%;W;QgrakUDCV*LQAJ8ET0G|_b=wS>PuCF*> zfgM5?Fgvs8AV`AoEuaE6c?G zDVzssSWG(7!N=yZ7$$(4-~_ciRwEa0tPMSr>A?M z2OS{H1`1M)<+DV60Sf|3F=PKHxBf`dUMK?sye{45w1lr0Kc1kZ4a;_JzchIaR4TbS zkql6$efcF9#851KY=+K=)yjl6{TGXfQ`#49rC?FZNV;$S{rC z+?8xn4^LK_^bHI&+Sat&*08K#o{7cwo}W8FhugYV`TAfijc<8rX=;0Wd+>V*;LZ;K zoGmCQs9rQP1?s~3wl?zNgpCegjk=}T-Q7YgY;3oKY5KA8aa%Jp$qePUqrIHkWsa-3 zBEe&?r)7l*7vq`JycsZ*6KS!NtXunw6!KH>H|v(dq;V0e7{- zMn?t92)C$cCA&$7wU?KkUNZO1J~Ermq!ci83HcvXJ%$VYS8rjd~%2X z3)P#Fs>;ek@O1a*kbyy61eK(l=-=MIDwme@qv!lO;8KYvbL z>`h#4{YKyM0#_M8UG(6rVLA><%)h#FGivu#6A}_K)!FXp`TBRQjrhKHbdPAdIT~`< z7huK@6Ui?ujDdNG_4Zt1P*ihyBTdI4oQ`yL*4NPyXZ*~iWGs((6|zm~&4%6DEh6gO zuW{llmcsTl`b=?Z78x4c53lO$mvc(!ukEChwLyI2=`}zTKqDUrqxz=?{#*B}OB?|P zfp&t2^sQ-~_R?(_IvevwO&?z{^@yeYc=7b)E7x1SvmjhNV><9nW$n8t35F!e#nqfG zPkH0-Ue&eH_0t;}xHPH!sxb0Z%1RWit_>7z@Z20bAGYT}qUmP8MdNKNzfYgW>M!r@ ze;>}TfFtzZg9Qrf(J}*3vjHI@h!$%n-}B+YIacJkFMaScD_{En9lnpz>IdqVOqloQ zV(>6%-oK1>S)GU2u@q%wWK0VaJ=O)9J^LgKl_M1F&!YuG)BQTBB@_I2YYiSV-P=T_ z+y5FxIQ^_%IFh^9NVaQQ*(8Nb5>{z@(U)(Wf>5B)jj1xV%m_m2O-Pk%JdA(d02&Nh zyBXw}&gGkTyFR4r{PWl{GBWZej>SZWvr6)a0~?{Bj~rUgu+_C!((p&Tt?B%%k(=|2 zfeA?Uj(Yw`@b5aL01bMHyiGDPsBk8MLQZ%Xp5&71J-{|%`T`M&mKOOS==mo_Z?2z|FQtx-y#+ya(Y({2Lr=^DxwVo14B=v3kw6281w-V21XU{D + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/cpu-cache/mlp.md b/content/english/hpc/cpu-cache/mlp.md index 34a5c5de..8fb7fdca 100644 --- a/content/english/hpc/cpu-cache/mlp.md +++ b/content/english/hpc/cpu-cache/mlp.md @@ -1,8 +1,10 @@ --- title: Memory-Level Parallelism -weight: 4 +weight: 9 --- +On perfectly pipelined systems, it would be equal to the latency-bandwidth product. But this isn't quite true: we measured the latency of the RAM to be around 150ns, and its peak bandwidth to be around 40 GB/s, which if we just divided + The fundamental reason why [linear iteration](../bandwidth) is so much faster than [pointer jumping](../latency) is that the CPU knows which memory locations it needs to fetch first and sends the corresponding memory requests far in advance, successfully hiding the latencies of these individual requests. Exploring this idea further, the memory system supports a large but finite number of concurrent I/O operations. To find this limit, we can modify our pointer chasing benchmark @@ -50,67 +52,3 @@ jne .L9 mov edx, DWORD PTR q[0+rdx*4] mov DWORD PTR [rbp-128+rax*4], edx ``` - -### AoS and SoA - -Exploit [spatial locality](/hpc/external-memory/locality). - -Let's modify the pointer chasing code so that the next pointer needs to be computed using a variable number of fields. We can either place them in separate arrays, or in the same array. - -The first approach, struct - -```c++ -const int M = N / D; // # of memory accesses -int p[M], q[M][D]; - -iota(p, p + M, 0); -random_shuffle(p, p + M); - -int k = p[M - 1]; - -for (int i = 0; i < M; i++) - q[k][0] = p[i]; - - for (int j = 1; j < D; j++) - q[i][0] ^= (q[j][i] = rand()); - - k = q[k][0]; -} - -for (int i = 0; i < M; i++) { - int x = 0; - for (int j = 0; j < D; j++) - x ^= q[k][j]; - k = x; -} -``` - -Transpose the array and also swap indices in all its accesses: - -```c++ -int q[D][M]; -// ^--^ -``` - -![](../img/aos-soa.svg) - -Running a bit forward: the spikes at powers of two for AoS are due to SIMD, and dips in SoA are due to cache associativity. - -### RAM-Specific Timings - -![](../img/aos-soa-padded.svg) - -```c++ -struct padded_int { - int val; - int padding[15]; -}; - -padded_int q[M][D]; -``` - -The rest of the core is the same: the only difference is that they require a separate cache line access. - -This is only specific to RAM: on array sizes that fit in cache, the benchmark is actually worse because the [cache sharing is worse](../cache-lines). - -RAM timings. diff --git a/content/english/hpc/cpu-cache/packing.md b/content/english/hpc/cpu-cache/packing.md index 201f0eac..6aeb0ebb 100644 --- a/content/english/hpc/cpu-cache/packing.md +++ b/content/english/hpc/cpu-cache/packing.md @@ -1,6 +1,6 @@ --- -title: Data Packing -weight: 10 +title: Structure Packing +weight: 5 --- If you know what you are doing, you can turn disable padding and instead pack you data structure as tight as possible. This is done diff --git a/content/english/hpc/cpu-cache/paging.md b/content/english/hpc/cpu-cache/paging.md index f4583ee7..2312c838 100644 --- a/content/english/hpc/cpu-cache/paging.md +++ b/content/english/hpc/cpu-cache/paging.md @@ -1,6 +1,6 @@ --- title: Memory Paging -weight: 7 +weight: 8 --- - There are other types of cache inside CPUs that are used for things other than data. The most important for us are *instruction cache* (I-cache), which is used to speed up the fetching of machine code from memory, and *translation lookaside buffer* (TLB), which is used to store physical locations of virtual memory pages, which is instrumental to the efficiency of virtual memory. diff --git a/content/english/hpc/cpu-cache/pointers.md b/content/english/hpc/cpu-cache/pointers.md index 5b46d97d..8c5ededd 100644 --- a/content/english/hpc/cpu-cache/pointers.md +++ b/content/english/hpc/cpu-cache/pointers.md @@ -1,6 +1,6 @@ --- title: Pointer Alternatives -weight: 12 +weight: 6 --- diff --git a/content/english/hpc/cpu-cache/sharing.md b/content/english/hpc/cpu-cache/sharing.md index da5b93ea..f3d3e23f 100644 --- a/content/english/hpc/cpu-cache/sharing.md +++ b/content/english/hpc/cpu-cache/sharing.md @@ -1,6 +1,6 @@ --- title: Memory Sharing -weight: 4 +weight: 3 --- Starting at some level of the hierarchy, the cache becomes *shared* between different cores. This reduces the total die area and lets you add more cores on a single chip but also poses some "noisy neighbor" problems as it limits the effective cache size and bandwidth available to a single execution thread. diff --git a/content/english/hpc/cpu-cache/sw-prefetching.md b/content/english/hpc/cpu-cache/sw-prefetching.md index 2db5ce19..15da22f8 100644 --- a/content/english/hpc/cpu-cache/sw-prefetching.md +++ b/content/english/hpc/cpu-cache/sw-prefetching.md @@ -1,6 +1,6 @@ --- title: Software Prefetching -weight: 6 +weight: 11 --- Sometimes the hardware can't figure out what to prefetch next by itself, and in this case, we need to point it explicitly. From 634f56ba034a070b6fb45705b0ef38f9d6bbb4a3 Mon Sep 17 00:00:00 2001 From: Ivan Beletskiy <63805710+ibeletskiy@users.noreply.github.com> Date: Tue, 1 Feb 2022 12:34:27 +0300 Subject: [PATCH 074/531] Update segments.md --- content/russian/cs/geometry-basic/segments.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/content/russian/cs/geometry-basic/segments.md b/content/russian/cs/geometry-basic/segments.md index 80734b25..7201388e 100644 --- a/content/russian/cs/geometry-basic/segments.md +++ b/content/russian/cs/geometry-basic/segments.md @@ -1,6 +1,7 @@ --- title: Прямые и отрезки weight: 3 +published: true --- Отрезок можно задать двумя точками своих концов. В любом порядке — ведь он, в отличие от вектора, неориентирован. @@ -79,11 +80,11 @@ $$ Если же прямая задана 2 точками, то можно сделать так: $$ -\rho(P, L(A, B)) = \frac{ \overrightarrow{PA} \cdot -\overrightarrow{PB}}{|\overrightarrow{(A, B)}|} +\rho(P, L(A, B)) = \frac{|\overrightarrow{PA} \cdot +\overrightarrow{PB}|}{|\overrightarrow{(A, B)}|} $$ -Обратите внимание, что в знаменателе стоит скалярное произведение. +Обратите внимание, что в числителе стоит псевдоскалярное произведение. ### Точка пересечения прямых From 4deb57bacf8f9cd12eafefb41bfeb56ee9a1f7df Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 1 Feb 2022 14:24:06 +0300 Subject: [PATCH 075/531] fix point-to-line distance formula --- content/russian/cs/geometry-basic/segments.md | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/content/russian/cs/geometry-basic/segments.md b/content/russian/cs/geometry-basic/segments.md index 7201388e..2b686bbc 100644 --- a/content/russian/cs/geometry-basic/segments.md +++ b/content/russian/cs/geometry-basic/segments.md @@ -77,14 +77,21 @@ $$ Об этой формуле можно думать как о скалярном произведении вектора-точки на нормированный ($\frac{1}{\sqrt{A^2+B^2}}$) вектор нормали, геометрически равный проекции точки на него. -Если же прямая задана 2 точками, то можно сделать так: +Если же прямая задана 2 точками, то можно выразить высоту из формулы для площади треугольника: $$ -\rho(P, L(A, B)) = \frac{|\overrightarrow{PA} \cdot -\overrightarrow{PB}|}{|\overrightarrow{(A, B)}|} +A = \frac{1}{2} bh +\implies +h = \frac{2A}{b} $$ -Обратите внимание, что в числителе стоит псевдоскалярное произведение. +И посчитать эту высоту так: + +$$ +\rho(P, L(A, B)) = \frac{|\overrightarrow{PA} \times \overrightarrow{PB}|}{|\overrightarrow{AB}|} +$$ + +Обратите внимание, что в числителе стоит [векторное произведение](../products) — мы воспользовались тем, что по модулю оно равно удвоенной площади треугольника $\angle PAB$, ### Точка пересечения прямых From babf04ecc5c8f275ee39c1eb0e5fa327d0912089 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 1 Feb 2022 19:58:40 +0300 Subject: [PATCH 076/531] alignment --- content/english/hpc/cpu-cache/alignment.md | 114 ++++++++++++++++++--- 1 file changed, 101 insertions(+), 13 deletions(-) diff --git a/content/english/hpc/cpu-cache/alignment.md b/content/english/hpc/cpu-cache/alignment.md index 44724710..e3cf4f32 100644 --- a/content/english/hpc/cpu-cache/alignment.md +++ b/content/english/hpc/cpu-cache/alignment.md @@ -3,11 +3,48 @@ title: Data Alignment weight: 4 --- -The fact that the memory is split into cache lines has huge implications on data structure layout. If you need to retrieve a certain atomic object, such as a 32-bit integer, you want to have it all located in a single cache line: both because hardware stitching results together takes precious transistor space and because retrieving 2 cache lines is slow and increases memory bandwidth. The "natural" alignment of `int` is 4 bytes. +The fact that the memory is partitioned into 64B [cache lines](../cache-lines) makes it difficult to operate on data words that cross a cache line boundary. When you need to retrieve some primitive type, such as a 32-bit integer, you really want to have it located on a single cache line — both because retrieving two cache lines requires more memory bandwidth and stitching the results in hardware requires precious transistor space. -For this reason, in C and most other programming languages structures by default pad structures with blank bytes in order to insure that every data member will not be split by a cache line boundary. Instead of playing a complex tetris game and rearranging its members, it simply pads each element so that the alignment of the next one matches its "natural" one. In addition, the data structure as a whole may be padded with a final unnamed member to allow each member of an array of structures to be properly aligned. +This aspect heavily influences algorithm designs and how compilers choose the memory layout of data structures. -Consider the following toy example: +### Aligned Allocation + +By default, when you allocate an array of some primitive type, you are guaranteed that the addresses of all elements are a multiple of their size, which ensures that they only span a single cache line. For example, you are guaranteed the address of the first and every other element of an `int` array is a multiple of 4 bytes (`sizeof int`). + +Sometimes you need to ensure that this minimum alignment is higher. For example, many [SIMD](/hpc/simd) applications read and write data in blocks of 32 bytes, and it is [crucial for performance](/hpc/simd/moving) that these 32 bytes belong to the same cache line. In such cases, you can use the `alignas` specifier when defining a static array variable: + +```c++ +alignas(32) float a[n]; +``` + +To allocate a memory-aligned array dynamically, you can use `std::aligned_alloc`, which takes the alignment value and the size of an array in bytes and returns a pointer to the allocated memory — just like the `new` operator does: + +```c++ +void *a = std::aligned_alloc(32, 4 * n); +``` + +You can also align memory to sizes [larger than the cache line](../paging). The only restriction is that the size parameter must be an integral multiple of alignment. + +You can also use the `alignas` specifier when defining a `struct`: + +```c++ +struct alignas(64) Data { + // ... +}; +``` + +Whenever an instance of `Data` is allocated, it will be at the beginning of a cache line. The downside is that the effective size of the structure will be rounded up to the nearest multiple of 64 bytes. This has to be done so that, e. g. when allocating an array of `Data`, not just the first element is properly aligned. + +### Structure Alignment + +This issue becomes more complicated when we need to allocate a group of non-uniform elements, which is the case for structures. Instead of playing Tetris trying to rearrange the members of a `struct` so that each of them is within a single cache line — which isn't always possible as the structure itself doesn't have to be placed on the start of a cache line — most C/C++ compilers also rely on the mechanism of memory alignment. + +Structure alignment similarly ensures that the address of all its member primitive types (`char`, `int`, `float*`, etc) are multiples of their size, which automatically guarantees that each of them only spans one cache line. It achieves that by: + +- *padding*, if necessary, each structure member with a variable number of blank bytes to satisfy the alignment requirement of the next member; +- setting the alignment requirement of the structure itself to the maximum of the alignment requirements of its member types, so that when an array of the structure type is allocated or it is used as a member type in another structure, the alignment requirements of all its primitive types are satisfied. + +For better understanding, consider the following toy example: ```cpp struct Data { @@ -18,29 +55,80 @@ struct Data { }; ``` -When stored succinctly, it needs a total of $1 + 2 + 4 + 1 = 8$ bytes per instance, but doing so raises a few issues. Assuming that the whole structure has alignment of 4 (its largest member, `int`), `a` is fine, but `b`, `c` and `d` are not aligned. +When stored succinctly, this structure needs a total of $1 + 2 + 4 + 1 = 8$ bytes per instance, but even assuming that the whole structure has the alignment of 4 bytes (its largest member, `int`), only `a` will be fine, while `b`, `c` and `d` are not size-aligned and potentially cross a cache line boundary. -To fix this, compiler inserts unnamed members so that each next unaligned member gets to its alignment: +To fix this, the compiler inserts some unnamed members so that each next member gets the right minimum alignment: ```cpp struct Data { char a; // 1 byte - char x[1]; // 1 byte for the following 'short' to be aligned on a 2 byte boundary + char x[1]; // 1 byte for the following "short" to be aligned on a 2-byte boundary short b; // 2 bytes - int c; // 4 bytes - largest structure member + int c; // 4 bytes (largest member, setting the alignment of the whole structure) char d; // 1 byte char y[3]; // 3 bytes to make total size of the structure 12 bytes (divisible by 4) }; + +// sizeof(Data) = 12 +// alignof(Data) = alignof(int) = sizeof(int) = 4 ``` -Padding is only inserted when a structure member is followed by a member with a larger alignment requirement or at the end of the structure. By changing the ordering of members in a structure, it is possible to change the amount of padding required to maintain alignment. For example, if members are sorted by descending alignment requirements a minimal amount of padding is required. The minimal amount of padding required is always less than the largest alignment in the structure. Computing the maximum amount of padding required is more complicated, but is always less than the sum of the alignment requirements for all members minus twice the sum of the alignment requirements for the least aligned half of the structure members. +This potentially wastes space but saves a lot of CPU cycles. This trade-off is mostly beneficial, so structure alignment is enabled by default in most compilers. -By default, when you allocate an array, the only guarantee about its alignment you get is that none of its elements are split by a cache line. For an array of `int`, this means that it gets the alignment of 4 bytes (`sizeof int`), which lets you load exactly one cache line when reading any element. +### Optimizing Member Order -Alignment requirements can be declared not only for the data type, but for a particular variable. The typical use cases are allocating something the beginning of a 64-byte cache line, 32-byte SIMD block or a 4K memory page. +Padding is only inserted before a not-yet-aligned member or at the end of the structure. By changing the ordering of members in a structure, it is possible to change the required amount of padding bytes and the total size of the structure. -```cpp -alignas(64) float a[n]; +In the previous example, we could reorder the structure members like this: + +```c++ +struct Data { + int c; + short b; + char a; + char d; +}; +``` + +Now, each of them is aligned without any padding, and the size of the structure is just 8 bytes. It seems stupid that the size of a structure and consequently its performance depends on the order of definition of its members, but this is required for binary compatibility. + +As a rule of thumb, place your type definitions from largest data types to smallest — this greedy algorithm is guaranteed to work unless you have some non-power-of-two type sizes such as the [12-byte](/hpc/arithmetic/ieee-754#float-formats) `long double`. + + From 17fe6fcf1622155ec1a7f01f2c3fb9cae847d4cf Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 1 Feb 2022 20:31:17 +0300 Subject: [PATCH 077/531] packing and bit fields --- content/english/hpc/cpu-cache/packing.md | 28 +++++++++++++++--------- 1 file changed, 18 insertions(+), 10 deletions(-) diff --git a/content/english/hpc/cpu-cache/packing.md b/content/english/hpc/cpu-cache/packing.md index 6aeb0ebb..e3dd1d39 100644 --- a/content/english/hpc/cpu-cache/packing.md +++ b/content/english/hpc/cpu-cache/packing.md @@ -3,20 +3,22 @@ title: Structure Packing weight: 5 --- -If you know what you are doing, you can turn disable padding and instead pack you data structure as tight as possible. This is done +If you know what you are doing, you can disable [structure padding](../alignment) and pack your data as tight as possible. -When loading it though, the +You have to ask the compiler to do it, as such functionality is not a part of neither C nor C++ standard yet. In GCC and Clang, this is done with the `packed` attribute: ```cpp struct __attribute__ ((packed)) Data { - char a; - short b; - int c; - char d; + long long a; + bool b; }; ``` -This is a less standardized feature, but you can also use it with *bit fields* to members of less than fixed size. +This makes the instances of `Data` take just 9 bytes instead of the 16 required by alignment, at the cost of possibly fetching two cache lines to reads its elements. + +### Bit fields + +You can also use packing along with *bit fields*, which allow you to explicitly fix the size of a member in bits: ```cpp struct __attribute__ ((packed)) Data { @@ -25,7 +27,11 @@ struct __attribute__ ((packed)) Data { }; ``` -The structure takes 4 bytes when packed and 8 bytes when padded. This feature is not so widespread because CPUs don't have 3-byte arithmetic and has to do some inefficient conversion during loading: +This structure takes 4 bytes when packed and 8 bytes when padded. The number of bits a member has doesn't have to be a multiple of 8, and neither does the total structure size. In an array of `Data`, the neighboring elements will be "merged" in the case of a non-whole number of bytes. It also allows you to set a width that exceeds the base type, which acts as padding — although it throws a warning in the process. + + + +This feature is not so widespread because CPUs don't have 3-byte arithmetic or things like that and has to do some inefficient byte-by-byte conversion during loading: ```cpp int load(char *p) { @@ -34,7 +40,9 @@ int load(char *p) { } ``` -This can be optimized by loading a 4-byte `int` and then using a mask to discard its highest bits. +The overhead is even larger when there is a non-whole byte — it needs to be handled with a shift and an and-mask. + +This procedure can be optimized by loading a 4-byte `int` and then using a mask to discard its highest bits. ```cpp int load(int *p) { @@ -43,4 +51,4 @@ int load(int *p) { } ``` -Compilers usually don't do that, because this is not technically legal sometimes: may not own that 4th byte, and won't let you load it even if you are discarding it. +Compilers usually don't do that because this is not technically always legal: that 4th byte may be on a memory page that you don't own, so the operating system won't let you load it even if you are going to discard it right away. From c2499339e84426b4ff05ecca0a9087533db92123 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 1 Feb 2022 22:32:00 +0300 Subject: [PATCH 078/531] pointer alternatives --- content/english/hpc/cpu-cache/pointers.md | 40 ++++++++++++++++++----- 1 file changed, 31 insertions(+), 9 deletions(-) diff --git a/content/english/hpc/cpu-cache/pointers.md b/content/english/hpc/cpu-cache/pointers.md index 8c5ededd..799ff212 100644 --- a/content/english/hpc/cpu-cache/pointers.md +++ b/content/english/hpc/cpu-cache/pointers.md @@ -3,10 +3,26 @@ title: Pointer Alternatives weight: 6 --- +In the [pointer chasing benchmark](../latency), for simplicity, we didn't use actual pointers, but integer indices relative to a base address: -Memory addressing operator is fused on x86, so `k = q[k]` folds into one terse `mov rax, DWORD PTR q[0+rax*4]` instruction, although it does a multiplication by 4 and an addition under the hood. Although fully fused, These additional computations actually add some delay to memory operations, and in fact the latency of L1 fetch is 4 or 5 cycles — the latter being the case if we need to perform complex computation of address. For this reason, the permutation benchmark measures 3ns or 6 cycles per fetch: 5 for the read (including +1 for address computation) and 1 to move the result to the right register. +```c++ +for (int i = 0; i < N; i++) + k = q[k]; +``` + +[The memory addressing operator](/hpc/architecture/assembly#addressing-modes) on x86 is fused with the address computation, so the `k = q[k]` line folds into just a single terse instruction that also does multiplication by 4 and addition under the hood: + +```nasm +mov rax, DWORD PTR q[0+rax*4] +``` + +Although fully fused, these additional computations add some delay to memory operations. The latency of an L1 fetch is either 4 or 5 cycles — the latter being the case if we need to perform a complex computation of the address. For this reason, the permutation benchmark measures 3ns or 6 cycles per jump: 4+1 for the read and address computation and another one to move the result to the right register. -We can make our benchmark run slightly faster if we replace "fake pointers" — indices — with actual pointers. There are some syntactical issues in getting "pointer to pointer to pointer…" constructions to work, so instead we will define a struct type that just wraps a pointers to its own kind — this is how most pointer chasing works anyway: +### Pointers + +We can make our benchmark run slightly faster if we replace "fake pointers" — indices — with actual pointers. + +There are some syntactical issues in getting "pointer to pointer to pointer…" constructions to work, so instead we will define a struct that just wraps a pointers to its own type — this is how most pointer chasing works anyway: ```cpp struct node { node* ptr; }; @@ -24,23 +40,27 @@ for (int i = 0; i < N; i++) k = k->ptr; ``` -This code now runs in 2ns / 4 cycles for arrays that fit in L1 cache. Why not 4+1=5? Because Zen 2 [has an interesting feature](https://www.agner.org/forum/viewtopic.php?t=41) that allows zero-latency reuse of data accessed just by address, so the "move" here is transparent, resulting in whole 2 cycles saved. +This code now runs in 2ns / 4 cycles for arrays that fit in the L1 cache. Why not 4+1=5? Because Zen 2 [has an interesting feature](https://www.agner.org/forum/viewtopic.php?t=41) that allows zero-latency reuse of data accessed just by address, so the "move" here is transparent, resulting in whole two cycles saved. -Unfortunately, there is a problem with it on 64-bit systems as the pointers become twice as large, making the array spill out of cache much sooner compared to using a 32-bit index. Graph looks like if it was shifted by one power of two to the left — exactly like it should. +Unfortunately, there is a problem with it on 64-bit systems as the pointers become twice as large, making the array spill out of cache much sooner compared to using a 32-bit index. The latency-versus-size graph looks like if it was shifted by one power of two to the left — exactly like it should: ![](../img/permutation-p64.svg) -This problem is mitigated by switching to 32-bit mode. You need to go [through some trouble](https://askubuntu.com/questions/91909/trouble-compiling-a-32-bit-binary-on-a-64-bit-machine) getting 32-bit libs to get this running on a computer made in this century, but this is justified by the result — unless you also need to interoperate with 64-bit software or access more than 4G or RAM. +This problem is mitigated by switching to the 32-bit mode: ![](../img/permutation-p32.svg) -The fact that on larger problem sizes the performance is bottlenecked by memory rather than CPU lets us to try something even more stranger: using less than 4 bytes for storing indices. This can be done with bit fields: +You need to go [through some trouble](https://askubuntu.com/questions/91909/trouble-compiling-a-32-bit-binary-on-a-64-bit-machine) getting 32-bit libs to get this running on a computer made in this century, but this shouldn't pose other problems unless you need to interoperate with 64-bit software or access more than 4G of RAM + +### Bit Fields + +The fact that on larger problem sizes the performance is bottlenecked by memory rather than CPU lets us try something even more strange: we can use less than 4 bytes for storing indices. This can be done with [bit fields](../packing#bit-fields): ```cpp struct __attribute__ ((packed)) node { int idx : 24; }; ``` -You don't need to do anything other than defining a structure for the bit field. The CPU does truncation by itself. +You don't need to do anything else other than defining a structure for the bit field — the compiler handles the 3-byte integer all by itself: ```cpp int k = p[N - 1]; @@ -52,7 +72,7 @@ for (int i = 0; i < N; i++) { k = q[k].idx; ``` -This measures at 6.5ns in the L1 cache, but the conversion procedure chosen by the compiler is suboptimal: it is done by loading 3 bytes, which is not optimal. Instead, we could just load a 4-byte integer and truncate it ourselves (we also need to add one more element to the `q` array to ensure we own that extra one byte of memory): +This code measures at 6.5ns for the L1 cache. There is some room for improvement as the default conversion procedure chosen by the compiler is suboptimal. We could manually load a 4-byte integer and truncate it ourselves (we also need to add one more element to the `q` array to ensure we own that extra one byte of memory): ```cpp k = *((int*) (q + k)); @@ -63,4 +83,6 @@ It now runs in 4ns, and produces the following graph: ![](../img/permutation-bf-custom.svg) -In short: for something very small, use pointers; for something very large, use bit fields. +If you zoom close enough ([the graph is an svg](../img/permutation-bf-custom.svg)), you'll see that the pointers win on very small arrays, then starting from around the L2-L3 cache boundary our custom bit fields take over, and for very large arrays it doesn't matter because we never hit cache anyway. + +This isn't something that can give you a 5x improvement, but it's still something to try when all the other resources are exhausted. From a0a045819dc038e160ac2798bbea617bf5506fd1 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 1 Feb 2022 22:32:56 +0300 Subject: [PATCH 079/531] change wording --- content/english/hpc/cpu-cache/pointers.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/hpc/cpu-cache/pointers.md b/content/english/hpc/cpu-cache/pointers.md index 799ff212..aa68180c 100644 --- a/content/english/hpc/cpu-cache/pointers.md +++ b/content/english/hpc/cpu-cache/pointers.md @@ -85,4 +85,4 @@ It now runs in 4ns, and produces the following graph: If you zoom close enough ([the graph is an svg](../img/permutation-bf-custom.svg)), you'll see that the pointers win on very small arrays, then starting from around the L2-L3 cache boundary our custom bit fields take over, and for very large arrays it doesn't matter because we never hit cache anyway. -This isn't something that can give you a 5x improvement, but it's still something to try when all the other resources are exhausted. +This isn't a kind of optimization that can give you a 5x improvement, but it's still something to try when all the other resources are exhausted. From 1378ac924c0b041299b52bbba1a53c98ba2ac2fd Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 2 Feb 2022 03:46:07 +0300 Subject: [PATCH 080/531] note about float zero folding --- content/english/hpc/arithmetic/ieee-754.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/content/english/hpc/arithmetic/ieee-754.md b/content/english/hpc/arithmetic/ieee-754.md index 451c295e..9c708ffe 100644 --- a/content/english/hpc/arithmetic/ieee-754.md +++ b/content/english/hpc/arithmetic/ieee-754.md @@ -66,7 +66,7 @@ There is a way to gracefully handle corner cases like these: hardware interrupts This is a complex mechanism that deserves an article of its own, but since this is a book about performance, the only thing you need to know is that they are quite slow and not desirable in real-time systems such as navigating rockets. -### NaNs and Infinities +### NaNs, Zeros and Infinities Floating-point arithmetic often deals with noisy, real-world data, and exceptions there are much more common than in the integer case. For this reason, the default behavior is different. Instead of crashing, the result is substituted with a special value without interrupting the executing, unless the programmer explicitly wants to. @@ -83,10 +83,12 @@ $$ What happens if we, say, divide a value by zero? Should it be a negative or a positive infinity? This case is actually unambiguous because, somewhat less intuitively, there are also two zeros: a positive and a negative one. $$ - \frac{1}{+0} = +∞ + \frac{1}{+0} = +∞ \;\;\;\; \frac{1}{-0} = -∞ $$ +Fun fact: `x + 0.0` can't be folded to `x`, but `x + (-0.0)` can, so the negative zero is a better initializer value than the positive zero as it is more likely to be optimized away by the compiler. The reason why `+0.0` doesn't work is that IEEE says that `+0.0 + -0.0 == +0.0`, so it will give a wrong answer for `x = -0.0`. The presence of two zeros frequently causes headaches like this — good news that you can pass `-fno-signed-zeros` to the compiler if you want to disable this behavior. + Zeros are encoded by setting all bits to zero, except for the sign bit in the negative case. Infinities are encoded by setting all their exponent bits to one and all mantissa bits to zero, with the sign bit distinguishing between positive and negative infinity. The other type is the "not-a-number” (NaN), which is generated as the result of mathematically incorrect operations: From b8f36a4646c0cc956a3d608ad94ae6bd4770a581 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 2 Feb 2022 03:58:20 +0300 Subject: [PATCH 081/531] writing down notes before I forget them --- content/english/hpc/cpu-cache/associativity.md | 8 ++++++++ content/english/hpc/cpu-cache/sw-prefetching.md | 6 +++--- content/english/hpc/simd/masking.md | 2 ++ content/english/hpc/simd/moving.md | 6 ++++++ content/english/hpc/simd/{permutation.md => shuffing.md} | 0 5 files changed, 19 insertions(+), 3 deletions(-) rename content/english/hpc/simd/{permutation.md => shuffing.md} (100%) diff --git a/content/english/hpc/cpu-cache/associativity.md b/content/english/hpc/cpu-cache/associativity.md index fd85a8eb..7bd52524 100644 --- a/content/english/hpc/cpu-cache/associativity.md +++ b/content/english/hpc/cpu-cache/associativity.md @@ -67,3 +67,11 @@ This issue arises with remarkable frequency in all types of algorithms that love Inside these sets, cache operates simply as LRU. Instead of storing time, you just store counters: the later an element was accessed, the lower its counter is. In hardware, you need to maintain $n$ counters of $\log_2 n$ bits each. When a cell is accessed, its counter becomes $(n-1)$ (maximum possible), and the others that are larger need to be decremented by one. Then to kick out an element you need to find the counter with zero and replace it, and then decrement everyone else's counters. Cost is you need to store more of them (1 more bit per element), and that you need more energy to update all of them. So the practical trade-off is to limit these groups. + +Unfortunately, we use power of two array sizes all the time: + +- it is easy to calculate modulo a power of two +- it is easy to calculate the jump value +- sometime we use divide-and-conquer algorithms, which rely on splitting tasks in two + +Fortunately, the solution is usually just to add some arbitrary number to the array size. Although sometimes this is much less trivial. diff --git a/content/english/hpc/cpu-cache/sw-prefetching.md b/content/english/hpc/cpu-cache/sw-prefetching.md index 15da22f8..6f5eac06 100644 --- a/content/english/hpc/cpu-cache/sw-prefetching.md +++ b/content/english/hpc/cpu-cache/sw-prefetching.md @@ -47,6 +47,6 @@ Managing issues such as integer overflow, we can cut latency down arbitrarily cl ![](../img/sw-prefetch-others.svg) - \ No newline at end of file +Hardware prefetching activates and deactivates automatically. Software isn't and will block the pipeline. + +Prefetching can be to particular levels. diff --git a/content/english/hpc/simd/masking.md b/content/english/hpc/simd/masking.md index 489d2ff2..b1ee4704 100644 --- a/content/english/hpc/simd/masking.md +++ b/content/english/hpc/simd/masking.md @@ -81,3 +81,5 @@ void binpow_simd() { ``` This implementation now works in 0.7 seconds, or 13.5 times faster, and there is still ample room for improvement. + + diff --git a/content/english/hpc/simd/moving.md b/content/english/hpc/simd/moving.md index 5d1d75ba..3322d94c 100644 --- a/content/english/hpc/simd/moving.md +++ b/content/english/hpc/simd/moving.md @@ -49,8 +49,14 @@ For allocating an array dynamically, we can use `std::aligned_alloc` which takes On most modern architectures, the `loadu` / `storeu` intrinsics should be equally as fast as `load` / `store` given that in both cases the blocks only intersect one cache line. The advantage of the latter is that they can act as free assertions that all reads and writes are aligned. It is worth noting that the GCC vector extensions always assume aligned memory reads and writes. Memory alignment issues is also one of the reasons why compilers can't always autovectorize efficiently. + diff --git a/content/english/hpc/simd/permutation.md b/content/english/hpc/simd/shuffing.md similarity index 100% rename from content/english/hpc/simd/permutation.md rename to content/english/hpc/simd/shuffing.md From 351a293f52eb98ee2c5fd48f88a5e1582b51f20d Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 2 Feb 2022 04:10:19 +0300 Subject: [PATCH 082/531] delete simd cookbook --- content/english/hpc/simd/cookbook.md | 17 ----------------- content/english/hpc/simd/shuffing.md | 4 ++++ 2 files changed, 4 insertions(+), 17 deletions(-) delete mode 100644 content/english/hpc/simd/cookbook.md diff --git a/content/english/hpc/simd/cookbook.md b/content/english/hpc/simd/cookbook.md deleted file mode 100644 index 90a3e8f4..00000000 --- a/content/english/hpc/simd/cookbook.md +++ /dev/null @@ -1,17 +0,0 @@ ---- -title: SSE & AVX Cookbook -weight: 11 -draft: true ---- - -## Constexpr - -## Popcnt - -### Naive - -### 8-bit lookup - -### gather - -### pshufb diff --git a/content/english/hpc/simd/shuffing.md b/content/english/hpc/simd/shuffing.md index 711aba60..508a3aeb 100644 --- a/content/english/hpc/simd/shuffing.md +++ b/content/english/hpc/simd/shuffing.md @@ -12,3 +12,7 @@ Masking is the most widely used technique for data manipulation, but there are m - AVX512 has similar "scatter" instructions that write data non-sequentially, using either indices or [a mask](https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=compress&expand=4754,4479&techs=AVX_512). You can very efficiently "filter" an array this way using a predicate. The last two, gather and scatter, turn SIMD into proper parallel programming model, where most operations can be executed independently in terms of their memory locations. This is a huge deal: many AVX512-specific algorithms have been developed recently owning to these new instructions, and not just having twice as many SIMD lanes. + + + + From 9c17582faf9acf067e7ee384434d6c6f83ecc6ed Mon Sep 17 00:00:00 2001 From: AlexXan312 <62149707+AlexXan312@users.noreply.github.com> Date: Wed, 2 Feb 2022 15:42:37 +0300 Subject: [PATCH 083/531] fix dp divide and conquer --- content/russian/cs/layer-optimizations/divide-and-conquer.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/russian/cs/layer-optimizations/divide-and-conquer.md b/content/russian/cs/layer-optimizations/divide-and-conquer.md index a7731f49..61a7304a 100644 --- a/content/russian/cs/layer-optimizations/divide-and-conquer.md +++ b/content/russian/cs/layer-optimizations/divide-and-conquer.md @@ -19,10 +19,10 @@ $$ Конкретно в задаче покрытия точек отрезками, можно заметить следующее: $$ -opt[i, j] \leq opt[i, j+1] +opt[i, j] \leq opt[i+1, j] $$ -Интуиция такая: если у нас появился дополнительный отрезок, то последний отрезок нам не выгодно делать больше, а скорее наоборот его нужно «сжать». +Интуиция такая: когда мы сдвигаем i вправо, то точка, с которой может начинаться последняя группа, не может уменьшаться. ### Идея From b41d0c1c8dff381f4dbdde90753524854c68d684 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 2 Feb 2022 15:58:32 +0300 Subject: [PATCH 084/531] move memory management and update toc --- content/english/hpc/_index.md | 35 +- .../hpc/cpu-cache/img/strides-small.svg | 1463 +++++++++++++++++ .../english/hpc/cpu-cache/img/strides-two.svg | 1294 --------------- .../management.md | 0 content/english/hpc/simd/masking.md | 4 +- content/english/hpc/simd/moving.md | 2 +- 6 files changed, 1486 insertions(+), 1312 deletions(-) create mode 100644 content/english/hpc/cpu-cache/img/strides-small.svg delete mode 100644 content/english/hpc/cpu-cache/img/strides-two.svg rename content/english/hpc/{cpu-cache => external-memory}/management.md (100%) diff --git a/content/english/hpc/_index.md b/content/english/hpc/_index.md index b46bdeca..d1b31dd6 100644 --- a/content/english/hpc/_index.md +++ b/content/english/hpc/_index.md @@ -50,7 +50,7 @@ Planned table of contents: 4.1. Stages of Compilation 4.2. Flags and Targets 4.3. Situational Optimizations - 4.4. Contracts Programming + 4.4. Contract Programming 4.5. Non-Zero-Cost Abstractions 4.6. Compile-Time Computation 4.7. Arithmetic Optimizations @@ -61,6 +61,7 @@ Planned table of contents: 5.3. Program Simulation 5.4. Machine Code Analyzers 5.5. Benchmarking + 5.6. Getting Accurate Results 6. Arithmetic 6.1. Floating-Point Numbers 6.2. Interval Arithmetic @@ -89,26 +90,28 @@ Planned table of contents: 8.8. Spacial and Temporal Locality (8.9. B-Trees) (8.10. Sublinear Algorithms) +(9.13. Memory Management) 9. RAM & CPU Caches 9.1. Memory Bandwidth 9.2. Memory Latency 9.3. Cache Lines - 9.4. Data Alignment - 9.5. Structure Packing - 9.6. Pointer Alternatives - 9.7. Cache Associativity - 9.8. Memory Paging - 9.9. Memory-Level Parallelism - 9.10. Hardware Prefetching - 9.11. Software Prefetching - 9.12. AoS and SoA -(9.13. Memory Management) + 9.4. Memory Sharing + 9.5. Data Alignment + 9.6. Structure Packing + 9.7. Pointer Alternatives + 9.8. Cache Associativity + 9.9. Memory Paging + 9.10. Memory-Level Parallelism + 9.11. Hardware Prefetching + 9.12. Software Prefetching + 9.13. AoS and SoA 10. SIMD Parallelism - 10.1. Using SIMD in C/C++ - 10.2. Reductions - 10.3. Auto-Vectorization - 10.4. Data Twiddling - 10.5. SSE & AVX Cookbook + 10.1. Intrinsics and Vector Types + 10.2. Loading and Writing Data + 10.3. Sums and Other Reductions + 10.4. Masking and Blending + 10.5. In-Register Shuffles + 10.6. Auto-Vectorization 11. Algorithm Case Studies 11.1. Binary GCD (11.2. Prime Number Sieves) diff --git a/content/english/hpc/cpu-cache/img/strides-small.svg b/content/english/hpc/cpu-cache/img/strides-small.svg new file mode 100644 index 00000000..0b76d21d --- /dev/null +++ b/content/english/hpc/cpu-cache/img/strides-small.svg @@ -0,0 +1,1463 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/cpu-cache/img/strides-two.svg b/content/english/hpc/cpu-cache/img/strides-two.svg deleted file mode 100644 index 8200a958..00000000 --- a/content/english/hpc/cpu-cache/img/strides-two.svg +++ /dev/null @@ -1,1294 +0,0 @@ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - diff --git a/content/english/hpc/cpu-cache/management.md b/content/english/hpc/external-memory/management.md similarity index 100% rename from content/english/hpc/cpu-cache/management.md rename to content/english/hpc/external-memory/management.md diff --git a/content/english/hpc/simd/masking.md b/content/english/hpc/simd/masking.md index b1ee4704..4ef4026a 100644 --- a/content/english/hpc/simd/masking.md +++ b/content/english/hpc/simd/masking.md @@ -82,4 +82,6 @@ void binpow_simd() { This implementation now works in 0.7 seconds, or 13.5 times faster, and there is still ample room for improvement. - + + + diff --git a/content/english/hpc/simd/moving.md b/content/english/hpc/simd/moving.md index 3322d94c..69162c65 100644 --- a/content/english/hpc/simd/moving.md +++ b/content/english/hpc/simd/moving.md @@ -55,7 +55,7 @@ On most modern architectures, the `loadu` / `storeu` intrinsics should be equall MMX was originally used the integer (64-bit mantissa) part of a 80-bit float. -Gather, scatter, non-temporal load and store +(Gather, scatter?), non-temporal load and store Extracting and broadcasting From d3cf14d799dec95832b3fd98c18bb7a4dbccd5d4 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 2 Feb 2022 21:31:27 +0300 Subject: [PATCH 085/531] cache associativity --- .../english/hpc/cpu-cache/associativity.md | 95 ++++++++++++------- 1 file changed, 60 insertions(+), 35 deletions(-) diff --git a/content/english/hpc/cpu-cache/associativity.md b/content/english/hpc/cpu-cache/associativity.md index 7bd52524..4d47fd3e 100644 --- a/content/english/hpc/cpu-cache/associativity.md +++ b/content/english/hpc/cpu-cache/associativity.md @@ -3,75 +3,100 @@ title: Cache Associativity weight: 7 --- -- Since implementing "find the oldest among million cache lines" in hardware is unfeasible, each cache layer is split in a number of small "sets", each covering a certain subset of memory locations. *Associativity* is the size of these sets, or, in other terms, how many different "cells" of cache each data location can be mapped to. Higher associativity allows more efficient utilization of cache. - - -If you looked carefully, you could notice patterns while inspecting the dots below the graph in the [previous experiment](../paging). - -These are not just noise: certain step sizes indeed perform much worse than their neighbors. - -For example, the stride of 256 corresponding to this loop: +Consider a [strided incrementing loop](../cache-lines) over an array of size $N=2^{21}$ with a fixed step size of 256: ```cpp for (int i = 0; i < N; i += 256) a[i]++; ``` -and this one +And then this one, with the step size of 257: ```cpp for (int i = 0; i < N; i += 257) a[i]++; ``` -differ by more than 10x: 256 runs at 0.067 while 257 runs at 0.751. +Which one will be faster to finish? There are several considerations: + +- At first, you think that there shouldn't be much difference, or maybe that the second loop is $\frac{257}{256}$ times faster or so because it does fewer iterations in total. +- Then you recall that 256 is a nice round number, which may have something to do with [SIMD](/hpc/simd) or the memory system, so maybe the first one is faster. + +But the right answer is very counterintuitive: the second loop is faster — and by a factor of 10. + +This isn't just a single bad step size. The performance degrades for all indices that are multiples of large powers of two: + +![](../img/strides-small.svg) + +There is no vectorization or anything, and the two loops produce the same assembly except for the step size. This effect is due only to the memory system, in particular to a feature called *cache associativity* which is a peculiar artifact of how CPU caches are implemented in hardware. -This is not just a single specific bad value: it is the same for all indices that are multiple of large powers of two, and it continues much further to the right. +### Hardware Caches -![](../img/strides-two.svg) +When we were studying the memory system [theoretically](/hpc/external-memory), we discussed different ways one can [implement cache eviction policies](/hpc/external-memory/policies/) in software. One particular strategy we focused on was the *least recently used* (LRU) policy, which is simple and effective but still requires some non-trivial data manipulation. -This effect is due to a feature called *cache associativity*, and an interesting artifact of how CPU caches are implemented in hardware. +In the context of hardware, such scheme is called *fully associative cache*: we have $M$ cells, each capable of holding a cache line corresponding to any of the $N$ total memory locations, and in case of contention, the one not accessed the longest gets kicked out and replaced with the new one. -### Hardware Caching +![Fully associative cache](../img/cache1.png) -When studying memory theoretically using the external memory model, we discussed different ways one can [implement caching policies](/hpc/memory/locality/) in software, and went into detail on particular case of a simple but effective strategy, LRU, which required some non-trivial data manipulation. In the context of hardware, such scheme is called *fully associative cache*. +The problem with fully associative cache is that implementing the "find the oldest cache line among millions" operation is hard in software and simply unfeasible in hardware. You can make a fully associative cache that has 16 entries or so, but managing hundreds of cache lines already becomes either prohibitively expensive or so slow that it's not worth it. -![Fully associative cache](../img/cache2.png) +We can resort to another, much simpler approach: just map each block of 64 bytes in RAM to a single cache line which it can occupy. Say, if we have 4096 blocks in memory and 64 cache lines for them, then each cache line at any time stores the contents of one of $\frac{4096}{64} = 64$ different blocks. -The problem with it is that implementing something like that is prohibitively expensive. In hardware, you can implement something when you have 16 entries or so, but it becomes unfeasible when it comes to storing and managing hundreds of cache lines. +![Direct-mapped cache](../img/cache2.png) -We can resort to another, much simpler approach: we could just map each block of 64 bytes in RAM to a cache line which it can possibly occupy. Say if in we have 4096 blocks in memory and 64 cache lines for them, this means that each cache line at any time stores the value of one of $\frac{4096}{64} = 64$ different blocks, along with a "tag" information which helps identifying which block it is. +A direct-mapped cache is easy to implement, and it doesn't require storing any additional meta-information associated with a cache line except its tag (the actual memory location of a cached block). The disadvantage is that the entries can be kicked out too quickly — for example, when bouncing between two addresses that map to the same cache line — leading to lower overall cache utilization. -Simply speaking, the CPU just maintains these cells containing data, and when reading any cell from the main memory the CPU first looks it up in the cache, and if it contains the data, it reads it, and otherwise goes to a higher cache level until it reaches main memory. Simple and beautiful. +For that reason, we settle for something in-between direct-mapped and fully associative caches: the *set-associative cache*. It splits the address space into equal groups which separately act as small fully-associative caches. -![Direct-mapped cache](../img/cache1.png) +![Set-associative cache (2-way associative)](../img/cache3.png) -Direct-mapped cache is easy to implement, but the problem with it is that the entries can be kicked out way too quickly, leading to lower cache utilization. In fact, we could just bounce between two addresses, leaving +*Associativity* is the size of these sets, or, in other words, how many different cache lines each data block can be mapped to. Higher associativity allows for more efficient utilization of cache but also increases the cost. -For that reason, we settle for something in-between direct-mapped and fully associative cache: the *set-associative cache*. It splits addresses into groups which separately act as small fully-associative cache. +For example, on [my CPU](https://en.wikichip.org/wiki/amd/ryzen_7/4700u), the L3 cache is 16-way set-associative, and there are 4MB available to a single core. This means that there are in total $\frac{2^{22}}{2^{6}} = 2^{16}$ cache lines, which are split into $\frac{2^{16}}{16} = 2^{12}$ groups, each acting as a fully associative cache of their own $(\frac{1}{2^{12}})$-th fraction of the RAM. -![Set-associative cache](../img/cache3.png) +Most other CPU caches are also set-associative, including the non-data ones such as the instruction cache and the TLB. The exceptions are small specialized caches that only house 64 or fewer entries — these are usually fully associative. -*Associativity* is the size of such sets — for example 16 meaning that this way we would need to wait at least 16 reads for an entry to get kicked out. Different cache layers may have different associativity. Most CPU caches are set-associative, unless we are talking about small specialized ones that only house 64 or less entries and can get by with fully-associative schemes. +### Address Translation -If we implemented cache in software, we would compute some hash function to use as key. In hardware, we can't really do that because e. g. for L1 cache 4 or 5 cycles is all we got, and even taking a modulo takes 10-15 cycles, let alone something cryptographically secure. Therefore, hardware takes a different approach and calculates this address based on the address. It takes the address, and reinterprets it in three parts: +There is only one ambiguity remaining: how exactly the cache line mapping is done. -![](../img/address.png) +If we implemented set-associative cache in software, we would compute some hash function of the memory block address and then use its value as the cache line index. In hardware, we can't really do that because it is too slow: for example, for the L1 cache, the latency requirement is 4 or 5 cycles, and even [taking a modulo](/hpc/arithmetic/division) takes around 10-15 cycles, let alone something more sophisticated. -The last part is used for determining the cache line it is mapped to. All addresses with the same "middle" part will therefore map to the same set. +Instead, the hardware uses the lazy approach. It takes the memory address that needs to be accessed and splits it into three parts — from lower bits to higher: -Now, where were we? Oh yes, the reason why iterating with strides of 256 has such a terrible slowdown. This because they all map to the same set, and effectively the size of the cache (and all below it) shrinks by 256/16=16. No longer being able to reside in L2, it spills all the way to the order-of-magnitude slower RAM, which causes the expected slowdown. +- *offset* — the index of the word within a 64B cache line ($\log_2 64 = 6$ bits); +- *index* — the index of the cache line itself (the next $12$ bits as there are $2^{12}$ cache lines in the L3 cache); +- *tag* — the rest of the memory address to tell the memory blocks stored in the cache lines apart. -This issue arises with remarkable frequency in all types of algorithms that love powers of two. Luckily, this behavior is more of an anomaly than some that needs to be dealt with. The solution is usually simple: avoid iterating in powers of two, using different sizer on 2d arrays or inserting "holes" in the memory layout. +In other words, all memory addresses with the same "middle" part map to the same set. + +![Address composition for a 64-entry 2-way set associative cache](../img/address.png) + +This makes the cache system simpler and cheaper to implement, but also makes it susceptible to certain access patterns. + +### Pathological Mappings + +Now, where were we? Oh, yes: the reason why iteration with strides of 256 causes such a terrible slowdown. + +When we jump over 256 integers, the pointer always increments by $1024 = 2^{10}$, and the last 10 bits remain the same. Since the cache system uses the lower 6 bits for the offset and the next 12 for the cache line index, we are essentially using just $2^{12 - (10 - 6)} = 2^8$ different cache lines in the L3 cache instead of $2^{12}$, which has the effect of shrinking our L3 cache by a factor of $2^4 = 16$. The array stops fitting into the L3 cache ($N=2^21$) spills into the order-of-magnitude slower RAM, which causes the performance to decrease. + + -Unfortunately, we use power of two array sizes all the time: +Performance issues caused by cache associativity effects arise with remarkable frequency in algorithms because, for multiple reasons, programmers just love using powers of two when indexing arrays: -- it is easy to calculate modulo a power of two -- it is easy to calculate the jump value -- sometime we use divide-and-conquer algorithms, which rely on splitting tasks in two +- It is easier to calculate the address for multi-dimensional array accesses if the last dimension is a power of two, as it only requires a binary shift instead of a multiplication. +- It is easier to calculate modulo a power of two, as it can be done with a single bitwise AND. +- It is convenient and often even necessary to use power-of-two problem sizes in divide-and-conquer algorithms. +- It is the smallest integer exponent, so using the sequence of increasing powers of two as problem sizes are a popular choice when benchmarking memory-bound algorithms. -Fortunately, the solution is usually just to add some arbitrary number to the array size. Although sometimes this is much less trivial. +Luckily, such issues are more of an anomaly rather than serious problems. The solution is usually simple: avoid iterating in powers of two, make the last dimensions of multi-dimensional arrays a slightly different size, or use any other method to insert "holes" in the memory layout. From 724fcd885d4000c57fd4a08f9257a1abab364825 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Thu, 3 Feb 2022 02:26:48 +0300 Subject: [PATCH 086/531] cache line -> cache line set --- content/english/hpc/cpu-cache/associativity.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/english/hpc/cpu-cache/associativity.md b/content/english/hpc/cpu-cache/associativity.md index 4d47fd3e..79d27445 100644 --- a/content/english/hpc/cpu-cache/associativity.md +++ b/content/english/hpc/cpu-cache/associativity.md @@ -65,7 +65,7 @@ If we implemented set-associative cache in software, we would compute some hash Instead, the hardware uses the lazy approach. It takes the memory address that needs to be accessed and splits it into three parts — from lower bits to higher: - *offset* — the index of the word within a 64B cache line ($\log_2 64 = 6$ bits); -- *index* — the index of the cache line itself (the next $12$ bits as there are $2^{12}$ cache lines in the L3 cache); +- *index* — the index of the cache line set (the next $12$ bits as there are $2^{12}$ cache lines in the L3 cache); - *tag* — the rest of the memory address to tell the memory blocks stored in the cache lines apart. In other words, all memory addresses with the same "middle" part map to the same set. @@ -78,7 +78,7 @@ This makes the cache system simpler and cheaper to implement, but also makes it Now, where were we? Oh, yes: the reason why iteration with strides of 256 causes such a terrible slowdown. -When we jump over 256 integers, the pointer always increments by $1024 = 2^{10}$, and the last 10 bits remain the same. Since the cache system uses the lower 6 bits for the offset and the next 12 for the cache line index, we are essentially using just $2^{12 - (10 - 6)} = 2^8$ different cache lines in the L3 cache instead of $2^{12}$, which has the effect of shrinking our L3 cache by a factor of $2^4 = 16$. The array stops fitting into the L3 cache ($N=2^21$) spills into the order-of-magnitude slower RAM, which causes the performance to decrease. +When we jump over 256 integers, the pointer always increments by $1024 = 2^{10}$, and the last 10 bits remain the same. Since the cache system uses the lower 6 bits for the offset and the next 12 for the cache line index, we are essentially using just $2^{12 - (10 - 6)} = 2^8$ different sets in the L3 cache instead of $2^{12}$, which has the effect of shrinking our L3 cache by a factor of $2^4 = 16$. The array stops fitting into the L3 cache ($N=2^21$) spills into the order-of-magnitude slower RAM, which causes the performance to decrease. diff --git a/content/english/hpc/external-memory/virtual.md b/content/english/hpc/external-memory/virtual.md index fc3f8eed..aa5a84ed 100644 --- a/content/english/hpc/external-memory/virtual.md +++ b/content/english/hpc/external-memory/virtual.md @@ -19,7 +19,7 @@ Virtual memory gives each process the impression that it fully controls a contig To achieve this, the memory address space is divided into *pages* (typically 4KB in size), which are the base units of memory that the programs can request from the operating system. The memory system maintains a special hardware data structure called the *page table*, which contains the mappings of virtual page addresses to the physical ones. When a process accesses data using its virtual memory address, the memory system calculates its page number (by right-shifting it by $12$ if $4096=2^{12}$ is the page size), looks up in the page table that its physical address is, and forwards the read or write request to where that data is actually stored. -Since the address translation needs to be done for each memory request, and the number of memory pages itself may be large (e. g. 16G RAM / 4K page size = 4M pages), address translation poses a difficult problem in itself. One way to speed it up is to use a special cache for the page table itself called *translation lookaside buffer* (TLB), and the other is to increase the page size so that the total number of memory pages is made smaller at the cost of reduced granularity. +Since the address translation needs to be done for each memory request, and the number of memory pages itself may be large (e. g. 16G RAM / 4K page size = 4M pages), address translation poses a difficult problem in itself. One way to speed it up is to use a special cache for the page table itself called *translation lookaside buffer* (TLB), and the other is to [increase the page size](/hpc/cpu-cache/paging) so that the total number of memory pages is made smaller at the cost of reduced granularity. + Exploit [spatial locality](/hpc/external-memory/locality). Let's modify the pointer chasing code so that the next pointer needs to be computed using a variable number of fields. We can either place them in separate arrays, or in the same array. diff --git a/content/english/hpc/cpu-cache/img/latency-bandwidth.svg b/content/english/hpc/cpu-cache/img/latency-bandwidth.svg new file mode 100644 index 00000000..3f70b11a --- /dev/null +++ b/content/english/hpc/cpu-cache/img/latency-bandwidth.svg @@ -0,0 +1,1074 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/cpu-cache/mlp.md b/content/english/hpc/cpu-cache/mlp.md index 8fb7fdca..e13cdd60 100644 --- a/content/english/hpc/cpu-cache/mlp.md +++ b/content/english/hpc/cpu-cache/mlp.md @@ -3,7 +3,9 @@ title: Memory-Level Parallelism weight: 9 --- -On perfectly pipelined systems, it would be equal to the latency-bandwidth product. But this isn't quite true: we measured the latency of the RAM to be around 150ns, and its peak bandwidth to be around 40 GB/s, which if we just divided +On perfectly pipelined systems, it would be equal to the latency-bandwidth product. But this isn't quite true: we measured the latency of the RAM to be around 150ns, and its peak bandwidth to be around 40 GB/s, which if we just divided. + +![](../img/latency-bandwidth.svg) The fundamental reason why [linear iteration](../bandwidth) is so much faster than [pointer jumping](../latency) is that the CPU knows which memory locations it needs to fetch first and sends the corresponding memory requests far in advance, successfully hiding the latencies of these individual requests. From 74584a8284bc358d1e55b0811f4d0085a1875dcb Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Thu, 3 Feb 2022 15:06:19 +0300 Subject: [PATCH 089/531] reorder sections in cpu-cache --- content/english/hpc/cpu-cache/alignment.md | 2 +- content/english/hpc/cpu-cache/aos-soa.md | 2 +- content/english/hpc/cpu-cache/associativity.md | 2 +- content/english/hpc/cpu-cache/hw-prefetching.md | 2 +- content/english/hpc/cpu-cache/mlp.md | 8 ++++---- content/english/hpc/cpu-cache/packing.md | 2 +- content/english/hpc/cpu-cache/paging.md | 2 +- content/english/hpc/cpu-cache/pointers.md | 2 +- content/english/hpc/cpu-cache/sw-prefetching.md | 2 +- 9 files changed, 12 insertions(+), 12 deletions(-) diff --git a/content/english/hpc/cpu-cache/alignment.md b/content/english/hpc/cpu-cache/alignment.md index e3cf4f32..9a2a2ded 100644 --- a/content/english/hpc/cpu-cache/alignment.md +++ b/content/english/hpc/cpu-cache/alignment.md @@ -1,6 +1,6 @@ --- title: Data Alignment -weight: 4 +weight: 8 --- The fact that the memory is partitioned into 64B [cache lines](../cache-lines) makes it difficult to operate on data words that cross a cache line boundary. When you need to retrieve some primitive type, such as a 32-bit integer, you really want to have it located on a single cache line — both because retrieving two cache lines requires more memory bandwidth and stitching the results in hardware requires precious transistor space. diff --git a/content/english/hpc/cpu-cache/aos-soa.md b/content/english/hpc/cpu-cache/aos-soa.md index 7529100b..2c94a3b5 100644 --- a/content/english/hpc/cpu-cache/aos-soa.md +++ b/content/english/hpc/cpu-cache/aos-soa.md @@ -1,6 +1,6 @@ --- title: AoS and SoA -weight: 12 +weight: 13 --- diff --git a/content/english/hpc/cpu-cache/packing.md b/content/english/hpc/cpu-cache/packing.md index e3dd1d39..a0601ddc 100644 --- a/content/english/hpc/cpu-cache/packing.md +++ b/content/english/hpc/cpu-cache/packing.md @@ -1,6 +1,6 @@ --- title: Structure Packing -weight: 5 +weight: 9 --- If you know what you are doing, you can disable [structure padding](../alignment) and pack your data as tight as possible. diff --git a/content/english/hpc/cpu-cache/paging.md b/content/english/hpc/cpu-cache/paging.md index 706eada8..fad39a54 100644 --- a/content/english/hpc/cpu-cache/paging.md +++ b/content/english/hpc/cpu-cache/paging.md @@ -1,6 +1,6 @@ --- title: Memory Paging -weight: 8 +weight: 12 --- Consider [yet again](../associativity) the strided incrementing loop: diff --git a/content/english/hpc/cpu-cache/pointers.md b/content/english/hpc/cpu-cache/pointers.md index aa68180c..b0adcc5d 100644 --- a/content/english/hpc/cpu-cache/pointers.md +++ b/content/english/hpc/cpu-cache/pointers.md @@ -1,6 +1,6 @@ --- title: Pointer Alternatives -weight: 6 +weight: 10 --- In the [pointer chasing benchmark](../latency), for simplicity, we didn't use actual pointers, but integer indices relative to a base address: diff --git a/content/english/hpc/cpu-cache/sw-prefetching.md b/content/english/hpc/cpu-cache/sw-prefetching.md index 6f5eac06..092d3c4a 100644 --- a/content/english/hpc/cpu-cache/sw-prefetching.md +++ b/content/english/hpc/cpu-cache/sw-prefetching.md @@ -1,6 +1,6 @@ --- title: Software Prefetching -weight: 11 +weight: 7 --- Sometimes the hardware can't figure out what to prefetch next by itself, and in this case, we need to point it explicitly. From a274a82dc6572294e715bc3e8043c363bf5a06ea Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Thu, 3 Feb 2022 15:56:26 +0300 Subject: [PATCH 090/531] mlp edits --- content/english/hpc/cpu-cache/mlp.md | 23 +++++++++++++---------- 1 file changed, 13 insertions(+), 10 deletions(-) diff --git a/content/english/hpc/cpu-cache/mlp.md b/content/english/hpc/cpu-cache/mlp.md index 25979a52..c0622e37 100644 --- a/content/english/hpc/cpu-cache/mlp.md +++ b/content/english/hpc/cpu-cache/mlp.md @@ -5,20 +5,17 @@ weight: 5 Memory requests can overlap in time: while you wait for a read request to complete, you can sand a few others, which will be executed concurrently with it. This is the reason why [linear iteration](../bandwidth) is so much faster than [pointer jumping](../latency): the CPU knows which memory locations it needs to fetch next and sends memory requests far ahead of time. -On perfectly pipelined systems, it would be equal to the latency-bandwidth product. But this isn't quite true: we measured the latency of the RAM to be around 150ns, and its peak bandwidth to be around 40 GB/s, which if we just divided. +The number of concurrent memory operations is large but limited, and it is different for different types of memory. When designing algorithms and especially data structures, you may want to know this number, as it limits the amount of parallelism your computation can achieve. -![](../img/latency-bandwidth.svg) - -Exploring this idea further, the memory system supports a large but finite number of concurrent I/O operations. To find this limit, we can modify our pointer chasing benchmark +To find this limit theoretically for a specific memory type, you can multiply its latency (time to fetch a cache line) by its bandwidth (number of cache lines fetched per second), which gives you the average number of memory operations in progress: - +### Direct Experiment +Let's try to measure available memory parallelism more directly by modifying our pointer chasing benchmark so that we loop around $D$ separate cycles in parallel instead of just one: ```c++ const int M = N / D; @@ -37,9 +34,11 @@ for (int i = 0; i < M; i++) k[d] = q[d][k[d]]; ``` +Fixing the sum of the cycle lengths constant at a few select sizes and trying different $D$, we get slightly different results: + ![](../img/permutation-mlp.svg) -There is a conflict over registers: +The L2 cache run is limited by ~6 concurrent operations, as predicted, but larger memory types all max out between 13 and 17. You can't make use of more memory lanes as there is a conflict over logical registers. When the number of lanes is fewer than the number of registers, you can issue just one read instruction per lane: ```nasm dec edx @@ -50,7 +49,11 @@ movsx rax, DWORD PTR q[3145728+rax*4] jne .L9 ``` +But when it is over ~15, you have to use temporary memory storage: + ```nasm mov edx, DWORD PTR q[0+rdx*4] mov DWORD PTR [rbp-128+rax*4], edx ``` + +You don't always get to the maximum possible level of memory parallelism, but for most applications, a dozen concurrent requests are more than enough. From ed0239165a8ec98e5a150b63b198d5e33181ffcd Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Thu, 3 Feb 2022 16:10:31 +0300 Subject: [PATCH 091/531] merge hardware and software prefetching --- .../english/hpc/cpu-cache/hw-prefetching.md | 44 -------- content/english/hpc/cpu-cache/prefetching.md | 104 ++++++++++++++++++ .../english/hpc/cpu-cache/sw-prefetching.md | 52 --------- 3 files changed, 104 insertions(+), 96 deletions(-) delete mode 100644 content/english/hpc/cpu-cache/hw-prefetching.md create mode 100644 content/english/hpc/cpu-cache/prefetching.md delete mode 100644 content/english/hpc/cpu-cache/sw-prefetching.md diff --git a/content/english/hpc/cpu-cache/hw-prefetching.md b/content/english/hpc/cpu-cache/hw-prefetching.md deleted file mode 100644 index 8ef43b30..00000000 --- a/content/english/hpc/cpu-cache/hw-prefetching.md +++ /dev/null @@ -1,44 +0,0 @@ ---- -title: Hardware Prefetching -weight: 6 ---- - -- Taking advantage of this free concurrency, it is often beneficial to *prefetch* data that you will likely be accessing soon, if you know its location. You can do this explicitly by using a separate instruction or just by accessing any byte in its cache line, but the most frequent patterns, such as linearly iterating forward or backward over an array, prefetching is already handled by hardware. - - -In the bandwidth benchmark, we iterated over array and fetched its elements. Although separately each memory read in that case is not different from the fetch in pointer chasing, they run much faster because they can are overlapped: and in fact, CPU issues read requests in advance without waiting for the old ones to complete, so that the results come about the same time as the CPU needs them. - -In fact, this sometimes works even when we are not sure which instruction is going to be executed next. Consider the following example: - -```cpp -bool cond = some_long_memory_operation(); - -if (cond) - do_this_fast_operation(); -else - do_that_fast_operation(); -``` - -What most modern CPUs do is they start evaluating one (most likely) branch without waiting for the condition to be computed. If they are right, then you will progress faster, and if they are wrong, the worst thing will happen is they discard some useless computation. This includes memory operations too, including cache system — because, well, we wait for a hundred cycles anyway, why not evaluate at least one of the branches ahead of time. By the way, this is what Meltdown was all about. - -This general technique of hiding latency with bandwidth is called *prefetching* — and it can be either implicit or explicit. CPU automatically running ahead in the pipeline is just one way to use it. Hardware can figure out even without looking at the future instructions, and just by analyzing memory access patterns. Hiding latency is crucial — it is pretty much the single most important idea we keep coming back to in this book. Apart from having a very large pipeline and using the fact that scheduler can look ahead in it, modern memory controllers can detect simple patterns such as iterating backwards, forwards, including using constant small-ish strides. - -Here is how to test it: we now generate our permutation in a way that makes us load consecutive cache lines, but we fetch elements in random order inside the cache lines. - -```cpp -int p[15], q[N]; - -iota(p, p + 15, 1); - -for (int i = 0; i + 16 < N; i += 16) { - random_shuffle(p, p + 15); - int k = i; - for (int j = 0; j < 15; j++) - k = q[k] = i + p[j]; - q[k] = i + 16; -} -``` - -The latency here remains constant at 3ns regardless (or whatever is the latency of pointers / bit fields implementation). - -Hardware prefetching is usually powerful enough for most cases. You can iterate over multiple arrays, sometimes with small strides, or load just small amounts. It is as intelligent and detrimental to performance as branch prediction. diff --git a/content/english/hpc/cpu-cache/prefetching.md b/content/english/hpc/cpu-cache/prefetching.md new file mode 100644 index 00000000..84044d33 --- /dev/null +++ b/content/english/hpc/cpu-cache/prefetching.md @@ -0,0 +1,104 @@ +--- +title: Prefetching +weight: 6 +--- + +Taking advantage of the [free concurrency](../mlp) available in memory hardware, it can be beneficial to *prefetch* data that you are likely to be accessing next, if you can predict its location. + +You can do this either explicitly by using a separate instruction or just by accessing any byte in its cache line, but the most frequent patterns, such as linearly iterating forward or backward over an array, prefetching is already handled by hardware. + +### Hardware Prefetching + +In the bandwidth benchmark, we iterated over array and fetched its elements. Although separately each memory read in that case is not different from the fetch in pointer chasing, they run much faster because they can are overlapped: and in fact, CPU issues read requests in advance without waiting for the old ones to complete, so that the results come about the same time as the CPU needs them. + +This general technique of hiding latency with bandwidth is called *prefetching* — and it can be either implicit or explicit. CPU automatically running ahead in the pipeline is just one way to use it. Hardware can figure out even without looking at the future instructions, and just by analyzing memory access patterns. Hiding latency is crucial — it is pretty much the single most important idea we keep coming back to in this book. Apart from having a very large pipeline and using the fact that scheduler can look ahead in it, modern memory controllers can detect simple patterns such as iterating backwards, forwards, including using constant small-ish strides. + +Here is how to test it: we now generate our permutation in a way that makes us load consecutive cache lines, but we fetch elements in random order inside the cache lines. + +```cpp +int p[15], q[N]; + +iota(p, p + 15, 1); + +for (int i = 0; i + 16 < N; i += 16) { + random_shuffle(p, p + 15); + int k = i; + for (int j = 0; j < 15; j++) + k = q[k] = i + p[j]; + q[k] = i + 16; +} +``` + +The latency here remains constant at 3ns regardless (or whatever is the latency of pointers / bit fields implementation). + +Hardware prefetching is usually powerful enough for most cases. You can iterate over multiple arrays, sometimes with small strides, or load just small amounts. It is as intelligent and detrimental to performance as branch prediction. + +### Software Prefetching + +Sometimes the hardware can't figure out what to prefetch next by itself, and in this case, we need to point it explicitly. + +The easiest thing is to just use any byte in the cache line as an operand, but CPUs have an explicit instruction to just "lift" a cache line without doing anything with it. As far as I know, this instruction is not a part of the C/C++ standard or any other language, but is widely available in compilers. + +It turned out it is non-trivial to design such a permutation case that simultaneously loops around all the array, can't be predicted by hardware prefetching but the next address is easily computable in order to do prefetching. + +Luckily, LCG can be used. It is a known property that if ..., then the period will be exactly $n$. So, we will modify our algorithm so that the permutation is generated by LCG, using current index as the state: + +```cpp +const int n = find_prime(N); + +for (int i = 0; i < n; i++) + q[i] = (2 * i + 1) % n; +``` + +Running it, the performance is the same as with the fully random permutation. But now we have the capability of peeking a bit ahead: + +```cpp +int k = 0; + +for (int t = 0; t < K; t++) { + for (int i = 0; i < n; i++) { + __builtin_prefetch(&q[(2 * k + 1) % n]); + k = q[k]; + } +} +``` + +It is almost 2 times faster, as we expected. + +![](../img/sw-prefetch.svg) + +Interestingly, we can cut it arbitrarily close (to the cost of computing the next index — [modulo is expensive](../arithmetic/integer)). + +One can show that in order to load $k$-th element ahead, we can do this: + +```cpp +__builtin_prefetch(&q[((1 << D) * k + (1 << D) - 1) % n]); +``` + +Managing issues such as integer overflow, we can cut latency down arbitrarily close to just calculating the address using the formula. + +![](../img/sw-prefetch-others.svg) + +Hardware prefetching activates and deactivates automatically. Software isn't and will block the pipeline. + +Prefetching can be to particular levels. + + + diff --git a/content/english/hpc/cpu-cache/sw-prefetching.md b/content/english/hpc/cpu-cache/sw-prefetching.md deleted file mode 100644 index 092d3c4a..00000000 --- a/content/english/hpc/cpu-cache/sw-prefetching.md +++ /dev/null @@ -1,52 +0,0 @@ ---- -title: Software Prefetching -weight: 7 ---- - -Sometimes the hardware can't figure out what to prefetch next by itself, and in this case, we need to point it explicitly. - -The easiest thing is to just use any byte in the cache line as an operand, but CPUs have an explicit instruction to just "lift" a cache line without doing anything with it. As far as I know, this instruction is not a part of the C/C++ standard or any other language, but is widely available in compilers. - -It turned out it is non-trivial to design such a permutation case that simultaneously loops around all the array, can't be predicted by hardware prefetching but the next address is easily computable in order to do prefetching. - -Luckily, LCG can be used. It is a known property that if ..., then the period will be exactly $n$. So, we will modify our algorithm so that the permutation is generated by LCG, using current index as the state: - -```cpp -const int n = find_prime(N); - -for (int i = 0; i < n; i++) - q[i] = (2 * i + 1) % n; -``` - -Running it, the performance is the same as with the fully random permutation. But now we have the capability of peeking a bit ahead: - -```cpp -int k = 0; - -for (int t = 0; t < K; t++) { - for (int i = 0; i < n; i++) { - __builtin_prefetch(&q[(2 * k + 1) % n]); - k = q[k]; - } -} -``` - -It is almost 2 times faster, as we expected. - -![](../img/sw-prefetch.svg) - -Interestingly, we can cut it arbitrarily close (to the cost of computing the next index — [modulo is expensive](../arithmetic/integer)). - -One can show that in order to load $k$-th element ahead, we can do this: - -```cpp -__builtin_prefetch(&q[((1 << D) * k + (1 << D) - 1) % n]); -``` - -Managing issues such as integer overflow, we can cut latency down arbitrarily close to just calculating the address using the formula. - -![](../img/sw-prefetch-others.svg) - -Hardware prefetching activates and deactivates automatically. Software isn't and will block the pipeline. - -Prefetching can be to particular levels. From e23374b924266425a041f257ecf634219308fe06 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Thu, 3 Feb 2022 16:14:04 +0300 Subject: [PATCH 092/531] merge alignment and packing --- content/english/hpc/cpu-cache/alignment.md | 54 +++++++++++++++++++++- content/english/hpc/cpu-cache/packing.md | 54 ---------------------- 2 files changed, 53 insertions(+), 55 deletions(-) delete mode 100644 content/english/hpc/cpu-cache/packing.md diff --git a/content/english/hpc/cpu-cache/alignment.md b/content/english/hpc/cpu-cache/alignment.md index 9a2a2ded..0a31368c 100644 --- a/content/english/hpc/cpu-cache/alignment.md +++ b/content/english/hpc/cpu-cache/alignment.md @@ -1,5 +1,5 @@ --- -title: Data Alignment +title: Alignment and Packing weight: 8 --- @@ -132,3 +132,55 @@ struct NodeG { ``` --> + +### Structure Packing + +If you know what you are doing, you can disable structure padding and pack your data as tight as possible. + +You have to ask the compiler to do it, as such functionality is not a part of neither C nor C++ standard yet. In GCC and Clang, this is done with the `packed` attribute: + +```cpp +struct __attribute__ ((packed)) Data { + long long a; + bool b; +}; +``` + +This makes the instances of `Data` take just 9 bytes instead of the 16 required by alignment, at the cost of possibly fetching two cache lines to reads its elements. + +### Bit fields + +You can also use packing along with *bit fields*, which allow you to explicitly fix the size of a member in bits: + +```cpp +struct __attribute__ ((packed)) Data { + char a; // 1 byte + int b : 24; // 3 bytes +}; +``` + +This structure takes 4 bytes when packed and 8 bytes when padded. The number of bits a member has doesn't have to be a multiple of 8, and neither does the total structure size. In an array of `Data`, the neighboring elements will be "merged" in the case of a non-whole number of bytes. It also allows you to set a width that exceeds the base type, which acts as padding — although it throws a warning in the process. + + + +This feature is not so widespread because CPUs don't have 3-byte arithmetic or things like that and has to do some inefficient byte-by-byte conversion during loading: + +```cpp +int load(char *p) { + char x = p[0], y = p[1], z = p[2]; + return (x << 16) + (y << 8) + z; +} +``` + +The overhead is even larger when there is a non-whole byte — it needs to be handled with a shift and an and-mask. + +This procedure can be optimized by loading a 4-byte `int` and then using a mask to discard its highest bits. + +```cpp +int load(int *p) { + int x = *p; + return x & ((1<<24) - 1); +} +``` + +Compilers usually don't do that because this is not technically always legal: that 4th byte may be on a memory page that you don't own, so the operating system won't let you load it even if you are going to discard it right away. diff --git a/content/english/hpc/cpu-cache/packing.md b/content/english/hpc/cpu-cache/packing.md deleted file mode 100644 index a0601ddc..00000000 --- a/content/english/hpc/cpu-cache/packing.md +++ /dev/null @@ -1,54 +0,0 @@ ---- -title: Structure Packing -weight: 9 ---- - -If you know what you are doing, you can disable [structure padding](../alignment) and pack your data as tight as possible. - -You have to ask the compiler to do it, as such functionality is not a part of neither C nor C++ standard yet. In GCC and Clang, this is done with the `packed` attribute: - -```cpp -struct __attribute__ ((packed)) Data { - long long a; - bool b; -}; -``` - -This makes the instances of `Data` take just 9 bytes instead of the 16 required by alignment, at the cost of possibly fetching two cache lines to reads its elements. - -### Bit fields - -You can also use packing along with *bit fields*, which allow you to explicitly fix the size of a member in bits: - -```cpp -struct __attribute__ ((packed)) Data { - char a; // 1 byte - int b : 24; // 3 bytes -}; -``` - -This structure takes 4 bytes when packed and 8 bytes when padded. The number of bits a member has doesn't have to be a multiple of 8, and neither does the total structure size. In an array of `Data`, the neighboring elements will be "merged" in the case of a non-whole number of bytes. It also allows you to set a width that exceeds the base type, which acts as padding — although it throws a warning in the process. - - - -This feature is not so widespread because CPUs don't have 3-byte arithmetic or things like that and has to do some inefficient byte-by-byte conversion during loading: - -```cpp -int load(char *p) { - char x = p[0], y = p[1], z = p[2]; - return (x << 16) + (y << 8) + z; -} -``` - -The overhead is even larger when there is a non-whole byte — it needs to be handled with a shift and an and-mask. - -This procedure can be optimized by loading a 4-byte `int` and then using a mask to discard its highest bits. - -```cpp -int load(int *p) { - int x = *p; - return x & ((1<<24) - 1); -} -``` - -Compilers usually don't do that because this is not technically always legal: that 4th byte may be on a memory page that you don't own, so the operating system won't let you load it even if you are going to discard it right away. From 04da34f11568fae9ed8d4a8da13e453683c1314d Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Thu, 3 Feb 2022 21:07:06 +0300 Subject: [PATCH 093/531] prefetching --- content/english/hpc/cpu-cache/prefetching.md | 58 +++++++++++++------- 1 file changed, 37 insertions(+), 21 deletions(-) diff --git a/content/english/hpc/cpu-cache/prefetching.md b/content/english/hpc/cpu-cache/prefetching.md index 84044d33..8ccdea6b 100644 --- a/content/english/hpc/cpu-cache/prefetching.md +++ b/content/english/hpc/cpu-cache/prefetching.md @@ -3,17 +3,18 @@ title: Prefetching weight: 6 --- -Taking advantage of the [free concurrency](../mlp) available in memory hardware, it can be beneficial to *prefetch* data that you are likely to be accessing next, if you can predict its location. +Taking advantage of the [free concurrency](../mlp) available in memory hardware, it can be beneficial to *prefetch* data that is likely to be accessed next if its location can be predicted. This is easy to do when there are no [data of control hazards](/hpc/pipelining/hazards) in the pipeline and the CPU can just run ahead of the instruction stream and execute memory operations out of order. -You can do this either explicitly by using a separate instruction or just by accessing any byte in its cache line, but the most frequent patterns, such as linearly iterating forward or backward over an array, prefetching is already handled by hardware. +But sometimes the memory locations aren't in the instruction stream, and yet they can still be predicted with high probability. In these cases, they can be prefetched by other means: -### Hardware Prefetching +- Explicitly, by separately reading the next data word or any of the bytes in the same cache line, so that it is lifted in the cache hierarchy. +- Implicitly, by using simple access patterns such as linear iteration, which are detectable by the memory hardware that can start prefetching automatically. -In the bandwidth benchmark, we iterated over array and fetched its elements. Although separately each memory read in that case is not different from the fetch in pointer chasing, they run much faster because they can are overlapped: and in fact, CPU issues read requests in advance without waiting for the old ones to complete, so that the results come about the same time as the CPU needs them. +Hiding memory latency is crucial for achieving performance, so in this section, we will look into prefetching techniques. -This general technique of hiding latency with bandwidth is called *prefetching* — and it can be either implicit or explicit. CPU automatically running ahead in the pipeline is just one way to use it. Hardware can figure out even without looking at the future instructions, and just by analyzing memory access patterns. Hiding latency is crucial — it is pretty much the single most important idea we keep coming back to in this book. Apart from having a very large pipeline and using the fact that scheduler can look ahead in it, modern memory controllers can detect simple patterns such as iterating backwards, forwards, including using constant small-ish strides. +### Hardware Prefetching -Here is how to test it: we now generate our permutation in a way that makes us load consecutive cache lines, but we fetch elements in random order inside the cache lines. +Let's modify the [pointer chasing](../latency) benchmark to show the effect of hardware prefetching. Now, we generate our permutation in a way that makes the CPU request consecutive cache lines when iterating over the permutation, but still accessing the elements inside a cache line in random order: ```cpp int p[15], q[N]; @@ -29,28 +30,30 @@ for (int i = 0; i + 16 < N; i += 16) { } ``` -The latency here remains constant at 3ns regardless (or whatever is the latency of pointers / bit fields implementation). +There is no point in making a graph because the latency is flat: 3ns regardless of the array size. Even though the instruction scheduler still can't tell what we are going to fetch next, the memory prefetcher can detect a pattern just by looking at the memory accesses and start loading the next cache line ahead of time, leveling out its latency. -Hardware prefetching is usually powerful enough for most cases. You can iterate over multiple arrays, sometimes with small strides, or load just small amounts. It is as intelligent and detrimental to performance as branch prediction. +Hardware prefetching is usually powerful enough for most cases, but it only detects simple patterns. You can iterate forward and backward over multiple arrays in parallel, perhaps with small-to-medium strides, but that's about it. For anything more complex, the prefetcher won't figure out what's happening, and we need to help it out ourselves. ### Software Prefetching -Sometimes the hardware can't figure out what to prefetch next by itself, and in this case, we need to point it explicitly. +The simplest way to do software prefetching is to load any byte in the cache line with the `mov` or any other memory instruction, but CPUs have a separate `prefetch` instruction that lifts a cache line without doing anything with it. This instruction isn't a part of the C or C++ standard, but is available in most compilers as the `__builtin_prefetch` intrinsic: -The easiest thing is to just use any byte in the cache line as an operand, but CPUs have an explicit instruction to just "lift" a cache line without doing anything with it. As far as I know, this instruction is not a part of the C/C++ standard or any other language, but is widely available in compilers. +```c++ +__builtin_prefetch(&a[k]); +``` -It turned out it is non-trivial to design such a permutation case that simultaneously loops around all the array, can't be predicted by hardware prefetching but the next address is easily computable in order to do prefetching. +It's quite hard to come up with a *simple* example when it can be useful. To make the pointer chasing benchmark benefit from software prefetching, we need to construct a permutation that at the same time loops around the whole array, can't be predicted by hardware prefetcher, and has easily computable next addresses. -Luckily, LCG can be used. It is a known property that if ..., then the period will be exactly $n$. So, we will modify our algorithm so that the permutation is generated by LCG, using current index as the state: +Luckily, the [linear congruential generator](https://en.wikipedia.org/wiki/Linear_congruential_generator) has the property that if the modulus $n$ is a prime number, then the period of the generator will be exactly $n$. So we get all the properties we need if we use a permutation generated by the LCG with the current index as its state: ```cpp -const int n = find_prime(N); +const int n = find_prime(N); // largest prime not exceeding N for (int i = 0; i < n; i++) q[i] = (2 * i + 1) % n; ``` -Running it, the performance is the same as with the fully random permutation. But now we have the capability of peeking a bit ahead: +When we run it, the performance matches a normal random permutation. But now we get the ability to peek ahead: ```cpp int k = 0; @@ -63,29 +66,42 @@ for (int t = 0; t < K; t++) { } ``` -It is almost 2 times faster, as we expected. +There is some overhead to computing the next address, but for arrays large enough, it is almost two times faster: ![](../img/sw-prefetch.svg) -Interestingly, we can cut it arbitrarily close (to the cost of computing the next index — [modulo is expensive](../arithmetic/integer)). +Interestingly, we can prefetch more than just two elements ahead, making use of this pattern in the LCG function: + +$$ +\begin{aligned} + f(x) &= 2 \cdot x + 1 +\\ f^2(x) &= 4 \cdot x + 2 + 1 +\\ f^3(x) &= 8 \cdot x + 4 + 2 + 1 +\\ &\ldots +\\ f^k(x) &= 2^k \cdot x + (2^k - 1) +\end{aligned} +$$ -One can show that in order to load $k$-th element ahead, we can do this: +Hence, in order to load `D` elements ahead, we can do this: ```cpp __builtin_prefetch(&q[((1 << D) * k + (1 << D) - 1) % n]); ``` -Managing issues such as integer overflow, we can cut latency down arbitrarily close to just calculating the address using the formula. +Ignoring some issues such as the integer overflow, this way we can reduce the latency arbitrarily close to the cost of computing the next index (which in this case is dominated by the [modulo operation](/hpc/arithmetic/division)). ![](../img/sw-prefetch-others.svg) -Hardware prefetching activates and deactivates automatically. Software isn't and will block the pipeline. - -Prefetching can be to particular levels. +Note that this is an artificial example, and you actually fail more often than not when trying to insert software prefetching into practical programs. This is largely due to the fact that you need to issue a separate memory instruction that may compete for resources with the others. At the same time, hardware prefetching is 100% harmless as it only activates when the memory and cache buses are not busy. +You can also specify a specific level of cache the data needs to be brought to when doing software prefetching — when you aren't sure if you will be using it and don't want to kick out what is already in the L1 cache. You can use it with the `_mm_prefetch` intrinsic, which takes an integer value as the second parameter, specifying the cache level. This is useful in combination with [non-temporal loads and stores](../bandwidth#bypassing-the-cache). - -Exploit [spatial locality](/hpc/external-memory/locality). - -Let's modify the pointer chasing code so that the next pointer needs to be computed using a variable number of fields. We can either place them in separate arrays, or in the same array. - -The first approach, struct +The first approach will locate these fields together as the rows of a two-dimensional array. We will refer to this variant as *array of structures* (AoS): ```c++ const int M = N / D; // # of memory accesses @@ -67,20 +35,28 @@ for (int i = 0; i < M; i++) { } ``` -Transpose the array and also swap indices in all its accesses: +And the second approach will place them in separately. The laziest way to do this is to transpose the two-dimensional array `q` and swap the indices in all its subsequent accesses: ```c++ int q[D][M]; // ^--^ ``` +By analogy, we call this variant *structure of arrays* (SoA). Obviously, for large $D$'s, it performs much worse: + ![](../img/aos-soa.svg) -Running a bit forward: the spikes at powers of two for AoS are due to SIMD, and dips in SoA are due to cache associativity. +The performance of both variants grows linearly with $D$, but AoS needs to fetch up to 16 times fewer total cache lines as the data is stored sequentially. Even when $D=64$, the additional time it takes to process the other 63 values is less than the latency of the first fetch. + +You can also see the spikes at the powers of two. AoS performs slightly better because it can compute [horizontal xor-sum](/hpc/simd/reduction) faster with SIMD. In contrast, SoA performs much worse, but this isn't about $D$ being a power of two, but about $\lfloor N / D \rfloor$, the size of the second dimension, being a power of two, which in turn causes a pretty complicated [cache associativity](../associativity) effect. + +Even though $N=2^{23}$ and the array is too big to fit into the L3 cache, to process some number of elements from different cache lines in parallel, you still need to store them somewhere temporarily — you can't simply use registers as there aren't enough of them. When `N / D` is a power of two and we are iterating over the array `q[D][N / D]` along the first index, all memory addresses will map to the same cache line, making many of them be re-fetched from upper layers of cache. ### RAM-Specific Timings -![](../img/ram.png) +Let's do the same with the [padded int](../cache-lines) + +Теперь мы добавили паддинг в AoS так, что каждый элемент теперь окружен 15 какими-то бесполезными элементами, а в остальном они все так же используются для подсчета ксор-суммы D чисел: ```c++ struct padded_int { @@ -92,6 +68,21 @@ const int M = N / D / 16; padded_int q[M][D]; ``` + +4D. На уровне RAM интереснее. Казалось бы, AoS-padded должна работать так же, как SoA: мы и там, и там загружаем 63 кэш-линий. Однако здесь играет роль то, как работает сама RAM. + +Все данные в ней физически хранятся в виде двумерного массива конденсаторов, разделенного на строки и столбцы. Чтобы прочитать ячейку из него, нужно выполнить одно, два или три действия: + +1. Прочитать содержимое строки в специальный временный буфер (row buffer). +2. Выбрать и собственно прочитать (или записать) в нем нужную ячейку. +3. И, опционально, записать данные из буфера обратно в строку массива — потому что чтение разрежает конденсаторы, и их нужно зарядить обратно. Этот шаг нужно делать только в том случае, если следующий доступ в память относится к какой-то другой строке. + +Эти три шага занимают примерно одинаковое время. В AoS-padded все элементы хотя и распологаются в разных кэш-линиях, но эти линии соседние, и они с большой вероятностью окажутся в одной строчке в RAM, и первый и третий шаг можно проигнорировать. Поэтому суммарно все эти запросы отработают за втрое меньшее время (плюс задержка одного чтения) + +![](../img/ram.png) + +4C: когда мы падим инты, но оставляем суммарный размер массива таким же, ничего с точки зрения запросов к памяти и вычислений меняться не должно. Единственый нюанс: с паженными интами не происходит случайного шеринга кэша, как с обычными (когда мы загружаем какой-то инт, его соседи по кэш-линии тоже попадают в кэш), поэтому есть небольшое замедление. + ![](../img/aos-soa-padded.svg) ![](../img/aos-soa-padded-n.svg) @@ -113,3 +104,13 @@ We can turn on hugepages, and they make it 10 times worse (notice the logarithmi ![](../img/soa-hugepages.svg) This is a rare example where hugepages actually worsen performance. Usually they the latency by 10-15%, but here they make it 10x worse. + +4F: Когда мы включаем большие страницы, задержка немного уменьшается — так же, как и в оригинальном бенчмарке задержки с D=1 + +4G: L1/L2 уровни кэша приватные для каждого ядра, и поэтому для простоты, чтобы не делать отдельно трансляцию адресов, для них везде используются виртуальные адреса, а не физические. На уровне L3 и RAM уже используются реальные, потому что иначе синхронизироваться никак не получится. + +Когда мы используем 4K страницы, они размазываются по физической памяти довольно произвольным образом, и проблема описанная в 4E смягчается: все (физические) адреса имеют одинаковый остаток по модулю 4K, а не N/D. Когда мы запрашиваем именно большие страницы, они мапаются в последовательные же страницы в физической памяти, и поэтому этот лимит на максимальный alignment возрастает с 4K до 2M, и кэшам становится совсем плохо. + +^ так что здесь ещё есть такой рандомный фактор, в зависимости от того, где операционная система страницы разместит + +Это единственный известный мне пример, когда увеличение размера страницы ухудшает производительность, тем более в 10 раз From df115636ea24caf957bcbe596cd2d9383f5cc676 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Fri, 4 Feb 2022 18:21:16 +0300 Subject: [PATCH 096/531] ram-specific timings --- content/english/hpc/cpu-cache/aos-soa.md | 45 ++++++++++++------------ 1 file changed, 22 insertions(+), 23 deletions(-) diff --git a/content/english/hpc/cpu-cache/aos-soa.md b/content/english/hpc/cpu-cache/aos-soa.md index 271d1aa1..65c0ce32 100644 --- a/content/english/hpc/cpu-cache/aos-soa.md +++ b/content/english/hpc/cpu-cache/aos-soa.md @@ -7,6 +7,8 @@ It is often beneficial to group together the data you need to fetch at the same To demonstrate the potential effect of doing this, we modify the [pointer chasing](../latency) benchmark so that the next pointer is computed using not one, but a variable number of fields ($D$). +### Experiment + The first approach will locate these fields together as the rows of a two-dimensional array. We will refer to this variant as *array of structures* (AoS): ```c++ @@ -48,15 +50,11 @@ By analogy, we call this variant *structure of arrays* (SoA). Obviously, for lar The performance of both variants grows linearly with $D$, but AoS needs to fetch up to 16 times fewer total cache lines as the data is stored sequentially. Even when $D=64$, the additional time it takes to process the other 63 values is less than the latency of the first fetch. -You can also see the spikes at the powers of two. AoS performs slightly better because it can compute [horizontal xor-sum](/hpc/simd/reduction) faster with SIMD. In contrast, SoA performs much worse, but this isn't about $D$ being a power of two, but about $\lfloor N / D \rfloor$, the size of the second dimension, being a power of two, which in turn causes a pretty complicated [cache associativity](../associativity) effect. - -Even though $N=2^{23}$ and the array is too big to fit into the L3 cache, to process some number of elements from different cache lines in parallel, you still need to store them somewhere temporarily — you can't simply use registers as there aren't enough of them. When `N / D` is a power of two and we are iterating over the array `q[D][N / D]` along the first index, all memory addresses will map to the same cache line, making many of them be re-fetched from upper layers of cache. - -### RAM-Specific Timings +You can also see the spikes at the powers of two. AoS performs slightly better because it can compute [horizontal xor-sum](/hpc/simd/reduction) faster with SIMD. In contrast, SoA performs much worse, but this isn't about $D$, but about $\lfloor N / D \rfloor$, the size of the second dimension, being a large power of two: this causes a pretty complicated [cache associativity](../associativity) effect which we will come back to later. -Let's do the same with the [padded int](../cache-lines) +### Padded AoS -Теперь мы добавили паддинг в AoS так, что каждый элемент теперь окружен 15 какими-то бесполезными элементами, а в остальном они все так же используются для подсчета ксор-суммы D чисел: +As long as we are fetching the same number of cache lines, it doesn't matter where they are located, right? Let's test it and switch to [padded integers](../cache-lines) in the AoS code: ```c++ struct padded_int { @@ -68,36 +66,37 @@ const int M = N / D / 16; padded_int q[M][D]; ``` +Other than that, we are still calculating the xor-sum of $D$ padded integers. We fetch exactly $D$ cache lines, but this time sequentially. The running time shouldn't be different from SoA, but this isn't what happens: -4D. На уровне RAM интереснее. Казалось бы, AoS-padded должна работать так же, как SoA: мы и там, и там загружаем 63 кэш-линий. Однако здесь играет роль то, как работает сама RAM. +![](../img/aos-soa-padded.svg) -Все данные в ней физически хранятся в виде двумерного массива конденсаторов, разделенного на строки и столбцы. Чтобы прочитать ячейку из него, нужно выполнить одно, два или три действия: +The running time is about ⅓ lower for $D=63$, but this only applies to arrays that exceed the L3 cache. If we fix $D$ and change $N$, it becomes clear that the padded version actually performs slightly worse on smaller arrays because there less random [cache sharing](../cache-lines): -1. Прочитать содержимое строки в специальный временный буфер (row buffer). -2. Выбрать и собственно прочитать (или записать) в нем нужную ячейку. -3. И, опционально, записать данные из буфера обратно в строку массива — потому что чтение разрежает конденсаторы, и их нужно зарядить обратно. Этот шаг нужно делать только в том случае, если следующий доступ в память относится к какой-то другой строке. +![](../img/aos-soa-padded-n.svg) -Эти три шага занимают примерно одинаковое время. В AoS-padded все элементы хотя и распологаются в разных кэш-линиях, но эти линии соседние, и они с большой вероятностью окажутся в одной строчке в RAM, и первый и третий шаг можно проигнорировать. Поэтому суммарно все эти запросы отработают за втрое меньшее время (плюс задержка одного чтения) +As the performance on smaller arrays sizes is not affected, this clearly has something to do with how RAM works. -![](../img/ram.png) +### RAM-Specific Timings -4C: когда мы падим инты, но оставляем суммарный размер массива таким же, ничего с точки зрения запросов к памяти и вычислений меняться не должно. Единственый нюанс: с паженными интами не происходит случайного шеринга кэша, как с обычными (когда мы загружаем какой-то инт, его соседи по кэш-линии тоже попадают в кэш), поэтому есть небольшое замедление. +From the performance analysis point of view, all data in RAM is physically stored in a two-dimensional array of tiny capacitor cells, which is split in rows and columns. To read or write any cell, you need to perform one, two, or three actions: -![](../img/aos-soa-padded.svg) +1. Read the contents of a row in a *row buffer*, which temporarily discharges the capacitors. +2. Read or write a specific column in this buffer. +3. Write the contents of a row buffer back into the capacitors, so that the data is preserved, and the row buffer can be used for other memory accesses. -![](../img/aos-soa-padded-n.svg) +Here is the punchline: you don't have to perform steps 1 and 3 between two memory accesses that correspond to the same row — you can just use the row buffer as a temporary cache. These three actions take roughly the same time, so this optimization makes long sequences of row-local accesses run thrice as fast compared to dispersed access patterns. -The rest of the core is the same: the only difference is that they require a separate cache line access. +![](../img/ram.png) -This is only specific to RAM: on array sizes that fit in cache, the benchmark is actually worse because the [cache sharing is worse](../cache-lines). +The size of the row differs depending on the hardware, but it is usually somewhere between 1024 and 8192 bytes. So even though the padded AoS benchmark places each element in its own cache line, they are still very likely to be on the same RAM row, and the whole read sequence runs in roughly ⅓ of the time plus the latency of the first memory access. -RAM timings. +### Temporary Storage -This isn't about $D$ being equal to 64 but about $\lfloor \frac{N}{D} \rfloor$ being a large power of two. +Let's discuss the spikes in the graphs in more detail. -TODO fix D and change N +This isn't about $D$ being equal to 64 but about $\lfloor \frac{N}{D} \rfloor$ being a large power of two. -### Temporary Storage Contention +Even though $N=2^{23}$ and the array is too big to fit into the L3 cache, to process some number of elements from different cache lines in parallel, you still need to store them somewhere temporarily — you can't simply use registers as there aren't enough of them. When `N / D` is a power of two and we are iterating over the array `q[D][N / D]` along the first index, all memory addresses will map to the same cache line, making many of them be re-fetched from upper layers of cache. We can turn on hugepages, and they make it 10 times worse (notice the logarithmic scale): From b5484b68fb28217c3f531c4400fb07a916ca16a3 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Fri, 4 Feb 2022 18:58:22 +0300 Subject: [PATCH 097/531] aos hugepages --- content/english/hpc/cpu-cache/aos-soa.md | 52 +++++++++++------------- 1 file changed, 23 insertions(+), 29 deletions(-) diff --git a/content/english/hpc/cpu-cache/aos-soa.md b/content/english/hpc/cpu-cache/aos-soa.md index 65c0ce32..048271db 100644 --- a/content/english/hpc/cpu-cache/aos-soa.md +++ b/content/english/hpc/cpu-cache/aos-soa.md @@ -37,7 +37,7 @@ for (int i = 0; i < M; i++) { } ``` -And the second approach will place them in separately. The laziest way to do this is to transpose the two-dimensional array `q` and swap the indices in all its subsequent accesses: +And in the second approach, we will place them separately. The laziest way to do this is to transpose the two-dimensional array `q` and swap the indices in all its subsequent accesses: ```c++ int q[D][M]; @@ -50,7 +50,25 @@ By analogy, we call this variant *structure of arrays* (SoA). Obviously, for lar The performance of both variants grows linearly with $D$, but AoS needs to fetch up to 16 times fewer total cache lines as the data is stored sequentially. Even when $D=64$, the additional time it takes to process the other 63 values is less than the latency of the first fetch. -You can also see the spikes at the powers of two. AoS performs slightly better because it can compute [horizontal xor-sum](/hpc/simd/reduction) faster with SIMD. In contrast, SoA performs much worse, but this isn't about $D$, but about $\lfloor N / D \rfloor$, the size of the second dimension, being a large power of two: this causes a pretty complicated [cache associativity](../associativity) effect which we will come back to later. +You can also see the spikes at the powers of two. AoS performs slightly better because it can compute [horizontal xor-sum](/hpc/simd/reduction) faster with SIMD. In contrast, SoA performs much worse, but this isn't about $D$, but about $\lfloor N / D \rfloor$, the size of the second dimension, being a large power of two: this causes a pretty complicated [cache associativity](../associativity) effect. + +### Temporary Storage Contention + +At first, it seems like there shouldn't be any cache issues as $N=2^{23}$ and the array is just too big to fit into the L3 cache in the first place. The nuance is that to process a number of elements from different memory locations in parallel, you still need some space to store them temporarily. You can't simply use registers as there aren't enough of them, so they need to be stored in the cache even though in just a microsecond you won't be needing them. + +Therefore, when `N / D` is a large power of two, and we are iterating over the array `q[D][N / D]` along the first index, some of the memory addresses we temporarily need will map to the same cache line — and as there isn't enough space there, many of them will have to be re-fetched from the upper layers of the memory hierarchy. + +Here is another head-scratcher: if we enable [huge pages](../paging), it expectedly makes the total latency 10-15% lower for most values of $D$, but for $D=64$, it makes things ten times worse: + +![Note the logarithmic scale](../img/soa-hugepages.svg) + +I doubt that even the engineers who design memory controllers can explain what's happening right off the bat. + +In short, the difference is because, unlike the L1/L2 caches that are private to each core, the L3 cache has to use *physical* memory addresses instead of *virtual* ones for synchronization between different cores sharing the cache. + +When we are using 4K memory pages, the virtual addresses get somewhat arbitrarily dispersed over the physical memory, which makes the cache associativity problem less severe: the physical addresses will have the same remainder modulo 4K bytes, and not `N / D` as for the virtual addresses. When we specifically require huge pages, this maximum alignment limit increases to 2M, and the cache lines receive much more contention. + +This is the only example I know when enabling huge pages makes performance worse, let alone by a factor of ten. ### Padded AoS @@ -70,7 +88,7 @@ Other than that, we are still calculating the xor-sum of $D$ padded integers. We ![](../img/aos-soa-padded.svg) -The running time is about ⅓ lower for $D=63$, but this only applies to arrays that exceed the L3 cache. If we fix $D$ and change $N$, it becomes clear that the padded version actually performs slightly worse on smaller arrays because there less random [cache sharing](../cache-lines): +The running time is about ⅓ lower for $D=63$, but this only applies to arrays that exceed the L3 cache. If we fix $D$ and change $N$, you can see that the padded version performs slightly worse on smaller arrays because there are less opportunities for random [cache sharing](../cache-lines): ![](../img/aos-soa-padded-n.svg) @@ -78,7 +96,7 @@ As the performance on smaller arrays sizes is not affected, this clearly has som ### RAM-Specific Timings -From the performance analysis point of view, all data in RAM is physically stored in a two-dimensional array of tiny capacitor cells, which is split in rows and columns. To read or write any cell, you need to perform one, two, or three actions: +From the performance analysis point of view, all data in RAM is physically stored in a two-dimensional array of tiny capacitor cells, which is split into rows and columns. To read or write any cell, you need to perform one, two, or three actions: 1. Read the contents of a row in a *row buffer*, which temporarily discharges the capacitors. 2. Read or write a specific column in this buffer. @@ -88,28 +106,4 @@ Here is the punchline: you don't have to perform steps 1 and 3 between two memor ![](../img/ram.png) -The size of the row differs depending on the hardware, but it is usually somewhere between 1024 and 8192 bytes. So even though the padded AoS benchmark places each element in its own cache line, they are still very likely to be on the same RAM row, and the whole read sequence runs in roughly ⅓ of the time plus the latency of the first memory access. - -### Temporary Storage - -Let's discuss the spikes in the graphs in more detail. - -This isn't about $D$ being equal to 64 but about $\lfloor \frac{N}{D} \rfloor$ being a large power of two. - -Even though $N=2^{23}$ and the array is too big to fit into the L3 cache, to process some number of elements from different cache lines in parallel, you still need to store them somewhere temporarily — you can't simply use registers as there aren't enough of them. When `N / D` is a power of two and we are iterating over the array `q[D][N / D]` along the first index, all memory addresses will map to the same cache line, making many of them be re-fetched from upper layers of cache. - -We can turn on hugepages, and they make it 10 times worse (notice the logarithmic scale): - -![](../img/soa-hugepages.svg) - -This is a rare example where hugepages actually worsen performance. Usually they the latency by 10-15%, but here they make it 10x worse. - -4F: Когда мы включаем большие страницы, задержка немного уменьшается — так же, как и в оригинальном бенчмарке задержки с D=1 - -4G: L1/L2 уровни кэша приватные для каждого ядра, и поэтому для простоты, чтобы не делать отдельно трансляцию адресов, для них везде используются виртуальные адреса, а не физические. На уровне L3 и RAM уже используются реальные, потому что иначе синхронизироваться никак не получится. - -Когда мы используем 4K страницы, они размазываются по физической памяти довольно произвольным образом, и проблема описанная в 4E смягчается: все (физические) адреса имеют одинаковый остаток по модулю 4K, а не N/D. Когда мы запрашиваем именно большие страницы, они мапаются в последовательные же страницы в физической памяти, и поэтому этот лимит на максимальный alignment возрастает с 4K до 2M, и кэшам становится совсем плохо. - -^ так что здесь ещё есть такой рандомный фактор, в зависимости от того, где операционная система страницы разместит - -Это единственный известный мне пример, когда увеличение размера страницы ухудшает производительность, тем более в 10 раз +The size of the row differs depending on the hardware, but it is usually somewhere between 1024 and 8192 bytes. So even though the padded AoS benchmark places each element in a separate cache line, they are still very likely to be on the same RAM row, and the whole read sequence runs in roughly ⅓ of the time plus the latency of the first memory access. From 16045d2d943979f5c3b960bf53b3284abdd1ea8b Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Fri, 4 Feb 2022 19:02:52 +0300 Subject: [PATCH 098/531] grammar --- content/english/hpc/arithmetic/rsqrt.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/english/hpc/arithmetic/rsqrt.md b/content/english/hpc/arithmetic/rsqrt.md index f49514d9..1882fa26 100644 --- a/content/english/hpc/arithmetic/rsqrt.md +++ b/content/english/hpc/arithmetic/rsqrt.md @@ -99,7 +99,7 @@ $$ Cool. Now, where were we? Oh, yes, we wanted to calculate the inverse square root. -### Approximating Result +### Approximating the Result To calculate $y = \frac{1}{\sqrt x}$ using the identity $\log_2 y = - \frac{1}{2} \log_2 x$, we can plug it into our approximation formula and get @@ -132,7 +132,7 @@ $$ f'(y) = - \frac{2}{y^3} \implies y_{i+1} = y_{i} (\frac{3}{2} - \frac{x}{2} y_i^2) = \frac{y_i (3 - x y_i^2)}{2} $$ -which is written in code as +which is written in the code as ```cpp x2 = number * 0.5F; From 755cffa3fda87936f3484b9c510989ab128fbc43 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Fri, 4 Feb 2022 19:06:25 +0300 Subject: [PATCH 099/531] rsqrt grammar --- content/english/hpc/arithmetic/rsqrt.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/content/english/hpc/arithmetic/rsqrt.md b/content/english/hpc/arithmetic/rsqrt.md index 1882fa26..4aae6dcf 100644 --- a/content/english/hpc/arithmetic/rsqrt.md +++ b/content/english/hpc/arithmetic/rsqrt.md @@ -37,7 +37,7 @@ float Q_rsqrt(float number) { We will go through what it does step by step, but first, we need to take a small detour. -### Calculating Approximate Logarithm +### Approximate Logarithm Before computers (or at least affordable calculators) became an everyday thing, people computed multiplication and related operations using logarithm tables — by looking up the logarithms of $a$ and $b$, adding them, and then finding the inverse logarithm of the result. @@ -51,9 +51,9 @@ $$ \log \frac{1}{\sqrt x} = - \frac{1}{2} \log x $$ -The fast inverse square root is based on this identity, and so it needs to calculate the logarithm of $x$ very quickly. Turns out, it can be approximated by just reinterpreting a 32-bit `float` as integer. +The fast inverse square root is based on this identity, and so it needs to calculate the logarithm of $x$ very quickly. Turns out, it can be approximated by just reinterpreting a 32-bit `float` as an integer. -[Recall](../float), floating-point numbers sequentially store the sign bit (equal to zero for positive values, which is our case), exponent $e_x$ and mantissa $m_x$, which corresponds to +[Recall](../float) that floating-point numbers sequentially store the sign bit (equal to zero for positive values, which is our case), exponent $e_x$ and mantissa $m_x$, which corresponds to $$ x = 2^{e_x} \cdot (1 + m_x) @@ -65,13 +65,13 @@ $$ \log_2 x = e_x + \log_2 (1 + m_x) $$ -Since $m_x \in [0, 1)$, the logarithm on the right hand side can be approximated by +Since $m_x \in [0, 1)$, the logarithm on the right-hand side can be approximated by $$ \log_2 (1 + m_x) \approx m_x $$ -The approximation is exact at both ends of the intervals, but to account for average case we need to shift it by a small constant $\sigma$, therefore +The approximation is exact at both ends of the intervals, but to account for the average case we need to shift it by a small constant $\sigma$, therefore $$ \log_2 x = e_x + \log_2 (1 + m_x) \approx e_x + m_x + \sigma @@ -87,7 +87,7 @@ I_x &= L(e_x + B + m_x) \end{aligned} $$ -When you tune $\sigma$ to minimize them mean square error, this results in a surprisingly accurate approximation. +When you tune $\sigma$ to minimize the mean square error, this results in a surprisingly accurate approximation. ![](../img/approx.svg) @@ -126,7 +126,7 @@ We reinterpret `y` as an integer in the first line, and then it plug into the fo ### Iterating with Newton's Method -What we have next is a couple hand-coded iterations of Newton's method with $f(y) = \frac{1}{y^2} - x$ and a very good initial value. It's update rule is +What we have next is a couple hand-coded iterations of Newton's method with $f(y) = \frac{1}{y^2} - x$ and a very good initial value. Its update rule is $$ f'(y) = - \frac{2}{y^3} \implies y_{i+1} = y_{i} (\frac{3}{2} - \frac{x}{2} y_i^2) = \frac{y_i (3 - x y_i^2)}{2} @@ -141,6 +141,6 @@ y = y * ( threehalfs - ( x2 * y * y ) ); The initial approximation is so good that just one iteration was enough for game development purposes. It falls within 99.8% of the correct answer after just the first iteration and can be reiterated further to improve accuracy — which is what is done in the hardware: [the x86 instruction](https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#ig_expand=3037,3009,5135,4870,4870,4872,4875,833,879,874,849,848,6715,4845,6046,3853,288,6570,6527,6527,90,7307,6385,5993&text=rsqrt&techs=AVX,AVX2) does a few of them and guarantees a relative error of no more than $1.5 \times 2^{-12}$. -## Further Reading +### Further Reading [Wikipedia article of fast inverse square root](https://en.wikipedia.org/wiki/Fast_inverse_square_root#Floating-point_representation). From a989d853a22f405aea018719106182273355b7ba Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Fri, 4 Feb 2022 19:07:35 +0300 Subject: [PATCH 100/531] typos --- content/english/hpc/arithmetic/rsqrt.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/english/hpc/arithmetic/rsqrt.md b/content/english/hpc/arithmetic/rsqrt.md index 4aae6dcf..06659136 100644 --- a/content/english/hpc/arithmetic/rsqrt.md +++ b/content/english/hpc/arithmetic/rsqrt.md @@ -115,7 +115,7 @@ $$ I_y \approx \frac{3}{2} L (B - \sigma) - \frac{1}{2} I_x $$ -It turns out, we don't even need to calculate logarithm in the first place: the formula above is just a constant minus the half of integer reinterpretation of $x$. It is written in the code as: +It turns out, we don't even need to calculate the logarithm in the first place: the formula above is just a constant minus half the integer reinterpretation of $x$. It is written in the code as: ```cpp i = * ( long * ) &y; @@ -143,4 +143,4 @@ The initial approximation is so good that just one iteration was enough for game ### Further Reading -[Wikipedia article of fast inverse square root](https://en.wikipedia.org/wiki/Fast_inverse_square_root#Floating-point_representation). +[Wikipedia article on fast inverse square root](https://en.wikipedia.org/wiki/Fast_inverse_square_root#Floating-point_representation). From 3fd354b258d78f306c9801f58430f9b6d84ddaef Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sat, 5 Feb 2022 07:24:46 +0300 Subject: [PATCH 101/531] simd graphs and code --- content/english/hpc/simd/img/filter.svg | 1401 +++++++++++++++++ .../english/hpc/simd/img/gather-scatter.png | Bin 0 -> 31350 bytes content/english/hpc/simd/img/gather.svg | 1233 +++++++++++++++ content/english/hpc/simd/masking.md | 180 ++- content/english/hpc/simd/moving.md | 47 +- content/english/hpc/simd/shuffing.md | 170 +- 6 files changed, 2969 insertions(+), 62 deletions(-) create mode 100644 content/english/hpc/simd/img/filter.svg create mode 100644 content/english/hpc/simd/img/gather-scatter.png create mode 100644 content/english/hpc/simd/img/gather.svg diff --git a/content/english/hpc/simd/img/filter.svg b/content/english/hpc/simd/img/filter.svg new file mode 100644 index 00000000..99422714 --- /dev/null +++ b/content/english/hpc/simd/img/filter.svg @@ -0,0 +1,1401 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/simd/img/gather-scatter.png b/content/english/hpc/simd/img/gather-scatter.png new file mode 100644 index 0000000000000000000000000000000000000000..91aae2c7164f8a891d7beacd85c8a69478d6590c GIT binary patch literal 31350 zcmbrm2Ut^Ew>BCp8%5YQRH_PyN)wPyz_vxCiJ*uysSybsLJ7^Sf=KUDBZ7i-0)!F* zN{0ZEP6!D~m6C)a38CJV;NIu^zPsgJmTF7`vN>s$wcN}~>MS{?wd z*|>3y$@|Z%7`{&2&xk%%@%Xvv`IiF^ z+n$;CiynQ`epukV(dk>Sj;}W?cGRMAxS^V&YWe~mZinm4z{7EVbv(ox%6v*l#gI81 zw_ddKc*nDyNhS#NRV*o&IVljRO`i2-15gV5A&;=W{x5Ih53+js_dfsEH^J?+2S#R| z?5w^Fgw#~b@wyBSBi7}|Q$D0pT&yGO0)K(P`i|bcY?I#|D%$baY<4e$pmG1B=4T22 zLHDoTkvgb7%wf*{HP~+N|KsK?68|r01O5L^g*AvxDs&(7$6td5?Ot9(wX@dxZ$|vJ z%KwYz!*M^b-dSViU3gHw1TK(Ho=G8xS9kef= zjPr+3nz;h4(*+1O5dl|=8I&CL>Uy`dt}D#YV`aMJ0vFJebP|CjL=GbJ^2e2c1BL|jOkedFrvvX9$AYc zw-jg8wNPAIsIbQhw}Nsm5W=z!Lh2dQFSueH0d3L&n8akiv;aOscHUqoBQGSD{2_*1 zz36OAFpc7kGDTr+)>4cREj2SQ0PfkqCm!$~_Ndo~PABIe%+cj055E+6nK?hnS)G-2 z5U-roK`E%&>=GVsgfa=xM2SigApZEiphYV`Zy))J){T^i5mV|rqm7P;{3bC()k1{? zlK7sI6H=BtWrJ?bKWGD2$S|%RxY`DjD70=&&$bwH%n#9d_T(m>M2Y@It`%OApIQS& zY{T#kl)1E@&FVv>Cly$)wWeS$$DZH>Jr$pZ=K6k;we9yLKKMK$Ma&n1#6%lHKfRWR z802H}K4>V;OG?F&9~3*rkWo+l7+a%9`OBenmGh{5cr{1j{(u`ujNzvIAu{Y@V={hv zOHVEL57a3sauE#;178&`QzU+SCoxyYizjXVMbwBS+(59})|S3JBknpd5IS7rHy(R( zCjHHrFk#3Xz1BML%ED~{oA&|1zTvPsX-7jh*b|2Skm`UL#4d7{{*W_AKPqV!fHdX+ zD+BKBNREZ#aIa^_WCiKNb0AZNOvlrv^H@*p{^gcemp9_bWX|MWHOQJn7I#F#mJ6ya z245gox1VmS(-D$)>q(;K@!!JgYjT?t6ew%uVREH1-49*tA2QM z&^Jk^>{IIb#?|e>74Zd06JM#FCT>D5JzPbtaLR{xXv}v(Yo}ZTZ~od4w*dpk_I+$U zTwg4;*Fpm|TB^%L9~2_`@s>V#C(A0VIQ0Y)$-!q3un;t#m1Ja?dbEs}V;#a!jp^;XrxvK9+ZFlJg+Ydu=MDg}m{l5EI19_OO6DR;YPBE!%# z?IlQ}Vvt-p)W}_Lp~I7)L6I+rAnP_=2+>CE0rW27C>92`DjU`?AAO@t?ITT?h11fu zP#@Sv02@t$*{3Y&>N!|=5Y5*_TI$_z zTFvKVsMq)OjchdK9|9CG$p}lLE(U3B>kZ=NSUSU`5P*jqDs9Gxs z3ymBK99dRdT5gb1qS+20$NW*EjHe0Y?%DJ!^o zc>?-SE&wthS#EIrw680q31vaKkE&GpUik*2U_&)G zdlT%6;$%a*b>^G&uAnMY-$R`%3xB3$RF||5hL=_Ot{`I$n~}Z9zE91*BLxZ6@4w9W zo@g>Zxz2!G7e~&J;Rlx9Hh}&f>GZH+I@~UOZhVvRP}JP_td4#2OvcxR0iBf@UQ*rz zltyYqt>a96Rt&l2zN7R{022Wa?JU9C&rl^j9lrD64^aZxPWh4cgOVRJq&Vjv2H3*UL;c0(V%Loi)@ht*of#Kb`;#Vu~emgnhel$1Y1*MEv)Z zvZ54#+<%~yHRIhlfJNzl&Eaoe{2J^pUa)M%uNgCE@;{so(2_;Q-$Vn``-kb!d|5bq z#SA`l^M&BjJCQSba;(8Xr*ELt;cAt6THS$6=mFehE(?6Wh&|&3lXc8c!G*)ry-SWu zb&(Y)uqgALSjI{Pu6+*yL=XsrfvH}32|cB+%J_p}l8NV=$341?kmXh@q`g@#Iu$sy zyDnMc?9mD%=%QETtz%HJ%mdsYoC&TZ69aKB8@{x$SB zkPVs=6v2(j4XVhg5PxH?hOhN3gsouF2{t$JPM5Dk@aFm<#LGEj`(e8)dt02~co|&^ zwMdh~D6;v_VpS3+Z%iP>S}|85YF8VXAuLN9U>6DeLIg7zU%3lpbD^xYni%r*en^6s z(w6(pMYILp{DIP6^Z)kL1}^=3?Sj4716Nvkin5oFPiySe?Fw+a*93MWf$WrUlc^L; zNr}`zL7sm!4g}BlGdzRy@)@!E>8pR;p$22q;x!a|m5)B>GdWK_uU3zv*?A6^C#Es=iK$q8q0pfR zE0f@qn=PI>rqcYFI^u&3?2u&p-2|tu7}^?DFF-#K*J0A-%tb2XG~(lD$`aFdzI_K3 z6p@&(H37ux< zRW+9blWjxA?aw(a=DqlZ&BC+gLz1~~m)uZ1f?BuXl)y|LAj!gCF0}-UcZ-F3sxP+* zJ!7VHYzUfv8iXIIrjy_fNdNi5$~^@%4++X>sf@ONrg%z8BL>&9K)_O>4QVGkxvFP$&4&Ir;1uT&Rl@}(~32M^NJpofh z#Oc~&0ggZJp57F{*y#OuqPTYcxJH0dc&14BcQ+Sj4~zTkS(h7Aw-*g;QlICtDK;CN zr&FXhS1q_UPSbek#O2Jo8xj{zBJ5o~TW?)0$eRZ;DhYJd3z&X(;+9RQC5( zP|jOuA20u1Tb3h21CKhGY(DbUG_1+1lU=*yt$6@ytryz zejBOdJjzQB%%a2)x^lf@clpGRF}Ng9d&-Y_aw=M9o7} zMQ~%dbvdnqw6Z>5jKFLrnmF0}bt85d2n*z;Gl$<|^bQ`dtEZkX7effLQ_W**2{lrI zRQOn`byT#&96;QUdzbeCbF?8~^c`3Oaos~@rAU8|e3%S@aqzcxIutwleRpfL(A^BA zsU4+O*ubY&^7S5RZ4CJ{#vSB50Zw^bOnAl80WYhPXmepkHo`)TL@HKw79#M9mz!@d_d^`7yx-7x`j4|}g~uKXH>;cT zjuF2Sc^>9EZTwmFqgjNR)a{@s1Jo%tl{|xSxz*zWHS;5U^A%p;TPy)#F0u}a$Xjun zDMD~%(ND@se#B0QSbNj#T>sLMX)VDQP$P$r8=*o3HgSRlynJ0gYKxde+aj+fXb7aL z#I_1BI_gy7x%_>Lj@50<>QT@)jq_hR2YTkJH6?9RO5KjtU45!Poj*pZx$a~hEY*bw z*WsOP@pUJ>7%L3eo>8yV!*93D@KU9Kyi-(Cqfx56x7zwZD5@pm@HdiTEPUeISnLZ@ znx1;)D~)>CEsNhP@`h1>5FYFe1RFkZDLl|@f zWa>nIl)QzG1^=drUmTTsL?s@vp=-@9FP~O-*>XZ)V0!EAUHIvTC^6zZ5R(iEkjZa` z?>Ikma_>+O^m`5E5+M*OThwY+DG~dKydgV@e4Ep%%6Nh4!&Yn(&!Mu>+w+#(gXRwW zo*H&#S&&ctc?N?jw|f9zs-}j|*eFt#jp*8N=kX%d7owNcZZSuKmLktABS0z{fj^~;u2 z<-L7EA%T2%&%w7!rcQf#Nw01X^HjR1(pxgA5ibI@BIl-Shi@sQI7+umz4q10O*;iF zuZCM}URW-i%OS-0cKMw0Vlv6gG|f*K2~Ks^Lq(aPOwiTvxsLcc!@~KmYgkN?*5+ho zU3ERDEu*2xXl+P4xxW^TBPwq_i%FG@B`w9IcA({K=%F`&;ysn+_qq(Y-I z<(yP06C}ky;S-<7VzX+Lq&W9LRp%jpTKw)fqK9gyJ=MMMd7pler#h{Z7lHAaj#CJD zx=L>qCk)iYo+UQ`fv{=)c9pooGk3<+CJ>W8X<^`CzK*Nx2}yi0Py}ZL4#z3M-vQPg zbxc;R&haUeFz+#xh0opOARy3dvv8`Gqy7xIqJ_JM`2Kn^J!|>sCv0B8}E^eP7`MG zB33A!^$!^~@JzDfQQZaGE?VN55ntmn@U6!Mq{kETVk)D&8XnGk(oibDN;>CAw1U@}k*5DCC7IA*mTL>hx*q_c6 z4?q_ko!!KBw!|m>q_h2`e=|!WSC_coyjWw)!{90A*Rp}9+P@)aKgqBT`#K)X-_fT# z8D7}n%d3-6KIv|h?q*5pcW7(_2JPH_LBCiI_ znQy_w%?61fx-{OZKg%PZH|#_^0+>~mcyg81#BpRUJoMx369iHe!A$;!#sz!Ys) z;nU9 zHI!6vj*xGzB7moyQ|aesAoS`=?Z#~O=_b@x(R+jaI|5p;QO-OJWpn?3|Wan54vj>xK%gLkWmS~56UxlKz|Ikj6}GD%7_yys*a735nU^Sk^#k1o3l@HOdoAkSr^?x- z25PElbw1R~mwd-&!=r9I@$f^1QM_${)8r=-G-c;wRDo`ef5$oU`0%l8Ilmuw2BFRw z1^GFHRWtkT$-J?SkAuu@&`ajh34>LPi`npr*OZ7xKdnd|RAu!s(X9hLmr9x35{e;L z3_^=XkJi*=JtCwU%Bi_`dkGV)b)ysS(Zsjw@oF<{mkIeo1f5xIjjj-tHX_{yQ}=(p zT8|mIYUho)1e1Oe;+FbT)*zl-thzkBJdL^j;OAvRs>kg5vj~xrTH{2dfGvEujh2Wu zKSJP9AinCxxH2tPok;5i!q!5RVzbn1v(Y5L5mKrJ@~Xf1#*^{K6Bn!=q-)msg;5*3 zZ<}v9&~@rL;XI5yIl^ulj~J*_0HqIx3>%_O)uUS9E4=W$$kqviHzA39rTB5vpVehc z+{e8(4%so2-bMM$ zE;y~I<8V<#_uIJuAKHde-1Z)cBg1Z?u?zl*-|$oSlxSrhUpG>!;DGB}zhh8W-Zg6H zj}K5i$=HO*iQXCeU;#Ft(OLN;WVqj}#C{SmT-6Gkn-WTSsVj;N8~ZM-J~bUIZFU5Z z2svZ^oLICS1mEsCvNi zdFstVM&i0CVmmA1Q+nwUbZ!!q=9W&Tt<5@?eK@?YkXP`jrn&p3P1L1N-8Oa^I+er` zZB*(oVX#9hW&AWZgd%-C1wNq&h!NmvW~OV2y3~-fge|iuz?@RE1)k~uh-5EK!qTA6 ze8&gG=v5S%7wg|_RcyL853Vq{Dn8*fmqX77()>(bufG>XY`(XBMHA6l&X=o$wYAZH zt2wFz4wb0GDx_UD9&@%wz}qDUKi4HS0v0c}Y>Fm7Ux`SeI5}0P0@=oU1rmmVeaVf7 zw?@DaYn~U(g`FVXLBWkXT+dqkzIyJ;2Ee7I^z1q9uAkY2eD zrgS1oZgr{8==}T4`s9Wy^qNLH>J zG`D&%;0$k%X@lkY6KyyD@Jv|5r@`p80!Bg_^tO%SmdhNDrRajxk4usa+DhR9REx|N!&=l6g4!`DV}wV zHz@~u2&R+%53vVLSnvLgu7ML%VUg+I9Cga1I%m#Nt0ui~uTF69JocYy#+eCRrY>*h zjmxX4U!SH}{~i`YrcAZ770Nxy{Kv^;kTl<%3Gf%F@O_eV+Qua0zWQwkVZzTAOpm{& z$i;nlyR&n9bfy1+i;fo#0^!jHtv9njs)q-dsVgh$0%TQIni=jW$FEC276M+RL{PLY zGx7bw_X25F(f<+eFabId(!VygOgmmQpV5PdM}i=jz6C1 zPRG_potSveCg5To`6gW5?(0mX3gzC(AYC=qf;{&}ds#(s>{qEszgoTjbaC}`${wN4 zWIch2nq^@4`$}>eH>Mv>ntL&0|3lYA5v+CpY)(}1|61iGnOt@6M&Kwf`rWO!eJxf* zP3bG&?$2pU=&IdGdyOpWhsQo;hD~bdmrDnOW1_3r=q*L5c<80o z66_h7UPHU;9M#^Z{Hi?*X+JK~?`iJ;l|%@7NbklmbOMUqN<@xkbWU|!jcCtx2XzN6 zwZwKy^o1L|)%Fs6=BPxw>13reU*CY&L%t8>1+<@oSz`Q#&G9(ie%D>B)!vJS^@ zu~M|3e~zl4eMD7HyWV{n4YwxJ46LvWufAB;2QTb5(3X7J8mj#;D-RXK9=CXu z-cl2vUWz?GC%w>fQlSho{5;0w9?v7YsoX=mU1nB$^@~u_Ba#4tG`|$~^-hcJ-D4Hu zJ>i)#*{TX8Z)^72)DTtA-uG@o9TImQuo6fh&NnaFvc%Y*TsZUKh=zvp7yW`Z*xq}iyr*=h-E#KC=eorvMr0>Eq;td_aCw}qWNRPfb9FbtN+B+g ztPEOj$0jyc_8n{~^HXlVAJ|y%8P-HQ21x>dMT+-)-y!t zYR3T1$i%=oH2V(?fE#5lK6$XX)^6Vcwh_i22WSU@TK?D@+EdsD;|PHZo>ZzAQB00r z_LfO(F2!b8mO8=p)QTFtD5^*-JtGpM>>!2S#SEHg{f@Vjt++E5Rio z5ekLa*a%lMQ(qv^g=vzeO~STEr|84P@|n?%#n)YS=e%w|aw|G+BB1vr4<6k7L%ZrU zWTO4;9STR@UY2MD9YOcw^F!f+SCpcTj>O)Q5cM^j?dYZTl6n^x&dA@8`7L&H@(j2B z$z8TAG+|^{hm;BX&bq&Vpe;o`@wwSf@_fHn#2tYf=3Kz)^+Gu?dsu?~wH^++>^7aNnm3kIBM>8B>f%A|F z;6^*4y(0r9@$5(4+I}?+fYrm-N2Zx4n?n>Y1PjJzW^ozCyDi&X*T`1ric3L0H zWCCBtRUXIdiXkDxXNR0ytnM@ZB11p!(HXjcM9gqnO6)DiG|Nej{Wgm4UXJ%gFPuRk zlyR5W#WRh5EsAz8M)rXsg+JkL{DU8E*JP4aML*bF(yBqa79~l-gFke|gB#N-(qrYL z#jc(#y1Yx!TVd(kVYBt0p(bur>9k0sxiZ8iE$5xEQ6*$JAWwNUE%OLGbC-pmF+KZ| z^U^Oh8H=Y6+jYXqUul1=JQjiDqHz$8R-f+U5Do!W5d>0vMAsFTrSP+Bqn_DS2+DNo z-;cL#vn~(&00{O6pK#gw_5OA|?!b@t`&iAj&+EHVo36;>l$Nw97zIFWmZ;4Mna}Zj z-tgd6t(IDu7oYbZXKtkgWx<@;^S!R)!^Bz=M!LDO(v3_zlJ7v-+g2Lu+8*7`>%Xv< zwF%QYPdeK0=(?LqGv@gBNzzvAWgU^EO&jZn`1rya$lC|4n6!0MV>{L+8i-FyI={-x zMx{Ewnl5>fV3A`boB3pt*L5m83DDcyPjDp?nl>sE;#oSWY(Q)eq`_OpNqaL0_vFVf zYxc_&KIzL(gQM+EFyWh78 zjcq6wJ)&_O2KW{po@e8>U06xV1J(oRxulct*!VPw5I6UxE>ZZ|#Mmpb7kh1SC23yp zU`LlCS)H>&;&VrUJsuD!745|*L+KJzjyBcv-{~7pml~GTFYWmyu;6gf%@6pn0PhjbmQ7#o)gEbkJ^Vljq?I zZd6Kqr_CQ$8X30=fA8u(ef@%*@KP$cu_hXOPIPF_i?z44Fo>bnuIQG0aSRlA| z*MoBubNo_Q-Z#z8Pb9PS_^rxVF4aC%ⅈ8hl5bePPj()gM7e)0ox25)5-c3ljCpT zIihj)0~+(Br2#(g^DRr|`y)hS;>2wMJ!!5XpLr-^0CKx5d-xjQ!hnsXC*H@7O_KBn zZG^-$`#&NX52xcGLGQv(XxrZ1ENXYrG`^a-yL?w*OuWz~E&St6-37_V@?#z6^gayJ zIPzoL0zcwKmz`6Iz1SnqB74OuTj_5a=WJh{=i;R z41UlR3Z2o`x!9<>>pwpSBYEhNAMLDY*+doYr0*|~*Xfg1&6oT>HsMgpDhs1Td!sX^ zcjDYv_b^u_2tK9~`p)&L+$*V1ZB~VGCMN*1n`%m0_0%-!coRH{^J-?b`!-BeIhgv@ z;UP~eX5zV!7d5^_{7qz!j*~B}9c@0{lC#N{+@#X_)-_H0redDBE6ajhD?8ha5h7@Q z1hdm4?KB={jb4KM*%Ft(7`xvIMZiBNpW*J(z1i{UJ5bjN>_L5gM(>u=sFj*n7a*C7TBTHY`Dy5rWa?XEFW97fjaZQR-XAP8UN;dXs&h&U<-Yxr==hareki zXdGHXnhr|AHVeC*qHMa|KPE+e#Tb;t-YU%A&ajHF;=2K8tU6!N$LU+yF~VZ#!)z=! z3wmNgkG$A|3F#$vwo1NgulW(W{tPc$ARK{M&#*K$M7YP$zFF-7az%jY7#_t48M#s) zS(!>-9}!j+Y{JCNqt~8I2G*b*ZNI$(R`twsANWn$)*9`E0iMM~khSphDNCXXBGzNf z)(RHz_UY>?pq-y}A5m7kbp3>y#`pKT*Z?$K4Hta^pGo(kSVp?4S#h7wpd1{!H0$Ys9nFvvrzT zDig$x_VSZ)&w(3u04!_@5lxp=ao^vOVQIMu5B{S}HaxaZc~*RWm*3~2IZQ&wFMuBj zOKdeZeb{iL#s-KV&a}=0R_Wm#M?(NMyZ=le7f51n&;7SbL4(@fkUH;%#r|oF3hT5L)DzA@^T2+%e zk-8VF#INWctB4U5n7_f916bi^9dNiMx7TI$fz-Q$YRnR4Kv+1RL)R0Cwa2OCaZ)G; zM8fMMSX4d$lw9uqTjf{){LHP>eki{~aR$+E1}^-1C9)Zl|12Fb@lv1invMZBZ$TPh zf~>g4)h9pH&e09MF9*)Kv$6ag$it1=6oB0xU_YQyGC>dtecZm9Ub2|2@I!L)wM>D_ zM1(@3+?*2adPnr$CI*99j>FV8{ol930$cLq%nIg6fs1T`i@fzqylio|$PM@mOQjY7 zM{SFCCM6$WyAl9Y%K(#jB1W%UKw>lKwKL71S60lth@PjP3%lCz0`i%YynCjkbSXW4J?Y zFVWcXnZPx!o$rLPZJJm82C;oR0pBA_kG!qxYU>6>Fv8J<*~RO57Y95~8;||s@7FTv zMEha(-yr8a_N&x~7T>P6jx8O$vvz{0azIz63<-Jk{;SI4PZ_`aKD*S&&BM%cF!#yl zg!l4gNitgAe;gQt_%b|LWHq$bc$GsQS<@j&+yG{>!0!D=+bQ=2S65Ejd${Q3 zjDClh;TTaRy>{7rrqN%fr)sHAaN@bdWs0|0p&HtSFjvA4%fV@crf)O)^!Lv zZ-(OLt%oPwocG6-)ao5PdyJ+Yo<6W^?m;DD=c9XiVxi*d1^sO=!$c=Z80wisYi$>m zR<-6Pw_gTgSRq+iKqx{T&~$!_iXSA`p3W(O0hqDvwam`6`qe=bw4fQUPn`B8s#l6K zODBU~i0owmOP#ZD{!L2Eu2c&b9bZC0B52Vg9O1Rz>W9+V4sFqUpaDc*VKX>evcEspAsnO{q4EL zSm87&hM{eY+fOBr`J_s}0)&e9gpfORWn?q|(=XmU0VAiYJ5zL@D#J#M&X>RSBT397 zCP=fMTt7(c#2JtRpdBB&3O4v&!}k7K0i8R)RW?1hZo}lT@xFlPZ~!tFf9}=nTQ}9C z8!}_&hku#!&%w9KvYMEBMzOnx8&w%hSMym=P@hrQFcwOmrIZ19jn8^c*m5WS-pVig z{CPV508YB|SU+*Kt}oOXh(}6R)L61$y#R<#MjFzVO#h9}%eLz*J~zESh>Oi7OUQYp z!l_TCifSF1raT?zsD%^gQuEq7#;m9V#OoSYJfC$>&#-fpX8)ikwaukmox-osZMaAx zsu*(=$-BVD!gKb)wV{mQY9?(KA@hBWfn`00|-`tm!W;E0<~Z@ z;Xze&H#xYB5=tkZp|vQ~Eq&7Sf3}?uRDiK6xt@eHl}`|k_INKmKBf{67o~gKHROD5 z-$EtAMZ*!`_Kej62zXXguZB`H-{M#DOY{}@130gufxSi3Y5Z8ri77+yI@1hP*7ZGkJ;7DE@tl+@0$)XoBK zdqK>6=?xO{)^v)6M?f&G@y5_~?!){Kj#W@7={H|PlN?*J#=*{tgekkPXkn+)mx3KN5_+?QsMz%u)1J>4TV*H!2TStk6eXmTh& zLENYIxJbqieVdZA_#yY-_>HnP3bp2{lIl07@Tv=5fpjm~8P$Ynh4#R*&W-5h_UFvd z2Xs1kk}j#)ZgTE#(Uoi$=A^a+Oc zU}@+D{V-XG-C9lAxa?qb@+7dzc5I|+QeXPsU3qwNCDu^eMBT9l5;5)+R>3dek1E=&rmZrLVw(a)^`|>SqMV(RtyL!u_smv#ke*GIR#wvu zpKn|6?%Jz>1pAx3l77UuisCVOhm<%f0IRbIw7GhE*o zJ|>-8ZNco;PJ+H(3oqRdpCFl)yI+&ZQW3)mVf-vyXy8Sgp!O~X?Udk*Cam6 z+?6av=J|B0Sc_atf9=t7z}F9_2%8SU0g?T;MP8!H^|ecOL88oBT7a9l_y=9K0iFRr z1ZIh%?Deho)~E_A`f@iO0k~9d#Mq+*z)}sLcRu+%3O+-uQeatCKJouN|Hj9V+m+8Q}#YHfj{RIoZR1D8W<>s&#VM|A2L9wFNV)#r(I%94UI0o=OrPtt<0vG zv}=d{45(YL7hL&u1TU>gZu?Z=R)PtMhQ_VOKocKw=bAx1$i-SgC+6TU`L41VvT>$Me(@9aIBOM`6~R zrM8w3-cI7K(=T_%Ew=N2yJ0G=cREkv!=0qGnAn9M1~Ts!>$n2eL<83aA*1Is()EM! zi2J&QdjAHX8%Q)oRZ7TjU1as3#UM)SY)&pmOpphjZ10ft?a<`Hs#r-LnJ% z0h^60MN7ERouqo#KHg*DUG@4+oA<;OcKEZ;{c%?+IVbN6+Iw>A_*~RU(Tn`p2rdl| ztcFH&CD-x{Kj0A!1JuB-%X>nA6N>nQC8Ww#W_=(S75L-cpHR4WG+&(Yv`A&H+jr@BE%Jey;5lwem2#V|pcK zF*wS>uQDomqHDN0c6Q2=wnl>jRSL>AIFNJ4z5JKyx_MDgG?s-3N^^YnKUFAZTR%JH zo|-(O$cXM1ueYJt@GVb<)|iJKCBw1Ios^I{L%lnYu}%Q}f@?B<=i6^yyLI{$@L~^W#AAW>@px?8_(%wxmRs+XS%9 zbFgQV3a!CwefXVFfVVAB4GF~`^>XGbgnBK2ROuqcp**UyY#Hl-pSmv~(@|$ z_5REvk9Ng6|OD1Q7xl!Y3}s)y@bVD2WRm=x8eo2 z+0IbcUabiF5e0Vwt*cs?mVrsTM-%Yt0-?wi2_A6R%n4ZC(RSHbr~BJMhZLEb3M@i! zB=eUIK~5JKfC&|uZXDR790yJ&GH>pl4rHD{<#*+G1$e<~1>ggq?M;m|==^JDc_NQj z#9xOOf>^hr9seHTU>gb;;~*Z!Yyu#4=1I~_AzhRwlQ>uenni61P>eU~HEq7Rs4!Ta zs{V6UV?i=8b(1a~|C zxiN#)dq~VxB>2>}ghT4Q>qHC|mds(XGS~Nr?gtdL0VV2;+-a5E5_7i@f%SK*crk|p z*z6|`EO*{Q5hzFN$uoK-S&vx>rBzhi&A8`V+cGow+Ob;IaZ4tA^7fEBQ;s6q+hAC! zc3P9|#sVr}y`tu{pm~N4d8ykeOMa+MV!9^u|MS2om;cvw$M4~k#y<&(nG-h>G7jIX z9*cDxTI-fvnV!%8)J|UDDHeG~9ASuFBBH4pVoAc{^W|V zKvOg3S+s)3>&YecgJuDC^4@DB}+L_tz`(t)!qbcl|-;&X(imK%negv%a!owDZ1M`rpR=zDRhWz*uoDyvA zHIi85m^3gHhMTRkLz98Z;Xjf7iqYhA=Lst-N?3mTGWJBw@STnH94fQ257UUP{cuvw z%mcpKDRIa}5sC3+NynvjScsEf;l7#zb>-A|#KnUSZ8u(2q3Wq4dMf#$Ev~^+%z-z4 zeDp$N5#5cdvxSn~50x1Xf7D|Y={JT@)x0C+b)`U|7+ivBL*ONH1h`&Cd9-lB&+GIP ztB=$s)?kR(UAX%jkM%?$HfJZcso&&M89k@oVQ z1~yfs9p{W!L|ZND6k^zlcQwPcHK#(g*yOuOl>aP(c9C#6X3l4mJGr;1k&*Q#KTe#T3=(f2+ zXG9T|CE;j9pj^b#%?_%0QW-(5LEP3;a9fqImPu6@?y< znVtSv$q$1>Q{?espm1`&BAt$RQhXy6N4Db~RqXrHt9sZSXIy()TDv~H?Ijp~MiGow)40lsWuLlAVynqsKl=1S?l%FYE1Pz@w z-=YF8^zZS61E=FRdY|DP_JlHJqD3y)U@K1v|#R~RXEDBU}$^Ra`-cM@X|cm=eQ zyr*uF-zUDW&FG?K<%xsDqyWPVylOhU>=-P8%w1sy?t!O&QgDcrAKXU%aRII0PODS? zf;ID^w|{O;jquwgH|1~_Z6O(6hYK$S5t5kYMB&xgfcBb=JC!-VIha=s(+(DzeH}my zuYa1sIsYuDsV(I(&x=kCAnE?7`2h4=ZzyFRrW>ngh`Meub0HhNofBtNQSh~>`sdt` zJQkM!IgWhtGWy}i7fwFhkCQZ@fC}Ecr}Z($;;H#k@71EgfU{zb-o#JfmBs^cNn{iU zr%D3384I&`KIO@s$00#*NaF$A!;pn?sS%E^U7TX6!G|hfc?=P}(~mRVtB)boo~c&2 z{6TTAu9K+cI{B_}8X{OjW~t;UM|l38iVVp`ofR1lcosqP z7pS8`HV|`0m++KJeP8g-CevP0m*?$7pxsXgKS_2YoIS@su=U12^rNtMnG>%D)_pI0 zYzt2Vj8aXI$HBb9zbsB!URGFldC2?gz4bw0l7M;KH>KwmA1bO00_T5 zE`RQ0_ZD%Eei2WR?)zeu_vat#&RnxaN_<~^1bkLS_y>!x)>^OhQczbNR&Pz19n-pc zB9LbT>dd$oyu>=6Q{?&!#nGb29<-h%oKgl(t;goQlO9$nCYb1WP@|I_AH-J^o|l6^ zh#VPx<9B2pDxjF`4`91r=`@mlWy%VZy`G<42T=!85b)6R5(_A}( z^_v77<5SWRLhVjZ{@5C`l5Z>iJ2Sv8a6oUS@MbV$>!o(}#eF#M?kLlxUcYr_1pRHu zySrIzY`tesF%#MYX`*Wyqm9-HM;&LUz* zYtB-2$mmigGf)L`SggS9XVKDAh@osGf8Tav;6|y`Ca#`0TVktnXX({g)$9@dvCUKj z+-oftJ}Sp47`Fq&%oIg(&H5eu{r>n-W?&6uPi+VpXU&ntXG=LynTF7BJfRmZ%ro<)kHh8R#y8P&rau7AVX?A_p3)VF}#hv&pCC(S2`bf4*!;_mASURWT>xoN-H|AL@; zHxIQGFIpG_m~J4a`1L4#QITe@n5VkA*yu&_nMp7*X8_fYGlM41{y;5bhy;$AH1k_k z-UDsZY>L%?V9iLP;^bO`K@ z#;x5WzAk5P4XCa6MAgkbD1I|IoiD#F?;S${HooFTd3;*=y>6xAJqe*$%gFD3F|3Fb zC}jzQr8p1d&N2@|_UR}zJF`jkT1SDo$x)XrJO>_#)i13~dfP!&U+G;{6FRW8GVIAY zdFMQ4%DV#i-vZ2RX^_4P4otswZ(>-QyvgUlUwfx#0@&RGADuq&Nx~4_R3y7+dsx?s zQ#GVm8sG&p*~_7Q!7xeWYfSn6Xcb)?NzV98#{K5avMcbOWW1-)r^Kcm4Yz~4@$Mdw z;%xz7d&3*bO}BG`Eh8czZ#*X&7M|K1N_O6imp_;xDm%wS z!Jse>g4aMt;`yy_T-)CUj4b?#ol{#j&`zVMmtp)X zQ7f(aS;vyn7uSdCx23knrP6@-t~Mw<`%xaCy39R<5;3GpC3Gwy(s!HRZ=&YAAkw5nI*5c0p%(>2l-@%XrAP}s2%!nm z1SwKN2t~R;0@4yXcL(*H``vr)TJQS4`wt;|?U{LIp8m|tFa4D>y@<8@QS1JWQ;Fl> zVMhQhIZwK4obzrhA{!|Yf8bni4{JlyDUY*kSdX+dNv_W*_98{KjdL%)(^ou(7d$sh9 zVZi>L@HggjnCQL_!XE|fnANW%M`$zB@uAk^Kr@(@+U6uGSDx9G%5cg*PHC|J@(6cW%yF{!0dV^}^(zt; zAEI~%v{l=n3^y9e!F{SbQcjHP-8PoUH+Apt8y1%I0XL~n_?61&)Yv)Gd~>=HvGyfR zDTurE=ClQhX6Ha6tNL(jHbhVT`DQ8gid>h^&QPt-Z&`iHyW2XW8BPH@ZTKzPq1NAE z7`~Z#IJO<(r{sNHUm1$_&C@_hXlI?6SdzETLtexyU^nRA6*H^6w1(SPEJ9Q zaNC4X^j{k1$VO6sL}geII}VywTYNi9AqxNx&H?AVQyiv-Ki-3Wh10D^fjb~{TWpFS zD{k}%u5=0XurZ%d&Y0QqJ=O_hw|3`XMd{w$e zvM`D$5x|Ukp-v|oZwMt01{mi8vuDRD!&^q zzdMvKw_V%k@3^q4DVv_M_6cxrH!oJklli+Kog%W*<%q2R1=&Hzu@BP20lR3QeVvdv z#MExaRP8~C*AkiMqERx7C8a=EyxQ8xo+kZ2P2M9`a?ZR_SoG!8DR=``vD{#dZT1KN z1j2(vr?|TX%Mule*6DcU@GCqbY5stdLekk{a`>M%(%+9MgT#UnYp=c7+mr7_NzI;} ze|4iLQ;H}szko+xf(DxAq2SIwm`TRHGH=aHXJI}!VAq~#ZngUXXFbt^ZEgOicVQ^2 zUX*dt4|j{8NWhd`%LIKcpA}xmwLU&$+)fFXnKq-mz(BNWzZQz(DQaNhK!UKV@zKFo zPoWW}ld-s{2a{-=gAH;`udw(kpn=PpjAGG>T^#Z>eQc_KLIIQ}iCb#>)tH?Z6 zYuC9%LfHm-AGtqL`tNDWP};2h4&fxx$iF%lTs3^{oB^a zd7V!2piHb(lZiT?=(KoIHTUQMb}m7xOljQkV;cES$vUN_N@3$lNwY69l-qym@4%4T zszFjU^EjN(!g6{4T1HYeh^oy^BZGR1R0%Bu7>fwIV zd@=6BjpGM}m65qzYcLY08?ps$m5#d{5?q?ja$!lQdpKSw8Ft(e3`;?9UM8}c<=y`T zilq7O=yuY#6!Z|F#^babP62YAk+F}o3(CC=_}bYuR^8o~%85P_m{ah73$BFm`pwC4 zJIj|17>{9jTZ|`fo`1{*qboM~rv!TB6n+UuGfgMnF9s-mUDC3@aN5BA=T`Etv%?#3 zmR!DeH2_GWr?z0|mF#n?;lM($XWS^oI`)9l9h*DCGBCfQw=6!cgO#T0J@2__xfN!&Vv9 zC{OQQo9;Y@Devru4lkeh54#d^Ofs%5>LOs`&L)6fxXxSA3t>Fx6{)=1T!*Ceu72%Z zH%jqO|J~{PSX|v{aigtwr$#bID)@bNtu{xh$G!@!>Bn8NMi!jOu=c&j3N1bcxd+o!o?nx>dl&mF^ zaBntq85Z9P`2%<-CQlFC=07sz#nuD>TNY7ma8X<2s7Hjvn=sD?TMsVZ-EqICGrq7} zz)^({v%`|3x1k!+jU~gZC*F;y@~C^|##Tsm<ZK>hOPv-0~KLj56p0l$9yxO%@J0lMeKQ&vu;B z;|_zlMBdh0KH#W1(;{(FQ8n?aJecEqi*BpDWKNsQktpK zI4L-@3Dg(#!yedob%_gGBL(O9WH4O&K{acIaV{Y|+I_TJEUqsZZ>bM;zIZ_)S#bUr z%_CF#XUiq6Ogz6_9voP~fUBGQaB;=DOCv0H(Z@$~4BLd_`r!lo_lafaHUM$vEi;TX z^QYb3YVE}QJ2e(Do3vNx?}IwscrZ<`6@!*{l+~U)rRaO+@X5s+ST|QWKBab35%k>) zJ}25=_BVEn*?j=_&=-tfx&LCmTdR{Qtij(PaVXFt5eTCb44wkf8q@QN-^|*-V~u1J zbX12>27ybnssDQPVw5JFe*43Vx+^yCx$W2t(ZmRv7!u$~G0?DyCQ=HOcb~cjGDz%J z&0sMs!4IC47nAl*Kp=w2)Q%qe@b`673!{Joa$D?WJ!rK4arTeTiq<58(?A=p7&@jp zuR`|WmffUDn|hBL=ci_2plE?V;=@dWcR($@h#rUA=8OXI@y2HP*Hq0j(k@t~%1V}r z7Kr;lZf#W0K-#5*(Fo?kii|2=XsW&Yfa}}ku42b256D1m#9usdf%D|MpIRWrn1FxW z57a;Q%uGeB9+#^c$rZiZEsq^3C2ha9Kp???2LWY?iF&xYALu!3U5w%CKrjV0PfVKP z?Kpu7hU?j9HbuwH%*Dk4e76%mJCHoa!0DDF-Xz_t7e)38fv%E9>Do?soso8Qh5)G+ zmVb&>GP~eXRER)n!L6M44GF&kpzo?UAE=4p%Z%_D$#+UF(fCYQXmvpt#jTuc6~e~B z0IJ0`_3CQg--CRJ|6nr=SNAD9kcjO&oni=h@KlFbf8q!)nd6w`wqru88%6pVU~F;M z)n$RYA)wc%n`e(;(#q~q-xA3dpQAlE2M`th20r$?1Tv!R?gC&kSgWL%BNxF39Z~#C zE4=WN6BOuk&2aXb6;;$UdOLzem!|v7vgIk2#;dhAdV1Argqa)SBYc<&;|-#JrT)z{ zv3CaFfo(6ioT?8S6YX8+gQUR&lcH@NY|sE|$NMBhw#Ri@hgot7PKKS~)4^LBT>c$x zQBJ|>8hR3zwt}KR!7su%WC3Fx<`m`rjN@|L-LR?%=>^0N7;I;9mqUx zp82QOGsq|<-T>x4nOiXH|1{dUT^o4Fsi-htU` z?^?GOj(V_2=dn&9Zn-NvZWqKqh3Xfwd$)UF+5adCX~YEFa%lY-wBczSljke7g-n}$ z_`SZaR~meg`8K1Jc&#gTAcGfw%c}A-)OzlY)xv9}jjdIj-!kZMM_2QiG9T`mVM<2& zWzvRb?c{Ci{ zK|46nQ}qNokumh>nlmxu(syzYDB&^dYtk>Vgv`xU|2B3&B5{cEcoA7!L5tCrpHQQQ z054mNJu8h=Dx>Srye0j$rv(y0{fiMm1_4n>;!wUgD~YXRs3>zo_$LcgBm;_{=X%fh zSCSFqVCI0-9Iq~p{u)SUW`$yoXCj4Jr)IzYV0ldktDa)zqorlfZy5vH zC?cm{lyMCG_RWrj2-EdFs zRv1@i_)G!D%aSj(Fn-?ZBZ#o5Yf!NQyYpPl?C&OZ;N%>WT_*~agEEybc(C=0ARMxe zTD$9mQp*ujit2CywZ?Co49pP79SP^nyJ3!`EopJXP#?ym`^Haql40xa4TIJj?HH~p#i3Njn zQ_2c&m>~JY*tH`BZQAkxI^x@5SJmCVS{R(BOW{|$9+OrAlm~#Mpi`VOtVhEJ)f#7l z?vhEj+wBr$1JftsRO_XN6P1|us#-klD9%ecu&ca}_KN1kx)eP@YMsopae6na!g9xeoVPz*30%zf86JMGa%G*EfF(|3 z(uIFT-<KO*jQb-b4}QDsLIcc)VcX_;iZ!@3ifJu=U%LmsoVKZqqwb10T4Q(LReby z@)es|#xh;z6aGLoTe8XLzhb3Y+TozPa6+efuL1lp$Ghg8m!*hIEZ$V4in*})CUr-z zf9nQ2aDtcv1E{7-AMK~>&w~$4nw@u94t(dV_qqSUn)gqXZdPWx{ofOd*abM_o>^jP0^+0Mz`D{RS445vR z$YpU~Ie;h!Zt!)ethT@nASjnvm^#2xI4ZWF$Spt#8Rl<7s`bdCA|nRpnLkmUJ*h^O zXM#abXpow7m7^XxT+k97u?e|#AS6*T&FH|D#y-jdX8u$S#L4|2=~?heeFG_{?0OE! zTC5!qp`S69=HkxNR;e1UvVWG)!_}2;IUVf*JJ0O>O0{g|G{?%aD6R=836dYf=bsF5 zy}w-Wjl$UKS2I^qk z!&>s3sTs%0wHaT6dCYS)AcDbvxhSB^e0dBWToYH0RoiLtYzAtQC&I&?0q(dky1niW zjsGgBlS=0zfblGdXImL*ftZp1>G%RN%D#N%Jr_s{nz5PL3&$b%cnux`KH#dj{FSH3 zvxdJ!YX6RE79Tn6-C%4HP>Ca_?Mc{Q_EKd|afo%;hfjtBB_5pya&+wysl)FD+;_%G zq(pj6a* zeaBkoL38K7nlUhdu~Ng#-eohaKNK1hMVgdlPhMve3#a-FkDs|MYCu;G@QCYNxK&}+ zd){PY3gHJ_+1RS{k>)Ci>2=k@%#YthsLhWg2np#CZ;e=`fh|R>NTEPjd+V}C#YLNo zm6iOh8JB)&@}cHhZnm8xY6e~vCsnEER6Cz^`!Z6QQxqmTM$9-R1BSbzRAp`1OUL_N zNiTdQT?xpTH|}`^?8yPDp-7ADW%;heH&{|SR$Wl|A(k9Q40S~hF*HoQok0c*bG9Qv zZc$Yr^bGBhpidD=>%MnzEROe zE3!kouI?vxiLa#X3L4p{Dr)-nMvwk*v-s}?_pM1yAhXi=OWPhA>9JqX^*s3vze_e` z%6j}6>T$}YtyjM#)YBfd^sLOj_?{rFm`=01Yu<7nE*taHqg_PA<@b8i z%7mFmscM_%F*?A~vh!I>u(Al)kvM8u`HCVnpWc){{0z6ei@n+qk4Q)yg&D!B4-L~C zmN(=|R{^WI?eye&zw|IqRBi4272962P6XZQOC%}+7%`;aI(*ba;R*KYAykoQDwduD z^r`XD%$@2R&PeqHpO)t0HmvQwS60cSA}HZ034mvwyGjzJiO`7VF;*!Vzw~qr z?%iiadH@`lh#8O_K1HDb9HLDXP(SsHP5W=Jb)|(?)8!bki1ZPB@;>l}@XylYH7roo zBc1DbEys&X{V2_l2ma6Rb{>eNUaG)-zH_=+c1Rn#*KIo#juY;Yj` zMlaN~@KT1jA3mo&T%Q#0S$UEP4m4VS}xxlOmG z-PNp*d&!$Gy)lA9YD|qkk83;$QgQU>S(*tntKMV+T-_l7p>!}_3@K=xjQM*|vucut zlENucbldGa&I8N+YTcPJw>i+fQX(cbq8NigQym?!&V=mdzTN5lQ5Y_|aCy&D&meFg zS1(`b6dYWJ4qD=i*McY$a2oZbCjnj>q_h7l(M=qhJ zqgN0f-0u9L4j=`j6|=4y?_J=DfuDr}JmC0{AhhQPAWly=@!4RtR?#6y#K_EIMAR*r~lD|=HCG9Yyu!51H%i zBw1c}Yfl9pAF7Tz`pc|2hPDVnE@J9vq!L;Xe;CdIJaNlR@jB)9%6o!zPS`^-2e}@> z0K&x-`Kp*Iwm{xl%J9il&3`?!v0y*wDWjq9)X2G3z6dO=W;auy%fak)PSvU$&k;Gm zz<*oSMai4%Wq8#PB(jFUtiS_6luVFWj!k>|gcZzA*~LL+HpW@0<>JdGWP5$r^wTY- z1J?(n&BG>le}w+~0vvt5_5SPmOROE7erxf=6VVaJ3Zj~gU2{x>`C1~7jxerr-HRA| zN{oEK;67OX;PEc)(ZTRa#wD-AW$)qZKZg6H1CO`+3`9*@jL%Q6`Xo7i#y_kN@DNiE z+vxIa>Nr20mM`kTUkcSy)_-oIs@5sxBh>gYQev7QBvY~7szHfv7pCF5oR~Hc z=BqEYF&__WYX!WiNA!7(t>1zLoO%oHUn!*lga9lD344{c4iXpb5gl2*ueO%G(U&9(Qe72h8!vq+oicDA7r=fn%Q)!2W4+ADk}!6ilS7J}1va$U(^d!kE; zk*Kl$(yDO3lq%q=*UplHBfE$AB zh)qgendkML;u7(>X(<0t^lXED966>1&lG~2lFM{)9m6kQHWjG>zViCA0z=&~n+bEJ^ls#Hw9JTP;)JFTsUE=x~txQtau2?PjeM`MW7-lLLL%Qg8C zUut__fzpxeag#Dk#>+HJD(i~G;f_OnItb!a7mYJHT&LO%#u)ON$J_FKr7$s4<9}E! zf4rUVb_qi`FP|QJq~lia@Ytf&E6>4lTHhi=&-LdKWb*Cm)uOrBXCZ*dF<=O{_1IEj zv%E;!j`WT;_4ggo&<@Wtc)1P#LUU=k+qUI*#tfpTlDZ|#YbC^;j$8-~`;BkccZzcQ z+MB5T=WeJVJVCD<_Y$^CXSmbcv2QVS0(_A6oM2(CQ8xb9X`gV#i&yU_w!datbbi<3xe@(hhOAOM{ItDskE}cvt*Zh3CY3QK+@hI>MBs<0#x*3;i z7wTTG6-Vf8(#rLFq&au{RhuJ4E&WQg zo?ZObG6BmlBK(XQ(~&)MYh$;_YC6v~LvMp$OaV~%i)+3tbKkfL*_Ek;XL)Zt={!Um zWUM)Byj!88E#0JwYckB_m}x=fozkzdc314==Ua}2=MJE%yEqPaX~-IdG=btCl?J!` z_sY#Ei04qF>Ex2J7Nb!Q`sK*C!NHe9e6m6l;Jg)+7U;3gt2Dd0?N(@PTNuM5TE49X z=6JrQV)y4&6XtV(V-8DQy%ui_rbIm>WQ zBVYO=|J~xrNXEyrTj?$M@2`=UTk)gskcs--@(ODF1C?1rwL)SM`s;8NPpmuKyvmSS z*B;Qo71F&RiPn z@nv%v^l(R_vPOEg>(Kn;;rHz}Qgh!a+vLlL#m8~aJR8sp_r)=gt~^iU^~=%?gzi53 ztGmrV9t_*fg@z4W0oJ94{OfhJh&16|W@K?a-$*N$hOho|VGT9m;L>|UdC#fWMWu1^ zP7=j6_u1Yn16~|YLuin24dYyOoD^h65GW_bMZ3~!UI>Un!f=6-W=}Qbcp+iF)M0Xr zO}#Dh^wHAClgWuXtIb z6Yc6S)rMLW#daKvt>>D8gd_c?L^0e>r!dTj!DAO~b6?-4di{E)I@d+h0n}~VZN`g} znF^zmz1K=5^eO`T`3NRA02;QI;%qk?hCFYY`?_43twlt(#o@<_tYXb^bQdvn_C#xh z!VtVBDn=OEgWOxa=vvidtg<*n9Sen*p-u)a}kPJtv{?=;cJxq?!S32`R7O1YRRHXqOw8L z9{!9B0=$k2)1;|!qKtkdH6j#OxX;!!b6wMCdV212_JYegA&1#p6ulIR^YZ;hwQbj6 zyT3l$p|*xz&0x*k$;3_c^#XsN69V)Cd>4OORO%_0pJR9n-#s^wHfi~ATTC7myuvl& zqq~#6|M5COz&$>B1*2#GSDs};$ar<~*oF;qE)*xH{+PUAM!aVnFjFcLW=54|h-orlh+eMb~If_YGPkYvdL<^NcP@V3!dH3NzBw^dWyV@K(kxa zp{O|I|Xw7!fzYn^2O^y?j;Ma!_>Glq9HucRI?xp!~s5DO70$p6y zI(m>*P?_x6tvIoJc3xfKsLtI-3h2{9`+M>~b^?+p#qeqC}bqh9|PL3CgIHW{x0 zx2c;Q5Slpu36TZen2DBdxv_)0kDI)Vux%zAFSWWlbH097tX*f4o3&4p{R^?n<5x+6 zQ~MjD|BndRR{AFAwXjj~Mg)5DeC2HRLW8$&{Iyod&jr)e;oeq44pz55b*x`S6L2^> z6p!?C=ztjf?|8jR9dS2NmlmMJ_a)t+Reel=FN*n1SO#(p?YQu~qx3w+YD0gHlAR*E zfr-r=6L>w0Ay`=VqvS8O`fO8bEE-Ko_`B= zH`7zR{MP^_j^@GfIT03?R}I>@M;|C@G37hp#ySVPw_EXdorK=l4zOQE=&%_Y?EL!_ zkLc*Wff>!&uBNWCO0hAyQI>o2nnXjz&~=vYly31IPcj7OOK&l^aJb1V`2FaF z%IM1_5pEY^$m|G~X-E^3`Mzg|(3$ik_{BZf96is1H z@LLC6ULr{&3Z!>NM+!Ws6O{kUVmTOIJbUL~Lvd>XdXf-_3?tV2toK0*iSE=@k%c+H zGIbmAlaOKlj}#6_iIOC+(ieh&7yj=|hS;P9q5e^`gff>hmoEFh6xo=v-XmaG;vo0; zHqU3heCa!x<$1i$j`_|Jk_Fshz57_{Z;`zPi0skEoQ}EiWZ4=&aDn`-y3h$7z;*aC z+~>Ez6QKW-L=?DOp8Vt_odli%GrbEyxB$~IXBMLaet9>S;bcVMrO_wFD$#6U=XM3s zei`5vH9$823=mWX0MC)M5V3v$Uq}1^!t(zgkAR+txzj)Wk`e#^PamC7c0%Cq7978b z{ck@ynbOJ3{_Dj5vq$)m_bSnX0R3hT3YlQXQ6ivD2oaNnnh2P4=j6LNa993+{QacU k|E<6O*7p;lf|ib`=EJ$Vl$kp=0m4Ys?rAE)70h1zFZl`R>i_@% literal 0 HcmV?d00001 diff --git a/content/english/hpc/simd/img/gather.svg b/content/english/hpc/simd/img/gather.svg new file mode 100644 index 00000000..5f35484f --- /dev/null +++ b/content/english/hpc/simd/img/gather.svg @@ -0,0 +1,1233 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/simd/masking.md b/content/english/hpc/simd/masking.md index 4ef4026a..05f632db 100644 --- a/content/english/hpc/simd/masking.md +++ b/content/english/hpc/simd/masking.md @@ -14,74 +14,158 @@ While using the elementwise instructions is easy, the largest challenge with SIM SIMD has no easy way to do branching, because the control flow should be the same for all elements in a vector. To overcome this limitation, we can "mask" operations that should only be performed on a subset of elements, in a way similar to how a [conditional move](/hpc/analyzing-performance/assembly) is executed. -Consider the following problem: for some reason, we need to raise $10^8$ random integers to some random powers. +```c++ +for (int i = 0; i < N; i++) + a[i] = rand() % 100; + +for (int i = 0; i < N; i++) + s += (a[i] < 50 ? a[i] : 0); +``` + +```c++ +const reg c = _mm256_set1_epi32(49); +const reg z = _mm256_setzero_si256(); +reg s = _mm256_setzero_si256(); + +for (int i = 0; i < N; i += 8) { + reg x = _mm256_load_si256( (reg*) &a[i] ); + reg mask = _mm256_cmpgt_epi32(x, c); + x = _mm256_blendv_epi8(x, z, mask); + s = _mm256_add_epi32(s, x); +} +``` + +```c++ +const reg c = _mm256_set1_epi32(50); +reg s = _mm256_setzero_si256(); + +for (int i = 0; i < N; i += 8) { + reg x = _mm256_load_si256( (reg*) &a[i] ); + reg mask = _mm256_cmpgt_epi32(c, x); + x = _mm256_and_si256(x, mask); + s = _mm256_add_epi32(s, x); +} +``` ```c++ -const int n = 1e8; -alignas(32) unsigned bases[n], results[n], powers[n]; +vec *v = (vec*) a; +vec s = {}; + +for (int i = 0; i < N / 8; i++) + s += (v[i] < 50 ? v[i] : 0); +``` + + +```nasm +vpcmpeqd ymm0, ymm1, YMMWORD PTR a[0+rdx*4] +vptest ymm0, ymm0 +je .L2 ``` -In SSE/AVX, [doing modular reduction](/hpc/arithmetic/integer) is even more complicated than in the scalar case (e. g. SSE has no integer division in the first place), so we will perform all operations modulo $2^{32}$ by naturally overflowing an `unsigned int`. +```nasm +vpcmpeqd ymm0, ymm1, YMMWORD PTR a[0+rdx*4] +vmovmskps eax, ymm0 +test eax, eax +je .L9 +``` -We'd normally do it by exponentiation by squaring: +### Searching ```c++ -void binpow_simple() { - for (int i = 0; i < n; i++) { - unsigned a = bases[i], p = powers[i]; - - unsigned res = 1; - while (p > 0) { - if (p & 1) - res = (res * a); - a = (a * a); - p >>= 1; - } +int find(int x) { + for (int i = 0; i < N; i++) + if (a[i] == x) + return i; + return -1; +} +``` - results[i] = res; +```c++ +int find(int needle) { + reg x = _mm256_set1_epi32(needle); + + for (int i = 0; i < N; i += 8) { + reg y = _mm256_load_si256( (reg*) &a[i] ); + reg m = _mm256_cmpeq_epi32(x, y); + int mask = _mm256_movemask_ps((__m256) m); + if (mask != 0) + return i + __builtin_ctz(mask); } + + return -1; } ``` -This code runs in 9.47 seconds. +```c++ +int find(int needle) { + reg x = _mm256_set1_epi32(needle); + + for (int i = 0; i < N; i += 8) { + reg y = _mm256_load_si256( (reg*) &a[i] ); + reg m = _mm256_cmpeq_epi32(x, y); + if (!_mm256_testz_si256(m, m)) { + int mask = _mm256_movemask_ps((__m256) m); + return i + __builtin_ctz(mask); + } + } + + return -1; +} +``` -To vectorize it, we can first split the arrays `a` and `p` into groups of 8 elements, and then run exponentiation by squaring on them for 32 iterations (the maximum for any 32-bit power), masking the elements that need to be squared: +### Counting Values ```c++ -typedef __m256i reg; - -void binpow_simd() { - const reg ones = _mm256_set_epi32(1, 1, 1, 1, 1, 1, 1, 1); - for (int i = 0; i < n; i += 8) { - reg a = _mm256_load_si256((__m256i*) &bases[i]); - reg p = _mm256_load_si256((__m256i*) &powers[i]); - reg res = ones; - - // in fact, there will not be a cycle here: - // the compiler should unroll it in 32 separate blocks of operations - for (int l = 0; l < 32; l++) { - // instead of explicit branching, calculate a "multiplier" for every element: - // it is either 1 or a, depending on the lowest bit of p - - // masks of elements that should be multiplied by a: - reg mask = _mm256_cmpeq_epi32(_mm256_and_si256(p, ones), ones); - // now we blend a vector of ones and a vector of a using this mask: - reg mul = _mm256_blendv_epi8(ones, a, mask); - // res *= mul: - res = _mm256_mullo_epi32(res, mul); - // a *= a: - a = _mm256_mullo_epi32(a, a); - // p >>= 1: - p = _mm256_srli_epi32(p, 1); - } +int count(int needle) { + int cnt = 0; + for (int i = 0; i < N; i++) + cnt += (a[i] == needle); + return cnt; +} + +``` + +```c++ +const reg ones = _mm256_set1_epi32(1); - _mm256_store_si256((__m256i*) &results[i], res); +int count(int needle) { + reg x = _mm256_set1_epi32(needle); + reg s = _mm256_setzero_si256(); + + for (int i = 0; i < N; i += 8) { + reg y = _mm256_load_si256( (reg*) &a[i] ); + reg m = _mm256_cmpeq_epi32(x, y); + m = _mm256_and_si256(m, ones); + s = _mm256_add_epi32(s, m); } + + return hsum(s); } + ``` -This implementation now works in 0.7 seconds, or 13.5 times faster, and there is still ample room for improvement. +```c++ +int count(int needle) { + reg x = _mm256_set1_epi32(needle); + reg s1 = _mm256_setzero_si256(); + reg s2 = _mm256_setzero_si256(); + + for (int i = 0; i < N; i += 16) { + reg y1 = _mm256_load_si256( (reg*) &a[i] ); + reg y2 = _mm256_load_si256( (reg*) &a[i + 8] ); + reg m1 = _mm256_cmpeq_epi32(x, y1); + reg m2 = _mm256_cmpeq_epi32(x, y2); + s1 = _mm256_add_epi32(s1, m1); + s2 = _mm256_add_epi32(s2, m2); + } + + s1 = _mm256_add_epi32(s1, s2); - + return -hsum(s1); +} + +``` + + diff --git a/content/english/hpc/simd/moving.md b/content/english/hpc/simd/moving.md index 69162c65..51502be9 100644 --- a/content/english/hpc/simd/moving.md +++ b/content/english/hpc/simd/moving.md @@ -49,14 +49,51 @@ For allocating an array dynamically, we can use `std::aligned_alloc` which takes On most modern architectures, the `loadu` / `storeu` intrinsics should be equally as fast as `load` / `store` given that in both cases the blocks only intersect one cache line. The advantage of the latter is that they can act as free assertions that all reads and writes are aligned. It is worth noting that the GCC vector extensions always assume aligned memory reads and writes. Memory alignment issues is also one of the reasons why compilers can't always autovectorize efficiently. - +```nasm +mov eax, 42 +vmovd xmm0, eax +vpbroadcastd ymm0, xmm0 +``` + +You can [broadcast](https://software.intel.com/sites/landingpage/IntrinsicsGuide/#expand=6331,5160,588&techs=AVX,AVX2&text=broadcast) a single value to a vector from a register or a memory location. + +### Non-Blocked Reads + +Since AVX2, you can use "gather" instructions that load data non-sequentially using arbitrary array indices. These don't work 8 times faster though and are usually limited by memory rather than CPU, but they are still helpful for stuff like sparse linear algebra. + +![](../img/gather-scatter.png) + +AVX512 has similar "scatter" instructions that write data non-sequentially, using either indices or [a mask](https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=compress&expand=4754,4479&techs=AVX_512). You can very efficiently "filter" an array this way using a predicate. + +```c++ +int a[N], q[Q]; + +for (int i = 0; i < Q; i++) + checksum += a[q[i]]; +``` + +```c++ +reg s = _mm256_setzero_si256(); + +for (int i = 0; i < Q; i += 8) { + reg idx = _mm256_load_si256( (reg*) &q[i] ); + reg x = _mm256_i32gather_epi32(a, idx, 4); + s = _mm256_add_epi32(s, x); +} +``` + +Maybe move it to shuffling anyway? + +![](../img/gather.svg) + +The last two, gather and scatter, turn SIMD into proper parallel programming model, where most operations can be executed independently in terms of their memory locations. This is a huge deal: many AVX512-specific algorithms have been developed recently owning to these new instructions, and not just having twice as many SIMD lanes. diff --git a/content/english/hpc/simd/shuffing.md b/content/english/hpc/simd/shuffing.md index 508a3aeb..ed0f3dd6 100644 --- a/content/english/hpc/simd/shuffing.md +++ b/content/english/hpc/simd/shuffing.md @@ -5,14 +5,166 @@ weight: 6 Masking is the most widely used technique for data manipulation, but there are many other handy SIMD features that we will later use in this chapter: -- You can [broadcast](https://software.intel.com/sites/landingpage/IntrinsicsGuide/#expand=6331,5160,588&techs=AVX,AVX2&text=broadcast) a single value to a vector from a register or a memory location. -- You can [permute](https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=permute&techs=AVX,AVX2&expand=6331,5160) data inside a register almost arbitrarily. -- We can create tiny lookup tables with [pshufb](https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=pshuf&techs=AVX,AVX2&expand=6331) instruction. This is useful when you have some logic that isn't implemented in SSE, and this operation is so instrumental in some algorithms that [Wojciech Muła](http://0x80.pl/) — the guy who came up with a half of the algorithms described in this chapter — took it as his [Twitter handle](https://twitter.com/pshufb) -- Since AVX2, you can use "gather" instructions that load data non-sequentially using arbitrary array indices. These don't work 8 times faster though and are usually limited by memory rather than CPU, but they are still helpful for stuff like sparse linear algebra. -- AVX512 has similar "scatter" instructions that write data non-sequentially, using either indices or [a mask](https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=compress&expand=4754,4479&techs=AVX_512). You can very efficiently "filter" an array this way using a predicate. +### Permutations and Lookup Tables -The last two, gather and scatter, turn SIMD into proper parallel programming model, where most operations can be executed independently in terms of their memory locations. This is a huge deal: many AVX512-specific algorithms have been developed recently owning to these new instructions, and not just having twice as many SIMD lanes. +You can [permute](https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=permute&techs=AVX,AVX2&expand=6331,5160) data inside a register almost arbitrarily. - - - +```c++ +int filter() { + int k = 0; + + for (int i = 0; i < N; i++) + if (a[i] < P) + b[k++] = a[i]; + + return k; +} +``` + +```c++ + +struct Precalc { + alignas(64) int permutation[256][8]; + + constexpr Precalc() : permutation{} { + for (int m = 0; m < 256; m++) { + int k = 0; + for (int i = 0; i < 8; i++) + if (m >> i & 1) + permutation[m][k++] = i; + } + } +}; + +constexpr Precalc T; + +const reg p = _mm256_set1_epi32(P); + +int filter() { + int k = 0; + + for (int i = 0; i < N; i += 8) { + reg x = _mm256_load_si256( (reg*) &a[i] ); + + reg m = _mm256_cmpgt_epi32(p, x); + int mask = _mm256_movemask_ps((__m256) m); + reg permutation = _mm256_load_si256( (reg*) &T.permutation[mask] ); + + x = _mm256_permutevar8x32_epi32(x, permutation); + _mm256_storeu_si256((reg*) &b[k], x); + + k += __builtin_popcount(mask); + } + + return k; +} + +``` + +![](../img/filter.svg) + +### Shuffles and Popcount + +We can create tiny lookup tables with [pshufb](https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=pshuf&techs=AVX,AVX2&expand=6331) instruction. This is useful when you have some logic that isn't implemented in SSE, and this operation is so instrumental in some algorithms that [Wojciech Muła](http://0x80.pl/) — the guy who came up with a half of the algorithms described in this chapter — took it as his [Twitter handle](https://twitter.com/pshufb). + +2 GFLOPS: + +```c++ +int popcnt() { + int res = 0; + for (int i = 0; i < N; i++) + res += __builtin_popcount(a[i]); + return res; +} +``` + +4 GFLOPS: + +```c++ +int popcnt() { + long long *b = (long long*) a; + int res = 0; + for (int i = 0; i < N / 2; i++) + res += __builtin_popcountl(b[i]); + return res; +} +``` + +0.49 GFLOPS (0.66 when switching to 16-bit and unsigned short). + +```c++ +struct Precalc { + alignas(64) char counts[256]; + + constexpr Precalc() : counts{} { + for (int i = 0; i < 256; i++) + counts[i] = __builtin_popcount(i); + } +}; + +constexpr Precalc P; + +int popcnt() { + auto b = (unsigned char*) a; // char is signed by default + int res = 0; + for (int i = 0; i < 4 * N; i++) + res += P.counts[b[i]]; + return res; +} +``` + +```c++ +const reg lookup = _mm256_setr_epi8( + /* 0 */ 0, /* 1 */ 1, /* 2 */ 1, /* 3 */ 2, + /* 4 */ 1, /* 5 */ 2, /* 6 */ 2, /* 7 */ 3, + /* 8 */ 1, /* 9 */ 2, /* a */ 2, /* b */ 3, + /* c */ 2, /* d */ 3, /* e */ 3, /* f */ 4, + + /* 0 */ 0, /* 1 */ 1, /* 2 */ 1, /* 3 */ 2, + /* 4 */ 1, /* 5 */ 2, /* 6 */ 2, /* 7 */ 3, + /* 8 */ 1, /* 9 */ 2, /* a */ 2, /* b */ 3, + /* c */ 2, /* d */ 3, /* e */ 3, /* f */ 4 +); + +const reg low_mask = _mm256_set1_epi8(0x0f); + +const int block_size = (255 / 8) * 8; + +int popcnt() { + int k = 0; + + reg t = _mm256_setzero_si256(); + + for (; k + block_size < N; k += block_size) { + reg s = _mm256_setzero_si256(); + + for (int i = 0; i < block_size; i += 8) { + reg x = _mm256_load_si256( (reg*) &a[k + i] ); + + reg l = _mm256_and_si256(x, low_mask); + reg h = _mm256_and_si256(_mm256_srli_epi16(x, 4), low_mask); + + reg pl = _mm256_shuffle_epi8(lookup, l); + reg ph = _mm256_shuffle_epi8(lookup, h); + + s = _mm256_add_epi8(s, pl); + s = _mm256_add_epi8(s, ph); + } + + t = _mm256_add_epi64(t, _mm256_sad_epu8(s, _mm256_setzero_si256())); + } + + int res = hsum(t); + + while (k < N) + res += __builtin_popcount(a[k++]); + + return res; +} +``` + +Another way is through gather, but that is too slow. + +https://github.com/WojciechMula/sse-popcount + +https://arxiv.org/pdf/1611.07612.pdf for the state-of-the-art. From 9120e5aaee9bf96a622135ced59cd3404f8e7da6 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sat, 5 Feb 2022 07:32:33 +0300 Subject: [PATCH 102/531] simd primitives timings --- content/english/hpc/simd/masking.md | 20 ++++++++++++++------ content/english/hpc/simd/shuffing.md | 5 ++++- 2 files changed, 18 insertions(+), 7 deletions(-) diff --git a/content/english/hpc/simd/masking.md b/content/english/hpc/simd/masking.md index 05f632db..03b3e65e 100644 --- a/content/english/hpc/simd/masking.md +++ b/content/english/hpc/simd/masking.md @@ -69,8 +69,12 @@ test eax, eax je .L9 ``` +All at around 13 GFLOPS, and the compiler can handle vectorization by itself. Let's move on to more complex examples that can't be auto-vectorized. + ### Searching +4.4: + ```c++ int find(int x) { for (int i = 0; i < N; i++) @@ -80,6 +84,8 @@ int find(int x) { } ``` +19.63, ~5 times faster: + ```c++ int find(int needle) { reg x = _mm256_set1_epi32(needle); @@ -96,6 +102,8 @@ int find(int needle) { } ``` +A slightly faster alternative: + ```c++ int find(int needle) { reg x = _mm256_set1_epi32(needle); @@ -115,6 +123,8 @@ int find(int needle) { ### Counting Values +15 GFLOPS: + ```c++ int count(int needle) { int cnt = 0; @@ -122,9 +132,10 @@ int count(int needle) { cnt += (a[i] == needle); return cnt; } - ``` +Also 15 GFLOPS: + ```c++ const reg ones = _mm256_set1_epi32(1); @@ -144,6 +155,8 @@ int count(int needle) { ``` +The trick that the compiler couldn't find is to notice that all ones is minus one. So we can use it as the negative count, achieving 22 GFLOPS: + ```c++ int count(int needle) { reg x = _mm256_set1_epi32(needle); @@ -163,9 +176,4 @@ int count(int needle) { return -hsum(s1); } - ``` - - - - diff --git a/content/english/hpc/simd/shuffing.md b/content/english/hpc/simd/shuffing.md index ed0f3dd6..3264ce42 100644 --- a/content/english/hpc/simd/shuffing.md +++ b/content/english/hpc/simd/shuffing.md @@ -21,8 +21,9 @@ int filter() { } ``` -```c++ +6-7x faster: +```c++ struct Precalc { alignas(64) int permutation[256][8]; @@ -113,6 +114,8 @@ int popcnt() { } ``` +7.5-8 GFLOPS: + ```c++ const reg lookup = _mm256_setr_epi8( /* 0 */ 0, /* 1 */ 1, /* 2 */ 1, /* 3 */ 2, From 51e49d3d79f3196775709cab7bb6ab69649b8bd6 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sat, 5 Feb 2022 19:29:15 +0300 Subject: [PATCH 103/531] simd intro edits --- content/english/hpc/simd/_index.md | 16 +- content/english/hpc/simd/_pres.md | 401 ----------------------------- 2 files changed, 8 insertions(+), 409 deletions(-) delete mode 100644 content/english/hpc/simd/_pres.md diff --git a/content/english/hpc/simd/_index.md b/content/english/hpc/simd/_index.md index 883fb0ab..988e83e8 100644 --- a/content/english/hpc/simd/_index.md +++ b/content/english/hpc/simd/_index.md @@ -27,22 +27,22 @@ Now, let's add the following magic directive in the very beginning: // ...the rest is the same as before ``` -Compiled and run in the exact same environment, it now finishes in 1.24 seconds. This is almost twice as fast, and we didn't change a single line of code or the optimization level. +When compiled and run in the same environment, it finishes in 1.24 seconds. This is almost twice as fast, and we didn't change a single line of code or the optimization level. -What happened here is we provided a little bit of info about the computer on which this code is supposed to be run. Specifically, we told the compiler that the target CPU supports an extension to x86 instruction set called "AVX2". AVX2 is one of the many so-called "SIMD extensions" for x86. These extensions include instructions that operate on special registers capable of holding 128, 256, or even 512 bits of data using the "single instruction, multiple data" (SIMD) approach. Instead of working with a single scalar value, SIMD instructions divide the data in registers into blocks of 8, 16, 32, or 64 bits and perform the same operation on them in parallel, yielding a proportional increase in performance[^power]. +What happened here is we provided a little bit of info about the computer on which this code is supposed to be run. Specifically, we told the compiler that the target CPU supports an extension to the x86 instruction set called "AVX2". AVX2 is one of the many so-called "SIMD extensions" for x86. These extensions include instructions that operate on special registers capable of holding 128, 256, or even 512 bits of data using the "single instruction, multiple data" (SIMD) approach. Instead of working with a single scalar value, SIMD instructions divide the data in registers into blocks of 8, 16, 32, or 64 bits and perform the same operation on them in parallel, yielding a proportional increase in performance[^power]. -[^power]: On some CPUs, especially heavy SIMD instructions consume more energy and thus [require downclocking](https://blog.cloudflare.com/on-the-dangers-of-intels-frequency-scaling/) in order to balance off the total power consumption, so the real time speedup is not always proportional. +[^power]: On some CPUs, especially heavy SIMD instructions consume more energy and thus [require downclocking](https://blog.cloudflare.com/on-the-dangers-of-intels-frequency-scaling/) to balance off the total power consumption, so the real-time speedup is not always proportional. ![](img/simd.png) -These extensions are relatively new, and their support in CPUs has been implemented gradually while maintaining backwards compatibility[^avx512]. Apart from adding more specialized instructions, the most important difference between them is the introduction of progressively wider registers. +These extensions are relatively new, and their support in CPUs has been implemented gradually while maintaining backward compatibility[^avx512]. Apart from adding more specialized instructions, the most important difference between them is the introduction of progressively wider registers. -In particular, AVX2 has instructions for working with 256-bit registers, while by default GCC assumes that nothing past the 128-bit SSE2 is enabled. Hence, after telling the optimizer that it can use instructions that add 8 integers at once instead of 4, the performance was increased twofold. +In particular, AVX2 has instructions for working with 256-bit registers, while by default, GCC assumes that nothing past the 128-bit SSE2 is enabled. Hence, after telling the optimizer that it can use instructions that add 8 integers at once instead of 4, the performance was increased twofold. -[^avx512]: Starting with AVX512, backwards compatibility is no longer maintained: there are many different "flavours" tailored to specific needs such as data compression, encryption or machine learning. +[^avx512]: Starting with AVX512, backward compatibility is no longer maintained: there are many different "flavors" tailored to specific needs such as data compression, encryption, or machine learning. ![](img/intel-extensions.webp) -Compilers often do a good job rewriting simple loops with SIMD instructions, like in the case above. This optimization is called *autovectorization*, and it is the preferred way to use SIMD. +Compilers often do a good job rewriting simple loops with SIMD instructions, like in the case above. This optimization is called [auto-vectorization](auto-vectorization), and it is the preferred way to use SIMD. -The problem is, it only works with certain types of loops, and even then it often yields suboptimal results. To understand its limitations, we need to get our hands dirty and explore this technology on a lower level, which is what we will do in this chapter. +The problem is that it only works with certain types of loops, and even then it often yields suboptimal results. To understand its limitations, we need to get our hands dirty and explore this technology on a lower level, which is what we are going to do in this chapter. diff --git a/content/english/hpc/simd/_pres.md b/content/english/hpc/simd/_pres.md deleted file mode 100644 index 3e3b11fe..00000000 --- a/content/english/hpc/simd/_pres.md +++ /dev/null @@ -1,401 +0,0 @@ ---- -title: SIMD Instructions -draft: true ---- - - -## Recall: Superscalar Processors - -* Any instruction execution takes multiple steps -* To hide latency, everything is pipelined -* You can get CPI < 1 if you have more than one of each execution unit -* Performance engineering is basically about avoiding pipeline stalls - -![](https://upload.wikimedia.org/wikipedia/commons/thumb/4/46/Superscalarpipeline.svg/2880px-Superscalarpipeline.svg.png =450x) - ---- - -## Single Instruction, Multple Data - -![](https://upload.wikimedia.org/wikipedia/commons/thumb/2/21/SIMD.svg/1200px-SIMD.svg.png =450x) - -Instructions that perform the same operation on multiple data points -(blocks of 128, 256 or 512 bits, also called *vectors*) - ----- - -![](https://i0.wp.com/www.urtech.ca/wp-content/uploads/2017/11/Intel-mmx-sse-sse2-avx-AVX-512.png =500x) - -Backwards-compatible up until AVX-512 - -(x86 specific; ARM and others have similar instruction sets) - ----- - -You can check compatibility during runtime: - -```cpp -cout << __builtin_cpu_supports("sse") << endl; -cout << __builtin_cpu_supports("sse2") << endl; -cout << __builtin_cpu_supports("avx") << endl; -cout << __builtin_cpu_supports("avx2") << endl; -cout << __builtin_cpu_supports("avx512f") << endl; -``` - -...or call `cat /proc/cpuinfo` and see CPU flags along with other info - ---- - -## How to Use SIMD - -Converting a program from scalar to vector one is called *vectorization*, -which can be achieved using a combination of: - -* x86 assembly -* **C/C++ intrinsics** -* Vector types -* SIMD libraries -* **Auto-vectorization** - -Later are simpler, former are more flexible - ----- - -### Intel Intrinsics Guide - -![](https://i.imgur.com/ZIzDidV.png =600x) - -Because nobody likes to write assembly - -https://software.intel.com/sites/landingpage/IntrinsicsGuide/ - ----- - -All C++ intrinsics can be included with `x86intrin.h` - -```cpp -#pragma GCC target("avx2") -#pragma GCC optimize("O3") - -#include -#include - -using namespace std; -``` - -You can also drop pragmas and compile with `-O3 -march=native` instead - ---- - -## The A+B Problem - -```cpp -const int n = 1e5; -int a[n], b[n], c[n]; - -for (int t = 0; t < 100000; t++) - for (int i = 0; i < n; i++) - c[i] = a[i] + b[i]; -``` - -Twice as fast (!) if you compile with AVX instruction set -(i. e. add `#pragma GCC target("avx2")` or `-march=native`) - ----- - -## What Actually Happens - -```cpp -double a[100], b[100], c[100]; - -for (int i = 0; i < 100; i += 4) { - // load two 256-bit arrays into their respective registers - __m256d x = _mm256_loadu_pd(&a[i]); - __m256d y = _mm256_loadu_pd(&b[i]); - // - 256 is the block size - // - d stands for "double" - // - pd stands for "packed double" - - // perform addition - __m256d z = _mm256_add_pd(x, y); - // write the result back into memory - _mm256_storeu_pd(&c[i], z); -} - -``` - -(I didn't come up with the op naming, don't blame me) - ----- - -### More examples - -* `_mm_add_epi16`: adds two 16-bit extended packed integers (128/16=8 short ints) -* `_mm256_acos_pd`: computes acos of 256/64=4 doubles -* `_mm256_broadcast_sd`: creates 4 copies of a number in a "normal" register -* `_mm256_ceil_pd`: rounds double up to nearest int -* `_mm256_cmpeq_epi32`: compares 8+8 packed ints and returns a (vector) mask that contains ones for elements that are equal -* `_mm256_blendv_ps`: blends elements from either one vector or another according to a mask (vectorized cmov, could be used to replace `if`) - ----- - -### Vector Types - -For some reason, C++ intrinsics have explicit typing, for example on AVX: -* `__m256` means float and only instructions ending with "ps" work -* `__m256d` means double and only instructions ending with "pd" work -* `__m256i` means different integers and only instructions ending with "epi/epu" wor - -You can freely convert between them with C-style casting - ----- - -Also, compiles have their own vector types: - -```cpp -typedef float float8_t __attribute__ (( vector_size (8 * sizeof(float)) )); -float8_t v; -float first_element = v[0]; // you can index them as arrays -float8_t v_squared = v * v; // you can use a subset of normal C operations -float8_t v_doubled = _mm256_movemask_ps(v); // all C++ instrinsics work too -``` - -Note that this is a GCC feature; it will probably be standartized in C++ someday - -https://gcc.gnu.org/onlinedocs/gcc-4.7.2/gcc/Vector-Extensions.html - ---- - -## Data Alignment - -The main disadvantage of SIMD is that you need to get data in vectors first - -(and sometimes preprocessing is not worth the trouble) - - ----- - -![](https://i.imgur.com/TBRhLew.png =600x) - ----- - -![](https://i.imgur.com/WNH9eCc.png =600x) - ----- - -![](https://i.imgur.com/SsDwG6D.png =600x) - ----- - -For arrays, you have two options: - -1. Pad them with neutal elements (e. g. zeros) -2. Break loop on last block and proceed normally - -Humans prefer #1, compilers prefer #2 - - ---- - -## Reductions - -* Calculating A+B is easy, because there are no data dependencies -* Calculating array sum is different: you need an accumulator from previous step -* But we can calculate $B$ partial sums $\{i+kB\}$ for each $i - -![](https://lh3.googleusercontent.com/proxy/ovyDHaTtBkntJLFOok2m17fYS0ROX0BBy-x4jG1CsYKInNRZvDMQyG-j-DOpRHR6jhYVvX2mWBLZHi2SoDwWLJ4LhofzScPtkFxko6tlYWcFyBttn7gIy0BiWWlvkIcl6BZbRBjCR5_wdniz6sIKTr1rpN7M_whxvd0IrUGpXGwI7PwKxwLslF_h9Zv8gbstlV--dyc) - - -This trick works with any other commutative operator - - ----- - -Explicitly using C++ intrinsics: - -```cpp -int sum(int a[], int n) { - int res = 0; - - // we will store 8 partial sums here - __m256i x = _mm256_setzero_si256(); - for (int i = 0; i + 8 < n; i += 8) { - __m256i y = _mm256_loadu_si256((__m256i*) &a[i]); - // add all 8 new numbers at once to their partial sums - x = _mm256_add_epi32(x, y); - } - - // sum 8 elements in our vector ("horizontal sum") - int *b = (int*) &x; - for (int i = 0; i < 8; i++) - res += b[i]; - - // add what's left of the array in case n % 8 != 0 - for (int i = (n / 8) * 8; i < n; i++) - res += a[i]; - - return res; -} -``` - -(Don't implement it yourself, compilers are smart enough to vectorize) - ----- - -![](https://www.codeproject.com/KB/cpp/874396/Fig1.jpg) - -Horizontal addition could be implemented a bit faster - ---- - -## Memory Alignment - -There are two ways to read / write a SIMD block from memory: - -* `load` / `store` that segfault when the block doesn't fit a single cache line -* `loadu` / `storeu` that always work but are slower ("u" stands for unaligned) - -When you can enforce aligned reads, always use the first one - ----- - -Assuming that both arrays are initially aligned: - -```cpp -void aplusb_unaligned() { - for (int i = 3; i + 7 < n; i += 8) { - __m256i x = _mm256_loadu_si256((__m256i*) &a[i]); - __m256i y = _mm256_loadu_si256((__m256i*) &b[i]); - __m256i z = _mm256_add_epi32(x, y); - _mm256_storeu_si256((__m256i*) &c[i], z); - } -} -``` - -...will be 30% slower than this: - -```cpp -void aplusb_aligned() { - for (int i = 0; i < n; i += 8) { - __m256i x = _mm256_load_si256((__m256i*) &a[i]); - __m256i y = _mm256_load_si256((__m256i*) &b[i]); - __m256i z = _mm256_add_epi32(x, y); - _mm256_store_si256((__m256i*) &c[i], z); - } -} -``` - -In unaligned version, half of reads will be the "bad" ones requesting two cache lines - ----- - -So always ask compiler to align memory for you: - -```cpp -alignas(32) float a[n]; - -for (int i = 0; i < n; i += 8) { - __m256 x = _mm256_load_ps(&a[i]); - // ... -} -``` - -(This is also why compilers can't always auto-vectorize efficiently) - - ---- - -## Loop Unrolling - -Simple loops often have some overhead from iterating: - -```cpp -for (int i = 1; i < n; i++) - a[i] = (i % b[i]); -``` - -It is often benefitial to "unroll" them like this: - -```cpp -int i; -for (i = 1; i < n - 3; i += 4) { - a[i] = (i % b[i]); - a[i + 1] = ((i + 1) % b[i + 1]); - a[i + 2] = ((i + 2) % b[i + 2]); - a[i + 3] = ((i + 3) % b[i + 3]); -} - -for (; i < n; i++) - a[i] = (i % b[i]); -``` - -There are trade-offs to it, and compilers are sometimes wrong -Use `#pragma unroll` and `-unroll-loops` to hint compiler what to do - ---- - -## More on Pipelining - -![](https://uops.info/pipeline.png =300x) - -https://uops.info - ----- - -For example, in Sandy Bridge family there are 6 execution ports: -* Ports 0, 1, 5 are for arithmetic and logic operations (ALU) -* Ports 2, 3 are for memory reads -* Port 4 is for memory write - -You can lookup them up in instruction tables -and see figure out which one is the bottleneck - ---- - -## SIMD + ILP - -* As all instructions, SIMD operations can be pipelined too -* To leverage it, we need to create opportunities for instruction-level parallelism -* A+B is fine, but array sum still has dependency on the previous vector -* Apply the same trick: calculate partial sums, but using multiple registers - ----- - -For example, instead of this: - -```cpp -s += a0; -s += a1; -s += a2; -s += a3; -... -``` - -...we split it between accumulators and utilize ILP: - -```cpp -s0 += a0; -s1 += a1; -s0 += a2; -s1 += a3; -... -s = s0 + s1; -``` - ---- - -## Practical Tips - -* Compile to assembly: `g++ -S ...` (or go to godbolt.org) -* See which loops get autovectorized: `g++ -fopt-info-vec-optimized ...` -* Typedefs can be handy: `typedef __m256i reg` -* You can use bitsets to "print" a SIMD register: - -```cpp -template -void print(T var) { - unsigned *val = (unsigned*) &var; - for (int i = 0; i < 4; i++) - cout << bitset<32>(val[i]) << " "; - cout << endl; -} \ No newline at end of file From 3d60c7504e4379633dbf2276cf1b29e67662fff8 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sat, 5 Feb 2022 21:11:28 +0300 Subject: [PATCH 104/531] simd intrinsics edits --- .../english/hpc/simd/auto-vectorization.md | 5 ++ content/english/hpc/simd/intrinsics.md | 65 +++++++------------ content/english/hpc/simd/moving.md | 19 ++++++ 3 files changed, 48 insertions(+), 41 deletions(-) diff --git a/content/english/hpc/simd/auto-vectorization.md b/content/english/hpc/simd/auto-vectorization.md index 154244e1..d7cca47c 100644 --- a/content/english/hpc/simd/auto-vectorization.md +++ b/content/english/hpc/simd/auto-vectorization.md @@ -48,3 +48,8 @@ for (int i = 0; i < n; i++) There are [many other ways](https://software.intel.com/sites/default/files/m/4/8/8/2/a/31848-CompilerAutovectorizationGuide.pdf) of hinting compiler what we meant exactly, but in especially complex cases — when inside the loop there are a lot of branches or some functions are called — it is easier to go down to the intrinsics level and write it yourself. `std::assume_aligned`, specifiers. This is useful for SIMD instructions that need memory alignment guarantees + + +First of all, it is very useful to check if vectorization happened the way you intended by [compiling it to assembly](/hpc/compilation/stages) and taking a close look at the emitted instructions that start with "v". + +Also, if you specify the `-fopt-info-vec-optimized` flag, then the compiler will directly indicate where auto-vectorization is happening and what SIMD width is being used. If you swap `optimized` for `missed` or `all`, you may also get reasons why it is not happening in other places. diff --git a/content/english/hpc/simd/intrinsics.md b/content/english/hpc/simd/intrinsics.md index 88d2c861..d1147da5 100644 --- a/content/english/hpc/simd/intrinsics.md +++ b/content/english/hpc/simd/intrinsics.md @@ -6,15 +6,15 @@ weight: 1 The most low-level way to use SIMD is to use the assembly vector instructions directly — they aren't different from their scalar equivalents at all — but we are not going to do that. Instead, we will use *intrinsic* functions mapping to these instructions that are available in modern C/C++ compilers. -In this section, we will go through the basics of their syntax, and in the rest of this chapter we will use them extensively to do things that are actually interesting. +In this section, we will go through the basics of their syntax, and in the rest of this chapter, we will use them extensively to do things that are actually interesting. ## Setup To use x86 intrinsics, we need to do a little groundwork. -First, we need to determine which extensions are supported by the hardware. On Linux, you can call `cat /proc/cpuinfo`, and on other platforms you'd better go to [WikiChip](https://en.wikichip.org/wiki/WikiChip) and look it up there using the name of the CPU. In either case, there should be a `flags` section that lists the codes of all supported vector extensions. +First, we need to determine which extensions are supported by the hardware. On Linux, you can call `cat /proc/cpuinfo`, and on other platforms, you'd better go to [WikiChip](https://en.wikichip.org/wiki/WikiChip) and look it up there using the name of the CPU. In either case, there should be a `flags` section that lists the codes of all supported vector extensions. -There is also a special [CPUID](https://en.wikipedia.org/wiki/CPUID) assembly instruction that lets you query various information about the CPU, including the support of particular vector extensions. It is primarily used to get such information in runtime in order to avoid distributing a separate binary for each microarchitecture. Its output information is returned very densely in the form of feature masks, so compilers provide built-in methods to make sense of it. Here is an example: +There is also a special [CPUID](https://en.wikipedia.org/wiki/CPUID) assembly instruction that lets you query various information about the CPU, including the support of particular vector extensions. It is primarily used to get such information in runtime and avoid distributing a separate binary for each microarchitecture. Its output information is returned very densely in the form of feature masks, so compilers provide built-in methods to make sense of it. Here is an example: ```c++ #include @@ -31,9 +31,9 @@ int main() { } ``` -Second, we need to include a header file that contains the subset of intrinsics we need. Similar to `` in GCC, there is `` header that contains all of them, so we will just use that. +Second, we need to include a header file that contains the subset of intrinsics we need. Similar to `` in GCC, there is the `` header that contains all of them, so we will just use that. -And last, we need to tell the compiler that the target CPU actually supports these extensions. This can be done either with `#pragma GCC target(...)` [as we did before](../), or with `-march=...` flag in the compiler options. If you are compiling and running the code on the same machine, you can set `-march=native` to auto-detect the microarchitecture. +And last, we need to [tell the compiler](/hpc/compilation/flags) that the target CPU actually supports these extensions. This can be done either with `#pragma GCC target(...)` [as we did before](../), or with the `-march=...` flag in the compiler options. If you are compiling and running the code on the same machine, you can set `-march=native` to auto-detect the microarchitecture. In all further code examples, assume that they begin with these lines: @@ -47,9 +47,9 @@ In all further code examples, assume that they begin with these lines: using namespace std; ``` -We will focus on AVX2 and the previous SIMD extensions in this chapter, which should be available on 95% of all desktop and server computers, although the general principles transfer on AVX512, Arm Neon and other SIMD architectures just as well. +We will focus on AVX2 and the previous SIMD extensions in this chapter, which should be available on 95% of all desktop and server computers, although the general principles transfer on AVX512, Arm Neon, and other SIMD architectures just as well. -## SIMD Registers +### SIMD Registers The most notable distinction between SIMD extensions is the support for wider registers: @@ -69,7 +69,7 @@ C/C++ compilers implement special *vector types* that refer to the data stored i Registers themselves can hold data of any kind: these types are only used for type checking. To convert a variable to another type, you can do it the same way you would convert any other type, and it won't cost you anything. -## SIMD Intrinsics +### SIMD Intrinsics *Intrinsics* are just C-style functions that do something with these vector data types, usually by simply calling the associated assembly instruction. @@ -95,17 +95,14 @@ for (int i = 0; i < 100; i += 4) { The main challenge of using SIMD is getting the data into contiguous fixed-sized blocks suitable for loading into registers. In the code above, we may in general have a problem if the length of the array is not divisible by the block size. There are two common solutions to this: -1. We can "overshoot" by iterating over the last incomplete segment either way. To make sure sure we don't segfault by trying to read from or write to a memory region we don't own, we need to pad the arrays to the nearest block size (typically with some "neutral" element, e. g. zero). +1. We can "overshoot" by iterating over the last incomplete segment either way. To make sure we don't segfault by trying to read from or write to a memory region we don't own, we need to pad the arrays to the nearest block size (typically with some "neutral" element, e. g. zero). 2. Make one iteration less and write a little loop in the end that calculates the remainder normally (with scalar operations). -Humans prefer #1, because it is simpler and results in less code. Compilers prefer #2, because they don't really have another legal option. +Humans prefer #1 because it is simpler and results in less code, and compilers prefer #2 because they don't really have another legal option. ### Instruction References -are all generated by cats walking on keyboards. -If I'm wrong, explain this: punpcklqdq - -Most SIMD intrinsics follow a naming convention similar to `_mm__`, and are relatively self-explanatory once you get used to the assembly naming conventions. +Most SIMD intrinsics follow a naming convention similar to `_mm__` and correspond to a single similar-looking assembly instruction. Their become relatively self-explanatory once you get used to the assembly naming conventions, although sometimes it does seem like their names were generated by cats walking on keyboards (explain this: [punpcklqdq](https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#ig_expand=3037,3009,4870,4870,4872,4875,833,879,874,849,848,6715,4845,6046,3853,288,6570,6527,6527,90,7307,6385,5993,2692,6946,6949,5456,6938,5456,1021,3007,514,518,4875,7253,7183,3892,5135,5260,5259,6385,3915,4027,3873,7401&techs=AVX,AVX2&text=punpcklqdq)). Here are a few more examples, just so that you get the gist of it: @@ -120,9 +117,7 @@ As you may have guessed, there is a combinatorially very large number of intrins The Intel reference is useful when you know that a specific instruction exists and just want to look up its name or performance info. When you don't know whether it exists, this [cheat sheet](https://db.in.tum.de/~finis/x86%20intrinsics%20cheat%20sheet%20v1.0.pdf) may do a better job. -### Instruction Selection - -Note that compilers do not necessarily pick the exact instruction that you specify. Similar to the scalar `c = a + b` we [discussed before](/hpc/analyzing-performance/assembly), there is a fused vector addition instruction too, so instead of using 2+1+1=4 instructions per loop cycle, compiler [rewrites the code above](https://godbolt.org/z/dMz8E5Ye8) with blocks of 3 instructions like this: +**Instruction selection.** Note that compilers do not necessarily pick the exact instruction that you specify. Similar to the scalar `c = a + b` we [discussed before](/hpc/analyzing-performance/assembly), there is a fused vector addition instruction too, so instead of using 2+1+1=4 instructions per loop cycle, compiler [rewrites the code above](https://godbolt.org/z/dMz8E5Ye8) with blocks of 3 instructions like this: ```nasm vmovapd ymm1, YMMWORD PTR a[rax] @@ -130,17 +125,25 @@ vaddpd ymm0, ymm1, YMMWORD PTR b[rax] vmovapd YMMWORD PTR c[rax], ymm0 ``` -Also, some of the intrinsics are not direct instructions, but short sequences of instructions. One example is the `extract` group of instructions, which are used to get individual elements out of vectors (e. g. `_mm256_extract_epi32(x, 0)` returns the first element out of 8-integer vector); it is quite slow (~5 cycles) to move data between "normal" and SIMD registers in general. +Sometimes, although quite rarely, this compiler interference makes things worse, so it is always a good idea to [check the assembly](/hpc/compilation/stages) and take a closer look at the emitted vector instructions (they usually start with a "v"). + +Also, some of the intrinsics don't map to a single instruction but a short sequence of them (as a convenient shortcut). + + + +### GCC Vector Extensions + +If you feel like the design of C intrinsics is terrible, you are not alone. are all generated by cats walking on keyboards. I've spent hundreds of hours writing SIMD code and reading the Intel Intrinsics Guide, and I still can't remember whether I need to type `_mm256` or `__m256`. Intrinsics are not only hard to use but also neither portable nor maintainable. In good software, you don't want to maintain different procedures for each CPU: you want to implement it just once, in an architecture-agnostic way. One day, compiler engineers from the GNU Project thought the same way and developed a way to define your own vector types that feel more like arrays with some operators overloaded to match the relevant instructions. -In GCC, here is how you can define vector of 8 integers packed into a 256-bit (32-byte) register: +In GCC, here is how you can define a vector of 8 integers packed into a 256-bit (32-byte) register: ```c++ typedef int v8si __attribute__ (( vector_size(32) )); @@ -179,23 +182,3 @@ for (int i = 0; i < 100/4; i++) ``` As you can see, vector extensions are much cleaner compared to the nightmare we have with intrinsic functions. But some things that we may want to do are just not expressible with native C++ constructs, so we will still need intrinsics. Luckily, this is not an exclusive choice, because vector types support zero-cost conversion to the `_mm` types and back. We will, however, try to avoid doing so as much as possible and stick to vector extensions when we can. - -## Tips - -First of all, it is very useful to check if vectorization happened the way you intended by [compiling it to assembly](/hpc/analyzing-performance/compilation) and taking a close look at the emitted instructions that start with "v". - -Also, if you specify the `-fopt-info-vec-optimized` flag, then compiler will directly indicate where autovectorization is happening and what SIMD width is being used. If you swap `optimized` for `missed` or `all`, you may also get reasons why it is not happening in other places. - -When using SIMD manually, it helps to print out contents of vector registers for debug purposes. You can do so by converting a vector variable into an array and then into a bitset: - -```c++ -template -void print(T var) { - unsigned *val = (unsigned*) &var; - for (int i = 0; i < 4; i++) - cout << bitset<32>(val[i]) << " "; - cout << endl; -} -``` - -In this particular case, it outputs 4 groups of 32 bits of a 128-bit wide vector. diff --git a/content/english/hpc/simd/moving.md b/content/english/hpc/simd/moving.md index 51502be9..b5f70d03 100644 --- a/content/english/hpc/simd/moving.md +++ b/content/english/hpc/simd/moving.md @@ -67,6 +67,25 @@ vpbroadcastd ymm0, xmm0 You can [broadcast](https://software.intel.com/sites/landingpage/IntrinsicsGuide/#expand=6331,5160,588&techs=AVX,AVX2&text=broadcast) a single value to a vector from a register or a memory location. +Also, some of the intrinsics are not direct instructions, but short sequences of instructions. One example is the `extract` group of instructions, which are used to get individual elements out of vectors (e. g. `_mm256_extract_epi32(x, 0)` returns the first element out of 8-integer vector); it is quite slow (~5 cycles) to move data between "normal" and SIMD registers in general. + +### Tips + +When using SIMD manually, it helps to print out contents of vector registers for debug purposes. You can do so by converting a vector variable into an array and then into a bitset: + +```c++ +template +void print(T var) { + unsigned *val = (unsigned*) &var; + for (int i = 0; i < 4; i++) + cout << bitset<32>(val[i]) << " "; + cout << endl; +} +``` + +In this particular case, it outputs 4 groups of 32 bits of a 128-bit wide vector. + + ### Non-Blocked Reads Since AVX2, you can use "gather" instructions that load data non-sequentially using arbitrary array indices. These don't work 8 times faster though and are usually limited by memory rather than CPU, but they are still helpful for stuff like sparse linear algebra. From f1338ddfd6ba85df56c9df79e5584c58334b0062 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sat, 5 Feb 2022 21:14:31 +0300 Subject: [PATCH 105/531] typo --- content/english/hpc/simd/intrinsics.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/hpc/simd/intrinsics.md b/content/english/hpc/simd/intrinsics.md index d1147da5..9a0ee437 100644 --- a/content/english/hpc/simd/intrinsics.md +++ b/content/english/hpc/simd/intrinsics.md @@ -102,7 +102,7 @@ Humans prefer #1 because it is simpler and results in less code, and compilers p ### Instruction References -Most SIMD intrinsics follow a naming convention similar to `_mm__` and correspond to a single similar-looking assembly instruction. Their become relatively self-explanatory once you get used to the assembly naming conventions, although sometimes it does seem like their names were generated by cats walking on keyboards (explain this: [punpcklqdq](https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#ig_expand=3037,3009,4870,4870,4872,4875,833,879,874,849,848,6715,4845,6046,3853,288,6570,6527,6527,90,7307,6385,5993,2692,6946,6949,5456,6938,5456,1021,3007,514,518,4875,7253,7183,3892,5135,5260,5259,6385,3915,4027,3873,7401&techs=AVX,AVX2&text=punpcklqdq)). +Most SIMD intrinsics follow a naming convention similar to `_mm__` and correspond to a single analogously named assembly instruction. They become relatively self-explanatory once you get used to the assembly naming conventions, although sometimes it does seem like their names were generated by cats walking on keyboards (explain this: [punpcklqdq](https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#ig_expand=3037,3009,4870,4870,4872,4875,833,879,874,849,848,6715,4845,6046,3853,288,6570,6527,6527,90,7307,6385,5993,2692,6946,6949,5456,6938,5456,1021,3007,514,518,4875,7253,7183,3892,5135,5260,5259,6385,3915,4027,3873,7401&techs=AVX,AVX2&text=punpcklqdq)). Here are a few more examples, just so that you get the gist of it: From 116e805db0afbf82dcb093f49de7a590a6d0a00e Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sun, 6 Feb 2022 03:18:29 +0300 Subject: [PATCH 106/531] loading and writing data --- .../english/hpc/simd/img/gather-scatter.png | Bin 31350 -> 66745 bytes content/english/hpc/simd/intrinsics.md | 2 +- content/english/hpc/simd/masking.md | 2 + content/english/hpc/simd/moving.md | 123 ++++++++++++++---- content/english/hpc/simd/shuffing.md | 2 + 5 files changed, 100 insertions(+), 29 deletions(-) diff --git a/content/english/hpc/simd/img/gather-scatter.png b/content/english/hpc/simd/img/gather-scatter.png index 91aae2c7164f8a891d7beacd85c8a69478d6590c..a9f829fed2954e7de0fc72f8d1d7533bc152fffc 100644 GIT binary patch literal 66745 zcmd3Mbx@qm^XDc62p-%mxVy_H5FofL?(Xgm2?;I<8r*`r>mtG3S==2KcfHN~et%ad zbyZjQ*KO54`#k&1^mO-3&vf^v!;}=H(2$9c0RRA+^yg11000~a0C-jS76JB8u5-Hs z?B|V(n6&C!*vI#+X(;TO&{aay735&<>S5$;2C%Squrp(EF>y9Cvv;v{a6N%*7lk!q z{M$&}+04k*%E6vO)ymEc0CKaZVCSWfG&82)Was3hVCNL%<`-n=ppXa2=05Ju000yK z=}#Y3J=2bsJzQ~S7CX*PuH{d_u|LK+-h(Z1m?;?FGOE0J)%n$!O;Z`OCpbJj^egP0q{(o+BaH*5A{%sPf{=-z}pT=(~wLiu({OtyNw1oeEZCqQ9 z{^2iyvZFuF8J-SkxcNjHA@4^y{e+~rTvUM?A(Y=7IsPG@Rwc)y?(H7Fy|F^7M;gvQ zBwTYTq!B{#jUTO_9r)iF*-yr@0Oxw;Ej=GFGLCS{71jk`${EhiUFeYp|HCUtnv-VK zTRoMu&Vyhb`bU>^kgz0+dS-_5AF=q;sX;J_Mztf_9-}4)8I$TNW~{5`ZCPy$awm)% zTJrr8WYFm8!#85D>$K6j5M60F80To9D3s5{|0v06cEOoG^~X0dRvVzt3GE-D7JJFY zR`EsM`j`AkUv5D-1V!?sS_6=p za!CI+`NZ^@ljc9OQbtj8hjiv(IZYkx^=?DmgY=K?NbqpLeWd{|Qq>9pliU$Q^@e2) zh0XpYk)VX^NXmhp(LB3J$JbGSzzT%~sD(HOV#f*9#v`k^%m)SwNAUw3vB zxx>#gJ*{SAF3~5e2}2wEPF$xc+%d`cbo^W&k{YudP$Aufr##a%AUu>^vi{BVRVK?j zX&xfhJ#0xs8T5yUA;;RYT(ws4a+CXqH&v4KEcCnE@3C=dKNMKFz8DI6T3e&+;Bs2X z+FcSwMQ8g_?U@EvPdcyk$NjsyHctB4+d+&`_;D?{Dn&$37aPThNlKQ^ZniIp={fP~ zwsWWWd9!f|Ts0>#1_EgY>!lsKeQZA_NTaq$`pq9Vz>x+SgV)ZudYg}-JoRe1+P(Ty zVHUw}FfZBI{0h~qr>(gnR#*EXD>wy=U)L!N{gyQ$#Dhv^6-0kvgM*HR?Nohi^DOGL z=Hb{40t>g7G~2mL{aqd#z)#-!3xOeHA81zv2m(5E<9ow<;+o`U!rWXQHuewJJ{|7Q z5R~FQ50}ew7c(D~RL2?Cf=pcOEMHW03qId_&TCfjraN%iWz9IvtPw~`-c9gi#md85 z^hdv4f1gdMe~HAGqnzG2YRXyV6o8vxH(8xxpAHvVrExLz6PYpk0fhGUZ0GA+@nt3< zA$R;ZW2?9ijyD%u)$aCa6-+Wrw={Vvyw=BW+Fp^Rkwc@^;SDWLGK{ZZrY2^^-a3U# zbIu?L`yx6$U@QIr%#}5L!X?|eP6BT2Ew#BixjE)soe#~~p=Bw4`$^8l;(ACjT=^DG zh>wh>u8E|(D{65RBM;nwU_IP=NeVb!Lo6N|}XlkKD z4YEg|#Xv1^h$og&W>}nLDyk$Urdzs7`)q%F>}7k)fgJQ+#4|t-VR1;&j-Ikpbjj6- zo)pZDHe}07E>=6w47AKClY$TNK;6yG(7z?(E>r~|i*NXOIk+LWC)GP`!fX$fa0k+B zYxbRlf&PyWWBmP3!m^WbchQJ1+Hr}66?SBaIslx zuDTovKg+RMHM7km9O<9&XV^C{I8P+i*e;)JB@^Qc*2d1gW9uFB%4FE}5X)2HIM^rW zfXfSkxV#jK?h*HZ4;tcw8|?&yemO}%7ToMVE?q9fXp$mpGR{ApiT7n_amCq_Za(`D%Q8gAk5Bmsyjz(i+tC~s`~<}RV&bwwCF2(j^vo2fK&F_4KV#~w1LX-&NJiDD zZ`9V_*TMm+wAviN=a($4o)8o(j9u{TyV5btWNaw~@50NQNvLJbTsr?__TAWKyZRd0 z(dhTEm9zuQe(1(i7-Y86(d+5C;Iu_Tw%rrMH6`)zrQ!&WDfM~S_7;^)Tj%RSU&r@! zzcV}9PF%6jdRh^XT~@0jE{5D-FEKlA!H-RWz-@!&kG{+IODDUW4ngS_c+8z3jQA6y z(c!ZtlS`xhi!P@M1(VTIFKT_E2&q`HLjKAWzIJH~IjI2p$z$!iw>B%if^1=|QjK^vB_e}x0cj?hS7;Rqb zPWO>SeX#@3Fo6b?=2&2p&84P$Z?t)5tj@T}ddI;{llK-6^Ygle{pYU<7zTZcG$LMC zP#-6*-80#1163VFQ1yvVKCt#H@ea%S9}4=F)?a*WLuRL2njOI-$nGuJ=F*NArj9;T z)`(eO{18!YX=~gKWm_}@%&%w823rV^UOjG#?U!;~3qP)vN)0UE*2GhcZUh^vB7kJtT!(W(XMHM{my147g&ryHS^IubnP< zG@73?W-K-YIZc#D_erZVj@s>t{}>#$5n|+M(A_ksuJH!Z63){eo&6k1%Apu_O(Bzz_1^6ZmBX}NGXTKSFR;Kb%?>M5l5 zHG)PnSL$VXbNz=SfF9dnN5XD!6_78;0NK`)u+EZ^1!p?pm;fr_Cc*F!ao|4>b-yMn z%L-V1_xg~}s-cE2H?3Rr%y+KLDIgY&&kto?HF>_A*ZDAklz;4HZrc!)fCfh3-~Uay z;fnKaHHqVlB?-F#KT1;KhMo=mMD))!Z?a{7?@HvtW~dJe4YB7wk{D&#Li7_3Fo=?+ z?%PI0mnUaYWS_ex<@;ZYizQlho!|1xGH=F7Fi5jM=tn^#n>^Iso~@o6kV7`HIyo?$ zl$4cl;PdiBD=ZWPCAu0Pa{cB2pkwz$Q9;6hvF3H>Vx{9psHR_#kA{7l1$9K98`b?m zh*q;PnuNNLlC--w?oK$lG`c;viJ>EBrUkBv!pPdUj}+)AA^rvV}} zAZY@R)@Hy|zQ7xdUmcSeKqDP%OyqX~quX*|$}c+DQM%5DT+C12vc$`4Pd8spzygXT z8#i`@M`g7a1L1JK$z|;8v2R&pF1FGfsU)OI`Vl{f|7O#)b+@J4XP@$EH9+p83@b?JPO7*;6McdSZhkt+JYUIJ_$n&p~gOk?Zj{ zFM5mZu>O0#0xb45gf{Iv7;YlaCFDKXy9R>qlV1&;LUz}+`z!QC{TGUAXpfC zSMDFA#l%0>72|}#wzj_h1#MA|;D$t~=HLNnYkKk&mJHVwlPaP5{a~^$rTbiZps@eZD#{~Oj*IutP2tvzKdJ}o@=PNn- z**-OkNbfM^P;J~9N8KTl-5Qgb7h$~dB zNJ#?2Oiw|;s-0~^*8vye?;Gy$EFke`QOYs8CCp7qF%C#k7wo_6>sY~F!;qJAx4!xk zbacd-_@ouu=JT3d0XiHa4mW?dt1K#N+e?^Km!ec8eXu5Vb>tgejmHRjwA^3zfXnv1;#cP|30)poigq&7YTg}W)A1S*z zfC0JUF*1q+B@vH^priVq)36su;d1SO;-4tVq zdOyEf=@-d-(edHxBr)qeO`V^0qE-ZYnZ(J{==+LfU+h*fj{Lb)aH~mte5O6I2)?+k zBr`18c#u%!*bXDvDbv&8bLERDaA^I$*p+j8y*@;F!AC2f+-pAs&#Wzw`9_hy^-PcV zu{mUy{4rUAL-rSs1NPLa&hQVzx6mV`9p)9kBQkUUOS*>9mssW9+i7&5|GK|DIbT{< z-9uBkiyePn$jjQqytTe&OCog;Q2O~e-ppFr=0n9)`Ub``4tq0KTLGPb)WzJBV)@N@pE+l)NqA*Ipm{jg2$a6hcsh`6cEf2nxn^U6Hg9;Beg9crqU~)* zRb};}^(9fbz4;sA*F!Rmk6*q&K3{q+(7ef^Z{Nh5PXFTfG*SGDIo@<92OU{(9j=Un ze5codZKHXyxpNQZ5o@8MKuW@q-l;Id@v{xH#gIN zF77i>QHi-(0G}F`HY~?jwl+54In`zYK;LF-{S!;zn!%RR+!!(CG9LUUCp$PN66^Ki z4Mnv$hUhZ!K;gy7=3$qjlhlK?oD5)KK>V}9@y8EmfMfp~k?U2F_?0%^@M_MgW_YZF zRd-Nv-gatHrQqHy%h!m(>f4gv*S!oLaP*A0=zJJtBCDA~vx!PZ^)>E_93~;RE+!4g z`pb3*LBt@0knzr_U>{-}cpUETkb^T8l~>xms8O^lhMOP6hj=+<*2E0>a$aBZGm@&N zkj{~yFI{?neY$+JLBTm0Q?l6fab)2B2xs<$Z9By@!2Vc$;PAn^V=UF;-djamsdwVM zhC8{M_I%?fsa)~lFX-&CvZyw1L|Mp)F5$lH^^SvP2Am7<_6Jgs4Vx4-#=caa@^W(O zd?lu+-ld=)xo-E~++4F+m0Y_6mdVPBM8b%Ik|o>DaPnX{W{k^NnbwhJT~LYs;vkKa z{v%6IM-SJ^Ms7jhS{{6CP$TrG$KA!LL40b(C!6w(@h>FD-uouy^5uS(?w1$jP7@Eb zf_?Qi)kT4!&(v5EDj`;XWpR<8wp zvT|P=@(Ovu*@tu-&E8<*(&wuV#7Bld)b(OF}o>IfVHU*6=f zZU&e57)VLdj8;@;p!Y1&fUYGsjqB$KnXcS@09Lw# zzx+0QOYY4_g$3N180eYdG-oOlQ*Vlht1B@BNwceS#uhJ~oT4iR3eht`%5ZSWT+=}ifBwOhNZpl{Y$LITw>utcAKlP^ z-=bB+C+aO=J;ZS77~2$_W8e?P(X4mJZ8;+5amjpb3T9Ap zukI^!^FMiKd$9Z#qx-h%)aI5J%Z!wZmq$`Cl3vcQ00pRK`%>=r(%X_>m!N>%$7ISj zx?R5e*N2xs-*8`^YDo*Z0z%(Y@>V7fUPZ@ds`b*v==agZkTo#6Qxm!)e*AE{ws_}y zK&LDGf-6?;A2qL0H;!WESfE$%DH2WU-~-lZ5b#jnikoOjr0%i#f?YQmfOGmVwoh;g zH%<=unR|m~>{#LAbz^I%isJdF6?{h zG>P@_fUAPA*IX3t4(V@pwm_Rv2k-b(Xj-?ShoZarmJ(7Oc-U#38vcQ&A$cl#)^e@- z4Tg2NtY+r%L7W#)nA0mcFhyXO4(IKup-#`{K$9}#^KIU~L@IOtM~xJKCKk zU^;$Ax*kM~e`@797D-PrvS%%>f2yKgW}yf4>q%DNU5}rET|w8MqP`5o4dT91NITV5 zCsWf!!h#(wEc9)H!*hZznJk72ztD<$GsImK2#*~pl8)cd);qj>Z8#RL*yf|D2A-5_ z4wbgpZ{Jx&_xL?X9X8fuf1o?o5{&FLv4}$>2wLEHNLQowBB;D16g#as;q)0Ma=I{h~b~**>-&zOvaoUm>N{fK>)6On(am=bmRlc2fI6_ zkjrVYOY4qfih@7ksUd{9BjNI|N_lxt253Y|3Xd_Avul^aN5KRk`*UW;UnAe|1c1T{ z5J=DwzpDkR3?LF0-x;OmUx&B#D1@P@6RP24Y|$e1WLJ7g@mfi_x?!}waZy8qKkpc7 zH_EMEdX1yJyy3Go5~{^^amyo;!aDo#V^2SG-n9BU!WA&cTe<;RB1A>8^Kw zqT?}XKoza}b_7sr6o0Lf`RsT}1cm`QC*KlP$`XY9#TtcBp`Rh>4_xE+R{*02+4iy? zt#-4T>RZZ%6lzd`D)UY0LxhhueICZ~AMBxk?Xc~rq|vp(;1@)$O(@T>XUpAvo0FfH ze~&`O3ms)q-8Xmg5k;@uRMvK9*|?;PY|>v%mjJFLUVTC2De*pit4sS!0=}3*r;C(n zKNr2{Dtsi3ABm91Ad@E{>YCFo5Q!;1xksYD{|2G7lh%|SoTjy{DH)&5 zryV3vTFKi}E-a&ZWhJAYoWQOfik9DZ!yocgun!8T=IAkOi?$_g#Rv?I7we&sGQ#+| zy$Z^+%ABc7CbJp03dlalgKxtX-{;tGW%THN(5EDrv93(Hb6-hswM5_EbV#Z?Ew5U- zS?WXewnx=x8CkjI8@%8>8{Lq?}DlDR!q(xLmNFU-LcejSxKOuUPNheBQ|zZ-&z z;@5)kE+^!N|5@uyg>vTs$fO zDvsQ5Vn=;u2ky3L4U-M%NzHBE8Z&7*^EXFA7;<$RQ#}vWw3U?9pNK|58T9gUqm|PB z)vQ(Ewwb@qL4A;2s@hxvz(QOdto5(BI4*U(mwwL+OJiKDUFP}iR~*TayGW)G_qDX#B$s12jPqFuk;cvN01o?d zOz=uf;kUP(el0)JbW?H8AUeYV7TlnG)UEE8s*PXwhQOWGmUy^t?E5u8-9%aqFVT&+ zW+3dan1B%MK`V0LJ?wm4F2!?8BkE6)tbiY7Y}(M%>b1E2`^A;=BLY2tHPTQM#@T)x z)mx;P&Sk17jgz+TPo%ATB37&>*gj5Ax8{%irsMWL2U6&} zjidK4SnWlbh9+V7$rVVX20d*H;Jj+;(e7~>Iy|69=5hXs|5~VKtpMu}-St{ngZ`@+ z3Np>s6ci8LfylbPa5wV1cP(PVr=lpNwzwyt$ZGAL<5-qpm^dC8B1XTiBVe@clN^FVp-5Y_QAC zdnMZX)Ge0@-1~G)-QRZHgeT1_H8l2}hG>cUQ2=@?Gnh5w#JcGX%-J4M0IQ<|kE^Yw zfUiDL7M=1JOMYFk zPU2j|H;A>4;#nYpIj7UxwvBpYvx|(l1~-bl#);oNcJh=w;55~OI*6)_HU|T%wa#10 zlw1`Y3*j2w&?(4z>HFg+N5qU*M&;<+k+`aHgF_8^#|P}Af>yG9lgDAN)n6Vt#5WEf zeib)5{GbD7Ua&6WEqPAfzrcsoglpDt1x5IH*s48Qja5)kXG;yiFMhsWk^km#3~}z{ zjrFih9lBmsO^K1T=vZ?SYflADt0p+9@i`#4IxJ6BXKLM(I;;wv<4Dnb2@odNtaZz z29H(ub6`6VqAS!rlbseX$?`A6J1QD02O}t1);l$_cEFyJ2PtIV<1VoC6-(P!rT;aQ zeg#dsC7g1UdZx# zvjwE&J~*qf+)I@aQdjZ*4L}3hCIior{mg2u_x8+y@-GjeHM0UpKg-cS!o((&vv>Uu z@ccg`{K0oHUGbX`UTN2p^ABeFfc`<=YhzFy-QL264g7al&7U7dTfHD^<>=ynbtv;s zufO#EXT17f$zf{xU%_oBA{TxVOKuqQ|F7y({xVol6&s87H7Mx6aC^&O*kEbBNA5y` z>7z47|Dg4u=Kmue{(q$ZFV+7`X8!+fYyn30e=-(mKOS>4LUnh(vEK^1QU7V=>^zl^ zst&$+MSQbk?BYBH)BQL88d#-6H=z|`O{~;e?@fzhXFam!`8aA~sfZ_X{fzcsplSG$ zL9r@#ctAc0x>m$6rPX>5P^aT#`CAOYmHJoTKV{~xhG@}u6I%p-aD8SNqIIXOzW7w1 z)GUbiS5c4nunFk4Ds62g2?z-IU*_-Nz*<;DM2A%~;m;D12Io;7%kP)!ADwWqpJik= z21626%$&GcQlEq%Xo|E93}L^APRtDw^5*5qp0>?cWs8>;CVL>KSEH@X`Mm7wgI{#L z$26v6`k7=G|BPtRa5!msL%%*|5*QfpF@zoB*#V6na7)N-Gy|@3S5+o96cW#7M3Q)? zt!_Q~V%zmz2HD^NbHz-a)*BS&Wpb22Ga3_wnerD?gDE_BFt9HBc#f|jjX>;KG19K~ zsp#ZHu)cf`;2}@a%kq-+2Lt6ee-*gG_E+SzB3PlvMtg_8{A{P+Ed_H-w|mcFk$x@~%YO1RrpHSI*|ox@ z$cBttVP`?fmsY)1GKudqk_E#Z-0|xkpBZ?3xg40O$aOVt@NjB2u2e-t(LdF-6?SDqLy1ia)y|-S*JNkXHDs4SrR4?MoKrX7%ZuXrBggevBvNaWBIQGUq zz~a(|x0p$JG_oy?A8&=J9wM-5d4yF8um*DPxZ{T&?{j*p_*pd?nD~ zrQ*=WsC*prnXii=b$o9hXpWG3FZtRK^>A<-EHa-Citc$$5CheKNIsY~q!fjRQo@Ip z=3{Vc9Z#F7%+*}^z&jawXCCqS}Lwze$#2CP40>j&t` z4&0C~XdC#{OxjRw!hGNS;(hyRLXS!p1X{A5{0cS zWw6?oai}W{{bhGQcx5>*I_#D3*_`eu_~AW86yM2XRG|(X_8O8qh>}gN<(rgxawEM! z@ukO`KERgW#PhVcNLraan+BiGO|uehI725?zj@qD*?UnbrVWxZ$nZopUI zay4a8V=5W2^y;&bb)Wu=_6095mGUblrT;b{T6cIb$?**+gev!uWN3RQ01EL`a%t&TPN zh~6$D(H9SS)UYu9!f%(({{8zK>~QC8cN7WxV$TndDC#s~+x{TQkvOR$BO7x%DY46) zt_f$bK+~+O(ws~hK$=%u8&9+d^gD@$LY7Nlmqay1_N(K(vOD&_GM#KtT1A$ zy8V)Eo92h<@9%$$=KTX5+rZ*TA_`f;pd+MWX>8nmMt60pqCU-lPGA0JJBOffcxBkm zqfN45z2+B3F}`3N}0%`Z3lEcIMiQQtFZ>YlfWkhd}doNtf zIqr4&Xm85&0_JVDgU&d>-^Ztt$95<$zqfc%L30`PuH&+ZHop8UirPhnwSy zAySaC!_M7oK1U<`cG|^#*ZJkkYySQ%6yvjt!+yGumpdPgb^|s8r_v`%P&-Al!*0Nn zz&~|o?H}`CHHQbV;>?j$UOS=~;q$i>6BB)9Mbu_eM8SLRpKg5u!70O34DtJ7d*iHG z_K)R3=S#W2ceKNxv7w5~g`zD*--3z=r!`5}$UZb&|J*7#?md`_-s>8w{pO-{yIFt~EG7Nu~+-*zV9cOF^(BvsXNWmD^6g0$nlN6`AW|2-aLITe!=y|_*%^T z;}U(P+D~ta@0=y1TfNAJl#{2Q4f+7w9E@@}=Kg%ixa0Jo|*;HLGV- ztsL(iC|rW!boP_ra8uO2{qp!r->lBa@BDm~eN2glw-t$^+W)vCi2t}FM<;PbHL~~N z`>wvsLAVaWl8qHr{tPwcZ5e?+0UJR&4^7;n{~YD%&sT47W@cuF(}j|%tE&@HBDf?S z9ocKlN35H&-m$)O9gxd-TDLE6DJ4(f=}R&Zp{^lt7<)esr!sq5R?{7}VPS?BiNTdHD%&$;# z3@iPu1+(fkv)j(G+c$0FEiEnS{@3g1zb<0Ft01WpoYch;en2OFKofPabp$C`Axb;X zS^D8l&0VSKn0-6He#6XF{vlz8S6FYFH?XKTx~0<%M3Vts6(LbqI@D%`AaI8#05BlX zmU-j8!l||J7=7ie1?#?%?^EA)G!{NWn{QSNt2dxKDc2>%cAkx=jUM#@H@Z*HNy(+R zjV(Kk1u8bTYwL&FY8roqHOVt4zCkvHg%I9K)OtuS0_-JP zemP0!$S&xgZhlOvLpzAewhA(&?x+_!`H>Ed9(nAhyGi|uB{_+$|JdeJ?*Eec=-V*V z=0pCJGJDYM{5Np@tMUIW{Q3Le-&m<}D}$kBIl5~d087$ks}zk}K0l#(gy&?@BLl46 zF1(C3IbS6Gbg!agZUVv%SGzAA4qldh{!_5cf0dRfr^Sq?qM~y9{BSZbFpyncjRlC3 z42p@tfG#%d{y|}*78JZEcevP$0h+%&KLK?8@3t?Gmmj`ymS|M$Knwz&AZORm#Uxoa zFixGeytFjZ#qO9?nL3xzpEnyD8w%PtO&$oOF=CyK5~|&up1;)R9jCy5U2}A<&&G+6_OcjBMgm*$hCo#e}dJmOf{KH;`}S|D1w4?{`tj) z*FjN%W|K3GR1BG2ePpL01t(`Lu|w-mm`!c$?|V(!nVXq$_}#lQGBN@*Ys}GL-1b^> z^D^AP+BZmgU_e(<%KAd@2?;R zUPo1Z&&niSAs7y^a(VgroHnzi?JOrysQiZyfFxaKq>G)AuEoU%@;a)2Cgt0$#{ze< zGe61E3OJQgJXlFdDX^tQ=;zO$k;Ht_ok573>+2u={YC#Ou%x2xJr`G;6q(nzj*buc zikbeCb`C48$>HJ1Fbi>SNso*~ot&K9o2%M)%>nz|m~Zw*$tN;|iy0YF(#9cqczC>q zgHt4EhSb?Wg0n+j+$JhB9i;Et7gG%RZ85I!BQPQiKAI_187`Np9+bPrkJ`85|v!z<% z#>U2RJ|V}Ao9M7_H0@PMNl7xh(Q794Qp34QBj3|bM7m2$fuTmH&9_KMR%*+zWe}e& z9>Uw2A?(FjR8%B#=^bwmcmd%a6PCt~oV0eZjztjgoclgrl}G{Z-KO7t7GK-QFf4GKfwR&<4C9xr{)i2tcz@Ay7k6e{{PEJO5a+O%{ z^`2K1^M$*xLeERuEk?y=JGw@UnkjCz}2jo{epr2h6kyuBgmJ5X@!&&sB7ZuDXU7oGIw{Ua z%^!U3E)0ZHHAPN;JQVY|urNs@lCqDD%j+VkLaH5S4Kh%%hS)gw(DM{L#&j|HfD7w>Vb5>KbWb}olbDmDuN-tZiuRl4(wtkT z*2u9HL!~YX&J%j(O}7&_O2yecv6cmhFl=%1$Y@uOA`|Q0^w42xdhEII51W$7tR>$6-+(VE=^5299nh^)^H{3?v^x}PV+?=2T9~|tzlv*is=&Ady@*%)%gh5b;1Q0 zIB*M!2&Us%NElrr<%<{aN21Gi-55;VZR~uDp!4?dIBsngdGgJM4U;{f5aN>KX`e-> zZ}+^cv2H(&wU(V{`T%6r|jXl zDE(}E%5$<@T*?=fMCh0h?(-g2L<`q5Lf2PKe%cfzALy>WI-K7_@2RCnI(D>=_UoL5 zDUM0-dZUE~ahN@RM9U}=;?s`*1^3FQo`x>*lSpz>9?47!ETOtSse+FcQ(`DwqM0)8 zHo5H$_W|zXhi;Z>=i3#zlWVw`(G?0o>J@zqKoM;!WhnOBX-%T%<7or~-2|3 z2yK7`E-r5Ca;#M*KcZy&JS>;@3N+lp&!X_?BE!Vq-b>-YkPsM+Gqve_2{ z%!8(f5v}UbjJ`(~GW)Kjs;XgJJ(JP1C;TT&5w!!-6&0RR4^}CR^t& zQ_QvzyN8lmbsL%!-#B^g)P_rs;(o!juNU{?3R}21QJvZfMH@ji`395sZ`QMmDfze( zzMd0(h_cXRcKS%x*kr}7R_}x_Z6$yLNuEp6mnb5QJeH(I3o|y%VOt5nRay#`w?QMj z^&(~%QlO!#z!J)Gx}jlZ4Q2;YNuA5v%?XY?@9c|Eg(;r)K$brwaF{UJ^niI>k;}|RF0S;$g{V%4VRR^G zTs3+gJ*$_BKhd315|s#T=Ebo02$Uy3d3|(b3v)OCdn;g{2(F7KVwipUUQS4TfQpC% zGBPscQn--y$T*2-pQ>C z!wDQX1|aeeGTWtmezoUnzvqfKc!0$KC~5jC6Nk&@5W{Ov+VRm!fiZ*=xrklM(K)1p zLx26#T`~PITgO7U#baHH;yeEXqH|x75M!OwIammSdF%2GT9be2tXo*k}U~a+l%##7k@JWO{f55y6xba-67|;rnr(~hb=9f;W zXTcgtZ)9fZE9@7Kp*mZ|kQEI`LRHEypb5bRMy+s8Vb838`dwZSks5&(ZG$-=n+9G$D#g~n7Nm^MyUnhXcO z#s62{!VocdS;ndk8@)6ydF!r$PLMHjDTi^q8S|hbj~y5xLl>3lap6kEAQyud8hT38 zq1~espjzwI629?48TIxJ0RvmHv@Lc9w{3L^=MGL>lsiOdtEMlGxHNQ+ze!nue#aC>a}S?rsbhFZAsMUSA!) z3eaKvZmb@eqU9%kocN&G&y|^9avo6!6@95qQ^#9~_c4t=AifE?f0xU7G3c;uw^E+N zuSDYQ3%5+q#LcJMtz7hhgJ2?P>X^8s;OzRNMM-u)VY^oWA)klhrT6zg;$HkE29hV3 zfelr`rCn-ScMtPLi{E5I8ZtGY^r%sY+3N_%oHdo2X{ihK(wqsgv0`3c0!Btgu7|VG zZ&%yf+qO%MRqKSxeTRz;Y#baMbj)lF3`U(_;Q^NiGbIDfqv=9ZH5Q8M75XDEcas04 z^@5!MmCklEfXCxwA zv_e9uF#qoUa=$B-R3y+3e48 zB`Ya*D0hujj2KS)kykbu5^217mDYz+0x`yrb-q64%R8fZ{|Dbk4Rc#1%iadd7NV=Rc3wGcyU(s)*~vm?0|_^%kG= zW(^&=vILnVmy1s?F*+3;XnFE*QZF))Ff7OS)hr`3zDfBQk3$-u9Y-KJb901zUXhCG zs+|wGRAV>W_YpaSx*nT#sUfd2@vJk5-2E3*y4FSd5K=Esu!nxE7~$aMS24DiDKa0$ z(E>(CselPXknR7U*H+;un zy4&z1k>*Y2FDlnc5x4--`W`#!EN3VD2ff3%sM{93m+yVF=W5D=sZ3t)p8Y2?x0PqU zanE;z+Yxo|laMdwljQE+yvZ&e%g`(%@v!~0X2I1vPS_WM&BSnN|`-|F3 zRAs3M1n5epk~yg|s>N9i9-5h(bHcJE=;WeHw%{;qx@y<`N$f<2<>mxf;34RC&>i>+ zE~WJ1`Z^y}pirt0P7(3B{*j(856i764ES@K^QOMcxct_J1us*!oNBP^MrEO1i#t!t zKQO(E7c|qodoe2F@&JUc@%%P5}r)_ZOnBMM-4yRjnVtQVnhh$cT$3q{K;ia%Ki4>(8bfbUo7`cg|&RaFf8Kj|n_*{w6fR42JOsp#0!W{pOD;3ze@6OzVt2GXC1g zJ!*9Up=6ZMFqEs#z7hhE(<1r{t3Z6$vN_n$G!bgF6j)Ek^O>Lq7#2wdo01VQ0P_NUvO6?V`~8;H1~Y ziZyz-bfV}S>!QEwvU;*upIm*ub0X(MCYpXVvUGD}4$##jZ4CIrrns^{)_h~lK-4RX zbB=p!*{d_<{I;grUk3c5boTWQ*%^Z^~`*f)v;>$l24Md zwzvO1syw8)-hNZKuc(EA@Iccp+Cr0B-~ZVXanOphpSa&bl1x%C9eo6282|d}YI{P8 zJTGs1(W43n8D6K}7M49*M~ul+FM;{6iUcGi-K(pafq{WF&#OyIX=X!-RZg3dgM)+p zgM&ng>{kmm6&$b)&ShoiVhp$oBL&Xd%qu9^JcEn4l9Q9GtP#2qj8H34$Hc=!NJBSx zxH;K{fevjLo^N;>9OoUgvYX4OW5F>ms z9Iz^KY`PeXF(xE-k=-5}Uyb+B1l-=YIPM@Jb!!ruf6V*%P?-^ed;I*|s99XZR>G;I zofu2WhWl4CT7Xm?R}5G|$fC709RNBi9cJgM>V&q(fXj`BTy`4Yd|S6# zR-7qXm>S_*ur4A{#m3tC5@N%KW(P*_%buT@=<4fh@#3BTP5|TzFAjb<(Z|4*SS^?3 z<*4bP5iu#|8+>RkTpc+F>^v8R!xM7YypZ)_G*5Qio>13)KR;RodJ z%xeBhG2Y$z9x?*ZOV=wzs}GeCx9MRhegEFcy^ie3h>?>M3m+d}TU)z-c$k`p=c@F@ z!`XQQmZbt7+SJw6k?=dNNp-1uc<|-rlasOK<*b~XoQ+qr zIu+xDenUe;hpm(&Q&SgTVFlW2r(pR=w&`NEl3-rw%)aaN^z>;>=2PV5rK`T@I{bSo zDmE|iwTYn&q~}>w#-Spsz1P59^40`NwU6%Vh&KAqbQMcvFRKRMkPk#f4So^*K3$`< z4%;;#=5sdrg_d6^@Th?QGN0`9eQaArYn_zrYe`e${UCBPBH9(V{TDD;V1bpT`Gomd z{pvD{^QvQE0U7%q?fE!C!6EH>4S-i~&p&5vS z{HcpD|3i_A(XpYEcU1X(d&O^N<|+WUcNY&8fw6t(qms^avwrk{2`T~- z(xD&<(%m2^pfpHIDILxE6h{Z799QND|V0t5fpNcwr?^XT#m*~QYk zllonVUnx{oQA<_k|M<~#t=@~~=Oq17Oe+^M!h0`Lp>|UB=-1}b z_O8SpDB_H1X=zbWF|X$q-yd8QbrjhgG=0zW7x;L;9(S?(!}uNgtxgWQH;<{QZ=j;0 z^6>IzW@kUkRM6Z%P6fguDPj@$F0PF=`_(}qbMuY6YY(_SN5{lqmc-!Wgl_r9cpbs-GcQY8#!LoRDLs98@sp|M3zJSu@}2p|ybcSqOOGs3AE}wetUfmo zLlvp7G;3%dMlmoUM(J#~{ciiHz1jCX2!-c%pKAMP&+)F)eQR1^UzrMv8au|X6p0^u zIi5XnuYa-K>BQJh(&=NUZbPb|*oW?n<5RnjcRY4=b7wEPJd-&8C;#g!5v`YFoV`y9 zP%jsdH5Bj3olkU9QdE_7j0EsU|Hj4KDf32gvN8y`Nj@8=nxftDxvcsct+PYw+e44I z)WApY&pFuGe_R}hRC4FOujN{F&R8Nj$L|x|9o)GQ?^B_QsW`_zv9!-lqOiMbY+-HL zFjgPfQQe!SmOkG3IyoWEDQ^f6X%SgG`SVg%X{4OXcX~NCVXQ^*!zrxGw z%=U`Dwf3?Hn^tYZ6=lvxlD3a=$Glebb4VA{cAuTAYO$7BQBc<L7GB;Wg z4@tz?L!anCv5319R5*9<_nue$bym7Jtyy~n{3vADQsJ#N<) zK_3<#z6I>I^-LhA=AY7(-$h2fD&;mbBPHewGVaH{T(@67*`;?m7ox;GKX#1JI1%7F z+VV%$nvc$?e{0O^iX#!rr!XS^?>~vWm!U zlG;i&`C3v=!H$YCSH7*4xwltV)zd5wNKL564#)Oah>DE63~%$kpXh79O{_*Nc}clu zv#=sFlZQq?jy-_1vB;)RQG0GKQ0)oR(*c8i{=R{tt})h8bNuCBg5v>=&+HR4t>*o* zBW51GIy!Zr)vI@D)Jp%Ov$heKcoxkW)CWJ_+mrf}N#oUTi`~uI&fcXFDXSch8`h{! z%Pov1+|91dmuw#u7vo(!bi%@OJMKnY z+bp7U{4jb`{PWoDbF+goIJ zmh|+DsXR?-Lc>hFmXygoNhHE99yJIWl&I1lGHmb*FDAg&P4Z)krf-BA7*2%+jDK*#H9n31*&RlJEuFH zg{~)dLF9tcW@Zd8YkI%gRYsKpccppw>%wJ^W1aB0gu8oSw7J&nq@%ip^&nv{Q{(xq zi(h|sf(xsOnn-0%LwOSp7gV)0YR)SX8+Q`*>jn31i0<}3vd$E^kn;Ba7TGk+LSz!i zT$)hsbln!N!5A^|>@N1`i^P*NKTNuikPvgdPTE8}>27GqZAe=1gWE8a*`|V3Ef}qsDX+22a}$u>2K$aDo)(D znl4)+>lGUUR#sNk2h-IR6|vB~d3dU8lYXmhGOI3+`M`7@?<@x zAfLJaR>5Mr;2F)O36~von70?oivP)o)7t1JZT>z!H#p2jpTB)e2ilkW`A#B8yn!Jh zcifKJw6=G9Wk3}2D=cIcJX?~<%*gmzTl=0$BH}h_N_lzA@bIhl_I97(;O4rz>B<>V z5fLnWeAFkPX05F?MLTWjNJ>_(G0d;61^W871iP&c6)t>kQYp2#4}?(Bk00BBYl7RJ z(ENK>3!;6~x<}G+zs}E2hJKh58yOjaGWW5mNtEL)mvfT4xcIGtf&#i%MTUk_VW3a} z+AS(7x(dM zj*`$FJwANc+}b+o=GdI*AF#JGh(OEHn}(BaR8(@x%3`Lb^x~;jac0xB{~l|TV4vyw z!<*aNb-#b3^YZf2T~v7N6M~No2Lw-8jVsX8(a3hj zoj3G`zUy6@s7Oprr82KRddOIFx{VohR3j`boROVP@@>y0f=-pi=7c;ZA}S>0XLpL2 zTxQ3_+?=~fP1h@Uslif9N~phy`2_{F!^6j7=Kmhcv(y3Q=1U7F@(1wctE;OwZrv*X zauO94HMzPPP`HC>Ug^Z}=g%KOo*-Y}n`C5U4h|0LT?KS>bPT2XuNV2;1Io&{LAk?s zsg%fUY-qTqtgNiAq0#BA_D@%BXk5Dn1;^HYtCSt-tWA4_7o&4xN%bOA8j7zqNhf+4ThQ)Nl%`c&s@-FFj!~s(p zAk>846#8kZxtZL(oiUU- z<4JQFHhv~nDT_)_Fj1#D$Z~U1XnA=#ip!G1($W%P<{23oSJ&1oEi9lc)DF)1$$bNt z6{5?6ao4PBPYFH?hyiw${42kH|DOBJIq|i)ID93qCPB4OpBX#>5a0|BL|+Iyuc55}EeNJ}^MDMYlOF_S!ib*KxwGr`Wp^u*PbUr|v}K|!Ibx3_+B@}7r>$L-s< z?|{1J;J}52h2`MvOkAPP?6+vc14A^I!{)t5G6`Sv@`6((B3TSNNZ!7Eds|5iK828w z@G1M@(Gj!zxl3|#@@*v#0ymhZ5DP*>V`Blx*B&VL_nXjezPu8`MAcAcSp2Hbn4Kul z>-^jegOsNMq-dDJE}fm7sXMO~x5sgZhK1qA#l;0^bIK{X9RfpWwXCy}iBd z0R8l+MaJzzS=m$sYgWJU@$vCsjk_RHML>b%+iJjRk6~|n&F(X0J#V@Hi!%5QK7Mo6 z!8omQhG%IV1CQk1-d=(JZz6@q{TBTX0)93%c?Am`{~FZ0!MqG5)M|B*SvrnO2)b`a zM+d9{;q1q3XT9M4#<-wNyqavqLRBjk&;3OXUBtw4_PdypLq!CL!>~3 zLX(n`Qn&3_L=rYXp9wzU_ABqw@9?xLC1#}6)zzbJCu@TvBg7RC?l%SaV<0!&xv4ul z^O~BP!ejLG^*z(kp@a|V8yJ{bTzoDsj}25VxHaAwIa;oLZyZePT{1K@yvN8G3^Xhe zF|lfqA(li09rL081XOrGfB%-w&PNbt;CoQYRe$m9nWxao9c5{lbT_IF$N-t}?e)C` z~LznM8XBQv=^w6=qqXVGvt#gd5vq$VaN9s?8% z6DJ!rom2JRjz@EmsYyY1l;0sWkLh!AG6e(~H7#ulR0K7(kqfiOZ~eW!QGwz)JUN+} zo1>aCCdtap)d!CNiJ)8DyLXp9fBqcBtm7q@W;{1P|J`JO0fu}FTU#ms;a|T#h(_V^ z%TRcHezxIm4OvF!w0Iu?f#K?BPlA5Pg8L0OzVsU}t*@t*l;D3976JzS=;#O+AOEG6 zmW+!FU!g(gLuO_ya&mH=NZNoePzSz$XMt9Jfh-KdoUfHkn00yGoyY;liWiJn)u-DM zF!CP(UGzd$Hl)@B3tuLs8@wrhAxL4T0Idee$I{!_aDk(?=gUk5U48w}(a|kXKbAH& z44_aoG&I~HBJxX5e}sX7!NtR)4SlmWx9NRE1m2%KtxiZ1#E*kK7ZX$0m$m***~j1k zOx@Zaf!iup+ImtY5c35Hd{F9%D#xefeX0+g)B*yS~jacDmogAjI879i%=gQpAdE|9d*dZA3W9i{2NoT~Vh!+RqZak(beffe)XF>`34Z@wZ0fXu zmCj*QFsArJaOK-?Xo9uR!p25-ywVX#dc*`rOG~R0Nteq6o#0iDXp-#ab{pf)T%?v2 z#H-cGS_!e5QAx>8p8J`9ATcM4PSb6~R|E%A&^3d(YS!1y{^V-3fiVvdQBpx0 zmbSttCb*hW2f$5wLBS_Vv7nI1NHIml{w@KP_wV1ALJ3)4@9?~dV_|O(5TkC8%AfX2 zSy}YKgnJ^?Z-|#)aopBXY)05n3Pdx(b_L5%_T%l)htN=*t*tF&%gD$Wl%>T&j*p$( zPDUnJpae$Ht37aZb3fg5n~Y`68=dUQv;~sMc0hl@Myqs7)xES~DU1SVJhKg5eYQDk224G24xBkrz z04A;&GlH#*Kx24hWGMg5Rm94Jn026KFal%{JyW%_x8DIsxdSlcjEXQk-QEQrvl)l^6J&}?5t;G zBtFvPQBYbHKgxa0%=FExKK28}8n9BdfGZCT4GruHVa~&F@y2S`lfepm#!CRU8k(C~ z!E6?dV))>3xF6@Zg9i|EWGq;B211$|N^ zZvgpmY;5c(R)e7IVuI@(o^y=VhaW(%V1gYZbDYIW(fkOj1=00VPOcUdvX=ICBsfo} z@oSij2Z{4Q2SY~pHpUuV@Wo)i{dkD=_hr>1pf@3@21RWM`Vb@_6lp9%!WZi5B>VgO zTwGj}laueDvI3^KNP!Wf=HhI_WxM6ksKZPkVwi!>NtpAdkuW|!9?|KMf$_nkM}-bs z+I98ysj7bOO+j#xisjgvk2ME2>w&Y}%`L&d?^~J-B`+fAGSf_#n_3}c2bwHSi@$qrd9EntUOXzq(B;3Lfo&aOO z;0H(D-ZkoL0hq94)29o$3u+2-yM*ak>LY|IoMfhM#e|? z?p=Z!4c#>g3~vxAPs!^x;bp-bM+02{eLA{;*;(B*Xu=e3WTd2|F6T!JZzrnb7Z(lF zq~pbHY}oGKzu)uTEJ{I95s-t7MH_udzRi++v>DB{zuOyd2LcXu7;?wR)HJoAz;ZDK z3O*sL{vA?MDcEADBTWqU>PySZ$)%-H9;ld~LqnfRO5QG+aN~uctcd{~UJ`7ZfSn%D z()uSPkl`_EwzChYQ}$l+7J?RUUb%$=;_ay8LOfvA8!4ek$b8**Q5d^xlUx78Ddzm0^`EFE2+-f5_M?EiH`{ug{-9)YQ}zNL-b+zd8zk zgo;Vl06z?WABJ@y1%)ultCd&Sh|rOVt7~NsaT;#(-<^s0_r01h)OCr$s68|^g!Eo{ z`48dYcfr-52f8v~#})w<<-wmnFQKEOuMQXcgK-oK2d5rKhL2gB!&)h;pN)gVYJ2vD zz}fCg1OY*R6a;KI>({DBL_#tPpic)Jr>AS??)8n0je}G--&p+HR8{Ohenyq?R&@T0W#?|lqU5X zXF|>bt_Q@E}TqLeE4< zr>E4^B))%t831TTM#epUepkbpVHkf{&Bys*dg9jpb0zR0AnpBqJNVeziSvC<;5=aD zS@Sv8{MK&_TMv~KL^XQQIr0HN1E@4w?Ge2TK2AG34ov>tI|#0X6A(=dFK&1M(1%WN z`VPJkG8Gvfpr!5Z?_-mYNIrXZ4FVnl84S!Hz`6V5#}7(yBY@@b__K3!a;mDn{{Cn%1yV3G%L(^$!*B#{Y1JAx z{*qBEG=F@?h+;!+$-g-)te*Dc2gm-#L?;-9Yr)DL2=9u9NB4TECl#rb-%%Dw1de8L zfFgo(I`OSQRe1&n2Ny^j3&EyO2D_h9f;`k^60pdk48_>jS6U;rPEf|j+`E@ip#+OO+^ zeG?ESi|H^M^nx3=ZF!6ahSzUt;2f&o|o#hQ`LzkOL3y z-+v!#UQGyaUMKQF!`(-J*$V>ES2=iqq74ilA>&I?k*uPk2oycw#ymYeLqbBH2n*Mh zY8B-vJO|S<4h+EDY9> zrF-9q6-Z!D`~AzS09oVW7$69eT=OCNS~@xqH_rIDCa?oA5}|;l zn~Rs1xMK7TXyZ$Zi<7|YRh}QO93CHU15}I1$e`AV-1k61K|zq?M78VNvGTh>dm%$o zZu# zml+rsN=r*$>*(mLTIH1%6|r%c4kNG+h6hqPIXNhvL~wNHt5Q}=YpXV(R#@5j-?7b* zxHoUy_&GG>s4jK2Z)MK~2E@PI`N?V_tY%Qw4!%zCzlFbTXkwBA&=JO5V`F1r$b^$* zySlqU66FUz6G2~h@3QNcf1?Q(hp835hC)z~+`zy9IVpoc0qyj@{=UAD*jNcoO)_8} z;p3r&CSy5&H06SP2Eo_iV9XwAl^tN`r;_!RllWUrn47{QBXe$Rl(u$qL0g43b#V&r|GWSIs_K9*4uFwPM(}LcN02jBz6H=sBN}7nJ6Nf_%I#ag zTyxl|qV#rk{X9FE==5i-{sjYA0F(FNA_Z!T6s8fYW>R|?Z{DuJ5hh&;i8d_f>vNLMdwVMbI_Bn`u#7T*zd&rEJZ@EUcNYY~Bh)iS&Ns7N}VC^np;mM#sj=_J>V;0|QCBlf-kBLZO?$aTFtCV*ukH zpe_MgVoBR4URhb0fskqJ>kIQHbGjiWCWfrF4qog(oD6)9+t=Bw~OKD;5n=idmXp&|A1B@WU`N=kbF z{1JuCfN2GG4&7AO1fC|0s0|oNo5eezOaNzT(JRII3D!0_{Zif1u!Oicm)$A>>>m8I zb-$u=0pMe}^dL}UoNDBs48nJ|+l!1{&_Ufo{uUKsfk_J{_q<8~h8?>6K>1h@^cDV8b z{-b7KK$tc*PEKn;b;yYcx?SZC4=>NE+wT0a2SXLf{IEsc_pPLL~Vom z*a^SPSFJn={!8N&c{Md+I7y}l*$e{>WT}j{HU(I0?$glRgjNqsGN>_S$t(Gwdv-u6 zfRver=Jx;?I56*|Wbc*ZLNEZKcrht>afFFbA5f@3DF{AvE#cweIRrwuE{5HtbnVB8 z=TCo zhTV60LZ_gp7@3)+!T1ZORuPFKfhlZ1{AX-yE3fmcByQ3=+a;YEPX ze{Z}FZ5eKUkV7VVOH1Z|4UyTX=Yfl>D~Cx(r>@=(Y=$I^{cutaF8jDi zPM#=N@MAduC<1}g&8K#2!;R4YRDmjh(g{p)aE8&lPW}TDXa>z>&bOhX>O@jD(fl3D zJqZTjc_2Lc`upVo`~ioB=m{7YJwP2oIfkS~&dtuuSPM1N(bLa?!5!l74Uk@yjyo#( zuWOkSTf+O{?@CK!0Pg^4rK6|U2uK&27#Y^srEqa7WW0b;4JjKbY0+K;YX34y*u`5Z zOC+WHOSd6VZLn3=i(U6&3NiwO3hDcgkMOIgI0+^b%Nu}w1kH6aTv_uzoC}Q9a*G!6 z#|!*Ms|XVzY$-r7gb5Q1B6$O?=oBy^r&9(LC1m=9ln=)gIGp@YpL<8y?_bsMLF*eD zdI4Gi6e3u7Z6hOK*f2FUHDREfTwCKWB9DlVA5%TZ$QD z!g*Fp1(cV}PXDQ~OiVdS{?IYg)4hSFF*Y%2X=|H-^alU}XI_1wIv{d1Fm?kO#wwOg zz)OHU15sR{*MtBN>oyOUr#| zk#KCXXLwi~PPuZ9T3m&8(+JWxpaIbE5OEsj>_IB65a7Cy$rtJ^q{(G8e1?X$HZ*+v z$T!hJ0K1wII|Rs=Wivs%qGK^^%sl_S3_&eSQ-dYul|v;^pUv# z4=ZSU{zEI!G8}-ScZ6R@pty8?*K4Duqk8}^33UJD?Cdk3u_YoIs00Muo@oVk8?r%V zK;#dA4~u|?ro74ZyOTwf*%aVXRHPz8)j+z~?cqC516==Aw)>*dQ1}qc0;nC-QYh2@ z5a*zCA~Pe*KL9U5It6Z#0vv*vn3zzS5v(#Y;{1$VziY_`*(o4MzGY?k1_eEYEP(}p zzr)~6^yO8~ya%ShJ(x=$B{+QO>gu9ZC?nYTFZ=@Jb8>PJ%_%#Z4z#_p_41X{ttMi4 z1R+q9`};)zs=yS1bQ;i(K(z@B3?_)iU0w^#pXL|#PIYaB>cmTpitG2b#f$G0CzAa0Qi;Qmwc1I8_WDYjWuH@+V zpd2D8h)~(^_CTjVwbTdJ8eyo+%!1&%x_PJ8K+ywQAMK+sGH(BdjfKIudbS+i#c}E* zz5`4%a9X%7!)Wrp=)?c&i_UXsTi}vQ|vaKrC;&UmUpq zopT_c2x#$iqW>%9%(*!^fxxqWiil`}MN)y2$P$X!|`c(zUun4n+=U_o5=oK@R?`8F+r-5@Y>ymvIZR!PDM92HqLzx9)cPHQ3bsfrjca%Ofb6ZzJ7fP z=5-LLo~x)3fJ1@ghte9D4j>4i^U;8c3N;MLcn~+PT)EQ0jh+wh256KN^lPAsFQ?>T z;iUNg_u`hKIZ6oi0nkbYp+T%+Y}*7ONn<~)#;J-7y1=fSnwo^Z{zG7QbpgX6G)rHf z0=1wN4RTj>^Ja+U<#jwdkR3Hj)$|)7-|OsKxYCXf{aH*_`yN0-EL6sps^rwUsDtI zojVq)I!48>fMw&hUB(4|7myt`Y%*kX=kQPEtp05{S%COD0HXDyy|Slj;{0!mr=1bW z900^sL;=vSANvS1WJE*+3h@OPIOYKn0eAsvuo(~$Vpj&$0CMV$Avql@YoWhcGNP+j zILph2)p33$xpQjo^+Tf)!w_f%t_apnUky z3v4RWGc!6}@0h^`0uvzg*09J(7%TjNLIFGpF+a1hA@ltC4M+eW3N<$^A>@H>CV-Lv z$}E_-z-@H}9&mPcb`x5*^+LQo#>g=P^YMQfd`A!}9_v4o38EIj+@-TU+kyxZqC-FN zrMiqATuo@2*D);V4niD z@r9HW6fRbyQn&!n~)scB_DLru;rVog<7ZCHroMwm+h!Vd{Eh_N8?@C6# z%lW_z>?@wq>6D;>AMGs6EG!tT_ILq>1fUK$1(KFi<*dAUkaHnL z2Yd`31W5gx3q4B_i57{VAqX!Zr1YoiK0Xr_4f5lUg+D-g43CWuoE+_$n2-Um0_Elb zu%}Safq_6|z0HGz=HI^qK}-ejAlxH(3!dQGwQH{-N>^WEq5PFSKnzfqb9HrvfI-++ z#9;$T1vL&Rm4?Aojr{6r0l=?_mI%n z>)I>8qB8bgYQFj3o@lF>f>?>K>F^%hLw{`nA3g+V^yq;YJa1ofLJA@d;>!bIfKYa@ z)&Lvkpxxpl;0qi;i$1TOKFsC&-=_@%dk8m!41xp$ehj4I!K2Gn(X0!eZ()KAQFb&R zi%=bp9(hC61!oM*tQ7DYfF#KEqOi@79*7mA;;rTN|78UVD;L*uJv}OzF8~0dYiVgA z9x+fN0yKt2z!ZW2HE7lTwkv-CKKM(gKLutJ5C)9TQ2xOY5c28M`+r8m_5}V|u(^Oo zM}PI@oyh;-r6<#^K+$?cZvbS47#9Go;$Oae88TW2xYE^a1{D}qR{rE5W!dxN$Fs%o zA22zbZ+Z*ngS7$7R)Az+WMYD2f8ejVfq~HkuK;8}jOe0kcPqC3k&@up`3BL4)Lx*~ zAr!%6lv_4W^w(k2!{DQt_Xw~Q=!BujjS?19b=F?BK|vV6tp4igfL!p3i6H{Q<0d91 z`S6N(RzIjhe7pBe^8O!xXg`BnPQW(t9Ucn0L;&!|z-!KCEMCbNSb@oICbH&~idH#S z7qpaxs_(!iBbtA9c7XkaYa}GNS@n@V_%OU+tss68rW#aP@Vfv;0BxNW+-snW+_ur5 zR{sy=`Cy5q$`@EKNS{L<;NamQV`&ulX<(ECUkqVIC3!#06bRU^`-rCc<1HZ=B=B7j zrxo%8p^1qypasJnRTO-D$gP{Ww;E;hTie^AWq3h}0f!3Q!P77{MhbKWINS0|O30g7 zaVP#mc)|fH(7b^WLO>Bn?SRkp_m=y?57;u8dX@pw{`q!f&1(aLX3Ahb_^f)7u^KQE zfJFsrI(Vk8PL{9aQnIt(1uo?hm^q9<33;9w!=^#1vkZtq=r|5=DGlQa7mz2X^PDyJ!Tklz-D?dr@ zf2K8PQeVE@13(1U?MGZ(hw?Jcqkp9u?Y?hj>(ebYYr;AGgcZ2V9nVl4Hcw+5n;3N3 z50}bwsQZW81f=2LX(}#rmNVHam5B)q^=8Bf3NpB!RZVssUSM#LF#m;zri_Uy!(GRv zYn_$8fM`$_qPXF_ewbBt>2>ZlhSitmujpvr;uha8$AEXZD(&cJvqDU z@#bY-Z%?nflz}O;Uy~Jz6HmS!+Rs0fcWCj+>DJ=@`k9O&(t9 z>8VAL?RM^G(tfpdb)N+V4%k|SsATx}vFGQ#oN2Oq$DZgPb7r5MnH%QhiKMI=LXK<`!D(oaVqZKs zb2*29Jy&nceKyYbj;Q=Z+GO;>ei!E#7CV}hKiQ?P$x82cg9{Y=5a3l}(R+2J;N`R& zbvX?^!*qPfdBCfjX#@MQ&Irtjl#&uuUT0^wy}m+}tTtfr_kZ9?hsQlj=XJen){LKM zIrcp^ze;#)|MaTICfW_%+T>J#6`vBV!v@X|`NqaHMo+r(WHS{=c&&WWnw(CreDI0E zb?%q}Fp0xW6Y=Xs0cc`uMC@13&ZtpQ<*0FS@$YUfHIrK<-y`33f>RI_+_o46q3KhT ziw=JZ%6WJk)?2i&NQs+S-5pHZvwwePqH3zT4FBqE4&^@1LxPGhxvcEmos1D!JqN|Oq-KBiXdMB5&t16Z&#_i~vb=5_)56G)eINz?VF0RR4 zvRIzI+1e)V6Hnsa*aZE(wx{(LrVL|l***{Q>(;jQ;cDpb)l(~C5bMihjMvr6;sP%Q z#cft+(PDB9w6#Bc{78i-9Wi;0Omp1RODuoZEAdj-P`8ko9iD15>fu6#dwmD^MlSBv zQKoF}Zl`4xdTf7xIXg#^n)@RnJ<-{B(^pP;wKDKOG8C2l(dYejZ60UXsIKVCO39tP)rY-~!mVF1g1w0&J*KWbagXk3Na^M`K9HF8b( z%73csoJKu5s(j5w_TLP@Qp>lFeOFaM9u+k`P5xp((<#{lPknpPbna0HFNejkqW$Z8 zLhlQ29B)LjeY@;5vYgrDq_KS%&#qT`m{7sa^W^Do8lshNmzmg0el`kXrv{=8>6u&&pMfg579-u19;r5jW7hdZ;BFD)MQ;}MWfAL?Wi(zEu}sw zCzNe{HMI5M!4Zn^XLC3PV8C2bR6DeADzqqWu^;~{dBUu*sfCo**}x;|&-U+V`?4pj*IOJVifI~Sefu3}<}MVCp= zO`S$|J$HrClDr;E2{S$>6}l$WfRg+ZRYg{=;GT$_>{s9CVsh7(?p3`&dk`Xx-Qzql zUM+|%{3Iqy_h!4bG={Eu^uhXiRoO=SDB(*UHk;&xDx;x|#?);8u(5@vheeo{jU0JV z^)%u({dl^Bv?kUGZ)YV*C1k_Cagn@yKF%-tnuEPKSxXI-uF#!p zLU%m96P~Ae@`e;Fgmwq0(Y$GEoLHK*O5ekFNL3_{8y~ByoWc~C(lBMt7js2%j(x2! zFK{bx+;?76)y3}1JLgUX9pT=42I@EpBvOjWS@W5SUw{0uF1z{W$>c?~sAB#SC`QhFT_}`wQ{75nLLG|)5iSIM{ zrDiV%9l=@chytCHlG}#3sknDry2fqgP5HM!Q{cudn;AIXC@t?EKm&aIX1DdZXkJC@ z772ZjnY>HRw=7xK1b2=oMyK+ree*1V2iPIhDrFD-UW~|>Jw|0Up&fS^P6QiXGK=mr zy{vm(v7O}=lg1cCfuFfoD5J@fvLjNkU}>%iIfVH{WExLSa)fcWiOt*$`>Lg1yFkve zIA*7QnkmRGC=w%^YQr;~cM$$^`ML6n(VH%cJgOZ54x@D`R#vJk%ycUrG1P}?Hf}8M zlwDShR@K|Pvz>RZ;(9hGaJyEN_x@Uq;FjO^Kb#kS5h&7qZ_p#Xs_nvarHG^~OhUyC zot;umbw}GE(yzvVTRUuU{(oM80V@^q^lyKFJz85Le=8%Zj^duCkXu>!`n8Px74xxs zc_HR?;#uqullFIvCu=P(zVAHG>B~P6#`E2dc|7L%XxOo*wtDqda&P94B}HrlQE4I* zYvCJRWdpM{8yj|kSiw-^d9S9Sp@?((I185RT{x~(k^F?kv--<;6w%*MA~?{OxDD^1 zU%D-me00Jzqr7pij3S!liy}=3O`rP4lAbZWh_R8A&Lbyus#Gf*r>jv>I**ELSC6k5 zpjEjXZubYzp6mSaXSH9~ z%V^`Tyyc2ZP10kPN!VgrSUkUmZQko-FlE@x%U^ZFitTD@oSt*xg=L*;jOym-tu7f;=$j);py9jD3NkGMXk`+N2YEtld_~nuOJ-)Y+H+I8&dPo#v#0OrykQ#d*-IJ$$ zsq;kjOV46A;m?L=!YfWnbZ0caIQ^q9e*S5R$mDm;Fn>B!EJ~3So`;Sdip$YJU1}oc zh5Fj~acw~oruFa$sHs!J`tvy^UH6`dv<z|IxGX=8kWe1wJ;8Fz@lU6(w<)F~8H# zhCa`8_DY*TF>(YlJ((r=8lrDrbD*!DR+ z9wSO_%}5dyIM;OPttSir^KM+$E}>-GzTqH`TIqq}4(px%CpJ4-`czAnB*DJ*tVQg( zb9r~g)^(fwUAL^P>sm5b(~o`yb0x?MguK^$I%uxdlaZ@DwEYvq#hM~TJ8{ThV5jvb zMjy?@pG-#?f%7eblC$0k&akN)x_M6K&k(ny7 zZsA-L*Z5qYh^Ez;5@a&ugkHYW&6(xPkX7&7;@itBVg^R3V?!#;Q%3GFxO3q-*4)9p+$afavqE18sppKPMeswuvdGG3Pd?Wr zJ4r!nk-Z{ddsymCyyQDKB6>GzGoxNiM!cZn<>7oppY1#Oa6G9w-UMxKLL!_vQ}Vqw zA(oxQw=dcrErhykgqx@z=52KS%w(6q2_$XtU>@J@r)ju9J8_J?Qf5uRLppMzB|$u?7YJ+jrU@a zXw^p|T)8u*t2mT=<1jb35=)$IeX*ak=>@yx!0kd1outpO!gPPC3=YUC7}@o=a_< zwm2>SBTiiBxcs}-#7O*$N=dPu*N;eyAAWdD^(yU~klS(a?HfZMslLeJPWD*xiOmN# zt{iW(5)Z2NMa4JcO&MOFjh}8zz&UE5iINfw68`2hw<6>Ib-4IC6HikcBg46nT=JoW`nm!=E-+s$-LajN`d<`|;%I-TXIxENzk20Q1l9+ z#I*|E*3N5DuL}$+sYG>-WYigSPQ==`MkC={){b_4=lgU*l-Cfi8PB%%78N`}Epe#V z>nV3%qojQOCN6pPX3Ncug@SOZ!gJj##_~JCV`@YRfmoWC_0}Ryq zhqBEBgP-`Q`HkLG93HNbwyz2LiS{^~-#>Ncsk-cOxrnOOZ$+p;UZg-?#;901J+^zW z`EwR?W*jxO`-{}q_RnZ?5{r_Pd_Chv4vZY0ICW_LnoJg#N+>3kVb816KnrFUBc_de z%l1}4v3<`tWVD^4&N_Z^+WhL)0h#==VD}O?xrXf0b$+ZNd!0LHJHXr)N32CxH+L?I$K}S9DQ#J_m06NdEaocQ)R{tjiAUP}&@a9*@aTMQeDKO^p%u5QMkhO1=RCyqC;Ve& znbP%5;lTs(Pvaw>>x#<_nYhIp&3=!`@Vv5k=eg#tqaBqV>*>fdP3P9D??OU~OP5es zwOax;G!AzRB#%3_>(%1(3)wCUml`_w$)TG`rGAmyI8bFWp|fqB5#U&O8AKp_Xr>#v zty|~!=M1ICxJ4jHy|Oi|%ZZc2ZZ}P%;?8h5DcjvWe{7u5Q(|luYhG7cC9_}O50`5; zqhdyw^n4zzJ2tu$(0+0vz3T8I8nqKQwUA%#V_^=rzR2U;5^@yBp?jfsL*(Sg7n)w* zOMfnsnLykUcDV@Usub^^g<_QOP$Bxr4MpZu*4Jj{^}FM?H|?RxRr zOtaSZc^Z>{JA0!L6Dz6xiCgOG*o(m6domc>A766}E51AbQ0&v=LMpC7%(m@0ZeRSCBn5=x%trUtfcHJieYCpHq2u`jdfNAg0pDZMU((V5_hVSdfMrdO zR;*cD{L5dx9d+TI{vmgStKeR_@HFLTvr0~Bvc8?YpR30Q#atiTqF*gL@rGbW2GpsG zpLtzYG4y*b{=8|J@!G`WMQK&VWt6@*-EDfxrIqazO7X3;@|t^zBFDnqBRF>}2Y>ks zi7cuA>g_J|<<0dIR?^IwH%anE*WBkxzGW^wtm}B|Rzc&0T%GUgx4@**qRO=9V>6Q% zhTVO!R8m!!f4yikXv`#`KqGI?im|w8&s;ESU1HN1nh=bCFJFxD(tuW-=3z;>e`M6; zlwr`vYxxB_#u@bZ40@mFvz9x)+iSP@+(##nJ6CHzdJrK>^GoiJB#VKbUQle{O>AFJ zB`NhSHpQ)xkDniJJXXhf#-fNGSJhN_HjS^>DQP6ZRk%!<;IJkous+}`!dUW*uqKw9 z+~xPQAjbWdW2J+fE2{f>A8ntWA51wUL^FMkk4@B9n5dR?H@GbjA#TNjo1vTTQ0Cm` z9yY>2k49+Gd*&tGvScb)XuNbqX*D>n=x5*yJ=&1)PoBc_lFxO5D@YVCf5H#F@pdM7 ziFD(Sxy547cfL|s-aLjBu6j}qh*NbR-MK1flKl*I{s7=2Rr{@HEPO2;3e z_#HvlpS#k%2xz0tn_@`~!re{PWB+BZtz-UMqRPd{m|>yh<3j zZ&rEVjki!;Se$VNP&aJQU8DQa1e3KS$m{Ig>lbBYm>hSo#+PQVTwjPjmc$I}%z52E z^0bAkBt~e=ZYOYx>Fkmf#-_{3R;eH_hW9VUw*tyaS__0c5&aVQy%nEuWpSK!SvRDj z+Bf9FO}O8WlWPz1R!k2pt`!pe5O4idFDN@O?_{65PIYAe(ajg*m-@rhGJ11SEPurqp-2|I2S{dX*G^QmFWVPCHa5FA}Cp5-c^-Xei(!yGHcJ$yxWI3MRfdWtb zXUei~6?_NkwneY5^GfR}@!?HQGo;$LU3w_as^XIT?vyCQJ41OOK9iKRl{U&&U81w(bD`ZgYm~Zy{_K3Ja?pPhwPt6 zUC=xg`+YB2;W~%eZepe9he{g(w;oUS4UF`eC|lgo!iH};MoLP~(`znO8ZSC)dzaq6 z=*1cShCweCo*hsz{$5(_71p3y>>E?dbRnUi-G8oNs_|Fsr*AihyX+oIC%nr0}kt66f@`@!~sg^`{QqDSU?w8gUaX z-h!r+?rLq-!S0f|o#u(|ZTs2V9NnuPl0-Z7H}^HToQvOm2;gcOpr2DY{T<4lI8y#D zSfFj*+i%0}8s(1Lth?5^k9sQQZdqo)Tf^5K`Hl^;4S5???O0>Im1oDVzed=X;?nvJ zRoG#bTQhIAqKddU^J971o?}EmEZo-x3v1Rb-zCHRRdkQZ|nEkkP7H*k`sZhgNiBktxc$jn5VG z@&$*d7yc>?>3Ni|zx-&-W0Y`gFenNwQ*toEqi#GRvx@I=`Da1RB6aEuI@lkBg+>yk zgqGBX;RG|c@{oYTyN)MMeEhDifWmz+Y6l{3RbA9Lg*}L2MHGNX@ zhyGETZQJcqRx}h+38f+{B@&84_9!7)Nyy%# zL?|N(Nhv!OLPlAck*tieHwjtU>piY|?)US#??3LI9)7>?_qxt;oab>IolcWsr#`)C zl~}lNWc5S(?Gf9; zeKRJ`e82FPu;(Jl={Aw+`Hvf^H0SJNL=z8du?B01NGBeW=eK$pj6YxY&sp|mnc|%3 z6qgZCiV)(;c>m^>ybQ~gXJ7YapM8*Cg9<~QejouY=Fn#TRY-V0XmO>%Z_?yXxm8FP;!Ewm;s ziPhgMFXzMXPFDt{xkp#Ou-t5{?GFBZo|>f}4mM1undGBauPyhHi8=qKs6KM*X!UdW zabIDlbn#mmshcQGy=63DI2|yV;qF5v(cV$~OFf)OUlbK^AFJFoG5P4>;WN={j{@2C z<_B(ltH(S0%<*L0;JW^ovj;N&Jg|IS_egCY``cM#d57;!;%Y}{BjO4YjXH?F^tGp+nsjY)#nPsrxLgzh%WKbjP$q|NS7@VRP=?OaKn87fztyu{-~yW_b3dDFMY z3brvWq#7LGK{r9idU9J(!A~2DoCWLLRI83OuF>kJ7nC~V_OHLJL9ruF&`s~?+qV|)>!Y2||I(Y#-~Hav{QcJQ$L7~x zN9|wE=wmWkbc>{CTq~7t?1<^Z21o70oT`aH)9zHNi_0;cp4A%~J zjc@y_qBpy|$f)Z4F(VRZt68upxY5GdhVO0VDrxUlJNNI@hr*_dD^o7D%-6k3VHI^d z=Kt0v>GpjOJtyxnyXbRsLcc2)_&%&?#vO4E6T9BnW4PAg!i`r41{D(0w7-Y$pc@;u z@@3+R4-P$fknd#1_tn+xGp_Qxsb_uF)dV^xeeRyv_HEOay3{OI+dbEsZycq+a@C-= zQ>=LUm-mD92k+B;7~Uh!!eT{7W1=(hdBJ?M`Rd+cCg~Ex&N}|bYc%^O3-08SE@R7iex-`K@CU-3i%)&=`9$BkQdjOI$}tL+h8H2w0% zR4YuQdY!&yqo+Z)()ZAzo-Mrtp{euwdz2j=dD2~*mi?bs?tR@=DKHke^+vJvjGq2= z#%-1-J4|J|7d@>reu^-BYxi0{BWKIA(4+q+`*Y{|*MBraA_NS?!-Zr`6>KFQ+p`5X zsO#&WPHMZBbN9yO1g|$?A5Kw!_W~tGNplVD*}J#1E9pWnh{+w=7~=BjjzQ*;%?z?3 z?Y9LxTc_7IUTi!6CM=wSA^U!&_-L(ZW$WhTRiZ~Ylj{9vkH6mbChX+2f4_;yw$hx{ zRJXU;zI{;GY`m&5b5Z1Yrd_Pm0&R%Pgy^)huE^A5MjwH z=FTjwjalFH>O%I@JJPkDo%if$vQrBe^*2v4+eh_Pwt2@=h1XP;hUBMNDX!{Y)o#Xb z{n~|9{Y3MVYXRdGe5kE2 zEg`?3e=PugQR#hpvYf%l0DoU_6w-xHt z?B{FNQDCpP1R70Ug$n>MZpYop0{UWvf=M$0;xNc%2A>p6wi@iG_AU7k`*neBB4k2kSlGtz zdAymc%}}m?;VMPN%Q@SFZ2|;5c^&IHkJ3t*6vB?VXDo=FwF?d`i-JhF2`ppZliq z=i6J0QeI^G`JIxd|0J0drU9lk*QHg`vJcPaIIJY|IX5oZGM8@CRJ+0RGpM|3{co#! zF`wdfHPN*{Uw#`qF&`!VC|%`D?2<=ac z-q0^~NX=N`CNslW3}3C+mP4^7{IyB#jjycuE+jUF^Q_&e+-q&xRT+MnZSe`E{fBYe zEjQ`(c*|>5m&U`x!gQ{!x%5P5|7t#EWs@U+UVdl&{QfB z>%M`-o&{y&(aJD|+_%g}jSLM7`?@FSODp_5MV^(*lwXs-FlG10%KDw&uOR9pLQj0e zuS|=Kv?gwiqmA^)V4ACawqch>uG7cPanu0xq0H7=wpTaRCO6kE(>-|`B1Z?JnH@Ly4y0F znmcbSx6ZD<$wAfsfM%e^s>y3sr1PMS=b+c@^g?`9O#KorQBVdRJN1;i*q$$5ZtpGC z@40Zp^N2_gDHv9%*WLeNQ|^8`QDoLpcFrrNZfme++33!R4@1$(s(aZvgG{H}DF>Uq zhoC@oRyY+d8n1zH=42Sj&AFbdX z@S?FwX_3v-KCxJO;M;a#`y0)B4SieB^56NYS)aK-Fy&$6wc#gBJ>4X<9@I?^<( zUv+!#^OLspwRI7rpXyj#Wml~7JB@GN7fWUnIkY3W_RU;Yz=~D?51-@CG>(f!R^zme zIlpJlj4C(qm``b(ey~~pgv$KT<^9UaFJDKh(Z)X2y0+_CcF2_*PuH(q&$j!u%hN+= zxRg^y?0E;c+FzL;I;5zmaI{+H#z{*dKbf6ob+!q$s~nSx`zKV@D85NgiJpr!zQ3oa z_JZdEN9dtTPJ4Ys4{z|{RZw^pw_$heDusip!>5C#i?>_Ja{e-~*57WFupHsKn>s?4 zV?Sr;{ghYMLQ41Fs|v}qKHue?_3Wav_&zf=!S-_|9p5I#TO)hy z?RNXQb2lz!n;O$>`L%96y)1K5^)G{=Uov$%uY@Z?eZrV+7zIm$Q2xjj zynV}-<8C4&U*_W3@Uib2OWz#?)sUTq{2c(oFPqxybHmT_bbKZg=`Cx9H%~*1(tj7sqWg66&d{D8J7iHR{yZ zXhl(Gm6pZiMzJYt@chj3?B-juCsJ={jJoA)`4&mPtK-BoIq^taY6AHTMS`wU#!=j`;}aFQ-6rNhvmiBt9J4-#fZ|c8YfUGwtl3aTdh zMP;VV{EVjbx41vueJb!{J! z^{!fT(T+sp*t#Icp~c$*#kcR6um-z47f)jUFdtrE>`|;^UvnpEeOE2*i;&NMY<8wt zjpxjKDK!0%KzDhzw62u8Cnf0pyu%t%?#`ciRKmM{sw^At32OOZ88;~;=OXgu)p&!) z?0aEqZRtq$Y{?nzx#QZ;^F4*AG*m5AX1GNrwrTsFVn|$ko3%yql&izF<8l2Vmzvw> zZ(prcD8J9}<7&EL8$CX#MFPE9vvyC20AV>UCl*HqwD^u;;(j_2pXSLWqsjzt+r)2@pQ zNgZnS%b=il;5_h)-Ay`BO*ef{~Y6*5J|^-MRV?DV`W)wNg4&1hdBe|KHiR@O&{ zx>D4uE5dTG`X9Gt$V>+t9sC1_<+;F8>T~2Tc}RJPB}>JtM%TSMCXn{&>$m0~`x-QG zoyae{!v6D6`2`+h%F-hpFTd|fboNv7FD~Ae-g-hg{ujUT0lJ>A7HleOpZ8yN3#Hni z5WGX-Y54t{I%z8k?|twa{Vt!gmZMZ=MRg@v@<)C2@zBNWO~?9L}{}3 zR43-Ax>QJRpkk~_Yfx}CGdsSs`gz}}F#m()ruP?&vtMs;QRQ%S?fotXrcgCHut zJSG|diDBM4%=gW_y>RGi`T;^jQevF&PVH1g`FbG56x@~7!o9PX4HZoFm+ zeUpcKp8R(%JfsN0qI-^ysX0F!l+DF8YbcECJ$~oTD}Juq0_&UI4|;|N>4)1sP%$!I zdFuIIzj5al^W>hnoV3IhI-$BU|d0k|#YKt9q!+SLouGx!fNg z=DNSfaZ_tEmvP>KnP}&5E$5*D*KM&^*_MAAel;CV`0j8pBQB!&?7<&&7C-Yc1b^C; zO?)f&p^4thw)@<*<4OUmT7RV%v%TB3L&@pbs*`8@_jIS;G1(inJK$IT1CgvfGcRp( z(sc7vB)WV`^Hy#+1o?g5miDman*-e=BU+w{?JvNFk$Z9aqP_iz9m_l3HZPqzs_{OnXCANQ z*)t~W@t2pC?S-x>EQp8(+|i>Yxpz)gdih5~76Y`)Yldo0w>lb(1z?>kkSqB|7vctB zBr~3hS$R9{x*MOaP_MeOHe`@Bg5Q?GeJ=x1dd**1K1Ke$_>_MiWO!h>3oeh>MJhXH zJ_bt2PK-mYBeK?r0d}_F$3RYLhNX=HhaGeRLQ00dE9llBghnbxOY)oic*uV7DM+s# zU>|`y5^%QAq76Ut!uj($VAceDwfJ|F$YuJO&|rN3!2?KS=)n+|@ND3}`VL|eOw7!F zbjsknV3!{DzP^y|0M$b>Sv&W3>W(zf;=s&-VjKDGkV}JZC(&Oe2DUXX#lUygFmAAHf~KaheG>PMK@9B3O``*O zR}A#^i8K+{)jVJ9s;=EY`4ZYhAcl-#ZJsDrLQA}49vq1J0Um^(3e+$PNtvu!Zn) zK0}QBMr#%iJ`u4wP14H#_QC2p-lP8G0tlY!SV#EKywWW0p<_*!S<}D?qVeJFzkbM=%agB?lUc*1ubu4jxs&8a847 z;;j?)TWI1-J{PhfEN=WsKlBHS+S+nUdtQ$~Y#HpWet2$`z^(j8mkQ&*Yl^$jARVbd z%g87Lx)OdGXk*`!v`#_Q_d$NXTe_%J5X`(lfdelU{}Y0eEa>)7nt`}f-Al2x_b7LX zS4W6I@c=*8izpR*I>Yng$K8h&eZ;)|`U{tkni>T?FXhDle<>;tK%#bCno@^6g*)hv zS|6>i)xT0~IK30x)*T>>L2vaQnDAY{ei2?gkqU**8<-x^(X3!~J`;6|=&K0T0}}>x z$z8`y81TL%8GE}G$3Sk~7kHjfk5eQPZ2zSQb0lF^Y+jD+ALVVgoW&7nW+`Yc#* zZ*tx|gU)P%N|XXg0SBPIfO`y;C7-kVBVluE{j=r{o_8C6a^OELO1y*#m<533QP^@T z{~|OmU-RdER{bAE5NdT#!bIj1N~7I8@L=(ec>0&JvQAKxH9uNYz(g0(py(cz9=L?6 z$M+>_a&eqI83wi(TpAyi>u;}slsuH!V%%1i|3`#v`OS*IL0EO5figAaji3I1CUU!&{2lJoyVMdj>tPxVwR>JN>(M5rjVyROVk~ zmm}mr{^_K{!~w|65&l}KA7kBX)x(a{!#=PNFA;cs$%E0iWRV zwKjtwxx=2#s@I(6s&eten-gu~KLiB=qk8qji zmYocE)wpDCM?OaEDuNX9yVTSu2+9OBOd58#93W5IeQzt?(*TGanS~@vTuVh3$E@poL>jG#{a> zP@i}u6ybr-;LElcrNR96P;~c1P6x#=y!}d1ZgTMiGzTNN2_mO zps1j52Db`CeQ5O7#9W|ncN{wEI@`4oG3fOXL(t5TW1-a5gV+ZxowQpWd7mfZ+8IE; zN5as>K@AvrrupCZv-jP;W5)!dP%1nZv)x3VNWJ*FoFQ?ci#N3!=SCsLXp-8xgb6CXzN#@#p*$(|raR`(mAn3#56$D$Ns{vVf_+M zQdMP$LyQARk_LVzxp;3(S8eI)IOV!{0u&tbtPJZY)ciM@(NN(8UF7pEE8 z2dqKm(9Ma#GyLOEb6F!pvn1+ z$k*V54x8`U-F=1gZxkAPsC;l6n8FXiH$`SZbRp6~j~>xMX@mjBy{IDC4jhP4`m8`@O+#2f6s_c&c>)#sEYbEL`vwl;R&E1DEb^+yrT;23R_OAAvkrYgkWi82h_*ZNHozxx==R~*1hp6(epKZfsF}(ji~)}pUfZ85 z_X~8+aD`goFBAD)s5B6H8l3D>>Wf($?-qPY(fdr~XHZrz=!v|jfBs*q5_u0Y2e81A z#?m?pi?TjTl(Lo|n4p1%aD1aB<7j2LAfi75bPMsxWsHmtg2G=)y>b<@F`6GNJqgnt zvcst9MVDt;QL^*c8WpMKwV^IU+^E!RWt=zDpEZEks5cryr(0c>N&KTXq$$r`)iXS;+dOda|}N7`5Ar zF6@LL7*zcNS)~a77B8LLEC{SJ7+AKhKs#9xCwgDuS!fjisxM331G@9liwdW96FxQn$$opZcYW z*vxwRvlH#8|K57(&kj}N2EUG8E*hxLTnJd;?EduBptJtFcll%g_YW3kT@|-!8P=Qg zvdQSwMU7HSyWJ~@oVxi=M~v1OEZqypS)lFq{FBQdbqlc|hS-!h$iNWMzjW!6j(bjsJ9pQ08Pp<_;Wx!3L+Q_pXbxhA;}XpUdTx}LzA2+vV$f6Amo+QCj@RH zqpFlp|07cP4@m{-k9J&EmZ+gdKxmfpXz70y6XG^x&zDA<_82e1~2!w z4L0yr-E!W-a-qrfz@CejEcfnVR-nrbt)kJM8(E) z)5yi;ijRjO%dkXbqwc|ud^NV7=3fJcLpF-!KjQcNs5>XRtLC8coa=4%lH=P&mj>L$ z0+d1)1Nv4LrZ_d8ulCD%_+3bd$MNE}8;UoL`cyKU%jXyuzQ68eT6}9&lCf`e_)JpX zc&Vmn(nINm(ZxBj$Gf$fHecSktA5|@_AS+ub$_nE3qEfoY%AiqG7_|XU6&|)vted~ zdI|EW&~;0L)eVy1MYQs}L>&3iy8rpz7LI8Cj+mEKJuAnl`q`E(h}UCpZ{Nxj_-)V{ z@ef);L_CWqR|yF{CF-82Ac-}ZMy8q8O?qlXejEwWl|??ZC2Uacf-p571R8|werw(g zkjctLosF|bDsh3R5q+Ag4ybZ=$Fv6Q`&{h3bD+3 zWqfxPHk_{R6*GHzUQf17?!ZkZ1~(Uu*Cwtm3n#VPUh0qamY-M?d$RVc>y35yOx?yM zD-?F*E-=YoT=b63x9PgO)phvt>Co;{mUm9#TDrE*mA5Sp-w<-(m0zP!IR4_^qf6e82 zTalAfLStW*P0s{YhbB{93rgh<;s_NrSKU^?9rPkPxI;}c!#eWw*!K%utzT1QX*T%( z^b2R>x8vsi^s>MDL-YsQRzJuGMSsmH{CI04&lg>mo1!EB%(Q7vJZno6W9E9xt=GSN zQY7lzxqaxxrG@klOONy;3G4hdaOas-A#XRiuoeH`K2uQ{J~ZH`Wy@x<+n$ATRUpa0z9RvBhpTC_dg@vQRg z(Tz{G{kp^UlFPev?!&d5r89t)`*9;w-`~y_Uk9D_KlUYk!)eyuXyD z-XHqYlrMNUUi`6je^rLT{IgqT!hFH2e@2SMj~`h!+rhjzT=S){Q=gl9&kwQY>^6#2 zPrts;=_dWkKZ04luNFPYJ$!mY@S}uhi~framONUy&uIg6wuny!RrE25I6g+55snrr zQtb_~EI4uE3FJSzp*o<#d5$v}x=@oZzI=3^<2|hQb}!U=pgL8Tc3mDlA}VfIeM#X{ zmc5$@CWC;8L`(}^ytK5mlp?yetcJH*R6_q{vzJVDj=1< zr7*BF&gNLlqx+IuUhW(I;o7=%U7E_tfpR_jw4QB38 za11-*sPMTkaP^C%-pm>}RzYG_~INtV|%_q%k6>#KmXYG#ytwV za{jxv0rpB_?{X}+S3Nmz_Ci+m-TR!Snm#&;{y0OAqptTq1+2V&|0!(zdv;fwUjo}j zC*h0US1&)+Hyo1B&QMznek~c_^ZT>vibMCEp`KM5zAQ2{x8j{b4{a)RQGU}vycaYy zq!DGHiHf8{e1Oo&{%lI)fPGU>Y^XLQo-?Jt@H0?Cjrt*k`yh8T*Aqahot!`4WWI(A zChO+CYL<(y3xupTG|8YCE4Dund6lIm>f8g3Q(s=1ub{$&d7uas0b|g+mr}>8(UF$p zxaEb5k9BtFl;jSyPwM`8yX$?P0LR#H;gpyX+pA+d=U1c z6TykjpZCT&GOmHz7+|Je$fWVw9!w8G;_gaO^!l`S?ckw9Wxbs5a>Ks-oY(SmLhk=*6{|jN$mP( z%c4#*%pEGZqOOaK{4Yt4z~AWILyzknBCAox8kl+qyhzh2(<=TiG=>{T1H@NvH()!|H|Z=qZ~y*=T-Cw>j)7-0 zbIzwn<<++Skw1}K#(0xD^_m{f^QA9PB;;Q;pZ_s*`^rzB0LovC0YT4Adj{oC_j_KU zlT&Lv)Y&N&dN^!;a)fSO*EfhCl3E8kj>RZ@b^%d`fG8VOt>L-D8d?!pu3x6lo3p&= zU$g$EnHiD6$w3DJcw(|ff`-UmKbAJgX7s%Z6z)xO-WO>96PHQ=@z8<`%}Ew-iTyql z28#$F7(&U$!G)vrIAfXuIaxN}oDF@$s-WV6 zj8dYXmeRaC=7B-;B%^M| z1Dn*hc8Jx@T8gpFh%7PLxfii47m8g#C>lotuc?Oppje4O|2a)<2i*-mbgr(?VwoH4%k-Exf)nnEY~ zH?(W=ti5%J_DPvlvYU9;%%zA3M~=xm9{JS+!PoVL7rRI07k?)HlJs?CS$Q_LQst;g zf68~~=@sA4_pE!a-pVbIuoJC(epPS{l=S+q%Tl-m^Ti9~^(Sv($V}y)eBpX~?!csK z0rP=~>vlz5*G;|)a76_>C?1exE%zlh)61m z3Fv~3ea~MemjdQ)LxVNzXi}O?V5;^c-0Y)Iy-ICMx6*}YaswaYZMUu~h&HT_sDvlM zS0Jsz%=e|XAI*JHywkWT)oQ&2t9&qT#-z$?-xp4<7pAjkocK?RWj$>&$!KIj3yD6Y ztgK82t}Z8@%K--ZLe#sjE>G-y{ACNn5r?o7Ow?Zu`G6n-mh^zYJ{0&F00vt&-B*1t z5Z~XZ0mUccwM5S@s{?&M%-*~4aNpYqDM5d9wJy-!dU_%r+Tv>$*ykPD`LMxWFdiQEIY66%K7)vBBd@&suFR|6bw%jaJFXa++O$+sTkzgwT%|Oo?PPO5CL>!Ny6GMM35ij?=Sw%^g zHw=1@dg$GtkS+)zA!xD@7rSN~Pu}=Cs2)-QbRkNzB%}hy2z+AiYGC8>=0V+HFN|HT z{j7=9)orhk{Rcdm2V0zAf;OZw@IF8ln792HiRZ`yvdE4ng5H##g{6a?S-UTN6VS#Wm|NfO_ee(TnP!^t_~z)sHHuxxyxHe$_aLp-Yha#`1zi20q!99H;sJX^pc8KTI#$8R$$yW? z7odvO0DwwnPwq4dmy-F5iE(+?4Hg^m>mFl3hlgi~#`^@YM2xRYA`Z{D@p;&o~}P#mHJiEm5Ea{3uwGBM8s1~1Iq%?6kwXW@4? z3cdhn!21%DPjxLZPe{%EmEEjZ6e2y_eUy0dh@#s2}Y9>dVEOv{Q2`)Q4gpYC^rC%EyhPd1*b$@;}JRk~v?(Uerne0s5hObSqU6g_3Mc-b%mVwWZ60mDo--M{! z9os@9{t0-0P9RWPK?$74vX?kBB){Sc6K}yCCB}LPl4qi8?S-D%%Ky7Xof{w&o+ygs z2{mJ4ItGXcd6<$}=rNvUz|G{`3@x^8a`E(OO|nq>g+o&>u{i+#PjUl*BkEx3zCqfy zFUIN6KrI6LatR8A&rjKrnJ$~KaBE#8#mH^kv*&j0z!vTxd*HXf`5!n`upa&bWfc|C zn5bbc_B1>kcN&2mO3mjlTeLUySF7NOaPd zf-$u>!DQ5I(IW{(Qw&~mAZ3gJo_4L4fM-TgmpOUx{otO6!Tz4PhvSn1P(-?b8J`)7kVGJ&UZzRlIr7H2i*cE|$gP|Wh)q%be z8A#xBp`l)w5<)If@6+wo!ppNgP-t`%(#$-edMp}_Y;Y}kM`r1EqgAkJyyf`kr$s|M zGdG+`F<5weC6hY3mITwYnTHDua7Bq@CUhxd&d0R=7ci;uHE?$R=LiY6tFwt}a1p~8 zn#@5k1w;h2Xt?No8Q%&86&c=a_H@U32d3$T5Jybe=v_4;`s*BkXP~1tFecB5>e)H&W zXz+L=hjo*o>V!EsQUAik06rmjIt7@Z$N>WuW`6w{5jzl54?w}t6XrjDoNP2;(gP>D z32Xa3J=Q`AubEX*Xfb+Ro1pT==DXHsjKS=o+}dm-ko&*@2^Ve%7(p!}@IDR>CN|sv`2CY)tE1S7)r_}lgHw9+3r%8j_e?=Zx||^y+%(%x)cEN=ZuYvUSiCK zL4a!u%d~Vpifdd9_~l%fN)0dTPT^{{`Z0pyVn|#iejWC zGwoW@2FT^kk-ug(P1VQRq9Lg{WuA0;|0Wx(yi(8h|3)>Y_Zo~% zkFLXt4ZLTF$0ux$qQ4JeX^M!O>S3&7J^W*9nC!FCDz8ezXgFTr84;&KTqziw6U}1c zY8yeFTxouSiEJ4Fr9gH-h`&~He>=8sP??IYf)qVL;*+NOy*eGbkC4>?dtdxKjPBu` z2TA-xNHti@@sqrZ^&lWcnAG!Np25G{k0IOY4kU^RdI!|*A8yNVOlK8F9PAJ=uIozQ zg<24IpSWR{=#mLHsaVNVKD8nq4s(FN_FklPz~@4<IOCokmMqH#H{@VdO z+Pjzp+YNs_8Y>vK*b8mzqWqV7xej|3iNXE{H8!M(8X6m;U`CmcAjTQ|Enc1lQv%Ew zHba@PZr{~$=mnEyB9tL`ns^nvcy!kxGO15#DhwzAEs~VnfCZ8>P(1IvOJi9?CFso= zARZo!p%#W{Xv#BCy1JhG8;c9sZWk?aMJR| zYS7=p$K2!URw)`%&&Z#AE@yI_kv)hat;x66s27DO`3R^2f>5xLfgU!|VCu3LxJ?B0 zhc*kL=T9mP5#b;VuRk}@2Zz36SpA4_=-4=Sk~bFSr<=(R&Ost4Gh?ML?4@kKQI=qe zg}^fhrCxH)P-Ftu%oD@@I|CLVIrB{6=~ZKX&z1uKwWQRu?{U|SoD=V3K` zK^B)kTha9(tB{xP_HS}O0_|lgGPxo1G)x+SZEc|1U5w!+_P+-dIfKjeQyW_6{9pIV+$3A&4j!mJJ=tYTDEU|qp zaaK~N{d?fwZ>9EH8(w*@(y+)looq3X zR0+Z)#z`pWFiDt@Z%08L#+|?T2s0QmGv&#TFmE88H>BQs)2dGwRaaNza_#)LG8FnM zFE0XB0BI7tioAHjWYHqWC>_GqD60Ai)URY`2dPDbImvu0ngJvrJuLq|k!44_N49XW z9JF)i@=oM@KcH9?fG)6Z2YZ&_SH=f5Rq=L!B6ml6?V)SCl&teNgJzvUpeJvSl-hs} zl=04yR7h^55guCw01$m*qrw%KzwVw3uF>{gNa0cvCs=KJOp#bIZV%Y*8nJl@xtvrin=;Kv$*x**8{_omHp=8`L{k@bF^F;V^9D{1oq6y5k_(C z?dy90n`da=QPg;$%Qo}IgkDHj@B9alv8Cl{9s=)EfSKU!+k)zmTLe|yE zBfy{6zUI(u0MJ4LKO7=Mp4?1grj_CoqfOc^7oTQB%@JNHN$Monao`=2T^;;RNneHn z^fLq-DjuF3LGCUxIgUlf(`3;c8%R%%o9&Ohx)AI45?n;UqM)bJM#aGQbIxiu^E||6q_86R!aDZ8kt1> zlP53ZTMrD|o=+i$ue*nwhVOaJe+c_{{A>JEE z<1Da1@;aP#2Tv3w|67gp`{;T}9{-&?qm4NHuL)Dn0Zy~y$fYxIBgW3a^+FGd#B*%x z#EcM*FJqMt4*;N!>{qex{(o`E!^nbhfLd^r;AjFP-EKT<;xUW7vVmrgY`*K#6BrBN zCL@)Y^TpTsz-Al;_6jV(NIMFD3U>Jj^4E%WMk_v<99LFQAj>tgm@Z(+-wEIJ?$6za zHw_SzL%>A#<0U&wPJFffyG{Jw;a?|>Lkx@6*pDA!_jX4qfwa@G+W`bqg_1*BR@N7% zUAS1EH^pVaHp8UJyx6eA5>#_^o=mah+Nm|(aLms={_a#_&nO|&n zEigFZCL=si#)`Vz9)Xqj?^!ohGBjkz1%)+oFu`@Py-02tm0;_cOtUtP0gFc!eck)# zIMr2EOBP1+tFhMlvm;-WWC}cwgTd0Zn{9R_!sCv(V2It_FkZNY8}=v%^W5CWIB-m` zq1}bX%3yY^t9#)a_1|C~V2)Xfsw(kM#HKa6XmpdraU2V#xMDzcNId)3J9r+)S0+#} zPBC7L_?rf@wg#(ad3@#gqML$%d5AMDATMA}*fF2Qf?G+BvAg-|#>6XLXyh;w`eS|2 z{lVY8yi8$HWw-;;Kh-Iw;C{IdExGpx;MQ-9cKsFG$-tY}r z0HD%Elp>`$$P?&yfm0iR7a%-UuNu?7_(fvAUo62evR!w)+ya_o@P z5QND-F@oVZO7m}-x2`JM$kKA5sqIDaO4`FHTBTEV?dA1o#1oQ}?F^2HGlFtdX`0z4 zX=3sg%}R3?E)A)d(15~w3U>vC5AJ?;n;h0`XIiaj@Y28!QfY14v8;k{fHV!t&ZeMp z;wFkH7mM}2sMd@vr2B-47~+otXD}hC0K#+>IwF=81^5^@9ZnlKkz`LC2mo1GMVjr@ z9e6Ga%P4i?^p@m%&PW0tczCnGLdF=WsHlcJ@?W%N+hM_sj3e0CJjn_}w%x6N;|L@^ zEPEAenRexvqDx?t33)F!>j=k8y6xM~!8`m%fF@b}wQG0qLMKuhz*7CsnvS-wa(&l+ z1Vmy&Pi(mHFFNKfU=8Ca^%$Rl>5U-Q@Fw)I-qz-aQBnJcZL{NxvN%sNQR9T&sk8ft z@1vCPCyLur&Zg^RN0Q?4@I;$jz|sgXX*f!75j}@xe*%i%V>IhTGe_z#6U_;&*eCYvzChX~A{v0{38a_4BywdC>h^S?t#2lRP`1$`vW`UJsv0YnnG zzuk4glt%K!MyxrLfl~g`$Sr(4?{UsT2TbJ1 z@&lmwNq{Kd7SQtb1Mhteb z5l5m5F+ZoI{A2yfy@t!AC4=fVzz>v5e+P%CiqFZ6s*+SxQ;Gr@GXR2{ttMp=J`@n* zsVHr35Sj8}&iL$vg}d9*&{d45Rw}N+mh^8@k3_dYic4(r zdbF8c2W&#QNB9Xb*~YFGb}r&2Ofpeop{pe=BnpF7gi1o9bbD^LS2VrHbh&YQDxw&E z2~K>gZHp<%)OeXrm|#+o8rSrpcw{G9(SDpZsRR{X&`GvDmHMo6Yz(bODt7G~&u-TE zxRARv-GbENy+xzI0Lk?evVmJ;Rt4^ShV2x+X3?N^D!5?;#YMTS!2%0neDenwL24f0 z<_swOVZKaG81jre|xjxAKcAX1ms|uJcK9D>IEW z!!7v%cRTU}+|F+N*e*2aelFNEuEw6+^E+}hlU-kN>`3B+HFE6;>sFWQOY~iwh z9{M0t^M1*%ur&wiD9U2B`CFLwTo$#rUFfsir8ZRJHl5MWEYlSjfV9_N+s^$=?5({sHiTctNxrwdw1n% zVnE9g_MD|PK0{8EFY@k*bHz4EEJOrkQU;a4QZQEYev?GYZ?@G5p*tHNEDm#T^!%Fk zSwuc>1#5NTBi%D1HI?Z^S}zD2Hh|E!TTt zBZs|y+dauG*^^YQt$g2Dhc;H5t>mk-oDFuqVMSRowcE9^BX0LmwFeZzZpSuXonqm7 zb6BYO$_%^Kw|QbuX(UHxj&i_fm}bQ1c@YA~lj7c91Mp8F8@=)uLL>S6kLJ36 z;LyMS@Th37tRA!m3fP3U4d`B?K^QzNF%<;&Y?o-hXp+5dW(z1ApfrHE(H~! zIZmPXTwT-J`W4ln#=!bH*34E*nW5r0HM}SLvA74f)_eyu>{(&o7%&*hk?J84`oG^2 zeNeU5w7xN@bOdih&#RkqscQZzX3k`+L+F_H3%k8s?~&7GVU!3P)#cF3;Fj1J$hxW<6K4qPvO%vW~iEMoPg z0-^d&$c63InUr&AiNOdqAL;=7L|SI7es;LH;!|TXX9y^cpue9!ecIx$N=DlF$0JRv zQJ)i`I{emYk$@7QSU!HHhitNp8#oE7iI&NsH&Cj|5p?LeYPShgY?;e0Ae@q|ivG$hN`XIr+<%ozKcu5c~;{j&p(-PI$pu+)Tzv*eE_uIC1aIxZiGjK8k|2eKQ-5C{0o@7X4$E$< zh(#t$V80dxet6)|eO>hawPCjw2VOQdn=>aQ#CxxcYJUH;NV#^0;KjE|oBUaiU$U6s z=AfG2{Q9PBmY(6^U=0ZmNte{u{?D7g?fzGfMOH%)9Q1H(Q$V8E<)IVxHxk8}D&%)~SpUUuTUex=6H?YmNzuh9)u^anG_Ya*J-aUEv{><3kG-b^9A7q_QS@0Y$ZV-N zy8(lCcg;0A@xRYfd}OyvWIn)qt1vohT=kjo?_F{VW=57R^J}9X)jS(E(NE`Ba1fVC zYsquerj+qB7%S&pWoSj;VUieC*5~E2D{`FTRbT#$DC0f)zIy%-qphsL2|m&_FF#4! zDDq612~CydEctj-f6nf^Ew5*@zFBBy#OeA+lZHh!XSUhQ2`bX!;)R!c9#tI(DtFD? zRQOu;4csZu#VU=XpIA2AZ*yOhD;^`T{6p*(1j#6W#I_3Fg7ozCb7peo+EVdrA>#7 zx8h^~VbH@VivzJxVTot|A`&+~2 z45e1mg#%|AQb*dQuNKlL?>k+_ckT;!%ZwJrtwH5RW@h~e_C09O@xq^fPemfb0vD~( zU`WWbXU-Q=(3l6lctP_CDFtkI%3Zu3m_%cZ1wq&Vkw5gk|KZw%FZab48jRld_$6~5 zvYB*bv%Iy&+2x&p+N02`J|(N^xLLZBF6=*c=|XI6`sIW+6^YA=Z0S=+6LmJP922tj zE>uq}eG~bAm7Qf&RA1QkY5$~3hkzp8B`F~#BhAoAOGzW$AR)+rfG~t~$xtFWqzFiZ zAR#d{Dnoa}v**+MqqbLN~`=j^zz-*xRDJ-(mYw0qfBRvn5e^3qLesfreq ze)zJoiV}1*Kc#;;tH917`Y0pJj#0Uol(jpjrDuUELLmqOJN{!iLMXd{lIOce&+vW7 zik|6(=-wX>!^PI*B?fZkx><;ZKVfRKNd=iai}EAyW4_k{6E*vs74;w2;NzC{Z`GBZ zV_xAXuxL-x{Cef>vFYG~YvEdLZy)lQl$ik!0EBrq7+lPMD@^+Un92?XUz0NZ#4#CdbhtX$Z=B}k6e=C>Kt14t)5j3CUd{X81X<8n{G0`=43`Z|bb_BsgzBcTI;$bkP4tXT=l zJ=lBWY6527l~8cyfSLzR7^vt5NYK+D<_CktEuMqblL_DcsI5ip`{wBN#Q3t-J>_^5 z&X4Q*YCAsw%PU6Qefq;qYmTk`Uqt8|mlrr~F$)7HU(?ul$U=G@aszDQ`w*pnTOcq{{G~2VrS{S-9S`vluPhTX>I^BuI?_L0 z)0_Ku-#0{jfj};oqN=o(Vvs8vEw#2sR+)LCjG*(JJ4q1c!t86E-fqejhPU@%X zLyajwr{zst)mZQ?9lkEtN!N&(l=)uI&fy7olXMb^P|eaenk&sP`m}Bhr#qMWNb$g6 zVp~3UG-DVpGBcdUrW;swe|pZRfVLmmUYPsf7E5yKj|#m8N22tJ9pdO%wRMd%m0xye zCaY{LTeAp@7Y6&C#Pa5OZQk(6!aNihCf>H1FnGRatn3O#4VrS{LjKdYt`N8MH0%gL z-ql$SH$tch(xQTPY9hU0D`OzP@#a^VFci~ zm}(a&Al`PNy}&BIj@Q>iM7}hee{(~B*wxm6;SR8rU<2DS#`pznnn}Pw4Rj=d=DxS( z+2!3xJ#OIs#dF&|+!eE}G5TClfmvFb7N!E?V=@G-zONJ2yEt;&Tnz`${QjBZN=Rrq zXyxVX<~1xkABvW=m2O(CKCSKjD6jjyzvrmef9p1<@baQ@-|*}n9fg{<51nSNrS)nd zAuK_G3wJz+b$O)NYkT2+B;GgIm7k`kZyqz8hm1T7!m`o0kCg05Psk0k8tZ_K_268+ z^*>&=#8M}-K(FRd%m!EE*ol8a__&f@BA2!FOn|%mJFaBvZIEbS_YJ`!F~BhFPa8q72++cQskLrW7FcgsqdTQBq@Rn_rqX;xxGp z3y#vlyNSIWXh;48AP}Lu>?`Yky(07wO8I*Q=dXA^!BGj5gTxu?33%`4PM%?v4crkh zrGSOXn*Cj0W-C@~5?{dLmMbCA{`v%}md((MZuqcNO(xG>qf`FITPyz8)kLSC8Utsg z=rUD+hkm6%6nuzbvcJfznC$t6ZAnGziB_5?Y%sIqbaOhfy5lCZG&dv@6rM4j;4iKy zDL{gZV$hKv@pOkRwtXGI6fw0Z&@gny1=x_?b|*)^CO{XNFvI&_VQLu|4gpZqUx45P zW;zRMCh&61;=fS}`&qrV!gL@@L(VCo%)+$B8edXUvNAMFo&Dh80HhbNgZ2x|GpIPlqoTuy?n0u~i-)6SD6YwYhgqME|$ zZK;_yMI%oQgSwoidk&y~dyUWcDI7G{s}e4aZ`OzB-B*Vr&vAySTE%y$&0l_fXERDt z(8#RhLpQuRh)c+{vZFMh@+X|Tu2)N3>~xcA7`IGE^hSHALCTPMKAZM9l~US8Ntq*= z6}PlYQ#mBulnCo#Qo&UBy~=&*8NSS4Q+@|tJ;RLuhzu#H;9=`)Cj-F~eW(Dg zWxCk_irHw(st#Fd{njWud3Ly`Mv=I=25Hq@d6uq0V#-HCtO0j^dAfRA*NW!z@W`9) z4k-PKsA`U7)%Z`ygD7*G4=zO1@o>JFKXpJ4+x3Q0Gx@x5?939u-z8JV zns>CE)X`1uf1p)9sqS4!1s8vJ8wY=>?4Bn#)hs;kbr*N-^>@*miBuYEoU8e|SrPv` zN*`#M)d#Od?))KOq`r$XY~6z8gJdU5Fxw9TvrNqO0a!IKBR6VvWyGwC4$N|A!;x+F z4eDkHj6Mps0s#n6U~-$GprG41Jxv2#k}4`HSsEpPH5oncK%;?qwhk=X4Jg;ctX`n= zLcwTR7I2w0gDt)7q_>@ot9}H&Fem?Srd_0=48!Qq(I4(;NV^ru?V!w7;I zU8}0^;@KC7o?KNO|69(dFuyN`A4;$XG^LKfeD2Hk%tRg% z6^J05aw6^vs}0=A z;X;fbGI)eU(>pb6t3oRKwTGT54(ca1T5u9FT0_wMYyzctzSBoU>O}V!|560XvdsTd)G^5EzZXpu*@Gm`sBph&DCr zlc#*WcO$giB^E;6O^1w-`PtKdjN7bjnz+U?4;u784CeA*uupT zF|O))CrYVmEoayg*~7oLwM!=(*oQS^-b%GGdB+o_((v=AZWWcV9d1^;j%6*Ag^ZZud|Iy8zHSUi}MSCaSIZmH;K)Cvb zn&lTQ~ zF!_ZUn)~BBaD8;IF}nus4iI+$b>#sF#H*b*t}WYQ7_JtGK{L(XJdG%yxgM8K#P|dR zIkj7txk$i8ns98_&uxR=aStF@7)=;RSQyEAZp)}fjY3JC#etZp%;`(m?QS9uk90Hq zI*L5Us6EcmH9-Yi?{CPcg|F;BN43RZw6cZbfJ+hd+avCc2Pq`Har<;eJzt_`PEmGsR_3 z;fU@EFmPAJGE-zMDsH*C$?MtxPKnjszr@kM*l1ddgjqRyMUkl!F`or3c>i!`qMqEl z;?<=`w~}!fAg^T`=~`U@_WQ; z{IfFl?1VqRjjmd4T{TO-9rnt&DJWijnPU*5L)C0^v!syk`OTXPwJLJS9{`_{tehfCT$4n!Tkq}i7zZv3j4P%YG&;{8WN0;@!&MCz$gSG zi(zslfN2A(9Y`couLH5#K;`Yat)gP9U<#1zAQX%csJd(G>th{YN;( zT{zaSwh{#|`eYV($R)sC@Y&%`ZEw8Qpz#wq7Rbh<^}ex6z^;5s{_2_OO!i9H0!>er z;#-FiHpkP3Y%ji|?3H!C+S6StusuC$3)CtwOMaXlXtTF*UVNn&&p@XbUz|YIYZm&< zzR~oHF_jW~&r+fGLqpCz1A@HR|DIVRdg~8+x!*OHNbBU^^l4Si$?+yc-jk$$^LUt_ zPiIbnit6pfTJ!qCLS!!+IeJyOtpTT&Az@cHqtzDWNvAq;W{SR`Xwk|OfBQICb2ku@ z#M@;2wz4joG<3@QMA&`SnHI)g)XDg6^6b<6sRCi*Uat;RvF*L0wzAT4D7x10f;=l- zK5zb6c6+8B+^zkiw5yhtirnqgX@?pIX`WG{+d=?o&db)yOLb*^LI(%39CRUyM_bY!`3E7Au ztJ(TY4EM`-&8W|dCx2&C50GvD^@G&Er{JQ?OfZAQkvU%v1y4Ncj=p2nC}t_#eA~Rh zcR27&?MWLXg&iOz7Fd>obqUt_zPk&R9n}Cp03@u%-sbewR5BcXr;gRWg9+}<4&0Cc zusH)-A10>_kdLiP$7T?W69iyTfC&RJ9u*EQ$k)En!|R9eIRB&iJCGn0P7=!BdOpib zLb$m8T>8AZ#qR`ZqaMng+C77{9G~$tHK)X4B%mzhbdq7{?lVGXTueP`~;Ee+~PKgWe_mz1K)dwFLZ1^*EeYvieIl__SXO1ZFsWIO(y zmC5gp$@eGgH~Yhcy8Cr3>~YQS=NCQWV@tejnpM;Pu#MRI{&* z+$kG-S|F)UG^06%&9}%@-bLER)9F}P(u`9k8r_#Nx9uh6Fad|^xt<-a3+UI1*O;~B ze46WvEv_MRfw0X=`F{Gz_&}!QaWfRp?G&V7&q??xjoyB(Uu!}8wPD~PqdBqDf@Z#-DIc@x*~ zvi;X#feEk1)rTrum()O$A}XMOs}+kH=r5h9XeWAFit6Q)#FnT{Lj!XeZD0vs<8PkxSAhXi0rb+8H!bV4MGK;Iu$EB1i-7%F|1`?S z%ZtG)f!gpDnD2lj>>h6FX$xRq4Co0m!)n0z7<5}3V}&uLU7oQW>enAMLNmh4X0Q<; z$27jUq=?vVF*1ziPFT9W$q@06IRdMLmYjkfSgj@XPoa6d(pUF10 zNNoSx<+rBUo#)4I{v4?)iIS%XBo|oiHWh0x8PX+<(RHB+XYABV>gN~H4L`2m#>K<> zQYx;P;B>Ji&O%i}Q`Lq7!ToO%O-rop5IG4wYq#t5I2(uY3dVz>Y7B%7&HGE0JGSce zxj+9I@fx3UQmu>;o?lwr+}L!cD{x6fI{2yI22&-uO&*oXx~fpA^$^PmKzA*UIr*5N+;&$+6pzuN zBK+0%KU7ucKPDx&{__Z?WzuDpKs*|a-^{tM(T`Nq6&3w#DDXUax-HU$J1tW$Lb&VM z_4uW(;~{B#@33u17)Ml>j?CF;UO-@pTj8pQX}gfq%v_GOMm*Ykx2eLRd%d-WFzR_Q zI?Jzm;-fo^=jny{4?peQl)?{$#$ygIs?HCMzglwBl5q3!xhzcl9bdc1F0-@;U^zEW zR>B_{rbK{+=JzHt^H352s6y?vv-1Qnc-kv|wB+2?I+QT zWmk!{xIM>mBZLpykleXK^?J!69mUR5qmCQpM==L4hRw{L>wZzU!Y1rtDI3Wb8ID6jE8LQ|SCAa9lu2nu&4_~EAuS$?xemCm?^jfQ zet&JA%(9%mQ2ZK15BczV-{qr*f1uOY4z$V=n&D3%`fWGSbFx}xc1rGHs?&uU&2~23 z#o0N&Y1*E8vI|Y=OB+INe4sJ(ZyqD6sx*P0;kA=fip!iZiDH^98#GML|3C>BH|2>KaIjFzOaX3m za(a3bWKGgDF--x=3CzrMiUYXXje%u${10PG&IH&Ws9FafhXAmu4%jFG8&!rx8Uv6m zwKruvBL{Q*zBc_kMR%EHE5zLvfVW(&t*s5{#F@vZ2M3;$6BD2F^Q{b2+PDB}!_CS0 z6QuV|j*l0#`0ld5A8n|wXKS2m4{iZYt$d*GA^^3M(W8|NaQlN9$LKdMvvm-lg`#(M z0F~G{@$=_i|Gv71?%hqN+m0_^v_bH|!R0kk6m^c^uSK~QlEU6qNJFExDQIWG`5RbR zsNB{l;Rjwz3f_a^t&nnGY3;;?5eBN-NvV%B) zwxtm8i=`kP77c8v;O`p$_wTcFpr__*P0a*wlhgsrKL>);{|0UKQGvd1P+f$aG9*1Uecg^b~sIRbdstpI{3Q;shxRv zc+>zkF6FnE2dvmM!2;k|v4R$oNA?GXhV&qiSb#mZJIVpBnwy{Bz{6v|U@qV((>AJQ zT1OgCpx|b|-U7E*UNt!VsmTj+^J+t2woI4>tfH=q%XYzBacOCJMTN7$!ClOrM|cCr z%rQ^SnC`1%gU2vjJSgV%5|)2WI91;!#L)d8%xRm$8QR!#L9^(dag-Qo=CmlddT@tS z*?LDG*X!`=v5=_{&87Lg=u7ES%ikZZX6gTHf?`h#t_sU4Q8l7A13l)3M{~B(-u4bd zZSCUs>qV)>sLt-01+Cgz_D}?fKi$s?zE->sC8VhiG!Ru-d{eXQZy4-<(0qPR9h>K8 z$H4@S>l`2N_A~3f;Ydh7>VFx4w?*n9z6WNbLgbRye+ZU zL&zBBCp8%5YQRH_PyN)wPyz_vxCiJ*uysSybsLJ7^Sf=KUDBZ7i-0)!F* zN{0ZEP6!D~m6C)a38CJV;NIu^zPsgJmTF7`vN>s$wcN}~>MS{?wd z*|>3y$@|Z%7`{&2&xk%%@%Xvv`IiF^ z+n$;CiynQ`epukV(dk>Sj;}W?cGRMAxS^V&YWe~mZinm4z{7EVbv(ox%6v*l#gI81 zw_ddKc*nDyNhS#NRV*o&IVljRO`i2-15gV5A&;=W{x5Ih53+js_dfsEH^J?+2S#R| z?5w^Fgw#~b@wyBSBi7}|Q$D0pT&yGO0)K(P`i|bcY?I#|D%$baY<4e$pmG1B=4T22 zLHDoTkvgb7%wf*{HP~+N|KsK?68|r01O5L^g*AvxDs&(7$6td5?Ot9(wX@dxZ$|vJ z%KwYz!*M^b-dSViU3gHw1TK(Ho=G8xS9kef= zjPr+3nz;h4(*+1O5dl|=8I&CL>Uy`dt}D#YV`aMJ0vFJebP|CjL=GbJ^2e2c1BL|jOkedFrvvX9$AYc zw-jg8wNPAIsIbQhw}Nsm5W=z!Lh2dQFSueH0d3L&n8akiv;aOscHUqoBQGSD{2_*1 zz36OAFpc7kGDTr+)>4cREj2SQ0PfkqCm!$~_Ndo~PABIe%+cj055E+6nK?hnS)G-2 z5U-roK`E%&>=GVsgfa=xM2SigApZEiphYV`Zy))J){T^i5mV|rqm7P;{3bC()k1{? zlK7sI6H=BtWrJ?bKWGD2$S|%RxY`DjD70=&&$bwH%n#9d_T(m>M2Y@It`%OApIQS& zY{T#kl)1E@&FVv>Cly$)wWeS$$DZH>Jr$pZ=K6k;we9yLKKMK$Ma&n1#6%lHKfRWR z802H}K4>V;OG?F&9~3*rkWo+l7+a%9`OBenmGh{5cr{1j{(u`ujNzvIAu{Y@V={hv zOHVEL57a3sauE#;178&`QzU+SCoxyYizjXVMbwBS+(59})|S3JBknpd5IS7rHy(R( zCjHHrFk#3Xz1BML%ED~{oA&|1zTvPsX-7jh*b|2Skm`UL#4d7{{*W_AKPqV!fHdX+ zD+BKBNREZ#aIa^_WCiKNb0AZNOvlrv^H@*p{^gcemp9_bWX|MWHOQJn7I#F#mJ6ya z245gox1VmS(-D$)>q(;K@!!JgYjT?t6ew%uVREH1-49*tA2QM z&^Jk^>{IIb#?|e>74Zd06JM#FCT>D5JzPbtaLR{xXv}v(Yo}ZTZ~od4w*dpk_I+$U zTwg4;*Fpm|TB^%L9~2_`@s>V#C(A0VIQ0Y)$-!q3un;t#m1Ja?dbEs}V;#a!jp^;XrxvK9+ZFlJg+Ydu=MDg}m{l5EI19_OO6DR;YPBE!%# z?IlQ}Vvt-p)W}_Lp~I7)L6I+rAnP_=2+>CE0rW27C>92`DjU`?AAO@t?ITT?h11fu zP#@Sv02@t$*{3Y&>N!|=5Y5*_TI$_z zTFvKVsMq)OjchdK9|9CG$p}lLE(U3B>kZ=NSUSU`5P*jqDs9Gxs z3ymBK99dRdT5gb1qS+20$NW*EjHe0Y?%DJ!^o zc>?-SE&wthS#EIrw680q31vaKkE&GpUik*2U_&)G zdlT%6;$%a*b>^G&uAnMY-$R`%3xB3$RF||5hL=_Ot{`I$n~}Z9zE91*BLxZ6@4w9W zo@g>Zxz2!G7e~&J;Rlx9Hh}&f>GZH+I@~UOZhVvRP}JP_td4#2OvcxR0iBf@UQ*rz zltyYqt>a96Rt&l2zN7R{022Wa?JU9C&rl^j9lrD64^aZxPWh4cgOVRJq&Vjv2H3*UL;c0(V%Loi)@ht*of#Kb`;#Vu~emgnhel$1Y1*MEv)Z zvZ54#+<%~yHRIhlfJNzl&Eaoe{2J^pUa)M%uNgCE@;{so(2_;Q-$Vn``-kb!d|5bq z#SA`l^M&BjJCQSba;(8Xr*ELt;cAt6THS$6=mFehE(?6Wh&|&3lXc8c!G*)ry-SWu zb&(Y)uqgALSjI{Pu6+*yL=XsrfvH}32|cB+%J_p}l8NV=$341?kmXh@q`g@#Iu$sy zyDnMc?9mD%=%QETtz%HJ%mdsYoC&TZ69aKB8@{x$SB zkPVs=6v2(j4XVhg5PxH?hOhN3gsouF2{t$JPM5Dk@aFm<#LGEj`(e8)dt02~co|&^ zwMdh~D6;v_VpS3+Z%iP>S}|85YF8VXAuLN9U>6DeLIg7zU%3lpbD^xYni%r*en^6s z(w6(pMYILp{DIP6^Z)kL1}^=3?Sj4716Nvkin5oFPiySe?Fw+a*93MWf$WrUlc^L; zNr}`zL7sm!4g}BlGdzRy@)@!E>8pR;p$22q;x!a|m5)B>GdWK_uU3zv*?A6^C#Es=iK$q8q0pfR zE0f@qn=PI>rqcYFI^u&3?2u&p-2|tu7}^?DFF-#K*J0A-%tb2XG~(lD$`aFdzI_K3 z6p@&(H37ux< zRW+9blWjxA?aw(a=DqlZ&BC+gLz1~~m)uZ1f?BuXl)y|LAj!gCF0}-UcZ-F3sxP+* zJ!7VHYzUfv8iXIIrjy_fNdNi5$~^@%4++X>sf@ONrg%z8BL>&9K)_O>4QVGkxvFP$&4&Ir;1uT&Rl@}(~32M^NJpofh z#Oc~&0ggZJp57F{*y#OuqPTYcxJH0dc&14BcQ+Sj4~zTkS(h7Aw-*g;QlICtDK;CN zr&FXhS1q_UPSbek#O2Jo8xj{zBJ5o~TW?)0$eRZ;DhYJd3z&X(;+9RQC5( zP|jOuA20u1Tb3h21CKhGY(DbUG_1+1lU=*yt$6@ytryz zejBOdJjzQB%%a2)x^lf@clpGRF}Ng9d&-Y_aw=M9o7} zMQ~%dbvdnqw6Z>5jKFLrnmF0}bt85d2n*z;Gl$<|^bQ`dtEZkX7effLQ_W**2{lrI zRQOn`byT#&96;QUdzbeCbF?8~^c`3Oaos~@rAU8|e3%S@aqzcxIutwleRpfL(A^BA zsU4+O*ubY&^7S5RZ4CJ{#vSB50Zw^bOnAl80WYhPXmepkHo`)TL@HKw79#M9mz!@d_d^`7yx-7x`j4|}g~uKXH>;cT zjuF2Sc^>9EZTwmFqgjNR)a{@s1Jo%tl{|xSxz*zWHS;5U^A%p;TPy)#F0u}a$Xjun zDMD~%(ND@se#B0QSbNj#T>sLMX)VDQP$P$r8=*o3HgSRlynJ0gYKxde+aj+fXb7aL z#I_1BI_gy7x%_>Lj@50<>QT@)jq_hR2YTkJH6?9RO5KjtU45!Poj*pZx$a~hEY*bw z*WsOP@pUJ>7%L3eo>8yV!*93D@KU9Kyi-(Cqfx56x7zwZD5@pm@HdiTEPUeISnLZ@ znx1;)D~)>CEsNhP@`h1>5FYFe1RFkZDLl|@f zWa>nIl)QzG1^=drUmTTsL?s@vp=-@9FP~O-*>XZ)V0!EAUHIvTC^6zZ5R(iEkjZa` z?>Ikma_>+O^m`5E5+M*OThwY+DG~dKydgV@e4Ep%%6Nh4!&Yn(&!Mu>+w+#(gXRwW zo*H&#S&&ctc?N?jw|f9zs-}j|*eFt#jp*8N=kX%d7owNcZZSuKmLktABS0z{fj^~;u2 z<-L7EA%T2%&%w7!rcQf#Nw01X^HjR1(pxgA5ibI@BIl-Shi@sQI7+umz4q10O*;iF zuZCM}URW-i%OS-0cKMw0Vlv6gG|f*K2~Ks^Lq(aPOwiTvxsLcc!@~KmYgkN?*5+ho zU3ERDEu*2xXl+P4xxW^TBPwq_i%FG@B`w9IcA({K=%F`&;ysn+_qq(Y-I z<(yP06C}ky;S-<7VzX+Lq&W9LRp%jpTKw)fqK9gyJ=MMMd7pler#h{Z7lHAaj#CJD zx=L>qCk)iYo+UQ`fv{=)c9pooGk3<+CJ>W8X<^`CzK*Nx2}yi0Py}ZL4#z3M-vQPg zbxc;R&haUeFz+#xh0opOARy3dvv8`Gqy7xIqJ_JM`2Kn^J!|>sCv0B8}E^eP7`MG zB33A!^$!^~@JzDfQQZaGE?VN55ntmn@U6!Mq{kETVk)D&8XnGk(oibDN;>CAw1U@}k*5DCC7IA*mTL>hx*q_c6 z4?q_ko!!KBw!|m>q_h2`e=|!WSC_coyjWw)!{90A*Rp}9+P@)aKgqBT`#K)X-_fT# z8D7}n%d3-6KIv|h?q*5pcW7(_2JPH_LBCiI_ znQy_w%?61fx-{OZKg%PZH|#_^0+>~mcyg81#BpRUJoMx369iHe!A$;!#sz!Ys) z;nU9 zHI!6vj*xGzB7moyQ|aesAoS`=?Z#~O=_b@x(R+jaI|5p;QO-OJWpn?3|Wan54vj>xK%gLkWmS~56UxlKz|Ikj6}GD%7_yys*a735nU^Sk^#k1o3l@HOdoAkSr^?x- z25PElbw1R~mwd-&!=r9I@$f^1QM_${)8r=-G-c;wRDo`ef5$oU`0%l8Ilmuw2BFRw z1^GFHRWtkT$-J?SkAuu@&`ajh34>LPi`npr*OZ7xKdnd|RAu!s(X9hLmr9x35{e;L z3_^=XkJi*=JtCwU%Bi_`dkGV)b)ysS(Zsjw@oF<{mkIeo1f5xIjjj-tHX_{yQ}=(p zT8|mIYUho)1e1Oe;+FbT)*zl-thzkBJdL^j;OAvRs>kg5vj~xrTH{2dfGvEujh2Wu zKSJP9AinCxxH2tPok;5i!q!5RVzbn1v(Y5L5mKrJ@~Xf1#*^{K6Bn!=q-)msg;5*3 zZ<}v9&~@rL;XI5yIl^ulj~J*_0HqIx3>%_O)uUS9E4=W$$kqviHzA39rTB5vpVehc z+{e8(4%so2-bMM$ zE;y~I<8V<#_uIJuAKHde-1Z)cBg1Z?u?zl*-|$oSlxSrhUpG>!;DGB}zhh8W-Zg6H zj}K5i$=HO*iQXCeU;#Ft(OLN;WVqj}#C{SmT-6Gkn-WTSsVj;N8~ZM-J~bUIZFU5Z z2svZ^oLICS1mEsCvNi zdFstVM&i0CVmmA1Q+nwUbZ!!q=9W&Tt<5@?eK@?YkXP`jrn&p3P1L1N-8Oa^I+er` zZB*(oVX#9hW&AWZgd%-C1wNq&h!NmvW~OV2y3~-fge|iuz?@RE1)k~uh-5EK!qTA6 ze8&gG=v5S%7wg|_RcyL853Vq{Dn8*fmqX77()>(bufG>XY`(XBMHA6l&X=o$wYAZH zt2wFz4wb0GDx_UD9&@%wz}qDUKi4HS0v0c}Y>Fm7Ux`SeI5}0P0@=oU1rmmVeaVf7 zw?@DaYn~U(g`FVXLBWkXT+dqkzIyJ;2Ee7I^z1q9uAkY2eD zrgS1oZgr{8==}T4`s9Wy^qNLH>J zG`D&%;0$k%X@lkY6KyyD@Jv|5r@`p80!Bg_^tO%SmdhNDrRajxk4usa+DhR9REx|N!&=l6g4!`DV}wV zHz@~u2&R+%53vVLSnvLgu7ML%VUg+I9Cga1I%m#Nt0ui~uTF69JocYy#+eCRrY>*h zjmxX4U!SH}{~i`YrcAZ770Nxy{Kv^;kTl<%3Gf%F@O_eV+Qua0zWQwkVZzTAOpm{& z$i;nlyR&n9bfy1+i;fo#0^!jHtv9njs)q-dsVgh$0%TQIni=jW$FEC276M+RL{PLY zGx7bw_X25F(f<+eFabId(!VygOgmmQpV5PdM}i=jz6C1 zPRG_potSveCg5To`6gW5?(0mX3gzC(AYC=qf;{&}ds#(s>{qEszgoTjbaC}`${wN4 zWIch2nq^@4`$}>eH>Mv>ntL&0|3lYA5v+CpY)(}1|61iGnOt@6M&Kwf`rWO!eJxf* zP3bG&?$2pU=&IdGdyOpWhsQo;hD~bdmrDnOW1_3r=q*L5c<80o z66_h7UPHU;9M#^Z{Hi?*X+JK~?`iJ;l|%@7NbklmbOMUqN<@xkbWU|!jcCtx2XzN6 zwZwKy^o1L|)%Fs6=BPxw>13reU*CY&L%t8>1+<@oSz`Q#&G9(ie%D>B)!vJS^@ zu~M|3e~zl4eMD7HyWV{n4YwxJ46LvWufAB;2QTb5(3X7J8mj#;D-RXK9=CXu z-cl2vUWz?GC%w>fQlSho{5;0w9?v7YsoX=mU1nB$^@~u_Ba#4tG`|$~^-hcJ-D4Hu zJ>i)#*{TX8Z)^72)DTtA-uG@o9TImQuo6fh&NnaFvc%Y*TsZUKh=zvp7yW`Z*xq}iyr*=h-E#KC=eorvMr0>Eq;td_aCw}qWNRPfb9FbtN+B+g ztPEOj$0jyc_8n{~^HXlVAJ|y%8P-HQ21x>dMT+-)-y!t zYR3T1$i%=oH2V(?fE#5lK6$XX)^6Vcwh_i22WSU@TK?D@+EdsD;|PHZo>ZzAQB00r z_LfO(F2!b8mO8=p)QTFtD5^*-JtGpM>>!2S#SEHg{f@Vjt++E5Rio z5ekLa*a%lMQ(qv^g=vzeO~STEr|84P@|n?%#n)YS=e%w|aw|G+BB1vr4<6k7L%ZrU zWTO4;9STR@UY2MD9YOcw^F!f+SCpcTj>O)Q5cM^j?dYZTl6n^x&dA@8`7L&H@(j2B z$z8TAG+|^{hm;BX&bq&Vpe;o`@wwSf@_fHn#2tYf=3Kz)^+Gu?dsu?~wH^++>^7aNnm3kIBM>8B>f%A|F z;6^*4y(0r9@$5(4+I}?+fYrm-N2Zx4n?n>Y1PjJzW^ozCyDi&X*T`1ric3L0H zWCCBtRUXIdiXkDxXNR0ytnM@ZB11p!(HXjcM9gqnO6)DiG|Nej{Wgm4UXJ%gFPuRk zlyR5W#WRh5EsAz8M)rXsg+JkL{DU8E*JP4aML*bF(yBqa79~l-gFke|gB#N-(qrYL z#jc(#y1Yx!TVd(kVYBt0p(bur>9k0sxiZ8iE$5xEQ6*$JAWwNUE%OLGbC-pmF+KZ| z^U^Oh8H=Y6+jYXqUul1=JQjiDqHz$8R-f+U5Do!W5d>0vMAsFTrSP+Bqn_DS2+DNo z-;cL#vn~(&00{O6pK#gw_5OA|?!b@t`&iAj&+EHVo36;>l$Nw97zIFWmZ;4Mna}Zj z-tgd6t(IDu7oYbZXKtkgWx<@;^S!R)!^Bz=M!LDO(v3_zlJ7v-+g2Lu+8*7`>%Xv< zwF%QYPdeK0=(?LqGv@gBNzzvAWgU^EO&jZn`1rya$lC|4n6!0MV>{L+8i-FyI={-x zMx{Ewnl5>fV3A`boB3pt*L5m83DDcyPjDp?nl>sE;#oSWY(Q)eq`_OpNqaL0_vFVf zYxc_&KIzL(gQM+EFyWh78 zjcq6wJ)&_O2KW{po@e8>U06xV1J(oRxulct*!VPw5I6UxE>ZZ|#Mmpb7kh1SC23yp zU`LlCS)H>&;&VrUJsuD!745|*L+KJzjyBcv-{~7pml~GTFYWmyu;6gf%@6pn0PhjbmQ7#o)gEbkJ^Vljq?I zZd6Kqr_CQ$8X30=fA8u(ef@%*@KP$cu_hXOPIPF_i?z44Fo>bnuIQG0aSRlA| z*MoBubNo_Q-Z#z8Pb9PS_^rxVF4aC%ⅈ8hl5bePPj()gM7e)0ox25)5-c3ljCpT zIihj)0~+(Br2#(g^DRr|`y)hS;>2wMJ!!5XpLr-^0CKx5d-xjQ!hnsXC*H@7O_KBn zZG^-$`#&NX52xcGLGQv(XxrZ1ENXYrG`^a-yL?w*OuWz~E&St6-37_V@?#z6^gayJ zIPzoL0zcwKmz`6Iz1SnqB74OuTj_5a=WJh{=i;R z41UlR3Z2o`x!9<>>pwpSBYEhNAMLDY*+doYr0*|~*Xfg1&6oT>HsMgpDhs1Td!sX^ zcjDYv_b^u_2tK9~`p)&L+$*V1ZB~VGCMN*1n`%m0_0%-!coRH{^J-?b`!-BeIhgv@ z;UP~eX5zV!7d5^_{7qz!j*~B}9c@0{lC#N{+@#X_)-_H0redDBE6ajhD?8ha5h7@Q z1hdm4?KB={jb4KM*%Ft(7`xvIMZiBNpW*J(z1i{UJ5bjN>_L5gM(>u=sFj*n7a*C7TBTHY`Dy5rWa?XEFW97fjaZQR-XAP8UN;dXs&h&U<-Yxr==hareki zXdGHXnhr|AHVeC*qHMa|KPE+e#Tb;t-YU%A&ajHF;=2K8tU6!N$LU+yF~VZ#!)z=! z3wmNgkG$A|3F#$vwo1NgulW(W{tPc$ARK{M&#*K$M7YP$zFF-7az%jY7#_t48M#s) zS(!>-9}!j+Y{JCNqt~8I2G*b*ZNI$(R`twsANWn$)*9`E0iMM~khSphDNCXXBGzNf z)(RHz_UY>?pq-y}A5m7kbp3>y#`pKT*Z?$K4Hta^pGo(kSVp?4S#h7wpd1{!H0$Ys9nFvvrzT zDig$x_VSZ)&w(3u04!_@5lxp=ao^vOVQIMu5B{S}HaxaZc~*RWm*3~2IZQ&wFMuBj zOKdeZeb{iL#s-KV&a}=0R_Wm#M?(NMyZ=le7f51n&;7SbL4(@fkUH;%#r|oF3hT5L)DzA@^T2+%e zk-8VF#INWctB4U5n7_f916bi^9dNiMx7TI$fz-Q$YRnR4Kv+1RL)R0Cwa2OCaZ)G; zM8fMMSX4d$lw9uqTjf{){LHP>eki{~aR$+E1}^-1C9)Zl|12Fb@lv1invMZBZ$TPh zf~>g4)h9pH&e09MF9*)Kv$6ag$it1=6oB0xU_YQyGC>dtecZm9Ub2|2@I!L)wM>D_ zM1(@3+?*2adPnr$CI*99j>FV8{ol930$cLq%nIg6fs1T`i@fzqylio|$PM@mOQjY7 zM{SFCCM6$WyAl9Y%K(#jB1W%UKw>lKwKL71S60lth@PjP3%lCz0`i%YynCjkbSXW4J?Y zFVWcXnZPx!o$rLPZJJm82C;oR0pBA_kG!qxYU>6>Fv8J<*~RO57Y95~8;||s@7FTv zMEha(-yr8a_N&x~7T>P6jx8O$vvz{0azIz63<-Jk{;SI4PZ_`aKD*S&&BM%cF!#yl zg!l4gNitgAe;gQt_%b|LWHq$bc$GsQS<@j&+yG{>!0!D=+bQ=2S65Ejd${Q3 zjDClh;TTaRy>{7rrqN%fr)sHAaN@bdWs0|0p&HtSFjvA4%fV@crf)O)^!Lv zZ-(OLt%oPwocG6-)ao5PdyJ+Yo<6W^?m;DD=c9XiVxi*d1^sO=!$c=Z80wisYi$>m zR<-6Pw_gTgSRq+iKqx{T&~$!_iXSA`p3W(O0hqDvwam`6`qe=bw4fQUPn`B8s#l6K zODBU~i0owmOP#ZD{!L2Eu2c&b9bZC0B52Vg9O1Rz>W9+V4sFqUpaDc*VKX>evcEspAsnO{q4EL zSm87&hM{eY+fOBr`J_s}0)&e9gpfORWn?q|(=XmU0VAiYJ5zL@D#J#M&X>RSBT397 zCP=fMTt7(c#2JtRpdBB&3O4v&!}k7K0i8R)RW?1hZo}lT@xFlPZ~!tFf9}=nTQ}9C z8!}_&hku#!&%w9KvYMEBMzOnx8&w%hSMym=P@hrQFcwOmrIZ19jn8^c*m5WS-pVig z{CPV508YB|SU+*Kt}oOXh(}6R)L61$y#R<#MjFzVO#h9}%eLz*J~zESh>Oi7OUQYp z!l_TCifSF1raT?zsD%^gQuEq7#;m9V#OoSYJfC$>&#-fpX8)ikwaukmox-osZMaAx zsu*(=$-BVD!gKb)wV{mQY9?(KA@hBWfn`00|-`tm!W;E0<~Z@ z;Xze&H#xYB5=tkZp|vQ~Eq&7Sf3}?uRDiK6xt@eHl}`|k_INKmKBf{67o~gKHROD5 z-$EtAMZ*!`_Kej62zXXguZB`H-{M#DOY{}@130gufxSi3Y5Z8ri77+yI@1hP*7ZGkJ;7DE@tl+@0$)XoBK zdqK>6=?xO{)^v)6M?f&G@y5_~?!){Kj#W@7={H|PlN?*J#=*{tgekkPXkn+)mx3KN5_+?QsMz%u)1J>4TV*H!2TStk6eXmTh& zLENYIxJbqieVdZA_#yY-_>HnP3bp2{lIl07@Tv=5fpjm~8P$Ynh4#R*&W-5h_UFvd z2Xs1kk}j#)ZgTE#(Uoi$=A^a+Oc zU}@+D{V-XG-C9lAxa?qb@+7dzc5I|+QeXPsU3qwNCDu^eMBT9l5;5)+R>3dek1E=&rmZrLVw(a)^`|>SqMV(RtyL!u_smv#ke*GIR#wvu zpKn|6?%Jz>1pAx3l77UuisCVOhm<%f0IRbIw7GhE*o zJ|>-8ZNco;PJ+H(3oqRdpCFl)yI+&ZQW3)mVf-vyXy8Sgp!O~X?Udk*Cam6 z+?6av=J|B0Sc_atf9=t7z}F9_2%8SU0g?T;MP8!H^|ecOL88oBT7a9l_y=9K0iFRr z1ZIh%?Deho)~E_A`f@iO0k~9d#Mq+*z)}sLcRu+%3O+-uQeatCKJouN|Hj9V+m+8Q}#YHfj{RIoZR1D8W<>s&#VM|A2L9wFNV)#r(I%94UI0o=OrPtt<0vG zv}=d{45(YL7hL&u1TU>gZu?Z=R)PtMhQ_VOKocKw=bAx1$i-SgC+6TU`L41VvT>$Me(@9aIBOM`6~R zrM8w3-cI7K(=T_%Ew=N2yJ0G=cREkv!=0qGnAn9M1~Ts!>$n2eL<83aA*1Is()EM! zi2J&QdjAHX8%Q)oRZ7TjU1as3#UM)SY)&pmOpphjZ10ft?a<`Hs#r-LnJ% z0h^60MN7ERouqo#KHg*DUG@4+oA<;OcKEZ;{c%?+IVbN6+Iw>A_*~RU(Tn`p2rdl| ztcFH&CD-x{Kj0A!1JuB-%X>nA6N>nQC8Ww#W_=(S75L-cpHR4WG+&(Yv`A&H+jr@BE%Jey;5lwem2#V|pcK zF*wS>uQDomqHDN0c6Q2=wnl>jRSL>AIFNJ4z5JKyx_MDgG?s-3N^^YnKUFAZTR%JH zo|-(O$cXM1ueYJt@GVb<)|iJKCBw1Ios^I{L%lnYu}%Q}f@?B<=i6^yyLI{$@L~^W#AAW>@px?8_(%wxmRs+XS%9 zbFgQV3a!CwefXVFfVVAB4GF~`^>XGbgnBK2ROuqcp**UyY#Hl-pSmv~(@|$ z_5REvk9Ng6|OD1Q7xl!Y3}s)y@bVD2WRm=x8eo2 z+0IbcUabiF5e0Vwt*cs?mVrsTM-%Yt0-?wi2_A6R%n4ZC(RSHbr~BJMhZLEb3M@i! zB=eUIK~5JKfC&|uZXDR790yJ&GH>pl4rHD{<#*+G1$e<~1>ggq?M;m|==^JDc_NQj z#9xOOf>^hr9seHTU>gb;;~*Z!Yyu#4=1I~_AzhRwlQ>uenni61P>eU~HEq7Rs4!Ta zs{V6UV?i=8b(1a~|C zxiN#)dq~VxB>2>}ghT4Q>qHC|mds(XGS~Nr?gtdL0VV2;+-a5E5_7i@f%SK*crk|p z*z6|`EO*{Q5hzFN$uoK-S&vx>rBzhi&A8`V+cGow+Ob;IaZ4tA^7fEBQ;s6q+hAC! zc3P9|#sVr}y`tu{pm~N4d8ykeOMa+MV!9^u|MS2om;cvw$M4~k#y<&(nG-h>G7jIX z9*cDxTI-fvnV!%8)J|UDDHeG~9ASuFBBH4pVoAc{^W|V zKvOg3S+s)3>&YecgJuDC^4@DB}+L_tz`(t)!qbcl|-;&X(imK%negv%a!owDZ1M`rpR=zDRhWz*uoDyvA zHIi85m^3gHhMTRkLz98Z;Xjf7iqYhA=Lst-N?3mTGWJBw@STnH94fQ257UUP{cuvw z%mcpKDRIa}5sC3+NynvjScsEf;l7#zb>-A|#KnUSZ8u(2q3Wq4dMf#$Ev~^+%z-z4 zeDp$N5#5cdvxSn~50x1Xf7D|Y={JT@)x0C+b)`U|7+ivBL*ONH1h`&Cd9-lB&+GIP ztB=$s)?kR(UAX%jkM%?$HfJZcso&&M89k@oVQ z1~yfs9p{W!L|ZND6k^zlcQwPcHK#(g*yOuOl>aP(c9C#6X3l4mJGr;1k&*Q#KTe#T3=(f2+ zXG9T|CE;j9pj^b#%?_%0QW-(5LEP3;a9fqImPu6@?y< znVtSv$q$1>Q{?espm1`&BAt$RQhXy6N4Db~RqXrHt9sZSXIy()TDv~H?Ijp~MiGow)40lsWuLlAVynqsKl=1S?l%FYE1Pz@w z-=YF8^zZS61E=FRdY|DP_JlHJqD3y)U@K1v|#R~RXEDBU}$^Ra`-cM@X|cm=eQ zyr*uF-zUDW&FG?K<%xsDqyWPVylOhU>=-P8%w1sy?t!O&QgDcrAKXU%aRII0PODS? zf;ID^w|{O;jquwgH|1~_Z6O(6hYK$S5t5kYMB&xgfcBb=JC!-VIha=s(+(DzeH}my zuYa1sIsYuDsV(I(&x=kCAnE?7`2h4=ZzyFRrW>ngh`Meub0HhNofBtNQSh~>`sdt` zJQkM!IgWhtGWy}i7fwFhkCQZ@fC}Ecr}Z($;;H#k@71EgfU{zb-o#JfmBs^cNn{iU zr%D3384I&`KIO@s$00#*NaF$A!;pn?sS%E^U7TX6!G|hfc?=P}(~mRVtB)boo~c&2 z{6TTAu9K+cI{B_}8X{OjW~t;UM|l38iVVp`ofR1lcosqP z7pS8`HV|`0m++KJeP8g-CevP0m*?$7pxsXgKS_2YoIS@su=U12^rNtMnG>%D)_pI0 zYzt2Vj8aXI$HBb9zbsB!URGFldC2?gz4bw0l7M;KH>KwmA1bO00_T5 zE`RQ0_ZD%Eei2WR?)zeu_vat#&RnxaN_<~^1bkLS_y>!x)>^OhQczbNR&Pz19n-pc zB9LbT>dd$oyu>=6Q{?&!#nGb29<-h%oKgl(t;goQlO9$nCYb1WP@|I_AH-J^o|l6^ zh#VPx<9B2pDxjF`4`91r=`@mlWy%VZy`G<42T=!85b)6R5(_A}( z^_v77<5SWRLhVjZ{@5C`l5Z>iJ2Sv8a6oUS@MbV$>!o(}#eF#M?kLlxUcYr_1pRHu zySrIzY`tesF%#MYX`*Wyqm9-HM;&LUz* zYtB-2$mmigGf)L`SggS9XVKDAh@osGf8Tav;6|y`Ca#`0TVktnXX({g)$9@dvCUKj z+-oftJ}Sp47`Fq&%oIg(&H5eu{r>n-W?&6uPi+VpXU&ntXG=LynTF7BJfRmZ%ro<)kHh8R#y8P&rau7AVX?A_p3)VF}#hv&pCC(S2`bf4*!;_mASURWT>xoN-H|AL@; zHxIQGFIpG_m~J4a`1L4#QITe@n5VkA*yu&_nMp7*X8_fYGlM41{y;5bhy;$AH1k_k z-UDsZY>L%?V9iLP;^bO`K@ z#;x5WzAk5P4XCa6MAgkbD1I|IoiD#F?;S${HooFTd3;*=y>6xAJqe*$%gFD3F|3Fb zC}jzQr8p1d&N2@|_UR}zJF`jkT1SDo$x)XrJO>_#)i13~dfP!&U+G;{6FRW8GVIAY zdFMQ4%DV#i-vZ2RX^_4P4otswZ(>-QyvgUlUwfx#0@&RGADuq&Nx~4_R3y7+dsx?s zQ#GVm8sG&p*~_7Q!7xeWYfSn6Xcb)?NzV98#{K5avMcbOWW1-)r^Kcm4Yz~4@$Mdw z;%xz7d&3*bO}BG`Eh8czZ#*X&7M|K1N_O6imp_;xDm%wS z!Jse>g4aMt;`yy_T-)CUj4b?#ol{#j&`zVMmtp)X zQ7f(aS;vyn7uSdCx23knrP6@-t~Mw<`%xaCy39R<5;3GpC3Gwy(s!HRZ=&YAAkw5nI*5c0p%(>2l-@%XrAP}s2%!nm z1SwKN2t~R;0@4yXcL(*H``vr)TJQS4`wt;|?U{LIp8m|tFa4D>y@<8@QS1JWQ;Fl> zVMhQhIZwK4obzrhA{!|Yf8bni4{JlyDUY*kSdX+dNv_W*_98{KjdL%)(^ou(7d$sh9 zVZi>L@HggjnCQL_!XE|fnANW%M`$zB@uAk^Kr@(@+U6uGSDx9G%5cg*PHC|J@(6cW%yF{!0dV^}^(zt; zAEI~%v{l=n3^y9e!F{SbQcjHP-8PoUH+Apt8y1%I0XL~n_?61&)Yv)Gd~>=HvGyfR zDTurE=ClQhX6Ha6tNL(jHbhVT`DQ8gid>h^&QPt-Z&`iHyW2XW8BPH@ZTKzPq1NAE z7`~Z#IJO<(r{sNHUm1$_&C@_hXlI?6SdzETLtexyU^nRA6*H^6w1(SPEJ9Q zaNC4X^j{k1$VO6sL}geII}VywTYNi9AqxNx&H?AVQyiv-Ki-3Wh10D^fjb~{TWpFS zD{k}%u5=0XurZ%d&Y0QqJ=O_hw|3`XMd{w$e zvM`D$5x|Ukp-v|oZwMt01{mi8vuDRD!&^q zzdMvKw_V%k@3^q4DVv_M_6cxrH!oJklli+Kog%W*<%q2R1=&Hzu@BP20lR3QeVvdv z#MExaRP8~C*AkiMqERx7C8a=EyxQ8xo+kZ2P2M9`a?ZR_SoG!8DR=``vD{#dZT1KN z1j2(vr?|TX%Mule*6DcU@GCqbY5stdLekk{a`>M%(%+9MgT#UnYp=c7+mr7_NzI;} ze|4iLQ;H}szko+xf(DxAq2SIwm`TRHGH=aHXJI}!VAq~#ZngUXXFbt^ZEgOicVQ^2 zUX*dt4|j{8NWhd`%LIKcpA}xmwLU&$+)fFXnKq-mz(BNWzZQz(DQaNhK!UKV@zKFo zPoWW}ld-s{2a{-=gAH;`udw(kpn=PpjAGG>T^#Z>eQc_KLIIQ}iCb#>)tH?Z6 zYuC9%LfHm-AGtqL`tNDWP};2h4&fxx$iF%lTs3^{oB^a zd7V!2piHb(lZiT?=(KoIHTUQMb}m7xOljQkV;cES$vUN_N@3$lNwY69l-qym@4%4T zszFjU^EjN(!g6{4T1HYeh^oy^BZGR1R0%Bu7>fwIV zd@=6BjpGM}m65qzYcLY08?ps$m5#d{5?q?ja$!lQdpKSw8Ft(e3`;?9UM8}c<=y`T zilq7O=yuY#6!Z|F#^babP62YAk+F}o3(CC=_}bYuR^8o~%85P_m{ah73$BFm`pwC4 zJIj|17>{9jTZ|`fo`1{*qboM~rv!TB6n+UuGfgMnF9s-mUDC3@aN5BA=T`Etv%?#3 zmR!DeH2_GWr?z0|mF#n?;lM($XWS^oI`)9l9h*DCGBCfQw=6!cgO#T0J@2__xfN!&Vv9 zC{OQQo9;Y@Devru4lkeh54#d^Ofs%5>LOs`&L)6fxXxSA3t>Fx6{)=1T!*Ceu72%Z zH%jqO|J~{PSX|v{aigtwr$#bID)@bNtu{xh$G!@!>Bn8NMi!jOu=c&j3N1bcxd+o!o?nx>dl&mF^ zaBntq85Z9P`2%<-CQlFC=07sz#nuD>TNY7ma8X<2s7Hjvn=sD?TMsVZ-EqICGrq7} zz)^({v%`|3x1k!+jU~gZC*F;y@~C^|##Tsm<ZK>hOPv-0~KLj56p0l$9yxO%@J0lMeKQ&vu;B z;|_zlMBdh0KH#W1(;{(FQ8n?aJecEqi*BpDWKNsQktpK zI4L-@3Dg(#!yedob%_gGBL(O9WH4O&K{acIaV{Y|+I_TJEUqsZZ>bM;zIZ_)S#bUr z%_CF#XUiq6Ogz6_9voP~fUBGQaB;=DOCv0H(Z@$~4BLd_`r!lo_lafaHUM$vEi;TX z^QYb3YVE}QJ2e(Do3vNx?}IwscrZ<`6@!*{l+~U)rRaO+@X5s+ST|QWKBab35%k>) zJ}25=_BVEn*?j=_&=-tfx&LCmTdR{Qtij(PaVXFt5eTCb44wkf8q@QN-^|*-V~u1J zbX12>27ybnssDQPVw5JFe*43Vx+^yCx$W2t(ZmRv7!u$~G0?DyCQ=HOcb~cjGDz%J z&0sMs!4IC47nAl*Kp=w2)Q%qe@b`673!{Joa$D?WJ!rK4arTeTiq<58(?A=p7&@jp zuR`|WmffUDn|hBL=ci_2plE?V;=@dWcR($@h#rUA=8OXI@y2HP*Hq0j(k@t~%1V}r z7Kr;lZf#W0K-#5*(Fo?kii|2=XsW&Yfa}}ku42b256D1m#9usdf%D|MpIRWrn1FxW z57a;Q%uGeB9+#^c$rZiZEsq^3C2ha9Kp???2LWY?iF&xYALu!3U5w%CKrjV0PfVKP z?Kpu7hU?j9HbuwH%*Dk4e76%mJCHoa!0DDF-Xz_t7e)38fv%E9>Do?soso8Qh5)G+ zmVb&>GP~eXRER)n!L6M44GF&kpzo?UAE=4p%Z%_D$#+UF(fCYQXmvpt#jTuc6~e~B z0IJ0`_3CQg--CRJ|6nr=SNAD9kcjO&oni=h@KlFbf8q!)nd6w`wqru88%6pVU~F;M z)n$RYA)wc%n`e(;(#q~q-xA3dpQAlE2M`th20r$?1Tv!R?gC&kSgWL%BNxF39Z~#C zE4=WN6BOuk&2aXb6;;$UdOLzem!|v7vgIk2#;dhAdV1Argqa)SBYc<&;|-#JrT)z{ zv3CaFfo(6ioT?8S6YX8+gQUR&lcH@NY|sE|$NMBhw#Ri@hgot7PKKS~)4^LBT>c$x zQBJ|>8hR3zwt}KR!7su%WC3Fx<`m`rjN@|L-LR?%=>^0N7;I;9mqUx zp82QOGsq|<-T>x4nOiXH|1{dUT^o4Fsi-htU` z?^?GOj(V_2=dn&9Zn-NvZWqKqh3Xfwd$)UF+5adCX~YEFa%lY-wBczSljke7g-n}$ z_`SZaR~meg`8K1Jc&#gTAcGfw%c}A-)OzlY)xv9}jjdIj-!kZMM_2QiG9T`mVM<2& zWzvRb?c{Ci{ zK|46nQ}qNokumh>nlmxu(syzYDB&^dYtk>Vgv`xU|2B3&B5{cEcoA7!L5tCrpHQQQ z054mNJu8h=Dx>Srye0j$rv(y0{fiMm1_4n>;!wUgD~YXRs3>zo_$LcgBm;_{=X%fh zSCSFqVCI0-9Iq~p{u)SUW`$yoXCj4Jr)IzYV0ldktDa)zqorlfZy5vH zC?cm{lyMCG_RWrj2-EdFs zRv1@i_)G!D%aSj(Fn-?ZBZ#o5Yf!NQyYpPl?C&OZ;N%>WT_*~agEEybc(C=0ARMxe zTD$9mQp*ujit2CywZ?Co49pP79SP^nyJ3!`EopJXP#?ym`^Haql40xa4TIJj?HH~p#i3Njn zQ_2c&m>~JY*tH`BZQAkxI^x@5SJmCVS{R(BOW{|$9+OrAlm~#Mpi`VOtVhEJ)f#7l z?vhEj+wBr$1JftsRO_XN6P1|us#-klD9%ecu&ca}_KN1kx)eP@YMsopae6na!g9xeoVPz*30%zf86JMGa%G*EfF(|3 z(uIFT-<KO*jQb-b4}QDsLIcc)VcX_;iZ!@3ifJu=U%LmsoVKZqqwb10T4Q(LReby z@)es|#xh;z6aGLoTe8XLzhb3Y+TozPa6+efuL1lp$Ghg8m!*hIEZ$V4in*})CUr-z zf9nQ2aDtcv1E{7-AMK~>&w~$4nw@u94t(dV_qqSUn)gqXZdPWx{ofOd*abM_o>^jP0^+0Mz`D{RS445vR z$YpU~Ie;h!Zt!)ethT@nASjnvm^#2xI4ZWF$Spt#8Rl<7s`bdCA|nRpnLkmUJ*h^O zXM#abXpow7m7^XxT+k97u?e|#AS6*T&FH|D#y-jdX8u$S#L4|2=~?heeFG_{?0OE! zTC5!qp`S69=HkxNR;e1UvVWG)!_}2;IUVf*JJ0O>O0{g|G{?%aD6R=836dYf=bsF5 zy}w-Wjl$UKS2I^qk z!&>s3sTs%0wHaT6dCYS)AcDbvxhSB^e0dBWToYH0RoiLtYzAtQC&I&?0q(dky1niW zjsGgBlS=0zfblGdXImL*ftZp1>G%RN%D#N%Jr_s{nz5PL3&$b%cnux`KH#dj{FSH3 zvxdJ!YX6RE79Tn6-C%4HP>Ca_?Mc{Q_EKd|afo%;hfjtBB_5pya&+wysl)FD+;_%G zq(pj6a* zeaBkoL38K7nlUhdu~Ng#-eohaKNK1hMVgdlPhMve3#a-FkDs|MYCu;G@QCYNxK&}+ zd){PY3gHJ_+1RS{k>)Ci>2=k@%#YthsLhWg2np#CZ;e=`fh|R>NTEPjd+V}C#YLNo zm6iOh8JB)&@}cHhZnm8xY6e~vCsnEER6Cz^`!Z6QQxqmTM$9-R1BSbzRAp`1OUL_N zNiTdQT?xpTH|}`^?8yPDp-7ADW%;heH&{|SR$Wl|A(k9Q40S~hF*HoQok0c*bG9Qv zZc$Yr^bGBhpidD=>%MnzEROe zE3!kouI?vxiLa#X3L4p{Dr)-nMvwk*v-s}?_pM1yAhXi=OWPhA>9JqX^*s3vze_e` z%6j}6>T$}YtyjM#)YBfd^sLOj_?{rFm`=01Yu<7nE*taHqg_PA<@b8i z%7mFmscM_%F*?A~vh!I>u(Al)kvM8u`HCVnpWc){{0z6ei@n+qk4Q)yg&D!B4-L~C zmN(=|R{^WI?eye&zw|IqRBi4272962P6XZQOC%}+7%`;aI(*ba;R*KYAykoQDwduD z^r`XD%$@2R&PeqHpO)t0HmvQwS60cSA}HZ034mvwyGjzJiO`7VF;*!Vzw~qr z?%iiadH@`lh#8O_K1HDb9HLDXP(SsHP5W=Jb)|(?)8!bki1ZPB@;>l}@XylYH7roo zBc1DbEys&X{V2_l2ma6Rb{>eNUaG)-zH_=+c1Rn#*KIo#juY;Yj` zMlaN~@KT1jA3mo&T%Q#0S$UEP4m4VS}xxlOmG z-PNp*d&!$Gy)lA9YD|qkk83;$QgQU>S(*tntKMV+T-_l7p>!}_3@K=xjQM*|vucut zlENucbldGa&I8N+YTcPJw>i+fQX(cbq8NigQym?!&V=mdzTN5lQ5Y_|aCy&D&meFg zS1(`b6dYWJ4qD=i*McY$a2oZbCjnj>q_h7l(M=qhJ zqgN0f-0u9L4j=`j6|=4y?_J=DfuDr}JmC0{AhhQPAWly=@!4RtR?#6y#K_EIMAR*r~lD|=HCG9Yyu!51H%i zBw1c}Yfl9pAF7Tz`pc|2hPDVnE@J9vq!L;Xe;CdIJaNlR@jB)9%6o!zPS`^-2e}@> z0K&x-`Kp*Iwm{xl%J9il&3`?!v0y*wDWjq9)X2G3z6dO=W;auy%fak)PSvU$&k;Gm zz<*oSMai4%Wq8#PB(jFUtiS_6luVFWj!k>|gcZzA*~LL+HpW@0<>JdGWP5$r^wTY- z1J?(n&BG>le}w+~0vvt5_5SPmOROE7erxf=6VVaJ3Zj~gU2{x>`C1~7jxerr-HRA| zN{oEK;67OX;PEc)(ZTRa#wD-AW$)qZKZg6H1CO`+3`9*@jL%Q6`Xo7i#y_kN@DNiE z+vxIa>Nr20mM`kTUkcSy)_-oIs@5sxBh>gYQev7QBvY~7szHfv7pCF5oR~Hc z=BqEYF&__WYX!WiNA!7(t>1zLoO%oHUn!*lga9lD344{c4iXpb5gl2*ueO%G(U&9(Qe72h8!vq+oicDA7r=fn%Q)!2W4+ADk}!6ilS7J}1va$U(^d!kE; zk*Kl$(yDO3lq%q=*UplHBfE$AB zh)qgendkML;u7(>X(<0t^lXED966>1&lG~2lFM{)9m6kQHWjG>zViCA0z=&~n+bEJ^ls#Hw9JTP;)JFTsUE=x~txQtau2?PjeM`MW7-lLLL%Qg8C zUut__fzpxeag#Dk#>+HJD(i~G;f_OnItb!a7mYJHT&LO%#u)ON$J_FKr7$s4<9}E! zf4rUVb_qi`FP|QJq~lia@Ytf&E6>4lTHhi=&-LdKWb*Cm)uOrBXCZ*dF<=O{_1IEj zv%E;!j`WT;_4ggo&<@Wtc)1P#LUU=k+qUI*#tfpTlDZ|#YbC^;j$8-~`;BkccZzcQ z+MB5T=WeJVJVCD<_Y$^CXSmbcv2QVS0(_A6oM2(CQ8xb9X`gV#i&yU_w!datbbi<3xe@(hhOAOM{ItDskE}cvt*Zh3CY3QK+@hI>MBs<0#x*3;i z7wTTG6-Vf8(#rLFq&au{RhuJ4E&WQg zo?ZObG6BmlBK(XQ(~&)MYh$;_YC6v~LvMp$OaV~%i)+3tbKkfL*_Ek;XL)Zt={!Um zWUM)Byj!88E#0JwYckB_m}x=fozkzdc314==Ua}2=MJE%yEqPaX~-IdG=btCl?J!` z_sY#Ei04qF>Ex2J7Nb!Q`sK*C!NHe9e6m6l;Jg)+7U;3gt2Dd0?N(@PTNuM5TE49X z=6JrQV)y4&6XtV(V-8DQy%ui_rbIm>WQ zBVYO=|J~xrNXEyrTj?$M@2`=UTk)gskcs--@(ODF1C?1rwL)SM`s;8NPpmuKyvmSS z*B;Qo71F&RiPn z@nv%v^l(R_vPOEg>(Kn;;rHz}Qgh!a+vLlL#m8~aJR8sp_r)=gt~^iU^~=%?gzi53 ztGmrV9t_*fg@z4W0oJ94{OfhJh&16|W@K?a-$*N$hOho|VGT9m;L>|UdC#fWMWu1^ zP7=j6_u1Yn16~|YLuin24dYyOoD^h65GW_bMZ3~!UI>Un!f=6-W=}Qbcp+iF)M0Xr zO}#Dh^wHAClgWuXtIb z6Yc6S)rMLW#daKvt>>D8gd_c?L^0e>r!dTj!DAO~b6?-4di{E)I@d+h0n}~VZN`g} znF^zmz1K=5^eO`T`3NRA02;QI;%qk?hCFYY`?_43twlt(#o@<_tYXb^bQdvn_C#xh z!VtVBDn=OEgWOxa=vvidtg<*n9Sen*p-u)a}kPJtv{?=;cJxq?!S32`R7O1YRRHXqOw8L z9{!9B0=$k2)1;|!qKtkdH6j#OxX;!!b6wMCdV212_JYegA&1#p6ulIR^YZ;hwQbj6 zyT3l$p|*xz&0x*k$;3_c^#XsN69V)Cd>4OORO%_0pJR9n-#s^wHfi~ATTC7myuvl& zqq~#6|M5COz&$>B1*2#GSDs};$ar<~*oF;qE)*xH{+PUAM!aVnFjFcLW=54|h-orlh+eMb~If_YGPkYvdL<^NcP@V3!dH3NzBw^dWyV@K(kxa zp{O|I|Xw7!fzYn^2O^y?j;Ma!_>Glq9HucRI?xp!~s5DO70$p6y zI(m>*P?_x6tvIoJc3xfKsLtI-3h2{9`+M>~b^?+p#qeqC}bqh9|PL3CgIHW{x0 zx2c;Q5Slpu36TZen2DBdxv_)0kDI)Vux%zAFSWWlbH097tX*f4o3&4p{R^?n<5x+6 zQ~MjD|BndRR{AFAwXjj~Mg)5DeC2HRLW8$&{Iyod&jr)e;oeq44pz55b*x`S6L2^> z6p!?C=ztjf?|8jR9dS2NmlmMJ_a)t+Reel=FN*n1SO#(p?YQu~qx3w+YD0gHlAR*E zfr-r=6L>w0Ay`=VqvS8O`fO8bEE-Ko_`B= zH`7zR{MP^_j^@GfIT03?R}I>@M;|C@G37hp#ySVPw_EXdorK=l4zOQE=&%_Y?EL!_ zkLc*Wff>!&uBNWCO0hAyQI>o2nnXjz&~=vYly31IPcj7OOK&l^aJb1V`2FaF z%IM1_5pEY^$m|G~X-E^3`Mzg|(3$ik_{BZf96is1H z@LLC6ULr{&3Z!>NM+!Ws6O{kUVmTOIJbUL~Lvd>XdXf-_3?tV2toK0*iSE=@k%c+H zGIbmAlaOKlj}#6_iIOC+(ieh&7yj=|hS;P9q5e^`gff>hmoEFh6xo=v-XmaG;vo0; zHqU3heCa!x<$1i$j`_|Jk_Fshz57_{Z;`zPi0skEoQ}EiWZ4=&aDn`-y3h$7z;*aC z+~>Ez6QKW-L=?DOp8Vt_odli%GrbEyxB$~IXBMLaet9>S;bcVMrO_wFD$#6U=XM3s zei`5vH9$823=mWX0MC)M5V3v$Uq}1^!t(zgkAR+txzj)Wk`e#^PamC7c0%Cq7978b z{ck@ynbOJ3{_Dj5vq$)m_bSnX0R3hT3YlQXQ6ivD2oaNnnh2P4=j6LNa993+{QacU k|E<6O*7p;lf|ib`=EJ$Vl$kp=0m4Ys?rAE)70h1zFZl`R>i_@% diff --git a/content/english/hpc/simd/intrinsics.md b/content/english/hpc/simd/intrinsics.md index 9a0ee437..21a8f0da 100644 --- a/content/english/hpc/simd/intrinsics.md +++ b/content/english/hpc/simd/intrinsics.md @@ -127,7 +127,7 @@ vmovapd YMMWORD PTR c[rax], ymm0 Sometimes, although quite rarely, this compiler interference makes things worse, so it is always a good idea to [check the assembly](/hpc/compilation/stages) and take a closer look at the emitted vector instructions (they usually start with a "v"). -Also, some of the intrinsics don't map to a single instruction but a short sequence of them (as a convenient shortcut). +Also, some of the intrinsics don't map to a single instruction but a short sequence of them, as a convenient shortcut: [broadcasts and extracts](../moving#register-aliasing) are a notable example. ```c++ alignas(32) float a[n]; @@ -45,62 +58,114 @@ for (int i = 0; i < n; i += 8) { } ``` -For allocating an array dynamically, we can use `std::aligned_alloc` which takes the alignment value and the size of array in bytes, and returns a pointer to the allocated memory (just like `new` does), which should be explicitly deleted when no longer used. +The [built-in vector types](../intrinsics) already have corresponding alignment requirements and assume aligned memory reads and writes — so you are always safe when allocating an array of `v8si`, but when converting it from `int*` you have to make sure it is aligned. + +Similar to the scalar case, many arithmetic instructions take memory addresses as operands — [vector addition](../intrinsics) is an example — although you can't explicitly use it as an intrinsic and have to rely on the compiler. There are also a few other instructions for reading a SIMD block from memory, notably the [non-temporal](/hpc/cpu-cache/bandwidth#bypassing-the-cache) load and store operations that don't lift accessed data in the cache hierarchy. + +### Register Aliasing + +The first SIMD extension, MMX, started quite small. It only used 64-bit vectors, which were conveniently aliased to the mantissa part of a [80-bit float](/hpc/arithmetic/ieee-754) so that there is no need to introduce a separate set of registers. As the vector size grew with later extensions, the same [register aliasing](/hpc/architecture/assembly#instructions-and-registers) mechanism used in general-purpose registers was adopted for the vector registers to maintain backward compatibility: `xmm0` is the first half (128 bits) of `ymm0`, `xmm1` is the first half of `ymm1`, and so on. + +This feature, combined with the fact that the vector registers are located in the FPU, makes moving data between them and the general-purpose registers slightly complicated. + +To **extract** a specific value from a vector, you can use `_mm256_extract_epi32` and similar intrinsics. It takes the index of the integer to be extracted as the second parameter and generates different instruction sequences depending on its value. + +If you need to extract the first element, it generates the `vmovd` instruction (for `xmm0`, the first half of the vector): + +```nasm +vmovd eax, xmm0 +``` + +For other elements of an SSE vector, it generates possibly slightly slower `vpextrd`: + +```nasm +vpextrd eax, xmm0, 1 +``` + +To extract anything from the second half of an AVX vector, it first has to extract that second half, and then the scalar itself. For example, here is how it extracts the last (eighth) element, -On most modern architectures, the `loadu` / `storeu` intrinsics should be equally as fast as `load` / `store` given that in both cases the blocks only intersect one cache line. The advantage of the latter is that they can act as free assertions that all reads and writes are aligned. It is worth noting that the GCC vector extensions always assume aligned memory reads and writes. Memory alignment issues is also one of the reasons why compilers can't always autovectorize efficiently. +```nasm +vextracti128 xmm0, ymm0, 0x1 +vpextrd eax, xmm0, 3 +``` -non-temporal load and store +There is a similar `_mm256_insert_epi32` intrinsic for overwriting specific elements: + +```nasm +mov eax, 42 -### Aliasing and Broadcasts +; v = _mm256_insert_epi32(v, 42, 0); +vpinsrd xmm2, xmm0, eax, 0 +vinserti128 ymm0, ymm0, xmm2, 0x0 -Register Aliasing +; v = _mm256_insert_epi32(v, 42, 7); +vextracti128 xmm1, ymm0, 0x1 +vpinsrd xmm2, xmm1, eax, 3 +vinserti128 ymm0, ymm0, xmm2, 0x1 +``` -MMX was originally used the integer (64-bit mantissa) part of a 80-bit float. +Takeaway: moving scalar data to and from vector registers is slow, especially when this isn't the first element. -Extracting and broadcasting +Instead of modifying just one element, you can also **broadcast** a single value into all its positions: ```nasm +; __m256i v = _mm256_set1_epi32(42); mov eax, 42 vmovd xmm0, eax vpbroadcastd ymm0, xmm0 ``` -You can [broadcast](https://software.intel.com/sites/landingpage/IntrinsicsGuide/#expand=6331,5160,588&techs=AVX,AVX2&text=broadcast) a single value to a vector from a register or a memory location. - -Also, some of the intrinsics are not direct instructions, but short sequences of instructions. One example is the `extract` group of instructions, which are used to get individual elements out of vectors (e. g. `_mm256_extract_epi32(x, 0)` returns the first element out of 8-integer vector); it is quite slow (~5 cycles) to move data between "normal" and SIMD registers in general. +This is a frequently used operation, so you can also use a memory location: -### Tips +```nasm +; __m256 v = _mm256_broadcast_ss(&a[i]); +vbroadcastss ymm0, DWORD PTR [rdi] +``` -When using SIMD manually, it helps to print out contents of vector registers for debug purposes. You can do so by converting a vector variable into an array and then into a bitset: +If you want to avoid all this complexity, you can just dump the vector in memory and read its values back as scalars: ```c++ -template -void print(T var) { - unsigned *val = (unsigned*) &var; - for (int i = 0; i < 4; i++) - cout << bitset<32>(val[i]) << " "; +void print(__m256i v) { + auto t = (unsigned*) &v; + for (int i = 0; i < 8; i++) + cout << bitset<32>(t[i]) << " "; cout << endl; } ``` -In this particular case, it outputs 4 groups of 32 bits of a 128-bit wide vector. +This may not be fast or technically legal (the C++ standard doesn't specify what happens when you cast data like this), but it is simple, and I frequently use this code to print out the contents of a vector during debugging. +### Non-Contiguous Load -### Non-Blocked Reads +Later SIMD extensions added special "gather" and "scatter instructions that read/write data non-sequentially using arbitrary array indices. These don't work 8 times faster though and are usually limited by the memory rather than the CPU, but they are still helpful for certain applications such as sparse linear algebra. -Since AVX2, you can use "gather" instructions that load data non-sequentially using arbitrary array indices. These don't work 8 times faster though and are usually limited by memory rather than CPU, but they are still helpful for stuff like sparse linear algebra. +Gather is available since AVX2, and various scatter instructions are available since AVX512. ![](../img/gather-scatter.png) -AVX512 has similar "scatter" instructions that write data non-sequentially, using either indices or [a mask](https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=compress&expand=4754,4479&techs=AVX_512). You can very efficiently "filter" an array this way using a predicate. +Let's see if they work faster than scalar reads. First, we create an array of size $N$ and $Q$ random read queries: ```c++ int a[N], q[Q]; +for (int i = 0; i < N; i++) + a[i] = rand(); + for (int i = 0; i < Q; i++) - checksum += a[q[i]]; + q[i] = rand() % N; ``` +In the scalar code, we add the elements specified by the queries to a checksum one by one: + +```c++ +int s = 0; + +for (int i = 0; i < Q; i++) + s += a[q[i]]; +``` + +And in the SIMD code, we use the `gather` instruction to do that for 8 different indexes in parallel: + ```c++ reg s = _mm256_setzero_si256(); @@ -111,8 +176,10 @@ for (int i = 0; i < Q; i += 8) { } ``` -Maybe move it to shuffling anyway? +They perform roughly the same, except when the array fits into the L1 cache: ![](../img/gather.svg) -The last two, gather and scatter, turn SIMD into proper parallel programming model, where most operations can be executed independently in terms of their memory locations. This is a huge deal: many AVX512-specific algorithms have been developed recently owning to these new instructions, and not just having twice as many SIMD lanes. +The purpose of `gather` and `scatter` is not to perform memory operations faster, but to get the data into registers to perform heavy computations on them. For anything costlier than just one addition, they are hugely favorable. + +The lack of (fast) gather and scatter instructions makes SIMD programming on CPUs very different from proper parallel computing environments that support independent memory access. You have to always engineer around it and employ various ways of organizing your data sequentially so that it be loaded into registers. diff --git a/content/english/hpc/simd/shuffing.md b/content/english/hpc/simd/shuffing.md index 3264ce42..15edd9a5 100644 --- a/content/english/hpc/simd/shuffing.md +++ b/content/english/hpc/simd/shuffing.md @@ -7,6 +7,8 @@ Masking is the most widely used technique for data manipulation, but there are m ### Permutations and Lookup Tables +AVX512 has similar "scatter" instructions that write data non-sequentially, using either indices or [a mask](https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=compress&expand=4754,4479&techs=AVX_512). You can very efficiently "filter" an array this way using a predicate. + You can [permute](https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=permute&techs=AVX,AVX2&expand=6331,5160) data inside a register almost arbitrarily. ```c++ From 0b84978d39ae16492bce40fa0b9498aca74e955d Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sun, 6 Feb 2022 06:25:15 +0300 Subject: [PATCH 107/531] note on vector type casting --- content/english/hpc/simd/intrinsics.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/content/english/hpc/simd/intrinsics.md b/content/english/hpc/simd/intrinsics.md index 21a8f0da..d68c7088 100644 --- a/content/english/hpc/simd/intrinsics.md +++ b/content/english/hpc/simd/intrinsics.md @@ -67,7 +67,7 @@ C/C++ compilers implement special *vector types* that refer to the data stored i - 256-bit `__m256`, `__m256d`, `__m256i`; - 512-bit `__m512`, `__m512d`, `__m512i`. -Registers themselves can hold data of any kind: these types are only used for type checking. To convert a variable to another type, you can do it the same way you would convert any other type, and it won't cost you anything. +Registers themselves can hold data of any kind: these types are only used for type checking. You can convert a vector variable to another vector type the same way you would normally convert any other type, and it won't cost you anything. ### SIMD Intrinsics @@ -113,7 +113,9 @@ Here are a few more examples, just so that you get the gist of it: - `_mm256_cmpeq_epi32`: compare 8+8 packed `int`s and return a mask that contains ones for equal element pairs. - `_mm256_blendv_ps`: pick elements from one of two vectors according to a mask. -As you may have guessed, there is a combinatorially very large number of intrinsics. A very helpful reference for x86 SIMD intrinsics is the [Intel Intrinsics Guide](https://software.intel.com/sites/landingpage/IntrinsicsGuide/), which has groupings by categories and extensions, descriptions, pseudocode, associated assembly instructions, and their latency and throughput on Intel microarchitectures. You may want to bookmark that page. +As you may have guessed, there is a combinatorially very large number of intrinsics. For some reason, there are some operations that are agnostic to the type of data stored in registers, but only take a specific vector type (usually 32-bit float) — you can just have to convert to and from it to use that intrinsic. To simplify the examples in this chapter, we will mostly work with 32-bit integers (`epi32`) in 256-bit AVX2 registers. + +A very helpful reference for x86 SIMD intrinsics is the [Intel Intrinsics Guide](https://software.intel.com/sites/landingpage/IntrinsicsGuide/), which has groupings by categories and extensions, descriptions, pseudocode, associated assembly instructions, and their latency and throughput on Intel microarchitectures. You may want to bookmark that page. The Intel reference is useful when you know that a specific instruction exists and just want to look up its name or performance info. When you don't know whether it exists, this [cheat sheet](https://db.in.tum.de/~finis/x86%20intrinsics%20cheat%20sheet%20v1.0.pdf) may do a better job. From e2b054622bf8ffa741134f336a5365a8c81847cb Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sun, 6 Feb 2022 07:10:56 +0300 Subject: [PATCH 108/531] masking and blending --- content/english/hpc/simd/masking.md | 131 +++++++++++++++++++++------- 1 file changed, 100 insertions(+), 31 deletions(-) diff --git a/content/english/hpc/simd/masking.md b/content/english/hpc/simd/masking.md index 959969cb..47d069b0 100644 --- a/content/english/hpc/simd/masking.md +++ b/content/english/hpc/simd/masking.md @@ -3,27 +3,41 @@ title: Masking and Blending weight: 4 --- -If you took some time to study [the reference](https://software.intel.com/sites/landingpage/IntrinsicsGuide), you may have noticed that there are essentially two major groups of vector operations: +One of the bigger challenges of SIMD programming is that its options for control flow are very limited — because the operations you apply to a vector are the same for all its elements. -1. Instructions that perform some elementwise operation (`+`, `*`, `<`, `acos`, etc.). -2. Instructions that load, store, mask, shuffle and generally move data around. - -While using the elementwise instructions is easy, the largest challenge with SIMD is getting the data in vector registers in the first place, with low enough overhead so that the whole endeavor is worthwhile. +This makes the problems that are usually trivially resolved with an `if` or any other type of branching much harder. With SIMD, they have to be dealt with by the means of various [branchless programming](/hpc/pipelining/branchless) techniques, which aren't always that straightforward to apply. ### Masking -maskmov, maskstore - -SIMD has no easy way to do branching, because the control flow should be the same for all elements in a vector. To overcome this limitation, we can "mask" operations that should only be performed on a subset of elements, in a way similar to how a [conditional move](/hpc/analyzing-performance/assembly) is executed. +The main way to make a computation branchless is through *predication* — computing the results of both branches and then using either some arithmetic trick or a special "conditional move" instruction: ```c++ for (int i = 0; i < N; i++) a[i] = rand() % 100; +int s = 0; + +// branch: +for (int i = 0; i < N; i++) + if (a[i] < 50) + s += a[i]; + +// no branch: +for (int i = 0; i < N; i++) + s += (a[i] < 50) * a[i]; + +// also no branch: for (int i = 0; i < N; i++) s += (a[i] < 50 ? a[i] : 0); ``` +To vectorize this loop, we are going to need two new instructions: + +- `_mm256_cmpgt_epi32`, which compares the integers in two vectors and produces a mask of all ones if the first element is more than the second and a mask of full zeros otherwise. +- `_mm256_blendv_epi8`, which blends (combines) the values of two vectors based on the provided mask. + +By masking and blending the elements of a vector so that only the selected subset of them is affected by computation, we can perform predication in a manner similar to the conditional move: + ```c++ const reg c = _mm256_set1_epi32(49); const reg z = _mm256_setzero_si256(); @@ -37,6 +51,10 @@ for (int i = 0; i < N; i += 8) { } ``` +(Minor details such as [horizontal summation and accounting for the remainder of the array](../reduction) are omitted for brevity.) + +This is how predication is usually done in SIMD, but it isn't always the most optimal approach. We can use the fact that one of the blended values is zero, and use bitwise `and` with the mask instead of blending: + ```c++ const reg c = _mm256_set1_epi32(50); reg s = _mm256_setzero_si256(); @@ -49,6 +67,15 @@ for (int i = 0; i < N; i += 8) { } ``` +This loop performs slightly faster because on this particular CPU, the vector `and` take one cycle less than `blend`. + +There are several other instructions that support masks as inputs, most notably: + +- The `_mm256_blend_epi32` intrinsic is a `blend` that takes an 8-bit integer mask instead of a vector (which is why it doesn't have `v` at the end). +- The `_mm256_maskload_epi32` and `_mm256_maskstore_epi32` intrinsics that load/store a SIMD block from memory and `and` it with a mask in one go. + +We can also use predication with built-in vector types: + ```c++ vec *v = (vec*) a; vec s = {}; @@ -57,27 +84,16 @@ for (int i = 0; i < N / 8; i++) s += (v[i] < 50 ? v[i] : 0); ``` - -```nasm -vpcmpeqd ymm0, ymm1, YMMWORD PTR a[0+rdx*4] -vptest ymm0, ymm0 -je .L2 -``` - -```nasm -vpcmpeqd ymm0, ymm1, YMMWORD PTR a[0+rdx*4] -vmovmskps eax, ymm0 -test eax, eax -je .L9 -``` - -All at around 13 GFLOPS, and the compiler can handle vectorization by itself. Let's move on to more complex examples that can't be auto-vectorized. +All these versions work at around 13 GFLOPS as this example is so simple that the compiler can vectorize the loop all by itself. Let's move on to more complex examples that can't be auto-vectorized. ### Searching -4.4: +In the next example, we need find a specific value in an array and return its position: ```c++ +const int N = (1<<12); +int a[N]; + int find(int x) { for (int i = 0; i < N; i++) if (a[i] == x) @@ -86,7 +102,21 @@ int find(int x) { } ``` -19.63, ~5 times faster: +To benchmark the `find` function, we fill the array with numbers from $0$ to $(N - 1)$ and then repeatedly search for a random element: + +```c++ +for (int i = 0; i < N; i++) + a[i] = i; + +for (int t = 0; t < K; t++) + checksum ^= find(rand() % N); +``` + +The scalar versions gives ~4 GFLOPS of performance. This number includes the elements we haven't had to process, so divide this number by two in your head (the expected fraction of the elements we have to check). + +To vectorize it, we need to compare a vector of its elements with the searched value for equality, producing a mask, and then somehow check if this mask is zero. If it isn't, the needed element is somewhere within this block of 8. + +To check if the mask is zero, we can use the `_mm256_movemask_ps` intrinsic, which takes the first bit of each 32-bit element in a vector and produces an 8-bit integer mask out of them. We can then check if this mask is non-zero — and if it is, also immediately get the index with the `ctz` instruction: ```c++ int find(int needle) { @@ -104,7 +134,16 @@ int find(int needle) { } ``` -A slightly faster alternative: +This version gives ~20 GFLOPS, or about 5 times faster than the scalar one. It only uses 3 instructions in the hot loop: + +```nasm +vpcmpeqd ymm0, ymm1, YMMWORD PTR a[0+rdx*4] +vmovmskps eax, ymm0 +test eax, eax +je loop +``` + +Checking if a vector is zero is a common operation, and there is an operation similar to `test` in SIMD that we can use: ```c++ int find(int needle) { @@ -123,20 +162,30 @@ int find(int needle) { } ``` +We are still using `movemask` to do `ctz` later, but the hot loop is now one instruction shorter: + +```nasm +vpcmpeqd ymm0, ymm1, YMMWORD PTR a[0+rdx*4] +vptest ymm0, ymm0 +je loop +``` + +This doesn't improve performance much on this particular architecture as the `movemask` wasn't the bottleneck. + ### Counting Values -15 GFLOPS: +As the final exercise, let's find the count of a value in an array instead of just its first occurrence: ```c++ -int count(int needle) { +int count(int x) { int cnt = 0; for (int i = 0; i < N; i++) - cnt += (a[i] == needle); + cnt += (a[i] == x); return cnt; } ``` -Also 15 GFLOPS: +To vectorize it, we just need to convert the comparison mask to either one or zero per element and calculate the sum: ```c++ const reg ones = _mm256_set1_epi32(1); @@ -154,10 +203,28 @@ int count(int needle) { return hsum(s); } +``` + +Both implementations yield ~15 GFLOPS: the compiler is able to vectorize the first one by itself. +But a trick that the compiler can't find is to notice that the mask of all ones is [minus one](/hpc/arithmetic/integer) when reinterpreted as an integer. So we can skip the and-the-lowest-bit part and use the mask itself, and then just negate the final result: + +```c++ +int count(int needle) { + reg x = _mm256_set1_epi32(needle); + reg s = _mm256_setzero_si256(); + + for (int i = 0; i < N; i += 8) { + reg y = _mm256_load_si256( (reg*) &a[i] ); + reg m = _mm256_cmpeq_epi32(x, y); + s = _mm256_add_epi32(s, m); + } + + return -hsum(s); +} ``` -The trick that the compiler couldn't find is to notice that all ones is minus one. So we can use it as the negative count, achieving 22 GFLOPS: +This doesn't improve the performance in this particular architecture because the throughput is actually bottlenecked by updating `s`: there is a dependency on the previous iteration, so the loop can't proceed faster than one iteration per CPU cycle. We can make use of [instruction-level parallelism](../reduction#instruction-level-parallelism) if we split the accumulator in two: ```c++ int count(int needle) { @@ -179,3 +246,5 @@ int count(int needle) { return -hsum(s1); } ``` + +It now it gives ~22 GFLOPS of performance, which is as high as it can get. From 12bc3ab21eabda10ed11cdb25988dfc363d9d566 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sun, 6 Feb 2022 07:41:27 +0300 Subject: [PATCH 109/531] masking edits --- content/english/hpc/simd/masking.md | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/content/english/hpc/simd/masking.md b/content/english/hpc/simd/masking.md index 47d069b0..33aa38de 100644 --- a/content/english/hpc/simd/masking.md +++ b/content/english/hpc/simd/masking.md @@ -69,7 +69,7 @@ for (int i = 0; i < N; i += 8) { This loop performs slightly faster because on this particular CPU, the vector `and` take one cycle less than `blend`. -There are several other instructions that support masks as inputs, most notably: +Several other instructions support masks as inputs, most notably: - The `_mm256_blend_epi32` intrinsic is a `blend` that takes an 8-bit integer mask instead of a vector (which is why it doesn't have `v` at the end). - The `_mm256_maskload_epi32` and `_mm256_maskstore_epi32` intrinsics that load/store a SIMD block from memory and `and` it with a mask in one go. @@ -88,7 +88,7 @@ All these versions work at around 13 GFLOPS as this example is so simple that th ### Searching -In the next example, we need find a specific value in an array and return its position: +In the next example, we need to find a specific value in an array and return its position: ```c++ const int N = (1<<12); @@ -112,7 +112,7 @@ for (int t = 0; t < K; t++) checksum ^= find(rand() % N); ``` -The scalar versions gives ~4 GFLOPS of performance. This number includes the elements we haven't had to process, so divide this number by two in your head (the expected fraction of the elements we have to check). +The scalar version gives ~4 GFLOPS of performance. This number includes the elements we haven't had to process, so divide this number by two in your head (the expected fraction of the elements we have to check). To vectorize it, we need to compare a vector of its elements with the searched value for equality, producing a mask, and then somehow check if this mask is zero. If it isn't, the needed element is somewhere within this block of 8. @@ -134,7 +134,7 @@ int find(int needle) { } ``` -This version gives ~20 GFLOPS, or about 5 times faster than the scalar one. It only uses 3 instructions in the hot loop: +This version gives ~20 GFLOPS or about 5 times faster than the scalar one. It only uses 3 instructions in the hot loop: ```nasm vpcmpeqd ymm0, ymm1, YMMWORD PTR a[0+rdx*4] @@ -205,7 +205,7 @@ int count(int needle) { } ``` -Both implementations yield ~15 GFLOPS: the compiler is able to vectorize the first one by itself. +Both implementations yield ~15 GFLOPS: the compiler can vectorize the first one all by itself. But a trick that the compiler can't find is to notice that the mask of all ones is [minus one](/hpc/arithmetic/integer) when reinterpreted as an integer. So we can skip the and-the-lowest-bit part and use the mask itself, and then just negate the final result: @@ -247,4 +247,6 @@ int count(int needle) { } ``` -It now it gives ~22 GFLOPS of performance, which is as high as it can get. +It now gives ~22 GFLOPS of performance, which is as high as it can get. + + From f7d0d3a0112f885366dcda9dfd7c5a4708d99f2f Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sun, 6 Feb 2022 08:21:20 +0300 Subject: [PATCH 110/531] lookup tables draft --- content/english/hpc/simd/shuffing.md | 35 ++++++++++++++++++++++------ 1 file changed, 28 insertions(+), 7 deletions(-) diff --git a/content/english/hpc/simd/shuffing.md b/content/english/hpc/simd/shuffing.md index 15edd9a5..83b3f262 100644 --- a/content/english/hpc/simd/shuffing.md +++ b/content/english/hpc/simd/shuffing.md @@ -3,15 +3,19 @@ title: In-Register Shuffles weight: 6 --- -Masking is the most widely used technique for data manipulation, but there are many other handy SIMD features that we will later use in this chapter: +[Masking](../masking) lets you apply operations to only a subset of vector elements. It is a very effective and frequently used data manipulation technique, but in many cases, you need to perform more advanced operations that involve permuting values inside a vector register instead of just blending them with other vectors. -### Permutations and Lookup Tables +The problem is that adding a separate element-shuffling instruction for each possible use case in hardware is unfeasible. What we can do though is to add just one general permutation instruction that takes the indices of a permutation and produce these indices using precomputed lookup tables. -AVX512 has similar "scatter" instructions that write data non-sequentially, using either indices or [a mask](https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=compress&expand=4754,4479&techs=AVX_512). You can very efficiently "filter" an array this way using a predicate. +This general idea is perhaps too abstract, so let's jump straight to examples. -You can [permute](https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=permute&techs=AVX,AVX2&expand=6331,5160) data inside a register almost arbitrarily. +### Permutations and Lookup Tables + +One very important data processing primitive is the `filter`. It takes an array as input and writes out only the elements that satisfy a given predicate. In a single-threaded scalar case, it is trivially implemented by maintaining a counter that is incremented on each write: ```c++ +int a[N], b[N]; + int filter() { int k = 0; @@ -23,6 +27,16 @@ int filter() { } ``` +To vectorize it, we will use the `_mm256_permutevar8x32_epi32` intrinsic. It takes a vector of values and a vector of indices, and selects them correspondingly. It doesn't really permute but selects the values. + +The general idea: +- to calculate the predicate (perform the comparison and get the mask), +- use `movemask` to get a scalar 8-bit mask, +- then use a lookup use this instruction +- permute so that values are in the beginning +- write to the buffer only the element that satisfy the predicate (and maybe some garbage later) +- move pointer (by the popcnt of movemask) + 6-7x faster: ```c++ @@ -40,7 +54,11 @@ struct Precalc { }; constexpr Precalc T; +``` + +You can [permute](https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=permute&techs=AVX,AVX2&expand=6331,5160) data inside a register almost arbitrarily. +```c++ const reg p = _mm256_set1_epi32(P); int filter() { @@ -61,11 +79,14 @@ int filter() { return k; } - ``` +It also doesn't depend on the value of `P`: + ![](../img/filter.svg) +AVX512 has similar "scatter" instructions that write data non-sequentially, using either indices or [a mask](https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=compress&expand=4754,4479&techs=AVX_512). You can very efficiently "filter" an array this way using a predicate. + ### Shuffles and Popcount We can create tiny lookup tables with [pshufb](https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=pshuf&techs=AVX,AVX2&expand=6331) instruction. This is useful when you have some logic that isn't implemented in SSE, and this operation is so instrumental in some algorithms that [Wojciech Muła](http://0x80.pl/) — the guy who came up with a half of the algorithms described in this chapter — took it as his [Twitter handle](https://twitter.com/pshufb). @@ -170,6 +191,6 @@ int popcnt() { Another way is through gather, but that is too slow. -https://github.com/WojciechMula/sse-popcount +### Acknowledgements -https://arxiv.org/pdf/1611.07612.pdf for the state-of-the-art. +Check out [Wojciech Muła's github repository](https://github.com/WojciechMula/sse-popcount) with different vectorized popcount implementations and his [latest paper](https://arxiv.org/pdf/1611.07612.pdf) for the detailed explanation of state-of-the-art. From 8e0ecd6c20ab5b9595fcb34aa05762e0c17be294 Mon Sep 17 00:00:00 2001 From: LokotoshKonstantin <32109396+LokotoshKonstantin@users.noreply.github.com> Date: Sun, 6 Feb 2022 15:41:14 +0300 Subject: [PATCH 111/531] Update selection.md --- content/russian/cs/sorting/selection.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/content/russian/cs/sorting/selection.md b/content/russian/cs/sorting/selection.md index 03491dec..78cab8c9 100644 --- a/content/russian/cs/sorting/selection.md +++ b/content/russian/cs/sorting/selection.md @@ -10,9 +10,9 @@ weight: 2 ```cpp void selection_sort(int *a, int n) { for (k = 0; k < n - 1; k++) - for (j = i + 1; j < n; j++) - if (a[i] > a[j]) - swap(a[j], a[i]); + for (j = k + 1; j < n; j++) + if (a[k] > a[j]) + swap(a[j], a[k]); } ``` From 42e198fee1ea5b54abf697d25d072ebfac4ab08c Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sun, 6 Feb 2022 18:27:01 +0300 Subject: [PATCH 112/531] changing iteration variable name for consistency --- content/russian/cs/sorting/selection.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/content/russian/cs/sorting/selection.md b/content/russian/cs/sorting/selection.md index 78cab8c9..2e3b1bc5 100644 --- a/content/russian/cs/sorting/selection.md +++ b/content/russian/cs/sorting/selection.md @@ -9,10 +9,10 @@ weight: 2 ```cpp void selection_sort(int *a, int n) { - for (k = 0; k < n - 1; k++) - for (j = k + 1; j < n; j++) - if (a[k] > a[j]) - swap(a[j], a[k]); + for (i = 0; i < n - 1; i++) + for (j = i + 1; j < n; j++) + if (a[i] > a[j]) + swap(a[j], a[i]); } ``` From f4df11a07487578a6fdd4fd340d2e073d019942b Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sun, 6 Feb 2022 18:30:56 +0300 Subject: [PATCH 113/531] bugfix in selection sorting --- content/russian/cs/sorting/selection.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/content/russian/cs/sorting/selection.md b/content/russian/cs/sorting/selection.md index 2e3b1bc5..b47f2320 100644 --- a/content/russian/cs/sorting/selection.md +++ b/content/russian/cs/sorting/selection.md @@ -9,10 +9,10 @@ weight: 2 ```cpp void selection_sort(int *a, int n) { - for (i = 0; i < n - 1; i++) - for (j = i + 1; j < n; j++) - if (a[i] > a[j]) - swap(a[j], a[i]); + for (int k = 0; k < n - 1; k++) + for (j = k + 1; j < n; j++) + if (a[k] > a[j]) + swap(a[j], a[k]); } ``` From db2a9e2b09831a2ff17e7b802e3c6d7094dc543c Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sun, 6 Feb 2022 21:23:41 +0300 Subject: [PATCH 114/531] note about low-sized data counts --- content/english/hpc/simd/masking.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/content/english/hpc/simd/masking.md b/content/english/hpc/simd/masking.md index 33aa38de..3ec8388d 100644 --- a/content/english/hpc/simd/masking.md +++ b/content/english/hpc/simd/masking.md @@ -249,4 +249,7 @@ int count(int needle) { It now gives ~22 GFLOPS of performance, which is as high as it can get. +When adapting this code for shorter data types, keep in mind that the accumulator may overflow. To work around this, add another accumulator of larger size and regularly stop the loop to add the values in the local accumulator to it and then reset the local accumulator. For example, for 8-bit integers, this means creating another inner loop that does $\lfloor \frac{256-1}{8} \rfloor = 15$ iterations. + + From ad1112dbcb226a92d3244dc1e12a6518079356e3 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sun, 6 Feb 2022 22:18:45 +0300 Subject: [PATCH 115/531] char is signed by default --- content/english/hpc/arithmetic/integer.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/hpc/arithmetic/integer.md b/content/english/hpc/arithmetic/integer.md index 360fa9c1..74e01cca 100644 --- a/content/english/hpc/arithmetic/integer.md +++ b/content/english/hpc/arithmetic/integer.md @@ -74,7 +74,7 @@ Integers come in different sizes, but all function roughly the same. | Bits | Bytes | Signed C type | Unsigned C type | Assembly | |-----:|-------|---------------|----------------------|----------| -| 8 | 1 | `signed char` | `char` | `byte` | +| 8 | 1 | `char` | `unsigned char` | `byte` | | 16 | 2 | `short` | `unsigned short` | `word` | | 32 | 4 | `int` | `unsigned int` | `dword` | | 64 | 8 | `long long` | `unsigned long long` | `qword` | From a9b983be650694094c8d35ef982ac8d478437a0d Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sun, 6 Feb 2022 23:34:35 +0300 Subject: [PATCH 116/531] note on setr --- content/english/hpc/simd/moving.md | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/content/english/hpc/simd/moving.md b/content/english/hpc/simd/moving.md index e88dab35..70b71bef 100644 --- a/content/english/hpc/simd/moving.md +++ b/content/english/hpc/simd/moving.md @@ -106,6 +106,14 @@ vinserti128 ymm0, ymm0, xmm2, 0x1 Takeaway: moving scalar data to and from vector registers is slow, especially when this isn't the first element. +If you need to populate not just one element but the entire vector, you can use the `_mm256_setr_epi32` intrinsic: + +```c++ +__m256 iota = _mm256_setr_epi32(0, 1, 2, 3, 4, 5, 6, 7); +``` + +The "r" here stands for "reversed" — from [the CPU point of view](/hpc/arithmetic/integer#integer-types), not for humans. There is also the `_mm256_set_epi32` (without "r") that fills the values from the opposite direction. Both are mostly used to create compile-time constants that are then fetched into the register with a block load. If your use case is filling a vector with zeros, use the `_mm256_setzero_si256` instead: it `xor`-s the register with itself. + Instead of modifying just one element, you can also **broadcast** a single value into all its positions: ```nasm From 910c1c8ad6e13b47eaf2b1d6f83fc9aae64b4a55 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sun, 6 Feb 2022 23:45:21 +0300 Subject: [PATCH 117/531] popcounts and lookup tables --- content/english/hpc/simd/shuffing.md | 215 +++++++++++++++------------ 1 file changed, 122 insertions(+), 93 deletions(-) diff --git a/content/english/hpc/simd/shuffing.md b/content/english/hpc/simd/shuffing.md index 83b3f262..1d0b32d0 100644 --- a/content/english/hpc/simd/shuffing.md +++ b/content/english/hpc/simd/shuffing.md @@ -7,129 +7,74 @@ weight: 6 The problem is that adding a separate element-shuffling instruction for each possible use case in hardware is unfeasible. What we can do though is to add just one general permutation instruction that takes the indices of a permutation and produce these indices using precomputed lookup tables. -This general idea is perhaps too abstract, so let's jump straight to examples. +This general idea is perhaps too abstract, so let's jump straight to the examples. -### Permutations and Lookup Tables +### Shuffles and Popcount -One very important data processing primitive is the `filter`. It takes an array as input and writes out only the elements that satisfy a given predicate. In a single-threaded scalar case, it is trivially implemented by maintaining a counter that is incremented on each write: +*Population count*, also known as the *Hamming weight*, is the count of `1` bits in a binary string. -```c++ -int a[N], b[N]; +It is a frequently used operation, so there is a separate instruction on x86 that computes the population count of a word: -int filter() { - int k = 0; +```c++ +const int N = (1<<12); +int a[N]; +int popcnt() { + int res = 0; for (int i = 0; i < N; i++) - if (a[i] < P) - b[k++] = a[i]; - - return k; + res += __builtin_popcount(a[i]); + return res; } ``` -To vectorize it, we will use the `_mm256_permutevar8x32_epi32` intrinsic. It takes a vector of values and a vector of indices, and selects them correspondingly. It doesn't really permute but selects the values. - -The general idea: -- to calculate the predicate (perform the comparison and get the mask), -- use `movemask` to get a scalar 8-bit mask, -- then use a lookup use this instruction -- permute so that values are in the beginning -- write to the buffer only the element that satisfy the predicate (and maybe some garbage later) -- move pointer (by the popcnt of movemask) - -6-7x faster: +It also supports 64-bit integers, improving the total throughput twofold: ```c++ -struct Precalc { - alignas(64) int permutation[256][8]; - - constexpr Precalc() : permutation{} { - for (int m = 0; m < 256; m++) { - int k = 0; - for (int i = 0; i < 8; i++) - if (m >> i & 1) - permutation[m][k++] = i; - } - } -}; - -constexpr Precalc T; -``` - -You can [permute](https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=permute&techs=AVX,AVX2&expand=6331,5160) data inside a register almost arbitrarily. - -```c++ -const reg p = _mm256_set1_epi32(P); - -int filter() { - int k = 0; - - for (int i = 0; i < N; i += 8) { - reg x = _mm256_load_si256( (reg*) &a[i] ); - - reg m = _mm256_cmpgt_epi32(p, x); - int mask = _mm256_movemask_ps((__m256) m); - reg permutation = _mm256_load_si256( (reg*) &T.permutation[mask] ); - - x = _mm256_permutevar8x32_epi32(x, permutation); - _mm256_storeu_si256((reg*) &b[k], x); - - k += __builtin_popcount(mask); - } - - return k; +int popcnt_ll() { + long long *b = (long long*) a; + int res = 0; + for (int i = 0; i < N / 2; i++) + res += __builtin_popcountl(b[i]); + return res; } ``` -It also doesn't depend on the value of `P`: - -![](../img/filter.svg) - -AVX512 has similar "scatter" instructions that write data non-sequentially, using either indices or [a mask](https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=compress&expand=4754,4479&techs=AVX_512). You can very efficiently "filter" an array this way using a predicate. - -### Shuffles and Popcount +The only two instructions required are load-fused popcount and addition. They both have a high throughput, so the code processes about $8+8=16$ bytes per cycle as it is limited by the decode width of 4 on this CPU. -We can create tiny lookup tables with [pshufb](https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=pshuf&techs=AVX,AVX2&expand=6331) instruction. This is useful when you have some logic that isn't implemented in SSE, and this operation is so instrumental in some algorithms that [Wojciech Muła](http://0x80.pl/) — the guy who came up with a half of the algorithms described in this chapter — took it as his [Twitter handle](https://twitter.com/pshufb). +These instructions were added to x86 CPUs around 2008 with SSE4. Let's temporarily go back in time before vectorization even became a thing and try to implement popcount by other means. -2 GFLOPS: +The naive way is to go through the binary string bit by bit: ```c++ +__attribute__ (( optimize("no-tree-vectorize") )) int popcnt() { int res = 0; for (int i = 0; i < N; i++) - res += __builtin_popcount(a[i]); + for (int l = 0; l < 32; l++) + res += (a[i] >> l & 1); return res; } ``` -4 GFLOPS: +As anticipated, it works just slightly faster than ⅛-th of a byte per cycle — at around 0.2. -```c++ -int popcnt() { - long long *b = (long long*) a; - int res = 0; - for (int i = 0; i < N / 2; i++) - res += __builtin_popcountl(b[i]); - return res; -} -``` - -0.49 GFLOPS (0.66 when switching to 16-bit and unsigned short). +We can try to process in bytes instead of individual bits by [precomputing](/hpc/compilation/precalc) a small 256-element *lookup table* that contains the population counts of individual bytes and then query it while iterating over raw bytes of the array: ```c++ struct Precalc { alignas(64) char counts[256]; constexpr Precalc() : counts{} { - for (int i = 0; i < 256; i++) - counts[i] = __builtin_popcount(i); + for (int m = 0; m < 256; m++) + for (int i = 0; i < 8; i++) + counts[m] += (m >> i & 1); } }; constexpr Precalc P; int popcnt() { - auto b = (unsigned char*) a; // char is signed by default + auto b = (unsigned char*) a; // careful: plain "char" is signed int res = 0; for (int i = 0; i < 4 * N; i++) res += P.counts[b[i]]; @@ -137,7 +82,13 @@ int popcnt() { } ``` -7.5-8 GFLOPS: +It now processes around 2 bytes per cycles, rising to ~2.7 if we switch to 16-bit words (`unsigned short`). + +This solution is still very slow compared the `popcnt` instruction, but now it can be vectorized. Instead of trying to speed it up through [gather](../moving#non-contiguous-load) instructions, we will go for another approach: make the lookup table small enough to fit inside a register and then use a special [pshufb](https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=pshuf&techs=AVX,AVX2&expand=6331) instruction to look up its values in parallel. + +The original `pshufb` introduced in 128-bit SSE3 takes two registers: the lookup table containing 16 byte values and a vector of 16 4-bit indices (0 to 15), specifying which bytes to pick for each position. In 256-bit AVX2, instead of a 32-byte lookup table with awkward 5-bit indices, we have an instruction that independently the same shuffling operation over two 128-bit lanes. + +So, for our use case, we create a 16-byte lookup table with population counts for each nibble (half-byte), repeated twice: ```c++ const reg lookup = _mm256_setr_epi8( @@ -151,20 +102,22 @@ const reg lookup = _mm256_setr_epi8( /* 8 */ 1, /* 9 */ 2, /* a */ 2, /* b */ 3, /* c */ 2, /* d */ 3, /* e */ 3, /* f */ 4 ); +``` -const reg low_mask = _mm256_set1_epi8(0x0f); +Now, to compute the population count of a vector, we split each of its bytes into the lower and higher nibbles and then use this lookup table to retrieve their counts. The only thing left is to carefully sum them up: -const int block_size = (255 / 8) * 8; +```c++ +const reg low_mask = _mm256_set1_epi8(0x0f); int popcnt() { int k = 0; reg t = _mm256_setzero_si256(); - for (; k + block_size < N; k += block_size) { + for (; k + 15 < N; k += 15) { reg s = _mm256_setzero_si256(); - for (int i = 0; i < block_size; i += 8) { + for (int i = 0; i < 15; i += 8) { reg x = _mm256_load_si256( (reg*) &a[k + i] ); reg l = _mm256_and_si256(x, low_mask); @@ -189,8 +142,84 @@ int popcnt() { } ``` -Another way is through gather, but that is too slow. +This code processes around 30 bytes per cycle. Theoretically, the inner loop could do 32, but we have to stop it every 15 iterations because the 8-bit counters can overflow. + +The `pshufb` instruction is so instrumental in some SIMD algorithms that [Wojciech Muła](http://0x80.pl/) — the guy who came up with this algorithm — took it as his [Twitter handle](https://twitter.com/pshufb). You can calculate population counts even faster: check out his [github repository](https://github.com/WojciechMula/sse-popcount) with different vectorized popcount implementations and his [recent paper](https://arxiv.org/pdf/1611.07612.pdf) for a detailed explanation of the state-of-the-art. + +### Permutations and Lookup Tables + +One very important data processing primitive is the `filter`. It takes an array as input and writes out only the elements that satisfy a given predicate. In a single-threaded scalar case, it is trivially implemented by maintaining a counter that is incremented on each write: + +```c++ +int a[N], b[N]; + +int filter() { + int k = 0; + + for (int i = 0; i < N; i++) + if (a[i] < P) + b[k++] = a[i]; + + return k; +} +``` + +To vectorize it, we will use the `_mm256_permutevar8x32_epi32` intrinsic. It takes a vector of values and a vector of indices, and selects them correspondingly. It doesn't really permute but selects the values. + +The general idea: +- to calculate the predicate (perform the comparison and get the mask), +- use `movemask` to get a scalar 8-bit mask, +- then use a lookup use this instruction +- permute so that values are in the beginning +- write to the buffer only the element that satisfy the predicate (and maybe some garbage later) +- move pointer (by the popcnt of movemask) + +6-7x faster: + +```c++ +struct Precalc { + alignas(64) int permutation[256][8]; + + constexpr Precalc() : permutation{} { + for (int m = 0; m < 256; m++) { + int k = 0; + for (int i = 0; i < 8; i++) + if (m >> i & 1) + permutation[m][k++] = i; + } + } +}; + +constexpr Precalc T; +``` + +You can [permute](https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=permute&techs=AVX,AVX2&expand=6331,5160) data inside a register almost arbitrarily. + +```c++ +const reg p = _mm256_set1_epi32(P); + +int filter() { + int k = 0; + + for (int i = 0; i < N; i += 8) { + reg x = _mm256_load_si256( (reg*) &a[i] ); + + reg m = _mm256_cmpgt_epi32(p, x); + int mask = _mm256_movemask_ps((__m256) m); + reg permutation = _mm256_load_si256( (reg*) &T.permutation[mask] ); + + x = _mm256_permutevar8x32_epi32(x, permutation); + _mm256_storeu_si256((reg*) &b[k], x); + + k += __builtin_popcount(mask); + } + + return k; +} +``` + +It also doesn't depend on the value of `P`: -### Acknowledgements +![](../img/filter.svg) -Check out [Wojciech Muła's github repository](https://github.com/WojciechMula/sse-popcount) with different vectorized popcount implementations and his [latest paper](https://arxiv.org/pdf/1611.07612.pdf) for the detailed explanation of state-of-the-art. +AVX512 has similar "scatter" instructions that write data non-sequentially, using either indices or [a mask](https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=compress&expand=4754,4479&techs=AVX_512). You can very efficiently "filter" an array this way using a predicate. From 44984d1a76a5d51bd2b85de91818db112508924d Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Mon, 7 Feb 2022 00:10:38 +0300 Subject: [PATCH 118/531] filter with simd --- content/english/hpc/simd/shuffing.md | 35 +++++++++++++++------------- 1 file changed, 19 insertions(+), 16 deletions(-) diff --git a/content/english/hpc/simd/shuffing.md b/content/english/hpc/simd/shuffing.md index 1d0b32d0..64da2eb1 100644 --- a/content/english/hpc/simd/shuffing.md +++ b/content/english/hpc/simd/shuffing.md @@ -5,7 +5,7 @@ weight: 6 [Masking](../masking) lets you apply operations to only a subset of vector elements. It is a very effective and frequently used data manipulation technique, but in many cases, you need to perform more advanced operations that involve permuting values inside a vector register instead of just blending them with other vectors. -The problem is that adding a separate element-shuffling instruction for each possible use case in hardware is unfeasible. What we can do though is to add just one general permutation instruction that takes the indices of a permutation and produce these indices using precomputed lookup tables. +The problem is that adding a separate element-shuffling instruction for each possible use case in hardware is unfeasible. What we can do though is to add just one general permutation instruction that takes the indices of a permutation and produces these indices using precomputed lookup tables. This general idea is perhaps too abstract, so let's jump straight to the examples. @@ -82,9 +82,9 @@ int popcnt() { } ``` -It now processes around 2 bytes per cycles, rising to ~2.7 if we switch to 16-bit words (`unsigned short`). +It now processes around 2 bytes per cycle, rising to ~2.7 if we switch to 16-bit words (`unsigned short`). -This solution is still very slow compared the `popcnt` instruction, but now it can be vectorized. Instead of trying to speed it up through [gather](../moving#non-contiguous-load) instructions, we will go for another approach: make the lookup table small enough to fit inside a register and then use a special [pshufb](https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=pshuf&techs=AVX,AVX2&expand=6331) instruction to look up its values in parallel. +This solution is still very slow compared to the `popcnt` instruction, but now it can be vectorized. Instead of trying to speed it up through [gather](../moving#non-contiguous-load) instructions, we will go for another approach: make the lookup table small enough to fit inside a register and then use a special [pshufb](https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=pshuf&techs=AVX,AVX2&expand=6331) instruction to look up its values in parallel. The original `pshufb` introduced in 128-bit SSE3 takes two registers: the lookup table containing 16 byte values and a vector of 16 4-bit indices (0 to 15), specifying which bytes to pick for each position. In 256-bit AVX2, instead of a 32-byte lookup table with awkward 5-bit indices, we have an instruction that independently the same shuffling operation over two 128-bit lanes. @@ -148,7 +148,9 @@ The `pshufb` instruction is so instrumental in some SIMD algorithms that [Wojcie ### Permutations and Lookup Tables -One very important data processing primitive is the `filter`. It takes an array as input and writes out only the elements that satisfy a given predicate. In a single-threaded scalar case, it is trivially implemented by maintaining a counter that is incremented on each write: +Our last major example in this chapter is the `filter`. It is a very important data processing primitive that takes an array as input and writes out only the elements that satisfy a given predicate (in their original order). + +In a single-threaded scalar case, it is trivially implemented by maintaining a counter that is incremented on each write: ```c++ int a[N], b[N]; @@ -164,17 +166,18 @@ int filter() { } ``` -To vectorize it, we will use the `_mm256_permutevar8x32_epi32` intrinsic. It takes a vector of values and a vector of indices, and selects them correspondingly. It doesn't really permute but selects the values. +To vectorize it, we will use the `_mm256_permutevar8x32_epi32` intrinsic. It takes a vector of values and individually selects them with a vector of indices. Despite the name, it doesn't *permute* values but just *copies* them to form a new vector: duplicates in the result are allowed. + +The general idea of our algorithm is as follows: -The general idea: -- to calculate the predicate (perform the comparison and get the mask), -- use `movemask` to get a scalar 8-bit mask, -- then use a lookup use this instruction -- permute so that values are in the beginning -- write to the buffer only the element that satisfy the predicate (and maybe some garbage later) -- move pointer (by the popcnt of movemask) +- calculate the predicate on a vector of data — in this case, this means performing the comparisons to get the mask; +- use the `movemask` instruction to get a scalar 8-bit mask; +- use this mask to index a lookup table that returns a permutation moving the elements that satisfy the predicate to the beginning of the vector (in their original order); +- use the `_mm256_permutevar8x32_epi32` intrinsic to permute the values; +- write the whole permuted vector to the buffer — it may have some trailing garbage, but its prefix is correct; +- calculate the population count of the scalar mask and move the buffer pointer by that amount. -6-7x faster: +First, we need to precompute the permutations: ```c++ struct Precalc { @@ -193,7 +196,7 @@ struct Precalc { constexpr Precalc T; ``` -You can [permute](https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=permute&techs=AVX,AVX2&expand=6331,5160) data inside a register almost arbitrarily. +Then we can implement the algorithm itself: ```c++ const reg p = _mm256_set1_epi32(P); @@ -218,8 +221,8 @@ int filter() { } ``` -It also doesn't depend on the value of `P`: +The vectorized version takes some work to implement, but it is 6-7x faster than the scalar one (the speedup is slightly less for either low or high values of `P` as the [branch becomes predictable](/hpc/pipelining/branching)). ![](../img/filter.svg) -AVX512 has similar "scatter" instructions that write data non-sequentially, using either indices or [a mask](https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=compress&expand=4754,4479&techs=AVX_512). You can very efficiently "filter" an array this way using a predicate. +This operation is considerably faster on AVX-512: it has a special "[compress](_mm512_mask_compress_epi32)" instruction that takes a vector of data and a mask and writes its unmasked elements contiguously. It makes a huge difference in algorithms that rely on various filtering subroutines. From 2c4e0a03eaf39fcf4909c7da9de78029e5dbe2fd Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Mon, 7 Feb 2022 01:17:04 +0300 Subject: [PATCH 119/531] auto-vectorization --- .../english/hpc/simd/auto-vectorization.md | 33 +++++++++---------- 1 file changed, 16 insertions(+), 17 deletions(-) diff --git a/content/english/hpc/simd/auto-vectorization.md b/content/english/hpc/simd/auto-vectorization.md index d7cca47c..9815cf50 100644 --- a/content/english/hpc/simd/auto-vectorization.md +++ b/content/english/hpc/simd/auto-vectorization.md @@ -3,32 +3,30 @@ title: Auto-Vectorization weight: 10 --- -Most often, SIMD is used for "embarrassingly parallel" computations: the ones where all you do is apply some elementwise function to all elements of an array and write it back somewhere else. In this setting, you don't even need to know how SIMD works: the compiler is perfectly capable of optimizing such loops by itself. All you need to know is that such optimization exists and yields a 5-10x speedup. +SIMD-parallelism is most often used for *embarrassingly parallel* computations: the kinds where all you do is apply some elementwise function to all elements of an array and write it back somewhere else. In this setting, you don't even need to know how SIMD works: the compiler is perfectly capable of optimizing such loops by itself — you just need to be aware that such optimization exists and that it usually yields a 5-10x speedup. -But most computations are not like that, and even the loops that seem straightforward to vectorize are often not optimized because of some tricky technical nuances. In this section, we will discuss how to assist the compiler in vectorization and walk through some more complicated patterns of using SIMD. +Doing nothing and relying on auto-vectorization is actually the preferred way of using SIMD. Whenever you can, you should always stick with the scalar code for its simplicity and maintainability. But often even the loops that seem straightforward to vectorize are not optimized because of some technical nuances. [As in many other cases](/hpc/compilation/contracts), the compiler may need some additional input from the programmer as he may know a bit more about the problem than what can be inferred from static analysis. -## Assisting Autovectorization +### Potential Problems -Of course, the preferred way of using SIMD is by the means of autovectorization. Whenever you can, you should always stick with the scalar code for its simplicity and maintainability. But, [as in many other cases](/hpc/analyzing-performance/compilation), compiler often needs some additional input from the programmer, who may know a little bit more about the problem. - -Consider the "a+b" example: +Consider the "a + b" example: ```c++ -void sum(int a[], int b[], int c[], int n) { +void sum(int *a, int *b, int *c, int n) { for (int i = 0; i < n; i++) c[i] = a[i] + b[i]; } ``` -This function can't be replaced with the vectorized variant automatically. Why? +Let's step into a compiler's shoes and think about what can go wrong when this loop is vectorized. -First, vectorization here is not always technically correct. Assuming that `a[]` and `c[]` intersect in a way that their beginnings differ by a single position — because who knows, maybe the programmer wanted to calculate the Fibonacci sequence through a convolution this way. In this case, the data in the SIMD blocks will intersect, and the observed behavior will differ from the one in the scalar case. +**Array size.** If the array size is unknown beforehand, it may be that it is too small for vectorization to be beneficial in the first place. Even if it is sufficiently large, we need to insert an additional check for the remainder of the loop to process it scalar, which would cost us a branch. -Second, we don't know anything about the alignment of these arrays, and we can lose some performance here by using unaligned instructions. +To eliminate these runtime checks, use array sizes that are compile-time constants, and preferably pad arrays to the nearest multiple of the SIMD block size. -On high (`-O3`) levels of optimization, when the compiler suspects that the function may be used for large cycles, it generates two implementation variants — a SIMDized and a "safe" one — and inserts runtime checks to choose between the two. +**Memory aliasing.** Even when array size issues are out of the question, vectorizing this loop is not always technically correct. For example, the arrays `a` and `c` can intersect in a way that their beginnings differ by a single position — because who knows, maybe the programmer wanted to calculate the Fibonacci sequence through a convolution this way. In this case, the data in the SIMD blocks will intersect and the observed behavior will differ from the one in the scalar case. -To avoid these runtime checks, we can tell compiler that we are sure that nothing will break. One way to do this is using the `__restrict__` keyword: +When the compiler can't prove that the function may be used for intersecting arrays, it has to generate two implementation variants — a vectorized and a "safe" one — and insert runtime checks to choose between the two. To avoid them, we can tell the compiler that we are that no memory is aliased by adding the `__restrict__` keyword: ```cpp void add(int * __restrict__ a, const int * __restrict__ b, int n) { @@ -37,7 +35,7 @@ void add(int * __restrict__ a, const int * __restrict__ b, int n) { } ``` -The other, specific to SIMD, is the "ignore vector dependencies" pragma, which is the way to tell compiler that we are sure there are no dependencies between the loop iterations: +The other way, specific to SIMD, is the "ignore vector dependencies" pragma. It is a general way to inform the compiler that there are no dependencies between the loop iterations: ```c++ #pragma GCC ivdep @@ -45,11 +43,12 @@ for (int i = 0; i < n; i++) // ... ``` -There are [many other ways](https://software.intel.com/sites/default/files/m/4/8/8/2/a/31848-CompilerAutovectorizationGuide.pdf) of hinting compiler what we meant exactly, but in especially complex cases — when inside the loop there are a lot of branches or some functions are called — it is easier to go down to the intrinsics level and write it yourself. +**Alignment.** The compiler also doesn't know anything about the alignment of these arrays and has to either process some elements at the beginning of these arrays before starting the vectorized section or potentially lose some performance by using [unaligned memory accesses](../moving). -`std::assume_aligned`, specifiers. This is useful for SIMD instructions that need memory alignment guarantees +To help the compiler eliminate this corner case, we can use the `alignas` specifier on static arrays and the `std::assume_aligned` function to mark pointers aligned. +**Checking if vectorization happened.** In either case, it is useful to check if the compiler vectorized the loop the way you intended. You can either [compiling it to assembly](/hpc/compilation/stages) and look for blocks for instructions that start with a "v" or add the `-fopt-info-vec-optimized` compiler flag so that the compiler indicates where auto-vectorization is happening and what SIMD width is being used. If you swap `optimized` for `missed` or `all`, you may also get some reasoning behind why it is not happening in other places. -First of all, it is very useful to check if vectorization happened the way you intended by [compiling it to assembly](/hpc/compilation/stages) and taking a close look at the emitted instructions that start with "v". +--- -Also, if you specify the `-fopt-info-vec-optimized` flag, then the compiler will directly indicate where auto-vectorization is happening and what SIMD width is being used. If you swap `optimized` for `missed` or `all`, you may also get reasons why it is not happening in other places. +There are [many other ways](https://software.intel.com/sites/default/files/m/4/8/8/2/a/31848-CompilerAutovectorizationGuide.pdf) of hinting compiler what we meant exactly, but in especially complex cases — when inside the loop there are a lot of branches or some functions are called — it is easier to go down to the intrinsics level and write it yourself. From 829f22975731b218102e86af285266644f1cc979 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Mon, 7 Feb 2022 02:42:56 +0300 Subject: [PATCH 120/531] precalc --- content/english/hpc/compilation/precalc.md | 78 ++++++++++++++-------- 1 file changed, 49 insertions(+), 29 deletions(-) diff --git a/content/english/hpc/compilation/precalc.md b/content/english/hpc/compilation/precalc.md index bd496d30..6a6fd382 100644 --- a/content/english/hpc/compilation/precalc.md +++ b/content/english/hpc/compilation/precalc.md @@ -1,52 +1,70 @@ --- -title: Compile-Time Computation +title: Precomputation weight: 8 -draft: true --- -### Precalculation +When compilers can infer that a certain variable does not depend on any user-provided data, they can compute its value during compile-time and turn it into a constant by embedding it into the generated machine code. -A compiler can compute constants on its own, but it doesn't *have to*. +This optimization helps performance a lot, but it is not a part of the C++ standard, so compilers don't *have to* do that. When a compile-time computation is either hard to implement or time-intensive, they have a full legal right to pass on that opportunity. -```c++ -const int b = 4, B = (1 << b); +### Constant Expressions -// is it tight enough? -constexpr int round(int k) { - return k / B * B; // (k & ~(B - 1)); -} +In modern C++, you can mark a function as `constexpr`, and if it is called by passing constants, its value is guaranteed to be computed during compile-time: -constexpr int height(int m) { - return (m == 0 ? 0 : height(m / B) + 1); +```c++ +constexpr int fibonacci(int n) { + if (n <= 2) + return 1; + return fibonacci(n - 1) + fibonacci(n - 2); } -constexpr int offset(int h) { - int res = 0; - int m = N; - while (h--) { - res += round(m) + B; - m /= B; +static_assert(fibonacci(10) == 55); +``` + +These functions have some restrictions like that they only call other `constexpr` functions and can't do memory allocation, but otherwise they are executed "as is". + +Note that while they don't cost anything during the run-time, they still increase compilation time, so at least remotely care about their efficiency and don't put something NP-complete in them: + +```c++ +constexpr int fibonacci(int n) { + int a = 1, b = 1; + while (n--) { + int c = a + b; + a = b; + b = c; } - return res; + return b; } +``` -constexpr int h = height(N); -alignas(64) int t[offset(h)]; -//int t[N * B / (B - 1)]; // +1? +There used to be much more limitations in earlier C++ standards, like you could not use any sort of state inside them and had to rely on recursion, so the whole process felt more like Haskell programming rather than C++. Since C++17, you can even compute static arrays using the imperative style, which is useful for precomputing lookup tables: -struct Meta { - alignas(64) int mask[B][B]; +```c++ +struct Precalc { + int isqrt[1000]; - constexpr Meta() : mask{} { - for (int k = 0; k < B; k++) - for (int i = 0; i < B; i++) - mask[k][i] = (i > k ? -1 : 0); + constexpr Meta() : reciprocal{} { + for (int i = 0; i < 1000; i++) + reciprocal[i] = int(sqrt(i)); } }; -constexpr Meta T; +constexpr Precalc P; + +static_assert(P.isqrt[42] == 6); ``` +Note that when you call `constexpr` functions using non-constants, they compiler may or may not compute them during compile-time: + +```c++ +for (int i = 0; i < 100; i++) + cout << fibonacci(i) << endl; +``` + +In this example, even though technically we perform a constant number of iterations and call `fibonacci` with parameters known at compile-time, they are technically not compile-time constants. It's up to the compiler whether to optimize this loop or not, and for heavy computations, it often choses not to. + + From e9dcefeaf8df44ce000fdc06e7bf48a11b950641 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Mon, 7 Feb 2022 02:44:35 +0300 Subject: [PATCH 121/531] grammar --- content/english/hpc/compilation/precalc.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/content/english/hpc/compilation/precalc.md b/content/english/hpc/compilation/precalc.md index 6a6fd382..065f42d4 100644 --- a/content/english/hpc/compilation/precalc.md +++ b/content/english/hpc/compilation/precalc.md @@ -21,7 +21,7 @@ constexpr int fibonacci(int n) { static_assert(fibonacci(10) == 55); ``` -These functions have some restrictions like that they only call other `constexpr` functions and can't do memory allocation, but otherwise they are executed "as is". +These functions have some restrictions like that they only call other `constexpr` functions and can't do memory allocation, but otherwise, they are executed "as is". Note that while they don't cost anything during the run-time, they still increase compilation time, so at least remotely care about their efficiency and don't put something NP-complete in them: @@ -54,14 +54,14 @@ constexpr Precalc P; static_assert(P.isqrt[42] == 6); ``` -Note that when you call `constexpr` functions using non-constants, they compiler may or may not compute them during compile-time: +Note that when you call `constexpr` functions while passing non-constants, the compiler may or may not compute them during compile-time: ```c++ for (int i = 0; i < 100; i++) cout << fibonacci(i) << endl; ``` -In this example, even though technically we perform a constant number of iterations and call `fibonacci` with parameters known at compile-time, they are technically not compile-time constants. It's up to the compiler whether to optimize this loop or not, and for heavy computations, it often choses not to. +In this example, even though technically we perform a constant number of iterations and call `fibonacci` with parameters known at compile-time, they are technically not compile-time constants. It's up to the compiler whether to optimize this loop or not — and for heavy computations, it often chooses not to. + ### Non-Contiguous Load Later SIMD extensions added special "gather" and "scatter instructions that read/write data non-sequentially using arbitrary array indices. These don't work 8 times faster though and are usually limited by the memory rather than the CPU, but they are still helpful for certain applications such as sparse linear algebra. From d2f211538827af95d7a4e3dfb6e0934f9f8eca30 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 8 Feb 2022 05:18:30 +0300 Subject: [PATCH 125/531] use std namespace prefix --- content/english/hpc/simd/moving.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/english/hpc/simd/moving.md b/content/english/hpc/simd/moving.md index cac260d5..e2cf3035 100644 --- a/content/english/hpc/simd/moving.md +++ b/content/english/hpc/simd/moving.md @@ -157,8 +157,8 @@ If you want to avoid all this complexity, you can just dump the vector in memory void print(__m256i v) { auto t = (unsigned*) &v; for (int i = 0; i < 8; i++) - cout << bitset<32>(t[i]) << " "; - cout << endl; + std::cout << std::bitset<32>(t[i]) << " "; + std::cout << std::endl; } ``` From 40c752c05cc4867beb3c76745668581aff8eb25a Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 8 Feb 2022 06:07:30 +0300 Subject: [PATCH 126/531] note about immediates in simd intrinsics --- content/english/hpc/simd/intrinsics.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/content/english/hpc/simd/intrinsics.md b/content/english/hpc/simd/intrinsics.md index d68c7088..0b2b8d32 100644 --- a/content/english/hpc/simd/intrinsics.md +++ b/content/english/hpc/simd/intrinsics.md @@ -113,7 +113,9 @@ Here are a few more examples, just so that you get the gist of it: - `_mm256_cmpeq_epi32`: compare 8+8 packed `int`s and return a mask that contains ones for equal element pairs. - `_mm256_blendv_ps`: pick elements from one of two vectors according to a mask. -As you may have guessed, there is a combinatorially very large number of intrinsics. For some reason, there are some operations that are agnostic to the type of data stored in registers, but only take a specific vector type (usually 32-bit float) — you can just have to convert to and from it to use that intrinsic. To simplify the examples in this chapter, we will mostly work with 32-bit integers (`epi32`) in 256-bit AVX2 registers. +As you may have guessed, there is a combinatorially very large number of intrinsics, and in addition to that, some instructions also have immediate values — so their intrinsics require compile-time constant parameters: for example, the floating-point comparison instruction [has 32 different modifiers](https://stackoverflow.com/questions/16988199/how-to-choose-avx-compare-predicate-variants). + +For some reason, there are some operations that are agnostic to the type of data stored in registers, but only take a specific vector type (usually 32-bit float) — you just have to convert to and from it to use that intrinsic. To simplify the examples in this chapter, we will mostly work with 32-bit integers (`epi32`) in 256-bit AVX2 registers. A very helpful reference for x86 SIMD intrinsics is the [Intel Intrinsics Guide](https://software.intel.com/sites/landingpage/IntrinsicsGuide/), which has groupings by categories and extensions, descriptions, pseudocode, associated assembly instructions, and their latency and throughput on Intel microarchitectures. You may want to bookmark that page. From f4dd353ca5eaa570f22410fcd6fb12bf38b77d05 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 8 Feb 2022 06:45:38 +0300 Subject: [PATCH 127/531] update hpc index --- content/english/hpc/_index.md | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/content/english/hpc/_index.md b/content/english/hpc/_index.md index d1b31dd6..a30fcf40 100644 --- a/content/english/hpc/_index.md +++ b/content/english/hpc/_index.md @@ -96,15 +96,13 @@ Planned table of contents: 9.2. Memory Latency 9.3. Cache Lines 9.4. Memory Sharing - 9.5. Data Alignment - 9.6. Structure Packing - 9.7. Pointer Alternatives - 9.8. Cache Associativity - 9.9. Memory Paging - 9.10. Memory-Level Parallelism - 9.11. Hardware Prefetching - 9.12. Software Prefetching - 9.13. AoS and SoA + 9.5. Memory-Level Parallelism + 9.6. Prefetching + 9.7. Alignment and Packing + 9.8. Pointer Alternatives + 9.9. Cache Associativity + 9.10. Memory Paging + 9.11. AoS and SoA 10. SIMD Parallelism 10.1. Intrinsics and Vector Types 10.2. Loading and Writing Data @@ -139,13 +137,14 @@ Planned table of contents: Among cool things that we will speed up: - 2x faster GCD (compared to `std::gcd`) -- 5x faster binary search (compared to `std::lower_bound`) +- 8-15x faster binary search (compared to `std::lower_bound`) - 7x faster segment trees - 5x faster hash tables (compared to `std::unordered_map`) - ~~?x faster popcount~~ - 2x faster parsing series of integers (compared to `scanf`) - ?x faster sorting (compared to `std::sort`) - 2x faster sum (compared to `std::accumulate`) +- 10x faster array searching (compared to `std::find`) - 100x faster matrix multiplication (compared to "for-for-for") - optimal word-size integer factorization (~0.4ms per 60-bit integer) - optimal Karatsuba Algorithm @@ -175,6 +174,7 @@ This work is largely based on blog posts, research papers, conference talks and - [Peter Cordes](https://stackoverflow.com/users/224132/peter-cordes) - [Geoff Langdale](https://branchfree.org/) - [Matt Kulukundis](https://twitter.com/JuvHarlequinKFM) +- [Georg Sauthoff](https://gms.tf/) - [ridiculous_fish](https://ridiculousfish.com/blog/) - [Creel](https://www.youtube.com/c/WhatsACreel) From 769cb5ada2c2d166532c0314dbdbe11b07bb2de7 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 8 Feb 2022 08:00:50 +0300 Subject: [PATCH 128/531] speed up array searching --- content/english/hpc/simd/masking.md | 98 ++++++++++++++++++++++++++++- 1 file changed, 96 insertions(+), 2 deletions(-) diff --git a/content/english/hpc/simd/masking.md b/content/english/hpc/simd/masking.md index 3ec8388d..332597c1 100644 --- a/content/english/hpc/simd/masking.md +++ b/content/english/hpc/simd/masking.md @@ -88,7 +88,7 @@ All these versions work at around 13 GFLOPS as this example is so simple that th ### Searching -In the next example, we need to find a specific value in an array and return its position: +In the next example, we need to find a specific value in an array and return its position (aka `std::find`): ```c++ const int N = (1<<12); @@ -170,7 +170,101 @@ vptest ymm0, ymm0 je loop ``` -This doesn't improve performance much on this particular architecture as the `movemask` wasn't the bottleneck. +This doesn't improve performance much because both both `vptest` and `vmovmskps` have a throughput of one and will bottleneck the computation regardless of anything else we do in the loop. + +To work around this limitation, we can iterate in blocks of 16 elements and combine the results of independent comparisons of two 256-bit AVX2 registers using a bitwise `or`: + +```c++ +int find(int needle) { + reg x = _mm256_set1_epi32(needle); + + for (int i = 0; i < N; i += 16) { + reg y1 = _mm256_load_si256( (reg*) &a[i] ); + reg y2 = _mm256_load_si256( (reg*) &a[i + 8] ); + reg m1 = _mm256_cmpeq_epi32(x, y1); + reg m2 = _mm256_cmpeq_epi32(x, y2); + reg m = _mm256_or_si256(m1, m2); + if (!_mm256_testz_si256(m, m)) { + int mask = (_mm256_movemask_ps((__m256) m2) << 8) + + _mm256_movemask_ps((__m256) m1); + return i + __builtin_ctz(mask); + } + } + + return -1; +} +``` + +With this obstacle removed, the performance now peaks at ~34 GFLOPS. But why not 40? Shouldn't it be twice as fast? + +Here is how one iteration of the loop looks in assembly: + +```nasm +vpcmpeqd ymm2, ymm1, YMMWORD PTR a[0+rdx*4] +vpcmpeqd ymm3, ymm1, YMMWORD PTR a[32+rdx*4] +vpor ymm0, ymm3, ymm2 +vptest ymm0, ymm0 +je loop +``` + +Every iteration, we need to execute 5 instructions. While the throughputs of all relevant execution ports allow to do that in one cycle on average, we can't do that because the decode width of this particular CPU (Zen 2) is 4. Therefore, the performance is limited by ⅘ of what it could have been. + + + +To mitigate this, we can once again double the number of SIMD blocks we process on each iteration: + +```c++ +unsigned get_mask(reg m) { + return _mm256_movemask_ps((__m256) m); +} + +reg cmp(reg x, int *p) { + reg y = _mm256_load_si256( (reg*) p ); + return _mm256_cmpeq_epi32(x, y); +} + +int find(int needle) { + reg x = _mm256_set1_epi32(needle); + + for (int i = 0; i < N; i += 32) { + reg m1 = cmp(x, &a[i]); + reg m2 = cmp(x, &a[i + 8]); + reg m3 = cmp(x, &a[i + 16]); + reg m4 = cmp(x, &a[i + 24]); + reg m12 = _mm256_or_si256(m1, m2); + reg m34 = _mm256_or_si256(m3, m4); + reg m = _mm256_or_si256(m12, m34); + if (!_mm256_testz_si256(m, m)) { + unsigned mask = (get_mask(m4) << 24) + + (get_mask(m3) << 16) + + (get_mask(m2) << 8) + + get_mask(m1); + return i + __builtin_ctz(mask); + } + } + + return -1; +} +``` + +It now shows the throughput of 43 GFLOPS — or about 10x faster than the original scalar implementation. + +Extending it to 64 values per cycle doesn't help: small arrays suffer from the overhead of all these additional `movemask`-s when we hit the condition, and larger arrays are bottlenecked by [memory bandwidth](/hpc/cpu-cache/bandwidth) anyway. ### Counting Values From 706f3b42efc654e052ea1ad8cb8f1ceac656554c Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 8 Feb 2022 22:35:09 +0300 Subject: [PATCH 129/531] horizontal minimum note --- content/english/hpc/simd/reduction.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/content/english/hpc/simd/reduction.md b/content/english/hpc/simd/reduction.md index 3dcb109c..28fb4d9c 100644 --- a/content/english/hpc/simd/reduction.md +++ b/content/english/hpc/simd/reduction.md @@ -70,6 +70,8 @@ int hsum(__m256i x) { There are [other similar instructions](https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#techs=AVX,AVX2&ig_expand=3037,3009,5135,4870,4870,4872,4875,833,879,874,849,848,6715,4845&text=horizontal), e. g. for integer multiplication or calculating absolute differences between adjacent elements (used in image processing). +There is also one specific instruction, `_mm_minpos_epu16`, that calculates the horizontal minimum and its index among eight 16-bit integers. This is the only horizontal reduction that works in one go: all others are computed in multiple steps. + ### Instruction-Level Parallelism Our implementation matches what the compiler produces automatically, but it is actually [suboptimal](/hpc/pipelining/throughput): when we use just one accumulator, we have to wait one cycle between the loop iterations for vector addition to complete, while its throughput is 2 on this microarchitecture. From 9ee372e036206b998ee55d5b1c2aa42fe3e2ce00 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 8 Feb 2022 22:36:37 +0300 Subject: [PATCH 130/531] argmin draft --- content/english/hpc/algorithms/argmin.md | 54 +++++++++++++++++++----- 1 file changed, 43 insertions(+), 11 deletions(-) diff --git a/content/english/hpc/algorithms/argmin.md b/content/english/hpc/algorithms/argmin.md index 417495ed..85a6bd64 100644 --- a/content/english/hpc/algorithms/argmin.md +++ b/content/english/hpc/algorithms/argmin.md @@ -4,6 +4,20 @@ weight: 7 draft: true --- +Harmonic series: + +$$ +\frac{1}{2} + \frac{1}{3} + \frac{1}{4} + \ldots + \frac{1}{n} = O(\ln(n)) +$$ + +To take a minimum value, + +It needs 5 for 100, 7 for 1000, and just 14 for $10^14$. + +SIMD extensions have a convenient `_mm256_min_epi32` that works in one cycle, so computing a mini + +Finding the *value* of a minimum + We create an array of *random* integers. ```c++ @@ -90,12 +104,22 @@ int argmin(int *a, int n) { ``` ```c++ -typedef __m256i reg; +const int B = 8; +typedef int vec __attribute__ (( vector_size(4 * B) )); + +vec min(vec x, vec y) { + return (x < y ? x : y); +} + +int mask(vec x) { + return _mm256_movemask_epi8((__m256i) x); +} int argmin(int *a, int n) { + vec *v = (vec*) a; + int m = INT_MAX, k = 0; - vec p, t; - t = p = m + vec{}; + vec p = m + vec{}; for (int i = 0; i < n / B; i++) { t = min(t, v[i]); @@ -128,7 +152,7 @@ int argmin() { t0 = min(t0, v[i]); t1 = min(t1, v[i + 1]); vec t = min(t0, t1); - int mask = _mm256_movemask_epi8((__m256i) (p == t)); + int mask = mask(((__m256i) (p == t)); if (mask != -1) { [[unlikely]] for (int j = B * i; j < B * i + 2 * B; j++) if (a[j] < m) @@ -141,13 +165,21 @@ int argmin() { } ``` +It drops to about 1.4 GFLOPS — almost 10 times as slow, although still on the level of scalar code. + ``` std 0.28 0.28 simple 1.58 1.94 -cmov 1.43 1.93 -hint 2.26 1.49 -index 4.36 4.36 -simdmin-single 9.26 0.53 -simdmin 14.22 1.37 -simdmin-testz 13.51 1.41 -``` \ No newline at end of file +cmov 1.44 1.94 +hint 2.26 1.5 +index 4.38 4.38 +simdmin-single 9.36 0.54 +simdmin 14.65 1.41 +simdmin-testz 13.59 1.41 +``` + +### Acknowledgements + +http://0x80.pl/notesen/2018-10-03-simd-index-of-min.html + +https://stackoverflow.com/questions/9795529/how-to-find-the-horizontal-maximum-in-a-256-bit-avx-vector Norbert P. and Peter Cordes From 81c77dbd659f59b4ff74a6867fad4a21b1236ce8 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 9 Feb 2022 04:43:44 +0300 Subject: [PATCH 131/531] argmin with simd --- content/english/hpc/algorithms/argmin.md | 291 +++++++++++++++-------- 1 file changed, 197 insertions(+), 94 deletions(-) diff --git a/content/english/hpc/algorithms/argmin.md b/content/english/hpc/algorithms/argmin.md index 85a6bd64..28727fe6 100644 --- a/content/english/hpc/algorithms/argmin.md +++ b/content/english/hpc/algorithms/argmin.md @@ -1,33 +1,28 @@ --- title: Argmin with SIMD weight: 7 -draft: true --- -Harmonic series: +Computing the *minimum* of an array [easily vectorizable](/hpc/simd/reduction), as it is not different from any other reduction: in AVX2, you just need to use a convenient `_mm256_min_epi32` intrinsic as the inner operation. It computes the minimum of two 8-element vectors in one cycle — even faster than in the scalar case, which requires at least a comparison and a conditional move. -$$ -\frac{1}{2} + \frac{1}{3} + \frac{1}{4} + \ldots + \frac{1}{n} = O(\ln(n)) -$$ - -To take a minimum value, - -It needs 5 for 100, 7 for 1000, and just 14 for $10^14$. - -SIMD extensions have a convenient `_mm256_min_epi32` that works in one cycle, so computing a mini +Finding the index of that minimum element (*argmin*) is much harder, but it is still possible to compute very fast. -Finding the *value* of a minimum +### Baseline -We create an array of *random* integers. +For our benchmark, we create an array of random 32-bit integers, and then repeatedly try to find the index of the minimum among them (the first one if it isn't unique): ```c++ -const int n = (1 << 16); -alignas(32) int a[n]; +const int N = (1 << 16); +alignas(32) int a[N]; -for (int i = 0; i < n; i++) +for (int i = 0; i < N; i++) a[i] = rand(); ``` +For the sake of exposition, we assume that $N$ is a power of two. + +To implement argmin in the scalar case, we just need to maintain the index instead of the minimum value: + ```c++ int argmin(int *a, int n) { int k = 0; @@ -40,6 +35,10 @@ int argmin(int *a, int n) { } ``` +It works in around 1.5 GFLOPS — meaning 1.5 values per cycle processed on average. + +Let's compare it to `std::min_element`: + ```c++ int argmin(int *a, int n) { int k = std::min_element(a, a + n) - a; @@ -47,139 +46,243 @@ int argmin(int *a, int n) { } ``` -The compiler couldn't pierce through STL's abstractions, which isn't surprising at this point. +When using the version from GCC, it gives 0.28 GFLOPS — apparently, the compiler couldn't pierce through all the abstractions. Another reminder to never use STL. + +### Vector of Indices + +The problem with vectorizing the scalar implementation is that there is a dependency between iterations. When we optimized [array sum](/hpc/simd/reduction), we faced the same problem, and we solved it by splitting the array into 8 slices, each representing a subset of its indices with the same remainder modulo 8. + +We can apply the same trick here, except that we also have to take array indices into account. When we have both the consecutive data and indices in vectors, we can process them in parallel using [predication](/hpc/pipelining/branchless) like this: ```c++ +typedef int vec __attribute__ (( vector_size(32) )); + int argmin(int *a, int n) { - int k = 0; + vec *v = (vec*) a; + + vec cur = {0, 1, 2, 3, 4, 5, 6, 7}; // indices on the current iteration + vec min = INT_MAX + vec{}; // the current minimum for each slice + vec idx; // its index (argmin) for each slice - for (int i = 0; i < n; i++) - if (a[i] < a[k]) [[unlikely]] - k = i; + for (int i = 0; i < n / 8; i++) { + vec mask = (v[i] < min); // find the slices where the minimum updated + min = (mask ? v[i] : min); // update the minimum + idx = (mask ? cur : idx); // update the indices + cur += 8; // increment the current indices + } - return k; + // find the argmin in the "min" array: + + int k = 0, m = min[0]; + + for (int i = 1; i < 8; i++) + if (min[i] < m) + m = min[k = i]; + + return idx[k]; // return its real index } ``` -[Optimized machine layout](/hpc/architecture/layout). +It works in around 4 GFLOPS. There is still some inter-dependency between the iterations, so we can optimize it by considering more than 8 elements per iteration and taking advantage of the [instruction-level parallelism](/hpc/simd/reduction#instruction-level-parallelism). But it won't improve the performance by a lot: on each iteration, we need a load, vector comparison, two blends, and a vector addition — that is 5 instructions in total to process 8 elements. Since the decode width of this CPU (Zen 2) is just 4, the performance will still be limited by ⅘ × 8 = 6.4 GFLOPS even if we get rid of the other bottlenecks. + +Instead, we will switch to another approach that requires fewer instructions per element. + +### Branches Aren't Scary + +When we run the scalar version, how often do we update the minimum? + +Intuition tells that, if all the values are drawn independently at random, then the event when the next element is less than all the previous ones shouldn't be frequent. More formally, the expected number of times the `a[i] < a[k]` condition is satisfied equals the sum of the harmonic series: + +$$ +\frac{1}{2} + \frac{1}{3} + \frac{1}{4} + \ldots + \frac{1}{n} = O(\ln(n)) +$$ + +So the minimum is updated around 5 times for a hundred-element array, 7 for a thousand-element, and just 14 for a million-element array — which isn't large at all when looked at as a fraction of all is-new-minimum checks. + +The compiler probably couldn't figure it out on its own, so let's [explicitly provide](/hpc/compilation/situational) this information: ```c++ int argmin(int *a, int n) { int k = 0; for (int i = 0; i < n; i++) - if (__builtin_expect_with_probability(a[i] < a[k], true, 0.5)) + if (a[i] < a[k]) [[unlikely]] k = i; return k; } ``` +The compiler [optimized the machine layout](/hpc/architecture/layout), and the CPU is now able to execute the loop at around 2 GFLOPS — a slight but sizeable improvement from 1.5 GFLOPS of the non-hinted loop. + +Here is the idea: if we are only updating the minimum a dozen or so times during the entire computation, we can ditch all the vector-blending and index updating and just maintain the minimum and regularly check if it has changed. Inside this check, we can use however slow method of updating the argmin we want because it will only be called a few times. + +To implement it with SIMD, all we need to do on each iteration is a vector load, a comparison, and a test-if-zero: + ```c++ -typedef int vec __attribute__ (( vector_size(32) )); +typedef __m256i reg; int argmin(int *a, int n) { - vec *v = (vec*) a; + int min = INT_MAX, idx = 0; - vec min = INT_MAX + vec{}; - vec idx; - - vec cur = {0, 1, 2, 3, 4, 5, 6, 7}; - - for (int i = 0; i < n / 8; i++) { - vec mask = (v[i] < min); - idx = (mask ? cur : idx); - min = (mask ? v[i] : min); - cur += B; + reg p = _mm256_set1_epi32(min); + + for (int i = 0; i < n; i += 8) { + reg y = _mm256_load_si256((reg*) &a[i]); + reg mask = _mm256_cmpgt_epi32(p, y); + if (!_mm256_testz_si256(mask, mask)) { [[unlikely]] + for (int j = i; j < i + 8; j++) + if (a[j] < min) + min = a[idx = j]; + p = _mm256_set1_epi32(min); + } } - int k = 0, m = min[0]; + return idx; +} +``` - for (int i = 1; i < 8; i++) - if (min[i] < m) - m = min[k = i]; +It already performs at ~8.5 GFLOPS, but now the loop is bottlenecked by the `testz` instruction which only has a throughput of one. The solution is to load two consecutive SIMD blocks and use the minimum of them so that the `testz` effectively processes 16 elements in one go: - return idx[k]; +```c++ +int argmin(int *a, int n) { + int min = INT_MAX, idx = 0; + + reg p = _mm256_set1_epi32(min); + + for (int i = 0; i < n; i += 16) { + reg y1 = _mm256_load_si256((reg*) &a[i]); + reg y2 = _mm256_load_si256((reg*) &a[i + 8]); + reg y = _mm256_min_epi32(y1, y2); + reg mask = _mm256_cmpgt_epi32(p, y); + if (!_mm256_testz_si256(mask, mask)) { [[unlikely]] + for (int j = i; j < i + 16; j++) + if (a[j] < min) + min = a[idx = j]; + p = _mm256_set1_epi32(min); + } + } + + return idx; } ``` -```c++ -const int B = 8; -typedef int vec __attribute__ (( vector_size(4 * B) )); +This version works in ~10 GFLOPS. To remove the other obstacles, we can do two things: -vec min(vec x, vec y) { - return (x < y ? x : y); -} +- Increase the block size to 32 elements to allow for more instruction-level parallelism. +- Optimize the local argmin: instead of calculating its exact location, we can just save the index of the block, and then come back at the end and find it just once. This lets us only compute the minimum on each positive check and broadcast it to a vector, which is simpler and much faster. -int mask(vec x) { - return _mm256_movemask_epi8((__m256i) x); -} +With these two optimizations implemented, the performance increases to a whopping ~22 GFLOPS: +```c++ int argmin(int *a, int n) { - vec *v = (vec*) a; + int min = INT_MAX, idx = 0; - int m = INT_MAX, k = 0; - vec p = m + vec{}; - - for (int i = 0; i < n / B; i++) { - t = min(t, v[i]); - int mask = _mm256_movemask_epi8((__m256i) (p == t)); - if (mask != -1) { [[unlikely]] - for (int j = B * i; j < B * i + 2 * B; j++) - if (a[j] < m) - m = a[k = j]; - t = p = m + vec{}; + reg p = _mm256_set1_epi32(min); + + for (int i = 0; i < n; i += 32) { + reg y1 = _mm256_load_si256((reg*) &a[i]); + reg y2 = _mm256_load_si256((reg*) &a[i + 8]); + reg y3 = _mm256_load_si256((reg*) &a[i + 16]); + reg y4 = _mm256_load_si256((reg*) &a[i + 24]); + y1 = _mm256_min_epi32(y1, y2); + y3 = _mm256_min_epi32(y3, y4); + y1 = _mm256_min_epi32(y1, y3); + reg mask = _mm256_cmpgt_epi32(p, y1); + if (!_mm256_testz_si256(mask, mask)) { [[unlikely]] + idx = i; + for (int j = i; j < i + 32; j++) + min = (a[j] < min ? a[j] : min); + p = _mm256_set1_epi32(min); } } + + for (int i = idx; i < idx + 31; i++) + if (a[i] == min) + return i; - return k; + return idx + 31; } ``` +This is almost as high as it can get — only computing the minimum itself works at around 24-25 GFLOPS. + +The only problem of all these branch-happy SIMD implementations is that they rely on the minimum being updated very infrequently. This is true for random input distributions, but not in the worst case. If we fill the array with the decreasing numbers, the performance of the last implementation drops to about 2.7 GFLOPS — almost 10 times as slow (although still faster than the scalar code because we only calculate the minimum on each block). + +One way to fix this is to do the same thing that the quicksort-like randomized algorithms do: just randomize the input yourself and iterate over the array in random order. This lets you avoid this worst-case penalty, but it is tricky to implement due to RNG- and [memory](/hpc/cpu-cache/prefetching)-related issues. There is a simpler solution. + +### Find the Minimum, Then Find the Index + +We know how to [calculate the minimum of an array](/hpc/simd/reduction) fast and how to [find an element in an array](/hpc/simd/masking#searching) fast — so why don't we just separately compute the minimum and then find it? + ```c++ -vec min(vec x, vec y) { - return (x < y ? x : y); +int argmin(int *a, int n) { + int needle = min(a, n); + int idx = find(a, n, needle); + return idx; } +``` -int argmin() { - vec *v = (vec*) a; - - int m = INT_MAX, k = 0; - vec t0, t1, p; - t0 = t1 = p = m + vec{}; - - for (int i = 0; i < n / B; i += 2) { - t0 = min(t0, v[i]); - t1 = min(t1, v[i + 1]); - vec t = min(t0, t1); - int mask = mask(((__m256i) (p == t)); - if (mask != -1) { [[unlikely]] - for (int j = B * i; j < B * i + 2 * B; j++) - if (a[j] < m) - m = a[k = j]; - t0 = t1 = p = m + vec{}; +If we implement the two subroutines optimally (check the linked articles), the performance will be ~18 GFLOPS for random arrays and ~12 GFLOPS for decreasing arrays — which makes sense as we are expected to read the array 1.5 and 2 times respectively. This isn't that bad by itself — at least we avoid the 10x worst-case performance penalty — but the problem is that this penalized performance also translates to larger arrays when we are bottlenecked by the [memory bandwidth](/hpc/cpu-cache/bandwidth) rather than the CPU. + +Luckily, we already know how to fix it. We can split the array into blocks of fixed size $B$ and compute the minimums on these blocks while also maintaining the global minimum. When the minimum on a new block is lower than the global minimum, we update it and also remember the block number of there the global minimum currently is. After we've processed the whole array, we just return to that block and process $B$ elements to find the argmin. + +This way we only process $(N + B)$ elements and don't have to sacrifice neither ½ nor ⅓ of the performance: + +```c++ +const int B = 256; + +pair approx_argmin(int *a, int n) { + int res = INT_MAX, idx = 0; + for (int i = 0; i < n; i += B) { + int val = min(a + i, B); + if (val < res) { + res = val; + idx = i; } } - - return k; + return {res, idx}; +} + +int argmin(int *a, int n) { + auto [needle, base] = approx_argmin(a, n); // returns the first block of + int idx = find(a + base, B, needle); + return base + idx; } ``` -It drops to about 1.4 GFLOPS — almost 10 times as slow, although still on the level of scalar code. +This final implementation in~22 and ~19 GFLOPS for random and decreasing arrays respectively. + +The full implementation, including both `min()` and `find()`, is about 100 lines long. If you want, you [take a look at it](https://github.com/sslotin/amh-code/blob/main/argmin/combined.cc), although it's far from production-grade. + +### Summary + +Here are the results combined for all implementations: ``` -std 0.28 0.28 -simple 1.58 1.94 -cmov 1.44 1.94 -hint 2.26 1.5 -index 4.38 4.38 -simdmin-single 9.36 0.54 -simdmin 14.65 1.41 -simdmin-testz 13.59 1.41 +algorithm rand decr reason for the performance difference +----------- ----- ----- ------------------------------------------------------------- +std 0.28 0.28 +scalar 1.54 1.89 efficient branch prediction ++ hinted 1.95 0.75 wrong hint +index 4.08 4.17 +simd 8.51 1.65 scalar-based argmin on each iteration ++ ilp 10.22 1.74 ^ same ++ optimized 22.44 2.70 ^ same, but faster because there are less inter-dependencies +min+find 18.21 12.92 find() has to scan the entire array ++ blocked 22.23 19.29 we still have an optional horizontal minimum every B elements ``` +Take these results with a grain of salt: the measurements are [quite noisy](/hpc/profiling/noise), they were done for just for two input distributions, for a specific array size ($N=2^{13}$, the size of the L1 cache), for a specific architecture (Zen 2), and for a specific and slightly outdated compiler (GCC 9.2) — the compiler optimizations were also very fragile to little changes in the benchmarking code. + +There are also still some minor things to optimize, but the potential improvement is less than 10% so I didn't bother. One day I may pluck up courage, optimize the algorithm to the theoretical limit, handle the non-divisible-by-block-size array sizes and non-aligned memory cases, and then re-run the benchmarks properly on many architectures and with p-values and such. If someone does it before me, please [ping me back](http://sereja.me/). + ### Acknowledgements -http://0x80.pl/notesen/2018-10-03-simd-index-of-min.html +The first, index-based SIMD algorithm was [originally designed](http://0x80.pl/notesen/2018-10-03-simd-index-of-min.html) by Wojciech Muła in 2018. + + From 6304abeebca2280014c16944ecf2deb447be653f Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 9 Feb 2022 08:15:01 +0300 Subject: [PATCH 132/531] adding a todo note about shuffles that use immediates --- content/english/hpc/simd/shuffing.md | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/content/english/hpc/simd/shuffing.md b/content/english/hpc/simd/shuffing.md index 64da2eb1..c0714567 100644 --- a/content/english/hpc/simd/shuffing.md +++ b/content/english/hpc/simd/shuffing.md @@ -226,3 +226,13 @@ The vectorized version takes some work to implement, but it is 6-7x faster than ![](../img/filter.svg) This operation is considerably faster on AVX-512: it has a special "[compress](_mm512_mask_compress_epi32)" instruction that takes a vector of data and a mask and writes its unmasked elements contiguously. It makes a huge difference in algorithms that rely on various filtering subroutines. + + From 9e17ee5d70e25a77103190996b8030f13dd5aa83 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 9 Feb 2022 22:28:22 +0300 Subject: [PATCH 133/531] argmin edits --- content/english/hpc/algorithms/argmin.md | 38 +++++++++++++----------- 1 file changed, 20 insertions(+), 18 deletions(-) diff --git a/content/english/hpc/algorithms/argmin.md b/content/english/hpc/algorithms/argmin.md index 28727fe6..536fd2c9 100644 --- a/content/english/hpc/algorithms/argmin.md +++ b/content/english/hpc/algorithms/argmin.md @@ -5,7 +5,7 @@ weight: 7 Computing the *minimum* of an array [easily vectorizable](/hpc/simd/reduction), as it is not different from any other reduction: in AVX2, you just need to use a convenient `_mm256_min_epi32` intrinsic as the inner operation. It computes the minimum of two 8-element vectors in one cycle — even faster than in the scalar case, which requires at least a comparison and a conditional move. -Finding the index of that minimum element (*argmin*) is much harder, but it is still possible to compute very fast. +Finding the *index* of that minimum element (*argmin*) is much harder, but it is still possible to vectorize very efficiently. In this section, we design an algorithm that computes the argmin (almost) at the speed of computing the minimum: ~15x faster than the naive scalar approach and ~5x faster than the [previous state-of-the-art](http://0x80.pl/notesen/2018-10-03-simd-index-of-min.html). ### Baseline @@ -19,7 +19,7 @@ for (int i = 0; i < N; i++) a[i] = rand(); ``` -For the sake of exposition, we assume that $N$ is a power of two. +For the sake of exposition, we assume that $N$ is a power of two, and run all our experiments for $N=2^{13}$ so that the [memory bandwidth](/hpc/cpu-cache/bandwidth) is not a concern. To implement argmin in the scalar case, we just need to maintain the index instead of the minimum value: @@ -35,7 +35,7 @@ int argmin(int *a, int n) { } ``` -It works in around 1.5 GFLOPS — meaning 1.5 values per cycle processed on average. +It works at around 1.5 GFLOPS — meaning $1.5 \cdot 10^9$ values per second processed on average, or about 0.75 values per cycle (the CPU is clocked at 2GHz). Let's compare it to `std::min_element`: @@ -46,13 +46,13 @@ int argmin(int *a, int n) { } ``` -When using the version from GCC, it gives 0.28 GFLOPS — apparently, the compiler couldn't pierce through all the abstractions. Another reminder to never use STL. +The version from GCC gives ~0.28 GFLOPS — apparently, the compiler couldn't pierce through all the abstractions. Another reminder to never use STL. ### Vector of Indices -The problem with vectorizing the scalar implementation is that there is a dependency between iterations. When we optimized [array sum](/hpc/simd/reduction), we faced the same problem, and we solved it by splitting the array into 8 slices, each representing a subset of its indices with the same remainder modulo 8. +The problem with vectorizing the scalar implementation is that there is a dependency between consequent iterations. When we optimized [array sum](/hpc/simd/reduction), we faced the same problem, and we solved it by splitting the array into 8 slices, each representing a subset of its indices with the same remainder modulo 8. We can apply the same trick here, except that we also have to take array indices into account. -We can apply the same trick here, except that we also have to take array indices into account. When we have both the consecutive data and indices in vectors, we can process them in parallel using [predication](/hpc/pipelining/branchless) like this: +When we have the consecutive elements and their indices in vectors, we can process them in parallel using [predication](/hpc/pipelining/branchless): ```c++ typedef int vec __attribute__ (( vector_size(32) )); @@ -83,7 +83,9 @@ int argmin(int *a, int n) { } ``` -It works in around 4 GFLOPS. There is still some inter-dependency between the iterations, so we can optimize it by considering more than 8 elements per iteration and taking advantage of the [instruction-level parallelism](/hpc/simd/reduction#instruction-level-parallelism). But it won't improve the performance by a lot: on each iteration, we need a load, vector comparison, two blends, and a vector addition — that is 5 instructions in total to process 8 elements. Since the decode width of this CPU (Zen 2) is just 4, the performance will still be limited by ⅘ × 8 = 6.4 GFLOPS even if we get rid of the other bottlenecks. +It works at around 4 GFLOPS. There is still some inter-dependency between the iterations, so we can optimize it by considering more than 8 elements per iteration and taking advantage of the [instruction-level parallelism](/hpc/simd/reduction#instruction-level-parallelism). + +It would help performance a lot, but it won't let us approach the speed of computing the minimum (~24 GFLOPS) as there is another bottleneck. On each iteration, we need a load, vector comparison, two blends, and a vector addition — that is 5 instructions in total to process 8 elements. Since the decode width of this CPU (Zen 2) is just 4, the performance will still be limited by ⅘ × 8 × 2 = 12.8 GFLOPS even if we get rid of all the other bottlenecks. Instead, we will switch to another approach that requires fewer instructions per element. @@ -91,7 +93,7 @@ Instead, we will switch to another approach that requires fewer instructions per When we run the scalar version, how often do we update the minimum? -Intuition tells that, if all the values are drawn independently at random, then the event when the next element is less than all the previous ones shouldn't be frequent. More formally, the expected number of times the `a[i] < a[k]` condition is satisfied equals the sum of the harmonic series: +Intuition tells us that, if all the values are drawn independently at random, then the event when the next element is less than all the previous ones shouldn't be frequent. More precisely, it equals the reciprocal of the number of processed elements. Therefore, the expected number of times the `a[i] < a[k]` condition is satisfied equals the sum of the harmonic series: $$ \frac{1}{2} + \frac{1}{3} + \frac{1}{4} + \ldots + \frac{1}{n} = O(\ln(n)) @@ -113,7 +115,7 @@ int argmin(int *a, int n) { } ``` -The compiler [optimized the machine layout](/hpc/architecture/layout), and the CPU is now able to execute the loop at around 2 GFLOPS — a slight but sizeable improvement from 1.5 GFLOPS of the non-hinted loop. +The compiler [optimized the machine code layout](/hpc/architecture/layout), and the CPU is now able to execute the loop at around 2 GFLOPS — a slight but sizeable improvement from 1.5 GFLOPS of the non-hinted loop. Here is the idea: if we are only updating the minimum a dozen or so times during the entire computation, we can ditch all the vector-blending and index updating and just maintain the minimum and regularly check if it has changed. Inside this check, we can use however slow method of updating the argmin we want because it will only be called a few times. @@ -170,7 +172,7 @@ int argmin(int *a, int n) { This version works in ~10 GFLOPS. To remove the other obstacles, we can do two things: - Increase the block size to 32 elements to allow for more instruction-level parallelism. -- Optimize the local argmin: instead of calculating its exact location, we can just save the index of the block, and then come back at the end and find it just once. This lets us only compute the minimum on each positive check and broadcast it to a vector, which is simpler and much faster. +- Optimize the local argmin: instead of calculating its exact location, we can just save the index of the block and then come back at the end and find it just once. This lets us only compute the minimum on each positive check and broadcast it to a vector, which is simpler and much faster. With these two optimizations implemented, the performance increases to a whopping ~22 GFLOPS: @@ -205,11 +207,11 @@ int argmin(int *a, int n) { } ``` -This is almost as high as it can get — only computing the minimum itself works at around 24-25 GFLOPS. +This is almost as high as it can get as just computing the minimum itself works at around 24-25 GFLOPS. -The only problem of all these branch-happy SIMD implementations is that they rely on the minimum being updated very infrequently. This is true for random input distributions, but not in the worst case. If we fill the array with the decreasing numbers, the performance of the last implementation drops to about 2.7 GFLOPS — almost 10 times as slow (although still faster than the scalar code because we only calculate the minimum on each block). +The only problem of all these branch-happy SIMD implementations is that they rely on the minimum being updated very infrequently. This is true for random input distributions, but not in the worst case. If we fill the array with a sequence of decreasing numbers, the performance of the last implementation drops to about 2.7 GFLOPS — almost 10 times as slow (although still faster than the scalar code because we only calculate the minimum on each block). -One way to fix this is to do the same thing that the quicksort-like randomized algorithms do: just randomize the input yourself and iterate over the array in random order. This lets you avoid this worst-case penalty, but it is tricky to implement due to RNG- and [memory](/hpc/cpu-cache/prefetching)-related issues. There is a simpler solution. +One way to fix this is to do the same thing that the quicksort-like randomized algorithms do: just shuffle the input yourself and iterate over the array in random order. This lets you avoid this worst-case penalty, but it is tricky to implement due to RNG- and [memory](/hpc/cpu-cache/prefetching)-related issues. There is a simpler solution. ### Find the Minimum, Then Find the Index @@ -223,9 +225,9 @@ int argmin(int *a, int n) { } ``` -If we implement the two subroutines optimally (check the linked articles), the performance will be ~18 GFLOPS for random arrays and ~12 GFLOPS for decreasing arrays — which makes sense as we are expected to read the array 1.5 and 2 times respectively. This isn't that bad by itself — at least we avoid the 10x worst-case performance penalty — but the problem is that this penalized performance also translates to larger arrays when we are bottlenecked by the [memory bandwidth](/hpc/cpu-cache/bandwidth) rather than the CPU. +If we implement the two subroutines optimally (check the linked articles), the performance will be ~18 GFLOPS for random arrays and ~12 GFLOPS for decreasing arrays — which makes sense as we are expected to read the array 1.5 and 2 times respectively. This isn't that bad by itself — at least we avoid the 10x worst-case performance penalty — but the problem is that this penalized performance also translates to larger arrays, when we are bottlenecked by the [memory bandwidth](/hpc/cpu-cache/bandwidth) rather than compute. -Luckily, we already know how to fix it. We can split the array into blocks of fixed size $B$ and compute the minimums on these blocks while also maintaining the global minimum. When the minimum on a new block is lower than the global minimum, we update it and also remember the block number of there the global minimum currently is. After we've processed the whole array, we just return to that block and process $B$ elements to find the argmin. +Luckily, we already know how to fix it. We can split the array into blocks of fixed size $B$ and compute the minima on these blocks while also maintaining the global minimum. When the minimum on a new block is lower than the global minimum, we update it and also remember the block number of where the global minimum currently is. After we've processed the entire array, we just return to that block and scan through its $B$ elements to find the argmin. This way we only process $(N + B)$ elements and don't have to sacrifice neither ½ nor ⅓ of the performance: @@ -251,9 +253,9 @@ int argmin(int *a, int n) { } ``` -This final implementation in~22 and ~19 GFLOPS for random and decreasing arrays respectively. +This results for the final implementation are ~22 and ~19 GFLOPS for random and decreasing arrays respectively. -The full implementation, including both `min()` and `find()`, is about 100 lines long. If you want, you [take a look at it](https://github.com/sslotin/amh-code/blob/main/argmin/combined.cc), although it's far from production-grade. +The full implementation, including both `min()` and `find()`, is about 100 lines long. [Take a look](https://github.com/sslotin/amh-code/blob/main/argmin/combined.cc) if you want, although it is still far from being production-grade. ### Summary @@ -275,7 +277,7 @@ min+find 18.21 12.92 find() has to scan the entire array Take these results with a grain of salt: the measurements are [quite noisy](/hpc/profiling/noise), they were done for just for two input distributions, for a specific array size ($N=2^{13}$, the size of the L1 cache), for a specific architecture (Zen 2), and for a specific and slightly outdated compiler (GCC 9.2) — the compiler optimizations were also very fragile to little changes in the benchmarking code. -There are also still some minor things to optimize, but the potential improvement is less than 10% so I didn't bother. One day I may pluck up courage, optimize the algorithm to the theoretical limit, handle the non-divisible-by-block-size array sizes and non-aligned memory cases, and then re-run the benchmarks properly on many architectures and with p-values and such. If someone does it before me, please [ping me back](http://sereja.me/). +There are also still some minor things to optimize, but the potential improvement is less than 10% so I didn't bother. One day I may pluck up the courage, optimize the algorithm to the theoretical limit, handle the non-divisible-by-block-size array sizes and non-aligned memory cases, and then re-run the benchmarks properly on many architectures, with p-values and such. In case someone does it before me, please [ping me back](http://sereja.me/). ### Acknowledgements From 37738a552ea4bba589056d4b09856baa3da66a34 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Thu, 10 Feb 2022 00:10:45 +0300 Subject: [PATCH 134/531] bugfix --- content/english/hpc/compilation/precalc.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/hpc/compilation/precalc.md b/content/english/hpc/compilation/precalc.md index 065f42d4..5dd612c8 100644 --- a/content/english/hpc/compilation/precalc.md +++ b/content/english/hpc/compilation/precalc.md @@ -43,7 +43,7 @@ There used to be much more limitations in earlier C++ standards, like you could struct Precalc { int isqrt[1000]; - constexpr Meta() : reciprocal{} { + constexpr Precalc() : reciprocal{} { for (int i = 0; i < 1000; i++) reciprocal[i] = int(sqrt(i)); } From 5eeee6205a9fa34ef50eeb94bb1ca00efa795db5 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Thu, 10 Feb 2022 00:11:04 +0300 Subject: [PATCH 135/531] fix variable name --- content/english/hpc/compilation/precalc.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/english/hpc/compilation/precalc.md b/content/english/hpc/compilation/precalc.md index 5dd612c8..2bec9995 100644 --- a/content/english/hpc/compilation/precalc.md +++ b/content/english/hpc/compilation/precalc.md @@ -43,9 +43,9 @@ There used to be much more limitations in earlier C++ standards, like you could struct Precalc { int isqrt[1000]; - constexpr Precalc() : reciprocal{} { + constexpr Precalc() : isqrt{} { for (int i = 0; i < 1000; i++) - reciprocal[i] = int(sqrt(i)); + isqrt[i] = int(sqrt(i)); } }; From 97595ebf7886e2652292cd4b099e852976eee0d4 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Thu, 10 Feb 2022 02:01:27 +0300 Subject: [PATCH 136/531] fixing mula's algorithm --- content/english/hpc/algorithms/argmin.md | 57 ++++++++++++++---------- 1 file changed, 34 insertions(+), 23 deletions(-) diff --git a/content/english/hpc/algorithms/argmin.md b/content/english/hpc/algorithms/argmin.md index 536fd2c9..6bf9da96 100644 --- a/content/english/hpc/algorithms/argmin.md +++ b/content/english/hpc/algorithms/argmin.md @@ -5,7 +5,7 @@ weight: 7 Computing the *minimum* of an array [easily vectorizable](/hpc/simd/reduction), as it is not different from any other reduction: in AVX2, you just need to use a convenient `_mm256_min_epi32` intrinsic as the inner operation. It computes the minimum of two 8-element vectors in one cycle — even faster than in the scalar case, which requires at least a comparison and a conditional move. -Finding the *index* of that minimum element (*argmin*) is much harder, but it is still possible to vectorize very efficiently. In this section, we design an algorithm that computes the argmin (almost) at the speed of computing the minimum: ~15x faster than the naive scalar approach and ~5x faster than the [previous state-of-the-art](http://0x80.pl/notesen/2018-10-03-simd-index-of-min.html). +Finding the *index* of that minimum element (*argmin*) is much harder, but it is still possible to vectorize very efficiently. In this section, we design an algorithm that computes the argmin (almost) at the speed of computing the minimum: ~15x faster than the naive scalar approach and ~2.5x faster than the [previous state-of-the-art](http://0x80.pl/notesen/2018-10-03-simd-index-of-min.html). ### Baseline @@ -55,37 +55,48 @@ The problem with vectorizing the scalar implementation is that there is a depend When we have the consecutive elements and their indices in vectors, we can process them in parallel using [predication](/hpc/pipelining/branchless): ```c++ -typedef int vec __attribute__ (( vector_size(32) )); +typedef __m256i reg; int argmin(int *a, int n) { - vec *v = (vec*) a; - - vec cur = {0, 1, 2, 3, 4, 5, 6, 7}; // indices on the current iteration - vec min = INT_MAX + vec{}; // the current minimum for each slice - vec idx; // its index (argmin) for each slice - - for (int i = 0; i < n / 8; i++) { - vec mask = (v[i] < min); // find the slices where the minimum updated - min = (mask ? v[i] : min); // update the minimum - idx = (mask ? cur : idx); // update the indices - cur += 8; // increment the current indices + reg cur = _mm256_setr_epi32(0, 1, 2, 3, 4, 5, 6, 7); // indices on the current iteration + reg min = _mm256_set1_epi32(INT_MAX);// the current minimum for each slice + reg idx = _mm256_setzero_si256(); // its index (argmin) for each slice + + for (int i = 0; i < n; i += 8) { + // load a new SIMD block + reg x = _mm256_load_si256((reg*) &a[i]); + // find the slices where the minimum is updated + reg mask = _mm256_cmpgt_epi32(min, x); + // update the indices + idx = _mm256_blendv_epi8(idx, cur, mask); + // update the minimum (can also similarly use a "blend" here, but min is faster) + min = _mm256_min_epi32(x, min); + // update the current indices + const reg eight = _mm256_set1_epi32(8); + cur = _mm256_add_epi32(cur, eight); // + // can also use a "blend" here, but min is faster } + + // find the argmin in the "min" register and return its real index + + int min_arr[8], idx_arr[8]; - // find the argmin in the "min" array: + _mm256_storeu_si256((reg*) min_arr, min); + _mm256_storeu_si256((reg*) idx_arr, idx); - int k = 0, m = min[0]; + int k = 0, m = min_arr[0]; for (int i = 1; i < 8; i++) - if (min[i] < m) - m = min[k = i]; + if (min_arr[i] < m) + m = min_arr[k = i]; - return idx[k]; // return its real index + return idx_arr[k]; } ``` -It works at around 4 GFLOPS. There is still some inter-dependency between the iterations, so we can optimize it by considering more than 8 elements per iteration and taking advantage of the [instruction-level parallelism](/hpc/simd/reduction#instruction-level-parallelism). +It works at around 8-8.5 GFLOPS. There is still some inter-dependency between the iterations, so we can optimize it by considering more than 8 elements per iteration and taking advantage of the [instruction-level parallelism](/hpc/simd/reduction#instruction-level-parallelism). -It would help performance a lot, but it won't let us approach the speed of computing the minimum (~24 GFLOPS) as there is another bottleneck. On each iteration, we need a load, vector comparison, two blends, and a vector addition — that is 5 instructions in total to process 8 elements. Since the decode width of this CPU (Zen 2) is just 4, the performance will still be limited by ⅘ × 8 × 2 = 12.8 GFLOPS even if we get rid of all the other bottlenecks. +This would help performance a lot, but not enough to match the speed of computing the minimum (~24 GFLOPS) because there is another bottleneck. On each iteration, we need a load-fused comparison, a load-fused minimum, a blend, and an addition — that is 4 instructions in total to process 8 elements. Since the decode width of this CPU (Zen 2) is just 4, the performance will still be limited by 8 × 2 = 16 GFLOPS even if we somehow got rid of all the other bottlenecks. Instead, we will switch to another approach that requires fewer instructions per element. @@ -122,8 +133,6 @@ Here is the idea: if we are only updating the minimum a dozen or so times during To implement it with SIMD, all we need to do on each iteration is a vector load, a comparison, and a test-if-zero: ```c++ -typedef __m256i reg; - int argmin(int *a, int n) { int min = INT_MAX, idx = 0; @@ -267,7 +276,7 @@ algorithm rand decr reason for the performance difference std 0.28 0.28 scalar 1.54 1.89 efficient branch prediction + hinted 1.95 0.75 wrong hint -index 4.08 4.17 +index 8.17 8.12 simd 8.51 1.65 scalar-based argmin on each iteration + ilp 10.22 1.74 ^ same + optimized 22.44 2.70 ^ same, but faster because there are less inter-dependencies @@ -283,6 +292,8 @@ There are also still some minor things to optimize, but the potential improvemen The first, index-based SIMD algorithm was [originally designed](http://0x80.pl/notesen/2018-10-03-simd-index-of-min.html) by Wojciech Muła in 2018. +Thanks to Zach Wegner for [pointing out](https://twitter.com/zwegner/status/1491520929138151425) that the performance of the Muła's algorithm is improved when implemented manually using intrinsics (I originally used the [GCC vector types](/hpc/simd/intrinsics/#gcc-vector-extensions)). + From 441d188a48afc5646661d9689810046c7915db0b Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Thu, 10 Feb 2022 03:48:20 +0300 Subject: [PATCH 138/531] fix formula --- content/english/hpc/cpu-cache/associativity.md | 2 +- content/english/hpc/simd/shuffing.md | 2 ++ 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/content/english/hpc/cpu-cache/associativity.md b/content/english/hpc/cpu-cache/associativity.md index 8a0f54c0..c53e6935 100644 --- a/content/english/hpc/cpu-cache/associativity.md +++ b/content/english/hpc/cpu-cache/associativity.md @@ -78,7 +78,7 @@ This makes the cache system simpler and cheaper to implement, but also makes it Now, where were we? Oh, yes: the reason why iteration with strides of 256 causes such a terrible slowdown. -When we jump over 256 integers, the pointer always increments by $1024 = 2^{10}$, and the last 10 bits remain the same. Since the cache system uses the lower 6 bits for the offset and the next 12 for the cache line index, we are essentially using just $2^{12 - (10 - 6)} = 2^8$ different sets in the L3 cache instead of $2^{12}$, which has the effect of shrinking our L3 cache by a factor of $2^4 = 16$. The array stops fitting into the L3 cache ($N=2^21$) spills into the order-of-magnitude slower RAM, which causes the performance to decrease. +When we jump over 256 integers, the pointer always increments by $1024 = 2^{10}$, and the last 10 bits remain the same. Since the cache system uses the lower 6 bits for the offset and the next 12 for the cache line index, we are essentially using just $2^{12 - (10 - 6)} = 2^8$ different sets in the L3 cache instead of $2^{12}$, which has the effect of shrinking our L3 cache by a factor of $2^4 = 16$. The array stops fitting into the L3 cache ($N=2^{21}$) spills into the order-of-magnitude slower RAM, which causes the performance to decrease. From ca1ba28e3998fa139fe7678acd5bad9ade442839 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Thu, 10 Feb 2022 11:08:08 +0300 Subject: [PATCH 139/531] materials for prefix sum --- .../hpc/algorithms/img/prefix-blocked.svg | 1375 +++++++++++++ .../img/prefix-interleaved-prefetch.svg | 1672 ++++++++++++++++ .../hpc/algorithms/img/prefix-interleaved.svg | 1593 +++++++++++++++ .../hpc/algorithms/img/prefix-nontemporal.svg | 1766 +++++++++++++++++ .../hpc/algorithms/img/prefix-prefetch.svg | 1568 +++++++++++++++ .../hpc/algorithms/img/prefix-scalar.svg | 1184 +++++++++++ .../hpc/algorithms/img/prefix-simd.svg | 1262 ++++++++++++ content/english/hpc/algorithms/prefix.md | 188 +- 8 files changed, 10606 insertions(+), 2 deletions(-) create mode 100644 content/english/hpc/algorithms/img/prefix-blocked.svg create mode 100644 content/english/hpc/algorithms/img/prefix-interleaved-prefetch.svg create mode 100644 content/english/hpc/algorithms/img/prefix-interleaved.svg create mode 100644 content/english/hpc/algorithms/img/prefix-nontemporal.svg create mode 100644 content/english/hpc/algorithms/img/prefix-prefetch.svg create mode 100644 content/english/hpc/algorithms/img/prefix-scalar.svg create mode 100644 content/english/hpc/algorithms/img/prefix-simd.svg diff --git a/content/english/hpc/algorithms/img/prefix-blocked.svg b/content/english/hpc/algorithms/img/prefix-blocked.svg new file mode 100644 index 00000000..a91c86e0 --- /dev/null +++ b/content/english/hpc/algorithms/img/prefix-blocked.svg @@ -0,0 +1,1375 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/algorithms/img/prefix-interleaved-prefetch.svg b/content/english/hpc/algorithms/img/prefix-interleaved-prefetch.svg new file mode 100644 index 00000000..672e2e42 --- /dev/null +++ b/content/english/hpc/algorithms/img/prefix-interleaved-prefetch.svg @@ -0,0 +1,1672 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/algorithms/img/prefix-interleaved.svg b/content/english/hpc/algorithms/img/prefix-interleaved.svg new file mode 100644 index 00000000..db3ffe10 --- /dev/null +++ b/content/english/hpc/algorithms/img/prefix-interleaved.svg @@ -0,0 +1,1593 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/algorithms/img/prefix-nontemporal.svg b/content/english/hpc/algorithms/img/prefix-nontemporal.svg new file mode 100644 index 00000000..b81d8ad3 --- /dev/null +++ b/content/english/hpc/algorithms/img/prefix-nontemporal.svg @@ -0,0 +1,1766 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/algorithms/img/prefix-prefetch.svg b/content/english/hpc/algorithms/img/prefix-prefetch.svg new file mode 100644 index 00000000..0fca7d8f --- /dev/null +++ b/content/english/hpc/algorithms/img/prefix-prefetch.svg @@ -0,0 +1,1568 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/algorithms/img/prefix-scalar.svg b/content/english/hpc/algorithms/img/prefix-scalar.svg new file mode 100644 index 00000000..d5186d9f --- /dev/null +++ b/content/english/hpc/algorithms/img/prefix-scalar.svg @@ -0,0 +1,1184 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/algorithms/img/prefix-simd.svg b/content/english/hpc/algorithms/img/prefix-simd.svg new file mode 100644 index 00000000..16bb82ac --- /dev/null +++ b/content/english/hpc/algorithms/img/prefix-simd.svg @@ -0,0 +1,1262 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/algorithms/prefix.md b/content/english/hpc/algorithms/prefix.md index 01294046..9fa50f41 100644 --- a/content/english/hpc/algorithms/prefix.md +++ b/content/english/hpc/algorithms/prefix.md @@ -1,6 +1,190 @@ --- -title: Prefix Sum With SIMD +title: Prefix Sum with SIMD +weight: 8 draft: true --- -... +We design a 2.5x faster algorithm per the prefix sum problem, also known as cumulative sum, inclusive scan, or simply scan. + +$$ +\begin{aligned} +b_0 &= a_0 +\\ b_1 &= a_0 + a_1 +\\ b_2 &= a_0 + a_1 + a_2 +\\ &\ldots +\end{aligned} +$$ + +In other words, the $k$-th element of the output is the sum of the first $k$ elements of the input. + +`{1, 2, 3, 4}` is `{1, 3, 6, 10}` + +### Baseline + +For our scalar baseline: + +```c++ +void prefix(int *a, int n) { + for (int i = 1; i < n; i++) + a[i] += a[i - 1]; +} +``` + +The compiler, of course, doesn't do extra reads and uses a separate variable: + +```nasm +loop: + add edx, DWORD PTR [rax] + mov DWORD PTR [rax-4], edx + add rax, 4 + cmp rax, rcx + jne loop +``` + +When unrolling happens, there are only two instructions: fused load-add and writing the results back. + +`std::partial_sum` does the same. Less than 1 value per cycle because both reads and writes are needed. + +### Vectorization + +The main idea is to split the array in small blocks, calculate the prefix sums on these blocks + +```c++ +typedef __m128i v4i; + +v4i prefix(v4i x) { + x = _mm_add_epi32(x, _mm_slli_si128(x, 4)); + x = _mm_add_epi32(x, _mm_slli_si128(x, 8)); + return s; +} +``` + +Essentially, + +This 128-bit lane separation is typical for AVX. + +`{1, 2, 3, 4, 5, 6, 7, 8}` is `{1, 3, 6, 10, 5, 11, 18, 26}` + +```c++ +void prefix(int *p) { + v8i x = _mm256_load_si256((v8i*) p); + x = _mm256_add_epi32(x, _mm256_slli_si256(x, 4)); + x = _mm256_add_epi32(x, _mm256_slli_si256(x, 8)); + _mm256_store_si256((v8i*) p, x); +} +``` + +Not managing to come up with a more characteristic name, we are going to call it `update`: + +```c++ +v4i update(int *p, v4i s) { + v4i d = broadcast(&p[3]); + v4i x = _mm_load_si128((v4i*) p); + x = _mm_add_epi32(s, x); + _mm_store_si128((v4i*) p, x); + return _mm_add_epi32(s, d); +} +``` + +```c++ +void prefix(int *a, int n) { + for (int i = 0; i < n; i += 8) + prefix(&a[i]); + + v4i s = _mm_setzero_si128(); + + for (int i = 4; i < n; i += 4) + s = update(&a[i], s); +} +``` + +![](../img/prefix-simd.svg) + +We have a problem with memory. + +only update: 5.8 +only prefix: 8.1 + +### Blocking + +Write it in the same style as we did `update`: + +```c++ +const int B = 4096; // adjust block size to your L1 cache size + +v4i local_prefix(int *a, v4i s) { + for (int i = 0; i < B; i += 8) + prefix(&a[i]); + + for (int i = 0; i < B; i += 4) + s = update(&a[i], s); + + return s; +} + +void prefix(int *a, int n) { + v4i s = _mm_setzero_si128(); + for (int i = 0; i < n; i += B) + s = local_prefix(a + i, s); +} +``` + +(You have to make sure that $N$ is a multiple of $B$, but we are going to ignore that for now.) + +![](../img/prefix-blocked.svg) + +### Continuous Loads + +The cache system is sitting idle when we do a second pass. + +One solution is to explicitly add [software prefetching](/hpc/cpu-cache/prefetching): + +```c++ +v4i update(int *p, v4i s) { + __builtin_prefetch(p + B); // <-- prefetch next block's data + // ... + return s; +} +``` + +![](../img/prefix-prefetch.svg) + +Interleaving: + +```c++ +const int B = 64; + +void prefix(int *a, int n) { + v4i s = _mm_setzero_si128(); + + for (int i = 0; i < B; i += 8) + prefix(&a[i]); + + for (int i = B; i < n; i += 8) { + prefix(&a[i]); + s = update(&a[i - B], s); + s = update(&a[i - B + 4], s); + } + + for (int i = n - B; i < n; i += 4) + s = update(&a[i], s); +} +``` + +![](../img/prefix-interleaved.svg) + +You can also combine it with prefetching, which improves performance even more. + +![](../img/prefix-interleaved-prefetch.svg) + +It is sort of "meh". There are ways to do it with permutations, but it would kill the performance of the prefix stage. + +The speedup may be higher for lower-precision data as the scalar code can execute at most one iteration per cycle anyway + +### Other Relevant Work + +There is this professor at CMU named [Guy Blelloch](https://www.cs.cmu.edu/~blelloch/) who [advocated](https://www.cs.cmu.edu/~blelloch/papers/sc90.pdf) back in the 90s where the idea of for vector computers having + +Luckily, parallel. You can read [paper](http://www.adms-conf.org/2020-camera-ready/ADMS20_05.pdf) (AVX-512 which has — sort of — ) and [this StackOverflow discussion](https://stackoverflow.com/questions/10587598/simd-prefix-sum-on-intel-cpu) for a more general overview. + +Most of what I described is already known. To the best of my knowledge, the contributions of this article is the interleaving technique, which is only responsible for a ~20% performance increase. From 9aa91d71577deb39487672802c8e165117238140 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Fri, 11 Feb 2022 03:22:19 +0300 Subject: [PATCH 140/531] fix comment --- content/english/hpc/algorithms/argmin.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/content/english/hpc/algorithms/argmin.md b/content/english/hpc/algorithms/argmin.md index 128e45f8..ee9b32bc 100644 --- a/content/english/hpc/algorithms/argmin.md +++ b/content/english/hpc/algorithms/argmin.md @@ -246,6 +246,7 @@ This way we only process $(N + B)$ elements and don't have to sacrifice neither ```c++ const int B = 256; +// returns the minimum and its first block pair approx_argmin(int *a, int n) { int res = INT_MAX, idx = 0; for (int i = 0; i < n; i += B) { @@ -259,7 +260,7 @@ pair approx_argmin(int *a, int n) { } int argmin(int *a, int n) { - auto [needle, base] = approx_argmin(a, n); // returns the first block of + auto [needle, base] = approx_argmin(a, n); int idx = find(a + base, B, needle); return base + idx; } From df56fba672d24a5f32a5a3c17e4f2f3cd7b01634 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Fri, 11 Feb 2022 05:18:33 +0300 Subject: [PATCH 141/531] prefix sum with simd --- .../hpc/algorithms/img/prefix-outline.png | Bin 0 -> 12210 bytes content/english/hpc/algorithms/prefix.md | 132 +++++++++++++----- 2 files changed, 94 insertions(+), 38 deletions(-) create mode 100644 content/english/hpc/algorithms/img/prefix-outline.png diff --git a/content/english/hpc/algorithms/img/prefix-outline.png b/content/english/hpc/algorithms/img/prefix-outline.png new file mode 100644 index 0000000000000000000000000000000000000000..66c0ba82a74c91d9a44dc7ddf87ad25110de8526 GIT binary patch literal 12210 zcmc(_bx>8|`!Biy1yo8(It7$&q?HsT1PSR*X^<|ZOG-*X1*E${328wD0RbsVk?xW_ z&-%`H=FI)$&fJ;%J98L=yRwy z-RNaL!GBj=r4%%<;ma4>JRDw=yU9Lvd*W#6=3(MufwFRNw71}JHFL4BaB#JDblbvc zk$_)fMSe-z#lpnR#?gUJ!^YkMCFf#8$HPM>XJJam%gxJ6$IUCk$1B1uNT>Yd(bZmc z4HSwFr6BV_!!vzj+Uvf?p9{(DBb=xpJ67!a!1>Unmvp9CSL)~Ftdk!(Ob_)neHs-w z>dWc;V{6H2P?Ob{<)X04nz}S9Q1bOEiFOC>)hi~(qkm7sB>Q9IJ58SUD>WNkE~oFM z|NWQlO4IMVBxa?CO@AdEz8<(Ffe3crKaA5FP&@5mNM|A^)iASUdAUGo65mU zbEf~lKTeXyezCY{zOuJ5S;ynCso!|E`|VdtK#T8-c;(MNN1UZbEe^aJbu<3$zbee) z-@avgqGA56T9>0*w`9sB4p%`{wYx;G4kflMN$Y=1FBeM_(;0PZ@7`@bX3IZhLJm~S z%5ScqxXEyqhx70S?S2J5$y3RckEJ>OlVW84D$Y*X#f4Y5#x`U)Pxa9-r+$6l-}eDD z@BI$Ruiv;~5&fz-g88L=V|K-`yUTUdbM^BJU#F6Q{X0Z=!N$6DfOlC$uBK` zooG6oc!mljVbl83({s)5WdFhA#{{AHw6ChG`TH_NwJF{)u*PT=bsla^qM_dVpWO96 z+KkbuvP46TmKsDkww#kb$yJKy>*%H#NKKj>(KeDbHl~x3l8Sr%`bv#1F3QQx?FtH3 z^l&)5uf}W75_Ml+pUTwKl-u^_oyOf>w#Tgh@*c*HHu-p2Q$uc9mVIx_U)j+c7?MiwnY;1aNyI8fiEm993Bud9ksp)AnlwssJ@rrmH&5KsaI~?irx@Zt_2Aw=itC8CMHHeO&w$P&I{Aa z%L^5F^xG$vI)$~^q&*aN{>Rz@Ha>n>Ss7O#R=XN5rJzhKjrhkkvIv@`#l_&~Yoqk^ z^k`SFUPaB7U4h|k4FU007#@N1-a#9Kj|7vM&VL{Ktbk%;M`h`{z2M_z^{5-n0wsu+;V^*)} ze~XuA%>P6&D;LU(lgbQK%qiS_%5M3Li>o}xllGNn*_gJ7|6xv{*^QlK((FO&;8x>& zji%A5Sk&Y%Az3@7(V_XDD{e`J!@qyi93EQwk@U7{9pV~KR#4zzlEuhHt-TfX;3VbL ze>z@{N=-|BzV(~>+O=y0QmMsCgMa6Tqb^Uk_bYo}*G#qV_1e}|r}mx%*zB?DT6x(g zY34Odb|pOR)tl+H)y#)*+SKTMEjQ)7&(%G zS-kiGoc6jqk`%7VTqgl)=#RBt8aq(tbXqE;g-v=LyXvfI7^V#^X?~0ro}8X~&Q>%% zZ&B+wgIA&z_3PI!^*VX4wF;*>DSn43vX-;mM5Xy7{9QvAorcSjbfTo5EV4vCeuZ*@z$qRIK3{`8WpOlSS^!%q`! zI{sUk%(j)Pqa~gsp8HnM%Ep35(@JMDw^*dFprz5?)FuP}sB?aAetv#-ZtjI)lXuc5 z|35WN<~@YzBG0dDb{2Oi<*2ugk2kc5W;OP2=i88s-;7<2i;c{ye4eZopkuEk&%Sig zs_Awfn2@2J`S^qT#)Olz^Icb?q?%^gZz7_i-`=Gr)#xj;Smx9`Q?bKWEwVG!8no9N z?Y$L3W1l#*%5OCuIch}xTw>~0)Ie2L zRS>JwYD09cw;vt37)2SMy z=_ylEQo`2P*KO_Fy|G4nn8|T4?S7S`K{tcMkiCB&^)BGTtD&8N(t@92=l$gw*2TpI zbj#SOxj5Y3zP^q~5;kN{;Vzy&K4w$(&MslHf^yg|j*mU5B>kxz92_7eQw8lQsHmxn z8adwaWGmkF7a>Dk953J_#m051ueH%@FD@bBWz;RMUNZV|bb6VHXFUS27U_bLqy%|itRq`Ue0`Yz(d+uc`D z!Mix$3Xskleo&&t$jeKJfr)b;DP7OVlN6aWSV{K1oB9S51STuYu%KXKi;9ZYfBtON z`}(f+R6S2JyAC>BnhXb%n3S}wwKY?B!pc;RR6#{0RK1|%^H%eX7`Y&5AXy(ipc)z) zoLpQIdJ8mIr<(orx*qmS*iY8tK1$(~%@BDWmzo+5-?p~5V*mg{1^)c0k;Zsi51wD` z_}@R1^S<7r#` zQQw!QpJls1!Kj(2>QbKBov$OLE zAw~;N55w=@zn?a^Fun6$z56jQ&up|<=b@@_3PI^x7)@ZGaCw)P&=70RaI>8=n63ygi9cn;Wjave{tp*&S&&j!g%o6au!lT|b(CC%%4l zsV^;PFk2o2rB!Aa1Nh_`w6S{sa}hXXXr({Oj9OUz+;yClAlAb0sM`RPMwORyBZzhT z=m>$KKWgpusKmXA<+5*zWZ<=&A5x0AEz^0_o96&r-M0`1^ngPxLcFrFg5=P07_I*o zXw5fCNkgv@F@O2~9d~?u{GR7F-79L*P{2j4!^2NDxqEsoEG$e~UtYByDI|fG9eQzo zx-e1mtloWtBALqwFGdanbuZvT?Ck8UkO|Y!Opf##4o=5AaUY~TW7E&Bu10?Oa{o?} zV){L2bO2d`ZX(nG7-qM&99{l0tKEd~bXyz6g_91cs}tho$KFKThXtdB_`V+Fm-`;B+Kg*10 z;aL!DLLq1uzP;@P`0o}p`4E7AY-Nq=96a1Sj8O=TzBv0=A8;u_d_ANmu?lGgi+}zo zsGeA<%l(OnhV%r`twvitRn|5h&!qdLhn4_(kFaQh*m_4;n z8Kdxnp|j*e*pkT0)UpZ*jm-y@eE#f1A}?L5m^ZAGz9=dpGWt+mZd4T;snk))nyg98 z_`Id;RsH|rZ)yh%0{>3}=edqKwzs<*7Z-RhVKRcNkYUn*X9V`RjDcW^2AQQNbm=bO`?&HnBFS)Qn2!@Rdn-#zsj{*5$;rue3y0s0*Mx+G z09{DKF$xLYf;#Gpn@j?v6X zYPY5uFi`nMEzLTAqF!Yp37nWnw7a(#pOS*b#l>})6GMIdo4md=qj*brO z7&%6j_YxdX*x7vj{A8V+%At6e9&gWOJsT@!;N?}{-AMs* zJ>0ef`dl!96r31vSoo$`uTCX6EG&xC;8S?5{bVH25;;JXax${A5pXCGKZp^x z4-BY^o$n7b2nuRd+l{OH?sVRQD;oN22n<$PslvH%0?TDiNK6Da`T+q()?=j^K)-{2 zthi;Fb-$9zQI7}@$9A6YxHbqR-TCsuZ?MtJwZiqU3VkpE`~nGA%b80=El@?)mq9_Y zP|;+e`*uRtm$9`iM6ig4<@w2hDL^Z@G4S0EXvJ)wJhnI+8yh=@avsT=n}772k6@93 z|KX+r<(>Kba@lGCGoqH4t$D0}&;hx<5uch0&ut1-PTlAP6ayJ`^#o+8;jdniZcNk= ziF#}j+t}D-!6CLyPa6Qm*Y`c$s5AZX;Q@oVxPhmaR|FA@3U*K{T~=0B$9!j$@`NB6 z4styPc6KGVt=|U7U*yc1e2yaX^Y0W)={Ytr0VBK}OwiWdEiZO7<;ox?rjK0dXePi& z?BM58b7y-BY=@;TZhL6!%QBI!gO)_u=) zDcX9xG6!;rV4*992w(~!vXi$s^$Fg+dpEy6UL{pD5*Zgqurnx6n^s*iXe|TMjEtve zy(cgUSvR+8h&Kia2}2wjv8Sv;Hbb{}=E7*N(+4B593ByIb8c=9S#u+hRAkM^Cnu42 zpXC0W3O#*W-q69$%Bl#M&4iz#GY}1<19TK-B%EI>=G1#16ha1>LO_~!gx_ql=7)=> zO89;9ygWY&A&?ULx0pD;J=YE=$z-Wpq6MWI$hp`6Zb zY;gRvt?}Hb8IytMgvCJ0N0l0UjuiJf1o7f)Z!ckUb29@Yqcm_I#cG>jIc)lm9ZJ~r zc+?`d5ojprFm>mRkRvDP1dc-v{{GF-SZV=1!kUpEy`m96`PyMOUTHDp2IwhMs|Z;b zuyADCN9uF(#US`laj&od1QDX4p}`(RwgzEEK{L1k=?YNCdgx=sJ-?G`ctmJu;axG5 zH%heF+_!#noE&bj1rw;7G6xeF`5h!#&onnZ{p41Fed7jOeBgUwS2_M?qe`&zkU3k; zr=|Bm)M|)G%gf6PkBv>q4GRgm)?~=#;_4cbk&$tSoxQlLMD|Zle?O^WjyjMLeQf#; ze91F95syvwV9&VJ@2Nb}K&9acWV7XBI&vPRK**&_`04)VE-`YD>PF#n?O`|QgI}j! z+nKEgD8_6#@Qx!ZJ3D8qhXI5_5^7O*z;kMFnF_#D@yW?r4yH@9vvlX!Y5X<`h@J#J zXC59gORGe8q%QL!*KTwE@&yK#qI5JINC#XVQF)F}Pum9u@KNQCGZOlZo;P7V=qUXr zZzQC@P&ziM*=9h82USKIYQs)!z!`?$=^=-jC6r7uac_Q!^Zh3>^70F`m&e_VW*g9j zfzbhSzXJTxZT;8HFF`3AyAy(Iyq?>}sQXY9=T=rCK-9!QL23M;!5aVmeQdUTECVMe zo{|4QCXiro!tiP3z9n-!Dyf3b;(77!FOUD3CzXgBHVBc-dl~8J0C*+&?I(yqTtVa? z*wnMH&4U03QGqEbwFc5tV|0GE+yTX_8an4A=6D&0~H5ASIC5Sg5R-&UI2%#Fs zy?GM}=?#=XA};yc_wPw3`ZfrVg%id@0I5obh~|7|{`a>+VSqpZUJlkd8i}2*S3wD2 z8X6kvjj8iAx{iYroRLATZ(xwAUhrYiS}AlO-xAtrCMde7KncGS?(V(OD2JnL^1Y>m2dvujp1=zkvRM zsH)os2b~pW-T%3#i-$)TfD3rqSOroj3&r;!`V8Kx@t5MEphd&sa{(*5KHDll8~Z7j z4fI_ouCgF0?|b`2U>GSW;lS$411|l2p*SoL<=iMWY+@b37xQHGckV1Z0*I|H)wAQi zX30(V@j~Owq6R=mB8zJBl5&O!3DEF*=LNZf18jPy)t`5N7HQ+(y?b|gwB&7%Z-~?j z5a=-B^{>w!TnKN0kC}p~tNtEc@vtbXx!~~~#eJRx?Hxb5ik}?XKQIgKz zlY_N*&{hB?kbn}~1zbaUX{O#8L;0e3lRmB6gMj`wb+P+$z3UwkcR7@Bl^QtDWRQ^aGYlKu#VpAbCjzl|atW@LkUt z6ev1Y*6XdUt*PQZf{?8_&9C3Qxt^gC2D>l^IaE3?kjo(G(Nzq^4ML9+T1Mbr7BRZA`vn$*;bddpNaPK5x^7 znZ2NaUPV-Ns8~6)!5}OR#Z`gk16ULZp2nRn${)5hPh0%O(=#%Z4+{WSgMc2Fo*oT| z{tSE;z&>B{)w0VycTCxhn)Pk@`1y$eVOtO9;)3o4J0ImS<97?>HxS1ITUsRF@Y}ou zIif&S$nMu;5sZcTdDDdZq|-YpyJ`g*PiyVSnN`w3fyeiSOhFmDgym*uGuApxM~)c< zP=j;^`uwj(ud02+c|gZ#f%}(?@~>t=F#3Tj2l#KvGYR`dCx-U^ls z=!b|756s~M)IMnZ*WL&`3k7tzvKu|W72)RU$|uaHm$Vh(9hJtHZ-g|U7;?iF0({&uU;I-uVL zVoAS{(PZM_zy&@SXHUirZ{IUc4X_1KPbium6Uk^ed3e{wY4@AJPwyrKry=O+roQ57AjT-gEm^*c!2@4Ag5?#2Rkc7BO@a`vjka?0TDURZ;6{(=f7fC(tcWdsz7W6d3BN4sD# z-OiqGPc!)B7B5UuX#s^8?l#NKtf@Pg=8I?68}yeY!d!zt-D`777|%>ShZxst^}zeT z-+6m^Ig>rXoSK*6y@89%;VfGEMiPa>M@MD>h*0rNAkyYw1rut@SzF%$lDLSsP|<!N;xK6X=e1+BE1wgymi%!Goo0$LXzVT8nyI}L_vLh!8 zOPEz?{t%eCaKKPl7V$#t^dCGxOHWT<#0#=Zm&*JC6zGF`L1C_hYkw-ly4YSH-_YM7 zQXVT^*>A5&biEE<*NG&Rd~UQ6(8*<8O*T7MO}~dGDJcmmV9OL)pW2`Wn|d?zLSbL! zh=E~gEsdrsNKB~LC&9ag(d|3o8B8%f=R?VnYz=>B~D?X7H?y6x{-Q9;g~xYU!8u;Yf)T!SCsRq6YZL zaB}+j@?zvUw6#9}NxEA$9r^mM#E^EE9*%mnM)+x%_vMYxZrzw8>1m7YdzOz-QekK&JVHDxJB zNQ!`B`1*>qwYO(#uzq8=QZY#0{MY!?C+`TWzp+n;^xJ*!&zjvbD46KwKcTL zwiZ(GCy9q;GqP0Aiz~9kF0I!=@6jq}S#+!M7htR}0J*=`8Pz8-sO9C#uc^)`)Xgh2 zlzf!?cGy9=Gc76UDqudyPMzwe?|ySf+4K2*-+I2MY5V%ijw$yavJQtbopua8eJ(-` ziWsm|%UwcOSJT3WMrC~$bszm)4bdG`e3H8i#-A6yvRBtiE3{=35;}@M& zv4Mo$5d$mPWX+}lC*vq7V>V4b-BQ9v1KDW`CQZ5~iVMAM=RKLn zL`KaUH5*=G;ZfZO5?1t4qKfc8MY_?n)=TT;d#pn)#Y3LVi2 zK)7PvKR~f200@%`5f5BC6sOg7Ks+o%)C1kqNoc(zg2;rMjFoy}hJ=VOivkf&fyqc# zP66>mW<1Wo4S(9~D=hblDx$Bik0Ly7@x#~}kMq0?n0P5V%+Lz~R{d)B6GdEo@ESA= zzqhqXfn9@+@N)hVz%>l@+6~@c8<@=F1D8XewbsP@I^SGK5_*$;ruMiy3)%ZvNK~M4Xt$ zi<7m7ii!Y_F~Ew8h4P$6J+KfOL9T|2I6nRFCAs_rDPoS7vtzkIp(0v=LLjj5712xG8cu?Eg=Q2RPS?Cs$tm~NkS&- z&W= znqqRAkRuJOpDWRZO{@6J@y-I8*uzD@UV$K^neNPYs!NVHdhr3^2?vSH_+)?OYhNG# zcsKYcsgeQB3QaI>(hj~TH)vCK)1NdI7m;C_ks|GEn6Jp`f;+iTiFy7o*9Geh9yL_x zx2Lp7k>d^0C0eUJxkb zQc|pV?IAQ_jEO7|OW_H04(@;pUqCX(;7FkY&rgpY870&_8?$KDA1H3`ee3nfI((4Fn0Y|cf5>CYPXYI}uF*xzvEf{0 zX*)Z1P+)%)X>%+$`?IAq)?6gcFPufwaL z0X?lSy{c8JuQUBNn1&EM$l_vlU`berZ3$4vvOgV>wGr6?R*t#-fot*aUup2vU`%MJ zJWnqBg1}_;f%ujABFX8P+!%*j9 z0lQ^_z@XVIZcjac#5&^E!Y)3jbC}+SO0WPHG?{=c@jL$$8+M(_D=;_%vIEA+(N-*s zNc=3((>{^CA$FXzvbxH;K=If@R#ujtnHduzUhZ#y#!VukAh4rn!OTIB)IM`Gy>zHC z=+MYDz|m*Ou!oMq>}WW+b^v=V5Ai?%AIW>?eH~~XRPlJiGB9+(62Af>Ugy8PzaTRl z$#)^$jJ3f!oplpPxQwnY1r&uKM8|<5naXcNxVyXi@gok(^81?*K%vMPX684Krn3e#et*CEM-C}=r786w!$8uK=ywk;fNI~wf7zMv{Q?Q`izvT zOzAMEg)UtE7Jq7}c~b7~h&r;fUNta}5F(H7uFC!C7MgN3^lG_*snj^g*Yy~np2PZ!}KmQU? zZTs*r(N^Og9ys*n7T-v%`6-Z!P^48N1;Uc^$-d1Qgg4Oml+~TxUET&uC9wNI&_L!9 zK(eC`{sO&%hJ5DFx0JcRf3<7alu0T9pCp>BK?$J);}K5mWo#@yG%>_^For=ZWOe}2 zKaoklw%}_>XhBB%pRBOBZA_3LLeA<69A}RYl_o0*G8qBWoQTVD&uy6iGPW<$8Jq&< z_FRe>Ohn27i3%XF^WKs&bY>GM7GHaMT4Ay@N7SDY^d(w+q}ahPvDGwPlpdEVF{Gsf zI>5rGkpWu|(g*{3I_f@@3L0gJ>lZJks ```c++ -v4i update(int *p, v4i s) { - v4i d = broadcast(&p[3]); +v4i accumulate(int *p, v4i s) { + v4i d = (v4i) _mm_broadcast_ss((float*) &p[3]); v4i x = _mm_load_si128((v4i*) p); x = _mm_add_epi32(s, x); _mm_store_si128((v4i*) p, x); @@ -86,6 +123,8 @@ v4i update(int *p, v4i s) { } ``` +With `prefix` and `accumulate` implemented, the only thing left is to glue together our two-pass algorithm: + ```c++ void prefix(int *a, int n) { for (int i = 0; i < n; i += 8) @@ -94,30 +133,29 @@ void prefix(int *a, int n) { v4i s = _mm_setzero_si128(); for (int i = 4; i < n; i += 4) - s = update(&a[i], s); + s = accumulate(&a[i], s); } ``` -![](../img/prefix-simd.svg) +The algorithm already performs slightly more than twice as fast as the scalar implementation but becomes slower for large arrays that fall out of the L3 cache — roughly at half the [two-way RAM bandwidth](/hpc/cpu-cache/bandwidth) as we are reading the entire array twice. -We have a problem with memory. +![](../img/prefix-simd.svg) -only update: 5.8 -only prefix: 8.1 +Another interesting data point: if we only execute the `prefix` phase, the performance would be ~8.1 GFLOPS. The `accumulate` phase is slightly slower at ~5.8 GFLOPS. Sanity check: the total performance should be $\frac{1}{ \frac{1}{5.8} + \frac{1}{8.1} } \approx 3.4$. ### Blocking -Write it in the same style as we did `update`: +So, we have a memory bandwidth problem for large arrays. We can avoid re-fetching the entire array from the RAM if we split it into blocks that fit in the cache and process them separately. All we need to pass to the next block is the sum of the previous ones, so we can design a `local_prefix` function with an interface similar to `accumulate`: ```c++ -const int B = 4096; // adjust block size to your L1 cache size +const int B = 4096; // <- ideally should be slightly less or equal to the L1 cache v4i local_prefix(int *a, v4i s) { for (int i = 0; i < B; i += 8) prefix(&a[i]); for (int i = 0; i < B; i += 4) - s = update(&a[i], s); + s = accumulate(&a[i], s); return s; } @@ -129,30 +167,38 @@ void prefix(int *a, int n) { } ``` -(You have to make sure that $N$ is a multiple of $B$, but we are going to ignore that for now.) +(We have to make sure that $N$ is a multiple of $B$, but we are going to ignore such implementation details for now.) + +The blocked version performs considerably better, and not just for when the array is in the RAM: ![](../img/prefix-blocked.svg) +The speedup in the RAM case compared to the non-blocked implementation is only ~1.5 and not 2. This is because the memory controller is sitting idle while we iterate over the cached block for the second time instead of fetching the next one — the [hardware prefetcher](/hpc/cpu-cache/prefetching) isn't advanced enough to detect this pattern. + ### Continuous Loads -The cache system is sitting idle when we do a second pass. +There are several ways to solve this under-utilization problem. The obvious one is to use [software prefetching](/hpc/cpu-cache/prefetching) to explicitly request the next block while we are still processing the current one. -One solution is to explicitly add [software prefetching](/hpc/cpu-cache/prefetching): +It is better to add prefetching to the `accumulate` phase because it is slower and less memory-intensive than `prefix`: ```c++ -v4i update(int *p, v4i s) { - __builtin_prefetch(p + B); // <-- prefetch next block's data +v4i accumulate(int *p, v4i s) { + __builtin_prefetch(p + B); // <-- prefetch the next block // ... return s; } ``` +The performance slightly decreases for in-cache arrays, but approaches closer to 2 GFLOPS for the in-RAM ones: + ![](../img/prefix-prefetch.svg) -Interleaving: +Another approach is to do *interleaving* of the two phases. Instead of separating and alternating between them in large blocks, we can execute the two phases concurrently, with the `accumulate` phase lagging behind by a fixed number of iterations — similar to the [CPU pipeline](/hpc/pipelining): ```c++ const int B = 64; +// ^ small sizes cause pipeline stalls +// large sizes cause cache system inefficiencies void prefix(int *a, int n) { v4i s = _mm_setzero_si128(); @@ -162,29 +208,39 @@ void prefix(int *a, int n) { for (int i = B; i < n; i += 8) { prefix(&a[i]); - s = update(&a[i - B], s); - s = update(&a[i - B + 4], s); + s = accumulate(&a[i - B], s); + s = accumulate(&a[i - B + 4], s); } for (int i = n - B; i < n; i += 4) - s = update(&a[i], s); + s = accumulate(&a[i], s); } ``` +This has more benefits: the loop progresses at a constant speed, reducing the pressure on the memory system, and the scheduler sees the instructions of both subroutines, allowing it to be more efficient at assigning instruction to execution ports. + +For these reasons the performance improves even on small arrays: + ![](../img/prefix-interleaved.svg) -You can also combine it with prefetching, which improves performance even more. +Finally, combining it with prefetching improves the performance even more: ![](../img/prefix-interleaved-prefetch.svg) -It is sort of "meh". There are ways to do it with permutations, but it would kill the performance of the prefix stage. +The total speedup we were able to achieve is between $\frac{4.2}{1.5} \approx 2.8$ for small arrays and $\frac{2.1}{1.2} \approx 1.75$ for large arrays. -The speedup may be higher for lower-precision data as the scalar code can execute at most one iteration per cycle anyway +The speedup may be higher for lower-precision data compared to the scalar code, as it is pretty much limited to executing one iteration per cycle regardless of the operand size, but it is still sort of "meh" when compared to some [other SIMD-based algorithms](../argmin). This is largely because there isn't a full-register byte shift in AVX that would allow the `accumulate` stage to proceed twice as fast, let alone a dedicated prefix sum instruction. -### Other Relevant Work +There is this professor at CMU, [Guy Blelloch](https://www.cs.cmu.edu/~blelloch/), who [advocated](https://www.cs.cmu.edu/~blelloch/papers/sc90.pdf) for a dedicated prefix sum hardware back in the 90s when [vector processors](https://en.wikipedia.org/wiki/Vector_processor) were still a thing. Prefix sums are very important for parallel applications, and the hardware is becoming increasingly more parallel, so maybe the CPU manufacturers will revitalize this idea and make prefix sum calculations slightly easier. -There is this professor at CMU named [Guy Blelloch](https://www.cs.cmu.edu/~blelloch/) who [advocated](https://www.cs.cmu.edu/~blelloch/papers/sc90.pdf) back in the 90s where the idea of for vector computers having + From 92944ab9001c9020ce7d01306116472599e3bcaa Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Fri, 11 Feb 2022 06:01:51 +0300 Subject: [PATCH 142/531] references for prefix sum --- content/english/hpc/algorithms/prefix.md | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/content/english/hpc/algorithms/prefix.md b/content/english/hpc/algorithms/prefix.md index 10f7ff35..5c79154a 100644 --- a/content/english/hpc/algorithms/prefix.md +++ b/content/english/hpc/algorithms/prefix.md @@ -231,16 +231,17 @@ The total speedup we were able to achieve is between $\frac{4.2}{1.5} \approx 2. The speedup may be higher for lower-precision data compared to the scalar code, as it is pretty much limited to executing one iteration per cycle regardless of the operand size, but it is still sort of "meh" when compared to some [other SIMD-based algorithms](../argmin). This is largely because there isn't a full-register byte shift in AVX that would allow the `accumulate` stage to proceed twice as fast, let alone a dedicated prefix sum instruction. -There is this professor at CMU, [Guy Blelloch](https://www.cs.cmu.edu/~blelloch/), who [advocated](https://www.cs.cmu.edu/~blelloch/papers/sc90.pdf) for a dedicated prefix sum hardware back in the 90s when [vector processors](https://en.wikipedia.org/wiki/Vector_processor) were still a thing. Prefix sums are very important for parallel applications, and the hardware is becoming increasingly more parallel, so maybe the CPU manufacturers will revitalize this idea and make prefix sum calculations slightly easier. +### Other Relevant Work - From e05f7e337465e94c9e31b3c713e39d9f742a2ce0 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Fri, 11 Feb 2022 09:26:55 +0300 Subject: [PATCH 143/531] fix compiler version --- content/english/hpc/algorithms/argmin.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/hpc/algorithms/argmin.md b/content/english/hpc/algorithms/argmin.md index ee9b32bc..87517c70 100644 --- a/content/english/hpc/algorithms/argmin.md +++ b/content/english/hpc/algorithms/argmin.md @@ -288,7 +288,7 @@ min+find 18.21 12.92 find() has to scan the entire array + blocked 22.23 19.29 we still have an optional horizontal minimum every B elements ``` -Take these results with a grain of salt: the measurements are [quite noisy](/hpc/profiling/noise), they were done for just for two input distributions, for a specific array size ($N=2^{13}$, the size of the L1 cache), for a specific architecture (Zen 2), and for a specific and slightly outdated compiler (GCC 9.2) — the compiler optimizations were also very fragile to little changes in the benchmarking code. +Take these results with a grain of salt: the measurements are [quite noisy](/hpc/profiling/noise), they were done for just for two input distributions, for a specific array size ($N=2^{13}$, the size of the L1 cache), for a specific architecture (Zen 2), and for a specific and slightly outdated compiler (GCC 9.3) — the compiler optimizations were also very fragile to little changes in the benchmarking code. There are also still some minor things to optimize, but the potential improvement is less than 10% so I didn't bother. One day I may pluck up the courage, optimize the algorithm to the theoretical limit, handle the non-divisible-by-block-size array sizes and non-aligned memory cases, and then re-run the benchmarks properly on many architectures, with p-values and such. In case someone does it before me, please [ping me back](http://sereja.me/). From b996a94077b90b9f692e0ce84f84f532f3e2e7d9 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Fri, 11 Feb 2022 15:08:25 +0300 Subject: [PATCH 144/531] binary search code and graphs --- content/english/hpc/data-structures/_index.md | 4 + .../hpc/data-structures/binary-search.md | 592 +++++- .../english/hpc/data-structures/filters.md | 1 + .../hpc/data-structures/img/search-all.svg | 1615 ++++++++++++++++ .../img/search-bplus-other.svg | 1493 +++++++++++++++ .../hpc/data-structures/img/search-bplus.svg | 1302 +++++++++++++ .../img/search-branchless-prefetch.svg | 1482 +++++++++++++++ .../data-structures/img/search-branchless.svg | 1269 +++++++++++++ .../img/search-btree-hugepages.svg | 1243 +++++++++++++ .../img/search-btree-optimized.svg | 1434 +++++++++++++++ .../hpc/data-structures/img/search-btree.svg | 1589 ++++++++++++++++ .../img/search-eytzinger-branchless.svg | 1254 +++++++++++++ .../img/search-eytzinger-prefetch.svg | 1618 +++++++++++++++++ .../img/search-eytzinger-small.svg | 1098 +++++++++++ .../data-structures/img/search-eytzinger.svg | 1439 +++++++++++++++ .../img/search-latency-bplus.svg | 1511 +++++++++++++++ .../img/search-relative-latency.svg | 1208 ++++++++++++ .../data-structures/img/search-relative.svg | 1179 ++++++++++++ .../hpc/data-structures/img/search-std.svg | 1133 ++++++++++++ 19 files changed, 22405 insertions(+), 59 deletions(-) create mode 100644 content/english/hpc/data-structures/img/search-all.svg create mode 100644 content/english/hpc/data-structures/img/search-bplus-other.svg create mode 100644 content/english/hpc/data-structures/img/search-bplus.svg create mode 100644 content/english/hpc/data-structures/img/search-branchless-prefetch.svg create mode 100644 content/english/hpc/data-structures/img/search-branchless.svg create mode 100644 content/english/hpc/data-structures/img/search-btree-hugepages.svg create mode 100644 content/english/hpc/data-structures/img/search-btree-optimized.svg create mode 100644 content/english/hpc/data-structures/img/search-btree.svg create mode 100644 content/english/hpc/data-structures/img/search-eytzinger-branchless.svg create mode 100644 content/english/hpc/data-structures/img/search-eytzinger-prefetch.svg create mode 100644 content/english/hpc/data-structures/img/search-eytzinger-small.svg create mode 100644 content/english/hpc/data-structures/img/search-eytzinger.svg create mode 100644 content/english/hpc/data-structures/img/search-latency-bplus.svg create mode 100644 content/english/hpc/data-structures/img/search-relative-latency.svg create mode 100644 content/english/hpc/data-structures/img/search-relative.svg create mode 100644 content/english/hpc/data-structures/img/search-std.svg diff --git a/content/english/hpc/data-structures/_index.md b/content/english/hpc/data-structures/_index.md index 880b3d43..9516ad3c 100644 --- a/content/english/hpc/data-structures/_index.md +++ b/content/english/hpc/data-structures/_index.md @@ -5,3 +5,7 @@ draft: true --- Optimizing data structures is different from optimizing algorithms. It is harder. Each new aspect multiplies the design complexity. A lot more attention needs to be attached to memory and latency-bandwidth trade-offs. + +Defining the benchmarks is [harder](/hpc/profiling/noise/). + +A brief review of the [CPU cache system](/hpc/cpu-cache) is strongly advised. diff --git a/content/english/hpc/data-structures/binary-search.md b/content/english/hpc/data-structures/binary-search.md index 7ce64718..2b0eed67 100644 --- a/content/english/hpc/data-structures/binary-search.md +++ b/content/english/hpc/data-structures/binary-search.md @@ -1,85 +1,122 @@ --- -title: Binary Search +title: Searching in Sorted Arrays weight: 1 --- +![](../img/search-std.svg) + +![](../img/search-branchless.svg) + +![](../img/search-branchless-prefetch.svg) + +--- + +![](../img/search-eytzinger.svg) + +![](../img/search-eytzinger-prefetch.svg) + +![](../img/search-eytzinger-small.svg) + +![](../img/search-eytzinger-branchless.svg) + +--- + +![](../img/search-btree.svg) +![](../img/search-btree-hugepages.svg) +![](../img/search-btree-optimized.svg) + +--- + +![](../img/search-bplus.svg) +![](../img/search-relative.svg) +![](../img/search-relative-latency.svg) +![](../img/search-bplus-other.svg) +![](../img/search-latency-bplus.svg) +![](../img/search-all.svg) + The most fascinating showcases of performance engineering are not intricate 5-10% speed improvements of some databases, but multifold optimizations of some basic algorithms you can find in a textbook — the ones that are so simple that it would never even occur to try to optimize them. These kinds of optimizations are simple and instructive, and can very much be adopted elsewhere. Yet, with remarkable periodicity, these can be optimized to ridiculous levels of performance. In this article, we will focus on such an algorithm — binary search — and significantly improve its efficiency by rearranging elements of a sorted array in a more cache-friendly way. We will develop two versions, each achieving 4-7x speedup over the standard `std::lower_bound`, depending on the cache level and available memory bandwidth: - The first one uses what is known as *Eytzinger layout*, which is also a popular layout for other structures such as binary heaps. Our minimalistic implementation is only ~15 lines. - The second one is its generalization based on *B-tree layout*, which is more bulky. Although it uses SIMD, which technically disqualifies it from being binary search. +- Novel structure based called S-tree based on -A brief review of [CPU cache system](../cpu-cache) is strongly advised. - -## Why Binary Search is Slow +GCC sucked on all benchmarks, so we will be using Clang (10) exclusively. The CPU is a Zen 2, although the results should be transferrable to other platforms, including most Arm-based chips. -Before jumping to optimized variants, let's briefly discuss the reasons why the textbook binary search is slow in the first place. +## Binary Search -Here is the standard way of searching for the first element not less than $x$ in a sorted array of $n$ integers: +Already sorted array `t` of size `n`. -```cpp +```c++ int lower_bound(int x) { int l = 0, r = n - 1; while (l < r) { - int t = (l + r) / 2; - if (a[t] >= x) - r = t; + int m = (l + r) / 2; + if (t[m] >= x) + r = m; else - l = t + 1; + l = m + 1; } - return a[l]; + return t[l]; } ``` -Find the middle element of the search range, compare to $x$, cut the range in half. Beautiful in its simplicity. - -This is actually how `std::lower_bound` from [libstdc++](https://github.com/gcc-mirror/gcc/blob/d9375e490072d1aae73a93949aa158fcd2a27018/libstdc%2B%2B-v3/include/bits/stl_algobase.h#L1023) works: - -Metaprogramming monstrosity. - -```cpp -template -_ForwardIterator __lower_bound(_ForwardIterator __first, _ForwardIterator __last, - const _Tp& __val, _Compare __comp) { - typedef typename iterator_traits<_ForwardIterator>::difference_type _DistanceType; - _DistanceType __len = std::distance(__first, __last); - - while (__len > 0) { - _DistanceType __half = __len >> 1; - _ForwardIterator __middle = __first; - std::advance(__middle, __half); - if (__comp(__middle, __val)) { - __first = __middle; - ++__first; - __len = __len - __half - 1; +This is actually how `std::lower_bound` from works. Implementations from both [Clang](https://github.com/llvm-mirror/libcxx/blob/78d6a7767ed57b50122a161b91f59f19c9bd0d19/include/algorithm#L4169) and [GCC](https://github.com/gcc-mirror/gcc/blob/d9375e490072d1aae73a93949aa158fcd2a27018/libstdc%2B%2B-v3/include/bits/stl_algobase.h#L1023) use this metaprogramming monstrosity: + +```c++ +template +_LIBCPP_CONSTEXPR_AFTER_CXX17 _ForwardIterator +__lower_bound(_ForwardIterator __first, _ForwardIterator __last, const _Tp& __value_, _Compare __comp) +{ + typedef typename iterator_traits<_ForwardIterator>::difference_type difference_type; + difference_type __len = _VSTD::distance(__first, __last); + while (__len != 0) + { + difference_type __l2 = _VSTD::__half_positive(__len); + _ForwardIterator __m = __first; + _VSTD::advance(__m, __l2); + if (__comp(*__m, __value_)) + { + __first = ++__m; + __len -= __l2 + 1; } else - __len = __half; + __len = __l2; } return __first; } ``` -If compiler is successful in piercing through the abstractions, it compiles to roughly the same machine code. - -### Spacial Locality - -* First ~10 queries may be cached (frequently accessed: temporal locality) -* Last 3-4 queries may be cached (may be in the same cache line: data locality) -* But that's it. Maybe store elements in a more cache-friendly way? - -![](../img/binary-search.png) +If compiler is successful in piercing through the abstractions, it compiles to roughly the same machine code and yields roughly the same performance. -### Temporal Locality +We change the compiler for GCC (9.3). For some reason, it doesn't work. -When we find lower bound of $x$ in a sorted array by binary searching, the main problem is that its memory accesses pattern is neither temporary nor spatially local. - -For example, element $\lfloor \frac n 2 \rfloor$ is accessed very often (every search) and element $\lfloor \frac n 2 \rfloor + 1$ is not, while they are probably occupying the same cache line. In general, only the first 3-5 reads are temporary local and only the last 3-4 reads are spatially local, and the rest are just random memory accesses. +```c++ +int lower_bound(int x) { + int *base = t, len = n; + while (len > 1) { + int half = len / 2; + base = (base[half] < x ? &base[half] : base); + len -= half; + } + return *(base + (*base < x)); +} +``` -![](../img/binary-heat.png) +```c++ +int lower_bound(int x) { + int *base = t, len = n; + while (len > 1) { + int half = len / 2; + __builtin_prefetch(&base[(len - half) / 2]); + __builtin_prefetch(&base[half + (len - half) / 2]); + base = (base[half] < x ? &base[half] : base); + len -= half; + } + return *(base + (*base < x)); +} +``` ### Branching @@ -135,6 +172,437 @@ So, to sum up: ideally, we'd want some layout that is both blocks, and higher-or We can overcome this by enumerating and permuting array elements in a more cache-friendly way. The numeration we will use is actually half a millennium old, and chances are you already know it. +## Why Binary Search is Slow + +Before jumping to optimized variants, let's briefly discuss the reasons why the textbook binary search is slow in the first place. + +Here is the standard way of searching for the first element not less than $x$ in a sorted array of $n$ integers: + +```cpp +int lower_bound(int x) { + int l = 0, r = n - 1; + while (l < r) { + int t = (l + r) / 2; + if (a[t] >= x) + r = t; + else + l = t + 1; + } + return a[l]; +} +``` + +Find the middle element of the search range, compare to $x$, cut the range in half. Beautiful in its simplicity. + +### Spacial Locality + +* First ~10 queries may be cached (frequently accessed: temporal locality) +* Last 3-4 queries may be cached (may be in the same cache line: data locality) +* But that's it. Maybe store elements in a more cache-friendly way? + +![](../img/binary-search.png) + +### Temporal Locality + +When we find lower bound of $x$ in a sorted array by binary searching, the main problem is that its memory accesses pattern is neither temporary nor spatially local. + +For example, element $\lfloor \frac n 2 \rfloor$ is accessed very often (every search) and element $\lfloor \frac n 2 \rfloor + 1$ is not, while they are probably occupying the same cache line. In general, only the first 3-5 reads are temporary local and only the last 3-4 reads are spatially local, and the rest are just random memory accesses. + +![](../img/binary-heat.png) + +## Eytzinger Layout + +```c++ +void eytzinger(int k = 1) { + static int i = 0; + if (k <= n) { + eytzinger(2 * k); + t[k] = _a[i++]; + eytzinger(2 * k + 1); + } +} + +int lower_bound(int x) { + int k = 1; + while (k <= n) + k = 2 * k + (t[k] < x); + k >>= __builtin_ffs(~k); + return t[k]; +} +``` + + +```c++ +t[0] = -1; // an element that is less than X +iters = std::__lg(n + 1); +``` + +```c++ +int lower_bound(int x) { + int k = 1; + for (int i = 0; i < iters; i++) + k = 2 * k + (t[k] < x); + int *loc = (k <= n ? t + k : t); + k = 2 * k + (*loc < x); + k >>= __builtin_ffs(~k); + return t[k]; +} +``` + +```c++ +t = (int*) std::aligned_alloc(64, 4 * (n + 1)); + +int lower_bound(int x) { + int k = 1; + while (k <= n) { + __builtin_prefetch(t + k * B); + k = 2 * k + (t[k] < x); + } + k >>= __builtin_ffs(~k); + return t[k]; +} +``` + +```c++ +__builtin_prefetch(t + k * B * 2); +``` + +## B-Tree Layout + +```c++ +typedef __m256i reg; + +const int B = 16; +const int INF = std::numeric_limits::max(); + +int n; +int nblocks; +int *_a; +int (*btree)[B]; + +int go(int k, int i) { return k * (B + 1) + i + 1; } + +void build(int k = 0) { + static int t = 0; + if (k < nblocks) { + for (int i = 0; i < B; i++) { + build(go(k, i)); + btree[k][i] = (t < n ? _a[t++] : INF); + } + build(go(k, B)); + } +} + +void prepare(int *a, int _n) { + n = _n; + nblocks = (n + B - 1) / B; + _a = a; + btree = (int(*)[16]) std::aligned_alloc(64, 64 * nblocks); + build(); +} + +int cmp(reg x_vec, int* y_ptr) { + reg y_vec = _mm256_load_si256((reg*) y_ptr); + reg mask = _mm256_cmpgt_epi32(x_vec, y_vec); + return _mm256_movemask_ps((__m256) mask); +} + +int lower_bound(int x) { + int k = 0, res = INF; + reg x_vec = _mm256_set1_epi32(x); + while (k < nblocks) { + int mask = ~( + cmp(x_vec, &btree[k][0]) + + (cmp(x_vec, &btree[k][8]) << 8) + ); + int i = __builtin_ffs(mask) - 1; + if (i < B) + res = btree[k][i]; + k = go(k, i); + } + return res; +} +``` + +```c++ +btree = (int(*)[16]) std::aligned_alloc(2 * 1024 * 1024, 64 * nblocks); +madvise(btree, 64 * nblocks, MADV_HUGEPAGE); +``` + +```c++ +constexpr std::pair precalc(int n) { + int s = 0, // total size + l = B, // size of next layer + h = 0; // height so far + while (s + l - B < n) { + s += l; + l *= (B + 1); + h++; + } + int r = (n - s + B - 1) / B; // remaining blocks on last layer + return {h, s / B + (r + B) / (B + 1) * (B + 1)}; +} + +const int height = precalc(N).first, nblocks = precalc(N).second; +int *_a, (*btree)[B]; +``` + +```c++ +void permute(int *node) { + const reg perm_mask = _mm256_set_epi32(3, 2, 1, 0, 7, 6, 5, 4); // todo: setr + reg* middle = (reg*) (node + 4); + reg x = _mm256_loadu_si256(middle); + x = _mm256_permutevar8x32_epi32(x, perm_mask); + _mm256_storeu_si256(middle, x); +} +``` + +You call `permute(btree[k])` after you've done with constructing a node. + +```c++ +unsigned rank(reg x_vec, int* y_ptr) { + reg a = _mm256_load_si256((reg*) y_ptr); + reg b = _mm256_load_si256((reg*) (y_ptr + 8)); + + reg ca = _mm256_cmpgt_epi32(a, x_vec); + reg cb = _mm256_cmpgt_epi32(b, x_vec); + + reg c = _mm256_packs_epi32(ca, cb); + int mask = _mm256_movemask_epi8(c); + + return __tzcnt_u32(mask) >> 1; +} +``` + +```c++ +const int translate[17] = { + 0, 1, 2, 3, + 8, 9, 10, 11, + 4, 5, 6, 7, + 12, 13, 14, 15, + 0 +}; + +void update(int &res, int* node, unsigned i) { + int val = node[translate[i]]; + res = (i < B ? val : res); +} +``` + +```c++ +int lower_bound(int x) { + int k = 0, res = INF; + reg x_vec = _mm256_set1_epi32(x - 1); + for (int h = 0; h < height - 1; h++) { + int *node = btree[k]; + unsigned i = rank(x_vec, node); + k = k * (B + 1) + 1; // remove + 1? + if (h > 3) + __builtin_prefetch(&btree[go(k, 0)]); + update(res, node, i); + k += i; + } + unsigned i = rank(x_vec, btree[k]); + update(res, btree[k], i); + int k2 = go(k, i); + if (height > 4) + __builtin_prefetch(&btree[go(k, 0)]); + if (go(k, 0) < nblocks) { + unsigned i = rank(x_vec, btree[k2]); + update(res, btree[k2], i); + } + return res; +} +``` + +## B+ Tree Layout + +```c++ +constexpr int blocks(int n) { + return (n + B - 1) / B; +} + +constexpr int prev_keys(int n) { + return (blocks(n) + B) / (B + 1) * B; +} + +constexpr int height(int n) { + return (n <= B ? 1 : height(prev_keys(n)) + 1); +} + +constexpr int offset(int h) { + int k = 0, n = N; + while (h--) { + k += blocks(n) * B; + n = prev_keys(n); + } + return k; +} + +const int H = height(N), S = offset(H); + +int *btree; + +void permute(int *node) { + const reg perm_mask = _mm256_set_epi32(3, 2, 1, 0, 7, 6, 5, 4); + reg* middle = (reg*) (node + 4); + reg x = _mm256_loadu_si256(middle); + x = _mm256_permutevar8x32_epi32(x, perm_mask); + _mm256_storeu_si256(middle, x); +} + +void prepare(int *a, int n) { + const int P = 1 << 21, T = (4 * S + P - 1) / P * P; + btree = (int*) std::aligned_alloc(P, T); + madvise(btree, T, MADV_HUGEPAGE); + + for (int i = N; i < S; i++) + btree[i] = INF; + + memcpy(btree, a, 4 * N); + + for (int h = 1; h < H; h++) { + for (int i = 0; i < offset(h + 1) - offset(h); i++) { + int k = i / B, + j = i - k * B; + k = k * (B + 1) + j + 1; // compare right + // and then always to the left + for (int l = 0; l < h - 1; l++) + k *= (B + 1); + btree[offset(h) + i] = (k * B < N ? btree[k * B] : INF); + } + } + + for (int i = offset(1); i < S; i += B) + permute(btree + i); +} + +unsigned direct_rank(reg x, int* y) { + reg a = _mm256_load_si256((reg*) y); + reg b = _mm256_load_si256((reg*) (y + 8)); + + reg ca = _mm256_cmpgt_epi32(a, x); + reg cb = _mm256_cmpgt_epi32(b, x); + + int mb = _mm256_movemask_ps((__m256) cb); + int ma = _mm256_movemask_ps((__m256) ca); + + unsigned mask = (1 << 16); + mask |= mb << 8; + mask |= ma; + + return __tzcnt_u32(mask); +} + +unsigned permuted_rank(reg x, int* y) { + reg a = _mm256_load_si256((reg*) y); + reg b = _mm256_load_si256((reg*) (y + 8)); + + reg ca = _mm256_cmpgt_epi32(a, x); + reg cb = _mm256_cmpgt_epi32(b, x); + + reg c = _mm256_packs_epi32(ca, cb); + unsigned mask = _mm256_movemask_epi8(c); + + return __tzcnt_u32(mask)/* >> 1*/; +} + +int lower_bound(int _x) { + unsigned k = 0; + reg x = _mm256_set1_epi32(_x - 1); + for (int h = H - 1; h > 0; h--) { + unsigned i = permuted_rank(x, btree + offset(h) + k); + + //k /= B; + //k *= (B + 1) * B; + // k += (i << 3); + + k = k * (B + 1) + (i << 3); + + //if (N > (1 << 21) && h == 1) + // __builtin_prefetch(btree + k); + + //k += (i << 3); + } + unsigned i = direct_rank(x, btree + k); + return btree[k + i]; +} +``` + +### Comparisons + + +### Measuring Actual Latency + +```c++ +int last = 0; + +for (int i = 0; i < m; i++) { + last = lower_bound(q[i] ^ last); + checksum ^= last; +} +``` + +```c++ +for (int i = 0; i < m; i++) + checksum ^= lower_bound(q[i]); +``` + +### Modifications + +```c++ +void permute32(int *node) { + // a b c d 1 2 3 4 -> (a c) (b d) (1 3) (2 4) -> (a c) (1 3) (b d) (2 4) + reg x = _mm256_load_si256((reg*) (node + 8)); + reg y = _mm256_load_si256((reg*) (node + 16)); + _mm256_storeu_si256((reg*) (node + 8), y); + _mm256_storeu_si256((reg*) (node + 16), x); + permute16(node); + permute16(node + 16); +} +``` + +```c++ +unsigned cmp(reg x, int *node) { + reg y = _mm256_load_si256((reg*) node); + reg mask = _mm256_cmpgt_epi32(y, x); + return _mm256_movemask_ps((__m256) mask); +} + +unsigned rank32(reg x, int *node) { + unsigned mask = cmp(x, node) + | (cmp(x, node + 8) << 8) + | (cmp(x, node + 16) << 16) + | (cmp(x, node + 24) << 24); +``` + +```c++ +unsigned permuted_rank32(reg x, int *node) { + reg a = _mm256_load_si256((reg*) node); + reg b = _mm256_load_si256((reg*) (node + 8)); + reg c = _mm256_load_si256((reg*) (node + 16)); + reg d = _mm256_load_si256((reg*) (node + 24)); + + reg ca = _mm256_cmpgt_epi32(a, x); + reg cb = _mm256_cmpgt_epi32(b, x); + reg cc = _mm256_cmpgt_epi32(c, x); + reg cd = _mm256_cmpgt_epi32(d, x); + + reg cab = _mm256_packs_epi32(ca, cb); + reg ccd = _mm256_packs_epi32(cc, cd); + reg cabcd = _mm256_packs_epi16(cab, ccd); + unsigned mask = _mm256_movemask_epi8(cabcd); + + return __tzcnt_u32(mask); +} +``` + +_mm256_stream_load_si256 — on just the last iteration. + +--- + + + ## Eytzinger Layout **Michaël Eytzinger** is a 16th century Austrian nobleman known for his work on genealogy, particularily for a system for numbering ancestors called *ahnentafel* (German for "ancestor table"). @@ -335,7 +803,7 @@ They are widely used for indexing in databases, especially those that operate on To perform static binary searches, one can implement a B-tree in an implicit way, i. e. without actually storing any pointers and spending only $O(1)$ additional memory, and $k$ could be made equal to the cache line size so that each node request fetches exactly one cache line. -![](../img/btree.png) +![](../img/b-tree.png) Turns out, they have the same rate of growth but sligtly larger compute-tied constant. While the latter is explainable (our while loop only has like 5 instructions; can't outpace that), the former is surprising. @@ -450,21 +918,20 @@ This worked [particularly well](https://finance.yahoo.com/quote/NVDA/) for paral Modern hardware can do [lots of stuff](https://software.intel.com/sites/landingpage/IntrinsicsGuide) under this paradigm, leveraging *data-level parallelism*. For example, the simplest thing you can do on modern Intel CPUs is to: 1. load 256-bit block of ints (which is $\frac{256}{32} = 8$ ints), - 2. load another 256-bit block of ints, - 3. add them together, - 4. write the result somewhere else …and this whole transaction costs the same as loading and adding just two ints—which means we can do 8 times more work. Magic! -So, as we promised before, we will perform all $16$ comparisons to compute the index of the right child node, but we leverage SIMD instructions to do it efficiently. Just to clarify—we want to do something like this: +So, as we promised before, we will perform all $16$ comparisons to compute the index of the right child node, but we leverage SIMD instructions to do it efficiently. Just to clarify — we want to do something like this: ```cpp int mask = (1 << B); + for (int i = 0; i < B; i++) mask |= (btree[k][i] >= x) << i; + int i = __builtin_ffs(mask) - 1; // now i is the number of the correct child node ``` @@ -476,11 +943,8 @@ Actually, compiler quite often produces very optimized code that leverages these The algorithm we will implement: 1. Somewhere before the main loop, convert $x$ to a vector of $8$ copies of $x$. - 2. Load the keys stored in node into another 256-bit vector. - 3. Compare these two vectors. This returns a 256-bit mask in which pairs that compared "greater than" are marked with ones. - 4. Create a 8-bit mask out of that and return it. Then you can feed it to `__builtin_ffs`. This is how it looks using C++ intrinsics, which are basically built-in wrappers for raw assembly instructions: @@ -562,4 +1026,14 @@ Note that this implementation is very specific to the architecture. Older CPUs a ## Acknowledgements -This tutorial is loosely based on a [46-page paper](https://arxiv.org/pdf/1509.05053.pdf) by Paul-Virak Khuong and Pat Morin "Array layouts for comparison-based searching". +The first half of the article is loosely based on "[Array Layouts for Comparison-Based Searching](https://arxiv.org/pdf/1509.05053.pdf)" by Paul-Virak Khuong and Pat Morin. It is 46 pages long, and discusses the scalar binary searches in more details, so check it out if you're interested in other approaches. + +This [StackOverflow answer](https://stackoverflow.com/questions/20616605/using-simd-avx-sse-for-tree-traversal) by Cory Nelson is where I took the permuted SIMD routine. + +The more you think about the name. "S-tree" and "S+ tree" respectively. There is a an obscure data structures in computer vision. We even have more claim to it than Boer had on B-tree: it is succinct, simd, my name, my surname. + + diff --git a/content/english/hpc/data-structures/filters.md b/content/english/hpc/data-structures/filters.md index 8948d345..e8d38669 100644 --- a/content/english/hpc/data-structures/filters.md +++ b/content/english/hpc/data-structures/filters.md @@ -1,6 +1,7 @@ --- title: Probabilistic Filters weight: 10 +draft: true --- bloom filters have the inverse behavior of caches* diff --git a/content/english/hpc/data-structures/img/search-all.svg b/content/english/hpc/data-structures/img/search-all.svg new file mode 100644 index 00000000..66508a1f --- /dev/null +++ b/content/english/hpc/data-structures/img/search-all.svg @@ -0,0 +1,1615 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/data-structures/img/search-bplus-other.svg b/content/english/hpc/data-structures/img/search-bplus-other.svg new file mode 100644 index 00000000..d3316cb6 --- /dev/null +++ b/content/english/hpc/data-structures/img/search-bplus-other.svg @@ -0,0 +1,1493 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/data-structures/img/search-bplus.svg b/content/english/hpc/data-structures/img/search-bplus.svg new file mode 100644 index 00000000..1081ca31 --- /dev/null +++ b/content/english/hpc/data-structures/img/search-bplus.svg @@ -0,0 +1,1302 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/data-structures/img/search-branchless-prefetch.svg b/content/english/hpc/data-structures/img/search-branchless-prefetch.svg new file mode 100644 index 00000000..2e3c46ef --- /dev/null +++ b/content/english/hpc/data-structures/img/search-branchless-prefetch.svg @@ -0,0 +1,1482 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/data-structures/img/search-branchless.svg b/content/english/hpc/data-structures/img/search-branchless.svg new file mode 100644 index 00000000..6ddacb23 --- /dev/null +++ b/content/english/hpc/data-structures/img/search-branchless.svg @@ -0,0 +1,1269 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/data-structures/img/search-btree-hugepages.svg b/content/english/hpc/data-structures/img/search-btree-hugepages.svg new file mode 100644 index 00000000..87a27edc --- /dev/null +++ b/content/english/hpc/data-structures/img/search-btree-hugepages.svg @@ -0,0 +1,1243 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/data-structures/img/search-btree-optimized.svg b/content/english/hpc/data-structures/img/search-btree-optimized.svg new file mode 100644 index 00000000..ea008092 --- /dev/null +++ b/content/english/hpc/data-structures/img/search-btree-optimized.svg @@ -0,0 +1,1434 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/data-structures/img/search-btree.svg b/content/english/hpc/data-structures/img/search-btree.svg new file mode 100644 index 00000000..580b781f --- /dev/null +++ b/content/english/hpc/data-structures/img/search-btree.svg @@ -0,0 +1,1589 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/data-structures/img/search-eytzinger-branchless.svg b/content/english/hpc/data-structures/img/search-eytzinger-branchless.svg new file mode 100644 index 00000000..b0f430ce --- /dev/null +++ b/content/english/hpc/data-structures/img/search-eytzinger-branchless.svg @@ -0,0 +1,1254 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/data-structures/img/search-eytzinger-prefetch.svg b/content/english/hpc/data-structures/img/search-eytzinger-prefetch.svg new file mode 100644 index 00000000..eabbab16 --- /dev/null +++ b/content/english/hpc/data-structures/img/search-eytzinger-prefetch.svg @@ -0,0 +1,1618 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/data-structures/img/search-eytzinger-small.svg b/content/english/hpc/data-structures/img/search-eytzinger-small.svg new file mode 100644 index 00000000..248410ae --- /dev/null +++ b/content/english/hpc/data-structures/img/search-eytzinger-small.svg @@ -0,0 +1,1098 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/data-structures/img/search-eytzinger.svg b/content/english/hpc/data-structures/img/search-eytzinger.svg new file mode 100644 index 00000000..843fd491 --- /dev/null +++ b/content/english/hpc/data-structures/img/search-eytzinger.svg @@ -0,0 +1,1439 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/data-structures/img/search-latency-bplus.svg b/content/english/hpc/data-structures/img/search-latency-bplus.svg new file mode 100644 index 00000000..92ea4a39 --- /dev/null +++ b/content/english/hpc/data-structures/img/search-latency-bplus.svg @@ -0,0 +1,1511 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/data-structures/img/search-relative-latency.svg b/content/english/hpc/data-structures/img/search-relative-latency.svg new file mode 100644 index 00000000..c1480666 --- /dev/null +++ b/content/english/hpc/data-structures/img/search-relative-latency.svg @@ -0,0 +1,1208 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/data-structures/img/search-relative.svg b/content/english/hpc/data-structures/img/search-relative.svg new file mode 100644 index 00000000..28cec67a --- /dev/null +++ b/content/english/hpc/data-structures/img/search-relative.svg @@ -0,0 +1,1179 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/data-structures/img/search-std.svg b/content/english/hpc/data-structures/img/search-std.svg new file mode 100644 index 00000000..3486e33c --- /dev/null +++ b/content/english/hpc/data-structures/img/search-std.svg @@ -0,0 +1,1133 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + From 242b77226e29d6f7e635a39b9c1bcf09aaaeb916 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sat, 12 Feb 2022 07:04:04 +0300 Subject: [PATCH 145/531] data structures intro --- content/english/hpc/data-structures/_index.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/english/hpc/data-structures/_index.md b/content/english/hpc/data-structures/_index.md index 9516ad3c..125e8f56 100644 --- a/content/english/hpc/data-structures/_index.md +++ b/content/english/hpc/data-structures/_index.md @@ -4,8 +4,8 @@ weight: 12 draft: true --- -Optimizing data structures is different from optimizing algorithms. It is harder. Each new aspect multiplies the design complexity. A lot more attention needs to be attached to memory and latency-bandwidth trade-offs. +Optimizing data structures is different from optimizing [algorithms](/hpc/algorithms) as data structure problems have more dimensions: you may be optimizing for *throughput*, for *latency*, for *memory usage*, or any combination of those — and this complexity blows up exponentially when you need to process *multiple* query types and consider multiple query distributions. -Defining the benchmarks is [harder](/hpc/profiling/noise/). +This makes simply [defining benchmarks](/hpc/profiling/noise/) much harder, let alone the actual implementations. In this chapter, we will try to navigate all this complexity and learn how to design efficient data structures with extensive case studies. A brief review of the [CPU cache system](/hpc/cpu-cache) is strongly advised. From 854e4fe9fbf2a85c161ba2d2b8eac54d2e84c043 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sat, 12 Feb 2022 10:22:15 +0300 Subject: [PATCH 146/531] reorganize binary search --- .../hpc/data-structures/binary-search.md | 755 ++++++------------ 1 file changed, 260 insertions(+), 495 deletions(-) diff --git a/content/english/hpc/data-structures/binary-search.md b/content/english/hpc/data-structures/binary-search.md index 2b0eed67..fdab8d62 100644 --- a/content/english/hpc/data-structures/binary-search.md +++ b/content/english/hpc/data-structures/binary-search.md @@ -3,37 +3,6 @@ title: Searching in Sorted Arrays weight: 1 --- -![](../img/search-std.svg) - -![](../img/search-branchless.svg) - -![](../img/search-branchless-prefetch.svg) - ---- - -![](../img/search-eytzinger.svg) - -![](../img/search-eytzinger-prefetch.svg) - -![](../img/search-eytzinger-small.svg) - -![](../img/search-eytzinger-branchless.svg) - ---- - -![](../img/search-btree.svg) -![](../img/search-btree-hugepages.svg) -![](../img/search-btree-optimized.svg) - ---- - -![](../img/search-bplus.svg) -![](../img/search-relative.svg) -![](../img/search-relative-latency.svg) -![](../img/search-bplus-other.svg) -![](../img/search-latency-bplus.svg) -![](../img/search-all.svg) - The most fascinating showcases of performance engineering are not intricate 5-10% speed improvements of some databases, but multifold optimizations of some basic algorithms you can find in a textbook — the ones that are so simple that it would never even occur to try to optimize them. These kinds of optimizations are simple and instructive, and can very much be adopted elsewhere. Yet, with remarkable periodicity, these can be optimized to ridiculous levels of performance. In this article, we will focus on such an algorithm — binary search — and significantly improve its efficiency by rearranging elements of a sorted array in a more cache-friendly way. We will develop two versions, each achieving 4-7x speedup over the standard `std::lower_bound`, depending on the cache level and available memory bandwidth: @@ -44,10 +13,17 @@ In this article, we will focus on such an algorithm — binary search — and si GCC sucked on all benchmarks, so we will be using Clang (10) exclusively. The CPU is a Zen 2, although the results should be transferrable to other platforms, including most Arm-based chips. + +This is a follow up on a [previous article](https://algorithmica.org/en/eytzinger) about using Eytzinger memory layout to speed up binary search. Here we use implicit (pointerless) B-trees accelerated with SIMD operations to perform search efficiently while using less memory bandwidth. + +It performs slightly worse on array sizes that fit lower layers of cache, but in low-bandwidth environments it can be up to 3x faster (or 7x faster than `std::lower_bound`). + ## Binary Search Already sorted array `t` of size `n`. +We are going ot create an array named `a` into array named `t`. + ```c++ int lower_bound(int x) { int l = 0, r = n - 1; @@ -62,6 +38,8 @@ int lower_bound(int x) { } ``` +![](../img/search-std.svg) + This is actually how `std::lower_bound` from works. Implementations from both [Clang](https://github.com/llvm-mirror/libcxx/blob/78d6a7767ed57b50122a161b91f59f19c9bd0d19/include/algorithm#L4169) and [GCC](https://github.com/gcc-mirror/gcc/blob/d9375e490072d1aae73a93949aa158fcd2a27018/libstdc%2B%2B-v3/include/bits/stl_algobase.h#L1023) use this metaprogramming monstrosity: ```c++ @@ -145,7 +123,7 @@ Contains an "if" that is impossible to predict better than a coin flip. It's not illegal: ternary operator is replaced with something like `CMOV` -```cpp + + +![](../img/search-branchless.svg) + +![](../img/search-branchless-prefetch.svg) + But this is not the largest problem. The real problem is that it waits for its operands, and the results still can't be predicted. @@ -212,43 +195,99 @@ For example, element $\lfloor \frac n 2 \rfloor$ is accessed very often (every s ## Eytzinger Layout +**Michaël Eytzinger** is a 16th century Austrian nobleman known for his work on genealogy, particularily for a system for numbering ancestors called *ahnentafel* (German for "ancestor table"). + +Ancestry mattered a lot back then, but writing down that data was expensive. *Ahnentafel* allows displaying a person's genealogy compactly, without wasting extra space by drawing diagrams. + +It lists a person's direct ancestors in a fixed sequence of ascent. First, the person theirself is listed as number 1, and then, recursively, for each person numbered $k$, their father is listed as $2k$ and their mother as $(2k+1)$. + +Here is the example for Paul I, the great-grandson of Peter I, the Great: + +1. Paul I +2. Peter III (Paul's father) +3. Catherine II (Paul's mother) +4. Charles Frederick (Peter's father, Paul's paternal grandfather) +5. Anna Petrovna (Peter's mother, Paul's paternal grandmother) +6. Christian August (Catherine's father, Paul's maternal grandfather) +7. Johanna Elisabeth (Catherine's mother, Paul's maternal grandmother) + +Apart from being compact, it has some nice properties, like that all even-numbered persons are male and all odd-numbered (possibly apart from 1) are female. + +One can also find the number of a particular ancestor only knowing the genders of their descendants. For example, Peter the Great's bloodline is Paul I → Peter III → Anna Petrovna → Peter the Great, so his number should be $((1 \times 2) \times 2 + 1) \times 2 = 10$. + +**In computer science**, this enumeration has been widely used for implicit (i. e. pointer-free) implementation of heaps, segment trees, and other binary tree structures, where instead of names it stores underlying array items. + +This is how this layout will look when applied to binary search: + +![](../img/eytzinger.png) + +You can immediately see how its temporal locality is better (in fact, theoretically optimal) as the elements closer to the root are closer to the beginning of the array, and thus are more likely to be fetched from cache. + +![](../img/eytzinger-search.png) +![](../img/eytzinger-heat.png) + +### Construction + +Here is a function that constructs Eytzinger array by traversing the original search tree. + +It takes two indexes $i$ and $k$—one in the original array and one in constructed—and recursively goes to two branches until a leaf node is reached, which could simply be checked by asserting $k \leq n$ as Eytzinger array should have same number of items. + ```c++ +int a[n], b[n + 1]; // <- change name + void eytzinger(int k = 1) { - static int i = 0; + static int i = 0; // <- careful running it multiple times if (k <= n) { eytzinger(2 * k); t[k] = _a[i++]; eytzinger(2 * k + 1); } } - -int lower_bound(int x) { - int k = 1; - while (k <= n) - k = 2 * k + (t[k] < x); - k >>= __builtin_ffs(~k); - return t[k]; -} ``` +Despite being recursive, this is actually a really fast implementation as all memory reads are sequential. -```c++ -t[0] = -1; // an element that is less than X -iters = std::__lg(n + 1); +Note that the first element is left unfilled and the whole array is essentially 1-shifted. This will actually turn out to be a huge performance booster. + +## Binary search implementation + +We can now descend this array using only indices: we just start with $k=1$ and execute $k := 2k$ if we need to go left and $k := 2k + 1$ if we need to go right. We don't even need to store and recalculate binary search boundaries anymore. + +The only problem arises when we need to restore the index of the resulting element, as $k$ may end up not pointing to a leaf node. Here is an example of how that can happen: + +```python + array: 1 2 3 4 5 6 7 8 +eytzinger: 4 2 5 1 6 3 7 8 +1st range: --------------- k := 1 +2nd range: ------- k := 2*k (=2) +3rd range: --- k := 2*k + 1 (=5) +4th range: - k := 2*k + 1 (=11) ``` +Here we query array of $[1, …, 8]$ for the lower bound of $x=4$. We compare it against $4$, $2$ and $5$, and go left-right-right and end up with $k = 11$, which isn't even a valid array index. + +Note that, unless the answer is the last element of the array, we compare $x$ against it at some point, and after we learn that it is not less than $x$, we start comparing $x$ against elements to the left, and all these comparisons will evaluate true (i. e. leading to the right). Hence, the solution to restoring the resulting element is to cancel some number of right turns. + +This can be done in an elegant way by observing that the right turns are recorded in the binary notation of $k$ as 1-bits, and so we just need to find the number of trailing ones in the binary notation and right-shift $k$ by exactly that amount. + +To do this we can invert the number (`~x`) and call "find first set" instruction available on most systems. In GCC, the corresponding builtin is `__builtin_ffs`. + ```c++ int lower_bound(int x) { int k = 1; - for (int i = 0; i < iters; i++) + while (k <= n) k = 2 * k + (t[k] < x); - int *loc = (k <= n ? t + k : t); - k = 2 * k + (*loc < x); k >>= __builtin_ffs(~k); return t[k]; } ``` +![](../img/search-eytzinger.svg) + +### Prefetching + +We could prefetch not just its 2 children. + ```c++ t = (int*) std::aligned_alloc(64, 4 * (n + 1)); @@ -267,8 +306,75 @@ int lower_bound(int x) { __builtin_prefetch(t + k * B * 2); ``` +![](../img/search-eytzinger-prefetch.svg) + +### Last branch + +Let's zoom in. + +![](../img/search-eytzinger-small.svg) + +```c++ +t[0] = -1; // an element that is less than X +iters = std::__lg(n + 1); +``` + +```c++ +int lower_bound(int x) { + int k = 1; + for (int i = 0; i < iters; i++) + k = 2 * k + (t[k] < x); + int *loc = (k <= n ? t + k : t); + k = 2 * k + (*loc < x); + k >>= __builtin_ffs(~k); + return t[k]; +} +``` + +![](../img/search-eytzinger-branchless.svg) + +That was a detour. + ## B-Tree Layout +B-trees are basically $(k+1)$-ary trees, meaning that they store $k$ elements in each node and choose between $(k+1)$ possible branches instead of 2. + +They are widely used for indexing in databases, especially those that operate on-disk, because if $k$ is big, this allows large sequential memory accesses while reducing the height of the tree. + +To perform static binary searches, one can implement a B-tree in an implicit way, i. e. without actually storing any pointers and spending only $O(1)$ additional memory, and $k$ could be made equal to the cache line size so that each node request fetches exactly one cache line. + +![](../img/b-tree.png) + +Turns out, they have the same rate of growth but sligtly larger compute-tied constant. While the latter is explainable (our while loop only has like 5 instructions; can't outpace that), the former is surprising. + +Let's assume that arithmetic costs nothing and do simple cache block analysis: + +* The Eytzinger binary search is supposed to be $4$ times faster if compute didn't matter, as it requests them ~4 times faster on average. + +* The B-tree makes $\frac{\log_{17} n}{\log_2 n} = \frac{\log n}{\log 17} \frac{\log 2}{\log n} = \frac{\log 2}{\log 17} \approx 0.245$ memory access per each request of binary search, i. e. it requests ~4 times less cache lines to fetch + +This explains why they have roughly the same slope. + +Note that this method, while being great for single-threaded world, is unlikely to make its way into database and heavy multi-threaded applications, because it sacrifices bandwidth to achieve low latency. + +[Part 2](https://algorithmica.org/en/b-tree) explores efficient implementation of implicit static B-trees in bandwidth-constrained environment. + +### B-tree layout + +B-trees generalize the concept of binary search trees by allowing nodes to have more than two children. + +Instead of single key, a B-tree node contains up to $B$ sorted keys may have up to $(B+1)$ children, thus reducing the tree height in $\frac{\log_2 n}{\log_B n} = \frac{\log B}{\log 2} = \log_2 B$ times. + +They were primarily developed for the purpose of managing on-disk databases, as their random access times are almost the same as reading 1MB of data sequentially, which makes the trade-off between number of comparisons and tree height beneficial. In our implementation, we will make each the size of each block equal to the cache line size, which in case of `int` is 16 elements. + +Normally, a B-tree node also stores $(B+1)$ pointers to its children, but we will only store keys and rely on pointer arithmetic, similar to the one used in Eytzinger array: + +* The root node is numbered $0$. + +* Node $k$ has $(B+1)$ child nodes numbered $\{k \cdot (B+1) + i\}$ for $i \in [1, B]$. + +Keys are stored in a 2d array in non-decreasing order. If the length of the initial array is not a multiple of $B$, the last block is padded with the largest value if its data type. + ```c++ typedef __m256i reg; @@ -324,11 +430,60 @@ int lower_bound(int x) { } ``` + +We can construct B-tree similarly by traversing the search tree. + +It is correct, because each value of initial array will be copied to a unique position in the resulting array, and the tree height is $\Theta(\log_{B+1} n)$, because $k$ is multiplied by $(B + 1)$ each time a child node is created. + +Note that this approach causes a slight imbalance: "lefter" children may have larger respective ranges. + +So, as we promised before, we will perform all $16$ comparisons to compute the index of the right child node, but we leverage SIMD instructions to do it efficiently. Just to clarify — we want to do something like this: + +```cpp +int mask = (1 << B); + +for (int i = 0; i < B; i++) + mask |= (btree[k][i] >= x) << i; + +int i = __builtin_ffs(mask) - 1; +// now i is the number of the correct child node +``` + + +…but ~8 times faster. + +Actually, compiler quite often produces very optimized code that leverages these instructions for certain types of loops. This is called auto-vectorization, and this is the reason why a loop that sums up an array of `short`s is faster (theoretically by a factor of two) than the same loop for `int`s: you can fit more elements on the same 256-bit block. Sadly, this is not our case, as we have loop-carried dependencies. + +The algorithm we will implement: + +1. Somewhere before the main loop, convert $x$ to a vector of $8$ copies of $x$. +2. Load the keys stored in node into another 256-bit vector. +3. Compare these two vectors. This returns a 256-bit mask in which pairs that compared "greater than" are marked with ones. +4. Create a 8-bit mask out of that and return it. Then you can feed it to `__builtin_ffs`. + +This is how it looks using C++ intrinsics, which are basically built-in wrappers for raw assembly instructions: + + +After that, we call this function two times (because our node size / cache line happens to be 512 bits, which is twice as big) and blend these masks together with bitwise operations. + + +That's it. This implementation should outperform even the [state-of-the-art indexes](http://kaldewey.com/pubs/FAST__SIGMOD10.pdf) used in high-performance databases, though it's mostly due to the fact that data structures used in real databases have to support fast updates while we don't. + +Note that this implementation is very specific to the architecture. Older CPUs and CPUs on mobile devices don't have 256-bit wide registers and will crash (but they likely have 128-bit SIMD so the loop can still be split in 4 parts instead of 2), non-Intel CPUs have their own instruction sets for SIMD, and some computers even have different cache line size. + +![](../img/search-btree.svg) + +### Optimizations + +Enable huge pages: + ```c++ btree = (int(*)[16]) std::aligned_alloc(2 * 1024 * 1024, 64 * nblocks); madvise(btree, 64 * nblocks, MADV_HUGEPAGE); ``` +![](../img/search-btree-hugepages.svg) + ```c++ constexpr std::pair precalc(int n) { int s = 0, // total size @@ -347,18 +502,6 @@ const int height = precalc(N).first, nblocks = precalc(N).second; int *_a, (*btree)[B]; ``` -```c++ -void permute(int *node) { - const reg perm_mask = _mm256_set_epi32(3, 2, 1, 0, 7, 6, 5, 4); // todo: setr - reg* middle = (reg*) (node + 4); - reg x = _mm256_loadu_si256(middle); - x = _mm256_permutevar8x32_epi32(x, perm_mask); - _mm256_storeu_si256(middle, x); -} -``` - -You call `permute(btree[k])` after you've done with constructing a node. - ```c++ unsigned rank(reg x_vec, int* y_ptr) { reg a = _mm256_load_si256((reg*) y_ptr); @@ -374,12 +517,37 @@ unsigned rank(reg x_vec, int* y_ptr) { } ``` -```c++ -const int translate[17] = { - 0, 1, 2, 3, - 8, 9, 10, 11, - 4, 5, 6, 7, - 12, 13, 14, 15, +`packs` + +Or + + + + + +```c++ +void permute(int *node) { + const reg perm = _mm256_setr_epi32(4, 5, 6, 7, 0, 1, 2, 3); + reg* middle = (reg*) (node + 4); + reg x = _mm256_loadu_si256(middle); + x = _mm256_permutevar8x32_epi32(x, perm); + _mm256_storeu_si256(middle, x); +} +``` + +There are probably faster ways to swap middle elements, but we will leave it here. + +You call `permute(btree[k])` after you've done with constructing a node. + + +```c++ +const int translate[17] = { + 0, 1, 2, 3, + 8, 9, 10, 11, + 4, 5, 6, 7, + 12, 13, 14, 15, 0 }; @@ -397,16 +565,12 @@ int lower_bound(int x) { int *node = btree[k]; unsigned i = rank(x_vec, node); k = k * (B + 1) + 1; // remove + 1? - if (h > 3) - __builtin_prefetch(&btree[go(k, 0)]); update(res, node, i); k += i; } unsigned i = rank(x_vec, btree[k]); update(res, btree[k], i); int k2 = go(k, i); - if (height > 4) - __builtin_prefetch(&btree[go(k, 0)]); if (go(k, 0) < nblocks) { unsigned i = rank(x_vec, btree[k2]); update(res, btree[k2], i); @@ -415,6 +579,10 @@ int lower_bound(int x) { } ``` +All that hard work is totally worth it: + +![](../img/search-btree-optimized.svg) + ## B+ Tree Layout ```c++ @@ -529,11 +697,23 @@ int lower_bound(int _x) { } ``` -### Comparisons +![](../img/search-bplus.svg) + +Makes more sense to look at it as a relative speedup: +![](../img/search-relative.svg) ### Measuring Actual Latency +One huge asterisk we didn't disclosed. + +```c++ +for (int i = 0; i < m; i++) + checksum ^= lower_bound(q[i]); +``` + +To measure *actual* latency, we need to introduce a dependency between the iterations, so that the next one can't start before the previous finishes: + ```c++ int last = 0; @@ -543,10 +723,8 @@ for (int i = 0; i < m; i++) { } ``` -```c++ -for (int i = 0; i < m; i++) - checksum ^= lower_bound(q[i]); -``` +![](../img/search-relative-latency.svg) + ### Modifications @@ -597,432 +775,19 @@ unsigned permuted_rank32(reg x, int *node) { } ``` -_mm256_stream_load_si256 — on just the last iteration. - ---- - - - -## Eytzinger Layout - -**Michaël Eytzinger** is a 16th century Austrian nobleman known for his work on genealogy, particularily for a system for numbering ancestors called *ahnentafel* (German for "ancestor table"). - -Ancestry mattered a lot back then, but writing down that data was expensive. *Ahnentafel* allows displaying a person's genealogy compactly, without wasting extra space by drawing diagrams. - -It lists a person's direct ancestors in a fixed sequence of ascent. First, the person theirself is listed as number 1, and then, recursively, for each person numbered $k$, their father is listed as $2k$ and their mother as $(2k+1)$. - -Here is the example for Paul I, the great-grandson of Peter I, the Great: - -1. Paul I -2. Peter III (Paul's father) -3. Catherine II (Paul's mother) -4. Charles Frederick (Peter's father, Paul's paternal grandfather) -5. Anna Petrovna (Peter's mother, Paul's paternal grandmother) -6. Christian August (Catherine's father, Paul's maternal grandfather) -7. Johanna Elisabeth (Catherine's mother, Paul's maternal grandmother) - -Apart from being compact, it has some nice properties, like that all even-numbered persons are male and all odd-numbered (possibly apart from 1) are female. - -One can also find the number of a particular ancestor only knowing the genders of their descendants. For example, Peter the Great's bloodline is Paul I → Peter III → Anna Petrovna → Peter the Great, so his number should be $((1 \times 2) \times 2 + 1) \times 2 = 10$. - -**In computer science**, this enumeration has been widely used for implicit (i. e. pointer-free) implementation of heaps, segment trees, and other binary tree structures, where instead of names it stores underlying array items. - -This is how this layout will look when applied to binary search: - -![](../img/eytzinger.png) - -You can immediately see how its temporal locality is better (in fact, theoretically optimal) as the elements closer to the root are closer to the beginning of the array, and thus are more likely to be fetched from cache. - -![](../img/eytzinger-search.png) -![](../img/eytzinger-heat.png) - -### Construction - -Here is a function that constructs Eytzinger array by traversing the original search tree. - -It takes two indexes $i$ and $k$—one in the original array and one in constructed—and recursively goes to two branches until a leaf node is reached, which could simply be checked by asserting $k \leq n$ as Eytzinger array should have same number of items. - -```cpp -const int n = 1e5; -int a[n], b[n+1]; - -int eytzinger(int i = 0, int k = 1) { - if (k <= n) { - i = eytzinger(i, 2 * k); - b[k] = a[i++]; - i = eytzinger(i, 2 * k + 1); - } - return i; -} -``` - -Despite being recursive, this is actually a really fast implementation as all memory reads are sequential. - -Note that the first element is left unfilled and the whole array is essencially 1-shifted. This will actually turn out to be a huge performance booster. - -## Binary search implementation - -We can now descend this array using only indices: we just start with $k=1$ and execute $k := 2k$ if we need to go left and $k := 2k + 1$ if we need to go right. We don't even need to store and recalculate binary search boundaries anymore. - -The only problem arises when we need to restore the index of the resulting element, as $k$ may end up not pointing to a leaf node. Here is an example of how that can happen: - -```python - array: 1 2 3 4 5 6 7 8 -eytzinger: 4 2 5 1 6 3 7 8 -1st range: --------------- k := 1 -2nd range: ------- k := 2*k (=2) -3rd range: --- k := 2*k + 1 (=5) -4th range: - k := 2*k + 1 (=11) -``` - -Here we query array of $[1, …, 8]$ for the lower bound of $x=4$. We compare it against $4$, $2$ and $5$, and go left-right-right and end up with $k = 11$, which isn't even a valid array index. - -Note that, unless the answer is the last element of the array, we compare $x$ against it at some point, and after we learn that it is not less than $x$, we start comparing $x$ against elements to the left, and all these comparisons will evaluate true (i. e. leading to the right). Hence, the solution to restoring the resulting element is to cancel some number of right turns. - -This can be done in an elegant way by observing that the right turns are recorded in the binary notation of $k$ as 1-bits, and so we just need to find the number of trailing ones in the binary notation and right-shift $k$ by exactly that amount. - -To do this we can invert the number (`~x`) and call "find first set" instruction available on most systems. In GCC, the corresponding builtin is `__builtin_ffs`. - -```cpp -int search(int x) { -    int k = 1; -    while (k <= n) { -     if (b[k] >= x) -     k = 2 * k; -     else -     k = 2 * k + 1; -    } -    k >>= __builtin_ffs(~k); -    return b[k]; -} -``` - -Note that $k$ will be zero if binary search returned no result (i. e. all elements are less than $x$ and all turns were right-turns that got canceled). In that case, you can put a special flag in the first element of `b`. - -This is already 2-3 times faster than `std::lower_bound`, but we are not going to stop there and apply a series of small incremental improvements. - -### Branch-free - -Compiled program instructions are stored and loaded from main memory too, just as normal data. They are fetched during execution by similar mechanisms, and they have a separate instruction cache. In fact, in large applications you can sometimes remove blocks of literally unused code, and the program may run faster because of better instruction cache hit rate, but this is a topic for another article. - -To avoid performance hits caused by memory latency here, CPU loads 20-ish instructions ahead of time, but to do this it needs to know ahead of time which instructions to fetch. If a program has conditional execution (if-s, while-s, for-s) there is no option other than to take a guess. - -Branch misprediction (guessing "wrong" branch of "if") costs around 10-20 cycles. To partially negate this penalty, hardware [branch predictors](https://en.wikipedia.org/wiki/Branch_predictor) were developed. These are complex ad-hoc systems that use statistical methods—some even use simple [neural networks](https://en.wikipedia.org/wiki/Branch_predictor#Neural_branch_prediction)—to make a more accurate guess. - -In case of binary search, if all of our data is random, branch prediction doesn't help at all, just because it can't: all comparisons are 50-50. This is why we need to get rid of if-s and rewrite our main loop the following way: - -```cpp -while (k <= n) - k = 2 * k + (b[k] < x); -``` - -It also directly saves us from executing a few unnecessary arithmetic instructions. - -### Prefetching - -Compiler doesn't like when CPU is sitting idle while waiting for memory fetches. Sometimes it can take a guess about which cache line is going to be needed soon and fetch it ahead of time (recall that bandwidth-latency product is usually much larger than 1). - -This works well for simple access patterns, like iterating over array in increasing or decreasing order, but for something complex like what we have here it's not going to perform well. - -As we know a bit more about our problem than the compiler does, we can explicitly tell it to prefetch a cache line we need. This is done by `__builtin_prefetch` in GCC: - -```cpp -while (k <= n) { - __builtin_prefetch(b + k * block_size); - k = 2 * k + (b[k] < x); -} -``` - -Here, `block_size` equals 16, which is precisely how many ints are needed to cover a cache line. When we reference cache line at `b + k * block_size`, we are referencing $k$'s grand-grandson (`block_size` = $2 \times 2 \times 2 \times 2$, or 4 left turns) and possibly some of his neighbours in his layer (recall that indexes at the same level are just consecutive numbers). - -The whole point of doing this is that there is a good chance that we will prefetch an element that we will use later on $(i+4)$-th iteration. What chance, exactly? Well, it turns out that it is constant for each iteration. - -### Memory allignment - -Note that for each layer in the tree, except for the first 4 and possibly the last one, the number of nodes in that layer is divisible by 16, the block size. This means that the fraction of covered nodes on *each* iteration depends only on the position of the first offset of the array in respect to its cache line. But what is more important is that it can be made that all of $k$'s grand-grandchildren are covered by the same cache line. - -The way to achieve this is to place the first element of the array to the 1st position (0-indexed) of a cache line, or placing the array itself on the beginning of a cache line, since its first (i. e. `b[0]`) element is blank by design. This way the next $1 + 2 + 4 + 8 = 15$ elements of first 4 layers will occupy the rest of the cache line, and the rest of the array is alligned in nice 16-element blocks of nodes that share a grandpa. - -We just need to ask memory manager to allocate our array on the beginning of a cache line (by default it allocates your arrays wherever it wants), and that's it. To do this, we can use `alignas` specifier: - -```cpp -alignas(64) int b[n+1]; -``` - -This is it. Now our algorithm is constantly prefetching 4 layers / cache lines ahead of time, which is covered by the bandwidth of our RAM. This way the effective latency is reduced by a factor of 4, and we're basically trading off bandwidth for latency. - -### Complete implementation - -```cpp -#pragma GCC optimize("O3") -#include - -using namespace std; - -const int n = (1<<20); -const int block_size = 16; // = 64 / 4 = cache_line_size / sizeof(int) -alignas(64) int a[n], b[n+1]; - -int eytzinger(int i = 0, int k = 1) { - if (k <= n) { - i = eytzinger(i, 2 * k); - b[k] = a[i++]; - i = eytzinger(i, 2 * k + 1); - } - return i; -} - -int search(int x) { - int k = 1; - while (k <= n) { - __builtin_prefetch(b + k * block_size); - k = 2 * k + (b[k] < x); - } - k >>= __builtin_ffs(~k); - return k; -} -``` - -Few more things to note: - -* It works best when $n$ is a power of 2 or close to it, because otherwise the branch predictor will have a hard time figuring out whether or not to unroll the $(\log n)$-th cycle. - -* Its performance varies by cache size and array length, but stays >3x even on smaller arrays (<1MB). - -* Preprocessing isn't costly. It is around 1% of the cost of firing the same number of queries as the array size. - -* Modern hardware won't penalize you for prefetching cache lines that aren't yours, though this maybe be an issue for older CPUs, which can be solved by a simple `if` statement. - -* For some reason, basic binary search implementation (the very first code block in this article) is already ~20% faster than `std::sort`. - -## B-tree Layout - -B-trees are basically $(k+1)$-ary trees, meaning that they store $k$ elements in each node and choose between $(k+1)$ possible branches instead of 2. - -They are widely used for indexing in databases, especially those that operate on-disk, because if $k$ is big, this allows large sequential memory accesses while reducing the height of the tree. - -To perform static binary searches, one can implement a B-tree in an implicit way, i. e. without actually storing any pointers and spending only $O(1)$ additional memory, and $k$ could be made equal to the cache line size so that each node request fetches exactly one cache line. - -![](../img/b-tree.png) - -Turns out, they have the same rate of growth but sligtly larger compute-tied constant. While the latter is explainable (our while loop only has like 5 instructions; can't outpace that), the former is surprising. - -Let's assume that arithmetic costs nothing and do simple cache block analysis: - -* The Eytzinger binary search is supposed to be $4$ times faster if compute didn't matter, as it requests them ~4 times faster on average. - -* The B-tree makes $\frac{\log_{17} n}{\log_2 n} = \frac{\log n}{\log 17} \frac{\log 2}{\log n} = \frac{\log 2}{\log 17} \approx 0.245$ memory access per each request of binary search, i. e. it requests ~4 times less cache lines to fetch - -This explains why they have roughly the same slope. - -Note that this method, while being great for single-threaded world, is unlikely to make its way into database and heavy multi-threaded applications, because it sacrifices bandwidth to achieve low latency. - -[Part 2](https://algorithmica.org/en/b-tree) explores efficient implementation of implicit static B-trees in bandwidth-constrained environment. - - -## Implicit Static B-trees - -This is a follow up on a [previous article](https://algorithmica.org/en/eytzinger) about using Eytzinger memory layout to speed up binary search. Here we use implicit (pointerless) B-trees accelerated with SIMD operations to perform search efficiently while using less memory bandwidth. - -It performs slightly worse on array sizes that fit lower layers of cache, but in low-bandwidth environments it can be up to 3x faster (or 7x faster than `std::lower_bound`). - -## B-tree layout - -B-trees generalize the concept of binary search trees by allowing nodes to have more than two children. - -Instead of single key, a B-tree node contains up to $B$ sorted keys may have up to $(B+1)$ children, thus reducing the tree height in $\frac{\log_2 n}{\log_B n} = \frac{\log B}{\log 2} = \log_2 B$ times. - -They were primarily developed for the purpose of managing on-disk databases, as their random access times are almost the same as reading 1MB of data sequentially, which makes the trade-off between number of comparisons and tree height beneficial. In our implementation, we will make each the size of each block equal to the cache line size, which in case of `int` is 16 elements. - -Normally, a B-tree node also stores $(B+1)$ pointers to its children, but we will only store keys and rely on pointer arithmetic, similar to the one used in Eytzinger array: - -* The root node is numbered $0$. - -* Node $k$ has $(B+1)$ child nodes numbered $\{k \cdot (B+1) + i\}$ for $i \in [1, B]$. - -Keys are stored in a 2d array in non-decreasing order. If the length of the initial array is not a multiple of $B$, the last block is padded with the largest value if its data type. - -```cpp -const int nblocks = (n + B - 1) / B; -alignas(64) int btree[nblocks][B]; - -int go(int k, int i) { - return k * (B + 1) + i + 1; -} -``` - -In the code, we use zero-indexation for child nodes. - -## Construction - -We can construct B-tree similarly by traversing the search tree. - -```cpp -void build(int k = 0) { - static int t = 0; - if (k < nblocks) { - for (int i = 0; i < B; i++) { - build(go(k, i)); - btree[k][i] = (t < n ? a[t++] : INF); - } - build(go(k, B)); - } -} -``` - -It is correct, because each value of initial array will be copied to a unique position in the resulting array, and the tree height is $\Theta(\log_{B+1} n)$, because $k$ is multiplied by $(B + 1)$ each time a child node is created. - -Note that this approach causes a slight imbalance: "lefter" children may have larger respective ranges. - -## Basic Search - -Here is a short but rather inefficient implementation that we will improve later: - -```cpp -int search(int x) { - int k = 0, res = INF; - start: // the only justified usage of the goto statement - // as doing otherwise would add extra inefficiency and more code - while (k < nblocks) { - for (int i = 0; i < B; i++) { - if (btree[k][i] >= x) { - res = btree[k][i]; - k = go(k, i); - goto start; - } - } - k = go(k, B); - } - return res; -} -``` - -The issue here is that it runs a linear search on the whole array, and also that it has lots of conditionals that costs much more than just comparing integers. - -Here are some ideas to counter this: - -* We could unroll the loop so that it performs $B$ comparisons unconditionally and computes index of the right child node. - -* We could run a tiny binary search to get the right index, but there is considerable overhead to this. - -* We could code all the binary search comparisons by hand, or force compiler to do it so that there is no overhead. - -But we'll pick another path. We will honestly do all the comparisons, but in a very efficient way. - -## SIMD - -Back in the 90s, computer engineers discovered that you can get more bang for a buck by adding circuits that do more useful work per cycle than just trying to increase CPU clock rate which [can't continue forever](https://en.wikipedia.org/wiki/Speed_of_light). - -This worked [particularly well](https://finance.yahoo.com/quote/NVDA/) for parallelizable workloads like video game graphics where just you need to perform the same operation over some array of data. This this is how the concept of *SIMD* became a thing, which stands for *single instruction, multiple data*. - -Modern hardware can do [lots of stuff](https://software.intel.com/sites/landingpage/IntrinsicsGuide) under this paradigm, leveraging *data-level parallelism*. For example, the simplest thing you can do on modern Intel CPUs is to: - -1. load 256-bit block of ints (which is $\frac{256}{32} = 8$ ints), -2. load another 256-bit block of ints, -3. add them together, -4. write the result somewhere else - -…and this whole transaction costs the same as loading and adding just two ints—which means we can do 8 times more work. Magic! - -So, as we promised before, we will perform all $16$ comparisons to compute the index of the right child node, but we leverage SIMD instructions to do it efficiently. Just to clarify — we want to do something like this: - -```cpp -int mask = (1 << B); - -for (int i = 0; i < B; i++) - mask |= (btree[k][i] >= x) << i; - -int i = __builtin_ffs(mask) - 1; -// now i is the number of the correct child node -``` - -…but ~8 times faster. - -Actually, compiler quite often produces very optimized code that leverages these instructions for certain types of loops. This is called auto-vectorization, and this is the reason why a loop that sums up an array of `short`s is faster (theoretically by a factor of two) than the same loop for `int`s: you can fit more elements on the same 256-bit block. Sadly, this is not our case, as we have loop-carried dependencies. - -The algorithm we will implement: - -1. Somewhere before the main loop, convert $x$ to a vector of $8$ copies of $x$. -2. Load the keys stored in node into another 256-bit vector. -3. Compare these two vectors. This returns a 256-bit mask in which pairs that compared "greater than" are marked with ones. -4. Create a 8-bit mask out of that and return it. Then you can feed it to `__builtin_ffs`. - -This is how it looks using C++ intrinsics, which are basically built-in wrappers for raw assembly instructions: - -```cpp -// SIMD vector type names are weird and tedious to type, so we define an alias -typedef __m256i reg; - -// somewhere in the beginning of search loop: -reg x_vec = _mm256_set1_epi32(x); - -int cmp(reg x_vec, int* y_ptr) { - reg y_vec = _mm256_load_si256((reg*) y_ptr); - reg mask = _mm256_cmpgt_epi32(x_vec, y_vec); - return _mm256_movemask_ps((__m256) mask); -} -``` - -After that, we call this function two times (because our node size / cache line happens to be 512 bits, which is twice as big) and blend these masks together with bitwise operations. - -## Final Implementation - -```cpp -#pragma GCC optimize("O3") -#pragma GCC target("avx2") - -#include -#include - -using namespace std; +Another idea is to use cache more efficiently. For example, you can execute `_mm256_stream_load_si256` on just the last iteration. -typedef __m256i reg; +They aren't beneficial for throughput: -const int n = (1<<20), B = 16; -const int nblocks = (n + B - 1) / B; -const int INF = numeric_limits::max(); - -alignas(64) int btree[nblocks][B]; - -int go(int k, int i) { return k * (B + 1) + i + 1; } - -void build(int k = 0) { - static int t = 0; - if (k < nblocks) { - for (int i = 0; i < B; i++) { - build(go(k, i)); - btree[k][i] = (t < n ? a[t++] : INF); - } - build(go(k, B)); - } -} +![](../img/search-bplus-other.svg) -int cmp(reg x_vec, int* y_ptr) { - reg y_vec = _mm256_load_si256((reg*) y_ptr); - reg mask = _mm256_cmpgt_epi32(x_vec, y_vec); - return _mm256_movemask_ps((__m256) mask); -} +However, they perform better: -int search(int x) { - int k = 0, res = INF; - reg x_vec = _mm256_set1_epi32(x); - while (k < nblocks) { - int mask = ~( - cmp(x_vec, &btree[k][0]) + - (cmp(x_vec, &btree[k][8]) << 8) - ); - int i = __builtin_ffs(mask) - 1; - if (i < B) - res = btree[k][i]; - k = go(k, i); - } - return res; -} -``` +![](../img/search-latency-bplus.svg) -That's it. This implementation should outperform even the [state-of-the-art indexes](http://kaldewey.com/pubs/FAST__SIGMOD10.pdf) used in high-performance databases, though it's mostly due to the fact that data structures used in real databases have to support fast updates while we don't. +## Conclusions -Note that this implementation is very specific to the architecture. Older CPUs and CPUs on mobile devices don't have 256-bit wide registers and will crash (but they likely have 128-bit SIMD so the loop can still be split in 4 parts instead of 2), non-Intel CPUs have their own instruction sets for SIMD, and some computers even have different cache line size. +![](../img/search-all.svg) ## Acknowledgements From 09b108b63af991ed3ce759ec8a673915e47f34fc Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sat, 12 Feb 2022 17:31:09 +0300 Subject: [PATCH 147/531] update english-language front page --- content/english/_index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/_index.md b/content/english/_index.md index 91cf875a..94b0cdf8 100644 --- a/content/english/_index.md +++ b/content/english/_index.md @@ -6,6 +6,6 @@ noToc: true Algorithmica is an open-access web book dedicated to the art and science of computing. -It is created by [Sergey Slotin](http://sereja.me/) and teachers and students of [Tinkoff Generation](https://fintech.tinkoff.ru/study/generation/) — an educational organization that trains about half of the final-stage participants of Russian Olympiad in Informatics. +It is created by [Sergey Slotin](http://sereja.me/) and the teachers and students of [Tinkoff Generation](https://fintech.tinkoff.ru/study/generation/) — an nonprofit educational organization that trains about half of the final-stage participants of the Russian Olympiad in Informatics. The English version of the website is a work in progress; the only useful thing you can find there is the continuously updated draft of [Algorithms for Modern Hardware](hpc). We are currently more focused on [the Russian version](https://ru.algorithmica.org/), which hosts various course materials that we use ourselves. From 77d22ae0d8de09d025c57fc180acececc9d07203 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sat, 12 Feb 2022 17:39:23 +0300 Subject: [PATCH 148/531] "please fix bugs" message on the en front page --- content/english/_index.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/content/english/_index.md b/content/english/_index.md index 94b0cdf8..46b3642c 100644 --- a/content/english/_index.md +++ b/content/english/_index.md @@ -6,6 +6,8 @@ noToc: true Algorithmica is an open-access web book dedicated to the art and science of computing. -It is created by [Sergey Slotin](http://sereja.me/) and the teachers and students of [Tinkoff Generation](https://fintech.tinkoff.ru/study/generation/) — an nonprofit educational organization that trains about half of the final-stage participants of the Russian Olympiad in Informatics. +It is created by [Sergey Slotin](http://sereja.me/) and the teachers and students of [Tinkoff Generation](https://fintech.tinkoff.ru/study/generation/) — a nonprofit educational organization that trains about half of the final-stage participants of the Russian Olympiad in Informatics. -The English version of the website is a work in progress; the only useful thing you can find there is the continuously updated draft of [Algorithms for Modern Hardware](hpc). We are currently more focused on [the Russian version](https://ru.algorithmica.org/), which hosts various course materials that we use ourselves. +The English version of the website is a work in progress; the only useful thing you can find here is the continuously updated draft of [Algorithms for Modern Hardware](hpc). We are currently more focused on [the Russian version](https://ru.algorithmica.org/), which hosts various course materials that we use ourselves. + +If you spot an error, please create an issue on [GitHub](https://github.com/algorithmica-org/algorithmica) or, preferably, fix it right away (the pencil icon on the top-right). From ea7115a8d61aa20610531fbaed712edf1c7588bd Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sat, 12 Feb 2022 17:43:02 +0300 Subject: [PATCH 149/531] change wording --- content/english/_index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/_index.md b/content/english/_index.md index 46b3642c..f319cd0e 100644 --- a/content/english/_index.md +++ b/content/english/_index.md @@ -6,7 +6,7 @@ noToc: true Algorithmica is an open-access web book dedicated to the art and science of computing. -It is created by [Sergey Slotin](http://sereja.me/) and the teachers and students of [Tinkoff Generation](https://fintech.tinkoff.ru/study/generation/) — a nonprofit educational organization that trains about half of the final-stage participants of the Russian Olympiad in Informatics. +It is created by [Sergey Slotin](http://sereja.me/) and the teachers and students of [Tinkoff Generation](https://fintech.tinkoff.ru/study/generation/) — a nonprofit educational organization that trains about half of the finalists of the Russian Olympiad in Informatics. The English version of the website is a work in progress; the only useful thing you can find here is the continuously updated draft of [Algorithms for Modern Hardware](hpc). We are currently more focused on [the Russian version](https://ru.algorithmica.org/), which hosts various course materials that we use ourselves. From 8692c7213d3de063e0e3dbd15f8bc8301d5149fd Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sat, 12 Feb 2022 18:32:24 +0300 Subject: [PATCH 150/531] draft of binary search intro --- .../hpc/data-structures/binary-search.md | 92 +++++++------------ 1 file changed, 31 insertions(+), 61 deletions(-) diff --git a/content/english/hpc/data-structures/binary-search.md b/content/english/hpc/data-structures/binary-search.md index fdab8d62..b454d435 100644 --- a/content/english/hpc/data-structures/binary-search.md +++ b/content/english/hpc/data-structures/binary-search.md @@ -3,20 +3,24 @@ title: Searching in Sorted Arrays weight: 1 --- -The most fascinating showcases of performance engineering are not intricate 5-10% speed improvements of some databases, but multifold optimizations of some basic algorithms you can find in a textbook — the ones that are so simple that it would never even occur to try to optimize them. These kinds of optimizations are simple and instructive, and can very much be adopted elsewhere. Yet, with remarkable periodicity, these can be optimized to ridiculous levels of performance. +While improving the speed of user-facing applications is the end goal of performance engineering, people don't really get excited over 5-10% improvements in some databases. Yes, this is what software engineers are paid for, but these types of optimizations tend to be too intricate and specific to the system to be generalizable to other software. -In this article, we will focus on such an algorithm — binary search — and significantly improve its efficiency by rearranging elements of a sorted array in a more cache-friendly way. We will develop two versions, each achieving 4-7x speedup over the standard `std::lower_bound`, depending on the cache level and available memory bandwidth: +Rather, the most fascinating showcases of performance engineering are multifold optimizations of textbook algorithms. The kinds that everybody knows, and are so deemed simple that it would never occur to try to optimize them to begin with. These types of optimizations are simple and instructive, and can very much be adopted elsewhere. And they are surprisingly not as rare as you'd think. -- The first one uses what is known as *Eytzinger layout*, which is also a popular layout for other structures such as binary heaps. Our minimalistic implementation is only ~15 lines. -- The second one is its generalization based on *B-tree layout*, which is more bulky. Although it uses SIMD, which technically disqualifies it from being binary search. -- Novel structure based called S-tree based on + -GCC sucked on all benchmarks, so we will be using Clang (10) exclusively. The CPU is a Zen 2, although the results should be transferrable to other platforms, including most Arm-based chips. +In this article, we focus on such fundamental algorithm — binary search — and implement several algorithms that significantly improve on its performance: +- *Branchless* binary search that is up to 3x faster on *small* arrays and can act as a drop-in replacement to `std::lower_bound`. +- *Eytzinger* binary search that rearranges the elements of a sorted array in a cache-friendly way of is also 3x faster on small array and 2x faster on RAM-backed arrays. +- *S-tree*: an approach based on the implicit (pointer-free) B-layout accelerated with SIMD operations to perform search efficiently while using less memory bandwidth and is ~8x faster on small arrays and 5x faster on large arrays. +- *S+ tree*: an approach similarly based on the B+ layout and achieves up to 15x faster for small arrays and ~7x faster on large arrays. Uses 6-7% of the array memory. -This is a follow up on a [previous article](https://algorithmica.org/en/eytzinger) about using Eytzinger memory layout to speed up binary search. Here we use implicit (pointerless) B-trees accelerated with SIMD operations to perform search efficiently while using less memory bandwidth. +The last two approaches use SIMD, which technically disqualifies it from being binary search. This is technically not a drop-in replacement, since it requires some preprocessing, but I can't recall a lot of scenarios where you obtain a sorted array but can't spend linear time on preprocessing. But otherwise they are effectively drop-in replacements to `std::lower_bound`. -It performs slightly worse on array sizes that fit lower layers of cache, but in low-bandwidth environments it can be up to 3x faster (or 7x faster than `std::lower_bound`). +It performs slightly worse on array sizes that fit lower layers of cache, but in low-bandwidth environments it can be up to 3x faster (or 7x faster than `std::lower_bound`). GCC sucked on all benchmarks, so we will mostly be using Clang 10. The CPU is a Zen 2, although the results should be transferrable to other platforms, including most Arm-based chips. + +This is a large article, which will turn into a multi-hour read. If you feel comfortable reading [intrinsic](/hpc/simd/intrinsics)-heavy code without any context whatsoever, you can skim through the first four implementation and jump straight to the last section. ## Binary Search @@ -24,6 +28,8 @@ Already sorted array `t` of size `n`. We are going ot create an array named `a` into array named `t`. +Here is the standard way of searching for the first element not less than $x$ in a sorted array of $n$ integers: + ```c++ int lower_bound(int x) { int l = 0, r = n - 1; @@ -68,6 +74,17 @@ __lower_bound(_ForwardIterator __first, _ForwardIterator __last, const _Tp& __va If compiler is successful in piercing through the abstractions, it compiles to roughly the same machine code and yields roughly the same performance. + +Before jumping to optimized variants, let's briefly discuss the reasons why the textbook binary search is slow in the first place. + +If you [run this code with perf](/hpc/analyzing-performance/profiling/), you can see that it spends most of its time waiting for a comparison to complete, which in turn is waiting for one of its operands to be fetched from memory. Contains an "if" that is impossible to predict better than a coin flip. + + +### Branching + +It's not illegal: ternary operator is replaced with something like `CMOV` + + We change the compiler for GCC (9.3). For some reason, it doesn't work. ```c++ @@ -82,6 +99,8 @@ int lower_bound(int x) { } ``` +![](../img/search-branchless.svg) + ```c++ int lower_bound(int x) { int *base = t, len = n; @@ -96,50 +115,8 @@ int lower_bound(int x) { } ``` -### Branching - -If you [run this code with perf](/hpc/analyzing-performance/profiling/), you can see that it spends most of its time waiting for a comparison to complete, which in turn is waiting for one of its operands to be fetched from memory. - -To give an idea, the following code is only ~5% slower for $n \approx 10^6$: - -```cpp -int slightly_slower_lower_bound(int x) { - int l = 0, r = n - 1; - while (l < r) { -     volatile int s = 0; // volatile to prevent compiler from cutting this code out - for (int i = 0; i < 10; i++) - s += i; - int t = (l + r) / 2; - if (a[t] >= x) - r = t; - else - l = t + 1; - } - return a[l]; -} -``` - -Contains an "if" that is impossible to predict better than a coin flip. - -It's not illegal: ternary operator is replaced with something like `CMOV` - - - -![](../img/search-branchless.svg) - ![](../img/search-branchless-prefetch.svg) - But this is not the largest problem. The real problem is that it waits for its operands, and the results still can't be predicted. The running time of this (or any) algorithm is not just the "cost" of all its arithmetic operations, but rather this cost *plus* the time spent waiting for data to be fetched from memory. Thus, depending on the algorithm and problem limitations, it can be CPU-bound or memory-bound, meaning that the running time is dominated by one of its components. @@ -150,17 +127,12 @@ IMAGE HERE If array is large enough—usually around the point where it stops fitting in cache and fetches become significantly slower—the running time of binary search becomes dominated by memory fetches. - So, to sum up: ideally, we'd want some layout that is both blocks, and higher-order blocks to be placed in groups, and also to be capable. We can overcome this by enumerating and permuting array elements in a more cache-friendly way. The numeration we will use is actually half a millennium old, and chances are you already know it. ## Why Binary Search is Slow -Before jumping to optimized variants, let's briefly discuss the reasons why the textbook binary search is slow in the first place. - -Here is the standard way of searching for the first element not less than $x$ in a sorted array of $n$ integers: - ```cpp int lower_bound(int x) { int l = 0, r = n - 1; @@ -177,16 +149,14 @@ int lower_bound(int x) { Find the middle element of the search range, compare to $x$, cut the range in half. Beautiful in its simplicity. -### Spacial Locality +### Data Locality -* First ~10 queries may be cached (frequently accessed: temporal locality) -* Last 3-4 queries may be cached (may be in the same cache line: data locality) -* But that's it. Maybe store elements in a more cache-friendly way? +First ~10 queries may be cached (frequently accessed: temporal locality) +Last 3-4 queries may be cached (may be in the same cache line: data locality) +But that's it. Maybe store elements in a more cache-friendly way? ![](../img/binary-search.png) -### Temporal Locality - When we find lower bound of $x$ in a sorted array by binary searching, the main problem is that its memory accesses pattern is neither temporary nor spatially local. For example, element $\lfloor \frac n 2 \rfloor$ is accessed very often (every search) and element $\lfloor \frac n 2 \rfloor + 1$ is not, while they are probably occupying the same cache line. In general, only the first 3-5 reads are temporary local and only the last 3-4 reads are spatially local, and the rest are just random memory accesses. From 54ed93a1c20be07e0f7fb4e4b0d7b094bf1d31c2 Mon Sep 17 00:00:00 2001 From: rgriege Date: Sat, 12 Feb 2022 11:03:05 -0600 Subject: [PATCH 151/531] Fixed grammatical mistake - missing "is" --- content/english/hpc/architecture/assembly.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/content/english/hpc/architecture/assembly.md b/content/english/hpc/architecture/assembly.md index 9a90c001..1333a92b 100644 --- a/content/english/hpc/architecture/assembly.md +++ b/content/english/hpc/architecture/assembly.md @@ -1,6 +1,7 @@ --- title: Assembly Language weight: 1 +published: true --- CPUs are controlled with *machine language*, which is just a stream of binary-encoded instructions that specify @@ -36,7 +37,7 @@ Assembly is very simple in the sense that it doesn't have many syntactical const - The `[reg]` syntax is used for "dereferencing" a pointer stored in a register, and on x86 you need to prefix it with size information (`DWORD` here means 32 bit). - The `;` sign is used for line comments, similar to `#` and `//` in other languages. -Assembly a very minimal language because it needs to be. It reflects the machine language as closely as possible, up to the point where there is almost 1:1 correspondence between machine code and assembly. In fact, you can turn any compiled program back into its assembly form using a process called *disassembly*[^disassembly] — although everything non-essential like comments will not be preserved. +Assembly is a very minimal language because it needs to be. It reflects the machine language as closely as possible, up to the point where there is almost 1:1 correspondence between machine code and assembly. In fact, you can turn any compiled program back into its assembly form using a process called *disassembly*[^disassembly] — although everything non-essential like comments will not be preserved. [^disassembly]: On Linux, to disassemble a compiled program, you can call `objdump -d {path-to-binary}`. From 8bfc8a8efabac2e9719c6d91722864fb6734ef1b Mon Sep 17 00:00:00 2001 From: rgriege Date: Sat, 12 Feb 2022 11:04:53 -0600 Subject: [PATCH 152/531] Fixed typo - changed "Buy" to "But" --- content/english/hpc/complexity/languages.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/content/english/hpc/complexity/languages.md b/content/english/hpc/complexity/languages.md index f719f08e..53c4b293 100644 --- a/content/english/hpc/complexity/languages.md +++ b/content/english/hpc/complexity/languages.md @@ -1,7 +1,9 @@ --- title: Programming Languages -aliases: [/hpc/analyzing-performance] +aliases: + - /hpc/analyzing-performance weight: 2 +published: true --- If you are reading this book, then somewhere on your computer science journey you had a moment when you first started to care about the efficiency of your code. @@ -92,7 +94,7 @@ This is not surprising if you consider the things that Python needs to do to fig - looks up its type, figures out that it's a `float`, and fetches the method implementing `*` operator; - does the same things for `b` and `c` and finally add-assigns the result to `c[i][j]`. -Granted, the interpreters of widely-used languages such as Python are well-optimized, and they can skip through some of these steps on repeated execution of the same code. Buy still, some quite significant overhead is unavoidable due to the language design. If we get rid of all this type checking and pointer chasing, perhaps we can get cycles per multiplication ratio closer to 1, or whatever the "cost" of native multiplication is? +Granted, the interpreters of widely-used languages such as Python are well-optimized, and they can skip through some of these steps on repeated execution of the same code. But still, some quite significant overhead is unavoidable due to the language design. If we get rid of all this type checking and pointer chasing, perhaps we can get cycles per multiplication ratio closer to 1, or whatever the "cost" of native multiplication is? ### Managed Languages From 183b4656110e8d01487519457320c4f31250f48a Mon Sep 17 00:00:00 2001 From: mode-six <97373506+mode-six@users.noreply.github.com> Date: Sat, 12 Feb 2022 19:00:48 +0000 Subject: [PATCH 153/531] functions.md: cpp tail-recursive factorial fix fix base case return value --- content/english/hpc/architecture/functions.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/hpc/architecture/functions.md b/content/english/hpc/architecture/functions.md index 908dc2bc..ec8631f0 100644 --- a/content/english/hpc/architecture/functions.md +++ b/content/english/hpc/architecture/functions.md @@ -249,7 +249,7 @@ To make our `factorial` function tail-recursive, we can pass a "current product" ```cpp int factorial(int n, int p = 1) { if (n == 0) - return 1; + return p; return factorial(n - 1, p * n); } ``` From 97ac1f3c186b6c6024057453f1c4cbbebcf23d7d Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sun, 13 Feb 2022 04:44:09 +0300 Subject: [PATCH 154/531] prefix sum notes --- content/english/hpc/algorithms/prefix.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/content/english/hpc/algorithms/prefix.md b/content/english/hpc/algorithms/prefix.md index 5c79154a..c64477de 100644 --- a/content/english/hpc/algorithms/prefix.md +++ b/content/english/hpc/algorithms/prefix.md @@ -40,7 +40,7 @@ loop: jne loop ``` -After [unrolling](/hpc/architecture/loops) the loop, just two instructions effectively remain: the fused read-add and the write-back of the result. Theoretically, these should work at 2 GFLOPS (1 element per CPU cycle), but because the memory system has to constantly [switch](/hpc/cpu-cache/bandwidth#directional-access) between reading and writing, the actual performance is between 1.2 and 1.6 GFLOPS, depending on the array size. +After [unrolling](/hpc/architecture/loops) the loop, just two instructions effectively remain: the fused read-add and the write-back of the result. Theoretically, these should work at 2 GFLOPS (1 element per CPU cycle, by the virtue of [superscalar processing](/hpc/pipelining)), but since the memory system has to constantly [switch](/hpc/cpu-cache/bandwidth#directional-access) between reading and writing, the actual performance is between 1.2 and 1.6 GFLOPS, depending on the array size. ### Vectorization @@ -217,13 +217,13 @@ void prefix(int *a, int n) { } ``` -This has more benefits: the loop progresses at a constant speed, reducing the pressure on the memory system, and the scheduler sees the instructions of both subroutines, allowing it to be more efficient at assigning instruction to execution ports. +This has more benefits: the loop progresses at a constant speed, reducing the pressure on the memory system, and the scheduler sees the instructions of both subroutines, allowing it to be more efficient at assigning instruction to execution ports — sort of like hyper-threading, but in code. -For these reasons the performance improves even on small arrays: +For these reasons, the performance improves even on small arrays: ![](../img/prefix-interleaved.svg) -Finally, combining it with prefetching improves the performance even more: +And finally, it doesn't seem that we are bottlenecked by the [memory read port](/hpc/pipelining/tables/) or the [decode width](/hpc/architecture/layout/#cpu-front-end), so we can add prefetching for free, which improves the performance even more: ![](../img/prefix-interleaved-prefetch.svg) From 625ec989e4e0c40122e7ebb9ecb48d4fe725da52 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sun, 13 Feb 2022 05:15:05 +0300 Subject: [PATCH 155/531] artifacts for argmin from stl --- content/english/hpc/algorithms/argmin.md | 35 +++++++++++++++++++++++- 1 file changed, 34 insertions(+), 1 deletion(-) diff --git a/content/english/hpc/algorithms/argmin.md b/content/english/hpc/algorithms/argmin.md index 87517c70..f61365b3 100644 --- a/content/english/hpc/algorithms/argmin.md +++ b/content/english/hpc/algorithms/argmin.md @@ -7,7 +7,7 @@ Computing the *minimum* of an array [easily vectorizable](/hpc/simd/reduction), Finding the *index* of that minimum element (*argmin*) is much harder, but it is still possible to vectorize very efficiently. In this section, we design an algorithm that computes the argmin (almost) at the speed of computing the minimum: ~15x faster than the naive scalar approach and ~2.5x faster than the [previous state-of-the-art](http://0x80.pl/notesen/2018-10-03-simd-index-of-min.html). -### Baseline +### Scalar Baseline For our benchmark, we create an array of random 32-bit integers, and then repeatedly try to find the index of the minimum among them (the first one if it isn't unique): @@ -46,6 +46,39 @@ int argmin(int *a, int n) { } ``` + + The version from GCC gives ~0.28 GFLOPS — apparently, the compiler couldn't pierce through all the abstractions. Another reminder to never use STL. ### Vector of Indices From 9c48bd4810ac7011602970c1e7778a0f710a7698 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sun, 13 Feb 2022 05:17:20 +0300 Subject: [PATCH 156/531] add link --- content/english/hpc/algorithms/argmin.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/content/english/hpc/algorithms/argmin.md b/content/english/hpc/algorithms/argmin.md index f61365b3..5fb1c71b 100644 --- a/content/english/hpc/algorithms/argmin.md +++ b/content/english/hpc/algorithms/argmin.md @@ -330,3 +330,9 @@ There are also still some minor things to optimize, but the potential improvemen The first, index-based SIMD algorithm was [originally designed](http://0x80.pl/notesen/2018-10-03-simd-index-of-min.html) by Wojciech Muła in 2018. Thanks to Zach Wegner for [pointing out](https://twitter.com/zwegner/status/1491520929138151425) that the performance of the Muła's algorithm is improved when implemented manually using intrinsics (I originally used the [GCC vector types](/hpc/simd/intrinsics/#gcc-vector-extensions)). + + From 98211717057e95e7d66471ca69257bacfff8627d Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sun, 13 Feb 2022 06:50:15 +0300 Subject: [PATCH 157/531] update plans --- content/english/hpc/_index.md | 17 +++++++++-------- content/english/hpc/complexity/levels.md | 17 +++++++++-------- 2 files changed, 18 insertions(+), 16 deletions(-) diff --git a/content/english/hpc/_index.md b/content/english/hpc/_index.md index a30fcf40..d62b50ba 100644 --- a/content/english/hpc/_index.md +++ b/content/english/hpc/_index.md @@ -29,7 +29,7 @@ Planned table of contents: 1.1. Modern Hardware 1.2. Programming Languages 1.3. Models of Computation - 1.4. Levels of Optimization + 1.4. When to Optimize 2. Computer Architecture 1.1. Instruction Set Architectures 1.2. Assembly Language @@ -119,15 +119,16 @@ Planned table of contents: 11.6. Fast Fourier Transform 11.7. Number-Theoretic Transform 11.8. Argmin with SIMD - 11.9. Reading and Writing Integers -(11.10. Reading and Writing Floats) -(11.11. String Searching) - 11.12. Sorting - 11.13. Matrix Multiplication + 11.9. Prefix Sum with SIMD + 11.10. Reading and Writing Integers +(11.11. Reading and Writing Floats) +(11.12. String Searching) + 11.13. Sorting + 11.14. Matrix Multiplication 12. Data Structure Case Studies 12.1. Binary Search - 12.2. Dynamic Prefix Sum -(12.3. Ordered Trees) + 12.2. Segment Trees +(12.3. B-Trees) (12.4. Range Minimum Query) 12.5. Hash Tables (12.6. Bitmaps) diff --git a/content/english/hpc/complexity/levels.md b/content/english/hpc/complexity/levels.md index 84838709..d0757754 100644 --- a/content/english/hpc/complexity/levels.md +++ b/content/english/hpc/complexity/levels.md @@ -1,5 +1,5 @@ --- -title: Levels of Optimization +title: When to Optimize weight: 4 draft: true --- @@ -26,16 +26,17 @@ In any case, the Big-O notation is not what companies really want. It is not abo You get especially frustrated if you had a competitive programming experience. You won't get to solve these type of problems, even if they asked them on an interview. To solve them, you need other type of qualifications. Asymptotically optimal algorithm already exists, you need to optimize the constant factor. Unfortunately, only a handful of universities teach that. -## The Hierarchy of Optimization +## The Levels of Optimization Programmers can be put in several "levels" in terms of their software optimization abilities: -1. "Newbie". Those who don't think about performance at all. They usually write in high-level languages, sometimes in declarative / functional languages. Most "programmers" stay there (and there is nothing wrong with it). -2. "Undergraduate student". Those who know about Big O notation and are familiar with basic data structures and approaches. LeetCode and CodeForces folks are there. This is also the requirement in getting into big companies — they have a lot of in-house software, large scale, and they are looking for people in the long term, so asking things like programming language. -3. "Graduate student". Those who know that not all operations are created equal; know other cost models such as external memory model (B-tree, external sorting), word model (bitset,) or parallel computing, but still in theory. -4. "Professional developer". Those who know actual timings of these operations. Aware that branch mispredictions are costly, memory is split into cache lines. Knows some basic SIMD techniques. -5. "Performance engineer". Know exactly what happens inside their hardware. Know the difference between latency and bandwidth, know about ports. Knows how to use SIMD and the rest of instruction set effectively. Can read assembly and use profilers. +0. "Newbie". Those who don't think about performance at all. They usually write in high-level languages, sometimes in declarative / functional languages. Most "programmers" stay there (and there is nothing wrong with it). +1. "Undergraduate student". Those who know about Big O notation and are familiar with basic data structures and approaches. LeetCode and CodeForces folks are there. This is also the requirement in getting into big companies — they have a lot of in-house software, large scale, and they are looking for people in the long term, so asking things like programming language. +2. "Graduate student". Those who know that not all operations are created equal; know other cost models such as external memory model (B-tree, external sorting), word model (bitset,) or parallel computing, but still in theory. +3. "Professional developer". Those who know actual timings of these operations. Aware that branch mispredictions are costly, memory is split into cache lines. Knows some basic SIMD techniques. +4. "Performance engineer". Know exactly what happens inside their hardware. Know the difference between latency and bandwidth, know about ports. Knows how to use SIMD and the rest of instruction set effectively. Can read assembly and use profilers. +5. "Intel employee". Knows microarchitecture-specific details. This is outside of the purview of normal engineers. -In this book, we expect that the average reader is somewhere around stage 2, and hopefully by the end of it will get to 5. +In this book, we expect that the average reader is somewhere around stage 1, and hopefully by the end of it will get to 4. You should also go through these levels when designing algorithms. First get it working in the first place, then select a bunch of reasonably asymptotically optimal algorithm. Then think about how they are going to work in terms of their memory operations or ability to execute in parallel (even if you consider single-threaded programs, there is still going to be plenty of parallelism inside a core, so this model is extremely ), and then proceed toward actual implementation. Avoid premature optimization, as Knuth once said. From d273b9b62bc26538777935e16317f4cf8fee67fd Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sun, 13 Feb 2022 07:05:32 +0300 Subject: [PATCH 158/531] why binary search is slow --- .../hpc/data-structures/binary-search.md | 65 +++++++++++++++---- 1 file changed, 54 insertions(+), 11 deletions(-) diff --git a/content/english/hpc/data-structures/binary-search.md b/content/english/hpc/data-structures/binary-search.md index b454d435..f408ad08 100644 --- a/content/english/hpc/data-structures/binary-search.md +++ b/content/english/hpc/data-structures/binary-search.md @@ -22,13 +22,26 @@ It performs slightly worse on array sizes that fit lower layers of cache, but in This is a large article, which will turn into a multi-hour read. If you feel comfortable reading [intrinsic](/hpc/simd/intrinsics)-heavy code without any context whatsoever, you can skim through the first four implementation and jump straight to the last section. +Build up understanding gradually, but you can skip them. + ## Binary Search + + +Here is the standard way of searching for the first element not less than `x` in a sorted array `t` of `n` integers that you can find in any introductory computer science textbook: ```c++ int lower_bound(int x) { @@ -44,9 +57,11 @@ int lower_bound(int x) { } ``` -![](../img/search-std.svg) + -This is actually how `std::lower_bound` from works. Implementations from both [Clang](https://github.com/llvm-mirror/libcxx/blob/78d6a7767ed57b50122a161b91f59f19c9bd0d19/include/algorithm#L4169) and [GCC](https://github.com/gcc-mirror/gcc/blob/d9375e490072d1aae73a93949aa158fcd2a27018/libstdc%2B%2B-v3/include/bits/stl_algobase.h#L1023) use this metaprogramming monstrosity: +Find the middle element of the search range, compare it to `x`, shrink the range in half. Beautiful in its simplicity. + +A similar approach is employed by `std::lower_bound`, except that it needs to be more generic to support containers with non-random-access iterators and thus uses the first element and the size of the search interval instead of the two of its ends. Implementations from both [Clang](https://github.com/llvm-mirror/libcxx/blob/78d6a7767ed57b50122a161b91f59f19c9bd0d19/include/algorithm#L4169) and [GCC](https://github.com/gcc-mirror/gcc/blob/d9375e490072d1aae73a93949aa158fcd2a27018/libstdc%2B%2B-v3/include/bits/stl_algobase.h#L1023) use this metaprogramming monstrosity: ```c++ template @@ -72,20 +87,50 @@ __lower_bound(_ForwardIterator __first, _ForwardIterator __last, const _Tp& __va } ``` -If compiler is successful in piercing through the abstractions, it compiles to roughly the same machine code and yields roughly the same performance. +If compiler is successful in removing the abstractions, it compiles to roughly the same machine code and yields roughly the same performance, which [expectedly](/hpc/cpu-cache/latency) varies greatly with the array size: +![](../img/search-std.svg) -Before jumping to optimized variants, let's briefly discuss the reasons why the textbook binary search is slow in the first place. +Since most people don't implement binary search by hand, we will use `std::lower_bound` from Clang as the baseline. -If you [run this code with perf](/hpc/analyzing-performance/profiling/), you can see that it spends most of its time waiting for a comparison to complete, which in turn is waiting for one of its operands to be fetched from memory. Contains an "if" that is impossible to predict better than a coin flip. +### The Bottleneck +Before jumping to the optimized implementations, let's briefly discuss why binary search is slow in the first place. + +If you run `std::lower_bound` with [perf](/hpc/profiling/events), you'll see that it spends most of its time on a [conditional jump](/hpc/architecture/loops) instruction: + +```nasm + │35: mov %rax,%rdx + 0.52 │ sar %rdx + 0.33 │ lea (%rsi,%rdx,4),%rcx + 4.30 │ cmp (%rcx),%edi + 65.39 │ ↓ jle b0 + 0.07 │ sub %rdx,%rax + 9.32 │ lea 0x4(%rcx),%rsi + 0.06 │ dec %rax + 1.37 │ test %rax,%rax + 1.11 │ ↑ jg 35 +``` + +This [pipeline stall](/hpc/) stops the algorithm from progressing, and it is mainly caused by two [factors](/hpc/pipelining/hazards): + +- We suffer a *control hazard* because we have a branch that is impossible to predict, and the processor has to stop for 10-15 cycles to flush the pipeline. +- We suffer a *data hazard* because we have to [wait](/hpc/cpu-cache/latency) for the preceding comparison to complete — which in turn waits for one of its operands to be fetched from the memory, which may take up to hundreds of cycles, depending on where in the cache hierarchy the data is located. + +Now, let's try to get rid of these obstacles one by one. + +## Removing Branches + +which in turn is waiting for one of its operands to be fetched from memory. Contains an "if" that is impossible to predict better than a coin flip. + + +Before jumping to optimized variants, let's briefly discuss the reasons why the textbook binary search is slow in the first place. -### Branching It's not illegal: ternary operator is replaced with something like `CMOV` -We change the compiler for GCC (9.3). For some reason, it doesn't work. +We change the compiler to GCC (9.3). For some reason, it doesn't work. ```c++ int lower_bound(int x) { @@ -131,7 +176,7 @@ So, to sum up: ideally, we'd want some layout that is both blocks, and higher-or We can overcome this by enumerating and permuting array elements in a more cache-friendly way. The numeration we will use is actually half a millennium old, and chances are you already know it. -## Why Binary Search is Slow +## Optimizing Layout ```cpp int lower_bound(int x) { @@ -147,8 +192,6 @@ int lower_bound(int x) { } ``` -Find the middle element of the search range, compare to $x$, cut the range in half. Beautiful in its simplicity. - ### Data Locality First ~10 queries may be cached (frequently accessed: temporal locality) From c94c5866163025a9a2e4516ead407b363525d40b Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sun, 13 Feb 2022 07:44:32 +0300 Subject: [PATCH 159/531] branchless binary search --- .../hpc/data-structures/binary-search.md | 61 +++++++------------ 1 file changed, 23 insertions(+), 38 deletions(-) diff --git a/content/english/hpc/data-structures/binary-search.md b/content/english/hpc/data-structures/binary-search.md index f408ad08..5448307c 100644 --- a/content/english/hpc/data-structures/binary-search.md +++ b/content/english/hpc/data-structures/binary-search.md @@ -18,7 +18,7 @@ In this article, we focus on such fundamental algorithm — binary search — an The last two approaches use SIMD, which technically disqualifies it from being binary search. This is technically not a drop-in replacement, since it requires some preprocessing, but I can't recall a lot of scenarios where you obtain a sorted array but can't spend linear time on preprocessing. But otherwise they are effectively drop-in replacements to `std::lower_bound`. -It performs slightly worse on array sizes that fit lower layers of cache, but in low-bandwidth environments it can be up to 3x faster (or 7x faster than `std::lower_bound`). GCC sucked on all benchmarks, so we will mostly be using Clang 10. The CPU is a Zen 2, although the results should be transferrable to other platforms, including most Arm-based chips. +It performs slightly worse on array sizes that fit lower layers of cache, but in low-bandwidth environments it can be up to 3x faster (or 7x faster than `std::lower_bound`). GCC sucked on all benchmarks, so we will mostly be using Clang (10.0). The CPU is a Zen 2, although the results should be transferrable to other platforms, including most Arm-based chips. This is a large article, which will turn into a multi-hour read. If you feel comfortable reading [intrinsic](/hpc/simd/intrinsics)-heavy code without any context whatsoever, you can skim through the first four implementation and jump straight to the last section. @@ -114,23 +114,14 @@ If you run `std::lower_bound` with [perf](/hpc/profiling/events), you'll see tha This [pipeline stall](/hpc/) stops the algorithm from progressing, and it is mainly caused by two [factors](/hpc/pipelining/hazards): -- We suffer a *control hazard* because we have a branch that is impossible to predict, and the processor has to stop for 10-15 cycles to flush the pipeline. +- We suffer a *control hazard* because we have a branch that is impossible to predict (both queries and keys are uniformly random), and the processor has to flush the pipeline and halt for 10-15 cycles to fill it back. - We suffer a *data hazard* because we have to [wait](/hpc/cpu-cache/latency) for the preceding comparison to complete — which in turn waits for one of its operands to be fetched from the memory, which may take up to hundreds of cycles, depending on where in the cache hierarchy the data is located. Now, let's try to get rid of these obstacles one by one. ## Removing Branches -which in turn is waiting for one of its operands to be fetched from memory. Contains an "if" that is impossible to predict better than a coin flip. - - -Before jumping to optimized variants, let's briefly discuss the reasons why the textbook binary search is slow in the first place. - - -It's not illegal: ternary operator is replaced with something like `CMOV` - - -We change the compiler to GCC (9.3). For some reason, it doesn't work. +We can replace branching with [predication](/hpc/pipelining/branchless). For that, we need to adopt the STL approach and rewrite the loop using the the first element and the size of the search interval instead of its first and last element. This way we only need to update the first element of the search interval with a `cmov` instruction and halve its size on each iteration: ```c++ int lower_bound(int x) { @@ -144,8 +135,18 @@ int lower_bound(int x) { } ``` +Note that this loop is not always equivalent to the standard binary search — it always rounds *up* the size of the search interval, so it accesses slightly different elements and may perform one comparison more than what is needed. This is done to make the number of iterations constant and remove the need for branching completely, although it does require a weird `(*base < x)` check at the end. + +This trick is very fragile to compiler optimizations. It doesn't make a difference on Clang as for some reason, it replaces the ternary operator with a branch anyway. But it works fine on GCC (9.3), yielding a 2.5-3x improvement on small arrays: + ![](../img/search-branchless.svg) +One interesting detail is that it performs worse on large arrays. This is weird: the total delay is dominated by the RAM latency, and since it does roughly the same memory accesses, so it should be the same or slightly better. + +The real question you need to ask is not why the branchless implementation is worse, but why the branchy version is better. This happens because it [speculates](/hpc/pipelining/branching/) on one of the branches and starts fetching either the left or the right key before it is confirmed that it is the right one, which effectively acts as implicit [prefetching](/hpc/cpu-cache/prefetching). + +For the branchless implementation, this doesn't happen, as `cmov` is treated as every other instruction, and the branch predictor doesn't try to peek into its operands to predict the future. To compensate for this, we can prefetch the data explicitly: + ```c++ int lower_bound(int x) { int *base = t, len = n; @@ -160,15 +161,17 @@ int lower_bound(int x) { } ``` +This makes the performance on large arrays roughly the same, although the graph still grows faster as the branchy version also prefetches "grandchildren", "grand-grandchildren", and so on — although the chance that the prediction is correct diminishes exponentially: + ![](../img/search-branchless-prefetch.svg) -But this is not the largest problem. The real problem is that it waits for its operands, and the results still can't be predicted. +We can also fetched ahead by more than one layer, but the number of fetches we would need will grow exponentially. Instead, we will try a different approach to optimize memory operations. -The running time of this (or any) algorithm is not just the "cost" of all its arithmetic operations, but rather this cost *plus* the time spent waiting for data to be fetched from memory. Thus, depending on the algorithm and problem limitations, it can be CPU-bound or memory-bound, meaning that the running time is dominated by one of its components. +## Optimizing the Layout -Can be fetched ahead, but there is only 50% chance we will get it right on the first layer, then 25% chance on second and so on. We could do 2, 4, 8 and so on fetches, but these would grow exponentially. +But this is not the largest problem. The real problem is that it waits for its operands, and the results still can't be predicted. -IMAGE HERE +The running time of this (or any) algorithm is not just the "cost" of all its arithmetic operations, but rather this cost *plus* the time spent waiting for data to be fetched from memory. Thus, depending on the algorithm and problem limitations, it can be CPU-bound or memory-bound, meaning that the running time is dominated by one of its components. If array is large enough—usually around the point where it stops fitting in cache and fetches become significantly slower—the running time of binary search becomes dominated by memory fetches. @@ -176,24 +179,6 @@ So, to sum up: ideally, we'd want some layout that is both blocks, and higher-or We can overcome this by enumerating and permuting array elements in a more cache-friendly way. The numeration we will use is actually half a millennium old, and chances are you already know it. -## Optimizing Layout - -```cpp -int lower_bound(int x) { - int l = 0, r = n - 1; - while (l < r) { - int t = (l + r) / 2; - if (a[t] >= x) - r = t; - else - l = t + 1; - } - return a[l]; -} -``` - -### Data Locality - First ~10 queries may be cached (frequently accessed: temporal locality) Last 3-4 queries may be cached (may be in the same cache line: data locality) But that's it. Maybe store elements in a more cache-friendly way? @@ -206,7 +191,7 @@ For example, element $\lfloor \frac n 2 \rfloor$ is accessed very often (every s ![](../img/binary-heat.png) -## Eytzinger Layout +### Eytzinger Layout **Michaël Eytzinger** is a 16th century Austrian nobleman known for his work on genealogy, particularily for a system for numbering ancestors called *ahnentafel* (German for "ancestor table"). @@ -262,13 +247,13 @@ Despite being recursive, this is actually a really fast implementation as all me Note that the first element is left unfilled and the whole array is essentially 1-shifted. This will actually turn out to be a huge performance booster. -## Binary search implementation +### Binary Search Implementation We can now descend this array using only indices: we just start with $k=1$ and execute $k := 2k$ if we need to go left and $k := 2k + 1$ if we need to go right. We don't even need to store and recalculate binary search boundaries anymore. The only problem arises when we need to restore the index of the resulting element, as $k$ may end up not pointing to a leaf node. Here is an example of how that can happen: -```python +``` array: 1 2 3 4 5 6 7 8 eytzinger: 4 2 5 1 6 3 7 8 1st range: --------------- k := 1 @@ -321,7 +306,7 @@ __builtin_prefetch(t + k * B * 2); ![](../img/search-eytzinger-prefetch.svg) -### Last branch +### Removing the Last Branch Let's zoom in. From c6b16280234d06b12277e3061e327815634a744d Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sun, 13 Feb 2022 10:02:07 +0300 Subject: [PATCH 160/531] fix typo (tnx @guillaumeguy) --- content/english/hpc/pipelining/branching.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/hpc/pipelining/branching.md b/content/english/hpc/pipelining/branching.md index db008023..706796d0 100644 --- a/content/english/hpc/pipelining/branching.md +++ b/content/english/hpc/pipelining/branching.md @@ -66,7 +66,7 @@ Now, if we benchmark it for different values of `P`, we get an interesting-looki ![](../img/probabilities.svg) -It's peak is at 50-55%, as expected: branch misprediction is the most expensive thing here. This graph is asymmetrical: it takes just ~1 cycle to only check conditions that are never satisfied (`P = 0`), and ~7 cycles for the sum if the branch is always taken (`P = 7`). +It's peak is at 50-55%, as expected: branch misprediction is the most expensive thing here. This graph is asymmetrical: it takes just ~1 cycle to only check conditions that are never satisfied (`P = 0`), and ~7 cycles for the sum if the branch is always taken (`P = 100`). An interesting detail is that this graph is not unimodal: there is another local minimum at around 85-90%. We spend ~6.15 cycles per element there, or about 10-15% faster compared to when we always take the branch, accounting for the fact that we need to perform less additions. Branch misprediction stop affecting performance at this point, because it happens, not the whole instruction buffer is discarded, but only the operations that were speculatively scheduled. That 10-15% mispredict rate is the equilibrium point where we can see far enough in the pipeline not to stall, but save 10-15% on taking the cheaper ">=" branch. From 2489ce4b662ab47aedfcf3ebb139fe1321b7428d Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sun, 13 Feb 2022 10:13:32 +0300 Subject: [PATCH 161/531] clarify parallel prefix sum pseudocode (tnx @dcepelik) --- content/english/hpc/algorithms/prefix.md | 1 + 1 file changed, 1 insertion(+) diff --git a/content/english/hpc/algorithms/prefix.md b/content/english/hpc/algorithms/prefix.md index c64477de..b503682a 100644 --- a/content/english/hpc/algorithms/prefix.md +++ b/content/english/hpc/algorithms/prefix.md @@ -54,6 +54,7 @@ Now, to compute these prefix sums locally, we are going to use another parallel ```c++ for (int l = 0; l < logn; l++) + // (atomically and in parallel): for (int i = (1 << l); i < n; i++) a[i] += a[i - (1 << l)]; ``` From f2dcfaefa02e7b5c901437792eb14d4a93e65592 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sun, 13 Feb 2022 10:41:23 +0300 Subject: [PATCH 162/531] giving argmin attribution to @mlochbaum --- content/english/hpc/_index.md | 1 + content/english/hpc/algorithms/argmin.md | 4 +++- 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/content/english/hpc/_index.md b/content/english/hpc/_index.md index d62b50ba..0c3e6f13 100644 --- a/content/english/hpc/_index.md +++ b/content/english/hpc/_index.md @@ -176,6 +176,7 @@ This work is largely based on blog posts, research papers, conference talks and - [Geoff Langdale](https://branchfree.org/) - [Matt Kulukundis](https://twitter.com/JuvHarlequinKFM) - [Georg Sauthoff](https://gms.tf/) +- [Marshall Lochbaum](https://mlochbaum.github.io/publications.html) - [ridiculous_fish](https://ridiculousfish.com/blog/) - [Creel](https://www.youtube.com/c/WhatsACreel) diff --git a/content/english/hpc/algorithms/argmin.md b/content/english/hpc/algorithms/argmin.md index 5fb1c71b..ccd9f140 100644 --- a/content/english/hpc/algorithms/argmin.md +++ b/content/english/hpc/algorithms/argmin.md @@ -5,7 +5,7 @@ weight: 7 Computing the *minimum* of an array [easily vectorizable](/hpc/simd/reduction), as it is not different from any other reduction: in AVX2, you just need to use a convenient `_mm256_min_epi32` intrinsic as the inner operation. It computes the minimum of two 8-element vectors in one cycle — even faster than in the scalar case, which requires at least a comparison and a conditional move. -Finding the *index* of that minimum element (*argmin*) is much harder, but it is still possible to vectorize very efficiently. In this section, we design an algorithm that computes the argmin (almost) at the speed of computing the minimum: ~15x faster than the naive scalar approach and ~2.5x faster than the [previous state-of-the-art](http://0x80.pl/notesen/2018-10-03-simd-index-of-min.html). +Finding the *index* of that minimum element (*argmin*) is much harder, but it is still possible to vectorize very efficiently. In this section, we design an algorithm that computes the argmin (almost) at the speed of computing the minimum and ~15x faster than the naive scalar approach. ### Scalar Baseline @@ -336,3 +336,5 @@ Thanks to Zach Wegner for [pointing out](https://twitter.com/zwegner/status/1491 Thanks to Alexander Monakov for [being meticulous](https://twitter.com/_monoid/status/1491827976438231049) and pushing me to investigate the STL version. --> + +After publication, I've discovered that [Marshall Lochbaum](https://www.aplwiki.com/wiki/Marshall_Lochbaum), the creator of [BQN](https://mlochbaum.github.io/BQN/), designed a [very similar algorithm](https://forums.dyalog.com/viewtopic.php?f=13&t=1579&sid=e2cbd69817a17a6e7b1f76c677b1f69e#p6239) while he was working on Dyalog APL in 2019. Pay more attention to the world of array programming languages! From ddd78b31aaa7e0e30255553c6ffd59dd3dfe0a4d Mon Sep 17 00:00:00 2001 From: Marco Date: Sun, 13 Feb 2022 03:41:57 -0600 Subject: [PATCH 163/531] Fix typo --- content/english/hpc/pipelining/branchless.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/hpc/pipelining/branchless.md b/content/english/hpc/pipelining/branchless.md index 758300e2..fab426a4 100644 --- a/content/english/hpc/pipelining/branchless.md +++ b/content/english/hpc/pipelining/branchless.md @@ -3,7 +3,7 @@ title: Branchless Programming weight: 3 --- -As we established in [the pervious section](../branching), branches that can't be effectively predicted by the CPU are expensive as they may cause a long pipeline stall to fetch new instructions after a branch mispredict. In this section, we discuss the means of removing branches in the first place. +As we established in [the previous section](../branching), branches that can't be effectively predicted by the CPU are expensive as they may cause a long pipeline stall to fetch new instructions after a branch mispredict. In this section, we discuss the means of removing branches in the first place. ### Predication From 59b7244783304229106c98d9a0cb0539ed6eb128 Mon Sep 17 00:00:00 2001 From: tralik Date: Sun, 13 Feb 2022 11:17:55 +0100 Subject: [PATCH 164/531] Update scanline.md fix a typo and a duplicate --- content/russian/cs/decomposition/scanline.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/russian/cs/decomposition/scanline.md b/content/russian/cs/decomposition/scanline.md index 1b3cf993..6ea7e2e7 100644 --- a/content/russian/cs/decomposition/scanline.md +++ b/content/russian/cs/decomposition/scanline.md @@ -20,7 +20,7 @@ weight: 1 Назовем *интересными* те точки, в которых происходит смена количества отрезков, которыми она покрыта. Так как смена ответа может происходить только в интересной точке, то максимум достигается также в какой-то из интересных точек. Отсюда сразу следует решение за $O(n^2)$: просто перебрать все интересные точки (это будут концы заданных отрезков) и проверить для каждой по отдельности ответ. -Это решение можно улучшить. Отсортируем интересные точки по возрастанию координаты и прой по ним слева направо, поддерживая количество отрезков `cnt`, которые покрывают данную точку. Если в данной точке начинается отрезок, то надо увеличить `cnt` на единицу, а если заканчивается, то уменьшить. После этого пробуем обновить ответ на задачу текущим значением `cnt`. +Это решение можно улучшить. Отсортируем интересные точки по возрастанию координаты и пройдем по ним слева направо, поддерживая количество отрезков `cnt`, которые покрывают данную точку. Если в данной точке начинается отрезок, то надо увеличить `cnt` на единицу, а если заканчивается, то уменьшить. После этого пробуем обновить ответ на задачу текущим значением `cnt`. Как такое писать: нужно представить интересные точки в виде структур с полями «координата» и «тип» (начало / конец) и отсортировать со своим компаратором. Удобно начало отрезка обозначать +1, а конец -1, чтобы просто прибавлять к `cnt` это значение и на разбирать случае. @@ -83,7 +83,7 @@ for (event e : events) { Воспользуемся следующим приемом: сразу считаем все запросы и сохраним их, чтобы потом ответить на все сразу. Добавим точки запросов как события с новым типом 0, который будет означать, что в этой точке надо ответить на запрос, и отдельным полем для номера запроса. -Теперь аналогично отсортируем отсортируем точки интереса и пройдем по ним слева направо, поддерживая `cnt` и отвечая на запросы, когда их встретим. +Теперь аналогично отсортируем точки интереса и пройдем по ним слева направо, поддерживая `cnt` и отвечая на запросы, когда их встретим. ```cpp struct event { From d709b73ec7285bd12b0e1f4f39c811245e251570 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sun, 13 Feb 2022 13:38:56 +0300 Subject: [PATCH 165/531] fix grammar --- content/english/hpc/architecture/assembly.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/content/english/hpc/architecture/assembly.md b/content/english/hpc/architecture/assembly.md index 1333a92b..f92ef812 100644 --- a/content/english/hpc/architecture/assembly.md +++ b/content/english/hpc/architecture/assembly.md @@ -41,7 +41,7 @@ Assembly is a very minimal language because it needs to be. It reflects the mach [^disassembly]: On Linux, to disassemble a compiled program, you can call `objdump -d {path-to-binary}`. -Note that the two snippets above are not just syntactically different. Both are optimized codes produced by a compiler, but the Arm version uses 4 instruction, while the x86 version uses 3. The `add eax, [rdi]` instruction is what's called *fused instruction* that does a load and an add in one go — this is one of the perks that the [CISC](../isa#risc-vs-cisc) approach can provide. +Note that the two snippets above are not just syntactically different. Both are optimized codes produced by a compiler, but the Arm version uses 4 instructions, while the x86 version uses 3. The `add eax, [rdi]` instruction is what's called *fused instruction* that does a load and an add in one go — this is one of the perks that the [CISC](../isa#risc-vs-cisc) approach can provide. Since there are far more differences between the architectures than just this one, from here on and for the rest of the book we will only provide examples for x86, which is probably what most of our readers will optimize for, although many of the introduced concepts will be architecture-agnostic. @@ -59,13 +59,13 @@ There are also 32-, 16-bit and 8-bit registers that have similar names (`rax` These are just the *general-purpose* registers that you can, with [some exceptions](../functions), use however you like in most instructions. There is also a separate set of registers for [floating-point arithmetic](/hpc/arithmetic/float), a bunch of very wide registers used in [vector extensions](/hpc/simd), and a few special ones that are needed for [control flow](../jumps), but we'll get there in time. -**Constants** are just integer or floating point values: `42`, `0x2a`, `3.14`, `6.02e23`. They are more commonly called *immediate values* because they are embedded right into the machine code. Because it may considerably increase the complexity of the instruction encoding, some instructions don't support immediate values, or allow just a fixed subset of them. In some cases you have to load a constant value into a register and then use it instead of an immediate value. +**Constants** are just integer or floating-point values: `42`, `0x2a`, `3.14`, `6.02e23`. They are more commonly called *immediate values* because they are embedded right into the machine code. Because it may considerably increase the complexity of the instruction encoding, some instructions don't support immediate values or allow just a fixed subset of them. In some cases, you have to load a constant value into a register and then use it instead of an immediate value. Apart from numeric values, there are also string constants such as `hello` or `world\n` with their own little subset of operations, but that is a somewhat obscure corner of the assembly language that we are not going to explore here. ### Moving Data -Some instructions may have the same mnemonic, but have different operand types, in which case they are considered distinct instructions as they may perform slightly different operations and take different time to execute. The `mov` instruction is a vivid example of that, as it comes in around 20 different forms, all related to moving data: either between the memory and registers or just between two registers. Despite the name, it doesn't *move* a value into a register, but *copies* it, preserving the original. +Some instructions may have the same mnemonic, but have different operand types, in which case they are considered distinct instructions as they may perform slightly different operations and take different times to execute. The `mov` instruction is a vivid example of that, as it comes in around 20 different forms, all related to moving data: either between the memory and registers or just between two registers. Despite the name, it doesn't *move* a value into a register, but *copies* it, preserving the original. When used to copy data between two registers, the `mov` instruction instead performs *register renaming* internally — informs the CPU that the value referred by register X is actually stored in register Y — without causing any additional delay except for maybe reading and decoding the instruction itself. For the same reason, the `xchg` instruction that swaps two registers also doesn't cost anything. @@ -89,7 +89,7 @@ Memory addressing is done with the `[]` operator, but it can do more than just r SIZE PTR [base + index * scale + displacement] ``` -where `displacement` needs to be an integer constant and `scale` can be either 2, 4, or 8. What it does is calculates the pointer `base + index * scale + displacement` and dereferences it. +where `displacement` needs to be an integer constant and `scale` can be either 2, 4, or 8. What it does is calculate the pointer `base + index * scale + displacement` and dereferences it. @@ -117,7 +117,7 @@ There are actually multiple *assemblers* (the programs that produce machine code These syntaxes are also sometimes called *GAS* and *NASM* respectively, by the names of the two primary assemblers that use them (*GNU Assembler* and *Netwide Assembler*). -We used Intel syntax in this chapter and will continue to preferably use it for the rest of the book. For comparison, here is how the summation loop looks like in AT&T asm: +We used Intel syntax in this chapter and will continue to preferably use it for the rest of the book. For comparison, here is what the summation loop looks like in AT&T asm: ```asm loop: From b6b1715575fdd02982f117d3b6f6874f909639d7 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sun, 13 Feb 2022 13:44:01 +0300 Subject: [PATCH 166/531] grammar fixes in the first chapter --- content/english/hpc/complexity/_index.md | 4 ++-- content/english/hpc/complexity/hardware.md | 10 +++++----- content/english/hpc/complexity/languages.md | 6 +++--- 3 files changed, 10 insertions(+), 10 deletions(-) diff --git a/content/english/hpc/complexity/_index.md b/content/english/hpc/complexity/_index.md index 66c7ed96..23d4719e 100644 --- a/content/english/hpc/complexity/_index.md +++ b/content/english/hpc/complexity/_index.md @@ -17,9 +17,9 @@ To estimate the real running time of a program, you need to sum all latencies fo ![](img/cpu.png) -The clock frequency is a volatile and often unknown variable that depends on the CPU model, operating system settings, current microchip temperature, power usage of other components, and quite a few other things. In contrast, instruction latencies are static and even somewhat consistent across different CPUs when expressed in clock cycles, and so counting them instead is much more useful for analytical purposes. +The clock frequency is a volatile and often unknown variable that depends on the CPU model, operating system settings, current microchip temperature, power usage of other components, and quite a few other things. In contrast, instruction latencies are static and even somewhat consistent across different CPUs when expressed in clock cycles, so counting them instead is much more useful for analytical purposes. -For example, the by-definition matrix multiplication algorithm requires the total of $n^2 \cdot (n + n - 1)$ arithmetic operations: specifically, $n^3$ multiplications and $n^2 \cdot (n - 1)$ additions. If we look up the latencies for these instructions (in special documents called *instruction tables*, like [this one](https://www.agner.org/optimize/instruction_tables.pdf)), we can find that multiplication takes e. g. 3 cycles, while addition takes 1, so we need a total of $3 \cdot n^3 + n^2 \cdot (n - 1) = 4 \cdot n^3 - n^2$ clock cycles for the entire computation (bluntly ignoring everything else that needs to be done in order to "feed" these instructions with the right data). +For example, the by-definition matrix multiplication algorithm requires the total of $n^2 \cdot (n + n - 1)$ arithmetic operations: specifically, $n^3$ multiplications and $n^2 \cdot (n - 1)$ additions. If we look up the latencies for these instructions (in special documents called *instruction tables*, like [this one](https://www.agner.org/optimize/instruction_tables.pdf)), we can find that multiplication takes e. g. 3 cycles, while addition takes 1, so we need a total of $3 \cdot n^3 + n^2 \cdot (n - 1) = 4 \cdot n^3 - n^2$ clock cycles for the entire computation (bluntly ignoring everything else that needs to be done to "feed" these instructions with the right data). Similar to how the sum of instruction latencies can be used as a clock-independent proxy for total execution time, computational complexity can be used to quantify the intrinsic time requirements of an abstract algorithm, without relying on the choice of a specific computer. diff --git a/content/english/hpc/complexity/hardware.md b/content/english/hpc/complexity/hardware.md index 0f2a7894..eed2e56d 100644 --- a/content/english/hpc/complexity/hardware.md +++ b/content/english/hpc/complexity/hardware.md @@ -27,7 +27,7 @@ Microchips are "printed" on a slice of crystalline silicon using a process calle Consider now the "hit it with photons" part. For that, we can use a system of lenses that projects a pattern onto a much smaller area, effectively making a tiny circuit with all the desired properties. This way, the optics of the 1970s were able to fit a few thousand transistors on the size of a fingernail, which gives microchips several key advantages that macro-world computers didn't have: - higher clock rates (that were previously limited by the speed of light); -- ability to scale the production; +- the ability to scale the production; - much lower material and power usage, translating to much lower cost per unit. Apart from these immediate benefits, photolithography enabled a clear path to improve performance further: you can just make lenses stronger, which in turn would create smaller, but functionally identical devices with relatively little effort. @@ -68,11 +68,11 @@ The only way to mitigate this is to increase voltage; and to balance off power c It may come as a surprise, but the primary metric for modern CPUs is not the clock frequency, but rather "useful operations per joule", or, more practically put, "useful operations per dollar". -Thermodynamically, a computer is just a very efficient device for converting electrical power into heat. This heat eventually needs to be removed, and it's not straightforward to do when you are working with a millimeter-scale crystal. There are physical limits of how much power you can consume and then dissipate. +Thermodynamically, a computer is just a very efficient device for converting electrical power into heat. This heat eventually needs to be removed, and it's not straightforward to do when you are working with a millimeter-scale crystal. There are physical limits to how much power you can consume and then dissipate. -Historically, the three main variables guiding microchip designs are power, performance and area (PPA), commonly defined in watts, hertz and nanometers. Until ~2005, cost, which was mainly a function of area, and performance, used to be the most important criteria. But as battery-driven mobile devices started replacing PCs, power quickly and firmly moved up on top of the list, followed by cost and performance. +Historically, the three main variables guiding microchip designs are power, performance, and area (PPA), commonly defined in watts, hertz, and nanometers. Until ~2005, cost, which was mainly a function of area, and performance, used to be the most important criteria. But as battery-driven mobile devices started replacing PCs, power quickly and firmly moved up on top of the list, followed by cost and performance. -Leakage: interfering magnetic fields make electrons move in the directions they are not supposed to and cause unnecessary heating. It isn't bad by itself: it mitigate it you need to increase the voltage, and it won't flick any bits. But the problem is that the smaller a circuit is, the harder it is to cope with this by isolating the wires. So modern chips keep the clock frequency at a level that won't cause overheat, although physically there aren't other reasons why they shouldn't. +Leakage: interfering magnetic fields make electrons move in the directions they are not supposed to and cause unnecessary heating. It isn't bad by itself: to mitigate it you need to increase the voltage, and it won't flick any bits. But the problem is that the smaller a circuit is, the harder it is to cope with this by isolating the wires. So modern chips keep the clock frequency at a level that won't cause overheat, although physically there aren't other reasons why they shouldn't. --> @@ -82,7 +82,7 @@ Dennard scaling has ended, but Moore's law is not dead yet. Clock rates plateaued, but the transistor count is still increasing, allowing for the creation of new, *parallel* hardware. Instead of chasing faster cycles, CPU designs started to focus on getting more useful things done in a single cycle. Instead of getting smaller, transistors have been changing shape. -This resulted in increasingly complex architectures capable of doing dozens, hundreds, or even thousands of different things every cycle. +This resulted in increasingly complex architectures capable of doing dozens, hundreds, or even thousands of different things every cycle. ![Die shot of a Zen CPU core by AMD (~1,400,000,000 transistors)](../img/die-shot.jpg) diff --git a/content/english/hpc/complexity/languages.md b/content/english/hpc/complexity/languages.md index 53c4b293..9453c91d 100644 --- a/content/english/hpc/complexity/languages.md +++ b/content/english/hpc/complexity/languages.md @@ -34,12 +34,12 @@ These instructions — called *machine code* — are binary encoded, quirky and --> -On the lowest level, computers execute *machine code* consisting of binary-encoded *instructions* which are used to control the CPU. They are specific, quirky, and require a great deal of intellectual effort to work with, so one of the first things people did after creating computers was creating *programming languages*, which abstract away some details of how computers operate to simplify the process of programming. +On the lowest level, computers execute *machine code* consisting of binary-encoded *instructions* which are used to control the CPU. They are specific, quirky, and require a great deal of intellectual effort to work with, so one of the first things people did after creating computers was create *programming languages*, which abstract away some details of how computers operate to simplify the process of programming. A programming language is fundamentally just an interface. Any program written in it is just a nicer higher-level representation which still at some point needs to be transformed into the machine code to be executed on the CPU — and there are several different means of doing that: - From a programmer's perspective, there are two types of languages: *compiled*, which pre-process before executing, and *interpreted*, which are executed during runtime using a separate program called *an interpreter*. -- From a computer's perspective, there are also two types of languages: *native*, which directly execute machine code, and *managed*, which rely on some sort of *a runtime* to do it. +- From a computer's perspective, there are also two types of languages: *native*, which directly execute machine code, and *managed*, which rely on some sort of *runtime* to do it. Since running machine code in an interpreter doesn't make sense, this makes a total of three types of languages: @@ -84,7 +84,7 @@ print(duration) This code runs in 630 seconds. That's more than 10 minutes! -Let's try to put this number in perspective. The CPU that ran it has a clock frequency of 1.4GHz, meaning that it does $1.4 \cdot 10^9$ cycles per second, totaling to almost $10^{15}$ for the entire computation, and about 880 cycles per each multiplication in the innermost loop. +Let's try to put this number in perspective. The CPU that ran it has a clock frequency of 1.4GHz, meaning that it does $1.4 \cdot 10^9$ cycles per second, totaling to almost $10^{15}$ for the entire computation, and about 880 cycles per multiplication in the innermost loop. This is not surprising if you consider the things that Python needs to do to figure out what the programmer meant: From 3fe17502055fb891e63f20d15830d0962b7f25a7 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sun, 13 Feb 2022 15:53:23 +0300 Subject: [PATCH 167/531] binary search layout issues --- .../hpc/data-structures/binary-search.md | 54 +- .../img/search-random-relative.svg | 1178 +++++++++++++++ .../hpc/data-structures/img/search-random.svg | 1286 +++++++++++++++++ 3 files changed, 2512 insertions(+), 6 deletions(-) create mode 100644 content/english/hpc/data-structures/img/search-random-relative.svg create mode 100644 content/english/hpc/data-structures/img/search-random.svg diff --git a/content/english/hpc/data-structures/binary-search.md b/content/english/hpc/data-structures/binary-search.md index 5448307c..909e0086 100644 --- a/content/english/hpc/data-structures/binary-search.md +++ b/content/english/hpc/data-structures/binary-search.md @@ -169,11 +169,41 @@ We can also fetched ahead by more than one layer, but the number of fetches we w ## Optimizing the Layout -But this is not the largest problem. The real problem is that it waits for its operands, and the results still can't be predicted. +How good is the [data locality](/hpc/external-memory/locality/) of a binary search? -The running time of this (or any) algorithm is not just the "cost" of all its arithmetic operations, but rather this cost *plus* the time spent waiting for data to be fetched from memory. Thus, depending on the algorithm and problem limitations, it can be CPU-bound or memory-bound, meaning that the running time is dominated by one of its components. +- *Spatial locality* seems to be good for the last 3-4 requests that are likely to be on the same [cache line](/hpc/cpu-cache/cache-lines) — but all the previous requests require huge memory jumps. +- *Temporal locality* seems to be good for the first dozen or so requests — there aren't that many different comparison sequences of this length, so we will be comparing against the same middle elements over and over, which are likely to be cached. -If array is large enough—usually around the point where it stops fitting in cache and fetches become significantly slower—the running time of binary search becomes dominated by memory fetches. +To illustrate how important the second type of cache sharing is, let's try: + +```c++ +int lower_bound(int x) { + int l = 0, r = n - 1; + while (l < r) { + int m = l + rand() % (r - l); + if (t[m] >= x) + r = m; + else + l = m + 1; + } + return t[l]; +} +``` + +The is around ~1.3x.[^limit] + +[^limit]: [algorithm](https://gist.github.com/sslotin/4b7193041b01e454615f50d237485c71). By the way, if someone who remembers calculus is reading this, try to find the limit of that. + +![](../img/search-random.svg) + +$2^{20}$ works in 360ns, while $(2^{20} + 123)$ works in ~300ns: a 20% difference. + +Another often neglected effect is that of cache associativity, which can adversely +effect binary search when the the array length is a large power of 2. In a c-way associativecache, the top $\log(n / C)$ levels of the implicit search tree must all share the same c cache lines. If $\log(n/C) > c$, this effectively means that the cache effectively has size only c. + +But it isn't very efficient: in the same hot cache line that we store element $\lfloor n/2 \rfloor$, we also store the element $\lfloor n/2 \rfloor + 1$, which is the last element fetched. + +We use data points of $\lfloor 1.17^k \rfloor$ to swipe that issue under the rug. So, to sum up: ideally, we'd want some layout that is both blocks, and higher-order blocks to be placed in groups, and also to be capable. @@ -247,7 +277,7 @@ Despite being recursive, this is actually a really fast implementation as all me Note that the first element is left unfilled and the whole array is essentially 1-shifted. This will actually turn out to be a huge performance booster. -### Binary Search Implementation +### Search Implementation We can now descend this array using only indices: we just start with $k=1$ and execute $k := 2k$ if we need to go left and $k := 2k + 1$ if we need to go right. We don't even need to store and recalculate binary search boundaries anymore. @@ -308,10 +338,14 @@ __builtin_prefetch(t + k * B * 2); ### Removing the Last Branch -Let's zoom in. +The finishing touch. Did you notice the bumpiness of eytzinger search? This isn't random noise — let's zoom in: ![](../img/search-eytzinger-small.svg) +There is a period of a power of two and The running time is ~10ns higher for. + +These 10ns are the mispredicted branches for arrays. The last branch, to be exact. + ```c++ t[0] = -1; // an element that is less than X iters = std::__lg(n + 1); @@ -320,18 +354,26 @@ iters = std::__lg(n + 1); ```c++ int lower_bound(int x) { int k = 1; + for (int i = 0; i < iters; i++) k = 2 * k + (t[k] < x); + int *loc = (k <= n ? t + k : t); k = 2 * k + (*loc < x); + k >>= __builtin_ffs(~k); + return t[k]; } ``` +The graph is now smooth and almost doesn't lose to the branchless binary search on small arrays: + ![](../img/search-eytzinger-branchless.svg) -That was a detour. +That was a small detour. Let's move on. + +The title of this article doesn't say "binary search". We aren't limited to fetching one element at a time and comparing it. We can do better. ## B-Tree Layout diff --git a/content/english/hpc/data-structures/img/search-random-relative.svg b/content/english/hpc/data-structures/img/search-random-relative.svg new file mode 100644 index 00000000..2b9cfa98 --- /dev/null +++ b/content/english/hpc/data-structures/img/search-random-relative.svg @@ -0,0 +1,1178 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/data-structures/img/search-random.svg b/content/english/hpc/data-structures/img/search-random.svg new file mode 100644 index 00000000..6376210b --- /dev/null +++ b/content/english/hpc/data-structures/img/search-random.svg @@ -0,0 +1,1286 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + From 1d5dfe655574e49154a0be12e015b70845a340f9 Mon Sep 17 00:00:00 2001 From: Zach Dingels Date: Sun, 13 Feb 2022 09:00:58 -0500 Subject: [PATCH 168/531] typos --- content/english/hpc/compilation/contracts.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/hpc/compilation/contracts.md b/content/english/hpc/compilation/contracts.md index 796a6702..66aeb5f9 100644 --- a/content/english/hpc/compilation/contracts.md +++ b/content/english/hpc/compilation/contracts.md @@ -153,7 +153,7 @@ Since each iteration of this loop is independent, it can be executed in parallel There may be a problem if the arrays `a` and `b` intersect. Consider the case when `b == a + 1`, that is, if `b` is a just a memory view of `a` starting from the second element. In this case, the next iteration depends on the previous one, and the only correct solution is execute the loop sequentially. The compiler has to check for such possibilities, even if the programmer knows they can't happen. -This is why we have `const` and `restrict` keywords. The first one enforces that that we won't modify memory with the pointer variable, and the second is a way to tell compiler that the memory is guaranteed to be not aliased. +This is why we have `const` and `restrict` keywords. The first one enforces that that we won't modify memory with the pointer variable, and the second is a way to tell compiler that the memory is guaranteed to not be aliased. ```cpp void add(int * __restrict__ a, const int * __restrict__ b, int n) { From b1788eec7315f8deadce94aedbad0f8c2cd53583 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sun, 13 Feb 2022 17:03:49 +0300 Subject: [PATCH 169/531] cache associativity edits --- content/english/hpc/cpu-cache/associativity.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/content/english/hpc/cpu-cache/associativity.md b/content/english/hpc/cpu-cache/associativity.md index c53e6935..02784a15 100644 --- a/content/english/hpc/cpu-cache/associativity.md +++ b/content/english/hpc/cpu-cache/associativity.md @@ -17,7 +17,7 @@ for (int i = 0; i < N; i += 257) a[i]++; ``` -Which one will be faster to finish? There are several considerations: +Which one will be faster to finish? There are several considerations that come to mind: - At first, you think that there shouldn't be much difference, or maybe that the second loop is $\frac{257}{256}$ times faster or so because it does fewer iterations in total. - Then you recall that 256 is a nice round number, which may have something to do with [SIMD](/hpc/simd) or the memory system, so maybe the first one is faster. @@ -28,7 +28,7 @@ This isn't just a single bad step size. The performance degrades for all indices ![The array size is normalized so that the total number of iterations is constant](../img/strides-small.svg) -There is no vectorization or anything, and the two loops produce the same assembly except for the step size. This effect is due only to the memory system, in particular to a feature called *cache associativity* which is a peculiar artifact of how CPU caches are implemented in hardware. +There is no vectorization or anything, and the two loops produce the same assembly except for the step size. This effect is due only to the memory system, in particular to a feature called *cache associativity*, which is a peculiar artifact of how CPU caches are implemented in hardware. ### Hardware Caches @@ -38,15 +38,15 @@ In the context of hardware, such scheme is called *fully associative cache*: we ![Fully associative cache](../img/cache1.png) -The problem with fully associative cache is that implementing the "find the oldest cache line among millions" operation is hard in software and simply unfeasible in hardware. You can make a fully associative cache that has 16 entries or so, but managing hundreds of cache lines already becomes either prohibitively expensive or so slow that it's not worth it. +The problem with fully associative cache is that implementing the "find the oldest cache line among millions" operation is pretty hard to do in software and just unfeasible in hardware. You can make a fully associative cache that has 16 entries or so, but managing hundreds of cache lines already becomes either prohibitively expensive or so slow that it's not worth it. We can resort to another, much simpler approach: just map each block of 64 bytes in RAM to a single cache line which it can occupy. Say, if we have 4096 blocks in memory and 64 cache lines for them, then each cache line at any time stores the contents of one of $\frac{4096}{64} = 64$ different blocks. ![Direct-mapped cache](../img/cache2.png) -A direct-mapped cache is easy to implement, and it doesn't require storing any additional meta-information associated with a cache line except its tag (the actual memory location of a cached block). The disadvantage is that the entries can be kicked out too quickly — for example, when bouncing between two addresses that map to the same cache line — leading to lower overall cache utilization. +A direct-mapped cache is easy to implement doesn't require storing any additional meta-information associated with a cache line except its tag (the actual memory location of a cached block). The disadvantage is that the entries can be kicked out too quickly — for example, when bouncing between two addresses that map to the same cache line — leading to lower overall cache utilization. -For that reason, we settle for something in-between direct-mapped and fully associative caches: the *set-associative cache*. It splits the address space into equal groups which separately act as small fully-associative caches. +For that reason, we settle for something in-between direct-mapped and fully associative caches: the *set-associative cache*. It splits the address space into equal groups, which separately act as small fully-associative caches. ![Set-associative cache (2-way associative)](../img/cache3.png) @@ -66,13 +66,13 @@ Instead, the hardware uses the lazy approach. It takes the memory address that n - *offset* — the index of the word within a 64B cache line ($\log_2 64 = 6$ bits); - *index* — the index of the cache line set (the next $12$ bits as there are $2^{12}$ cache lines in the L3 cache); -- *tag* — the rest of the memory address to tell the memory blocks stored in the cache lines apart. +- *tag* — the rest of the memory address, which is used to tell the memory blocks stored in the cache lines apart. In other words, all memory addresses with the same "middle" part map to the same set. -![Address composition for a 64-entry 2-way set associative cache](../img/address.png) +![Address composition for a 64-entry 2-way set-associative cache](../img/address.png) -This makes the cache system simpler and cheaper to implement, but also makes it susceptible to certain access patterns. +This makes the cache system simpler and cheaper to implement but also susceptible to certain bad access patterns. ### Pathological Mappings From dc824a97bf813137900139737f4cf8d1e18923c7 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sun, 13 Feb 2022 17:18:18 +0300 Subject: [PATCH 170/531] shuffling macro note --- content/english/hpc/simd/shuffing.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/content/english/hpc/simd/shuffing.md b/content/english/hpc/simd/shuffing.md index 9c04954e..f2a2cd15 100644 --- a/content/english/hpc/simd/shuffing.md +++ b/content/english/hpc/simd/shuffing.md @@ -144,7 +144,7 @@ int popcnt() { This code processes around 30 bytes per cycle. Theoretically, the inner loop could do 32, but we have to stop it every 15 iterations because the 8-bit counters can overflow. -The `pshufb` instruction is so instrumental in some SIMD algorithms that [Wojciech Muła](http://0x80.pl/) — the guy who came up with this algorithm — took it as his [Twitter handle](https://twitter.com/pshufb). You can calculate population counts even faster: check out his [github repository](https://github.com/WojciechMula/sse-popcount) with different vectorized popcount implementations and his [recent paper](https://arxiv.org/pdf/1611.07612.pdf) for a detailed explanation of the state-of-the-art. +The `pshufb` instruction is so instrumental in some SIMD algorithms that [Wojciech Muła](http://0x80.pl/) — the guy who came up with this algorithm — took it as his [Twitter handle](https://twitter.com/pshufb). You can calculate population counts even faster: check out his [GitHub repository](https://github.com/WojciechMula/sse-popcount) with different vectorized popcount implementations and his [recent paper](https://arxiv.org/pdf/1611.07612.pdf) for a detailed explanation of the state-of-the-art. ### Permutations and Lookup Tables @@ -237,4 +237,8 @@ _mm256_permute_ps uses a mask https://stackoverflow.com/questions/9795529/how-to-find-the-horizontal-maximum-in-a-256-bit-avx-vector Norbert P. and Peter Cordes +_MM_SHUFFLE + +https://stackoverflow.com/questions/37088449/macro-for-generating-immediates-for-avx-shuffle-intrinsics + --> From ad589dd17062592f91a0d3b3d9f9e34b4201b214 Mon Sep 17 00:00:00 2001 From: WeetHet <43210583+WeetHet@users.noreply.github.com> Date: Mon, 14 Feb 2022 11:53:06 +0500 Subject: [PATCH 171/531] BugFix: rewrite dequeue -> deque --- content/russian/cs/shortest-paths/bfs.md | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/content/russian/cs/shortest-paths/bfs.md b/content/russian/cs/shortest-paths/bfs.md index 1893d5db..c96804c4 100644 --- a/content/russian/cs/shortest-paths/bfs.md +++ b/content/russian/cs/shortest-paths/bfs.md @@ -1,13 +1,14 @@ --- title: Поиск в ширину authors: -- Александр Гришутин -- Станислав Алексеев -- "[Максим Иванов](https://e-maxx.ru/algo/bfs)" + - Александр Гришутин + - Станислав Алексеев + - '[Максим Иванов](https://e-maxx.ru/algo/bfs)' editors: -- Сергей Слотин + - Сергей Слотин weight: 2 -date: 2021-09-30 +date: {} +published: true --- *Поиск в ширину* (англ. *breadth-first search*) — один из основных алгоритмов на графах, позволяющий находить все кратчайшие пути от заданной вершины и решать многие другие задачи. @@ -158,7 +159,7 @@ $$ vector d(n, -1); d[s] = 0; -dequeue q; +deque q; q.push_back(s); while (!q.empty()) { From 90f3fccf5aa88d2845c9c0c117b8efc058d58cfd Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Mon, 14 Feb 2022 11:29:08 +0300 Subject: [PATCH 172/531] binary search layout problems --- .../hpc/data-structures/binary-search.md | 43 ++++++++----------- 1 file changed, 19 insertions(+), 24 deletions(-) diff --git a/content/english/hpc/data-structures/binary-search.md b/content/english/hpc/data-structures/binary-search.md index 909e0086..22a85e18 100644 --- a/content/english/hpc/data-structures/binary-search.md +++ b/content/english/hpc/data-structures/binary-search.md @@ -165,16 +165,20 @@ This makes the performance on large arrays roughly the same, although the graph ![](../img/search-branchless-prefetch.svg) -We can also fetched ahead by more than one layer, but the number of fetches we would need will grow exponentially. Instead, we will try a different approach to optimize memory operations. +We can also fetch ahead by more than one layer, but the number of fetches we would need will grow exponentially. Instead, we will try a different approach to optimize memory operations. ## Optimizing the Layout -How good is the [data locality](/hpc/external-memory/locality/) of a binary search? +The memory requests we perform during binary search form a very specific access pattern: + +![](../img/binary-search.png) + +How likely it is that the elements on each request are cached? How good is their [data locality](/hpc/external-memory/locality/)? - *Spatial locality* seems to be good for the last 3-4 requests that are likely to be on the same [cache line](/hpc/cpu-cache/cache-lines) — but all the previous requests require huge memory jumps. - *Temporal locality* seems to be good for the first dozen or so requests — there aren't that many different comparison sequences of this length, so we will be comparing against the same middle elements over and over, which are likely to be cached. -To illustrate how important the second type of cache sharing is, let's try: +To illustrate how important the second type of cache sharing is, let's try to pick the element we will compare to on each iteration randomly among the elements of the search interval, instead of the middle one: ```c++ int lower_bound(int x) { @@ -190,37 +194,24 @@ int lower_bound(int x) { } ``` -The is around ~1.3x.[^limit] +Theoretically[^limit], this randomized binary search is expected to do ~1.35x more comparisons than the normal one, but in practice, the running time goes ~6x on large arrays: -[^limit]: [algorithm](https://gist.github.com/sslotin/4b7193041b01e454615f50d237485c71). By the way, if someone who remembers calculus is reading this, try to find the limit of that. +[^limit]: I wrote an [small program](https://gist.github.com/sslotin/4b7193041b01e454615f50d237485c71) for calculating the expected number of comparisons required. By the way, if someone who remembers calculus is reading this, please try to find the limit of the ratio of the number of comparisons a random binary search and a normal one needs, and share how you did that. Although probably useless, it seems like an interesting problem. ![](../img/search-random.svg) -$2^{20}$ works in 360ns, while $(2^{20} + 123)$ works in ~300ns: a 20% difference. - -Another often neglected effect is that of cache associativity, which can adversely -effect binary search when the the array length is a large power of 2. In a c-way associativecache, the top $\log(n / C)$ levels of the implicit search tree must all share the same c cache lines. If $\log(n/C) > c$, this effectively means that the cache effectively has size only c. +This isn't just caused by the `rand()` call being slow: you can clearly see the point on the L2-L3 boundary where memory latency outweighs the random number generation and [modulo](/hpc/arithmetic/division). The performance degrades because all of the fetched elements are likely uncached, and not just some small suffix of them. -But it isn't very efficient: in the same hot cache line that we store element $\lfloor n/2 \rfloor$, we also store the element $\lfloor n/2 \rfloor + 1$, which is the last element fetched. +Another potential negative effect is that of [cache associativity](/hpc/cpu-cache/associativity). If the array size is a multiple of a large power of two, then the indices of these "hot" elements will also be divisible by some large powers of two and map to the same cache line, kicking each other out. For example, binary searching over arrays of size $2^{20}$ takes about ~360ns per query, while searching over arrays of size $(2^{20} + 123)$ takes ~300ns — a 20% difference. There are [ways](https://en.wikipedia.org/wiki/Fibonacci_search_technique) to fix this problem, but to not get distracted from more pressing matters, we are just going to ignore it: all array sizes we use are in the form of $\lfloor 1.17^k \rfloor$ for integer $k$ so that any cache side effects are unlikely. -We use data points of $\lfloor 1.17^k \rfloor$ to swipe that issue under the rug. +The real problem with our memory layout is that it doesn't make the most efficient use of temporal locality because it groups hot and cold elements together. For example, we likely store the element $\lfloor n/2 \rfloor$, which we request the first thing on each query, in the same cache line with $\lfloor n/2 \rfloor + 1$, which we almost never request (sometimes literally never — if it is the first element in a search range of three, and it is indeed the lower bound, we just compare against the middle and deduce it has to be the first element without ever even fetching it). -So, to sum up: ideally, we'd want some layout that is both blocks, and higher-order blocks to be placed in groups, and also to be capable. - -We can overcome this by enumerating and permuting array elements in a more cache-friendly way. The numeration we will use is actually half a millennium old, and chances are you already know it. - -First ~10 queries may be cached (frequently accessed: temporal locality) -Last 3-4 queries may be cached (may be in the same cache line: data locality) -But that's it. Maybe store elements in a more cache-friendly way? - -![](../img/binary-search.png) - -When we find lower bound of $x$ in a sorted array by binary searching, the main problem is that its memory accesses pattern is neither temporary nor spatially local. - -For example, element $\lfloor \frac n 2 \rfloor$ is accessed very often (every search) and element $\lfloor \frac n 2 \rfloor + 1$ is not, while they are probably occupying the same cache line. In general, only the first 3-5 reads are temporary local and only the last 3-4 reads are spatially local, and the rest are just random memory accesses. +Here is the heatmap visualizing the expected frequency of comparisons for a 31-element array: ![](../img/binary-heat.png) +So, ideally, we'd want a memory layout where hot elements are grouped with hot elements, and cold elements are grouped with cold ones. And we can achieve this if we permute the elements of the array in a more cache-friendly way. The numeration we will use is actually half a millennium old, and chances are, you already know it. + ### Eytzinger Layout **Michaël Eytzinger** is a 16th century Austrian nobleman known for his work on genealogy, particularily for a system for numbering ancestors called *ahnentafel* (German for "ancestor table"). @@ -330,6 +321,10 @@ int lower_bound(int x) { } ``` +Note that the last prefetch is not needed, and may be even outside of the memory region allocated for the program. On most modern CPUs, invalid prefetch instructions get converted into no-ops, but on some platforms this may cause a slowdown. + +Hardware prefetching will fetch its neighbours: + ```c++ __builtin_prefetch(t + k * B * 2); ``` From 6c69e3b6b0b2a73e1d0f4d8a4d09a948b2f4c88e Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Mon, 14 Feb 2022 12:15:43 +0300 Subject: [PATCH 173/531] eytzinger layout --- .../hpc/data-structures/binary-search.md | 50 ++++++++++++------- 1 file changed, 33 insertions(+), 17 deletions(-) diff --git a/content/english/hpc/data-structures/binary-search.md b/content/english/hpc/data-structures/binary-search.md index 22a85e18..fdfb6eb1 100644 --- a/content/english/hpc/data-structures/binary-search.md +++ b/content/english/hpc/data-structures/binary-search.md @@ -214,59 +214,75 @@ So, ideally, we'd want a memory layout where hot elements are grouped with hot e ### Eytzinger Layout -**Michaël Eytzinger** is a 16th century Austrian nobleman known for his work on genealogy, particularily for a system for numbering ancestors called *ahnentafel* (German for "ancestor table"). +**Michaël Eytzinger** is a 16th century Austrian nobleman known for his work on genealogy, particularly for a system for numbering ancestors called *ahnentafel* (German for "ancestor table"). Ancestry mattered a lot back then, but writing down that data was expensive. *Ahnentafel* allows displaying a person's genealogy compactly, without wasting extra space by drawing diagrams. It lists a person's direct ancestors in a fixed sequence of ascent. First, the person theirself is listed as number 1, and then, recursively, for each person numbered $k$, their father is listed as $2k$ and their mother as $(2k+1)$. -Here is the example for Paul I, the great-grandson of Peter I, the Great: +Here is the example for [Paul I](https://en.wikipedia.org/wiki/Paul_I_of_Russia), the great-grandson of [Peter the Great](https://en.wikipedia.org/wiki/Peter_the_Great): 1. Paul I 2. Peter III (Paul's father) -3. Catherine II (Paul's mother) +3. [Catherine II](https://en.wikipedia.org/wiki/Catherine_the_Great) (Paul's mother) 4. Charles Frederick (Peter's father, Paul's paternal grandfather) 5. Anna Petrovna (Peter's mother, Paul's paternal grandmother) 6. Christian August (Catherine's father, Paul's maternal grandfather) 7. Johanna Elisabeth (Catherine's mother, Paul's maternal grandmother) -Apart from being compact, it has some nice properties, like that all even-numbered persons are male and all odd-numbered (possibly apart from 1) are female. +Apart from being compact, it has some nice properties, like that all even-numbered persons are male and all odd-numbered (possibly except for 1) are female. One can also find the number of a particular ancestor only knowing the genders of their descendants. For example, Peter the Great's bloodline is Paul I → Peter III → Anna Petrovna → Peter the Great, so his number should be $((1 \times 2) \times 2 + 1) \times 2 = 10$. -One can also find the number of a particular ancestor only knowing the genders of their descendants. For example, Peter the Great's bloodline is Paul I → Peter III → Anna Petrovna → Peter the Great, so his number should be $((1 \times 2) \times 2 + 1) \times 2 = 10$. +**In computer science**, this enumeration has been widely used for implicit (pointer-free) implementation of heaps, segment trees, and other binary tree structures, where instead of names it stores underlying array items. -**In computer science**, this enumeration has been widely used for implicit (i. e. pointer-free) implementation of heaps, segment trees, and other binary tree structures, where instead of names it stores underlying array items. - -This is how this layout will look when applied to binary search: +This is how this layout looks when applied to binary search: ![](../img/eytzinger.png) -You can immediately see how its temporal locality is better (in fact, theoretically optimal) as the elements closer to the root are closer to the beginning of the array, and thus are more likely to be fetched from cache. +When searching, we just need to start from the first element of the array, and on each iteration jump to either $2 k$ or $(2k + 1)$ depending on how the comparison went: ![](../img/eytzinger-search.png) + +You can immediately see how its temporal locality is better (and, in fact, theoretically optimal) as the elements closer to the root are closer to the beginning of the array, and thus are more likely to be fetched from cache. + ![](../img/eytzinger-heat.png) -### Construction +Another way to look at it is that we write every even-indexed element to the end of the new array, then write every even-indexed element of the remaining ones right before them, and so on, until we place the root as the first element. -Here is a function that constructs Eytzinger array by traversing the original search tree. +### Construction -It takes two indexes $i$ and $k$—one in the original array and one in constructed—and recursively goes to two branches until a leaf node is reached, which could simply be checked by asserting $k \leq n$ as Eytzinger array should have same number of items. +To construct the Eytzinger array, we could do this even-odd [filtering](/hpc/simd/shuffing/#permutations-and-lookup-tables) $O(\log n)$ times — and, perhaps, this is the fastest approach — but for brevity, we will instead build it by traversing the original search tree: ```c++ -int a[n], b[n + 1]; // <- change name +int a[n], t[n + 1]; // the original sorted array and the eytzinger array we build +// ^ we need one element more because of one-based indexing void eytzinger(int k = 1) { - static int i = 0; // <- careful running it multiple times + static int i = 0; // <- careful running it on multiple arrays if (k <= n) { eytzinger(2 * k); - t[k] = _a[i++]; + t[k] = a[i++]; eytzinger(2 * k + 1); } } ``` -Despite being recursive, this is actually a really fast implementation as all memory reads are sequential. +It seems complicated, but to assure its correctness, we only need three statements: + +- `k <= n`: it writes exactly $n$ elements. +- `i++`: the elements it writes are increasing elements from the original array. +- Before we write the element at node `k`, we write out all its preceding elements. + + -Note that the first element is left unfilled and the whole array is essentially 1-shifted. This will actually turn out to be a huge performance booster. +Note that the Eytzinger array is one-indexed — later this will be important for performance. You can put in the zeroth element the value that you want returned if the lower bound doesn't exist (similar to `a.end()` for `std::lower_bound`). ### Search Implementation From c6c30a742d7ede532e08b649b26259039d7dd5b4 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Mon, 14 Feb 2022 14:18:21 +0300 Subject: [PATCH 174/531] eytzinger layout searching --- .../hpc/data-structures/binary-search.md | 86 +++++++++++++------ 1 file changed, 59 insertions(+), 27 deletions(-) diff --git a/content/english/hpc/data-structures/binary-search.md b/content/english/hpc/data-structures/binary-search.md index fdfb6eb1..ae977d53 100644 --- a/content/english/hpc/data-structures/binary-search.md +++ b/content/english/hpc/data-structures/binary-search.md @@ -266,27 +266,25 @@ void eytzinger(int k = 1) { } ``` -It seems complicated, but to assure its correctness, we only need three statements: +This function takes the current node number `k`, recursively writes out all elements to the left of the middle of the search interval, writes out the current element we'd compare against, and then recursively writes out all the elements on the right. It seems a bit complicated, but to convince ourselves that it works, we only need three observations: -- `k <= n`: it writes exactly $n$ elements. -- `i++`: the elements it writes are increasing elements from the original array. -- Before we write the element at node `k`, we write out all its preceding elements. +- It writes exactly `n` elements, as we enter the body of `if` for each `k` from `1` to `n` just once. +- It writes out sequential elements from the original array, as it increments the `i` pointer each time. +- By the time we write the element at node `k`, we have already written all the elements to its left (exactly `i`). - +Despite being recursive, it is actually quite fast as all the memory reads are sequential and the memory writes are only in $O(\log n)$ different memory blocks at a time. Note that the Eytzinger array is one-indexed — later this will be important for performance. You can put in the zeroth element the value that you want returned if the lower bound doesn't exist (similar to `a.end()` for `std::lower_bound`). ### Search Implementation -We can now descend this array using only indices: we just start with $k=1$ and execute $k := 2k$ if we need to go left and $k := 2k + 1$ if we need to go right. We don't even need to store and recalculate binary search boundaries anymore. +We can now descend this array using only indices: we just start with $k=1$ and execute $k := 2k$ if we need to go left and $k := 2k + 1$ if we need to go right. We don't even need to store and recalculate the search boundaries anymore. To avoid branching, we can just do this: + +```c++ +int k = 1; +while (k <= n) + k = 2 * k + (t[k] < x); +``` The only problem arises when we need to restore the index of the resulting element, as $k$ may end up not pointing to a leaf node. Here is an example of how that can happen: @@ -299,13 +297,11 @@ eytzinger: 4 2 5 1 6 3 7 8 4th range: - k := 2*k + 1 (=11) ``` -Here we query array of $[1, …, 8]$ for the lower bound of $x=4$. We compare it against $4$, $2$ and $5$, and go left-right-right and end up with $k = 11$, which isn't even a valid array index. +Here we query the array of $[1, …, 8]$ for the lower bound of $x=4$. We compare it against $4$, $2$, and $5$, go left-right-right, and end up with $k = 11$, which isn't even a valid array index. -Note that, unless the answer is the last element of the array, we compare $x$ against it at some point, and after we learn that it is not less than $x$, we start comparing $x$ against elements to the left, and all these comparisons will evaluate true (i. e. leading to the right). Hence, the solution to restoring the resulting element is to cancel some number of right turns. +The trick is to notice that, unless the answer is the last element of the array, we compare $x$ against it at some point, and after we've learned that it is not less than $x$, we start comparing $x$ against elements to the left, and all these comparisons evaluate true (i. e. leading to the right). Therefore, to restore the answer, we just need to "cancel" some number of right turns. -This can be done in an elegant way by observing that the right turns are recorded in the binary notation of $k$ as 1-bits, and so we just need to find the number of trailing ones in the binary notation and right-shift $k$ by exactly that amount. - -To do this we can invert the number (`~x`) and call "find first set" instruction available on most systems. In GCC, the corresponding builtin is `__builtin_ffs`. +This can be done in an elegant way by observing that the right turns are recorded in the binary representation of $k$ as 1-bits, and so we just need to find the number of trailing ones in the binary representation and right-shift $k$ by exactly that amount. To do this, we can invert the number (`~x`) and call the "find first set" instruction: ```c++ int lower_bound(int x) { @@ -317,19 +313,53 @@ int lower_bound(int x) { } ``` +We run it, and… well, it doesn't look *that* good: + ![](../img/search-eytzinger.svg) +The latency on smaller arrays is on par with the branchless binary search implementation — which isn't surprising as it is just two lines of code — but it starts taking off much sooner. This is because now we don't get the advantage of spatial locality: the last 3-4 elements we compare against are not in the same cache line anymore. Yes, the temporal locality is better, but it is enough compensation: the caching of the cold elements is still beneficial. + +But there is a way to make it profitable. + ### Prefetching -We could prefetch not just its 2 children. +To hide memory latency, we can use software prefetching similar to how we did for branchless binary search. But instead of issuing two separate prefetch instructions for the left and right child nodes, we can notice that they are actually neighbors in the Eytzinger array: one has index $2 k$ and the other $(2k + 1)$, so they are likely in the same cache line, and we can use just one instruction. + +In fact, this observation extends to the grand-children of node $k$ — they are also stored sequentially: + +``` +2 * 2 * k = 4 * k +2 * 2 * k + 1 = 4 * k + 1 +2 * (2 * k + 1) = 4 * k + 2 +2 * (2 * k + 1) + 1 = 4 * k + 3 +``` + + + +So their cache line can also be fetched with one instruction. Interesting… what if we continue this, and instead of fetching direct children, we fetch ahead as many descendants as we can cramp into one cache line? That would be $\frac{64}{4} = 16$ elements, our grand-grand-grandchildren with indices from $16k$ to $(16k + 15)$. + +Now, if we prefetch just one of these 16 elements, we will probably only get some but not all of them, as they may cross a cache line boundary. We can prefetch the first *and* the last element, but to get away with just one request, we can observe that the index of the first element, $16k$, is divisible by $16$ — and therefore its memory address will be the base address of the array plus something divisible by $16 \cdot 4 = 64$, the cache line size. If the array were to begin on a cache line, then these $16$ grand-gran-grandchildren elements will be guaranteed to be on a single cache line. + +Therefore, we just need to [align](/hpc/cpu-cache/alignment) the array: ```c++ t = (int*) std::aligned_alloc(64, 4 * (n + 1)); +``` +And then prefetch the element indexed $16 k$ in the main loop: + +```c++ int lower_bound(int x) { int k = 1; while (k <= n) { - __builtin_prefetch(t + k * B); + __builtin_prefetch(t + k * 16); k = 2 * k + (t[k] < x); } k >>= __builtin_ffs(~k); @@ -337,16 +367,20 @@ int lower_bound(int x) { } ``` +The performance on large arrays improves 3-4x from the previous version and ~2x compared to `std::lower_bound`. Not bad for a just two more lines of code: + +![](../img/search-eytzinger-prefetch.svg) + +This trick allows us to overlap 4 requests at a time. We are trading off memory bandwidth for latency. + Note that the last prefetch is not needed, and may be even outside of the memory region allocated for the program. On most modern CPUs, invalid prefetch instructions get converted into no-ops, but on some platforms this may cause a slowdown. Hardware prefetching will fetch its neighbours: ```c++ -__builtin_prefetch(t + k * B * 2); +__builtin_prefetch(t + k * 32); ``` -![](../img/search-eytzinger-prefetch.svg) - ### Removing the Last Branch The finishing touch. Did you notice the bumpiness of eytzinger search? This isn't random noise — let's zoom in: @@ -358,11 +392,9 @@ There is a period of a power of two and The running time is ~10ns higher for. These 10ns are the mispredicted branches for arrays. The last branch, to be exact. ```c++ -t[0] = -1; // an element that is less than X +t[0] = -1; // an element that is less than x iters = std::__lg(n + 1); -``` -```c++ int lower_bound(int x) { int k = 1; From 0832abc4c58048bb530b8efa108d716b02202f55 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Mon, 14 Feb 2022 14:33:07 +0300 Subject: [PATCH 175/531] dequeue -> deque --- content/russian/cs/basic-structures/{dequeue.md => deque.md} | 4 ++-- content/russian/cs/basic-structures/stack-minima.md | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) rename content/russian/cs/basic-structures/{dequeue.md => deque.md} (88%) diff --git a/content/russian/cs/basic-structures/dequeue.md b/content/russian/cs/basic-structures/deque.md similarity index 88% rename from content/russian/cs/basic-structures/dequeue.md rename to content/russian/cs/basic-structures/deque.md index 699cc5f1..fb4f800a 100644 --- a/content/russian/cs/basic-structures/dequeue.md +++ b/content/russian/cs/basic-structures/deque.md @@ -4,11 +4,11 @@ weight: 7 draft: true --- -`dequeue` - структура, позволяющая работать и с началом и концом +`deque` - структура, позволяющая работать и с началом и концом одновременно, то есть вставка и удаление с двух сторон ``` C++ -dequeue name; // дек типа T с названием name +deque name; // дек типа T с названием name name.front(), name.back(); // ссылка на первый и последний элемент соответственно name.pop_front(), name.pop_back(); // удаление первого и последнего элемента name.push_front(x), name.push_back(x); // вставка x в начало/конец diff --git a/content/russian/cs/basic-structures/stack-minima.md b/content/russian/cs/basic-structures/stack-minima.md index 174f8c3a..3cebaf81 100644 --- a/content/russian/cs/basic-structures/stack-minima.md +++ b/content/russian/cs/basic-structures/stack-minima.md @@ -36,7 +36,7 @@ minima = st.top().second; Рассмотрим реализацию вышеописанных операций: -dequeue q; +deque q; Нахождение минимума: current_minimum = q.front(); Добавление элемента: From 1f83831aa8f40fd79f60bcbdff20627f637824a8 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Mon, 14 Feb 2022 15:16:08 +0300 Subject: [PATCH 176/531] eytzinger analysis and optimization --- .../hpc/data-structures/binary-search.md | 24 +++++++++++-------- 1 file changed, 14 insertions(+), 10 deletions(-) diff --git a/content/english/hpc/data-structures/binary-search.md b/content/english/hpc/data-structures/binary-search.md index ae977d53..2149696b 100644 --- a/content/english/hpc/data-structures/binary-search.md +++ b/content/english/hpc/data-structures/binary-search.md @@ -371,25 +371,25 @@ The performance on large arrays improves 3-4x from the previous version and ~2x ![](../img/search-eytzinger-prefetch.svg) -This trick allows us to overlap 4 requests at a time. We are trading off memory bandwidth for latency. - -Note that the last prefetch is not needed, and may be even outside of the memory region allocated for the program. On most modern CPUs, invalid prefetch instructions get converted into no-ops, but on some platforms this may cause a slowdown. - -Hardware prefetching will fetch its neighbours: +What we essentially do is we hide the latency by prefetching 4 steps ahead; if the compute didn't matter, we would expect a ~4x speedup. We can also try to prefetch further than that, and we don't even have to use more prefetch instructions for that — we can request only the first cache line and rely on the hardware to prefetch its neighbors: ```c++ __builtin_prefetch(t + k * 32); ``` +It may or may not improve actual performance — it heavily depends on the hardware. + +Also, note that the last few prefetch requests are actually not needed, and in fact, they may be even be outside of the memory region allocated for the program. On most modern CPUs, invalid prefetch instructions get converted into no-ops, so it isn't a problem, but on some platforms this may cause a slowdown, so it may make sense, for example, to split off the last ~4 iterations from the loop to try to remove them. + ### Removing the Last Branch -The finishing touch. Did you notice the bumpiness of eytzinger search? This isn't random noise — let's zoom in: +Just the finishing touch. Did you notice the bumpiness of eytzinger search? This isn't random noise — let's zoom in: ![](../img/search-eytzinger-small.svg) -There is a period of a power of two and The running time is ~10ns higher for. +The latency is ~10ns higher for the array sizes in the form of $1.5 \cdot 2^k$. These are mispredicted branches from the loop itself — the last branch, to be exact. When the array size is far from a power of two, it is hard to predict whether the loop will make $\lfloor \log_2 n \rfloor$ or $\lfloor \log_2 n \rfloor + 1$ iterations, so we have a 50% change to suffer exactly one branch mispredict. -These 10ns are the mispredicted branches for arrays. The last branch, to be exact. +We can get rid of that last branch by always executing a constant minimum number of iterations and then using predication to optionally make the last comparison against some dummy element that is guaranteed to be less than $x$ and will be canceled: ```c++ t[0] = -1; // an element that is less than x @@ -414,12 +414,16 @@ The graph is now smooth and almost doesn't lose to the branchless binary search ![](../img/search-eytzinger-branchless.svg) -That was a small detour. Let's move on. +But that was a small detour. Let's get back to optimizing for *large* arrays. + +The prefetching technique allows us to read up to 4 elements ahead, but it doesn't really come for free — we are effectively trading off excess memory [bandwidth](/hpc/cpu-cache/bandwidth) for reduced [latency](/hpc/cpu-cache/latency). If you run more than one instance at a time, or just any other memory-intensive computation in the background, it will significantly [affect](/hpc/cpu-cache/sharing) the performance of the benchmark. -The title of this article doesn't say "binary search". We aren't limited to fetching one element at a time and comparing it. We can do better. +We can do better — instead of fetching 4 cache lines at a time, we could fetch 4 times *fewer* cache lines. ## B-Tree Layout +The title of this article doesn't say "binary search". We aren't limited to fetching one element at a time and comparing it. + B-trees are basically $(k+1)$-ary trees, meaning that they store $k$ elements in each node and choose between $(k+1)$ possible branches instead of 2. They are widely used for indexing in databases, especially those that operate on-disk, because if $k$ is big, this allows large sequential memory accesses while reducing the height of the tree. From 600fd0432960e2ab69064f1a2e1155e260073bd9 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Mon, 14 Feb 2022 15:41:08 +0300 Subject: [PATCH 177/531] split binary search in two --- content/english/hpc/data-structures/b-tree.md | 7 + .../hpc/data-structures/binary-search.md | 468 +---------------- content/english/hpc/data-structures/s-tree.md | 475 ++++++++++++++++++ .../english/hpc/data-structures/segment.md | 2 +- 4 files changed, 484 insertions(+), 468 deletions(-) create mode 100644 content/english/hpc/data-structures/b-tree.md create mode 100644 content/english/hpc/data-structures/s-tree.md diff --git a/content/english/hpc/data-structures/b-tree.md b/content/english/hpc/data-structures/b-tree.md new file mode 100644 index 00000000..25440bd0 --- /dev/null +++ b/content/english/hpc/data-structures/b-tree.md @@ -0,0 +1,7 @@ +--- +title: Search Trees +weight: 4 +draft: true +--- + +... diff --git a/content/english/hpc/data-structures/binary-search.md b/content/english/hpc/data-structures/binary-search.md index 2149696b..7788e72b 100644 --- a/content/english/hpc/data-structures/binary-search.md +++ b/content/english/hpc/data-structures/binary-search.md @@ -420,472 +420,6 @@ The prefetching technique allows us to read up to 4 elements ahead, but it doesn We can do better — instead of fetching 4 cache lines at a time, we could fetch 4 times *fewer* cache lines. -## B-Tree Layout - -The title of this article doesn't say "binary search". We aren't limited to fetching one element at a time and comparing it. - -B-trees are basically $(k+1)$-ary trees, meaning that they store $k$ elements in each node and choose between $(k+1)$ possible branches instead of 2. - -They are widely used for indexing in databases, especially those that operate on-disk, because if $k$ is big, this allows large sequential memory accesses while reducing the height of the tree. - -To perform static binary searches, one can implement a B-tree in an implicit way, i. e. without actually storing any pointers and spending only $O(1)$ additional memory, and $k$ could be made equal to the cache line size so that each node request fetches exactly one cache line. - -![](../img/b-tree.png) - -Turns out, they have the same rate of growth but sligtly larger compute-tied constant. While the latter is explainable (our while loop only has like 5 instructions; can't outpace that), the former is surprising. - -Let's assume that arithmetic costs nothing and do simple cache block analysis: - -* The Eytzinger binary search is supposed to be $4$ times faster if compute didn't matter, as it requests them ~4 times faster on average. - -* The B-tree makes $\frac{\log_{17} n}{\log_2 n} = \frac{\log n}{\log 17} \frac{\log 2}{\log n} = \frac{\log 2}{\log 17} \approx 0.245$ memory access per each request of binary search, i. e. it requests ~4 times less cache lines to fetch - -This explains why they have roughly the same slope. - -Note that this method, while being great for single-threaded world, is unlikely to make its way into database and heavy multi-threaded applications, because it sacrifices bandwidth to achieve low latency. - -[Part 2](https://algorithmica.org/en/b-tree) explores efficient implementation of implicit static B-trees in bandwidth-constrained environment. - -### B-tree layout - -B-trees generalize the concept of binary search trees by allowing nodes to have more than two children. - -Instead of single key, a B-tree node contains up to $B$ sorted keys may have up to $(B+1)$ children, thus reducing the tree height in $\frac{\log_2 n}{\log_B n} = \frac{\log B}{\log 2} = \log_2 B$ times. - -They were primarily developed for the purpose of managing on-disk databases, as their random access times are almost the same as reading 1MB of data sequentially, which makes the trade-off between number of comparisons and tree height beneficial. In our implementation, we will make each the size of each block equal to the cache line size, which in case of `int` is 16 elements. - -Normally, a B-tree node also stores $(B+1)$ pointers to its children, but we will only store keys and rely on pointer arithmetic, similar to the one used in Eytzinger array: - -* The root node is numbered $0$. - -* Node $k$ has $(B+1)$ child nodes numbered $\{k \cdot (B+1) + i\}$ for $i \in [1, B]$. - -Keys are stored in a 2d array in non-decreasing order. If the length of the initial array is not a multiple of $B$, the last block is padded with the largest value if its data type. - -```c++ -typedef __m256i reg; - -const int B = 16; -const int INF = std::numeric_limits::max(); - -int n; -int nblocks; -int *_a; -int (*btree)[B]; - -int go(int k, int i) { return k * (B + 1) + i + 1; } - -void build(int k = 0) { - static int t = 0; - if (k < nblocks) { - for (int i = 0; i < B; i++) { - build(go(k, i)); - btree[k][i] = (t < n ? _a[t++] : INF); - } - build(go(k, B)); - } -} - -void prepare(int *a, int _n) { - n = _n; - nblocks = (n + B - 1) / B; - _a = a; - btree = (int(*)[16]) std::aligned_alloc(64, 64 * nblocks); - build(); -} - -int cmp(reg x_vec, int* y_ptr) { - reg y_vec = _mm256_load_si256((reg*) y_ptr); - reg mask = _mm256_cmpgt_epi32(x_vec, y_vec); - return _mm256_movemask_ps((__m256) mask); -} - -int lower_bound(int x) { - int k = 0, res = INF; - reg x_vec = _mm256_set1_epi32(x); - while (k < nblocks) { - int mask = ~( - cmp(x_vec, &btree[k][0]) + - (cmp(x_vec, &btree[k][8]) << 8) - ); - int i = __builtin_ffs(mask) - 1; - if (i < B) - res = btree[k][i]; - k = go(k, i); - } - return res; -} -``` - - -We can construct B-tree similarly by traversing the search tree. - -It is correct, because each value of initial array will be copied to a unique position in the resulting array, and the tree height is $\Theta(\log_{B+1} n)$, because $k$ is multiplied by $(B + 1)$ each time a child node is created. - -Note that this approach causes a slight imbalance: "lefter" children may have larger respective ranges. - -So, as we promised before, we will perform all $16$ comparisons to compute the index of the right child node, but we leverage SIMD instructions to do it efficiently. Just to clarify — we want to do something like this: - -```cpp -int mask = (1 << B); - -for (int i = 0; i < B; i++) - mask |= (btree[k][i] >= x) << i; - -int i = __builtin_ffs(mask) - 1; -// now i is the number of the correct child node -``` - - -…but ~8 times faster. - -Actually, compiler quite often produces very optimized code that leverages these instructions for certain types of loops. This is called auto-vectorization, and this is the reason why a loop that sums up an array of `short`s is faster (theoretically by a factor of two) than the same loop for `int`s: you can fit more elements on the same 256-bit block. Sadly, this is not our case, as we have loop-carried dependencies. - -The algorithm we will implement: - -1. Somewhere before the main loop, convert $x$ to a vector of $8$ copies of $x$. -2. Load the keys stored in node into another 256-bit vector. -3. Compare these two vectors. This returns a 256-bit mask in which pairs that compared "greater than" are marked with ones. -4. Create a 8-bit mask out of that and return it. Then you can feed it to `__builtin_ffs`. - -This is how it looks using C++ intrinsics, which are basically built-in wrappers for raw assembly instructions: - - -After that, we call this function two times (because our node size / cache line happens to be 512 bits, which is twice as big) and blend these masks together with bitwise operations. - - -That's it. This implementation should outperform even the [state-of-the-art indexes](http://kaldewey.com/pubs/FAST__SIGMOD10.pdf) used in high-performance databases, though it's mostly due to the fact that data structures used in real databases have to support fast updates while we don't. - -Note that this implementation is very specific to the architecture. Older CPUs and CPUs on mobile devices don't have 256-bit wide registers and will crash (but they likely have 128-bit SIMD so the loop can still be split in 4 parts instead of 2), non-Intel CPUs have their own instruction sets for SIMD, and some computers even have different cache line size. - -![](../img/search-btree.svg) - -### Optimizations - -Enable huge pages: - -```c++ -btree = (int(*)[16]) std::aligned_alloc(2 * 1024 * 1024, 64 * nblocks); -madvise(btree, 64 * nblocks, MADV_HUGEPAGE); -``` - -![](../img/search-btree-hugepages.svg) - -```c++ -constexpr std::pair precalc(int n) { - int s = 0, // total size - l = B, // size of next layer - h = 0; // height so far - while (s + l - B < n) { - s += l; - l *= (B + 1); - h++; - } - int r = (n - s + B - 1) / B; // remaining blocks on last layer - return {h, s / B + (r + B) / (B + 1) * (B + 1)}; -} - -const int height = precalc(N).first, nblocks = precalc(N).second; -int *_a, (*btree)[B]; -``` - -```c++ -unsigned rank(reg x_vec, int* y_ptr) { - reg a = _mm256_load_si256((reg*) y_ptr); - reg b = _mm256_load_si256((reg*) (y_ptr + 8)); - - reg ca = _mm256_cmpgt_epi32(a, x_vec); - reg cb = _mm256_cmpgt_epi32(b, x_vec); - - reg c = _mm256_packs_epi32(ca, cb); - int mask = _mm256_movemask_epi8(c); - - return __tzcnt_u32(mask) >> 1; -} -``` - -`packs` - -Or - - - - - -```c++ -void permute(int *node) { - const reg perm = _mm256_setr_epi32(4, 5, 6, 7, 0, 1, 2, 3); - reg* middle = (reg*) (node + 4); - reg x = _mm256_loadu_si256(middle); - x = _mm256_permutevar8x32_epi32(x, perm); - _mm256_storeu_si256(middle, x); -} -``` - -There are probably faster ways to swap middle elements, but we will leave it here. - -You call `permute(btree[k])` after you've done with constructing a node. - - -```c++ -const int translate[17] = { - 0, 1, 2, 3, - 8, 9, 10, 11, - 4, 5, 6, 7, - 12, 13, 14, 15, - 0 -}; - -void update(int &res, int* node, unsigned i) { - int val = node[translate[i]]; - res = (i < B ? val : res); -} -``` - -```c++ -int lower_bound(int x) { - int k = 0, res = INF; - reg x_vec = _mm256_set1_epi32(x - 1); - for (int h = 0; h < height - 1; h++) { - int *node = btree[k]; - unsigned i = rank(x_vec, node); - k = k * (B + 1) + 1; // remove + 1? - update(res, node, i); - k += i; - } - unsigned i = rank(x_vec, btree[k]); - update(res, btree[k], i); - int k2 = go(k, i); - if (go(k, 0) < nblocks) { - unsigned i = rank(x_vec, btree[k2]); - update(res, btree[k2], i); - } - return res; -} -``` - -All that hard work is totally worth it: - -![](../img/search-btree-optimized.svg) - -## B+ Tree Layout - -```c++ -constexpr int blocks(int n) { - return (n + B - 1) / B; -} - -constexpr int prev_keys(int n) { - return (blocks(n) + B) / (B + 1) * B; -} - -constexpr int height(int n) { - return (n <= B ? 1 : height(prev_keys(n)) + 1); -} - -constexpr int offset(int h) { - int k = 0, n = N; - while (h--) { - k += blocks(n) * B; - n = prev_keys(n); - } - return k; -} - -const int H = height(N), S = offset(H); - -int *btree; - -void permute(int *node) { - const reg perm_mask = _mm256_set_epi32(3, 2, 1, 0, 7, 6, 5, 4); - reg* middle = (reg*) (node + 4); - reg x = _mm256_loadu_si256(middle); - x = _mm256_permutevar8x32_epi32(x, perm_mask); - _mm256_storeu_si256(middle, x); -} - -void prepare(int *a, int n) { - const int P = 1 << 21, T = (4 * S + P - 1) / P * P; - btree = (int*) std::aligned_alloc(P, T); - madvise(btree, T, MADV_HUGEPAGE); - - for (int i = N; i < S; i++) - btree[i] = INF; - - memcpy(btree, a, 4 * N); - - for (int h = 1; h < H; h++) { - for (int i = 0; i < offset(h + 1) - offset(h); i++) { - int k = i / B, - j = i - k * B; - k = k * (B + 1) + j + 1; // compare right - // and then always to the left - for (int l = 0; l < h - 1; l++) - k *= (B + 1); - btree[offset(h) + i] = (k * B < N ? btree[k * B] : INF); - } - } - - for (int i = offset(1); i < S; i += B) - permute(btree + i); -} - -unsigned direct_rank(reg x, int* y) { - reg a = _mm256_load_si256((reg*) y); - reg b = _mm256_load_si256((reg*) (y + 8)); - - reg ca = _mm256_cmpgt_epi32(a, x); - reg cb = _mm256_cmpgt_epi32(b, x); - - int mb = _mm256_movemask_ps((__m256) cb); - int ma = _mm256_movemask_ps((__m256) ca); - - unsigned mask = (1 << 16); - mask |= mb << 8; - mask |= ma; - - return __tzcnt_u32(mask); -} - -unsigned permuted_rank(reg x, int* y) { - reg a = _mm256_load_si256((reg*) y); - reg b = _mm256_load_si256((reg*) (y + 8)); - - reg ca = _mm256_cmpgt_epi32(a, x); - reg cb = _mm256_cmpgt_epi32(b, x); - - reg c = _mm256_packs_epi32(ca, cb); - unsigned mask = _mm256_movemask_epi8(c); - - return __tzcnt_u32(mask)/* >> 1*/; -} - -int lower_bound(int _x) { - unsigned k = 0; - reg x = _mm256_set1_epi32(_x - 1); - for (int h = H - 1; h > 0; h--) { - unsigned i = permuted_rank(x, btree + offset(h) + k); - - //k /= B; - //k *= (B + 1) * B; - // k += (i << 3); - - k = k * (B + 1) + (i << 3); - - //if (N > (1 << 21) && h == 1) - // __builtin_prefetch(btree + k); - - //k += (i << 3); - } - unsigned i = direct_rank(x, btree + k); - return btree[k + i]; -} -``` - -![](../img/search-bplus.svg) - -Makes more sense to look at it as a relative speedup: - -![](../img/search-relative.svg) - -### Measuring Actual Latency - -One huge asterisk we didn't disclosed. - -```c++ -for (int i = 0; i < m; i++) - checksum ^= lower_bound(q[i]); -``` - -To measure *actual* latency, we need to introduce a dependency between the iterations, so that the next one can't start before the previous finishes: - -```c++ -int last = 0; - -for (int i = 0; i < m; i++) { - last = lower_bound(q[i] ^ last); - checksum ^= last; -} -``` - -![](../img/search-relative-latency.svg) - - -### Modifications - -```c++ -void permute32(int *node) { - // a b c d 1 2 3 4 -> (a c) (b d) (1 3) (2 4) -> (a c) (1 3) (b d) (2 4) - reg x = _mm256_load_si256((reg*) (node + 8)); - reg y = _mm256_load_si256((reg*) (node + 16)); - _mm256_storeu_si256((reg*) (node + 8), y); - _mm256_storeu_si256((reg*) (node + 16), x); - permute16(node); - permute16(node + 16); -} -``` - -```c++ -unsigned cmp(reg x, int *node) { - reg y = _mm256_load_si256((reg*) node); - reg mask = _mm256_cmpgt_epi32(y, x); - return _mm256_movemask_ps((__m256) mask); -} - -unsigned rank32(reg x, int *node) { - unsigned mask = cmp(x, node) - | (cmp(x, node + 8) << 8) - | (cmp(x, node + 16) << 16) - | (cmp(x, node + 24) << 24); -``` - -```c++ -unsigned permuted_rank32(reg x, int *node) { - reg a = _mm256_load_si256((reg*) node); - reg b = _mm256_load_si256((reg*) (node + 8)); - reg c = _mm256_load_si256((reg*) (node + 16)); - reg d = _mm256_load_si256((reg*) (node + 24)); - - reg ca = _mm256_cmpgt_epi32(a, x); - reg cb = _mm256_cmpgt_epi32(b, x); - reg cc = _mm256_cmpgt_epi32(c, x); - reg cd = _mm256_cmpgt_epi32(d, x); - - reg cab = _mm256_packs_epi32(ca, cb); - reg ccd = _mm256_packs_epi32(cc, cd); - reg cabcd = _mm256_packs_epi16(cab, ccd); - unsigned mask = _mm256_movemask_epi8(cabcd); - - return __tzcnt_u32(mask); -} -``` - -Another idea is to use cache more efficiently. For example, you can execute `_mm256_stream_load_si256` on just the last iteration. - -They aren't beneficial for throughput: - -![](../img/search-bplus-other.svg) - -However, they perform better: - -![](../img/search-latency-bplus.svg) - -## Conclusions - -![](../img/search-all.svg) - ## Acknowledgements -The first half of the article is loosely based on "[Array Layouts for Comparison-Based Searching](https://arxiv.org/pdf/1509.05053.pdf)" by Paul-Virak Khuong and Pat Morin. It is 46 pages long, and discusses the scalar binary searches in more details, so check it out if you're interested in other approaches. - -This [StackOverflow answer](https://stackoverflow.com/questions/20616605/using-simd-avx-sse-for-tree-traversal) by Cory Nelson is where I took the permuted SIMD routine. - -The more you think about the name. "S-tree" and "S+ tree" respectively. There is a an obscure data structures in computer vision. We even have more claim to it than Boer had on B-tree: it is succinct, simd, my name, my surname. - - +The article is loosely based on "[Array Layouts for Comparison-Based Searching](https://arxiv.org/pdf/1509.05053.pdf)" by Paul-Virak Khuong and Pat Morin. It is 46 pages long, and discusses the scalar binary searches in more details, so check it out if you're interested in other approaches. diff --git a/content/english/hpc/data-structures/s-tree.md b/content/english/hpc/data-structures/s-tree.md new file mode 100644 index 00000000..eda00ec2 --- /dev/null +++ b/content/english/hpc/data-structures/s-tree.md @@ -0,0 +1,475 @@ +--- +title: Static B-Trees +weight: 2 +draft: true +--- + +## B-Tree Layout + +Attentive readers could notice that the title of this article doesn't say "binary search". + +The title of this article doesn't say "binary search". We aren't limited to fetching one element at a time and comparing it. + +B-trees are basically $(k+1)$-ary trees, meaning that they store $k$ elements in each node and choose between $(k+1)$ possible branches instead of 2. + +They are widely used for indexing in databases, especially those that operate on-disk, because if $k$ is big, this allows large sequential memory accesses while reducing the height of the tree. + +To perform static binary searches, one can implement a B-tree in an implicit way, i. e. without actually storing any pointers and spending only $O(1)$ additional memory, and $k$ could be made equal to the cache line size so that each node request fetches exactly one cache line. + +![](../img/b-tree.png) + +Turns out, they have the same rate of growth but sligtly larger compute-tied constant. While the latter is explainable (our while loop only has like 5 instructions; can't outpace that), the former is surprising. + +Let's assume that arithmetic costs nothing and do simple cache block analysis: + +* The Eytzinger binary search is supposed to be $4$ times faster if compute didn't matter, as it requests them ~4 times faster on average. + +* The B-tree makes $\frac{\log_{17} n}{\log_2 n} = \frac{\log n}{\log 17} \frac{\log 2}{\log n} = \frac{\log 2}{\log 17} \approx 0.245$ memory access per each request of binary search, i. e. it requests ~4 times less cache lines to fetch + +This explains why they have roughly the same slope. + +Note that this method, while being great for single-threaded world, is unlikely to make its way into database and heavy multi-threaded applications, because it sacrifices bandwidth to achieve low latency. + +[Part 2](https://algorithmica.org/en/b-tree) explores efficient implementation of implicit static B-trees in bandwidth-constrained environment. + +### B-tree layout + +B-trees generalize the concept of binary search trees by allowing nodes to have more than two children. + +Instead of single key, a B-tree node contains up to $B$ sorted keys may have up to $(B+1)$ children, thus reducing the tree height in $\frac{\log_2 n}{\log_B n} = \frac{\log B}{\log 2} = \log_2 B$ times. + +They were primarily developed for the purpose of managing on-disk databases, as their random access times are almost the same as reading 1MB of data sequentially, which makes the trade-off between number of comparisons and tree height beneficial. In our implementation, we will make each the size of each block equal to the cache line size, which in case of `int` is 16 elements. + +Normally, a B-tree node also stores $(B+1)$ pointers to its children, but we will only store keys and rely on pointer arithmetic, similar to the one used in Eytzinger array: + +* The root node is numbered $0$. + +* Node $k$ has $(B+1)$ child nodes numbered $\{k \cdot (B+1) + i\}$ for $i \in [1, B]$. + +Keys are stored in a 2d array in non-decreasing order. If the length of the initial array is not a multiple of $B$, the last block is padded with the largest value if its data type. + +```c++ +typedef __m256i reg; + +const int B = 16; +const int INF = std::numeric_limits::max(); + +int n; +int nblocks; +int *_a; +int (*btree)[B]; + +int go(int k, int i) { return k * (B + 1) + i + 1; } + +void build(int k = 0) { + static int t = 0; + if (k < nblocks) { + for (int i = 0; i < B; i++) { + build(go(k, i)); + btree[k][i] = (t < n ? _a[t++] : INF); + } + build(go(k, B)); + } +} + +void prepare(int *a, int _n) { + n = _n; + nblocks = (n + B - 1) / B; + _a = a; + btree = (int(*)[16]) std::aligned_alloc(64, 64 * nblocks); + build(); +} + +int cmp(reg x_vec, int* y_ptr) { + reg y_vec = _mm256_load_si256((reg*) y_ptr); + reg mask = _mm256_cmpgt_epi32(x_vec, y_vec); + return _mm256_movemask_ps((__m256) mask); +} + +int lower_bound(int x) { + int k = 0, res = INF; + reg x_vec = _mm256_set1_epi32(x); + while (k < nblocks) { + int mask = ~( + cmp(x_vec, &btree[k][0]) + + (cmp(x_vec, &btree[k][8]) << 8) + ); + int i = __builtin_ffs(mask) - 1; + if (i < B) + res = btree[k][i]; + k = go(k, i); + } + return res; +} +``` + + +We can construct B-tree similarly by traversing the search tree. + +It is correct, because each value of initial array will be copied to a unique position in the resulting array, and the tree height is $\Theta(\log_{B+1} n)$, because $k$ is multiplied by $(B + 1)$ each time a child node is created. + +Note that this approach causes a slight imbalance: "lefter" children may have larger respective ranges. + +So, as we promised before, we will perform all $16$ comparisons to compute the index of the right child node, but we leverage SIMD instructions to do it efficiently. Just to clarify — we want to do something like this: + +```cpp +int mask = (1 << B); + +for (int i = 0; i < B; i++) + mask |= (btree[k][i] >= x) << i; + +int i = __builtin_ffs(mask) - 1; +// now i is the number of the correct child node +``` + + +…but ~8 times faster. + +Actually, compiler quite often produces very optimized code that leverages these instructions for certain types of loops. This is called auto-vectorization, and this is the reason why a loop that sums up an array of `short`s is faster (theoretically by a factor of two) than the same loop for `int`s: you can fit more elements on the same 256-bit block. Sadly, this is not our case, as we have loop-carried dependencies. + +The algorithm we will implement: + +1. Somewhere before the main loop, convert $x$ to a vector of $8$ copies of $x$. +2. Load the keys stored in node into another 256-bit vector. +3. Compare these two vectors. This returns a 256-bit mask in which pairs that compared "greater than" are marked with ones. +4. Create a 8-bit mask out of that and return it. Then you can feed it to `__builtin_ffs`. + +This is how it looks using C++ intrinsics, which are basically built-in wrappers for raw assembly instructions: + + +After that, we call this function two times (because our node size / cache line happens to be 512 bits, which is twice as big) and blend these masks together with bitwise operations. + + +That's it. This implementation should outperform even the [state-of-the-art indexes](http://kaldewey.com/pubs/FAST__SIGMOD10.pdf) used in high-performance databases, though it's mostly due to the fact that data structures used in real databases have to support fast updates while we don't. + +Note that this implementation is very specific to the architecture. Older CPUs and CPUs on mobile devices don't have 256-bit wide registers and will crash (but they likely have 128-bit SIMD so the loop can still be split in 4 parts instead of 2), non-Intel CPUs have their own instruction sets for SIMD, and some computers even have different cache line size. + +![](../img/search-btree.svg) + +### Optimizations + +Enable huge pages: + +```c++ +btree = (int(*)[16]) std::aligned_alloc(2 * 1024 * 1024, 64 * nblocks); +madvise(btree, 64 * nblocks, MADV_HUGEPAGE); +``` + +![](../img/search-btree-hugepages.svg) + +```c++ +constexpr std::pair precalc(int n) { + int s = 0, // total size + l = B, // size of next layer + h = 0; // height so far + while (s + l - B < n) { + s += l; + l *= (B + 1); + h++; + } + int r = (n - s + B - 1) / B; // remaining blocks on last layer + return {h, s / B + (r + B) / (B + 1) * (B + 1)}; +} + +const int height = precalc(N).first, nblocks = precalc(N).second; +int *_a, (*btree)[B]; +``` + +```c++ +unsigned rank(reg x_vec, int* y_ptr) { + reg a = _mm256_load_si256((reg*) y_ptr); + reg b = _mm256_load_si256((reg*) (y_ptr + 8)); + + reg ca = _mm256_cmpgt_epi32(a, x_vec); + reg cb = _mm256_cmpgt_epi32(b, x_vec); + + reg c = _mm256_packs_epi32(ca, cb); + int mask = _mm256_movemask_epi8(c); + + return __tzcnt_u32(mask) >> 1; +} +``` + +`packs` + +Or + + + + + +```c++ +void permute(int *node) { + const reg perm = _mm256_setr_epi32(4, 5, 6, 7, 0, 1, 2, 3); + reg* middle = (reg*) (node + 4); + reg x = _mm256_loadu_si256(middle); + x = _mm256_permutevar8x32_epi32(x, perm); + _mm256_storeu_si256(middle, x); +} +``` + +There are probably faster ways to swap middle elements, but we will leave it here. + +You call `permute(btree[k])` after you've done with constructing a node. + + +```c++ +const int translate[17] = { + 0, 1, 2, 3, + 8, 9, 10, 11, + 4, 5, 6, 7, + 12, 13, 14, 15, + 0 +}; + +void update(int &res, int* node, unsigned i) { + int val = node[translate[i]]; + res = (i < B ? val : res); +} +``` + +```c++ +int lower_bound(int x) { + int k = 0, res = INF; + reg x_vec = _mm256_set1_epi32(x - 1); + for (int h = 0; h < height - 1; h++) { + int *node = btree[k]; + unsigned i = rank(x_vec, node); + k = k * (B + 1) + 1; // remove + 1? + update(res, node, i); + k += i; + } + unsigned i = rank(x_vec, btree[k]); + update(res, btree[k], i); + int k2 = go(k, i); + if (go(k, 0) < nblocks) { + unsigned i = rank(x_vec, btree[k2]); + update(res, btree[k2], i); + } + return res; +} +``` + +All that hard work is totally worth it: + +![](../img/search-btree-optimized.svg) + +## B+ Tree Layout + +```c++ +constexpr int blocks(int n) { + return (n + B - 1) / B; +} + +constexpr int prev_keys(int n) { + return (blocks(n) + B) / (B + 1) * B; +} + +constexpr int height(int n) { + return (n <= B ? 1 : height(prev_keys(n)) + 1); +} + +constexpr int offset(int h) { + int k = 0, n = N; + while (h--) { + k += blocks(n) * B; + n = prev_keys(n); + } + return k; +} + +const int H = height(N), S = offset(H); + +int *btree; + +void permute(int *node) { + const reg perm_mask = _mm256_set_epi32(3, 2, 1, 0, 7, 6, 5, 4); + reg* middle = (reg*) (node + 4); + reg x = _mm256_loadu_si256(middle); + x = _mm256_permutevar8x32_epi32(x, perm_mask); + _mm256_storeu_si256(middle, x); +} + +void prepare(int *a, int n) { + const int P = 1 << 21, T = (4 * S + P - 1) / P * P; + btree = (int*) std::aligned_alloc(P, T); + madvise(btree, T, MADV_HUGEPAGE); + + for (int i = N; i < S; i++) + btree[i] = INF; + + memcpy(btree, a, 4 * N); + + for (int h = 1; h < H; h++) { + for (int i = 0; i < offset(h + 1) - offset(h); i++) { + int k = i / B, + j = i - k * B; + k = k * (B + 1) + j + 1; // compare right + // and then always to the left + for (int l = 0; l < h - 1; l++) + k *= (B + 1); + btree[offset(h) + i] = (k * B < N ? btree[k * B] : INF); + } + } + + for (int i = offset(1); i < S; i += B) + permute(btree + i); +} + +unsigned direct_rank(reg x, int* y) { + reg a = _mm256_load_si256((reg*) y); + reg b = _mm256_load_si256((reg*) (y + 8)); + + reg ca = _mm256_cmpgt_epi32(a, x); + reg cb = _mm256_cmpgt_epi32(b, x); + + int mb = _mm256_movemask_ps((__m256) cb); + int ma = _mm256_movemask_ps((__m256) ca); + + unsigned mask = (1 << 16); + mask |= mb << 8; + mask |= ma; + + return __tzcnt_u32(mask); +} + +unsigned permuted_rank(reg x, int* y) { + reg a = _mm256_load_si256((reg*) y); + reg b = _mm256_load_si256((reg*) (y + 8)); + + reg ca = _mm256_cmpgt_epi32(a, x); + reg cb = _mm256_cmpgt_epi32(b, x); + + reg c = _mm256_packs_epi32(ca, cb); + unsigned mask = _mm256_movemask_epi8(c); + + return __tzcnt_u32(mask)/* >> 1*/; +} + +int lower_bound(int _x) { + unsigned k = 0; + reg x = _mm256_set1_epi32(_x - 1); + for (int h = H - 1; h > 0; h--) { + unsigned i = permuted_rank(x, btree + offset(h) + k); + + //k /= B; + //k *= (B + 1) * B; + // k += (i << 3); + + k = k * (B + 1) + (i << 3); + + //if (N > (1 << 21) && h == 1) + // __builtin_prefetch(btree + k); + + //k += (i << 3); + } + unsigned i = direct_rank(x, btree + k); + return btree[k + i]; +} +``` + +![](../img/search-bplus.svg) + +Makes more sense to look at it as a relative speedup: + +![](../img/search-relative.svg) + +### Measuring Actual Latency + +One huge asterisk we didn't disclosed. + +```c++ +for (int i = 0; i < m; i++) + checksum ^= lower_bound(q[i]); +``` + +To measure *actual* latency, we need to introduce a dependency between the iterations, so that the next one can't start before the previous finishes: + +```c++ +int last = 0; + +for (int i = 0; i < m; i++) { + last = lower_bound(q[i] ^ last); + checksum ^= last; +} +``` + +![](../img/search-relative-latency.svg) + + +### Modifications + +```c++ +void permute32(int *node) { + // a b c d 1 2 3 4 -> (a c) (b d) (1 3) (2 4) -> (a c) (1 3) (b d) (2 4) + reg x = _mm256_load_si256((reg*) (node + 8)); + reg y = _mm256_load_si256((reg*) (node + 16)); + _mm256_storeu_si256((reg*) (node + 8), y); + _mm256_storeu_si256((reg*) (node + 16), x); + permute16(node); + permute16(node + 16); +} +``` + +```c++ +unsigned cmp(reg x, int *node) { + reg y = _mm256_load_si256((reg*) node); + reg mask = _mm256_cmpgt_epi32(y, x); + return _mm256_movemask_ps((__m256) mask); +} + +unsigned rank32(reg x, int *node) { + unsigned mask = cmp(x, node) + | (cmp(x, node + 8) << 8) + | (cmp(x, node + 16) << 16) + | (cmp(x, node + 24) << 24); +``` + +```c++ +unsigned permuted_rank32(reg x, int *node) { + reg a = _mm256_load_si256((reg*) node); + reg b = _mm256_load_si256((reg*) (node + 8)); + reg c = _mm256_load_si256((reg*) (node + 16)); + reg d = _mm256_load_si256((reg*) (node + 24)); + + reg ca = _mm256_cmpgt_epi32(a, x); + reg cb = _mm256_cmpgt_epi32(b, x); + reg cc = _mm256_cmpgt_epi32(c, x); + reg cd = _mm256_cmpgt_epi32(d, x); + + reg cab = _mm256_packs_epi32(ca, cb); + reg ccd = _mm256_packs_epi32(cc, cd); + reg cabcd = _mm256_packs_epi16(cab, ccd); + unsigned mask = _mm256_movemask_epi8(cabcd); + + return __tzcnt_u32(mask); +} +``` + +Another idea is to use cache more efficiently. For example, you can execute `_mm256_stream_load_si256` on just the last iteration. + +They aren't beneficial for throughput: + +![](../img/search-bplus-other.svg) + +However, they perform better: + +![](../img/search-latency-bplus.svg) + +## Conclusions + +![](../img/search-all.svg) + +This [StackOverflow answer](https://stackoverflow.com/questions/20616605/using-simd-avx-sse-for-tree-traversal) by Cory Nelson is where I took the permuted SIMD routine. + +The more you think about the name. "S-tree" and "S+ tree" respectively. There is a an obscure data structures in computer vision. We even have more claim to it than Boer had on B-tree: it is succinct, static, simd, my name, my surname. + + + + diff --git a/content/english/hpc/data-structures/segment.md b/content/english/hpc/data-structures/segment.md index 92c2afa0..f529bf8c 100644 --- a/content/english/hpc/data-structures/segment.md +++ b/content/english/hpc/data-structures/segment.md @@ -1,6 +1,6 @@ --- title: Segment Trees -weight: 2 +weight: 3 draft: true --- From 02e6d02980f50b52e49cbfd4bc84f7e98958275a Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Mon, 14 Feb 2022 15:43:32 +0300 Subject: [PATCH 178/531] update hpc toc --- content/english/hpc/_index.md | 13 +++++++------ .../english/hpc/data-structures/binary-search.md | 2 +- 2 files changed, 8 insertions(+), 7 deletions(-) diff --git a/content/english/hpc/_index.md b/content/english/hpc/_index.md index 0c3e6f13..e2ab8f6f 100644 --- a/content/english/hpc/_index.md +++ b/content/english/hpc/_index.md @@ -127,12 +127,13 @@ Planned table of contents: 11.14. Matrix Multiplication 12. Data Structure Case Studies 12.1. Binary Search - 12.2. Segment Trees -(12.3. B-Trees) -(12.4. Range Minimum Query) - 12.5. Hash Tables -(12.6. Bitmaps) -(12.7. Probabilistic Filters) + 12.2. Static B-Trees + 12.3. Segment Trees +(12.4. Search Trees) +(12.5. Range Minimum Query) + 12.6. Hash Tables +(12.7. Bitmaps) +(12.8. Probabilistic Filters) ``` Among cool things that we will speed up: diff --git a/content/english/hpc/data-structures/binary-search.md b/content/english/hpc/data-structures/binary-search.md index 7788e72b..dbc64ae0 100644 --- a/content/english/hpc/data-structures/binary-search.md +++ b/content/english/hpc/data-structures/binary-search.md @@ -1,5 +1,5 @@ --- -title: Searching in Sorted Arrays +title: Binary Search weight: 1 --- From 1a4eaf2e42a703c892eee96f3e366759f0c92684 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Mon, 14 Feb 2022 16:09:28 +0300 Subject: [PATCH 179/531] s-tree intro --- .../hpc/data-structures/binary-search.md | 8 +++--- .../hpc/data-structures/img/b-tree.jpg | Bin 0 -> 24951 bytes content/english/hpc/data-structures/s-tree.md | 25 +++++++++--------- .../{segment.md => segment-trees.md} | 0 4 files changed, 17 insertions(+), 16 deletions(-) create mode 100644 content/english/hpc/data-structures/img/b-tree.jpg rename content/english/hpc/data-structures/{segment.md => segment-trees.md} (100%) diff --git a/content/english/hpc/data-structures/binary-search.md b/content/english/hpc/data-structures/binary-search.md index dbc64ae0..dfebcef3 100644 --- a/content/english/hpc/data-structures/binary-search.md +++ b/content/english/hpc/data-structures/binary-search.md @@ -13,10 +13,6 @@ In this article, we focus on such fundamental algorithm — binary search — an - *Branchless* binary search that is up to 3x faster on *small* arrays and can act as a drop-in replacement to `std::lower_bound`. - *Eytzinger* binary search that rearranges the elements of a sorted array in a cache-friendly way of is also 3x faster on small array and 2x faster on RAM-backed arrays. -- *S-tree*: an approach based on the implicit (pointer-free) B-layout accelerated with SIMD operations to perform search efficiently while using less memory bandwidth and is ~8x faster on small arrays and 5x faster on large arrays. -- *S+ tree*: an approach similarly based on the B+ layout and achieves up to 15x faster for small arrays and ~7x faster on large arrays. Uses 6-7% of the array memory. - -The last two approaches use SIMD, which technically disqualifies it from being binary search. This is technically not a drop-in replacement, since it requires some preprocessing, but I can't recall a lot of scenarios where you obtain a sorted array but can't spend linear time on preprocessing. But otherwise they are effectively drop-in replacements to `std::lower_bound`. It performs slightly worse on array sizes that fit lower layers of cache, but in low-bandwidth environments it can be up to 3x faster (or 7x faster than `std::lower_bound`). GCC sucked on all benchmarks, so we will mostly be using Clang (10.0). The CPU is a Zen 2, although the results should be transferrable to other platforms, including most Arm-based chips. @@ -420,6 +416,10 @@ The prefetching technique allows us to read up to 4 elements ahead, but it doesn We can do better — instead of fetching 4 cache lines at a time, we could fetch 4 times *fewer* cache lines. +Note that this method, while being great for single-threaded world, is unlikely to make its way into database and heavy multi-threaded applications, because it sacrifices bandwidth to achieve low latency. + +[Part 2](https://algorithmica.org/en/b-tree) explores efficient implementation of implicit static B-trees in bandwidth-constrained environment. + ## Acknowledgements The article is loosely based on "[Array Layouts for Comparison-Based Searching](https://arxiv.org/pdf/1509.05053.pdf)" by Paul-Virak Khuong and Pat Morin. It is 46 pages long, and discusses the scalar binary searches in more details, so check it out if you're interested in other approaches. diff --git a/content/english/hpc/data-structures/img/b-tree.jpg b/content/english/hpc/data-structures/img/b-tree.jpg new file mode 100644 index 0000000000000000000000000000000000000000..c0c2117cc4f6f014006044dda6901f3eea33c103 GIT binary patch literal 24951 zcmd4(2Ut{Fwgn1rvLHF<0zp7YN|uaD5)nknNhK#iNre;?AXzd30wM~MljIDNlSjokc6IF3~>>CwxmB+ z^yd~7@E@({9}D`&dQPVRQal`R!|odfMmQhvSDun2xV8CY(gL>eND{_3vZocq zes08_f1Y?vQc^O>m$f|6ZrNKZdCAkZ$NG`FfI4@arjN4Oy(6Aepr;VVatf4+s+_wW#z^A?tE;a>G!oM48WY7P0YtEtWL5=t8@x zr5I;i_vpN|LQjFEH7q-PA{l9ZA~cZu|F5?zhnYN*hY6q*f@S53FwTgaKru4=@lId* z9dR~crOK+>5Rc{uk!3d{M7H&;q-29RP?|o%X1Yfj@E++Qg-usfI`+M^JMgTC_O2qq z{}&!Ft!Ujd;ap=d3Uu^h zRA#3AVO3r}Bt+t{F{p6~`*hkQtWrxzipG}YZD)-MwU&;eHp8P6_jeJd9 zqaN6WyE3g2pJK}wFY2aET#_Pq5p~4K|nW1sXwK$P3y^%LJj`#3K0wEBd5n8uWzR zrmqQcyx`ta5{Q^qB^i?XxTNo#31dY;LUz-fHHKyB5GM+^$G)HBSvh63uu4wiQ%Mux zDg{3b#($7*G>OUybW_0n>S8;`I#}pN#zWT0&n7X+5p@2lHRpYtMDDmJaq7JBiY&Uz z=7gaAWgL5e?)_!@6u7&^+^E-3urPo8(mZN@|JpQsT`p6bD1ST%<~^8g9-fZx{%X(X z4mard9}KFJ*}hC)!x?rb5|MoX`$Ce7{~C~DT4@~1wPzL&E)b6xuaX##oA@H+#Ezu% zs7`g@O=jZQaw^`ad4QO(!5<*%^&Cr*r^Kv2SxlWaCxX`X>7hRvU8pQ9Zz|0eHSWL# z+CxX$x#m!T++3m|*E(eO9}JhsLonm5%x0N(3*RW`parc5(q+(+*0_G|;tPr6pY7)K zFU-8{9xnZsv9n4Mv&UOUahNaSEz)}y8M7kcy3s)`V-Z>6$7u>}{YNyi8mB?!AD@dAsSpFZC@Bui<`fr zoH(c*s2k@HX65$4cW_06g);Z;iJaGw=rPBLUhJ}-E2Wnc8J}7+Big0Dlh@%z4 z4N&~Z<0o@#r)0_)t3DQc3Rt8Eo&tNE!v&)bz0G#cKX{E)PojKjPJxk-J8vm;|H?V{ ze>S$(ndMjwwc3{IVAr6OuWEVll(DdIQT5ICzM_alb#<7L;XLbRq+w}%1j!QlJ2Wp+ z9k+EJOXnGqjy115hin+PF*kVoV{6N`jsLdIBVzpk30AAq zbuqU6s$M1b zu5xpkZf}V-tBO0M)l0&D}0SSRsc82tm_nr1pRu@O!pKZO+($l zhR(O00>>9O;h0g#|7_%V!WmkKdx4(DcF8!Os8-l3`~m}tC2wLVQ0DFFlh?Vt0R?J$NMj`H@}> z;omr_PV$?WRmITC7rO#wHCg$#>9U@9Wn9%I%9|>jU`42ZB{TR9G^mAyfk899wb8q! zIt)<~x*aYivURnF19Tz?e|nSIm_gi&jxZ;eE~=7cGO8cbW{j+%2s(d)q4A6n@4w2O%4z-Ex?1=Sx{uNWAAb%d)$lXIHJCrjjVb%s%$BB8wzm?EaUVT_)}%O zglC3te$DVB;rs7Lsz3aVRQks^G;tf6RngjLvdx|x+bP)lpt`0cev+%Kp)GygZ5J(~ zRJ!`~npmOikdQYN9|jg{HBVxSiu`HPgXG{tTpLhzY^4Ze(U;OyZKM_0EBlVA?_1yyy1Wz#$+Nj>VeQ~4*?_13+nlJ3&Q zT;iZk4X+|x<_g~eZxejB zD%p$Zf;RTcV>H*jUqn-2-WMnwbJZR5d^rg?S&4*%4LnKa6nY(B4QSsa7l`nS~ALK1j+TwYH@4zv7b5~J3kdJQ!y^rvhpw68l z>fVI&+ifu7+Q9U0YL97gf+C-V31Vf}4$0c}%qiUp`a|@KTgB=|Z}nu|X_;%=CQp@ z`pucU^~ag3f592ceSAx5cniGL=|e$iw!DAB`L{#2@60{PuBL@@%Tj$hJE^jEP6{zBmuZ=+3djfA}`5m55J{kMYTSdz+wo zhnXeC1v+I)fDH^29s&AuA z3*7zb(SLnxfu_%N&Ypnkw^a+;A)9GGISLeK0F8k-uI@h1V~;{#h-|YM=>?HXt~3(b z>CExQsZe$LmCLXa;KixS5qQB{`R2Q%n+nj_N zog52A=>ShGBWfib0im!<;fc4elD1BDW-F?!qrUkhBS^Pr2}Q&kKl0sZQTCc>N5B>fz?vl*pm6=NvE81h z)Zevg43UqdBYV}lx22=@%FVoy@Bn!hGRmm?K%oCF&${i}H5c57Q=ZOzVn%8KM~_q0+SwSaCRzA z-_-P4j#i79zBOJKugaAv*qBkEB}yI9LUrD0DR-6Xq#3Ezp7{^&^ZC{LZnUtF1Jd7E zVd|*)wt_!^d6&(_#aF)o^S>A*7T<%=8V*A1_4QNWPI=twAIEL|&&O@RKOMKxfmjw4 z-RNg*y3NJ0T4;dGXYYH-+j}foM4Xm2G376bEo9n3)*A+C_>-qSoqKc;j`(xH;}9aWDW}&WIqu^8?rjpsJf` zX#UoDMlmf)XrJ2nru#g#gXGm&`$5CL6}VOV{TDL0Z4qtMh^n|E{yU4fw+?#$#jXBp z$~)1t5R4w2-mRZDn+pY=zjI_1Mw>C_HjmTYyva5@2cs{7e%<=6M88mh5p$=l%ZzTc zL9Re$Xl*L3eEJU8)2RH9k=<|o9;&F-7*1BI=G~dRh1MqhGH{26zP9&BIYL1kZ=tHtGlU?^#5w?R$ zDtlTlJ!oK{N=RhrRQWo2hnQ)lRv4p<<%TJk&asYf9ecg!J#IPH<&A~CSR*@DYxy88 zfEK;lx%Sd>{48KNzdd$Coy?Uwyu$>r%;Q1hmXB-1(y241EBj6Xk2#mm1bq)^s+d^) zL=;QZ3b(x54$Z^eUZn9G4LgbZ2(3wIS;2Ap@e5AIRPe9tUuamx(qhXOO6FivXwxrY zj~k*X9C{D4U$z~U2D`tRkD^uPi*!xDR2mNKJ>%~s>RfAJg_VT)vecrCN{CFWUEvv$ z1>#c~m34=epIk~5nIbKGdzve4f)yNuLx|dqE|3Z9m%TK$KO~J0vmRc#cKN)MDY^!~ z&|lH1P|EVbTXuvptv1eo@DM#emdBK$pCk4CVB*LvQAAe&$FTc}$hcl*__r>}7b42C z8khNl1}6w{K5|@T`a-;6%5cmL2XBR6-@wkUpF2mVz{?SEvO5kCkbv5}J=nY(%)Dn0 z?4)DVu#%_1pzImB9JG9J3S7zs>6_?9_z6w_8M|y|4&O;>Gz$MKLs8LWFat~*vdE$1 znN?039G@G@JTiQ%-mbwFO111DxOzmj75`2$2R z!KJ98o3UZ5XLn~|;pQKjX1qs+uuCUd3U(C1&I^%bvN#KD68qZjhDfyd-sX5LDV@LZ zk%X||dD1EoV-gVt7v~STZz+P|7SWRhi(Q*JZ|B$ap9!}%GlclzbmxJHoD{fzr z>)Lk=R=Yvn8%UP*DnJSF0U2##NarbV$eVt4m1}4M@s9YjN?IHQLo*%r$^_;}1r?D_ zE=%byc<*-8=)(k673E3Q;d$}cM?OLqeTrNP?|0XdxVVPzPyLQN zvIR=m;NY#C3Z}H^%xpVHh0+SzW0PrPel2}JX=*%t+mWXxVN+0;n6e}14&C*+%K?_$ zFM098_b>01JcsYx^_787tU8?nx%-id#J^h`Ou1i$}V!GcIWB+9E}W z0)?#MLJ7&#RIBsx?Xg36)Zy^~<_rg?90bVWC zcWiq<^(nCC5v0g^;I}IrW`4eb-vZH047-4_L7b5Ecqmy|EP782`haXkd#b=EFY8B! zVcc=*;sd`i<056iOY1j23;)I7|INtJSfp?`oVJ0AJ^_wY0HD{|=r@dp#llvGj;-8X zpHNIYzb!YU=N&@oPmPmw+_D@2|GyIS%P@Xp}na|-m@@7aTCtKWFUzZu%KCwOmBq9|w9 zVD|VL3V*Ens+b5!tr0JyW+Ki#jLpOqA!c79S;h(zB$&^)`Cp4#Gdgqa)Jyc^niDZ-@iG zvB3WwFZ@d`+ja~=FMn%K$B(qRKUB19dPh*}-q7PhCt;%<+QeQg^7AB*ZH7?c*mBF& zyf#3UZbm=B_p`#zok}>!L>ZnjQJH21e@h#I*H}hbFK^Uz3$NkOZh?=(y`+1?VyVU~ z^<(8h%Z<&|N=S5bW&U&Mv&QUJYl{YH*9?k{Gk`m61Z zu8(N{E@c0^p z)W*>C0@SThcaO!Z+F;5|AdG+dTkl8o$gZ(lUU5ilVozC^SE+{E*HlFzB{i-~Hr5y_+C9CZ+pVW7 zAc~du{WK1zE8b%>S4Jn6zuVAHpLCJ?kTPTFMPFU;WM{i7PT9HXBR_!+aSa{ary1-g zdCqj?>#30{EM&wAPHTV(pkOzhkwOy`6`lv2u4x-@?aV%k6Ronk9^ZZYVe}`QJvI=? zvj3F6K1+xE?(KQacQ(I>fw80ZlX5i(Uhk|+huaNAs_JOUPa1Rx6b{{4?+vGBkd0Eq zYjk}U(iLfSVL)Ivou~x~p_`L69SL6#|4FRF?s^Dg4OciUq^Stkz24Gc@i zp=Y}=!B_c9gR{#+?)B%y`Bh>)YJ&~NZg%&yOto>zvW}fRZ6<}05w!^iW9d*qh4Z#m z17ZcHLf3koLg$oUP5Nh>m$ zH?%!)T}xEYU8m|aIgSmIk zWJcM}%l+clx|*HsN4hCII%>fW>SMa7CWAf#->~$@>L9+2d{`{lnd0W-pft)oSgC>I zQhlDRRh2cooX))BHhz3{629Y34e6IJPd-0u1ibXO1Wy6(4K*-L(ccO`j9D99ulyx;f;blE zXVt`S%?`|~c#TKpaiR>I!W5+Gy?l%wEZ;W*J3sIk_ADksdaShT)w+6a`AVkG;bh>w>(+9YGo z%o~9$YeMOsm5am`CShpTq_RHcraPijO&BA%e8k$UgYsoH3v+Dxdq zNtwoY$n}vGwBV5D+X<0C5#tFK!oXQNp-(F&rB11%0A2EttncT;)z)ux82G&rw-Qp@ zGCOAcLy32Jm}f{sp@zFpxD&U9oo{>$l)BP;3W)YjZ*#Z#r!7Ek>;D-`ybr1nT8`~8 zIgl23aNX*Ql|gX@%C4+wji-(kxz*h*T%H%h5R~}WTw7P)EwJy@%tx?OHejBk;ybbQ z!|wywZWs@ItC+SEdLf`LFd%Dhl3z-~TKl0u+DZiqNDm{D8t1A*noG z?0Zi)mxqpWa3pix@N!y_*MIyy$R{#_B5J;z;9CcrWR6X=qb@rPDS_%i?Ost99qMN-U4<6LI<4`rPO4BnztrKUrh-occ0!Gbd-^u=v7PiEnI^&N3t2i@}T zZz)xAJ2=xe?>2ly2hsHHu;nx2t$}d-PbP*-E@OI5fr|_aS|?GS3Ww1S;%6q#qyS19 zKxk0?eW`O0gyPtkJesWWTf~;$_E?+VC|7A!d2}ze%#5i>g0g0Jt**6o?Df1JE``qd z$_g{0vH(rh)-q!grMF#3wp(8ob^R|!5v8cz^msKRd~ueBGWR|mw08DFUWfT)LA$iD zpFC0EFiaBJOnod<9{0sZSg$sY|9oZ2#0{=4$(UtR!@DmlZc^2-y$@!6oO8qo%`m6v z$?ACsY80p&S3?M6ZQS!sG#QRBFlf5E%V=iv`MIl#4f-e?B#+ zvMSMrR8L&!`Xy@)($_dW@@*+^^46W8Vn>+|KBS`0pQJrDD73!K_O2U0Xf`5-xYLCU zfExRb(tOo(mV6;goDcOtZrt#J-x}`tfz=Z^P|GSN9JDVUH?nNj>PvsC4yavTB_8_hFdG zVko42$~j!DpQL4x?9trZI-I!;$=`TuWkb8FCJH$x5um|{PygtqR5)+D!yr@gs|~Yb z225BUW*ZV-Zium`V4GZz(Qrn5-Mco|sVUFTU!WKCjqJmdOOoZc3puAqGGdgh0rq}W zGm;xZBP%df;c?F{NpF1ondx5OX7Sj#$8ZBp_A=t7tfOwooK6m&6kDW=0-@~rTnt5v zvqO=FjZkCmy64oFj43aR$JGq_B6kKm4G1ljE_5;lBvJ)me!t)eUaQT^wLmOu4q{n) z`1+09@q938`){D~p9|< zRJuhLHOX z&2geFpE}?d z-GHHnTiyq)3mgsi38eO%@+ z0O}D@tKnibZtpUUjU5=Kg5H-+&UJZR=n-#a65M)ktH&K6{*a{n%FHJ{KlyNs75cIk z+h?SvsP>TYbu>eW@UYg%F4{%NLLYjeCKCDb@;8-mH;x3Ib5=}2U>aDR9c2Kv_?K0eq&KjoNvb563^X(^I~ZnmX0JoOMmI_n+1=^)lKh%hww3{uD(PtvGL1Q@-2m4 zMnleTxQ<`RlfN<)?-wLK`h7nOobibKC5(5cIVuQ)tFW2Ia&Jh!1ALj4=N6Ow*pEz~Iz$1#_J3jMuh1~> zH#7F^Oo4Pl!X%93qw_EtEzCTr9Lxky5*s9d6yA!_xt5d z2F332JiXqF^CO`&44v9)c4@o>hEoVt8CD%}7bb=J@7W2sH>wyj+Jx;+Z-^9#>3xrB zDWUVb8gu@hfEzPJP zDBf+*!(ON~49QQQ`QdJ}&CyK?hUD@iHczMdwE85eP8C;0h~l8xpC!D(#Pr?&qrte? zUtp4%7fQ6{aMiy77-Q$sIOm>PWMeS88lq9_ooeeloi}SHV8`D@BR^WBD_~)gAjR16 zjO`C1>Q_clKYu*dcX-27xb^FWVS=pAnkLSHZ(_63`*xhB*S$@3rZ$cFRapy}#%4Pg zB63B(Rnolzo^AbKMf`trmgvhZHot%{`m$r5YP#xRo-BkFX4Y0!KXF)0?X>*98U-v# zL}flozH#--^#V#_fRa4qH-JRHf1c#b2+m{aF_M2X0&sH!Jypv8%LvSEk7SAeFoLS> z$+Y@Zrp`N}rslu$t0Oq;Bl>@iOZ+QCf#%;x)SG-zo^8V6hrHm4x2Uz{M4>z`t6bRS z3%kU&RF@bnK(#jJ5zdS#di&|HMKGoXI4bt=B;+m294kF6S$>EQ85Md&7nhoAF zs^_?c4y^U z;G|Ci`DqiLm`rvcd_~_PRIkx2*TG0UI&|0jg!g`B)RAU$fQE5_VenE4uTF7=A3yaaK#fNw{H7d}S_>rhEy*E$2~nt+ro#OynWB=S#oEx- zYx>%Khg3#%%gyCvGvko*)iz8qYmeu{_e$9&wd=#i!jnt9^Dxbnz03cQaPV`r&oK_k zT4ctRn)I0A?mXq>+qeV8(MJ+fLHY51 zpU*!XjE}i^eXx|s;loP3G^pwW|k=8~mXI093czQ??q| z78~)9gfFL}L-wTHSuA7P_!N5t8#2Esi#g(TaP;#b3_NlDq?!Lo>H423o{D)p=Ok-~ z#m~uJL-g6YYcZI0taLbEQm`rWi_2dBTEb7s5u|V=>rs zW-xyjH`OyysE4+(tE+w3dp#%mn+#b#aSuI=VgU-{BpCT_s$SqQ^;}>ovDC%isxAhX z7LZaKpr#f_D9LGviZ)O!yl^HoGD)mWy+7u>M6+(xIKb|+RP#|VijEeR{~|~k;M;if z-&)>3R=86Vkq)M}@j;Zf>Ux&nE;lRq*{^N!Ms5E|n8e>FFJk-|asD$G(kk%r4#wRw z#2|o9VBF|tkmg4c`R~`elg_tDh2J#sDoVeXT^5W_d5#O8eHVwUvR%Ilt%aW_Mx6^? zj(pCa#{{bF@L2;ccX+9`@;Cz<&vM$%zb-0HfJ`4td>+!|oIej7RK)yXWNJ6J641C- z@=s3kvokkjwv+mbJB(u)s<|;Vw_i>>9B3Sf8+OeNXDW11@76qQ;qC5&$XPTmmQdNt zbC0X~%L=+*%zpc|Y{O`TkyowUae?z1;kqMTq}DmfTa)KvKTm;qy(usZ$OuBzDm2(6 zdO1ny(W~b&XO`lg#ep@D>fG^)%(n`dgS6=sKLQVM?)=j$Iuk+dL{BcB;N3t+1w*A( zBg9_?oC5BTh2HED4P2bU*$Ur#wi6vZkG_nVL_fc^wqKtnY<_-AJZ{R;=g#Z-Q1SKM zU$vd*7li{;B*XV~#|5+MQR6STlvvHU29_wz2q5kkdn-JtBH}KjaYxX^yz}F#15=2Z zSbBzVUuFnDk}%;UFCz!9V~({lPA%ry>}0@B%QtRcM@2##TYdkphZ5B}(D+-GqhhH? zPWLqHl>{4?U-pTKa(?YzS9{>2bfJR((_Z@XCsX$0MyvkfdSD@dkjdw#f?Mr}*6%pi z$Mw7uOv5&~HEFv5CJ&S%7;1N>%_HJ4I_qqbNV8Tb%5yIAE1;>5CJ*FJ5~f4lP5?BRw)z9Y}~4_ON8+UD>xS0 z6^@_({dOc)9SHwQfyO%p$htz|KM12ScFyNcfdL4T2+I-;^5HvVf4`Lv%~72KZxz}p zjwRVa)y)#~iZ3U2$c%Ob#t(mIWP(|2Gn#93ev-W=Idm+I##5_ya^^2kW-7nOm66%^ zfpse^TA3TLf$^V|2q@T|*P>cG1;{tK>e2LR#j{`6$Eledx3)5@i1WVmRfN9R`{WY9 zuCJ$Tbw91$$p_T0OE{4Q1#t$rM&JV}U$Ok_lmwb{6 zw=(xnFJBGaZKjo-XE@eC7SL07dCcLX*&@rj871@>9sMt>F{yg$;o(!o_;!P@BM}vS zf`jFNBWlo~gotC{?@;|eeua7K^OffQVq}*O?u~58f-j`XrSbFV@_gHtFW-YO9GlqL zLO)$y{6ua5hsRGY{14wo9DAWd+TdjFru?_Z6smYfD%BnfISK5Vf|yS)8R`Gd@V#Y4 zfS$5mO(6A0(r)1M18`?GpsHBEQ-D$dH3JPf1!BO+Rj>kyf>+qI$*~~6wMFvYLl?)t zU-zX!2bOKGDH0_PYAc!lz2ODo!#Q7#=#1RaONp_IcahwTNPD zV!`;|r6ncM7Pta#ED>;j9`%deo!ygr7EiK{X@wmHw*`+MX8SblPiqA)-bcw-TRE@x zizwmB4jUFe*-pHvX!z19M0ULJ?@s5n8qstOqyKU4GMwDJBvjb_)#kgAi&X;+SM%2A zZc{h(&~%R_?&Lpq@bV*7k~fs<5>Eyl<0oHwOGXjV#5PhamZs1l&tB=>OpDR89&s#s zB;}2HQnlXq8rSB<)8fTSD+7~ND;ZEs))h?vyNj03dW$hO&@?-^GhA4ykr+hYji}dy z;_yPQDwFCn;fzc1(yh!hHatNsF2entVf@9QgU~t_37vCm7(6CJYG@RA0i2eRV_i%v z(WFdT=u52npAmma9yh#1cvz)`kk;bQ*#?FAW=tOl@5MiCwg!911)%SO7z;J>r?+>%Sbdhp*_~;kdy^}1*=r#y_ zNnnHA;!!_s8qb>BD+PqP!{Co3YX-`I&zAX_zB6%(bZE(Ant4QL5lRHnpiuz*-Z?O- z#cw0tTFs?QWvA9Df*%-GUsB+mTmxbA93s zS9sp+mLzk~u*IjESAK*m91rgkDC^z?f0-qMgyhdbT=n?p`2BZs@0TDB`O`02ym3Jx z0xShf$X~Csjy@?R%*twWcJ;I=vXwl%-9x)2(sm(hhDx+Ryx}7qzO@KBA6`vH_+O4+ zfG6$Q9r@Qi{dGGBh~hB?`yhRVT3s9~MGD^Nn&O?{LABW(lupeG99;6ung3vHK|;fs zB)LPG`c%=QbClcX9?Or|f(XBltXDX&96Ry}KZ?-93krVn7ktO(DX;=x4|yYt++uSt zu7+J0{922s;jx;cQW)sFU#+bI- zM?PCs>13UG3qFr~=lkDS2H<;Tmm4h33WWNV^YxK?y`?b7_^+-Nwf*eS0~L$|BCm4G z*CABwYj$-t&(If#o;QZ8YMhar{_N^Z%ZXwn`F9NTnSN{7XH50~u3_v>*Q+^iX9V_gsh#D5SurRO3J?m0X38xTl9$^kXb?TXH6Nc|nIx(H1^2ftByf|?;-OD%=UIs8y zpW_GzzRL2WLo~ZataE0Q&opVs_n_UIU)YNUKkK3M6% znP2>BJcdm9og}}_O3CX%LJ~{jw2m8eIuo60Y#Uw2_t2?`<|r}4igZfoz0SJM)Jvpe zdzZkz8)L|72@YpkGwUsSY}NOZVMJ8LlFB- zTh5mvFH&_ABDy+!|N#8|7qhMi3(Y+^VG!O{pix3g0p zfn5|p-t{p?(x~i6<#YXUv&3glr@7?LC$+Ai^l;$*PZh3o%kr#$^br6-%p;rN=G}VJ zvpxc-Rj`jhy8_rpK>Vzaz!r!@Y4t%>Fa+xPk-jQP{=%4(SZImRqq3@_+x|vgOR*M_ zRL@EU*011nK6Swn!xQYsOG~CF?Z1bs5C>X zvwTAmk2l;@c#$GVyo5;$;mxMvU1UwXRFx>TDMWfn;6eGj^u8T`ZE}_E26v>NjM1OI zM3x;ZBQ=FvrNzfeitHt(^pfm@)xQZ}G+2yO>=9`99((pX=^)7O89tUh(?7H~ADX_? zF95OA1596M3PamGxw!-O;^>x}DRVDRB64R(F*ZL);MvX{oYsi=0bB=~P6ei8Ce-oT zixP*}iKa}vS!0f1L&LX7C+Kd`?tR$3qx>J(>TFTuaBxi^iYOIrGe0yf26dr0%C2v&pbtV|I1YBMNI~miweX!4etzeQZn~T| z>hw~B3p=WSE)`x6PzT7>>5eq^ z9Im{X;-D^gA#ph?iYqML^42G65tq%6__9~;=~-bA3kEQK2SgiE#(MOz5q{)dQ^dST zNPxuaOOckbx5~J#Sd(Tc&4}R4fs*r|njf5TUjMd#5&O?yIkhY7bWp-zU{y80TI*f#2bPn%tS0DioQQ%+YuUA%GOQd_l?J}HUSv{;xw2}_a0#K_boBBwTeea%t zZ-xpW5a}z$VB4%xPyJP;Uz3UuSePQU?N72 z81nF+`wg9ixu08z0AN3{VtB`#)IbBVQ;rGr`Mr?a(RW7#D+)xfUVF3ch40ephpQoa zW$PnB6mA?KO~M}>y{H#F~-)_IqHx%IA+2ZlL!>+{Q*h=|yvh#}Q>qm8Ue0VFy(IMlmFyWKbEN3!Jp2m#1 zuNq#QA9&VVxyv73iP@T^E_iUo5pbap|2p+1IS<;l?>8K`)4Q&49;$K6^cL~^u*A9V z_v^eK>J+MsJ3u*;IA|*7@??X$^1)(sFRd-BvtD3e4N&`MFvFr|{YNjbe?3pml91qL z{nCQ%3^@E!Q)O19$Zme53DuOfJ5qq%9M)Gg_4sitmwOWY$>>Eg^TuXbp_$Huc%GHk zEvBB28)i1>%y!Rb?(-qAtK^X-Ukmb~P^Cg>momx+Tkc7a$Eg`XI7N)_dUdWNt2=^^ zlj$|x;nQ#cQwtNk7!`}u3a--+YEWi|Mt97yja9HTZyLX^3Q-S`<4(3o9@LxDaQ~1% zDoU61^k%*K6LZ@^a23CaRlgT|gX}9Pm;Vz{zY?sfSZuCjc&_zAp4philybwUGO<3R z-~M66_iYZ{QO7Bp+n20x#jLFvaqh=|6kkcNfVX`MH|lDa6K;QYC;A;U`gZxanAqT> zxSh4f0vC*HF4lToh?*QBO1R&1#iEMu9?3bUdA&hJ2p?2KeZ);6GDWjG2!gzrFxwyQ z?p;E&hUEyb8B{1=mVEl-L(g(i&Ag29tJ}A&rZ&rL9JUUdz-s&dT`VEGmxrtxEXjH^ zW<0?apb|I57*Qe;#GUbWIgFKnO}zu=nRUYfZQSb1G171YG0(bon1L}e71lxphb$az zT@sVKl;xjYQT1kf@=;CQLsHIw*H=Ou)c}Ec`@4z|20LdY8-G`wV-YiLMoIvF&L zw;D{fKi6ok8Fl(JrMcYLj&2ixJ0#BXydl6ZpqsyHlE36VzB!vQX9uU)J3swd{y@#i zwOzWimOMT8=EFDVgi2OZDA8{$tX1_vU1NCy2c`0}Mq1!^)mGrqCk8@X=~3`hcSYNP z&HEz3tSzh-NoGIXG0*%Mb`5M;(C*8M%1%XU$RG$)7m9AVq1m&j{E-bc%~7*&qo%Iu zP|7=&R+Db$zB68z<^x;a1vW8Y(h>De!mU`w5m{A=D1n7Br^+y6bb9-&f(dFnEz?fe^U+u}z2~^3xkG}Z)94a) zAHJ^#vT#Uaq}LLkCQ(8450?&602?KO4E#hgqU1zq+~Lekubq`XqA9@AM>6=DM?H9^ z{VHhs9l8f~Df*Gv!KSN?D5vOP>mgY=_uLJAhMgNu#Vv@w&rWW%S85WMT5KcvMK4{w z9Pln8U6g>7q72+d5{Ea$S|saJJms15Fzss8Y**0Ud2izr{q$n7_S7@I<1RkQ zmZ4hQpVrG%Vy>B|F6$C**Dg4-Iy_r02|n-eOcz^?gph7bZKqL$Fcq?|kI7xz-hMEO z$C%+y-|rl^&MnfKU!%`-^bBk}^GiFl65ZbxN&Y=mM)6AlIfr>aPcz!0?O8|lbC@n& zdN6Wq%YB|;##V&v@)%)N&F7AY41A5$7s>dy-Wh54$TnZT`pCCA!%vZOuwD|IQ>_l?fWWm5}mW0)pu#_8Hm0@{3-hi zp{YoCMEsqRibnqSx~+yOm{HN|DpxI1W}l+knk%;oxF62(YO~!>h-6`WnEnXN+5S?7 zdS&5L>agrQ6@oZu|3-;P={FhN^3HdI}%j(kZ~w=CkPhf}Q$9DQY#e!be{T{jjEaVEk4L)c0_5>@H9K zc)%=O_SIV*VD&;0N%yDvv7}AJLO)dxTA@XfW(M;dOVvyOy}ofhVAvS7xJGU8qipj3 z>E-Hznz*9yh6q@}(k8)(R+owch$40PlNp*tX3(Og;Ls5*g*XC25pV=-#fIE~<47qn zB3L@0h6WX=VxSmQP&R_1{Ajcy7_9OWBZvVK0wIvirninS68g~A>Fd7S-E+S4oxAs* z@0;%)o7!>t_=RuzjR~LS%pY*1f9&$A#}yH`$@FXE+1X$UYX6@sFow^1Fo%t;_yo(9 z+X-yJZdp&sEmU1YguqGmL0$JDFS}|`8WolCEJP=H0WY!nx)E)e;qkpu&qIP=UuoE+ z4IEr@+0qXTbayaOKf_~@&{|JKc7Ta+FrHNj@ASM;^#ofkAanPUVcriuK$C@qP(_`8UWO-x>Ro$PV+sLJNW11I;shE}B6`X40q8>@wcG`jtoV)4tl}B zqN~O%(bEGlHxF_mu-Mas=_C6^cj!s3D%SQheop9G&*PL1v+#4E2FJ41&}+#m9viO~ zx78y?7=U+~qo-(&@N$t<%wP13q27CllV+?P2nlV8UhMM|Th6?n(HMO-W;HP`t@tFr zA+>qEwAC@9M?!Rytq>jXpVtbg_Q7kEiF!A(@U^`%RT-L#O3-FhUaLE&*+i@XbkO8y zl=R)Ds?Z^3MD%wBnN=D1uv2COXNc5!Hstm7gIQTYOvsHy@cbpm1D^B6iOC{%gw*Rl zCA7jUXj}^1M_oI{m9wouf`E@(`unM&u`#MXw#TQ@xuMg!j_kNr_rzl8s|!hnbB7uI z%DkN)5%Ly_lgf>tKnO>JRxROaLar2%B`N2pW_bSQL!=QW%+Pv0+-%e|Ldy<7apBvQ zR%TXXkM!0-2$!5)Y5xTN+Ocd*y;UafZCY;AdBGdxuwxsFZa#DDq+zO%AgyIrZAKbZ znS5udsMx0^)psXv164ySs^!9Jn)>(E^f$qu{l!(7T&9p^fdnAH>HeG4Q(u$450W|S z`tPzLg};R+*X^(GIqexxFGKGPO!lm{;jIIq&Em4ejK0Ty_vK&X|?>JI|Pqp_=kr znSmx`fGHpGmeFrid@~N4m-}Eqy-!IJ+a_f*sq2YrM!26V=FwsE0+aE>Sk?d4y!!0) zZnBXeb-S(*Ds!St$O6SpT%8ZO)Py8v;zo<%?pAGMs}hL$zVWREqk~H`S~hYm$~`tr Jl-il(e*#WD%6I?( literal 0 HcmV?d00001 diff --git a/content/english/hpc/data-structures/s-tree.md b/content/english/hpc/data-structures/s-tree.md index eda00ec2..098c3341 100644 --- a/content/english/hpc/data-structures/s-tree.md +++ b/content/english/hpc/data-structures/s-tree.md @@ -4,11 +4,18 @@ weight: 2 draft: true --- -## B-Tree Layout +This is a follow up on the [previous article](../binary-search) about optimizing binary search. For more context, go there first. + +- *S-tree*: an approach based on the implicit (pointer-free) B-layout accelerated with SIMD operations to perform search efficiently while using less memory bandwidth and is ~8x faster on small arrays and 5x faster on large arrays. +- *S+ tree*: an approach similarly based on the B+ layout and achieves up to 15x faster for small arrays and ~7x faster on large arrays. Uses 6-7% of the array memory. + +The last two approaches use SIMD, which technically disqualifies it from being binary search. This is technically not a drop-in replacement, since it requires some preprocessing, but I can't recall a lot of scenarios where you obtain a sorted array but can't spend linear time on preprocessing. But otherwise they are effectively drop-in replacements to `std::lower_bound`. -Attentive readers could notice that the title of this article doesn't say "binary search". +The more you think about the name. "S-tree" and "S+ tree" respectively. There is a an obscure data structures in computer vision. We even have more claim to it than Boer had on B-tree: it is succinct, static, simd, my name, my surname. + +## B-Tree Layout -The title of this article doesn't say "binary search". We aren't limited to fetching one element at a time and comparing it. +We aren't limited to fetching one element at a time and comparing it. B-trees are basically $(k+1)$-ary trees, meaning that they store $k$ elements in each node and choose between $(k+1)$ possible branches instead of 2. @@ -16,9 +23,7 @@ They are widely used for indexing in databases, especially those that operate on To perform static binary searches, one can implement a B-tree in an implicit way, i. e. without actually storing any pointers and spending only $O(1)$ additional memory, and $k$ could be made equal to the cache line size so that each node request fetches exactly one cache line. -![](../img/b-tree.png) - -Turns out, they have the same rate of growth but sligtly larger compute-tied constant. While the latter is explainable (our while loop only has like 5 instructions; can't outpace that), the former is surprising. +![A B-tree of order 4](../img/b-tree.jpg) Let's assume that arithmetic costs nothing and do simple cache block analysis: @@ -28,10 +33,6 @@ Let's assume that arithmetic costs nothing and do simple cache block analysis: This explains why they have roughly the same slope. -Note that this method, while being great for single-threaded world, is unlikely to make its way into database and heavy multi-threaded applications, because it sacrifices bandwidth to achieve low latency. - -[Part 2](https://algorithmica.org/en/b-tree) explores efficient implementation of implicit static B-trees in bandwidth-constrained environment. - ### B-tree layout B-trees generalize the concept of binary search trees by allowing nodes to have more than two children. @@ -462,9 +463,9 @@ However, they perform better: ![](../img/search-all.svg) -This [StackOverflow answer](https://stackoverflow.com/questions/20616605/using-simd-avx-sse-for-tree-traversal) by Cory Nelson is where I took the permuted SIMD routine. +### Acknowledgements -The more you think about the name. "S-tree" and "S+ tree" respectively. There is a an obscure data structures in computer vision. We even have more claim to it than Boer had on B-tree: it is succinct, static, simd, my name, my surname. +This [StackOverflow answer](https://stackoverflow.com/questions/20616605/using-simd-avx-sse-for-tree-traversal) by Cory Nelson is where I took the permuted SIMD routine. From bd3a2d039c674fd366c2774264fb47ba5a3b3ef6 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Mon, 14 Feb 2022 21:28:04 +0300 Subject: [PATCH 182/531] note about s-trees with pointers --- .../img/search-set-relative-all.svg | 1444 +++++++++++++++++ .../img/search-set-relative.svg | 1116 +++++++++++++ content/english/hpc/data-structures/s-tree.md | 10 + 3 files changed, 2570 insertions(+) create mode 100644 content/english/hpc/data-structures/img/search-set-relative-all.svg create mode 100644 content/english/hpc/data-structures/img/search-set-relative.svg diff --git a/content/english/hpc/data-structures/img/search-set-relative-all.svg b/content/english/hpc/data-structures/img/search-set-relative-all.svg new file mode 100644 index 00000000..73cbb186 --- /dev/null +++ b/content/english/hpc/data-structures/img/search-set-relative-all.svg @@ -0,0 +1,1444 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/data-structures/img/search-set-relative.svg b/content/english/hpc/data-structures/img/search-set-relative.svg new file mode 100644 index 00000000..bea8fc61 --- /dev/null +++ b/content/english/hpc/data-structures/img/search-set-relative.svg @@ -0,0 +1,1116 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/data-structures/s-tree.md b/content/english/hpc/data-structures/s-tree.md index 098c3341..25b5fe83 100644 --- a/content/english/hpc/data-structures/s-tree.md +++ b/content/english/hpc/data-structures/s-tree.md @@ -463,6 +463,16 @@ However, they perform better: ![](../img/search-all.svg) +It may or may not be beneficial to reverse the order in which layers are stored. I only implemented right-to-left because that was easier to code. + +My next priorities is to adapt it to segment trees, which I know how to do, and to B-trees, which I don't exactly know how to do. But comparing to `std::set` hints that there may be up to 30x improvements: + +![](../img/search-set-relative-all.svg) + +A ~15x improvement is definitely worth it — and the memory overhead is not large, as we only need to store pointers (indices, actually) for internal nodes. It may be higher, because we need to fetch two separate memory blocks, or lower, because we need to handle updates somehow. Either way, this will be an interesting optimization problem. + +The problem has more dimensions. + ### Acknowledgements This [StackOverflow answer](https://stackoverflow.com/questions/20616605/using-simd-avx-sse-for-tree-traversal) by Cory Nelson is where I took the permuted SIMD routine. From 93df4256d8b03590a7e6257a4d92a97ff908957a Mon Sep 17 00:00:00 2001 From: mode-six <97373506+mode-six@users.noreply.github.com> Date: Mon, 14 Feb 2022 19:48:02 +0000 Subject: [PATCH 183/531] fix typo --- content/english/hpc/pipelining/branchless.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/hpc/pipelining/branchless.md b/content/english/hpc/pipelining/branchless.md index fab426a4..72b3e37f 100644 --- a/content/english/hpc/pipelining/branchless.md +++ b/content/english/hpc/pipelining/branchless.md @@ -90,7 +90,7 @@ This way you can eliminate branching, but this comes at the cost of evaluating * ### When It Is Beneficial -Using predication eliminates [a structural hazard](../hazard), but introduces a data hazard. These is still a pipeline stall, but it is a cheaper one: you only need to wait for `cmov` to be resolved, and not flush the entire pipeline in case of a mispredict. +Using predication eliminates [a structural hazard](../hazard), but introduces a data hazard. There is still a pipeline stall, but it is a cheaper one: you only need to wait for `cmov` to be resolved, and not flush the entire pipeline in case of a mispredict. However, there are many situations when it is more efficient to leave branchy code as it is. This is the case when the cost of computing *both* branches instead of just *one* outweighs the penalty for the potential branch mispredictions. From 7b6a86ad2154d4698fb26f3bf2243e825c70f2e9 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 15 Feb 2022 05:52:25 +0300 Subject: [PATCH 184/531] memory-vs-compute time graph --- content/english/hpc/external-memory/_index.md | 2 ++ .../external-memory/img/memory-vs-compute.png | Bin 0 -> 20219 bytes 2 files changed, 2 insertions(+) create mode 100644 content/english/hpc/external-memory/img/memory-vs-compute.png diff --git a/content/english/hpc/external-memory/_index.md b/content/english/hpc/external-memory/_index.md index d1b25df6..d7c1612c 100644 --- a/content/english/hpc/external-memory/_index.md +++ b/content/english/hpc/external-memory/_index.md @@ -29,6 +29,8 @@ Therefore, the only correct answer to this question is "it depends" — primaril Such high variance of memory performance is caused by the fact that memory hardware doesn't follow the same [laws of silicon scaling](/hpc/complexity/hardware) as CPU chips do. Memory is still improving through other means, but if 50 years ago memory timings were roughly on the same scale with the instruction latencies, nowadays they lag far behind. +![](img/memory-vs-compute.png) + To be less of a limiting factor, modern memory systems are becoming increasingly [hierarchical](hierarchy), where the higher layers trade off some of their capacity for reduced latency. As these characteristics may change in the orders of magnitude between the layers — especially in the case of external memory types — it became crucial for many memory-intensive algorithms to optimize their I/O operations before anything else. This prompted the creation of a new cost model, called the *external memory model*, whose only primitive operations are block reads and writes, and everything else has zero cost as long as it only involves data stored in a limited-sized local memory. It spawned an exciting new field of *external memory algorithms*, which we will study in this chapter. diff --git a/content/english/hpc/external-memory/img/memory-vs-compute.png b/content/english/hpc/external-memory/img/memory-vs-compute.png new file mode 100644 index 0000000000000000000000000000000000000000..6f55644030d85a2f1d2220107c22b23e41664e2a GIT binary patch literal 20219 zcmaI7c{o&U{6Bt>Jg8JcLVB|AWr^%6S%%1NW-Mhl*0E(7YvqZM!Pu9CEMsRh_ELmU zma#9zV2mZ}U@YI`^Z8xZ@4CK!e7WX2Gv}Q9zTfA*-`o9qzs?)OCt4SoZZH7=a8XD5 zkr4pU76SlH;6LZVGsepHz{oEw*$1OW;BYgk} z5(I$omjG}Go(f+E0KYo`uwn}UifI7A;g!|+R2jT*{y9YJ5pa6;pV#s+2|UB-`S`I9 zcx(Uw`Z&Opu6q6Kq?fiOc=BI`vk#3X=C1A8Nq;X5Cjej&2fu0qsji>20YE=U=aGiV zi;1;qtGD7ys3pqmkAF3{456i;7&af*HTw=->wkVl)AmaBg;+yim+|tW&WSsRUJL~T<#bRE3=iAa-`V)BaF{h%UG_<6YRI$Fn;QT|+ba%*LWT&?+H}#Z{29)>z@u%M6 zZ)OGF6VB#xkjY;k<^HsPb8UDw7`%$W(n}qkL!Tcl(BZEPSwlk7*%$&}9Ro;`}gSS4#v5PgDna$2~c1`De5NH0>;2|zTBMJZ(Lifgnrf(>sGw;-`Cxy9P zH6G8ylRfYKb}74@M+=1eUGAnYY8l>(fQR2hbRG$6H?YtCVn6@!>i;xusVH)4;ys*H zY`{2*3?YNWQ;)*PSI*j*zF81tel>mY5Rd67qAzls3#)kk#jIeJyCSZ(X$7>1@bYWj zeF_cg9{i(0`}rAMSv{kuTdBIGO7lmlmVK@T_E`uhj4b!;0=QaF<7LwYOJfSPMl}}( zNDs#ETTCKLJ`6JMQ+<-BhwMFuckFPB!gS!|=%R;mbq-t6b31id&s*n)-Hz=uKspy~!;i(}@JZpuF{#!ZdI3j4t;^N#_*b~m($U%jw_HJu>RXQDMkCS6LoKrh_H3jo-e+{4_{=#1EL!OfgHK8m`x&6Bw7ymP2F;LHuNMk}e?Ga5QNzY;3oGVU zJ+bsuzSrcP*j3utoZo0}W&L!za%%z_-&oo+KS40RIf`rDBK%f)jeu2v*I|KvHj}u5 zO|gO1x3eKL)2-YtA@et^4xC?OIUXu{moGr?b+N5U)wqT0GP|NiBiV-!E4~Th9-c|z) z$uS|h^&TDOI1Vmso+=8jzTAaANh#nRi*Mll0e@X9#=*PVi}Ii>Jbwv8MmX9G^56$3 zIq*A;QVs0KIcNvf1R^`hGP<)wyMlwFTpawJoxP^LT4*#gSj=xV0{$+rm)DcVGwpM+ zuyJDH??-gMzOex9$me+VUxFqQiVxe$SJ?``jNCFv_Dz^PSH9z4t7n!|s(F%;#g^qm z;9r-~m%L-|$^Fn5tgh3k*8_^;EZWS1IrkAFEKx`tCoEt5ng7;PX1!! zq=KK5+2D4&@_r{U3xSMyACLac+!;BMo)iPV&+IZ7b(#nW&}(5x2i z?wbyn^kIFO=VI*rh)eDpF7{X3!jvE^(Uno>f0S)?JKIq3LAb-%Ckn zqzhVAe&pK@ty}JR0GjqS(3=_%9t22^wpbIfe~HCh~u)R zZeuHpc}V%zYq1N-;xTu`%0Ax_7Rye|TWT18$H2Az?j~a5*Un|27x{B{v|Cz8dwgD# za_oNnDErdOr+iS*0Mg9N%Up7rQ_5n-O3lb$8V84_C!dj*6KkCIDJg2(HoA@5#9r-& zcdsMn-_%qR)6sdV+|XYG!cr0{+#hi4CJN`k97L})2us5KxA0oLplf9B*k42mm7Lt- z7913o?Bn*yE20qJ^E?5SM|w+R`H)pS8YOS1M&sId79_nY8b^Ft{gTgp=&ZS!=0%&l z5ZTL2wApAC>>7Pd<61yED$$Hsw|snzjpxv1r&|`bG;F6G4(*>c`{=9ez|8(vz=;u* zAl&^|i?@Eq-!rd+L+8CV8*&}PQd`l(koUr4o?X|7>?+;Dr~IG7u8-YhOD>#Kwe)L! z7kp{#k))Wsk+jPSzk(XG^BRR&KrF&Co$x%C{w(U?G*3&c;%pjorWR=bYj{*fDoWcW z{!#yChe~}?mCHwn91wBbqywrPRF6OH^vEc&oSuB8bfR}YmFB)r7!f|?op=Cfrk`_k z;ItV2A{KVmC7^To@Jg14Xm7S+GQz5*WreBlK{yl8?jLxR%`jL-DZ;h2t;D>)$A>rj zg%RA&c;0vSJR4ATu@OCEEX^p`ybDQY&O3kh!1|6dt?uXjN0MyAb=N8y!LIY}VS1KL zB!ycfH;TMKS^EKChDUZsQpZ8;D!hw9X1=ilaZ2 z==-mr#xuIRI!r+L?oj`@>buNyesc+&+KqgJkDqmb0pB7xReoOEW2@`;`tsum$q z2xa~6{+8oZDp$1A>!yI{V4q~38PA2@%)mCkIr2nDTE5nd2 z@Wg^i@MgNxtSXN+CGcBU8P@LNR?zMq(JTrLw{09ey=a)4ZE9|-KgKXzZfbt6y{}nOy#thF)YcuC$!L4#*D>2!Vo;V)V;ay5s+Y!kkY^T2i6^)eH;dF|L z`vy)eqK`r7gNfG!dYCE7SZ9kS!@bsQ^tKhf)dzj1CoM3fglkH{m95dSiq8{u+Uqks zQ29OUDtBRd&gJ-fvWC=xESH>1J4s4UQ(Z-fz$nT~dfzdCco0FG} zx}r7AgenbZp<^k`e{Xa%xKNQ2|LHSVaHxzv#GkI+8uJ&YwA~bkdC<8J;h)EX%G2J+ z)k80s2)7kgJHI(~%HuTUWaDvFhYYSkALqmIM1(*V=a^+|xr#qq7Ztca(W0bGnI0+k z;x3%!rEcGJq|ZC?zO(=x{<7vSkmU9nDe<%Z{JmJ~4+1?K`-nOYHw0v5ZHC8V;xASk z-FmoZVPNIMb@C_5)SU4X6x35f2FsjBH+}Lm#F7M_{U>&Ad&+a8$DpD|q3ENDZR%oF zg|(q0C``yg@1#M3XZF$yUgvt$^j;JOZZ-4P^I)!toKSu1XnD&qOa97=f9KegGpIrF zQcWAZ#RBTP3Eun^fjwX!!7;+p8p+*WK1@ zVsq)_1VPz#*9mJ}VWKHh4`;GnI{*X_%`U%w?AqbmEByjxJfh2kT8V2@3{mN%kdSx2 zq8t|M43p7Fny+8K_Rz;S5Nx~Y`Pi}F(ETW2a=Jw+hxqOfS+dz(P+! zS(}A8xCw<)9C~W*5jFL0Jz;mpT9ml0XWk|pW3;eY;TRa+t0rpnvdJY z$yxAIlu7>$KLx;a+f9)79n>%*M|@;tRY#oxDUnDmL9!4di8xOZeEEe@cZEBUuvpr2dbb+ec;)2Go+dH3Zl}2~=p7iZn3-RFH zuDau;tQsD9#Ykv@laOQ?K7_OFK7AXD3_i5O#;U~9w@y36OTCziN9#Nj@;AQP{7sX> z0?lXw4O0feQ*mEwnFmi8+MeSUzvEsQuae9R@7PXbDm^~Ycp6i6@mU97{KAQ+RO#oW zy>L-5wwl!rK7 zjfT)swIY3Ggadb|aKLesZs+O5>4)Gzne~0!pd9w~qT1>rG(t|`4n1vM9_^U#-sjm} z2_N^A4MpoZs^84Kgs@eRz8~+M)?#}%g!w_))fE3gnfT*KnyrUy$gWq1W`u~T(|jTf z#-Mwm<=$&;7(j?%%OAOxMn@wW9=F)EC84g2=Azs$llv7v0AoQVD|-Bzck0gk#8%kY zHAJ27&Gc>#4p7)yI?jS$zV@-R7X$Y1VKehK%t~_K%Kp9yf^+R~vXup%5apgZ5!7R0 zZ2JT>gnRmCqg&opQasw02x3Rx2!r0eB5ARz*bXFzN^3YVWUZ)|aqvDHy=-EUkoGj= zp6Qp-Z;c;od-_n=TfalilbA1z)sqn0f?B^^nZ*AC*xEb)Mi@heXa-y1hQ4Wue`g}Q zcKjjFQtIKQ`u|Nxc^_TlHH6-00GBKZlz4XOp33fS?#SWf?&#;b=SLMWfj6zjA{JzfO;zi?VvkA#E47}Ezq7X`bbJ?2E-%IvPw7HS2vi4KrswihR_EM z1or1ddvMgu$~%wrKVuDJnDv5@y#!UEhPQ}BuQy|}1#E1>(O^@c3nQ$u**1$t+?Rld z$~mjZ@s8NHGudi@nh0oH;dKmo@Qm5qDMIaoM;gU)YF2eyeJ_UIP3bHh;39Qfp^PN~{n;-% zhJ@^}=9NQQ?JD#&z?M&?`h!sDya;JjwUJLJXsvZ;q@fE(d8TkR#myppXD@~N$*K043<{K@+_>~o!v{&FfREbO=5AiIGFUO2z2& zZuo-vUPghyXi6CW=&RFwc!26>kt&U|#n8XpU7wbt6GMy~ggqA!w&gq9YFPwW;pb%0 zTy~98P`!w_$DqDgm{-GXCor;?Wws;AHO@}F!nGA%TxeViKJk|%;R@zW&9AqQ{0r`j zun%BH8-?WV*P}D3Gx%XA0Tk4Ci4JzQ;i!_XRIc6rOG%x*P96z9Y=$F9Im`VFICfBK z+wA7sq>akPtAJVT|K2(S2@P9{;3qMZRj`JS@PS>!>$L}$1V+nQ0GG4bFB``A-fFl# zEQUV0RiEn(E`X|f+``O{L6RmEd9c}^(h~C>dko&kCZfY5mki5KrZ@rV1iz9YZ6mG0 zk}0O!?^~z1rrLRZs^&t0_7T_KmR!oh;NvST{J*4 zM|VqGJdt+z#RyyJ5V7yYs^{8PAn5{A@zw6Hgrinck<5XTG zgsuSLr(hV;eH%MOrIc1mR9zXgIIE_!+p0i zJp(7oBrA*8%>&)fIur3E)VIWpYW%YEcJH-{)$Bn)6t^c^+uTQSd9h*HGi~N5_lUSV z2H~z!=oet0*Q1z#-pSJHqO)%%PSZnrg%_o9mXt4E@!;dRz6#W*$W3Rn0L*2s7hGZ8 z*9R1Z_%F&G_ca&t8?2noy6a0jeXX9G2qr-6nM?K;Z{ftL29l_tKw;B9pZ>>0MST_j zUnk0-YO41nH60dBQG*Zn29n>(IY__N;!rcpmbN;Q(9uHHE_S4E^PL9R3}y@T;vOAK z*jU3V{CxRI7^z7k9Uf8AL~cDVaOb0t|ISxWWr5=ftpI+V%IVaztPW{)fIUy2ch7-t z_#s|0d#B;s`yTpYKAXK~ovpv$BAwP4fFGn1D^elZIMXI^hhlCe4`i0DULS78=rqa$ zwm!!o+636vrnqQ;1N~v~TQIxyev$#Wv1)XP&x6@)z$+RbgBz}Nz>j~lB`kU3W~_%l z;9O%d__--HXmNISbQv`7q5>*o65U;j0c8)AQ7Xc=!q@(x4360ff6AGWD>PX zASlDSP{eF|F#eq*Q#+3AEbV8)RzlmL+bwHs+3N=r?K0jx{b#(?!A>B~&IRL-HgUda z>WDUhNUEz0FDRH$7rR(#0fG%=ow>13r0ONLZmfJpeO=}r266wTjN#RQ9b`NyMlBMW+Hk+oR77?xai-Vv4Pynl^R>lGiC25USQ7hJ5LiK!<^9n9Hts(` zPi@Lrb<7Qz#Wu2K8)NM7V;?fNEqMC+BdmRM9DW?~iU(D|O7y~mDvl(ham(=_$~y8t zfq{2tKtyRn)@*=QxK}Dvv;q3z*n>5N50CL;9{VE2%M?!%)#mazHS0cwqx>fEO4ZmdGYJJ#q7$j>#18exayj8Q6joQ4>-%ADb0kwO5#tqFrqK#(k->sa3-fXi9l&4wR8OHY4 zvcWl+L-^WeBMlB3i|CF5PZ$E1Ci;a3`spE6p3EOV8wAfIjF?5iG)~{?bN?&n#1c(S znw504`jYz4=ZQB(YL(&^y1bK5ujW~n1%8iI=vGmClHwZ-`+0w_ZPJe0xe*qWgVOZt zRs3i<+LP14t_NZs8eVX-pEkZZM$(K3f}h&Oo}4A+?H)$ndGUQE<$-cze}R(3skA~B z{Caf6ltK~5Hy{|WV7SM)qP~ur0C&eKJYwmy-dfxGJJos4$xkuKb%a7nu2}HfvT<{1 z$-w1P93Zu7CN&{%FkR!Da?VFnBfBtug`OgIqe7ajyq)Fb#ysFMzOBd0g;Ncv`}WYY zXHi3rPM_s*&MrUbMw8Zc<(81V{+J=qj&W%t&1;LKaq$d`Iw1=8iWuqnY^L3;bidr} zOO;+$1Zy4CXU6l_QLFi{Wq)Bx(3)^yd5)9~yf(#M!`-DOP_^0Jy;c^0Ct?a_KJpra zp=W)sf!B1y(|RbZvOoXicxK69<90^ zir5)%9oRYDy8o0EvC|BXWx2245JjZd{|K+~AU+guNxzg_5V_Jr??CvYnw&XoM^(zW zkQ`Lfb)ju%VGSHCRySa1@^Z)1(kGVQ-<6l{57-s;R<0;ZG=z-B#L$n=%{Qyp=|~jo zp=p0Ua@UzR&l%@zJ7{h~-5=a1Ar8UFj*HXPL)TPi+6g3$)zId*qThI^=;H|EZu@is zagRuk3+`k;C9LjQbgw{2vbyg0y#B_&2Fd4aG^vgw<;z($zR5pcZWOF=mDi^Iem6J2E{vyJ~3^_?lw=%Y3+lzN?0SI{nW_=`<^G2>wZ%SLoXPGTYC| z^a_DGb58nWC47NL&3;X~{I396yARWa8g6IgZ(tA4(b&>iN^jx{)Dm~(bkypz)k53TjPo?a1IUK! za;V@DHYkGwtx0b#W;A8kDo-3}4-xlEezYQN%ldZVc`0T}JoNQ)amY;*$j@8{giipC z`o%A1#*oD)5RsoDOZg+dy5oSHl!Ako6krldz-NIdKM`|2bJ$J8K0fsbCeHyny z=`;Ke?)3hczK0gj1GiZ11kqZwi8$pop0DBUjQc4yeySm4MvunUTSA~$ljW;)VX+yc zd$6#5LXJ#Qr-rfl^crEx!zYvx%Qr|G|N zcarwE*fqDZim7_FZM8MUU7~fSbNF9aXaT=V7;Ifr&f1TjLp?tfD6`-AXoqL%5G?=7 zP-&kZ3R7*)o$&vcj+jhSI>gLmaRN{NLa8dn)1jQYB(0IZN+K!8Nh{E{5)7Z1V z#lx%HMKE0F2hr557Q?ID;|kp%0&Y34C&8Cc__mJPCo1!VCzzk|jjY%GNSw<&IG8?8y1lR|twn_I0duZRQd?qrf}akuAbqJ=&a z5Oaicmity0FWC0!r3;tDGD76wEsqEzH@IDWzX%&p*>fmL9alD!-VChF(+fUXX9$Cd z67rPd#vsOYo6{$uJ(jtVtQUK zKDf7TwQ%T3QSYp|^r8>v+S|h%F~+y4_{aV>y&F08vyt}Ht5rYnWAi_Kq@$&2R(@9>7&1(f*^1NuVVSJLu&vsa>XITcxZ5B- zF4})Ar=45rz+tg?gGbcT)_G>LqcU!zRzt{jJ2k;-ltk9A`PKK3HN^ zpTT_=Dh(9>kx=A-d}hNWR(#Cnaw!sZ6k*?zB}KA=sJ0obJGxNf_^pJAbl-4(>5pW) zh&*~@{q8R^h^yx5+utErwaPmc!lKdFjPLh6L5sSc*c2fEAma=ARRFPZ$>WW8;U`7Y zOFZejfvpveezD-%oBHE9^tnGE)U?KRjJkR}Mclzm1`yBx*d6j(DrW8)KzOXs$www! z1JnF8Ev)@ixuwMVrbqu z7}(zxJ>c$I1_n9PeXTuvl-0h1l6q>xp4z$B}!psR{WrCMJov+H? z>ip5LKsy`(FB4(C0n9Nar}K}Xxe`+3A@N`>Ep8L6cLZ$*22hdcYn|v|DfA~Uz|bh2 z9c*B3eiNoc4`i&||CP-${N9-;oM*!9y6X86SqmMKAJn69R#a85W7A6itSL14v7Po+ zv+}al^+jz-g*5(Vo-ZmLlIr)5_qiWAXG z4$#BxZp^O>u*oMW=RkhU#Ygwjg?$UUF?c$C+!WO6OM7uyLDRVEtUhF$;VFCjt6g3( z>h|dC*|B=%L3_h7miXH1U)e{#yO!uHjjXaeFwYGw1sECjm!NG`kJa=IF7;B|()sjp zx9y$~pc!45^KP`XCbqTX(bMVQf4$8}{>M<_a?hb&{cj(V-6VaNw&XG=LUEZ?aI-AM z8;nGBe3@UbVS@9BuP+n~_{{(#(C~!huk6)mK>3Ki*iC z|JC89kNhui}dz!eTyTn zC~DY}HMdbWvQ^l$%i}YuJRl$un1#AfOS-Btb31bJ$$HOWzFTG!gwPVn$#0%IehCYm zPshgI`QU{l0k{BE|Ch-_hla!56!92VoN~Xk;b7x+48n?R`DVN6^WH;!hETx^+6+A` zU$9riF3e%;hx+G#&Z?=@s+onBeUNZ`w>y!;7A~2TrRJu2abN_jSQ}ma0ed(daD12j zXY`W%#@{Z78dkURox$gwj>DFXB}LEB*M7SF_}SHy@*`@()c@viPEji3MJLQ8_JmnK z?7Wt)5vS(oSP+xgI^zRbP`tq-rc<(;G63*8WJckNTaEpIqTuwEuOjGy*l zvGk^05BJnMxEnK$7_OCP+>#Uemi#E33ER|Ku$lT{L69T46`as&LD7E<-BT~z74tck(-i+S_gGavI;LpZ=>)+&up&u$RgB^( zMmFo)L!HlG+db@XfRtr*C+OS@^%o9k@k|x-c;B@K|2%F=sWu+ zbqtKn5(N`GP{0CghNu#jKD9f5SY_huahjDr|@;?%j~$=eF_zzYy` z!xb>&P3~Ymh2?eEza(D;6Qm(%=q#3KK?5(3FOboH5(#3?K>N%859+fs?C>teDO=d6 z;{Ctf#hOu3a}fS%UJD zIzayMBq>@s*X-^+s!g?8b4sSSy*r-;vsw2G{Fn0T%9!^5fd7e(SA-W|qaXOTe9IP{xM>R!8sSrx*rB z2f@1Pr3^c~nW^A2zw%oDurKpNQnFh*j47Pn!!^sPa~Ruy-3*g}$|D&hwuIx-ax=1UY;PoZBDQanVm6&o8Zsklg! zUN!$owY~4N9^I%HP)Ps#v;rEFLWxn=2{e~yt5jzI9bmR9>=bN9wuAI(FpCYT$BhmXqh8Ol`e z1I*0ET0Fh0eFcf-LR(P*Nw`<1>i>?`y5C+nV-EnE(HK$HG1LDu4-UqBqehp z*+vremvx{exB^elJ^_gLg+&k-uxPOU$m;NX{oB~QoS zeWGVo(8^l|jbe$G^Iur;IzbvI@K>bX6L zx|CeEzLz{5Gm(NIu9*dTbAXj5H;jek9rxZA#_6XgnL>bDlG%!&z-MP`K+zz*A@!5Z zb>BVNrYq0p62w+^SN0pT_U&S~e!>2Z?0mx6)o|ayS>csl#n!=xv36l$a)0D# zHS%1sx=;N=0;a?D1ml6-%#ldBz?KvC=rpwbB=>D=A=EE<*rA?oS}jw6FiS9s*?cHhyaAx%p5r znOV1XE2ct;xh}Tx_l9ubteW|rVN`svb6pid?cXVK^ z0_rRO5C~$LU{rr26tEk-VZCB5<3Ej|?s31-F$C_x(nO`~UDD%6jwo$7wJz;ipS%hD z#sQVPVBIe1V~qvhV?~HMVa1HDO+~zH1?9t5VP(`5biU+6N_c!RR`~1{Exh z_v~AcrLXG#Vx9ldma}c^N4$h) zL(QXZV6n^SBZT}tE|r@>o-FqTlTXiP!Sbunm(OVy`lI+<_zhoT% z&AJE>56B8!_kW#!f`HDM)87U1%t&*-Hnrv^UofD5s31HtTxqx{uy(v35xUvMvNM$% zR<$BZt1jVofv7!b13Ao%L8sXiW`!y@j?&lT z!YtR1rFrsdZr>~dxp2%u+XvavZEZ)L7L_wYGP(o^2J8&mfGJ9BzFDgvmL zrxfbwY|6>%37=%}unC0J>a3!9p4KFjtKqgS$X)f49rID$*#znq4x!0LZl^`_(BNpm{F>{9rLXRR4}bz*69KnO}$Hd zM=4!}WYwF;zoQ4qJ_0(DLlXxL{KpLVx$m@3n3ETIa)T;JbCAPshZi5bwSS1q-;6yj3>HtnyvPzawuQud{97mNW;kUJF(jHekNwl!qWf$rmk zRmX_6#G*VjK#h0Tq ztuJ2R(u8IYGR9#KY?0o8bp_ z0=qjUTDu0j=7+&C!-HC1jxEcrNw;K=IFU4;1J)sxr$5{een;afW9}A5A9f3D_?b(U ziM5t@FX_V8YkP7yYM};78QhgY{1omKZ!Jks8RY9_t5{=T0MbN;Bf*b{&?syJxN1Q(EFG!(9NbK4I1S@Pop-i*s zdnp%e<@9wjLcVi)y%wZXZGwDI<L*+a z`Pn8Y^Wx+`T{4uE5O0!jU`_#S)Ijc_*`-&-U@*bQ$ScwLW0ZWd^lGZ%bE#?pfGr~{&6?F~Snt0-lAxz4 zZS}W1;`KOjI+N_v?y}wX`S)Wt+4nU$C#lGYQu>JMsikCD&fGfL^fsITg&U&u^s1Nc zdY}`H)99?w(HH;xf`7G6?*5s(`z3t$k@+>LU*8hkH(BwF%w5mFuvYO@tSRS9Dd#Lu zLyn`cA-T^uf)!3=j>2=qijT54Ps>ghrxS9r!@N%0u8ynRDfzHEkP8yRQg)AUxYKDt z^}?ha53O`vR9IOTGUwtcup6Ls%x6=?EeJV zclY}@R(@r76NSU>g&I)0PAyCepD~-u1U=V6mT~e3OD$(zJD)N{!uSV-n$I4L&!VpH zv{LTamgm3npZU70ZcPb=w#|L=RaX?dA2t?q<9sLP!KU`QpDicd#EEC-&hTCr$amhd zj!v9H+}nms*8F4m>l$98L{Lq*+ltAA_$xbgC)YYyBX@_FjnX;TFq0ay?W>zKe~*J_ z^Z;9sdpO&Rb>=_b*~%qX_uD?i{Y1py*o%&G)pEm;H;iO{6!n~f|_d!}7 zX|aApmSpk5AaTll1Z&~m(Y&t!>T%M2sQbP~@2rz-S73j6h8!w$VDLV+Stl4`-v`v0 zWxhZ&DC%{v9(CIfZFaSAReZ;(ypqjiMS9T+c z$*m?Za)u5+C%wUJ1sU0;9lKhQqW_?Xo~`WX2}nRT9-ZP1+l<&*FU$$0e(qR*6$sU( z`NWdu(di#NE#A}V-^kwIyD7EEO;Ov=B{aH}br*E`v7^HM{ZLNko+{d^(G_dj&*k`R zFV#jG2lH^WtMhR9 z!}^|lW!!G{bUq6+mnXp#9rkPM*6Rjl6+CqmylRK~%@{yU>FRhT1z=MxB)dovQ%lYt z)E>HMe1l$x@FwjvGTkTM!5_`=LQp$G; zyV>2CpY{(P*>vBzF(*1!lZj~+o~{XaztC&)Y1L`sO-*#|+Z4UMyQGnXpXzmae?)yF zGxRd@XR`)~GrpHn?n*9p zu~>v&6H7_`)#6r^1V-%p)V<^R#W+@oP@=&{*o{MKj-JG9AZkf1hQ%2x#AWnc!=d=Y zss|}v_U@Z|As&-e5fy5JS9>iz&(WHgber`q_n^F>PvocsxBa=VhX;R>g3>m&l_(Ok zPC24brCIdO2GN@3x7)tv>!u4SDK|G}(!hRe69sK}c?5ccigE{jdxJRc$%XM$uA5a- zqdkpMud8Km(ukJY&bqpX{8X2`@>kl$BV61H-kuAeSP>wiwi500f05N{%b(qQ z;}zIfmrv+N9WQrq6hZG#7qr*d5JdIgUW6xVC)s?BCt9mzzqMQQ`PP}_FPp9E@~%L~Vi=C{@wCu3TN`kka+j-6LRXo=H9l?D_p z5ULcrE+$~NjB#00{@LO!W+$jz$?C=zeDO7+2eQ7+hspfwtE8YhxR3@y4=J+8uXcaV zQt+m^w@Nj_W5ml@I)i=>B?xg@w+QJRV$gZRGxp+dRLd2LbIYCiEyEA7g2voGN2Rh|4s zZv~DU3kFsY^5~W!ncr&=qfb*aA?4W#YazzGDFKlWnMV}<MK0)9#k z^o93+kv=`bTLk>R-XFFZk0e+OWbArO79J7Clp9R^AQFJ>hbws=ERlnLl(Bw!zp$e+ z&fwCtAN!kGYf@tAn;9Iwj(tI?XpmS&UJ8H8qKR~6Ba;d!M>8Ya0h(hBTusg^FUK$>Gta@4v$329WW21%^`5QQ~g4z1)YFg`P1{xqLX9xpup$o zU!eOgFgKYVdjfi6Amx_2dzFEjAd2`JaR;@Vv!eyBU+O|yUfq-JxcMJOQxgZaM-Tds z6?<$R8-JeFe2fl#_tZR67igdOSAAN!p?+G!#hUy_S!1du2g9sdD|oGMb^>3!y?yvQ zT8&Z#JDurx%55q>Er0p5Fp&4}KZcp#FtG^8Ac9g@*}4{)79h!cv$pSZ$)&{W@wK*R|{tO8sYrIh1NL0TGRJs3SIGRge$b-K{ zf=m8fX$_K+Nk8$Ye(z^JG^o(87+1}n0^dA%Px!+rh>=(8!*76Z-Vz7>!orRWc2q%G zfOExUA+RzeDp31Xq8iUXZRw{8NsIq-bNt}3135=6@%}IE45@$W3H?6$xbRCy#|EGp z9q)=t2?9W-V-oGKwq&?PwB-7F9Ab!OVUoo+Z*_b|Lcof`o_vXDcEA`l*taQ`(8Nhb zLq6^hq7q0#g;191{Q>Xw7ho;W6CnV+p)^V>$o#nmo_ET}!5#Lb8(kame2tFVdS8gh(^@X+GMl-@pGUsrxZ$ zVQZ-o=YyqmO4%0}i*jXw?^KJ`Fw(dt9r2wGMGo3YwUiT^9{ZzD40~qx~LXrdr-R>464fZlege2XNdoR`kM($Zl`Vh$3dD2zk0{5 zM@rQ#Ri!wt!;pb4ctU?t$%lYl6+CwZ5lk&AI=2}Bz&sqNhi1=Vh#rLrd!S$S)vXMx z(k}arCG>OntWzv?5_;%!iBP2>zn0eRXN9BZ0YH|W#x+m^zgcM|0HzR|D=MfH)>&A$ zea4*>^oZv#LOf7F@9d$pv2+A_3r{d3=I4a zBUBHx39gUT(zfpob}u;T-O2cy)#F5WeR*cgaXkz^7$!-YEzN1KDYN&(mRgeit>%%` zQqMRlt};x{rKN@(T6`$2HEbD1Z6lb51*1kSDMhM6lX9aiB?H}!;z$BA2*dC|Xc)iy^YT{!t;rGbhfVbp_PKES5Q z24px*rur_BRh~>s9_{gk!OeC4ih_B_%CG9yzb~7Pd*=PyO6yVhwHkV&!qd>lk)5TeUY|^2QKFcI%=kBD=%~%`~s9}5xU(q9zT=7Iv3FtrVP}SVN@!6q;)nKv*uWL11 zTtS!r$^Yr(TEmjewlGbnI*#4W8Ph4V>}E}fnb*WeXu7LmUIGmTtOO%a6Q#UBsMBRC zFNv2%Q8P776x0$KHKoRrj#63)VjbV=$Uoyjpr>Tk%Hs6ll#WNuu-_s-O5oR9r&>2# z{l2I?Gvq#tCRZJClYZ#=tOJzpCtSn}33}d#@`ulgnMeY3D9V-D?ya#zt z5}If9d2Y#6buOYWS>15~Qs1Xu?E{-B8MHLMNfoEg9ZhWm$Pu>fOM<=&r!>Og*IA<2 zfAbmJ-qtv#X9U2vyuCp|IVQMKLy+e^pAY$ZoXjqohHHN!`Q5gl8I-@;p|sGBLgz`w zX~oRVK8&#{&WAnPQ&8PJhhX@8W4dW~;WYJ$tF-k{Kiyb8u%sQlDw`!Q zU7Zn}6FxGyRuC9qV0b>q^@V7mUAY5K)0-4nxlyFRN{scY8Fz)qVxz{tUQ_kXH`R02c>#`#v ze_R&cYm`vpeT|{w9AGi@pOvmHp7WW|_fljl1^2>^0aOLMZ==~!1*s6knjFZfy~_2Z zIpp13P9lbk&xeY1a9)$9;ITRGN14)J8*^9n*-!&(wxC+$u?`UbE2WkK;8&6_UENIY zHquDqoXU^)VH=3`imQ*Gs;L-l0^!OQ0|uuB&ieA4eYGTFFSZI1pH}Xya|TDt0+Y2~ zp$q7q;mr0*Y39JSN*}+Ac&eeX-f?g%{rUhbX~Hg=3yl~1Imq-O86EbmxDSr#=3+&EWjXCk8E2L*y$wO7*tMxIj zqB4gV!9pJFgCYJhLux$pivNNt4K5z2y$|3)UO}z&cZIMe!E*^~!SUzccP{$ZLdNp5 z^xVp3j}`BnoqW}6m;#@3^|KnfxCZHd8Lz-~keuidinx}SZW3KS=R1i9u6GFh%r^so zaxHt7sjn@XN{+GCj@~=zufF%hzOyQFESdgQ&3zLGjz^i!tbe1|u- zlxqVgH&NEt$|*fj(pIu}*g+L^g}*e>3mb0!zz%JM;;K;R3u`27T&u}=#l;Qb%vAdH^NYgo)Q&%4Z5^D5vM|U7zn=} zmT{&^EV3iU)7b2&-a2OJ4$NhIjlM4A+R3sq9{2I(0b6<33e?TulIn>@G9-h4^huCQ z;yQvyr0{*BpUOx}e0Wz$<9JO8u*$f!ONxXRuRwi1XN`moV@+U9$k{>n1TDP!*^gz@ zYdelkzI&FEThdV*ovisCDvzx~Z+11WUD`a?he1C2m5uW4Ci=wVm4j+6T$AySxtr20 zJ*72e;6#*_IbI@F?8?R;&2m zK-QRa#B9!5%DI2m~8DUwrcPEBBy z64^qix&xNbqk+>J6Cfr1c^KxToJk^4bO^2<`RZj(KYnA zNm6){*E!8NQ$KL^ z05un(>kn$K;_lu%JaTa$KnHY^i_xBR);0Og5&Gz;bn&f_bn)QEi;;(Ph^l5piK(D| zUsusG+*yY-j>1Xk(9;7ImeWFo;yjDHs)8&k43$-5IuaErQ+Cz8{Geu`&Uq`%23=8V z+Rn_bij8bXMx?AU{)hGb6zb^3YIh(P$Pf=J_vl#1w9uGpmzLRbv4>qi0I;zz|5DsO9HzVtEhe9=U5tLr_+L zK&1O*{1F+SGixI;Y`;!89V0y022?PFkHKI5$wU|x$8U;>KwWbRU+%a`Zg+Q_ zR#R`$6s?g|`OE6%#Bvut&B=^dKH57h+VRD_04=r3T|%H9a^y9REVSfwR^k+&XRt!0 zM4jV4QGDudi3|3Rnn*aR5SDnu6VYgH#^Mu9(FwFyd)LxzSs^d|>s)oG_FhI_VmjyRW9dv8omD8vWAT+A`E znz~hfSNEa0N_IDp34YP*^OyuJmz=p>l02Z;?-Ctu1$Yf84_sCTVGY&%q3LH6D$=h8 zZdd3H!dOSiF_)2rzbJ!^cOeV^>ziLqXtKH!_w$}3!X=*bo(u$O%j(kCs8S~@0+z|kM zyN}sDoFeKU@^MJNw#BNGYFOX7Y(;ITsDU?`SZEBn>ms1`RgP#M} z&>|RzcP-fNKlv2MO+lev-dyiNe?;Mop!bbWTTVqgg;=_g;cn9K!hvl;fmBWFFAjV! zZUCAleVZ8uus;2H2{oNHI`c=)j9qOT1X^y1-5(Sic03jjJxRd>70AJUrwhb>7sTG} z=uQ`?qXX306>M)0wYQ%f;}ZWzAR;P^NQnRc0!ch`F(6<)-@} Date: Tue, 15 Feb 2022 06:56:32 +0300 Subject: [PATCH 185/531] note about binary search and cache associativity --- content/english/hpc/cpu-cache/associativity.md | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/content/english/hpc/cpu-cache/associativity.md b/content/english/hpc/cpu-cache/associativity.md index 366431cc..fb5bf37c 100644 --- a/content/english/hpc/cpu-cache/associativity.md +++ b/content/english/hpc/cpu-cache/associativity.md @@ -78,7 +78,7 @@ This makes the cache system simpler and cheaper to implement but also susceptibl Now, where were we? Oh, yes: the reason why iteration with strides of 256 causes such a terrible slowdown. -When we jump over 256 integers, the pointer always increments by $1024 = 2^{10}$, and the last 10 bits remain the same. Since the cache system uses the lower 6 bits for the offset and the next 12 for the cache line index, we are essentially using just $2^{12 - (10 - 6)} = 2^8$ different sets in the L3 cache instead of $2^{12}$, which has the effect of shrinking our L3 cache by a factor of $2^4 = 16$. The array stops fitting into the L3 cache ($N=2^{21}$) spills into the order-of-magnitude slower RAM, which causes the performance to decrease. +When we jump over 256 integers, the pointer always increments by $1024 = 2^{10}$, and the last 10 bits remain the same. Since the cache system uses the lower 6 bits for the offset and the next 12 for the cache line index, we are essentially using just $2^{12 - (10 - 6)} = 2^8$ different sets in the L3 cache instead of $2^{12}$, which has the effect of shrinking our L3 cache by a factor of $2^4 = 16$. The array stops fitting into the L3 cache ($N=2^{21}$) and spills into the order-of-magnitude slower RAM, which causes the performance to decrease. From 484fae0fd8c30411130053585431326addc744e0 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 15 Feb 2022 06:59:38 +0300 Subject: [PATCH 186/531] change wording --- content/english/hpc/cpu-cache/associativity.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/hpc/cpu-cache/associativity.md b/content/english/hpc/cpu-cache/associativity.md index fb5bf37c..2ccc1783 100644 --- a/content/english/hpc/cpu-cache/associativity.md +++ b/content/english/hpc/cpu-cache/associativity.md @@ -100,7 +100,7 @@ Performance issues caused by cache associativity effects arise with remarkable f - It is the smallest integer exponent, so using the sequence of increasing powers of two as problem sizes are a popular choice when benchmarking memory-bound algorithms. - Also, more natural powers of ten are by transitivity divisible by a slightly lower power of two. -This especially often applies to implicit data structures that use a fixed memory layout. For example, binary searching over arrays of size $2^{20}$ takes about ~360ns per query while searching over arrays of size $(2^{20} + 123)$ takes ~300ns. When the array size is a multiple of a large power of two, then the indices of the elements that we request on the first dozen or so iterations will also be divisible by some large powers of two — and map to the same cache line, kicking each other out and causing a ~20% performance decrease. +This especially often applies to implicit data structures that use a fixed memory layout. For example, binary searching over arrays of size $2^{20}$ takes about ~360ns per query while searching over arrays of size $(2^{20} + 123)$ takes ~300ns. When the array size is a multiple of a large power of two, then the indices of the "hottest" elements, the ones we likely request on the first dozen or so iterations, will also be divisible by some large powers of two and map to the same cache line — kicking each other out and causing a ~20% performance decrease. Luckily, such issues are more of an anomaly rather than serious problems. The solution is usually simple: avoid iterating in powers of two, make the last dimensions of multi-dimensional arrays a slightly different size or use any other method to insert "holes" in the memory layout, or create some seemingly random bijection between the array indices and the locations where the data is actually stored. From 18546550507bfb19d993e75fec888244804a4e8b Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 15 Feb 2022 10:28:45 +0300 Subject: [PATCH 187/531] compare s-tree against abseil --- .../img/search-set-relative-all.svg | 175 ++++- .../img/search-set-relative.svg | 741 +++++++++++------- content/english/hpc/data-structures/s-tree.md | 8 + 3 files changed, 603 insertions(+), 321 deletions(-) diff --git a/content/english/hpc/data-structures/img/search-set-relative-all.svg b/content/english/hpc/data-structures/img/search-set-relative-all.svg index 73cbb186..cd0e87a7 100644 --- a/content/english/hpc/data-structures/img/search-set-relative-all.svg +++ b/content/english/hpc/data-structures/img/search-set-relative-all.svg @@ -29,7 +29,7 @@ z - @@ -99,7 +99,7 @@ z - @@ -129,7 +129,7 @@ z - @@ -170,7 +170,7 @@ z - @@ -226,7 +226,7 @@ z - @@ -241,7 +241,7 @@ L 341.092276 41.472 - @@ -451,7 +451,7 @@ z - @@ -487,7 +487,7 @@ z - @@ -500,7 +500,7 @@ L 414.72 269.175126 - @@ -514,7 +514,7 @@ L 414.72 230.766253 - @@ -528,7 +528,7 @@ L 414.72 192.357379 - @@ -542,7 +542,7 @@ L 414.72 153.948505 - @@ -556,7 +556,7 @@ L 414.72 115.539632 - @@ -762,7 +762,7 @@ z - - - - + + + - - + - - + @@ -1161,24 +1244,24 @@ z - - + - + @@ -1288,12 +1371,12 @@ z - + - + @@ -1353,12 +1436,12 @@ z - + - + @@ -1433,11 +1516,37 @@ z + + + + + + + + + + + + + + + + + + + + + + + + - + diff --git a/content/english/hpc/data-structures/img/search-set-relative.svg b/content/english/hpc/data-structures/img/search-set-relative.svg index bea8fc61..77a7cec0 100644 --- a/content/english/hpc/data-structures/img/search-set-relative.svg +++ b/content/english/hpc/data-structures/img/search-set-relative.svg @@ -29,7 +29,7 @@ z - @@ -99,7 +99,7 @@ z - @@ -129,7 +129,7 @@ z - @@ -170,7 +170,7 @@ z - @@ -226,7 +226,7 @@ z - @@ -241,7 +241,7 @@ L 341.092276 41.472 - @@ -451,7 +451,7 @@ z - @@ -487,26 +487,26 @@ z - - + - - + @@ -514,13 +514,13 @@ L 414.72 202.424594 - - + @@ -528,13 +528,13 @@ L 414.72 149.844891 - - + @@ -542,19 +542,67 @@ L 414.72 97.265189 - - + - + + + + + + + + + + + + + + + + - + - - + + + + - - + - - + @@ -831,41 +962,9 @@ L 414.72 307.584 L 414.72 41.472 " style="fill:none;stroke:#cccccc;stroke-linecap:square;stroke-linejoin:miter;stroke-width:1.25;"/> - + - - + - + - + - + - + - - + + - - + - + - - + + + + + + + + + + + + + + + + + + + + + - + + + + + + + + + - + + + - - + + + + + + + + + + + + + + + + + + + + + + + + + + + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +" id="DejaVuSans-43"/> + + + + + + + + + + - + diff --git a/content/english/hpc/data-structures/s-tree.md b/content/english/hpc/data-structures/s-tree.md index 25b5fe83..263d03d7 100644 --- a/content/english/hpc/data-structures/s-tree.md +++ b/content/english/hpc/data-structures/s-tree.md @@ -465,8 +465,16 @@ However, they perform better: It may or may not be beneficial to reverse the order in which layers are stored. I only implemented right-to-left because that was easier to code. +### As a Dynamic Tree + +When we compare S+ trees to `std::set` where we add the same elements and search for the same lower bounds (not counting the time it took to add them), the comparison is even more favorable: + +![](../img/search-set-relative.svg) + My next priorities is to adapt it to segment trees, which I know how to do, and to B-trees, which I don't exactly know how to do. But comparing to `std::set` hints that there may be up to 30x improvements: +`absl::btree_set`, the only widely-used B-tree implementation I know, is just slightly faster than binary search. + ![](../img/search-set-relative-all.svg) A ~15x improvement is definitely worth it — and the memory overhead is not large, as we only need to store pointers (indices, actually) for internal nodes. It may be higher, because we need to fetch two separate memory blocks, or lower, because we need to handle updates somehow. Either way, this will be an interesting optimization problem. From c5ef9af616083eb4cb989da7a95741584af45797 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 15 Feb 2022 16:21:41 +0300 Subject: [PATCH 188/531] proof for randomized binary search --- .../hpc/data-structures/binary-search.md | 99 ++++++++++++++++++- 1 file changed, 95 insertions(+), 4 deletions(-) diff --git a/content/english/hpc/data-structures/binary-search.md b/content/english/hpc/data-structures/binary-search.md index dfebcef3..61aede58 100644 --- a/content/english/hpc/data-structures/binary-search.md +++ b/content/english/hpc/data-structures/binary-search.md @@ -190,9 +190,7 @@ int lower_bound(int x) { } ``` -Theoretically[^limit], this randomized binary search is expected to do ~1.35x more comparisons than the normal one, but in practice, the running time goes ~6x on large arrays: - -[^limit]: I wrote an [small program](https://gist.github.com/sslotin/4b7193041b01e454615f50d237485c71) for calculating the expected number of comparisons required. By the way, if someone who remembers calculus is reading this, please try to find the limit of the ratio of the number of comparisons a random binary search and a normal one needs, and share how you did that. Although probably useless, it seems like an interesting problem. +[Theoretically](#appendix), this randomized binary search is expected to do $2 \cdot \ln 2 \approx 1.35$ times more comparisons than the normal one, but in practice, the running time goes ~6x on large arrays: ![](../img/search-random.svg) @@ -420,6 +418,99 @@ Note that this method, while being great for single-threaded world, is unlikely [Part 2](https://algorithmica.org/en/b-tree) explores efficient implementation of implicit static B-trees in bandwidth-constrained environment. -## Acknowledgements +### Appendix + +By the way, finding the exact expected number of comparisons for random binary search is a probably useless but interesting math problem. Try solving it yourself first! + +The way to compute it *algorithmically* is through dynamic programming. If we denote $f_n$ as the expected number of comparisons to find a random lower bound on a search interval of size $n$, it can be calculated from previous $f_n$ by considering all the $(n - 1)$ possible splits: + +$$ +f_n = \sum_{l = 1}^{n - 1} \frac{1}{n-1} \cdot \left( f_l \cdot \frac{l}{n} + f_{n - l} \cdot \frac{n - l}{n} \right) + 1 +$$ + +Directly applying this formula gives us an $O(n^2)$ algorithm, but we can optimize it by rearranging the sum like this: + +$$ +\begin{aligned} +f_n &= \sum_{i = 1}^{n - 1} \frac{ f_i \cdot i + f_{n - i} \cdot (n - i) }{ n \cdot (n - 1) } + 1 +\\ &= \frac{2}{n \cdot (n - 1)} \cdot \sum_{i = 1}^{n - 1} f_i \cdot i + 1 +\end{aligned} +$$ + +To update $f_n$, we only need to calculate the sum of $f_i \cdot i$ for all $i < n$. To do that, let's introduce two new variables: + +$$ +g_n = f_n \cdot n, +\;\; +s_n = \sum_{i=1}^{n} g_n +$$ + +Now they can be sequentially calculated as: + +$$ +\begin{aligned} +g_n &= f_n \cdot n + = \frac{2}{n-1} \cdot \sum_{i = 1}^{n - 1} g_i + n + = \frac{2}{n - 1} \cdot s_{n - 1} + n +\\ s_n &= s_{n - 1} + g_n +\end{aligned} +$$ + +This way we get an $O(n)$ algorithm, but we can do even better. Let's substitute $g_n$ in the update formula for $s_n$: + +$$ +\begin{aligned} +s_n &= s_{n - 1} + \frac{2}{n - 1} \cdot s_{n - 1} + n +\\ &= (1 + \frac{2}{n - 1}) \cdot s_{n - 1} + n +\\ &= \frac{n + 1}{n - 1} \cdot s_{n - 1} + n +\end{aligned} +$$ + + + +The next trick is more complicated. We define $r_n$ like this: + +$$ +\begin{aligned} +r_n &= \frac{s_n}{n} +\\ &= \frac{1}{n} \cdot \left(\frac{n + 1}{n - 1} \cdot s_{n - 1} + n\right) +\\ &= \frac{n + 1}{n} \cdot \frac{s_{n - 1}}{n - 1} + 1 +\\ &= \left(1 + \frac{1}{n}\right) \cdot r_{n - 1} + 1 +\end{aligned} +$$ + +We can substitute it into the formula we got for $g_n$ before: + +$$ +g_n = \frac{2}{n - 1} \cdot s_{n - 1} + n = 2 \cdot r_{n - 1} + n +$$ + +Recalling that $g_n = f_n \cdot n$, we can express $r_{n - 1}$ using $f_n$: + +$$ +f_n \cdot n = 2 \cdot r_{n - 1} + n +\implies +r_{n - 1} = \frac{(f_n - 1) \cdot n}{2} +$$ + +Final step. We've just expressed $r_n$ through $r_{n - 1}$ and $r_{n - 1}$ through $f_n$. This lets us express $f_{n + 1}$ through $f_n$: + +$$ +\begin{aligned} +&&\quad r_n &= \left(1 + \frac{1}{n}\right) \cdot r_{n - 1} + 1 +\\ &\Rightarrow & \frac{(f_{n + 1} - 1) \cdot (n + 1)}{2} &= \left(1 + \frac{1}{n}\right) \cdot \frac{(f_n - 1) \cdot n}{2} + 1 +\\ &&&= \frac{n + 1}{2} \cdot (f_n - 1) + 1 +\\ &\Rightarrow & (f_{n + 1} - 1) &= (f_{n} - 1) + \frac{2}{n + 1} +\\ &\Rightarrow &f_{n + 1} &= f_{n} + \frac{2}{n + 1} +\\ &\Rightarrow &f_{n} &= f_{n - 1} + \frac{2}{n} +\\ &\Rightarrow &f_{n} &= \sum_{k = 2}^{n} \frac{2}{k} +\end{aligned} +$$ + +The last expression is double the harmonic series, which is well known to approximate $\ln n$ as $n \to \infty$. Therefore, the random binary search will perform $\frac{2 \ln n}{\log_2 n} = 2 \ln 2 \approx 1.386$ more comparisons compared to the normal one. + +### Acknowledgements The article is loosely based on "[Array Layouts for Comparison-Based Searching](https://arxiv.org/pdf/1509.05053.pdf)" by Paul-Virak Khuong and Pat Morin. It is 46 pages long, and discusses the scalar binary searches in more details, so check it out if you're interested in other approaches. + +Thanks to Marshall Lochbaum for [providing](https://github.com/algorithmica-org/algorithmica/issues/57) the proof for the random binary search. From 3d4a723ae8ffbc70566109c93bfc1d0837586355 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 15 Feb 2022 17:21:24 +0300 Subject: [PATCH 189/531] binary search taking final shape --- .../hpc/data-structures/binary-search.md | 38 +++++++++++++------ 1 file changed, 27 insertions(+), 11 deletions(-) diff --git a/content/english/hpc/data-structures/binary-search.md b/content/english/hpc/data-structures/binary-search.md index 61aede58..5600935d 100644 --- a/content/english/hpc/data-structures/binary-search.md +++ b/content/english/hpc/data-structures/binary-search.md @@ -14,12 +14,22 @@ In this article, we focus on such fundamental algorithm — binary search — an - *Branchless* binary search that is up to 3x faster on *small* arrays and can act as a drop-in replacement to `std::lower_bound`. - *Eytzinger* binary search that rearranges the elements of a sorted array in a cache-friendly way of is also 3x faster on small array and 2x faster on RAM-backed arrays. +This is technically not a drop-in replacement, since it requires some preprocessing, but I can't recall a lot of scenarios where you obtain a sorted array but can't spend linear time on preprocessing. + +The usual disclaimer: the CPU is a [Zen 2](https://www.7-cpu.com/cpu/Zen2.html) and the RAM is a [DDR4-2666](http://localhost:1313/hpc/cpu-cache/). The compiler we will be using by default is Clang 10. The results may be slightly different on other platforms. + + + ## Binary Search + ### Removing the Last Branch Just the finishing touch. Did you notice the bumpiness of eytzinger search? This isn't random noise — let's zoom in: @@ -408,17 +430,9 @@ The graph is now smooth and almost doesn't lose to the branchless binary search ![](../img/search-eytzinger-branchless.svg) -But that was a small detour. Let's get back to optimizing for *large* arrays. +It's interesting that now GCC doesn't replace this with `cmov`, but Clang does. 1-1. -The prefetching technique allows us to read up to 4 elements ahead, but it doesn't really come for free — we are effectively trading off excess memory [bandwidth](/hpc/cpu-cache/bandwidth) for reduced [latency](/hpc/cpu-cache/latency). If you run more than one instance at a time, or just any other memory-intensive computation in the background, it will significantly [affect](/hpc/cpu-cache/sharing) the performance of the benchmark. - -We can do better — instead of fetching 4 cache lines at a time, we could fetch 4 times *fewer* cache lines. - -Note that this method, while being great for single-threaded world, is unlikely to make its way into database and heavy multi-threaded applications, because it sacrifices bandwidth to achieve low latency. - -[Part 2](https://algorithmica.org/en/b-tree) explores efficient implementation of implicit static B-trees in bandwidth-constrained environment. - -### Appendix +### Appendix: Random Binary Search By the way, finding the exact expected number of comparisons for random binary search is a probably useless but interesting math problem. Try solving it yourself first! @@ -514,3 +528,5 @@ The last expression is double the harmonic series, which is well known to approx The article is loosely based on "[Array Layouts for Comparison-Based Searching](https://arxiv.org/pdf/1509.05053.pdf)" by Paul-Virak Khuong and Pat Morin. It is 46 pages long, and discusses the scalar binary searches in more details, so check it out if you're interested in other approaches. Thanks to Marshall Lochbaum for [providing](https://github.com/algorithmica-org/algorithmica/issues/57) the proof for the random binary search. + +I also stole these lovely layout visualizations from some blog a long time ago, but I don't remember the name of the blog and what license they had, and inverse image search doesn't find them anymore. If you don't sue me, thank you, whoever you are! From ab22872b1defaa645996db6c5ebf0779ebb9dd89 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 15 Feb 2022 21:17:35 +0300 Subject: [PATCH 190/531] binary search edits --- .../english/hpc/cpu-cache/associativity.md | 2 +- .../hpc/data-structures/binary-search.md | 132 ++++++++++-------- 2 files changed, 72 insertions(+), 62 deletions(-) diff --git a/content/english/hpc/cpu-cache/associativity.md b/content/english/hpc/cpu-cache/associativity.md index 2ccc1783..ee3203cf 100644 --- a/content/english/hpc/cpu-cache/associativity.md +++ b/content/english/hpc/cpu-cache/associativity.md @@ -100,7 +100,7 @@ Performance issues caused by cache associativity effects arise with remarkable f - It is the smallest integer exponent, so using the sequence of increasing powers of two as problem sizes are a popular choice when benchmarking memory-bound algorithms. - Also, more natural powers of ten are by transitivity divisible by a slightly lower power of two. -This especially often applies to implicit data structures that use a fixed memory layout. For example, binary searching over arrays of size $2^{20}$ takes about ~360ns per query while searching over arrays of size $(2^{20} + 123)$ takes ~300ns. When the array size is a multiple of a large power of two, then the indices of the "hottest" elements, the ones we likely request on the first dozen or so iterations, will also be divisible by some large powers of two and map to the same cache line — kicking each other out and causing a ~20% performance decrease. +This especially often applies to implicit data structures that use a fixed memory layout. For example, [binary searching](/hpc/data-structures/binary-search) over arrays of size $2^{20}$ takes about ~360ns per query while searching over arrays of size $(2^{20} + 123)$ takes ~300ns. When the array size is a multiple of a large power of two, then the indices of the "hottest" elements, the ones we likely request on the first dozen or so iterations, will also be divisible by some large powers of two and map to the same cache line — kicking each other out and causing a ~20% performance decrease. Luckily, such issues are more of an anomaly rather than serious problems. The solution is usually simple: avoid iterating in powers of two, make the last dimensions of multi-dimensional arrays a slightly different size or use any other method to insert "holes" in the memory layout, or create some seemingly random bijection between the array indices and the locations where the data is actually stored. diff --git a/content/english/hpc/data-structures/binary-search.md b/content/english/hpc/data-structures/binary-search.md index 5600935d..ea37fe59 100644 --- a/content/english/hpc/data-structures/binary-search.md +++ b/content/english/hpc/data-structures/binary-search.md @@ -3,20 +3,24 @@ title: Binary Search weight: 1 --- -While improving the speed of user-facing applications is the end goal of performance engineering, people don't really get excited over 5-10% improvements in some databases. Yes, this is what software engineers are paid for, but these types of optimizations tend to be too intricate and specific to the system to be generalizable to other software. +While improving the speed of user-facing applications is the end goal of performance engineering, people don't really get excited over 5-10% improvements in some databases. Yes, this is what software engineers are paid for, but these types of optimizations tend to be too intricate and system-specific to be readily generalized to other software. -Rather, the most fascinating showcases of performance engineering are multifold optimizations of textbook algorithms. The kinds that everybody knows, and are so deemed simple that it would never occur to try to optimize them to begin with. These types of optimizations are simple and instructive, and can very much be adopted elsewhere. And they are surprisingly not as rare as you'd think. +Instead, the most fascinating showcases of performance engineering are multifold optimizations of textbook algorithms: the kinds that everybody knows and deemed so simple that it would never even occur to try to optimize them in the first place. These optimizations are simple and instructive and can very much be adopted elsewhere. And they are surprisingly not as rare as you'd think. -In this article, we focus on such fundamental algorithm — binary search — and implement several algorithms that significantly improve on its performance: +In this article, we focus on such fundamental algorithm — *binary search* — and implement two of its variants that are up to 4x faster than `std::lower_bound`, depending on the problem size, while being under just 15 lines of code. -- *Branchless* binary search that is up to 3x faster on *small* arrays and can act as a drop-in replacement to `std::lower_bound`. -- *Eytzinger* binary search that rearranges the elements of a sorted array in a cache-friendly way of is also 3x faster on small array and 2x faster on RAM-backed arrays. +The first algorithm achieves that by removing [branches](/hpc/pipelining/branching), and the second optimizes the memory layout to achieve better [cache system](/hpc/cpu-cache) performance. This technically disqualifies it from being a drop-in replacement for `std::lower_bound` as it needs to permute the elements of the array before it can start answering queries — but I can't recall a lot of scenarios where you obtain a sorted array but can't spend linear time on preprocessing. -This is technically not a drop-in replacement, since it requires some preprocessing, but I can't recall a lot of scenarios where you obtain a sorted array but can't spend linear time on preprocessing. + + +The usual disclaimer: the CPU is a [Zen 2](https://www.7-cpu.com/cpu/Zen2.html), the RAM is a [DDR4-2666](http://localhost:1313/hpc/cpu-cache/), and the compiler we will be using by default is Clang 10. The performance on your machine may be different, so I highly encourage to [go and test it](https://godbolt.org/z/14rd5Pnve) for yourself. -So their cache line can also be fetched with one instruction. Interesting… what if we continue this, and instead of fetching direct children, we fetch ahead as many descendants as we can cramp into one cache line? That would be $\frac{64}{4} = 16$ elements, our grand-grand-grandchildren with indices from $16k$ to $(16k + 15)$. +Their cache line can also be fetched with one instruction. Interesting… what if we continue this, and instead of fetching direct children, we fetch ahead as many descendants as we can cramp into one cache line? That would be $\frac{64}{4} = 16$ elements, our grand-grand-grandchildren with indices from $16k$ to $(16k + 15)$. -Now, if we prefetch just one of these 16 elements, we will probably only get some but not all of them, as they may cross a cache line boundary. We can prefetch the first *and* the last element, but to get away with just one request, we can observe that the index of the first element, $16k$, is divisible by $16$ — and therefore its memory address will be the base address of the array plus something divisible by $16 \cdot 4 = 64$, the cache line size. If the array were to begin on a cache line, then these $16$ grand-gran-grandchildren elements will be guaranteed to be on a single cache line. +Now, if we prefetch just one of these 16 elements, we will probably only get some but not all of them, as they may cross a cache line boundary. We can prefetch the first *and* the last element, but to get away with just one memory request, we need to notice that the index of the first element, $16k$, is divisible by $16$, so its memory address will be the base address of the array plus something divisible by $16 \cdot 4 = 64$, the cache line size. If the array were to begin on a cache line, then these $16$ grand-gran-grandchildren elements will be guaranteed to be on a single cache line, which is just what we needed. -Therefore, we just need to [align](/hpc/cpu-cache/alignment) the array: +Therefore, we only need to [align](/hpc/cpu-cache/alignment) the array: ```c++ t = (int*) std::aligned_alloc(64, 4 * (n + 1)); ``` -And then prefetch the element indexed $16 k$ in the main loop: +And then prefetch the element indexed $16 k$ on each iteration: ```c++ int lower_bound(int x) { @@ -371,23 +381,23 @@ int lower_bound(int x) { } ``` -The performance on large arrays improves 3-4x from the previous version and ~2x compared to `std::lower_bound`. Not bad for a just two more lines of code: +The performance on large arrays improves 3-4x from the previous version and ~2x compared to `std::lower_bound`. Not bad for just two more lines of code: ![](../img/search-eytzinger-prefetch.svg) -What we essentially do is we hide the latency by prefetching 4 steps ahead; if the compute didn't matter, we would expect a ~4x speedup. We can also try to prefetch further than that, and we don't even have to use more prefetch instructions for that — we can request only the first cache line and rely on the hardware to prefetch its neighbors: +Essentially, what we do here is hide the latency by prefetching four steps ahead and overlapping memory requests. Theoretically, if the compute didn't matter, we would expect a ~4x speedup, but in reality, we get a somewhat more moderate speedup. + +We can also try to prefetch further than that four steps ahead, and we don't even have to use more than one prefetch instruction for that: we can try to request only the first cache line and rely on the hardware to prefetch its neighbors. This trick may or may not improve actual performance — depends on the hardware: ```c++ __builtin_prefetch(t + k * 32); ``` -It may or may not improve actual performance — it heavily depends on the hardware. - -Also, note that the last few prefetch requests are actually not needed, and in fact, they may be even be outside of the memory region allocated for the program. On most modern CPUs, invalid prefetch instructions get converted into no-ops, so it isn't a problem, but on some platforms this may cause a slowdown, so it may make sense, for example, to split off the last ~4 iterations from the loop to try to remove them. +Also, note that the last few prefetch requests are actually not needed, and in fact, they may even be outside the memory region allocated for the program. On most modern CPUs, invalid prefetch instructions get converted into no-ops, so it isn't a problem, but on some platforms, this may cause a slowdown, so it may make sense, for example, to split off the last ~4 iterations from the loop to try to remove them. -The prefetching technique allows us to read up to 4 elements ahead, but it doesn't really come for free — we are effectively trading off excess memory [bandwidth](/hpc/cpu-cache/bandwidth) for reduced [latency](/hpc/cpu-cache/latency). If you run more than one instance at a time, or just any other memory-intensive computation in the background, it will significantly [affect](/hpc/cpu-cache/sharing) the performance of the benchmark. +This prefetching technique allows us to read up to four elements ahead, but it doesn't really come for free — we are effectively trading off excess memory [bandwidth](/hpc/cpu-cache/bandwidth) for reduced [latency](/hpc/cpu-cache/latency). If you run more than one instance at a time on separate hardware threads or just any other memory-intensive computation in the background, it will significantly [affect](/hpc/cpu-cache/sharing) the benchmark performance. -Note that this method, while being great for single-threaded world, is unlikely to make its way into database and heavy multi-threaded applications, because it sacrifices bandwidth to achieve low latency. We can do better — instead of fetching 4 cache lines at a time, we could fetch 4 times *fewer* cache lines, and in the next article we will explore that. +But we can do better. Instead of fetching four cache lines at a time, we could fetch four times *fewer* cache lines. And in the next article, we will explore the approach. -In this article, we focus on such fundamental algorithm — *binary search* — and implement two of its variants that are up to 4x faster than `std::lower_bound`, depending on the problem size, while being under just 15 lines of code. +In this article, we focus on such fundamental algorithm — *binary search* — and implement two of its variants that are, depending on the problem size, up to 4x faster than `std::lower_bound`, while being under just 15 lines of code. The first algorithm achieves that by removing [branches](/hpc/pipelining/branching), and the second optimizes the memory layout to achieve better [cache system](/hpc/cpu-cache) performance. This technically disqualifies it from being a drop-in replacement for `std::lower_bound` as it needs to permute the elements of the array before it can start answering queries — but I can't recall a lot of scenarios where you obtain a sorted array but can't spend linear time on preprocessing. @@ -535,8 +535,8 @@ The last expression is double the [harmonic series](https://en.wikipedia.org/wik ### Acknowledgements -The article is loosely based on "[Array Layouts for Comparison-Based Searching](https://arxiv.org/pdf/1509.05053.pdf)" by Paul-Virak Khuong and Pat Morin. It is 46 pages long and discusses these and many other approaches in more detail, so check it out if you’re interested. +The article is loosely based on "[Array Layouts for Comparison-Based Searching](https://arxiv.org/pdf/1509.05053.pdf)" by Paul-Virak Khuong and Pat Morin. It is 46 pages long and discusses these and many other (less successful) approaches in more detail. I highly recommend also checking it out — this is one of my favorite performance engineering papers. -Thanks to Marshall Lochbaum for [providing](https://github.com/algorithmica-org/algorithmica/issues/57) the proof for the random binary search. +Thanks to Marshall Lochbaum for [providing](https://github.com/algorithmica-org/algorithmica/issues/57) the proof for the random binary search. No way I could do it myself. I also stole these lovely layout visualizations from some blog a long time ago, but I don't remember the name of the blog and what license they had, and inverse image search doesn't find them anymore. If you don't sue me, thank you, whoever you are! From ee9ffc60fc4167ead9612a4046e099be68ecb7a0 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 15 Feb 2022 21:37:57 +0300 Subject: [PATCH 193/531] change wording --- content/english/hpc/data-structures/binary-search.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/hpc/data-structures/binary-search.md b/content/english/hpc/data-structures/binary-search.md index a90afc16..b110409f 100644 --- a/content/english/hpc/data-structures/binary-search.md +++ b/content/english/hpc/data-structures/binary-search.md @@ -11,7 +11,7 @@ Instead, the most fascinating showcases of performance engineering are multifold In this article, we focus on such fundamental algorithm — *binary search* — and implement two of its variants that are, depending on the problem size, up to 4x faster than `std::lower_bound`, while being under just 15 lines of code. -The first algorithm achieves that by removing [branches](/hpc/pipelining/branching), and the second optimizes the memory layout to achieve better [cache system](/hpc/cpu-cache) performance. This technically disqualifies it from being a drop-in replacement for `std::lower_bound` as it needs to permute the elements of the array before it can start answering queries — but I can't recall a lot of scenarios where you obtain a sorted array but can't spend linear time on preprocessing. +The first algorithm achieves that by removing [branches](/hpc/pipelining/branching), and the second also optimizes the memory layout to achieve better [cache system](/hpc/cpu-cache) performance. This technically disqualifies it from being a drop-in replacement for `std::lower_bound` as it needs to permute the elements of the array before it can start answering queries — but I can't recall a lot of scenarios where you obtain a sorted array but can't afford to spend linear time on preprocessing. -* The Eytzinger binary search is supposed to be $4$ times faster if compute didn't matter, as it requests them ~4 times faster on average. +As before, we are using Clang 10 targeting a Zen 2 CPU, but the relative performance improvements should approximately transfer to other platforms, including Arm-based chips. To the best of my knowledge, this is a significant improvement over all the existing [approaches](http://kaldewey.com/pubs/FAST__SIGMOD10.pdf). + + -### B-tree layout +## B-Tree Layout B-trees generalize the concept of binary search trees by allowing nodes to have more than two children. -Instead of single key, a B-tree node contains up to $B$ sorted keys may have up to $(B+1)$ children, thus reducing the tree height in $\frac{\log_2 n}{\log_B n} = \frac{\log B}{\log 2} = \log_2 B$ times. +Instead of single key, a B-tree node contains up to $B$ sorted keys may have up to $(B + 1)$ children, thus reducing the tree height in $\frac{\log_2 n}{\log_B n} = \frac{\log B}{\log 2} = \log_2 B \approx 4$ times — and also needing four times less cache lines to fetch. + +![A B-tree of order 4](../img/b-tree.jpg) They were primarily developed for the purpose of managing on-disk databases, as their random access times are almost the same as reading 1MB of data sequentially, which makes the trade-off between number of comparisons and tree height beneficial. In our implementation, we will make each the size of each block equal to the cache line size, which in case of `int` is 16 elements. +They are widely used for indexing in databases, especially those that operate on-disk, because if $k$ is big, this allows large sequential memory accesses while reducing the height of the tree. + +To perform static binary searches, one can implement a B-tree in an *implicit* way, i. e. without actually storing any pointers and spending only $O(1)$ additional memory, and $k$ could be made equal to the cache line size so that each node request fetches exactly one cache line. + Normally, a B-tree node also stores $(B+1)$ pointers to its children, but we will only store keys and rely on pointer arithmetic, similar to the one used in Eytzinger array: -* The root node is numbered $0$. +- The root node is numbered $0$. +- Node $k$ has $(B+1)$ child nodes numbered $\{k \cdot (B+1) + i\}$ for $i \in [1, B]$. -* Node $k$ has $(B+1)$ child nodes numbered $\{k \cdot (B+1) + i\}$ for $i \in [1, B]$. +Keys are stored in a 2d array in non-decreasing order. If the length of the initial array is not a multiple of $B$, the last block is padded with the largest value if its data type. -Keys are stored in a 2d array in non-decreasing order. If the length of the initial array is not a multiple of $B$, the last block is padded with the largest value if its data type. +We call this particular layout "S-tree". ```c++ -typedef __m256i reg; - const int B = 16; -const int INF = std::numeric_limits::max(); -int n; -int nblocks; -int *_a; -int (*btree)[B]; +int nblocks = (n + B - 1) / B; +int btree[nblocks][B]; +``` + +### Construction +We can construct B-tree similarly by traversing the search tree. + +```c++ int go(int k, int i) { return k * (B + 1) + i + 1; } void build(int k = 0) { @@ -67,46 +78,13 @@ void build(int k = 0) { if (k < nblocks) { for (int i = 0; i < B; i++) { build(go(k, i)); - btree[k][i] = (t < n ? _a[t++] : INF); + btree[k][i] = (t < n ? _a[t++] : INT_MAX); } build(go(k, B)); } } - -void prepare(int *a, int _n) { - n = _n; - nblocks = (n + B - 1) / B; - _a = a; - btree = (int(*)[16]) std::aligned_alloc(64, 64 * nblocks); - build(); -} - -int cmp(reg x_vec, int* y_ptr) { - reg y_vec = _mm256_load_si256((reg*) y_ptr); - reg mask = _mm256_cmpgt_epi32(x_vec, y_vec); - return _mm256_movemask_ps((__m256) mask); -} - -int lower_bound(int x) { - int k = 0, res = INF; - reg x_vec = _mm256_set1_epi32(x); - while (k < nblocks) { - int mask = ~( - cmp(x_vec, &btree[k][0]) + - (cmp(x_vec, &btree[k][8]) << 8) - ); - int i = __builtin_ffs(mask) - 1; - if (i < B) - res = btree[k][i]; - k = go(k, i); - } - return res; -} ``` - -We can construct B-tree similarly by traversing the search tree. - It is correct, because each value of initial array will be copied to a unique position in the resulting array, and the tree height is $\Theta(\log_{B+1} n)$, because $k$ is multiplied by $(B + 1)$ each time a child node is created. Note that this approach causes a slight imbalance: "lefter" children may have larger respective ranges. @@ -123,10 +101,9 @@ int i = __builtin_ffs(mask) - 1; // now i is the number of the correct child node ``` - …but ~8 times faster. -Actually, compiler quite often produces very optimized code that leverages these instructions for certain types of loops. This is called auto-vectorization, and this is the reason why a loop that sums up an array of `short`s is faster (theoretically by a factor of two) than the same loop for `int`s: you can fit more elements on the same 256-bit block. Sadly, this is not our case, as we have loop-carried dependencies. +Actually, compilers quite often produce very optimized code that leverages these instructions for certain types of loops. This is called auto-vectorization, and this is the reason why a loop that sums up an array of `short`s is faster (theoretically by a factor of two) than the same loop for `int`s: you can fit more elements on the same 256-bit block. Sadly, this is not our case, as we have loop-carried dependencies. The algorithm we will implement: @@ -141,7 +118,32 @@ This is how it looks using C++ intrinsics, which are basically built-in wrappers After that, we call this function two times (because our node size / cache line happens to be 512 bits, which is twice as big) and blend these masks together with bitwise operations. -That's it. This implementation should outperform even the [state-of-the-art indexes](http://kaldewey.com/pubs/FAST__SIGMOD10.pdf) used in high-performance databases, though it's mostly due to the fact that data structures used in real databases have to support fast updates while we don't. + +```c++ +typedef __m256i reg; + +int cmp(reg x_vec, int* y_ptr) { + reg y_vec = _mm256_load_si256((reg*) y_ptr); + reg mask = _mm256_cmpgt_epi32(x_vec, y_vec); + return _mm256_movemask_ps((__m256) mask); +} + +int lower_bound(int x) { + int k = 0, res = INT_MAX; + reg x_vec = _mm256_set1_epi32(x); + while (k < nblocks) { + int mask = ~( + cmp(x_vec, &btree[k][0]) + + (cmp(x_vec, &btree[k][8]) << 8) + ); + int i = __builtin_ffs(mask) - 1; + if (i < B) + res = btree[k][i]; + k = go(k, i); + } + return res; +} +``` Note that this implementation is very specific to the architecture. Older CPUs and CPUs on mobile devices don't have 256-bit wide registers and will crash (but they likely have 128-bit SIMD so the loop can still be split in 4 parts instead of 2), non-Intel CPUs have their own instruction sets for SIMD, and some computers even have different cache line size. @@ -158,6 +160,8 @@ madvise(btree, 64 * nblocks, MADV_HUGEPAGE); ![](../img/search-btree-hugepages.svg) +Ideally, we'd need to also enable it for [previous implementations](../binary-search). But enabling it for previous implementation doesn't make a that much difference as they have one form of prefetching or another anyway. + ```c++ constexpr std::pair precalc(int n) { int s = 0, // total size @@ -195,12 +199,6 @@ unsigned rank(reg x_vec, int* y_ptr) { Or - - - - ```c++ void permute(int *node) { const reg perm = _mm256_setr_epi32(4, 5, 6, 7, 0, 1, 2, 3); @@ -233,7 +231,7 @@ void update(int &res, int* node, unsigned i) { ```c++ int lower_bound(int x) { - int k = 0, res = INF; + int k = 0, res = INT_MAX; reg x_vec = _mm256_set1_epi32(x - 1); for (int h = 0; h < height - 1; h++) { int *node = btree[k]; @@ -299,7 +297,7 @@ void prepare(int *a, int n) { madvise(btree, T, MADV_HUGEPAGE); for (int i = N; i < S; i++) - btree[i] = INF; + btree[i] = INT_MAX; memcpy(btree, a, 4 * N); @@ -311,7 +309,7 @@ void prepare(int *a, int n) { // and then always to the left for (int l = 0; l < h - 1; l++) k *= (B + 1); - btree[offset(h) + i] = (k * B < N ? btree[k * B] : INF); + btree[offset(h) + i] = (k * B < N ? btree[k * B] : INT_MAX); } } @@ -479,6 +477,8 @@ My next priorities is to adapt it to segment trees, which I know how to do, and A ~15x improvement is definitely worth it — and the memory overhead is not large, as we only need to store pointers (indices, actually) for internal nodes. It may be higher, because we need to fetch two separate memory blocks, or lower, because we need to handle updates somehow. Either way, this will be an interesting optimization problem. +That's it. This implementation should outperform even the [state-of-the-art indexes](http://kaldewey.com/pubs/FAST__SIGMOD10.pdf) used in high-performance databases, though it's mostly due to the fact that data structures used in real databases have to support fast updates while we don't. + The problem has more dimensions. ### Acknowledgements From 6ad0b7584060806faa44d0ed8f52b3d23cfd0c32 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 16 Feb 2022 15:19:27 +0300 Subject: [PATCH 197/531] b-trees --- content/english/hpc/data-structures/s-tree.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/content/english/hpc/data-structures/s-tree.md b/content/english/hpc/data-structures/s-tree.md index 6397f1e9..6dac0a56 100644 --- a/content/english/hpc/data-structures/s-tree.md +++ b/content/english/hpc/data-structures/s-tree.md @@ -8,8 +8,8 @@ This article is a follow-up on the [previous one](../binary-search), where we op In this article, we generalize the techniques we developed for binary search to *static B-trees* and accelerate them further using [SIMD instructions](/hpc/simd). In particular, we develop two new implicit data structures: -- The first one is based on the memory layout of a B-tree, and, depending on the array size, it is up to 8x faster than `std::lower_bound` while using the same space as the array and only requiring a permutation of its elements. -- The second one is based on the memory layout of a B+ tree, and it is up to 15x faster that `std::lower_bound` while using just 6-7% more memory — or 6-7% **of** the memory if we keep the original sorted array. +- The [first one](#b-tree-layout) is based on the memory layout of a B-tree, and, depending on the array size, it is up to 8x faster than `std::lower_bound` while using the same space as the array and only requiring a permutation of its elements. +- The [second one](#b-tree-layout-1) is based on the memory layout of a B+ tree, and it is up to 15x faster that `std::lower_bound` while using just 6-7% more memory — or 6-7% **of** the memory if we can keep the original sorted array. To distinguish them from B-trees — the structures with pointers, thousands to millions of elements per node, and empty spaces — we will use the names *S-tree* and *S+ tree* respectively to refer to these particular memory layouts[^name]. @@ -28,7 +28,7 @@ The last two approaches use SIMD, which technically disqualifies it from being b --> -As before, we are using Clang 10 targeting a Zen 2 CPU, but the relative performance improvements should approximately transfer to other platforms, including Arm-based chips. To the best of my knowledge, this is a significant improvement over all the existing [approaches](http://kaldewey.com/pubs/FAST__SIGMOD10.pdf). +To the best of my knowledge, this is a significant improvement over the existing [approaches](http://kaldewey.com/pubs/FAST__SIGMOD10.pdf). As before, we are using Clang 10 targeting a Zen 2 CPU, but the relative performance improvements should approximately transfer to other platforms, including Arm-based chips. + +This numeration automatically makes the B-tree complete or almost complete with the height of $\Theta(\log_{B + 1} n)$. If the length of the initial array is not a multiple of $B$, the last block is padded with the largest value of its data type. + ### Construction -We can construct B-tree similarly by traversing the search tree. +We can construct B-tree similar to how we constructed the Eytzinger array — by traversing the search tree: ```c++ -int go(int k, int i) { return k * (B + 1) + i + 1; } - void build(int k = 0) { static int t = 0; if (k < nblocks) { for (int i = 0; i < B; i++) { build(go(k, i)); - btree[k][i] = (t < n ? _a[t++] : INT_MAX); + btree[k][i] = (t < n ? a[t++] : INT_MAX); } build(go(k, B)); } } ``` -It is correct, because each value of initial array will be copied to a unique position in the resulting array, and the tree height is $\Theta(\log_{B+1} n)$, because $k$ is multiplied by $(B + 1)$ each time a child node is created. +It is correct because each value of the initial array will be copied to a unique position in the resulting array, and the tree height is $\Theta(\log_{B+1} n)$ because $k$ is multiplied by $(B + 1)$ each time we descend into a child node. -Note that this approach causes a slight imbalance: "lefter" children may have larger respective ranges. +Note that this numeration causes a slight imbalance: left-er children may have larger subtrees, although this is only true for $O(\log_{B+1} n)$ parent nodes. -So, as we promised before, we will perform all $16$ comparisons to compute the index of the right child node, but we leverage SIMD instructions to do it efficiently. Just to clarify — we want to do something like this: +### Searches + +To find the lower bound, we need to fetch the $B$ keys in a node, find the first key $a_i$ not less than $x$, descend to the $i$-th child — and continue until we reach a leaf node. There is some variability in how to find that first key. For example, we could do a tiny internal binary search that makes $O(\log B)$ iterations, or maybe just compare each key sequentially in $O(B)$ time until we find the local lower bound, hopefully exiting from the loop a bit early. + +But we are not going to do that — because we can use [SIMD](/hpc/simd). It doesn't work well with branching, so essentially what we want to do is to compare against all $B$ elements regardless, compute a bit mask out of these comparisons, and then use the `ffs` instruction to find the bit corresponding to the first non-lesser element: ```cpp int mask = (1 << B); @@ -101,33 +107,21 @@ int i = __builtin_ffs(mask) - 1; // now i is the number of the correct child node ``` -…but ~8 times faster. - -Actually, compilers quite often produce very optimized code that leverages these instructions for certain types of loops. This is called auto-vectorization, and this is the reason why a loop that sums up an array of `short`s is faster (theoretically by a factor of two) than the same loop for `int`s: you can fit more elements on the same 256-bit block. Sadly, this is not our case, as we have loop-carried dependencies. - -The algorithm we will implement: - -1. Somewhere before the main loop, convert $x$ to a vector of $8$ copies of $x$. -2. Load the keys stored in node into another 256-bit vector. -3. Compare these two vectors. This returns a 256-bit mask in which pairs that compared "greater than" are marked with ones. -4. Create a 8-bit mask out of that and return it. Then you can feed it to `__builtin_ffs`. - -This is how it looks using C++ intrinsics, which are basically built-in wrappers for raw assembly instructions: - - -After that, we call this function two times (because our node size / cache line happens to be 512 bits, which is twice as big) and blend these masks together with bitwise operations. - - +Unfortunately, the compilers are not smart enough yet to auto-vectorize this code, so we need to manually use intrinsics: ```c++ typedef __m256i reg; int cmp(reg x_vec, int* y_ptr) { - reg y_vec = _mm256_load_si256((reg*) y_ptr); - reg mask = _mm256_cmpgt_epi32(x_vec, y_vec); - return _mm256_movemask_ps((__m256) mask); + reg y_vec = _mm256_load_si256((reg*) y_ptr); // load 8 sorted elements + reg mask = _mm256_cmpgt_epi32(x_vec, y_vec); // compare against the key + return _mm256_movemask_ps((__m256) mask); // extract the 8-bit mask } +``` + +This function works for 8-element vectors, which is half our block / cache line size. To process the entire block, we need to call it twice and then combine the masks: +```c++ int lower_bound(int x) { int k = 0, res = INT_MAX; reg x_vec = _mm256_set1_epi32(x); @@ -145,13 +139,17 @@ int lower_bound(int x) { } ``` -Note that this implementation is very specific to the architecture. Older CPUs and CPUs on mobile devices don't have 256-bit wide registers and will crash (but they likely have 128-bit SIMD so the loop can still be split in 4 parts instead of 2), non-Intel CPUs have their own instruction sets for SIMD, and some computers even have different cache line size. +To actually return the result, we'd want to just fetch `btree[k][i]` in the last node we visited, but the problem is that sometimes the local lower bound doesn't exist ($i \ge B$) because $x$ happens to be greater than all the keys in the node. In this case, we need to return the last local lower bound we actually encountered — hence we update the result as we descend down the tree using the `(i < B)` check. + +This implementation outperforms all previous binary search implementations by a huge margin: ![](../img/search-btree.svg) +This is very good — but we can optimize it even further. + ### Optimizations -Enable huge pages: +Enable [hugepages](/hpc/cpu-cache/paging): ```c++ btree = (int(*)[16]) std::aligned_alloc(2 * 1024 * 1024, 64 * nblocks); @@ -213,6 +211,7 @@ There are probably faster ways to swap middle elements, but we will leave it her You call `permute(btree[k])` after you've done with constructing a node. +There are ways to do this with bit-level trickery, but indexing a small lookup table turns out to be faster. ```c++ const int translate[17] = { @@ -481,6 +480,10 @@ That's it. This implementation should outperform even the [state-of-the-art inde The problem has more dimensions. +NEON would require some [trickery](https://github.com/WebAssembly/simd/issues/131) + +Note that this implementation is very specific to the architecture. Older CPUs and CPUs on mobile devices don't have 256-bit wide registers and will crash (but they likely have 128-bit SIMD so the loop can still be split in 4 parts instead of 2), non-Intel CPUs have their own instruction sets for SIMD, and some computers even have different cache line size. + ### Acknowledgements This [StackOverflow answer](https://stackoverflow.com/questions/20616605/using-simd-avx-sse-for-tree-traversal) by Cory Nelson is where I took the permuted SIMD routine. From 55dab1c6cced1e9ac72aea60acd36db684b2f7a4 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 16 Feb 2022 19:40:01 +0300 Subject: [PATCH 199/531] fix file name --- content/english/hpc/data-structures/binary-search.md | 2 +- content/english/hpc/simd/{shuffing.md => shuffling.md} | 0 2 files changed, 1 insertion(+), 1 deletion(-) rename content/english/hpc/simd/{shuffing.md => shuffling.md} (100%) diff --git a/content/english/hpc/data-structures/binary-search.md b/content/english/hpc/data-structures/binary-search.md index b110409f..574b089d 100644 --- a/content/english/hpc/data-structures/binary-search.md +++ b/content/english/hpc/data-structures/binary-search.md @@ -260,7 +260,7 @@ Another way to look at it is that we write every even-indexed element to the end ### Construction -To construct the Eytzinger array, we could do this even-odd [filtering](/hpc/simd/shuffing/#permutations-and-lookup-tables) $O(\log n)$ times — and, perhaps, this is the fastest approach — but for brevity, we will instead build it by traversing the original search tree: +To construct the Eytzinger array, we could do this even-odd [filtering](/hpc/simd/shuffling/#permutations-and-lookup-tables) $O(\log n)$ times — and, perhaps, this is the fastest approach — but for brevity, we will instead build it by traversing the original search tree: ```c++ int a[n], t[n + 1]; // the original sorted array and the eytzinger array we build diff --git a/content/english/hpc/simd/shuffing.md b/content/english/hpc/simd/shuffling.md similarity index 100% rename from content/english/hpc/simd/shuffing.md rename to content/english/hpc/simd/shuffling.md From 30bb0fc33c9a5f5c9c6cec407268b654449c00df Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 16 Feb 2022 19:57:35 +0300 Subject: [PATCH 200/531] s-tree optimizations --- content/english/hpc/data-structures/s-tree.md | 121 +++++++++++------- 1 file changed, 78 insertions(+), 43 deletions(-) diff --git a/content/english/hpc/data-structures/s-tree.md b/content/english/hpc/data-structures/s-tree.md index 0a242193..efee29a6 100644 --- a/content/english/hpc/data-structures/s-tree.md +++ b/content/english/hpc/data-structures/s-tree.md @@ -147,18 +147,43 @@ This implementation outperforms all previous binary search implementations by a This is very good — but we can optimize it even further. -### Optimizations +### Optimization -Enable [hugepages](/hpc/cpu-cache/paging): +Before everything else, let's allocate the memory for the array on a [huge page](/hpc/cpu-cache/paging): ```c++ -btree = (int(*)[16]) std::aligned_alloc(2 * 1024 * 1024, 64 * nblocks); -madvise(btree, 64 * nblocks, MADV_HUGEPAGE); +const int P = 1 << 21; // page size in bytes (2MB) +const int T = (64 * nblocks + P - 1) / P * P; // can only allocate whole number of pages +btree = (int(*)[16]) std::aligned_alloc(P, T); +madvise(btree, T, MADV_HUGEPAGE); ``` +This slightly improves the performance on larger array sizes: + ![](../img/search-btree-hugepages.svg) -Ideally, we'd need to also enable it for [previous implementations](../binary-search). But enabling it for previous implementation doesn't make a that much difference as they have one form of prefetching or another anyway. +Ideally, we'd also need to enable hugepages for all [previous implementations](../binary-search) to make the comparison fair, but it doesn't matter that much because they all have some form of prefetching that alleviates this problem. + +With that settled, let's begin real optimization. First of all, we'd want to use compile-time constants instead of variables as much as possible because it lets the compiler embed them in the machine code, unroll loops, optimize arithmetic, and do all sorts of other nice stuff for us for free. Specifically, we want to know the tree height in advance: + +```c++ +constexpr int height(int n) { + // grow the tree until its size exceeds n elements + int s = 0, // total size so far + l = B, // size of the next layer + h = 0; // height so far + while (s + l - B < n) { + s += l; + l *= (B + 1); + h++; + } + return h; +} + +const int H = height(N); +``` + + + +Next, we can find the local lower bound in nodes faster. Instead of calculating it separately for two 8-element blocks and merging masks, we can use one [packs](https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#ig_expand=3037,4870,6715,4845,3853,90,7307,5993,2692,6946,6949,5456,6938,5456,1021,3007,514,518,7253,7183,3892,5135,5260,3915,4027,3873,7401,4376,4229,151,2324,2310,2324,4075,6130,4875,6385,5259,6385,6250,1395,7253,6452,7492,4669,4669,7253,1039,1029,4669,4707,7253,7242,848,879,848,7251,4275,879,874,849,833,6046,7250,4870,4872,4875,849,849,5144,4875,4787,4787,4787,5227,7359,7335,7392,4787,5259,5230,5223,6438,488,483,6165,6570,6554,289,6792,6554,5230,6385,5260,5259,289,288,3037,3009,590,604,5230,5259,6554,6554,5259,6547,6554,3841,5214,5229,5260,5259,7335,5259,519,1029,515,3009,3009,3011,515,6527,652,6527,6554,288,3841,5230,5259,5230,5259,305,5259,591,633,633,5259,5230,5259,5259,3017,3018,3037,3018,3017,3016,3013,5144&text=_mm256_packs_epi32&techs=AVX,AVX2) instruction before the `movemask`: + ```c++ unsigned rank(reg x_vec, int* y_ptr) { reg a = _mm256_load_si256((reg*) y_ptr); @@ -189,13 +217,14 @@ unsigned rank(reg x_vec, int* y_ptr) { reg c = _mm256_packs_epi32(ca, cb); int mask = _mm256_movemask_epi8(c); - return __tzcnt_u32(mask) >> 1; + // we need to divide the result by two because we call movemask_epi8 on 16-bit masks: + return __tzcnt_u32(mask) >> 1; } ``` -`packs` +This instruction converts 32-bit integers stored in two registers to 16-bit integers stored in one register — in our case, effectively joining the vector masks into one. Note that we've swapped the order of comparison — this lets us not invert the mask in the end, but we have to subtract one from the search key once in the beginning to make it correct (otherwise it works as `upper_bound`). -Or +The problem is, it does this weird interleaving where the result is written in the `a1 b1 a2 b2` order instead of `a1 a2 b1 b2` that want. To correct this, we need to [permute](/hpc/simd/shuffling) the resulting vector, but instead of doing this during the query time, we can just permute every node during preprocessing: ```c++ void permute(int *node) { @@ -207,11 +236,9 @@ void permute(int *node) { } ``` -There are probably faster ways to swap middle elements, but we will leave it here. - -You call `permute(btree[k])` after you've done with constructing a node. +We just call `permute(&btree[k])` right after we are done with building a node. There are probably faster ways to swap middle elements, but we will leave it here, as the preprocessing time is not important for now. -There are ways to do this with bit-level trickery, but indexing a small lookup table turns out to be faster. +This new SIMD routine is significantly faster because the extra `movemask` was slow and also blending the two masks took quite a few instructions. Unfortunately, we now can't just do the `res = btree[k][i]` update anymore because the elements are permuted. We can solve this problem with some bit-level trickery in terms of `i`, but indexing a small lookup table turns out to be faster and also doesn't require a new branch: ```c++ const int translate[17] = { @@ -228,34 +255,42 @@ void update(int &res, int* node, unsigned i) { } ``` +Stitching it all together (and leaving out some other minor optimizations): + ```c++ int lower_bound(int x) { int k = 0, res = INT_MAX; reg x_vec = _mm256_set1_epi32(x - 1); - for (int h = 0; h < height - 1; h++) { - int *node = btree[k]; - unsigned i = rank(x_vec, node); - k = k * (B + 1) + 1; // remove + 1? - update(res, node, i); - k += i; + for (int h = 0; h < H - 1; h++) { + unsigned i = rank(x_vec, &btree[k]); + update(res, &btree[k], i); + k = go(k, i); } - unsigned i = rank(x_vec, btree[k]); - update(res, btree[k], i); - int k2 = go(k, i); - if (go(k, 0) < nblocks) { + // the last branch: + if (k < nblocks) { unsigned i = rank(x_vec, btree[k2]); - update(res, btree[k2], i); + update(res, &btree[k], i); } return res; } ``` -All that hard work is totally worth it: +All this work saved us 20% or so: ![](../img/search-btree-optimized.svg) +To progress further, we need to change the layout a little bit. + ## B+ Tree Layout +The `update` seems to be useless: 16 out of 17 times we can just read the element from the last block + +We want to get rid of branches completely — this means having a fixed-height tree. We also probably don't want to do the `update` as it turns out to be quite costly. + +B-tree layout + +We will explain the constexpr functions because this time it is important: + ```c++ constexpr int blocks(int n) { return (n + B - 1) / B; @@ -368,6 +403,8 @@ int lower_bound(int _x) { } ``` +Helping the compiler out with pointer arithmetic. + ![](../img/search-bplus.svg) Makes more sense to look at it as a relative speedup: @@ -399,6 +436,10 @@ for (int i = 0; i < m; i++) { ### Modifications + + Another idea is to use cache more efficiently. For example, you can execute `_mm256_stream_load_si256` on just the last iteration. They aren't beneficial for throughput: From 009c4e7dda9e8201a8bf8c5770fc9d306826e475 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 16 Feb 2022 20:03:58 +0300 Subject: [PATCH 201/531] simplify variable names --- content/english/hpc/data-structures/s-tree.md | 93 +++++-------------- 1 file changed, 25 insertions(+), 68 deletions(-) diff --git a/content/english/hpc/data-structures/s-tree.md b/content/english/hpc/data-structures/s-tree.md index efee29a6..f3e281be 100644 --- a/content/english/hpc/data-structures/s-tree.md +++ b/content/english/hpc/data-structures/s-tree.md @@ -122,13 +122,13 @@ int cmp(reg x_vec, int* y_ptr) { This function works for 8-element vectors, which is half our block / cache line size. To process the entire block, we need to call it twice and then combine the masks: ```c++ -int lower_bound(int x) { +int lower_bound(int _x) { int k = 0, res = INT_MAX; - reg x_vec = _mm256_set1_epi32(x); + reg x = _mm256_set1_epi32(_x); while (k < nblocks) { int mask = ~( - cmp(x_vec, &btree[k][0]) + - (cmp(x_vec, &btree[k][8]) << 8) + cmp(x, &btree[k][0]) + + (cmp(x, &btree[k][8]) << 8) ); int i = __builtin_ffs(mask) - 1; if (i < B) @@ -166,6 +166,8 @@ Ideally, we'd also need to enable hugepages for all [previous implementations](. With that settled, let's begin real optimization. First of all, we'd want to use compile-time constants instead of variables as much as possible because it lets the compiler embed them in the machine code, unroll loops, optimize arithmetic, and do all sorts of other nice stuff for us for free. Specifically, we want to know the tree height in advance: + + ```c++ constexpr int height(int n) { // grow the tree until its size exceeds n elements @@ -207,12 +209,12 @@ const int [height, nblocks] = precalc(N); Next, we can find the local lower bound in nodes faster. Instead of calculating it separately for two 8-element blocks and merging masks, we can use one [packs](https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#ig_expand=3037,4870,6715,4845,3853,90,7307,5993,2692,6946,6949,5456,6938,5456,1021,3007,514,518,7253,7183,3892,5135,5260,3915,4027,3873,7401,4376,4229,151,2324,2310,2324,4075,6130,4875,6385,5259,6385,6250,1395,7253,6452,7492,4669,4669,7253,1039,1029,4669,4707,7253,7242,848,879,848,7251,4275,879,874,849,833,6046,7250,4870,4872,4875,849,849,5144,4875,4787,4787,4787,5227,7359,7335,7392,4787,5259,5230,5223,6438,488,483,6165,6570,6554,289,6792,6554,5230,6385,5260,5259,289,288,3037,3009,590,604,5230,5259,6554,6554,5259,6547,6554,3841,5214,5229,5260,5259,7335,5259,519,1029,515,3009,3009,3011,515,6527,652,6527,6554,288,3841,5230,5259,5230,5259,305,5259,591,633,633,5259,5230,5259,5259,3017,3018,3037,3018,3017,3016,3013,5144&text=_mm256_packs_epi32&techs=AVX,AVX2) instruction before the `movemask`: ```c++ -unsigned rank(reg x_vec, int* y_ptr) { - reg a = _mm256_load_si256((reg*) y_ptr); - reg b = _mm256_load_si256((reg*) (y_ptr + 8)); +unsigned rank(reg x, int* y) { + reg a = _mm256_load_si256((reg*) y); + reg b = _mm256_load_si256((reg*) (y + 8)); - reg ca = _mm256_cmpgt_epi32(a, x_vec); - reg cb = _mm256_cmpgt_epi32(b, x_vec); + reg ca = _mm256_cmpgt_epi32(a, x); + reg cb = _mm256_cmpgt_epi32(b, x); reg c = _mm256_packs_epi32(ca, cb); int mask = _mm256_movemask_epi8(c); @@ -258,17 +260,17 @@ void update(int &res, int* node, unsigned i) { Stitching it all together (and leaving out some other minor optimizations): ```c++ -int lower_bound(int x) { +int lower_bound(int _x) { int k = 0, res = INT_MAX; - reg x_vec = _mm256_set1_epi32(x - 1); + reg x = _mm256_set1_epi32(_x - 1); for (int h = 0; h < H - 1; h++) { - unsigned i = rank(x_vec, &btree[k]); + unsigned i = rank(x, &btree[k]); update(res, &btree[k], i); k = go(k, i); } // the last branch: if (k < nblocks) { - unsigned i = rank(x_vec, btree[k2]); + unsigned i = rank(x, btree[k]); update(res, &btree[k], i); } return res; @@ -314,22 +316,15 @@ constexpr int offset(int h) { } const int H = height(N), S = offset(H); +``` -int *btree; - -void permute(int *node) { - const reg perm_mask = _mm256_set_epi32(3, 2, 1, 0, 7, 6, 5, 4); - reg* middle = (reg*) (node + 4); - reg x = _mm256_loadu_si256(middle); - x = _mm256_permutevar8x32_epi32(x, perm_mask); - _mm256_storeu_si256(middle, x); -} +To be more explicit with pointer arithmetic, the tree is just a single array now: -void prepare(int *a, int n) { - const int P = 1 << 21, T = (4 * S + P - 1) / P * P; - btree = (int*) std::aligned_alloc(P, T); - madvise(btree, T, MADV_HUGEPAGE); +```c++ +int *btree; +``` +```c++ for (int i = N; i < S; i++) btree[i] = INT_MAX; @@ -350,60 +345,22 @@ void prepare(int *a, int n) { for (int i = offset(1); i < S; i += B) permute(btree + i); } +``` -unsigned direct_rank(reg x, int* y) { - reg a = _mm256_load_si256((reg*) y); - reg b = _mm256_load_si256((reg*) (y + 8)); - - reg ca = _mm256_cmpgt_epi32(a, x); - reg cb = _mm256_cmpgt_epi32(b, x); - - int mb = _mm256_movemask_ps((__m256) cb); - int ma = _mm256_movemask_ps((__m256) ca); - - unsigned mask = (1 << 16); - mask |= mb << 8; - mask |= ma; - - return __tzcnt_u32(mask); -} - -unsigned permuted_rank(reg x, int* y) { - reg a = _mm256_load_si256((reg*) y); - reg b = _mm256_load_si256((reg*) (y + 8)); - - reg ca = _mm256_cmpgt_epi32(a, x); - reg cb = _mm256_cmpgt_epi32(b, x); - - reg c = _mm256_packs_epi32(ca, cb); - unsigned mask = _mm256_movemask_epi8(c); - - return __tzcnt_u32(mask)/* >> 1*/; -} - +```c++ int lower_bound(int _x) { unsigned k = 0; reg x = _mm256_set1_epi32(_x - 1); for (int h = H - 1; h > 0; h--) { unsigned i = permuted_rank(x, btree + offset(h) + k); - - //k /= B; - //k *= (B + 1) * B; - // k += (i << 3); - - k = k * (B + 1) + (i << 3); - - //if (N > (1 << 21) && h == 1) - // __builtin_prefetch(btree + k); - - //k += (i << 3); + k = k * (B + 1) + i * B; } unsigned i = direct_rank(x, btree + k); return btree[k + i]; } ``` -Helping the compiler out with pointer arithmetic. + ![](../img/search-bplus.svg) From 974e3bcb9687b95f361bdb30f99988b5b6636854 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Thu, 17 Feb 2022 09:31:37 +0300 Subject: [PATCH 202/531] note about padding the array in binary search --- content/english/hpc/data-structures/binary-search.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/hpc/data-structures/binary-search.md b/content/english/hpc/data-structures/binary-search.md index 574b089d..c60842d9 100644 --- a/content/english/hpc/data-structures/binary-search.md +++ b/content/english/hpc/data-structures/binary-search.md @@ -415,7 +415,7 @@ Just the finishing touch. Did you notice the bumpiness of the Eytzinger search? The latency is ~10ns higher for the array sizes in the form of $1.5 \cdot 2^k$. These are mispredicted branches from the loop itself — the last branch, to be exact. When the array size is far from a power of two, it is hard to predict whether the loop will make $\lfloor \log_2 n \rfloor$ or $\lfloor \log_2 n \rfloor + 1$ iterations, so we have a 50% chance to suffer exactly one branch mispredict. -We can get rid of that last branch by always executing a constant minimum number of iterations and then using predication to optionally make the last comparison against some dummy element — that is guaranteed to be less than $x$ so that its comparison will be canceled: +One way to address it is to pad the array with infinities to the closest power of two, but this wastes memory. Instead, we get rid of that last branch by always executing a constant minimum number of iterations and then using predication to optionally make the last comparison against some dummy element — that is guaranteed to be less than $x$ so that its comparison will be canceled: ```c++ t[0] = -1; // an element that is less than x From da7006b636fc0a91798112c676d0b487e4d5162a Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Thu, 17 Feb 2022 09:31:59 +0300 Subject: [PATCH 203/531] s-tree problems --- content/english/hpc/data-structures/s-tree.md | 39 +++++++++++++++---- 1 file changed, 31 insertions(+), 8 deletions(-) diff --git a/content/english/hpc/data-structures/s-tree.md b/content/english/hpc/data-structures/s-tree.md index f3e281be..d1a190c4 100644 --- a/content/english/hpc/data-structures/s-tree.md +++ b/content/english/hpc/data-structures/s-tree.md @@ -121,6 +121,24 @@ int cmp(reg x_vec, int* y_ptr) { This function works for 8-element vectors, which is half our block / cache line size. To process the entire block, we need to call it twice and then combine the masks: +```c++ +int mask = ~( + cmp(x, &btree[k][0]) + + (cmp(x, &btree[k][8]) << 8) +); +``` + +Now, to descend down the tree, we use `ffs` on that mask to get the correct child number and just call the `go` function we defined before: + +```c++ +int i = __builtin_ffs(mask) - 1; +k = go(k, i); +``` + +To actually return the result in the end, we'd want to just fetch `btree[k][i]` in the last node we visited, but the problem is that sometimes the local lower bound doesn't exist ($i \ge B$) because $x$ happens to be greater than all the keys in the node. We could, in theory, do the same thing we did for the [Eytzinger binary search](../binary-search/#search-implementation) and restore the correct element *after* we calculate the last index, but we don't have a nice bit trick this time and have to do a lot of [divisions by 17](/hpc/arithmetic/division) to compute it, which will be slow and almost certainly not worth it. + +Instead, we can just remember and return the last local lower bound we encountered when we descended the tree: + ```c++ int lower_bound(int _x) { int k = 0, res = INT_MAX; @@ -139,9 +157,7 @@ int lower_bound(int _x) { } ``` -To actually return the result, we'd want to just fetch `btree[k][i]` in the last node we visited, but the problem is that sometimes the local lower bound doesn't exist ($i \ge B$) because $x$ happens to be greater than all the keys in the node. In this case, we need to return the last local lower bound we actually encountered — hence we update the result as we descend down the tree using the `(i < B)` check. - -This implementation outperforms all previous binary search implementations by a huge margin: +This implementation outperforms all previous binary search implementations, and by a huge margin: ![](../img/search-btree.svg) @@ -257,6 +273,8 @@ void update(int &res, int* node, unsigned i) { } ``` +This `update` procedure takes some time, but it's not on the critical path between the iterations, so it doesn't affect the actual performance that much. + Stitching it all together (and leaving out some other minor optimizations): ```c++ @@ -277,17 +295,22 @@ int lower_bound(int _x) { } ``` -All this work saved us 20% or so: +All this work saved us 15-20% or so: ![](../img/search-btree-optimized.svg) -To progress further, we need to change the layout a little bit. +Doesn't feel very satisfying so far, but we will reuse these optimization ideas later. -## B+ Tree Layout +There are two main problems with the current implementation: + +- The `update` procedure as is quite costly, especially considering that it is likely to be useless: 16 out of 17 times we can just fetch the result from the last block. +- We do non-constant number of iterations, causing branch prediction problems similar to how it did for the [Eytzinger binary search](/binary-search/#removing-the-last-branch); you can also see it on the graph this time, but the latency bumps have a period of $2^4$. -The `update` seems to be useless: 16 out of 17 times we can just read the element from the last block +To address these problems, we need to change the layout a little bit. + +## B+ Tree Layout -We want to get rid of branches completely — this means having a fixed-height tree. We also probably don't want to do the `update` as it turns out to be quite costly. +The layout is not succinct: we need about some additional memory to store the internal nodes — about $\frac{1}{16}$-th of the original array size, to be exact. B-tree layout From 7046cb068c905a278001cc1cad7136c7691891d6 Mon Sep 17 00:00:00 2001 From: Sangwoo Joh Date: Thu, 17 Feb 2022 17:58:35 +0900 Subject: [PATCH 204/531] Update assembly.md --- content/english/hpc/architecture/assembly.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/hpc/architecture/assembly.md b/content/english/hpc/architecture/assembly.md index f92ef812..b393c39c 100644 --- a/content/english/hpc/architecture/assembly.md +++ b/content/english/hpc/architecture/assembly.md @@ -49,7 +49,7 @@ Since there are far more differences between the architectures than just this on For historical reasons, instruction mnemonics in most assembly languages are very terse. Back when people used to write assembly by hand and repeatedly wrote the same set of common instructions, one less character to type was one step away from insanity. -For example, `mov` is for "store/load a word", `inc` is for "increment by 1", `mul` for is "multiply", and `idiv` is for "integer division". You can look up the description of an instruction by its name in [one of x86 references](https://www.felixcloutier.com/x86/), but most instructions do what you'd think they do. +For example, `mov` is for "store/load a word", `inc` is for "increment by 1", `mul` is for "multiply", and `idiv` is for "integer division". You can look up the description of an instruction by its name in [one of x86 references](https://www.felixcloutier.com/x86/), but most instructions do what you'd think they do. Most instructions write their result into the first operand, which can also be involved in the computation like in the `add eax, [rdi]` example we saw before. Operands can be either registers, constant values, or memory locations. From a3848f284a3d1f7d90648bda1557977ac7991bf2 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Thu, 17 Feb 2022 13:09:59 +0300 Subject: [PATCH 205/531] B+ trees --- .../english/hpc/data-structures/img/bplus.png | Bin 0 -> 33697 bytes content/english/hpc/data-structures/s-tree.md | 68 ++++++++++++------ 2 files changed, 46 insertions(+), 22 deletions(-) create mode 100644 content/english/hpc/data-structures/img/bplus.png diff --git a/content/english/hpc/data-structures/img/bplus.png b/content/english/hpc/data-structures/img/bplus.png new file mode 100644 index 0000000000000000000000000000000000000000..c1090668aad2e2bcc1513f6abcaa9d50324ff88c GIT binary patch literal 33697 zcmZU)V|X2H+XlK~+qTizw%yoi+}O6=rb!w!R%6?2Y_mxk+uoDs{q~=|k0YGSnz?aZ zr)DKeMM)Y79v>bA0wKxDNT`88U^l?e|6rkkuM}M#l^_t_ySIk6tD1=isiU)lh4oi+ zQdciWb5e6pYYPy_bEP8P+L^M+J?z~K%MiRM#)KD_q8;MX=BDW9vO@m~%TW`xBk~v> znJr{X&`j3*)8z46NEP~Tk7XN%4$MidvzDPAl<|O0p{kz!`}614$7l72!Ld>&>9ffCpCNnJfz7lXJ>0h{6XiVc!4`L4SYT zJ>}8Iw}7hiO%%VB+tKqD^jN#t)Vvs)u-(nBQy0~lchyP&S50VR9@>vtg{)c4fmjupBCZPlq zJ_aVN1BA+yYpjDi+Ed~XEZs%1_b9ISIX5bZVa&|67M{W|)Yl5)*K;dFqzc{1kxRz6Dc z%aman0Ro}Fe|Q?U|AMBsm(m7ds>S;0wNUXmj?oX%$v@hHrF}Th_ZE@20xEDkV zV)gj%h>$MW+;y?Mg<{USa~+QgNarh7m)BeuV&sfJG-Q!Z#cG`!^$(h);E*i;)c(XO zTPRt1Z^0O7FbXK31k zlFUbJ=lGFK=YKv5%v3RBGs&4m$*gQ?%8~AUBrVTBTe`+GC|ch+949&*nlnq}Yvx?zOjo8T@>YLd z6MFOxSX*|)b~vOv>x9*aZIP%WW*37dryUp<<^3nf-9Y5m=5p>! z_gn+B=OR{^jNo!W;ReqWyKEJSaWa_f&9fWl6@Exde$|5>FIhXIx&EnBMG-z3^S-$L zuq=~8J%7k{l2u%e;vVC3Y! zqWq4{cH81d_e7QfIVbGnb{;XG-+&=bgOt*TS0)!V&9#H|Cy$fj`?y4+zs#H%gx_84 zC9#EuvPZ9AbS!Jdp^MujT8sZUb!WNYIu^gGJyT3H^6`q@lMj4WDxVc0_MwyIU_FYU zJKX}GQj65p49Gj@$^4F6u6|)~(RuyqqY$LsJ{DJ*m1| zwo#(Pn-Tf-SVWx3?Yh&?X6_#D(7!|OY{fXc>q+1%^a=&fE?~^nOW#>Jps_p{WSp$_ zybjB>Yr8>y^{6I`2?NR7yE^!nbap9%B~6jLrUKU(C5%6?r7Vr;8v>s=;#s4{Y_%)# zb82UxD`H8Xd4n2TEWyQmtX@Da_O_g|P7Q9V&ZyF|G2EiHn%!q1H=`VV|L~_`{^E)0 z_v7V~Tdj*UdT(dw$SM+JY-xjpJa-PyJ_n7dZQiVd|E!DvVba3VkfM>D%;w~RP z;m#{Zix5nT^c92xU$~Z9zGm{bACf!wLyebU+)KO@oG~m^$8riH?yoS4L`pW`xEwyv zq1G{BhW|5LVp6<2G4Z#-@GsRL%hQ=KKd_UDlFeWE-SqVe*WegTip@ec>6#7F?H7TN4Y$TG-m&b)bQ3nhR5ipG_LIMkV=%xn0FhE!JyWeN88G3m{Mt4w(sh> zVQ{RuU(kmKwC8BjM=T>4ZKkxKWQ#sw;9}S5tsDdjJPERYiA72~O=8C(n8-8EZC}ZB zM?RHLnQ^BcKxoLL6yZpFxddNFX#x*G5v&j<63fc7)va?xrbm3r6DrY6e}irs3jC+5 z*5u}Gb2ab0>|X%R64+xc90G5yEbdFEP=_~WMJxm1yrskPjM+b4dgHybvaprT^R&D# zpibH7cLypO)GGp?C zNX-eM@~y7q>oVaitJqI)k-67`uO`uobUq;`K*{~oHs@nTZ~dOwk2U?dTjMNnOH?<1 z-VCQBsWV8I=6YxvGwMr7Pc`I@kxy@85pOqt62&TBVXt3Usbi8bt^>7vms3Dw#Y7f- zGE;EdrOUNgJsOMZ7pMzVV(U4@u|~*w3Cc-(Uyw6%ptOMcYA%ALISdy0?hKOP!bM@Jhx5oxywpPVBA$<# z_!)GLK3(iel^IBJ15d7E%d;qIDndOU8yM2zL@_)Xw#1@32o8bGCz7a#GiamHk{n$Q z+GGLq{WdqaO@zLIPM;iaQ$N6n7Uu(LV>K2-xRsimczfB6hj%6_7iTbv9uhx?9VutY zR$e&WsBtHcx7%jZil zMa+Ot&P;+YN4_~wkZ6M%K5j5`K>hx}1Uy>1021$DVsKaVHRhyQ_({-=ObB9D>_&%d zxiPQ#hCOxNGS>-*45jqX#LQjiDB{Isu#RM060R#(xhw7CpWXtlr1upXDJ=_#5h;^q z(j;;k+(2w}49Asj26nz^=y)VVN82ITc(J4J$oZgH(S-v>7B%>MEGOERhpHx(WS7MD@X#2AD7DFUNIw=k*QJ&w#?5CART~vj zi;gB2ADnYHieo^tudzQXJnV-3o6v*en`$z@0kuXD4leNW}o-w5IbK^=!Zqj_8V z_p`{nt*aaw9K`BQuAxCDQiO_5tRmuv{qBe&maYMtM6+#W53a{KrJ0@`78;2tM05XF zhJq!8f*oKVh)&;+x3^MLINciJ%PN<%*3ae&54w$ygp!BZ*q$n}B7zP4Hz+5vq@gAy ze6Ip74<%49jD>_<>Pe9}e<3d8&nwvmPM5Ax$Q;p7K-z68hmWK6=ir+f1a0K}moa~X z+-YQL@=pC8tu|HYEulDsiCnNf`&n7aQ@rsbveV{QQT&P6N(;!RMODmZ>ap)Sm&Uxm zjYq@&yt?W&Z7rDRJGQoQ+mVa$YY*J^J0M^QN-Uqj4IYsV<+d~1LIEW*3%Z8~`e%no@S)&^!n9E+q6Z=udbv)S(0*Rs*@lC{UA4d7hW1<1&3x!(2rRB5Z**Av_t5~qS}38wS_*}zw90sf#??nsPCdc!8p@-7jgkk>o}M{qc#ZI{*qv~1HQZOoE2zca zNHHTZcKOC^ZG>sIG?@*+q{$BwhrV->yShYuE0+H2#Ui3*>j33Nv@0jWy$1EE65Xg9 zgdI+93zc;6CqbMjJ4xl0-ft`UxU{z~Z^NHbTfzt&N1t;1(N=zE>ZicDspR#vSW^6O ze>~<-QeqyxhO+zC_`8*#T#8g7%Dca~qj90of{l=g9C^=>ce504F`GzSP2d_TQA!TLLF4O*#3+jPm(E1n7eI!5l)4o`G5G%4c<}j6Eh7wyG+-O~7nOxX-dc4tr;1#zM zNhH=2qM*?vg}jZuU3oBXCiI0|Q(i-lY{{O@JL#~Nayr%P!GgQip$cBA8Pc=<1LA2X zwuWSn*wkMHHDRuSkUdLbFVUGO1k{kIX!j*%z7upsg)q3k_>2*EL9b8FmpK!)?&vr< z{Whc!ED4gBnR5%%q~r&>6BrTh;>WL9jx?;WRnmeBLAvk`zjE42Wt7nDe&eIb_HdND zpLTz(@|*Cq4i7}6Td|4xO>;V=`a`P9TKsQkmR1qR49@R`Vr$McD59^o&&cd@OToF! z5X#x-rHX~oSV%X=Y~2`s5g{$GlEEsD@8tvHxt`6ke3z#pa&x(u(JTtxW}=YQ){1Lp zJcHueX19B>tmu@W!=ThFgm-mV0>vQClN##~5X%6f(9dW02Hbd0Bw@n!R#?LDyn6S| zvrTIj^nWVuocT401Xo?=?>uxtL!~qG+hjknP4ZrJ`MhqRP@Up*Sr2zVaW3JSxCrDz z;$na+5!e19$bp*YlPPF2?>FE!#hkBfP`n6r|1*05Q-G(fLw-Td{|_c7ChB~}!N8np z`NoC7PR{KhkwRShK;Q%uFdYl&|V7U+Z-V-S>J~hc4As+sI#` z3Lb~dxjfZ;O;MR(ath-}P>}KRbYFs1*(GlsmK&d0*L^(c} zTFB}N+LKAE5CU5Gq((o;1rmR+;32dkb-)^2V4TZoED{GAXUo+89c(7BTXgZ8XHe1f%Tr$tOs^VAY-SZ{4DF+&JT- zt&L0GbNY!0U(9ahlE`=3f4TmIN7AviV$1RfcX84~Sd2)`muV|!`id>~Z+sa}BFSlO zA72do74GjyddRdO5{&L|H)I?U7h@_$U+VzeU6t>!q^L+RQTje@4jc?qdF#a^4^UGD zZn-pz3^LLrQ=IbB&z+xjiN=FD!{X`2qEEoT6hS84HCR=q{QV{jVjX6g@pg?kw_mY~ zmw_A8X?{Rn!VyF$AX86oXQE5v+^zuc*4frpXo`;(^C=!mmT>OEWTjerE2>qT4q`j{rw-IW=bTTxb1E{pO%a~7 zx~=`yU3`>O?}##PB8f2sIDx-fWBFX;rhdb?9XKTjvW7~c*Q!a zkghoC$hIBJPYMlaMc^&X->LuDJ=5!78mqFuX zuGXP++fHwcpirJqv^=j8fHsgu{=-Rv={xU4b;l|$VufReLB6n8N5l?eKS3o z(hJUM(JJ^4Xp=JX&qYv*ianD~U4B#Ay3*>espi!O%7Vm;-J2}3MvhR2T}>l)xe#t( zoWw~fVyyVCt+e6Kk7`D$4GJa@3P)N=D3aJoVa9_Zi@!af<$xixK#;Df%cxrBa_C|C zA<7m*++VL+j28rSxHu}{U`H5E9X4XxFIs-}=3_xSgg!{5J+<4VF@R@+m1$+ut`<1b z;1x?DnUn7zSS_C6hR=E^vxcf9Nx(OpLA9@Dc6~lI4hC(5bP7v$J4KPW#uW8O z>}PpAAu?o5Zhd`=9_+8iHha2ArUhliUkV99iE}7wRW|HWL!F*1PvuaUZ1lpTxVypp z$gpsypVVU&6H1wB0^_(RKd}k_f#sRmy z8tQQSwp<$j#ZV7_yk0aff(x;NHaC83k^1|W`x)fTxqM^=zu}Qp2@(;>5)h`Bwj>PE zJQC5t-$|8kIb&Edl5|m_zWr%^n0VP4Gh=31m%G9?-*i)N?Uym}W{$Eh{06&tW9Iw5 z^DX0fZ@>tQ9EW?mCOb7)#nx~+HK|OA#&A#^{2Vs(JaRthksZm_j1KVyC!(ARv^wG& z)tPk}MJQE-DX623sNkS3+YbR#}11x^?SMDA9Fm~+$ zb37I;7q;(OznEbB1%tUL(r&R1CUT{yxLn0V5e4Z84g-CzFeu;Lrj0J<<<#YGJwxu< zA}5}SGRqV$f&>oscFuPN1KI{Fn2Jt}z}SC57&2(0)7vOtbwzuW;ML|>1$7E;s=jWQ zpqfNCF}J z)vX^@I3KtuMHM{DwQO4Pp?>;ebTcM3JuB4Cup6CM9fumUeqSfwrsfqk;OOCtPF>4H z#IJDblB6yo&gousXF}-aPVjkh=qJ5j;0lBHZ&)3p`mn$t$lEevh(6CgF2yCnhL}$y zfA26uvX#yt2Mi9K?W`pQ5yD14(^{IUHr#L2Q^X_`+H1k9CP?hmxwfXFQ9ES+Ze4Gv z(C+D|{480zJ(&4j0D7yHWlK-bmh|m}TCik&(Dm#dkL#x<#}5P%zY&$_pC{l+A{VOf zhwagQ^@pg(6AYX(;ACSJ5D5Kznvibls^e;J7>Kl(_=T2g*i2JB1;q!l;{g|tQP0H* znJ=?JXN8FC0T3Y46MB~?+b$3Y-rQPTTt!w~oK)q1kAi>)K$#6GFe;cj#Dhj2%E;Yg z-Kdm^yc9pdC=tD2gC#^Y!trp!S>hoeqDx8`>6@V?Bt#6RCDEW3LBsf{5YX5?kb_V< zn8|LefndpYyWJgStF5iS+p3N>U&{hI-kxrc?|%z{Xe~X_W$DX7BI!Kh_(#V*=_Evb zy}}?K5g14j2p(c#gI+i|ILLsqX&+E}`zBZR&C|2fyC#RI7#|6U7IM#1lusCG@B8;6 z-zW|U&;#rUkq#buauQ)OR^b7p>IUDRUOFL`KbPuj4{i#cB{U;;AXHj`UmCEW7J}0{ zwnwqKh!-$(`5F;Q=s?rZS_a`L8WCPN^Sb(as_*)-wW=>hzM=2Cc)~k>cHk^MkMugD zx2WV&3$9Y$Q%5hef27c7_`?t}Mf13w?RF)J+lq8w!>3uu+6*fQixQ%7jsJaU>6y)> zy@kud%ig}kL59ozcS1SY!Y*{U!7lVi;62egWkigkh*yw(?0L0rTb12K{v!R5Cfc`o zmT_cjqMgB9wtI{)kb4VcYm87Xf(Z*wN@1cixGU|$oWz(K9V&~*h}^o0KBleDy4f3R zZf9-&C~S26UQ9i?jM;Je=Iu{Py{)I*2o}i8Y~?c0WLS;`ax(|f3O_8&E{Zhqw?-&? zT$|5~l@eQA1w&DiqF@HmPeHbJ!=~liTBBqJ@dM9bNO40$tg2xzMA2m+)DoebyYUlI z;CnQyA(DeSjD>k1x2SNT^NkUMNd+Xra1-eZpbo{0 z5;^v#iiJ!O9ALzD zV``%<1R`xMvk?6226Niz2?mR2LcS8o z+bboIWK0PyD^V~^UjU*LrBBi=fU}mgrw|-Qy+RA=V_{*8y!W4(eD&ug^`+6Jwk7`)TI2|`zlcuA zJi_`M%y>=UO>j*FS`oH`ZhxN~DLe%Xu^gg3dSG_E9uR$n{Bho3-vr*UJ4xPyM5wLM z%Hf1Uaf8#UQ&hxw6nLbYB|VjJDdTQ_X*4{!W~<@$JtSqFM?sVjYPIs?=`@Nmbu`{*chQM#Bw!N0(1h@-HA^F+Sdh~6 zE$Ceku6eGBufNxZvq|%cdgVJ;Trut4)7NRNeMZ-S*ErHR`|Yb{Ug}*|RAR4gr{Y&0 zuF9^OQJhlPD1BPHGQ~KcGg0?Y2X8@?eJetw$*qd=*(kTus6)|DLR`I1OFXYeekX4* zky?G$fL*6f`#SSq4z~E{XZeE9+Q}M-}UNgRw58Zh3=MwI`>ZgZtQtU7em~Kuvv}LW6+u<5aZ$o|Rz4N;t%t!q#5Nb`e377dE|r^^b3OC<1Ten@V(Xan%J#Ix=?m`9~1#Ku+2cjcF?rJ<@Z`LcYiHPMAufz1?cAKNn| zF{m*R%Xnc}Z6+E}Zt&Irq$4aGA(g4UQN~X7D6cMN59b;?JF;(3a)2nFU?1w5-izXL z?IHTXq?0Cq=Do1nK!p)k!BLi0&hyv7H|=B|Qdcy49AxSisvnd{Q38_Yd5eGi|M-00 zEQTy5RQM?qCs!d)Q7}-Lg*`h!_g7w?Bb8BahK}{euDo{&AJck`t2RDqgQQdyfM2KIUw7cg;NPNC2v0LPc{><* z8SX2BH3zC}bSm2T9NYpb9nNw&3pnX){dH_x8m&&-+*FU#uQ;xpub(=vUn{ANn2pjm z8#cUNH;~eh@{mpm?6}TW9tuTN96P3Js|y@`F#=|9&nr4Vz51KF{QY;wbfAK zneJ(;%2cYBv@iXejPEN1&wOm|ZBCfho)2NBQP!V*@8`AfHBwH7PAAviPegWu7D$$b zHML4VV`=&c(F!fO(B4cH+C&^3^&Let&2@9uruR_PNb|*e6WNt>R;_Cdl`{xH$N!1% zc%VBg_8EeRZ^|^zE)gsZsQrt%qaiA&(bewXOjep(%-ULWUH^WVLcL$4F0A+Bdl7s0 z^QunHvMG}yYm}ecN$B}Srmn#2AoxPeX!LJN1kS!uyNmTk;oT@$YL($-i*wD7S?{;s zqx0;GnUz_c6P+85eQ#2J83wDRMkoHf^Y)vD0khi{W)Wu6>id}!E=_Cp2F<5#&m%-< zSp**Y8>j6!PE~CTjb+7pQ-Y4dw{OQVKT!sW{|WU9(R-F(H19?|eIGH33#d4*Ip{!iH0!Y)9~BU6W!75bxpa` zi|3)`&ZMFdc&9+ROh#JbML~O&Do7MTNda&2!FoB0FnzFHiS$SK#?{JJ9N&Y3RwV)~aU9pQ2i zw1!;%-Enovj;2dRiTtFaGIQg^+witqYCE}+x^B*-s`pfUz08qv$*j) zJgAVOsF12T*yjN+aMEiEl+Tz2?HN?A74d}0B5 z9sfYcnOkmZ*%UWF5&Hsnc*>4rhnP0?G&DlTGq}w*+Pz*_3pZhqK@io94D_v)wfw?e zARl-8p7#zG@tfz4Bt4HSRP=ddht5Z8K@Lb!G`{T$N@{A;e`7yu$t~zb#1>)UG7dNA ztMD8N8C;OQ2=CvWuJX7YVH$dFL*XJqKc86e2CABRxG{8ZIJ5fWz#!w{;SG1b(O68p zAmnh*rSmx-I`g@|{E|XJ_*jh%#L?QRL)UNMwsdRv2|Z*%n&jbPiwCZJLVn}5?C!H`)ci^29-dkY1#d!dR-EO;#6vMH$2Ob@?;Sa1>B z*DZo4=G#e}gY{i%U{0t%&L`CeA9At?w#THBRy$!Fm4naFcV(uGEn37q<-~H$N3F{Y zao3A%VcN@^vkxYfnS$#z4`#-3V!*QfGg*4rl!ETMvoCLuzewQO##Dz;S+6HcAvI!; z?ix%gN(0*j+qw*E7gOI>kz{il^Gvb*qv$eBZ6hUkto9#p6}(lOCiEVhdcB+;V|(sf zNeoD!D4MmWPKk0Xn9>L-pNs8e^hV)IdW-`Nt2~C!o)=+P?R=|}*Tuw`ZrZE{=|2`K zR36HpYEd5f;M-}og(P$yolH&4xCsu1u)5RS@hvdx1^0|E5d9}y<6@D^6|NF(X^DEo z#jsib&1EmxK2~gRX2DstVTx1iyhh(wQfVvB+S81~u6{Lb#)22y*L^ccqm})X%T8a$ zP#pDRggoL?iZHgze7ft-_-Qk+n$B2!xyNL{zSy2r>Vzr%HWB;pwC2xKh0xy1!}v zn359Odi!;JFfg#Z>54q7b&2{oE*&ODvxxN{KYqx_%R}_B1cQ&WAolk4p%D<+s(Tv_ z|G!J03g3DdOfM$(1#g`V{IAC6&tZ0}joU`=PpR`hQn9!}`@ejUPgfeg6P?;IeOs8W zzhu`=;Uq|;vGY(AZXeh;>hdQ_5WGwKf(JOv<1|)rZ*i?MZ(R>+?>j5gYVmf6fQkjlTI%|R>l-gO5`qI~$)#hUY2b@+G)IXAr!&Ps)L zxT}XMeSq2a6|BJ@4hB|M1X#pwJ%RMg=UDkPEV7TE(Vh_$t~7qiO3my5t`C%;nQww!;f6QxJ0?6*%bzdWw$0tbOZK-75O+UU19 zLxqKfS8wgyuCj6EiNunKYy(WB83H-)$w-@f%>>yz8>qPDK2!I$N6BR-KyjB zo!wf84-Tl>xCh+zXinA@PFt^khIY3o*gj#s=(xK|w_)TY9RY`VYltME|6;q3)nP*b z5(Bxz#NcxC(o-B5uORb^h%Y6qsAziPh4gj87tu?cTBNL4v zxQwPS)wmw2fx3$nGg$1`ccNJbtTsCZH+-)Z zLEZb=0rA^?5flOfFK?FJ-Yps}d%8?81Iw$l*1)xk6n>K9p57nLBl!FKPuVokGBSoL z71uF<6n6nR(vzq;T=Vd%_TKa^`J%oCyE^{X12nO7WHstU5V0HMGasnkcYf zzkmN$2wiGCXM4`5J7HFN&_S24N$*5PM3gW#CIt#5@cH!)dG-GmMQ&{TId?B#W%kW9 z+{E!@iB+hf(`uR=k$@|i?)U7S z=drkx%`OrsC@2spPAVi{I^l|9oy2ns%>O%fZh+dR_2Y_an@)+MZL@fsAIETQ9_*_L~ zWMJjw<(+O0RKJ*;=Q?b54o^-Fop-%YbnSf#ctilIys!xaYe`?yCY39GkFb^>&;#5K z8~vSr&nnv5k?yA}WE2z-L-9lqpwQ4zLAR-egsuWM8yBD?1;ez-Qc|A;VE@~|jxh@~+BUs*<`8$cbWyAz#E0WWka zDyoaiOVJE2yQyMj5`RArJxMNT@qGJPy@*QE8hb+sgk#9PPPtlKK;f&QDTu&@szw7Yeejw)9hK zvkRe`Rv78uD|a1y^6d1AM$!J8j8Y%1JN=IleMf~NeLeWfDO<80bab1Vd-_&vL>>)< z`pfJNcY-dh(sk3lqTScQMZ`Xl0hYi7U7bvO!P9k$l6jjEhL8Jhj|0L{zta~_K9%`L zEY}a%OuZ~d|6RBOxW~=GG(-Tv+=BL_KNz&CUG|m6viKKm=D9r7 z0JVT)=m`dJoS36y_0Wk*sm4;X69sxY%fSZKbpAcH;12bSi}k+cM{&2OPfe)+QrLS9 zxk58$cvz<1dUi11T|+~Il$I72P#uyoGLh1D)-wfgD~)hc3HC$+Hqs&@pqcBLLlZLC z(Qs}ou>X$*m>AM4GV_|PXGzbKLUkiK5~n$ejWhzpKaKJ)2`MRPK9955AP5+8dis&m zModi1t^3UYX&W2*zdJkAZ5|E)D@=d2`Pnu+JlqXHwbSVeXT9wrG6XcDNS(!aN$-Dq z>PZ7&QQvS9#dy5i(Hxxc`wP3Nnc2FJ{duYg<`~bk;K0lvC$Vf;)0qzWV!Gyp-5F|A4etlZlo}Qk_$Vd?@D_Y>R@dc4jjEoWh zZoa*q2V9@u-roLrUFXX}22{m>L)WVXXS&Z|ES^2ar+|O}g-jka;g^e$A5$o}EF!hr z+h0vhO<7F)Vas*u29NYbMZvhfE(NOQOOY`%PwXHp0A4lNu$Z+db#!#p3vJ$tV|aXA z(%zmK9v)r=FqShjY8slF;3a1ydpyX19)Lwgmh$xEAtxtCMMLX}DV8e?uB!UmSHm_1jND{vdoRgD7%JFq+>YtVC-pG#+dkcui zhn-KI59ELx-d&j9!=P5^wyo+2R~C8<6;4S@i<~dhx<(@QU=4Wl)dwWK&;8k6f+8?W zK>dgV3L~CGI39oyywlaDA%J0~D-HNV!^3C#qyZ@#idm$miH(U#N<#w!lq66!&0g2k zl?H9$H#e@kySspgmXMGj<>JD=yu6f=krC0>#svnB&-FEOX-S*IY6?U~M&?^!i-!jP zYLqIgpwOQy9&rK4C$X}5a)3GkM%^NyJ|m~3giH{6qD|+t4OhzI>jCOzz0E^fTRUxa zloFuv)h34_K+dTEm;_A!=4d`f`L)SmlNgvHB^x_?Qf6i>(ETLi>70G}2}04#2CX?J zec`uxXCGZo7RkD20F+jJJqG|2In~vu0dJ2q_M4qZBqSsp_G`Emx(#9g4xr)TMFD2( z=6sU`4GoP;y13u3x3`zYaZ4m&H7Pwk8UUL-=>%fpCMr@=5Q^|i^?_G|Zd?kJ-iJF$ z6YydKl(^&J4Aoc$Hwxj9$ygez=T;C5Dh7r)RxIFgjvGC3u(9)i+gncl8UXT!q?DBK zkIl56fdt510uV^8ZpY!QW`m^UQ6d zb#3wSXKsgevG0jNw2btkU%v1dv*LaO6y}XC#E_GWva&QFiGc(nf`HpT$L)AQMnORW zfO8g`IoeK=rS($7UY}-HXA}lBo@H+d%hb08U&SqU=#(F0yf8+C110w z!o`i8g#{HT0o~-26%*+q;K9>BvlXahj>jF4|CH8E2T{6tJEr8{D6M(NsTmW zb%pDCf8$3c7KoJnPHl29mFsy}m`Z!)&g$jm#p8aGHgpQO-Vcwgl*!Yhysa(_G2r01{j*B~Lf&X~({Hy$304`h}@m zRvAAEzlSd{?Vw@fgXcH$vwLma}!s%5-8-%J2X-gr7s(rTlf zcOJZ#4lo{HEG*Pgq{#qZ`vD|?2X}UMP6T=cpdLgW9Uaey?TG2FfPntNL2@3RH24^x zrz(ifqOll&oet%+U6>;O6siu;D^SzK06mqke5K^$!w0D8V6#j3W0kV9;sB4m1<*av z3H^Y?{2(j*RWV?Z<|o{@d&4{rawAwQ#&Fa5m3W9!%%_{3%wp1xp2M%y91D;Y88>&+LgkL(>hRcDKTz;B4x2)hl$6F(Iif(qKoOTlLn~HQcCy-p z0*>;YOj!sNVlY!c%PUm)r2P}u#Oo8l2kdj0pUG20Z)!*N5{PJ-5VQdTc;G(CF z;emzLZuex>tkjpCHHJ;CvtQ=}+#@2eTLrxrsNNmK0PGG<=S$aE&r-+Yu|tc8BZ>mn zyu>k*PdSSSuW0u0;&O<@>};)t@?WT)jFM6johI?%&=3s34{1Vve1HoJ1Az2z1c`rl zf$aAW#fgc9HC^c_v0m}+b{p{uy!v1|9|p)`Bt3pQBJlF^*5|}>OES=7eF#3lLhfvI z`0)7LF>ZJtGxZM*nXNTDSuIu)DY8-?VY;#H83G(JMC^kIU|xY&S?KD~8W4`e0f}um zd|!--4)%v5{M7Xl90G%4dbRrpm&F(~u0(nyMbXIoR$>(JiWTsBY$Q7d*Yr^?z-U!P z=YmTF0qmU=eivA%G>8F&Zu`sC=)=vdGH{9_l!Qv?nIe$Meyj$-Q^jbNp{ox&3BrjP z8S?5BwsZg_Xm|Mw0=g1_c0>-_g;?xta+xW?aR8$nZ;w6$JP~rny!{_gUtBnTkQ;nH zw#(y{x$om%Dl8ma4~DWp5H^#pN-kp@e*J>K*m)RTtZQAw~9^TdjTdD zPZ-6$;R&%B@Gb~wGAu;V;4@}`hQk03Ro%qdb_cjpk)rT8-KG5J-Cg$k3k*=?AJhw= zgcOu`Von==in-Il6xmT9<(d+J#^zy1ES;U5X?$+MFC2?=Xz2&0a!AVo?s{r z?Ra3+$z@tq9}w!w4c7!9u&S=EoZiiX8E$w_iE92w*3|y^FQer5@4Sx-pYAya;L^u>PXvNNuvpJfwqH;1*TW)-2HSrEf;lN59y_#0w=e$v8=i>Y zyr&YN*o1N2LP#V`wmr zpp*qj{`5XH9iV^3J-%JQYE>CZYG`0$L7EKS9L^@CrbfIz->aJd7->pLL$kB)vH9Ue zKp;Rx7P|6pbIr`oUXCzS8Lu|leMGXzBto%(DG1)-c_R@m(V!pu$qw^TSLbC4x!z*U z!MaT(PlvO`kGCs;jf=hBpWFdv4vT_P49q+;Gn4Q`f&eXA&&O>wMXHc3MD$@iDzs%l zpsVZa|84}R-j{!ZKuz#?T~Yq>`iDnPn&&TlP!z-Md%y0{?0riQXzOamDsQGnnIKv$FDL^992WlLa(MWH0- z97L>Sk?Fr!#5DqKRD}9VL=z-qRyjTOGbS+)~wCX(I)oDCV3dZ3J6(F5JpeHF=(eq0%At^ zXdO2c=#+T)h#c7UZvYZ}U8#r5;&aR6mut>EIIt=+58ogA83Jfqm2FfDD-L|$r^8~; zx5xd8iPx*qPk<)M7Xxrb@$v2Xe7ed=xHSX7Um2Nd-L}^sT86p7WMf-A$RDXL;0>rf zJ4;GnyI$`IurPs|`U1GeypUwrUaW_QhaBZ>p_gM|z|;R2L$MJ1J_zDN>;dWyvg+|} z)#0O*!zAaFMY-1>i~$dU%V~qTF4=qf!Gs`GT--=tf7OFfgMoO#^=u6f=z~Gv$mKzP zLVDKwY13w6db%8xB(S_j4K`RnRw^cMk4_AjDP+*avSoIz)6Sp z0Ehzw`N3EWnjfL#@Si`T->DT#j3b|@4%*yLdx1`r8G}o3d3kUGlwB%(uw#jd=k72X zAVyZc2RnA;08c7RmTS!olU6Knp@HmcTJr}0eYzFcfSYYN3I|ZXnh_!3Jy6%2d$cx< z_u4DrT;9mgFn(_|HNU;ReL*nR+}zv>VCRb3pF2Fssj2y*A&I(L8?v%H?Rlc^K;Zgy zsTxdFRJ5Uz6&)2-<+u@y;&$28s z;Bnu%-X>XMb&L|t^?I(nYq|#BQlB$+cvui2;&&Y1Uc3ql3%5&)e+3&>rPhm&*w=LF zr0fy!PSiiom8#P+GzXVrz~nUUD&ayu>@DUuJMDxIL3IEXp;f6502o4dZMn{}?gqh* z)6QnfYnqzd?5}5o*X)aDhmH0q;4y>Q{O`|G$G1u~s|)*xgx#VC9e>?7o1Gx|yEX&0 za5@n~a_?>+jM{yPEGOR%v4Z`#d!cU5)(HShmwNj%uXAK~BkGY}-l*?$*B@phC=F#C zfM5i@@6S%Ft5VwrYd2Q^mMLz!+@BdRYFC3c{huqgEaXhxDoHn5VGhXx#2ZD{lz49OvGQ905Fr;NB(qn1Y5F`TV)!Qq^-bTfMwJ+3GSovpc zOak?4nm$*jVmT?w1h=2#@)whYxR}@!pyr-~nRSiE&pT$MQp=l11-8T{(m1ODyZ-OE zjQvDY&4?+hXwAjq@z?JW`p_^g5iv1#M`&@T=$R3x>kYxG8X6$8XVR@#S%^%&{C}!? z??5d3_y7MADMF=02&HAO?422sWbeJn&L#?(*<=%uO)h2ck)1s*WMt3mk?(Qd@6Yd# z-#>L%T(0Xp&)4g99M9+DdECk}&0^0Cl~x?c{6)pZaWXP9YZ`4NdT;xHKyi3-k~qeA zQggwX;Y`SMtUxjMrf~-e9v@h}t9#XcO5`=86hpR@^&fcIW3l7OA3e4W-F8=xz1U`- z;XFZ$4ii90R%Ip8qR7{)#{yI}D34Gi zgkc{=rTPa`#hYH5WB2dWUcomE4P4!G(-*A> zB%W7{wI`H%R%Pi`_8G%D(&xyqNb^T`oy3kUT`|(Uw55%FHSf#%pR$+i>P!QRa?g~k zWe@V=F+Gn-{hn$UKkEzCWv0A7_<1shhbNu7h~}QcFJktfvVNUcZOlVOxjA=#r(tB? zwTKY?+BOTQN`0yL%R*%w<)kb&JlG%4=Pb33&bm^s)Al+IH|Nzw?Pm!L*YDodeLxYR z;rLtnh(1Q+v!3Tuo-T5OLhFs8SE3y|mnXS^Eh^ll`w(&y9{}*OFsto#f0E zk`$|AcJEfM@X|K(+RfrXiO6_Xe%ExzhjgPeL4@Q;9@>VTFBY7Ka zM7$^i`*xpeuA?7lt_E2JGC5x{eoh(ISAdiJ`lBs#0{Pv2+UJW;@oPrZq1*+5TbKg@G2YfpWtrnSXL?vmptW0D!(s^{2T zot0g^4~efiz8PaGq}y@k7hKo~dGbRM>xnfU-q?L*>1_O%+F zaBA<<>#1JHbhSr+DCCm)lpWvG54_Bmk@B2~#ErTkQsQpq+Vf4fpYiDk1&-7={!SZH zVbgc4q|93N&;i=*%s+z$N*|=Jzz-iHc1sGSxjVU@Fx|?ExZ!Y@dX>p}+B>`1PbwW8 z>1e|TxsFn2e$kl1>${(}t+Tns&4!+6HY==1k<*%MzW#2E|Ek?4OTG`|d}DaB(&-BF zu26c%yUU2vwl-FFab%Z!RZNw&Z6NnRySXDUYwVq5*$UyYhKg=3p2dA`Zc;ezK%hPR zllc1Z_&Z2L(b3UJ)eW9=s9tip7D{&C%i-r{!mq(XrdcCCClF>|EFX*+RzwahHdmG> z8CQ(jLHhR>t{=wNhBBXakj20}TESAcoyK4P-7B|y@{;)v=)OJ4S`c`~R<)3EvvW1{ z3Vl1#(~>3|;@uPI4DN#Xz6?=oGTb;8T`vA}@ACui@9qc8NRcNjEZh)XwO!jOC?~U7 zHRxw9>&dBoAoAFr_0C%cx<33=j=3_EQ{5c9xpoIh6i6|RrN*6zLQUj4b;PVI*&_up)psm2zP*}QJIpV#F(oSx2i~#$K6nEc85^l zZZ}+_o`F(I0*uMDelu356{S4OY0m>~vRYzFV<$M?d-DfjRIFX-l;o;f}G ztM<+8`X?@He3XZyO~=vS#r`Z=(j-17e9(Tr?U&%Q(YJO(|aDG5) zmabj%0bTs>J-+tamApVh19g=~8%f)~kU@v+k>wJQUnqI)-wAU&E-6T0hvn4%3_gO+ zNspDR*r8YL)FVValGyv~ z`u9(%pUT3Tv&Fj}e)9gK%Nv)tKS{clM->re>KFhPMzd8i;!FtopuV2$Cbq z#Z`B_!Vn!BYjCu);4x2w{#Wu#^b=@H;8=+^U@oBQK>_M7w^T(sIIv={QF3#gsliWo#{Z}7Wh zJS}WB%4&tFBmr`eucpgBa*o9a8`r;Ktjv@|_%+uFeS}ZP%PJP#N~4E=ifWJPPtVS_ zLBp9vKqI%^$F*nsngVCL1Dp}%>{ZX($hYbe4PC3U$j7rS^RFzAL19ZWa-&>B`)^J0 zgw2=zbWMJgq@##`{3`}JrUqG-ns?9y{q(9`fGRQBurepG#u>YQK{n2A;U|eusq4vl z#dTuh3t$MATaFmO5eAWq_bosy9&y8oGbl@IYH0SIFCI&~?V0|CxoqoDO2-VFul zG=hTgqhn$q>vt0fjK_3zSg6yjrmNpQj@S{a>&h2DYs$8dF){Cf%ea7om5pFLP=Si& z4K?=mhEIEK!?s?XcE#GxV%0OQOw*A*LBvQIq=J@UQfySEKhCv5jw|VxH zhPZ)CD>m(=^4RHkq;fV4A6&}ZoX%%O$ON3j|IEX-^P+*}!)-zD)9p)O1GR^zp-0;6 zDFZA=3b9bIX$C-lhc^X#v9JVI=%C#I>P0T*DJy{=OWma2u*)V$e^4`5*kD*aj~!8n zbIS(X7rbMBql$!KpQ!M5-fg*zy}hA<9z=Oq$tyw}6;L4?njo%rTkCQ8gt4zpue4xn zfzJ8J=yX1~m`X}U%A!^n4waw0{1z25S&cDys)vRkt0MLwPnaJ-WX+lE9_*NpQ+~| zpOx)Tf0WPvW@Dy)b4c65^LQ5#WDk4IhrW_{9W9XR*oOMeLp^FWK)bM4d;Gjr$TR+a z6t6AV%T3_7As#Qt7Qf3!bi@=CJ`Suoe@$b!gZFrF*1RtbOXzI()jtkUbN#-gdX29M z?i2jF9;_xu>j=m1u?Hv@U}a9rfKribhGTP_HfmWhR+_no169qNX;0G4+LOZP@(MXw zuU)BO(Pl?TfNn3ghVXsyybceckxJfAKpzs7xa%q?=nY@%fx-l&tn08%$H$Wa)-%evsQFWs z*WwNMw2pjtP-y@$LPY{AL)6ZW4RNhurGd*ULE9QLULgKBChQWbKK#vZ9wN;On-)0Y zu~3H-)BpTIaH*>6%<^SMQ&r~JU7+iM`}A)UR$D@JmHpy9KE6JpDyE`493}*IEc3H9;OwYE-}0W5-S~_!Q{(J=CAMI6A}Uqaryih zfB@j2+EPu==cQO5-M>GR9zeyOcq{h_Hcf2fP4YYgOUwvZOyA6J0Qa7)I`(RHpZerT_#?|(8<`UH7jJD~4S}i%Y zTQj1d7WoTvgK`;B-ju#|k{@@Lw+im79^A+v#g%;?sUIhW-yZa^66~*y`GH(A?QQGf zfh@ke_wGI2-f2-;beX^@ed0Wb8(FC^Jkj` z=Np%g{&UxWSp=<_<$^R{f)Rgh~Tlz+CNj>;oBt^D;PASCF^Y-7BW_LW3N_2Z!cuT=5&bt~^(s%3s0P;><7UyXIrId%Wbh?%C6|Ua0Gv5*v#PmKB)yBSWfl zHS<<~l_gI?dVdYly+TS2tFh9H$lh`;h}i#&xt&M!KaS~5QS}>}-8w2b=fN>+g93>a zWK)n9J5J2G1}@AF40lPTCWe$6M~~HjmjpG>CGhyfM+&vm2bR+mbG!CAKr#cT)+Dnm zYb7rUvxgNnw{^6bs`>n-gTzlv$^$RpZ$bR_gGy8M%Rst%nTf|uDpJyb{Cq|)FE_C0 zs64hrC0>1tK6HQ((r+N3i>5NH<=7 z`k|0vTGCf1_d5-tw}-#vL-&fAn;ypRXIJ--0IzK}{STpP2NfgeGv9f+TMRR1mZmice;wSJH146ty-RGQ(=v=t65F81I+33Z2jOYEN7!5+xTyk% zGkzYZnAFs80B$uri{f4I?~ryfj+Kf=EVvKAhh6eG-1-gn6x!8~g|{l1e#tk(vcw>I zYf`NB>!y7EYlX*=Dm!y6f8vAro>8oG71I7|rFgxSDIGv?+ZUnL*x1;D)@r94Yt9D) zC$xn#{sN~BSX$+`ej47VN@;~kMI%ZzjN7KE1rnm7Lg1bQGh_7U8Z9mDbs&aeX>2dQ zNo<~U6uuE`&QUtxbARnm{q{gkPZF;mhiT8{*yl1b_`ut+y6u`(trxx?7)FEv@Ixxg zYc)Ou&$J2;Jp#=cjO)Wz=y{)aqj@-6$bQHcZJ2E29hT2NKfqd~teCLxFd0}H<5BLG z1_IiQON2R?ru;&#pmU-DpWkQjwg7O7QV)FYy5D4cov%%X>5Sf4&HyJ>IyI$0Sm&$3 zCrJx*79UU9@Y+}9JQuc9uvJq$_tl|e#JB0Bk8_TxsM=r|EP*{n;QaJ2^E2&Fytom^ z^j_f>TDLkIrl+rh_I*L2M$Qy!nB-$D*q~Nm04t z08U{t0@WE*fnLT%rv7(VVU<;|KO90XaL3(?HoL0JonKCzUP;Fo%TzwYDRQ7SG{M_s;oXNhxNhJw+ zl0SR)%p>y~lysylI`c(@L`1ie6ZC^IilHdY%oX^WGJgknHuIhBqlj&ItF+_n7uEWm zO4iNaOJ^mnV+`aOGiTZ#-b5Z>@Ep1e*(O{X?pq`Xc~&apt4Cmn&s3e*`0AWiZ;e;H z-uz`IWpNy*Wf*463l@B+@IkouDK?n>NT;zeRTGSGQ%)8je{e?lZT-}gp_VTbWy?S7 zG?P!{>Rw5LLK2FcwkvM16GTKuE0xVf8#<3%64kA+xr3O~ab5wtI=6S)0HEkcTWf+( zjg3M~cY&k%D7eE-U{biGl3c(37E%p;jASJXwTWE&R(P$=VHYo5G6ZiLX}l2J8EJT> zWn>_ugLp`X{_W5nT_+)VaI7t}j$YewGAbj@Ors#R3JMLCFf+@^urcd5yx{&|v~(sO zBU{Jk5U!CMq#qm@Eb(W5+&aWcu0)G;LK+A&Bo|fQogFDeT<>A$TiJzxAJ7n0ow`Tc zR6TYMNAB)*T1malsT$N*(htR-p>ziuOso4WdV?cs^(u*M*#w$}H=nXxA#>QO&gF;y z_YjPe=kf@2J5V8!L$zX4P=r3F2c=*>>ctL!41y(sM_y66pJ1ATcriCI?ss*K$5~jo z{xqpr!-s5y+Qbf&e}q;+PUZZS zZIW-H=@g6Xu4R+_96gvbTM^BtZq%vbBo9^pDHZhdr%Xq#fQ(e$u(4>J!y*yTQwX?G z!P9oKifTXb!;>gyso5oltO^2Ez5BYtx;J6D-K9pS zH%-z{@BoU7ik_U{M`*@L2-Q00_goG`#Ex|y&tHD*rx`T1=F(xNxp z{Gt8lPha^jei{XmaAB{LlBzn?@pc@4{kHIWIA9|U6PwHL*Wd#!zCk1HjjvL8tOUDm zey`RGgXb3((qXxR^I@wg+2J^Fv)?iAUHEe#+@a_~h=<9lH&>yz2lNradi-z+>+P=P zm+jj;zCT3DVp5^7OG!Zy?EUXA?dLDLi)~-P`c4P)k<+g$#BFX=iDe*Ww)yy*HQgr2 z40>BL4X5shSfxq1gdtX_-}#GO+3T-D-|Ja+O0JO)+zfJM=nbQ0X75$&vgzdhSx{?? z?>%t)WZlIoxS-c#uY6;y7Ab$CGP?lO|9sAOv0+}-cRnI9f zB6~}oxUJ^4`2%Z(d28e4SK!;ecV?@|aEi1@@-bZ31~=~T&;{@l&Gj>;1w-p~VbUNf z>#?o2o$oR&Y07yjk$YWK*2R?bxx2h&61JhKEJ?;_Z|_TX82hL1n4RY?W?E1T@X8d% zF4^iI>rqP%RM*S6UP zB*#e)%dj#*W4+#4JehF3g6ai}s^o~LxxxSXziVy$jA6j6P{vHXkrIEgEXdV^iAi)` z4m!5TyHVb@QP~RRV-O_91PR{msPRm24C5t z*yrE$gSO5AA`D(VMx__zNgUI6%NNdNF5Qq;B|h=`_K{GL=+~c{5ej)7R^`l#qfA_1 zzoY+x!1#%f*)h)H+mG4{T!ix_S)vc$)byE9P?b%zF6I~7fgg;hJ-!nmm1S7;q+=R| z>~u~qVuo*b?6KlFk3~`ZX6D-JN^nlJF=yft>S(n2AsJY;hNgerGdsqakUP7({%S!* ztIV}8EU@W^ck*7lNQ2U4MyZ=m`$YLT8cQs17XQMMw!+wdgmVgXoeR*`VvcL^#CLYN~zzSz!Q{0GCP#S;f?5Do*di&h!mD(F8iWacs78pK1a#HWX^N|(la>YSL7w^RGST7c$V*BDQV~~ z+`4H$;$V`8QA|~jGchJ3jkFulz*F2ScjNLS1-Z^Dv|ui3t>omNrEC>%+1uCl&axkJ z%htKPnTpMCTWRi0LW%v!>vx(JtM3_6Od0<|ftkB7k?+eO!C*$7P8)xHlt0!hO@X^j zBJHC<3?OFtPM|}y=Ms-k9-HR!aPUuYr6ZF^_C$(qj{ z4wv-(=*nj_^0z-F(kV5s{InZqx|h_@F+k^HWTEA`7c^USm z2;x)iti+a1u3gZ)#5ev*cN0?;&>6>q{7cYEUV$@(glcuIhLXhly*-=X+LJwUP1)TO zBoeDr3c7izWfOsc%Vt9;l5xF`A%G#E4W@?kL6PPdeMGN$mZvq`@~z&`N&~DK5)#!?1C~&5#tVB3O?jVtR)h(ai)Sh7ucEEZhA%JmF*h7B?bU8) zFtDrJue;sis;z*`2Dts=-LAk+a;yit{y*85;FPtS>j0kM>?^@A9M>+6%D-H}q%o`c3?LExgG zpj{AGQi>iB)?F3^#{u4%q^myp@!kghi^ACi?*pnx7L1ct-;f0-l%nBqxVd~QRff0> zJ3BF;iIQ+m3RuWOiS72#KjYjFBVf_i!Le9!L&U%jdwblAchuZO|PK{eSXlYbTQ6V7& zRugDKc$fP*cK~#U7e?4z!Cz#*N(pYdpM&Xszz?bZ7gu!B7KDl?h!S$L8#P7gzvw#2@xfL?;J3QViq)!5uNUytWX*GcqzlPVA+r8h&3tzt!?rUEIa( z1Gg1A?965}CG(5M@0#ZBZ#RGy011jy4NYbwWdr)S9jsdDEi*tk+~QoebRq@$7G#R= z&gb)o4Ler`mdO7CjM#>|SCK_Je+tBS$tx|oJ^;!pI z{JY~avAu)Ck!hQD6~iYae+ADF4HFX~2%J(-_Fki+lK0(sj0n;|@pOmPob68>wh90H zJFg!cnYl6lkAU;0cZJDyy-0|>5R*$&oF4mlwx=PYY7)XDya zgq0N|=)Woqo%8b@QA-0~!pBN(bC}FPb>v{lhlZN&h2Y*zV3azjuUx(?0zwEpsJbkx zd7z(o{=oHKdhYM7va;nZ>SeZIno7aIz|y9w>-ip<;o+a`KkVjkL90PhBLCE&KPs`F z;HqFLhwfeI>gA+7VfCl!eL9Q$GWg9?>>HTIaS!rJ^9RI+# z2jMf(uiq~ItoXbBEjGfy181akQcA)YTruU=Dc6OCFyx?@{Lnnp096^$5S%>VvLo%; zqz1F)a6a1&hcvf_G+u{4B3}lEjuxAZeZ7G8>X#{pz(vJbJv=kNp^Nif^nz3YRLDhB zyloS1DvqBtzlN3PDME&ZQKy;=z|!(kIV5cZYp%f|k^8mt#N!c&NI^*|yEO?Nu7q`d zr!A^yS*_pU2QGiJY*yn!o+_Ky>7j=DLq)H3r2Pi+9buFZ;5rPBAFnWZJ^wl4-dSeIO12`wHEirV zh4|zS8(N7c+Dzp)SkO9 z8)}_Yh4>*~*eZ5a&}iuJ2wXdGF%%Y8-^&iIA?7`NY$PCD%owavprJog|JFuVrvzYR z9@`ldgr-5Z!T-M)Gm>XtVW`1AyEd0889RlFox7H+l3tL?;-TPu8+HXiM&H4e6v^zT zfvz0Vz7VAj)Lah?4Q+wcR5-ZwX~6!$fC(~oqszp^#4GDHczR@Am+|T!6i^-PGI}Tx z5PJ-ntz;Hf82^H84f23}45u29(10|5I6Oq)CT2kAOU7jtKM}WV)1Pq>(d&`d{r;T{ zP)Ya)wVDpAAA{b2n;|41)GYglS3^TX_u7Tprtq<8rc^`!)D96vWZuIQnL>iFu(!?yOacPxbeI0Q z1mDs5yT&?n?*6UktI|*}(cIDTto7V%O;(CaQ<>qn27ZqkeEQbEwL0eN?f8|@WY_*X zw$!R}?oMyPSBrI8!d-?Ci;>YxDEE}DSsB6H>*9)8c1{XDO6#R?YvHxOOV4|{tAg{l zY)@J0r!zf+349opW^S{@U&v9+{Rk1Z5gqsL%)J$jv<8Bj@r?zeykv3H5CriClu2WdFlaRW+$Piy^riW({(B%e?GoapLs5e%E7 z)L}1z04D4dO#p*ZetoP_Fq}^J8jK^L?7=Nf9nzA}1W6r2>)S4tv2#37B8IQwJ`|Ue z3sLP_t@AuV&#x%UtaLQKzrtoY%ovcApSd@*WQ4f>NOXdTQz1z|m}4u}^wNI>C&bXh zCtw2xTywF{6iKyxd6DXD){XOD2$iz%5u%|$kanZ**3X`tA~3oUY9+;Og)ZR2vly9$ zZ7Ueg0BO&U$xj(pJ#YKlxg6vptKK zHpc6a90L8avc%~oc!B||>X5gE8la}JNJ3H)B&>0|-gH_V92^<}foh#2w+c@FYU4W? zpg7%rG?DAi`K2qmc2_R%IrU;SqK`F{l_i9R?~yUFuoS8GhE3pLH1T6#7|(NGu(=?` zJEZ;;o#GOe#{X+!p^Kf4iRpo8QK#?5=H`o@2!(lq>J}nM0SEQ1hxe^U)fK%#zKd5X zV_1AiWUrpm;vKOfxE_Fg7ZpEXW$j}CBkRBU2@*?%^@)_kczo&2w_RkhLEwIs{J>~6 zui~;Bh|8*js^uDl1uC+>IwQ1>z6y#js1OMZ%in%~k<5lLtyQRFJ z#e(;sr(SaozE3= zn>3G~sv2B^_kREB!7b5)hu)iu4_5c0(yIDI72nAZ#y$rB5zKUWx263#Izeci92h2OTR!Ei-en<=Wyhz*y ze#-=9%ccyr%URI+av-`7MGTxHZ$U8iHCBh@aeslaB2XPrHbEzcm|4RA4ru|=1-TI7 zWbKvOcX>$d0jDPN8L9(Itp`IXWAiOcsU27W9W#jz^(!shB7Q7=(cz0%?oR{A#fafP2i6^p;fFng6VTJSUl; z+qbb2c!#t|9+faIMbJG1h$E9WaJ+*7c2M9*MJWhkBp-i5ipL3SM8X7&x+}~H2C^W* z1^8j&Qc`n(P;&Bffom7Af3G4AX^~ctl29;m$NT*3pgXMGYV6m~^AQQ9LgrC$-#Vro z-yql(_q4~r8u(^S?X=fS2q<~3L1Fk?*L$~L3NlgKflR>3aJVDMho-i+02|_lxw$z% zyc-YW?@;jCf0i;ahha5%&=M@@UWIl8^8EYKKLo(&HYvJ9Nd69B0b)lfHtY0qv;M}+ z>#Y!(?55X*2^`1|gNosk#4?V0b4dt#l+ZLcZ^U&>PTY8 zy(<=)#Vv0NirXn}&TuO_>k3)0pJ5{7*s_TP@94wU4zXD*6pRFxO7Xci|Mn7<2<&%k z&TDEQt-21z0x*{bm)T{#d?ZaJZ~4zPUnHgp(gS44KS`|z1_d?3uD=EGHg|y*$eKA+ z@BWgF7j#EUwOYGg*yG=<{u`E8_Zb54kLl?fqv>7=Y(&F|rHGW2FmM?oNojDl_x9ey zet^WvpyqKKyk#h9i7c%vS27dDF8?al7yF!@jSNkrl66QN|CJ_+?YKFGDeM0#^B#@@ z#IAF~PnGgh>3LqZ0kA)WX4XlBkKr-3Nxk%9kQwO5Ya}EKB*8{VJPY3Vp3QC94CiA` z5U3*~iHib*ps|~Q$6;GeGx-J#A-n8LqZJSo7|r4U;~+jQS2H2T~z|X$MTFyzPY28_4VgA=y{xIpCnI zhla1lK$N1f3dw?k*^dP1!O3*9T9iqaZ#!`Zc^lrpyJLSbRHz0(JR+G|V5GF_-t;>< zitWKW-T)81VqETD@|&sO=R$U~?a3s%d*7jX3?A-_;fI?z&NP5iN+}yT_cv}UT6L`LP zekP0>O4PXj;6XanqEG;3{1I$0jr!A*d$Sk6jlv11%xQX zQGs3YDllq-vxY@lKN>eX`pZmTAhB%(yn^kY?rFuP0+mjurM|b=_0LlT5|DXIJYjR- zA0NQ3R>k?$R6W)mcF3d6YLFDXZ9uqy!AO*Cc9Y;7;;c3TEZC)Sv=P_`+FqOYG;An5 z@OsqTq2qKBe@M-*LQP%WGfH#IT&G^DO7wZxYkwhBDyC7(oA{R^r}O3f!HNX=Pr z9l|G*0G+gZFG8Pq=l}$OYfWh19rBxTQBi)AlUk^H;Dcb^2Y#Ac<_X=a?C<6J8U1Fq zz@DS(VbV+2ei)a{w9GY#4`C_PtWH5_@o)|z!_DxXCM{%5OtRJQaBf7BfW;0y>p=HC z6biZ<7%_f$7$15mC_KCkDn*Qm6im%P#xkV2RSg4f6@KKNEFW;=kvxb|&@n;cE>@Tu z91k~5RZ5v3q@<*oa7R&nkk~A9K!P{X_D|&5U`z&u?Q39YJx`obB7og=^>A8`4_Z54 zo9c!`;P9B7n7ENzHCJhAygW~HRa#&F?m2jJzaZ&L=jIjCM^jlD%x1Q8a8NP+RP}2Q zUn1-%U~ysj4CyR_4io3W(pd^QI)Wyig(J4T1HhD@@GmZVj#?I^UHaeT5oB`9Sss?5-;-el+x%Z}1XDH@wL8CVZ$PwiAPR`epJ>{ZCNZaaOSeKNG$GzRYc%SdV z4Z@(J{^d^=(gZYeA2TKtSOeK=n_KiUh)EVLxTMkXOtR>^WEo4i7A;1R`Lr<_G>O%L zVGnA3f^a11XfCLj!3@r=Rl@L=IJ~-@GQrgOXXA^m<0%4*`-sASpcVJFEjXvBs15G4 zf})}?taHPbAQCmm;+ga$Jrn$JX?F5gzd=dbmoJz8{rfk`!^O>g)%*!W(EK2YMJSnh z9U^xKD*`2n4zGd-iA+{x-NxG5O>GC~Bo=Ns_ECD!Pin zU@|Pl%E!XOh_qnFgJP~@yB9lTrr_nyHHdy_g~8#SR!~4_Qg&2+>>kjB71e>v-+2mS z%O!4Dn4^H_`AF7qiHV6xF0-E%1-}sRJL^?UsKcg@U2nvG+tR{fc7$2$JFC&;x5C1( zXIY9bDP^wV;QEk|lD^vN!TtdYaKs7rEntNc147baqZ@hloi$UU4pfCR)xd%@_H6^p_}SV=F5THyqpBNi*l$#Kz7^n~@D zFlOaN%bOzP`4hY#GSogydnisxq4HV>okXm)4^BY;XhUemCUvUKy}jY^nbPIv;@YtU z^T_@~dfk#Mfe=jQspjw0PFE1g8@fkFhX<1~K1$z**%vv5g{{!E@y|c$fJ!9<02hRD z`uf$P8TTRa2OQ{)U_RZ_=B5-h^TS2DSIfo(j2`B6n}|Vb+KP+fJ~1AQ5TJqi{E+*F z@6-qJ5gAJa@#r6yshVNcUZo;$>;Ru3RZpAQq0)oR$ z1|@pawVpY-xeW+rFgZEN9osTJO%2U0|z$CMrCP8un%Jbus zlV|BV^{|s5>_ig(8zT4pHBnDbLAb|&cxjn9qh??p0m=vVoA1q04~X$@;Pr* zDw5`AaS#{9=wqH1s6fF;!2<_XCo(!l5~ft3kgElwYoJUY1;`S#R|s^Uw{F{f5ch@Y zHMQjT;ZlOAYi(@}?uY%4bD$;3clhNV-ayFG8Mq{#ClsBwQaCj=H5mj_i*_(@kMt_W zqP!Fg0Fik-Y(2o6#cVORl@6hUBCco?B)G6dHG>P;xX1diGbN)Scdmu3bs6BO))M>7rBjGE{WEu5>g zFdYbx{>MyA5o$*$%=6talqsJ{q28Q4V41_P(vuxBTv<>1NZGOStKThh_6f22M;d#(xkzVG)S{MPu%*Vl1BnXA2MA8Vlgs%#DS#pV${6$ zIH{idw?iap5&gXG-|-dLYEVcb7yzUy?4VGv z_H0%uS65cxXl=BtNfq|iiT1XKDIXBMipZ#@gnR&^5GKB^Z2;QhZiF;VREK!XVK*-m!~fw=WxKjFMf zOq>0i+c-0$SJBQ6#X9h^L8VUJCLACZ6QPhfeV`slo9oD1TQf~(rlHZdz$8TZKr680 zKKb&Z1CZ>PxLKW{;ZW!9|? z85~rC{RxowtpzKXL1csRORK0r@{uUqeHk+O!WMs$=)3qR%7w zq_l79EN!sFVAF_0L&}Vfo{NK=pvN96h3RMNIEGefN^5^5%uuY*NYZ~;Ct}vAZoi5D zK9~_iQxmSl?t;>96&1ljy)O4IP=r2dI+u49_Bg!8l}M|KCecwR)q3Ufi8x + +The disadvantage is that this layout is not succinct: we need about some additional memory to store the internal nodes — about $\frac{1}{16}$-th of the original array size, to be exact — but the performance improvement will be more than worth it. + +### Implicit B+ Tree B-tree layout We will explain the constexpr functions because this time it is important: ```c++ +// number of B-element blocks in a layer with n keys constexpr int blocks(int n) { return (n + B - 1) / B; } +// number of keys on the layer pervious to one with n element constexpr int prev_keys(int n) { return (blocks(n) + B) / (B + 1) * B; } +// height of a balanced n-key B+ tree constexpr int height(int n) { return (n <= B ? 1 : height(prev_keys(n)) + 1); } +// where each layer starts constexpr int offset(int h) { int k = 0, n = N; while (h--) { @@ -341,35 +361,39 @@ constexpr int offset(int h) { const int H = height(N), S = offset(H); ``` -To be more explicit with pointer arithmetic, the tree is just a single array now: +To be more explicit with pointer arithmetic, the tree is just a single huge-page aligned array `btree` of size `S`. + + +We store in reverse order, but the nodes within a layer and data in them is still left-to-right. This is an arbitrary decision: you can do it the other way around, but it will be slightly harder to code. ```c++ -int *btree; +memcpy(btree, a, 4 * N); + +for (int i = N; i < S; i++) + btree[i] = INT_MAX; ``` ```c++ - for (int i = N; i < S; i++) - btree[i] = INT_MAX; - - memcpy(btree, a, 4 * N); - - for (int h = 1; h < H; h++) { - for (int i = 0; i < offset(h + 1) - offset(h); i++) { - int k = i / B, - j = i - k * B; - k = k * (B + 1) + j + 1; // compare right - // and then always to the left - for (int l = 0; l < h - 1; l++) - k *= (B + 1); - btree[offset(h) + i] = (k * B < N ? btree[k * B] : INT_MAX); - } +for (int h = 1; h < H; h++) { + for (int i = 0; i < offset(h + 1) - offset(h); i++) { + int k = i / B, + j = i - k * B; + k = k * (B + 1) + j + 1; // compare right + // and then always to the left + for (int l = 0; l < h - 1; l++) + k *= (B + 1); + btree[offset(h) + i] = (k * B < N ? btree[k * B] : INT_MAX); } - - for (int i = offset(1); i < S; i += B) - permute(btree + i); } ``` +```c++ +for (int i = offset(1); i < S; i += B) + permute(btree + i); +``` + +### Searching + ```c++ int lower_bound(int _x) { unsigned k = 0; From 6ee6df9933d29c383bbca8d371fa4b8331a3e62f Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Thu, 17 Feb 2022 14:35:01 +0300 Subject: [PATCH 206/531] s+ tree construction and searching --- content/english/hpc/data-structures/s-tree.md | 52 ++++++++++++++----- 1 file changed, 38 insertions(+), 14 deletions(-) diff --git a/content/english/hpc/data-structures/s-tree.md b/content/english/hpc/data-structures/s-tree.md index 0cf13e27..545c9552 100644 --- a/content/english/hpc/data-structures/s-tree.md +++ b/content/english/hpc/data-structures/s-tree.md @@ -312,7 +312,7 @@ To address these problems, we need to change the layout a little bit. Most of the time people talk about B-trees they really mean *B+ trees*, which is a modification that distinguishes between the two types of nodes: -- *Internal nodes* store up to $B$ keys and $(B + 1)$ pointers to child nodes. The key number $i$ always equals the first key of of the $(i + 1)$-th child node. +- *Internal nodes* store up to $B$ keys and $(B + 1)$ pointers to child nodes. The key number $i$ is always equal to the smallest key in the subtree of the $(i + 1)$-th child node. - *Data nodes* or *leaves* store up to $B$ keys, the pointer to the next leaf node, and, optionally, an associated value for each key, if the structure is used as a key-value map. Advantages of this approach include faster search time as the internal nodes only store keys and the ability to quickly iterate over a range of entries by following next leaf node pointers, but this comes at the cost of some redundancy: we have to store copies of keys in the internal nodes. @@ -328,9 +328,9 @@ The disadvantage is that this layout is not succinct: we need about some additio ### Implicit B+ Tree -B-tree layout +To be more explicit with pointer arithmetic, we will store the entire tree in a single one-dimensional array. To minimize index computations during run-time, we will store each layer sequentially in this array and use compile-time computed offsets to address them: the keys of the node number `k` on layer `h` start with `btree[offset(h) + k * B]`, and its `i`-th child will at `btree[offset(h - 1) + (k * (B + 1) + i) * B]`. -We will explain the constexpr functions because this time it is important: +To implement all that, we need slightly more `constexpr` functions: ```c++ // number of B-element blocks in a layer with n keys @@ -348,7 +348,7 @@ constexpr int height(int n) { return (n <= B ? 1 : height(prev_keys(n)) + 1); } -// where each layer starts +// where the layer h starts (0 is the largest) constexpr int offset(int h) { int k = 0, n = N; while (h--) { @@ -358,13 +358,18 @@ constexpr int offset(int h) { return k; } -const int H = height(N), S = offset(H); +const int H = height(N); +const int S = offset(H); // the tree size is the offset of the (non-existent) layer H + +// the tree itself is stored in a single hugepage-aligned array of size S: +int *btree; ``` -To be more explicit with pointer arithmetic, the tree is just a single huge-page aligned array `btree` of size `S`. +Note that we store the layers in reverse order, but the nodes within a layer and data in them is still left-to-right, and also the layers are numbered bottom-up: the leaves form the zeroth layer and the root is the layer `H - 1`. These are just arbitrary decisions — it is just slightly easier to implement in code. +### Construction -We store in reverse order, but the nodes within a layer and data in them is still left-to-right. This is an arbitrary decision: you can do it the other way around, but it will be slightly harder to code. +To construct the tree from a sorted array `a`, we first need to copy it into the zeroth layer and pad it with infinities: ```c++ memcpy(btree, a, 4 * N); @@ -373,27 +378,37 @@ for (int i = N; i < S; i++) btree[i] = INT_MAX; ``` +Now we build the internal nodes, layer by layer. For each key, we need to descend to the right of it in, always go left until we reach a leaf node, and then take its first key — it will be the smallest on the subtree: + ```c++ for (int h = 1; h < H; h++) { for (int i = 0; i < offset(h + 1) - offset(h); i++) { + // i = k * B + j int k = i / B, j = i - k * B; - k = k * (B + 1) + j + 1; // compare right + k = k * (B + 1) + j + 1; // compare to the right of the key // and then always to the left for (int l = 0; l < h - 1; l++) k *= (B + 1); + // pad the rest with infinities if the key doesn't exist btree[offset(h) + i] = (k * B < N ? btree[k * B] : INT_MAX); } } ``` +And just the finishing touch — we need to permute keys in internal nodes to search them faster: + ```c++ for (int i = offset(1); i < S; i += B) permute(btree + i); ``` +We start from `offset(1)`, and we specifically don't permute leaf nodes and leave the array in the original sorted order. The motivation is that we'd need to do this complex index translation we do in `update` if the keys were permuted, and it is on the critical path when this is the last operation, so just for this layer, we will switch to the original local lower bound procedure with mask-blending. + ### Searching +The search procedure becomes simpler than for the B-tree layout: we don't need to do `update` and execute a constant number of iterations — although the last one with some special treatment. We slightly optimize the pointer arithmetic by storing `k` already multiplied by `B`: + ```c++ int lower_bound(int _x) { unsigned k = 0; @@ -407,15 +422,23 @@ int lower_bound(int _x) { } ``` - +It is 1.5-3x faster: ![](../img/search-bplus.svg) -Makes more sense to look at it as a relative speedup: +The spikes at the end is the L1 TLB: it has 64 entries for 2. The B+ layout hits it slightly faster because of the ~7% additional memory. Unfortunately, this CPU doesn't support 1G ones. -![](../img/search-relative.svg) +64 * 2 = 128MB -### Measuring Actual Latency +### Comparison with `std::lower_bound` + +We've come a long way from binary search: + +![](../img/search-all.svg) + +On these scales, it makes more sense to look at the relative speedup: + +![](../img/search-relative.svg) One huge asterisk we didn't disclosed. @@ -437,6 +460,9 @@ for (int i = 0; i < m; i++) { ![](../img/search-relative-latency.svg) +A lot of the performance boost comes from removing branching and minimizing memory requests, which allows overlapping many queries — around 3 on average. + +Although nobody except maybe the HFT people ever measure actual latency and uses it for benchmarking, this is still something to take into account. ### Modifications @@ -497,8 +523,6 @@ However, they perform better: ## Conclusions -![](../img/search-all.svg) - It may or may not be beneficial to reverse the order in which layers are stored. I only implemented right-to-left because that was easier to code. ### As a Dynamic Tree From 7b888f0cb1d3a3bfbefa29bdb4713d0ef1412e73 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Thu, 17 Feb 2022 15:16:13 +0300 Subject: [PATCH 207/531] changing the graphs after they've been rendered --- .../hpc/data-structures/img/search-bplus.svg | 35 +++++++++++++++++- .../img/search-btree-hugepages.svg | 35 +++++++++++++++++- .../img/search-btree-optimized.svg | 37 +++++++++++++++++-- .../hpc/data-structures/img/search-btree.svg | 33 ++++++++++++++++- 4 files changed, 132 insertions(+), 8 deletions(-) diff --git a/content/english/hpc/data-structures/img/search-bplus.svg b/content/english/hpc/data-structures/img/search-bplus.svg index 1081ca31..9962b0e6 100644 --- a/content/english/hpc/data-structures/img/search-bplus.svg +++ b/content/english/hpc/data-structures/img/search-bplus.svg @@ -1235,9 +1235,40 @@ Q 22.953125 48.484375 18.875 42.84375 Q 14.796875 37.203125 14.796875 27.296875 z " id="DejaVuSans-100"/> + - + @@ -1282,7 +1313,7 @@ z " id="DejaVuSans-43"/> - + diff --git a/content/english/hpc/data-structures/img/search-btree-hugepages.svg b/content/english/hpc/data-structures/img/search-btree-hugepages.svg index 87a27edc..d36d7869 100644 --- a/content/english/hpc/data-structures/img/search-btree-hugepages.svg +++ b/content/english/hpc/data-structures/img/search-btree-hugepages.svg @@ -1097,9 +1097,40 @@ L 31.203125 23.390625 L 4.890625 23.390625 z " id="DejaVuSans-45"/> + - + @@ -1212,7 +1243,7 @@ z " id="DejaVuSans-112"/> - + diff --git a/content/english/hpc/data-structures/img/search-btree-optimized.svg b/content/english/hpc/data-structures/img/search-btree-optimized.svg index ea008092..89a8472f 100644 --- a/content/english/hpc/data-structures/img/search-btree-optimized.svg +++ b/content/english/hpc/data-structures/img/search-btree-optimized.svg @@ -1180,9 +1180,40 @@ L 31.203125 23.390625 L 4.890625 23.390625 z " id="DejaVuSans-45"/> + - + @@ -1295,7 +1326,7 @@ z " id="DejaVuSans-112"/> - + @@ -1403,7 +1434,7 @@ z " id="DejaVuSans-100"/> - + diff --git a/content/english/hpc/data-structures/img/search-btree.svg b/content/english/hpc/data-structures/img/search-btree.svg index 580b781f..7f78b3ee 100644 --- a/content/english/hpc/data-structures/img/search-btree.svg +++ b/content/english/hpc/data-structures/img/search-btree.svg @@ -1568,9 +1568,40 @@ L 31.203125 23.390625 L 4.890625 23.390625 z " id="DejaVuSans-45"/> + - + From 0854aa1640ffa62a591acb252dc8678a1898c37b Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Thu, 17 Feb 2022 15:42:02 +0300 Subject: [PATCH 208/531] real latency note --- content/english/hpc/data-structures/s-tree.md | 41 ++++++++++++++----- 1 file changed, 31 insertions(+), 10 deletions(-) diff --git a/content/english/hpc/data-structures/s-tree.md b/content/english/hpc/data-structures/s-tree.md index 545c9552..3db36ade 100644 --- a/content/english/hpc/data-structures/s-tree.md +++ b/content/english/hpc/data-structures/s-tree.md @@ -422,13 +422,11 @@ int lower_bound(int _x) { } ``` -It is 1.5-3x faster: +Switching to the B+ layout more than paid off: S+ tree is is 1.5-3x faster than optimized S-tree: ![](../img/search-bplus.svg) -The spikes at the end is the L1 TLB: it has 64 entries for 2. The B+ layout hits it slightly faster because of the ~7% additional memory. Unfortunately, this CPU doesn't support 1G ones. - -64 * 2 = 128MB +The spikes at the high end of the graph are caused by the L1 TLB not being large enough: it has 64 entries, so it can handle at most 64 × 2 = 128MB of data, which is exactly what is required for storing `2^25` integers. The S+ tree hits this limit slightly sooner because of the ~7% memory overhead. ### Comparison with `std::lower_bound` @@ -440,14 +438,21 @@ On these scales, it makes more sense to look at the relative speedup: ![](../img/search-relative.svg) -One huge asterisk we didn't disclosed. +The cliffs in the beginning are because the running time of `std::lower_bound` grows smoothly with the array size, while the for an S+ tree it is locally flat and increases in discrete steps when a new layer needs to be added. + +One huge asterisk we haven't discussed is that what we are measuring is not real latency, but the *reciprocal throughput* — the total time it takes to execute a lot of queries divided by the number of queries: ```c++ +clock_t start = clock(); + for (int i = 0; i < m; i++) checksum ^= lower_bound(q[i]); + +float seconds = float(clock() - start) / CLOCKS_PER_SEC; +printf("%.2f ns per query\n", 1e9 * seconds / m); ``` -To measure *actual* latency, we need to introduce a dependency between the iterations, so that the next one can't start before the previous finishes: +To measure *actual* latency, we need to introduce a dependency between the loop iterations so that the next query can't start before the previous completes: ```c++ int last = 0; @@ -458,13 +463,17 @@ for (int i = 0; i < m; i++) { } ``` +Therefore, in terms of real latency, the speedup is not that large: + ![](../img/search-relative-latency.svg) -A lot of the performance boost comes from removing branching and minimizing memory requests, which allows overlapping many queries — around 3 on average. +A lot of the performance boost of S+ tree comes from removing branching and minimizing memory requests, which allows overlapping the execution of more adjacent queries — apparently, around three on average. + + -Although nobody except maybe the HFT people ever measure actual latency and uses it for benchmarking, this is still something to take into account. +Although nobody except maybe the HFT people cares about real latency, and everybody actually measures throughput even when using the word "latency", this nuance is still something to take into account. -### Modifications +### Modifications and Possible Optimizations -To the best of my knowledge, this is a significant improvement over the existing [approaches](http://kaldewey.com/pubs/FAST__SIGMOD10.pdf). As before, we are using Clang 10 targeting a Zen 2 CPU, but the relative performance improvements should approximately transfer to other platforms, including Arm-based chips. +To the best of my knowledge, this is a significant improvement over the existing [approaches](http://kaldewey.com/pubs/FAST__SIGMOD10.pdf). As before, we are using Clang 10 targeting a Zen 2 CPU, but the performance improvements should approximately transfer to most other platforms, including Arm-based chips. Use [this single-source benchmark](https://github.com/sslotin/amh-code/blob/main/binsearch/standalone.cc) of the final implementation if you want to test it on your machine. - +This is a large article, and since this is also serves as [textbook](/hpc/) case study, my side my goal is not to develop algorithms, but to teach people how to develop algorithms — so we will be developing a lot of intermediate optimizations first. If you are already an expert, and feel comfortable reading [intrinsic](/hpc/simd/intrinsics)-heavy code with little to no context, you can jump straight to the [final implementation](#implicit-b-tree-1). ## B-Tree Layout @@ -471,7 +467,7 @@ A lot of the performance boost of S+ tree comes from removing branching and mini -Although nobody except maybe the HFT people cares about real latency, and everybody actually measures throughput even when using the word "latency", this nuance is still something to take into account. +Although nobody except maybe the HFT people cares about real latency, and everybody actually measures throughput even when using the word "latency", this nuance is still something to take into account when predicting the possible speedup in user applications. ### Modifications and Possible Optimizations @@ -518,33 +514,40 @@ unsigned rank32(reg x, int *node) { | (cmp(x, node + 24) << 24); ``` +That's it. This implementation should outperform even the [state-of-the-art indexes](http://kaldewey.com/pubs/FAST__SIGMOD10.pdf) used in high-performance databases, though it's mostly due to the fact that data structures used in real databases have to support fast updates while we don't. + +The problem has more dimensions. + --> -Another idea is to use cache more efficiently. For example, you can execute `_mm256_stream_load_si256` on just the last iteration. +To minimize the number of memory accesses during a query, we can increase the block size. To find the local lower bound in a 32-element node (spanning two cache lines and four AVX2 registers), we can use a [similar trick](https://github.com/sslotin/amh-code/blob/a74495a2c19dddc697f94221629c38fee09fa5ee/binsearch/bplus32.cc#L94) that uses two `packs_epi32` and one `packs_epi16` to combine masks. -They aren't beneficial for throughput: +We can also try to use cache more efficiently by controlling where each tree layer is stored in the cache hierarchy. We can do that by prefetching nodes to a [specific level](/hpc/cpu-cache/prefetching/#software-prefetching) and using [non-temporal reads](/hpc/cpu-cache/bandwidth/#bypassing-the-cache) during queries. + +I implemented these two optimizations: one with a block size of 32 and one where the last read is non-temporal. They don't improve the throughput: ![](../img/search-bplus-other.svg) -However, they perform better: +But they do, however, make the latency lower: ![](../img/search-latency-bplus.svg) -Prefetching one of the children nodes (probably the middle one), which has several more benefits: - -- The hardware prefetcher may also get some its neighbors for us if the data bus is not busy. -- This removes the TLB issues we discussed, as they will be on the same page. We hit it too when $n > 2^25$: unfortunately, this CPU doesn't have 1GB pages. -- The RAM may speculate +Ideas that I have not yet managed to implement but consider highly perspective are: -It may or may not be beneficial to reverse the order in which layers are stored. I only implemented right-to-left because that was easier to code. +- Make the block size non-uniform. Upper envelope. I know how to implement it with code generation, but I went for a generic solution [implemented]( +https://github.com/sslotin/amh-code/blob/main/binsearch/bplus-adaptive.cc) it using facilities of modern C++, but the compiler can't unroll a compile-time loop. +- Group a node with one or two of generations of its descendants (~300 nodes / ~5k keys) so that they are close in memory. Similar to how [FAST](http://kaldewey.com/pubs/FAST__SIGMOD10.pdf) does it with what they call hierarchical blocking. +- Optionally use prefetching of on some specific layers. The hardware prefetcher may also get some its neighbors for us if the data bus is not busy. This removes the TLB issues we discussed, as they will be on the same page. We hit it too when $n > 2^{25}$: unfortunately, this CPU doesn't have 1GB pages. The RAM may speculate. -If we only need the index, we can permuting the nodes of the last layer and use the optimized procedure. +Other minor optimizations include: -Upper envelope. +- It may or may not be beneficial to reverse the order in which layers are stored. I only implemented right-to-left because that was easier to code. +- If we only need the index, we can permuting the nodes of the last layer and use the optimized procedure. +- Implement it in assembly, as the compile-generated versions don't look the most optimal. -I would not be surprised if it is possible to make a 10-20% improvement for a total 10x speedup over `std::lower_bound` on large arrays. +With these implemented, I would not be surprised to see another 10-20% improvement — for a total of 10x speedup over `std::lower_bound` on large arrays. -https://github.com/sslotin/amh-code/blob/main/binsearch/bplus-adaptive.cc +Note that this implementation is very specific to the architecture. Older CPUs and CPUs on mobile devices don't have 256-bit wide registers and will crash (but they likely have 128-bit SIMD so the loop can still be split in 4 parts instead of 2), non-Intel CPUs have their own instruction sets for SIMD, and some computers even have different cache line size. NEON would require some [trickery](https://github.com/WebAssembly/simd/issues/131) ### As a Dynamic Tree @@ -560,17 +563,11 @@ My next priorities is to adapt it to segment trees, which I know how to do, and A ~15x improvement is definitely worth it — and the memory overhead is not large, as we only need to store pointers (indices, actually) for internal nodes. It may be higher, because we need to fetch two separate memory blocks, or lower, because we need to handle updates somehow. Either way, this will be an interesting optimization problem. -That's it. This implementation should outperform even the [state-of-the-art indexes](http://kaldewey.com/pubs/FAST__SIGMOD10.pdf) used in high-performance databases, though it's mostly due to the fact that data structures used in real databases have to support fast updates while we don't. - -The problem has more dimensions. - -NEON would require some [trickery](https://github.com/WebAssembly/simd/issues/131) - -Note that this implementation is very specific to the architecture. Older CPUs and CPUs on mobile devices don't have 256-bit wide registers and will crash (but they likely have 128-bit SIMD so the loop can still be split in 4 parts instead of 2), non-Intel CPUs have their own instruction sets for SIMD, and some computers even have different cache line size. +It seems possible to implement a 10-20x faster `std::set` and a 3-5x faster `absl::btree_set`. ### Acknowledgements -This [StackOverflow answer](https://stackoverflow.com/questions/20616605/using-simd-avx-sse-for-tree-traversal) by Cory Nelson is where I took the permuted SIMD routine. +This [StackOverflow answer](https://stackoverflow.com/questions/20616605/using-simd-avx-sse-for-tree-traversal) by Cory Nelson is where I took the permuted 16-element search trick from. + +With these optimizations implemented, I would not be surprised to see another 10-30% improvement and over 10x speedup over `std::lower_bound` on large arrays for some platforms. ### As a Dynamic Tree -When we compare S+ trees to `std::set` where we add the same elements and search for the same lower bounds (not counting the time it took to add them), the comparison is even more favorable: +The comparison is even more favorable against `std::set` and other pointer-based trees. In our benchmark, we add the same elements (without measuring the time it takes to add them) and use the same lower bound queries, and the S+ tree is up to 30x better: ![](../img/search-set-relative.svg) +This suggests that we can probably use this approach to also improve on *dynamic* search trees by a huge margin. + +To validate this hypothesis, I added an array of 17 indices per each node that point to where their children should be, and used this array instead of implicit numbering. This array is separate from the tree, not aligned, isn't even on a hugepage, and the only optimization is that the first and the last pointer of a node is prefetched. + +I also added [B-tree from Abseil](https://abseil.io/blog/20190812-btree) to the comparison, which is the only widely-used B-tree implementation I know of. It performs just slightly better than `std::lower_bound`, while the S+ tree with pointers is ~15x faster for large arrays: + + + ![](../img/search-set-relative-all.svg) +Of course, this comparison is not fair, as dynamic search tree is a more high-dimensional problem. We'd also need to implement the update operation, which will not be that efficient, and for which we'd need to sacrifice our fanout factor. But it still seems possible to implement a 10-20x faster `std::set` and a 3-5x faster `absl::btree_set`, depending on how you define "faster" — and this is one of the next things we'll try to do. + + + ### Acknowledgements From 44e055afda9e937033bde7545137ba5ab99eb405 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Thu, 17 Feb 2022 19:17:10 +0300 Subject: [PATCH 211/531] final edits to s-tree before publishing --- content/english/hpc/algorithms/prefix.md | 2 +- .../hpc/data-structures/binary-search.md | 2 +- content/english/hpc/data-structures/s-tree.md | 94 +++++++++---------- 3 files changed, 48 insertions(+), 50 deletions(-) diff --git a/content/english/hpc/algorithms/prefix.md b/content/english/hpc/algorithms/prefix.md index b503682a..5e31570d 100644 --- a/content/english/hpc/algorithms/prefix.md +++ b/content/english/hpc/algorithms/prefix.md @@ -29,7 +29,7 @@ void prefix(int *a, int n) { } ``` -It seems like we need two reads, an add, and a write per each iteration, but of course, the compiler optimizes the extra read away and uses a register as the accumulator: +It seems like we need two reads, an add, and a write on each iteration, but of course, the compiler optimizes the extra read away and uses a register as the accumulator: ```nasm loop: diff --git a/content/english/hpc/data-structures/binary-search.md b/content/english/hpc/data-structures/binary-search.md index c60842d9..1145e926 100644 --- a/content/english/hpc/data-structures/binary-search.md +++ b/content/english/hpc/data-structures/binary-search.md @@ -397,7 +397,7 @@ Also, note that the last few prefetch requests are actually not needed, and in f This prefetching technique allows us to read up to four elements ahead, but it doesn't really come for free — we are effectively trading off excess memory [bandwidth](/hpc/cpu-cache/bandwidth) for reduced [latency](/hpc/cpu-cache/latency). If you run more than one instance at a time on separate hardware threads or just any other memory-intensive computation in the background, it will significantly [affect](/hpc/cpu-cache/sharing) the benchmark performance. -But we can do better. Instead of fetching four cache lines at a time, we could fetch four times *fewer* cache lines. And in the next article, we will explore the approach. +But we can do better. Instead of fetching four cache lines at a time, we could fetch four times *fewer* cache lines. And in the [next article](../s-tree), we will explore the approach. -Next, we can find the local lower bound in nodes faster. Instead of calculating it separately for two 8-element blocks and merging masks, we can use one [packs](https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#ig_expand=3037,4870,6715,4845,3853,90,7307,5993,2692,6946,6949,5456,6938,5456,1021,3007,514,518,7253,7183,3892,5135,5260,3915,4027,3873,7401,4376,4229,151,2324,2310,2324,4075,6130,4875,6385,5259,6385,6250,1395,7253,6452,7492,4669,4669,7253,1039,1029,4669,4707,7253,7242,848,879,848,7251,4275,879,874,849,833,6046,7250,4870,4872,4875,849,849,5144,4875,4787,4787,4787,5227,7359,7335,7392,4787,5259,5230,5223,6438,488,483,6165,6570,6554,289,6792,6554,5230,6385,5260,5259,289,288,3037,3009,590,604,5230,5259,6554,6554,5259,6547,6554,3841,5214,5229,5260,5259,7335,5259,519,1029,515,3009,3009,3011,515,6527,652,6527,6554,288,3841,5230,5259,5230,5259,305,5259,591,633,633,5259,5230,5259,5259,3017,3018,3037,3018,3017,3016,3013,5144&text=_mm256_packs_epi32&techs=AVX,AVX2) instruction before the `movemask`: +Next, we can find the local lower bound in nodes faster. Instead of calculating it separately for two 8-element blocks and merging two 8-bit masks, we combine the vector masks using the [packs](https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#ig_expand=3037,4870,6715,4845,3853,90,7307,5993,2692,6946,6949,5456,6938,5456,1021,3007,514,518,7253,7183,3892,5135,5260,3915,4027,3873,7401,4376,4229,151,2324,2310,2324,4075,6130,4875,6385,5259,6385,6250,1395,7253,6452,7492,4669,4669,7253,1039,1029,4669,4707,7253,7242,848,879,848,7251,4275,879,874,849,833,6046,7250,4870,4872,4875,849,849,5144,4875,4787,4787,4787,5227,7359,7335,7392,4787,5259,5230,5223,6438,488,483,6165,6570,6554,289,6792,6554,5230,6385,5260,5259,289,288,3037,3009,590,604,5230,5259,6554,6554,5259,6547,6554,3841,5214,5229,5260,5259,7335,5259,519,1029,515,3009,3009,3011,515,6527,652,6527,6554,288,3841,5230,5259,5230,5259,305,5259,591,633,633,5259,5230,5259,5259,3017,3018,3037,3018,3017,3016,3013,5144&text=_mm256_packs_epi32&techs=AVX,AVX2) instruction and readily extract it using `movemask` just once: ```c++ unsigned rank(reg x, int* y) { @@ -236,9 +235,9 @@ unsigned rank(reg x, int* y) { } ``` -This instruction converts 32-bit integers stored in two registers to 16-bit integers stored in one register — in our case, effectively joining the vector masks into one. Note that we've swapped the order of comparison — this lets us not invert the mask in the end, but we have to subtract one from the search key once in the beginning to make it correct (otherwise it works as `upper_bound`). +This instruction converts 32-bit integers stored in two registers to 16-bit integers stored in one register — in our case, effectively joining the vector masks into one. Note that we've swapped the order of comparison — this lets us not invert the mask in the end, but we have to subtract one from the search key once in the beginning to make it correct (otherwise, it works as `upper_bound`). -The problem is, it does this weird interleaving where the result is written in the `a1 b1 a2 b2` order instead of `a1 a2 b1 b2` that want. To correct this, we need to [permute](/hpc/simd/shuffling) the resulting vector, but instead of doing this during the query time, we can just permute every node during preprocessing: +The problem is, it does this weird interleaving where the result is written in the `a1 b1 a2 b2` order instead of `a1 a2 b1 b2` that we want — many AVX2 instructions tend to do that. To correct this, we need to [permute](/hpc/simd/shuffling) the resulting vector, but instead of doing it during the query time, we can just permute every node during preprocessing: ```c++ void permute(int *node) { @@ -250,9 +249,9 @@ void permute(int *node) { } ``` -We just call `permute(&btree[k])` right after we are done with building a node. There are probably faster ways to swap middle elements, but we will leave it here, as the preprocessing time is not important for now. +Now we just call `permute(&btree[k])` right after we are done building the node. There are probably faster ways to swap the middle elements, but we will leave it here as the preprocessing time is not that important for now. -This new SIMD routine is significantly faster because the extra `movemask` was slow and also blending the two masks took quite a few instructions. Unfortunately, we now can't just do the `res = btree[k][i]` update anymore because the elements are permuted. We can solve this problem with some bit-level trickery in terms of `i`, but indexing a small lookup table turns out to be faster and also doesn't require a new branch: +This new SIMD routine is significantly faster because the extra `movemask` is slow, and also blending the two masks takes quite a few instructions. Unfortunately, we now can't just do the `res = btree[k][i]` update anymore because the elements are permuted. We can solve this problem with some bit-level trickery in terms of `i`, but indexing a small lookup table turns out to be faster and also doesn't require a new branch: ```c++ const int translate[17] = { @@ -295,32 +294,32 @@ All this work saved us 15-20% or so: ![](../img/search-btree-optimized.svg) -Doesn't feel very satisfying so far, but we will reuse these optimization ideas later. +It doesn't feel very satisfying so far, but we will reuse these optimization ideas later. There are two main problems with the current implementation: -- The `update` procedure as is quite costly, especially considering that it is likely to be useless: 16 out of 17 times we can just fetch the result from the last block. -- We do non-constant number of iterations, causing branch prediction problems similar to how it did for the [Eytzinger binary search](/binary-search/#removing-the-last-branch); you can also see it on the graph this time, but the latency bumps have a period of $2^4$. +- The `update` procedure is quite costly, especially considering that it is very likely going to be useless: 16 out of 17 times, we can just fetch the result from the last block. +- We do a non-constant number of iterations, causing branch prediction problems similar to how it did for the [Eytzinger binary search](/binary-search/#removing-the-last-branch); you can also see it on the graph this time, but the latency bumps have a period of $2^4$. To address these problems, we need to change the layout a little bit. ## B+ Tree Layout -Most of the time people talk about B-trees they really mean *B+ trees*, which is a modification that distinguishes between the two types of nodes: +Most of the time, when people talk about B-trees, they really mean *B+ trees*, which is a modification that distinguishes between the two types of nodes: - *Internal nodes* store up to $B$ keys and $(B + 1)$ pointers to child nodes. The key number $i$ is always equal to the smallest key in the subtree of the $(i + 1)$-th child node. -- *Data nodes* or *leaves* store up to $B$ keys, the pointer to the next leaf node, and, optionally, an associated value for each key, if the structure is used as a key-value map. +- *Data nodes* or *leaves* store up to $B$ keys, the pointer to the next leaf node, and, optionally, an associated value for each key — if the structure is used as a key-value map. -Advantages of this approach include faster search time as the internal nodes only store keys and the ability to quickly iterate over a range of entries by following next leaf node pointers, but this comes at the cost of some redundancy: we have to store copies of keys in the internal nodes. +The advantages of this approach include faster search time (as the internal nodes only store keys) and the ability to quickly iterate over a range of entries (by following next leaf node pointers), but this comes at the cost of some memory overhead: we have to store copies of keys in the internal nodes. ![A B+ tree of order 4](../img/bplus.png) Back to our use case, this layout can help us solve our two problems: -- Either the last node we descend into is has the local lower bound, or it is the first key of the next leaf node, so we don't need to call `update` on each iteration. +- Either the last node we descend into has the local lower bound, or it is the first key of the next leaf node, so we don't need to call `update` on each iteration. - The depth of all leaves is constant because B+ trees grow at the root and not at the leaves, which removes the need for branching. -The disadvantage is that this layout is not succinct: we need about some additional memory to store the internal nodes — about $\frac{1}{16}$-th of the original array size, to be exact — but the performance improvement will be more than worth it. +The disadvantage is that this layout is not succinct: we need some additional memory to store the internal nodes — about $\frac{1}{16}$-th of the original array size, to be exact — but the performance improvement will be more than worth it. ### Implicit B+ Tree @@ -357,11 +356,10 @@ constexpr int offset(int h) { const int H = height(N); const int S = offset(H); // the tree size is the offset of the (non-existent) layer H -// the tree itself is stored in a single hugepage-aligned array of size S: -int *btree; +int *btree; // the tree itself is stored in a single hugepage-aligned array of size S ``` -Note that we store the layers in reverse order, but the nodes within a layer and data in them is still left-to-right, and also the layers are numbered bottom-up: the leaves form the zeroth layer and the root is the layer `H - 1`. These are just arbitrary decisions — it is just slightly easier to implement in code. +Note that we store the layers in reverse order, but the nodes within a layer and data in them are still left-to-right, and also the layers are numbered bottom-up: the leaves form the zeroth layer, and the root is the layer `H - 1`. These are just arbitrary decisions — it is just slightly easier to implement in code. ### Construction @@ -399,15 +397,15 @@ for (int i = offset(1); i < S; i += B) permute(btree + i); ``` -We start from `offset(1)`, and we specifically don't permute leaf nodes and leave the array in the original sorted order. The motivation is that we'd need to do this complex index translation we do in `update` if the keys were permuted, and it is on the critical path when this is the last operation, so just for this layer, we will switch to the original local lower bound procedure with mask-blending. +We start from `offset(1)`, and we specifically don't permute leaf nodes and leave the array in the original sorted order. The motivation is that we'd need to do this complex index translation we do in `update` if the keys were permuted, and it is on the critical path when this is the last operation. So, just for this layer, we switch to the original mask-blending local lower bound procedure. ### Searching -The search procedure becomes simpler than for the B-tree layout: we don't need to do `update` and execute a constant number of iterations — although the last one with some special treatment. We slightly optimize the pointer arithmetic by storing `k` already multiplied by `B`: +The search procedure becomes simpler than for the B-tree layout: we don't need to do `update` and only execute a fixed number of iterations — although the last one with some special treatment: ```c++ int lower_bound(int _x) { - unsigned k = 0; + unsigned k = 0; // we assume k already multiplied by B to optimize pointer arithmetic reg x = _mm256_set1_epi32(_x - 1); for (int h = H - 1; h > 0; h--) { unsigned i = permuted_rank(x, btree + offset(h) + k); @@ -418,7 +416,7 @@ int lower_bound(int _x) { } ``` -Switching to the B+ layout more than paid off: S+ tree is is 1.5-3x faster than optimized S-tree: +Switching to the B+ layout more than paid off: this S+ tree is 1.5-3x faster compared to the optimized S-tree: ![](../img/search-bplus.svg) @@ -434,9 +432,9 @@ On these scales, it makes more sense to look at the relative speedup: ![](../img/search-relative.svg) -The cliffs in the beginning are because the running time of `std::lower_bound` grows smoothly with the array size, while the for an S+ tree it is locally flat and increases in discrete steps when a new layer needs to be added. +The cliffs at the beginning of the graph are because the running time of `std::lower_bound` grows smoothly with the array size, while for an S+ tree, it is locally flat and increases in discrete steps when a new layer needs to be added. -One huge asterisk we haven't discussed is that what we are measuring is not real latency, but the *reciprocal throughput* — the total time it takes to execute a lot of queries divided by the number of queries: +One important asterisk we haven't discussed is that what we are measuring is not real latency, but the *reciprocal throughput* — the total time it takes to execute a lot of queries divided by the number of queries: ```c++ clock_t start = clock(); @@ -448,7 +446,7 @@ float seconds = float(clock() - start) / CLOCKS_PER_SEC; printf("%.2f ns per query\n", 1e9 * seconds / m); ``` -To measure *actual* latency, we need to introduce a dependency between the loop iterations so that the next query can't start before the previous completes: +To measure *actual* latency, we need to introduce a dependency between the loop iterations so that the next query can't start before the previous one finishes: ```c++ int last = 0; @@ -459,11 +457,11 @@ for (int i = 0; i < m; i++) { } ``` -Therefore, in terms of real latency, the speedup is not that large: +In terms of real latency, the speedup is not that impressive: ![](../img/search-relative-latency.svg) -A lot of the performance boost of S+ tree comes from removing branching and minimizing memory requests, which allows overlapping the execution of more adjacent queries — apparently, around three on average. +A lot of the performance boost of the S+ tree comes from removing branching and minimizing memory requests, which allows overlapping the execution of more adjacent queries — apparently, around three on average. @@ -522,24 +520,24 @@ The problem has more dimensions. To minimize the number of memory accesses during a query, we can increase the block size. To find the local lower bound in a 32-element node (spanning two cache lines and four AVX2 registers), we can use a [similar trick](https://github.com/sslotin/amh-code/blob/a74495a2c19dddc697f94221629c38fee09fa5ee/binsearch/bplus32.cc#L94) that uses two `packs_epi32` and one `packs_epi16` to combine masks. -We can also try to use cache more efficiently by controlling where each tree layer is stored in the cache hierarchy. We can do that by prefetching nodes to a [specific level](/hpc/cpu-cache/prefetching/#software-prefetching) and using [non-temporal reads](/hpc/cpu-cache/bandwidth/#bypassing-the-cache) during queries. +We can also try to use the cache more efficiently by controlling where each tree layer is stored in the cache hierarchy. We can do that by prefetching nodes to a [specific level](/hpc/cpu-cache/prefetching/#software-prefetching) and using [non-temporal reads](/hpc/cpu-cache/bandwidth/#bypassing-the-cache) during queries. -I implemented these two optimizations: one with a block size of 32 and one where the last read is non-temporal. They don't improve the throughput: +I implemented these two optimizations: the one with a block size of 32 and the one where the last read is non-temporal. They don't improve the throughput: ![](../img/search-bplus-other.svg) -But they do make the latency lower: +…but they do make the latency lower: ![](../img/search-latency-bplus.svg) Ideas that I have not yet managed to implement but consider highly perspective are: -- Make the block size non-uniform. The motivation is that the slowdown from having one 32-element layer is less than from having two separate layers. Also the root is often not full, so perhaps it should have only 8 keys or even just one key. Picking the most optimal layer configuration for a given array size should to remove the spikes from the relative speedup graph and make it look more like its upper envelope. +- Make the block size non-uniform. The motivation is that the slowdown from having one 32-element layer is less than from having two separate layers. Also, the root is often not full, so perhaps it should have only 8 keys or even just one key. Picking the optimal layer configuration for a given array size should remove the spikes from the relative speedup graph and make it look more like its upper envelope. I know how to do it with code generation, but I went for a generic solution and tried to [implement]( https://github.com/sslotin/amh-code/blob/main/binsearch/bplus-adaptive.cc) it with the facilities of modern C++, but the compiler can't produce optimal code this way. -- Group nodes with one or two of generations of its descendants (~300 nodes / ~5k keys) so that they are close in memory, similar to what [FAST](http://kaldewey.com/pubs/FAST__SIGMOD10.pdf) calls hierarchical blocking. This reduces the severity of TLB misses and also may improve latency as the memory controller chooses to keep the [RAM row buffer](/hpc/cpu-cache/aos-soa/#ram-specific-timings) open, anticipating local reads. -- Optionally use prefetching on some specific layers. In addition to the $\frac{1}{17}$-th chance of it actually fetching the node we need, the hardware prefetcher may also get some its neighbors for us if the data bus is not busy. It also has the same TLB and row buffer effects as with blocking. +- Group nodes with one or two generations of its descendants (~300 nodes / ~5k keys) so that they are close in memory, similar to what [FAST](http://kaldewey.com/pubs/FAST__SIGMOD10.pdf) calls hierarchical blocking. This reduces the severity of TLB misses and also may improve latency as the memory controller chooses to keep the [RAM row buffer](/hpc/cpu-cache/aos-soa/#ram-specific-timings) open, anticipating local reads. +- Optionally use prefetching on some specific layers. Aside from to the $\frac{1}{17}$-th chance of it fetching the node we need, the hardware prefetcher may also get some of its neighbors for us if the data bus is not busy. It also has the same TLB and row buffer effects as with blocking. Other possible minor optimizations include: @@ -555,7 +553,7 @@ Mobile and some older CPUs only have 128-bit wide registers, and some high-end C --> -With these optimizations implemented, I would not be surprised to see another 10-30% improvement and over 10x speedup over `std::lower_bound` on large arrays for some platforms. +With these optimizations implemented, I wouldn't be surprised to see another 10-30% improvement and over 10x speedup over `std::lower_bound` on large arrays for some platforms. ### As a Dynamic Tree @@ -565,7 +563,7 @@ The comparison is even more favorable against `std::set` and other pointer-based This suggests that we can probably use this approach to also improve on *dynamic* search trees by a huge margin. -To validate this hypothesis, I added an array of 17 indices per each node that point to where their children should be, and used this array instead of implicit numbering. This array is separate from the tree, not aligned, isn't even on a hugepage, and the only optimization is that the first and the last pointer of a node is prefetched. +To validate this hypothesis, I added an array of 17 indices for each node that point to where their children should be and used this array instead of implicit numbering. This array is separate from the tree, not aligned, isn't even on a hugepage, and the only optimization is that the first and the last pointer of a node is prefetched. I also added [B-tree from Abseil](https://abseil.io/blog/20190812-btree) to the comparison, which is the only widely-used B-tree implementation I know of. It performs just slightly better than `std::lower_bound`, while the S+ tree with pointers is ~15x faster for large arrays: @@ -579,7 +577,7 @@ My next priorities is to adapt it to segment trees, which I know how to do, and ![](../img/search-set-relative-all.svg) -Of course, this comparison is not fair, as dynamic search tree is a more high-dimensional problem. We'd also need to implement the update operation, which will not be that efficient, and for which we'd need to sacrifice our fanout factor. But it still seems possible to implement a 10-20x faster `std::set` and a 3-5x faster `absl::btree_set`, depending on how you define "faster" — and this is one of the next things we'll try to do. +Of course, this comparison is not fair, as the dynamic search tree is a more high-dimensional problem. We'd also need to implement the update operation, which will not be that efficient, and for which we'd need to sacrifice our fanout factor. But it still seems possible to implement a 10-20x faster `std::set` and a 3-5x faster `absl::btree_set`, depending on how you define "faster" — and this is one of the next things we'll try to do. Here is the heatmap visualizing the expected frequency of comparisons for a 31-element array: From 8bf7569d58c7ae51f01447bd41ed0027e4a2ace7 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Fri, 18 Feb 2022 00:17:00 +0300 Subject: [PATCH 213/531] s-tree edits --- .../hpc/data-structures/img/search-all.svg | 188 ++++++++++++++---- content/english/hpc/data-structures/s-tree.md | 20 +- 2 files changed, 158 insertions(+), 50 deletions(-) diff --git a/content/english/hpc/data-structures/img/search-all.svg b/content/english/hpc/data-structures/img/search-all.svg index 66508a1f..e467869d 100644 --- a/content/english/hpc/data-structures/img/search-all.svg +++ b/content/english/hpc/data-structures/img/search-all.svg @@ -29,7 +29,7 @@ z - @@ -99,7 +99,7 @@ z - @@ -129,7 +129,7 @@ z - @@ -170,7 +170,7 @@ z - @@ -226,7 +226,7 @@ z - @@ -241,7 +241,7 @@ L 341.092276 41.472 - @@ -451,7 +451,7 @@ z - @@ -487,7 +487,7 @@ z - @@ -502,7 +502,7 @@ L 414.72 256.645772 - @@ -517,7 +517,7 @@ L 414.72 205.707544 - @@ -564,7 +564,7 @@ z - @@ -579,7 +579,7 @@ L 414.72 103.831088 - @@ -707,7 +707,7 @@ z - - - - + + + - - + - - + - - + @@ -1122,7 +1205,7 @@ L 9.8125 0 z " id="DejaVuSans-75"/> - + @@ -1130,7 +1213,7 @@ z - + @@ -1156,7 +1239,7 @@ L 9.8125 0 z " id="DejaVuSans-77"/> - + @@ -1231,24 +1314,24 @@ z - - + - + @@ -1403,12 +1486,12 @@ z - + - + @@ -1424,12 +1507,12 @@ L 89.5 72.332156 - + - + @@ -1553,14 +1636,14 @@ z - + +" style="fill:none;stroke:#ffa500;stroke-linecap:round;stroke-width:1.5;"/> - + - + + + + + + + + + + + + + + + + + @@ -1608,7 +1714,7 @@ z - + diff --git a/content/english/hpc/data-structures/s-tree.md b/content/english/hpc/data-structures/s-tree.md index 5a3d980d..38dc6a25 100644 --- a/content/english/hpc/data-structures/s-tree.md +++ b/content/english/hpc/data-structures/s-tree.md @@ -522,7 +522,7 @@ To minimize the number of memory accesses during a query, we can increase the bl We can also try to use the cache more efficiently by controlling where each tree layer is stored in the cache hierarchy. We can do that by prefetching nodes to a [specific level](/hpc/cpu-cache/prefetching/#software-prefetching) and using [non-temporal reads](/hpc/cpu-cache/bandwidth/#bypassing-the-cache) during queries. -I implemented these two optimizations: the one with a block size of 32 and the one where the last read is non-temporal. They don't improve the throughput: +I implemented two versions of these optimizations: the one with a block size of 32 and the one where the last read is non-temporal. They don't improve the throughput: ![](../img/search-bplus-other.svg) @@ -532,20 +532,20 @@ I implemented these two optimizations: the one with a block size of 32 and the o Ideas that I have not yet managed to implement but consider highly perspective are: -- Make the block size non-uniform. The motivation is that the slowdown from having one 32-element layer is less than from having two separate layers. Also, the root is often not full, so perhaps it should have only 8 keys or even just one key. Picking the optimal layer configuration for a given array size should remove the spikes from the relative speedup graph and make it look more like its upper envelope. +- Make the block size non-uniform. The motivation is that the slowdown from having one 32-element layer is less than from having two separate layers. Also, the root is often not full, so perhaps sometimes it should have only 8 keys or even just one key. Picking the optimal layer configuration for a given array size should remove the spikes from the relative speedup graph and make it look more like its upper envelope. I know how to do it with code generation, but I went for a generic solution and tried to [implement]( https://github.com/sslotin/amh-code/blob/main/binsearch/bplus-adaptive.cc) it with the facilities of modern C++, but the compiler can't produce optimal code this way. -- Group nodes with one or two generations of its descendants (~300 nodes / ~5k keys) so that they are close in memory, similar to what [FAST](http://kaldewey.com/pubs/FAST__SIGMOD10.pdf) calls hierarchical blocking. This reduces the severity of TLB misses and also may improve latency as the memory controller chooses to keep the [RAM row buffer](/hpc/cpu-cache/aos-soa/#ram-specific-timings) open, anticipating local reads. +- Group nodes with one or two generations of its descendants (~300 nodes / ~5k keys) so that they are close in memory — in the spirit of what [FAST](http://kaldewey.com/pubs/FAST__SIGMOD10.pdf) calls hierarchical blocking. This reduces the severity of TLB misses and also may improve latency as the memory controller may choose to keep the [RAM row buffer](/hpc/cpu-cache/aos-soa/#ram-specific-timings) open, anticipating local reads. - Optionally use prefetching on some specific layers. Aside from to the $\frac{1}{17}$-th chance of it fetching the node we need, the hardware prefetcher may also get some of its neighbors for us if the data bus is not busy. It also has the same TLB and row buffer effects as with blocking. Other possible minor optimizations include: -- Permuting the nodes of the last layer also — if we only need the index and not the value. +- Permuting the nodes of the last layer as well — if we only need the index and not the value. - Reversing the order in which the layers are stored to left-to-right so that the first few layers are on the same page. - Rewriting the whole thing in assembly, as the compiler seems to struggle with pointer arithmetic. -Note that our implementation is specific to the AVX2 and may require some non-trivial changes to adapt to other platforms. It would be interesting to port it for Intel CPUs with AVX-512 and Arm CPUs with 128-bit NEON, which may require some [trickery](https://github.com/WebAssembly/simd/issues/131) to work. +Note that the current implementation is specific to AVX2 and may require some non-trivial changes to adapt to other platforms. It would be interesting to port it for Intel CPUs with AVX-512 and Arm CPUs with 128-bit NEON, which may require some [trickery](https://github.com/WebAssembly/simd/issues/131) to work. -The disadvantage is that this layout is not succinct: we need some additional memory to store the internal nodes — about $\frac{1}{16}$-th of the original array size, to be exact — but the performance improvement will be more than worth it. +The disadvantage is that this layout is not *succinct*: we need some additional memory to store the internal nodes — about $\frac{1}{16}$-th of the original array size, to be exact — but the performance improvement will be more than worth it. ### Implicit B+ Tree @@ -416,7 +416,7 @@ int lower_bound(int _x) { } ``` -Switching to the B+ layout more than paid off: this S+ tree is 1.5-3x faster compared to the optimized S-tree: +Switching to the B+ layout more than paid off: the S+ tree is 1.5-3x faster compared to the optimized S-tree: ![](../img/search-bplus.svg) @@ -536,7 +536,7 @@ Ideas that I have not yet managed to implement but consider highly perspective a I know how to do it with code generation, but I went for a generic solution and tried to [implement]( https://github.com/sslotin/amh-code/blob/main/binsearch/bplus-adaptive.cc) it with the facilities of modern C++, but the compiler can't produce optimal code this way. -- Group nodes with one or two generations of its descendants (~300 nodes / ~5k keys) so that they are close in memory — in the spirit of what [FAST](http://kaldewey.com/pubs/FAST__SIGMOD10.pdf) calls hierarchical blocking. This reduces the severity of TLB misses and also may improve latency as the memory controller may choose to keep the [RAM row buffer](/hpc/cpu-cache/aos-soa/#ram-specific-timings) open, anticipating local reads. +- Group nodes with one or two generations of its descendants (~300 nodes / ~5k keys) so that they are close in memory — in the spirit of what [FAST](http://kaldewey.com/pubs/FAST__SIGMOD10.pdf) calls hierarchical blocking. This reduces the severity of TLB misses and also may improve the latency as the memory controller may choose to keep the [RAM row buffer](/hpc/cpu-cache/aos-soa/#ram-specific-timings) open, anticipating local reads. - Optionally use prefetching on some specific layers. Aside from to the $\frac{1}{17}$-th chance of it fetching the node we need, the hardware prefetcher may also get some of its neighbors for us if the data bus is not busy. It also has the same TLB and row buffer effects as with blocking. Other possible minor optimizations include: @@ -544,7 +544,7 @@ Other possible minor optimizations include: - Permuting the nodes of the last layer as well — if we only need the index and not the value. - Reversing the order in which the layers are stored to left-to-right so that the first few layers are on the same page. - Rewriting the whole thing in assembly, as the compiler seems to struggle with pointer arithmetic. -- Using [blending](/hpc/simd/masking) instead of `packs`: you can odd-even shuffle node keys (`[1 3 5 7] [2 4 6 8]`), compare against the search key, and then blend the low 16 bits of the first register mask with the high 16 bits of the second. Blending is slightly faster on many architectures, and it may also help to alternate between packing and blending for as they use different subsets of ports. (Thanks to Const-me from HackerNews for [suggesting](https://news.ycombinator.com/item?id=30381912) it.) +- Using [blending](/hpc/simd/masking) instead of `packs`: you can odd-even shuffle node keys (`[1 3 5 7] [2 4 6 8]`), compare against the search key, and then blend the low 16 bits of the first register mask with the high 16 bits of the second. Blending is slightly faster on many architectures, and it may also help to alternate between packing and blending as they use different subsets of ports. (Thanks to Const-me from HackerNews for [suggesting](https://news.ycombinator.com/item?id=30381912) it.) Note that the current implementation is specific to AVX2 and may require some non-trivial changes to adapt to other platforms. It would be interesting to port it for Intel CPUs with AVX-512 and Arm CPUs with 128-bit NEON, which may require some [trickery](https://github.com/WebAssembly/simd/issues/131) to work. From 6b248ea29c863062bdbc7d3e242857591a1b4eea Mon Sep 17 00:00:00 2001 From: Scott Wang Date: Fri, 18 Feb 2022 17:35:34 +0800 Subject: [PATCH 217/531] Fix typo --- content/english/hpc/pipelining/branchless.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/hpc/pipelining/branchless.md b/content/english/hpc/pipelining/branchless.md index 72b3e37f..9717bfd4 100644 --- a/content/english/hpc/pipelining/branchless.md +++ b/content/english/hpc/pipelining/branchless.md @@ -63,7 +63,7 @@ So the code above is actually closer to using a ternary operator like this: ```c++ for (int i = 0; i < N; i++) - s += (a[i] < 50 : a[i] : 0); + s += (a[i] < 50 ? a[i] : 0); ``` Both variants are optimized by the compiler and produce the following assembly: From 82fa2eca7dd37a7403a762b6073d603fb757e5b1 Mon Sep 17 00:00:00 2001 From: Christopher Sahnwaldt Date: Fri, 18 Feb 2022 11:45:46 +0100 Subject: [PATCH 218/531] s-tree.md: fix index formula The code below is correct, the math was off. --- content/english/hpc/data-structures/s-tree.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/hpc/data-structures/s-tree.md b/content/english/hpc/data-structures/s-tree.md index 193e0292..78663b63 100644 --- a/content/english/hpc/data-structures/s-tree.md +++ b/content/english/hpc/data-structures/s-tree.md @@ -48,7 +48,7 @@ Storing and fetching pointers in a B-tree node wastes precious cache space and d One of the ways to achieve this is by generalizing the [Eytzinger numeration](../binary-search#eytzinger-layout) to $(B + 1)$-ary trees: - The root node is numbered $0$. -- Node $k$ has $(B + 1)$ child nodes numbered $\\{k \cdot (B+1) + i\\}$ for $i \in [1, B]$. +- Node $k$ has $(B + 1)$ child nodes numbered $\\{k \cdot (B+1) + i + 1\\}$ for $i \in [0, B]$. This way, we can only use $O(1)$ additional memory by allocating one large two-dimensional array of keys and relying on index arithmetic to locate children nodes in the tree: From 50d1efb2f6eb8f6d71481e2870910c3b897b28f2 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Fri, 18 Feb 2022 21:38:39 +0300 Subject: [PATCH 219/531] fix b-tree definition --- content/english/hpc/data-structures/s-tree.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/content/english/hpc/data-structures/s-tree.md b/content/english/hpc/data-structures/s-tree.md index 78663b63..37492ea4 100644 --- a/content/english/hpc/data-structures/s-tree.md +++ b/content/english/hpc/data-structures/s-tree.md @@ -33,11 +33,11 @@ This is a long article, and since it also serves as a [textbook](/hpc/) case stu ## B-Tree Layout -B-trees generalize the concept of binary search trees by allowing nodes to have more than two children. Instead of a single key, a node of a B-tree of order $k$ can contain up to $B = (k - 1)$ keys that are stored in sorted order and up to $k$ pointers to child nodes, each satisfying the property that all keys in the subtrees of the first $i$ children are not greater than the $i$-th key in the parent node. +B-trees generalize the concept of binary search trees by allowing nodes to have more than two children. Instead of a single key, a node of a B-tree of order $k$ can contain up to $B = (k - 1)$ keys stored in sorted order and up to $k$ pointers to child nodes. Each child $i$ satisfies the property that all keys in its subtree are between keys $i$ and $(i + 1)$ in the parent node (using 0-based numbering for children and 1-based numbering for keys). ![A B-tree of order 4](../img/b-tree.jpg) -The main advantage of this approach is that it reduces the tree height by $\frac{\log_2 n}{\log_k n} = \frac{\log k}{\log 2} = \log_2 k$ times while fetching each node still takes roughly the same time — as long it fits into a single [memory block](/hpc/external-memory/hierarchy/). +The main advantage of this approach is that it reduces the tree height by $\frac{\log_2 n}{\log_k n} = \frac{\log k}{\log 2} = \log_2 k$ times, while fetching each node still takes roughly the same time — as long it fits into a single [memory block](/hpc/external-memory/hierarchy/). B-trees were primarily developed for the purpose of managing on-disk databases, where the latency of randomly fetching a single byte is comparable with the time it takes to read the next 1MB of data sequentially. For our use case, we will be using the block size of $B = 16$ elements — or $64$ bytes, the size of the cache line — which makes the tree height and the total number of cache line fetches per query $\log_2 17 \approx 4$ times smaller compared to the binary search. @@ -48,7 +48,7 @@ Storing and fetching pointers in a B-tree node wastes precious cache space and d One of the ways to achieve this is by generalizing the [Eytzinger numeration](../binary-search#eytzinger-layout) to $(B + 1)$-ary trees: - The root node is numbered $0$. -- Node $k$ has $(B + 1)$ child nodes numbered $\\{k \cdot (B+1) + i + 1\\}$ for $i \in [0, B]$. +- Node $k$ has $(B + 1)$ child nodes numbered $\\{k \cdot (B + 1) + i + 1\\}$ for $i \in [0, B]$. This way, we can only use $O(1)$ additional memory by allocating one large two-dimensional array of keys and relying on index arithmetic to locate children nodes in the tree: From 1910d0713e783a763dec0d6f0caa45891a219a41 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Fri, 18 Feb 2022 21:43:54 +0300 Subject: [PATCH 220/531] change framing --- content/english/hpc/data-structures/s-tree.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/hpc/data-structures/s-tree.md b/content/english/hpc/data-structures/s-tree.md index 37492ea4..05be6252 100644 --- a/content/english/hpc/data-structures/s-tree.md +++ b/content/english/hpc/data-structures/s-tree.md @@ -33,7 +33,7 @@ This is a long article, and since it also serves as a [textbook](/hpc/) case stu ## B-Tree Layout -B-trees generalize the concept of binary search trees by allowing nodes to have more than two children. Instead of a single key, a node of a B-tree of order $k$ can contain up to $B = (k - 1)$ keys stored in sorted order and up to $k$ pointers to child nodes. Each child $i$ satisfies the property that all keys in its subtree are between keys $i$ and $(i + 1)$ in the parent node (using 0-based numbering for children and 1-based numbering for keys). +B-trees generalize the concept of binary search trees by allowing nodes to have more than two children. Instead of a single key, a node of a B-tree of order $k$ can contain up to $B = (k - 1)$ keys stored in sorted order and up to $k$ pointers to child nodes. Each child $i$ satisfies the property that all keys in its subtree are between keys $(i - 1)$ and $i$ of the parent node (if they exist). ![A B-tree of order 4](../img/b-tree.jpg) From 0b739a8796f475261858cb2691ef642e5f67163e Mon Sep 17 00:00:00 2001 From: Robert Klein Date: Fri, 18 Feb 2022 16:31:30 -0500 Subject: [PATCH 221/531] Fixed pointer chasing link --- content/english/hpc/complexity/languages.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/hpc/complexity/languages.md b/content/english/hpc/complexity/languages.md index 9453c91d..70defee9 100644 --- a/content/english/hpc/complexity/languages.md +++ b/content/english/hpc/complexity/languages.md @@ -12,7 +12,7 @@ Mine was in high school, when I realized that making websites and doing *useful* I didn't know much about computer architecture to answer this question. But I also didn't need the right answer — I needed a rule of thumb. My thought process was: "2-3GHz means 2 to 3 billion instructions executed every second, and in a simple loop that does something with array elements, I also need to increment loop counter, check end-of-loop condition, do array indexing and stuff like that, so let's add room for 3-5 more instructions for every useful one" and ended up with using $5 \cdot 10^8$ as an estimate. None of these statements are true, but counting how many operations my algorithm needed and dividing it by this number was a good rule of thumb for my use case. -The real answer, of course, is much more complicated and highly dependent on what kind of "operation" you have in mind. It can be as low as $10^7$ for things like [pointer chasing](/hpc/memory/latency) and as high as $10^{11}$ for [SIMD-accelerated](/hpc/simd) linear algebra. To demonstrate these striking differences, we will use the case study of matrix multiplication implemented in different languages — and dig deeper into how computers execute them. +The real answer, of course, is much more complicated and highly dependent on what kind of "operation" you have in mind. It can be as low as $10^7$ for things like [pointer chasing](/hpc/cpu-cache/latency) and as high as $10^{11}$ for [SIMD-accelerated](/hpc/simd) linear algebra. To demonstrate these striking differences, we will use the case study of matrix multiplication implemented in different languages — and dig deeper into how computers execute them. -## Measuring and Mitigating Errors +One of the uses for the alternative rounding modes is for diagnosing numerical instability. If the results of an algorithm substantially vary when switching between rounding to the positive and negative infinities, it indicates susceptibility to round-off errors. + +This test is better than switching all computations to lower precision and checking whether the result changed by too much. The default rounding-to-nearest converges to the correct “expected” value given enough averaging: statistically, half of the time, they are rounding up, and the other half, they are rounding down — so they cancel each other. + +### Measuring Errors + +It seems surprising to expect this guarantee from hardware that performs complex calculations such as natural logarithms and square roots, but this is it: you are guaranteed to get the highest precision possible from all operations. This makes it remarkably easy to analyze round-off errors, as we will see in a bit. There are two natural ways to measure computational errors: @@ -183,4 +216,3 @@ The tricky part is the "shortest possible". It can be solved by printing digits How many decimal digits do we need to print a `float`? --> - From 58c9f45a2bce09709d845724cdf0f9ab203a713a Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sun, 20 Feb 2022 22:07:11 +0300 Subject: [PATCH 230/531] note about floating-point binary search --- content/english/hpc/data-structures/s-tree.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/content/english/hpc/data-structures/s-tree.md b/content/english/hpc/data-structures/s-tree.md index 05be6252..454e674d 100644 --- a/content/english/hpc/data-structures/s-tree.md +++ b/content/english/hpc/data-structures/s-tree.md @@ -235,7 +235,9 @@ unsigned rank(reg x, int* y) { } ``` -This instruction converts 32-bit integers stored in two registers to 16-bit integers stored in one register — in our case, effectively joining the vector masks into one. Note that we've swapped the order of comparison — this lets us not invert the mask in the end, but we have to subtract one from the search key once in the beginning to make it correct (otherwise, it works as `upper_bound`). +This instruction converts 32-bit integers stored in two registers to 16-bit integers stored in one register — in our case, effectively joining the vector masks into one. Note that we've swapped the order of comparison — this lets us not invert the mask in the end, but we have to subtract[^float] one from the search key once in the beginning to make it correct (otherwise, it works as `upper_bound`). + +[^float]: If you need to work with [floating-point](/hpc/arithmetic/float) keys, consider whether `upper_bound` will suffice — because if you need `lower_bound` specifically, then subtracting one or the machine epsilon from the search key doesn't work: you need to [get the previous representable number](https://stackoverflow.com/questions/10160079/how-to-find-nearest-next-previous-double-value-numeric-limitsepsilon-for-give) instead. Aside from some corner cases, this essentially means reinterpreting its bits as an integer, subtracting one, and reinterpreting it back as a float (which magically works because of how [IEEE-754 floating-point numbers](/hpc/arithmetic/ieee-754) are stored in memory). The problem is, it does this weird interleaving where the result is written in the `a1 b1 a2 b2` order instead of `a1 a2 b1 b2` that we want — many AVX2 instructions tend to do that. To correct this, we need to [permute](/hpc/simd/shuffling) the resulting vector, but instead of doing it during the query time, we can just permute every node during preprocessing: From dc025a0c061c4319f5dce88a89ad2ba095633423 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Mon, 21 Feb 2022 00:31:59 +0300 Subject: [PATCH 231/531] long double is 10 bytes --- content/english/hpc/cpu-cache/alignment.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/hpc/cpu-cache/alignment.md b/content/english/hpc/cpu-cache/alignment.md index 0a31368c..ee703dc9 100644 --- a/content/english/hpc/cpu-cache/alignment.md +++ b/content/english/hpc/cpu-cache/alignment.md @@ -92,7 +92,7 @@ struct Data { Now, each of them is aligned without any padding, and the size of the structure is just 8 bytes. It seems stupid that the size of a structure and consequently its performance depends on the order of definition of its members, but this is required for binary compatibility. -As a rule of thumb, place your type definitions from largest data types to smallest — this greedy algorithm is guaranteed to work unless you have some non-power-of-two type sizes such as the [12-byte](/hpc/arithmetic/ieee-754#float-formats) `long double`. +As a rule of thumb, place your type definitions from largest data types to smallest — this greedy algorithm is guaranteed to work unless you have some weird non-power-of-two type sizes such as the [10-byte](/hpc/arithmetic/ieee-754#float-formats) `long double`. +For example, if you call `fesetround(FE_UPWARD)` before running the loop above, it outputs not $2^{24}$, and not even $2^{25}$, but $67108864 = 2^{26}$. This happens because when we get to $2^{24}$, $(x + 1)$ starts rounding to the next nearest representable number $(x + 2)$, and we reach $2^{25}$ in half the time, and after that, $(x + 1)$ rounds up to $(x+4)$, and we start going four times as fast. One of the uses for the alternative rounding modes is for diagnosing numerical instability. If the results of an algorithm substantially vary when switching between rounding to the positive and negative infinities, it indicates susceptibility to round-off errors. From 77c0647c333d27fdbf5ddec6640a024939da80ee Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 22 Feb 2022 19:50:28 +0300 Subject: [PATCH 235/531] change wording in russian segtree --- content/russian/cs/segment-tree/lazy-initialization.md | 8 ++++---- content/russian/cs/sequences/compression.md | 1 + content/russian/cs/tree-structures/treap.md | 2 +- 3 files changed, 6 insertions(+), 5 deletions(-) diff --git a/content/russian/cs/segment-tree/lazy-initialization.md b/content/russian/cs/segment-tree/lazy-initialization.md index 3e789b30..d8a4bd49 100644 --- a/content/russian/cs/segment-tree/lazy-initialization.md +++ b/content/russian/cs/segment-tree/lazy-initialization.md @@ -6,9 +6,9 @@ prerequisites: - lazy-propagation --- -Рассмотрим нашу любимую задачу суммы на подотрезках, но теперь все индексы лежат не от в пределах $10^6$, а $10^9$ или даже $10^{18}$. +Рассмотрим нашу любимую задачу суммы на подотрезках, но теперь все индексы лежат не в пределах $10^5$ или $10^6$, а до $10^9$ или даже $10^{18}$. -Все асимптотики нас по прежнему устраивают: +Все асимптотики нас по прежнему более-менее устраивают: $$ \log_2 10^6 \approx 20 @@ -16,9 +16,9 @@ $$ \\ \log_2 10^{18} \approx 60 $$ -кроме этапа построения, работающего за линейное от $n$ время. +Единственная проблема — это этап построения, работающий за линейное от $n$ время и память. -Можно решить эту проблему так: откажемся от явного создания всех вершин дерева в самом начале. Изначально создадим только лишь корень, а остальные вершины будем создавать на ходу, когда в них потребуется записать что-то не дефолтное — как в lazy propagation. +Решить её можно отказавшись от явного создания всех вершин дерева в самом начале. Изначально создадим только лишь корень, а остальные вершины будем создавать на ходу, когда в них потребуется записать что-то не дефолтное — как в [lazy propagation](../lazy-propagation): ```cpp struct Segtree { diff --git a/content/russian/cs/sequences/compression.md b/content/russian/cs/sequences/compression.md index ffd0bd79..332011b3 100644 --- a/content/russian/cs/sequences/compression.md +++ b/content/russian/cs/sequences/compression.md @@ -3,6 +3,7 @@ title: Сжатие координат authors: - Сергей Слотин weight: -1 +draft: true --- diff --git a/content/russian/cs/tree-structures/treap.md b/content/russian/cs/tree-structures/treap.md index b5fdf764..dd3417dd 100644 --- a/content/russian/cs/tree-structures/treap.md +++ b/content/russian/cs/tree-structures/treap.md @@ -14,7 +14,7 @@ published: true Рене Декарт (фр. *René Descartes*) — великий французский математик и философ XVII века. -Рене Декарт не является создателем декартова дерева, но он является создателем декартовой системы координат, которую мы все знаем и любим. +Рене Декарт не является создателем декартова дерева, однако он является создателем декартовой системы координат, которую мы все знаем и любим. Декартово дерево же определяется и строится так: From 7b1eca96cd6307be4fec81bc60900803f5fa8b34 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 22 Feb 2022 19:52:15 +0300 Subject: [PATCH 236/531] bugfix in graham scan (tnx smirnov maxim) --- content/russian/cs/convex-hulls/graham.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/russian/cs/convex-hulls/graham.md b/content/russian/cs/convex-hulls/graham.md index 9736c034..49138c7e 100644 --- a/content/russian/cs/convex-hulls/graham.md +++ b/content/russian/cs/convex-hulls/graham.md @@ -23,7 +23,7 @@ vector graham(vector points) { // сортируем точки по полярному углу sort(points.begin(), points.end(), [&](r a, r b){ - return (a - p) ^ (b - p) > 0; + return (a - p0) ^ (b - p0) > 0; }); vector hull; From 0af7e491a42100d9e3f9934948eee5c71861fe19 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 22 Feb 2022 21:54:28 +0300 Subject: [PATCH 237/531] code for segment trees --- .../hpc/data-structures/segment-trees.md | 293 ++++++++++++++---- 1 file changed, 234 insertions(+), 59 deletions(-) diff --git a/content/english/hpc/data-structures/segment-trees.md b/content/english/hpc/data-structures/segment-trees.md index f529bf8c..27b35e81 100644 --- a/content/english/hpc/data-structures/segment-trees.md +++ b/content/english/hpc/data-structures/segment-trees.md @@ -8,10 +8,16 @@ The lessons we learned from studying layouts for binary search can be applied to Most of examples in this section are about optimizing some algorithms that are either included in standard library or take under 10 lines of code to implement naively, but we will start off with a bit more obscure example. +There are many things segment trees can do. Persistent structures, computational geometry. But for most of this article, we will focus on the dynamic (as opposed to static) prefix sum problem. + Segment tree is a data structure that stores information about array segments. It is a static tree of degree two, and here is what this means: Segment trees are used for windowing queries or range queries in general, either by themselves or as part of a larger algorithm. They are very rarely mentioned in scientific literature, because they are relatively novel (invented around 2000), and *asymptotically* they don't do anything that any other binary tree can't, but they are dominant structure in the world of competitive programming because of their performance and ease of implementation. +Enable [hugepages](/hpc/cpu-cache/paging) system-wide and forget about it. + +Functional programming, e. g. for implementing persistent arrays and derived structures. + Segment trees are built recursively: build a tree for left and right halves and merge results to get root. ```cpp @@ -19,27 +25,19 @@ void add(int k, int x); // 0-based indexation int sum(int k); // sum of elements indexed [0, k] ``` -## Segment Trees - -* Static tree data structure used for storing information about array segments -* Popular in competitive programming, very rarely used in real life -* -* Many different implementations possible +Static tree data structure used for storing information about array segments. Popular in competitive programming, very rarely used in real life. Many different implementations possible, which we will explore in this article. ![](https://i.stack.imgur.com/xeIcl.png) ----- +## Pointer-Based Implementation -### Pointer-Based +If you were at an "Introduction to OOP" class, you would probably implement a segment tree like this: -* Actually really good in terms of SWE practices, but terrible in terms of performance -* Pointer chasing, 4 unnecessary metadata fields, recursion, branching - -```cpp +```c++ struct segtree { int lb, rb; int s = 0; - segtree *l = 0, *r = 0; + segtree *l = nullptr, *r = nullptr; segtree(int lb, int rb) : lb(lb), rb(rb) { if (lb + 1 < rb) { @@ -51,7 +49,7 @@ struct segtree { void add(int k, int x) { s += x; - if (l) { + if (l != nullptr) { if (k < l->rb) l->add(k, x); else @@ -59,7 +57,7 @@ struct segtree { } } - int sum(int k) { // [0, k) + int sum(int k) { if (rb <= k) return s; if (lb >= k) @@ -69,41 +67,80 @@ struct segtree { }; ``` ----- +It takes 4+4+4+8+8=28 bytes, although they get padded to 32 for [memory alignment](/hpc/cpu-cache/alignment) reasons. -### Implicit (Recursive) +Actually really good in terms of SWE practices, but terrible in terms of performance +Pointer chasing, 4 unnecessary metadata fields, recursion, branching -* Eytzinger-like layout: $2k$ is the left child and $2k+1$ is the right child -* Wasted memory, recursion, branching +## Implicit Segment Trees -```cpp +Eytzinger-like layout: $2k$ is the left child and $2k+1$ is the right child. + +```c++ int t[4 * N]; -void _add(int k, int x, int v = 1, int l = 0, int r = N) { +void add(int k, int x, int v = 1, int l = 0, int r = N) { t[v] += x; if (l + 1 < r) { int m = (l + r) / 2; if (k < m) - _add(k, x, 2 * v, l, m); + add(k, x, 2 * v, l, m); else - _add(k, x, 2 * v + 1, m, r); + add(k, x, 2 * v + 1, m, r); } } -int _sum(int k, int v = 1, int l = 0, int r = N) { - if (l > k) +int sum(int k, int v = 1, int l = 0, int r = N) { + if (l >= k) return 0; - if (r - 1 <= k) + if (r <= k) return t[v]; int m = (l + r) / 2; - return _sum(k, 2 * v, l, m) - + _sum(k, 2 * v + 1, m, r); + return sum(k, 2 * v, l, m) + + sum(k, 2 * v + 1, m, r); } ``` -### Implicit (Iterative) +Still have wasted memory. -```cpp +### Iterative Implementation + +```c++ +void add(int k, int x) { + int v = 1, l = 0, r = N; + while (l + 1 < r) { + t[v] += x; + v <<= 1; + int m = (l + r) >> 1; + if (k < m) + r = m; + else + l = m, v++; + } + t[v] += x; +} + +int sum(int k) { + int v = 1, l = 0, r = N, s = 0; + while (true) { + int m = (l + r) >> 1; + v <<= 1; + if (k >= m) { + s += t[v++]; + if (k == m) + break; + l = m; + } else { + r = m; + } + } + return s; +} +``` + +### Implicit (Bottom-Up) + +```c++ void add(int k, int x) { int v = 1, l = 0, r = N; while (l + 1 < r) { @@ -138,44 +175,96 @@ int sum(int k) { return s; } ``` + ### Implicit (Bottom-up) * Different layout: leaf nodes are numbered $n$ to $(2n - 1)$, "parent" is $\lfloor k/2 \rfloor$ * Minimum possible amount of memory * Fully iterative and no branching (pipelinize-able reads!) -```cpp -int n, t[2*maxn]; +```c++ +int t[2 * N]; -void build() { - for (int i = n-1; i > 0; i--) - t[i] = max(t[i<<1], t[i<<1|1]); +void add(int k, int x) { + k += N; + while (k != 0) { + t[k] += x; + k >>= 1; + } } -void upd(int k, int x) { - k += n; - t[k] = x; - while (k > 1) { - t[k>>1] = max(t[k], t[k^1]); - k >>= 1; +int sum(int k) { + int res = 0; + k += N - 1; + while (k != 0) { + if (~k & 1) + res += t[k]; + k = (k - 1) >> 1; } + return res; } +``` + +### Arbitrary Array Sizes -int rmq(int l, int r) { - int ans = 0; - l += n, r += n; +```c++ +int sum(int r) { + r += N - 1; + int l = N, s = 0; while (l <= r) { - if (l&1) ans = max(ans, t[l++]); - if (!(r&1)) ans = max(ans, t[r--]); + if ( l & 1) s += t[l++]; + if (~r & 1) s += t[r--]; l >>= 1, r >>= 1; } - return ans; + return s; } ``` -https://codeforces.com/blog/entry/18051 +Magically, it just works ---- +```c++ +const int last_layer = 1 << __lg(2 * N - 1); + +int leaf(int k) { + k += last_layer; + k -= (k >= 2 * N) * N; + return k; +} + +```c++ +void add(int k, int x) { + k = leaf(k); + while (k != 0) { + t[k] += x; + k >>= 1; + } +} + +int sum(int k) { + k = leaf(k - 1); + int s = 0; + while (k != 0) { + if (~k & 1) + s += t[k]; + k = (k - 1) >> 1; + } + return s; +} +``` + +### Branchless + +```c++ +int sum(int k) { + k = leaf(k - 1); + int s = 0; + while (k != 0) { + s += ((k & 1) == 0) * t[k]; // simplify? + k = (k - 1) >> 1; + } + return s; +} +``` ## Fenwick trees @@ -185,30 +274,116 @@ https://codeforces.com/blog/entry/18051 then both query and update would only require updating $O(\log n)$ different $t$'s ```cpp -int t[maxn]; +int t[N + 1]; -// calculate sum on prefix: -int sum(int r) { +void add(int k, int x) { + for (k += 1; k <= N; k += k & -k) + t[k] += x; +} + +int sum(int k) { int res = 0; - for (; r > 0; r -= r & -r) - res += t[r]; + for (; k != 0; k &= k - 1) // k -= k & -k + res += t[k]; return res; } +``` +```c++ // how you can use it to calculate sums on subsegments: int sum (int l, int r) { return sum(r) - sum(l-1); } +``` + +Can't be more optimal because of pipelining and implicit prefetching + +```c++ +inline constexpr int hole(int k) { + return k + (k >> 10); +} + +int t[hole(N) + 1]; -// updates necessary t's: void add(int k, int x) { - for (; k <= n; k += k & -k) - t[k] += x; + for (k += 1; k <= N; k += k & -k) + t[hole(k)] += x; +} + +int sum(int k) { + int res = 0; + for (; k != 0; k &= k - 1) + res += t[hole(k)]; + return res; } ``` -Can't be more optimal because of pipelining and implicit prefetching +### Wide Segment Trees + +```c++ +const int b = 4, B = (1 << b); + +constexpr int height(int n) { + return (n <= B ? 1 : height(n / B) + 1); +} + +constexpr int offset(int h) { + int s = 0, n = N; + while (h--) { + s += (n + B - 1) / B * B; + n /= B; + } + return s; +} + +constexpr int H = height(N); +alignas(64) int t[offset(H)]; +``` + +```c++ +int sum(int k) { + int res = 0; + for (int h = 0; h < H; h++) + res += t[offset(h) + (k >> (h * b))]; + return res; +} +``` + +```c++ +struct Precalc { + alignas(64) int mask[B][B]; + + constexpr Precalc() : mask{} { + for (int k = 0; k < B; k++) + for (int i = 0; i < B; i++) + mask[k][i] = (i > k ? -1 : 0); + } +}; + +constexpr Precalc T; + +typedef int vec __attribute__ (( vector_size(32) )); + +constexpr int round(int k) { + return k & ~(B - 1); // = k / B * B +} + +void add(int k, int _x) { + vec x = _x + vec{}; + for (int h = 0; h < H; h++) { + auto l = (vec*) &t[offset(h) + round(k)]; + auto m = (vec*) T.mask[k % B]; + for (int i = 0; i < B / 8; i++) + l[i] += x & m[i]; + k >>= b; + } +} +``` + +Wide Fenwick trees make little sense. The speed of Fenwick trees comes from rapidly iterating over just the elements we need. ## Further Reading -This article is loosely based on "[Practical Trade-Offs for the Prefix-Sum Problem](https://arxiv.org/pdf/2006.14552.pdf)" by Giulio Ermanno Pibiri and Rossano Venturini. +"[Efficient and easy segment trees](https://codeforces.com/blog/entry/18051)" by Oleksandr Bacherikov + +This article is loosely based on "[Practical Trade-Offs for the Prefix-Sum Problem](https://arxiv.org/pdf/2006.14552.pdf)" by Giulio Ermanno Pibiri and Rossano Venturini. It has some more detailed discussions, as well as some other implementations or branchless top-down segment tree and why b-ary Fenwick tree is not a good idea. Intermediate structures we've skipped here. From cae1b287d44450afe41c571f6f94c00fb1684e0d Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 22 Feb 2022 23:19:46 +0300 Subject: [PATCH 238/531] contract programming edits --- content/english/hpc/compilation/contracts.md | 56 ++++++++++---------- 1 file changed, 29 insertions(+), 27 deletions(-) diff --git a/content/english/hpc/compilation/contracts.md b/content/english/hpc/compilation/contracts.md index 66aeb5f9..367911df 100644 --- a/content/english/hpc/compilation/contracts.md +++ b/content/english/hpc/compilation/contracts.md @@ -3,11 +3,9 @@ title: Contract Programming weight: 6 --- -In "safe" languages like Java and Rust, you normally have well-defined behavior for every possible operation and every possible input. There are some things that are *under-defined*, like the order of keys in a hash table, but these are usually some minor details left to implementation for potential performance gains in the future. +In "safe" languages like Java and Rust, you normally have well-defined behavior for every possible operation and every possible input. There are some things that are *under-defined*, like the order of keys in a hash table or the growth factor of an `std::vector`, but these are usually some minor details that are left up to implementation for potential performance gains in the future. -In contrast, C and C++ take the concept of undefined behavior to another level. Certain operations don't cause an error during compilation or runtime but are just not *allowed* — in the sense of there being a *contract* between the programmer and the compiler, that in case of undefined behavior the compiler can do literally anything, including formatting your hard drive. - -But compiler engineers are not interested in formatting your hard drive of blowing up your monitor. Instead, undefined behavior is used to guarantee a lack of corner cases and help optimization. +In contrast, C and C++ take the concept of undefined behavior to another level. Certain operations don't cause an error during compilation or runtime but are just not *allowed* — in the sense of there being a *contract* between the programmer and the compiler, that in case of undefined behavior, the compiler is legally allowed to do literally anything, including blowing up your monitor or formatting your hard drive. But compiler engineers are not interested in doing that. Instead, undefined behavior is used to guarantee a lack of corner cases and help optimization. ### Why Undefined Behavior Exists @@ -15,15 +13,13 @@ There are two major groups of actions that cause undefined behavior: - Operations that are almost certainly unintentional bugs, like dividing by zero, dereferencing a null pointer, or reading from uninitialized memory. You want to catch these as soon as possible during testing, so crashing or having some non-deterministic behavior is better than having them always do a fixed fallback action such as returning zero. - You can compile and run a program with *sanitizers* to catch undefined behavior early. In GCC and Clang, you can use the `-fsanitize=undefined` flag, and some operations that frequently cause UB will be instrumented to detect it at runtime. - -- Operations that have slightly different observable behavior on different platforms. For example, the result of left-shifting an integer by more than 31 bits is undefined, because the relevant instructions are implemented differently on Arm and x86 CPUs. If you standardize one specific behavior, then all programs compiled for the other platform will have to spend a few more cycles checking for that edge case, so it is best to leave it either undefined. + You can compile and run a program with *sanitizers* to catch undefined behavior early. In GCC and Clang, you can use the `-fsanitize=undefined` flag, and some operations that are notorious for causing UB will be instrumented to detect it at runtime. - Sometimes, when there is a legitimate use case for some platform-specific behavior, it can be left *implementation-defined* instead of being undefined. For example, the result of right-shifting a [negative integer](/hpc/arithmetic/integer) depends on the platform: it either shifts in zeros or ones (e. g. right shifting `11010110 = -42` by one may mean either `01101011 = 107` or `11101011 = -21`, both use cases being realistic). +- Operations that have slightly different observable behavior on different platforms. For example, the result of left-shifting an integer by more than 31 bits is undefined, because the instruction that does it is implemented differently on Arm and x86 CPUs. If you standardize one specific behavior, then all programs compiled for the other platform will have to spend a few more cycles checking for that edge case, so it is best to leave it undefined. -Designating something as undefined instead of implementation-defined behavior also helps compilers in optimization. + Sometimes, when there is a legitimate use case for some platform-specific behavior, instead of declaring it undefined, it can be left *implementation-defined*. For example, the result of right-shifting a [negative integer](/hpc/arithmetic/integer) depends on the platform: it either shifts in zeros or ones (e. g. right shifting `11010110 = -42` by one may mean either `01101011 = 107` or `11101011 = -21`, both use cases being realistic). -Consider the case of signed integer overflow. On almost all architectures, [signed integers](/hpc/arithmetic/integer) overflow the same way as unsigned ones, with `INT_MAX + 1 == INT_MIN`, but yet this is undefined behavior in the C/C++ standard. This is very much intentional: if you disallow signed integer overflow, then `(x + 1) > x` is guaranteed to be always true for `int`, but not for `unsigned int`, because `(x + 1)` may overflow. For signed types, this lets compilers optimize such checks away. +Designating something as undefined instead of implementation-defined behavior also helps compilers in optimization. Consider the case of signed integer overflow. On almost all architectures, [signed integers](/hpc/arithmetic/integer) overflow the same way as unsigned ones, with `INT_MAX + 1 == INT_MIN`, and yet, this is undefined behavior according to the C++ standard. This is very much intentional: if you disallow signed integer overflow, then `(x + 1) > x` is guaranteed to be always true for `int`, but not for `unsigned int`, because `(x + 1)` may overflow. For signed types, this lets compilers optimize such checks away. As a more naturally occurring example, consider the case of a loop with an integer control variable. Modern C++ and languages like Rust are advocating for using an unsigned integer (`size_t` / `usize`), while C programmers stubbornly keep using `int`. To understand why, consider the following `for` loop: @@ -33,7 +29,7 @@ for (unsigned int i = 0; i < n; i++) { } ``` -How many times does this loop execute? There are technically two valid answers: $n$ and infinity, the second being the case if $n$ exceeds $2^{32}$ so that $i$ keeps resetting to zero every $2^{32}$ iterations. While the former is probably the one assumed by the programmer, to comply with the language spec, the compiler still has to insert additional runtime checks and consider the two cases, which should be optimized differently. Meanwhile, the `int` version would make exactly $n$ iterations, because the very possibility of a signed overflow is defined out of existence. +How many times does this loop execute? There are technically two valid answers: $n$ and infinity, the second being the case if $n$ exceeds $2^{32}$ so that $i$ keeps resetting to zero every $2^{32}$ iterations. While the former is probably the one assumed by the programmer, to comply with the language spec, the compiler still has to insert additional runtime checks and consider the two cases, which should be optimized differently. Meanwhile, the `int` version would make exactly $n$ iterations because the very possibility of a signed overflow is defined out of existence. ### Removing Corner Cases @@ -49,13 +45,13 @@ T at(size_t k) { } ``` -Interestingly, these checks are rarely actually executed during runtime, because the compiler can often prove, during compilation time, that each access will be within bounds. For example, when iterating in a `for` loop from 1 to the array size and indexing $i$-th element on each step, nothing illegal can possibly happen, so the bounds checks can be safely optimized away. +Interestingly, these checks are rarely actually executed during runtime because the compiler can often prove — during compile-time — that each access will be within bounds. For example, when iterating in a `for` loop from 1 to the array size and indexing $i$-th element on each step, nothing illegal can possibly happen, so the bounds checks can be safely optimized away. ### Assumptions -When the compiler can't prove the inexistence of corner cases, but you can, this additional information can be provided using the mechanism of undefined behavior. +When the compiler can't prove the inexistence of corner cases, but *you* can, this additional information can be provided using the mechanism of undefined behavior. -Clang has a helpful `__builtin_assume` function where you can put a statement that is guaranteed to be true, and the compiler will use this assumption in optimization. In GCC you can do the same with `__builtin_unreachable`: +Clang has a helpful `__builtin_assume` function where you can put a statement that is guaranteed to be true, and the compiler will use this assumption in optimization. In GCC, you can do the same with `__builtin_unreachable`: ```cpp void assume(bool pred) { @@ -64,9 +60,9 @@ void assume(bool pred) { } ``` -For instance, you can put `assume(k < vector.size())` before `at` in the example above, and then the bounds check should be optimized away. +For instance, you can put `assume(k < vector.size())` before `at` in the example above, and then the bounds check will be optimized away. -It is also quite useful to combine `assume` with `assert` and `static_assert` to find bugs: you can use the same function to check preconditions in the debug build, and then use them to improve performance in the production build. +It is also quite useful to combine `assume` with `assert` and `static_assert` to find bugs: you can use the same function to check preconditions in the debug build and then use them to improve performance in the production build. -For integer arithmetic, this is different, because the results always have to be exact. Consider the case of division by 2: +For integer arithmetic, this is different because the results always have to be exact. Consider the case of division by 2: ```cpp unsigned div_unsigned(unsigned x) { @@ -103,7 +105,7 @@ A widely known optimization is to replace it with a single right shift (`x >> 1` shr eax ``` -This is certainly correct for all *positive* numbers. But what about the general case? +This is certainly correct for all *positive* numbers, but what about the general case? ```cpp int div_signed(int x) { @@ -125,7 +127,7 @@ add eax, ebx ; add 1 to the value if it is negative to ensure rounding toward sar eax ; this one shifts in sign bits ``` -But the positive case is clearly what was intended. Here we can also use the `assume` mechanism to exclude that corner case: +When only the positive case is what was intended, we can also use the `assume` mechanism to eliminate the possibility of negative `x` and avoid handling this corner case: ```cpp int div_assume(int x) { @@ -151,9 +153,9 @@ void add(int *a, int *b, int n) { Since each iteration of this loop is independent, it can be executed in parallel and [vectorized](/hpc/simd). But is it, technically? -There may be a problem if the arrays `a` and `b` intersect. Consider the case when `b == a + 1`, that is, if `b` is a just a memory view of `a` starting from the second element. In this case, the next iteration depends on the previous one, and the only correct solution is execute the loop sequentially. The compiler has to check for such possibilities, even if the programmer knows they can't happen. +There may be a problem if the arrays `a` and `b` intersect. Consider the case when `b == a + 1`, that is, if `b` is just a memory view of `a` starting from its second element. In this case, the next iteration depends on the previous one, and the only correct solution is to execute the loop sequentially. The compiler has to check for such possibilities, even if the programmer knows they can't happen. -This is why we have `const` and `restrict` keywords. The first one enforces that that we won't modify memory with the pointer variable, and the second is a way to tell compiler that the memory is guaranteed to not be aliased. +This is why we have `const` and `restrict` keywords. The first one enforces that we won't modify memory with the pointer variable, and the second is a way to tell the compiler that the memory is guaranteed to not be aliased. ```cpp void add(int * __restrict__ a, const int * __restrict__ b, int n) { @@ -166,9 +168,9 @@ These keywords are also a good idea to use by themselves for the purpose of self ### C++20 Contracts -Contract programming is an underused, but very powerful technique. +Contract programming is an underused but powerful technique. -Design-by-contract actually made it into the C++20 standard in the form of [contract sattributes](http://www.hellenico.gr/cpp/w/cpp/language/attributes/contract.html), which are functionally equivalent to our hand-made, compiler-specific `assume`: +Design-by-contract actually made it into the C++20 standard in the form of [contract attributes](http://www.hellenico.gr/cpp/w/cpp/language/attributes/contract.html), which are functionally equivalent to our hand-made, compiler-specific `assume`: ```c++ T at(size_t k) [[ expects: k < n ]] { @@ -178,7 +180,7 @@ T at(size_t k) [[ expects: k < n ]] { There are 3 types of attributes — `expects`, `ensures`, and `assert` — respectively used for specifying pre- and post-conditions in functions and general assertions that can be put anywhere in the program. -Unfortunately, this exciting new feature is not yet implemented in any major C++ compiler, but maybe around 2022-2023 we will be able to write code like this: +Unfortunately, this exciting new feature is not yet implemented in any major C++ compiler — but maybe around 2022-2023, we will be able to write code like this: ```c++ bool is_power_of_two(int m) { @@ -198,4 +200,4 @@ int mod_power_of_two(int x, int m) Some forms of contract programming are also available in other performance-oriented languages such as [Rust](https://docs.rs/contracts/latest/contracts/) and [D](https://dlang.org/spec/contracts.html). -A general and language-agnostic advice is to always inspect the assembly that the compiler produced, and if it is not what you were hoping for, try to think about corner cases that may be limiting the compiler from optimizing it. +A general and language-agnostic advice is to always [inspect the assembly](../stages) that the compiler produced, and if it is not what you were hoping for, try to think about corner cases that may be limiting the compiler from optimizing it. From 06a97a377a4c62edc620e69077d49829c7869370 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 23 Feb 2022 00:29:49 +0300 Subject: [PATCH 239/531] contracts are not in C++20 --- content/english/hpc/compilation/contracts.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/content/english/hpc/compilation/contracts.md b/content/english/hpc/compilation/contracts.md index 367911df..1cd3cf7e 100644 --- a/content/english/hpc/compilation/contracts.md +++ b/content/english/hpc/compilation/contracts.md @@ -166,11 +166,11 @@ void add(int * __restrict__ a, const int * __restrict__ b, int n) { These keywords are also a good idea to use by themselves for the purpose of self-documenting. -### C++20 Contracts +### C++ Contracts -Contract programming is an underused but powerful technique. +Contract programming is an underused but very powerful technique. -Design-by-contract actually made it into the C++20 standard in the form of [contract attributes](http://www.hellenico.gr/cpp/w/cpp/language/attributes/contract.html), which are functionally equivalent to our hand-made, compiler-specific `assume`: +There is a late-stage proposal to add design-by-contract into the C++ standard in the form of [contract attributes](http://www.hellenico.gr/cpp/w/cpp/language/attributes/contract.html), which are functionally equivalent to our hand-made, compiler-specific `assume`: ```c++ T at(size_t k) [[ expects: k < n ]] { @@ -180,7 +180,7 @@ T at(size_t k) [[ expects: k < n ]] { There are 3 types of attributes — `expects`, `ensures`, and `assert` — respectively used for specifying pre- and post-conditions in functions and general assertions that can be put anywhere in the program. -Unfortunately, this exciting new feature is not yet implemented in any major C++ compiler — but maybe around 2022-2023, we will be able to write code like this: +Unfortunately, this exciting new feature is [not yet finally standardized](https://www.reddit.com/r/cpp/comments/cmk7ek/what_happened_to_c20_contracts/), let alone implemented in a major C++ compiler. But maybe, in a few years, we will be able to write code like this: ```c++ bool is_power_of_two(int m) { From 1da6362baadffef503d0537accdbcdbaf3a3267d Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 23 Feb 2022 11:15:57 +0300 Subject: [PATCH 240/531] contract programming notes --- content/english/hpc/compilation/contracts.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/content/english/hpc/compilation/contracts.md b/content/english/hpc/compilation/contracts.md index 1cd3cf7e..cedf20dd 100644 --- a/content/english/hpc/compilation/contracts.md +++ b/content/english/hpc/compilation/contracts.md @@ -116,7 +116,9 @@ int div_signed(int x) { If `x` is negative, then simply shifting doesn't work — regardless of whether shifting is done in zeros or sign bits: - If we shift in zeros, we get a non-negative result (the sign bit is zero). -- If we shift in sign bits, then rounding will happen towards negative infinity instead of zero (`-5 / 2` will be equal to `-3` instead of `-2`). +- If we shift in sign bits, then rounding will happen towards negative infinity instead of zero (`-5 / 2` will be equal to `-3` instead of `-2`)[^python]. + +[^python]: Fun fact: in Python, integer-dividing a negative number for some reason floors the result, so that `-5 // 2 = -3` and equivalent to `-5 >> 1 = -3`. I doubt that Guido van Rossum had this optimization in mind when initially designing the language, but, theoretically, a [JIT-compiled](/hpc/complexity/languages/#compiled-languages) Python program with many divisions by two may be faster than an analogous C++ program. So, for the general case, we have to insert some crutches to make it work: @@ -136,6 +138,8 @@ int div_assume(int x) { } ``` +Although in this particular case, perhaps the best syntax to express that we only expect non-negative numbers is to use an unsigned integer type. + Because of nuances like this, it is often beneficial to expand the algebra in intermediate functions and manually simplify arithmetic yourself rather than relying on the compiler to do it. ### Memory Aliasing From 81fc33b9077434f49ecb6e5e8ff421b8d538064a Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 23 Feb 2022 17:54:10 +0300 Subject: [PATCH 241/531] segment tree graphs --- .../data-structures/img/segtree-bottomup.svg | 1551 ++++++++++++ .../img/segtree-branchless.svg | 1854 ++++++++++++++ .../img/segtree-fenwick-holes.svg | 2006 +++++++++++++++ .../data-structures/img/segtree-fenwick.svg | 1790 +++++++++++++ .../data-structures/img/segtree-iterative.svg | 1509 +++++++++++ .../data-structures/img/segtree-pointers.svg | 1369 ++++++++++ .../img/segtree-popular-relative.svg | 2099 ++++++++++++++++ .../data-structures/img/segtree-popular.svg | 2220 +++++++++++++++++ .../img/segtree-simd-others.svg | 1992 +++++++++++++++ .../hpc/data-structures/img/segtree-simd.svg | 1948 +++++++++++++++ .../data-structures/img/segtree-topdown.svg | 1627 ++++++++++++ .../hpc/data-structures/segment-trees.md | 76 +- 12 files changed, 19997 insertions(+), 44 deletions(-) create mode 100644 content/english/hpc/data-structures/img/segtree-bottomup.svg create mode 100644 content/english/hpc/data-structures/img/segtree-branchless.svg create mode 100644 content/english/hpc/data-structures/img/segtree-fenwick-holes.svg create mode 100644 content/english/hpc/data-structures/img/segtree-fenwick.svg create mode 100644 content/english/hpc/data-structures/img/segtree-iterative.svg create mode 100644 content/english/hpc/data-structures/img/segtree-pointers.svg create mode 100644 content/english/hpc/data-structures/img/segtree-popular-relative.svg create mode 100644 content/english/hpc/data-structures/img/segtree-popular.svg create mode 100644 content/english/hpc/data-structures/img/segtree-simd-others.svg create mode 100644 content/english/hpc/data-structures/img/segtree-simd.svg create mode 100644 content/english/hpc/data-structures/img/segtree-topdown.svg diff --git a/content/english/hpc/data-structures/img/segtree-bottomup.svg b/content/english/hpc/data-structures/img/segtree-bottomup.svg new file mode 100644 index 00000000..44e7a12e --- /dev/null +++ b/content/english/hpc/data-structures/img/segtree-bottomup.svg @@ -0,0 +1,1551 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/data-structures/img/segtree-branchless.svg b/content/english/hpc/data-structures/img/segtree-branchless.svg new file mode 100644 index 00000000..b149567a --- /dev/null +++ b/content/english/hpc/data-structures/img/segtree-branchless.svg @@ -0,0 +1,1854 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/data-structures/img/segtree-fenwick-holes.svg b/content/english/hpc/data-structures/img/segtree-fenwick-holes.svg new file mode 100644 index 00000000..acb4c1e0 --- /dev/null +++ b/content/english/hpc/data-structures/img/segtree-fenwick-holes.svg @@ -0,0 +1,2006 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/data-structures/img/segtree-fenwick.svg b/content/english/hpc/data-structures/img/segtree-fenwick.svg new file mode 100644 index 00000000..666a83c7 --- /dev/null +++ b/content/english/hpc/data-structures/img/segtree-fenwick.svg @@ -0,0 +1,1790 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/data-structures/img/segtree-iterative.svg b/content/english/hpc/data-structures/img/segtree-iterative.svg new file mode 100644 index 00000000..9e501f2e --- /dev/null +++ b/content/english/hpc/data-structures/img/segtree-iterative.svg @@ -0,0 +1,1509 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/data-structures/img/segtree-pointers.svg b/content/english/hpc/data-structures/img/segtree-pointers.svg new file mode 100644 index 00000000..1e713ef6 --- /dev/null +++ b/content/english/hpc/data-structures/img/segtree-pointers.svg @@ -0,0 +1,1369 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/data-structures/img/segtree-popular-relative.svg b/content/english/hpc/data-structures/img/segtree-popular-relative.svg new file mode 100644 index 00000000..458fec35 --- /dev/null +++ b/content/english/hpc/data-structures/img/segtree-popular-relative.svg @@ -0,0 +1,2099 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/data-structures/img/segtree-popular.svg b/content/english/hpc/data-structures/img/segtree-popular.svg new file mode 100644 index 00000000..9b650dd9 --- /dev/null +++ b/content/english/hpc/data-structures/img/segtree-popular.svg @@ -0,0 +1,2220 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/data-structures/img/segtree-simd-others.svg b/content/english/hpc/data-structures/img/segtree-simd-others.svg new file mode 100644 index 00000000..c2054dbf --- /dev/null +++ b/content/english/hpc/data-structures/img/segtree-simd-others.svg @@ -0,0 +1,1992 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/data-structures/img/segtree-simd.svg b/content/english/hpc/data-structures/img/segtree-simd.svg new file mode 100644 index 00000000..f71538d7 --- /dev/null +++ b/content/english/hpc/data-structures/img/segtree-simd.svg @@ -0,0 +1,1948 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/data-structures/img/segtree-topdown.svg b/content/english/hpc/data-structures/img/segtree-topdown.svg new file mode 100644 index 00000000..96239db0 --- /dev/null +++ b/content/english/hpc/data-structures/img/segtree-topdown.svg @@ -0,0 +1,1627 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/data-structures/segment-trees.md b/content/english/hpc/data-structures/segment-trees.md index 27b35e81..61d6ee53 100644 --- a/content/english/hpc/data-structures/segment-trees.md +++ b/content/english/hpc/data-structures/segment-trees.md @@ -67,6 +67,8 @@ struct segtree { }; ``` +![](../img/segtree-pointers.svg) + It takes 4+4+4+8+8=28 bytes, although they get padded to 32 for [memory alignment](/hpc/cpu-cache/alignment) reasons. Actually really good in terms of SWE practices, but terrible in terms of performance @@ -101,6 +103,8 @@ int sum(int k, int v = 1, int l = 0, int r = N) { } ``` +![](../img/segtree-topdown.svg) + Still have wasted memory. ### Iterative Implementation @@ -138,43 +142,7 @@ int sum(int k) { } ``` -### Implicit (Bottom-Up) - -```c++ -void add(int k, int x) { - int v = 1, l = 0, r = N; - while (l + 1 < r) { - t[v] += x; - int m = (l + r) / 2; - if (k < m) - v = 2 * v, r = m; - else - v = 2 * v + 1, l = m; - } - t[v] += x; -} - -int sum(int k) { - if (k == N - 1) - return t[1]; - int v = 1, l = 0, r = n; - int s = 0; - while (l < r) { - int m = (l + r) / 2; - v *= 2; - if (k < m) { - if (k == m - 1) - return s + t[v]; - r = m; - } else { - s += t[v]; - v++; - l = m; - } - } - return s; -} -``` +![](../img/segtree-iterative.svg) ### Implicit (Bottom-up) @@ -198,14 +166,16 @@ int sum(int k) { k += N - 1; while (k != 0) { if (~k & 1) - res += t[k]; - k = (k - 1) >> 1; + res += t[k--]; + k = k >> 1; } return res; } ``` -### Arbitrary Array Sizes +![](../img/segtree-bottomup.svg) + +### Arbitrarily-Sized Arrays ```c++ int sum(int r) { @@ -245,14 +215,14 @@ int sum(int k) { int s = 0; while (k != 0) { if (~k & 1) - s += t[k]; - k = (k - 1) >> 1; + s += t[k--]; + k >>= 1; } return s; } ``` -### Branchless +Branchless ```c++ int sum(int k) { @@ -266,6 +236,8 @@ int sum(int k) { } ``` +![](../img/segtree-branchless.svg) + ## Fenwick trees * Structure used to calculate prefix sums and similar operations @@ -296,6 +268,8 @@ int sum (int l, int r) { } ``` +![](../img/segtree-fenwick.svg) + Can't be more optimal because of pipelining and implicit prefetching ```c++ @@ -318,6 +292,8 @@ int sum(int k) { } ``` +![](../img/segtree-fenwick-holes.svg) + ### Wide Segment Trees ```c++ @@ -380,9 +356,21 @@ void add(int k, int _x) { } ``` +![](../img/segtree-simd.svg) + Wide Fenwick trees make little sense. The speed of Fenwick trees comes from rapidly iterating over just the elements we need. -## Further Reading +Unlike [S-trees](../s-tree), you can easily change block size: + +![](../img/segtree-simd-others.svg) + +### Comparison + +![](../img/segtree-popular.svg) + +![](../img/segtree-popular-relative.svg) + +### Acknowledgements "[Efficient and easy segment trees](https://codeforces.com/blog/entry/18051)" by Oleksandr Bacherikov From 5af3ae3763be2952f5027c6b867f90f13fb9ae57 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 23 Feb 2022 18:01:59 +0300 Subject: [PATCH 242/531] increase max figure size --- themes/algorithmica/assets/style.sass | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/themes/algorithmica/assets/style.sass b/themes/algorithmica/assets/style.sass index f4953156..49c6c025 100644 --- a/themes/algorithmica/assets/style.sass +++ b/themes/algorithmica/assets/style.sass @@ -278,7 +278,7 @@ article img dispaly: block max-width: 90% - max-height: 400px + max-height: 500px margin-bottom: 4px figcaption From ee324c0b891ee819b9f6076d3b35cc91d72f6c03 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 23 Feb 2022 19:11:57 +0300 Subject: [PATCH 243/531] segment tree illustrations --- .../hpc/data-structures/img/fenwick-sum.png | Bin 0 -> 43016 bytes .../hpc/data-structures/img/fenwick-update.png | Bin 0 -> 39961 bytes .../hpc/data-structures/img/segtree-layout.png | Bin 0 -> 52243 bytes .../hpc/data-structures/img/segtree-path.png | Bin 0 -> 44722 bytes .../hpc/data-structures/img/segtree-wide.png | Bin 0 -> 13306 bytes .../hpc/data-structures/segment-trees.md | 9 ++++++++- 6 files changed, 8 insertions(+), 1 deletion(-) create mode 100644 content/english/hpc/data-structures/img/fenwick-sum.png create mode 100644 content/english/hpc/data-structures/img/fenwick-update.png create mode 100644 content/english/hpc/data-structures/img/segtree-layout.png create mode 100644 content/english/hpc/data-structures/img/segtree-path.png create mode 100644 content/english/hpc/data-structures/img/segtree-wide.png diff --git a/content/english/hpc/data-structures/img/fenwick-sum.png b/content/english/hpc/data-structures/img/fenwick-sum.png new file mode 100644 index 0000000000000000000000000000000000000000..9c298aaf9557ecf05df0124cf88fd7ec5449121a GIT binary patch literal 43016 zcma&O1yt4D+by~QrAq{)OF%lL8xaseBm^X-k?xX4x)mt{5TqNVrBeYFknRvg8l>|) z8~@+A=NsR-cieY4#*4__d;QjWVm@=uMTF|ZdwAFs*a!py@4kYpIs$=OjzFLkW1_=% zP?$MX;eS_LWbSKV!hb%PX5sL6a#uNR*GCQ(t{%qD=7^{E4tD08E~d`r=Jqa@4z8PM z&64m#Y{(zF>uhf9YUN;0r(tDhj(BWtMt7T!PTt&v?l#Zu+jP9bqCEVfd;)aJkCcww zrt=X9I>de1I~tzv)}}qQAN{_8vE|T)eOHj>J;&9@nzqB%{+4#dICj)>uL_0>G*_jI zH7l)GtF>0!?BohHaodMUA5#mjNJqV*XUThD9<;`Ia1f>}i%CdVcYn?&tvi?L;8#+qZA)+v(unG3l*UStDZN;_hwEG`>tp$&2;0HXkX{8LPB<^twUpMh+DR;pgY4prrK3gY5Ff!RCqaV>h^;m#dq&>{QvmYhko($JvBlP?DH@OdfH>hWg!|vcL!{0O#(&aKFLtwS~NlxK=x!ukbDT|M-zIBE9ECyl4^36(DQW#vvt6t&}k>oQee zeIsu3>BJS<1@vz^4_h-|eybcqxqRmQ8^a}I*^{X4BetyE%YGP`vYyhkXkmQ7Mqxuo)yWr zw?YmkgIV`Ik9L?`U0vx&qY|B`j(Pa`i+Be(FFwA0O)g9oTi?!n`7wDpw?^99WUPX; z_4L}u$I0O>6%-VtPxjZZk&|N~mZv`nk%hL#PztH4t7lg7X5-@G8d+OoqoJYk;0DFV z6a8FT%E-&Z{GKk={rC5G64BVSv?tje)eGO3E_8HtW$xX(_otcQa_7~yq6~UkmYmCt z+tG0;1#$1+zpq_xay5zDFt)6W$8~M&nvlax$B$eUdSYy+g$^uY5|S?)Ha0dmzP`R$ zxw)gF1Dvd^cf7p3vM0ax_4QYnD^*$!$Y@zDZF{o7Kgi3Y5B~V^GA)hkP<#Gxz(1ImbN@q$Jh7u>&q7} z0`l@$WbfT;PTXZ<5E8m^?b@~Q@NmyncRoJ8N}Eyq^@*yx>5{(vB?e-Z4zn~`T3Vm` z`VyFGNG=jpe15>96B9$CqjAm5%vh*iG59tvT{l@-WrE{TFx50yDY|X&$sKRj_kg;@ zusIe!3At575m(?-9NpYHhw>j?BP2v1$ONp3-8ZHpn3dlkJX3Acgq=_W0|SM<_c>%4 zt|6oi4QbKPaS`d?yWigldhz;o#OKcnvT|}?&MB#>V;}&sPw{hdaw>jS`=0DqTnIYM z=-ZbTDKHi!>}ox5Ww$T1D|OWLZPe8C%P&nv1{~hi)%zTXzBCC+h&K7}X%bF<2AL2o z=VvFpzQ?Y_q@4CS_I+s>o;w< z`+xjspQy6owHe{4`(mW4OCcj8Lr09g{axzfv%L^6Z)<Smmbz{|bln5i(hOrt4+M?p( zLUAn<6Ncx{pTnY@PWJ7_*GWl9$s~P*ub`rM9cTvPg_QN7Ip zkBpMCSJ^)G-MfVNDXrYvT5bF6nt~Mv11;;qg3_9T9`1j8l*Kia)DhWf;<`RT;;}K+ zN+5L_0I}R$bv9ORn!;m@>g?=nIWM>y`SXzs3!Uq?qr508xvjf(2kn@pGR4psXgKMw~p=UfLLOz~2wZ}+dY^C&!gSfWuC6CEx4;?hb| z-HWBluQXsJGKTloZuAl$3%vvT>$u!c&kwmU&kIFx`jZEgCipX}JJ&KiteMpBzTAI@ zRZ}w5_(gm?gHYYquV0G~wowAQ7r$JFE`06?quub{o|fj3rn_NjLvYHm7ZS{T&E01d zWM#!F)~k^>Hm3iUDi$W`d;EEN^YrL18J`92;NYN0MemzQU_b?Y9hDH$2rd*!!+iC*@4)wcNp zU+mo2_V-4He!Y)wnt8+4JT!`IUu7!mkiA8^RoGECxaFbPI~{CHvrS%o(&Q6SZrZ)j z_nHq8p!3mU{&%{6Xkg%-)Gm^!&RXXsY*aL~YgAOxeSHtc1qlcU;4t7wgvG>0{~Sh| zkfB>ySvkL&TwAlvjy5(j3K%KY>y6j7dOiuoI%JZs@}&Z!479>FPJD85@`}1}5~lT4 zO;>Y47Z;a*1#cxHk+Hg{2p3j!H<`=er#&|pm;5L9^%pN*Af%+E4i67Co^e1ez@rqL z0~{9y@Avy#>NPSlOd+8fuWrxzwyT&SGE`#jYsGPAv;G&pu)t_CUQ~R1{5^_XK{&+T zWbUPErUaGR)w@s<;Q&g8T#(CK{_}^`9M?|$5gb!bX~l^byZwizClD!Fx2rhZ>ApiU z@Y;^?q`ZB5lZ&flg0%`kCoWEf8i>Hc!s3C-5lbz`W6?*J-Jz$h{+#dWH)$5)g^q~p z6(QGzsqnq_*RTSm5e^Ox$kuJ%FacZj=g%M5stFDkJ!%PwMyPx6m~+2=jj%xed|q8G z&hNaV{gkln zbUg_(GqYk;Wa#JT=P4pC%#o3ia~m76Q1wGL^z@!RdnW9(a3xrV z{#%*^`)cYCRO0{?G&(}etv|WxnfG5YE7E>yx_$fhPv#o#Ck?7S+iq@d^S^#+vDUeH zc;x=9PP1NFEl$B5%`Q#uUL3YI7rYFbOnZvN4%8NC9_7#eS|F%>GRkC2L;NAbg9v9b8NRo0q+aLhM0Hs-cH70H8Ga(oO+ z894EKmXXoYIx+EP<|ov){G=|Xa>UkZX|u`B&o7oU4I(BcCe{)JPaPe3n$C}Hkcdq# ziWG*(fZW`t1~R2p9UZb)Ffi^y#6rSWOmcE?@c0}$Lh9^pOzYX?R`q=Qmh^TiCo79( z=}@1mM%{T5iDayPgvRUPsl$J!ZF+v*2FG4+`K{CoNij@KO|6~if?vqX-h8E9t{n{n z+X9rh%gXL^w4)pM8nQW1I$&$d(XmdSL$4YaGF76G)=Ki(j%@YXdJqPIS>+MnDx5p#=iQeB(P7x>=Z zC%UwSO_`aQW>|K%wndjsCz?ux2ofDon4RN|;i)P8x-WJnCYkE7-{&0*9@s*=TPUfh+=JFw-1TCA zkL|WOIWkhiSFAcJK5E`qRFqjADOS_g4u?XLp^`4?CaGm@&DM0fqx4S7p9aC>v7rwg zLb2Ge8p;)5b7Uh{RqfX^d}IA1C&qRhx$AkH^3{NR0D$z~I@xQTcbr>W;}ELT{B_zU z`&V;xlzr1(b9YbcLHH$XB+eZ`ww))96j7v{GXH5iUcU5-3@_*4h`Ioi(5-$V!43gn zt7+2Vc9S&92@nI6^x)v&hN>N{+=r=fFa{HB6whHS(p0qnkeLa+;e919phLDQCP3S< z@p`E5ogpy{Z6C=SI=CeoBC7N%Q0jfa=h<`_*rlwZh#ezhvZ-ZdM8~GLujfKR@hX@C8 z>*2c=K%K_+_OWAQ+Rz?`PEy9l$5Vuz7@(z~BMy&OOObUg=H#~O{rgx@IdNbRxqj;* zU;#wSYFwu`^cKl<0%UKQm?#iY|EjL7%_39>M95=Z=gCaDD8(pYKxd>D78Z~(Rz6vO z_t@%1^&nj#&#eXAE@J`+1;%pR_V@l5=ZDZ&dChx>*QV+oS#?j>x!h_x+t*7-NO%tk z0DykEKO+b_g>6Vk2qqq02*kvH+Dk*4UjalBWGtDWEg1&~#}HNiFE$R2jt}?Z*!HH7 z$!+z;?(eU!?0Zwym*9%ks`M9c!aGgOEG$L}Ix{4Zt?d%CXO(E*H<4qwMkDU2 zy4mq}(xJICNBJ!jj|e~x?0X@lu^lDzv!h(s8vz}zD@980Lhe^45SJ}spg$ZNT4zp=&E3)f=H=*Yi z>r$-@j@p=;XFEm(B_@&}^A%B+(D9_%kJ{x|U;eNO#?H^5KU=1!siAKR`yBFIK@E&& z@VQAxhj^6pptbOE>DJZqbum> z4C3PY?w{)FL_!H@@u366Ngjt6P$+5uWP$=#lYU=-GF@Z{Ah(#{CRWM}8(?0kL|r_5OpE!tK0Xb$iDeK=z4kTb<$*OdLa=0U35hB$wNX+?&G4wG!XXz*O3K2@ zpCcn9TK3L`(c|Kn1H>lfhs8%;BvvkYN`?-0S{8glLb;<4Wrn0zfkyc6{iVCw&-48V{6)E0C>bD92x0bv5rOx*Wa zqxx%S=T#_e(B`Pc+_TKW0P>l8o*p`7b0+}vi3vNSNZ#JlqiF98jEvInxBS*iC$Jk% zs!(E;FH3CJ?P6tv$xt#>!l^VpF)^`vZ6b9XfxW$+!t;^p(oxvk zdChUCx9W!61A@A#|40An{`K`Ougk9o*)+{;uzWHpKk@U^zbc3pU~|g-&W_PV2nrP# z+W!=wx-dCRwbXdzZ6c5hU>?BH~A3zi;d1CHgI0sxz4uq~`ohjtY&oC_ zN37kDSiUDmp!b_r2uwMfv$XyfXF@p-UO!J2^QaK@la;*!iMp_`U}R<%%&Avxbb7dr zcg|n{=mkhQ85Uv&etr$D;YH`V|C(?JXdf0nv?2%m32W9|sf&q?g;_g8y^BXcAo1-K zFkB6-QVy5{8W7vV%#9>9!`Y>yfQ080$8BwG^GcJ46jonp&+R3sK}G~a2sdAx>uoOC z*|1AV(L!x%hmipVVd%4c6<9YcU}d!yRg-~uJl*c5g?JWr`4w$z%_1QofqWRqt90Le z@;CfenZFe78$g%=hWsBy*G+@Zm;*e}6Kxcq+lbj2LBe0+*vUiwa9}Hteb{!Q5(}uu zTA!l|T`n&Se5w|};cyfINt}AH3zAgrfKAN$r4WHkM{)o21}Oa+8bf;P%zMTrCeN$wCbd@T++1C=a7gKOGokKg+Tri5 zBWDb#yH$ok_bR! zoj6n?&Q~C(9+&99oVK^Mji$dFDoscD-muwk=+-a-c7u_bxkuUFu+b|!fj2rPW&qZp zXovOi;lr4nBYB0pkF>S53lqnoHR?G5i&4-dPNQdoZSC#`2L__RBr_yQe*x&0*K5}jS=&IJ0nVo@PU>8N zr27KKp_@EBgm5@;e(}@xKmfkHc~1ro2Wrjk!|Im+{aIOAfvKqXab+Qh>%I%n3(#DB zeZ9J~66Q;&-!Rr!Sv%#e-!RMu6jjn^VP`j;_ZLemer$EL)D%FOqT$+mHq)-?R;V;Q zz@bn{cqJtA7TzR^K(+)V35DC<^5e&K=n9qtnIZLee~{s}&jFJRX#{^fk9YHJ-+uha z2Q?VNSJZ7q7K;-Fw(5WIUZ0hboY(y?obMSV$dG3ST z>Ie%9OA7$Y7!}E*;6x`IXFzPx1K3Ejf7oq#WbnhQ(;FihbI z3B=>oFdzeX3i$q8XgP#QVHz}$%1VC7Wq0@L!LIpj&$8-ji!+n{5BCV*xguu}*9WCF z0XqGM%+d4{fELC6YHV+HbhI|_j+Pc_MMcH?(ozD@NwT90G5^R7_4gD0`t>WZI!czI z!RJVQ|5^|XhJAf~+EE0aXBcb0e}_T5SBRD{0IUO|)C3AHqLlVKz{{V%e=iOfKCWQ> z6fi0g>+bGebU=Vd!H)?iX5#ihT?#f82w&$(BH%WLsh<@4oFDzh#KKZTDxi3Dz{HtN z-2Nk3wZ3{4VPF82Rl6*$l#GJH$)pg8o4J1iX>^+UB_W72PktEVtx?p~6 zhX(_6>dvL9w?h5=^L0{YLFlJ_)v@){55_5)uMRvy7McQXsRs0;3184m{RF zEP|m7m-1W;qUt)5uLip9|Iv9ltIx*!lTXh)}Qnt!eAf6CReDY`R}6Yr?M7I-Me(Izy;;pPm>wvR_YH z{Gl=YPHF&cO{eUJQ$6~NG`st1f)#5jY`tJ0_~!+6QN}{XNGc*S{)E zG5c^=_7PLo>_AgCgK~Y_qxnZ}Oc0NVAj_aBenVoJ^qZfbaywUsEc)dDtdB;)Y z3eNY;l*Z*3zQ1`$V>a;OE@C>9{GYr(RkvPvibB>+5U{wWsoiLg=JVz(hFdZ%P^v<+ zY3>ba-ux*^=iJcz$M7%LzwE0t`{GA*eS1F?IFuxO#O)*Xn{ln{TUX8BU#69NQYm77 zSGj)PM(j8_?b)u85Og&hHE9dCqY)$@nQVSgzlwbFHww|YC%1xVr-(m3F72w1`8crK zF}lpR$WnnK{VsgAD@H~k>JR7*)Pmn!+qTuxTYk%;Bj$z^&B^NN@Pm>|yN~hy&h3tR z?i!MX$}q`}`_^2cTcZXoQB1(!B5JBQAYJ=;3qNgf>_bJa5d3 zmLRuyeD;`!&)0x-p=w+9z(SDpuu7BQTvfgdaz-W_eYQUr6A$o z;;1gRWA-HUc!8~9I8MjyLQ=^&PZ=J6$B*6j3TVQq_td^=^qqogbs48d3B$4ZjIn6d zUi+^v7QO6EnD<8|aFfpTBZn%fmc^=`f8fHH+ix!!OA-2%oJBA27AIHhVLxX~chIsp znVTg8I|X^ACH`c?qi`>-w*Klp?{(8`LJ}I9Sk^~5WxN_^#G&mD4;)-+DzA1*H1rR* zmi;=(3px}*#n%u0u}jPQEnlr#pf|VwsQ1qDxy`b80d1c)!3%!j1&!deTD&ms9Fy=M zF2qfrfiMEbM|zY9K-CHzi1*|EV4w1A>vo#BNQ9b*7*E~- z|GU0{%CWzWXSw9U{kBZ8MQh52ifk%@e{bCv4?TNnur0^!^;5yOv2A*q-Eqo-S@63T zZP&%LZu~x3Inz2%r-%3i<9bKO7d^tXn_A0F;079Sb}-%}_8)V=Y>iAyRJv0qG_v^i zEN9Hql|)=w7%_Tn(Jv(e-@~tM?CK{%XAZ=qds>GJ+O1lVKfUDd-Se1K@zFjJ@br2x z>N2La(F#QkQ9l%UR@-EON}3%V;y8N4F=ru0Uo&0mS|&b~jPC5U<$~||JLkbeYa|N~A95(poau+!cNd1m?yt91& zin+(y>g3Z39CUg4)bY?f3i-s-cBK5=dcd2POR6alFTJk4KA+FkWpLo9Z#M)ine}*NAaoZvg_}npOw#yHFr|Ric0@!l-j=c!0M}qUGUGe<^{(!tk11R z=iYf2CsKo+i3<^qSR<0_Vd=}HZ_EnxqHtuMSE*j z#e`A6E`o8imF%1*ZhiShy5aNq=WR6>@`U#`J1VPiEzaBS;ZP$oiK!+lDQkB4i>L2M z5R3S~_TRV@K0xQ$Y)W@Q7tA7k%1GFI%0$NZ9mic1AT}Gn~7K z@E5K=?CxQ|8{87w^3GC(*NMG0Bm>4bS#lGGY9&7tNwr6$j~I1t{-$GFn)_WY`T4Iw z4C{1^}4{E7US;XiR;v=RWzH(z-KuQ6tsuis?72ARs5L@%QMuRC}wz(|(`Ntd(~9{(G|d9iHyz z;v39ii~AVs^lx|D4sV_|NE6gEr<=+*KO5WlbWJC{8twNN^LO3?#8Q2B7zW|j48LAh zoQOrI)6aN}Q^I)c^UD|ZKU-|ap4bj8NuRP1-z$;m`dXGF%<;nB{`owHR@1}VohsYE zdlC=#0v%6yq`5j#Yd+W8g^oy<3if;n&rahf3R{s2C%+ z+D4toqdPxgm~L+<>dkF4LQx{kk+RoHU$BkcES`|`mP@BR+m ztPR&WaVH@LX01K_%GU1L&CV0nIKn*+f@8Gi_NOIU7geQ0u7==?lNROVy-*kZwYu6= z=cm{~$op>ee%2(1K3B@sGa+u<7vXY7!WY>oR4%T^)|{~-I&5dhleO_<6kBT(i)f$r zSMcw0mBjjm%n)$7F0VCaPyEd_4LdlVD5oT}dSUIyWus>e*NX;$IFsS zD-HMQ>O}Jz>ZWigN2y{T7DPs0E!M4S18s1eEbxKM# zwQmolFP%e`l>ZmwmoGQqJ+`qXUoj+2M2(K1V`aTRO%Jw3dBB41K)nI7&8~0v>|LL% zK|z3pMZ3vI#AXo)OmUw>Lco$hOB6c;o|YOhDTo7{-bDPJlV!;hG)zPQ8U{XmSx{m2g~%;T zl?i5Dw=lZo-Km*|XV*c^G4nkIvS7GY#m#{z+KHx3{bE->{FpiJayRk<`Me%}24t{~hb)_(!0>6`)FqUAq*cG$2wIvO# z#aA)*IUrVmdgKL-T5H`8csSsf!N8=VLg0J6+v>g9h=HHGqo^q1y`SJMHt^&+^Me-P z8i%PvWv-y!`6_lzMgb2DUt8U@{-Dt7TXmnF9n21VeEQwm<;6t*=xFz-p})bBJN}2F z2W1qaB?j#tv%a~Nj9TMkRG=SCEPZvx(0Go}}FZ&B0a|efn zsAXQAgSlS#*$yKTNhIaA zIrd(Upb=92A1$Rl-doMBG%eAuBZ5N##zL;@q}bpSDXq`eokWKnx%O0t5kk+@0WN8&gL$e%lL$3|>W!YqrpcUo=!mlN@|Svv(@;s^4F?qp%aOTLWpiVHeKM@Szh6;P z2mCKUug!t4?yje&FxWu1vLva__)2;&}0`5E+ zFeX-Mj)OjaUaTVkE`+PJwCQYxm5`c9-`LvvdcL9(rdk0AM@SPwBMLcbR(3Woh%8S{ zS$5LlOuO3Jo};wr*E-&rp0`XDarwoANC6J;^E3@HaR(?od-JtH1Ja}~ z{MyYFARuo8V^wgx2fjcn?bo-G@-V(q>Ukhj)Xhy!P3;;314DnE%WuUKaFqfrMGn#f zs`TF;qNQ5z11?dI4P2VIFc`~gNR>e#Xob^{0wY%O;90FcmyfS+x!ITa8m)zVMR+F= zSArkqKAZ>Q0)!)P8*h**c^&7Zk)Z@ia_%!LEc!S1rgiw$s3Bxjl!vwRK?nnjZD>J- ztW`llmm=#2Hobwx?zq>$Y4ZS4sgcbTY z09%2cVzUFD8^kr5BrbyjRn2OKfGC~>5{9qm7B$xB1Ns~5a$ z`h=Egb{?D3iqe2}5izCacw`>()NjB*@KT{eR$jh>S0i<8qAF^<(YvO;J#P?nsMnyw zcu|{nMv~wnav8fr(+AM^`k?NZ-`GP+ zgAs9YrH{pB#nz@jC68Cz={3eW{r(2MwS94$_#rBI4PS#Rt5?~qPM?d<|J)}hFE8(j zlV)~V*$WZ36=FWnSGvHFH+V=xL(>OZ%4<+Sc|g}4Ba<_M&fnUazZ*>})ihYWyljco*KkNVJZ-AD&27djm@N8I0-BFv}I zy9e}KC)9VuP$LDqW;q)rDQPIOH9}$b+&umYi~vF zzuqW1KcBVfWJ3%AVHkD1-=(Ed&L(Esg!7iSR(9#^MtB5 z5_JBBwk)_tz|geirD>YYU&p3*O%u{mGlP*dpsGs1*3NF2T0U`YZ4Dfk`IUbJ-92bJkPgn}{NWF2vsF@+;}tcDF-v_Xp8urVRvmv#<1#?XYEW zn~;{abV!8)OusyU>(+qZ9TWr{2X)Z|C|IPyjPj8t`1c$^D}5*Gh7EftV{5ynHz@(# z^gksIl;MDql9D$J5G>?{tb7A3G5HjQ*aanBifRxgSq0Wb8h4RVVvsn}*B8CLzn{a# zNv|8qRvtqIaTOZ_{|>f&WP}1(Kxib|t!coSf$(uzTolkUHaOno-GCnrD~2BhtDL4e zRy$8YX_ulvM}e&zM5(M8Tr+_>Ml)7)`Jsq44byB2=YKUZQ*J0#?)u&bYumFjgUeVj zE~H?016LfnVPbK3L1nNd7nHAZ?LLq}J|fH4Oyk?}k5A)}DRV!XqQQ6n?jTO!etNM12*cyOwhtYzyI##Du5lg0KvM=8dtXI?BsxN zvc}=@A2P^oa|Dk#O^n#cJ0lx$M(((R_ErU%eT*vrd`^YX$)e2@A0z@6ST zJ>59Cxj8@I>T_{+&EA3_+T>*(s>?qHqV`(Sk?B^ zY#w_&Ah;uu9^hCxknrp8-U=oGC<2xk6FYI(-})5mrmjlJf;?+jgfFNL;gOM`75dv$ zl^z~?j+oM~sRK0`Ng?b=M{XM(`5!3Qhf@mwHNAZ`A|EeU5AKZOBJdHUmM}!(NRWa zB{8BO+zud!Fi1$86`#SFH_+(q0c{0&bEv2j0|7c>3Sc*MT(Xvx+8u?$O5{d4Iq?EK z*&S)9gk1ue2SxhtAdxz?mmoh99+3&zQ|whDaMYSm1h0f&-$W zaNO6%P+(-#>elRlNy7hbO$G!GAtBnEGTbuWklkY;{Lx`TTOd2vWv zn+%FxKbSy9z86D%1;I!Lhi{0eod>1>-55JIxY6+W+qW)=E1X-mZfTjYv$3^7p4hmh z|NLoI)raW+_I{X*WcfrLubv<8(L?ko@h zI@B^98XgWQE#;OxS>w~{X3Ant=s+#2sCW+B3F#bPTnu(Mjs)i-@KDYJYRBHBrKT=U zaI#qhD+cVwJ9Rc@fRNzU$tTM6vV(!0Fsc)K{cUP$aZ=nvlGih`3@`JRrckA4n|zV? zN2;Q~GPAR@!{rP1C2fEVV|6aMZ6~?e+4P*8%B-eBjB;{vwZ13S^5DfSGwH;J3O!t# zuFWA27d+tlhEDj39i&wbW75!PfD_>PDT?Vq-(m=af#)&V#isdgYz1;CF+;R#Sz?`9ZoQHTWZe`=JMfT`Cx% z83tBYw=n5S%*jzX>sxR&Gm{?xAAl@X53<|vRjucCCh`Jk9zwZX?jM8m|9N$JdG#IB&0w^SOm(klkmC^>46Hu*QPg#-E7O)awUG+Ec^2B)0FkU zth(flX*AhAF6>?b5;*$|4@VJ$5gLqgk%(n|s z4(Kg`jrp+x@xp)cQLCSXtv=5`Ob^`E(Rm*H<5Ssh8H%2rw438|XonoHpKl~xNN9(Y z3)dUp;<9iFJZ0dg7N`6x=@Mk2ydhVq_TWLZ49*4%j!mMr@{v&U} zC~Hp`DW;~#DQ_jFrf*W$+gU(I78>o!H+mT}jv9}BHq(Tevaz_<}3r$V6Ov*bbYry>~- z>-PB}e$dFzg{Y;%GWSVE*Jrg<&qJQ7;UYeFrt{sKyD`90q<4WM$h{t@_e0)v`i!XR zj4ODEWUnJ$l-1Ys`a$GZBWp~Ac;(raFTb}F$qicKLNWeg4Nn_BzsT}C7-)jbkzDQV zHUz3B7LlX5->GSvHUlczjXkw{G`#IQ^F##d*@r3K%F0+GtWTn|U?c4BvTafE)#Dd^lhIw+PYr*&YR`wM8%ZF6g~|sc~vw2bA_H~!Fl`FzzaGiNy^lQuuQx-wB@mu zff=%A<2r;W(tk|#8(*WccazSc%;^e-HTsFhtCQ$Gd`qaDS(l!1J;p-Vl5YXM@u*H% z0B`&_)^aH;x9NT5aN+8ql^t7`{rMF!f;4kuX&Us+`molDbfuSXlF)o}wH$|*;6dHq zG0!J&&e*ay_z?bnch4wR`6gF{)%7P*YmqU}{<_?!Q2$uxXuXuR9~J1=d(c0=K(~MD zdWHH|RcE&EMP~cZu%}@<+I!6TNv-)8-yEqZCTR-ns|T=|V=e7aiODEr2$l-->X(I2 z2tlA{RA{b6Z`OAS-ppt%o%G$+TWlwIV?6oYrL1|9+ww;JwfRloxVyf~ABi)4^E_o% zR8FZ-Fu2;Sj^oIFXKg5DR6h$TOQcleaiTwdU3lvnu@qicAp5VB@}KO&nm%%Q`%}N@ zTrUJ%X@&_bY_BRMC`1Pb+^wB3AxiRDDmae{+{W12dm9%!u*#b`%RjYThbra$sqtZn z3)g&qL^qGX*VaFrg}naN+#~4nbV`{U?*+7tg-^S7<$vGNQ=MwZF`&-MZ^mjE!7y4p z8AutgA_-013^~KNm4VmjdYx0&zd9>z=f7!lYx$9f?P)rEr_}0y19>auLkDN zxyKHz`S1CH0{cAwoxBc*QpI`gMG=d`^_r*_n~`!}HM3nf9dS zL03XUdbyJ>0h!!%dwW~bR@z_Aoccc>SbhmTAQz)-(8F$K6@OdGl{~oJ@#mh?}EAnsK`lV+#8n zjPOyPCui1Z-t}X=e>ZDiXjb;^(){roEbWBbq-_Rw2HYe*NTHl!u|3?}NIUtslv=H`m|Jmb;ksVsBd7h}JO3 zQVR!HHr?#kY<4>qB_kdP~R`=n8E0LK}`Dog(-UPMU#QUm<3s=>p=T@+1t-g1-$kn{_boQ zOf8z5S?LtIWB+$)GvJh`@9z_|ARS6x|6}J*md`t9%?W``wRSXpPq94S6d=XjG30yH#9x z4?nsW=@afwdbZ}qJnqbMT%D4VKIru9et{N%_V&rl@}?_=%7Rj>TSb7#E&Nh!#(+e8 z1kD5G-3&vs{`ZAti1)Mi1I$ye?1`c$cL$-25{AlCB}|>7-D09$I4CKVx-&nq*G{}y zjTzw+OyoKE_w`T|xa@%j)uD__WE_DR18!EBndXVUmhi5YS3%hau^_O0pvXE|*y~nK zu{qCQL0V2%M7X-6?GJnLcM}BZ`C_%s=y()YxJx5%JWy^8Q9dEym3~u7`qNlw6!BHm z|9P67#_QAYYfD0utDcV>y@UDSi@UPN!m>pEWSM-{5*4G*(YP#f4>I>hXr z^?R%Z+K=43c0@KV_{b#l|YMLX5lCg4)3O~@#l(l7(1Hk{Qy)@Snna3f!&!1eI5go@HB(b_O-^+*{!k7ET%V2}yK+4k&IO?vz=; z=KQE+EEoS8r5s(m*zLqgkLknC1(zI!z3>kg^6)2WZDvk9zj~jtU7wZ+cjzwemdfOK zr+2{V{^Yynw9$emm6S6YR@*??f@90P`Du4VCI+91 z3=dW|OKdNyQq}E~gn|tHHv&1cCi>(`xCQr5Oo)(^+e#^8oCFD>*nC!ea$N15P zuf8TNA?f2l(=gdy@X7sbeU7hGR+^~1c)qOH zjdhoFvjrqwHajnmRvmHL8xi3tH;hC%I}LHDW;o>As@h*D|L*^N0}@5QmyGpjbagP0MKj&7<) zLpJ@_lWne?xEmN54!hsftMwQOpz+{r`DhbIuvV~A{JX7V0IiWq2QPv!+om$T(cC+BBN0l$6OuY!la=w`!Do#xx1 zOExIA2#tXVOBotBpNbf=eF*!Q?{xVp}~mz3Xn2-Cb*cYhAMSSzZ^W)z5mQN=H| zxggmfui>ncaqGZ!PvlLDLc7hDKs1&|Ik<2$#bRH8`4v`g5@%4W5dxu6qE8MSSBPS^ z*Pklqx_2k{y04$e$`z%u>Qpf1JWRE&T1SQya-N1&7gsB%xvdOgrHZ;$OzeZW8U_~O z58!Oqn#{R;S{Sy2lan6@mV$D9;J7t+r(thIk4(flOX1I2FRTHn7r|S$3`eEoB3bRPgxw1*<052atF1 zUcGwMIA0@>&R^_47Rc zJv5|2CG3cB7U2v$7X?2u&$Asf%PL*ur2xp*yvL<9K+nrOeh|*se~Vh>-mPM{qhV6G zZKYOWc7KGIF<&-K8^TIYj{+mO zd5VIrVgk#y51Te*CETcwfL}rY7nCXCVuencn!U`Euuq@FfZUd~QAuDB1XDZ6)r24N z8OY%l6p|Dg;q5zyy3L z`rkQ?j%{wkg3W!v49lTJUW(<=uO+muq5$I8axe?4_@gqmImq65nuST0>k;Z5aOZ#& zerrQMnKB8vLod@A<@fQdfEg(#TrwU61h25`G9ldj(&?T@rS`5qSfE;iVg@&g($pya zjJD07^bS#}j2Dtt`v^xL~Ta048EM-K8AH~`10 zU~(1P#4spl0x_QwG3 z4Uo0%&CKrSf@eYgQF>3Rm^{}=2Lv^+2G#OZX`vvH=vk$MsS}_m8UZJoTwaa@&isY> zr^Lz-Ri&(6C7v?M9(LKWu31Oh^XM0r2ixEC(J*8Ld`KZ}C>nP)bEJ7j^YvZ+t6kQ| zhkHF2H>$44FK7e#f(6sp01nmv;h22!f+6jy2L&*Aa)5yyZ)Tnt12!sg0i4d^nmU6y~qu)-;H*1_mv3LE)J29N8~ATNws1)4BD1b!Bo0mhhVX!eJRM3>dE zNN|jyLLu7;nOfUf9lt2T&UnIhi`YsiIVLTQM?q8mt>>QOAq)o>nmym@V$Yla%|fA> z($~qB;7r$AIU1p4{2YbIGi!hDTb5i-gmF24SW#tBCRbqlAvBwmbk1xgo&_v1wSocg zq0y_Kw+@u`1v<+FNNkuMX$5dj9iAmnBbSwKyOB{*g#i{HKA6EFZ>_A1zrVfA4R-;o zs2}we+tMhLK^QSy|JBC~`+5N&Rec{{u40fr_L%fHBL0TLVZf>Zi8} z>j^TM4HLQyx*pkNWw3k*Y-N9kDC9gjMiEn8U^acNRYV0wJ*o2+??&||r4E?ZKJhn9 zfKA2&BbY%Fr;%{Nfrle&Ys-f?z}Wt*odhD;RGu|p4vgBY%{pTk;h}`UEUJbE$&y&B zYF+i{AjD(`cx4I?Ujh{-fk1zbf6 z&1P;x1{M>mR%#dlJpFEg*79CvqyjSq58AlM^4S`g7MOz9o&52Ug%M_RON)sxmC@cl zfc_Iu6y+Pd7loJ8TWWHHN9S!oaU)4;4pSnUg`4}6?Yxw33yc<4)3HJvBzjw39)5Vx z!>7WhS6SG2J5J6MzTnjM7sqr}?HPVoq2%AWHD$=WJ~+3CWZ!ItN{F|%CkJqJ#9IWO z5Ksl{x(%AIHCUmHGJ)0IyWq6@r~l46sGOjxD(@7>c|y(wE`Fk#1zNmyJ7XR&S=sp3 z{1X~>LpG#JdskPV<9R@rO(5#xlim!M%|a{^Fvx6G-{<5oQ}7!`AXYdq4u-CAWTg_e zY+MeJj=gqz$$P`M;{wnhVbB6>Bfe~wLsb7dbr?uQLUKl1TN}wvu$;NSt!{&H3L?J& z78SAmKq4a0TNUaWrak!~SZdjFxNdH4u#6KSiB$E7&8Nf5gsq~}QECOkmup;Fg~$NV z;qk$0;dDsQj(2V~wWu{ZBhQ3hX0b8IXq*+IKe=89;&rV@BIj2>BrERst97W(|a- z3#DxjbAPC@2;n_Q(5xNqvFh;fuonttDe3aO=^iTC8#fr^x=P)*9)T#3Vt>a#U%&Lx z02d6U$rNcZ1AzkEGauM=SjQx;8AFu=OQZ5MQ|<5W8pB#Q6~1?0O1dFq-daOumEHr= z=2Q6cK+y+)Bf+49+>H)8J$}9_C`kBPJ35e^3gw6#V~0ZE9iIPawIuos;Q!1BXs1qIx|SKvHVXbEwHm-rOE(A=I5NSJbjzyA3h z-U!tcZwj7Tm8}89Zz@n1ArnNf76rTy-6-*~t>jr?S09; zmEfn4Q}(eEc@RmT%7FX;;bafj2#fsu)YKUmvKbWI$i2%2Nx&w-=7Si_ueafW>av$> zWG+GEAOSktEUhBle?m0R4`c~IMIHnaE!iwJV~CwFmWBnAHk*XP*5gi0-N@(YPD|mr z8ehM$&3eJeJgh9q5f%g1@9lNj-_0V$d%By|D2R!Dq2MmCO(i2(&D67+g~YWBLx7uyNbYLe7per;7F~? zK!O1iiSU+SLOt_TsLDekBjsUH#3m*NLFH=p$ z9O;Z-j!j4x0TZ`K#sJX|lIxc!XHM=5ec$W@I&Gd}$)cbHq=~l^c}!^iQNZZ9Jxo@t zZ7Q`7!oPH;LN^kmw9Mk--yeO|nGmijf9}5SuiF-g5Ms)Ea z`)RI#8FFMQ9cr_ij*dbIJtGmM17vmEu>z)Ejl(py7siA#=r7f3ogN}CDCY?2?$dTW zfZbDlykTpXQFQO{Gvo>f7>!3VfpW4CG$R)=uQ%2`*(|I-f+xrJgWTMN9jGQxDP_*zoOHYr4C-XLn z5q_(CYX8|jRQotZT2@=j?j9aF;PD~)UihY=&9{7Ym_rs?qTKtsv;}5}L9K~A4#oG9uAh_7Hs8Xo{Je_(!*)(z%TUlr7{5OSG@$^*&& zH&e1Kfm*Ek$tK$p<%8w%26|}aMZa&C-;RSt+X9t?%~x!}Rqz5C9m6W_U$E8)=(_)Q zzT@ZRe=^pYlrWXku_@p_FnHaFU=!79=9y z1c!5)U zDvqX~AG$&{2Zn+X@b}G-mOykS=Z}2>^fq>zmAEHRZwCl459Z#K%>a6{8N|-8OF?V- zuG5%m^;dc~swdn9L-7-bHou-$Dv1Myie!)~q)I?|T| z8U8kQAUX%eu}K13a2i&CA_(prmVbyKqP!@xdv!b9+t$OSCQmYO62Yqy5(jd>07)RK z0EQJjLSTH^R5PSPfxTqDe3=w}pxq+V3p-tvT^Cye1l?ESd2}@1ZOd_*0%sfzU;cle zuple&=ClZw@vDR_K(o~W3l-hLum`>U4KTpkJS?oH^{J!)H%&+xh=>((D@Bn3D438j zVCbljx+^545!3;bsw?)gy?{HD*$>k4UH9_i}vG} zO24l~yMybkCbVgkAVxvydS-;=K}%%`PcI`gFgO^JpGr?X4s?JqU93k)p$DbbFR+M! zwyN*7COfF`=&=2-|KSO^fGlF@vmueUNd(;o+PA5@8wq;Qp}y#|v>sChCCiMX_+YUC zA^`thcl8X&;vah;{iynW!bL?h?LxFV58XPiR>N0_Bj4WKI}dVPbd_y-_sFMcBt_Z& z41*1N{lBHCZWcn~B;^+g{-J@OUd+17rPC3^lGBG%qfY?6mBg+3*3l}1TBjxmRD#f5 zD5p!@ix*wK1X)9H;}Smh-F!919?kcNKm|I);dcglm#HGNRpJi*dyX>skwkmdoeXKN zKyo%QI~#~JS77-0d4p&qCl+lgm~khdPLqRdl%n)$^?&qMa#llz>&G?Z%$X2_0B*{d zw}7T1>2TIEP?qt>Xe*31H$y#)z-Q3N$Up2)10PUur=frm4Ov}XD%~pENZ{GOZg{ml zpGt*w8{~{5i?hD7qX@D{D0kIB&#t~|&)SyCfCE*i03|*|%kIr_YP&jLsCW_K2x@Jn zi?nXNojPB?X5)5A#jMY*+-FPt@8&oXZb7v+1&tU|cR&tFo^x<{0vy9@BS01Pt^s^b8rV>FT92|b&jWiJDQ2!S(Rz%O>^NCmSB;zAe?2T#qjnVuV= z_bL^`<{*3`(*2$2j$*Z}GtgZH!{3xgZIHm5{a|Rfy}J7IeK0^gjm829@B(!?nINsP zQ1Zsn=*mbHAyjmYkUS<}asYx-HkfT@KR8=BQnF?XAskugWA015>OV+l-8G=I19fRA zxXcQsaKD4(T+Ckty6{@`sC8pA+?K=aNAf{IoTrf8a-e=LaelZ8Wt*~s%A32d{Q#tB z0=4>-4T7^WGq>Qq25^A~-RK3S2VzG9k|_{xcin5+oM<`=?aT$xWS4C1!&4X$Q}2T& z@f@V2`(It4ptE26#`)gOvJc{*(E+*)Tb-*4EVz1)%mRk1V6a7s-}t=ldZVRqv!$?X zu12T?ctbBBQpM3)=ku@*$%_n3^RflA+xii?K?(j>BBpeq(YtiGKi zlI|Ail7nLE0j+eEtsSvH3*(MprO=JjC~2;Fb6^^(oimO>@9vTX9pSCtt)b5HxSDN$Pel@j^=~v zhS@_)eV{=Eo&Wd65duQO`A6S#Q$(z;HN5_;JNG3$9NJ`o_nxXD=-p(u8!VjM+~<9c zwk`su3aW<sz^7FJ zSfujJT^`ECp`o4JdCg5Zwm65QYH2QG-W#j*CcZn~fkd)}D!(^*k)}e)f1OdaL?=OgU-gS5ia2i=@Hj2-0lnTF*T^sF6H&)F2=jn`6t=!km15x z?5yXbAB5sjZ=da${iL5?LnrwLi<|eo^`RFgw0EC5R+uTo{7B-DVy*!ryE13^c- z-z}pxvDM6%8TGa(_9OQ(BX2jLDU$6i4(@w8e~T3&J-Q=? z2do{6=OvSPbVweFOxupcM-AVL>tp|AQ+MD%nf+Dyqc#TeZ-KBkk^!@1|}JC&JlGP`i;d6Bw6PeG$nIh_E6{FYdd!f)jr~5 z{PmZ#dgX~y1l1J`uUoV&!x;T#CxWTdF{|G4(czoH#=U7Dl)f16iVrB5Xw0uyuOHtJ zIHC;dn>~^386Ujr%SZbBOt^U@ceyau{H)zKBXwki}}UIZ%K65McM7 zS+DqW|7rn_7Ze=Hr4|WWv(pD+@9~js_!wke`FXQxnkv2T^ff1$>)TX#{0I&#%wO#Y-qH;bh|I-f4oAt`ot z27Ab2JZzwzV?NzK;C2t)wYX5`@t)=!@3`L2>1@PnVJ$gnM6!6y)P0j@iSFiIr7$15 zao~E`xa#*rE=9RBv8IK2nW1uJpOib#E1>cxyIBuy%O`%3v9_q?yonJj4vD?5ZC`vW z9o8Bi$B3_;f5+FgzZgeie)Lf`(Fu=T^JGty2}h>bkPy`=CQSPHo!7cK^_^aAQ*QvS z@%)|wiWLT6L?KLv6=)Ti0vG|1I3p)#s5Aae3HLu7ukK`ik-#qFR1~JFr|Eb`vhHC- z+dSq&n7QiKC_yG_J^VEHcGBCZ>v&ymO9F{sa9fZ*gWd=K#%}xN&ZjgPCznE>n2eij zBxwrT&FVXSU2UwIJcx~*uc{Dx?Iz6M{#AI#-~iX-QYD)`zY@KEAs z4(>D#rz8n{7?6 zB#K`@eq0E>UElVg2qW0SI~2UN%r_#{Ho$_JDSQo0O?lupqICeb zdzb52)9Q{ zgjKaKHwKSA>OA<;KJ{oQ_DGnAO#Kt_vLUI;h;U1)?YHu^fO>8c*_YSzYj;4a>;xN&ghHfsNq7;T<>H{qCn z!x;g#qKiA7~ivX=ZO$lEw3b&xhs?mO+%)?Pdjb&M=5;7cU88-q-k5} zaASa{wf-u`$+Gq(JW7Y7k_P+jZQF>4`m{)|HCFhY#%$*!F*>7r5mAZ`H~2`$RT{DG z{N{?-|D{Ot$t%RGTw!wlS4YG%%Gkq$D$3qXZhh+p%7Zyq-_`J))YkV|bOGbbbT-rV z^FBF+m#GwO|Fj2Q##JJJG~M76;U7(Eu;M`rEiuy7jJZu;KuVAP1u&ad9TYrnBaiF= zp^~?@J?AWLnt}pDYsc(KjtpKreKn;8_M0qEL(F93&*Ti6Z^bE~ruejb*(&3$i_z!r z-!48zpYz3Esn}<-UP`jRQib+=i+wGloQ}Sn@b>2yOo>Eb?r6TIH7!)ANhCS(%ixmp z2czQ}{Veavx36gMB;Gr!61bkA1{p&FEhYY*gMP<=XVlOgGv)vytH&^$3*6OTY7o3KL5oj0C$*{LVvB&I5NTS>y>In>X9=@L|ET0l|FIC}Gk01&J9Aq1K* zVFcx>Rk1onVsNs5fM08sks7$+!eEDGoK31S8%)m)frGF-IGzn0BtRdK4gIFp&sQR? zt*s~&oAUFu!;?09fToaXbEusIH6GT-!=yGTZ`@U+UX2(c%_1ZFwzuAxa!`5qwMNg0 z57^&njTDPK6i$&655GJnT{ZY|ex`M|>{TEFq!;|b9sI2g;CbeWbV=rj1$ySTl9Jo3 z6?DUOA2xN2pQDx8aQqr2Fd1L&LVGz}ZbA4?cS)*R5FGIV>uUmxwD9d!R6evii-Sd0 z70blis+rd!6(Xscj0H-gHf|xKd%&x;OSInxdoh0)UjYIgj!0>P-<+77G+D2NmkO5J zVbFZ*%)ZJ-e8^zx{rOc77vGZb$jCM@@+AaV<8!8TL!-F_xI(L!7+wu8k4^fU1%aMy z92y!z`Z=J;8InHr0g^wczCjKg7|gwAo3@1&<|wJcY%c+g*i-(7!)MN9rsjZDH-oh* z(z(FTRy^!ylX|j&2?~h9oZ%0_g!D*fsg>$O1%oXxrWZkX^#|9;+1~}d@MfE!r-kmU zMdIUaWL6wTJf{Hk!m3v#!W6b1CIa4~71R_0F2B*SBpWtRNAK=VN2`BucSf2v-8uMe z@RwEs+k54RIlN^!cShf6@G98Q0YxSwZ%quO`UKE$NTuxrcFJ&D0nYem8ez&4y7$UO z#0s%%mPVZkAc7!>PuS4hb-?VNpwllkz$D6}+D73Mdh5MB`ohraD;$E9`0swf2sn{! zSR`zVsdU$|>(HA66@cz2yL~uDp@Q@a&95zT%1{C@gb7{s+Y9%D5vCcsz)2UrGBUXO zIha#6fdT`1AN3A>WtHaw7=H^wBliz<094u0Y8N(dO&LCFhAa=A_zEQd_lzfj)+k*RUqi)ccbn}^2h+M6;mcm+x zf14$cV3iSdN0QT%m$fnh=rNb3E0fdL*wz*ZRT1=VSy}};9Sow4n`L=w9BO|h8sYO^ zdZT@oh1K`{7;p-25IyW)aO$-bhEVjZ$@r;k)>Hj2^E)@kcEA8nl!lfTL7T>QczgC> z0|Gk)gIQ?Bt|a(LeZF}YiSqz6-QGlA7Ob#fU^W0&|Ltg7h~AZYOmcH_OO6?fDy;5Jr;8s9^}eHe)X>Qv+PUBH!4YCt9;E*MdHvfbZ7VcWpQw)T;7LSt$QLr;fNH522!SrZ zArS5eK7k63T)8S+8UnF(?WgN_8Ki<JqvBNmX*z-SOPKm8EZ4#us2QP8XE zVQBv*nQ(7Xa2#p1MLfH~y|46rA6l>v@O@-YahDIl|D9s=;Dir5rwqOnGA#=h)O7(ycq}KzY`WQJOYp;v;YFhi!Mc-vKjs;OR$NLZ2v)otOJ0$Vbbv&f)haC zf=Gc-`EVWIe{ljh#{iI9Gg@XNiF5)ukp;mC7W(sI@azf8cP9P=vv9P8FuIl7y5d%fbm+7^G46Y5j+@Ks9Cd`MDr+ zWe8I>USSC^$JN~240(+zvI4dqhM0sz^I!NBcF_c+dWj&G^?JLT>$5X6avgMxm`{%) z+QHx=Ot*xi&GMywjpMJe5ATte3Rbnii|cd)E13dUA_i@jKz9So~g3h}q1Dp(>g8T}kMSvyIe*FA7z_7#uwkk*>V!(eG79P~0 z)?+2Tqv$2#+z@h13=jlWA@2hOMP;@>;nzL`U_}PgP7pF)LOz7|4+xcMFoJ4Beuw=d zixV6M`wZ4Ha3}ny>|NfSOa*v(tq|T0rwo8w#GCfC0@NxQYPl~I+V|IXaR)x7@VTYx>B)Q{ou~izPp)bLFPP7; zce+!r5CRpCkn`|_7SZ9aA?cHZhK3@c#8?2a&Htx31J)e2#ghxL&M*CwuHYQCF1SXz z)RSjRsoPA<%y8US8>~HFD8<4gT#E!ib5!cQq=n=uHnyMV+Q`ubus{0-zz z%Ux%4*x`&4+c=Cp2*0u8rlU7Q{C*rrP+sj~np&4-E(n@J4pVZ)Nhr7;Jrs#>@85vj zgd<7Hk-!V|ZM7E{F=N0m7?Hfp-wLyo{|zAv$#}q@4E2=cof#DRb4*NeU+5C8M*z`# zCYQ_(yIF-(&@$u_Qb0oqo>F3kk#MW+r_^ho$M9rzivYYGM)h=L$Y#CRYo(YtEw+WO z6JbJ&@eX`EAA6b6^Z+FY_fyrc<2GIe(T^%J2yl=ouknqvJ4BcuNw960SlDt0sU)J7 zRNrVjm>*2pdo)-WT3sy$N6M`v6p)52L#jsTFPN%F+$#nh!UB3Ov&WJd-VxRSBB+|z z7^8m2#?wo zH4P_Xx(P8{nF5<#u3$hmuOCSXcQT}Wn~aU&XY01PYvVtlcRv}vyd-3QREei@gWXK6 zMZ-m!g!5yA5fP?sgEK}WnhaO3ES#i%C75m~C&+5wowZzL8e@L+#C)}%^K=b}B}M*P z!@zmdPZ$1nDeFkm+uoQq^enP2;3x&-#-|ua;(>abn4BD_IvrqLa3Oj@R2E$W+KU-M zzal(=9P|9sr-=M0Lhha-2-gcg5=RE{ca*)9I~>yvF5MY#nU|z1C1|7hH?x9^50Y8I zD9H9a?!-bj@~V(373uilco!s4BD)`?%Sb^BEOxc~78g_r6vF02!k+lG2YDunGY-QC zRLC-xWdVJt#$O?GD*@Qm=78%K5_bTq1M{SIxEMgry`aDx6vSg2YP_EFG$)7jx0e=A z5j3Q+FM<4L=HZDzq9Y=|fvFXwCTcj^rbagT-X5crR2sm`j;qX!jOUN-_F8bCuImbC z+a<``C%I2-COb@JF;U#fOljThH-XA-0<@wKa~}#*xv%1oI{hY=Eo8j1as~epDfT?$ z_+G?F31pO(sZ=<$J$_|hc2TXfF4X1+;0&qs9097OcP}q%Mv(4=5eh;z( zCYU%uZV{oOM)e6UIPg}wb^NmJ8Deenwm!@c~-lVd@GYpR~AEFjS%t1yHFvjt{j zP>}s+OsMt^9jAv0GoC6$rxDxQ-7oFoh zp&Hq8zp8SPlj*en#k?*9C|#A&8iWv_0;-Pi5Mj<-flfIanAqC`B?Btiey)*si4gQ! zz|9zxscq1Wp*uK9s6tsOrjcS80_bRlB?`iWtn8Wp9v;R%Q2!GiAcU-FKN>2I1S4u_ zdJ+(F6DI@f%BM-B8QN7 zF#RcC#bOQ90=poBC=!jSLU1n-;KArPOd+7)AivQ^BP7(w+Xg-!6!UF>6rq8G18O)M z5p{mcCk76V3dA6wS8mx#pL%JD=$H^m2BRKSSm zxc$zG0rLLh;##)4E9)e^55>Stry16*s3Pd|<(?t^D6rwB$Q!6}oJDkU@G1*H*4r~5 zw$TpVJWMUXW36LL0jI*#?T(x=iM;u5paKOT0UmT0+%Eaw?QR0vutKw#AyEILDs;QI z(xf95&RSSlz(sEsY4u+o#y#ECzaxi_K=WG&CnkZAhy}9sM%zUrpkCiXqX72}ISk1b zAfjdbcjrb>`T|*tlwNYe&@p5Wa>M#Uj)W6)$cAJPs?Qg!|7G=$F_&?;DqCFNf#P$R zO915ul3v1MVVStOyTjEJhv)T>uL6w^69&EaF5?o8{-o<}tdgH>)`a^t!ByB;|5O}J zO&HKQLfANa3aI(5oGnH593&E1J;U{5C?&Z5Ql((v>jfV9`1JG(6}lG!Wt%_>0ZAP; zLD+Am|B=xn6*v++4L^VG>hjfiJtImFdvxR|`UP3|rAGQ3k}Aw%!-=o!3hypuZLYMN9}H`7Tt=^Y6M(lEZoGhH3h+>lrIz zIkH4K`a=BgF~}q_85smcTAw7auOQWP-+~h&s)7eZ4(ddxB>-?A1O#kh?zQP7(0sws zoe!Wsf&GN!TlldG@To*j27q8-9fim|;aXcN4s}p>HiN4@vSDCn-)*JPgcGljvp97+ zI<-|*31G?qO8@)vii+Xxk1sG5B6M{u%bh!SghBs?BoYwIA~ONVUID)p`^S%w8VY0DQw{cL*K_*pCDQ|UyukGSeC{(o4stALJ?B0B7+}*o zdi;js_(raVLu&L+5AW(;r?e%@DMlPgCMGWKpx1cDb9Dq{VmXk|Aly;(98gS1jw45O zz?>622S?aKwKxR^??quy3}#8RL57(kYV#4ACIssSU;uuhJK&}f;+WEgqcYE!bLnx;VGj#CQgeDXeK}( zWDZ*&TC)HcbVN{>Qj?aL^mL>F^M%g^L%G1ebtgWZ6fEQ}-C8G3;3eAs$p~d3S_N$& zp++D`IALuHP+z2yf-+tg_(RwV;~5Yi0vpPC|r@i zg={`8mZ6q^wEz-+Iw0Y1fm7#Be>V(`g0Q?auUv4K%RS1CxX(ioon;nHWNO#CvI8#> zPWE00MLoi$fIVjDU*z*PdkfRz*ayjffltI<6KW}V_C!o6f452}W`=e*D*(_qj4bp{ zW*6ZJ%i_e+{4XRc0QH{c5AnTXRYXq;FgSo6ka)?@_uj+cS{UuVbrD_&PD45o#LQ_;YOiRjne`NNGB=t zx~!lUKs@8Pw2LD^n}mESsGQBUo3~~+G zZpxW(w2_g~?*6{6NW2Ji+F;P+n9L74(#HOXI`gwnf$Ty}xQ4>8c&G2d+>%q3-Z6Zzf)7v@*}8et+8c{ik&4f2;!}#vuRNl?Bx? z@+T!gI!;%ToFksT=_jgp7YC<R^(W@_%qn%xDKYjW3K!YTQ8$@Vt1Ea? zsr-z9^F766($S8wo8#Fkkw{_*a{(&%ko3Wjb9qB%$2X;wX?~EU_#CM`XkHCL(?$l1 zFDP~M1rsU+W#3@E7`wTilK1o|i8|Pbjj!ciE_Tm$CFcP#-M&-DC~i^X3azK_f72dO zUr7uMun8~Szfa{zm&^3tQ$QieMUFKuZ#Cw-*QbpS!q-|CM9w=1*$W4WI=CO38GJP4 zIW*`0ihrrvTQBfcR-9nI31_PJ4gR&^o?1U%6Y1$jX7QNV$gi{truu8tJ<^{(OJ7L_ zfs`q^>b1}9Ip3_~PmIN6CGQ6H7x?tgy(Ot+{yp#I_7YR*?T0>-KWI#C&-L;b@}4vN zXu75Fp8Cb84O8d7nv11VOS9F+8nUm3rY6UCUbxwDQQ*6uzg0R?A}^8oVT!UsT>U^| z^Z!yuWxZchhj-0_Y_w!%s(oee`9a5p&H(HDN83&LzA9=zs4@*0=|Mc7jFNdS?c?q3 zbMzx(Xb6|Sen1udJfAdz&;>X1n>CXsb*g!vDAskN_*=Os86{<99{r49i@*ebo--x8LwVJ15GG39#u~OH?p*+lsG3c}gsUHOjhb!StOv6Jm zx(pp=&%XDe6M5=58D63@4<&bgJj$}_&S>60V7pHAHn6;ZoK*j#7mKxzrBGFqC&=$~ zPz(VtphJznBvti-mq5`B(4$Fxew!)5f?)&$)&8slF{h~B!4#0)~rURm>wTX0l z-g8l&PbPedk`YMeIdH94X4l{4M04bBkOr-E^L(WMEe;=w2o$kLVcK(eoDJK=kG2hY zM`0MDGm7+;$Og&bIcpL+yprK6in=5i-hr86JQgome?JGNXYk#gK7A^C@_r+zfstobSNm7_N(KXtUyPUStKQy< zG@`e>-Pzr6Hkc-g0sNR>{-S>JPLmQ{leMN|{fYrcVRG;*onMcnX252UkD=+|L5RQy zGqcH~64CS8myLYrA&Bat7+~1G=+~Y`()s0rN@7`dWjFWVcMfl%-KRnsa1wWdxe{){ z@!Z~ryA{)9N1j@1g8SJZvpkOq03H8oMLGHBjrtWhdxe@EMFb&UyV~fp7+Zwb`zWGX zeib6jbbHq9RJGzc%E`|29SJK_vXj!e!hR<)>9WVGYE8ma-Bw%NSY;2SckE!Aj?J^`UC+BlHuqnY zifX<>kKfqzYO~bDJ;_JmsoXP(2&1lD*ICsoI9d%0e zCzLbRpXG}IdJV^ZSCjqPx~#9H%lHuZ{8qg#6TN7E?4x<_&F*yHs}EvXOQE7X1jZMA z?mA!y(b8u5tr+=a$#usTlu@9JWqkA?Mm z?;lJ1e5vsujQ)0@VD$LZK zoWJKGXW5HEOObvfc;07ldgX&t{#mXjBTjDnajrAyx+KsyPT}8AZwsft@V~@KP2V5q z!aFj+>ZVF+@}pNUYm+QrpNkR~({Uxys&aDQS`BN;5}`1-xjWxe>rs866!r8LMrEen zHs0bRl{MSf_J;zFmlk`9{iPq!-#F^})F;b)HYU(tGuSM{6BQNdncDU3+|Nk8r}KK% zZhufh_pasZRN+Y-pS%+MULt?>^bK+L9Y{R>qZIKzf+r8p-fO!Ey&Ud1CNAc*fZbbz z{ZyG|560QIQbQkvSDEp*yyN+s6l4r#8nW*w=pT}b4cL2K>&@bvKex&*>9C-{cP;aL6)RN0mT^;0LbZmlAd4-z!^P zI?j^QX?PpMn%>;?%_rzG7DnfQ^ar+B2TQg&%Y&&;q8Ss@7}NL%H5bX;zPObcsVDko zMpNAs6jVAIn!hK*_s9K{bYrEYwe$B*(KC-C=bp@S#C0){;tR}d*3A3$OH$S^3Sf^% z%I?0XcWc-)w-*0^xqN%W0Y`cZvv1rdYW(Bj#nh9LH2>6R@MNhDdYm3}Yv~t8-&B!T zeN-1U-jc|$c%2Uu)(C@Nhj$p&yoFg*C1GL}A6Dt}&~-8F@e77{nWZYv>He5<>aNwB zZyVOP{>9###-}^$On&VhdsNqoV@o%zKAd*h^zm`DHI&u=GO}yxa*if>lUKLuc1KF& z_=}pnse}gZR~;Qbtuf@b&&y}^go$nB&I!Ca3@v3Sg8xt50L} zkQ6oqk}GRg3MxdL5Y5`I$jyXW4>}yr;$~X}Onj0&DzjDRkGHsn^GPow@AGPdzW%hVd5uJJy=>xmp6~^)3jFG zUVd>&@EmE7%8i-W@!svZJCvfA_l@#ow=+}WRmnB*KNEEN$>hY(zj8F#&VJ_wd4>qm z&y>CWn4e=U&7|JQ!S0xxfSpbeZEqgWWkC1wk8pM6qTflij`vEb!OrW~fmK%3##=w% z3p8DHqy(Q0FOD%!!U|e(@da|3l?hbjC~*|ixRh}QU6S`P*>tf}%MaF{YN{B&dXAzD zQbcizKbrOtj%~~&tR8<74>cNmnbWdZJH;QD?S0q1VJXGG3rQhOb0cuI*991Za`otwS;^!QBW z^0S;9YpL0v!|x3(hQ6+FuV2%oC9wH^_FY=AyQ&aMDBebz=qFc^qNgkYnVnmcCPgs& zr85&3P^ugYuY4u?n--Hddy=GU^lrwhbsB#g;k~iJ*`K(e&{mt{pdi&JSA_Ep>vc`yz^qp_Zc}7y>^#PJX=K z%+LRLLO5af0QEhdmYDpaJ`tAwV}9PxtQX(4k@p)tH1{_6STgxcz1T82moqCQ?6z+5 z!<*8YWTP54HHpoMb>z0?x07owXO7dq2*nTY{QMzpeBL=~zVrI?s-Df!dGD3&(ep{G zW{uyfdT-8+(8bGDHiy>?jr%{po)O-1Uya8OK&g?FA@jV$S| z?xpKqlBmanq;p>4#6`2HfOEU%nheEcA&>8JR7g25G^CkHuEl+~a?42_)4EX@h`()M z)C_dGAczGcJlif*zGbuQYdP<4VB@q~^GMF;J&Ia+tR?8mh%AAO($R&~Gbgf)`Kn&r zxAeHe)biC{RDTUhjg(aH;&}2$`gUDN_>P|O?|?B0?qaJ)^feV1Qpjf+mev=DFW{Yz zw_o~6`>Nuw&@kU@t((T@`LV^-F-sdvnZ_e#Jt8-qLJ>2%*p?raoMCUvkFS=$~Qr;Dx zdK7EWbX1D3SxQ@{`yitB3pZ=u`}HqrYWM`B{#ZY$IBF`LtEe8Oqcwi!PArssof+`p zxWDB@we>}{HplIFOJwQ2Di3a`)BV{4;kr1HOIOMmd)DbS8ry>9US>B85ilPl8opzi zVaL-s;JZL86z6A0F}Tm2sdeIP8X6E__RevTWAR%du1ML|i}lAZE2T5xzsAEpOH6Ump&}>s-a_5s9*jchUP8w8_%( z?vY!$jW>cVgU6NKZ5D^f_R>ZiZWNJnkM4!-Q&Hbt@$i2o%qVy0Y&!6|q5shL!^6?9 zC!BAbn(|=x3GqAqaCw*6jLQdE@AdB&tXmp*JS$Nh`@Z~cFC)TiLZ z6|3)d3Dmkt=6tQw9$$XdWa#pOLP=d5^%{#Dx2I-OwDS`|NXR$&cqb7*)gXfxBgwjn ziTt;6sporo_w*k2yj%11ulxNpBuvw;_j@T$ZQ@WSYObvrD^wrTb z=6LMU$M#>3&Rx9{M`0lNA~0lU{+kTj74O9{!b>wu*OXW*q-1!H*zJ4`TXcV2iMij! z0RNWqT={r+tXcO!>ygdFsSw+WbiF#|{QwdVcqYN9vo$h5Lg0C&BoGUu;?l)Lq>(=D7->X5WoZ@~Zs_4HP*Ur9+x=fgLM(hOw` zC^4KNOe(H&4|sLd#{*F~gJDK^U~_9L(AO8!P7V!gHHqci7D?j4>$v&V&dajy6UgOApeK=sLk){pe0u+jnPAca92pqgM zuy2PSlK;_T9V~PX-HZiJp9fnIP{#N4;vVsuZf5;vnU%@ZBHRIZ`!s<$!K+ub9-1y*FG7|bsG7bk77#N70mkz2QR`HV21|Rxcxg;PPz6c)>Jx2~l2f*oEA+&jTU|I$4 zP>4f37~Pw8yk>%n!wD$Ae%wpp*9eP_4u(%+nI}iVx$r@t(T3BiD!h%X{#g;va{KfOjI!Ns!w{lyel=I7`0`tS74Fm1@7{G#NivJ5Ud zRoZ3uM;9bLlS`8#Cp;ifjnS%Q8r}0EbfG7Z-|Y@AbzR1M&xUOp>|V&Tz_Yz7@szov zhk@dYy`2SC&2ZK`wf4iX#>b9-oGKDNtEaNf+Z}G0Q5MS^btIJM5xL(8w7ojgc=xyQ z9qyaW$!D{aY-X`7E{DD|?7Ben{fND|ZB!P97_x2UrhX?_3-37j=E^Zqj1kDI^cjpS0cP34f0rksjkT_{tur{<_aTDE95E zf_WFtc*FT~Q8bEfh1fPfM_M!r9nhK=UPQD^HsVn3{60e+JD2pvtKj;V1f%uog@TZJ zZ!C*DK~q8P7p|sv%PDDZ?(*M^X!Py9|8DUI_5+DwDW_3Z*6OvryPhdxIP@RpgGg6< zT8rfpY^67JI5deWpAZ`^^_?|qsE-gT+q|3IID0Q{{b4i1=PiSf*cv?nr&g{}qSi0> zRMSh(X}((Z|M^lLy)9j|mfAy&-iem|jlM^2*ckS@d`(4;>zk>?6F)9T`oVK2{7g(= zrZ*zR?#JSOZX>uCihapK)Oe#cA>@%H&Ig-Zwb^*hytLN#0X+v{9vavQVn>^zp41u9 z4@g4uqxR_Rx(NbD3R%%YxtuP_&m;YlH7S&Gd6f}bJKwdTTAUk7?0-qu`i55Fxl{R-&o+0T5Bgjh+uN3UWlh8MqtA)e zQL{QiFRJ^E-mmpMWg&|N6e@*gv_=nm{t9g$<^%i9#dAv)CSs{Im~^TV%as!9dGn1A zA_(hfE|{PPDTe){*J4*nxce1 zq)oBIU*P(g>E)Y?&95jMmMJNsO0gIiHp)3m-5!#?{QAs%=*#q56W4cDi7mZ#SJPsn zX3kugVDRPAtB!o3LTrEFc|Hxld-BhivYYiTjYsx_#vaX=OKJn%Eh*=4bjUG3-So){ zuHlWpcDBrIt_mqaCxDu2%8$ z1y?h%0c>LsYY3WC!8aZ==h**V*!H^o5_)Q(u*d1gh*G+*ek$?vYoBo)Z(#kXcqen+ zoAFQmxz}aby$sxcxDR9cX||b`W!}(xv?+Bxiu~S$@!XVg@cT(hyN@8B=QXVJ&n%|h zdiCtTnj9=Rt==W03u;JlZp1r}5>QY~(o!P8SFCf#8D$a3ykQ$G`i1Ic>aFaT#maL5 z3r#-Pj^A=j(mAGiya<;25^{+U9YT8+6+hb2^Qocvr(i*;*VA~uAgdVfKio8SI~Z)R zFT~5H?=npoI?{#3OHMW@T(F9ceJ)l@&C?dO6|#Xn*fSC}N1xs{Jg32Zf-yZ-Ub@zI zk#(NfaR947G1TFbiex_{set>Ql=4nRJN|cv``!AH_h=(pqwc9HwSFMOc(4&h8>RkO zO!umcejOfBQAp7PI==d+aXuo~J7!fdQ0=rMIRi)CVx;`WFH(tISQ&2$fsyF-%Ck zO7Q;#cL<30#k%c-^nY@zh8jSVaUA}?Ibq)9Vw^fF&Q->0$DWXcl!t{-ZeNc?!3rL) zy@k4nXX;pHKn(M;@6j~I1D9}fX%eho(+08j~v%M4*6 zt%m^s0P2iXq$w<5WmR#m#l#$@202Jem(VR~jY2UlY}}4CpRe&rf9V|T0Rp1-uEd&s zmyw!4`3u|t00$^ZL_t(?4X4h>Ln(HoM>H8Yy!IG6e>4N_rMCkC4D8|c&J+xA?~QIv zF2Q%G7nYxUD0Zp}O8m9wN5nhYAoajTESok2FL!L;?;F-?<+y_5Rk!zF$3>d$Y(M(kXg zEd09a5FBQ1K(pK&6qKo+?R|Y8@lTDlF~C0?(D)l zpJ!sp-|?_LJP#8`j6~WW5%7CK0tJM;Ej|d^3#I>FEd4YZ9tg+Lyl(iSmlU>($_xyR zU}zwN?B5*-)MNx>-hn)WQbYyJ#|Gtp@RR>=o!ysMKZfWwZ4H8^X^_5e20TApivGh7 zV~UN~{i{L7QGfhw?}M+KDa?WLR(NsZr1HOe^+EA@5B%r5 z<5)6ivIL&3C`L?dIy#J=i$x#xgrP7REBX$BZ=g5MebTkTKTU^2S_X%X?O;`uj{-FT z&>%mr0J1h6;bJKEAZ;X+aR{EX0?E_2V9m&upo|oxiuCY6jbygy?%o<@Ss8dJh38jD z3B%@=h!0$b#fz8V`}3*DI2(ZFn?l7-rj==P2UseKP$ZEPOh(2=kZY*o{xm~$_Hc!1 zb_TLVc(9=+!mWMyq09@fyGZ0xJpiB#p&PzJk@v^&YHovGAAEZq1N9C1Qf- zV@1-N_Y7n?X@j3~8h@g0Zn28mhR%*{}mVx->&}MwU)KO{d3Y=7UFLm~aO- z^V9&!aQ$W~`i~m{2T2_b0%ZynG%6*Ol6o#liO6+xvA^@D_`<6NWGF`1y0!Q{UkX(s z04PzWc+_CUxD%U(_WcK-r6i70TVl+oTd^}Z7{S4TSoDfBT6itOR=@FLC#z73dzT}T ztSCRY%ZQFeLC+V_zo|41?#z4lLD$SU#1@z5%gfAxd7lC3DxpjEKq(@&{|vJ+S!;05ti?%$B3i zz;3$ve9|d_?(P4OyytAC!@HN$^!K*Ytul>nKk`|q^XS5kB{ad^j!f)&lg}4FiJj>UmvpH>8rl9; zN>}$RC2uzyvhF#F{6iDO4pdHvqX2JLa(#0H{kZQ7dVT0*3cOh?cB+4=^XT;F@970i2$W_BUEF>PHS>7K|z~Mc#Wcqs)e5*wN@p3QGj>T`1yOk-~ z^_fav9KQAFJCu7Nm=;gyK~^Sq)W^q<4&PQtB%gx97kH9!Q-|{XN2``(W6_T0hUt5r zsiuvJPt!cNPBdoKLHcdqYMMG?5^cJaujg!3YYk9(bssJE_Mm3gJ!q=`@02L$IbJnm zPl(w`-tAoJt&In1-)9qP=#;H=vsB;Pa`oe1of|=Zuf0MuHXNXzw!BZHrthFQ9krbS z5$XIIVM*F;#1+`WwX=i7UM3)A8VE{)%vk>Tbh%oshFmV4(@vU_48%s?Lz$f`+}ld? zMF-NQL{`iV+%2+3XHOScOYdqfO44rQR*D9#J>1}EBBisifF>^vR}u`-wO2b>h#f|o zk~Bo!yoXXt7kIcfm)5*jl8H2h4O&QX&}mbWj_A0Duy^eYDZVR5sx*QUBr}Hm-yI0l za=CQoX-x^zV{ap|%m!WD+llw0gT_jkjhO54D6;JY&-ONwdM=<`Dbk{EAq6er;no7W zy6avqMJ!{CF~%5U4M}2mr!vMEV~jCYJ1$}wV~jDzSfj&5EMtr@#u#gKxQJzpF~%5U ujSd&Fj4{R-W318PB9<}67-NhzI{ycAp+Hlyshn~E0000OP~ubw))i3q}Dx0RcvS5i!0SVge$JcePaHg$M4T zP>iVC@-o_9X=~p+^^jD2|%ebu@XSqtSsDg@J=!p-1|3|F~T?3=<9SHsk(AT`kX||IEH+n{GIa~f zy|yg;4i{6@uabuozf-ERA8T8is^quoB`@w)Sd&GryHBH>G6FPYiFPR-lz}|~ za`N`XBTrq`mx>dqhCf^^+9vL%3t4Ni%IW=__-rAxMloJ`30ruO_@`u7{<^p`^SvO( z8zH{4!&EmgxVc7>Uq0y-)|@K5+VOB$ZE=wNi|-^HA@*GhwJB3Ny|RGVjk&;)GHUaynXLZo`0Ktbg2Cd(OomzuL)6xoAB-Sp`7uL1s#;oG%|2>r~i@eRv z7N^=dK}@ZqiO+W!b>dw%Cj!-NCdqdSEs5Q_U*=s42r9g&midX61viL;QaN*NxC~td zIRV~h0p0}Y`#LhxLxc_XoE+|g81uBx%`z6;{rKA{t5NFjZoG{VjS`(5=%z^|FD_o# zTFTc~S{aL|B(6~VuS3k)=`-SUT9;GO)}=m5-Rs7Zp6*&$9$iy>@3S)3>RKEl-^w`#r33)-S-_Iz5VCoKT{RV|82JP z{cb|c0m--YezYIoKW0eHy^GwjiEzTP++q__{PYWp{W|Ky)`8`*TJ6iN6NL&Fhe39I z(v`<*x-kl5mmBv7Suv5V$&v9m=cOZRj^{jg@5r3;o~>&A(r42`$I)?e$lk{Ax?QSN zs5` z*)W+vW{1oFfk*%bPh3q$V8odEB0k9rDHUx6omX$p8+g^b@9qC$`{z6(SA;`yYd359 zGmdMlGOO||ZhBH#__B{)D;?_nRfjyN{9-U37ve=aSO1x6o~-%5)BSG;oD2&B8}5-m z-E6<0p06EF%rg5iiCsl7?(ZiP@d${Ch23#;yZ$Ymt!-pvWQgfmRAXb~_(n9Q49N~1 z0dfawX=P>nuiw61WMFVwR*xEM@OCq9@}su4wx;R+7o1)sJbOk?chigH+O=y+w{Ocl zc)(?0Y3Xu)=HJoP#VjOLH9A2a-FfgMNlm@jP$<2_*2X4+NiMvlw^w=9<)1+%r!nHz zx-Aj&+-uOE<78r*&n>lcaER{jzvJcY&BVrr4PPx5bI|$khC0j~$@m%?8ny1fG#}sB zF|yKOPLw9WyhgW!5-4QFJLOsVnU4t0y+s@asvC_^P9> zoJ%@UV#1k_7akG8!pW&(S(Kg4^7ZT2H@bx`tK-~Pu3VAkNon3ZDsA|K@8|FDajZ^o8X&_Oq<`aI)uDLNiQ|?lRQl+B5aaso@Dk=)Qel?*;QC{BmuI%bq zA&!%ilb5e=JA8I{cv!@F4waRi9i84iGeeJoL)%OGQPcwzl@4_ulUn5D;)a{IiJu(fU2fl5~)Jz)o>< z6ScOs;t~^gC3EUUr4NpeUxM%*qRpXD^YGi|@-Pb_tuUdvxj7#s0`?5QVDIgpGP#=O zKl@*xf)t`JWjl`*$$d&z?TbD!;e;bKuq06SfaZS7blGy-RSVu-K?3 zJR}4I1y7GN+30)Y5|44@_R=7wcxiGnh0p$PqI~VBdyPJ=U0v9y&ChC5^Nss6a)yR9 zD`SNb41p&XOPkNcP-friJR+l_f?vI&eXEx7Ixrnl)vrHn*Dj^}J*8i_{rnmfD`?2b zviU!K{YuAe`2MQ8&hE}mXmPRQV;8m8H(Osz_%bGabY27=^+nqza0qy^-a!yMhX;tre;L@PSD};xf(!dim0OjlHf~ z&+s_^rYJEsTy%N~pPb?96<8gOB`Yc3O$iACZQWNlobKJfFq)HdW$jx%C2X|Fqcy5g z`&|Odi_x9SBe?`Qn%Cei2^k5JyaEEG0RaJ#F)^X?%&jkl9QYt9N6KfN@Dp6(lJ@Zt zgFSK_`axlag{1YoJD4&p-|9Vup6WQ(n;?0$udh(LVr70FjesDU@>AmT=i#Tv`*Lha z^F4_yJp%*W!X=Q>WjT^9hTh2J=@jZFmOFi|y?EWaU;5D_VZ>YM=m^!+2GJKjrHkX3 znwskC>odY5-F+!|?a|qv4tykdXyEq(l`xAPHtsQnE+R+;eB`GPHpqxu3a36PNy!g1SJU>4_Rm6q)(Z&oBit4&GVbjqH zXRH;zeU(^XV_|c1b8C`b(7zmA4Bl8`+FUYhDRlk1)?bI4es8D6Xy7WmDAlarI7T9*aV0kTV%!p+;m=V48Gnz{H*$G^p6sQ6`9 zC)dB!EPV5vRal>{x(E&?AkxRzuTM4X;}E$V|9EiuSiT> zJVnf4AX}TmWo?ol69>m~W2Vv2v!A2?+qcV_`Q&hVw(xvUmNW~;^c#JIE^+Axef??- zo(#u8VW4wYO|84ZdmDx7OX0l{1t}PBuT9 zuZ1kF(&hpw8CkT%J6Or#?ZS8OZ1Tx|9vpbZ&`Z$VilakA^{0uF!FNW;Ghh3YmaCG& z%)>+6)YL>JWUrmLFlsNsqPCtwr^5^VuDT7MX^~(3} z-_^4fi5`5bqwqi4@tzljaAiBwAPVQ!*48$+u#oLxtf+`RUF{gXI#JrJ{eU;*?+zD` zlai7O`V`GH`5RfLefS{y68z*%#dLuVr*^)k!V*7TaM0$abI*MN*yV+pM&HiY3(L!q zU{P`$$yss{#8w3le*XN)%*RL4_nbR~5qEhsp9~_A&Kgbw%51#IK%FC*O8n7-#E+b) ztK|3{ot=&DF}xvv)0ocHV8*$vg>D+hLdZEnUCLKT--G8%S4?LqL^Jyqt@J{@O9IaD z;!b#8UYH?t+r9 zka?u+Vj&AWYA7z|KHA^70MYQ_x4PDm5lx%&lRtgDR5u>dX=-Y!zf--uHeE&F)Uc)a zEebHk8?LmO`FMef6jBBNTt5@O|=8`>JvJ!jB(6(pavquM3Hx3EGY1bV%rW zunG%bs`uPbbav+Zlp$rh=Ecd$x%2xQEf{d4&u(yRY^>@6e5^HsAt2JJ=@1(=^iGZ8 z!BhnnBsS$Ir5DM`L-X>kLMU9^4-^vqTT3WyxVpJj?dcEn^bAq0E-&XL775#rF6T-O zY3m{}S54|PbZ*WpP0XWUa_iEiOUpBj;t=?b_STIa7UwVk0OB(XQmv;ZhfrPZ_$7RQHju%v{R@+qa!vT!!mLvm5@UY+hC&Astv^+t|l830)|!JZ3C)^dK`KwgpZ>C+4oS zs%ji|cyJ&M0lg=IiPc^$my?pmI7pkL6B4bNrDX_mcGdyPIa!`ZuOiaPsqOWOuK?#{|1xym*mmP-%_zR7XusEqf|hoA!fo9LudZ zX|^Qu^Rv^Q{(i2#G~Hsu#f_-3S{F+R`9lU7lFd#^Q?uV+KY-J1Lj1kM77`KBtzznS zytnSP|IJgQ{*mL?25&ORIb-G>b>08&`lh&h1rn4{mw};SmQ~+#2uIJkjoPNCr%@Uy zahmxZ<(A!$s5kp~n;D@XoOFW}wVgEn_U@|{CgrzOJ#{cb*?*thZcR_*?dO8M)KeilMxUJD5w9xy|lQH6=Lu`blmHF;J zycuv(JL#5|p3d*Stge)+d2nz*)qo-U@FCB09+N0mmFLpMx(jgs?h8^UK}Wl*DTmWg z#o7*MVf_oS_iC+*_|1`uAn)nNfq~9}S2u)&F_qNRCa8=cFu%9^teD+hpwA^N{iBWv zZrt9^l=KmTTa54iZwlWys9?cX2_e=HUlyYp7(AbA#en>X9(aCwL!S8}Qldd@OMf6l zfU2|}2!>Nh<+tRx^9TR0ooZ)(Ty&LJRJ=54z58deujkXJajMm|wIU4zu%(h#&X_va z^=TonSNIQr0!s0j)>bqqU_-%3UpY19SLUee1st#6h@rlY4L$$?Oc-K#7>`Na^4d`Z z@nu=?Wpk+fA>3qt6{x=2wfhR3F^k#C%F1rNyg-{ns7v_4q{}Nee5tAKN_jb1STG=CLqHhSaPR!r{A2hY?c^&$Vc_(5X;j|GNJtq&kfH+ioK*0mL2|NWL^e_Lwj3=<=*;E`0wEI8OM^jynX9 zyndrDS9TtxAmHKQQOza`s=h$2^8*a@&(1Hc|5U4Bp5rUq+Z`7CGd`=_V4#Fld@WZ! zk4D(wfuXK%hfs<_!2eg*ZZuR4!4(BAo4Z(?(?|)vA_Da%SpzNAt9lqEle!3 zcUZXZ2tZeNK~0Tk*lKwJr=6YMWQ`L;a7c*D&a%djpFc18s=iZApM#552TV8q8vb-t`1(3R8t$eGgaH z5hYz{9QEvcidGOe%?Lbu0vQ)69F*?f<=&(7^z=j;IXJ1!&CMaXXN#*)Tq02oY9Mnb zr}&NLb9%XO!fX59pFe-TNJ4@_9h__ra&|;a-wXX)HU+J%J&2_*geJhyw@BD_NIKgw zIVmX#ElW7yUjRi)erT@V^Auh!(#Tfa+}=i-$U@y>j?QRQvT_*2cqXXV?5k+N&=0n{ z8HV#TSQ`7A|2v36o*C^+TYQ4t??oDzX|aT^4o7m=BqTv_;aIR_D#n#4aL>^2a9PMm z;C+$N(VTm9Ha1HqIaQ8ds*{=R2DYJTjiC|3?~c2Sf$D#C{FTs<#XW#xi{u$^scR4p~2djzF%mNNIXYqycKs}^WcPBrjSh%=$+rE4$m#bh~RkuuO7z1DZXq)UW=fB@DR>m9*nNkL7G2URB=T=6Sj@}F~a zL60$Tz)Zdm+?Wmh;JxT~T|ZNOE<0Yb4J;aka%w&m2J^_=U-!FkSu7M{Py44&#NZ{# z*5dQM-la^ir1eAk!a+Sf253(C$ET-hUcY_~nHFN;tNXC94<4(R!0{j_I73jg=+?{E z#QN{0B^RIR?ZpRwd<<3r-SZu5oN`#`oive)5CLq5-k=2> z%;PldO*>#eVZC}47nb4d>Di6L;AfUV)k}c%_ld&*O3_H@{%t+GkuWSYt^1+2R=d~Z zSvV;rWkg265gyWG`yaWNFK}~lArT&8R>Sruaf&7#ExY?xR_0wX)RVO?EGVR;hd*Su zR#(PGM~48QF!K4c3-}{`z10JagQw3TBMD_;x4~ZEmQ_OUOQF72tT1d2I91-g?P0m~ zUr`dAnq4&wGJe3k*Y)ey-G7Z>K_!iWO~|L;W~~J70;Ol|qjj1qqDz*mrm;Uc z0xGkfAJF7d+<5qv&z(x4qO!8R;|baP^0Muai1KI4Dgd*Za|p;)+**@w`e$UBBUK0as7DcnZZElTf=0F?kJAu%aQi)VroQwD-v zM~6IAs0G@?%LowAN?*t;WAqNo@~z`(Mdo6aE-j=aV-u5I;f-{0Pcv)la4144McKr} z!~*+t{UuBCv^k9H9}z(b!Ryz}0)89MAf@=|QT2?)XCcWIl8f_GV$DH8D7lEt*0whE zN}Iv)k5$?n6j@nWP*uHs`_?|DShtu4P?MzdRt|t9YwPMNy6r`Kzwm#gDl(}b2Ygvy zS~}<++2qq66*KVZm_II^01*Oi1$$5#0CjBP`fwv6lj@9-xWRawe2-5vhhkJMz!yAEA+5E~lZU z&i9DTDl9Bij30vo?nG&jX;`r(Dj~qegy;*vZ*@=rc6A^~*IM~BG!$UfdwYA>QO_gq zJb`luC^_pt7v6B1!9=CLdgZWQZom93FfwunNpu=mn|oiJETLaPih6j2#E+Ixcxs~o zCV}rb{hr^QN?u;^F>>V-tNE^`-ANA~G|`V+Jr5@-uSN2{Y{Eg+<7#9g5en6OBiRbaC@q#B=FSdXxkK%dIl_Ie{in_^} z)|!y|?Bn$fm!I384xBbPu$Y@)#^5>dS}5twW9f0K*nU0L*Cy(4zY}h2cOR^*rHsxoJg?Lll5ey2gm|dRl`0Lb zr(y4(8KZ)}|7JB-Ey8DK=R{eEh>V3QSkRG(}}(=q5!!&JC!=AOrf&P8kaBQMUY z%!4J)`pi@Oe)>6j-$RkC+>V5eguJb*Y}Btqj@#RnF;hjenku31&8Au?b_W#79!#T^HoIQR82lPjsK#UQt>Ck5eGh}l zl`M-vRt};sVzKJo<8-=DWv#T_^kVMS(6pOyqj?ClHGK(op77*9A`(Xh<*Lwn>D0%> z_5~=qTU>ec;6u=JEyebUkQ6A0G)sBx-+t?;IzE3JMr~96pwlwu z-hnV3q0i^k?6H260~bS;RJqvDg}(BYJO2H<8zCIwgenN<^ZwyC!%{Z^;L&{W3hJfI zk8k7LA)pJvwm>~HDfii+IWT@1T^&2yx64PNiuxxct1}1azfF>M&+wyApO?Lc@sAyC4Lx< zG`1?2Np$}f!O%df()bJObh)9PPn)XTe82yr1z0&{b)DXq{UR1_7;hv9SPC=%s~)P7 z?%)!-E8c#>c4TQ2YpqY?ykBAcVBA~L`SCs@R;P0M`y(^8oudP_n>r~!jI-Lby`23x z$4d-ip~Sb`eaU!2{@O6>da==k_9zBf=bbN)mA;4(Sw5$eqd`f>)h^riNOW8LPJJGE z>dMUK?{Vo`*hY5pi(1K<(^>cT4etrFF~%(id8Ou7H;Ct(a?oB1mdSU0)X{&%k4{ie zJsjsDEScK0?F67CZo5{K!p0{pm#dx6p8MN&Bf(KnbI@`s?=bRJ1I8?o$H&rpx##61 zA-VY!kPVu(vi9aW~!%^LQvLTWA z>!+yt8l0@cb(04NUN;0>)hs)MQJ9$63w1Y%mTI`#+KAYRE(>2#Ej91@nwdvQ8+>X$ z)tRCGace6nGB1bx#@2f3RIQLbG(xtx*RMKz&zdV8x!+ow$!;++{jNiM5p^-mC$`&c z!pyii5#xPjKM}MGvI{*p4e;*-prqIkUx^RjFL&vu8i^6gkly(>x_+Yj=YmTS$$t*gs08N)5NZ$R0s zwzD=xCNXOtPGf?~4#x7LPcQog|vy2FgSx`hdxk*_5Z#P-|jnrB;1`zSA_TM#$vu*}V0y*Bkh zKG#gQ=K5Q|5o#^TYknOVWRK`&&X-psy$iy-V@mlle+CvGb_&h5oLqL?TJ941s&eDj z;0OPqS|7XdF12*`;Fo|rX^U)bxR-H$)jxqW!IiRyuvf%h2FYt zNr#)a{neV_ZESoxY&&!K3#&{WJ&VfV3t1LMqx&n7N4#!4iK5fnW31fZb0=Z7SsbM+6{YT``I@B!{D;-4!k z_|PUmvw=cE4G+wBLSo`$ND#QPt;P3%mIdznp4R{{^6gVo)F4F?dom1dX?w(liH}u1 zB1q9i`s7DgvYh2ex#canh&M>T2-sF**X;84)1;)WgDp#aL&LVt&c!f1T|d8@fRfT( zzT8LmiV@fAV3X;=g9qx3bq}jEW1l>cfi}e4GW%6AA@qKVadi1f4p!FI2zv!+;8WES zJyPa*g=TLAfMaLJ2|V4U1Q0efAt8aXMDG3j_Z@(aYXE#gdGB*y9{~`_;VCI8CGEM& z_W>5g#mAQg!0Pcvp*F`&??0HpsbV$pl0KP(^}B6nIjBLk8CqD#+20>EZCT*X#m&tR zO>qqE4ORAu&+NIHcsFj`_}b`8S^sFAV^6`%%#0BrK6DI>U2^kZ2S-QDP{;#H6Hcrk zr>RK_dj_(zq8d>HWQzgTcMJ^;WxR3)6Uf0^j~+cbna*@Y|Fr|H!Fvn}|+?1fXB8%|=+id0QwR4N!CaFRtSNQgujB`U|iQK)FOjM3x1| zqilgW0C0o{Ab;TIZI+#t5Z#DIZc2K3EYc54*>M|NA%rEcF$i2dw>jUgUujJQR3}1Y zlaK@fc>q)bjoZ(D{vkE#WD^ji0Ki-Bv8s!b4#6gJhAli<5Ed3rmGG^8oK@%rXgUDy z_$dANwuzrOM(*GIdp~XCq`3!pxz|tPMC7gAGHvEzrUN;){-Y ztD4@s!kHZ1P_PQD3uB_vX7l;!1sNF`tgbt+gzvIDT)&G|SencLL8X8^s<6Vx+dI={ z@HO(?05POA$MJy?0JYE#(9WO261c4b9FFtWcZIdt=8Q2t4$82?60oRtU`a?wNj1$( zxC%>i+gA$h5E_|6*RHTAj6%m_V3ocMBz~x(rl-FtO6qtFDlie`-k4`Xf2nuH;aAy_ zJq@z9u&oN6@iyp0NvNpus#adUD;m?YTzhppTu*y!ti2rrRTgmS1D16q(9f>0lsT~{ zXT#U3xM{^L>1m4j+SadRY5Dsdph^jQ^M(;wM=+3@D8}=?nVEaQ^hcI?Z(Gj~XWwG2 zyJu^gHw2_TS`g!<=rB>YpSZxf(JTP>$B815JlSm5?Dji7@{lRkO*y(*@RgYU=2a1K zaRL55E(}P$e(~bkC~QoNzaQ{Sxi1%B%j$^hWv2kZK}iEn#OwU!xjbk; zz?&m&`^N(zd?l5z%<7JG2VIGEnn!!PwPsdK*u1}8K#{i(P2`}Bms_i4S)ZNPl} zGV#_iAmF?^CUl{&G>3q-xbCOz05QX4w?>xq1o6XUZ70hzQPPuD4tg(2 zJpTN+J^$lll33K8ePPGxD`0pXL+0eSXNxhQ%ht%hIh!)p+gs3*R2mWz!Uw3UIne%X z5NdWw)ovfxU3$mmZ%W|st***>zAa3^G$O`mzNv&$uY^lyL&IdppSvH-Qb|>n=d%#U z(+c?g0r2H*-3)<6**)%q-QBlviHL;UsSK^|-^ZPqnej4}O3un|Nq5L4Z2 zn95UF;>}}R+p@w*!6_S?x$0lv4W7U?8~_ma>5`FY$@M9*2WD1P9=iJ7G;H zBRUf|i<&fDz?CAOkg-4i@So#P2FHI98Kr5(UF7`N$&IQXT+ZRxnVU!`!L=yYe>ZXE^lE8=!|OjKO;Pg9 zkp9z61}C3+?v72KKa7=-Y#ed;%)>nL;pvVQD8343jw+t(Kp>f$pHG~f@PHT~3mDU< zCGVwlPxtZtsxl}gKrr*`q)xjDIAQg-O(mVz;+smtOd5T3-c4K=o((*2{B+cGX%CD{ zz>!&FCg#wE6>!GouOBM!Es^5|Gg5JQvt^AGa>1kSXm|{`&9`AS9IX@r;nW4lSFK-5 z2ZYmVr&)T)#N!6AX%K?=9mb1J&c94f&i(#YZ!D=Uiywym`}c1obs8d?2halUrfP5kr=dan4EABfPe~*;(P2)F{HOW&WHR@e0QtKO{`C2g4>ItB=TK*p}BeFUZy z{`cc!?!W5)x8b}WaTvs4CQC|AW<|6I2Xi48F!l5%(AV8ih{Lpj*gNd=J2e93+>xRA z=da-8&EbrK^|>ygaOvsk+omdQc4^wB&QDlSBoJ`JK%D5B?f{UnEcnWYPcp1Y&z@ar zIR0IK80ZGXIVDI!GHVVe0NkE)n(69#YiuM`Y*48PqV>f`{{BgQi?R@SuuyFvt?2bA z(R!zf2T_Y-LbTTci3>^WYJ``=haoTKSbF;#`0gxc4yPb{ADoe48n)7BsC%@sYMAzC z?bt9CYAPgVxMoD$TRUtj0ZItrR@`UAcNBJp_pA7zvSAVx{b0N_^bWax5;Q`g@lt2C z#!FWT7#J9s1O-Qj%x#ks6Pf<2h{i+HpiaKe?Es+gP^FC;5c=9LtBN}Le;7`*^}0=P z3rL0=t{UolZ`A2bCGVFQ<%qj}paa>IPx@!U*SDXz_m56gT)CHPAD8 zz7QTl?9d;oUR*@2{9zodi!h-6*^gByoy3m4DS7V|B{?MHAcn@p)enZY3#=0`Tr#C@ z73eC8GWpufK((WGj%09TNPxoAiR`oxd@xB;U0rS1z1T?fA~7+uW>R_K*XMVrAh2tT z?;TID;}k!`A5kb^=oNv&-xYRpcNat&7w{qGZqRcuU%4XxuK5c%Vm4@jkSFe|-|+VF z$#R$|@p*Lv^kdZ2)KGLq=yQ#Wl)y}h^#Dy{rc=Q+tacC+nEwj9Lnw=S%Cr?OeS$TS` z#0C**0ax%H?9Tp5nPqpJkh2TWz^u-3Z)}>HG@_^3 zUWfv${_O17>F-@(WJV%U>Ap6@4~H+++)&rs18=~=wNRW)hIGMpt&pPOhvOgb5929F z=lP#DIWZ3DOPaARCQ4#qVNpybgem`q0UHswpE2sbH^Patl}rOqt7d8xc=C&ixEh@pJ_#FYZ++aD-@HNV z!X*>;-uD92FA<8zWTv+~_!DTGDA0^b+rM&@QdiSVfu8321EHTy`LC>JDO9b=_x$V0 zmY!B5(q+VkpT1iNk$hN&hPwYhG78KgGtUd&rEIqA*$j)A;onR7F~IvjqWd45d^(e2 zX?-W=>ldEz-4d&C*ljBkJ)L8|HrgS1%C_e;8LUFYZZ^+Ixx9KK-FM*z=6XF^t1PC1 zHOFf<)VlTAg%A@iwEHYS{08Mq(3oiZi=XRyRINW=w?)q#A~^9D|@)y11APgEDV$B9G7%IFmk3(i+?Ri zt^E6HuGj7gdjgM0RGMpac(2Nz$D+&i0tA-2C2a+~m)egrF7R7)vYzZK*3xVR7E8}Q zJUi(2p!>PZ?~N5S+7~!Cm`aCY)}Iz@`S67OYZ@0SZ`0Y7(fV@!X)~sEbmsq)#grNJz`U zeABitDwcZM&es3u84Bgf>mUAbf3{`lh|On0?_T#5Cv_RadC`ph;xO8}J#TC4g5I0u zeVUTjPlCSQ!*@WHxwg|bd6ez6;2G=e$lb;G`rIPg-d23ef$mR_LZgAjiC(G3g2o+z zn>cF05y`kgEGyz)n7Kvqj8FB3uabDI-2Cp)5aFZCj2cs)ouYBMKra_&qo$e9E+yB@ z-_7SW8{oHWpfI?#_l{9NHh9CcVpGN6BR%Uw7c;Y-2zd79=@f|O?qU9t7=w#wNC8ocfl%dreQ9b zB02&FHqEU)agIYa{sQ+_*F(f~4PW1U8gF5giWBZx`1TwX`xM_ZU@}Z=myOZ+j%6$k z#@M%vveO)lw^(s9A#FIZi`D&N5pNRH=f{p!e!SlZtM(`&qP@^aUjI5{KBX$%*+tP; z@OL1tM9q4O$ORQt%V*Q(_G()rZIu)1=(q7*35g6A?NjXPtDGVp!X2gBawp}cE(b%Z zegavP;Q@5Pw>Sne3nC)Q&x0|-)h-8#`8_&SJG#bYvQ56hy@rSLb2oo6)?kuiQGRY~rQ+b1Tm!a9`DUV^dOWdh;SVN>fhH@=tYve*M0dd!FX9 z@NJ3sK{m(W?UxezBjuYo?_-v$-*2_0E??%URfegX~(h^M3z4qOm!!*xH};{>|x^Qw!W~v z7Ed`7LDNN=dBCAmSikB0+_ z`oRB_b5V6#Fz?U1?N3j;%d5JzxH9U{mnhIju2(RXXpjkYqRe8)$rLk_{(!7jWyIT$ zn%ZEnll^gHqrBy_yPOQ~tJQA(Sfsl8r|AX#~rH%dX_mg{TKV?)7nlN)`2=5W4` zs3dYgR}S??us*1CP`|ZO%hTQ7`lbtd3v`}LmuQ_#oS^No9WQ#3**??@>i&&#>J6u; zJbBkd=LHUn$}{d)dA}EaOdUJ_5=YG(W}sBN&USq@ayJ+SZ|tT#D0p>Xv9eTl2DUp< ziAspZIc{nTce{qtuDv8k##zi}zL*fshmml+i5hprRS^5rs~P@*d-2z=gud&EyT;Lv zzlOf5rFr#YKV<&XN4}f3UvV(L=>Os^V+P>6H!lM>HtGdNf!=_kMOz@z(CDvU>cWl~ zybgE`bsqA?%LF1BM{AOKb#53n{bE@gMn1{`KWbcdB0_5SoMzcOtD^Gx%Ck^ud`K4bkzWidj_$Y{Vy|5q7hF5xPH7bv!LMHe%bK z?q6cENhNsraP~sYU~@MvyTqwbAKArnCsxUGPVCRVU*_I(5ti~Ck+Ya^qsAoYjwR?W z@Mg3*Fm3-)%l|X}4%dt_Sl0DolZWK<0B^d2D@H}BheQwbR27cfRyUp~_l#x?=4V}Z z3OfwEW!XQbQhTv9d}+iV*O_Nj-}HNQkf~T^Yy^+grUw5Dp%;&vuiLbYoHXlg+3WCN zCW);ayv9e<;jLekA)vstrkPC~@qICPz-o-8g~U1Yeg z85X3a(@!Y5wG$)fzj|Vw{U(j>qs&gQ5TzZ=)qij;a&DMQ6J8$(!J@wsvtIq))jn<_ z?Io=k;Ru>bvd`vNYw2Xg00azL{ z3rqWuxwaA$6Vr1;*_Quk0YqGXL?S&vifOy~1QU8+ zDzO;9q8tq?;G06A<@)8J%2sw?8^P)+Dfip>p;LuXS|3MpQ{y{bFzWT`s^viRH72Ik zv$;*0gwHN&%B3q+0mCR0?$3-~YC;G5jj(xU7m6*}<<3FXbq z-XLBEqDI2Y=*daYcG8Mra_Y-~&@e#czRmcCc`vxEo7G8g%9G z>?joIWaHxEcpI#NU@0mpGPBBk^>zt*engR{T9WGk92NpeCI%kcW`kyWR<>9p8@qoX z`fOY8DtS*&k2HulJxERA>wtlwqoZRaP4-_(2U#K@g2*Ho&M+DPhJct3f2_JQ05p#a zL{G{9=c~Y9oy$2Q_x9VrA}&q~$lKnY)fWz363m zGO%k~Au7_D>d26fkp1Y;$K4>CFTx-Q)VG<)j)L(pYOF*hq~o`j*aHzUs2vg9;oKKd zZ3AOu>NNm*T#Js#i%2`FYAK|M;GuW!^b>UZ!9*OqVFH0GqE6qm0o)i1hVUI!)#FyF zV%I_LiAWV&gR#PiJAVN9OAO2KolgZds3qtZQ{gU6r@C%>RZECM5oon;(AD?SeE}^! z!e$lfl|DZ-^w!kUg6`OTuTB37X(r(OF!=IPwd9=&Af%svMT7%FPY@Uwn0=Nbr9*ee z>|-7XkqH4OlLML(wl?AEQ*=;)1F`>kC{yPuSdpp!E(OXAKu7Gj@H-#Dn(J@b9z-1&?1%ZJ&y1TPM7Yw>am_tnE zmn&tMDgfy7vO82|=RZ zu(Mz*5hXmz4A7CXswyGpXA~+kzXk4`?Y@J-&v4?_t0$m^N9bL6Xk;pnT8u6UrU@LE zNjSJ{Y;5ErHf6g`YBh6}Dbgf-fk~&?P$3ey7R-oDDtOf? znQj0&g8(LyIzaFPw^tUd80PUquiJcj15gWV)81lV%KZ25S(W!PDTYDtTJCq~ zN_bg}9DpZQNnX(N0$izh{Co_K)*Q6kJ5Gl$1VAJXYHmzSOtYy9E4YfE|00N+!|LmO zMHTn|06YRy#~9khhGRpS+9F^AFrAAHAR6Iid>xq5`p5{=%z$blU=hq(z;J{G4B)|R z9yi}ZuwY~y0@!X68k*i3CKVJ85^qr^H->wzU3M<4^1#`$;&4b zrH~}d0NZW>CYj1-jt;EMjPHg7ay%^F14fhtH;*x|Cryk@m0RLKn4Mo(2m!<%&Q*X3 zAQvG}sD=|#xBWWO_~i&R#skmA6i2LmZE4_|>$m*t%f z78cCH55MBhjiE1Fd3^7{g9)$ahZ~E^k>90&f&=bPwu@97l=U!^{z)9$JA#DeoS&Z` zCSAw`p1*;aa}qMLcJXko+_!Hn0Ag;iOu!Gu*JrK;-@Wrv5KqE)Pu|s4py}W{0V0`3 z#NorUFja5~=4T)bDT1N*rL}|B1N_YD%vY4O;Gh}2M7KzvNt>&69o`=Rfokz`NW(mY zKoOVkm|!4$eg~TZPD?P9Fftvr@WRbFCKae{N}GR>Xx#brGwAK^gaF?=5;^f*8PBH1#dD z4j2wpL83K^FW|)MlN~P8QP5kG1lPX4zV;%6%AtHM_M3rcbV$ts8+&i%JH>$~@wZSs zGUc20D)*-9J$Wh4Nwd@(Hh#vRfLLS+nH2#`RLl4zk$KyH_O?Vusx9X8XDTp$gM>q% zv07SMP{?o^3>Lo8=1^%Y26uqynGF$Iq2z)J{;ketci=T3J*9gRY7F%>zlYIee2{}Q zSoVxWUSzrQ&9lUy5*uE9Fn>8)y{oG$59S&395vBYdi!T*8DO;v#>R9o?goOAao5c1Tm(77 zM{D|MpI2}=&hU7h;!Zvfz6vENS4UG(W`(H?7_(`GRFU}VRrL9B^Labe5%X(n#wF`N zrJ7;(dnQjKJ6zgsxBXS;IRs*_ZYG~AjEtzqj~|!8R0O=B!@HX)bonAi7uCY$;GT9k z)?t_d{|ciDPzNGvC`Pp}8u{AJj~*$4%Ye>dxX!}?ZKSDp734?o?h%=4cAe(aJ=&jx z8Sw$9M|jmrUQpoyBgit9Q3Rq9a{%uTGfp__V2s`p1~v}XYi9X8e;dnGzkm^t&`?a6 zF2}2o$oy98f&p)+0&R)V#?^{!&3u@c%+=F+5cwMx8r4bF4jk;(LcA0+WCVzvVWp+K zFspzy)l^e+0|Zxi7cN{dWNdf_QK>+$G%hFSicEB%Y!osdhK$o;V`BqHrU(_N_bp~v z?n_zesAq|ZXkhio3@m6Qkg5QPEN1}K}ktbNu@&skrpLIN|cgL=@gJgIt8SZl13>>r9qIC zl9JAEJna2@-}8RgIp;e3wYQ7)u-2Soj=0A??wRx=&xzGHFdSDqLj#GcLj^&V>efq82jS-h=MEALnUJ#iRFadE`$5YE31L8}qjWnW zr~98?9Rme(m&6nlS1ppZ882+guGYJ0)*d?-%ve>|L z+%}r9g`YbDkqE}s5RDfaA*;M-AO_KmcOt$*@(ppd z%4w4s>L+Bf8R&F6zaw6Ef%^?6=S1~gt|{Bt-1OEh8ZzHhkvH6uI~5ZXgQ4oS+xSe< zsSnT!a_^^#he26013m(>jAjVSef93H55^|J@xq`nGBFG@WFg;8W$u2Jn4X8+uvkvk z;l`E<4il@)L!9kAq*`KGA;)OVds7Ih^4&rK%DA3yK?}(>E zBkop}buJy99xnkqm$$XVq=UqMu@4OxQ)bXmgz2?vvSRq3jh!7o7}bRoN#=$A+LoMt z{{A(i&UGm-t<2(W(yL5ayheNY>W!KPpvJNI^YhKC_mcyWR01X@!Dr6ya{q;L%3n6~ zE+mG(R#&@NU+amvY~z4`&zVULybcZ>yawXX)9$_o7w=1i67)-J#>K~%fj4ZxSRBx} zj3MdRUoI%Fng@Rlb7bg@lb37qALeF2m6P!?+w$P6Ctw58?5DtbIdvIO4)({V#Udpo zo#fk=JZ(P%?HxmF>kof-YaG|ZVRUY(_FY*b6tfn?AG#qCs{Z`B;fajjn=O8WP`*i~ zz?W3UV2n-so3oKIF@9h)h;#z9?Dg!Not+H_@*W_Qi?GGRT(;5Rh~bQbqDK=%T*yBc z=z&mDVw`L1N(X&O5Qxjcjy5M1K3os&oW6IAF`w)41qGdFm6)_2$l zem3br-};3ta9P1IgQL#qWH>nc*JsP)l9!N6gNONFd^YefVKIF2^V!$UAia|Uj}9$! zf0(fS8-9foK5Vg;b=r)+_^Y*vpmVp~31p)GxNTBE9sYIOfB-?xyFe#Z0IaxaI6Ae) z@>t4DM5<7dU%eOa6&i~7U|d;Vz7Zs6()!H|3=O@JRROA7-(H>5COX6>QingoP+0*i zlm_ZpU&FEXmIp6^ng|iQP6txp%(X{4ZO-`R{i%oFhrh?m$9ET!&k()LK2TeL!Ck1E zUk6@PFa+!MzF|3pQN$@zqZp9%XAUUPY*m+q; z?W1B02co71FV4`(yfW>INAr{PrWP>W2v=i+_{ybUuNS414YE+%i~Sd#^&}G)Dc1xk zA;gRp2p`p^0P-{fw<*|7<=(x}TZWSe*%H##-1a?$OBjV&n3_7Z(q^^=#120=v}8y1 zP>2~qh0e^vY%~f;8>BncpHh>PLm{+~AJrp>91t~vaDs4{)S?p-g8KSYAB<*7) z!Qj&D9=jB=fu`K+^mL{~LGxyjV6!ZjHTT1%TqN=sjE97UeFbs=i%6pD?y3%OrqX1? zXlgUt?}IAZ{S;A!JxQXwead=S%}~XV19=;=f^1;kE4k2V!>lgwSmGV=GLSuKf>nXA z-wCb~BI7jFZh!?K6S9+h@|brqxxhC_;zW{j7+ggDE3goz88(w%YZd91|Z4v0kpL^OWH<9*dB^$)7=BxTERGVtRG9dW^cx zb9SgC0TE3_J2D#^{ExcJMy`qe>Pw3hK`UglzM za}V%Ucv7UCgdzy(^XL1mp@c?Ip>?XSs#naW-6meWdd;`a_5zl%I#0!&qbOY}zzF;^ z)6mdJpX*N*d6LT3{l7f0TOX}Pqu_5qTJ0?NhOT_=+snN$Z0D!q&cQ6R zG%t&E(tb=YgHT_K>v~9^;zexfk}1|( z3wMQTf5`JhNSkLw7?t+Q)OH!8@C@d@UyvYt5nNKO^k%b##=h)^SE8dlGaDGb+wSNy zfbr((vWfCHQg0F@z3*vF8Kt{h`BH=H>}cwY7eFv7n(u9{=D-dm59RAPb*Th?7?8iN zLA0EulOZwdx_4d8hwsJ43nyU9cMr3ul&l(TTzKu@Vv^sD)Rrc#hAL2{{6YcxVKY2n3rCHq~#w z-=k?HivOE)!}PGtTstw6b634&TAQr<_D{|&ezYn8o^4HupzMY5j-$lviXL*fP2aLVeraN$j6ZE!o|wvEy|XGp;c`imF3;`_)P8Fm50HdW&@>{#Mj5jA;5jTX04IN zn;*RJZmgwSvqOX~-%Bvkl6OJHf75g4bvel$Ytdcq9od&0k1KtN4${#zNdyDMc^R%? zbv5}Vb0r`B6**pXI7p47H#kc{Nw4oy>|$Z@^748;*byIjrc3DHJh8*`NmK^7RCrKM zcZtaI+1j|b@a$VJ>6nr9&h(~gHc7nmyc-GL6-G@q!qb5=%m$tg{30>_+S*hEXaDEE z-X};>Nkyg6;#H?uH5fg#h|FueT${KJ8zAg?DhhD(7X-w~%{7LR|9?|`1=Wn*xaev3 zV+yr0{>Yc^5}KjM8>w-6yi3QI=J>H<&m@!yNZbOvzWw-+9h-{-)N^UbW8j4IO8;1= zFxnfurkXp|(lKb<-x_sB>HUwF9zl`UTI#rqd-vwnZ?Gi6XJhi)hp)ws0@0cI=?BV%E&3Z@c(o4!Ku)S?v z57lN%%L2^|=7)ml|8eDrQwFEb%(;Dmgvl4j3i4eSS25?h-97EBewBcUdt$`V*QD+YnA|;jvRSGP(TJ%-r z5)hvGC;k4D@D@A9oTb~JyGIV>1c+6;fd2Dxri1=Q=DO%*YoR%r5vBecVm$H^bkLF@ zg3>!OK0a?p4B);1LIx~jJgC)f1Mf1}Z5=Xhge-h_ig0b*>N^kz4c|}K)0&sF|WEY1i>lRih-znjf;VZ~g0!XcbWY(FH=dL@9 z=>dAoIr`iBXn&s(#=!s@gw!Jnv>E|XHn6tt5y2%MfrNMylzdu-hGHGxqJs{IKLn#+ zfOpkq6447A3*`e1rDVsqlCXh<4DQ>+AXe)n7l{GsqI3kk7^2K$ zXkg%F{v&5ITneW@vfA$V4E%vwVAn)oo!^yo9ItzrGU2zF@+K z88Odb2%^sI9UXU}YUwL6l$x|TPh>ikK@jR3T+8<&zZh0ef*2ysYh3NfzQuxlC~pIx z=k^z#4($v?v1gr1_i=*<9U!Ri*Xf`-bgi39C3_4BYuO&F{9~v#cGstRktmzzMUEC$ zzlF3&{)JtkPrMHtSSL8OP^KZ`y>P^^!S4J3EZhXJ94ZNr<9!O{{1W74$oE*M%JNA= z*Ti^A=YB&_&^J3fn;<$S0!|&65P447CrUbq0^*1lg-T?3+YtB#`3b5EA~Tdrc(S=_hO8P*`iH-`&+^|3;At>L-`I zTPjQn2P09zVLf7!ffM~Wg#`T*T9Vs801od5C5uVEKzd=P4TDG(CzdgHXj7@OWr=*K zghSIIZqn4=j?>fAlTo37Sx7J+Q4gz;s!7i%9RVcxSVxSET<8KciuhQ;cMWm$W&ULT z!`~4D$iRg}=itxxllM$^f`^L8-nhe_Gg`Q+H!kB89i9J61})fRtBAA=0@Oj)i{$Im z)6?pWwj@+kIMDb_B-F%$je>6%x2mF9`C`f+NaS;#TUWC8An8Z2nScVhK{ux23H(IP;e60E@- zp7gIh18zy^xHf?R+91INjY-l^${Yrmg77{XcUC0rfAXMU5PH}|=l@PZOB(|Aiu7WT z*TNs5aS|mSafku*F9*dd1fsXFo}kzTVRjH2+Al!@8%%}(1P!Txwmt_XFLZ^WDG&~6 zDa_!d(c%Odz+pj!)hqshc9Ha%LS_b{iv)#cBP=GOkqv0Da~q>DqWlJSy#)`34uZ|n zch>(L6@h24A?HU0RYP-rR6y3tV4O4&jtZ#jL7eA?l$9+;5k5&uq)P*wa#q835wD3Q z+wH;r{s-Sz3JE~I2wKClbW#;CdSYN?)GUJ4FYy06E8i4`?ew!noAXuKK=KCm+UkFt z6sy~Pa8l$A5s5}IZ*8hp0n=@bkQyB5{`+!s$(m;4baFAIH zzBw1!v|@LNwzXG?#0Fv3c#_zp04m?wD`X0tC&3_z4Kpwhagn29^Y2mF3LeQER&R_Y z{`a6X2cQ{;ZW1uvEG;bsJC}ieFl0MWS+N0Hh`>|-7VW?rAf~4e2TH+KxEe^>vc<>8 z%b#A8P;)`VSpf`S5ec{j(rmEbe(&DVA!h~562@^qS`LRhlBuEf;aE`z-Z)Hz9S56U zu{xS}fL#IgaBycD02#16t~~|Gco@~32lIj+K@|5Bw8rl4u;5{S&_bpHrC>1I+rq+% z6;7BaKn*yad~5IQ3=Y;}rs6PAQs}BK27A8cwqJ68oGk)ccrafKCjSV~ZbY0K{AX`g zPR{p-T{ui-a+p@^h@+A}pKk*pDa28Y)i`hp(%>WM(QGTeUV{hpb&8NE4w(NWAIodp z5w$dA#aa0*hq(Uhm;g%h?=b=7_8&*B+npjt``^a|@U-*epyVy9e_k6;=78Ees0Ja3 zLxr!|b_|lzNYmtbaWm1l$E1c(^2`80jbs=g*bNQRPB3j~VuZ2%?>o9<^Mqk{=J$mk5r^4#MP#{)`V z5R{3t9?QweA@%>8=Sx1hxz`}Mg`&|)Moz9}VBjsdu4f!tA3_1CgChn0khuU|DE0Ga z5u_C_B8vvfQmt>fdH8{P8ubdo)(#G`(~?@m7MmSt1nS?^iu@5P^&QOXI79P zh*8%A+{8O946zGADEiJF!$(-Q21NT6$}*Vu#DGtPY$oKU7(?k&*ptM8DTe@IL$%9I zNl6KL?Es+Rlf)k(43M~6ko>%hqz({LWmc1-^1Tqp2}UXT5J%o`C-v+&fL{24EQ|og zpSvnue0tG%&6jy;!j5Zrz`cPA1HheBRbVN=zUKp%Y5#Ap1o%91jIc&UZ7>;-3j78p z6UU?P4?Qe6G>D5hFJ`3*U}%cz=FK=jut!Hm40 zf5tP+LM)zQ>z4Hb9>$d-&tA#x&H4TH;)fe6d3X0;HFIFFSloKb;@kccdL@y`$z9+{ zrNEB|$V#Q^Ked561&j-XLQyIsiPOITNqrl1#+}+2UCbc+l~+;%za5L<@8Cj_Lj+d( z666SAE3-8VHs|B?gU51G$VzajM%}%Lp9JvbCS4%IeSP!EH>u;-ELD3;m^U9@xTAx4 zBbn<>;#(Z^p8G7w{zVde(hY;73vZq*lr+#Or{_wCeqFyvF;zx~%A-;`)SCdxXTN8Q zId5#N4AG|raG&Ro@)0Sajj>)FINz`xq|4Vh-;Cza>S_wiz6*T&13BH0i@?^f127E9 zC2T}NU?5J%>08;(Ymni?JTJ0a!0drm!aq)68ae_0l$#@N3q`yh-!P%$N_HVyE7P@& z?^a@0@RY_hz9rJ!+5qX15<0&<0J<-$QME1c0Mtk4a6iDhl?5Dc#+f5DXT;?XB6mF z^IK2Wc8Eaf_ckO2|I7p~n7E`v{k~F@s@>g&b~=Lp^neFHg9nTG!+k=x6ST9LW(Wk3 z976x<-)xti;gMQte>H$>B6ZZ>gVGGGwax{l2W})~p%jMg5!Cw*r)(c9`hK<6Fl-(u z`uG2pOShh|IbQd@61$cdwwo*mFy_`rKsN=Sk0DMa01#wDV`GGw0q0Ts`;Qs~+#YS? zqNS#W4Ds%O&`yq*#?YE~{-hZ1;L%SlyneSWg)c6$zjOx#c=r-+0jIso*_Xvv9ntrm zqw`+OhirZ`obeIddGO!y8bYtk^M@~RM`q%Mg(M%f$-;p`d__BmS(AMdTMCaMJ$D^y zkbSa}4?%mrKaV)vE6@-jvX=wqi^@fm$XaOT$ z?H`W;LKg5RG=MC2UYOw(LGTEHlnW_@kRYDYrZDG+t9?-ka#_H|p`zh{Ml-MkAifO& zk?764cfb$o7Xa~*5$I&DQ~469T?gh5Km-Uk2(EZJkIgXM`A=*nqo=>V6yOHhGGx%w zg7np&>vyBSpI=jNZv<59S@A(c%=HNZ&x{agV(>lrI-7Ja(uqgNG1-bSP%F{_ya#&z zwP&YCwV7ngh<|``1&V_v5k)jeYXSBl+tD)CAuoAxofoeQvznd=(JWwA9p4`vW(ZoQW z1roHV;{-Akfb`HQ$G;^kycNzY0-pwH7XV|%Nsfse_4A7iMb{ML zFA!-q!qZ~Ff>#%Ay0gq1D{%-3@8K}n%3Ub|EU5$pBhsK?M`FiEyWjT@ebi*Zh@-YY zet=f6QiVPrKffeoq0-MEu3ysjlSDH%hKvV?iM$=WQ5S+C*#~%zLM)1dDp8?YzqG8Z z83NP!ua5v9G4Qm=8#zQ-%#dwlR9yR6Y8$D#dA^yxZI{$ z1UDa*=M_@|eYxYkhBNZxC1c5pI5>^)nIONF*a&A<0N_ z5+0Xf0Qh_rLDw4^Qb^!{vj&z88;wxWmbE7DAdMOvtHbT3D)%EKDJH}RQYI9_h_S#V zJj+SJZ{Y3%JK`$<$bk7Eq@4%j14rn)FtOeU0S*G`dAwrL7z0E_Dj)Byj! z#>ASnhsl*+1zq-y;{&^YBNH@T;d6jIaS-&dz|QP{0Ve|JCNmJ#($W}#aSe`E2U2}# zmO__Eod}pBxGT_<_yVmzBvt`0;~0P<#A?ldTwR^4LzV)Gi*100AW#WZOSfQb9LeB8 z4F*X?5r9(_jnfc_%AgYp?_}L^Q+sA%f$QgU{VwzMbBjhRO1o|d(<1*iS>cK(gcu7A zt_tAT-togAehjVz2q&%Q@#qr%nY`Y)BkAkx>mvfB5>_tuZ3H*F1iQrCn}2bT^F@ZZ z`X6u@fNbK^e&?nIHN4&&0lUk-&}();bRq&JW

pGUv-bnMo0RxGVp$ zb~UvU0A&AHYkw!|Oaf=_+}dAEiO>`QlC0)L0PNt>v35u&{Cs^4KpzaTYEWN>Ygt0htvZ~p1so}u$uito6}25F z!hQ)K|0|rK^Gu(-gd_)W1q6BU*96Lo16VguWJp+MGFD=V)QJ0Vmq~z1aRUPr6Ow0e z(z%O{beb$6yYd?0BS0+-gAtg32K@`|`FpYe6c-LD6k?9gpF<+?KjDgVgNE<{6Z`O% z#dOFE|3i)8H9$e$1=#}lUa0WMPC`n-BY-7CxckSF&oiAn+v)(&IzdSU=o^$oEb|^; z$HsKG1GpC5(N+Mi0ed?y*$^`}BE(ugxvb6c>lL69bS3<26QN|@u8NLfm;dAgf zcoonxFwJ>X#Vjd8Zl&0F$N3Alai`S?U0gw z3$y|DhdBn|bO5N9luuYXAwy?Zx_9pj7#Nb~fJzTUW$(kB;^Ls5<&u$+3G%q+Zm!y}Re}Bx2OuIq4250>KNM`G zLcnwIVSwmP+TbEUHE>b@x5sc+HwO3yj2I0HQnc@NiBV~4`7l<<(|+Y1z?6s<)5zEh zz&Sw!B~Lv+igBD&?a32*?6rza6Q?r+qW;;a)Gfz2MiZP{1mQVLYM~d@khqT zTN)d^9Fr?~TQ`7>3jh$#P_d}%E*=DDk}QVybu9o5Fr96aWHS*!=-1qw`r31}vt3RI zs!@7U0zl^n(Dsp~M$l7|p_S|Mu&^4hhIXU@FF2v$Y6W4Mr`gQH(-Dv37|*z+UkN-;@bT3nRd8Tfc<~cr^u3vEqdtIAlYqnndoS z!CVjur3d|loilE7D5k!EusR%8pYy=?uo1>FkX|7|MhC4pVt52%rjW@2=_7hVas_h7 z^8pR>T~287w8C{M1|aSOl^=XSAhlco&~XC*2&0KAo7!2jUo$g4aE}pkyAqwS{V%%T z+^`CMaO6#n$bp86Z~*l^4p@N#j(ju#ZKa%?oFdozbsOO>GNlh*-MzD`t9ZbZ>M_Ud z?1hUpxI5f{@d(%Syq!A*HTnyv_{2oT>$G0O-)<<_Orbf|Wksm}WsgyW6hmR*TWZFY zuU4ohD*2HX;?O+H8?=Cb?CQeUSU!%{75X*RiwDn8ScrCMm|0y7MFx5RM72H0nk=3> ze-B){V01~lCMe{eLI;Rc=Lxg4=6PBJL99#?apdso|BhTDv%;@CALqUC{8H(mEwh0d zX`N?+m&#>_5xpR@jjiSTg?R77J2FLnO+C{wVLGg$97=HxP9D$}6*3QDcor&cDH55= zk*V-y@Oc4|x+}A6rR(2(E)MBGd(A^>l(<=I! zK&d0|zE5@aPn$t&AfaI{%C`7L;~ah@uV8m?EVhFY^Lu=4Uww&+>F7!#=S{hd@wYFx zcTdTSLLF)u1lA1Bmbq~h)vqj7(|iJ@*a)%Z=|Sq;laZ~7xWu^hk&5pO4p$>u1FjzY z`CDZjBEF~b0K3Ggo1KLk-F$hk&v>SHamh$Cqk&vXc7UmHayjBlmx-vBNanDLL`FbC zw%H%cT-kxiY;8LsG=KTmW9hV4Pq$e0?oE!~G_SPja}|@c#BUpzyiQaQ(1Sl7Y))Hz zrMDe5x)kKJ`vsNSbwJxfL-l&fW^l_{Q`!@U%rSg9CC~kxW_6%gDTxwcTeDZb;lSLZ zQ8AUTmvg3Ca-!QX0`W*8Xd+=+)CE2Aq$wLjvpgZ#yay& ze@^FD{c2PM5(+95qM_xe&w+^_GAinOW33|j=ePcnxPZ3OtA9M6rRzA)k6bkd`j;^( z9;Sct|FS$8Z_UB8YI~o=mT*$}IsPzg_{n9eZ!~V(*q`%zjJa+mf@FgC;AQ9Su)Vik zP_$8_hT!O{wJI463#k}i7#}dXiu*{&?8O)@&qCQE{#4o33zcOUpbt7_;`uSZ*}|z=_jnK-|>TGa}e(5~9J@fSvmRCYzV(c2>4W7}{^lt68+v)eZslL^n-8>qp z!J>8Gf5+FV-5L8L!MKIqYP&aj1ZKxZvh?epJ-75|)s+4H_I&~zKR+}Bn3U64+KQ3Z zWO+1JEi7P5-*f+I>>nB%J3IK6adBGxFS^2>uDE(O*8kjm%hwt&ys9!Z)|qfpa)!yd zHWK?xpCd*Q%S#LZZuK)ybCTln?uS_ALWA82&DN;#+@2|k=Es$Alb<0fzM#O8Z=Gvl zLPfaMf5@`i)Zdi1Rgla2vBbiF1=i=thdvG5lE7-CLJ8T-d{H~~+`WGoUsUKAeQNbg zlHT}U_lMrYcpYKavI{};D{*iI5;f`-ByIdJu59*cYgX>4Q@rmnzDL?SQf_x?gtF*!nfcL+@=y|Mts+eU6cmxMw`q+-#UA}gs&*IO9dwp3CrJ_=48Y9! zUAna!0D}oes1az=YXu9+xTTQ)V4!dhrx<{$7|W|jyEKFBAeixv{$7r-!!j>w70QzA z!_OZtJ=7`BF&IZ& zyK|uD8;m{ix?iRiD7Jz$|7c9LXp%@yUC}EepDyDT8(-=hY$kf~Rf?dwctSch>_!)= z@gu$XR-S(@WiOLx5_!GYyD06H+FYkMY7Fi!H+f`3oq=bx*`qKOp)nuW+cW?Z%g0E}!prr#( zWj5u`-(C^5>pfrNrs&$_EMHAW@SM>d5X)URGK%iE7}4uWJzJhgy=h6Mt3&10Yr{SB zx3K+`=xcIj%~f0dTn+SHu2CF~HgU6OmME&>rnII%#Fd8J_f3yKGpILxVZONte`LhW?S%z+5R0w`E#A>l;H;tScsUo9O4eG*xw1#7*1zHpTI_ z#K+?TYJxKdYVO?Pwrf|hzMd9Qb=B~+{9HRbbzF$fmFNF}Hk^nlOczO=IW1RYSZT|2 z`a+76%gT)=1pOzkyJfCcDmir=pWT2hRk*O<#Wm3!kvT@8>y*1i%r;AfG_r$}ZP+`j zX9pW|c6y&YQpiUcGUDyjE2Y<0K6ZF`B}L}*dQL6xDBro-Gr0L=nQtPvTc#LG_D+q& z!pQ-KRII!IjYm%YF?Zb6=AL00=t(z3x$~e$Qk%Z4?40@``|{O?(6{B0frBXg&f5`p zt}%#>?lCaNj%FyHUQ22w8PfEZ4|ETFEBg2&8%l#HmT)=H^rseMx$@1Ag{_wLO8tU- z^&|GBB(2^11UFb!eHB$OFRpO4f`j*nvoxJ>JA;C zKa<#NT1XGWrDOf&@zczx^!X3Wfy*LvDK`$zXbcA(uJ3X$d0CQJSiHCgPCfNDgVT}3YcmE`=Ox~kc|L8}LPo>k5*BV!AP)PLcJqT)tJj3p91ibdar9pyrFR$c zLk*3hJU8$`s@bbH`SRRi#$BfP%I1A;O5UvwaxMmOoS!0-t&;`fi>)UXu2PD;3*ma$ zxnDo?71Uu^--%MC|0`r%GQXuNTK$ox<;dMHV%N zg8g!(BfpuSEbyVQcx{)wQv+_wj7+e$?;GL^kJ8-Xw#d*@L#HqtkhSC*6~X?)H8pw- zOy1+D4E1FAdHT^c(un4k$kByo$2%H~v0aP7)*VTumWR0|T4Z~bMv8uCwKgN_4nn(K zYiu(-$~4jpx>%{4y_?Hwygo|3K~$5i6z!oc-APMk0`n?v4x2CTtxu45jK4C}m7~@B zY>)r`suZu|KTQ>*;vFZA#&1mv`)YjomJLJB z^$$cw3kmn{>pT>B6n}+wbo4;~L3w0Cws-zJIx7}l3vDvewu<--OZ{6C!v3$lCwM-2 zW@8vn{+imbA22OkTFE$i(PjJX0*cq$_%q*r_KU|~=x+z672d3$nV)RR>E>G;=#|<& z&`KOo&niFW!s4wgf4zBvA76d#;%!Pw!_K*m`0699vWLYpont!nq z^11|2*5~i^(6g_N*iOWa1-ksS7?m*3&B@qn7CruG#oyX977{}1Iz8=UqfM1{efP!J z=>I`h4Wha8dSJl**p)rLsiz0;j_|Pub)ZaZW~tTpTV29V6tv==K?laC(l3RibZcmE zkM=fF&{17|_WK6*?VD}W0+RaI#jUi5&*J+fhAJuX{*=!* zU@EyT$bEpD?d6cISR9M~ebk0qnf}l1vpC!kay2oV%NA`xU84W-Gho z*Dsly>$G>O);VD7uZ4``c)s;iNyh4@yg;m#i(P-(s6OYkB>$~b(to+zo@PS#l$0Vd z{7dr-i@O&Y_cjf=b_nj!I2B4VeiE&AA|?-)d+#|m^kpW)lS|LV!*8K@nCf2h-Qv2E z)lu58D$0>FRdw0S@W&g@!ygvMK5>fM6rlSlXb7$k&|6ms*D4BDmRfb8$Diz9JG@1U z)-E$~8nZ#5DBBUvo$8(t6YF1Ch*#@u9_aa#;*;mwl24VAF3lmHFK&EOlG6?6)D;xq zS$Ofa+WLP_g^pO+#qU%sDc8I*{k5TCcp*ERDYpsR%+%D!^xb&N^qK}nZf$3NPzXmbyTzqkZ+5T;S%%( z=Z6Wy-rX0C?Z&m94GN#dJ)bO3=sr3X7_!MD{G714PV>5&P_tZRRkv)Ec=Tyc?q>%B zJMG28RGsw&jf8x^ODlpFte>Ma4|j-_Y6z^X2bdQOtu}Ty1xk$cUwAyoFn%=Tt}_H5 z%NlQ7!jf!7tv8lBX@qaDxV7xYDf_+lm5MJbjTGgcf)NoV7a1xH^zZAmy!xZtW=e42 zgRe``{#|LKQij-+ZbXmb_MWlnw&sxm){uS~w?gf(_h3mIwud2u#|CE7OkMOJ-6L$; z@9vC`=(l=xBNKC17hn9;PAbkR+xa|H-XS(t-b9`FL66c*Ya=l$de}u5sa#e~T~Ya0Ei~u ztDWS{@>G{f6ODV^@s_#7Ry#3sYB!i*Z<}}Q);X5oIUc5O)AT#)Jbk;5*OZ2ia?eWp z=AszZqY5t+l%vN&p)4zU>9bqHPq##!)90mIE?`Lt^Dri&L@o& zGA*{{w%Tit&fb2rwY`x&8O2M;PjtOEsZ+0!A4>M}{UoiCgdcGuVQrY7XC~T)M>_rz zuRguayu5uE3!QEAHJ)HK7w5ITcb=Vw#RU4BZ54$_WnGw7e7;rIN~eskNs^Be?RPiE zFkdZqky9LqmfL$4_^A@|_syA>Rub0v&o9iCZ9T4T7H#+9YZWvvT^&1kxEgxKy<7XH zKYY6?X?3gm8`q3^N8C%w<-1s+`6II>N~>LdmaB$iNg}0gCx$wgMl#5MGFV8+T_QP# zkHl$*R-j-+X#Aw!eFEd^J+z}>qLYcwipPwv;m8E9YY7IGa$PO8CnPW`G{G#i_QVi$ zl#Z?l7k)0umJ=S-+w@1bST7wxpb_SZvh2>5(e@kMGPCb4?fLSS%Qq z)YxG3ah{=WCZcFyH@j;1`NbZ0KTHcj%LtmM2ao6;vV~102;_LFaXf@#jq6>1;p_6D zVea=gWXCO-v0Or^lM~$neR9RmW_4BjPtteo)Faq9(q)w=nH0k<&l8u;li=nd>YA}Z z5`EHtG1kfl|25_HU!ws?^B!qN>HIfY|GzN3j~{mVo_3OfhKoiQ*H>gx0>ub8IovWl zot%^A{C5Ej8O>!$-~s)C#scG`emMr4`+qBrtpE~@ns9^*Eih55&Nn4Skp7AGOdc zvqWW$7tf`-LW|xP_?7TbL^Eh)kBYZ-(q-~7bn!v(q^Ml=9K-Y_Yx`&J^!oNE^eZLU z1eY6OEX$t$Jr|X6U~sTcU7Z*rD1(6b=ht>(0L35m@L^=iLp@v|Hz1lcFk|2($F6?w z9v=Ltmgj71#t)eZLX+zb%+eVeU->RDq6DJ#h#&wWUZB>ocQE7Uyt_&bc=dH-Yp^kSlROr#n3SyK7qXz!5nL zMj5;U0s@AHFNOYV70gfVNh$x2a4s-K&TWU8^Z#Z=|8L0fdC#b3%xUtQKI z>3RR12VY0-Xeq}eEH_andl+DAU3)nBU^4a@KA zxk48ia2DY3IefY~j78mAl|kbk%fR~6)?Wi>wWaeT`D5Bg-gj(cxq{Q+ zDS$~Rk>C~Sz+*JvxI&b9aJJhc#?{GxGRyGH07C&&h9D>}EuDodxCmQFIe&*LKOekK zcjqgykJp2LOA?FPs|t|E@Mm*5@RSceR=T9Br>xUW>cf1u^MEmQmba_6cb$;nX;6dy z(Rjk%r}b z$aAVjRq56%yM$z2Vb4mr)$KT}*!aM_qd>a@cx=c!NgV3}Uoz}Q$w`Sxo=vopEs#t-N>Rx)!NB$dqhX-pkc=e8`Vhuom$4 zD?hu_(HZIc;+yRK%88E*LT^r6rDNy(Hx9p@UUq)U!7w+(A(zUYYD~MpC(0K1drCa4 zf1wj~Z;o#K%fxh(D4#9&v#^ZQBbxig@r_Z*M5iAOu1kHpxz+WC-0hj=%@N-)uG=C( zfn#Bo`=S)ee`iA4DT5@mjE!y10xqWszsn}opZPhE#J?Juvp|t2Cii`r?y`dN{FkM_ zdYqAF(A^c1tZ*?O!G8x*T#6Tf0m4A@oQ$8 z(kE5bq2S+{DPN^514_#&!!=^mmCy^fLU&S`V%!v(+bS1F?z4AjU2tsYS!xW;>XsJG zxa@jDcG95{nh^bUC7?p!@BypF5VyJr;nTlU&;DZnHZ>4$D=X!ux>Xo9(Y(yJx&9Gv z_2_6OqqHSILvGkG*qV3ZqCZVs$T=>t zwyf$uU2_=(tU9>3GeN6FT2Z?xMfPW zMq~Gz(Qw#5HF`qXxV3d&DW+2y?)5hU&>OvUhvuKImsPR_& zlJ_Tt9~1Gj+#Ilef3RD|EByO)xFGcTl!=slnuGswjM=#JEI0XVUbNViW6Z?eX%@Om z#b+jrzh$Z4xri$~l93kqUcSK4E-yZNY`%cax7WHHb?zl=vtgxy1AQ1!Qbu1+v_yLT|Re2 zoUKOQG7RNjp4Bg^DfxKat8Z{Pq_&7%uN?2ol>e?3|9!K^ON-nO7R|odAO1EoAMU5k z&%@lS((lfcW_2Ike~@bTntz?q$Li#2_C?yI!|tb#BpP47b`Kf0ai7B_IU21m8MG&0 ze6W8YOyJ-Vv$1SH7sROC#cfa@+57zDISJ_u?`I5Nw_}{+N8?UsLe;0rCWQ%YyRSSK z#!KXiuD1vicdB&}Vq)nE-`xIm>(^Xf_Q$k7j+T?y{?v)ap`=Xh21e<-VkHLk`xn2> ziZjoe(m#tx`BkHEynbdv%7S}Z5!({cZB!PK+?0JW@zWK17mO2hpQZj|vP&r!pKflB z+ZI@Sv0pY8@30Q}R-~fynJqXs75B=tQ~pHRniM9sL}ym7uR)UL9DOKOFXa@Iv+Si; zy(!-)xv_YEl@8g4SV&Wjgxqnb!B+c*B68fOav%ua;&jkvJ4}C*7~Au4Xb963`x*^@Vu){#37Ds6+<+R)N>Xx6-xR>v`YJXR3VoHJalUF?`+h`;P`=bvImc zO`0BI@)6TlBje^CYP%HLTNNb1ij1nN-X#SwFH#QS1XyKjzE<-bq-P$B|SvG^<|rbry}C#FeLqD>p|rmmZTu#g)uSqTC$3 z##MT9bE76xP)E&CE=z4kMh(@1ywiO#gOaSo~gR5+6|3(IOaic=ml+ z`l`a$RowL!OTLELYWL{KI%;{2>Be9co1_|l&H1P5)ZFsAV)HSwqRBBfeUDb?nw{2b z8l+w;U*2=~+aqY}^DIQCkKGy7s~hwtEn4E zr1QTI%#eNUTA)|aDosjP9z@D2lVz>j`T49X`q7xXS@>sHPK#~Fl9SE1N5;kFa-0{J zg}<+ETw3)vHYd)e@DrWBV6ZTQ?aGT!{kqsE$|6)w=bdVEUW0hRXC)#5(pf5(6}!_{ zlPr~GE~|?~!bxEwlh%1Th`2pN-~=-Uma~`53U$|8c&k)ovY=C=Z~fB?eQMn zpU7f+E)X(xo2#v|Uw{=X&o%$*p;tHc^`d(4BMsbfTC@M0Wub^X9yKb!>hjCq6;i~*7j8~m3 z{#Wb_xplIMq_&Xqj+%#Yx75kvrUyPVL~!OSKD(tb^MvYlO*!)i zFAHgY+u@n^EBG;|6d$T89NK;;8&YQ+xQeIL`4$E}0@}OpN}I=0z&j5KpJ&|(dU+)% z*&h8Wg{cyzcH_>RS>Mteiu@s8h5J)hh3T5_znWOIqm+|_&PY2vl$-IJmhmdjeECQ0L2dO})I=0bzQQ}6 znnhjQn{cnq{51S-!oKUT2Tkub_9(r@5X z#7m4~++B57IkwTx+Gl)3T~W}&jO!?ayZ(D9L^NbV*W>;t`4D{43s+{pUc=rC*~`2Z zl)s)r;!^$eh7s}CYn36+k&(l~rx%>`o6a(B&n=H`7-ov-pK92z^WQ7nUrigKqhoRXgBMWdynM>eZOADaFP zcU0r~(!%eY#uigWnPw{zGy5@y;tyroeH_=k7^#EXYuqkmTHZt#?%2D6MaY>G#nq!< zoKvLFD1IGR^pOeM+*Y>eB9T`^z>al%i7k2o-~#{rY)#1+cI$Bb*AJnrvH_^-{qsZ5 lF@T{76ua}E`Scl2)aM%)%uYyzT2b($Aah^(gQS7a{|6*zST6to literal 0 HcmV?d00001 diff --git a/content/english/hpc/data-structures/img/segtree-layout.png b/content/english/hpc/data-structures/img/segtree-layout.png new file mode 100644 index 0000000000000000000000000000000000000000..30635299d759919b9dc75dae08725c5bf1dea5b7 GIT binary patch literal 52243 zcmY(r2RxST-#@NNDk7=K$c&JcgvchdsALw|q(rilJ(5vEWRoJ98KI0QBQv9rlqfSQ z3H{&4{XEa>_5a!NJnqUU7g;&#E=lEV6ulcS~JRSRcJ%S%^n zoLpu%)XL&T{NzQ7&XyNkY@IH#>)Se7Qk=EC$Sx|zu3~AylTW(}BQqM1=12>C;iMvHrQad=3r{9r;Gj z+Gi5~=QFyv>8JxA=I1MGX>GT&v(q&(+0J^KGEJAt?>I%8ZvRxN?tzVT@+vBuhpODH z-#qqhdGki$e_ndN&_GvD&p#qU(|Wduyjh8~iMIIsb(thNUsl!Itc`(NcXxDju>bPC zdiClyHrah*R;@Q;WA%7{7Jb(ts6 z68E=jX=yzjaKL-Go*Rp`d0}??bmN_G-|Q||7+a+t8vL4V#6d5;S&MQdJwGgr+CYF2 z|Mv9uzBDha{9O1jH}_?Cj?Vnzq@MnnGwGF;iF9&W&jhQ_nwZ=g92}&XE_nW2hGEyP zw1*F=xVgCx@bfF2JGb}5i4(eK&RjZ!cPc3ngX|$V0U&^pDRL1Vu(_Q!YOF9+Q%C`{3CpA3HNt)wH!$oSj9Nykz&Wu%zYZZ&y}U zZf*RQ{5Y5AV9Z{P{GU@HLG!0)XD=NB~pYM22=FIF?y0nrKhKh;` z+*W<~?n8S-dR*QwV!JTxloDXv>g?=nZvHU$xDEe-L`K<^+o+E__wI>pqR?})Hxd*S zyezrTz`)@6*|Wd%DT}vn-+n+qKnWKww&cVgpkzML`y0qaUK4LRW`6({m1D?;qJgchS-k&z=JHg?kpgX{MpE)j@*!(W6HZIU1)=?;@K( zSNBFl#I_JBW1(ltqYK~cJCEv~JsXS!hF+rLj-a^v5$X%p4ft@WWhd7}~%LiZ~Mz3A=T6&V?M+1dGKP0boJ+c@A)O-oDp z#0lz{*w`xz<63v_++pY9+L)no>t;d0q0gT`OWE8Wq~uV*jm->IHQfQUj2k2*CW@|a z>gi!nQBjGychC3xH6~`}lluA%+vt_K4J!fqP z!`Ri`O}4#}kx_R4z`)CQ?-;YQvrPt_4Y8*`d@#mkioYNK@&!A7FC!zPu$Y)Xwt@KJ z!)6DK2fH_L{4hVf%dvpEyr@W&;36+CCF<(w)2HXd5)$^}v0<0fvfggmLz~FNj%R%y zg@R226p>d{%t+W_xBTp(i%a?Ldo83FaHc0xKRRo678r#x4-5A z@sDFmxv|igT}aEy+Jtp{6_WAf z$(o9MXUqiPFDNhX^Ln=>uQmHkBaaJ|-(yR?nmc{!)J;?p@Ob}-hzQ) zOVz7aD%nqVA2JOY%J}m6v!S?0=|zgzFabWcgULdPAyh5*4yoGM99VdOXZSAPC^(u^ zRRM)EwV?Ar=4G6q(OeUo?~S)#H!RJ{&wnqdHGdt$Z6egF5QF}*Pu6RXwA+l5p&`>v ze}9onA0(Il{?%}xcz~`1a3AZ*)oX5VSJBnoG5$7%cK7bxp**(lYFHQ|CBnNa>aqOUqN=yu!n6UX!>&w>J*-6my@#BNbzh`fM{CIwGX2^)+ z=8dMY{X9II>FDS}ZrxG_zM?Ojx3+Ggvg76Dm6VdI>*#peVsc0+;L|7b%$yuCZb89Y z^ARYYdeqXBCr{=U7XzxQJcEN)>FB7pZlz;myB!d)F*7Smr0!u^S=@YU{Qdh!-apFC zbv2OVMpbM;MMlTO+zh5=)>&FwdePVS zJ7bXvsKhBIwiAuy`r-s#OiT>fmb^L-#JBSL?nV*sSx25QunjxF#>PhYo$s+@$M!HW z)nThySy`cCZ}|Di2M_ldDzNVmzpv5jVs!d+Af75B3VoZ znO247uFLF`xxS-z<;PYmSD{TO$HvW@%l>XC0t%}}?t97RyRy#O+FHPgwtU*GWxf_gbFJnfvkMW?I?>Yqx^E=%J(tl$18MKR6bS4Y-k#k`tI# zQ&Z!(<`}539}9nZb=lRp)`#Wf$&!ZES9CKJMiu!x{5wY-~R#=Z3Rq&z4tK)(sCon6D{m zta`s|)VbpN%903vI6fvRDdOhM^#+E9buVA8qoJXBT2y2%Qydf&6k$V8wUKl@(T&6`1sWHI~`!4eu?54nd?o* z)o>^{I_AfxdVquC^-(D)Z|`7no*Lv?DTLLJ3Ktv}7nl0mp%z`&k(+M2Fy3ZRetG}G z@7r=~A&2eXmAHR=IX3_F@~Fbf&pxwO&u~V;x}fdMvdu4B9+O&2oz|;AL5V}3bJQgYUo{5i+_A9o2`>8cSj6B|;ZF_G*=@tatiPtAC|Xy6l_8BLxh zs_DNzefqQmScRHtXl#_n4QML;{MP$eZ#y&gd;~k=&%=U(+t>Sj%>Q=s)T#AY6|6J} z|CpnQxH#GL5IuVi1!ZM%3kV3L=H*3Tx1h#8&W|;bn*+$rFU?zk6cJ5;t2A$HZ`U>t z@^ecoER5cA%rO^BZbji}XGaa@fsQ5|$_U^kr1bOGuNhCD>N<1?{74Ed19+V;9x(s= z=8N;*3#SbXNG-owxQC^*e*2*Mr|daHL(x9B?(fe*fl;gxA$FDCKR$19adkCf(8uSp zIbJse@8-N=MBfOeRIux`)-0bw+74`m=C^NEO-&!fr_$dAK{>MXbh`P43+o&mr)huq z2z;r}>{jZo52hU*9eo+lFLh0epR%BDT}&kC!YqSi6BAp3ntz z^RB7=LY4Bh{O1S#(VGH-gU3)=z^}&gD`Uc5OP7HKp8%y$9`!H$HtbEi=-{w2?1){N z{_tVQmhK(9q62mHAISN(?|b&H=xgY>J^)u_Awx--A$vbR5C-J*-iwoeX*m>3O<6h9 zQGDT8{G~sCE;L8&-y?Hf7%-u0W_FwO*Gku|Km-K`2Z6l2y!dso;o)1cO9$*m0u=TW z#vefAD0s-MhT;CfuDB^Th~n0joH7PzOf;+1^z`*;&LNSJuRuq4?A*BlG=)b-k1Gl_ zmI{T4v_0Ap1+p%XM$x(gK==g@tYI{R$5cZw$BsT!l&)T=717 z^ytL5Zwh$y>-}#0(Nf-E*_nDyADwD=K#yw&6)7|o&CnBecKqeWm&9nnGVUsEKZh+H z(ahN9jqRX}o5st}U%2pcTcoRt%W?qYv^L$HC@m}$NlH#8U0qUIx_)3_FDg6vl7oXz z2-VvM$LvpPf=_)!e*?hZtgb$`Z{NQ7{JVEghi_)$z+EVZoAb-bu?-9iObl0hhF?8@ zbYD9qiy>c@eHUMP?vp1pmo8mGU!kDTd?2&e^Y@JRA^Vq!hwYy_e>pKei{dnY6|r}A zs4tQSl*7zxeiK@ck)|HM!X4S9khvdUG^hgW&d9ZlUY=Y1J!0|Z4j=uN$(_`J{m-1( ze3nKjBlaDox9@x6_`723moH!Xt38jDIlkWvhS7Te2(87d2tRZPkW4%}e^5Afetv2^ zPqYV$;CfrD3I@x@omt!Nq2uZvxOl4mT~CkrPQ!FZtZ!a^{^<1dEuYopG_zVCdOCr8 zqT#3d3pZ3swW?OB0g5bUV9I35YA6L>rz>>$k}2b>{zQXf8C zdNt*4ECgoy{CWn}K>>jP$f^Yj$7SK(o}N3+R|K#LDz1LL{JEL#LzNrt3tv-X=+td2 zQVi?Xt()j8V1o;=z`~&bIu`z}hX{kOUlk$SU>`U#iZL-U5v$<RDArY`NM~#3Rdi%G*krM3fonrzyPfn zmcU-8_=!>!)V+jajyLE3Jc=TVh#I{A4W^U1sdcA_kL!!KWAjmf*33+9@& zUERg;+fK*6tHP4<9Wa-#hQk?&tw<(HaJYxf8W!=y)l26 zcTV!kC#B+lg#!*y02yZsOecCD@BZ-NL&Lb&mW}H#T3e4n1ri#3=|j06iry-{rL}dN z+}aXnwq^oH^js+(O`bvF`W)@#y1`O=5ywICP+C^jV;%u8={|ja9lJ-?OLDyJelS|- zD3D+B>i9MePEJC~;bDlHziMfDY%;DTj@F8`2G1+M`juwpL&b(&yLK&p@1s==+9t27 z+uF|0+fqfsN2|@czEltU%&F- z{RHVzQc?NTo-Ey3pIcN!f9=|}h~sp}lBDcs0wSJyCeR@AXf7m=%R%YglwzloywE(~4*eDzv9CH($oaN=^H|j@^gA!t&MhJfc zRy^L%vyUc6uigpWuR&Dd)vH$uIy$$vXt-Ryd;^T_fs8wc)wCm%0_Wgv>c+!|2b7eR$?84K%X854 zh!~!_x)fL|kqe)vrnX5-R|#=a7^XjezHiHxE#?Z3K+lPhaP_KxZ0t@b?SysJ6CN#Z z-v$Q;ZW1&2a#<3-fGR$wsJM9Vg*&maV5b|bJ`TA2O$*8kgViT*Zf;&YXMB>Y^}DXV zzEX;;_s8k!?M_Zk*F8MSq-a5Jlhj#bK;9Y}lrl3jVF#UU<@_==^@00IW26FCcA_jL zm^Hb^ty{yb%-;+SF5j|2t;^&bF}R=w4Hh!u*~;eFY5kOf8D%YqNO z93V%7U=edW+2xQl47T)DEe{Jj8zHf#eW zjoL*Y1qjnKc?N!ef6tCVIKP4e7k%$u@L1DrP62^!Fq*(0f>Kh_a&rzVo2i?9+EBoj3&c*5XyTNS9 z)=^McU+fLuSW;RFlO-5rNnTZz5^hv|iEa1CZ{G%AtB(3Y7ZH+h_u$#fU*7D37_T21 zN~Tk1Wo31I(DKfu?ICa*98|bz?9aki;t>HO^{|JePqY8{oiC zHW!f7XxIakw5;#|s>@it_46&N0F{^;q`9f17*FmGsRbY^Clr+|O1 z_CkSHpuUHP)G0l^#+a3mu&k^vl=H23xuZmooGAL~HB5pJ4@xq@UU*LOr&Ps$(p?@|!I{mm| zFgiB29=~|%GI=VrVHK+v5ENwozRa<|;woRG2-gE!% zirz|Gmh?FoO<<8X%E}}Fam0)b2naae|MUVNMiG3_+`=L{D(VJI86+Vu1_sV)?vRv} z6y6~3cl-v5dJJnrOaYMD_ysNczEAIuE5U;^SL$lJe*}$&7Onz^u<(ut4<1a5(yHIF zQE_Rdr>Bp4@PLLUxPG4)EiJLbbLvHdLn8y5>plPenm!Vw_qwG;EZlP|+P~+oZ3zhp z#3~r6@$P))WJb($2zj=Z=ag^`GXs6+KCUB>itGmEC+dF!rbYAX*P~EUA)%r2ST7hR z@te*JI7pz~i`aCq6FLS=FLryZK`gGjce!~o0gAe(PyZa8gEn z6-qp?hF7n`@_+NbiCj)gshuT(6G&iV`=7cx1%moBGc&F3PWJZi$F0wIf8k1uL(YOf z*V9xRc>mtb!^4*^Sur>)yVSnVyfJJS#hOvL?czjdyo|f3$Ioxvo!5Y=9goi*vKk7; zQ#8k4(L0WLAN5(0%+^T}vMObgyv*+NZ`Cc-{YIjKQu*c4_2ccy;c&2oXQ8Go{;jPR zY+vwTPESu4M)xPtR_*F9qx*ufaH0*0t+ux(9rbr#7$*j<^tGwu;O%(dNURZT3waLj zDJn!1bgZnAKqujwHsh@c#H0W-0rl{!{kO(^=FFLRIbYeEH*XFv&2e%7JjSL5icwlk zz)+FvbLRnDkg?l5e@zhF)L1JCWwg=YeSi|!Ku zKmXC=$D?1qY(gFI6#7?GxZf8H@{(mEln+;TWxRDS)?x#YryeZfy(6m;Bni6pp0oxi z+Gz3fFK%dS^tOF*j(+f<{bKyiERWrR(#wF*?(T*BzBKx_CI-PCK7QDGtsiAU-T_ih zB{N4>P0--?@1SDmAjDxg43IIK=sIaC`o`vsU^RMd-*uI6SnVlca(0C39jZ;0SRmZ_$$ zej(BfmH1;}!R#+f=b#_Zy$Pc7{hjHuNe|!jACOb!6%``GR#*mTBMs-@vU@<+6Sw5T zV+1P+anA(VA9-_i@l)g0@nK$m5N&|N&Yh>)+uOS;PW_{t5blIav$O3>ZAz4m{LJ$~9v4OQ(%(GuhJ z+Uzh4{_S21ZAUbVf@u?go(=;=o6wH}VQD{*^$shvXfhvu?mjV4{MEkpAOkfW3(GCE z+N5KijA(vTTeisKioLwN8X~vfZmLgH4Bo-SL=6hFg6o-^o3oi6vUdaNstdV?4Zm%V zFa>~x@R(Dl=r4`b-0JE&b=}?lC2U(LQgpk`6hJd}Ny$CK!^5)<(O>4TlP;v4B3Lo5zm){Jw5tKvffPp*8UzsCIe;FM#b&v`B#Y9%nWGG$Q{ ze2k!FJZPqj(Fc9P_nC=8T11A|F zfEBWcSF~~~6u!Q`d1kdW8bWh_rA}yQT>0@?ecv$;27tw@rx)taTUiCItayER?!FVO zE#h8%Q&Ua?cTz|cJZj==0#6BF458x`wtREvx70I4PeNuj5x9g;AYO1m#cNi6Xp~n~ zJLUega|G%&8iwglq?QS@qdG0R;ivZ*FT-hLrj7G9zsk{@|Fp!`JZ9u*bEoSW0rH$Ao)uSQDi zAtrC>;IM7$)~!VA*VU!>ivQC3s)itw1BM6DTTM+vLoDF+#E7jcLz+t(BFc_1>C3LJ zn}X}3%Tv)!6oI5@0hfP%JN^9n{CYSLq^vj7?4yF46k7JX@FWHo(6I}*6<|L){9HiP zLSBK=9{u{YZ+w%fa#|kVvk?phxz#!hm*B}kEZ4&jgGC=Rm`@u+{0ZRx@7WR6ix;_& zjwQ^ggB1{-wvXm(ehMAdm^@h2P=X~Z4Ec|rKGo&zg+DRC9r63ra54>@79gmjug^L! z81nD3o168TnWvt>fg-g4>tms_}_6tCShj2tEbIxe);jGg{;4X({Kdgg)e+}{`v}b zfbEd(`)>_Fw4$0?NW=IQ*IiiMxt2q~J;VzIcX(R&7kk zGsXsrttDM1cKn(d3LhLa)JzcFw3A2kdVbD~D<*h8!h~&%WRWKCAYfenRD;7XSA-j;*@=Ecdmee&%pTZ@q z$LvF4#7@0`Mhh2gV|Zg7{8-5U0ILoA(sV=O|&xB26bjok-}gyvWzp)s+fHPfUpS@0rWW%19unsksd$ zH9t41h))dX-hA1`W$MmRK@#(VA(F}_7KnqyeL7vBHob}Gw1pA_J)bBOTm!$BFJ(&u zunsbuMrtBLz9HSce=Jcc00ROa%CBrdMZ>_}j9?PQ-W2licLk--HZp+Zu^1wv5BDS; zoy?r$o5@$oK1^cq@TdrM=T=r$#)r_i9l7(v?>Nhgy1=blDAvI`PGyvfO-NA3fXm{_ zpR1u|)%JZt*^UzTwXj|3^J@jUxVWOw*Sl#k3RMZM6Z|`a8(pm}K`eNZnIiG%Ri61j zUZJIkf|OvDm3aMeHmvbF3TKK54{z_DJst5P76*D+DRC6Ank-S0Si;TlBLNw`Mm3=BFF#H`g0Fr3Jn`JIK; zreTxS#LnbJ-}>?ESN4Vapp#&5%_t4fUTy34q_o6DrqB{z2I@&HimW--~>JlmPX zLr~KycyhDGImg*k(L1^j0jQgpB=hJz(1Zz*&V1xj$8+}u_-2Swrfi_E7}(fGY@2Ff z@Ilv1bfhVkK+jRr(C{DthYQG!<*N4hnJ!}Ss-r7Q!xjR^1@ik%YilMNxSL%8Qf}=O zSuHRq9vJ$Co~@6S&CF7sKY#uZ&leI+8IU9n7k3bDGSa9MizKD&LdCuA=!k-^XVaan zrEhF3&?#X^X1CBXG)H{?3L=dPM}Ve=_rirXzk3<)Q-=5`-jLU0&V~e}{p9NOE5& z>nupu5>K{~O@7DWhg~W0S@mENK2MSR=ur&fynN$|7}z_ZEnlu(+g&9CqUM5@y^K!! zy0de>v@Z;+85j`I0sqg?yR0Ng_dq7d)bhexIa};hau0}27JJW{b%N99!f{UiHCHcx zfZ-IfO!yDhP@lITv3WP-(o^g!7{|0hV zus{Tj$|CCO>qnN4-0dnXsmjR(I(m3`XiD{iFrU%;d1}|opLeb}$lVEo8cAhp zgQyE^8X6?%o5CA!Vx$H63fL+CJQsViI$SEBztcwu+l}SXs{1|SJM!pY4J>KjwLiRw zayDQ(4p6J4s%nAX8Ps^WS7rV}UGUfZ8D|DrH%H!@9B1GkLR{{t4;4Fs;1*~6ePQb0 zWB$WWW{JF^uUgN&{?Madk9esRc1q200KYpYD$t0saU0w7jG)J0$?mH3gN&2qIU~#WZ z50;*(R!1)eW5A5V=6fxHc-G55zX!vjc=`5ifnr@Xc093?6@zK^!xvj!{#g$yLo{J1 z*D*K?FR*BPoHWtRu(TbuzP@093I`ZIjn)VMN2ZT?Y?PLk#xw~ta$b_(V$2N2(wh|@ ziujQylV$k>64GJI%O1oq1r;OD5E<2u9XoCy+d>3U(oLV}rAESp`OhZR9;Lm?36L~2 z7#Ser2cZ6V+v|_7DQ(qIRZ?Q-;8-6W9gTUtMz}9wI&!NFh#3Fz7B5tv0A}wr+`DU6 zW8UnM^DiifxPUh{J0%EKUI)c?<@byUNk6~|f-&VJ#@mMbug`H(bKgFl|HkU=gEj*- z&@$7%187q8_uI0;_CkN{N4|{MQJX>yMi(xe%h1=JSZ9z9#s4rfMgJ`EM2yybiO_Q; zpfF3rjB6MYS{@B%Q!_HUkJhQCtQ-g}$vty&N)u^d@-fdi$$d1pSdT52eszNC(Rv`0 zfc|9zCclWtc-$9hmpB}%qLPwKgifUzW);Rljsr!3@CI$6yjZ@u3knHkyMBKc4WXV8 zA0Jg!C0%GBKutxJ$|8MD7fK@d=aO?i$KW*27`p}Gq`#Zmp-VWuQ!Zd zw)yhr?w#mpc}Vq4bi*hlM&7>$tX;-lZ4dQK0Gv;LsJKe@GSiVu`;JOU#ehMcGcxK1 z!L@+};xg4Qazb0%$NF~XeKie@401s^IlTB!zVi3$fLel1K*8p&t{8HE*n}mXzb|1P zlo2^ZBvec@@Q>`Cu*k?6-P^{Paho0}4u<`zg=WLR#I*eGj2qw#GjW+ezxQ=RZ>yg^ zeHTmm-^Z`5xJ$YJ;PU=E{p|DjxC)-}=h;~uxqmBSL(kn~A;Z1>EDiVSDt?8u zDEgf{ZQ;p^L%5vM*LUdMYH4k)3X3}f8@5`nWc}@?3{-9x=-lhx-n;PjZPCz6eE<36 zQf6)eYbSlfu;}6z1od>`jp-1_UC3ehjvoC}-ENON#iLtxnJ4l zY^T1NnGQ;?byIRw)ZDAtmAQIaamzOhp}#G`pfu%>x*VEcHp@y)y+Ld|iKF{yj(IPa z488X@`!zeF1OH637TPQnHm{xuAoKILZzoU#1lLD={~han_pU33Tca6CfWIWG6B>p8 zW?EKF0J;mB5ThI?y6l^~2O0M4QQN$E^CI}?bBUgD@ZXDeb|*-h3Z7i5QHwEd8$cth zCDjWTE=+fA?)0S{-MU*y7aczL;lrrX(!*kx2P<9eg9DVSx5J`Ot+V4sr< zgGJ}^@x?l+%O5G)Qshz)gQ_D8BE>=|kg|%(DMdwprDc8w>Rdc1A=Bq-khjv_^NqVN zIyvb;et#Yvl}8uJK;)roXqc#^=reK`9|~ed?~Pt~(cD}CN@%g|$VdnFL)D#*?{Cmd zVsPoI;E=osRr`XT<0@i>jLYZRN1Fg7B_MoEBIX-`58#=!t1Am{7a`b%uE@IX!Y58Z zM~Gh?*Tl_UJb(T;8eckaQt#x+PS4)9uM2!aLXiY7Q1hJN49+etneZSxA# zRE1?D))n22d0PMXi^Of{>gmbEpiJnmEX-Jds3feRVQQ4DWrG5sp0{_$VeSp&v((5l zhDSyw2_zl}Jgm&*quDg^P3R{PT zhX)hp7_{qvL^6|>3vIJP&b)IkGjl8&lUAa5CG!|>T97KC^O^_%2fT@Mq)#2nXg+%-2o3a|ML6LZA-H%qeQm772wsorkR zqb`8V zWqRJeff~L*c$w3PHs+@}@cu>5p5-7V??93T zakE_g?d$w?G%;^--Gt^t!ZtTH1~Lj6t6^Xu2c@@VI`mE)uo9guKKz$s-T+bBfk8nh zp36NxlW)E}Ha$HgL>mEL;M+iPPCt9PE!Df=I@EHbf+LjohwRzv>+?6^F}&{Yk0%Wa zjVzz%?h@(@X#UgR^SCM8Z;V>nergC6^k_lYVgs&(z)cCkOsd8`q3VSSMrLLWB$>q4 zzQ!z!snu?b=e~aP=39PUm#Ojr1_+3I%_Acz*c#-SV<;vc3>6^A-9HwoSfYdEi;I&} zF)=ZbfX{E5Vk})2rif$U&5~E5QYYb~6Dl5^k->>Y(?bAE>Jp`Iigp}l;~ZDUj^*Ux z2_lPwIi`G?=55=zzeXFzr@nsw{=UzjFS|XK=F%DjI?>c-Mrw87ps_={z~0V)>Z(2a zBl4N8w1`M_ws!Ifh*%0t+wK)Ksl2nc{QZ#O)AQ>PiNpe#wlhmakg~zSirJ~;|1fL< zcStw`7$*l>PS?A4*`&=m)vl_5cFp$q?B2Dj6B;c(Q<$t82#c3~+J|W^M0=}{BRvIX zwPzG9&j~ORS|nsrxoBGJW@W`g_|N9;OiJkTSX+u^lzZj{uL2DH@<~Xt1B!__Gv6$8`_Z&L@gUa^-;iHgFR1r;K90go9- za-RQMz;ZGQ7#t_8)6sj93M7QkIskW<`2kbd!(xZ5;5k5W9*Vx^F2m~B|I`m+$_D#* z+KL6P3fJVfZ(W$#7_qythN+~%8gFTw7>EHgL#EmZuzqhzqoWFQTM^Tn9GC(l*#4gY z1GPcXh(lQX4ih|d;6N(ad?zYJT}$g0Sd0o#qXc>=?A9&EXMa>3_a8od7kN5V#+m$f z9Z*d!_<9h#SxEf4LF}a)mN39{oL4(36+IPgF&%bc8Q;`jM;ix+OdQDhPeZr1Mu5r! z^=@CA`JgVhGPhypwS2%1X=45T{WQBvV$$JK!>!*xzMom#W(T%o7rLg@v`pUP$Cz>y zWf49 z6))^pA-r42cX$}DeXDv%-sgb)TN;oFQ zQ#m-^aK=0pu?SE#QgF!bQ?{T2h{^lo=TGmkixcf!Qc{U<+X-Z}Z4OTVObw%kG~CeO zxG)&H9Oyb=*GWo>lZhC~IF82BQ|tThNYEWw$RP~+6Iy^F4khS}5-@ApeGwiYXLK zl_Qo0j~&y4kT042|A;0$dNs&naYEjolGZ23+t&yO4Aqefknrl7(MZ_a zuiX&r>Z+wG4Ac&dbRSUjKe}gCY;l(k+l*+OPV)`BAI!^4^_{5PEJmi zT&UmtZKPI?NK_h{Y#d97W;|r7iP$eQJ6k2iXPJ+HI|inWWNOWD9K4YC&zkQ z6$_b;JX$DLnU9-05WdSqw~m~ngd6F02v?gsJEP$fen?xTrw%N!Xrdu~78O%!Bn`F9 z1yYGSv}rq#=5u4ZE=u^jWbLxdD4F0_@O&{eirE?mqal5rwg|8rPDHHU|Xi8);f ze<8R%1tWQxkTKn`azh)g&6QUd8_M!7Amw7H(|6 z?Ax=&JlYl(obDbTqOL$~*Kb|{zc{c*6R5V|z{RPcV`SnM)6&xDD_-EVpe-zF;M|gS zGAH{EstMnKX z+2LgHRvd}&!%*K}c^#o$fgd+fQbJdq3EzL+F_-h`(FEA9XMZd%a$!e#|NTW=a1ger z_}x#@Z_6$XodBjV?AURj8_ar`pA)&y2$2nkgbM{vv@aa}6Pf># z$zf>tk>LQ**9{E;gp+}@eE#%Ffs_HxS`u@(Sf~#!G?(ag%wH3e8l9A^(_oQhi|VnE z0z3&Evg^k0$Q~g5e6YmU1-1jt&@8AuVGl$e0DW)m6$^sIsB~#|JIe`6<4l36AYb z*njjpG;|9Fn>t~YPQnUd`+022F}mCa1l61|=Jc5}K8Nm&;4HHOsCCr&*|mG%Bcj)Y z#T5T<7T}2gjT<-Io>;?z(maOt#ik_n#qA4H0rMk>r%s0+sxyS?gKShBh^1ZVg7fD| z7#uqe$&rD%=J8dW&EavdgAq3^ku%{4C6M8_ffC21Qg(5?H-#dX35-m=Z2%2z6~I5}SD zF`|5tY$@9s(TJLRGBU@%VTc^%FMLIEA9|AAI2~FD!bW9;!|d+vf2{mjFk?kFFUFy))HPNYrwo?!HVRZcEo=~FSp|oa`wbxGLEZZb{ z0K>k0t$P${SY?)comuZRD3fmc##j?GnZ5;;aAOJwrmls%WBAE(nQ7m?l`|uP0s;WP zMlWCZPqe^a-{v@mvseT^J4z31LSl z^5ggK9E4dklTCkju(E1{I>C^8@@(xAJQuW~NPsY*&d{$$3@WH*aU6jH=1%>Xyol45 zt6*V07@=^!awQELvUJk8?e*(4#Adpf^C4L(8XSouXIrsIO}H_mlx!ZnI1t-`Yq7xz8H5F-TfwzzYAxP} zCzUMYUX*ZT9uP^|KfDAS3sHsP#nLr`&3OtB@pW5Ue*KKS$tKWRv;yMUMByR`GGbRw zpj=t#+l8Q6P}Snz%Pw6V9pp$8H7)I|v$|HLsIkyJfV2(9fdq#wz&(il8q8OdKinGWArqHjF+o?38s(4u)4^3%1z$8?;Kc-&Jbx=7FE( znB!lCfWX`pm&B1zAlTDgGpKR(*?+}u!zZ=4o!I`xsNEZ`(&r27cQt7&GJhE~R0@=z zvg{jvSJdAUX6|-&%KYW9tD}K@UihxbE_JHc-otVRMeJL+Mok5rWWQdfQJ+bkZT*no zKAs@m_$KTwABAP%r6H-Jva)QX*7PUSHi+2u93VCruG&X0!x(al7o)N8w{sQxzK)Nd zf~}5po*3vTi97dALzNV;&q-Wt)?MMy5O^?3I6Zo2AQS;4X_AhEC29B^ zpWTiO^}@uGOT+*OJWb5w@7}a|Gdd6(gb%qom^+$jN7BGaNXP=yJqt(Rot+y3^Ok~x zgJX+}#V{-J-Qo zuh;|>e$saz9nus5IL7A5S3B7Np{Oh=oMy3SX2kf#Eu5o&jhG+t&k5KDD?)=}LhgTl za0?ndz47R5fj05!?B59lUfXp#2xI7-Fi9UezAqzf3Vu)y7^H8X{g!jCiP{fj`0^!z zdU*kRu?$9%dHM;g{l_b&fw;vja-!Za0Z zgB9lQ{C4r=UEJ|01Wb|%BLeiHHn`c(V`GX4?3MJ(U7y1OhUi$jpIr5Kw4RbYvs}HL zH?B7z5s7+7phUb$IBuyWSC?Rc#bBpBcGBg41DGcU*oUE^NemF(u{YRCO+5j#Til^v z7!}2XEHzQqTWYBMiocje64XD2etDUh?^Cn5M%}-!fo7BeY;87qI*nuqSN-?WyG_qM zeufbT5kd`v=&A^#KTl4ogCB_FY$aAX;!Ra=s1gVA8cx{Cb|W>%kT(?W``ZPB#IV}q zFsUr;oPTDs>~ZkQ5DB3}@DgTZthChi$~qW2Di05X&DwJfY~C_o{KxzfMFVM`GrMA z`Q0uaJ$$&d!SEr#(V4!WU|qDD60Q+j;M}8F-e|Rl$b7rJ;CF82y@K)$qG35bnv+Q~w^%PpD+Vwo7gD1l$!!mErjd1(j6O)5O zf1tTM^z#i56aS3(0Whi!TrZ;w!4`S|#?E)(z{69+>$7rlx=~oKp(zlj{`u#8>-ZH! z&BUzfDzzrxUoiclBKGQ6XrL7IS~YAaVkR*ib^d(B_m{Ed0#0c$VGW^`S!cowPQdjT;&H>gv^l!8E}5}yKWVVfle9NbZ`-EdKk$8|wsvt44bJyj zxW~ee9}l70NuqO4$Rx&H3oRgYGqgZ#QUgR>3bI&CnXslLNxS zQ8?XtKwMludkrDctEVE*Nzg8+>h%JoOM1isUkXl@#jzC^&^Ce;L{CD?MO6kFstwL79V*M=ehy%uN{iRB;3tg-cbiCkQ}}A(oiDeXd8?lz@yr5 zrf%&49o0r+)R5HzlAgk<+aSElV>4KtA5$eVV-+V|$&<#>7<{MJpElOE`owM*eC32~ z))F4~0CikQh)TpBc}x-K1)WR!)J+Y@5Om|>5*00KxK?deIo7xf^p&_o(9K^i-5@7` zZ6APxUB0H^Wp2)aIfgFiBz}f=kQ$PS5K9y;l{m&-$gQEX6)|)dYz6bg+BS^MW3(Y0 zGW?Z#uz^6`3&|u~yu?i5{sJP&o`*PEw#z9DQR0BPmWm!2 zgTM?sACr80v9-GkYS&I61pmwNRuGvPF;mEZ8P3IIz5Q`25VOpu;6*eK4J8w<+k3a~ z5cyUYJX-C_7v8YDl4`fasByW>d^q|Ir~VO~pG1PBFU33k@8Gt2@;z=gFj~MG^&V{( z16U&GaG`qypTG>2H&z?-iG$LEmm|YugoePXOJSbS1=P++-dE{%1GH zOznMFwv?UV+^{JKYr(_RFEBlZL%&}%bvU9sNGyl#QS9hlTvU{4zXL+zM4lYxw{t>5 zs3yz0|JOt;@HufDu_}oxaj>5>^pf*0*MoD#R=lVL#=BsoD68%*GF7XLK1QIrm>;H>u_>6zVR&an5P6dQWsn=e#_NPDD_;pv_06lv*RlQ zW!KkMjg@ z^E7-J$0NNQ8nQ#6^lL+{8xp(qy-xlBS0MY2`k*mB=BQ&PNF2|te>c``II${=gDR7= zpoc3$IXWSY=EN_8*ODc*85<*Z{1#$|9D5}=FIwO{r#!-5!E$Y`tGkJ2a{^*A4IRn* z+50un&Lm8^5&1}LaC~<`gYY)8<``H%nqpUk8D)rBYgc~;bb5%M&mTV?2SRk^TZfm? z3rVSdhDV!;c6|Dckg-Dy$OcT3Sk`T(D{o;l;VhKig|D;d9b|w77Ef0)$3bSi5pJ@l z;;MC5UmvfhrzhzmB>aMWs$o)9XN31nhBnFCVdmk=uQSzD(P~`Y-9WX<*=zjej~J5G zB@j;#{%zHksBXREa;`g0JK4Q269fVgv!wQZaXbG8d=i>zPpSRcNy8uMwfv_ak%5@+ zsvOUKRug$NlWte!3jFD$4<9}d z=mU*x4m)GHu>8{=z27BNa3ws&lb)5e-OdDkjL-y}11EG$dqjruE1cF6pf;vW$lW3s z$RJh`h8b%s=>C#^N@mND{iFZAtQf|tC&}YSPwxhslcX?;H+5m%Bv!KZn}tckEFWi%(j!KRcAbzp z1a6m$fLV?4@+wGi*2wS4?%-}oFFpuUHQ1Y9STgY4N++hix#E~=oW&9wo170ZU;^7u zS71HyTnMElbp=ndrX|SvfBhrOW0By$qEDTRo`fve)sYL&vT%fl7o_~|+vBhC^vLWJ z3GmMq1;DfUPl==3O+NpQd+?5ZyMQyRWMJVG%Vr94kMN}S<7l|(k#1^cB7-x~G|rbVr{djxcN;mNi6laQ0B5w5q^IaZ{~I(UjR|RafmPU- zzGebcuzvwmu0(kv3og1goG5_%wpm;G10JDFq$Q44c*Qz)-TMT+EsigdN%;N%8U6&! zmAvsB%<}>EqMtoG0w@Z{VYOA0_I;n$qlU>yI841*d>!d67U{U+mp5VTbYq~FxZg-~ zG=gbZmiZs-JgD6bt1c_h3mLV;#&cmXIrWf99NkhCg4Iy(9>pLLE*{^Opl`kq{KjCL z@jwLT0hhdd0&}JuK9rwKJPVg9>A?a0o7)`hc3 z$KlYHTznnlbnQednjP4;+W5c;>{RX_Wa1sRpc0Kl5F_1wA6tgINH1Xj!)_{ zh-CXkD=QTUl20z*$`PGny-xzXo12<~!EI_Po;@^+!CY3Nh(%DVU<;<{iGyann(v8I zHl+>&F&Ne2c zS_GTx+D@CJNMEDtcTaL{tJg)vWZ70Yf1%#Aaif}?&V7knV$+8jaStI)V}+vWf{fw0 zpa*x$&b^u&t+#w4$H_rP&r!E0(FAfaU47SC*W41Wgbt#){cwYDtUIo&-3wj)u;bvJ zSCK3&?yA=?tc~bDqr>9ue6U zEhDqYNJ1I+`Mkd0>;B!p|L)_sj_de*s*m39*X#LwjPr4x=L69NVoe2np zX2?NOe6D;NL6h|EXs1jObA!ZW%LAOW>j5Q40Ldr~Ux zvGW)kbgun5G*o9r^Bmo%$(Gvg==vhU)aAZy+dW~p~yTTecl+e^}{CLL&@`f(>`Kj!Kq|e6jBj^ll z%}yfU#8|o8;*z#w6pmp*h33ES;3xw#H7~Dy-Yc4DisqpmLrF3RBcBZ6Yv8gSSoNI% zGSbbO9}z?Z=`6dD$BTbL6rWW^Hw-L>x2I!<4uScrTjqQ)l?^R!=MCz5CnuK1|z{eV0GXDPxW5(rU4 zdxQaG2k?`N0VW@J%D6>AVW4R9ooJ{tj4EY-3}SG4jKnoujrkVk2;d%@cZwg!iA%m) z4rZs2eu-*w^&RA7@&chKv6B?PK>{W6#2PB<##m|;Bzj~7iQu&5e`g+tsTQ*!!17DWxHhSpHP=2EC_Y@&e}j=mx)A_8bBA45m`hScHYo_KjtByqEy#4bb# zAaGdWW}^Eu%#a3GG#R*9fsBw+xs_Z!=szj#?W%Z{EclXY$fg#@K0)L9fC!2}gCe(= zOb=6!J+a=xIADasN}RULbSm5=?(LLVONkP*<-j~7X}TP}gb2F#g}*qVgFp+QqSxlX zheE-5?ngvTw@QpSBTAG99>oO(wRoKg_?`A;TrY22bM@=bmB4_(z?+A-FQErgAB)4x zz8`G^t-(LI6%d&n1+vXhaBKE4%fjauFC+j}r~*dPGX)K{XD!I|%PE1lf1P2ueVh6ZzoZBNuw>>ek`&$-!gj zDO$xJAetP8ur-pMAR5Fq0?whP2&#}zYV&&pfwAU;deRgnnGQ$+L`h5`rW3$80zmJs zYHFgz7^w=+5+0jX=cFpkAaw8nwz~W1Q5HO-lNiQJbU~R{jS4guye8OjB$7%4w~UvM z&ji2SQ)smnngS72D}a-jK}flgg_uXQ?Vzizs})&*NUR+pM7gly0;YsBAQh?iDe^Mu zmpX1lJQjQ(Ux4?8J2@~lmGkiDzMP!&p z={f?VG5_c!1PHQ-Thfe8(4R+VrZP%=FKaW*yucd;5GLL9*7v=7(E zx1B-Lw(X_}fCrqlU=$Gho}-`)fy+%&lx?E~vIST}w`-Ii@o!;7mLo!62)p;WfnD?( z@0A#m0%-G{gm9o5(LysMLjkJEB>Yj@Bg%4jARR{+4HF*~6aQcnR$^g}0!8HLQFES8 z_ULAuXTGh6w(KF>zAYQpm3Xeo@G@;gA7)}@b#YP}?Y1+}avUPG;TL=z@ga%D62z4J z@7J&d%Hgmxg9)*1vMMz>nG5d;1E;XFyt#RKw|#t0z5MB=x~~p(s2VWERvJFyLJ|}6 zHvRReW~LTBsyBi8(nNa!$i&CTGpGC4s)~#J0R{wV5&t_G$SQ~h7#{i^*aM&C^@c+* zf^w74USx^03JTt&jk&JrTUaOx@U9y8@a$$w#A*a`NZ2z}|2>S3=D%^sKwtmi4?DiS zdwB!|6iu>#!;-SU=po+cnzgr&fB&b+B_qROn#T&(ZOj$s8&$h@?rbW%XTpjfIsgn~ zOK?3b$h4Y@*ZTUfUC7Y99UE)#xSNrYm^%UwLBETSTZ13FALA-<*zMIo`IG)na{yt7 z1c=5J%{!3MA=KMt%t&Bfm8PA;;Vc#RFQLWT&zuGwCuR^}8@Aa6Sr}wcE&}rA2mB!k#B>sfQ+4rCYlfR6iTM=J4S$U+Tw@;S&!G`J zj%Xjd%tE1KKS~Giv;p%jg&kDI-vNISS2Pq2 z*JXBVrz-D)vHe6}B`5HVALo-DSVB6C2vVgn*t_1PLM+r{U)af4(T8M`V#aH9s_`CGl#~%P# zKU?Jlp|R`x60kp_^#kCMhw#?w=vA|Sn&kg&u7`lJ5Tjo-N7WdoO1vi{bkKH!5xY&m zh>Nk#GAw69V|vIU?M=+#Jn*2D0G$+Uz$M#+A3;rlUf)4-*VBc}cGAQBAW8^sxcVM9 z@g2=)N=Z$n+UICmV)#m9iFpbwl*El_+@oh9kKrDbY84zBN~0PjlRcYjX=r06m6OV4m+4&_6M8m;Cq?jW1SUqm4*0fZ=!@$ZhNARPHSBD67HO zaS(!k@`iQZ3j1LwP>Pi72j<(TGDHveL=E(hokze!`Rw(dU)&nnKvgROd+Fy7=eWQ& zj3|En9y1z;tVE9xbT^|U(jFh4KLUb>H^6B76uxb=*MYH8yf9cywGxQ-)7|01x_vv% zQ~SS1B0hHSBfp?)<}2*Ul=^=i`!uy<=l?&qAWcn!#X9xee{RHouSV1HpQ7piy(Z)r z&+kZ!-ShvwJ8Ao=d}H^VyT;EXyRquw%9{Ow@w=CnXwDr_e{h&folE^enabHWTbkY} zC@RQjm%O82c=+J4j=paCgqzm-jrJN$GwGVX{5R!&JNbm&cJv)KeR$vjvts&NDZx<*t##+Fu{}_C z7>Z=Pzd?|Nk-{`IJoIQoc=;jj6W0_D9U4B8QA77^)ZQh)N#ZZ_$-ABMr|!J13q5O6 zz!b?$wZkm@$k4}g*|ioO=KI%eGE=?gzW=aLep4vzzDt!oDT&3}j2l1vjL_%$SX}OX zwDRitHTJVX!<=W9Po9pt{kyCwL-qLMbw+IUZV}w3r1oCmUYy0)*m-JwC(j?6Xs;gL%k%m$@Nc8O|J5C z?jPk8(>xQIfsseaI_0ZU2Ud>IQ2FP)Zc*{oeeCqYxJfuc+?{W&;+cD=dy;li=sK@m zYp~6nAE>D>U%1h4_x#_nqLcLA)9>V*kGVX(bwj&H?^sepad+5m2DR}9)5bEJfS%b+ zx?-#W?o+gyTN~X2=6L)~d++h|9L|zvqg&ZJ#Y^E8&YMWJwzJDLEJ-%KaYjcp+K1JQ zef!I{%ngjDVoP`ZE`{>sycHEV*UKJbbeOH>b8Ee{+Wn4)k+yW5IfC_V)xo)|LkE=T zChnNM$bWeD67>P$onHNYf4&X1rf--iWPM@3@wg24nfJcoXKrXZ75NuAXkYcr`R2eM zSJ)kyYrddjDYp8-(}yxVQt(UUO=OnOLpjIxa|I?=bi~FWt#j0yN-na%Fq55n3CvI6s;`t`(=Pv^% zzEuXtU2=c1ht-K+&u{~Cl9YP{WB=UZ9BXfx5R2T#XMH2Ro9Y|69P)U+sVH6fy5UsK z1S4aoM0L>53+pydMC{~}Ag9Y{ zb7H1qd%}fdg;%zFjhxrJ^liZTtoYb=mgJvy@YO)tkxQG_ww=E%P`NX!gh$6gHDIQ} z_36RNgZY2wUyn3STK2SU8Fkuu`{oOuvCrx>oa>wp^8eaM{Vum^rHE$t4yO&$9_=TW zCrrjF7I{m>X#GMIwleH^R@rhqyl6fz+2Q?aMXQAO1-%_~M@h zl0=HbIeLc-Pa7{tD_wne^5Dz2;W|YIRHrUABqV*H8H(9hSMe{rJ?$Jzula}9r44%R z=lqikWb>!ExDRcP@2!-}moSoSQZ@nvohgFH0Bsy;-E-bbVWrNoOBd!3?d`=5g9s z(>_JhkNiALWwfz@LVknSZA~eK1(_ztvyAeJYA-6;yCxG3b`G~Lxctos zcOMn!?n@b1p1rJdmi?Z7j7|}Aw2$y5H^p6L&6VDgG&z?HY%RtvNF9uqKd$`o@}@V6 zA8+OluM^MZ+9SL~t^E9>V11hNj)$L9Hs9JOCY0YCn!oQ#Wlz(oqgQiGb8T~m?(-qY zJi0Plr1`-#Y~~>Q4G-x%>$8H2=pKka`scg)z=QU%r~5r+?ahhtYIGx_yL^je4eH)T zXQV!rQkd`Vd3yKHm*0Z^R?q8h8Yst_CviM<-PPx??Vi3t5X&bv%0{kjGsc!_R2+R& zTQ?ik*b2>2wyZnJSV%{opkc@9DckN>bi<@Jaq9b;=CRKu@kT2f4)rB7eE1o6$4{4y zGJHz2>dgC$PyHuoT3e}0+>K>TADG%LvG`QolxhE_UF%hJOeEp{0m?}ak2QrETE4|i z%%|>(P4VS+hSnyX5e|!4o$9{elPh2?t!sH@_uz-ka+lXVRoV2Oqu@kaOUbLFluwOy zExAdJ6P@fWH$*8vQ*=sPc1jhAQuZ)2(u8+pM6x%{99HC-cUg1%5B>8m^mJpxYJ&O_ zIIJ0#?G~bcY2MwrI8zq+#P(L&q)9-q-ccTY8QNC|4vlqXW`7Gwtzeh>onxXnwyUmh zc~hk5s|4*`yN)ky<}}_|t{plvCeNg)&@{^WPCowNa>u97`Eq9a5655dX2_ZG9B6v* zp`H5pS}wT@bS7!Fm-ZX{s_A|Rp1wWH(a@ROER;RNyqnEL)*DUgbTh2 z#me0JSr&LiXwK{ItDak;8&A0FdWXMzBk<&EwNs?c-j7V-+bg(Sk18_u?DMf7SKRSO z`{tq=i`mcZdbuX9%s%d}d<-uC37UIM-IN|yDa})1&g@f|*j@L0e$%Em6?Uwt8GjS9 z^Qcwp=_|4W#jh`|cd}Ku{HMF;xQA=24VyroL8M0N$26K@qvlLncD+l}l%X$6TP+;@ zOiwdhp-8dz^wb2<>>k~GGnV>w%h;M={-e1soI_u|2&XmjXQBkMn#!Aho&IfXVx_Zi zdT3yye%Iu^1e>>OEE8Iz_*m0jw@`gjuABW(X`E|xfg&}gr&Hravs;Xfv3rBNf01uG z(~fD0Xl1R&m(HyICs+MbDq?5OIGQwgwYAmsv7bpA_iAa(9;Okf$lzIVn?7uNX>fXS zLvP$#S=n(%R5$N9O$;sv%myM5$p$O6TFX1=wnon36#S0QI(0UV^r%*?FN8t(no zPuv@h+Z?NYbY4JUULv2dSormGR;pO(Qyn*d@vM8xu~t@^KEQPIV2*2P?tzkfEy3o- zuYZIGWLoY>H)RB_mb}x{ok9`Z0wZ{Jr$R} z{`mr>%IAK<=5I!oBbSA)U0f~-;>nTT?hyZ4_-l57D#NyW*H;RpqjI(GF4B0OI#q5; ze`aOVL{u)d?rhD$SNGR=on9~-} z5*}q!Q>ZtJ-TU^ae#6y^v0g2~e&0O3>2*GL7uY>J`J4L8qe;J!o(R?pibe++_w4Xb z-TeAf+SM)2xjzcm{o5;cvV5R2iG#A6A;j-}ZI!(L%q$G_y$^b@mNRZY<5o2(Bh+ql zcVlAhzuJ=#zpi#}NfL{%4RHF`o_bFB?vYoa7mWZzs}3#|H=}+W8w78`)7{0L%*~KdpFyOmkoB8 zuL{*OHPj8hEq2J&bKFT|m{nK8Vi3mevh9-fyVuhZMVoTo*aVr#EaxqZwQZO(Lr~=3 z{lYMfgFEhLRM3s+E<0o8@S5%qx>r*a@z^{VjxTL{rJ{Kv)m@PN;>$L*mw9{l%rSYy z=TWK1t+R}JI{eU#bu9cgyT5If{nF{@9xt{AQ?D?1x4*aC^^cD8ieuHZp}X<^>`+;1 zx|)~_jvk431_>Qfd%laOY}N_f$3-#8@`6 z=gzumL8_U5@kwE|_B$RL81*OzC)%ZFKKp65M(bz9TNzENvPi4XSGdpJYtbIF*&6OQ zTiI5Zz1M{zYW&HRTZ`e>sgO7gH7d&P<5?f2r^izqHt2t3r1r92uE|fV5}_Dvuc{+zw`F$dDb18Yvgir zFTK6G)Masmr6~Wl$YFNf)6}`vN9YH#C?ZpN>0NB1JJ{%K*MB|S@now`$sf+bcQ};Lbe?}q|dFy`_I3!p- zHS(x&9scHGX0a0RdrL+QK^PA)XdgW zi}!5WKW$$ayeP%GG%h;49AY!t(s3jI#~^od{B{P;)pI9a#}03pq#A0FmNQqN_B(Hp zJN+VlfLkcXzEgi^+;xe?J;HhQzl272erQ>t*_E^8$9eupZfJbKr*Nllz0=IJ>myvP zH=0*6RJ{)~7m?pJEwkj6er(FPXR2l8iq{SO0rR&0cMWtRB7>KIgnYS`a4Wz$&w<;# z@`pElFqQHCw{pdg7!}rCOmKX&sq*UAvdmX@y%n4Gu;(3X%4m1cq{LaBqSG7O$i*nS za7(1P^LD=i>xVr@tFx`|hmrK~pulz4iJBdpUPm#R<=U9H#q5L}vtb zO-@H~u9Iu;AFq9zeR=Tt;3T!{0FT?-Fv{r*Is!aWnIlmAiK`rzBN ztlhujc3Cv$g+C9;d$e8V%F$q&X0QDlG9ziY>8Iz;n>4I?*$6(WwjbXV{6~Agc=v^> zKU!PR3_q9A?3H*MY4-VFQMH!D(aO9>Yj65mcys!-_=JY7v6@b>xVrdpy{?^o{+(%S z+I-fk<4ej#9trvjxemqtf0CcqOuO8fxXN`^YUjty26f(rr>dUkZ(RPgmD7M#Hv74# z$IF}LiE~evG9G(!D!$%%vfOO8qw@j7^$Byqp1-tALxZgkHi{cMF^&}FuTfc|V`sTD z5aReNX-kKvQvK%I9XewAd7C%N<@MxG)4A;lJ#p!z12R-+&}Y(uVidl_O8{{HF4sbJ#C}{oPGBg|9WaWK%qHvMqIL|T1kbIq3I3U}G92rp}Twkkhuw@kBxy=CXUTz;0?sipHCTo6uka(?nO_|x2%XTFE#dN%9N z|G3M7)oD+g+zfLD4m#6sp7XEy||%Xqj$%#gb_AX{`$8lk3+5s|(xK zSEQ>|ajrA$byV$0Up1$+*!CWFE#EaNd7`7KZfgaft2w_q@J0WpvCeyn)W8|38B9xTn+l*K*F6J}yyoe$_eWn;a}I-%rs| zK#MZ`V=YC7%XFOC*~um_Na(_WSbzKYr2G@Jzm2UgQ9<|CqqF1G_Tww5bX9fV`ozpX z**WHVSH!JRDw8sPvNLp^d+qKMTE(oBy_2fW$t)@iESopOL-3AI?2g}*4TJFm@xN>K z#;^Y1J+=oUNbq?t$u-@kK`FIZxFw7p`kzIU#{(x;l#27)nAtLs%=Hy!S{!eDVEJceogQ?4HQ@utoWqbZlG-OXG^6+7pK9 zrxw=q?D&eGbgFfp2UKl4j5Q-@fZrG;ez(xZTw)mJfet;mpr{{MTlc zzG&{JS$mtuLnvMh!Y`MP+>DU1ZIe*Blw3T1=!fX0xu;)##CzI^759~GHvAW@L+>pj zw0%X7`gXkM*3V8~s*Z%*o4!aDHof&%v`rnWOcn2!Do%kHJfEJ){+Xh^ljJRNAV|`W zhhgf}B{_?UwI?2O9baB=P%S2MqRo77uKV1Yp(BTmf2p^y6!)P066LSs{99(yL{xH; z`s~AvCgls{~9!6STe>lL2yOV(bhH=`re&BE*?=mb1v?=eg{FSwgXzkU3ky&|!IRY;)kZs)O_~0DcCf12@`V$<`8DRm3-2QX0&KkG1>A0LXFr%d9-ns5 zg4%8Js(h!{5uHzI-brtixDWVepFVqeXRmcmT}|nQyxd45iN)u%izTJYDqqUtg8#^j zuVLaAnm=AuF%i%E3H-dHWTq#i7tQPR5l=wz-c^>A^zS&tEbqmRXE=l&&M9<#sS z-DfPY^q=0_c-!u-;(t*yB4x{l5jGV{byHhjIfRDGdhL+Li=B zjH85Mx@vF5tu7gF0o^>sa;WjX*}BInd_LR4MbGke9x<)ku=Y4cx)=5>>qO}W#{>4hzWLbli%)v{qpkl>WN;9Q;O@G0*>zvd*nnz<9}$IJ(ev$Y07FECfFqL15Bb2hGcAt*XLnX!Q6u2K% zSTkNw-%bIWZ$Fo6@z01eu!3H*8-`@{*HqJuR2w1q+CTxJen0E6kHrQn^Q#?Tm@zFE z{rbE}<2bxJv~T{XqZ`f2gOgKdnu>fui|3=f!?rEH zN%as+ftV;@&u_T@IzRL$kqrH`e~=t?N)qZKOa>qw$hGILTOA2L$ys zPoV&L^f${*HeNe0?r~Yzpvc z8ms!0xmiKTWpupu!)FHC?Ed{*Jua|2cNgzprfOR&Bdcj=lzeDbmBYUCEw%k@2m@6^ zv}J#qb;*T+F+I-YYu$O?+6Ds5R2SAp{CZioTk+n(UqRc=SvvW|mJT_(JnCIIk=*or z_pdk}0h96C;?kUne2ekToGi(X&!yGoXKCxKcXaskMYauOc8w-wWNo)D=;-%QE6G)( z^=nIP$=KGEA#B7i6f+89Z``Ayq7in^Xa2H1>G_|S zW>JrUo3XyPR4+E~d;i~25E++_JdykVF%K&0diFna)&KoZ&7R(j{y)Q3{9+2-JLT~I zYZUeH_FJKbRLne~aT~x6q5()3mLnP!{-+Ztw)u{ z!Cs-#D=XEBkriPOURB^1m8m5V?(rlE*~zw@7&HQ4s+Rii4vfwIqth>aC`|1JM21%51H z)*nCFzf!S+pJl~=(fyEPQL+OtKXq9x*A@_9jzi0RJa1~F8;TR~BgibAAl4q4n! zmf*MCS_F0j-!cf(D_C*!O=%nfp>vf7`=7v4YAr-KT2CR1gg}~Uc)XbczvQ+1+^*@% z=70r4+4sD`!$XjR_J9wFi)V|#DG-0pFVH7C*4JB&*m)4vQp!Au|JPeHIt&s+13YdN z-1gpx#^|5Sz_C~;vy&Udz!30zh|B+gE7YnC;0}Weaz0`F@S#I?@LPd%Pa50#kK zCo$Eo)}EFt0wqJg%o&?HIKT=Iu91ZB&^OSYe)RNzxd4G1pJy@4vQ2^Jj_-Gp5JW2W zV$cuOEhK8{3voeDp~l9k*$o|$2BdFWcI~2v<`a=Xz)=Mztq{(wS5#Ci^IYXXs3GRU zFWYnlS$MGI7yKjIkZ%gS{~-~9GeVZ^K&r$X#HlK73Ze%+m?AJh-~S(e31S;rVF_ME zBA)mH3{9Jqo}A@92`fG5q#kB8WI#GQ1FsY?r?$d)XxLCij(EGm>4FpX<2)FD)74_% z%npd=as>wqKj2IeAAA(*SNEJa04+2=8$Ap{@%^wx8iIjSU>JeIg;Dq#uv)Na@r}DV zbTWkY@B;($#SNZt0)teqe{?i8>;p(L*Wu&zE=zAaD7KpX9$+BXzF6^yv-xMfUv-so!B_KE$^8L}@^ye61 zSAPeVRp7ujmmkGTgRc}NwBeMo@ootFA9g|lv;!Bz_|hefksE|xLzcLGEnSb-2Pw0D z`in;}X9D8@#$tZzGnpGeG=rJmSTKIfZV@f{!=LiG`j9?bQd<0J@0kdd@k6M9p z-uQQ@oE}Q>_?WO_qbb?IdlrkOaIs+{o*HH&bP{JkQj!gOZqjs%GHT}2z40>PV~#iQ z@CQSB5I#552&$3{-w(6;r5DOxAh!tQz`xT2J~i{2=Fytxv_T`b0;!!i=79gC{?|kP zCAM*ZE2|1`+|K!o2Ml;$?^4dQuid>o3++3W4k zhiG-_{9YK`fH08`iU<7Z>UB$k9zc>x%)gOSiSrm(Yp{V^(V-tam`I!gFyrQo)8HSlnrHC!x4v*9(}# z#N+`d148J$pnJhhNDDHnm@&9KbP@M4NX-b*6D|d663=tW zD_=8%43>h~&z>octekWd7rA$tKV( zLa>GW#%p6^^KHgiv%vhkF7)Qw-;cvD`~jP&<#ABHlRz$7EA0aQsL~Imp!|B+qTmR zu3!A1y0oJ_t8mohe+xSZOKAM$G7mo=6p8MuBBi@{n~#y0+oNyy@yHxKk?%M~a*S zvN$1NQfWXis35Kimh8W8Ji!qBLG@17Wh)qskxkr){nmNhZBRJ(p%^T%4U%@kEHdiv2$pW-d5Q67O}$DN>NOk+GhC@ZAh7Q&9u!-5s3))CCA4>g*;h|~C6^a5L9=Ph0NSD_4S zDlixM=Y`ZyoLEEdS%zl%NnFt1aCjHkCIiAf-2NdOA9lR8hg}M;dr+GMw2g*iEdmr} zA&9n>j)ehfBCB{^L z1PfH(t{1g^JYb6B7+lTI&3%drsRLF`I-DZ^m4V`Oz2iTyG2fObnh~8Q$jrgYkBEki)v2JuW9Y0DFlt8B85B2rKPolCHBp@ff$04nN zH>FOKfsjbdNLzRH<2Vy{DPs2sj^ti8B6vgq48vX{I5XQP&aB=BEm0jRoHP*sh^>XP zC~Oyq=_%1LLlhpPbpCG~VrZfJzng^5jz{HiRC*FlWGV0udH}vV1Xgm(F%Q9F-b>yF zk(fZd&Rl9Zo_uqEFRdxw<{)^#C#T?RTWD9;kQENoB*_~;kL@8ZC)CgK!)1N)=xAmAml z(+I*|<8bPPtG@~BrveKGVgrG1whwyxtz=;nS&e;TqUSz#$5xlVo>%I4r-y~H)}7Bb z5_gKNA<%z?VMP`A{~d6vt5_)PwcYr?V$E!WV?;s%iy2h~1=?{o$j1{&0tLAh=4D+V zL|1;4nR!tq_7hey4O<~EK$>y;p0*DGomkd3$NIg%W&bw2$b9oA6!9d$=cuyklf3}Y z=?Zu3fN{7;a%!sQ_cYwvL`bWN*$DqOWmJAemL1-x9U7aP75|!l!(U58s1V=$`|+VB z%^Qjn{4fm^Ip;Jq!f|K!!Cudyfg3u0q@7a`3RsjoL}C>SqO!1>Oi^LsS#9k|Fn*yF z3IQ{B=-SblEH0+NF__%HE40#p`6jPSyb20HTyOP+uUq4#T~CwDile6GoV3j)=Ksj; zwr$_u5cqnP242x*a~+tfL?ISqDTVN;TWCqV4PlUWQCYOY0#-%Jh-!SW8$cz?I=*WN z4q?Qw{C&EHKQX;S6&`cGd5MD*HxOx|8s~v`2+Y?ujZ=ve)rDz2ZH8gFq>HO-7Y>BD zUS&!OZ(m`ToE%Z-5dR9S@g|R)tl1&zD|=51R=@w*0-!$J5sNAnPny()7n0sctakUh zHi%uQSY6G@=8ZiUtz4?8_&qT>d7G~JMYE)U4T*GWYN~b#PXlb3ndP==Zed=xZCPaf z<))O~t`R5q+k8)PoqFCq#ZW0BYt6WcUR5D3v|;&d=bm|qmu+t(y2nei|CK@l#ky

a<{+GpU*lV!I1UBB1Fp9efV%M4udw3=hTI7!6Z#gaE)+n%a|%U{m%M`n9Zj;{2gGJ z+rz8sR{T8tU|-Mox`u)?Y=ZV3r@oVa=dF3FEUjkLQ!`LPq?}AJ#`dt6a8fJ3X9Qa% zE@Sc>#^iMA7@#ltBjNiQgoMV zlRnz6<)tP30Xg(hd*hrFFQg_z0X7QrL}=dxK=S*LW_^ZU0IT3_#+ioZXkg<3m5ku@*d1$C;B(QdiA&ek%>zPr-#82f0lb&JTp<7W`) z0j_z2`cSs;`%>0ze?!sxLu7H1hDMja`OkP~b#Ey6pvI7p%}>$VnGbfK=0kGJ<&fB&?qeiRBS zv)0FlQ{PMx>qI#Jei_>M4*D+KQ3kVZr;^oYUXI3HF)mr#f4EJ`p>6b@AWZ+!62KuL z#4}MV(>J1Xb^i(^8dx#I*)(jgd+{Q`q+rSgshEQANwW6H0e9)v6Bcy?gBBcd2hd4_lDx#rXLq z$HvmSX>Ab&^-=btJ=?eK0kk*+2#zg~O~@f=yTc^z5(UJsFEf!uIv+*ZoX^ z`;h8KMqX zlDWuf9v-e=U7qs1rPufZ>&t-5jX>SAKSJRn6g^Phgl47Ws$NNcAF$`a68;S>%`Fve zLE8gU75@(GJs<7Z5U;@>@q2pu>60g)ZIY}2tzL#MEi>ve2H(v%W}-ooa3MLJvvm@y z-p&Hv!pFE^>kdn#aA>WT`h6IphpaASz9#_|XUJ8H??qJN62Aulb zojyw#0mX9ro*Zd;8PATaYLk|dV%@s6 z7PTFFo%(-53|7jpva&Wp{)d>ZVEgz`>B$371v}c=_3PP|&mHgXaE(gK$r(kfiye)s z2t~^1OwfIFHW$6bCOXn3H-xaCd93uBTo@#)tlo2ihNjAot74{Ndvzdi825VD-WB8v9~upilC3^oaQ zI{4pP94-T4YF~xqZ-@t0mo9bnyWQHdDZ$*r!U5tWIyinqTW)q}2R`N^Lgs9!>`F=I zqT##``HNttjRTLi#3Njt^@_E134E-`gTOChgUPnwdPfI`LAWlLm^HItHRM5LEC?-A z^$(0bv8}KY*%N)jz+hM9>WXXWt5+nB+&Vvpy(oj^H6es<2227iX188nRxCh#2OK}B z7)^mP!#*iGZehW>eBz4YC`wr1s$1m3(cor1e=d&YVJldK;l%d{ zrCAef#HmL^<+!aFHzh#qMc&E;OzV){5xMdID4X75joL$~bzS?_(xRi;$mIhDxf89U z0s8>nJLZ~8zq?d?r>?o;)BxVo!)KUkkhrW zNW2dC2iMna^5tl>KJTzvD)z~eJuEqfHr?zX84j@+KsqahPFU4(_?Oni?YxkqS~td5$ow*xKOIp>ZIRJODU&%wMJtcvOet1%J8; zMN3{rMi9~#iL=KX$Bx6orL?4^9hOIMj~Ie;R}-XzpZ`SS$3WdIDro_`fX)f_q4K9# zKTX0gnhV93`iS%jkCjg2_T&BA0ivA0A)WU3?=2t}4*_;TnqSh}86Tm*hvHb<55DC{ zWj@sOrf_@58%@6ir|v^Vo($7!HNAk z8hW(&{pcyKL_O_YC(1_8uuHNIpAcEl9_D*r(2^GL$l@r#NOK5x?;Jd>{w^+FM_pOw z`2E4a{0t||QpqQELrg~)s71ubTIl+gxMqDr!&Wq%*8w-6@JoPs3rc||;0>%n9|{lY zCR=t|ijs&8FuAuownPsx@pZSM>v`K{v@d~Zzue~DYrn*iH~o1b5v!$-pe-~1Iumxs=Z2VM^iTtqU5?>b* zd1UR2nJ@QgYiZdxJ>2i;@~ds1mX<~%4?k1d zbH(nS^Q~LG-G~0PoXC9t3p!A8o3R5@ff&(PSVTX0au^PL==_c%irA@S_+@DFi5Z$50(zQDf4$HfyKz-SOR;4U@;r^ET}0J02pudtwC5E&X? zK=x4lvfY(iF`&1_RtXZIW!y^*9|@Bx%dX!`MBO3Z zjK>E=)wL56Oys#y%v#={cC0;URbQykA-58nVgbhrS=maOik0J#Vh`KD>>W)eySs>M zgP2(NlM}NdM4nj+^9Anvq9Q>+xFqm`Pj8oxP^brFc9@$x@uLGDCae+|Ccu$!7NeM! zALGhtSy{tyJt8#={9UngM?FE+4o(E~ml-?$$zy+cd3m`64Dt84=3)N!7G-ceaFaqf z=b})8tL8^oD8s?w0d6aO>!v)NFGCXILfAfaas~P5KaC*_UNFOzgt;azD&WcyK>N|| zU0Rv%VIxZ(NCp*D%B?SF*GV%|~`&t~hqu(9iGa=e zJSl}?(enhE5P0|?eAA-mj=jS>#%vw45cp23pd0vw^)JI1;?tjA^hAex0XBT8*gbj} zksTKt{}Vf!DOYlUsvp}2QUP@W9bEj?EbIU`javfJ*Pae{S<7!dM_zJ`Q~-_d=_C`p zrwjDc)KD|_0CEoq$S&w6~Y!b_A5U!A{k3CRVu3Ui@xS5gXbgogQ zM@!HKq*M>wRs?L^qFyQiaWpFn=%z^LK%zUw2P%kasA8iL1a4Vg%fRDvn{5&MJ~83e zJv?-yBMJpY?4ieZacn}k!;pHf9#pe%Pf$;Xl()gD*To-nmqMU`2qUN}(0*g}w6HyZ zxbrLt)tmDMZXE0;98)n@&KtAF#F`ZjZ=}3zF${i=dyJ{hFpT?mu&|s)6@#dN zrFw(NlHfJM8PP+svwq9Dw=+4*y-H_~uU+Kai z50?2TIs=O;7yruzh$JSzy>my7Cv{BYKOr9tpBzFF0fN+El5-%{p3=}=gH2szvobT| zbNczI9`}TH|6LSRi2aEuJOSeo<^rj08EhUNgkL5ax-~f4Fp_u!JoQNCQUx9nu%*Qh zdbS;KlECawztFOSC;XfHi+vFa$n25A6${*GB;Z)u1lQEEi}H%7cS?ZblTsCh5=yes zV^d|d=tSA+Wbj#Vr~L`KfWI`t-$mrJ8e{>P%{1(UveC>01S z67M=GagX(&H;yuf@58sLXmsCtVDSLS>Ik2sSYKh^?MF&%Angj;+TEucuwi8qD|CfP zd4T0mstKyOsP*o`X9~3^tY1iRG>eBle_5v8Ck5>h@Th^zoC*b?$hc$Fn9-tC!S%*? zbOa_|eqoe?38rehX`imJm5>iG$)9f8!d4)X*BkYPQrs4b24yP zZhr6dMb-QbeSFbaBP!$zly4G4IW!>_ zh@t7K`xf45s9Xo=>J$;3-C@bf$=QZ*Je$|Ru~n`UC43uzC7E~MC#ffQWdX$Vbmc|} zwz9Et?<;%D7UBmebP!O^@80=Jq1S%_9CQF6!}v_-!EgWkT*+QAnBxfaQ6M zX8gyG1+s@5W(sZC(Q`^;?II!rsj~NlzX1oRnPgH-0SNvE!5?@9Ij)|vnAnTc4M+L;e>@tGBj>!gTyLLf!YtE2_`!Yp6pbVD4WCT+XWb8^d29@f z+=>;61V!hLQwM8L}|wJ~fFwK+Md{PVb!Hd|?5*H&j5U<>kpk5o!|9)7O8C zDto0WI>!^V3 zfU*z>2RhmxH$F3LLIa2)7pyo1`fiO2;US=vJbEY>g*(>m^`uRd=w^*zwgMv78{lwK zg>gM-l}Y)C(~^V@NhGtQ!+n6SjE-vve{`il|AkC@@)o#3=Hr4o8HgvLogi@-k$`Pi zGuj3_MDz29*4>EfMKtrqokg|HhtTvfS#dj>=PsIMhGGH0cizQK z*HB=@Vv1`H$`bxN0`KF@zPh;IJu4P%`O4#$_=ON_0!MppL&Ak}~0U?eci?*Kd zq3*!BCaw2SxC))`&Xx5BLj*<`KwV8RC`=(yaz~)2FCTQhfPBKhA=OA4W5gyhkih9_ z9*uKt-Ti>>%aPN(foJX@**}G^AB=ilBkXk`5V_{Z;!h6-?v1MgC39eKFsa+)ObDAV ziikuoa##OWeEFhJe76b#SB8RUd$V zVEGdk9sOy{0fE68BL+1iqfq>PwJ^;>#yJBlgR|;IyDa#83*S@YbCCnEOnEm9UVvf| z8!)(S1*gW{qD5ir2e8ucg)1f;Z(}63!ig~ozQkxj*YfZr9s4l|&srV4MY8!xQ(IeX z!3dXdWxjG1?oNO1{d=!wU=V^IVhVV%r|8OdEJK2M(4maih=L{*zI`v z`mRS&5Ec~`fOlqsjZ&=G94W`$LACU#1siuZgH%V58N0u~t(Fj&$utFz$k^Q6@4*8m zOHnq*AMZB8qxz(?vpD<=2Y&u^?;bTk)Pu=c2%aZBDkDN-1V73Lg=-bIUf=uMXcM!H!n_zP;`m3Ze-Ow3&vGsE3_Jw_lW@XG@uBR4}UBW`Q8QL)&} z$Kuv@peIA|*mB^Qy8t0q;Rcp##+5!stMXN?l0SkpY^hjRHQMrI{;YE`aaKYPVh^Iv ziLwT7=scjEeG6uf0&pBft%y`8f~6D@D3{`1oc}4Sa)nG{zTi z@HFkQeS{#eNVMSaien!K_R+?pDY*5uLI&u=mogV(C`mHykIbCj8SYjf%KaeW{ikwh zX$e6)EDHZd{R63Fd7PkNPnMoo1y;#|qB}1A@{x=s7{o7Ohv|X&9!X&+T&RrPN(!0WB}yIOlI0 zVQLRVgaVo?D~=QgCVg;6tLp5`Kj)5BoW`(#)Pa>HEE(^i+}nlJe7-joJp+uBcLK*c zjWN5=>f`8WQ}FJhuyzJJxhI7L1_TD4z;y@5 z<@fQjzr6Ru!n{Zn02b+u(M+7@ii^}t-BzbESu7Uo0b#{sb=irSD59KTJ#-mcUTv=;7{u~m1MJiJ#$g|9;GscP{u*RZOt1FG zNZz1@#~dm-iWkBiwnuRZ3eu4PWoIXX8|RNriAy*OI+Y&MAhRBX=kZ+*X*TpQ=X;Ny zBUW_ag!H_0*bAT3P}~?()VHjXj%izV9hqL$P*?Xx1QK@NEtz!7VvGg)A&hyd+TxQX>-1 zf4d^3diERiEem+CGmtdZc62aezH=K~TM#@6$>=E(J$b*>3>%r#F zIG2$fZ{wt2y%Ea;2jlA3Qf9K-7(MU1vDMPTd;|vR=@}JxE+T4Q2hX1z4iIw5;Xs(h zOMunkUCe%n5hadMKu}OAPwj&8OAtt5oa&FhGoiGUjeO-j=cCtodamx7Kf#1OLG=if z0EsrEN8gMr(~xUYD}oindXf0G2W#ebtagDd zW-)(5h*qY&fkhA`uvd#po=Dh9WXwk}1NvcUINhu++ zG&rp%ib|pqWtSl)>v$}!qJ=23WT|7XgluU@i^rafFhglU8_G6esOR&WXWsXEuj_TW zTwTZMEdT%gyYKJ){oV`x9&rf?&d3U4UL|xF&ZuMlSOGMa)hZHuHdCQeiz8XOYLy!U zp|PPJ+S&@}ZGPc>7I!m4rHbO(+bzHZ#R&jk;HF~%3QtFQ%S@=V7r~tjN({;q9tz1) zMNQ2SUq~pY4Qi^QvT{*L$%`@TCnzYlze?`uS@_(j0zf_&$|*gkw%FAJTE%xljJ zSR7S`KNL+6lvs#~gKsS&-{&4IU(kwU3xTc@^Gc~t=B0z_z^w&%ivvvOz+g%T#SB=> zR5ci$BGHC%%j#4x6hZE}bKlF}Jbd&>Oje!7wiL*fB$5Z^4*MZPig*uOCIfT!b6u*8 z-Ax7za8K(hwykg+>4GS{eqEUzd^9|4q}pB47bMYxRnb`^$3c0E7o&k?E6OX=&ecfH zT{*XVekx(Jt|*9z^zu3G8)=cWdGn<@l^*WQovuM&j@&350y9G?l{eOn$I@Aqx%|n~ zr|Kr-`yVsP(_3N#k`b;Dx0Q=s(+_hHZzcLZWVN`yi;VTA)6sa>?T}fsBJnfS{}kW7 zt4WfeqM;GZtYMhYr7Q;uS>lgwB+=H6!ZO)wT@`$+W30@iUq2U#3_(}Q`bqY{+2rv) z%;S80{yXqhDBzLfMq!uWHUd(Ng)7K=M~2DPF5D>4-cJe<@(P;#dXxv>d;aECf>r8x zGjsFdS~9#f;}lg?Ri!MpaddL3X>NXMa;ihw$kS+Hx#G8VHS6A-dz{NGUX4Cu43Sr( zcI=qEW6a*v)EMOBUQkyhqyiEh9r>d)p{2sEY4SkbOz@c=PzT=AQeK~DF018I2zz6YwjuzO8W|133&OKR#1yCy zCT`bZ)^1?S9MoL&=+XIzZQ!{m^!jMDD+zrgbhkXt z`0X!7Rx*up6YZjE_oAsioJ~6zem8VZ)*?H7`H>@HR;ma^RC0o`Hj#$m5jn<={58{p zcshhc#UN)l@v7gxQ5GNg`Sa((d-vWZ>r5mcg-R%GUIi{^eg}B4fu)lI+2~-vm`l( zk=j*17r+A}dnBi%M8ma+SPP{ihNcqR^i|}*f2S?!5xQljgZ3xI*));nQh1ifX2UvJID1f{TML#^x=aQ zp{FV480ddS0IE(K(R4)q^zfk$Wn3c=zi1>76QmE(c2bM2m0;kx0C<*w+cyQ8BnNY6 zh2zD#zuuoZ>+){y<)I-DVSXjRUGgn*O(!$0LnltCA;eM`C8exFvE>AM9&4?Zv0nq` zzbgcm9yM+Nkj=b)`_cL-ndQb~#>|j}k(cG>%V{qjJ;M^ z+uQd7n7M&H1!jQ#0W;7VHg;p!??b$f`^rEyByUUk?4lEHYR+^iwBEk-^d5S8acERh zPHLyDgBZ=RGV&wh4s861+s zoFU>mmGh>^1Ij2o7B)7zw`O%?E^k=DJJ%hVKAu8=5*fsPxam9Tqc0=}|7MS9^l!=q zqp6TYW+@=Kv!3YC_^JyF|t<$>Q-755)n9#536Nn2?q`mvb zoFN!71=W)z^~eUI(GM$A3VWSxeyaIaThjx@tv`%b#)72@KUN__Y{_zZDjvuZ4vQAm{!9JE*jV zKhB-3>c6s=j!rOC{}YzkuW$b{|6^0dgj3(2cT04QA}b9h$mQndjw5lTAs_+LuGzY> zbqU176KdYCe0gO+(@G6$Z4vDGl17)ickYxRMnphqS99j{*kmXyH997yaad^vM-?)& zn5;hu#?R4whty`5skOBx=eOhQCH8$Q4=}U7bW6(}A1^2*ZctY(ZS8kx2vpyy5`Mz$ z_CwK$0(8XbNeJHHfhJ9i43hLtvBpLY_O-Hn$& zLnculV?1=IHfh098hd>%(8a-h)WDbE7Dc*_QBurEsrx8qO_wtu0D?^cW_fMH9+j1M+KcG^g5msra zj{dJ~(&*Q12%e2Xc;J-U8-TemADn}}irIz@8-7i%2UX%{dPJv8oW^H4bmfW>*|B5j zY?LOHHMq~tNIa|8t`%4ipyC+5Y*_>rucKD_qx*!(Co9Mo7UW$ej&h$ILyt@}uH2V1 zUW*GSzLtOs4m0pwofN(0i%4&*Z+%%c)_5}jiPLQl@3c_n>FELk35ek-+A?^rD6~TO z%xz#em;?IXhOgegb7w4x4_xXqA49rF#=dV%P&U%k$Y`9&2>K!*m7o%U=d(rO*;NY? z`w$9VK2uaMnD)p52Ij9O#vAAvn28(+SyYom)djtjL^T1HgOuXWD|e#2PF4u^aClf%0u|;8ouCo9 z)l2*%TIMmqznlVJY*N*J(7H7@T4J|sQNxtFQ~4w+l~~~|);|c6fKB3w(cM_S!!o}DjCMU*(qhP?!#H?{b%K+Gj z_()7+(RLEKH;~p|b_wforFU3i8H>i_U@D6%D$aSAG)s2|4zjUvrHhD$B7MCp+yt5@ z-KY{G9SNR0S1DPh$T!Zi;mQGMtDYA~pLSIE_UgaEQe>Z#E{|K3S3ch`tHe3|ZT=hY#~2 zGRL*?`ae*Le50(aV`_-+lqpI$!I~$RQ%!0;J9Sd;@9r%1y?j~yZgH^!FV*4iTf8yi zQIgc|d)Kbr3=0jy3OY0G9}ZT5Vz)&r+rt+w96%*uJW5X&5*fkq$`I0aJoHXgp(tNiUlE1F(xSy9rdBl zp-3ZHZ4}$RPZC|HeMgP@8sTaw6UCqKp0?fkj7fksAPLj(Ij(~%wir0@TiF<{s$%CA z;aj39H=U6f6m+^vkmVgfo+t=%c5w-z?n0n35>q;n4djYqX9JRix68E+v*yl?L6&1b zS!eMSnv^_xDiUo}V&{8$CI)|N&^UKu02sC~}CyDIz7^VG@Z% z>_e8CBs)}xt(*UlcsP1=plw>%dJn7?-+*GtNnme>Doxzv`$U0*85v!m(GjO&;nqpg zWf@0~DC2Hv-(DIX9)&JS_SD1BPy;5JJ){{(QhYY?UDmH3L+v@;#}W8&nih-bt=w63 zNzpw!ltc>X4AW_?ze>YuD9X*OcmXmXxqa+5q|jT26cp8f_pd1XN-xG>27^ z^nq%m;Q8|(%QPy#{UNa)5?p%gA&XOws(rd|A=aUtJf{BdV5+Cox@saxsNEtsj(qH8`BwIQQ%Ye$T{TWd~BoPAm@7^uV_RHiR zBGREWiIpk}eGe}pb^z%}I|};OyLozQa-U)zTBMJ7@a95%{DdfrNu;lw^-Y2m0x3bD zD`Tx89n>C)loqUyKe~Lg;vMy{5u6h;W3}r(oOe2Van7npE#L^ufuuOX4B`X=#%ua? z4Q{76Qn=GZ#nhaeJS8x2tfW6>-;Ova* zJz6xztXtwXA=0d=x!K#_Uxair3kB$$$mE>STOnS(0|L~xix=d+@PZ`PJXPv*@l~e{ zGdmzpd2IQs7>*jQhEHXTv65*~axD%Vf>!INPcc-+5KYNdctSLlHQSugiy&aV{rnz{ z`}SBuJ@GxX_q?6{co^=SH&Cq* z+r;iW03o`$dwP~4@J55PSqJ5K<65`w-8WG~10{Heak_+3$rT)za5) z+$KxFRpqM4b2MHT&v~)*UH+uClz_U zaa_pZiqel|E5};yubl2I?Jr+mw&Ln-%!o@}2yFoKE{+^`T&AXcX8hqu5JFY>hY`%A zQ9HF_YyLZWc@qt-@>N`5X~caD3bIaED}Khdu_7)bAr}frxOz^E`20W%@;|uUU|8&Z z3if@*X8Wl_=t9+24XN;WjJGw2drdVE>4B7C-Z?Z^K9lh!jPiAx$$$ z>^}28>8%0w=K1yHl0}~YDZWqUSGF{~YN4DEKNeHgnDlF6qNhU5he36;(`@68%-hD+EXcZASQf|9161Om5%9eYaR&k%qRJ!DJ+XW+bK})8TM-SF(vkKA zKS#_xmX+m>$4Xgg>5)={Q^&I`Lp zPaeK2xNvt`S|JI1Yf`gD(CKL!TrO@TMokLWF@ML+Io+qIu1McDV9=nebMenk{{Fjd z9|_! z+U8)A9dFa%T$OFKVphIf?yTpQJ|gSXsb1HcUL7f45WVu`>E9Ly+9%m>J#16^Q)s9f zf=>-?l^R-`6~R@U7Wnv!=FNL1wpN{Jd{Fpt@H~6<(|b!R?-sSp+eE~TseX9>S2z2W zqe6c9P{c2yDhw z6}<)Sdd@rk{>k>Ahn^`rc|7^=fBy5(k|^=o|MMICP`KDO&RqQV|2}w8Q@MQ3KQH;; r2QOSR-t&K+U>jUL{QtdEY)AKIn=-H6aYaiz@yFd|qH~U8aO8ggnL`D% literal 0 HcmV?d00001 diff --git a/content/english/hpc/data-structures/img/segtree-path.png b/content/english/hpc/data-structures/img/segtree-path.png new file mode 100644 index 0000000000000000000000000000000000000000..e22597253d3787c1847f1f6a56b66a375cc3975b GIT binary patch literal 44722 zcma%j1yt2vw=D{SGy+lz0!m0ocXv0^-Ho)gDBa!N-JR0i-O?c44R4?S_uY8o-Z#c` z42Ezx=eKLExz?I2K-lnBhHo&&Sraf$zcF#5H?8{#?ZT5!$HIX-3 zCX_rL2Nd`Q{G=8Asp?-Z{*E+Ig8Y^rQ-!n#h4(elS=7zV_x1JlVEi72goFgSOhI^k zKk$ZIU&u#J$1;V2aM~yI0HR`I?c?Kg z+SdczrWd>84>ucOJtzXNiGrnjh7gEgzU_>pyo~C5xzy+|I7jrFNU)W#6UiHG9alW= zi{$}YrRW)jHyT?MWM;Oqxste6E6t_{w3Sqk#EI(99!k>($KdA&?AN6XG|uaW|joIcLv?*W#Go z^|kn2zm;MtUK0K~Y1^SQ7hy$^qBpAHIR82;9Ie>FWj;?m)(`u%}L$vWh05A`NxB^P>qQ+SLXv z5;9^&K4`arNRXv)D%e>%&~Psv^o6H+==~icVtpF_FZ!NZJiuM1F)gT=)Cs%BE==H2y!t@#t$f-5S z=Otp$L|G#ec@v4!Q+lI0$4dKq(iS+@s=QQD%vJqGzr_7OLu9HZ$|J)o`_i#C;jtX+ zYx4=!+rc$xzHFPg5(ed{xn^iBp|?6(@YZRh-enpCD}sj-Jvn`MWsJ|SBm+%)eq8U5 zFEqh=M8#ngdaNtvz=uk?(EVAqC65^>rev^M$+a)bu~;hiLf+pVpB|>lTlYaDUJZxX z(b>>XrJ&Jox_m``ov;)B&GErhzVI$mbEpu_Mc;0Bl+EH{6$BCSpWv<#T=YnRo&(Ns-@{&n+e|BwuDJn#B zh1XP1e15`6gUvD6Bm^$!LuwhxuWCkj!%#Cly$!m$%Ao_vuL&9QWax3(zQoVV2aE0v zNY7tL^Yx%{R@UwA7VoM~&nH|0C0-L<$V93Vd4I=64Lwyj-il7G@Hv$_sQkc9pTb>a znQ*%(*ldCM1Cg5tNpEn6XM>AhEX`M_{kUapJHTZ_31@{In6T988z9S(hLAdBG5Z+@$|FtPmG#&uB{9hGSX{)m=4z?TQziNbaQb z=GQwbjUlqqFpTNU;kF&w-8V+n1;M|vC`l`v$R{$no~x#3(-q%tC9!lJZGY8c3b)z9 zyWVHa@ydDW(BZD9@FA^W++@szkrVGnx^ys^7fKal%d;cluOQD6y%}xn zhO=varT3w{0(c%6Y7@<~#E!+dy$hSAk_SqBo_z2`-Y@!Sp#df_Fzsw3=@trEhkC9D zBJ}yswHK50?#&i5<ou63q@Kg<1^)SL4uMP2IJqsgRllf}1w zPuJh;$_ABEoQ!=)RXR#yEq}2tquOB`?UxMJ7qahv-G6(tsH5sT?CY`%F+Scxxt6Ao zahBk}YsRPZto(jBSXxWyjdn-I1?(+C`EQZMqKkKVh_>=UBduPYR|J{`R3 zOm0efI3dGgRVq!6>B|h*>Z_$>++V)q5tq|F{_E?yMTS1#r4EDt5TgBzEb#b28K5qm z=x8w-|7J|#%T+2nk3h* zZM!e(;Jq{vd#LR;LrEi===x;`=r-kjuA0eZ@yURT3Q4HHts0JG*x52jzy)ZS<` z627?q1U?WMAWj2fiJXoY1h_yfL7;-*Eciyyi=-c{;h!*LzyvvCK0!eksU&75(RsWO z-@ZOkJw3hh%F2$J87y^m^`W8E7MqT>PCsPyAP`dRAv5b@Xbyy-Cnfdu_kaCsRRlbo z64wZv(~p`OCW@GqT3Xuri5!V{95CSF*#CsGbd?IdU>v%aKK!}|vZ;eBD+J2Q$^@L> zbq-{K_3vV;l_;4XfG2dsSF2Ggmuj@n&c<=MUj5nG5&8c8d+j-eVm@yO9#iELD?5AZ z30G8PB&*{A4j&&M3>=)@?U^AXo>c09EIDRa05T)5+;J0D&!=W-)NRPyN9gG2GNe=3RpB+VghRBt(oI#ejX?}gQ+_}2SL+Q-r@$kKNrI+4tU@TiKsu-qNI0WyN*|3Vh zxCx_1{Wl`Yip$H(t-sSn&R=>51`uRqMvI@@Qd3js78dXb^?+@0{F6HPDix7XQBhIa zT%MnAEH&7%?~r36EG;cnx@fjEH-EFSiI`n)u-~7dPDx!<$dlf37NgCP7aLAwa`1pN z9Ehfqh0JXgGB+G4VPROX_v3qP!NI{(Y`sfM$*{rVp0!ghAY zvs6*hL_*&afh`%`s&8!kD@jqLRFX&qPwM@h1oEtAdU$wvb_s)5FfaxZ$${83r3_kC z;2`3YlHPq@pB;@t#R+>_YIWxr-I`lm+(GkUL`n_WZn8%0QM{*&(GuL?#}k<;g&N_-^j?<&(Cip5WPDHo8_NJgg_qA z0wRt^^LIJ2J!hB72=Sau$HN8ovMh-h3d3yw?jRQz`IQR&UQw{Kc5GjZl%}1-KWn!( zbM}GcLK5OKh+_av;H|ONys)|&Xk&$u6LFYI42uzXq9dH0Ju0v!*czC$Q&et+6 z%{;Q<>FKXp7(wrso1GaJu{xa}@8k*;rwYGH7{>>KnRo%ez{G?=mTb{9%&S*>pUNN( zwi6GkfHEzv6nM_=z_$$YU76{wCx5&~2CyJU%Erbvy48Hz^KPCLZ6H#IzgY{M=Q>k# ze0<+{mWV@aUtn0+EL$Eh2jHm=`!g{)G2ZRjd3kn!r#`2^74Gita>gkXt5jHAD#bct zWPSrlzyB<+qGI$2?VH3}xU!3cN0PJTyB3m+cb~R)4^#BaA zKJd?u4ItF4^EXQ6(&XSvtV}4#;4i;WQ-^?&|6b@Ux40_ryr-lrDkiDKW3dX(%OevC z#H5q&;@=gKmq+gn#UCuZe|mcA>h9iZNCuv{izM~U$w?$f9v9+qkTh8E1;)E~61=>; zGuYpCbO=9x{>;7;P4QJcZ??v0I59pk5f4mD3%N#Zc)rTmf#9H{v-4a-I)YfpJ{HD& zEd2#Q3|N1a*BPi_Z4skBet!FIY;bUJ#_AKh_`E^#a0H;|nzDPR)P(#U#)b}nw6 z31Y;BQrI0TxwR|pQqXRLfI)43G^mkmhjM8XJ$uZ%vjP$(7WpF`blCs_|bXQK%d1IrSQj*b@8(a}*lOPq}7 zadmYC8!P!0WHJT0#`dB+z5@S~Q?jSh4S3=9xw%L75S zixLwbzog33UAn45Q&ZD{G)u8aNh-Dvef_O-HC|to0LB;X*5n;qfK>U~Z5}f_^}3zD z!1Lch_7yA$$6u*v>uhU~T>utx*mj}v1zpR_n#Pq%^Oc4nm=$QmWMMO=9v5uf&SwUx zaWD!X*flilxy@*5Xc(CML~|hoOK~~db=ZpHVs|=P8fWXBn2sV{3&{+Z15N}^s>O%R?S>XO62Rq|8D?`X6B-(tzq4gwF)=Y95@mji07j?L=zN|# z@+l%cJ>5d$p&9HrI~Vm^N~P#ofBt5pYBkH5NS81dYXN@#v*u6n0Riw1f2YPx6~tqy zOw;py$m@a65yRny2pgX|Xlt)p1eGNvCd$`YX{KooRh5*KoG~Kv@bGxg z9k$O;My0Ia2b5Z)rSV#bGrAZxpwzDjeyO5i*GD5wPv`YdN}GVgp= z+Y|SJyT89bnZqgc=H|wt3~G3I7=-ptklS!@a1@u6RJrhi`9YY4{ia zzzrL;)pXqYk(PE$XAAMP5;6l25+X>Y5>Kuct>1A3Q(3n?)_$r2l_2u!|Hvx;^>%Nx z|4v@O>+tUi$kQfm9}`MiiuVhrO_rD@G{9Tv4$#RC& zzk8lyZBc3Kq@8zR6l>v0?zCZ_tKJeP@F=X3?EbbPsAo>M>CR*?<72GQO>T(@Oh|Y+ z7JkPYB0$Zec5&c$K4H~dS6U5;aGV2whW=I*qlon?MhJI2nh?iGCDf<0e3c?(vYH0r@G_!Klzg&s|!2TZtUZdEREGGMFwZDD^*Mv`XOFSlfVA>Zr7L6$<2Nz(5l_zdx* z_tZOyN{Hf=6jKF(S>kvWM1oq~=;h8F*z$ZCjCuaQ5U07!*Y#c|x^nETnh}3a>7g}U z;Pkk7ZFpn6TEQdqfn%YFd|&HBoL_)NM|*`EH&J|LO)@Li^#G$d;>ezE;v zvz{vQ>dfSV(Gp)fOk7Jw3hY@~gtvoo6%kKd-f*nOqcCr9`-^TW#NEefq=gceNTOXj^BGU`{OWMdqlYnD}Qd8ykA$p?g-`a3%+r#af|pRvzp6n~UQq z|0W#rWZWc|kg-n5q`i05U z26LjZW!%yGihQw?5j+vkWKqN*kzwEJBS>goPqRU|P#TNAOCBR!?^ zxJUVRrsPsuT6J%W`_x8lFu%l|AL}9ZELDeumO>s3pw36|WnNscv3daC5Wy1Ez8{ZN z3;saPl-b2oF6CFNFPW}o4KO|2F^WBKRR%*!;vO79d;~%lXh8^YaIO=P3S%#`KSdd7 zVZtTl8Q&B(w3<@(Ti4`9yP^A_HRQC8q$#xRG`;zVL#GF``dA+S15PcPWRZ5OPO3Pb z$FI8$-=HcCeh@BndlW9Lr(12}+nD%6srT7VT)~jZEf(}sW8Ti6$eW#}_V;GAi9Y++ z<(Ydnz7O@3%E}Z{lCm%6W^){(){{>V$L%aK%OlHV`!zEjMbu zE|-g<<*T~$=3Zf;|0RW9jJWhQxxm753SqjzJekVO zr^U<9F4r(3G}*fZ^fW?Id;+&!TPU8EJZJS%6A~8V$?SO@C8hW}LM3%v6*6M5Q;mlo z(jlB0!86BH!09JbsEG#bs9^IgmJ8ej@-Bn;RPsL)^_4bA3Z*TB)i%d#**~ih?@yV| zJNB>I-cQWvN2+gkK^q42xZP>Jy7JZ{0kv6f&ExFC}&aMQ}fYxp! z=!56La>D@e+~Uqy`sd*aT4hk}=dAXBr7&Kwry4#97$jdEiI7-aKfzrK*pxE!>4q(` zw$&3F+1v0p!#w897z^V;xsHx=pEOBG@#(Q+rz7LTg?--{d!{B5ed{+w@9hl{ccYOmb7GPQrkCx$nmg=D(OcqFKL z>CF#}XNGkHcq@yGu3A(fI2ek<-`5kNAVb9`D_QBSE;T{Gb!zxV1dq*vT?}fDY*KAFVFmRKw;AU96_ea z^_`Z3CUnCt2%74mth-}cAV2qKu!8yS?Fv%y{exiHK;3+W9(Iheg@J~G`w)h#o#`=+ z#r0jXuS1dC0Kb=WzxWiz?M-mrVq0=SxsyZZQ$X4$VHmrIld|_nqrWv#BLX|b0J{Gu z_;%Rp9ko16+n=bO!kiNE>;=cnb9Q}VJIDFH8$h->ik}nl=H{!+nwt1>eb$_k3x?mnpE~8JoC5eU zJmcaGkaMUit&Q)Ou4qK?BK3|UEM}9(&DrHA`IVC1LX<8{6Y-dNg{Zvfd`bIfE6qlV zOL5~Sbnue?*O)@=XWsaj?v0j88kft?dyYR>Gp>E4wSWRQ+VXBvzLJ82Dkt|*&ysg( z%So`qU$L8Ut@sg%ZZ+(z4Xoik{kh=36+TF9|1T{J@D=}W#|EOR zJ~Aq5zQ!CIuw~G{hlfJ|SIAYzqEetZQKVd^u?&>QryxH+xYE*6u#Vxm49+VrW2bytE>s}@ncycsDGylqW14!pHhID5+S+C>3AYf=0mO3G6FB) z$y6HcKj(P|{d9_owun80L?1qJ9EL%;+=N_84=n1WWItdHS#zxxKZ z-J*BJc?VyqSj8(SXi#F6jFU6jYPqq(dX*ar6xc(_YRqAL_OYL11`A(ur>J#c_<9VBZ67uyNBF> zfTRM3W`DGlfP&9%AYnjFO$}MA>(#-y1TkWQ0MgaLd?(;GE2^sv4i{>s^5uTDxL*Iw zaU(*KashSS^>N!XV1%$CC0s)T8(2ESFE~`@8>Bz=^{lOTe<{H-UhslFP^mF{3v33q zr3`KTvc~CHtI_c=k*VNP3>lA62eK2T8uL2WZT5TUD=RCYs>b~B^mxwzCXVX!} ziZ3D}fT5B;cA;XTr z?++^A=9U(CWaMGTA>cj0ROEr(DHF_s;(c$afq{sKh&)ET_3=u*Clo&@Az_{}OvwM4 zmxie*4{j{{@#`0K5{vbAoVG{%%1Ub6(;XrU z3k$RLibXY$@m{H$nNa|~scUeM60m%-0AoUM#jN7nw`h%RTh2&=PJ;n(dcwMQlef5#40@i zHy4MC%4LI}5F)U_W(I>UkxKhL`K9G$Hv3&Pz-c_(?xay^Hdg=sR0#!msLinq0m(#0 zcwk#t)M|pj^iR*v85~Xc{oNoR)oK4}T#7JOpg_jY&yOHTjCcaD2?HISKTkSER9rmM z^ZBXy`RN`U%j8@}z-%<|3gs{K^pONyF6DMR(oo#Dn;2MBD!^WTMSOH+lL_?#gdNfc z?qr*d9$hOdD{us8!nWtzQjHG#u^X^3fVVdu&lIxX9czCbvDXVC(B5Jl4MZvdX6Agp zJq~8Fm-DvlpMm}Y_Ng0KHGo1rYwOXw%l-LkQw1*>gBOtH0$~i;pkRh9t>FMBCnqPX z`&~8U;F|RSW@>xfzG7jidyg@%RHF6<9bG;hL$&W9_%%Fy1`q(cdV0>b;lIfjKx6`H7>A9%6Z91Q?(Ovh=b~?BHUNC&2Jk~b z6QKhe`5L;(ZU@Ev?qaQO)nm|c2&e~u5uOCD$Up5`Fm0yVU`ql(ij;`x1%L*?SBZm& z7=oYp9T~tkk`)Mz_QxwJK#v5_lfdoH-r{mep@{4(4M@yeFiHZCC-;e2Z6+o^&g|9U z;@*4}+4%VQd}s@bx}Kkp&v=cwGJu=cyx=(ysE2T%p^}(JfF;3^;4rkpOX=1GZ`?i3K%8SpVR}8_fk18(XB#YWZe2 zE3A^F643SkMt1DKf>`N*+CNxrOT%R_(y_7{8Hk~@adO&O^?YV>G|5DRxB;3U>W3b% zXF%&28XCgk@!$jw;ckgN7a9^`IF`;=tl5P1!)e9Y54i36!GXnQA5eDpfprrB4LmQfLDZaa->+^oTYg$CoRwSrJAbiC4XyAj3%WwKaH9fVDj3n*rAF0Uihy}Z1>ySmzeFmBNc z$wn4x%nQ`(O5YF5GO$ckJO-UIP5nagzscsxI^i&3{k z!}n8S5GP?^VPDdMOa$L)ppyHKYVU(ZuK6I?7*x*=!+l{@NJ$E81EX4NL4-MXT(DW zmYM-R0fq#|F$BU$CrCs-MGQj&CHm@UdGKJqs-nCc63QV-0!W;2IT(q$1OR3OFqQ*m z3tS(76dESxrvQsDOy)De>FI<3NTosO>hwk4IILSGrJ@=FOMEE~GK*9nKfg(kSfIRT zh58W|h6(~u>`GUh{aVK>Fwshv1~?>~-<$nWTWh?pKQ;mn)oiq1J6Y?T4+Ru_AP6Mm zm4*@@1(jG2<|=+c#{E||#$Ys*4m_)@`z(at6P(~?B1lxH3%^3<3=M}O3tII4IkOTq zl7M0Y#RBx)Slh-E^7T4MBa(YkHC$;tX?r6mYhpWV(#4){95>-#M)FOS3J!Z?{LB~UaC zNluFCYWQDw&Q+UYfKwwOB_&;~xL);4+Zv2hXtX!|D0t=K@9&@PdVTZ&LLij=!Cb`c z`3?{o={|~(o-u$1jMUG}%vjde*4$4w`d~bXAS`Nt>+Oq+>ge;zzdJg9epbl+1}tnK zj%F0Z{nL{Zoq_1jAo{+B~a<8)T;eKgb^n;*8wuQF2GSL zje0bY5HMRU`6nf*hqi3;b3!sE5iqA5uuC8qVn|3xF6R}K*OFKH6263ClHF+-0v7cc#0F8@ejoO+_ByrG~ZxH zCM_*}b9*aSWh4c{`=`YjC)atcEv3-dN33MPnK0RJ*iy4GugT(1Q=l5`s zdI2rHCz3=WU$u%9q@O(~Qkx7QpQwa^f%$WIC}m~K%3=Dlg($sID zqXLq~f0t$@6fhtD z3>|>opt0aTv^Di-i{FK9ShFnKWpM1K)GHdvJRqM&_)=ivkz^!{uEGXNjK3Ah+) z_2m~3I6XaWUtRt1qt*Q$q%N2VL~6f%K#~AKHhgv%UXW-d(CNeeD@l00(eQ2UC8Yl0 zaHLwK1Go)hM?c4Lv%YtqH#Z2EVWpZ)wi`XjK+BB>C=Lya6b?UFhI?z-CJa)0us7*I z3BS3&hxo1I@yhVeSBRe?s9_*}2j(0Pav6Z`Ad7nMSzvu&S z^sig~3+A~N=|C$RB&!33+vgVhvKL??Ai=<8{YGeGW5deITK@f1Z=ufG8qCs5lwMub z5`YqfJix~ffshB_V9aMrC&Jru0WE{eX8Q?v*UNN32NbJT`I=1^;sQX^S^;DUz*STy zaO94~;Fv%Y$V*^79|cK+KmV`n0gJ5x|EfVe9}&iXrIh}mp}7Djz?n~$sI!1TrvGp5 z01_8VHQ1z+S@}RS1jw~v@j`m$*5FV1LFWymrp0Hs?~CnIQ&Xe2Ug3Oty0-=&Rsh5d zND2HJ>~=sU*$+voEf{gOOdHbg1S&f^lQBZoOu=3MprDPS zvQ~gLZ=o1Lq5u-kwZDJOmVOI*CmX?+v6XF)7H|GV_Eohy2_m7kgK~)gapcC!`X*3%_ zPf6j!jsx(NN_}ynn8ZYJVc{On)GAP2!6M=C+S@aN7zJSVm9_P;^29G|U0pBG8FS;_ zRfO^8O*^)>Tf}kPPtXVR9+EME_QoC*p_jQ7zf3M>1r#NSO13bf7iS=IOy%%m|P_ekg_{l??^1Y9&}T zYEq!I1`!DMxI{)=d>uFwz=DnKZKE;Q0Og&B+w&e2qFK-gASxxr^nH;^D+N-xJvw>Z zw~&&MyfBaw#JD5tFZKe|1I_-Wr^iI2gSq3uobCaw_Y0{M_88EUg3A$X0dZnrF~Ru0 z-(MlcC>j_Tgr}y4@@Mp23Or(+QSP&>5K&fGW8U^159RIUjpkHtj0wYyexcYtsjU>xp?2sL%Z>3o zX=ty3&YcmZZ*A1puJ?=mMFz~PW23!~>~0AvQqr>q zX!vI|o)5LZKF(sikd1K-N_}M4@ic(nm&t4HxDjfPQCN1hCy&-AavCr3z|$-9MRQvt z2F+WW>$ik21*{ZGg;?%tX!;|Z_gSMzUEBSe`wCD2^Lbl>` z*T5IJ37H#mDYDHK8uh5AbV+R3m&Q9;B}51{u5mUy@>~7W9)+wmKrQ{y@+;*@+27O1 zSw=Ijb#yxXT!Zz)P=&zrH%{WLF;5nWRiV}>@wuaJs%593>iTy}l_o>!bSXeGvR6ruOXZC|VVik?daTr77|FQ%?7Eyj+AlGsCD#jP5K9f!?* z=CddxXaL-gHG9$BVmBw$3ZC}39)695{U}&@QSo)d3kbG)& zc!D9R%%r}_ot&DCD}6GIhr#mc`(N0wtc^Y{18$KVv`SY*R~pVgay&n_M?PP9tFM{m zqzTg(MOX70Ni_~U!+P*8D{0G<=&)1rPo}XlB&aaRG&-|nW~w7=e63iLnAw&M@2i_) zQQ4XdpdNX=-ODQ&daoF2nW9E}64unXlcT9)HOoD;C3TcsWYaYfE?Jc) zmIiM&*Cr14X4QF#I;Wv7>Zyb@a>5iTDtCB8Buv)V^9ui7fu2%tD0HnlAyF`{U^1TB zm8u51Epswj!Jt`(T-8AY-)nVEJ-T4O=bhD2v~)e6VB%4OcvK|g9>e97XXQH>(1(kYZT@6d9Rce;KcmBiUg`7m9qgy?W z=ejPl@3aoDgI!B6q1Kxmct?lEaucVw=$f#G?nKZgRz9cc!W`U%deFR|XX#bCx@~!& z2}2}Wd;PzxKpP{zl0}M}h!tF^_eor!x!%exuoRC;@fIi_sp=bE!aUpd%si}xwz{`B46y)sgyM_j6#DCuiTCqh$F3krjZ zr%U^hR)V%JJFG76B|Bz{$g*5qxK5R7zENyGH~;K`xhC45qfLh;HL17LL7#lkyxA^N znuE%xn6O^R62uZcGoI$j*_~^QUh$0h8?UlHL|vf>l~2py;?1 z$+GzI^1Tl&d61K;+uqA_9`eZB>#0S-9`eu16&gy!;eR+n3a)t!70)AOxW@7@JQ#kh zlkS{yozRHTEW56)lv_;2skWZLez-JGwygj41$lh2&bF<;H`?>bDwlY*vZp4HQfP>( z!sV~?T(+<$v-NvPCXBHqwl41ixPsGRH129SC?iGlZLtc^ytB){gqfi z_Rb%xbp5XlJAz#=Zl0hI(n|FbVK75U{BQJ`+N_c#w95vj_RlHTTy8pKI2t1?cfOnt zmcIyIc8v9RXTYtwlDwjMb1B7a8M;=&oRdRDITZ;XKN&+>!((7m3HM=!CaAmfqY(E^ zz^XCxRu`enib+!xs!~~w`dhA{32DwgC+Mp+Hmeu=&#ED_8P+)0PV0kllxHc=#R9Mv zJGPV?3)XHNr}Sx~A0*D#8(htl6XVlgmX}HFlXkPMVe5_IoI~HD^o14wW!Q}q&}gWP zlCD5Bop1iYp`VtRQWEfIcG}qF%(-gkLaPNQrU*MEXk}fz7>~Z=b(408QR+;(^rnZ! zUj<`x|1g=1YI8PdL1HZ*$h|M)8fpg5B>%XOm!g|vG7Mq|%z@8uhu3t)RDXu$Sr|Lw8{|P*GX|N$`zl5_E zxy>2zKvh|pEBxX_+oA8rp++bp-V`3P>hx#2@Kb^734`_5iCBYwPKxK>OHOA>*N0g# zO|?j;;K)S65~jG)_17zTbkd$~e~cX3*58&6!Fd?YV_lu=zRK!n@1-BH);TOOa#LDe z$5n3oRK<>jcS;JETcW=h>Col&^l@c`wCj;OzL4pzoN|po-Ss()#`x$@nnRR|V;%LR z-xQrBRYy9(`vmEo2<5#d8@jZ5qbt$;vSPi18Jr1J8RJd=H^Iy3f@(|fN)IPo5)YLu zn=_Y5?fCgwT%w68}y#iY271~$~Uf02q+?jUPxb_*n(6&H8%%V z5mjwi*K;t;bw!R>mqxKk{H~&tzP^s92iz0t|YKazVy-%2E{T%XcW@GSB zj5XoYiF%q#{q^co@czU}(pcnn3s!~xFzXDdlc8Q6f=ej#bHtYrH)&H{uZ9V9iIiys z`4jcph-#&ve9PQXBG`pq8E^8>=ckNHE}Enok93yeiw4}c_VL{`d*UX&-DM4^=mD-Xkw}vK$tvaIZ*GV|!>FzT)2rQfb%UKkeo_zX;x-*Tu~4Q!0K*p|;Ae5%WH!Yte4fA1?| z*Y88zcdEJ=+GfCY`hsq|F+NAdr0W-Edmf@xU3M>sdhobRwUU`G1?Tk09g<&G;dJ=n zCM9TdJaPalG$#LnJVHodDB@7VdghlBGs5rWb!)|(!gGuOi*@RsC{I0Q&%Q+GBxh`w zLPpm~!CSvS+kazzMPJ`wNg(l3pKp)%a{Ku73GTy5JNR;pK_?%UklXQ;%9mGs^?7HW z-4i;Gz<@cpfXtmOQ7!1olA=xJW17J74#vlyZ1r2pD8BFX1AV`n z7QlDq9Tqsnqg8@y1Y zATiNnt2{&)cBy+R^TfF-wW?S4AdaBk{8!RyfL85Bcsx2e;UxcoVtHveuq`G2a(~fq zm-jz<0bmjhE{6&m?gr@{-TNLCP?4=`4*}0ig>VUFYCxuG_XOWQqxYjY)@1w zh`%vjFC&}uAgth_`5Us|gaTd4gpb#0XTo9=9d7&o+S)Zq{Db4M$D<=}!wtt_N5bqg zP36Y*B9cn^?r9)RLv9-AiAwyWcSNE;s|Qn*vUKiJL;EReXDS26z1B`IQ3~$Yd8W;o$hChBx#a2N5CiTD|?St(e3cU6XiSdW;=?+cKxIMSDQT!r67Y(!_t@Rt+k5VLHNsI`uphFa9f_8EKt$dd@UAZZtIaw={@e`vf?183X3c`8 zx=4+gWzme>qIQwPG&Mry`+C`ALWwKcdBUXdrTcqm3;`9drA?#c(eq7b!6UYvtzq0_ zu4~uPq{zsZfRhII|9chN0FNXfBh%WsPNUUK?(_&VyjeBMEdZ`~Y0N5p{kjIU@^AL`-rx$7 zhM`HKFu0xdos&RALu0d+5?uK@GKWtLDJv_RSo9AFkaKwlGpq{;bckNEXf#F0zz}7L zQB8n8L;=sy#KN}ml`aFC7xYv_4Kk9vCNhByylBbJ&NiySee-5KdmkAIDbc=DSk_Zg zl98+^JRt%1#*LJL9DPoh^B7QNgn)E6H2>Q(Jskrcip73^M}w$9U{iHJwGA7d-v{Op#~8sK(;Y<4h}+L1fOd1 znBU;y1^^Yvq-)~=P+1$Ge@K53>E*BJ*w}S&Uo@RuymTMjLPZ6lAvha7;4NdbaWq2I zQj?I59Kclu03KcsaM-pyZvcNq^B!O9?v)|$PgSP$%1U(5vcRXPhzaTZ0__qxVq2z( zzCeBkYK zZRtYwW?H~yG(iJ2 z@~nv}&~w3U=D}qwzWaK|Lp&hqpb&)lot%8T0DP|4mV&(e7!c3E-L*g9`ei(md2A#V zNguLnf<3$*q~Qw>Y8hx9Mmva}w`U}lhK54W>8S5nf0z5fSZ<{_tz^6y1H5965V2g< zU;p6X>{~Z(Mq?-e zmpF8~dbJ5(UT!YE0TG`%csdgD4`ycOT+msASt#*dg_VP266g+kef&RbEA@L(@SqO= z{Ndf%*VZAn2NaZX!G=chKeqOZABn8-1v<4+stg<;}dYLqn1QOOV@J!Bb*m-0|Vqef*oS z-aTjtJ{O?3!xZWQbob#F*a307W)t>PIbol?h6YsxHc<6aD_pJ*7s(>t85kMSE74?_ zhU>(V#7ATI5y!p**DWW&4b0#fsWG>Izr@PM_NL)Q2G(MowX%v9p@gO_bf*5NK1ef^ z@aC&Z(LrP&%=Ih0h$U>NLZqdp2^Uhcq6Iw_^!2aNZEbB$z~!|b$ot^1K=+$Lv&YZv zj_hV$EMgWyMwrR7m#Y3tKioxW2Vd$r^ZB9R<3wTiMJ6PO0eJ-jj0&sg+zxmk$g9No zDnK(O&q4lal!4^zMI~s(n~!MNk7z(-3++hMUK|(^iwzWGg#t?;rU&;SzedmAdIfF? zi(3LaKx=s6mAl%92&i{l@GSYSUxluLUKP9a0|u)xL`7??%2-B?5RJS99~>r7lfa54 z0OtCL4kCi)h@6Wn^1#fNmX?5}AF={k<>s?ra_YdTqBVF~eWa$T|9+Y&zgQ??#rqO* zc?(eJ=+5ondR@$l7EsuhGldYr5gyFZ<0=Xe0$uIz>bp zNogqsr9)amN*YAEMH)mUq+7Z{x=WBd07hn=ldWitojCYxY3l?;kIGd)Z`MKzv4eOO!okI&?rDG<>N?u z%6+c957d1z9&;5i0DA9{lOR=0&wnfM{U@|_vmlp=U4WU5P2F-9{X`}3-#QOfZvNXS z0NKi>;N)Tx_vIrrzWw{2%sA{p7Frkv?ycMgFI10QRoG+DR?(E}T=}v8NmYIjdIgIw z{(ozpmKeN>0kD<-e(K)}w7dmzr1kk5-Gl$?x}{qy)PDvOsHJVwbl^5$Oo3oIf&hoi zLEq#j4PETsAO*P6Qj0%)JOJv0aw||I#weVUSi!7==EEmi(eR@xB%M&(5ks?d1WdX2g`odn^7tn(q(}NG%qqq8j4*oT>m|)odGLuJjg@xfZP8JC{ z8-;~z3<5(aP;~9AgC*Q(SpsEjohYS@Dw+Q7WN$AGW$18i(0rCEAtz)`;+ZljJP;NE z75p1&FDrSz#2}9EmGAyru8&|!|=7$^Kd{6nVMH1}qa@05o6t0ZQ5fm6<1IO~zKv%p4H5veCOYGuE zt-3!t4C+~Mb5PQM*P7j$TmPTf4$;8PKr60w;was}lTiErzN1XT7;k!A;h)Nz@dIyR z281~zGBOtZA@rKVMpxy?*d5Wdj36Xoc3!{CAvqP0gc})eK4jLtEW% zc^~!#i`KKH`oS9}y2gs)9r%STo4bx^{Zgas!h)x~M*iUVIXWv+Mv8|*7ow_%$eT4elN(5Ji_|7}GjU}NxoQ|_DMxUrI)R_hx~Ey%|y2U%?|4Y16&p!TktEh2rh<%m7}ErX8x<1pP7XJG!r~F61p(^ zlli8NyRHG{kQwaC%wBZ^e>(1Gg~#Gx!P(f^H=tx!YLlS&-!;Hs0LtV^3J&iL;y!*9 zSYh*2udeC`EDt>+^Z0ksU!vBX7^g7h$c<~h+MrsfTfv0G%jAdg#Nwa$TZq9Tp+Z2cX0C!mCb4By^EMz!1eDmdz%bRGCBuH?R#;91w<8iUMRJuJ)O$@#oj` zwJR;66~U@R$5O&^IRGppFFt-?j-@h5_R(t=7M3A^Xak?a;;52%C%BfYEYQhnScOF;6nUa+}R2>E!0`VqjsJK=iP%?}|-G z_z2KBaM#6NgcdUi(b3Uh-KH+tH*Bt0qtm_MI0Ro>l zb=EC@Tk)%j4IlYwai^t`}q@k@RY3ufq{2@VZKMP&a_7#erP zTi@9D>NWXDg{1e}H;0RJXF!WFk2n3ie*HRKZF}sbPyfD@Q#vya`~vujLz9!wUg|2W zg6G=P+l!jkfY${j9^nXpQGnZr*zmk;UEWV%3@`*-{38Yc!NGsZ%=#&ipf4L9iaYo36N9cb znGL4^TuA%$^e6bIDYO(!l^UTzXuI@5N$la{$Jn!GJrQ;2rupyKxI4j3+5^;jxs5D4 z++|>Q6m5l#Y{1m?v>dlmLMfn>Ay813kFtsD2W3yfFYyA)}w)5~g8%c+RyCU(sJl&Ew91@Vvi zAKh7{L8ywO%sl9O&5!)9guMbte6>`IagF!osu>RrhLO^=a$JrES>Q*eJ z1#iFW=e9TYH+YYQY_jfgalcMP-`B4&GLB>vu3*Vn+2N)Ne;KGv+q^-Rk@Ok+giuD( z=;D!b$iZmzW&Zi8UfiNq%?nJnHiCuqWt^C)ez9z9;Um(FWb*}+uf|$_YlNtaTQ0bA zov)}(-}HF9yQGWOUHe%R4Neo62-&u_B7@BIqQ& zb=)JwSo^rFCZPjNIuL$BdQ zL(Ml0SdaEE=9SfZ|K{*ckj^&$CVuW+jgA|85gc~y4qj!q%ofeXqP6OG?%YbXGuN*w z*uQYfeD-zpVmGkW7uEiLaS#)!>u>q7Vdve=?I5$uL14%;Z6KP5Y!N?bGKkHQ6bbhIEu9W127cVjj8+zv3!Cnrbn7?EDWFg*7DGx z)z>#z`W&*mxVt8apo*{D4jDMfpP#rV^{Y1d?B_Sm{poJpQtK@Y#*dAkxVDb7xrnhQ z3)|`j#kjD z_4sdUKc+sXK?EPrEbg5)FKcG^wW*!pne!KS-AyfSy-Um9{xd8@xpXa^} z#D6IaP^HMt)dP3iez}OZPHs(_T)43A4ZIhRmK=S#EX=(s_pJ1^J7Vg5RMxZJh_=|Y zXTTq!{5FgA%J5GLLd0$-dT_+#%L-GI=NeW=)$5>(i(Rg5%IVrKs4fD7-II&AEJ|p@ zO+QR5LdN4)#P>Q&*GMbZ1_3PFt)2DVIaI2#FC5B`MVnBcT(So0&tH|gZp*BEMXz*@ z%`~82YNqAx&crgi;9TujVDBc|c6WIz8docI9%$J*)#Uy_)2EJv*2e*1G+kaXVD-U3 zudVL4u0vFG;;m17UleJ1t5cMj!v!NEoaQA3QUf0QY%{hl|KR#lK}QC8su`vE_`~M0 zmsj^%UGV8ko+n09HF@A4FNwq|US7Xq>KUkZ{zH(#NTpYXu$sK`?HAu|4-6I=Qug#3 zZZ(XRT>Nn3%SxfR{<_wqSp>nfK_J_h;(g(h5ORH3Csl z0z93mmh3IWk-er!K-`K_PWd~X5u6JT7wP(kH@GA?Ngfi7JTZQN^PA?(aEGpPcY?Zj z4jM{FtF~0GF=YKcBr;2`?C$MPEVEgYpGtl18~tq{4H0oxE^_O#8RbRP2bFo>-Y4nXUnLYy#)|1q|ISA(V;;Ve3w7Vv8Il8+)4Vszoq?2 z;G5jn$9jLSr(Cru5rS^76liFHZv!B7X`&8CC6F;XY;^pRNtuH4}*tQSweiS_j zHdo!Ei#bJL#j+XM5GAboihZ9+ulKMb$!^Sbt56{h*7phJ zy7VG^MP7zip=;2#OM3Wnyq*7e&;dW{^ylGE&Jrj6>vQd#y6?HpKRB%O?2Eptt3dO) zqduOSKHx}2^DD^njj4%V%cA{@zu(fnJh5F4v$TrlU)25EuxwsvkK-FP8CSzjxq@q4 zwsp}t<#vbY*!CU&%{=i;5Vg7#`<4pYH&No`SXcyDOU9-cB6za6;6|Aa4#}cU# znq9Eea_zTzy1ig%i|%zi75;9h)aT7(IacW_jKnf1Cso`Cc;N5!f0$W1Ieh zvRqh`c922DN)mEK2HMo`0zu7K$@#o~_b6Zcs0n7_77fs4QtRk)*u9HS8zf%w0vEP5n zkK$Z^oub|9`xG?($8KU~)d-hh5U){u4*#lO&5*7B42RhMa0gdQSh)2%qI>3Axlbii3$3stdkU)#twhGK=53BpJ}&l_nXexzx>R{^KYVw9t>oCh zuwaPoi-wCOyuS8=A;`TE`<8gyPWE$#!qd2^T2EfR@Obj(IRc*2+yL?UV*_7}t4~Iw z6XX{3#~%4?gtam;*VR!GgcF*cFLDZWtFzwzta;lle0!{nBbeBZp|FN;M#ml{g(t~| znCO-gTe|f#GRh>0-o1M|?A*G#G1nG7LR#FrYLmzVhXSJ#Yv zQqSB4G<@Se1uoQA2JtABJbdmb%+Yb8%7y{Gy^U&rx0O-#lpB0L4~D94DOeen*JQu%~^ z<$QPN_%|x+qxT=Doray5IKVUNFzcy=0*!Me}v zS07?##c@aWll3LO3-Nk>5t3FAcYTkGaN>mjsZI+5zaKP+izz(|vvjl{QGVPFjx&+8 ztlX!T+kE&a#0<;IX2x%CEk)e+X05DdePpeku#*2iakh;82i9}9nX(GBA3g4>F61rc zYmDInf+102!a>$C^GDy-o|!U)3cvBvHrd~AxfSu8E#ed5Wuw=vX4<(3jE39sRV!q- zn?}DHYT8O$3c53sa6D=56d}djHK?5^Eh9;9$T+uc%xEBft2xm(Uc&O?LMldLaQ@cd z;8nm?`{B;?@U?Ehqz7Dw$C2j{n=9eO?NNh_+C?@Wys-Y zTR6^#*Rzv4(kX9goqYGK>7n>;_?)#P+s8I@vfjk7I1S&wzT1IAh~}zT`|a^s#-7NF zyG=rWv$_pkUx-g-qDV8!Yq zVHh|TdFS)Z`QLw-27H7N?Cb{g+@A$<-14Ryrrta#ZrvZC)1#!+!eLR~?#21|%+biy zB)Nq4J%#N+DW<1Aaj>SXSYZ&;~C zK9qkCKRZ0i=Dgdx_$-@2Vrua)I8yvAAMDa^b8_tYeLe#5m`oc2LJSTJ`P->ddU{Gd@;z}JLP44JHy_-jc-MC zJ~24OH+D-yz{MnW53P_)@^*7P7Ax+?BWaS*FxG|j%{Jv(oIMt*MAD0E-Kk2DRG5BO zx!5F|ohtE+?fU;<0XoH1F7`T=J3Z)qh6MHq(Gf)NUq6+7al&GquuPDWtG*c>!vs2UjL+h zt%a=e-P@vy8!sbtjYuxVq>TPv!_suYl_pO=`?KOjqRNX9T^IIPy7<&Mj2hDu9`z{u z&mS#e7SyX>=h$JP)y8Y@{kS5%{Xd&-bD12oc+UAn9ZWTsS?s75-lXolbeUz|W7c@n z_lTS8*U7a$g#BEQQJMxxue5J2BW>roW-d(~W~q%;(Cm{aNxS}%`0btv;yaS%yCrAK zule)8e~x zWL74@SS`FyRr#r&z)EqT>+OS7G+M&9a_4u@b5OU1YY@yb>pzp3FAQmAbvM)X?(sjq zy%kV5`N7nSHS3%a5$YeDp**Sj^k!UI+c3t{9hY&5k2pV4uUfW6==1B1DW4h~5@GVM zKQpD<78b*_6Nze>Jh6%L37u0CFjf?3>T{K~`4J+oqG$Q$Ru`4}&(+M4sv9$0dn=OE zr}$&O40F>5C{6C%q}`{@XwSNY@6lewmH2H{B?~(4&_(`K>U`@~EU_=cwmE|K?QbZG zk%*_uU)1IfY4l&eaRm(SH#X*BAN=mly5hpCbPxRFVnXIx@R!e&{>xn z#_q4t>w=q^>%VYb?KF{H-fZ1QM?<06s`77e8m&L(q8Qs;pl8}_} z!nb{O*JKoHo6PE&CCT%4bDruJ!%oFQ-E%aKIl(t&8_mWWN-ptR9$cMS8e#P3c4(JP zuTX^p!pnBlN)ncDNyrU7->XV}vtRv7u_WU?@yFq%o6>}DfBxh`D7*i8R(KZHmBdN9 z^nJB?@dxdrrOwufGuCocb?tk-M}0=78r0Pk&Ixbtc^x}k&FC{Ty;*8=9(g*2ZqwTO z$BMe?{6CM!~~iPrw$de@Q!b@(=( zZFL>abRTj1tzQ3oogz$F_;Y%4gs9$e*#5{X^qi=&H_ZEXh2r|Y`IN$Yt81wy7kSv& zp)?C6c1>M=y1)BoRN@FX^^{m%9?2Oc(Z2q5B%`W&-TP`w$Uv3EageQhZ(4YP)RQFf zvZo~}^;t$xc@|@aF-c5A?eq{&wSNo|#S!cB7-iJQrLTEt*z!GX-G_{exUfz z&T=_Uq5gH)=V}sH4_4y$!G4B$LfJE5&5j;Ce@Z0g!H?E`8IN+8gU)p+>>H**{G1%O zr4YNbZnr~{|Bp+7>w_ct3!=3e$HRQ_{$5wIhPu}<_Jj;a>cqJ%^{J1Ctarxv$Hl4Q zwY8X;BR~I`8z+zI{z1;!vfM~OBF&(z$011V)7nfltBlF|^oI{oikw`}gbyx|)joE1 zq+oGLrX%oJD|y(KE@-2SoCs$9;g zenluer5b-}zZ?9CS~8i7dPWpa9KhSkOOlk z4Dw_L78ZgS*D&rMeCE5ipzGT`xo(z)B|%s%W!~~_^*-?j^o1I&Hc3vJJJ}?L{TPPWJZfTRtC;;% z?RH%plQuL2sZ>$s+=CBKzoXjy*<7&yIWJCBTO?#Z{Ds^XLp&nLv?xYy+$4_hboxh| zez8^dPf%hWJPPUAuNAhmNp%!{L|!%CjN%=B_4GAw(R_9SJY=G9XjwMU7( z6~a|iCY77qWJU5j3l$my!`l^4_R5z1t$1+j2`V4CyB+_~>h$sz{er(JHij!z&)k%$ zZ!%vnigjFgSJmtN8b8g}C3zG@XM<1ss?pcSeLs!&T3w7dF23L27WNkCyYzPY-dxaK z@^q+X7tw;gQ^8MfIKlMw>XlWWi`2xjqB2XyC=2)M6xEa*J6C95v`_<&ZphM-yn1Zx z_z-TW@rlt_CEi%ORE)bEh(+9_i&aEh7|C`Dg9z?9DXFaty;|W7o=U0i$R&weX*g{T z9Gz%kOyfMkPoC2)os<7{VJElV9_3$84JAvOcfyM@(75u7A+%evHC1v|+R4@8VwHb5 zw0Kx^^Md&1LenQ#gzq;DEE1eH4m#&4)Tg}TXoiTb>g3_+4Fx*ga}nZ6rHh~~OL>nw zMI7x!7qjx#JbG_zc=TH_=sdlNnyIj*xGieWby9X8Cf!f*CI61MX(K};{PMVwqUSsc z7kAQjeOQI%okl>@-JJ3ovY41`<`3sAwXvF9tY4)fSJbI-I%QB_ZOp2*#>H>$C6|c` z9k2ftdRHWBtB_Qm@#IOnfJScgE`p$s{i(9%JHoGv3a{?T=O+ypRnVU$V`elfU}i8p zZzf`Q_v?5?gc(kT{oyC?Gm~GaT%|HVow0Z#^bSKKxYQ|xtlEMY;cI22&-%{P{MUq3 zfRkSPa${{UvE<%vAZ74?^%;Fbx);_f60Gv7uF8gwi*C})ipe#hctq_kgr;SJXN&js zC>wBto*fsaBtGC@-`of+0}2&rmL?nidW$L`c3spRTS%LQ;v*mbE=sPIX!OR1V-RwQu8Wz>SJM6EC=Z7-cNCUm z7sH*q*;@6c+#*kzJ5q&%X1_jjfN>Cg?p}>4(yF6ohcRI|9} zF(ZVCG!^gDR`#J**QP({)ioUZ;?vgJsQo0&KbYb~N#<4?q1W5-IODQ?eKqu7jiKqy zSAnbgByO3rcsH7U;KorAg*xFXD?Vr`onR-!ily1j@Sap*X6=!xt-aeMdc;46uW=OX z$9!!~F^K-bult2jgWeaZ?p&!Sq^}QF)~zzs|JI8miawuZR-oRl+HabSzZy&5MY*26 z^fyYk^wgf~4XUVh@}Fy=6SiFpggkrhUctws<|geP&l&zS&M!#+S^ibvHg%|G7s&Kr zo9xTT-q7l{3|Zq#fr-lloDV_{(XQ-dcXmAk`I65N=O)tvVrQPaE>m~neKo(v*|)oq z^DDia4o@b0(8bY>O5-GEokkGbW$VSu+a0n*Nw(WxIqON{Eoc!%l}}QOu`2Pz&90w< zcGOuiCY`Q?>Sv}{w>W`lf=(Tc^y7o?{`fv3{+&l?*eR2rBT2ObUH_hvPStF&NdIO! zvs~TfnyyAiOC$-Nlujwry&G}z)m=GsmqL=JnV(SOPh<#eXMnbCBCtobNPBvvP1p@N zY@2>8T5a~8B|tC-zu`S18EfB5c5nYCm(J2sw(Y-YQqQazBFX!cR>``SGHR8Gkm$6R zED=q!eOi6E-byU#Jkrm&c)6uI@orbXkOu!3>Iwb9Rw1$7{sG=Q{_iHx2&v-vW+u;E zt-f=IZ;WcBC);eVdmw5!;t!4up8j^X+y3D|#kCq|82&V@rTq+b&)lXpN%1vKs&*b< z(+7VO0leo_J3*vn!;jg28qLjwSXkYeXIN}`y?*zs!j38sEbZHel z4gb!RPV$5Ajfl4mP5ue>4W}$c%nJP0#RJpVbb<@SoN^X(;y2p@wxzvhl-{qKw^b@^m9r}uS22D z-pL6qmGehP7{CcX)_e0KX#le1@Xfcbt7&V80A~kK#?de_9Z*|8RUv_a1%_5uxWdA} zqt1y!rUqli$N++(u;~H)xKN3-e;s@ED;LSgv$B!~BpwU)D<1DM;r5+780z@vAxGqjO#LD#=$+)xISq_%l~ zM_9OC&Z-NzmsHf%(SgCI_8TA$=;5ls2EoPkj?*bCr;r>!y{xqmDB%RD2$(}(PLK{K zMbc7I&aT{7RLmEt()I>&B^1QXn>Q6sen6M|M z;TzDA3)F_!oi+m7ct=MEZ=SW&c?y4|E&?c^l;skki(Y{vGxCy$pI_F^t^J69Q5$1hA2Hll7UypO zN1Po`J|9k(!m^&LQQ2{X@+J^Zn1A?2PDW-39A#YhRoR3a?lW_8J_0TxkQTj3HIhJJ zi~z2A1XLd-q3{}`Lx2D>ti)H3^5u7aG)HA+G1iudyX;U!QVYLr4Zt!1QaZ;tVV_G8 z02YM;*wF~O+#Tb1`1oLWOq|X<9?7j+49v{f2a2MAz}13jB+Q|00H|D! zAIY6Qw+e7vF5?dD+LC8Cgtj=j4q#m z)PezuqrjQe1)L(Q9ADfH%|zS|(1GRUrRsA>C^-PQaJ9)GhLMqR0eCJQpI_}QN?Cue zyE!m0FjcK^*=GR^9&J#MU07Lp1E|+;`ijq=(`G)3dJBE-bEae0A_&Y9dx5LN9!T8& z{yiK7uE3bNRrx1RjI6A}4h|f%va|D689rOX?79%(C2AQTzX>P9QT#Q!FA%CN0F0m= zo+5#lxhe$8{>Xa}1gayTgJE&0h4lw^2Lk9^?->}-{6=}()M3TuBHWzs zh>6j_lfB=X0X)L^I|d-TL;y51w!R23P9)^yxV8AI5$+58$D+fqsBa|WFWvU=9RP%}De6EO=Hd&}Wx>63p z0q-D*GG$ccOBEaCo<0SvDy+yWZEfuj-vH#0h$#X10?lg^dE$UufH#5v(b91m9%~q3 z0f3AmRt;C8Ci0Ga%Z zG6dD{ruT*Fb|91zXlXDLCqa+@?%klDj1t^KtP8oUaj61! zH2A*rrA7hp7)q~xT5tfC`eR&NdwYAPwr~M3SE8pi>Cl@3@(+gEU@Sz?CbR)73t+c> zVV=TVvrnq!;TTQXE>L2P6((!w?|1;LR}$uh0New3biS_~0DJ~Bb?_08+mx4=e>6;Y zzrDi)m$g0^V2SGfaHcnTZs6)F2=j>;#VIK$jN9H^Q^=PRu^j{yHXwmzw@1-Qw0SyI zZ>B2;gEHo}Ntn4oMHSxGCJA7bM!3bg8ngAMT&$O>K;}4HcmM#;p!9V5ow?coBw3h( znnzn(+ul+qES_t+3EgoUUIZ=$>9)^w+Vu zfEKM1>Kk8IVb%k;Ss$4)Ng&)ZbF1XB3?s4JrJ2m1dDw~72|+8g@`o)o4$BFJcYXo4 z2JnT3`}?B;vr(Yj;aEyTg9wb3^8L1I4E#&|`BHPIuN?ZXU-kD>e}N%Q{s6!S<*0tr z#y`6*&84fU$uum+_r(dYzQ(}eh{%uX9KMqkQ1)~4XVxx>v!%*YMa3?_WEdG6`vb)i zUqK{j&`cm;18}`8ta$sae1Q?+uznd*mwO00US1NIhhYH(7D)A|(G9x9$i^lDXjg!8 zg4;UXU2^nQQParW<>vygUBDz?50M7)ap?*rvv__RiW)r*Kx?TeD+7>x^NEH=%nO0K z+KU0;6mq;LkQk&yv2;tQ(ag?aOXr=SP`9f@Q?E3c>^uT*YSkv67#+r4ZwQ)$V! zIr>TANTfEA-;x&0p_2Sg-&)Cvk}1nQU(Dy3P-@_*x=_FSkio z8t}#-jXcaCi|0}(8a_9yZfKwaq#8OR@U;s9`8FBQ5+T<6GMH1QJ}adb?+uI8oHeXAD2NOoSYu{kkpvQ2DG-6sml4^+RWkd)f&g<{zAjvn}OVam{Z*(@4$ z5>O3Kfp>Y%TC(Yh#a{wkGMW?1lLEAn9c zttqJ&%*25)Q}@)(;kv^BS4LVfXp+Gh&)y*+A)yl#B%>mDWEA2@roZ8Vs1QJvaK5q` z&cL7%^PI{Gk4XsD#Hs)s_1^I@MoV9&{6fT-Z;oY=__%~fX^v7V0YT8G;tbZ$NRZTR zHa0xyvU#t@MFOOt_yEZ32J?5FimLB%@+Qs)Sj4(==6R5Bn0;#xZUU%BXf#;qi7u_I zlm_HDxV>a~YTpeHNYE@nJHN`=)vvKf0oy_b#BDGJmn?4-#uT}`yJO?wb-UAQAFTK8`N8Bkgjv8ekkt8hFE0Xeyr;k4w9vp=%6;yB zjKAjd=g%F-qN+JUqiDrtjg2!_%*9emN;@RNYiO2DEb?%6*yhRrI4;x)QjQ1i1fmkK3eMZzbFID(@-JG^?;0rm?xuZ0TU z@ngN{-SimI4n#*q$#gEj1@xk%j0^@aj0FRLl1H-yXl;Y^VrPNNL-Glv@M8Pmzm z9=nu>T>xwvN&NZv`_9e|Qp-Y1#dQjbUf2p*BO`iLG_=-5`-GUH68cw0=81C2^}Ik| z6_}TIk4D5*S&FdO#=?e6S4E|@1|AH&_J)QAiP6f5l$U9EASpI+6hOtZ0219V?>9}h zM2d!kiwh*KZ5H1*{qLt>2M?df?1S=m-Cx3hPO+gccF$$EXuw+nqB3OD6DWDvEi+0} zIlPW_|FM?INmAr4!+@e++uP(tx}FJ&;7>!O1wc${02_u=XWtD2yyZ&L3!6qFj7q_H z=oFK7aNu@7`4s{`!(-6Quk>`}hVZBWDgue&1kYgsqyqYhpl_s_Xg559R=nx!J+9S< ziGK{Bg^*E~_Qq$jp8G)XyXPtb)=-8u8}sRltpY;W$|}_WfGSg%J8W)kJ&EKq#@VkC zh5jILfnf{Kdi8j8s;!Z3Cv<43!qXEH2htR8aU(nacTPYzouH=u`m@qNm{=lT* z-_#@qyl>Hv`1;5{wFJ&6OmKb($a!wpJ;Sc8$&*6K-lVbYEm0UeH(`47;P6n&&Fujw zd<1lq*&dTAsx!)by|HA$!)AGYFya!#_bnc|Ke+rX+4yB+EkJ9BA7^3ClruvRP$_H3;pK=n8r;1N zk#Cj!OS&Cg*h*;YL2d)+Qu%WCmuTtf=@1pj0&sU>XQvo`{sKd>2+vHTtRBt6Tr(Z) z82!C_*G^7O$nL)g9+?@|%=61XPkE(DZxc-c99o%3cjNpxprD{&WTvdTdO|1<1zc<| z2So~-`snu;!pO4<#FW{Ub0htI)jOERl|{QKy~3Rmu|P4{ zJ~c%KJZ>^NIte2)Mz}?|W{u0d!8_N7M<*wm32;nd6AhX?Lx2{XMZ=g5 z$rA@z2Ptpwrim9fRmUR`X8t5dzzwHZp3bcErSpcIsk0V#zcuWVE^_MdZmDN#0FSJD>@K?jS9j!q~NH4aUO@^*t= zU6|0K5TwlfQRM5edRE18-pl#XTNlZ{GQ@xkQ%hU>UR6JYk@ ziSq+8aRgTibGlgNJvz0(0sVw)H<(&_daxmz0A0Dy+&%K83;eYK>+3dGlJ6^-SdJ5d z6$@>it|BS&Gh?wK_jq`2fMAB_v{;8D0q3Bn!61#hg6!D=@zKap6Uc`s#KcJAY%)Rn zH2ImMg9Cr0Di`w%hIluoG7nGPAr$(?RM8Z6J-~$n4;usKiFR02H@&^S<#_S;ud%c zLKHxg{XVr;uXsd}`<=|(AfNurmaYU)Dd^z#AVa*tK_%1e`~8q2i2n5H8ZZ8CAnGK zpyL1nP7V#yEkG8lKEgB#{jQJwB4eOMJ0}5Q6P+X1Cbrj$VNb97gJSWUC9(* zl%KC!3x|`R>i9N|7D&0;C-^Hz>FM(dOTToqVqqJ1%DEBV8Fw}vxjNDpamsYw^`AOu+0kc^BM16E9EQY z{hdp@-ysHOXMJm{!Z;vjhvVnU$^y96@(SP(A;k>hn&{+YNS4`_8cR7DJz3+3$iaL; zNGm1{S!D2YG&sO++0F?9`v10w2o-3G!l@yu7ie_+wwQuckzFUE5QxAjcv9JHO?UDg zc==Tf3K)Q|V^)Ia<-XaGjjZG<5|7lS85Yy3u`dyzWqO^K#@lvAe zq?lhhS)=ElW#&7`Di8*D0RxN@@-`D#_>@gS1pqB3W>EV>L&G##IehQ~dI$a)W#T-) zoL}#NS`}&5-={vHitGjJ#H;Z#jlk+{K>;f?ECfOS!dLAt2@*gU3}G6<-Me?Q&9|(u zVI?UfS^c#&R{3r5$}SZ@M4l7ThM1a~212u5saX(^tjo&dj~S`P*mtn%Y$ejKJH640|D*2I!ikgzi}gQAgqnC9Stavxh5X_Q4{<)CK?Ksf8C6SZrNV&y-5-8s@Vece=@VON;P|5ENpJ}=Cda- zA%pT#(OvNPTfcq7ZQvwr>JS)lwV9Oqljbz!3fT4$?y#&BBCM5e9DMpr*C$ZG4ECksQ$wbwURu^T_o1 z)O)gEn1s;!=GqNJu8&0QyLY~jm3!j-(_WC}NxKg1i0~}0LdDB4k=)S)WV8(Xfd#=6 zBti^llVodMc4=UCIC8Buk-Kl%Z-Ey(hq=h7M+j~3QPlpe&q5n5)cb|u6n*R~~)M*YX zcA(_flbSj_Hs~FP5hG7;XJL+fCfob%8C#SmtHzY`fAU@O_v6bHAUUwOwB#Sh`8>aI zOyavgw}jI(2$mw-Swcx(Tld@Mw0hYfZ71j)OBZ0LhP zaePT7mJOnahY);1ZOR{J3zynNzJZXy7-Dj4XcUAv!3-wKSIbo?KZ3*&DE1{d8o+7G z2rLzO_Fkxd2smdjJ%S8vLPK03x(lYSL(S<~%{&lr7%sL2a`Et3y`~0cYFUV`qTj#w z$0DRj1EL0;*-l<4EWNx73K!URMcj|D!T(ctlXq}4;?aFBh2-x2-!z}L@U)c5k;Tn< zOo)Wvl?mTX$XSH^D`eMX2hOrADL@rke6QoXEo=kEQ%X^h00`F!fg>Yu3vve!A+Z1= zjt+^uz^$;zNLfg0AOqlLAD_T256JJ};NT2F2`SKy4YXw>P$RUDj1WVvRdtF#O^4Rn z)|LQTEd#fRvC)aIeT8B4Fb+CR*$bj$u*&yf_dXB#W2>KFk+`diz>6x#ih5SHKOl!3AXBuGP}``(NkmP12mn-hw{DDq(;%KqBZfMhpEmL$Q&4e!fD7+P9} zAo=c*4Vqa^UCNeRRxnRtBH`Bm+bie4gye5Q)>$-0V&zRSp+yzp3!=$g-hT-xcCfYW z@W6|8mwykHz83GpSR<0+s2oeiJ9k=vRVKQ=gHY@Y0ZRbmO;HaLhSqZ0m$P0aJ`I?Nn^N)Zxq$`$3?_@fV_SAC$A0Bw=j0@W+Az)}2D7;VHy4*7 zquL7eh~6E};kj7?1LTmMkI`21zmaaj|zfrhw5>2g zMJ8*zi6vF$_3aOkldr z%Z>WrU~KD|O7NIp9^o27^;TjCA7*eB+%hLd8yC1Y(#qy1VMuyBLvsh&gbDH4^#sH< zGbAVg*^5bm`+s@;n$jrt1}vRGa4QV*8L%v@xJ87qzonuD3Lx`y*RtQf`Klx z;c6sE_N`eLoshNj#!W7n@2{a($arwOJKEzxE_07;?beIJzb}YqC@V~#wV$XnVJ)7D zqekChmYu;Bd&_=x3s=;7>e;r9)#I-wb zI(ptOyPdBl(8F^}y2ryFwSKp%YmJsL&^W|(Rr+HT)+3+pM*f2ReOnU7uP#=FtNJtfT2uWN7Q6#qrPF`HO!bIaGU zBjq>7pW>hXLB{HWPdh$aF=a+f*_aO6|6KE`(P=>mWnJKMyE*aH`g1v-G}%_zkAY7A ziOGPpqnihgh^Uw!hO+f9aeLnvFPwGWb=A-g|BHhao+M0}!V!8`%90}6)%_La$$T#s zo*vgl#Z9usw6}N7^Bs$=5Joy&-oIB-4c0@H2VA-vG-zAe6@PIaA7t0{J5}YHduujp zr5mjrn9H2)3U+)7;}n@)FtujGxo4WFaU-9L_)+txu-2Ub?yE0}J<*H{+3!3S)t*ir z;Ou%|-LBr9_7#g3U)bhEODB+X+WL~nF8%oT+U-6e%cz({0h8H*m{Rd(#O=7`hkg_@ zQmem>HaZk?T~CFrr<*8$oGP@KS2VT6;7m~83q+{YlbS3J&@tX*o#s#xx=cCcP<+7N z`Llnv)TtJCpivknHEB?0*~IVv*Qb~Eic_^@@9Jbngd=}*|LQF4lY%3ϑGv0O=o(X5pU{dQGXQKTJWzL*bZY{Rw=Xu|M)xO8o zO0ROsQH?ksJL)ZYHrj6!eqoJe;3aN))-!zcTDWIzS7GV&)sxtQPg@0VhfHE{+t&t+ zMyw{u=Dl!yC{{{lKWmqku!e9%mg&nzk6AKQm<I9E9^1QT*{yj-KDV}$WIa6am1tDmoUK-LR4F(w&2hlx z1Fn7^=Mf**VS`IeTth*RH)^$tf zr8~N+@9m6GxfmT6df4^6@(s~)^$L{bJmOY#WAiO;3#i?{wkebZ;T;l9B1SnAljc8t z$7IQp_=rGI3r$O{$bruGRLTA*1LK>9PyF=)y8#$+Fy*WUd&B+hDAxaT?e$wR2j5&ai7-<$ zDD6$>mnElfop0ZDe#UnDJ1x<>&0E=Yd1kS)o+dw#dF^j7{?>tyiDHD&!O;bJa`}aN(;+Br_k~KS{;fV|l9;K#4e*c>aX38G3){*=a-!>=N zs=`wr_1fG}^o#qw989CoNilX@uvcSHq#I#kUeMOepEZBCF^N zQmj@%cw|bqsZqoHYA(9-0q4+uXYW)G`2jce zAErSjs}{i-p>Jj4vmN7n1rL;0(VBieFDj*v%d1XHtK-`$&0inrXaA<%Y%3O^JC?Pd zcFg!`1xFoKjo3=FW8u%R^!e{z@i~qkMmULPUsj^hIaRics9$L{y+1{x*$#G2TkhlQ zvvo{SGRReo6J^(K+C)QTyE&T_&|8G+;`5_zfV$wU&gD^Ekc?A7t~+0Lsn<7$1`mRq ziZ83Z|hjWt@@g{;h(-Q{$RVv?j`e~zQp4_W=XGsmcH1)aJ|w~J9$H1 zG7_CwoXxdM`s|3Cp{kN$3MSpp!npAhdVJkaem_}}B_yu)l=~Zq6|Xyc&m67d|EcXf zgQAGKHxD8zNf0C_C1=S9LzD~y3WLO9hKzs`g&`x76bTZBoJ4YzoFxfE9`cZpC}BX! zVF=sr`>(Cq4_mdLwyQs!>biZax~s#z=lt&TEPf$1r|#}Ze7+el8lhWG_36@@h&fa6 zh^*_q=lzbDW&u_Cs=_4$V9`tp*gzt+yk){8S=65PM9 z%jtEae!O)|oE#|Mo2y$v^Z7M&vBcV5*bI}-I@kOZb#L#uTa3PTA#PT0d33ekOy!_`lAX06vwy%z%R5RMiI>uowb)cgvAMWg_yf$^He5oQ~?$&+ixh$}U`CF~Oa~RWFxD zDt3RvyN$lt@_H!PH;+84Nuoe3YXnXd4WwrWX+R3~VN{8dhCFIjYkKBGq;~e9#EZ3&FE(JEoV$!dh9z$_Y%|GXH zBs})ad?Cs!f5mPL;nRZqpR>aloyV>-VwTlTtqQz^J=BAS^2+@#s;&dNB2-+5o+xd~ zt9g;3d}rRO1szM6SsRVY`v|eBGF`GJml9J)~UHq&z_2FW_#_OT4}v1E#5*h zwaB+>B$V(t!pNua9ieS(d$EQZPg2+$dM$^L^Gsb$~QY+sf!3F7B zZMn0xoPX=>C>{n?J@lDy*ABWBj~!rObdspA_o8V@t=MXopptQU>KRWWl;t!1OYl=q zexUYTT^X)&Qv_a`5tfGfjM$RFm&+|1+5Zl^)`P>^iKK!R?_(znar zwO_m6I^)W*!VOloYgf`XNF%d*Ugg8R9zJ zgTq0<5?fg7#VJX(?4QQ3#AnUr0D@rAvsx$9SWsgJ{u-6E!tk!_*&D}z>I-hH5H~@1TzVxq~ z1ZT%FALtYGwqRjZ!8J14p@W5|doQ;5Wnz9Uc9Et)V%?7o1AN;Vb3S;yEcjU=UcGh^ z*nz@2$!8 zHwd^~l=TSTG)2ZG{BavS$1B|KHd&|W%qh4X`N;-Z^$Lv9Uhn%ESr#C!WcNC&!3eAt z5=Vu0ASo!FPm%F_%73Py0(a&LQ0TWZIDZ1W#IPvXuI?mlgUuU1rpU@|uw5Q$Zfb0j zzl}MF>e`V!dx$cOTCdqV0Pk;fRFa&HyLXC)PsiaL@bS2W-YtO_rrNLwb94)#LY+Hy zXkb8O0=tXR*4gluOlZjAbLChK0TIJfK7woF8LE|<`Q^EET$!ZAy{=iMMsUJ^lHuzX7aEqWXC(IdNgCs`*mv zpL`@zA>2ao1$zTgqjQZ44A!NsTe9|2m7JNIzw#sAER%7IcXYeg=*Q^LRmrD~hLS*s zYgZ;XuZwMNizuef$v&Uh& zc~^-7*h9yG@5q0nyWfT=|F5C`VIb^3E%5$Bn6pbuo9xr!0br{7{)3NCWR;RS-vY^A zK4djl7@)fRpaZaG1PUYZ(!(`?Pjv8A_0dVdNM?p;-{4?SryI}`u%Awn9-aWSjPW}F zK#ba?0Z2;UJshk6q)kN&eWt&YAb?hOH71+lJplci>+i!U+g|kw3?{_`mK8wOoBmVf zebSC~oGg0kkye;rSg3H~v+^QLcJ;>t0IZEHT~@z_0X+bm`hUASlyP7Gv^ub_sf9H< z1th5zq0kJrLERXzVLXQk*7Koh`RPH9`hRZ4U8VEe380(-;O&X3F}7o=vw+6y4?qN2 zo27a(N{t<;wFAon00%xkM20<(oOQoAv~|Q(MAy|FvJ_>;$9DlqVZgX9Wtzyt6_0sy5XA|*9gh$L17l*IvUKZ8n``92R? zI2HU01>kop!7VK<3vOB8Cbj`QMid_(KXTUys3)L+g6Kw!9v>j|36SW6%xk={m()O7 zoQc`ich`mpKp!-K{s1X!<#NLT5-nn&pk*mc_wVd188jf4xKxP;$g(|RU~sHjGkI@k z_@B$xVh4SlAbE#}M>ejhqGSL{2OPfv)n&ay+mgWR%P1ykN)V{0)lZ+|2K|3LD*8)l z-<|*u0I(;<(+v^-H+o>3^#3cA0iZANK1EP=Dh`FIaosl5wQ1CENCKHQvCFYX@V5m0 ziS+a2GcoID-gIcz#v1?3hz6axNlqX?r0p9IW^PwOEB?Y!u})G|)?fBbp-^Fa6cYgb712>LQk@y0 zn&n4GO50G)VW{f*p%h#biO%xd#Bt9a9c_+A2gIqJtWEI?!C3_g{fQN8jfVpkYUH~b zj^2JU;A?{K3r%>!ZnoRF-N2|n!;+=T^8U)Mc=gP`GZ`Z z(Z^_Au-UQ&2J*1(SB%_W>m7%*gf}Qal}k+ap@c(HCMPh3>BeK5g&$sdfk*e`eztQs zwZGMvL$z8y6@RevD`Rc;GD8VnRos^1+9b^ydz0vOwt1k1b9w%5+xbU|e*r)L))Yd*Sx(;m3WuDf znM3|lIH7rHG~Q(NOZWZ0uN>%y*e#tK)a$nl_lCn6S?34z5hUUpH z6p}jN=N)R(Y0nJh=Lrq}$|>2YcWCU+7cF`Kyd zM^&3#^3GZ*2w!~~3?$&(b(YOnny0Xp*!I!cYw~dOYjRNF*5`)!NOa!!E;El=`BwIk z4+pWl?`vGZ_wDl&zXwx^aPwetWvC4pn63i88gj=81-C4W;jPV54}C5j$z$dJURBZ6 zfghbBZ!7)h@+Fr>L21-nJzr*f+RLTl0<0djm-A-2g2)BV{rLIOol_y7<99N__(!W- zA~i@tqjRlB-J*BJ(>`~W8J>;)lS0ah>Gs$PFR%Hx#tbaVRT<7gD&L9mf#;R6iz0Vl zS6iskCZONr{k&JH1z9;)a!fs}ucffFP$_FM2>~vqox6qef=K=HdOE-bFO+5i{tqNc|1T zYh;$d@^Z~@h@?@)B1)_<$~*biqE0#$e`W2TQ6KHcR(^~|l0j^;M;$Ed zIX7qgo5Tm>eUMp41l)fZQ8gvM*tC6mJ7TFG#3h6GrdY2{R5K2zAzqgo(=$*Lx*Ix2 zrT_=TH*6PVjiC3CQrAg;eXjHn8w_Wq;!kF#y11=GE^3Fvj{?M=;P@0`vpe&@_6xkT zyHGt`9g!3E@Y^M=$B`ANt1%ryl_EiBQBZAP#R5zEiGq|OipgMv^3Do}c5nO3N6IPf z-S7W!@%rA+$Rz1qkvyF-UF6=V&rkk5AiwzSLWK)qH?l=TJSRrdTeJHhy*fUj!P(Uc ztt7;lSpB*iJWxQdf}zuDcO00{k47@SyLX&x@|$F`M!qY>XzpK6@cSkRwXmyq&ppaE zXk(o-m$hGA-htV&>+f&+JEuhCUe%%^DJ*2jS_nAIsG+IsdNc`{O=?^EPTO?KV8m99 zaq`m>a{)7}eS#;p+kI1WJSFgc`_+-?`K>|FTHZ)|{QS(kh3B&{kQ!mu7p(%&0Ji6I z=m`C=w$@zByVj19AG~GKJ|XGcLEQV*0*n%oHb1XPRzUuAOYQd@38_uw`DN#hzuVU+ z!iww%gGJsshy}9BG+86{Q`1xH#uJsT^H&ds8g05bkjiGdZua51w%O(bUq6xdsto6p zx4bAo5Wm?ud1B>2K7^P)a=v~VO=Y8m?(>fQ*gB&6I-(_ zuXb!TyGQtaOx;x|-*Da)cH}QXh*P%P2*d43H)2a+S67y-a}6(7Fa!y2%02{#HqYS% z%(A;{C8+2S3#{EsI(brb+RB{eC3qHL1sV7Fk&#{6pXiG-XxxY{7^~4WZ~DX!)hfAj zGYhNjdc^Ctsj5QdH(Sbf4k%fp-@Rzb#y!O`Af=Il37EBmrT0v%3*79OP?PdCB~ziS z5iMf^{k4f|MtQte-S1dygWESxL8!Xi(zZj!h}@CR@mD?ug=ro%4%-)?kzt2zo1&6e z-Ebl^(t&%*U0UJbR*m+amD;mVkfzi|Q~z&i181eXmmQ1q4({ssisPk4Ti(%qRp03( zyzuWs{!-Lfuz3XO#Dh#9l@%M0J!%sou`M%6R4#qZu1Qz}pJSM;T|O) zRA?6lU>G7Bscx$lTVpw`Z|%Ue!4*7RwX&NjPO8#=-li66{C1Erh=nSxH{?PX_Zp zIClnB`V=WO#Qx)@rE9Yg&eCf9N9R7~^!A4#e)AV7_#n^ZXu!Sjj-q)!I~5IL=+@$S z-;ZwA9QtcPc>SwB_xGm~FmZBoI_mW}KI~wXbTU(d;KNu$JnVw2NBO7Z7D|Raswj*J=d z>^uJPG?HcwTm1GuN0k**E4^oYW?*w*c5YA+lZA@NN@^dWMm{sD?Cdoqw0RUH66QFZ z*8jd{8$KdeC*=Y*?qFR1vv9Vp4b=-6`S@enJ(PcSVQ^(<%%f;mstaYhI&$&FM*OYj z#G&oFc9y|C`pWWTd2gp5wLFet;`M(OwNqknaH*P) z4g!OyfT1`2a5LCp5!9pvFhyM7(}$JZj0y1??OViiXzJJ3I=x}uSa+tdWsNvGBMP7;cW$4V;E9*Zs3&oD5AlgTd{a+G_a@5k+m+G;bM;zfiJN5Xnk>FkB3jewg0w` zLn37pY**giW*|8Uno1Uyiq*{m;>+##^Oo31L zv@_b1Ld^Dj&v_5Sh=}bP)w_yS-d9Kz!-rKho$e62Ye)yh*t8OrtmWqv65?%&3$e6B z)jKO+J-&D&*7wshMhsK!c1X_Ap{$p$ewz9C*coQ<3Jlk;#wLM z)IvhFN`?HHlKMkUrD-XKW)Y3vH+)Ub_;)Y+&$>dZ`xPC#9E5iHUX=!V^6~t=qhD4H zN}>Lr8Tt(xG84}KKT~ui9yyW!eSARR{}q)c3btS|k+_D!)P{ktK{JL@AMdSJ^^#lijFnMTVlqzGNE=S+Y}<5iw-T zo^=@ezVn{1e$Vn=*LywJbG`EiGjo6MbD#U%XS+Y=%nRka@`q@dX%PfDbnB+93WDsy z!22+oz3?~c(3U#<+wCBION|Cz?li{!@cE?U4J}7i8&gN82lgh2nYE3T36F!3y@`pn z!y_BVsa@p~@X!VHAsKs<2ae`8)~D6XtxOP26XVnTS5M2CJUq>Rg`fX4-!;)IBBED? zPb;e4X3C0SLXgwQE!pd8&XE({Zq8~8QKT8A%Y)K1XP*AE`(_2t7yI_S_^Dp{Ud zOX+c&`aaoRJX6WOd4xsS+Oha^Y%IcX{*IK2%Il=Lf{D+a)0=BkZhG>-sGnr6*S0L4 zoe;)si{e-=N*rGCvEL8T(W9?DJpYrxx~7ky58Z2^RaM|lPLPA1d`K;cG%1UvzwkDVH-g0(_ z;h!@F(fGRN7oJHLb-QhNM2 zJsNjhR`%HLZ80@BH=mZ2EGK7xhFD+332@E3b${6vKx?l+>yI~VPOIB;xnqcKU5Q@mHUXke_%kL ztvDfJmrcpcUV8@zyZ&$c5zUw6U33_)$jDQ7qA@2~Sy{O*T(B=Xla`WV>oUH)?CdvJ zS5uR{_aE+*;>iez{)(ER6#FQndZdB$5)JY zlIvq6r|eIde(~CY%1vf8SXW0UJSiWCb#fY*$nTXiFkq8dpO!(SWMy|xmM$3#4i26c z6D#g*b6OnNPB$!P>&&;2x3s*7+^06DR{!|%wS2v{{82|RSE91M{)FRz(`1*(nw{H$ z0|#VGO_Q1~gnsxi+HyV!RR1Zej&FV$5tAr9SFbXFeoI8%1~>58s?!%c*4nqeGKc7q zDI?$8Bb5{sXy5W02AR~pJ#+5d_vC?&$3mg+-id7#50xCDzc=0I>*rrBRX=PosG+Jl zFzfl_#}5zMZ{GVEu6V9H>*G|k&CR)m9{={Lix74bk4a#3a*(*FN&FVdt?Ia;S3m4S zOG~S4U@$rBIWseJxAr6jtR;{HCLI!keexteC8c_E*Uohj&4hOe^7ym7yyOQGl!)H- z^Y-@kCDG)@02Nw4I(9j0Yhq<2Ia0(S%cAAN&uH<^%rKuhX<6CUZz!e1dUY*>6v_bX z>T>V-a}I*_p5gCX3zMQJk*zOb6oG5V(W${@0i^3Om2-2d_VoYtp@K8C{% z^Zl73?4={(yr5}5eU*L=2LArl2T(0s_|;n&-lL?v5S`W-a1x$8S`AOmM>A3jUpx`! z6?oKe`kf8-6uM>0uHCCJeGjaKQDENVdQK6MlalLwbc%eG1jf=~A6JR<=|x{T z&r)B29%b(qr&tWi?~Jem#=Z)~B)kK^S+X)5QC(N}9b(N6y$^HR1-574#O{btwntI> z^H{e9aXe!M=U~iSO0jTlWp;89oc);Yl=0~oNHd5{{AE{{-grme#S!5Ctb>G$i;FWQ z%IS3P-V7n&Lvv0V&MPG&1M4FBfswjVaA1**fPiwqwWW<50{=!(B-#9#NVPTPF-gN$PR{{O; z{v_Hq@IuaKg zNp8H&MP3>C-lD52!YeB)hnh*(uel$$H8eDC*0+rK%!xd$6k*>2?f6H1WV}7cn&Pv3 zc0c_B8cv!i6Wrny%WIlOmo(=@dbc*0Lf*W2Zkbuy`seCF&&av`2jLI~C+yu>F46CY z%}D!4@2b|;)^?Zflb;-RVDp&2_c1w{#oafK)DlUa9jg80zG7d$vV4ooXzaR`#ZY!y zMpnnRQs$U(ctiwi^s)JMGmt$uH}~CrFIL`$d;y8A*wvFT^PaU6V9D2bWsiA-ShG(b z-m{Oro}z_Up~Y>^#e_sej9sBPbXw&C*xQuV(sw9Ob4Sep@^-_hU(zO1fW_o9l5O62Cz zyJdM5;R*^0SrD}zw`YU2uLn6ud?mH$z5pv$*3cN9^%U<_=qf%qlkrXVDVTD|-D{dM z#8V>nyFt$m&xr_k+3o_By%xhTc#XoqthaI06c+oi3l=l571VqK(LFx?tfGTpsp05A zC+jm+Cnw>>$?gEqS@-!yc?2d>VPbe>sg?H2^5T4+lbF(PPJt?=DMC83Fmhjc= zrrhYtiP4_kWK7)wku?vCNXB?&u>3uFV6A?I;nBUQrxRqTTHjL5BPbZAbfx-SJtesO*s){xH&z#Zf3}M7s5hG({DFp`3ZDZa z6dCJ6S1}=q4_7m!u5TM0h)GL#UL_ZX|NMC;`9+L_;2=*uMLE{C-f#Y6TZuAxD$(5B z-2FlDrROYc{9#E+=aW-=-W4*hG)BAM2we@B6EO}2CczUNC@d_DoI6uKGCnSAXZNXT zs(5nPr@y~H(z6$W(FaI0j@hEFq-QYfj@jNr%-$~;I2Tg|Bh${{MC=D?AanTSwqy;w z$6cey$k__=hQcXAUi)roOoX?2&J9?u=IWXbHvA1SE-XyMgPo6JQT_8;jEpHsgq{C6 zm1^-$ax`>K+E@slB12fIeypgg8_~4(L+@3NVFhH9KQ!xp9C`KX)$bzg#nwzi^2({~ z-SsQMCkw+FFK}^vS(&3WwzLe*d-C%RJ|2EMC*5oal;k0ESY$u^rI49c2+rSS&Qw4^ z070r=vG}oXEt4Y2XKvdHQ5cRN_klFgzTWT7jVYJ7^z=*roIk&Jnzot%wuzLHTfULN z!ghWuWabwGw}B=mQBFNSe%*Y_@z;5-P2{r2mz|R%7$qFS7OL{Qz%HTF1|uH!_HBG( z;vNRhyYmH~2?Sc)`T+Yqb#;>%BPKzMW2hVg*(KQ^RX;qY6peX7&mnhOL`1u(P!HRp z{$3JSmXs&!+MaL2F0uS)m!c&LGqbXmmcMyZl1K?wp4%rooBx!!>oLfNb8r}L?)^Dt zWI0ICNJfl|j0tIJ2L?#^Hpk1Usi|5;j@+Q$0vVjYD&eY?(Lw0pGqsz>#|h@hCi6xTUqVAB^(7kX3k+Q?0DL{6L={ zd+WT`KNAIgjg1#GOP+B?J6DV}P2YkXcy?~?mBb<1Go&Ea%K6bIcpn3zc-WkBPcCM! zPcRT}ZfQB`!7R~U<{`%BK7Bnw>A7_EQrVI?L@*K%`GKb7QDRQa9?aK$7<5G~9zWiV z96WXH2w7q+y)%y;x6TJn@&MwzGW1TIZS4nV%xTm7vQw z_wFX|?0c7)?~LMy^=fg+s!4ag0y}@6Oh9b6g_IfT5#Jl_O67(u?U7C`>A;?FjDdCG z(21WPu97A?(7d<*djO&*x%aUU3o{GLJr$K_y&IEG)@_Ub;Gd3#TR+%VxlJJiFxK#L&(>NV>ww2_HXx8=#z@<>c9l+NtR|8j%i2 zhGVhD<~leGHat(c?Y%LVNHM4S0p_zfQP1@Pk|9P%Rn^E)3`)(w1`1!}yt$Rg z(YJCa(Qb#yl3OK>$$f7ictPIE;=z1O`Y^jhV1!LsJ1Bg6p#$ZLb=Ni25N;Y9|5H?4 zj9LwPi@p8!65px-mp7=_zI}&JU0D@ypWT?kvBM?Sw=tPsP0QKQku1LSXS%O)vAbMM zfjji=+q0K1AC~buj4etkE@l}T8d}t(%HFwi5H8|ZZ2w3lFIiez%G|iI=hv@aM_5?W zM!Oh!4bsh7Lkx(ZGF@&{rCeNG4Y_6Dv8=4DZYn7`tx^=Y^~&5&0;$97baZs2lzR!m z@8A1pWo1b$<+MNM;^Dag;1IY&fvDPC^ClEYP_%q8pKV&lT(Ys)8Qv-+Boq(yPM$n@ z*-{zuq3ciLd|JCn=?^eG(9^*wE}PUId=Jzd&tqUTim` zXJxgTZa;c*tgFyLtHhZPy+f~&9vT@nc63;5Z8`P0a0*#r(J zl%%+bKg4w9OW%vWkGI}rd8|G8`JS)RU-cjULx5`at`7@qmAVRUXUqXVBsu^2=QJ-b z9i8}ma1&!|wex!1p7W}PhIn~4m)#DO`WwL9`8)5C#ACjPSbY81J(-XzcKx!DSgenm zNuthmW;8j5*}YY^Dy#i5dh6VrVc()>_gp^n`pp}928LaT^Y8~V1bOk|1rx^kN{M3#Y=rNHxA6G4~ zIMtg8$Y3Ipl9Ix;Fc-6>SNEPzW`}b5`Vg=Bty?dcuiW>dJ#avEYom87BRf0$7;>DR zJ}%mG)f}RZ>HN7Pd+uDn{2gwurVSpse&{+ zmCdm(z{#aePuuR1QVbWkB6?Y(=C zl*jbfeHK$arL=t!$Qe&(&L-o_APm4A}xaCUr=Aqhm>hA6(J7YJewlD>es##j5ebtm1Gz6THIvip1S(3AOs%%Y7T|I1dR0&5w zQgp2gZ*V7u)|slJ+HefXgS&{jfdLUvKeO^)R;RAGnwpw}Wz0vrrkT>6T3jG@tDBi6 zXSYA57L5^e+E^QNW+Np$bYLfjo3o6qY_$9933mbFQ&C72v~2x77Br&mV&Ji$4iZaD zOiUL)b~I8w3jC@b%m6g;6g4Qg#sOR*S7NAzx%q>MxZpqO)bXBDUG79RE2|8dZh93L zfVPP`Iyxwq5KxUL_|8~6AM7VkY0R+l^XZspd4A76q!BnGTTpJQ59|<%cm-dD=Mb72 z8{;g2t;6b4pGouqL>R|-})Pu-q7iOxNf z_Dcopom|ABr;hThk(fxOQTmq>$+>HK`9mv#Be$8gkBAb>KANE}|0q?gLLc3Mv7ini zs6oZaI~nDvS|L|sdle;g6(+zeB!B5(y<3Yoa@E)xix)8q&8jfxLXR42T;e1RPni6* zG;s(R2Ksa%k}Yj*mPAo$%WnGU3wT=%H;K(Q=wWta_89o*?orKDSKj7K3oeOm~o=u<^s8g!L$~&vN$WC=!wdujYxU7dqsZO@> z3pjjdT_^9>ZDCzCTa z$sJAq^{tlMdH_qb=ZH-U{RVLGZd=8Utm6=L)2Av}J28fd)|`pqD(>)&ass$c|kD*-PkF2arfIve6 z)qnFp2&@kHl5V@Iv&PAH11G35&wtcxp?7VDrBSAC+5oxj!HoYL+{h4%FGmG20xHKl zqK@@8z{-!@Lzi247p)F&oxIu5ab?`7_V(a>J}MEnq0&&zinggaZbQi3u~Mvy338qL->8Jnyhh7-Sn;+ zY%`7W?5R-q08NuGO{<4%(V9NiI~ussc9AtVPc0e<5(9hTz`Q zt4rxh4wVK67LTvt0EZ0Ju1^uJdaOEP;lisg=~oVH6mJW#PP?T&#|%%5sWgPWZF@PV zg1*tOy+AH!k|q_i)Qc;hB$-Z}NhG>+)YR6_>9){sri)duZ`(dxIsOPI9&!BhWk zD=b6;BulPwfWk{Reky%&WJNY4y$=#B&kWK2LQ3=;ScLdRB?E)#Nz#`79ijH=;&5@M!qv}sVn01VxSpA!6*1dRynJ&Zb7NLS z%S~GIG$H7k^K|UaSV&FR!IP;)PkAmCdy061^+F!i8hxa(yF0^kAgU0>@6!LAB@* z4*6B9f%*;x*PZvUvgD$SJY3j|I`lqALCM zV(`C%sCNDxjHLoubTCNTx{_f8wgsnk`XWS;{cLh-tn}4I96PnQm%9=V{Is#sr)=}% zM-QZmU>%gyoYR?{K0XMj96-$&rPgYy)?00IR010ME{KVB3ILVVSsJsv#cq02R05o~)UmM^xJx}G09}o3x)9WV&OTGf;xhuK)5Iz+g~{)BnD{jA zoBa86YpA+>$=YOb5`2GW+c)4g#tdIcWW)lBnFx65L}podBGsQjxX=VVC)Ggy6+O|J zzwf*7xg6)I(j_%l*COEWiV-l7h8{tAeUwoV?q&(-i}^3BlU3EYs9O_;yethyNhRZVFE4=Qbq}`o?=8akY*0!FCET|6uVi z32ji$w;e_w4k?id46Ljfo-+Z4@4-MI<9#rUF6M6L15fHXd4E8Ub|&gL@azUlpfQl- zqyefo4ol?C(nQoH8Q?asVej8RYVR;wJ%rC8IIPZ_Cn(jPy9n9`#L)O}gcq;sNC;7f z^OOcEL`;QgPAW(wDnRG#2{(b2rO^znpgP{&1>LENj zSD~9;1NAME+^-9UC#rcPM@HIGR_&Q5xg0I_` zWT7svl`%m6wnmrdJOq-UNEgE7b3cD3w4`a~EOJ?Dus+BG7FArl z!z}RTLKuw<(TbPbbNrPv#sBLQ@QgnD9iG*2kxgl1OZ#^wD9~{WJon$31egiRp`0$R zLHtYSzdoU38UH!<_Y<5Yn~X3<1GJ&$->IqM|NRqTn-Dsd2UGu!p;O^db6TUXB!)ga z0N{NJAX5vZ!+;y==<3olGA5gE02t#(iNUc!YYfDZikh&CpPXkN0xr9#-PzIc5-!vL zSv(S5gbGz~aBu^K2?f0-q5I_a?SLf9dkTG%#Zy&`hVD;6P6HwGY<9(Efbl})WP(d~ zlEbZS9Q$F65rDtuMdhuQsVOO6ASk^AY-YSOpAM>lZ20cR#vO^r;u8`Cqi#z}?}T)H zc={7SF`=QMkBTG_0b$|dlDB}&9!%5Bl(Vzr12iQIFu`=4JX$_JJ~n(OG{L~*->RxC zMDfs40sRc~B2`d>5a?#N#qpFzNq8_zZj36~jsNCDzCgUSS|8+;{2F{lW;K0ca#GIC zO>DdE=(nAn9TOx)l_ZeR%YcBs;o9fBY=}b?$t@Ajwa(stUl!{xkRWvt7Z%499zVV$ z7s%8%&6ju#1)C-(trbZP_B=unhoAx|dJ#$ah)HLATtmMgocKO*wIbE^ZWw? z9ak|QK?}y(GD&ZFbOE;k+&l~Laun+c51N?BGd4D!cZr7fk5|xVq_|{bVP76@BriG zW=BtGQt}T9YF#gpgwBKK`{eWJ~dPtJj&>C96s9+`b1#j1;%6q5hW`|Bd zpVDWW-bq^?mGyLI-F)k)$~?@*e3Ng?o6LNG^epUyUM)kPD;9SF`(wG#34|VZ6avhN39KtJOqD#no4r*yG}SY zHFebH6iHa%dAKB5{BLjm>>1Jh@}7u zZNMty?7-7=sTX8A#+L+GWXw2yL7n+)k(EqhV&m}Q^ zDa~_3)6!04fv|4n3y2w-+FF(Mp~K- zC{>K5h+V(;u#teEU>t#<(4-SlH>^CjF*8-N)HbM^e7F5#o-nSBLoRSyOKyFmG%qEE z21>P165NSE@6cYLV8FWT3Te(8xYjFj{IMyR4Gl#{fsCjx_tzb97C<|MutW^W(8r zQg;mYZ{_3Ggz=UE0Rd3V2@Z}O#e8H!ldMI&pr9Z(azRK)qlphX=#C=`ev<272Re>l z)Zjv>P=sxAe7tTYOc>gjon4wQ?%k;G8`!n?pfaKc@+27rvWjW!7@4Cjy9$Q=Y-cXr z>)M}p6ct~Q@>C8AS$B<0&xD2^+w-MQeXl05h56%pP=lFw+vH3x!j=~IyaS z;x@|J3?QXAQQUDPcxNnH%k#q*uZI8m38C&7BIq!DXPe|12c9XG^5JmHW5_~);d;eD z2L!!4JCrr`v0;@YeGi}yP5~fg0B$ivb%j*uumq@hbA`gzRv#^{yF*z?V&x+#k3x>o zMAV>E`*)HY?aM|o1Jq%oB$R9y1iyX&wJ8@@a)$PgwGGVImbTqWqy~N?E;8CzH*A-I zQ9^rU1foa~Adew_?CAqkkix89e*%Y1I?h4Mn^1o_;?T1bz^!5Mhr9D1s zejVZ(6vV~ZQsUxvAhu14!hPT)pzI0m4Ds4p&D^Rmi}4tKVui&DBC&wRH@Q@QiZZvd zYHVx+j+2_(z-$599mH=0}uvLAk@g#x@x6Q9KD2!~?Z!<=?R zukh{sykowlbKti&wC-|;7kw{Go624Xm&JYTpmm!X^A-<+yy z-6kL;wDU^a+-|cT*D;<CSWSiXVUWa19>}sKJ z2$`Wv-1KhbwYX6x$O4dgT=X!%Pz>J0EW2hl+W*rv$DCy}tQbet?dOKFZ+OM3|<5UBO!@pSWRh zILYYJ!VK-4sy-d)Ngrt0;NajmRKB;6$FQucPshMz=y>KNJ5&Of1*@$#*H;d=oh|>= z{W2fQwf`Bk3tiSQeNe%0D0mBiSZ^j-umrxsn)H8rQDd6Zo;j3>+emxq?|)@o6q$`#Ny(voyEQ?RS@5+|ONyBF-5;KKhUlw&y27<>7drLX}IA zbob63Sq+W1l_bc~roS^9J}N5DZ&QJ+@ayw^M$m7}*7i9+KLI+jZ{0$B71-JFx89r= z3C#d?i^RC7G;Qqo18y4n&K9&tNNiHlUPy4+47+y{OQ^pZ8qf}KxDY_P{IDjJ`!J;u z+Sn9U2tS%`lt@{9zci9tKYx1@Rr3zMaqy>u6$yG-wZ&P{8`NU)RZln zkitA@d(RkUdUsjB0xa}YP@e)6nc*TLARb-1QFIh~p3wZsUH?=2<7+OnA2-4Tl}cPN z!otD{P$37&E*7+Qk{&_@G)XaHcv?$Mp$|0wk)>smV_@w&UXC+ot~U)7W@H@7$jD%` z%!rgXF*%2pl14F*kMmh|P%CtYqnn2g9RlAf1zqK&yiNCL z0FyZHj?ZnU8tUrm;gh}1&AStiK|{MZaRJ;p0-fJ_F|X>WG7p zO=nmZ%+7<>8GJNUBn7+6K~|7lc&>B|*1@7T@7}F~9JGsQSN!)^6lb&fHyd_tM-f^b pv_X0MrxO2Dr1;SBvd+alW#fAF*{{4NK%b>prqQTqO~{{zawnK%Fd literal 0 HcmV?d00001 diff --git a/content/english/hpc/data-structures/segment-trees.md b/content/english/hpc/data-structures/segment-trees.md index 61d6ee53..182abb59 100644 --- a/content/english/hpc/data-structures/segment-trees.md +++ b/content/english/hpc/data-structures/segment-trees.md @@ -27,7 +27,7 @@ int sum(int k); // sum of elements indexed [0, k] Static tree data structure used for storing information about array segments. Popular in competitive programming, very rarely used in real life. Many different implementations possible, which we will explore in this article. -![](https://i.stack.imgur.com/xeIcl.png) +![](../img/segtree-path.png) ## Pointer-Based Implementation @@ -78,6 +78,8 @@ Pointer chasing, 4 unnecessary metadata fields, recursion, branching Eytzinger-like layout: $2k$ is the left child and $2k+1$ is the right child. +![](../img/segtree-layout.png) + ```c++ int t[4 * N]; @@ -245,6 +247,9 @@ int sum(int k) { * If $f$ is "remove last bit" (`x -= x & -x`), then both query and update would only require updating $O(\log n)$ different $t$'s +![](../img/fenwick-sum.png) +![](../img/fenwick-update.png) + ```cpp int t[N + 1]; @@ -296,6 +301,8 @@ int sum(int k) { ### Wide Segment Trees +![](../img/segtree-wide.png) + ```c++ const int b = 4, B = (1 << b); From 8ab6b39765efbcb1689a839a92d03f5c30346122 Mon Sep 17 00:00:00 2001 From: Tony Reksoatmodjo Date: Thu, 24 Feb 2022 10:14:21 -0800 Subject: [PATCH 244/531] typo fix --- content/english/hpc/pipelining/branching.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/content/english/hpc/pipelining/branching.md b/content/english/hpc/pipelining/branching.md index 706796d0..6168e8a7 100644 --- a/content/english/hpc/pipelining/branching.md +++ b/content/english/hpc/pipelining/branching.md @@ -1,6 +1,7 @@ --- title: The Cost of Branching weight: 2 +published: true --- When a CPU encounters a conditional jump or [any other type of branching](/hpc/architecture/indirect), it doesn't just sit idle until its condition is computed — instead it starts *speculatively executing* the branch that seems more likely to be taken immediately. During execution the CPU computes statistics about branches taken on each instruction, and after a while and they start to predict them by recognizing common patterns. @@ -68,7 +69,7 @@ Now, if we benchmark it for different values of `P`, we get an interesting-looki It's peak is at 50-55%, as expected: branch misprediction is the most expensive thing here. This graph is asymmetrical: it takes just ~1 cycle to only check conditions that are never satisfied (`P = 0`), and ~7 cycles for the sum if the branch is always taken (`P = 100`). -An interesting detail is that this graph is not unimodal: there is another local minimum at around 85-90%. We spend ~6.15 cycles per element there, or about 10-15% faster compared to when we always take the branch, accounting for the fact that we need to perform less additions. Branch misprediction stop affecting performance at this point, because it happens, not the whole instruction buffer is discarded, but only the operations that were speculatively scheduled. That 10-15% mispredict rate is the equilibrium point where we can see far enough in the pipeline not to stall, but save 10-15% on taking the cheaper ">=" branch. +An interesting detail is that this graph is not unimodal: there is another local minimum at around 85-90%. We spend ~6.15 cycles per element there, or about 10-15% faster compared to when we always take the branch, accounting for the fact that we need to perform less additions. Branch misprediction stops affecting performance at this point, because it happens, not the whole instruction buffer is discarded, but only the operations that were speculatively scheduled. That 10-15% mispredict rate is the equilibrium point where we can see far enough in the pipeline not to stall, but save 10-15% on taking the cheaper ">=" branch. Note that it costs almost nothing to check for a condition that never or almost never occurs. This is why programmers use runtime exceptions and base case checks so profusely: if they are indeed rare, they don't really cost anything. From b851998e2c2c2f7cba4e356a7969ee886b82cad4 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Thu, 24 Feb 2022 22:21:07 +0300 Subject: [PATCH 245/531] segment tree intro --- .../hpc/data-structures/segment-trees.md | 53 +++++++++++++------ 1 file changed, 37 insertions(+), 16 deletions(-) diff --git a/content/english/hpc/data-structures/segment-trees.md b/content/english/hpc/data-structures/segment-trees.md index 182abb59..faf04927 100644 --- a/content/english/hpc/data-structures/segment-trees.md +++ b/content/english/hpc/data-structures/segment-trees.md @@ -4,32 +4,53 @@ weight: 3 draft: true --- -The lessons we learned from studying layouts for binary search can be applied to broader range of data structures. +The lessons we learned from [optimizing](../binary-search) [binary search](../s-tree) can be applied to a broader range of data structures. -Most of examples in this section are about optimizing some algorithms that are either included in standard library or take under 10 lines of code to implement naively, but we will start off with a bit more obscure example. +In this article, instead of trying to optimize something from the STL, we will focus on a *segment tree* — a structure that may be unfamiliar to most *normal* programmers and perhaps even most computer science researchers[^tcs], but is used very extensively in [programming competitions](https://codeforces.com/) for its speed and simplicity of implementation. -There are many things segment trees can do. Persistent structures, computational geometry. But for most of this article, we will focus on the dynamic (as opposed to static) prefix sum problem. +[^tcs]: Segment trees are rarely mentioned in scientific literature because they are relatively novel (invented around 2000), and *asymptotically* don't do anything that [any other binary tree](https://en.wikipedia.org/wiki/Tree_(data_structure)) can't do, but they are much faster *in practice* for the problems they solve. -Segment tree is a data structure that stores information about array segments. It is a static tree of degree two, and here is what this means: +Segment trees are cool and can do lots of different things, but in this article, we will focus on their simplest non-trivial application — *the dynamic prefix sum problem*: + +```cpp +void add(int k, int x); // execute a[k] = x (0-based indexing) +int sum(int k); // sum of the first k elements (from 0 to k - 1) +``` -Segment trees are used for windowing queries or range queries in general, either by themselves or as part of a larger algorithm. They are very rarely mentioned in scientific literature, because they are relatively novel (invented around 2000), and *asymptotically* they don't do anything that any other binary tree can't, but they are dominant structure in the world of competitive programming because of their performance and ease of implementation. +Note that we have to support two types of queries, which makes this problem multi-dimensional: + +- If we only cared about about the cost of *updating the array*, we would store it as it is and [calculated the sum](/hpc/simd/reduction) directly on each `sum` query. +- And if we only cared about the cost of *prefix sum queries*, we would keep it ready and [re-calculate them entirely from scratch](/hpc/algorithms/prefix) on each update. + +Both of these options perform $O(1)$ work on one query type but $O(n)$ work on the other. Depending on the relative frequencies of the query types, the optimal solution may differ. + + -```cpp -void add(int k, int x); // 0-based indexation -int sum(int k); // sum of elements indexed [0, k] -``` +Segment tree is a data structure that stores information about array segments. It is a static tree of degree two, and here is what this means: + +![A segment tree for sum](../img/segtree-path.png) -Static tree data structure used for storing information about array segments. Popular in competitive programming, very rarely used in real life. Many different implementations possible, which we will explore in this article. +Unlike the previous structures + +Segment trees are built recursively: build a tree for left and right halves and merge results to get root. -![](../img/segtree-path.png) +Many different implementations possible, which we will explore in this article. -## Pointer-Based Implementation +### Pointer-Based Implementation If you were at an "Introduction to OOP" class, you would probably implement a segment tree like this: @@ -74,7 +95,7 @@ It takes 4+4+4+8+8=28 bytes, although they get padded to 32 for [memory alignmen Actually really good in terms of SWE practices, but terrible in terms of performance Pointer chasing, 4 unnecessary metadata fields, recursion, branching -## Implicit Segment Trees +### Implicit Segment Trees Eytzinger-like layout: $2k$ is the left child and $2k+1$ is the right child. @@ -146,7 +167,7 @@ int sum(int k) { ![](../img/segtree-iterative.svg) -### Implicit (Bottom-up) +### Bottom-Up Implementation * Different layout: leaf nodes are numbered $n$ to $(2n - 1)$, "parent" is $\lfloor k/2 \rfloor$ * Minimum possible amount of memory @@ -240,7 +261,7 @@ int sum(int k) { ![](../img/segtree-branchless.svg) -## Fenwick trees +### Fenwick trees * Structure used to calculate prefix sums and similar operations * Defined as array $t_i = \sum_{k=f(i)}^i a_k$ where $f$ is any function for which $f(i) \leq i$ From be28b5d6501c3195b210cb24b46498a35990b79e Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Thu, 24 Feb 2022 23:59:57 +0300 Subject: [PATCH 246/531] segment tree structure --- .../hpc/data-structures/segment-trees.md | 40 +++++++++++++------ 1 file changed, 28 insertions(+), 12 deletions(-) diff --git a/content/english/hpc/data-structures/segment-trees.md b/content/english/hpc/data-structures/segment-trees.md index faf04927..645fc7e9 100644 --- a/content/english/hpc/data-structures/segment-trees.md +++ b/content/english/hpc/data-structures/segment-trees.md @@ -13,7 +13,7 @@ In this article, instead of trying to optimize something from the STL, we will f Segment trees are cool and can do lots of different things, but in this article, we will focus on their simplest non-trivial application — *the dynamic prefix sum problem*: ```cpp -void add(int k, int x); // execute a[k] = x (0-based indexing) +void add(int k, int x); // execute a[k] += x (0-based indexing) int sum(int k); // sum of the first k elements (from 0 to k - 1) ``` @@ -22,10 +22,36 @@ Note that we have to support two types of queries, which makes this problem mult - If we only cared about about the cost of *updating the array*, we would store it as it is and [calculated the sum](/hpc/simd/reduction) directly on each `sum` query. - And if we only cared about the cost of *prefix sum queries*, we would keep it ready and [re-calculate them entirely from scratch](/hpc/algorithms/prefix) on each update. -Both of these options perform $O(1)$ work on one query type but $O(n)$ work on the other. Depending on the relative frequencies of the query types, the optimal solution may differ. +Both of these options perform $O(1)$ work on one query type but $O(n)$ work on the other. They are only optimal when one type queries is extremely rare. When this is not the case, we can trade off the work on one type of query for increased performance of the other, and segment trees let you do exactly that, achieving the equilibrium of $O(\log n)$ for both queries. + +The main idea is this. Calculate the sum of the entire array put it somewhere. Then split it in halves, calculate the sum on both halves, and also store them somewhere. Then split these halves in halves and so on, until we recursively reach segments of length one. + +These sequence of computations can be represented as a static-structure tree: + +![](../img/segtree-path.png) + +Some nice properties of this construct: + +1. The tree has at most $2n$ vertices: $n$ on the last layer, $\frac{n}{2}$ on the previous, $\frac{n}{4}$ on the one before that, and so on. +2. The height of the tree is $\Theta(\log n)$ as on each "level" the sizes of the segments halves. +3. Each prefix can be split into $O(\log n)$ non-intersecting segments corresponding to vertices of a segment tree: you need at most one from each layer. + +When $n$ is not a perfect power of two, not all levels will be filled entirely. The last layer will be incomplete, but this doesn't take away any of these nice properties that let us solve the problem. + +1. Property 1 guarantees that we will need $O(n)$ space to store the tree +2. **Update** query is processed by adding a value to all vertices that correspond to segments that. Property 1 says there will be at most $O(\log n)$ of them. +3. **Prefix sum** query is processed by finding all vertices that compose the prefix and summing the values stored in them. Property 3 says there will also be at most $O(\log n)$ of them. + +This is a general idea. Many different implementations possible, which we will explore one by one in this article. -Segment tree is a data structure that stores information about array segments. It is a static tree of degree two, and here is what this means: - -![A segment tree for sum](../img/segtree-path.png) - -Unlike the previous structures - -Segment trees are built recursively: build a tree for left and right halves and merge results to get root. - -Many different implementations possible, which we will explore in this article. - ### Pointer-Based Implementation If you were at an "Introduction to OOP" class, you would probably implement a segment tree like this: From 863c86a576862e3530da32b4a60ff4c2fe82a177 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Fri, 25 Feb 2022 00:29:25 +0300 Subject: [PATCH 247/531] grammar fixes --- content/english/hpc/pipelining/branching.md | 18 +++++++++--------- content/english/hpc/pipelining/branchless.md | 8 ++++---- content/english/hpc/simd/auto-vectorization.md | 2 +- 3 files changed, 14 insertions(+), 14 deletions(-) diff --git a/content/english/hpc/pipelining/branching.md b/content/english/hpc/pipelining/branching.md index 6168e8a7..849e75a0 100644 --- a/content/english/hpc/pipelining/branching.md +++ b/content/english/hpc/pipelining/branching.md @@ -4,7 +4,7 @@ weight: 2 published: true --- -When a CPU encounters a conditional jump or [any other type of branching](/hpc/architecture/indirect), it doesn't just sit idle until its condition is computed — instead it starts *speculatively executing* the branch that seems more likely to be taken immediately. During execution the CPU computes statistics about branches taken on each instruction, and after a while and they start to predict them by recognizing common patterns. +When a CPU encounters a conditional jump or [any other type of branching](/hpc/architecture/indirect), it doesn't just sit idle until its condition is computed — instead, it starts *speculatively executing* the branch that seems more likely to be taken immediately. During execution, the CPU computes statistics about branches taken on each instruction, and after some time, they start to predict them by recognizing common patterns. For this reason, the true "cost" of a branch largely depends on how well it can be predicted by the CPU. If it is a pure 50/50 coin toss, you have to suffer a [control hazard](../hazards) and discard the entire pipeline, taking another 15-20 cycles to build up again. And if the branch is always or never taken, you pay almost nothing except checking the condition. @@ -27,7 +27,7 @@ for (int i = 0; i < N; i++) s += a[i]; ``` -We set $N = 10^6$ and run this loop many times over so that cold cache effects doesn't mess up our results. We mark our accumulator variable as `volatile` so that the compiler doesn't vectorize the loop, interleave its iterations, or "cheat" in any other way. +We set $N = 10^6$ and run this loop many times over so that the [cold cache](/hpc/cpu-cache/bandwidth) effect doesn't mess up our results. We mark our accumulator variable as `volatile` so that the compiler doesn't vectorize the loop, interleave its iterations, or "cheat" in any other way. On Clang, this produces assembly that looks like this: @@ -49,13 +49,13 @@ Our goal is to simulate a completely unpredictable branch, and we successfully a - We discard the pipeline, which is 19 cycles deep on Zen 2 (i. e. it has 19 stages, each taking one cycle). - We need a memory fetch and a comparison, which costs ~5 cycles. We can check the conditions of even and odd iterations concurrently, so let's assume we only pay it once per 2 iterations. -- In case of the "<" branch, we need another ~4 cycles to add `a[i]` to a volatile (memory-stored) variable `s`. +- In the case of the "<" branch, we need another ~4 cycles to add `a[i]` to a volatile (memory-stored) variable `s`. Therefore, on average, we need to spend $(4 + 5 + 19) / 2 = 14$ cycles per element, matching what we measured. ### Branch Prediction -We can replace the hardcoded 50% with a tweakable parameter `P`, which effectively corresponds to the probability of the "<" branch: +We can replace the hardcoded `50` with a tweakable parameter `P` that effectively sets the probability of the "<" branch: ```c++ for (int i = 0; i < N; i++) @@ -67,15 +67,15 @@ Now, if we benchmark it for different values of `P`, we get an interesting-looki ![](../img/probabilities.svg) -It's peak is at 50-55%, as expected: branch misprediction is the most expensive thing here. This graph is asymmetrical: it takes just ~1 cycle to only check conditions that are never satisfied (`P = 0`), and ~7 cycles for the sum if the branch is always taken (`P = 100`). +Its peak is at 50-55%, as expected: branch misprediction is the most expensive thing here. This graph is asymmetrical: it takes just ~1 cycle to only check conditions that are never satisfied (`P = 0`), and ~7 cycles for the sum if the branch is always taken (`P = 100`). -An interesting detail is that this graph is not unimodal: there is another local minimum at around 85-90%. We spend ~6.15 cycles per element there, or about 10-15% faster compared to when we always take the branch, accounting for the fact that we need to perform less additions. Branch misprediction stops affecting performance at this point, because it happens, not the whole instruction buffer is discarded, but only the operations that were speculatively scheduled. That 10-15% mispredict rate is the equilibrium point where we can see far enough in the pipeline not to stall, but save 10-15% on taking the cheaper ">=" branch. +This graph is not unimodal: there is another local minimum at around 85-90%. We spend ~6.15 cycles per element there or about 10-15% faster than when we always take the branch, accounting for the fact that we need to perform fewer additions. Branch misprediction stops affecting the performance at this point because when it happens, not the whole instruction buffer is discarded, but only the operations that were speculatively scheduled. Essentially, that 10-15% mispredict rate is the equilibrium point where we can see far enough in the pipeline not to stall but still save 10-15% on taking the cheaper ">=" branch. Note that it costs almost nothing to check for a condition that never or almost never occurs. This is why programmers use runtime exceptions and base case checks so profusely: if they are indeed rare, they don't really cost anything. ### Pattern Detection -Here, everything that was needed of a branch prediction is a hardware statistics counter: if we went to branch A more often than to branch B, then it makes sense to speculatively execute branch A. But branch predictors on modern CPUs are considerably more advanced than that and can detect much more complicated patterns. +In our example, everything that was needed for efficient branch prediction is a hardware statistics counter. If we historically took branch A more often than branch B, then it makes sense to speculatively execute branch A. But branch predictors on modern CPUs are considerably more advanced than that and can detect much more complicated patterns. Let's fix `P` back at 50, and then sort the array first before the main summation loop: @@ -86,9 +86,9 @@ for (int i = 0; i < N; i++) std::sort(a, a + n); ``` -We are still processing the same elements, but in different order, and instead of 14 cycles, it now runs in a little bit more than 4, which is exactly the average of the cost of the pure "<" and ">=" branches. +We are still processing the same elements, but in a different order, and instead of 14 cycles, it now runs in a little bit more than 4, which is exactly the average of the cost of the pure "<" and ">=" branches. -The branch predictor can pick up on much more complicated patterns than just "always left, then always right" or "left-right-left-right". If we just decrease the size of the array $N$ to 1000 (without sorting it), then branch predictor memorizes the entire sequence of comparisons, and the benchmark again measures at around 4 — in fact, even slightly less than in the sorted array, because in the former case branch predictor needs to spend some time flicking between the "always yes" and "always no" states. +The branch predictor can pick up on much more complicated patterns than just "always left, then always right" or "left-right-left-right". If we just decrease the size of the array $N$ to 1000 (without sorting it), then the branch predictor memorizes the entire sequence of comparisons, and the benchmark again measures at around 4 cycles — in fact, even slightly fewer than in the sorted array case, because in the former case branch predictor needs to spend some time flicking between the "always yes" and "always no" states. ### Hinting Likeliness of Branches diff --git a/content/english/hpc/pipelining/branchless.md b/content/english/hpc/pipelining/branchless.md index 025f9171..15adda57 100644 --- a/content/english/hpc/pipelining/branchless.md +++ b/content/english/hpc/pipelining/branchless.md @@ -27,7 +27,7 @@ for (int i = 0; i < N; i++) s += (a[i] < 50) * a[i]; ``` -Suddenly, the loop now takes ~7 cycles per element, instead of the original ~14. Also, the performance remains constant if we change `50` to some other threshold, so it doesn't depend on the branch probability. +Suddenly, the loop now takes ~7 cycles per element instead of the original ~14. Also, the performance remains constant if we change `50` to some other threshold, so it doesn't depend on the branch probability. But wait… shouldn't there still be a branch? How does `(a[i] < 50)` map to assembly? @@ -92,7 +92,7 @@ This way you can eliminate branching, but this comes at the cost of evaluating * ### When It Is Beneficial -Using predication eliminates [a structural hazard](../hazard), but introduces a data hazard. There is still a pipeline stall, but it is a cheaper one: you only need to wait for `cmov` to be resolved, and not flush the entire pipeline in case of a mispredict. +Using predication eliminates [a structural hazard](../hazard) but introduces a data hazard. There is still a pipeline stall, but it is a cheaper one: you only need to wait for `cmov` to be resolved and not flush the entire pipeline in case of a mispredict. However, there are many situations when it is more efficient to leave branchy code as it is. This is the case when the cost of computing *both* branches instead of just *one* outweighs the penalty for the potential branch mispredictions. @@ -100,10 +100,10 @@ In our example, the branchy code wins when the branch can be predicted with a pr ![](../img/branchy-vs-branchless.svg) -This 75% threshold is commonly used by the compilers as a heuristic for determining whether to use the `cmov` or not. Unfortunately, this probability is usually unknown at the compile-time, so it needs to provided in one of several ways: +This 75% threshold is commonly used by the compilers as a heuristic for determining whether to use the `cmov` or not. Unfortunately, this probability is usually unknown at the compile-time, so it needs to be provided in one of several ways: - We can use [profile-guided optimization](/hpc/compilation/pgo) which will decide for itself whether to use predication or not. -- We can use [compiler-specific intrinsics](/hpc/compilation/situational) to hint the likeliness of branches: `__builtin_expect_with_probability` in GCC and `__builtin_unpredictable` in Clang. +- We can use [compiler-specific intrinsics](/hpc/compilation/situational) to hint at the likeliness of branches: `__builtin_expect_with_probability` in GCC and `__builtin_unpredictable` in Clang. - We can rewrite branchy code using the ternary operator or various arithmetic tricks, which acts as sort of an implicit contract between programmers and compilers: if the programmer wrote the code this way, then it was probably meant to be branchless. The "right way" is to use branching hints, but unfortunately, the support for them is lacking. Right now [these hints seem to be lost](https://bugs.llvm.org/show_bug.cgi?id=40027) by the time the compiler back-end decides whether a `cmov` is more beneficial. There is [some progress](https://discourse.llvm.org/t/rfc-cmov-vs-branch-optimization/6040) towards making it possible, but currently, there is no good way of forcing the compiler to generate branch-free code, so sometimes the best hope is to just write a small snippet in assembly. diff --git a/content/english/hpc/simd/auto-vectorization.md b/content/english/hpc/simd/auto-vectorization.md index 9815cf50..5fc568c3 100644 --- a/content/english/hpc/simd/auto-vectorization.md +++ b/content/english/hpc/simd/auto-vectorization.md @@ -51,4 +51,4 @@ To help the compiler eliminate this corner case, we can use the `alignas` specif --- -There are [many other ways](https://software.intel.com/sites/default/files/m/4/8/8/2/a/31848-CompilerAutovectorizationGuide.pdf) of hinting compiler what we meant exactly, but in especially complex cases — when inside the loop there are a lot of branches or some functions are called — it is easier to go down to the intrinsics level and write it yourself. +There are [many other ways](https://software.intel.com/sites/default/files/m/4/8/8/2/a/31848-CompilerAutovectorizationGuide.pdf) of telling the compiler what we meant exactly, but in especially complex cases — when inside the loop there are a lot of branches or some functions are called — it is easier to go down to the intrinsics level and write it yourself. From 3536dd8ff57e89a1305157727f9a387b169dd545 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Fri, 25 Feb 2022 00:34:14 +0300 Subject: [PATCH 248/531] fixes in branchless --- content/english/hpc/pipelining/branchless.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/english/hpc/pipelining/branchless.md b/content/english/hpc/pipelining/branchless.md index 15adda57..3eb7838f 100644 --- a/content/english/hpc/pipelining/branchless.md +++ b/content/english/hpc/pipelining/branchless.md @@ -103,7 +103,7 @@ In our example, the branchy code wins when the branch can be predicted with a pr This 75% threshold is commonly used by the compilers as a heuristic for determining whether to use the `cmov` or not. Unfortunately, this probability is usually unknown at the compile-time, so it needs to be provided in one of several ways: - We can use [profile-guided optimization](/hpc/compilation/pgo) which will decide for itself whether to use predication or not. -- We can use [compiler-specific intrinsics](/hpc/compilation/situational) to hint at the likeliness of branches: `__builtin_expect_with_probability` in GCC and `__builtin_unpredictable` in Clang. +- We can use [likeliness attributes](../branching#hinting-likeliness-of-branches) and [compiler-specific intrinsics](/hpc/compilation/situational) to hint at the likeliness of branches: `__builtin_expect_with_probability` in GCC and `__builtin_unpredictable` in Clang. - We can rewrite branchy code using the ternary operator or various arithmetic tricks, which acts as sort of an implicit contract between programmers and compilers: if the programmer wrote the code this way, then it was probably meant to be branchless. The "right way" is to use branching hints, but unfortunately, the support for them is lacking. Right now [these hints seem to be lost](https://bugs.llvm.org/show_bug.cgi?id=40027) by the time the compiler back-end decides whether a `cmov` is more beneficial. There is [some progress](https://discourse.llvm.org/t/rfc-cmov-vs-branch-optimization/6040) towards making it possible, but currently, there is no good way of forcing the compiler to generate branch-free code, so sometimes the best hope is to just write a small snippet in assembly. @@ -201,7 +201,7 @@ int lower_bound(int x) { Other than being more complex, it has another slight drawback in that it potentially does more comparisons (constant $\lceil \log_2 n \rceil$ instead of either $\lfloor \log_2 n \rfloor$ or $\lceil \log_2 n \rceil$) and can't speculate on future memory reads (which acts as prefetching, so it loses on very large arrays). -In general, data structures are made branchless by implicitly or explicitly *padding* them, so that their operations take a constant number of iterations. Refer to [the article](/hpc/data-structures/binary-search) for more complex examples. +In general, data structures are made branchless by implicitly or explicitly *padding* them so that their operations take a constant number of iterations. Refer to [the article](/hpc/data-structures/binary-search) for more complex examples. + +For the prefix sum query, we can check if the query covers the current segment fully or doesn't cover at all and return the result for the node right away. If it is not the case, we can recursively call the query on the children and they will figure it out: + +```c++ +int sum(int k) { + if (rb <= k) + return s; + if (lb >= k) + return 0; + return l->sum(k) + r->sum(k); +} +``` + +Actually really good in terms of SWE practices, but terrible in terms of performance: + ![](../img/segtree-pointers.svg) +This is terrible. It doesn't seem like performance + It takes 4+4+4+8+8=28 bytes, although they get padded to 32 for [memory alignment](/hpc/cpu-cache/alignment) reasons. -Actually really good in terms of SWE practices, but terrible in terms of performance Pointer chasing, 4 unnecessary metadata fields, recursion, branching ### Implicit Segment Trees From 81eb4f769253e68c2f6cf16646255971fab54fd8 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Fri, 25 Feb 2022 13:54:19 +0300 Subject: [PATCH 250/531] top-down implicit segment tree --- .../hpc/data-structures/segment-trees.md | 59 +++++++++++++++---- 1 file changed, 47 insertions(+), 12 deletions(-) diff --git a/content/english/hpc/data-structures/segment-trees.md b/content/english/hpc/data-structures/segment-trees.md index a843cb66..a955af63 100644 --- a/content/english/hpc/data-structures/segment-trees.md +++ b/content/english/hpc/data-structures/segment-trees.md @@ -80,9 +80,9 @@ struct segtree { segtree(int lb, int rb) : lb(lb), rb(rb) { if (lb + 1 < rb) { // if the node is not a leaf, create children - int t = (lb + rb) / 2; - l = new segtree(lb, t); - r = new segtree(t, rb); + int m = (lb + rb) / 2; + l = new segtree(lb, m); + r = new segtree(m, rb); } } @@ -126,7 +126,19 @@ We can do largely the same with the prefix sum query, adding the sum stored in t --> -For the prefix sum query, we can check if the query covers the current segment fully or doesn't cover at all and return the result for the node right away. If it is not the case, we can recursively call the query on the children and they will figure it out: +To calculate the sum on a segment, we can check if the query covers the current segment fully or doesn't cover at all and return the result for the node right away. If it is not the case, we can recursively call the query on the children and they will figure it out: + +```c++ +int sum(int lq, int rq) { + if (rb <= lq && rb <= rq) // if we are fully inside, return the sum + return s; + if (rq <= lb || lq >= rb) // if we don't intersect, return zero + return 0; + return l->sum(k) + r->sum(k); +} +``` + +For the prefix sum query, since the left border is always zero, these checks simplify: ```c++ int sum(int k) { @@ -138,25 +150,34 @@ int sum(int k) { } ``` -Actually really good in terms of SWE practices, but terrible in terms of performance: +Since we have two types of queries, we also got two separate graphs to look at: ![](../img/segtree-pointers.svg) -This is terrible. It doesn't seem like performance +While this object-oriented implementation is quite good in terms of software engineering practices, there are several aspects that make it terrible in terms of performance: -It takes 4+4+4+8+8=28 bytes, although they get padded to 32 for [memory alignment](/hpc/cpu-cache/alignment) reasons. +- Query implementations use [recursion](/hpc/architecture/functions), although the `add` query can be tail-call optimized. +- Query implementations use unpredictable [branching](/hpc/pipelining/branching), stalling the CPU pipeline. +- The nodes stores extra metadata. The structure takes $4+4+4+8+8=28$ bytes and gets padded to 32 bytes for [memory alignment](/hpc/cpu-cache/alignment) reasons, while only 4 bytes are necessary to hold the integer sum. +- And, most importantly, we are doing [pointer chasing](/hpc/cpu-cache/latency) in both queries: we can't descend into children until we fetched their pointers, even though we can precisely infer the segments we need just from the query bounds. -Pointer chasing, 4 unnecessary metadata fields, recursion, branching +The last issue is the most critical one. To get rid of pointer chasing, we need to get rid of pointers, converting our structure to being implicit. ### Implicit Segment Trees -Eytzinger-like layout: $2k$ is the left child and $2k+1$ is the right child. +To store our segment tree implicitly, we can also use the [Eytzinger layout](../binary-search#eytzinger-layout), storing the nodes in a large array, where for every non-leaf node $v$ corresponding to the range $[l, r)$, the node $2v$ is its left child and the node $(2v+1)$ is its right child, corresponding to the ranges $[l, \lfloor \frac{l+r}{2} \rfloor)$ and $[\lfloor \frac{l+r}{2} \rfloor, r)$ respectively. + +![The memory layout of implicit segment tree with the same query path highlighted](../img/segtree-layout.png) -![](../img/segtree-layout.png) +One little problem with this layout is that if $n$ is not a perfect power of two, we would need more array cells to store the tree — $4n$, to be exact. The tree structure hasn't change, and there are still exactly $(2n - 1)$ nodes in the tree — they are just not compactly packed on the last layer. ```c++ int t[4 * N]; +``` + +To implement `add`, we similarly implement a recursive function that uses this index arithmetic instead of pointers. Since we also don't store the borders of the segment, we need to pass them as parameters. This makes the function a bit clumsy, as there are now five of them in total that you need to pass around: +```c++ void add(int k, int x, int v = 1, int l = 0, int r = N) { t[v] += x; if (l + 1 < r) { @@ -167,7 +188,11 @@ void add(int k, int x, int v = 1, int l = 0, int r = N) { add(k, x, 2 * v + 1, m, r); } } +``` + +To implement the prefix sum query, we do largely the same: +```c++ int sum(int k, int v = 1, int l = 0, int r = N) { if (l >= k) return 0; @@ -179,11 +204,13 @@ int sum(int k, int v = 1, int l = 0, int r = N) { } ``` +Apart from using much less memory, the main advantage is that we can now make use of [memory parallelism](/hpc/cpu-cache/mlp) and fetch the nodes we need in parallel, considerably improving the running time for both queries: + ![](../img/segtree-topdown.svg) -Still have wasted memory. +To improve further, we can manually optimize the index arithmetic and replace division by two with an explicit binary shift — as the compilers [aren't always able](/hpc/compilation/contracts/#arithmetic) to do themselves — and, more importantly, remove the recursion and make the implementation iterative. -### Iterative Implementation +Here is how a fully iterative `add` looks like: ```c++ void add(int k, int x) { @@ -199,7 +226,11 @@ void add(int k, int x) { } t[v] += x; } +``` + +This is slightly harder to do for the `sum` query as it has two recursive calls. The trick is to notice that when we make these calls, one of them is guaranteed to terminate immediately, so we can simply check this condition when descend: +```c++ int sum(int k) { int v = 1, l = 0, r = N, s = 0; while (true) { @@ -218,8 +249,12 @@ int sum(int k) { } ``` +This doesn't improve the performance for the update query by a lot because it was tail-recursive, and the compiler already performed a similar optimization, but the running time on the prefix sum query roughly halved for all problem sizes: + ![](../img/segtree-iterative.svg) +This implementation still has some problems: we are potentially using twice as much memory as necessary, and we still have costly branching. To get rid of these problems, we need to change the approach a little bit. + ### Bottom-Up Implementation * Different layout: leaf nodes are numbered $n$ to $(2n - 1)$, "parent" is $\lfloor k/2 \rfloor$ From ea0e76cf8c03a82c7d8a8f8be0c44b38cef8bad8 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Fri, 25 Feb 2022 15:22:21 +0300 Subject: [PATCH 251/531] bottom-up segment tree --- .../data-structures/img/segtree-ranges.png | Bin 0 -> 6145 bytes .../hpc/data-structures/segment-trees.md | 68 ++++++++++++------ 2 files changed, 45 insertions(+), 23 deletions(-) create mode 100644 content/english/hpc/data-structures/img/segtree-ranges.png diff --git a/content/english/hpc/data-structures/img/segtree-ranges.png b/content/english/hpc/data-structures/img/segtree-ranges.png new file mode 100644 index 0000000000000000000000000000000000000000..c3f8d6aac1e668d32208b15546c4e4916140a2a2 GIT binary patch literal 6145 zcmZ`-by!qgw;vHj1r?-49l#Nj&_|v0GcMX$9$ofQmm03A`0w;mzp)z|EhTK+x6s>oRML2Y{wG!guk9!lN03 zQ`#+`P$!b_E!`_!gM(cwav*|Wv5igu;4d>g`YWY@!=Q#agu$kJf?UCQv3xhG3)sgr za)f)aS%k}WBxc3!)PW**U0&fo4$n>aF{7He4gNJ!5apvq_MMJ|ivS>9%wK#D05JbA zp8s(EzYu>imq%ykLcwpqV6Y7Yu)DZE2mp}2^034szlWt-mZ0+HY7*C4PS5r1HsbGo1r{n29hk#t?rhyVP|WM3Yn6F@XF0YwXN*DRRK|->D@Tt#t==7+RVhlgRySmx zb35l)Y{`iR9h|w=1?A*(wij+p{j*gl2*`Rn`?7&7$3vB^$eE#4idx-Ci&)!e{t`_| zYkXI||3MR8EFyb>?qyyv;un1-Kd&$H>JaW`fwhwr*{(8osSAh8`V3j4?>7kW#oo*( z9p^N4gPuzT21~6ub8udUATBbI$u@k zXSl0yZS&pz#*cpHErJ~1AJ428)Q8nSdUwuU^9P)KXWRt zb2vd z^erq>&nzhpZEH}HPt3|lGAW~}8U(-_uMT$Dre;4fM#ABnxMSye0~-l)-RCW4)EjZb{d)aw~!@q{@xlDMwA(^B39~k|7KX`Gm4AI*xDGs-i8(ES>ZUb~;C(*SC3J-Af zy75cwgDfNxCHc1UZyru0O}_IzzpdiQGPh`=->(Kv^sr4lFBbFLaCv1iw_RMichZmA z7&BGtkI}5Ar##p6=}=lYWmsaRRouSVuTwFbSPuPGIBxn`@52R$Xs6Z7N~i6!1D>4@(B5=^XGjchP$AJ+cSYvAKeneKeA6E5 zG84FN73(K3I;R0EihDaxbIz0a#aSYh9xmsDycFgk@mY1@hIAqGfFLNgFK(d;v{?*3UsxgKRJFE?1nG4 zqt62dkKR{Lyx*n%OgjlHnN}m}*7S((=Sp`1VUV8>G&Uv9YY-$BLBqv9 zp*G)j+2d3sr2ZNqWwLRkrUb{Lski2fBVmJ7&n7+{Iz^8S>=gOjtnzXS<`Km$D_45I z+jY|WrEx0#xBMGhWz5K$hi7{zN!1pGE9W|q+*1NEHwWyQ+WXuW+`~GwnFg`k9ifUi z-Zb9B)3;#?wBU=&;|_TA+53nTjy_3W=7UYg-o~(9cp8yAtRcS#14JXTdu7JA z2BONDaddul&9iXJs4FUT@0`czd~c;^tg>lD)AhzPH*UW}pAu}G>=5Y6e%GDpeoWYY zd{c@-aa$bB+aiXsc4#G|^}xOmh_x%>cBe?g!3DDAUgg=;co)j$JDd!v^B`?RetW~- zwfg>qhsK^Qx3O;59d$gl)Nw_^Hdm7ig+qom?unS}aE9$~*3tMHAE*H{OcH#Bwt7B3 zz|)J6L9g^KynD+$Z9eaon%VmNOxNn<*`v8^9+Nt`D79Tinv}dajy+~GuP*09;WLc# z?!R14zTgciyS`;*c}WSqiIWiH?8B|ubUIY($yd^SMn^dVH^L!~i&Y+Q5`}1!I-PJy zrZ%iruYd9R)G1*a=u~Pme|h#?de)NGf|p(W-|Mqt>eT$} zyRo>blA5#I46foi&e9s_hMLgEnjL_#@gGl)sI1uu~tTXWSBrfu$*RRJ}ZE zuQi6Q!FsrF0NwJ@W%u^G*>sH$&X+pSin1JB7?!wBCmV~U`BG55-*knb=Ep9!)I4cr zRQnVq)HTJs!;Um|ix~TZs-MBCu3yTm6!b>=;?cD|rL7O7I1&7%w%)R1G2R%yuI`B*^_f*M+Kmrjj;h#z+&3CyoCZ&A_t+)%xn5` zI0`P0rNE4bTt|=I=!mvjM>_~CYo~Q8)0LYNZ!VxdSy*bC6@4wGw5kzT0Ex-NbIRv8 zlf;<#8S*}juV|QtSm4y(w;q5Ya(zN?bv1R`e~2r=1(i4)KQLyR;RRsj?d14m3l-B~ zDP(>4ksA(F0A*@VfE^{Eb{Mr@F1A4p$e6z2c=^Ksb*dVNz#N+6l~SzR=kBfQlA8b{ z?VU$S*ibo;n2yP(=?u}o2oA}f3Y2#-`V+2k-}o@Xf$-DLzND(sqz|c(4evm!Zcy*c z-|GQf&05|d*Xr^j(9W1`%IAsoSh(Yy1MP(5Yege%dhK*YndhlOYYg6pkzL5_FS_z6 zEXjRXD%==qjagIO5i^CD5{u?UkjI!tC#^TJt;eY}JJm#^cEjUejqQI>vSvS^SmpOk zxm;13!gg*Tpv+445ekVKYKePWeXE!$g>kWHKHh%QX!<#d*A?}s&M_o@J95`oTX?Lv z?ZaWN<;Nmp!jnd1i;`icl5E%eEogF|&a_xGKX1Q1@&qq#Os~4+ZZR*>sH$g~In$l@ zBL2S_yoHJtc{)UtxL;)UF`gMa3NJsMO{p&YSzYc2A&Ufa}Nl%Wj|I{J!8;`TM_$X7J

GaE`HE+xh;B?T(w zBh^i~mPnJvgdS>FZ6XkIo6+dy59A`!lduh%zyK zQ+z|7!<49-+UREqz2q^PflT8I^t78li9t>Qy!kIgU;2t*j68JJ)0@$ppg0QA|13-P z!J7dz_oQG1AFMq7Y_wNo<%#`=i)G`E{1GvQ8nYSxwn{ZA&cYSg?b~61hc~R*UZzSk=7-Sipm{t{^jAmfvMWq}WJ)bBM zF%n3Kt#k+*t-N@zYCr;qYIm>DOY5e>whj!aepTzAV8?5|G?ZD~n3aVSE#7Yg==@~3 zoqEYHgkukl2%Un!)ZIoZOQkl&J=muQkGNoi{p&;4&5IXltf>&*8r2#a1<(YtoH4=h}Gac!cpLXVY%UY?dqboZjMP&$2jZJMmqw$+pj!*#;#&7e`@@7_Cnx{VB ziA6AKm={GJAzhj#5k)#fkAWJfFT zL6XiZ#HYDmVMIl&_1K7sa?G?(#z6ddB=BWnFqgr?YzOOLg?$OIr<(Ce&*(f@?Ja-8 zWm}+uU2Jb@pSM1hk;hBou)@l8L)Rv9d%`WI7d>_Z8WxAjU@N{CfYUS6GP&ENYvvCn zb44=CQNvVEvxeyUH-V8QX}WFKNoa4@jf*R;*#_%r2@~E<;xz+Voda+b^_Y(=0F9I z;Z&{$%?x!A@c{7y_t(nzl1UgyF~2S464az`*u?DmwmU{QNntZzaE`Pcn{u;kIa58l z(0cud--2TXH$24@-u~yQ#!&_Z^Tvl#m`@R%BxVUQ3o-pzeW~BsUinwIM%I$q(O)#F zo?5ty<}L#_uaT)Tvrx7(UjyEf9yjD8o?efH<5oT6izdT4D|Qj^uQAD`BC(o*(rMwe zLceIh_0d*m@d8&Ca(Jp47rp*F>ZE^IqIEgL=DNY?^Ffi=yVsk6LZNkypgl=c3j7a6 z`ra>Zn#b(deO;HWD8VFL(@{u!cjP#*@%SFUZ$6sih4bv5E2Y2(gjfbb*lZfeU7%Q4 zHVOwm*As}iLYaQTyDip?S)1<2O*TD{t$ERUu{v_-qfP3B9%%s=OuGX1PpUdZaPX-~1l`*d92~B?bysMob27H28OIoL?BI^Ec3z+ApH<0e&GxbB zIGBma*oYS}ncG`nYUBo9weVi~q>O!Pr?8_K*(o|hZu7e0?4VrS-GM)w8x2(HkKUCL z`evD%H1yf`J4Jq6PCkr?=S}i-So<22O(NTVYWgt1ebI{K0hfSsun|A6%Avf1;BVJ` zZ`ou>QC>s;aRc1p8R~}%(z!YKgLJ-;7^}u^XM}x6x2OM z-!DCX8QLQVDs-%KCRzTskLumC>(hKROXWqrSb7c$hRsgBssEW(-Y)-qC&48_neIER zCC=v==C#XQJo|S~nH0dpE$FU8?dMYqa@}1A;njN7oYwkqEHP{OcfVamihu@z1~GAb z+E0|YveI0uRu<3x)SZUr*h1+bdk;rSZwym?F6AY@w2^Xr3DHgFvccC7&XJnf zWl7zkwcG%->4g}e>jK~GbYILCQ45CuX1vAgX8m~em3IwqgH2R?go-TycJYYoBO}ZH z)SaUHE;H|&g@R|Nu0(pX<~pHx!51F`MVpiFek%(SPNm*u;H+=}ZG+A8vI1Y3o@&leX@X4^&+uVW{fEwjf~x%{U^-fFp~k!w@uw5N7oA z>A;l9bY>O`7BNkp;PLRR>ZfHD#4~1@aO{$ir5CjMHV<7R-U2i+RlPmQhVT^~e(d4) zRO(%cIpj$W2_@L_0{>qK+)S569Tk%xaZwdC)a>2(dv3}d@jB9iG`EcaRq89KWwvza zaYFkgp`{|BY+cu~{#$<;Sk>RTS6+w=qRyE;{7=nyXruDshEXO5NdF0_;jn7p-Aq@J vLFmK!KPvBH^>= 1; } } - -int sum(int k) { - int res = 0; - k += N - 1; - while (k != 0) { - if (~k & 1) - res += t[k--]; - k = k >> 1; - } - return res; -} ``` -![](../img/segtree-bottomup.svg) - -### Arbitrarily-Sized Arrays +To fix this, we can similarly calculate the sum of a segment in general. For that, we need to maintain two pointers on the first and the last node to be summed, and stop when they are giving an empty segment: ```c++ -int sum(int r) { +int sum(int l, int r) { + l += N; r += N - 1; - int l = N, s = 0; + int s = 0; while (l <= r) { if ( l & 1) s += t[l++]; if (~r & 1) s += t[r--]; @@ -301,7 +291,32 @@ int sum(int r) { } ``` -Magically, it just works +This results and a much simpler and faster code. However, when the array size is not a power of two, the `sum` query doesn't work correctly. To understand why, consider at the tree structure for 13 elements: + +![The nodes comprising the first 7 elements are selected in bold](../img/segtree-ranges.png) + +The first index of the last layer is always a power of two, but when $n$ is not a power of two, some prefix of the leaf elements gets wrapped around to the right side of the tree. + +Magically, it this works even for non-power-of-two array sizes and for queries where the left boundary is to the right of the right one because the left at some point will "wrap around", and when this happens, the `l <= r` condition will become false. + +![](../img/segtree-bottomup.svg) + +Now, since we are only interested in the prefix sum, and we'd want to get rid of maintaining `l` and only move the right border like this: + +```c++ +int sum(int k) { + int res = 0; + k += N - 1; + while (k != 0) { + if (~k & 1) + res += t[k--]; + k = k >> 1; + } + return res; +} +``` + +It works when $n$ is a power of two, but fails for all other array sizes. To make it work for arbitrary array sizes, we can do the following trick: just permute the leaves so that they go in the right order, even though they span two layers. This can be done like this: ```c++ const int last_layer = 1 << __lg(2 * N - 1); @@ -311,6 +326,9 @@ int leaf(int k) { k -= (k >= 2 * N) * N; return k; } +``` + +Now, when implementing the queries, all we need to do is to call the `leaf` function: ```c++ void add(int k, int x) { @@ -333,22 +351,26 @@ int sum(int k) { } ``` -Branchless +The last touch: by replacing the `s += t[k--]` line with predication, we can now make the implementation branchless (except for the last branch, where we check the loop condition): ```c++ int sum(int k) { k = leaf(k - 1); int s = 0; while (k != 0) { - s += ((k & 1) == 0) * t[k]; // simplify? + s += (~k & 1) ? t[k] : 0; k = (k - 1) >> 1; } return s; } ``` +Combined, these optimizations make the prefix sum queries run much faster: + ![](../img/segtree-branchless.svg) +Notice that the bump in latency for the prefix sum query starts at $2^{19}$ and not at $2^{20}$, where we run out of the L3 cache. This is because we are still storing $2n$ integers, and also fetching the `t[k]` element regardless of whether we will add it to `s` or not. We can actually solve both of these problems. + ### Fenwick trees * Structure used to calculate prefix sums and similar operations From dd54c43e3ed6ea7790411ba26c2a3c4390f8d994 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Fri, 25 Feb 2022 17:05:18 +0300 Subject: [PATCH 252/531] fenwick trees --- .../hpc/data-structures/segment-trees.md | 64 ++++++++++++++----- 1 file changed, 49 insertions(+), 15 deletions(-) diff --git a/content/english/hpc/data-structures/segment-trees.md b/content/english/hpc/data-structures/segment-trees.md index 57858ae2..882e2436 100644 --- a/content/english/hpc/data-structures/segment-trees.md +++ b/content/english/hpc/data-structures/segment-trees.md @@ -10,6 +10,8 @@ In this article, instead of trying to optimize something from the STL, we will f [^tcs]: Segment trees are rarely mentioned in scientific literature because they are relatively novel (invented around 2000), and *asymptotically* don't do anything that [any other binary tree](https://en.wikipedia.org/wiki/Tree_(data_structure)) can't do, but they are much faster *in practice* for the problems they solve. +This is a long article, and to make it less long, we will mostly be focusing on its simplest application + Segment trees are cool and can do lots of different things, but in this article, we will focus on their simplest non-trivial application — *the dynamic prefix sum problem*: ```cpp @@ -373,40 +375,66 @@ Notice that the bump in latency for the prefix sum query starts at $2^{19}$ and ### Fenwick trees -* Structure used to calculate prefix sums and similar operations -* Defined as array $t_i = \sum_{k=f(i)}^i a_k$ where $f$ is any function for which $f(i) \leq i$ -* If $f$ is "remove last bit" (`x -= x & -x`), - then both query and update would only require updating $O(\log n)$ different $t$'s +Implicit structures are great. They allow us to avoid pointer chasing and visit all the nodes relevant for a query in parallel. What is even better is *succinct* structures. In addition to not storing pointers or any other metadata, they also use the theoretically minimal memory to store the structure — maybe only with $O(1)$ more fields. -![](../img/fenwick-sum.png) -![](../img/fenwick-update.png) +To make a segment tree succinct, we need to look at the values stored in the nodes and search for redundancies — the values that can be inferred from other nodes — and remove them. For any node $p$, its sum $s_p$ equals to the sum $(s_l + s_r)$ stored in its children nodes. Therefore, for any such "triangle" of nodes, we only need to store any two of $s_p$, $s_l$, or $s_r$, and we can restore the other one from the $s_p = s_l + s_r$ identity. -```cpp +Note that in every implementation so far, we never added the sum stored in the right child when computing the prefix sum. *Fenwick tree* is a type of a segment tree that uses this consideration and gets rid of all *right* children, including the last layer. This makes the total required number of memory cells $n + O(1)$, the same as the underlying array. + +To calculate a prefix sum, we need to repeatedly jump to the first parent that is a left child: + +![A path for the sum query](../img/fenwick-sum.png) + +To process an update query, we need to repeatedly add the delta to the first parent the contains the cell $k$: + +![A path for the update query](../img/fenwick-update.png) + +More formally, a Fenwick tree is defined as the array $t_i = \sum_{k=f(i)}^i a_k$ where $f$ is some function for which $f(i) \leq i$. If $f$ is the "remove last bit" function (`x -= x & -x`), then both query and update would only require updating $O(\log n)$ different $t$'s + +```c++ int t[N + 1]; +``` -void add(int k, int x) { - for (k += 1; k <= N; k += k & -k) - t[k] += x; -} +Now, instead of making it actually equivalent to a segment tree, we will make all sizes a power of two and maintain a *forest* of trees. In a sense, we maintain $O(\log n)$ different trees. +Now, the tricky part is how to do it *fast*. If the array size is a perfect power of two, we have a trick. Notice that what left children have in common is that their indices are even. If a node is a deep interior node, it will end with a lot of zeros in its binary representation. We can there just remove the last sequence of ones, which can be done with `k &= k - 1`: + +```c++ int sum(int k) { int res = 0; - for (; k != 0; k &= k - 1) // k -= k & -k + for (; k != 0; k &= k - 1) res += t[k]; return res; } ``` +Now, when we defined $f$, on update, we need to identify the nodes that contain the element that is being updated. Since the $f$ function removes the last index, these have to be the nodes that have the same number of zeros at the end, some of the same prefix, and some number of ones at the middle that will be cancelled to produce a number that is lower than the original one. All such numbers can be yielded by adding the last set bit to the index, which trims zeros: + +```c++ +void add(int k, int x) { + for (k += 1; k <= N; k += k & -k) + t[k] += x; +} +``` + +Sometimes people use `k -= k & -k` to iterate when processing the `sum` query, which makes this implementation delightfully symmetric. + +This is a structure where it is easier to calculate sum on subsegments as the difference of two prefix sums: + ```c++ -// how you can use it to calculate sums on subsegments: +// [l, r) int sum (int l, int r) { - return sum(r) - sum(l-1); + return sum(r) - sum(l - 1); } ``` +The performance of the Fenwick tree is similar to the optimized bottom-up segment tree: + ![](../img/segtree-fenwick.svg) -Can't be more optimal because of pipelining and implicit prefetching +There is, however, one weird thing. The performance goes up rapidly close to the L3 boundary. This is a [cache associativity](/hpc/cpu-cache/associativity) effect: the most frequently used cells all have their index divisible by large powers of two and get aliased to the same cache set, kicking each other out. + +One way to negate this is to insert "holes" in the layout like this: ```c++ inline constexpr int hole(int k) { @@ -428,8 +456,14 @@ int sum(int k) { } ``` +As computing the `hole` function is not on the critical path between iteration, it does not introduce any significant overhead, but completely removes the cache associativity problem: + ![](../img/segtree-fenwick-holes.svg) +There are still other minor issues with Fenwick trees. Similar to [binary search](../binary-search), the temporal locality of its memory accesses is not great, as rarely accessed elements are grouped with the most frequently accessed ones. It also executes has to perform end-of-loop checks and executes non-constant number of iterations, likely causing a branch mispredict, although just a single one. + +But we are going to leave it there and focus on an entirely different approach. If you know [S+ trees](../s-tree), you've probably guessed where this is going. + ### Wide Segment Trees ![](../img/segtree-wide.png) From 33c204434b7b0ef0698d2fca5f8fceef6a6dc097 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Fri, 25 Feb 2022 17:40:05 +0300 Subject: [PATCH 253/531] wide segment trees --- .../hpc/data-structures/segment-trees.md | 63 ++++++++++++++++--- 1 file changed, 55 insertions(+), 8 deletions(-) diff --git a/content/english/hpc/data-structures/segment-trees.md b/content/english/hpc/data-structures/segment-trees.md index 882e2436..ff46b8ef 100644 --- a/content/english/hpc/data-structures/segment-trees.md +++ b/content/english/hpc/data-structures/segment-trees.md @@ -456,18 +456,22 @@ int sum(int k) { } ``` -As computing the `hole` function is not on the critical path between iteration, it does not introduce any significant overhead, but completely removes the cache associativity problem: +As computing the `hole` function is not on the critical path between iteration, it does not introduce any significant overhead, but completely removes the cache associativity problem and shrinks the latency by ~3x on large arrays: ![](../img/segtree-fenwick-holes.svg) There are still other minor issues with Fenwick trees. Similar to [binary search](../binary-search), the temporal locality of its memory accesses is not great, as rarely accessed elements are grouped with the most frequently accessed ones. It also executes has to perform end-of-loop checks and executes non-constant number of iterations, likely causing a branch mispredict, although just a single one. -But we are going to leave it there and focus on an entirely different approach. If you know [S+ trees](../s-tree), you've probably guessed where this is going. +But we are going to leave it there and focus on an entirely different approach. If you know [S-trees](../s-tree), you've probably guessed where this is going. ### Wide Segment Trees +Here is the idea: if we are fetching a full cache line anyway, let's fill it with information that lets us process the query quicker. So let's store more than one data point in a segment tree node — this lets us reduce the tree height and do less iterations descending it. + ![](../img/segtree-wide.png) +We can use a similar constexpr-based approach we used in [S+ trees](../s-tree#implicit-b-tree-1) to implement it: + ```c++ const int b = 4, B = (1 << b); @@ -488,6 +492,19 @@ constexpr int H = height(N); alignas(64) int t[offset(H)]; ``` +We effectively reduce the height of the tree by $\frac{\log_B n}{\log_2 n} = \log_2 B$ times, but it may be tricky to realize in-node operations. + +In this context, we have to options: + +1. We could store $B$ sums in each node. +2. We could store $B$ prefix sums in each node. + +If we go with option 1, the `add` query would be largely the same, but the `sum` query would need to sum up to $B$ scalar in each node. If we go with option 2, the `sum` query would be trivial, but the `add` query will need to add the element to some suffix of each node. + +In either case, one operation will perform $O(\log_B n)$ operations and rouch one scalar, while the other will perform $O(B \cdot \log_B n)$ operations. However, we really want to use [SIMD](/hpc/simd) to accelerate the slower operation. Since there are no fast [horizontal reductions](/hpc/simd/reduction), but it is easy to add a vector to a vector, we will stick to the second approach and store prefix sums in each node. + +This makes the `sum` query very easy: + ```c++ int sum(int k) { int res = 0; @@ -497,6 +514,8 @@ int sum(int k) { } ``` +For the `add` query, however, we need a trick. We only need to add a number to a prefix of a node. We need a mask that will tell us which element to add and which not. We can pre-calculate such a $B \times B$ mask just once, which tells us for each starting position whether the element is engaged in the operation or not: + ```c++ struct Precalc { alignas(64) int mask[B][B]; @@ -509,25 +528,31 @@ struct Precalc { }; constexpr Precalc T; +``` +We then use these masks to bitwise-and the broadcasted delta value and add it to the values stored at the node: + +```c++ typedef int vec __attribute__ (( vector_size(32) )); constexpr int round(int k) { return k & ~(B - 1); // = k / B * B } -void add(int k, int _x) { - vec x = _x + vec{}; +void add(int k, int x) { + vec v = x + vec{}; for (int h = 0; h < H; h++) { - auto l = (vec*) &t[offset(h) + round(k)]; + auto a = (vec*) &t[offset(h) + round(k)]; auto m = (vec*) T.mask[k % B]; for (int i = 0; i < B / 8; i++) - l[i] += x & m[i]; + a[i] += v & m[i]; k >>= b; } } ``` +This speeds up the `sum` query by more than 10x and the `add` query by up to 4x compared to the Fenwick tree: + ![](../img/segtree-simd.svg) Wide Fenwick trees make little sense. The speed of Fenwick trees comes from rapidly iterating over just the elements we need. @@ -536,14 +561,36 @@ Unlike [S-trees](../s-tree), you can easily change block size: ![](../img/segtree-simd-others.svg) +Expectedly, when we increase the node size, the update time also increases, as we need to fetch more cache lines and process them, but the `sum` query time decreases, as the size of the tree becomes smaller. + +There are similar considerations to [S+ trees](../s-tree/#modifications-and-further-optimizations) in that the ideal layout (the node sizes on each layer) may depend on the use case. + ### Comparison +This is significantly faster compared to the popular segment tree implementations: + ![](../img/segtree-popular.svg) +It makes sense to look at the relative speedup: + ![](../img/segtree-popular-relative.svg) -### Acknowledgements +The wide segment tree is up to 200 and 40 times faster than the pointer-based segment tree for the prefix sum and update queries respectively, although for sufficiently large arrays, memory efficiency becomes the only concern, and this speedup goes down to 60 and 15 respectively. + +### Modifications -"[Efficient and easy segment trees](https://codeforces.com/blog/entry/18051)" by Oleksandr Bacherikov +We mostly focused on the prefix sum problem, but this general structure can be used for other problems handled by segment trees: + +- General sums and other reductions. +- Range minimum sum queries. +- Fixed-universe heaps. + +Some more exotic applications, reliant on there being pointers, are expectedly harder. To implement dynamic trees, we could store the mapping between the node number and the tree in a hash table. For more complicated cases, such as whether wide segment trees can help in implementing persistent trees is an open question. + +why b-ary Fenwick tree is not a good idea + +### Acknowledgements This article is loosely based on "[Practical Trade-Offs for the Prefix-Sum Problem](https://arxiv.org/pdf/2006.14552.pdf)" by Giulio Ermanno Pibiri and Rossano Venturini. It has some more detailed discussions, as well as some other implementations or branchless top-down segment tree and why b-ary Fenwick tree is not a good idea. Intermediate structures we've skipped here. + +Some code was borrowed from "[Efficient and easy segment trees](https://codeforces.com/blog/entry/18051)" by Oleksandr Bacherikov. From 5146998e09479bc2e84aa045bf15e6458a22fbb7 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sat, 26 Feb 2022 22:49:12 +0300 Subject: [PATCH 254/531] svg figures for segment tree --- .../data-structures/img/src/fenwick-sum.svg | 3 + .../img/src/fenwick-update.svg | 3 + .../img/src/segtree-layout.svg | 3 + .../img/src/segtree-permuted.svg | 1803 ++++++++++++++ .../img/src/segtree-succinct.svg | 2102 +++++++++++++++++ .../data-structures/img/src/segtree-wide.svg | 3 + .../hpc/data-structures/img/src/segtree.svg | 3 + .../hpc/data-structures/segment-trees.md | 2 + 8 files changed, 3922 insertions(+) create mode 100644 content/english/hpc/data-structures/img/src/fenwick-sum.svg create mode 100644 content/english/hpc/data-structures/img/src/fenwick-update.svg create mode 100644 content/english/hpc/data-structures/img/src/segtree-layout.svg create mode 100644 content/english/hpc/data-structures/img/src/segtree-permuted.svg create mode 100644 content/english/hpc/data-structures/img/src/segtree-succinct.svg create mode 100644 content/english/hpc/data-structures/img/src/segtree-wide.svg create mode 100644 content/english/hpc/data-structures/img/src/segtree.svg diff --git a/content/english/hpc/data-structures/img/src/fenwick-sum.svg b/content/english/hpc/data-structures/img/src/fenwick-sum.svg new file mode 100644 index 00000000..6fc6c75a --- /dev/null +++ b/content/english/hpc/data-structures/img/src/fenwick-sum.svg @@ -0,0 +1,3 @@ + + + Produced by OmniGraffle 6.6.2 2020-06-26 13:14:04 +0000Canvas 18Layer 112237-4227132822-86-13894322913-52412345678910111213141516001337122822292-4227132-86-138-5249403tree diff --git a/content/english/hpc/data-structures/img/src/fenwick-update.svg b/content/english/hpc/data-structures/img/src/fenwick-update.svg new file mode 100644 index 00000000..79c8f848 --- /dev/null +++ b/content/english/hpc/data-structures/img/src/fenwick-update.svg @@ -0,0 +1,3 @@ + + + Produced by OmniGraffle 6.6.2 2020-06-26 13:14:04 +0000Canvas 19Layer 13942824-8621322737-421213-138-5222912237-4227132822-86-138943229-524123456789101112131415160013tree diff --git a/content/english/hpc/data-structures/img/src/segtree-layout.svg b/content/english/hpc/data-structures/img/src/segtree-layout.svg new file mode 100644 index 00000000..aefb2427 --- /dev/null +++ b/content/english/hpc/data-structures/img/src/segtree-layout.svg @@ -0,0 +1,3 @@ + + + Produced by OmniGraffle 6.6.2 2020-06-26 13:14:04 +0000Canvas 10Layer 113-1223-42311352-880903-124122522718-86-9372452828594-52-52-138-53229tree diff --git a/content/english/hpc/data-structures/img/src/segtree-permuted.svg b/content/english/hpc/data-structures/img/src/segtree-permuted.svg new file mode 100644 index 00000000..dc522b6e --- /dev/null +++ b/content/english/hpc/data-structures/img/src/segtree-permuted.svg @@ -0,0 +1,1803 @@ + + + + Produced by OmniGraffle 6.6.2 2020-06-26 13:14:04 +0000 + + + image/svg+xml + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Canvas 5 + + + Layer 1 + + + + 13 + + + + + -1 + + + + + 2 + + + + + 23 + + + + + -4 + + + + + 231 + + + + + 13 + + + + + 5 + + + + + 2 + + + + + -88 + + + + + 0 + + + + + 90 + + + + + 3 + + + + + -12 + + + 0 + + + 1 + + + 2 + + + 3 + + + 4 + + + 5 + + + 6 + + + 7 + + + 8 + + + 9 + + + 10 + + + 11 + + + 12 + + + 13 + + + 14 + + + 15 + + + + + 12 + + + + + 25 + + + + + 227 + + + + + 18 + + + + + -86 + + + + + -52 + + + + + -9 + + + + + + + + + + + + + + + + + + + + + 37 + + + + + 245 + + + + + -138 + + + + + + + + + + + + + 282 + + + + + -53 + + + + + 229 + + + + + + + + + + + -52 + + + + + 4 + + + + + 85 + + + + + 94 + + + [0,1] + + + [2,3] + + + [4,5] + + + [6,7] + + + [8,9] + + + [10,11] + + + [12,13] + + + [14,15] + + + [0,3] + + + [4,7] + + + [8,11] + + + [12,15] + + + [8,15] + + + [0,7] + + + [0,15] + + + A + + + + diff --git a/content/english/hpc/data-structures/img/src/segtree-succinct.svg b/content/english/hpc/data-structures/img/src/segtree-succinct.svg new file mode 100644 index 00000000..80679bf3 --- /dev/null +++ b/content/english/hpc/data-structures/img/src/segtree-succinct.svg @@ -0,0 +1,2102 @@ + + + + Produced by OmniGraffle 6.6.2 2020-06-26 13:14:04 +0000 + + + image/svg+xml + + + + + + + + + + + + + + + + + + + + + + + + + + + + Canvas 5 + + + Layer 1 + + + + 13 + + + + + -1 + + + + + 2 + + + + + 23 + + + + + -4 + + + + + 231 + + + + + 13 + + + + + 5 + + + + + 2 + + + + + -88 + + + + + 0 + + + + + 90 + + + + + 3 + + + + + -12 + + + 0 + + + 1 + + + 2 + + + 3 + + + 4 + + + 5 + + + 6 + + + 7 + + + 8 + + + 9 + + + 10 + + + 11 + + + 12 + + + 13 + + + 14 + + + 15 + + + + + 12 + + + + + 25 + + + + + 227 + + + + + 18 + + + + + -86 + + + + + + + + + + + + + -52 + + + + + -9 + + + + + + + + + + + + + + + + + + + + + 37 + + + + + 245 + + + + + -138 + + + + + + + + + + + + + 282 + + + + + -53 + + + + + 229 + + + + + + + + + + + -52 + + + + + 4 + + + + + 85 + + + + + 94 + + + [0,1] + + + [2,3] + + + [4,5] + + + [6,7] + + + [8,9] + + + [10,11] + + + [12,13] + + + [14,15] + + + [0,3] + + + [4,7] + + + [8,11] + + + [12,15] + + + [8,15] + + + [0,7] + + + [0,15] + + + A + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/data-structures/img/src/segtree-wide.svg b/content/english/hpc/data-structures/img/src/segtree-wide.svg new file mode 100644 index 00000000..dd6bc878 --- /dev/null +++ b/content/english/hpc/data-structures/img/src/segtree-wide.svg @@ -0,0 +1,3 @@ + + + Produced by OmniGraffle 6.6.2 2020-06-26 13:14:04 +0000Canvas 13Layer 1 diff --git a/content/english/hpc/data-structures/img/src/segtree.svg b/content/english/hpc/data-structures/img/src/segtree.svg new file mode 100644 index 00000000..ab4b4dc7 --- /dev/null +++ b/content/english/hpc/data-structures/img/src/segtree.svg @@ -0,0 +1,3 @@ + + + Produced by OmniGraffle 6.6.2 2020-06-26 13:14:04 +0000Canvas 5Layer 113-1223-42311352-880903-120123456789101112131415122522718-86-52-937245-138282-53229-5248594[0,1][2,3][4,5][6,7][8,9][10,11][12,13][14,15][0,3][4,7][8,11][12,15][8,15][0,7][0,15]A diff --git a/content/english/hpc/data-structures/segment-trees.md b/content/english/hpc/data-structures/segment-trees.md index ff46b8ef..44edb04a 100644 --- a/content/english/hpc/data-structures/segment-trees.md +++ b/content/english/hpc/data-structures/segment-trees.md @@ -26,6 +26,8 @@ Note that we have to support two types of queries, which makes this problem mult Both of these options perform $O(1)$ work on one query type but $O(n)$ work on the other. They are only optimal when one type queries is extremely rare. When this is not the case, we can trade off the work on one type of query for increased performance of the other, and segment trees let you do exactly that, achieving the equilibrium of $O(\log n)$ for both queries. +### The Structure + The main idea is this. Calculate the sum of the entire array put it somewhere. Then split it in halves, calculate the sum on both halves, and also store them somewhere. Then split these halves in halves and so on, until we recursively reach segments of length one. These sequence of computations can be represented as a static-structure tree: From b9365d78809a127d33483c327c632e0a706bf2cd Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sat, 26 Feb 2022 23:50:48 +0300 Subject: [PATCH 255/531] replacing png segtree figures with svg --- .../img/{src => }/fenwick-sum.svg | 0 .../img/{src => }/fenwick-update.svg | 0 .../img/{src => }/segtree-layout.svg | 0 .../img/{src/segtree.svg => segtree-path.svg} | 0 .../img/{src => }/segtree-permuted.svg | 1526 ++++++++--------- .../img/{src => }/segtree-succinct.svg | 0 .../img/{src => }/segtree-wide.svg | 0 .../hpc/data-structures/segment-trees.md | 14 +- 8 files changed, 681 insertions(+), 859 deletions(-) rename content/english/hpc/data-structures/img/{src => }/fenwick-sum.svg (100%) rename content/english/hpc/data-structures/img/{src => }/fenwick-update.svg (100%) rename content/english/hpc/data-structures/img/{src => }/segtree-layout.svg (100%) rename content/english/hpc/data-structures/img/{src/segtree.svg => segtree-path.svg} (100%) rename content/english/hpc/data-structures/img/{src => }/segtree-permuted.svg (52%) rename content/english/hpc/data-structures/img/{src => }/segtree-succinct.svg (100%) rename content/english/hpc/data-structures/img/{src => }/segtree-wide.svg (100%) diff --git a/content/english/hpc/data-structures/img/src/fenwick-sum.svg b/content/english/hpc/data-structures/img/fenwick-sum.svg similarity index 100% rename from content/english/hpc/data-structures/img/src/fenwick-sum.svg rename to content/english/hpc/data-structures/img/fenwick-sum.svg diff --git a/content/english/hpc/data-structures/img/src/fenwick-update.svg b/content/english/hpc/data-structures/img/fenwick-update.svg similarity index 100% rename from content/english/hpc/data-structures/img/src/fenwick-update.svg rename to content/english/hpc/data-structures/img/fenwick-update.svg diff --git a/content/english/hpc/data-structures/img/src/segtree-layout.svg b/content/english/hpc/data-structures/img/segtree-layout.svg similarity index 100% rename from content/english/hpc/data-structures/img/src/segtree-layout.svg rename to content/english/hpc/data-structures/img/segtree-layout.svg diff --git a/content/english/hpc/data-structures/img/src/segtree.svg b/content/english/hpc/data-structures/img/segtree-path.svg similarity index 100% rename from content/english/hpc/data-structures/img/src/segtree.svg rename to content/english/hpc/data-structures/img/segtree-path.svg diff --git a/content/english/hpc/data-structures/img/src/segtree-permuted.svg b/content/english/hpc/data-structures/img/segtree-permuted.svg similarity index 52% rename from content/english/hpc/data-structures/img/src/segtree-permuted.svg rename to content/english/hpc/data-structures/img/segtree-permuted.svg index dc522b6e..e43e1474 100644 --- a/content/english/hpc/data-structures/img/src/segtree-permuted.svg +++ b/content/english/hpc/data-structures/img/segtree-permuted.svg @@ -28,12 +28,13 @@ id="namedview462" showgrid="false" inkscape:zoom="2.1152246" - inkscape:cx="192.86542" - inkscape:cy="0.99181839" + inkscape:cx="149.12595" + inkscape:cy="90.843446" inkscape:window-x="0" inkscape:window-y="27" inkscape:window-maximized="1" - inkscape:current-layer="g456" /> + inkscape:current-layer="g458" + inkscape:snap-text-baseline="true" /> Produced by OmniGraffle 6.6.2 2020-06-26 13:14:04 +0000 @@ -42,12 +43,53 @@ image/svg+xml - + + + + + + + + + Canvas 5 + id="rect16" + fill="white" /> + id="rect20" + fill="white" /> <rect x="130.38236" y="232.59842" width="19.842519" height="14.173228" - stroke="black" - stroke-linecap="round" + id="rect22" stroke-linejoin="round" + stroke-linecap="round" stroke-width="1" - id="rect22" /> + stroke="black" /> <text - transform="translate(130.38236 235.18504)" - fill="black" - id="text26"> + id="text26" + style="fill:#000000" + x="130.69576" + y="235.18504"> <tspan - font-family="Linux Libertine" font-size="8" font-weight="500" - x="6.2025096" - y="7" + x="136.89827" + y="242.18504" + id="tspan24" textLength="7.4375" - id="tspan24">13</tspan> + style="font-weight:500;font-size:8px;font-family:'Linux Libertine'">23</tspan> </text> <rect x="150.22441" y="232.59842" width="19.842519" height="14.173228" - fill="white" - id="rect28" /> + id="rect28" + fill="white" /> <rect x="150.22441" y="232.59842" width="19.842519" height="14.173228" - stroke="black" - stroke-linecap="round" + id="rect30" stroke-linejoin="round" + stroke-linecap="round" stroke-width="1" - id="rect30" /> + stroke="black" /> <text - transform="translate(150.22441 235.18504)" - fill="black" - id="text34"> + transform="translate(150.22441,235.18504)" + id="text34" + style="fill:#000000"> <tspan - font-family="Linux Libertine" font-size="8" font-weight="500" - x="6.710322" + x="6.7103219" y="7" + id="tspan32" textLength="6.421875" - id="tspan32">-1</tspan> + lengthAdjust="spacing" + style="font-weight:500;font-size:8px;font-family:'Linux Libertine'">-4</tspan> </text> <rect x="170.06693" y="232.59842" width="19.842519" height="14.173228" - fill="white" - id="rect36" /> + id="rect36" + fill="white" /> <rect x="170.06693" y="232.59842" width="19.842519" height="14.173228" - stroke="black" - stroke-linecap="round" + id="rect38" stroke-linejoin="round" + stroke-linecap="round" stroke-width="1" - id="rect38" /> + stroke="black" /> <text - transform="translate(170.06693 235.18504)" - fill="black" - id="text42"> + id="text42" + style="fill:#000000" + x="166.86134" + y="235.22937"> <tspan - font-family="Linux Libertine" font-size="8" font-weight="500" - x="8.0618846" - y="7" + x="174.92322" + y="242.22937" + id="tspan40" textLength="3.71875" - id="tspan40">2</tspan> + style="font-weight:500;font-size:8px;font-family:'Linux Libertine'">231</tspan> </text> <rect x="189.90945" y="232.59842" width="19.842519" height="14.173228" - fill="white" - id="rect44" /> + id="rect44" + fill="white" /> <rect x="189.90945" y="232.59842" width="19.842519" height="14.173228" - stroke="black" - stroke-linecap="round" + id="rect46" stroke-linejoin="round" + stroke-linecap="round" stroke-width="1" - id="rect46" /> + stroke="black" /> <text - transform="translate(189.90945 235.18504)" - fill="black" - id="text50"> + transform="translate(189.90945,235.18504)" + id="text50" + style="fill:#000000"> <tspan - font-family="Linux Libertine" font-size="8" font-weight="500" - x="6.2025096" + x="6.2025094" y="7" + id="tspan48" textLength="7.4375" - id="tspan48">23</tspan> + lengthAdjust="spacing" + style="font-weight:500;font-size:8px;font-family:'Linux Libertine'">13</tspan> </text> <rect x="209.75197" y="232.59842" width="19.842519" height="14.173228" - fill="white" - id="rect52" /> + id="rect52" + fill="white" /> <rect x="209.75197" y="232.59842" width="19.842519" height="14.173228" - stroke="black" - stroke-linecap="round" + id="rect54" stroke-linejoin="round" + stroke-linecap="round" stroke-width="1" - id="rect54" /> + stroke="black" /> <text - transform="translate(209.75197 235.18504)" - fill="black" - id="text58"> + id="text58" + style="fill:#000000" + x="211.06825" + y="235.24773"> <tspan - font-family="Linux Libertine" font-size="8" font-weight="500" - x="6.710322" - y="7" + x="217.77858" + y="242.24773" + id="tspan56" textLength="6.421875" - id="tspan56">-4</tspan> + style="font-weight:500;font-size:8px;font-family:'Linux Libertine'">5</tspan> </text> <rect x="229.59449" y="232.59842" width="19.842519" height="14.173228" - fill="white" - id="rect60" /> + id="rect60" + fill="white" /> <rect x="229.59449" y="232.59842" width="19.842519" height="14.173228" - stroke="black" - stroke-linecap="round" + id="rect62" stroke-linejoin="round" + stroke-linecap="round" stroke-width="1" - id="rect62" /> + stroke="black" /> <text - transform="translate(229.59449 235.18504)" - fill="black" - id="text66"> + id="text66" + style="fill:#000000" + x="233.0419" + y="234.99699"> <tspan - font-family="Linux Libertine" font-size="8" font-weight="500" - x="4.3431346" - y="7" + x="237.38503" + y="241.99699" + id="tspan64" textLength="11.15625" - id="tspan64">231</tspan> + lengthAdjust="spacing" + style="font-weight:500;font-size:8px;font-family:'Linux Libertine'">2</tspan> </text> <rect x="249.437" y="232.59842" width="19.842519" height="14.173228" - fill="white" - id="rect68" /> + id="rect68" + fill="white" /> <rect x="249.437" y="232.59842" width="19.842519" height="14.173228" - stroke="black" - stroke-linecap="round" + id="rect70" stroke-linejoin="round" + stroke-linecap="round" stroke-width="1" - id="rect70" /> + stroke="black" /> <text - transform="translate(249.437 235.18504)" - fill="black" - id="text74"> + id="text74" + style="fill:#000000" + x="248.18341" + y="235.05969"> <tspan - font-family="Linux Libertine" font-size="8" font-weight="500" - x="6.2025096" - y="7" + x="254.38591" + y="242.05969" + id="tspan72" textLength="7.4375" - id="tspan72">13</tspan> + lengthAdjust="spacing" + style="font-weight:500;font-size:8px;font-family:'Linux Libertine'">-88</tspan> </text> <rect x="269.29134" y="232.59842" width="19.842519" height="14.173228" - fill="white" - id="rect76" /> - <rect - x="269.29134" - y="232.59842" - width="19.842519" - height="14.173228" - stroke="black" - stroke-linecap="round" - stroke-linejoin="round" - stroke-width="1" - id="rect78" /> + id="rect76" + fill="white" /> <text - transform="translate(269.29134 235.18504)" - fill="black" - id="text82"> + id="text82" + style="fill:#000000" + x="266.15735" + y="235.18504"> <tspan - font-family="Linux Libertine" font-size="8" font-weight="500" - x="8.0618846" - y="7" + x="274.21921" + y="242.18504" + id="tspan80" textLength="3.71875" - id="tspan80">5</tspan> + lengthAdjust="spacing" + style="font-weight:500;font-size:8px;font-family:'Linux Libertine'">-52</tspan> </text> <rect x="289.12205" y="232.59842" width="19.842519" height="14.173228" - fill="white" - id="rect84" /> + id="rect84" + fill="white" /> <rect x="289.12205" y="232.59842" width="19.842519" height="14.173228" - stroke="black" - stroke-linecap="round" + id="rect86" stroke-linejoin="round" + stroke-linecap="round" stroke-width="1" - id="rect86" /> + stroke="black" /> <text - transform="translate(289.12205 235.18504)" - fill="black" - id="text90"> + id="text90" + style="fill:#000000" + x="289.49814" + y="235.56113"> <tspan - font-family="Linux Libertine" font-size="8" font-weight="500" - x="8.0618846" - y="7" + x="297.56003" + y="242.56113" + id="tspan88" textLength="3.71875" - id="tspan88">2</tspan> + lengthAdjust="spacing" + style="font-weight:500;font-size:8px;font-family:'Linux Libertine'">0</tspan> </text> <rect x="308.96457" y="232.59842" width="19.842519" height="14.173228" - fill="white" - id="rect92" /> + id="rect92" + fill="white" /> <rect x="308.96457" y="232.59842" width="19.842519" height="14.173228" - stroke="black" - stroke-linecap="round" + id="rect94" stroke-linejoin="round" - stroke-width="1" - id="rect94" /> - <text - transform="translate(308.96457 235.18504)" - fill="black" - id="text98"> - <tspan - font-family="Linux Libertine" - font-size="8" - font-weight="500" - x="4.850947" - y="7" - textLength="10.140625" - id="tspan96">-88</tspan> - </text> - <rect - x="348.6496" - y="232.59842" - width="19.842519" - height="14.173228" - fill="white" - id="rect100" /> - <rect - x="348.6496" - y="232.59842" - width="19.842519" - height="14.173228" - stroke="black" stroke-linecap="round" - stroke-linejoin="round" stroke-width="1" - id="rect102" /> + stroke="black" /> <text - transform="translate(348.6496 235.18504)" - fill="black" - id="text106"> + id="text98" + style="fill:#000000" + x="312.09857" + y="235.31039"> <tspan - font-family="Linux Libertine" font-size="8" font-weight="500" - x="8.0618846" - y="7" - textLength="3.71875" - id="tspan104">0</tspan> + x="316.94952" + y="242.31039" + id="tspan96" + textLength="10.140625" + lengthAdjust="spacing" + style="font-weight:500;font-size:8px;font-family:'Linux Libertine'">4</tspan> </text> <rect x="388.33464" y="232.59842" width="19.842519" height="14.173228" - fill="white" - id="rect108" /> - <rect - x="388.33464" - y="232.59842" - width="19.842519" - height="14.173228" - stroke="black" - stroke-linecap="round" - stroke-linejoin="round" - stroke-width="1" - id="rect110" /> - <text - transform="translate(388.33464 235.18504)" - fill="black" - id="text114"> - <tspan - font-family="Linux Libertine" - font-size="8" - font-weight="500" - fill="black" - x="6.2025096" - y="7" - textLength="7.4375" - id="tspan112">90</tspan> - </text> - <rect - x="408.17716" - y="232.59842" - width="19.842519" - height="14.173228" - fill="white" - id="rect116" /> + id="rect108" + fill="white" /> <rect x="408.17716" y="232.59842" width="19.842519" height="14.173228" - stroke="black" - stroke-linecap="round" - stroke-linejoin="round" - stroke-width="1" - id="rect118" /> - <text - transform="translate(408.17716 235.18504)" - fill="black" - id="text122"> - <tspan - font-family="Linux Libertine" - font-size="8" - font-weight="500" - fill="black" - x="8.0618846" - y="7" - textLength="3.71875" - id="tspan120">3</tspan> - </text> - <rect - x="428.01968" - y="232.59842" - width="19.842519" - height="14.173228" - fill="white" - id="rect124" /> + id="rect116" + fill="white" /> <rect x="428.01968" y="232.59842" width="19.842519" height="14.173228" - stroke="black" - stroke-linecap="round" - stroke-linejoin="round" - stroke-width="1" - id="rect126" /> - <text - transform="translate(428.01968 235.18504)" - fill="black" - id="text130"> - <tspan - font-family="Linux Libertine" - font-size="8" - font-weight="500" - fill="black" - x="4.850947" - y="7" - textLength="10.140625" - id="tspan128">-12</tspan> - </text> + id="rect124" + fill="white" /> <text - transform="translate(135.38236 252.11417)" - fill="black" - id="text134"> + transform="translate(135.38236,252.11417)" + id="text134" + style="fill:#000000"> <tspan - font-family="Linux Libertine" font-size="7" font-weight="500" x="3.2943065" y="6" + id="tspan132" textLength="3.2539062" - id="tspan132">0</tspan> + lengthAdjust="spacing" + style="font-weight:500;font-size:7px;font-family:'Linux Libertine'">3</tspan> </text> <text - transform="translate(155.22488 252.11417)" - fill="black" - id="text138"> + transform="translate(155.22488,252.11417)" + id="text138" + style="fill:#000000"> <tspan - font-family="Linux Libertine" font-size="7" font-weight="500" x="3.2943065" y="6" + id="tspan136" textLength="3.2539062" - id="tspan136">1</tspan> + lengthAdjust="spacing" + style="font-weight:500;font-size:7px;font-family:'Linux Libertine'">4</tspan> </text> <text - transform="translate(175.0674 252.11417)" - fill="black" - id="text142"> + transform="translate(175.0674,252.11417)" + id="text142" + style="fill:#000000"> <tspan - font-family="Linux Libertine" font-size="7" font-weight="500" x="3.2943065" y="6" + id="tspan140" textLength="3.2539062" - id="tspan140">2</tspan> + lengthAdjust="spacing" + style="font-weight:500;font-size:7px;font-family:'Linux Libertine'">5</tspan> </text> <text - transform="translate(194.90992 252.11417)" - fill="black" - id="text146"> + transform="translate(194.90992,252.11417)" + id="text146" + style="fill:#000000"> <tspan - font-family="Linux Libertine" font-size="7" font-weight="500" x="3.2943065" y="6" + id="tspan144" textLength="3.2539062" - id="tspan144">3</tspan> + lengthAdjust="spacing" + style="font-weight:500;font-size:7px;font-family:'Linux Libertine'">6</tspan> </text> <text - transform="translate(214.75244 252.11417)" - fill="black" - id="text150"> + transform="translate(214.75244,252.11417)" + id="text150" + style="fill:#000000"> <tspan - font-family="Linux Libertine" font-size="7" font-weight="500" x="3.2943065" y="6" + id="tspan148" textLength="3.2539062" - id="tspan148">4</tspan> + lengthAdjust="spacing" + style="font-weight:500;font-size:7px;font-family:'Linux Libertine'">7</tspan> </text> <text - transform="translate(234.59495 252.11417)" - fill="black" - id="text154"> + transform="translate(234.59495,252.11417)" + id="text154" + style="fill:#000000"> <tspan - font-family="Linux Libertine" font-size="7" font-weight="500" x="3.2943065" y="6" + id="tspan152" textLength="3.2539062" - id="tspan152">5</tspan> + lengthAdjust="spacing" + style="font-weight:500;font-size:7px;font-family:'Linux Libertine'">8</tspan> </text> <text - transform="translate(254.43747 252.11417)" - fill="black" - id="text158"> + transform="translate(254.43747,252.11417)" + id="text158" + style="fill:#000000"> <tspan - font-family="Linux Libertine" font-size="7" font-weight="500" x="3.2943065" y="6" + id="tspan156" textLength="3.2539062" - id="tspan156">6</tspan> + lengthAdjust="spacing" + style="font-weight:500;font-size:7px;font-family:'Linux Libertine'">9</tspan> </text> <text - transform="translate(274.28 252.11417)" - fill="black" - id="text162"> + id="text162" + style="fill:#000000" + x="272.86172" + y="252.29146"> <tspan - font-family="Linux Libertine" font-size="7" font-weight="500" - x="3.2943065" - y="6" + x="276.15601" + y="258.29144" + id="tspan160" textLength="3.2539062" - id="tspan160">7</tspan> + lengthAdjust="spacing" + style="font-weight:500;font-size:7px;font-family:'Linux Libertine'">10</tspan> </text> <text - transform="translate(294.12251 252.11417)" - fill="black" - id="text166"> + id="text166" + style="fill:#000000" + x="292.61557" + y="252.3801"> <tspan - font-family="Linux Libertine" font-size="7" font-weight="500" - x="3.2943065" - y="6" + x="295.90988" + y="258.3801" + id="tspan164" textLength="3.2539062" - id="tspan164">8</tspan> + lengthAdjust="spacing" + style="font-weight:500;font-size:7px;font-family:'Linux Libertine'">11</tspan> </text> <text - transform="translate(313.96503 252.11417)" - fill="black" - id="text170"> + id="text170" + style="fill:#000000" + x="312.54675" + y="252.46873"> <tspan - font-family="Linux Libertine" font-size="7" font-weight="500" - x="3.2943065" - y="6" + x="315.84103" + y="258.46875" + id="tspan168" textLength="3.2539062" - id="tspan168">9</tspan> - </text> - <text - transform="translate(333.80755 252.11417)" - fill="black" - id="text174"> - <tspan - font-family="Linux Libertine" - font-size="7" - font-weight="500" - x="1.6673534" - y="6" - textLength="6.5078125" - id="tspan172">10</tspan> - </text> - <text - transform="translate(353.65007 252.11417)" - fill="black" - id="text178"> - <tspan - font-family="Linux Libertine" - font-size="7" - font-weight="500" - x="1.6673534" - y="6" - textLength="6.5078125" - id="tspan176">11</tspan> - </text> - <text - transform="translate(373.4926 252.11417)" - fill="black" - id="text182"> - <tspan - font-family="Linux Libertine" - font-size="7" - font-weight="500" - x="1.6673534" - y="6" - textLength="6.5078125" - id="tspan180">12</tspan> - </text> - <text - transform="translate(393.3351 252.11417)" - fill="black" - id="text186"> - <tspan - font-family="Linux Libertine" - font-size="7" - font-weight="500" - fill="black" - x="1.6673534" - y="6" - textLength="6.5078125" - id="tspan184">13</tspan> - </text> - <text - transform="translate(413.17763 252.11417)" - fill="black" - id="text190"> - <tspan - font-family="Linux Libertine" - font-size="7" - font-weight="500" - fill="black" - x="1.6673534" - y="6" - textLength="6.5078125" - id="tspan188">14</tspan> - </text> - <text - transform="translate(433.02015 252.11417)" - fill="black" - id="text194"> - <tspan - font-family="Linux Libertine" - font-size="7" - font-weight="500" - fill="black" - x="1.6673534" - y="6" - textLength="6.5078125" - id="tspan192">15</tspan> + lengthAdjust="spacing" + style="font-weight:500;font-size:7px;font-family:'Linux Libertine'">12</tspan> </text> <rect x="140.30338" y="198.4252" width="19.842519" height="14.173228" - fill="white" - id="rect196" /> + id="rect196" + fill="white" /> <rect x="140.30338" y="198.4252" width="19.842519" height="14.173228" - stroke="black" - stroke-linecap="round" + id="rect198" stroke-linejoin="round" + stroke-linecap="round" stroke-width="1" - id="rect198" /> + stroke="black" /> <text - transform="translate(140.30338 201.01181)" - fill="black" - id="text202"> + transform="translate(140.30338,201.01181)" + id="text202" + style="fill:#000000"> <tspan - font-family="Linux Libertine" font-size="8" font-weight="500" - x="6.2025096" + x="6.2025094" y="7" + id="tspan200" textLength="7.4375" - id="tspan200">12</tspan> + lengthAdjust="spacing" + style="font-weight:500;font-size:8px;font-family:'Linux Libertine'">19</tspan> </text> <rect x="179.98839" y="198.4252" width="19.842519" height="14.173228" - fill="white" - id="rect204" /> + id="rect204" + fill="white" /> <rect x="179.98839" y="198.4252" width="19.842519" height="14.173228" - stroke="black" - stroke-linecap="round" + id="rect206" stroke-linejoin="round" + stroke-linecap="round" stroke-width="1" - id="rect206" /> + stroke="black" /> <text - transform="translate(179.98839 201.01181)" - fill="black" - id="text210"> + id="text210" + style="fill:#000000" + x="178.30417" + y="200.74588"> <tspan - font-family="Linux Libertine" font-size="8" font-weight="500" - x="6.2025096" - y="7" + x="184.50668" + y="207.74588" + id="tspan208" textLength="7.4375" - id="tspan208">25</tspan> + lengthAdjust="spacing" + style="font-weight:500;font-size:8px;font-family:'Linux Libertine'">244</tspan> </text> <rect x="219.6734" y="198.4252" width="19.842519" height="14.173228" - fill="white" - id="rect212" /> + id="rect212" + fill="white" /> <rect x="219.6734" y="198.4252" width="19.842519" height="14.173228" - stroke="black" - stroke-linecap="round" + id="rect214" stroke-linejoin="round" + stroke-linecap="round" stroke-width="1" - id="rect214" /> + stroke="black" /> <text - transform="translate(219.6734 201.01181)" - fill="black" - id="text218"> + id="text218" + style="fill:#000000" + x="223.48505" + y="200.74588"> <tspan - font-family="Linux Libertine" font-size="8" font-weight="500" - x="4.3431346" - y="7" + x="227.82819" + y="207.74588" + id="tspan216" textLength="11.15625" - id="tspan216">227</tspan> + lengthAdjust="spacing" + style="font-weight:500;font-size:8px;font-family:'Linux Libertine'">7</tspan> </text> <rect x="259.3584" y="198.4252" width="19.842519" height="14.173228" - fill="white" - id="rect220" /> + id="rect220" + fill="white" /> <rect x="259.3584" y="198.4252" width="19.842519" height="14.173228" - stroke="black" - stroke-linecap="round" + id="rect222" stroke-linejoin="round" + stroke-linecap="round" stroke-width="1" - id="rect222" /> + stroke="black" + style="stroke-width:2.00025002;stroke-miterlimit:4;stroke-dasharray:none" /> <text - transform="translate(259.3584 201.01181)" - fill="black" - id="text226"> + id="text226" + style="fill:#000000" + x="256.43317" + y="200.92317"> <tspan - font-family="Linux Libertine" font-size="8" font-weight="500" - x="6.2025096" - y="7" + x="262.63568" + y="207.92317" + id="tspan224" textLength="7.4375" - id="tspan224">18</tspan> + lengthAdjust="spacing" + style="font-weight:500;font-size:8px;font-family:'Linux Libertine'">-140</tspan> </text> <rect x="299.0434" y="198.4252" width="19.842519" height="14.173228" - fill="#ccc" - id="rect228" /> - <rect - x="299.0434" - y="198.4252" - width="19.842519" - height="14.173228" - stroke="black" - stroke-linecap="round" + id="rect230" stroke-linejoin="round" + stroke-linecap="round" stroke-width="1" - id="rect230" /> + stroke="black" /> <text - transform="translate(299.0434 201.01181)" - fill="black" - id="text234"> + id="text234" + style="fill:#000000" + x="302.42813" + y="200.88644"> <tspan - font-family="Linux Libertine" font-size="8" font-weight="500" - x="4.850947" - y="7" + x="307.27905" + y="207.88644" + id="tspan232" textLength="10.140625" - id="tspan232">-86</tspan> + lengthAdjust="spacing" + style="font-weight:500;font-size:8px;font-family:'Linux Libertine'">4</tspan> </text> <rect x="338.72841" y="198.4252" width="19.842519" height="14.173228" - fill="white" - id="rect236" /> + id="rect236" + fill="white" /> + <rect + x="338.72842" + y="198.4252" + width="19.84252" + height="14.173228" + id="rect228" + style="fill:#cccccc" /> <rect x="338.72841" y="198.4252" width="19.842519" height="14.173228" - stroke="black" - stroke-linecap="round" + id="rect238" stroke-linejoin="round" + stroke-linecap="round" stroke-width="2" - id="rect238" /> + stroke="black" + style="stroke-width:0.99997501;stroke-miterlimit:4;stroke-dasharray:none" /> <text - transform="translate(338.72841 201.01181)" - fill="black" - id="text242"> + id="text242" + style="fill:#000000" + x="340.01373" + y="201.10045"> <tspan - font-family="Linux Libertine" font-size="8" font-weight="500" - x="4.850947" - y="7" + x="344.86469" + y="208.10045" + id="tspan240" textLength="10.140625" - id="tspan240">-52</tspan> + lengthAdjust="spacing" + style="font-weight:500;font-size:8px;font-family:'Linux Libertine'">13</tspan> </text> <rect x="418.09842" y="198.4252" width="19.842519" height="14.173228" - fill="white" - id="rect244" /> + id="rect244" + fill="white" /> <rect x="418.09842" y="198.4252" width="19.842519" height="14.173228" - stroke="black" - stroke-linecap="round" + id="rect246" stroke-linejoin="round" + stroke-linecap="round" stroke-width="1" - id="rect246" /> + stroke="black" /> <text - transform="translate(418.09842 201.01181)" - fill="black" - id="text250"> + id="text250" + style="fill:#000000" + x="419.25079" + y="201.1891"> <tspan - font-family="Linux Libertine" font-size="8" font-weight="500" - fill="black" - x="6.710322" - y="7" + x="425.96109" + y="208.1891" + id="tspan248" textLength="6.421875" - id="tspan248">-9</tspan> + lengthAdjust="spacing" + style="font-weight:500;font-size:8px;font-family:'Linux Libertine';fill:#000000">2</tspan> </text> <line x1="146.91756" y1="212.59842" x2="140.30362" y2="232.59842" - stroke="black" - stroke-linecap="round" + id="line252" stroke-linejoin="round" + stroke-linecap="round" stroke-width="1" - id="line252" /> + stroke="black" /> <line x1="153.53173" y1="212.59842" x2="160.14567" y2="232.59842" - stroke="black" - stroke-linecap="round" + id="line254" stroke-linejoin="round" + stroke-linecap="round" stroke-width="1" - id="line254" /> + stroke="black" /> <line x1="186.60256" y1="212.59842" x2="179.98819" y2="232.59842" - stroke="black" - stroke-linecap="round" + id="line256" stroke-linejoin="round" + stroke-linecap="round" stroke-width="1" - id="line256" /> + stroke="black" /> <line x1="193.21673" y1="212.59842" x2="199.83071" y2="232.59842" - stroke="black" - stroke-linecap="round" + id="line258" stroke-linejoin="round" + stroke-linecap="round" stroke-width="1" - id="line258" /> + stroke="black" /> <line x1="226.28757" y1="212.59842" x2="219.67323" y2="232.59842" - stroke="black" - stroke-linecap="round" + id="line260" stroke-linejoin="round" + stroke-linecap="round" stroke-width="1" - id="line260" /> + stroke="black" /> <line x1="232.90174" y1="212.59842" x2="239.51575" y2="232.59842" - stroke="black" - stroke-linecap="round" + id="line262" stroke-linejoin="round" + stroke-linecap="round" stroke-width="1" - id="line262" /> + stroke="black" /> <line x1="265.97257" y1="212.59842" x2="259.35827" y2="232.59842" - stroke="black" - stroke-linecap="round" + id="line264" stroke-linejoin="round" + stroke-linecap="round" stroke-width="1" - id="line264" /> + stroke="black" /> <line x1="272.58675" y1="212.59842" x2="279.2126" y2="232.59842" - stroke="black" - stroke-linecap="round" + id="line266" stroke-linejoin="round" - stroke-width="1" - id="line266" /> - <line - x1="305.65758" - y1="212.59842" - x2="299.0433" - y2="232.59842" - stroke="black" stroke-linecap="round" - stroke-linejoin="round" stroke-width="1" - id="line268" /> - <line - x1="312.27175" - y1="212.59842" - x2="318.88583" - y2="232.59842" - stroke="black" - stroke-linecap="round" - stroke-linejoin="round" - stroke-width="1" - id="line270" /> - <line - x1="345.34259" - y1="212.59842" - x2="339.05906" - y2="231.59842" stroke="black" - stroke-linecap="round" + style="stroke-width:2.00025002;stroke-miterlimit:4;stroke-dasharray:none" /> + <rect + x="269.29134" + y="232.59842" + width="19.842519" + height="14.173228" + id="rect78" stroke-linejoin="round" - stroke-width="2" - id="line272" /> - <line - x1="351.95676" - y1="212.59842" - x2="358.57086" - y2="232.59842" - stroke="black" stroke-linecap="round" - stroke-linejoin="round" stroke-width="1" - id="line274" /> - <line - x1="385.0276" - y1="212.59842" - x2="378.41338" - y2="232.59842" stroke="black" - stroke-linecap="round" - stroke-linejoin="round" - stroke-width="1" - id="line276" /> + style="stroke-width:1.99999502;stroke-miterlimit:4;stroke-dasharray:none" /> <line - x1="391.64176" + x1="305.65758" y1="212.59842" - x2="398.2559" + x2="299.0433" y2="232.59842" - stroke="black" - stroke-linecap="round" + id="line268" stroke-linejoin="round" - stroke-width="1" - id="line278" /> - <line - x1="424.7126" - y1="212.59842" - x2="418.09842" - y2="232.59842" - stroke="black" stroke-linecap="round" - stroke-linejoin="round" stroke-width="1" - id="line280" /> + stroke="black" /> <line - x1="431.32677" + x1="312.27175" y1="212.59842" - x2="437.94094" + x2="318.88583" y2="232.59842" - stroke="black" - stroke-linecap="round" + id="line270" stroke-linejoin="round" + stroke-linecap="round" stroke-width="1" - id="line282" /> + stroke="black" /> <rect x="160.14589" y="164.40945" width="19.842519" height="14.173228" - fill="white" - id="rect284" /> + id="rect284" + fill="white" /> <rect x="160.14589" y="164.40945" width="19.842519" height="14.173228" - stroke="black" - stroke-linecap="round" + id="rect286" stroke-linejoin="round" + stroke-linecap="round" stroke-width="1" - id="rect286" /> + stroke="black" /> <text - transform="translate(160.14589 166.99606)" - fill="black" - id="text290"> + id="text290" + style="fill:#000000" + x="158.99353" + y="166.99606"> <tspan - font-family="Linux Libertine" font-size="8" font-weight="500" - x="6.2025096" - y="7" + x="165.19604" + y="173.99606" + id="tspan288" textLength="7.4375" - id="tspan288">37</tspan> + lengthAdjust="spacing" + style="font-weight:500;font-size:8px;font-family:'Linux Libertine'">263</tspan> </text> <rect x="239.5159" y="164.40945" width="19.842519" height="14.173228" - fill="white" - id="rect292" /> + id="rect292" + fill="white" /> <rect x="239.5159" y="164.40945" width="19.842519" height="14.173228" - stroke="black" - stroke-linecap="round" + id="rect294" stroke-linejoin="round" + stroke-linecap="round" stroke-width="1" - id="rect294" /> + stroke="black" + style="stroke-width:2.00025002;stroke-miterlimit:4;stroke-dasharray:none" /> <text - transform="translate(239.5159 166.99606)" - fill="black" - id="text298"> + id="text298" + style="fill:#000000" + x="238.18625" + y="167.0847"> <tspan - font-family="Linux Libertine" font-size="8" font-weight="500" - x="4.3431346" - y="7" + x="242.52939" + y="174.0847" + id="tspan296" textLength="11.15625" - id="tspan296">245</tspan> + lengthAdjust="spacing" + style="font-weight:500;font-size:8px;font-family:'Linux Libertine'">-133</tspan> </text> <rect x="318.8859" y="164.40945" width="19.842519" height="14.173228" - fill="white" - id="rect300" /> + id="rect300" + fill="white" /> <rect x="318.8859" y="164.40945" width="19.842519" height="14.173228" - stroke="black" - stroke-linecap="round" + id="rect302" stroke-linejoin="round" + stroke-linecap="round" stroke-width="2" - id="rect302" /> + stroke="black" + style="stroke-width:0.99997501;stroke-miterlimit:4;stroke-dasharray:none" /> <text - transform="translate(318.8859 166.99606)" - fill="black" - id="text306"> + id="text306" + style="fill:#000000" + x="321.83188" + y="166.99606"> <tspan - font-family="Linux Libertine" font-size="8" font-weight="500" - x="2.9915721" - y="7" + x="324.82343" + y="173.99606" + id="tspan304" textLength="13.859375" - id="tspan304">-138</tspan> + lengthAdjust="spacing" + style="font-weight:500;font-size:8px;font-family:'Linux Libertine'">17</tspan> </text> <line x1="166.76006" y1="178.58268" x2="153.53173" y2="198.4252" - stroke="black" - stroke-linecap="round" + id="line308" stroke-linejoin="round" + stroke-linecap="round" stroke-width="1" - id="line308" /> + stroke="black" /> <line x1="173.37423" y1="178.58268" x2="186.60256" y2="198.4252" - stroke="black" - stroke-linecap="round" + id="line310" stroke-linejoin="round" + stroke-linecap="round" stroke-width="1" - id="line310" /> + stroke="black" /> <line x1="246.13007" y1="178.58268" x2="232.90174" y2="198.4252" - stroke="black" - stroke-linecap="round" + id="line312" stroke-linejoin="round" + stroke-linecap="round" stroke-width="1" - id="line312" /> + stroke="black" /> <line x1="252.74424" y1="178.58268" x2="265.97257" y2="198.4252" - stroke="black" - stroke-linecap="round" + id="line314" stroke-linejoin="round" + stroke-linecap="round" stroke-width="1" - id="line314" /> + stroke="black" + style="stroke-width:2.00025002;stroke-miterlimit:4;stroke-dasharray:none" /> <line x1="325.50008" y1="178.58268" x2="312.27175" y2="198.4252" - stroke="black" - stroke-linecap="round" + id="line316" stroke-linejoin="round" + stroke-linecap="round" stroke-width="1" - id="line316" /> + stroke="black" /> <line - x1="332.11426" - y1="178.58268" - x2="344.67592" - y2="197.4252" - stroke="black" - stroke-linecap="round" - stroke-linejoin="round" - stroke-width="2" - id="line318" /> + x1="332.13547" + y1="178.60397" + x2="345.28152" + y2="198.15608" + id="line318" + style="stroke:#000000;stroke-width:1.04205513;stroke-linecap:round;stroke-linejoin:round;stroke-miterlimit:4;stroke-dasharray:none" /> <line x1="404.8701" y1="178.58268" x2="391.64176" y2="198.4252" - stroke="black" - stroke-linecap="round" + id="line320" stroke-linejoin="round" + stroke-linecap="round" stroke-width="1" - id="line320" /> + stroke="black" /> <line x1="411.48427" y1="178.58268" x2="424.7126" y2="198.4252" - stroke="black" - stroke-linecap="round" + id="line322" stroke-linejoin="round" + stroke-linecap="round" stroke-width="1" - id="line322" /> + stroke="black" /> <rect x="199.83089" y="127.559054" width="19.842519" height="14.173228" - fill="#ccc" - id="rect324" /> + id="rect324" + fill="#ccc" /> <rect x="199.83089" y="127.559054" width="19.842519" height="14.173228" - stroke="black" - stroke-linecap="round" + id="rect326" stroke-linejoin="round" + stroke-linecap="round" stroke-width="1" - id="rect326" /> + stroke="black" + style="stroke-width:2.00025002;stroke-miterlimit:4;stroke-dasharray:none" /> <text - transform="translate(199.83089 130.14567)" - fill="black" - id="text330"> + transform="translate(199.83089,130.14567)" + id="text330" + style="fill:#000000"> <tspan - font-family="Linux Libertine" font-size="8" font-weight="500" - x="4.3431346" + x="4.3431344" y="7" + id="tspan328" textLength="11.15625" - id="tspan328">282</tspan> + lengthAdjust="spacing" + style="font-weight:500;font-size:8px;font-family:'Linux Libertine'">130</tspan> </text> <rect x="358.57091" y="127.559054" width="19.842519" height="14.173228" - fill="white" - id="rect332" /> + id="rect332" + fill="white" /> <rect x="358.57091" y="127.559054" width="19.842519" height="14.173228" - stroke="black" - stroke-linecap="round" + id="rect334" stroke-linejoin="round" + stroke-linecap="round" stroke-width="2" - id="rect334" /> + stroke="black" + style="stroke-width:0.99997501;stroke-miterlimit:4;stroke-dasharray:none" /> <text - transform="translate(358.57091 130.14567)" - fill="black" - id="text338"> + id="text338" + style="fill:#000000" + x="360.07523" + y="129.83228"> <tspan - font-family="Linux Libertine" font-size="8" font-weight="500" - x="4.850947" - y="7" + x="364.92618" + y="136.83228" + id="tspan336" textLength="10.140625" - id="tspan336">-53</tspan> + lengthAdjust="spacing" + style="font-weight:500;font-size:8px;font-family:'Linux Libertine'">18</tspan> </text> <rect x="279.2009" y="96.37795" width="19.842519" height="14.173228" - fill="white" - id="rect340" /> + id="rect340" + fill="white" /> <rect x="279.2009" y="96.37795" width="19.842519" height="14.173228" - stroke="black" - stroke-linecap="round" + id="rect342" stroke-linejoin="round" + stroke-linecap="round" stroke-width="2" - id="rect342" /> + stroke="black" /> <text - transform="translate(279.2009 98.964566)" - fill="black" - id="text346"> + transform="translate(279.2009,98.964566)" + id="text346" + style="fill:#000000"> <tspan - font-family="Linux Libertine" font-size="8" font-weight="500" - x="4.3431346" + x="4.3431344" y="7" + id="tspan344" textLength="11.15625" - id="tspan344">229</tspan> + lengthAdjust="spacing" + style="font-weight:500;font-size:8px;font-family:'Linux Libertine'">148</tspan> </text> <line x1="206.44506" y1="141.73228" x2="173.37423" y2="164.40945" - stroke="black" - stroke-linecap="round" + id="line348" stroke-linejoin="round" + stroke-linecap="round" stroke-width="1" - id="line348" /> + stroke="black" /> <line x1="213.05924" y1="141.73228" x2="246.13007" y2="164.40945" - stroke="black" - stroke-linecap="round" + id="line350" stroke-linejoin="round" + stroke-linecap="round" stroke-width="1" - id="line350" /> + stroke="black" + style="stroke-width:2.00025002;stroke-miterlimit:4;stroke-dasharray:none" /> <line x1="365.18509" y1="141.73228" x2="332.11426" y2="164.40945" - stroke="black" - stroke-linecap="round" + id="line352" stroke-linejoin="round" + stroke-linecap="round" stroke-width="2" - id="line352" /> + stroke="black" + style="stroke-width:0.99997501;stroke-miterlimit:4;stroke-dasharray:none" /> <line x1="371.79926" y1="141.73228" x2="404.8701" y2="164.40945" - stroke="black" - stroke-linecap="round" + id="line354" stroke-linejoin="round" + stroke-linecap="round" stroke-width="1" - id="line354" /> + stroke="black" /> <line x1="213.05924" y1="127.559054" x2="285.81508" y2="110.55118" - stroke="black" - stroke-linecap="round" + id="line356" stroke-linejoin="round" + stroke-linecap="round" stroke-width="1" - id="line356" /> + stroke="black" + style="stroke-width:2.00025002;stroke-miterlimit:4;stroke-dasharray:none" /> <line x1="292.42925" y1="110.55118" x2="365.18509" y2="127.559054" - stroke="black" - stroke-linecap="round" + id="line358" stroke-linejoin="round" - stroke-width="2" - id="line358" /> - <rect - x="328.80709" - y="232.59842" - width="19.842519" - height="14.173228" - fill="#ccc" - id="rect360" /> - <rect - x="328.80709" - y="232.59842" - width="19.842519" - height="14.173228" - stroke="black" stroke-linecap="round" - stroke-linejoin="round" stroke-width="2" - id="rect362" /> - <text - transform="translate(328.80709 235.18504)" - fill="black" - id="text366"> - <tspan - font-family="Linux Libertine" - font-size="8" - font-weight="500" - x="4.850947" - y="7" - textLength="10.140625" - id="tspan364">-52</tspan> - </text> + stroke="black" + style="stroke-width:0.99975001;stroke-miterlimit:4;stroke-dasharray:none" /> <rect x="368.49212" y="232.59842" width="19.842519" height="14.173228" - fill="white" - id="rect368" /> + id="rect368" + fill="white" /> <rect - x="368.49212" - y="232.59842" + x="398.25592" + y="164.40945" width="19.842519" height="14.173228" - stroke="black" - stroke-linecap="round" - stroke-linejoin="round" - stroke-width="1" - id="rect370" /> - <text - transform="translate(368.49212 235.18504)" - fill="black" - id="text374"> - <tspan - font-family="Linux Libertine" - font-size="8" - font-weight="500" - x="8.0618846" - y="7" - textLength="3.71875" - id="tspan372">4</tspan> - </text> + id="rect376" + fill="white" /> <rect x="398.25592" y="164.40945" - width="19.842519" - height="14.173228" - fill="white" - id="rect376" /> + width="19.84252" + height="14.173227" + id="rect228-7" + style="fill:#cccccc;stroke-width:0.99999994" /> <rect x="398.25592" y="164.40945" width="19.842519" height="14.173228" - stroke="black" - stroke-linecap="round" + id="rect378" stroke-linejoin="round" + stroke-linecap="round" stroke-width="1" - id="rect378" /> + stroke="black" /> <text - transform="translate(398.25592 166.99606)" - fill="black" - id="text382"> + id="text382" + style="fill:#000000" + x="400.63776" + y="166.80801"> <tspan - font-family="Linux Libertine" font-size="8" font-weight="500" - x="6.2025096" - y="7" + x="406.84027" + y="173.80801" + id="tspan380" textLength="7.4375" - id="tspan380">85</tspan> + lengthAdjust="spacing" + style="font-weight:500;font-size:8px;font-family:'Linux Libertine'">1</tspan> </text> <rect x="378.41342" y="198.4252" width="19.842519" height="14.173228" - fill="white" - id="rect384" /> + id="rect384" + fill="white" /> <rect x="378.41342" y="198.4252" width="19.842519" height="14.173228" - stroke="black" - stroke-linecap="round" + id="rect386" stroke-linejoin="round" + stroke-linecap="round" stroke-width="1" - id="rect386" /> + stroke="black" /> <text - transform="translate(378.41342 201.01181)" - fill="black" - id="text390"> + id="text390" + style="fill:#000000" + x="378.45773" + y="201.05614"> <tspan - font-family="Linux Libertine" font-size="8" font-weight="500" - x="6.2025096" - y="7" + x="384.66025" + y="208.05614" + id="tspan388" textLength="7.4375" - id="tspan388">94</tspan> + lengthAdjust="spacing" + style="font-weight:500;font-size:8px;font-family:'Linux Libertine'">-1</tspan> </text> <text - transform="translate(156.9511 201.59055)" - fill="black" - id="text394"> + transform="translate(156.9511,201.59055)" + id="text394" + style="fill:#000000"> <tspan - font-family="Linux Libertine" font-size="6" font-weight="500" x="4.149127" y="5" + id="tspan392" textLength="11.167969" - id="tspan392">[0,1]</tspan> + lengthAdjust="spacing" + style="font-weight:500;font-size:6px;font-family:'Linux Libertine'">[3,4]</tspan> </text> <text - transform="translate(195.90363 201.59055)" - fill="black" - id="text398"> + transform="translate(195.90363,201.59055)" + id="text398" + style="fill:#000000"> <tspan - font-family="Linux Libertine" font-size="6" font-weight="500" - x="4.515381" + x="4.5153809" y="5" + id="tspan396" textLength="11.167969" - id="tspan396">[2,3]</tspan> + lengthAdjust="spacing" + style="font-weight:500;font-size:6px;font-family:'Linux Libertine'">[5,6]</tspan> </text> <text - transform="translate(236.32118 201.59055)" - fill="black" - id="text402"> + transform="translate(236.32118,201.59055)" + id="text402" + style="fill:#000000"> <tspan - font-family="Linux Libertine" font-size="6" font-weight="500" x="4.149127" y="5" + id="tspan400" textLength="11.167969" - id="tspan400">[4,5]</tspan> + lengthAdjust="spacing" + style="font-weight:500;font-size:6px;font-family:'Linux Libertine'">[7,8]</tspan> </text> <text - transform="translate(275.79134 201.59055)" - fill="black" - id="text406"> + transform="translate(275.79134,201.59055)" + id="text406" + style="fill:#000000"> <tspan - font-family="Linux Libertine" font-size="6" font-weight="500" - x="4.2565667" + x="4.2565665" y="5" + id="tspan404" textLength="11.167969" - id="tspan404">[6,7]</tspan> + lengthAdjust="spacing" + style="font-weight:500;font-size:6px;font-family:'Linux Libertine'">[9,10]</tspan> </text> <text - transform="translate(315.47638 201.59055)" - fill="black" - id="text410"> + transform="translate(315.47638,201.59055)" + id="text410" + style="fill:#000000"> <tspan - font-family="Linux Libertine" font-size="6" font-weight="500" - x="4.2565667" + x="4.2565665" y="5" + id="tspan408" textLength="11.167969" - id="tspan408">[8,9]</tspan> + lengthAdjust="spacing" + style="font-weight:500;font-size:6px;font-family:'Linux Libertine'">[11,12]</tspan> </text> <text - transform="translate(359.89765 201.59055)" - fill="black" - id="text414"> + id="text414" + style="fill:#000000" + x="361.49323" + y="201.94513"> <tspan - font-family="Linux Libertine" font-size="6" font-weight="500" - x=".126953125" - y="5" - textLength="16.746094" - id="tspan412">[10,11]</tspan> + x="361.62018" + y="206.94513" + id="tspan412" + lengthAdjust="spacing" + style="font-weight:500;font-size:6px;font-family:'Linux Libertine'">0*</tspan> </text> <text - transform="translate(399.2992 201.59055)" - fill="black" - id="text418"> + id="text418" + style="fill:#000000" + x="400.89478" + y="201.94513"> <tspan - font-family="Linux Libertine" font-size="6" font-weight="500" - x=".126953125" - y="5" + x="401.02173" + y="206.94513" + id="tspan416" textLength="16.746094" - id="tspan416">[12,13]</tspan> + lengthAdjust="spacing" + style="font-weight:500;font-size:6px;font-family:'Linux Libertine'">1*</tspan> </text> <text - transform="translate(438.7008 201.59055)" - fill="black" - id="text422"> + id="text422" + style="fill:#000000" + x="440.65094" + y="201.98944"> <tspan - font-family="Linux Libertine" font-size="6" font-weight="500" - x=".126953125" - y="5" + x="440.77789" + y="206.98944" + id="tspan420" textLength="16.746094" - id="tspan420">[14,15]</tspan> + lengthAdjust="spacing" + style="font-weight:500;font-size:6px;font-family:'Linux Libertine'">2*</tspan> </text> <text - transform="translate(177.88676 167.5748)" - fill="black" - id="text426"> + transform="translate(177.88676,167.5748)" + id="text426" + style="fill:#000000"> <tspan - font-family="Linux Libertine" font-size="6" font-weight="500" - x="3.602554" + x="3.6025541" y="5" + id="tspan424" textLength="11.167969" - id="tspan424">[0,3]</tspan> + lengthAdjust="spacing" + style="font-weight:500;font-size:6px;font-family:'Linux Libertine'">[3,6]</tspan> </text> <text - transform="translate(257.25684 167.5748)" - fill="black" - id="text430"> + transform="translate(257.25684,167.5748)" + id="text430" + style="fill:#000000"> <tspan - font-family="Linux Libertine" font-size="6" font-weight="500" - x="3.602554" + x="3.6025541" y="5" + id="tspan428" textLength="11.167969" - id="tspan428">[4,7]</tspan> + lengthAdjust="spacing" + style="font-weight:500;font-size:6px;font-family:'Linux Libertine'">[7,10]</tspan> </text> <text - transform="translate(341.29621 167.5748)" - fill="black" - id="text434"> + transform="translate(341.29621,167.5748)" + id="text434" + style="fill:#000000"> <tspan - font-family="Linux Libertine" font-size="6" font-weight="500" - x=".021484375" + x="0.021484375" y="5" + id="tspan432" textLength="13.957031" - id="tspan432">[8,11]</tspan> + lengthAdjust="spacing" + style="font-weight:500;font-size:6px;font-family:'Linux Libertine'">[11,0]*</tspan> </text> <text - transform="translate(419.1663 167.5748)" - fill="black" - id="text438"> + transform="translate(419.1663,167.5748)" + id="text438" + style="fill:#000000"> <tspan - font-family="Linux Libertine" font-size="6" font-weight="500" - x=".126953125" + x="0.12695312" y="5" + id="tspan436" textLength="16.746094" - id="tspan436">[12,15]</tspan> + lengthAdjust="spacing" + style="font-weight:500;font-size:6px;font-family:'Linux Libertine'">[1,2]</tspan> </text> <text - transform="translate(380.98125 130.72441)" - fill="black" - id="text442"> + transform="translate(380.98125,130.72441)" + id="text442" + style="fill:#000000"> <tspan - font-family="Linux Libertine" font-size="6" font-weight="500" - x=".021484375" + x="0.021484375" y="5" + id="tspan440" textLength="13.957031" - id="tspan440">[8,15]</tspan> + lengthAdjust="spacing" + style="font-weight:500;font-size:6px;font-family:'Linux Libertine'">[11,2]*</tspan> </text> <text - transform="translate(217.5718 130.72441)" - fill="black" - id="text446"> + transform="translate(217.5718,130.72441)" + id="text446" + style="fill:#000000"> <tspan - font-family="Linux Libertine" font-size="6" font-weight="500" - x="3.602554" + x="3.6025541" y="5" + id="tspan444" textLength="11.167969" - id="tspan444">[0,7]</tspan> + lengthAdjust="spacing" + style="font-weight:500;font-size:6px;font-family:'Linux Libertine'">[3,10]</tspan> </text> <text - transform="translate(301.6342 100.110234)" - fill="black" - id="text450"> + transform="translate(301.6342,100.11023)" + id="text450" + style="fill:#000000"> <tspan - font-family="Linux Libertine" font-size="6" font-weight="500" - x=".021484375" + x="0.021484375" y="5" + id="tspan448" textLength="13.957031" - id="tspan448">[0,15]</tspan> - </text> - <text - transform="translate(118.385826 235.1063)" - fill="black" - id="text454"> - <tspan - font-family="Linux Libertine" - font-size="8" - font-style="italic" - font-weight="500" - x=".33203125" - y="7" - textLength="5.3359375" - id="tspan452">A</tspan> + lengthAdjust="spacing" + style="font-weight:500;font-size:6px;font-family:'Linux Libertine'">[3,2]*</tspan> </text> </g> </g> diff --git a/content/english/hpc/data-structures/img/src/segtree-succinct.svg b/content/english/hpc/data-structures/img/segtree-succinct.svg similarity index 100% rename from content/english/hpc/data-structures/img/src/segtree-succinct.svg rename to content/english/hpc/data-structures/img/segtree-succinct.svg diff --git a/content/english/hpc/data-structures/img/src/segtree-wide.svg b/content/english/hpc/data-structures/img/segtree-wide.svg similarity index 100% rename from content/english/hpc/data-structures/img/src/segtree-wide.svg rename to content/english/hpc/data-structures/img/segtree-wide.svg diff --git a/content/english/hpc/data-structures/segment-trees.md b/content/english/hpc/data-structures/segment-trees.md index 44edb04a..88ac5288 100644 --- a/content/english/hpc/data-structures/segment-trees.md +++ b/content/english/hpc/data-structures/segment-trees.md @@ -32,7 +32,7 @@ The main idea is this. Calculate the sum of the entire array put it somewhere. T These sequence of computations can be represented as a static-structure tree: -![](../img/segtree-path.png) +![](../img/segtree-path.svg) Some nice properties of this construct: @@ -171,7 +171,7 @@ The last issue is the most critical one. To get rid of pointer chasing, we need To store our segment tree implicitly, we can also use the [Eytzinger layout](../binary-search#eytzinger-layout), storing the nodes in a large array, where for every non-leaf node $v$ corresponding to the range $[l, r)$, the node $2v$ is its left child and the node $(2v+1)$ is its right child, corresponding to the ranges $[l, \lfloor \frac{l+r}{2} \rfloor)$ and $[\lfloor \frac{l+r}{2} \rfloor, r)$ respectively. -![The memory layout of implicit segment tree with the same query path highlighted](../img/segtree-layout.png) +![The memory layout of implicit segment tree with the same query path highlighted](../img/segtree-layout.svg) One little problem with this layout is that if $n$ is not a perfect power of two, we would need more array cells to store the tree — $4n$, to be exact. The tree structure hasn't change, and there are still exactly $(2n - 1)$ nodes in the tree — they are just not compactly packed on the last layer. @@ -297,7 +297,7 @@ int sum(int l, int r) { This results and a much simpler and faster code. However, when the array size is not a power of two, the `sum` query doesn't work correctly. To understand why, consider at the tree structure for 13 elements: -![The nodes comprising the first 7 elements are selected in bold](../img/segtree-ranges.png) +![](../img/segtree-permuted.svg) The first index of the last layer is always a power of two, but when $n$ is not a power of two, some prefix of the leaf elements gets wrapped around to the right side of the tree. @@ -383,13 +383,15 @@ To make a segment tree succinct, we need to look at the values stored in the nod Note that in every implementation so far, we never added the sum stored in the right child when computing the prefix sum. *Fenwick tree* is a type of a segment tree that uses this consideration and gets rid of all *right* children, including the last layer. This makes the total required number of memory cells $n + O(1)$, the same as the underlying array. +![](../img/segtree-succinct.svg) + To calculate a prefix sum, we need to repeatedly jump to the first parent that is a left child: -![A path for the sum query](../img/fenwick-sum.png) +![A path for the sum query](../img/fenwick-sum.svg) To process an update query, we need to repeatedly add the delta to the first parent the contains the cell $k$: -![A path for the update query](../img/fenwick-update.png) +![A path for the update query](../img/fenwick-update.svg) More formally, a Fenwick tree is defined as the array $t_i = \sum_{k=f(i)}^i a_k$ where $f$ is some function for which $f(i) \leq i$. If $f$ is the "remove last bit" function (`x -= x & -x`), then both query and update would only require updating $O(\log n)$ different $t$'s @@ -470,7 +472,7 @@ But we are going to leave it there and focus on an entirely different approach. Here is the idea: if we are fetching a full cache line anyway, let's fill it with information that lets us process the query quicker. So let's store more than one data point in a segment tree node — this lets us reduce the tree height and do less iterations descending it. -![](../img/segtree-wide.png) +![](../img/segtree-wide.svg) We can use a similar constexpr-based approach we used in [S+ trees](../s-tree#implicit-b-tree-1) to implement it: From a818437ff2bdcaa3e1c02ee8e90879e916b3bbe1 Mon Sep 17 00:00:00 2001 From: Sergey Slotin <me@sereja.me> Date: Sun, 27 Feb 2022 00:29:28 +0300 Subject: [PATCH 256/531] add linux libertine font --- themes/algorithmica/assets/style.sass | 4 ++++ .../static/fonts/linux-libertine.ttf | Bin 0 -> 906980 bytes 2 files changed, 4 insertions(+) create mode 100644 themes/algorithmica/static/fonts/linux-libertine.ttf diff --git a/themes/algorithmica/assets/style.sass b/themes/algorithmica/assets/style.sass index 49c6c025..fe3ebaeb 100644 --- a/themes/algorithmica/assets/style.sass +++ b/themes/algorithmica/assets/style.sass @@ -53,6 +53,10 @@ $link-active-color: $link-hover-color//#faa700 !default font-family: "Crimson" src: url(fonts/crimson.ttf) +//@font-face +// font-family: "Linux Libertine" +// src: url(fonts/linux-libertine.ttf) + /* layout */ html, body margin: 0 diff --git a/themes/algorithmica/static/fonts/linux-libertine.ttf b/themes/algorithmica/static/fonts/linux-libertine.ttf new file mode 100644 index 0000000000000000000000000000000000000000..ab154440d796a815274ff79060759b5d1668b430 GIT binary patch literal 906980 zcmeFadzjYa+W-Il?C+gSHPxiX+_S66O#4!)G-Xs0hLR~GQ>G-9FbJVktBnwn5JD7M zxvP;@Yh(~Y2q8<BnKn}hSwghAzpu|dGyN>jV?EFBIKId6`{$SAJ?HhguJbz2^V*-+ zW`+z&jrn1#L;wCm&l&n!>&tb*$K8o-IiTRo{zHBo#rL(_VCXWS@Vp^^X;A!Eov>hs zR4`(|kPG{zl~3=i6Q%^Dln>A8J9OZb>%Le@{-Kg_*Lg$QcPd_Obt1eTc_o)k95wl& zcb|Mi%1xFUuDa~{vS`y?jaEy27Lb0;*vXeqJh8m}a;euQ$sBX}sB0%HMM3oQt)$B* z+&H$X>)}_WD{>Th_{g|1qe@Ti)@={+dy#(XI3n6tb$W;IGQP9MO)Q%-^V4s?MplZH z^V@{0E*q6U;ibt^zAd%MnmB68<nYxEPKSRn{Lx9HCXU(Bb>>q#!O4;WMU$_(w(Q1~ zCx+^T6Q)XI`%k`R%;b*4wyeY6Jorb+uREvtgPTUax&P7=d;MglQ4RciJNxSs>*+5= z#{wNkdO3XiiGGg6Pp|bczB{%3rRY#e$B}jMiA4WLsH-0n8fbo|hKWd%Z&OYEtl?3K z7~7s_ERcf%ot+1XKe?7#pK7ex$Z$+KR)SAe3dJwbp2MaT^<xUho>{6&MdR^W8@$%W zp%NJ{`)SryP5~sU2o;q0XFf~q5i-xw1~aDV)_QdXJ)IO?=RBz*W0oeGqcsBLnm?-8 z9;=&;S5;(|>jLO?=KV^8zYweg<G~2f16)-X9tRx>#u~eHH@FQMvwS&Vm35Q++Pl%Y zTee-UR3}%1orjfbmnvdyRfPC{&ZkN>OTc~c>#!@;?yBbQI;A@gDA{>LgY7<Qo}g>g z+^SYL`wh9yS>){|(|cUDGXU8S%Cxs>C^i0tdA6EC``KCAW9?GF%Fyf9at*S6P?9wP zc@fHUp9Y&ZsV()?&pt;j&0KXg+r}TY(Qmd?wtGHl!$3eo?N$odt_GUFs;zahMw_$M z8JpT#zhVQhzEQe;GPac>vzt<lC-pQm*?It(b#+hu+y&A>PiuYrkP*|<)@F^h0vbyl z^f$}Y$1c)h`)ys|*S#NSwbhMghWh(;NqR9j*W93Tbd9vP(l*jP_7GzO@pqVStC0N5 zsEb)(E*Rw3y<hMD9o%n-r!D+;_&pNs;I~CBBD7UfeLMJV@Oxm_cZqMmE$T5$Nr|@T zhTI{_PSm?!_kVzMD{Vkq`0em}&<>f2w#bbiF>~XGXutl)+re*(-^1@?0iEl&#lL~J zux;#s`epZZ{})Qxef@qN{I}HaqdioZ=&SUXM1RHB|B0ffpWmncl+xGz{^s}jM87Lh zBg$(e>N{5rpe^_g)O~lYr}k#}htZ!HBYr<kj1zyXv{D;?y!d_g-{|@O_8lTU;<uw; zmw%+hb#}HOs|RWRc%zR0NZlVb%<Vy2Fb4f`@JH&8OMh(q8TH3(V%+%jDC4L0yS3kM z{I>hw(!_XY%q9Am-_CX8-XDAZ`2RgMho~cCKH_93WIw}r{1$p2eX^Q<k*k4;w*Mh9 zA0<&X+n%mrc2nlJZz$(Y>gKfKJ45GMTjL$AS@GALO|<Rz%-HLcVxF%ye%sWwLpt?q z`sHwc_4{1>X?t94T_)<$)KD*rKDY|>#CCr^nW0Q;8}sfj*j=Gk=49E4`uU#v`Lwov zoLO~sLphw@=#2tX&5oDz3+(%Xx%Ymx2P3TM%+HgUmnJa}{y@LTQFG%aEn+@gWG+`{ ze+>v_C^fKNsooG3x>?M#xzypw$_`{H*?U#V?hNd%8#8r%FL1ms)VGyC#{D+?6Uunb zt(Q?x6Y~vYzII&1kJVDfiQk98aUFI2k8%yOhA<b>*F)!F^XJ%{Ri9t|yw1wB&sU@4 zeYwu&<9cf4oLjHIp8j$CQbsY)+yQOQxI5v$*zt${{>Oe=-{0%|H*NZmm8Aal<J<m_ zx@by&nxqnIBlFh}y1;sh^g4U%?fIkLKYrKucPpg!&e^QhN7X(t4;)3-VfxjbwevZ1 zS#!H<e6Md$VmvY~nE#Tk>sdF;<K>BY`V9JCY5YyvskzlN{&_-9kp|KC2iA@2?Bn{D zt3e{(U(bHfV&n`=j4_uv@qFg+@vKw#A$tR3tu1{pm+^8B*vpzWHU0!VgDmW`bE$*& z$Y`WA`eim~?Snqq#_XfnW_#$_3YlLh+xSTx?Py{RZssN{)$Pl;Yad@l{#^T3^-j!B zlkj26k$I>3F&{Nz%;Z{Kl??4+eyenIGqN}1yKGjQ1pIY-Qv9|AD&rOAu=p~@TXS<f z<HTP-dSlx+>dcyMv!;tNA?pqD`t>(S{rtYjdRj!d2RIjWwq{W8kI+v)Qy=R+>}rHP zYgAxfAOD=Wy)FFh{j%5RamQ=08)DqerS9HF_8R8hlhoP02mM{ClMT#GcdG&QSZd!* z|E@q*8D#<nUbXRyazP6-0=gpq8P<z(wZVr=H@E02qewlCvFZoyX8fR1Fw&f$8+_mk zeO`w>&|UZo<G^SmM7hRlvDuNisH2|1ce#^(v)F8)L1rrz8!I))cpS_{?p)fmkqQ&& z$k^=2_gwno5WQ~h(nfQoW|+fhLu@hcfbUC{GOwi>XHm~C?NUbD{)+#3>wks$4Sw&h z;8wL@%@0|w#<A;2_F?v=Z?vVHUsApq>TTXrhdYSdqZ!5q_5MB3-`l)fy{&O-BOPlF znn_2$1ZzNnbmR-@6U38#v_CW*`YCt_6!U#2^jh#Jyk9~45my1e1BZa0e-M}j;6GYy z+z~%Wd@11@z`#0mhWhwTiZjzBbcCRHgGJyY&<7y<D4ybxjZk0SM8bRPbo=RPg#A2z zUjO^%+V3NyrQg^;9m^z+{JxzD87C2r)$-Jr-!I3p?^qYcLVMKmCuI8Nt^uv<fctWU zyhm>Y55wD=Z@&&QYQO!sBTo@utjFW<9K(jA<G^d6K`s3HO~5aUANM(YzKxXY=*NU< zkAz(<YSR+=eHo-3Jp%mjjm8@qPoL{(K4rMfDeV~>g+BH77)B*~6w2byqx7vioxbsM zw>bX3WyaSzo8r~Zt?>=c`uO{1pZL!{kImkLahk+j+}odX{Wa8|qy2f-U+4UF$YEYQ zzVB$SVn2;Ft~v8VbNd0c^5;~4USxOK5&znDCb8Dv$G9n}^G9YUeD)D`5BfTDnm?b| z#Nk7f+97=ZZ<$N{IdV2WD?HX$j1l<k2sSYX!GD4Ewp!C~3;j8cydCY%?6YPw|J_aA z+4xw#%yp!P*c-XbfoCuV`@|0<;>qJvKaRPmz%C*Gc!ivv@z;H7KaS5cH!&Y~tY2T? zOSc9vryph?bEmGd&&F23T$OD7l!))h{Mp~Hd$SRKCv{?ttc87sZe$)Sv^TQndm9`5 zuumBUKAbP(2mLy?e}W(X(648|u2=(-J^WE`G<AM7e$f6*+0H8b08Sa$3#TU93xDl+ zo1IB{Xft;&_57Gd_`C^zKc7v!zgMhl%lh_m{Jo9eUcOA`@I-r+QfHmfU5PKT4H=u& z)g4LOUJrg{pV}K6p2R+XuksmX_UH8y_GRZVU*Y>ZPvNsYNxS%a5+63`rU3T#@aO$R zn{9;OZ(DzFbGN!C_8<ci?UHUkMcM$?wXf9Q-AdkB_?T~~fIVS>^*Hk1BVUX)t8PzZ zkEYL~-zvi%fBovieBO=uH7ov^a}Ij9P}j`$2|Ijye1CWbzR}~F=#0iMSVbM4sWkTu z$~sa#+zsS8L(PKsQ~rn5%^Rr@?HmgJfPXa=Im75{eb{^6if?xk`Te$j61y1T^od6G zv|fHaWvstL9i-c5CH7&>6XoAazZyn)4zuSjVcp(HKWM5x?)UiPQ{kf>i680&7?+1B z2W@GWQHIYFesQL2<0ss~ID3$FaH{&bA5w<5Rp711XFXeeoN@4$)%o!Lp8nSOF6M@A z3BSeP=lJ^qsFQ;qv<<&@FX2s;sgZi(TNFT(q1n_&CUw})EmKqKs%wG{VV^}`_4^xb z-k!N4)9y*$@>*LGzTNc+8MU(Nd^x}0`hBO~2le}GqTjHOfad!937^)<%}msR-=FGy z-9(?J9gkD$-|tiPW3`_8ebw(Te&0^C`5&m?$NheWPm#%-(ZlcKem|4MJg$*oFzAR! z5rMMiX$e?WhYO%SSi}5uqYVd9xADD_b)NjJ_=-PdZdJMYaAKVF#;<K-uEPJCPCxz- zpJf1J{XBf6ZOk9Lm<w~5x9F=@IsM^1_AlL3V&^dL_JwyY{U#Z|>L_{Nqp$lspA-I+ z`S4fbzhVsSVQig@pH@a&6w}6M;NMhoc9M+0@C*JQXD1-vpIgjXtQT|9KbLcaS@<4z z>Qv6KT;{SC@U%x~w<P*AX)ZoPnt3PvubMG6LlHl|9zGxIN;iC*<NG1oUqd)U`Xlw{ zn#37`e`dg0LH(J*pHY18wSK>-?;n3g{qr&ZjEa6$d#3P*vx7uG{M}hX{WtZ%UWW4% zzdkraIljlL-)q&;e>*Q|$vj+pj*whO6Z5Kn4(89Vzo*CdTz_(&P~T7gj3(Aoe-G@h zt3Lh19_x?a{@U@M&KBy^{rTSX&*uJw{&Da2r{A0_CeHTiX#JVP|3v*YhV?hSp4RW7 zj_=q1a6XYZ|C!Ca<ez6ScU#NjMgExui1?Iq27hlpxPD&w6B_)J`2Tw5^@n+?KHs19 zzdyux|HJq3{_D?`i8H%8`e*H47vKFqrgK)3NdHf?{v4v7)}NF75A=9@_-li|R{nvy zy_w(7WGx@5V)n1Y@CAf_)GXn{`F`A=z^}&ycEw+JR>l`QkHkL$pW%Pr=TyXBPxySk z-}eVNCGh9Q?{k;OH@dUpi`}{L3b!nNhrfRQC%AXV-vm#4d*ZKoTjNW-PvbG~)%arX z@%TD#UHoa^r>q5YK%e-2a0q-yoPcR1#W$GW#FyCP;v3jk&$Z9sysB?}k=-DEx80jP zyYGAcH^85J&G|LH$ZHi}>2-_W?b+D<Wqc8RzVm;A^Hlt47k%I%_xkv=zJL0k;6~-T z!z6D=e3gfN=-A-yjX&yci7)j`Y_4mE`ueP|E6O&~FL%A|^)}V(s?Yal)N3G{^E=y{ z5P#k4O?~!=f8^!HKl28}@AHcAq5p?=@xG0}>HQdgI@m4#S}-fVB-l3|3$}_c4kpLf z@%?nY9e+l>8S#g_d*a1`SbSCB$@tvBtoR1#qk&2Bn*;aLK7Z2x{P^R2u&yt9oM#%> zur@wr1dKA)-jQ0xerf!FR|aQR{2ga>{3CZ|{1x{>`V!}DoGT1W;BNN$ed2q;&%p0b zKK1+5esdCQEMwlviC0)n<L}}F^faGhz5k50`44@<-^V2OGIjBZvFGolo{Fz`X3(Z@ z#9zVB80+upjIZ!t|C>JIX2dr#t~R<k^da_k9%r@211j{-J@6mo0nRq$;<cE|@m%;e zhH#EE9a;%+V${|DK{|IVK75}AKYP6My#{(El(~m)_Rk;8EY3``xL2vgLTnoiM))uf zYYaZX3LoZsx{bI+{#_YoMb|k;b)6sfb$Ms$y5Pfvk7|(>(IV!OMaGZ1n)9iVwxD}C zXW6S|_^D}}HEqCu9chi`tZZBSCyh9k?oQ?YW-Le{Z903ct@tyQy2!-e!LR8C9RfxJ z|8A%Vlz<C>f7UV(6dLz&c5@&8>>k>x)=#<vT6>pNmXPJ=KNtA<hXVi1>A&Xp^ZMsM zz7JK#c@upKBy-N=`sX{W!7a=RbWCOq=B&qfDsgYvnY~Re$o3}Vi>|}(XRph*0=KFW zv@Lg#{S&~4awlm7=VMnVFkU0@k2|stNVU&nKJrndB<{bO1;ie1EPI(g%5>JSK0d>l zC}CIlD#dEyOwqaizTAg>68b&^zw=SevotU;1)q3;t^&p27BCX{`^3IrHYfpp`~_fy z4}9Jm%wqO%8SJ|cvzNJ_@Eh!}xFZdeC@nOW@b~OZqWDXkL2}NX&Ro>n=A4o}RU7>N zbpM>wdWwDD`Jf!XhP5AmKaIU-B=8|&+Lb)%?9<!0KjBw%ujBiczF+D4V1XIB+8Kdt zsLdHL<2pMS(v8+^6(;B!l?FDe6hFGu^|2vA$3VyU>2>&=^xi&SV2y4BxBP+L7<d5Q zA4q#ZgZwaOmE-&{bae1A-{l(T?m_mqwAUV80Ug7d*@!+|!hUp+)fwNvvwCy)($!wc zeq<%<ZG^oSV7&Nrgl*zC6L!9!&ir$h3eGEP?~VPt<D1zhWcr}ZZEz3$Y7u+piQrt& z*D0pY&}Q5--sTKJS1k&S$F+<z(roU#mvP3~hW1ZO;495#KQWiHM%k}2J{hBSUtM5# zGd7^-+B*1pud0I|_A$E-cR=rO&(ti?p71L0sUMd>TWo2|8Av^yfz025J^Rg|Gke}P zJ}kcJvm<;^wtKL9oYR*(-PyX9J>(+Jm-g5<ajxFcXk>Maf9x`kaAvW{dc+uNJ;E8n zQRZ94b2$Sa!9M2qXl~wV%q6ZlbMY6yLqGE~<66G^(KcU#=5_Q-Uk7F2PC5RC+el;n zucA_AZYBMtQd7X(I(j{HB$y3u3RY_txaUtWwJv@(^v?gQw*9BFCjDQohySFj5o76p zgKiq`_5?otGn`Etvh0T$FFuAa)_f!rPUell|MTs~jbPmL^g&-OWgO1-=^rqfZy(3w zxo0T_i&+0k?P~VCa}5PPQ%CPt(28*!W6sM;$h?Jo2{?B!)*7&nZ$Nw*cM`R@7M(XZ zImk(7?0nBWcN25OPb&2?2%m#Z8>yQZbPL}tp?er76R?{*CGJjcbf09t&H-;QA9Q1W zgpcq3<~HP=NqBQDY#FvSONYR0tAaDRzN{~6bc0u_8#o8M0XuH-ZE?A8;Cy0`o1{VB zgBlc^&YHY|d4T%~Y%igl?f+M8MH?i>`B=Th9ZQ46zPb<RT3zGiiM{WgoGWr(WR}O@ z!hdUy@6y`o$vU~0yRjdz?<~gYmz;TX=4Fkb&bX`QJYpmmR7d-pud+srsKa5_lNlNo z7{FQCQO-!tVlUN_b%t{cyH9+V%UxQqudW7n`;;<o0OtlcGf7&mj-7aKVt=d5a1Z>q zzRbzIL8%94*H`&4tMQpWta}IaC1<3M!@C*ZXgYRQLf>ZH^zYlx;><m&sKptPy^8lM zxs>~S=BRScbk>mPetezJIO`e9-k~;saUy>uJ_zp%&};kqK!1-O$dJkXeA~cy&Q3=p z?y9rhFBA8J*__K|yG=Rk+Qi!bo(AK?54Jfoa@Io!;BQP(+u$TE@*(|F#M<4~9YQ?5 zh;u*vskhpOScd|mIV<+ty)Ad3%V`JtyXW6)c4fYPgMM@e?bCqv_#U~lpzFXh><Nd# z&zYS$fj05)J14RBUBkT~eG7lBpFN#5j<MiV*1SQyMJi+s>d1Q755F_r9f^-H0smwV z>%n8xRkYSOP3ZG={sG+mhI|_m`h6Sffd1YI8z$hpPeA4@{Ej;P3EjT_I<T(vOoV+K zeusn&uhOR2!uTEJwxZm(DxLfH!GZU%&6incPr`=Z*9CU?b>WvCWF~a``s;0=?8nRR z*Fjx99Ip%P@XLQ3z8vZ)0eJng*`Kl}k8tOCy?Szg^c;8w%m9~xrsiX`>sH$C!Mgo; z{l47)2JAVSds*=>@UIU!!o~*}Yv-v5Un3Xvaqi)M2zV#s7l<=}AF0Q&?EhQRSN!~k zwa__Pclcw1@jsS!U*xW0P5zXA3g|mGSnV|tUwAb2Q_Oy+H)CZYbJ+-M19M+M;}YTZ zly$v61*O1$JLlg4Oa!CBjlhp9G$!lQ<KX@|&=TL?KhyKiCjE0x|9p}&PTvnV@EeWg z^qt@P<G#=D`{UG?e|HRj!Y}vFJpJ=d`fuX?n7O2_brNF{aPPJPJOQ|~JnG+rE(N)D zcocdwV0V7RzwdN8r+S6C)_+%6c>En<t<TZd?1r9hYU;l`@!z7^{`<BfnRWY~On?97 zzdhhCYZzlL8(*g_b3|MBamMs4zNa#eO=cW^$=LUsB$fXn{MS#Z@#_dTS2`!z&DBi} zh&ADG6OSFa@)6rUkOyc=cd!zhzY~`c>xc57+@%t4yHR$AGl}#xQ@2{@T31=uS+`pA ztohdC)_Yd9^|SS>b=bbazRjLtZ?d=8zdEluZ#Z8&2UDA*7NibJEleGnT9kTm>ZPem zQkSOwJ*`PvYTB7;=cHYfHazX(v`f-%Nh?p=l=e~D=V{-k9ZWkG4M$r?yG2ir_KOxo z&yJoKy*heVbV+nsbVYP!^!e!ObSvFW52l}(-Y~s!deiii(o@r;=^5!+>8;b-rT0ib zFMVeE!|6|_uSj2&{!;oY>7QkskkKyV)QnykeKXF?I4fg##^{VO8CPapoiQt8amLdb zFJ`=+@qH#8Ju@jYJ+pOYZsskS(=wmUT$TAzR%X`hta$e0*=w_3%zi2RmFzdO-^u<o zyQY=XD%h%ds}`+JZq>R~`&L6*m9)C7)!0^7wYtC6;#SLBJ>NRHbxP~h)|a=wt@W#| z_vTH=yP=)aF4QjN>5G=9E>B-Rrs|Zc4pmoHO{`j2wXSMo)z+$Cs(!6HTy0c0tWK{! zrMg3P_v*8&FRUI}J*s-{uF~DnpS%CO;pdG%zkb+_AC1Rpov4;tg;tq$lQol8xXXId z+G-uJ4q1n-qxMu<VZQyQz14})3R|56w89yw15(dPy&!d1>WI`!YFpuCrKO}r(*~v$ z*0sW=Y17iyrM;2%N!k}_|4REgs%TQQeY8iS6$a1>lcQzP1<@y>%cIXFT7fMXtq@7H zLX-67=`HJ8p%tyrl~x#%zM!rZD$*;{-^ozMiN{-EAgxf6QJOJ6V^YTL8TV&Ak+F_e z_>NXEX@zKJtIRfuR(QCs6-u(!WiQQMlf5pxBD*sCjqJCw-}^7E&?C_bV_IF2XoV+Q ztvuce*V76$c_n#e?aX$8cFmq1Mk_>V1y$u$b)pp}S3OYm60Pugq7_taSEp3xBwFFD z>T{|usxGOXv#Ti43ab*WAiovjyW+3LUx~jI|69B~zBaxlzB(R@KM{XCzBK+={NecG z_=5Od@jH%veC(rRZy$So?^AoH?7e<(+1_jSUbFY=y_5HL+1qh%vzm`;KB#%W=G~fi zYTmASt7db}n>DZ3Y^r&!=9QYtnvFFt)l}51uUTCatNClq%9>|ume<U#nN>5h=C+z! zYHqH%v1UrmDK%|sTGwRPWYuKWWYk1!(rQv`TGq6vIjJVOroo<H_x!x)n?2Qgs`h-o z=e<2!_Pn;|kv&WH+`nhRp80#`?wP%3%ARZXT)k)Fo-6lUyl3#9Gxl`Z(|UJ&_xHQM z+r4x5N4wwO{od|dcHg{v)b5eHPuum)uIG16-&wu$gPrg1d}rrdJ73?qY3Hju%Xco? zdEd^NJ7?^izVpVN*X+Dv=OsHY-Z^4t(az30JMPTcnN?j`{ZjScswY>Eu8vk6-7#y& zt^YXqkAMB6X7n?oc8}UQ>ajLX8@r9w#%!aUV>yR%_U3$%vm@v8oKJH;&UruQt(;9c zFXj9-XJyWFInU<IY_%@?aO2&LuWCG@@z@jdLZ{ZwSd51M|KI<YHQ?WnDp*u!!W+;i zxE1*8%Tma>P;L&bLU0w}#XmRpvaSQ%v9j`s7f)iREo&y22fBm#b>LLqx(hrG^1+kf zJunn(1=V00H~@YI_ku&<SMVS>T!)9D{t4&}Q0{YCJ!NxmYxCB|o>7MuQ1s$03$I4@ zn_vsL5TJ`4j_iwBK4aiD!n2^1*$3W}9C)0s33F%cV2^_jV80-xHmL=>B&D7K3W)zZ zbO0Dc_$BB$ppft@Q0|vgDWCm16kj5h`sIbQQbz#v+wVXx0hbbf7fN}2_%^0e=RWKY zpuP@Ze<<O!DGfVp-c+TffK=?`zB(-m&LsRjbRfVE7rr!XNxO(}3+V7#xS3G&`*5?M z*x<v>hGI+FEreS^r-5?9$W2=Zs84pSN~50A(9a7$rF{ZEC;R~P3xIz25$L}F^~R1? zX|!kBF~VzWyHzwyxPovJXib<piM9vb2v<RSfYS+6u4unnxX6uC-#**}Q0hE-HsK$j z=hebRPjoW4n)qL!Wq>~C(%#VpfV%aPq4<5#WrWkA%fSl5S<vUeO2U*q`g|??--(H? zu7%e|xMIM9=M<;|T*B?3!CH77pb>B)@g1QJYvFZ;Hm-$tDwMWKZwe1(PHzrQB22l{ zTY^-=J)zNBaB8abj9PfqOFFjt@EB9+*qTlqdihY_4&YH|>ApVXdGynC>c@vizepbf zW)eONN}Z%XO!#K#)3xwsL#YoR-W=$vT6lAzv|0K~#51<jE5R#-8E5JA<Mhu6-z8zD zGEN{&AIP8%GJJce*9^++!-F@YS1mloWJcdwc=Z1a`k4>!Jt($jV59dwba*Yi51=Jr zH1Ujuj8ZU$@W;^c;7Y=uLMMT%31d^n?O+z+FQE5>#e}P&PXNm7?SZZXFB1M5`g$$A z{m}2g_r!n8Hi~lj@c7O&f!_xXK$B|W(cd$pAf5Ohp|ne;-!4Bv+kjlc^vTRyYT^9` zomLC)FqC%4d={P~&{eev(2p`dszrc0&B6d5fd<eLFq<%SoQ3^aal-Va?4{sw!qcH^ zz*@pHpzvf<MovMLO*>}Ojsfy#SAtgv)4thnfHw)>0euV5R)IUA?}1MV(_gY{Y7uxy zYUR`-fbFe<wFoSMHm^m1KG3Q~Edq~1Pp(DaF=*>r1hBVN`&tCByHyWB9}Unxtx9SU zz}8lm)grJFItGk|=Vj;>;3~qDp%v}xgA;$XqMdyNC`T(_)^fscLsx?532&8JGZ**> zY=bhF`UrdsO|3=X6X@l&2vGjk*Mr;O*$#cR7J;2m>ZJ8v;&-v-!sfgQg!e$pzzu}y z8|_R$eFc7o20(}~eV|=4;OjjqJv|IuM3}Z*jtxG71~dxN2_s`UW%m)Zr7D3_2qU*D z4|E_5syYG2La-}zVl9HyQPpIyka(Xy0M-$vzN%gV)NAks=+;^UUxa=Rej$E6^w(Mh zsh_IDwFrJ9RbxXnHU>Y1VpBEqU665CO_{2Fo2i%T9N^pe0~Gsw1P?;H*CO~6^ek{T z@zhWCIp9LV=&Zg7jO4crDSLGZ7)6-6tDXbq5@w9-Dgw*}A;!h-Xe~nYyPuKoBh(Vg znEH7GVe0f}%IYJ8O+UY0i%_mav2@rap1O=5twpG_41Ukl_?Uhf>H_TvPA81#`f)xe zAbc9MKNvu`4|EVHB;1!%32aT=$x=uE_Rs%K$=ZMQ^xu+y_febEVu$ar^Y;|scTGbI za~5y{CjuubiQm3#q{d2C6E#&crEpeol3J)GX9sK$6=l)NVDidRwpyvRa@0nrD3{ZL zwrZ#L>Y$G5q|WN1t~yoS)LlJv8fQ<v)LW;kkMhNDCg=<mh!?8j6|K(FAf3&b#5p=w z=c!QV^OA0ehU!8M(?u%MaE;K#D%K^u!n;%@qRsRd&DLDqq5HI0kLpQ1p=EknPieWH z(Q|rMEA>}BuT_d^wbtqdm1~{W>qS-Q?|Mla-R-(Yr5eK>+&?r$k8857WG%d2<5>yj z=pk=|uJu0lKGEfzbU)_()qBpm+Iv$IG?g>u722Tbx=fS2b;eNdeeX$4)Gh8m@n>(> zEInW(8%^AI+%4`l_hWaf`+@!zxJ$1Xr@05b*6vsC*P*tdcJ61Kz<=m|;#PBhGG8-v zr{?KyEzmu>OZWSyu3D%idPEQFM{|z(DEHnEnGc&wbd&j*`G`qRjVUjt#%IS)DJ+T& zx_)>}nSD}XZf;TU;fc6ghDW!>jK=L!+Qy8$=s#kS+_o_@@9YbT&deO1-Zp0CU6B%v z<rfyE$MT1_joEpA*7VHun~J_@@$T>z@D&|x@$K*yndvbnw<y;C`r(O$;lq*V<egAF zqHWC0Tb*IdL3?ygadC^75aZ>o&Pqh(ACC&;HEI~`(Y|dgm>0dpSF#ED(U_HeR%SG2 zw;B{vVbQ!X^F~GeK=&5u>BC#hOMG2W`{gSP)moR-A}Jju;k@XFiQ+`^qU~c|ZgEjG zIv}(EsPWOFXzA!$5kB7uzGhTM=S2t1>pv=UUUXh&LS1G|7Zk<v;iP=N+F1S=|ARaY z5}JB{l9Ha@BKpZZs)0;rVcUguwqYM^C+1~FKdDom87(?{NQ?BCF}!FV1v)EpUS@RO zS@SYS`Ih*_@qhaEB>BzIh-PT$7r+lR`a=OQG_!x^sPQGgFNvSEaUO-3H`}kaL8Y1V zyjZlbs8@?hQj+u5D&NTO*Uvb6eG*+V@zZC%5X+<UE-1<zjrE!RT0o4<e$;&a1x0HZ z!)IL9Z;cT(AhGCWvF2l1CM-o_le`$Q)C)h_(hb)e@mVnrO3#i33X4`7#$Cf#_xJls zEQyYmd;x)J!&7P4;v&Vo=$SDqx4?)6Y6G!g^h};l-xlY$#a3JWug@)rg`#J!30_eD zIXsrkM5$x#*?nDQt+L;B+Bn=BF`A#CgP+5QdYJM26!SnnqkJB7#y!jr4>AY*mDyz` zGtULOh52SEvq*Plm<Jd=uQGB@XC}Iu(K?MeV<|IQW0h$Vv)l{JBfNDKt9{+S81#2A zm}gDaQ)@GRKI@FPl68YQIS36AA1VQ?HQ{AaBnqa0t<njs8z+#b!362VIlPEn&+qz| zON|cjhXyKndQl`b87ehJW>b8cX3%EjZMKiUA<$iFj@%YW{P}@o_GW|GmqF8PDLNlt z_z*r@9zNMpd?oB_Mc&qP_}%`g(kZM(x#-Lzt}Xf6Es)wD;h8crI}MaN!`B7-yRMYF z_mX;`=d>+6A0XV5GW3dqGO%9iog$sSg5UG6l=7!aeRuFHx#iLse4kk%4annpGcwOY z#-RPu+3=nN&J$S5@7QDK1*8uleJK1Fk~WOEi_kY5o)M+e#pAdlfL@XhX7a2W-jVxi zap@eXWD0*`U@J!h<S88t7W2Fb{;^fkxL*7XgaWV(pnp8!2{~YzG_j>L$(F8)NRv|l zaaXVB&nuAc+Ad&|R8|BIO4kjQuHP+9Ss~rDP`WuLP2Dcta)c+_@J@sG_5x`J`DcOI z0cp-=X&y4?7fN@PNO!}xzy{cHZ?beB{P%a~NmB|qC_S)GdXVpjrh)@J6^Hkcanhq( zq^0@N;}z1A#4YQEk0yMuO6i%I(sRiF>on>4#nS3>X)Srm*Gn(JziyNC;&$n8dD4a~ z>F+J23h+{~w6RpG94x((0(MKU!v7j}ydD87q&HSdo5}YU;deSnTaHNY<w)<(mp%wc zA9j~MBF^_eJ|(;zzJDx`KA$S>7$<#!{C}26Ul!x572zully(lrPa<tkFR6y~z2yIj za(um2+K=tuqW^pJ9w6>tOQavSOFu2eKO*i>GQWd`uEQ0K{+9SgGi4Z?WLV>5a6V#q zOJoEm$Oz}ih-{J3AWO!Hl`@i+%4oPuMx(_t8t;~od_+dmDKeTB%Sh=UqdDm<*2`$Q zSw?Dwj5NN}7s$x$B_pdyMm9cItA#RJSITHpAmfy&GICeQ$Sar8cB_nb#CL$FBj25o z+cjUtsmSThUC3!kGJ3+_Yln>9$Ul9lj6T?szfwkDZ0tw=g2^(@<a<DO83V~bXsC>{ zx63$(_;Zn67?p88`G&NVF_gT+h%YLXF`TrEp~diy%#(3xsf-eIjE3*Bd>La>WL$2` z7zh6q2V`7XCSzhR8CPY=U~6hzLmBAb#&t(z+(7uo0vR_!rv_x)(p|=_$i0nx)Aq@@ zJz2(#4qV?9%9u4!#$S*>dx?y>IWp#<_l_wt=1-GxC*ixW<!-Q`i;R2tzIU07`*zE? zpYWm*84qmb&xpYL;3gRl!?R?+j7PEKu{kmxkAeyrPpp*j6!tEcj1>VH&rFu_+yogb zH_KQ>p64SnV$jutWvp2$V=cD6fQ)tMSwBw3--z3Qp1%`cQ6*y|<$igljLHQvURf;T zHS%vl{~K8{-Yk`|8UDAH$#@&ycMiySH(ADe@Vwtr#@0d^AFPz|A?5vOf{bnD_}0k% zBn6bo_!Ro-dKsT#+vo6oaY)9O@Kh7tHCe`<X)^ZilkqjWzCqTv(C-8e%J_b}j05ET zS5(H2@c&fIFQ0;6BQg$muYFi~><E7PN^bMk%d{5Abb873X37jE$qe(`@FxtFd14ot zNsDDR+AK3UOJ>tjnavK!Y(7QiNlRq5SSGXOA(<zWH?2fwbcIaLSIkW0(Z9{s$jIUQ zl&v!J(9<qoW_xHy(mEltYqHF4<n56n^Rz;lJ*UZ}Pn)M7l$qZ_W<O8>@0s&u4j3nM zAo|X#kU40d%(J)2Jg241b4fq1NM<4V&mSstNL1!fbPgkY5weDt$-KC`%;K2Lk&|VX zkUpwP=4J4gQvNa6Gj_Mkanodu&ysm1Iwm5Y{%lSzka=~n%xjQ;?LwK?O^|s5GNx>o zc@y$)UMlmJDKc-}E%UamGN;94PCp`Z26oMach+K=vm-L+l=Evh2V~A$FLORJ?<9P8 zFPRIFaZeYS_m;`LuR`Yi%VaJh&jVXzF2;t3kn!+AnM;uSC}nzVtIVa_Wj=xYCp*Yo zhRsjgJeEbyip?^gO_KTC0+}nzW&Rbu=lOmfS*rsw*Oba!i|q2rGG7=dbKM-7FQRY# z5}6yKGXI_<v!a8{muAY`IA7+=>t$AMllcm=UPI2N1v1}&zF7p2y?LR`w|2{Xr$FWw z;@{mQ^L=<fSRwNx>gD5r%uh;XehT0Ap)&uGBJ=Y?nLBpK{9>QXe;$%qMZRjvQhh|` zPVC%8zCE_gnsGAsVbi{YGQTF@*XaAEQs#crznvoU`(&91(ECF#ng2r1kKJV+%#r!i zV3|L!mw5;ozoP3fd`AY#JUUY*zyE8-7s$d_vy5_CW`!)fmn>($EH6n`V1g{>c`LM4 zR%EEG6Sm1>p0^q$%StYh)nt>bW@WOPPm|SxJgNI+rRB>?9}M`;ES8mZKvpZ_S}&B9 zvr<-@#j;L8UoJFni>$U)vf9DZz6czY)nPI~PRA0kLslnbbWQ@5vbqr0H2{d~8k2Qu zOHc^b%jyPCx2a$UR}me+9Dofy1_Hu8mdZLU1<aS#GXk)oR}Pp0cFXEr0LXiKFHj|` z53=$FhRW(YSysP1S!a+|fUGmiW%Un$31F+N0bKy`1IRxRIRnXm7P1EsKZy82+hv_i z{<Eja8k`KuWHC;yb1Gz=I}lXLIxh>%2M1&o7J+@T&MySzV85&jqF|=1A@B_$|By{w zryz4EaYK*Dx)52z;2#G6F!EkR_@WiEiju*4S;NUcoU{?>8-YFede+5TWEEpa@d8<w z*kA%6ZRB9ES=Ob!0N*7!V1=wv=o>W^9FR4dvWz|?>oW9~rU2xXR>>Me{226%MebN^ z8@pfD<#WIeS>v+6GFeyTgN1;&@dD@_Pu?q$aph)N6ZoFM_eAndMAxLjAjT%75bTyU zdA+Qw0|5C~BmWxmT|>Sy?6{8b^+kaEH<0g!C9<XvKczy}jUB)=u$3#UE?|YMn+3MX zno9m#kblckS+^2*>p-wT)@{k46l{_;4W8*!W#NBXw{Mp<1N&zpf7UWtf7vH%&SF_} zDc3ym&O5?&7<uQDcYe97I|BfDcg>V_H*t3#khP#d);-vEFZ`_C*1}?0i)?_b2ax|@ zo~(xsa`i>}Bb4RQnX(?60MPq*sjMg9e{!p=r$~S5kgVlJvR0t;8T37eoRyno{gv{r zf<BMT7&=$Ox3;CM@+GofI3nxCg|hy}_l5#lf8QqSC1h+|A?xKTS*+pKtLS}gnXFCZ zfBm4WH^{#^S=L+Ve7jKAJ5jJy*1LA?72o^BZAIP(+hu(?Mb<}iWNj;!_3=1apTPSG zIzHVZYdiF_m9jn`Dr*Nizd-*#m&p2(@>QX?daJCRn`G@m?`}|o&b{+x?OQMFEAoER zUDkf`eA`9Vcj)}SgzLoZvVPbh>tE#iF$o-yb#St*pQg(Cd5WxGQn+U9z%^q8Y?k#K z@(zO|d4POJr^z~2CM(Wgt}*iX&9@G`kUGFit0S_#p|S%DWC!zQhsMbc_mUkUzQI=6 zNi+GSD!v;Zl--21ru(^~Bwur6oKz~i1#favQ)HjKRCcsdcKRXNnR8@kFO%Jxd^rnc zpVCD(zOJ1Ie_Lp~EwbAa-(iXDj>zwXye@^ZyOMS)^1CmVeHy$ySIX`^PWI_jWcS%D zyDvQK3+;je*=K_OWwP-TZ002Uto5?bM&4lf&K)YdaI)<4x62+fQ1(#bh9Tpk6xl_{ z9llKVh!WWsgJQlf*(CeY9N8trj}FMbY@2MxxjmNr;}*-l0z1Z6$ez$a_Qd(JCoPc8 zShcT?%Dx8q*P^eCyw|Oeef?J1H&n{Lu}t>O`CRL+mwoGC+0&3aeW~mjd9r6E$^Oe0 z*>jLJcc1L}$+GXPl6`l9>;>@NJ4N<F<lY~X{QzkXc98vWciE2+esr7crPE|TLEMvt zvX_mM{nSv|PnXDEj{Ik$vY$itD&k|vS`z@<WtXSOeu4H_hn?$@!?>_FkoNaH*%ipy z7?F){YqN*5HxYiF^1g}O%?D(^P5hRrvfo4J`-iy7oxrtilI(5B`UIJu?vwplx$MtZ z$o>L+i5=C**jX%l*L>N#!Jf^sYYxiZhaF#am;Lnu+28Dt{q17e2gb?9ceM}NvVS7{ zGx3Me`5SzP;XATh_OU$K@hNhQ969E0Io3isZg)9>^>RX+<V5DnX)sXEiTmX=oGhnt zzMSNxa++qzX|_^M^DS~(6v#;}mXik3ljLMZ<z!8hlifj1tAL!=UF77<k&~Mur|kqe z?TPC^-j2xWR3WF!R5_>S%ju4s(?Cz;aE9UZ#+K7p$jK*v-%WDPK<1gq=$|KN06YUL z<qYa2hc(g}d_c~*$Sg$9`Q*D`shpvt4?8I5qEb0U=opUP5!go`bw+HHbMa!ZT@HQK zDK3}8dgffRS<Xmwj2sF`A4&S9$pD^9;VBs`XB2Wq7lA`^E;}fv6kTI>$Qg^=%ZVR{ zt}B+w86O4YzjB(K31A{JCc$%6O9205bX|kYYe89soa^?<xnZcBDaC;D+=PsqSIC*V zRL(6>))wbhcy23`GmSjc$#Z*^oEf|2%seP(R;8T3%#kw(8|LQAnKx6;{3uv2=T7qY zxNC`=yD7thDRS<OfNlSd`<BXCxKPghC309(oCj?=50UR-_#P>g^C)$^G+EB$=y-y- zWzeUl%2|${707)CyPqR{CAP21k`p8UYUmnlEvF1G?3VK)`Tmw9XG4{ozaNydv4fnK zm&$qN2+u7@d%cXO7VvLwDd+71Iqx9z-I$#BOXYk(+=uXO!?sU$$l1PC&Oea5gYtZ_ zMb4Kq<y38#vlF{^k-lfLoW11Pw^`2D#C?PQZ&T!amnG->IdVA5aDG76zmW4Iat{*b zJi|GJjNhO~_RBf8MXu3OuDL?4T_D%lF4rrQ8#o|0G(~RsklYid$vv@`+@$q#8<oq& z*L9oBm)mri+-BS4rtFq`Ql8uvF}bM$xhE4J9VnN5o|`#TZWgrF9Jx6GxowupJq7-} zBDrm$?UB`CzuZp5cbOpfROEDz$nAlyo>S!VcShab`{efNAUD5MZr>$xna|xbu<uOx z`d7*wIA87{<P0v5dk!|7*G2C6@D4%eg$w0gM0!z`+!3T-42p~8UV_X^2g)tkDt9!p zO9OJpAnS7EjN2i1{7ku59+5k7f!wR|<zC%V?zPEs%O=ac4*mF3?hW8Z;%>^5dvl@O zsYB)7a!BrN$e&K!?Uiz8!aM7r+}TmNa|X+ui|l#mo=^H+w%ofn$-Os8?tRF;e}dda zd@tT2_n`{8k09?ccpj%7o;V=)DZ)>Wle=QE+-D_sCHAiZu_JQVY?J#!soZt@<^HXM z+zs$oz`qe0FK?IoD&g0V^E#maxtn8h--2h$9J%im%H6t2?uUf8A^T%wd`iCU*!mgq zpC`%P5dgd8{&TS0FPF)!+AOykxjPTa-E~Oro|SU<_L94gGJg&I20i<c^)0f#=lcM8 zemElc$11r$!Sl;}xxbdm{cWpU{0;YLOm4hDo&l`E^6)b}cbmMx6nUW%d6A{^PE3*4 zFi&3NIr5t1%WKv}Udl9i%~#565s}xjSl-D4<wXgnZ<Uu>Dlcn3Pg=;=itpBC@^V0% zCGv77$ZOj{Ui)71I)cvR>)KsjH{|u$BCjXk^gZwNsq*r(<n`Sw?+k1?vsm5$_y=O! zAmk5*@7$L1&O0LSeDV!}??O-{c_WZljGmE8<(0rQYJoh?6udDJd1Lp<8&@H3{0ey! z7R#G7U*6=I@~&Aguk4V#8_;=UfxMez@@^%3TU6dO@=YHs@AhT#@U^{JeE(&#yxG_? z*OoUAeRm-5&SZJ`)!y9)<lR#)@4hMW79#)tCGr*#x41ywgM=SCC~rv@o~w+L_bBp~ zBKz@8@;Eo}mMxU`G-X)6Qr<H|<vn{u9`l~Js!U#t{Hu4%Tf14_3ybBwST65x*tcPU zyb97@Iw<dD?0BVE-fLaty^fu4g13_8y@QT-XUcoOQr?GrZ^O1v;NLz+-ap{^99wp* zm-odXd0%43m;2>aVQV#fI|s_!HC5j3rSkTme{Yt&ukz)6gPw1<$UBfD?_cxf9o#PO z7x;dQ%A;?2$9gHCh5W_Mt^7u*6!7>Cu2&$^L4gwv@QkEffrb?dG_F*j$!-Nw;5n&_ z0xc&faPmF{(#I)~m8C$d{R+@`0(k`rwC}D!C-QeDuIpl+pDa*-eRZJcOa*#RRiMub z1^ObhfPDSuD{xkx0%vbl;9O*#A5~z;J_UyHU9?Pr5y&Yf?vkwvT#8+zmMU-=>0^jz z-3?sM_c+qV4+e9<5d}C?4P1%t3Ghvtpukn+yBeFW*`h$%3I(nwZVI|@+@ZkD*mMgv z-Ab8m-OUvjJktPsx4`Y26quR9(-`#sWeLw>z?>;Oi6Q^IVxGg0cE@G~=EHwijsmQ& zfd$C87n|<euE71|TeMJt#ia^7h>Z_tDX;{&j}U%rvI0ws6nH!jAooeqmLd0PTY=^C z6?mpZfoIY295Vl!q`)fV#WpFh8kwt!|9_2g1=d6WI@e&|n*9o_?G9#wEeezm1eE24 z0#K>II(S|rZGDLXe=AgALoWsXz6=~vpn|%pn6JP~0kBg3)0+fdChb-DU)`j@Yv|cz z1Iqb&6p;2t2QW^7H?d{&R-WC!^Y&r|-dzg#ejgsrbORqDW7`~_+*By=$zZTwfls&a zyoS2`Y?=c9C<O3-zC(c>F$KQp049J+1^yWU$o&%AtD=B(&U^zqp}Qt4u$#0!Szw6* zHOmy(OWfWA3hc`R$lgaCe3h)g*DXPn0^dwgVE=jrz9sFuO$r>aK_yRj1}gAlnF0rq z^AqxaM(!_L6!<mD^PLqu+nK4ru_e67DCDWmOa-k?Jja=@px08tz<vdT+Z7CVP%yGg z!3Hq}lb{WcD9AVqHX%>5g$g$3MMv`k3br6$D*4lr6pW5jFe9R1=1c{%4=I?FuV9;~ zf~Txdkg*qROIo{PfUNeUcR()wcCa)2U6I=jSv|Td$Q&N*JygLySqkRI6zn&R>(Rjq z_AgU#AYuIL;2`odE`x)s6g)Ri!SmKDSU5$&^NAaRjthyqD8TdJsS1wZyLhsKmrPJ_ zWETZ5?Vw-@x<{>0ka;~grc}YP2Nb+~lY&?9edRa>C+_FTQI3LF&sPv%ICy=Df;Y4T z+ZDXAyMi}^sqo!Wq2O%`6`Y35Y1<T>PX6i0xji3LDmVk(Gma=Yb1Hz&g3dam;9m|Z zI2#>vrYShL7a(rlK!B`y`xLyR3s|q<eB|62P!OLZc-LkH?=Ax5TY#K<lEDrI?<M{| z;upfVa4FBD^1vJg7iB5Po-+6VG8a!)@IfEj6?|wQIH2IeTNPYFzDET3eiWIHtyggA zJ_R3#K7r0B(DmeQ1(y|o6$(Cuy-%lr37|s3736)!2YH@__qlw4oRuBG76tzr0L37t z;40!)k+y2Hg3lxG`ArJOux<4b1=k+pDc2N$jCExSzKD(&iC=$E!3|MRq2S+(z<LEM zpcUl`zLW&ueQAe+8_~;oPw?elV5@?a@KjbQ_{tmwUnT9e!3r|wgPi{aUx)9F2$&4E zDflMoZ!S}CbFqS~b-}mBDfqU)QU%|cui%yd!1gV36nwWOs8H}d^uABr`^3G!Tfwc1 z75pGY!4JEDB>=e}^#aKKXorH^3IO`HA&-4R@MH9RO#Y9__wi;0KbZ<*fV5Au0CC%M zc$OAW@E_RnIpH073VzWN98~b1D;4~byj5iiR>QjsdAk=VScA;HD-`^SJYN?p_{~BE z_wQHmyMYRRU#{Q*^#1_gzt$`G;})K@RVw%s{6C}rmpKX^g8oYUVe%g-QSj&i1>=Rh zYFMg}v5((yOy)O8r}38yH!0-h^IV~lzfyOQ=WUx63Rfv~LU*3DEmi2mWePQHsZb*U zc$2dfY7$YXX`VvOhANbjq)>C@p43617E^fcHkoH`^A$Qdrcl~8h0<+>GLW5lM4_zR z3bi7wb*Vx*QH9zh0eDU!J@*h#-{5J>cRTX7$HopT6zaHNp-u-C>b#UEaEtk~|7ARX zD^{o*_H{d;P<LeZ*sf5|9EEyKR*1DfboyYP!sP+d^C?eX<o6^08Kf5!0OHRiU;mgw z1F(m^erOQ+&)%faIpjNcnnLGQC{(yZp$mu~GDRWgq0oiMxRCe@_bD{Y1|7g)K>DzS z06D|R!+aIGh<q0ngU#TGLPd)e8lDA)0(ggG^N17x&xjS^kU|%?1o?op;v|5bqxc?8 zzRR$63^t5G9=>bna?-|aQRs?sV4FhY3lzF?r9u-b6q>k6p-J#fCg0WMzZO|#Qxv)m zTo3;Z3ly5NM4_8HD0FkVLQ{_@bSt`V>kiO89n30L=r1{7u|l&4f@OgCIp~=i0fgsK z-a9A{^IK^C6ou|e2FP2mTcP_1-#=NQMer>SDD+SluuY*Q-4%L-_($P;44q4*(Bs&_ zd=z>LJDy&k&~kJv-=WZoQgA>a{NK>C@I5<Iq33#m6$-7SE>^}AT7~TAk;D8GTD@PP zwSyHZ-=fe9hZK5|IOd(u25hT{D)bV0Hlpigcq*4D^vZsPUfrb7YdK)PLYuNcr9zye zh2EH>(3|jW&R6Iy^u04sp?8t@9&+B>tkC-t6#Af0p&iKkCw5hlXBYYLk3+k*DYQEW zz|T4#`npV^f0ZkA5FI~NDfG)qg?=UNH*7duq|gy5bhM>H$IutgS6EXOHU@*O3ga_} zt(6Mf2NibbE9`+_p28t`!pUHT!jV#iPaxc&m%=A@Q8=kY;f5W+CWRTZ;l_&<PA*rt zDRIpPDx7ji;gjHLIauK|@}}=rIP0LottTto2Kl*Bh1(V=+>W>o2Ndp%%r3;ATBvaM z9EDFqKjS2P-cW@vUBdrvgYaeh6&@Q<_;Pez5mR_VvBHze6uzoT;j2>=z6M#>B`JIZ zGN$ZP_$F{OX}8#5lft*IP<R?Tr>|G|_T384#J*WA!FGjb=K%809jNd;<g=!R=fitv zk-~S8_wF4EFF^mjHh}-WBML97Q22q(3NKz*2htykDEx4q!b`B>5%N3+UAj%-Cy>3Y zm%>kzz8ssLA@6g@rGJJ0ioR8I6n>s?Y>UEch+jKj;qsvhzYtY;9Wq`Vtnl9i4k`Ti ze1$8fDf|+=FJoUNysuyrd;jpJDuubP2ydRL@Y~RLDbIVQ3cpYK2i+C^FiYW&S}ME^ zTRs8XqY8h9{r^~?@aM?<LJI#Ac~vP2S0^dFbA`gY(Z6T0!h5?YypOc6$@9$wg}){J zJM8{`s=^15@dG^n+OP1ze1(6iQ1}<<p#p_}MaE%d94Q7{6+Vg`N0%skjJRVf6^`>4 z5{=D@SW6YL_bK8QDB>MdB)C|S&>=<WdyxiG<V27(P?3f!6=|HONOF-PO=c?6bdw_N z=OWFgC~{J<A}vZ3X*oxclld*<v@VK7mnf3HLy-*PGRc>{Op&%_inNDzny*Obp#VKy zlN33%gCgC?(|v*>J&@C5zal-6-E+4hy~xu$2W(fQ4{^->k-q!|+`jPio2<wg<SEDp z$UQR)R4URxMUep&iZK62&YG&opqYxCjh*KdC~_|LoHtI9!fA?}Px=L<549D!5F3US zDRNPXB1MB08D6Z&h=3v&6Mr$fiVrC=lJAjQ6}fbWA|*?}K1D`h_b6nI+OEiGct>wh z<T7Mjwn>rFLQtj17|J?U0Qr|AdmMCJr6N~kfpSI04^-sJUW!aW)`ZQ9OhoP^;wG(7 z<SO!A1^-nsMJ7*B<mxGkTw9_@*>*+nc_UMHD{?b=r;_hB=xy5+nVzJ`?TZzeft*?B znN7Yq(7D7hw@2n<*PW9Uxw}Y_1$m0xGfk0uk0`Qmsv?V`iaZceWbqV59x4$3uOT_& zpUu%OuT~s({9&EQTkJIvL%HRaryM(X%?U2yu*V<Y$Tea~?PHNo%8h^;+qp3#X|-qM zmY<mTTHVmd<+EX<7fj1`L*cAuJ+wM(@h!vg0ztkzbZp!(Xrvq644+sH(;JR8E;&|h zu0OV{H-WK6@zH_HxSROs@BuS$^kFkteDsi=bZn^U{`Qfx=(q8hYHCA#zj2>az{~ZM z)k|wm_ANch^X*w<IQ%8}SX%p7lTTvJI+a@?{zz7CtVO5tmWj~nsFADH#z_qu#Uj|! zs8iD>$qfvzf$6y!t=pNc+ZkQEBy~BpQ`2NO<H)eS4O#>u5!<vo*=C?Y&~1}HqEDX@ zPQjqTJ+FSWZ^`o`18rJFW~}Pp=eG+6AMQ9JpX7vn)2(r4XQvHy*^n2iYaDEfHRxO} z58o{}7VKoi8nurFKPhjRsI@f>gQ48D`3;kTxod(A{a`R0%#AfnS}kfeP|Kr$pTHZU z+|`Y#<c2{?(4fa^G&G7i)Urchb+};<3T2pGyPVp&lV>@#fhLx*zU%3^1C29|z0tLI z?!Ysw@kal~*?kL+-4(76_+7x{wQZ_3-LBvzN1AbgXy~<paOA|SW}V9$di;^^+_kct zMp?}|tw~ClnH0q0&<PDXb?OvLZok$r?Z!TDlUm-U%}+}8d7C8rmN#hzZ%T`n@EWmb z`_<`2ZtT>PHuTB-=`U)M8;Uh(7fWduYml`11n<{aO490P-me?_^!PcFNW`;j*)A4J zS{)4k8f%%fx&<VZv=+~`aqe2f{~0?WX{~ue3ZIt$GZsu*YX@8M=_Gyz{GWc7a6NM* z!PX#Y?THPV!PY48vvJ~QQ~#$guz7-qTqh-RAv4vNW96sFpo2SsU??0pp}~ob8aHj0 z()^?rEmPb5$-h|sNupu8WqR3OXOF))&^Uo^ox5dr>)g1rm4IjV%--ADC~p1QiIu0H z(``?$!Kc04urhmN(#Brrp8j=@b9=p>Wc>UM|0W(=XiV@i@tcXpq+|E`m?&DUXS|y= z%(;tE9O1$Ey;_sT$cSZiUhR2b#X5Fclga0rUX0~6r};m6*o+&)<4+Lf#*#a&>CLAR z>)Sqd>L=xPfIqaEySBZ3Y9P0~ZS7ER+rGSgZOBeq9ikJS6!>aw!;?aA)D8cBb&b0B z$gPQXJFQ)VSVpC@x|jEBgIKgOmO-0koYpwkS!?)YL!Z4zZiF!$izcmgqrLnD&nIh> z{h!B^*EXq5OY?~@B-78mwqvb~eMe^aGS+sgmE6N8zQEo-_ZnjTH}aWogNBWI#MEOA zdi+q69=sEGn?y4@)i#-t|9@zE|M)h_D_?k?(MTGPWm&&vS(as4mStI%$C4~7isK(~ z62}<h8e?4JgfyfkB!Q4-vms5rd9%4`nsRBHQc9Y#EX%UlEX{|qGqRv8Z7D3}vTSab z^5GJ0%VpUvFWY68<+5yV-xiYL{hnu(-~jEu|Ga!!Ycx{qne&`;o^!tEd(OB7X0!YJ z(U#VDHdk7?YT%=5iA7|3Uj_!3F6okr*N&2=xb=qKQc16O<9!_r(H`Yr(3tyCrQY3` z=`O(QX(jp)8sK#@zt3wKgqvxH*4X0RENN+GLCNp4*=_aA7B6<&;_b=4U?AviPqrm{ z`-WIw-rg*-2OF+g{8jjK`)}34vNPAb;nR(-kveAEJiYOt;wdSiY<qjWEft8$(P-g+ zM7;IJTkLP|tO;jpidU==|2NawA5MMAc*Uq=`+sle9#KEC+i}BwuhC~~wC;Xo{&c=1 zeMfmrw8^q;lMeiuOZnGkQIT!?UOui{|4ez<;tgVp*o&v9JSW|QY%~Bq{wm=Gq3k0K zYsyzdQ;lS`;<O7lK%tUqrC=Vx+Jn%)m1-r!PcUsj)jH2ra<b5ZFXmkO3i!D;Q?=S$ zdM)0;b<$^oOSj^k=_mC;=v=N7;*FHF74C9VGE*v*v^I~{Q_^bPdawoSKz^{|^fiW? z`Q{_~%&*IC?i}@&g4?sy)|p#DixPC}&8_%1nU(OJ;U4mx312?fhqZ~f+U(>+u|3%0 zt?iavUops%Sf18o^%VE9fndPu3-Et}e_`1-Ke@FS9sAtw2X>Ed{o3!(-4vPrZhH0k zzZw-^+p>Rt+;->QxrGOAO-;FOHEvqbI=OvrZ1<VN&u$qlKE3+jk7l;+fA_yy9xtqn zy?<*|5|Zg))4u<>efH=R&HnuPvHoWk4m?dZKKH>});VhiyENu6qNl0sq!n0`*KPJU z`Eg|x_rK*3b{&1v{$`~s<qXxOwR~RhvcJi+YgEmBPPfQ!Y8ICsfJm>#;ngnvN(j1% zo4vLor_ay_0SHU5IejFoE8Ti13UN1i%8nXH4XU1kB@HW<8i(5#6<P;}>1$nXZLFkC zddf{#UX3q7<w@g7M8$TI<kt?t72BcLrf}_0^uFRCgB`T@7V*7ez-PyfY_kWO8Q$fB zP8M&)r;OVSr0n;J!igP6Zr*xiX689IaqrK|FPER&#=diW=eHg@A`g9Ub}Zrxrd3&< z-aS99D4Rw{7pBKEvyEF;27R&HR;5ZCGsdHvca08Yw~LkOhxQ#iym`+q)?)O$^k>@Q z%eyN(7fv78JpI_Dd*}NHkM<436j8~_k|bvp+lm9_+in_IyUV|@psXGgUl{s&Y4n<{ z8z%*UVMft57O^IwxPyWws1+;U?3q@hYQnjTySNO%WNBEoV=r8`p_4DYxRE11y0~jU zK%4bi{MzbfO?x!M^iw?P!1|y-{Vq&%({I=9%A2|eferq#%To*l1-A>ntX=FD=ihxY z>M(xH@=9#~!!uc%C@~Lv>$Dg*PG5NC!fVET5!*K5gTMWU1^&C!#%58rp84>1+t2Ap zg{!}txkNi!aPhnQip>a!{{5+_gXN7MU;5R?+lCLnn$3#|F~+vwR~u(vyAjvC(XvN8 ziTE;)8-Qt5O^0GFXn`G`Z;+9t#arR$xWQq)og&wjLB{6d`O~p@^79EP9S^T%2{E~E zJoeS0Z0P#lL@=(ZE@v(i54RLHNt@+~P-4D5wLctF$0Elft_>;wgd85qC&DeO_lLs= z`{RLR0jUSBX%SQ8-om`KFvYqS3v9@bxHeC5BqCnThQOtf2QKa!?21jQ<^?>Mt2j+J zWzChHUi#QcF_70uhhBs@dIW3f^up;_O0?ti8+6BVbXPGq`I3S=#sv@)g}gh?*!D=m zYn0VM%(y2S^RQ{lq9@U0{7KxEHy-oEqHJR<*iF|vXS~BU3Qr@V91t{{T9NHSC$6(j zrLZzUaXDhj+H(~Pzrb>>1(yU^!HgFLy-w!W)RQf8pJs0_kUb^I5MLS?KmJ+I_>L>) z+2GT=Hg5fz#}oDJOcZ73M0zEc)iirf+#(|DL?@`m)QZh6v`Xpf3)38`W;s{E{8U*= z>vdcYS@kAqb;YzXO1Jh@geBN>$w)Ef8^o{0J?%vtKPfze8z=~6JNy{s8x+MaJ4T<G zs<CqwE05o+FM19CD}B{SeeV#(-sx%p&V=X(N5wg78s0LDwdEOsq-x?hU^)C6v8P~h zYFM0!b4GZEy<ttix6yhE*CYy4AG~GREd3mEZWY!FWe={3!l!a0q9a`(2O3wk;5pr@ zys24Tx)bZ&3=?Uib+=+4@idZaM8Vyo2b;MAY;}r$S1>my7W>?;WV@v`*)A4bo_sDC zaQj3jW3MFFtxdh?Nv+7c;`@>VUznZw%)j1$V&RU-Klq|_Pwuw!<qPjHM|^74LDtR= z-WY#>diEOD^5?kvg(Hs{&z&)Tb8L+%&qqHvZ&9RI(Zh=?SLn*?u<=@Q<q!|;npMpO zgO=Yc3ocSv!3y1VahK$BKvAN&2f>94mqWM*+6=M3`RuS*JA5<r#ocSiVY025^V|I5 z<j@{P`Eh3EIa%H}l`1ZVY%IFRVS8_}-+IwwO2hlad-t)tD|uv0QD3Ju-+^nKfP7Wq zYeKn!*1XP+F1U2rLIQ4uq+7Y_wr0rgdboc}V27&9l@t6@6P@He6Z{53$deK-e&IH( zhc~C$T-9g+u2Ik8h7yz^HI%fdyQ0)KS&)*HbQ=Y0hyeB0I_z*ZkGBbD;XJFGl9M42 z^T)7d^#$QZ<1vH+h{BRQEWl>wuZ92zUbz2`=Lgr1J%5tzEDkm%{ZeMri1Dp6*B?9i zAiHIB>w-&O6NzOlPi+0cAKx>6cxK(RCpXXTlD)?(lC<T~uibs-he!5Is!BqPIHUP) z$Yvh8HwoDcpjWuuMC+G<JTgsW%9&Q&wzH~)m119$vo2jJ+A%i{Vj84$Qj6wv36er| zp2LlYAdr<U=iItSep4U935jyX){?f;{avR&fas~Wh}#x;NUMi(2!w<piKE67;NoDd zv~}BQ#~u_(TPa4xuI%?o<~ELV&GtMLyWono+LPo^x*2;can;mRERmdkVB4La8ve|F zanHlXzdy0{sbVyn6zgqHN3a3o+pe5jY5d1f9Jq6CT)M|U(Z7RrH$<m44ot1SVfIi- z^4xp(({FxiWvII#25LNVy-kV?ePuI~r4ipFpB}n4!ELn!T{;eVS<sb!i%<@c#G0V- zo?N*DlB%?dp$<eoQohn^$CHuG)Vo{NTF8~_wUAY3u3Xzfs#NQPcw1`euq6yzP$96| z7Bb=#*|XRR*|n9l4mWZL_;s5l#mKj%A&q7+7|4HUC(xP!xWA;}?F;6EHosd;-}|rZ zC>xEkr0<xLZpb+8igI}4uEVl?YdB(T8gjFNw>|YEk|a4_GQJ%1bbmjV^hnlAafciT z_I!Vqz02Z3CV46sfT=TK*9UJ})=ICVC;xlZJ5NDZQX$fnR1`NOLjsyDPw{v&%!}Su zTz1TM#4$I2S;4nz>9p4pQlL#<6YR0@jm-k*z_c!Ie*rkSOf|Rr%CEQZ+|Jsqxscw$ z1-c!-v1%@tZgc#`B53Yzwq&)r+>|{bW6}e#zcTV<9x+-{5M!XEyO51>0EUrG*}W9^ z1?|`!la^{+_@Rc`5Ic&;4(~X0c-_9Oi!as34$h69NzLp%{oF5qxamh~b1^6SYJ9Fx z*tO{}JAD5)v*M%@kneDKj5D!U?|u6GXZoX=5vj4xr_@V`pvlM2Lw1`WJEzb_yMK`M z#tOZuQ*}39TGdL6U06qkl~A>&bGisq5?#6so$x^v?Rb?%=tg}I0kO}mw;&$I7EA8m z*Mw7Uw^q-!#Eg7l76@jNeSGW1Q8x^x*p~?SZ6fo#kL;e>5y;*&c0=KY%}<wq^xd^z zI5_dcGb|AL!K>ol>B)D?58j{IxMeup#lCmfea4%q8(+AKdH(*{t6hop#;;AkzwLuJ zEDO?`nBy=Yl<R0kv89qvt>YfE4hzk+23521#2y>09wL8W!y?FZ;p8ISbiuHAyDjM{ z_T~|M$FL@%#h&yqU=qb)Uo@T+Yve+Tot~a(dG+7E;yTF+(wKMpDhXT7dYjl7Xt?nD zg+Y5gf*27OW)U{QpUuNQYa!<z^kbL(B<TeCcwD5pn`Tw5;asKLOgy6GuP8O?cIh3s zM7S=k2Indc{yFz+HH0VD`<q%5Y3{y+1m4>no`xoRPd9-SArNJh#s$e^T}s%BeMKJd zAoF09<*8<F!d{zOD9#O^y6&Iv+jQssFFk(x*?sFb^;XyqzjfEQjWe50E^Xd3_jT!> z&t6fwv6S5Zg~{!IHFvi<xB02Vum1Tx$F>jVV!Kc6|I_2||9tz=Q@5;~D6c)RNIJIl zgI`#7;ND5cHVfqz+&Ot6XjM5#HxGTQxa(SibpZ1ys(_~X&gn8VO?K%b?!5_an`;_U zSgv83=;j)S1*83|zNA^*T;Fid)tr@_5xlaEJ8eW52&Y|ko3*{r=Mgpcmp(Ujb1=Jg zY)j#W8=fvd^XS^egI7Lr#`qV~wAqWA_Sqp*(^w$2@WNfj+kgM;w6kM9&|ciraUsdJ zS-?~SLR=KIdX+F?LV@jSHFD$VFcs2sl{yo8l`bCX0EZD&c&ag*SbaUtF;#+2hvvmI z6v7Q(IKN?NXahm1_?rh2_a5edGbF4KG^a{Vj3Qq@kAV^SF8}L|6e|Mw)x-E>x*x%e zRtufD_(#2$2fxTa%s-#0hQQ<VgVk#=uKQRd%rN)kgyk_yKlC{uXcGVXcJdMq^<vm- z#Xi8=i;P~v;_W`)`JYE3mZ;Ci-u$=VI)$}n`dEvyE@<A{B1>4_7GJ~f3kod)#qu6A z`!Hi{<RN6XWUQ1+TnwqP4O^BUKYITEMtE(BN7_A?j>P>)8z_f($b6;0$1o}7PE2hr zn%57gMzVu2dfD|Sx`E6G>`2FpmWd1dFyzG29EqI&M;x%%-^LvnuVxCyYsx6@K;-rv zv5q1uP%*=ytdfqWO^s+*bqDOc-9h_NJN6R%#J<WSIvIgDw`Vg6hNnOn2dD?RQzc~P zS0UGuH<WRvD=AZpQUH(4qmWCF!#e8Whbg>^U`HvpcdFQ4^h^lhi{Fv%h9A)4?sAJS zLAwqDJ0I|TzM~Y)?S=4uyDh>H-4qKvwC!zgwO!PbH$&NIEWdr@1NVMEF<t1L^+n<_ zc6^|B=R>;(p_fA&q`ML=BgvGyaP8EYuavgTWD-Li#ZB`N(xzt*mu_0W^&7+QZ@*X$ zfSmRl-?u%=_nRr<2|?>r^)_rXD^zVbPsU$S?85cXV~biDGa<8(TIs1`Wml_O=A7OP zN7(Gr*Wr$39KuDcgs=d+0;SdJ0p8RC>%j+;F5QRk<l#G}@wyUO-=mgLTW8Pk7zw4` zy@Ux719Uh^E}u59_9SgyE85^?=uo#PKy_;Bea)Hts)2P^(s#$)<%z2%AxmhXiH*U< z1}Q%Fns^!QZ%&5Xz1c!LIUt)B-~>VpKYf(Q7ki6Ij^hB^5a51;Lo8^qNyn~S8NVfU z$Db6wtPVti4}SZ$@4c|F|MegKesbaNU!HvT+u@x*x!*Wx46%P>8`v28&cLB}?jL8( z($I<iu^rhU%RZF9-l^<t{z6zhS{h5H?|AL|?4I3s88?TX{`Zr=x@&6Y0psn5w>)$9 zo-Icc(=+UcY?6($pBM*?f7tYwe|*DCY_qsy`-+*X5-+k<*<`j+x$w?D(=MeC&RdV5 zzEmR&Q+<iD6s<F_hoR8!WELfiatujjWk6_%A#573<G~)#?A59c09?>x*iLm3SQ*)@ zC2p}bTP#l2Y`H9#DzaSh@^iUCTYD0QZL^DEw&TgV7c8R9>HUc)M?JZwkSw3PHF{D} zCg&6Eh0I!Z+7Ta!W)$V*^xkjEvLoq_xdJtSclH@miLflL6*Cv!wkbA=8Jm1jze5o% z7jrUO@0>KAdjFP0CLmo()GQ(pF5q}hwxQk!Ay_a+`n<FmI3u2>irG$I70PMaySt&c z&Rn^Huth2=6mk_-n{L1+#%L2;scPACx&xb-kSk}&_J^~`<Fvk<=0Knbjf^R8E;Yv7 zGQtn2tE5s9LGQ;0jx40+N3g9)?xluEb2}6d#lvoR@K_0Gx*kC`t=T<VBgxgBZ{aAd zlZ6;*BE11fn{NTeV)s*y!ttYJnJWf117BoEPqO&^+vl#VKlAcE&p!J`MvJq@J~w^* zq3o?Metqky*vQ`~%AO&f*h<HZ&;0ALL!Rh`ca7NX?CXb4JSiTMjAy34fyuR%JH`&~ zS;J95Y1N{ou-GrjIYokxo%rA_>2>H%i!djY@t#*E|A2ySk{RFXW2z>d(-Bl_0heA6 zJFeqS50jSQ*MPZ}WdR^UNvm_~N`#Elje2E8k(`kzgwm6>wzlyIGM|%xp4yTkvM);2 zJP4x^WClg?l^2cY@A&Bx@7={h{<xBiY%?BQ`o_xPEkFC_@qf4_>Uz=l-&pb?Hva6x z8^o8l`V=<z==#4qbf59|54P`%8gF6!MOuIBTC9IPrtI7xl*6?87OZ$YSN77K5VFvT zTAr$E?Hna6xAS1L9eJvj&XGSyyj%$=v{~ytxHX|2E9|0G1RC!R(~Z?3mP0<kp$48v z@<`dAq?Ls005}4#{RxXWbI0Dwy)y$}{I~lanv1<T26Xt|1JA#*Z}AQPvIRU-%&ND( zaN@ul&)v3gS~9j9Kd%_?AMOA6qTzm-;DElTtXWw6-<owZ-4An*k?A+($zPXMy6sgs zz)Zfhp2mSy#PSY%Vpiz?Fx~?VhsJw=rSLjh@|^C)A`j)tNuE+A;}DG(B~GMLmRv<K zw>p$Hss7PtzUy=84S3gRY6Sd=?oC2Q(URV`0<XD}*5KBfAUcnq@~RM5lwmjGczqS+ zAALo*a5asuX2?#+M<X7Dx90WjG;{k9*CR9KxCVYyaUE_QRz&<cKWS7qOH&x_7S#hK zuGo^2t&v>QGml@vf=^$Oh==m0wisilZ<OUd!^3-Inbn9&L6u$20BUD<{Lgs1_<DAO zqAd0$jw?#BCt-Nzm_znFP?1CtNcwV{k!_=|yKIXK`J}PcTnEedu@0*!uO$wGUJp$J z-r7h~?|_tfVIzQp4i3<EK&s^|twfqRvRDaP8J4D%U@K@PP;K&<IUKTm(L*1Igur`5 zzFJN!3X;1L7Ddf1TLwfKm6*8<RCp3Skg0-79+!12j&~6(1FOeDUr9^5wce862<rvd z1i{DM9|zAq6f&7@yjz5jgKo0f<!dD7pWyzde+0nVCF$1ob=p}V5N->V4sQF>ID6ti z$x4F6!pM?)T0CJaEXUoJ`6StP+mbRhTN*W{FIy7GdIYtkZSYm9aIa8Kk*wjL$|xVF z^%{gF!N!yTc~bE%21^GP*JV0)_;Mm4B$PMH39i|R7Rb}fF*z?Vxg~4Di>k2ARB3?K zR#T1?uz4L+RH|uAQD9*sUDm7|E!TsT)g((=qg(4LX*Hg*#YU(BX|a?K1nu2$25<_L z;S}Ayra#9aLTx8QC>ONF;4^wXZ5HVmvt~TuQT4gAFB<<me)Ok@pV~IIS(cv~=wDQn zY$o=a@tE;~vF|5r(7N{er@r90?{Pyjezon~V|zZ!vZa`;6lx*ZK9Du?{}gR}f_;J2 z|CCLFY7Lo@UwjyTaXr=EBIF)ydA$W<v(yDBGU$Z#w1K>y#HsAcwK!ENcp{yM6HpH2 zal)9Y#ZAx9i;_H-UmSpq){!IU;3AwQM`f2~D6KXE&INa`1k`H%IsFO}C6rJPcfp~$ z+>a;?q3#|Y*$Dt}hu~d53BW}SA^-^QGEDc9Byk?w2q-xp5w2f)cezGhSTp>4F!&$$ zFY~oH#;H9+;t$9h{CJi5(PdEZszvrqWBYGWQtn^+7mdf*SamC{p1CM<aRf4tVWQ+^ zxOR|K)d&4-hJLo1@HPauOjQ0HPZvH4YyWN$u%o~I4nJ~Fm$v?8*ZaX<<9=};t~W*K zQVU&U1lO3%p_D=Fm^Fw7{KV%G{BS-D`G6KWr!!nQBlvEZwZoc1j1EIk3_XptX2@*- z-U@mIeqUo`6%<+%W=0iNO7($4<~T@yX+gPZu;&xwh0NCfnz?0lDA?lkO5sdu-SFrk zEYnoxIYrsBN>b0C+IIh*QgUt}8!fq<aVb$ad&YA6@|764sQXUb*R-%;u7kEdPl14L zfu5IXF|<sc)}!3q*~^(eC^l2c0lQ%AG7ZMEP=)!!Wzp-9c4${oMpAB_oaS~<^@rcR zQ0~4W>cXfNB45SblP^GCXJ@vcFNo+mN_jZN4=A}L8UFV>*7Y-M*mM*R0%=oN?NjaI zQ;PEWzI3PY;uvdWPx(@6Fw~$ZPdOu3%NBXH@kIZw|3ndtlkynb7EAkVmp0J)Xee!* z9S{4W7v2iRTyixgca*MWcaKx-7(xu9K)y*-Cwc5hdQ%q3$t3cG7?tZunUGj(0VsTj zs>$c{M(9l=XNNWVp*J$$6lDo+qAsU3x`=62iJSg61(7!*CDa8F#k81P^CD07XbGs5 zBZAYdadNk@kAwwP!IEq61^RhN!NUiZIlXWxS|$e|?-|NxPMo+q9&ho*eo*|_PG2>V z^GDU<v4b{O89YBG#6I|??Ktjvz3{^-n`C94ww!WZaW(l<EAF1BTUtk6Z^oT>;m*fY za)6vb_TAsSOL(*sx2lDo)D=*}YMy-j$>#GR^E7j5!jrmM_Zuzu<Wi5CBjOPKEaP|? zGbmdUh82O;D2-vYCtD>25r)k^`81J%1id+kAWW1AW`MV~#F;JGnq)}%E-#!Fy?y51 z&q$Um!5wYT+iYE)5=!mXt9q?QufrXbjX7yj&VELg7e@<=vOKe@W?JdksVFSPPB|o7 z9V;^j^z_DV`+&t-U!P1R73Jt$_(?_SXzi#~#J4ST@Bb*s0#fq)+08RG7gIm`)E3Lc z?rs~UD(}A&jz~^g5?kM=&p(ju$+}(ArILeaak|jvi}mp#Mz|B}GX(UH<G1Aw3KS4z zkx3WG=H<MegA=gU6Sr?z)w<6Sk%a2F-Bh|7?(T&$g0_buRK;t4;2ok5yl{W|YA9tB zrFrcoZOHwI9JI&V`Y1T?dUVB4Swgx4y6GjXJ?E}?8bT%7=HLL4EG#p8uAhL}Svvw( zD#}z{2V4VrMU!A+1$=-dkoMGN3qz3Lj{^~|w`wMFp-_m?S=!og79Z{zQsxICbieq7 zGv@X<Bqx(XV^^jfT!17x$Lho9JwsW`(Y0*h!ZG^Deo@wrk;9hR44!NSS)bqM>Gt_$ zRK4*$HOO-YF;^X`wBthAOTv_)?qRZQUSCF;Ho)uMgmR+3LVPHvnMP5CwwA(tGkrq) ztw(nTc#OoW%)nMWz+D(XUx`C7xOHSw;_2}__KmaXvz0fDe_^j2oHhQ>Ul{Ki$G$Rq z;RHgb+azP`j$eMuc*A%Y-qgV|Z1%#b4}{9%QML`KRsEB<ebKm!uATgo(^>>3kS9cd zM~v-Igq~bEK#JhbQ``uEpwePV1rRzSF2dA5Rm-2#<0dE&&(p%kJK(AEaXQR{_NsN~ zv^WY#xVtcLy#ik2dAiFq*4*YU2U>s&KoKk`$WaBe#$92yy819_Nx-AI_-+W(GEs2* zrSPSXBj6kgNPa9HXNME}PJZj&gXM?K;3o4@@W_*!lG$y~9@#oPc^IYbc{7TUb{Vb8 zi9ap>m#3>6qI2EuZGB_6ANrKfIJLk3(uyOgq4hOptb6$S7BS0~8wf3{)~r@Sy!r<I zKqX)oHehjOl@SWq^IKLMDG6Wg;Tl2q_yWxx8O|kif{Dem=TI|bl<a_v0Ct9(Vvvz8 z6Mo?Fa%P<1Ub5CYdrV{bsD5CZgm}mpkcyxrg<aA{HCz-)+JW7Dvf_OG!qFoE&px($ zX750JZ+ZUVJ;#S*$<Fwc-BbIAkNvyxqyI#se%#f5gXO_18y*}O8*I$vQjUh&dp1Yd zW)xyBYMH5LZ%Jk7nT(0zk5p|037_VG!-2h2vK$*NCrIa7@?5-;T&3SqNI>WMN#|Da zkg5hcC+5mEJgTaJvnhu-)jEWY5+Ng4!bD!zJE3y}c&$N<rMpqf)9mi0`b3741Uo%i z5vm8;I!w)n&jN0}4sJ>7rw=rX8;Q0ifbZ7a3HfUYH*$6OrdB^Cb_EyvEomTWD*>Ur zsLoJr3z+3kBspT{%$Cz-V-zu{e0h+aej>R4!PPs<pJ5htcF5_7l<xc7E1}K%q<0S+ z>7!*9Jum;3_}ZBm*m}>ga@3ffoC*h=kFtdnmC)hei>T2ZgTAgJK1K_9BdQ~>ByWT` zgJ;jhb9z0fzRCI`lDY(U;3!vBlI9W6=Pn2jhPxz%yYLN>pXY4NB5HG5yrg#_OMb)= zY)R!-LK}5a!fBw2dWsA>UeX-y?}}15(VpxgN`khMK7|mGCEG!VrGj3w6l}H-0f7Qu zPLTD)$lkjp(DqG2nWdwtroLD4HpRO8)^}cav}@H3BU|cLU4Kp2wz*sGyy2ylQKcn5 z%#w>o;+boTH+ru7{HNde)4kueI*u*YvX=ac&W^_VuYS7c+R}<}yS1_Z+TS0&p<G;} z9(vSWYrJ}BamDqeLPJtcUi-EA`=06Lva`V-J`VPM4`#xbZ6rHF$iP(LF+cDl1S{oc z($Y95mN{YbDeRG`c{ZM-Opm8|%d!=KW{A<MlI7J-7oa~~A<#@9;6Y1ybgK>Tl5WTi z6)yUiV0$y!`)IUGC=P<(J`{8Vv3!e(?!>F;j`-tshiD7@)GUC8L!;@wg&kjRN{3TQ z&GXEO#*XjV#D-vZ=#KrFLh7Tzu5aO{E&CL0zs=>c9lclCvw7Qg=#m3_+Yn-fRkS-K z$x`ihfw0d!$smx|d!Sn0$V#3X0G0uqz_Xq%+6Y()_Swa=o~|71lX0M|(rN|_t#$OB z)+T(Xu9Xh!V8dL;SQm7tKxk<>kOcZw(pm|F5%9+#n^vU9JXZqAIsg)u1tcbf4a>Nu zBBBKXi=jpzO?=sAIj8Qpec@}98O2fPTL*h=ks++{d`o8c3-?Ujeeb6J&*Efq0w)(o zB=y3tXLlXXY-Ph6zyA4Y_Un-~3FT6(AbTcejLe-cKRln_`r=*U#QuRxiGf6vM95G4 z;1||B<VVk0;E_SOi3DfK>mdjZ&A41}T`G|=1U*DF!zA6^huM!)Lh^$Zr<`cnJ+dIn zv%YrA2?RG!HM>qJN^&6HR_8Y6Jkf~9xJ50@DDqRj;#0DGQ+tAqDr`+MCA~?p3xzq> zaA=^p^ZmC})+0519y?x-dA@dF!}%vHg_#*#yY#^??Dyc>uffw4%B@&?ts!4dlN+gB z3+gd|yNc||twucw1~8-Q*P8MB)A6?oW}u#MX>CvHea_!##V3WOwnSf%hu@cu%!s=K zL>xVW{1CU@N7i4xOCD02_imw#?~|k;aw$;V|8nU_IBz>*lS95_IG`w}2EyNjoTjey z8#`S|)%}h~ak!1wJxSGN+!4-?%kru4=A*Kl&NL`2FEP=nM4j_=I#q%IBaN9Fr;8;e z>4+!dcgi;8K~*_Q1O4xxQCUuWxs<;UCtPcOvh9)<h>k@3{FHb)nG837MqU5mdC3*b zw4DE(sUuXIN<z*X*cVKnL+~`*PPr~b>QT;-#!TdO6{kv9T*EbHqpGFO>DOVKyWW&~ z{>aZ{Q|j$5ZS+Y!;P{O;`{d(iU-;R7n5WY&&Gn={SpOR&Pg>fw<W9TLkdgnxpD}!1 zi@L><KQN0vi+0yivV9aEo&E1W`wjmHU+$zYYlE((p+N!Z)-eCMntncB{lm5N57*H@ zq+j)E`c;pgefMWS;=cj^a07k0jNiO0Lt9rW<8M;$#-~eV{N`mJ4VSdlC2d2gjK54@ za3h)X09)_}+LN8>;nnL#XFt7RaESEj^8eWXx38J1r=nP)d)IJZOWtG|Gcsvu2Q!j= zH0^_i!V)j@Hq`9F=}y&@5OF|}!!-EUK<TB)p~(?NUYs301p>RX=e(;<@f+_(;}cM? z{J<Y5%J}vyQ+-Xw|LPp>>*3%0GewCce70J*>mOY;F8BFp`xX4lz&gjb=8a$YT3(nd zSQQmEONepz0NWm95#vjR$ffe1t#1uG5l*GppY*SETP~WnXmvQ78XF8f7-C!V(Mt+x zD~=c^!mYuK>9fvDC!vocRPS2OIpwYkmQSuL-edawH5cKaW!UC2v`$}3TOyG*h?W!r zIa5njcedry=6?Dka)nfzS<$<^k-&5b7O-!{vI%$10`59A>IxhG5ij+c!h<WkHr)QU z(^L14E?cm(4dx$E7qKmle=7BVTrTM&8y}O3D^&KQ{*vdCl&qa2#`PiimWQGp4YikS z-IgCWMkBo)arnj=#0mHFIN?R`ktt3<)>LjMp6hTs>Z3e+;ziX;J*NWj6Fg3^o~yJe zf*ZY9zBa3jW-b?}CW7Ee=turnKl?M&VC^nVc~WnM!J^N~?pl)V3QVttewJ~(%vfsB z`Gga#a*b&V?6P;H>UtYdjy%XKHP!uS4;p_X%^AdHg5FN`x&-kEDm@O5uMH=u9GM_G zN)gWiLO>R`f<s0jM*xK;gt&H}9gS6fUokN#p*CrgwlTxF5J<XCv41)4@)=XR55>Z* zapmlT>__pN<uJ<{PZ>k5JiB9LVX!~k5KeSTQZy9!rFiDipwlCZH{YT>ddMXOBE!bQ zg}o6+)FO+~#7v>6JWTrds<9Wn2KNgdJg*3pcoHbJBUx2yQNj_^yu2d=)i-b_h%G?` z5qFHVg+c)nICzznu?5_@FIG3S>sA~Yd+li@1$yY@miHqQYx@D;f|){pFZjWI;uv}x zER@X)Xsg3^%~04TN<6znH>bt=>zDtn-CQh-OB0{8LeqaEY3xZ+XCiPW9YfrSG0Et9 zK<?hnUY<8n=y{c`zsBDgG4`?Z)(3E3U_ptzzl81!94v5#F5VY<O2O2^eU)n{kirwc z5J<Ul)w^Ww7}fhI&M}=5J8w+N^2NK9tq=VAB{>5qaq_d$>l15M`D7WqJpAo9#Ahrc z);#VmWAf=qmS7H{EXTmUB4$14`8Yu(U!>{aMw$UU5V1TG8qZIzXVNUgaOfutyLjt` zuj6;h#xs^zK8S)#7U5hfM2bj>DPY74k^ZY3e_~Y8Wm}$iIL@zgIDYOJaX41jaoIm3 zO`>1rUSKI5ye9&`td9P%lXusm_ZRP|zz#k>aUB&38oVYQP(`Af&#|4y9iBQ@`b}wQ zwBCBsFhU{OnR+d#(q1bH<H(IpSq_5-(l5Lrl<O&~S(&e({Q*HOp&qt+lCmui6HcK5 zsW-x`8&h<yv5k;@q@gH|$G~)m<g}Pevv58(vQm!<o(&K*w6v1TYslsFE)XG*g_5SY zZvN{w(iBM(K(Q6EKf8aU`JP0X5>2+)88sjFUta*Gjh`Qyb{QXH`@^HKm_I<q4VaK0 zaHF=M{0<7UUIQo#%T>=SaDv_l$t)?go+^3L4(L3PgCYClC<d9u1#sUBE}n%|xs*i| z99lrUXhw$?WwF#l9k+>d4?NAT{Ms!MDYbTX-QHu5pVS<u*KU<;+0+<&=>RL(l8I28 zX!ECMGmGribXF^FxXu^5xup=>v1jYO$+YBy(xxM|j!-Z&HgnuKZ9LyUSZHZbY-x~B zupTaODo?}K9C-HF0TYeXnoVH5iyB(%sR|5AA~Z<v>~gg{sa86w#*tUGR<rA^Cr6EM z#O3uMAnL`d1u<ng*o;hps0mB8^?qWIp{JTkEodF6b(eE>9zs_-sM_03YGez#v2|Ag zCsPIjz{hqQ^^(K=Rc#Dt3iV33eQY7lCf6PN#*xn~%xyh*?C*^?&U}8AJLt0ev3t8@ zTP9QtDvxdlb>|n+q9JU4{kwOa6*vC8{D$%KoRfLvKuv8Zxx@0$BdepL6l`~P`lpW= zN1I4bIp@TpLQk9V<TSp&RKj*4MP7iEl6VY$!aW2(+JSg;yzQLc0QEzkHF0uGz7Dm< z>+S}KqQ(MviIQ?&;r54d3Ij@U3Q1;FCy2yFJx$a!&ou+WC<ZL**q_|=#LG;4;LZn! z+eT%{yL#o;!}tH*vF)+P_kH?5&OXnSTV~Te<C(nr*Qd3YkK$PZ)Ndn6SW_1e<_FBI zhdM&Q)?X^rQ5897l4grAzH%q6awFXox*he7PQJwL)g=b<95|;dxGzvD&^oWT^uw%F zm=!_H7>DGkcpH-)yhf_J^-ko9V4$EO6qS1*%S-NjKmW`IUPT~F5<n%oxumsUWd0MQ zBEfS+1bf)h=$`wk6*%%HWDclIlW#TG-ChM9fm+}cQ4GbW>jFbJEw1f9IzP2L7Cf=z zfya*>Do4i)qKDZt4J(4RqyuclCPjH@BIpV-t1KJrh?)$Iv#m4VuROCQvatB?Prq_( z(c|_q+j4C<zj<mvRuVP70h?{C$7+>K9|pYHGGaLh%r+)$GksVkYHCNID#(fut|zpX zqtB?Kde0G>0}7WGcIg(#j&!BYETmdU8|njKw{a#)Ekqbb$8)*PTT7<^QV{*Yo5=BQ z8K+1X4%r3q{y@%(Cwx?<>@(P&C|ei}_f8CowQ>3O|8e%#%}lzR&9`58ZT0+@W_OAs zwx+#XoS9pVF0r|Mc-NCZWtf>4mmb=ASC;F12E8BH9pMQMekAUMj&Pnx2k)EV^pGI# zR7>Y|4<MN=_AGb<*Ey{LWp=n`utKW+P%5Ykmvu#<DTIhj2~sBL5jl%$cnY9Tw}I>b ziB<M^>j3NRH`^PlE)#`{c&po=&w;6sjI*tMx5%=#I{h6*nd|<lqHcfX_~O7wDO`y7 zR;UmBy-WS!dyAz@$g*waqL^@mQ`4{gsw-QJ7Dftb<F4PYOM#Sy|15|NLYAMRm$4Qe zU=|JiWdspvYGwrGAcma^Y2gT&4~n=Z#p7!9J67UmB|3gJxdvh&lqVo2B61^~sHBy~ zt3g@-otFm}pH{Ui(DVjz4HF+9ZIHw~b-FZ_-mW#<wb$|q0WB05m=@T9?{~O#KQOHl z|B5~b5vopvmm}+;E+DWw096A&eZ-y46$U2YchO^0j<*xrRP%H0U&*aU(Myc23IvfQ zzq3S*#(ElttE+4AErmm{7SB?9$NDSi7sTD#B>8BfsscF#UxWfss2z1Ep%7<pwZ+>R zloFoHUUg&q3uMH6z03xuB30J;51bx^anpCjV_P?$awMh(9x@IMJv6?#av;+`^jFGF ziHta(^teKvUdP<o<GZg&Ocn;FSNwE({$b<&W4C|*FE(XUJ6<?;a5k}+NflJF|KNZ9 zS!rfYHBL<I-FS0u$7r$;NBNLhift$j&o3T7lg_X0UnwbT#y9Txi>q&4yz54FS8;Fa z)<3-c;ZnAkjz!TGO1{4dxkV9nuvYkUlP{&^@<f>`Th$Wb0nOwxleDm_`TQ2WshN9C z^ttkdUB~+l7#6BEhi-o8!my7hQGnl8VXY9Al)F~Z*{&=!WEF?%R)JTgkK$8Ci2_F_ zSGgat%h5!VSMIvVr3(6L?3U%W1d2FdMiGmRGouh<vx~Jk3P&#@a^y1MRV{()DtxS9 z<3_tGRVmo4zpI4!_Mzq#->#|2XWDD{?n$1L$)e~v@$BA0QJsf<r4kRJc$#iVjNP#q z7sn0l(^mwnHDIzy^g~Img=1XqLXCVNaC&s?Z;j{omufh>3P0qP`&n^kc6@(3<c%3A zADkCm$aiaj(cLYS1F$Pfakcs!?=-E{*aaVU-Zt{Bt!-S?E!AeC3~wfKK%x#_7j>Yu zxZK29vP}&Txx=;8?686K4rh&)E$Sk-@gp6Gt?p=o_SaZ=(nZ86JoQ*_Ay!By5KlJ} zWtxQ0LC&|@#t6UL4(9u1K!pi$^4-bjwD=zzO(gqQrxJtT%WhE=EnoVMEN{uFvkv)* z`LHNjm~=KVu(mxuy!QO;GJhnmSY*d%Gjco_YIRz@rNU^0Wb?s&Mv)N~Pzy%c(Q2kp zm4}x+a73+i@#w7F6(Oh7MTAIXFqkgGyRE=5QxTa!KxzYhu7XyEAZ4k!;gh7IfkZ-# zvIZ4uR^zph$ADaXkV`!<VLjMD5zA%LAnUK{nH$~gBq5VL10pR-r-ifY63JB~sl-4N z$>8Z(DAdGNGha<ysd^N@(<{5gU~07El29(ZaZw0|Z!)Hgw~YTiG^(b;d7I5?_W%*7 z$AmbtQQ}g&V51YlgF?B7d^6@XptVyU9*2c0WTE^=n3Sdg%G8>x)UWCZ!~Y_Eg=o+h z#5D<-2RGWQcw628j22{UVSvn2xB@mgY32_<{^$RhHZ#o?E}*G;%JUno6QF9Uv$%ri z1(%L^W-vg6b!8p7SR1x?O@i1ii(~xY*<eV62<cs#(LlpW?8LG$kuc7&QNgikPr}tM zw{NCZ&2w8=9=T$CUa6B4DbU%o;cqI+@v(}GZd+$!VhC)}`DvTz7{5Cf3lAjM0E5vK z^&v%>SKqtbOfU4+^c8nI!fj!2Nr|bKL#z7Za-HHezN1D>wu6$s@o&0<HDY;Lf!2m+ zE(;0RFU`7YCT`a9j!{~HkIETyq%d9q#jkQbXW!LRm#9EV->^Vkq7*XMbLoMdAm}n> znKb9+GU;860?B!i!(&e`i}ZTION6t;ZEo)U$ZejVx~!OoYe&pu8?xMvezLc#*Pgo) z2pdYHl@_%$HcjJ}Y@|c=*XQ$<A=BR{ufuqzAy;KX)yQs8wOQT;uC0OY%<_8~$8F8> zJDj~5CWnycclv&<%WT5&LNH!l#rML0E#orkTOb$#V6A@D^{~oS?uuyZ8{@qsU5Ku0 zFfoU==}K*VB$=Zv+w0N75a1B1*Wbm%agdLu7@gnXHrqsf&_8U>aAr_L+K4}4cOdzG z;s;5J*7>&cF%*$|gJ6JdaazRtzBT!|daFDCg<FdUX47+6m$<oq3=G`$2V!apCqi^$ zHg$B%V%*=gvZm&tXyHLwp3n8bTim||=&Sf@$Y&8xU%T4p?Hdh0c%ShDxXp_^UE8Yt zY}=M=zZ@zV-=ov{>R6GLi)`U>wm*?IwArzsjc-u+H}rMW%leWoRBQSAOZ`~?K|C|M zTt|K+nb#HU^}e=l+UrGfvZZ{biLX4GGr+o`9&zqmhg2<hPA8wyf+_*`89D0B7{IVC ztd<>XX%CV%*g-KZ2kmqi^g_vp@##PgYe^&w49$5&bT_qO3ILV|?HKh~wkG6On5JY` z2dm5+GmiR%H8OW_zIq?6Rhpg|;a7m|Kk@iSxA5fXfdlc>G@Qai&82V4^5!1((^Z)> z@zv9#<A3wXn|4w)G;HP<Sveo>TqM(dA@a{6@Y%O4739xO%!?SQvd)QME^%9;DmdnU zz!A8S2VpE5HP50^lh67H5ouY145)Df>4drEnAsqj=fMXPSu3H6CRi}D`wP$Bb$05= z-p79Yb9UG3(Sbb^^LLzNr_LTZe&=(0&i$7?j49$e<K)QE!!zG__?3S(uTM5&i0^Ml zC(a1P9MqJ+mmL5+fDY^zAQDqntPSM}4IDx2;BjgF6q%zriLXA8ca-syNMb}kG$mQ2 z<xO^BrHe^AECR#QlDr*7kKo=2iBDo2fUbf{OnqaX#<+<AE5MJa6~5nHsrNTTg3tpK z4>999vyB6`XiNHdD2Je#!~kdnsn2hVMqq+lV@&o0L)XTq_YH)P-IQLu!XX_#anDb_ zx&M)cNtdIz8v)7~1t^<)X>*3T$!_s<Dr?-)o(y3m!`k0p9DVS`iI8Ho-Sggq@7%SA zMFBv(y13GIaq+3^U<co^xj~7-m=TjC$khw%;Vv_4qFy|}M`a8?f!0H`c~tXBv>>*M zqzOP7BA2Fr06j+Fr}w~idia>To&qH5>miByR>5`%dGLX-RSvbRB<tyMmpd{maD;4T z0CL3SDohZgNJSG-*QEBR7JMa5{fv+*iP6kpux${e#JtF0xK<T<slOKtLhcwpw#SHf zt=Y8iaB0&G*=u^eiX3S6wH3FXef9tOp0_RD8jCj9d3}D2onX0Tai93^m49<;*PqTL z!%85+UT#bz&yTzw6$34d1x}yE_#dmirmnuGR%uG+-+y)y$w_rEl0;)-O#dXl3VG&* z|I3`pNG)?(!rYX!o;*z$h(Wks8w4bH;b)e*Y=rT2SG5#Wu?`*pQvX;(2Z^yE0*Q7| zI5^w^OqG+v8i)|sf`MHEh;WFC1&i1SHPpL_3P8~d6paW>Rg8)N8A=>JanrcycGxbP zlG?wzf4D#4j5vd6whFDwno2hSrCV60bgwG1b97-So*b@sh9&31_>l4XV)2rXLAL%8 zzrE#yx5OXAzm$aU3gs2B-g1sin8wObTx?D~p+*3K2oAiB&UKEMHlvgUkQtj(eQ4GR z4U5WvjnkN+!Fv&tgM+}@>=-IHQeZvYNTZ0TtIkEi34_8BQAq?n$y0Xr_K_wkNZ9JU zt!dhR^+vFN(T9cUb!alu=?h3H9uX{xi;>YIibjVDY}a&Rz%tCJnW4YlPmM(uQ%}+2 zXg1z)dADy-x|Db(voLbY%z?$N5A9XA>~uPmZbeat)j;vOKzO*T=bg1XZptpqIzftN zEUd_Pc4f!ct)0&2_wC$##2FOBcc&D2U^?ztzbe)ni+j8q9vQp2*btCxH=8<_hitb= z528zLr%-N(SZQ`c0!r4>PLgP$shV){=oim(R~ZKUgPt|08M6Z(j9@?X<jN7QYZ3IW zlsmWtJK!35*IER+R<3ombDQmKfp}WmxRcNKc3WsN3J*y9aBfv>2fH3&p#*ys+#Cs- z+r&hXxx6vO(cWlt$-i=CUy39*jYqBf|LDH`ie=Z(hRz0u)$Xt<lGW$%On>Q9gza2- zZ{fO?V(jSOirazxnD+ISl(OuB9wCo_y`jC%L`r)C0+a#c_2BEdZj}|f<(N77k(wjW znQ1}O2^>{*ya(o1QM*YaWehpJ6CX5oa@9f3#Z|{k<!*-)s?g}J2(Cb^N{1`Q6FDx7 zKo68gEom8dxx0^i8p46HrHfN=;M824m1ISRwTxC{g5e~fzU0SGQ>x0b(n2rN{O=!I z%%kjmvXu9`<b~mlyJIXniH=wP{t`mkaKiV;``PWA2W>nTs5>!Wd}aUqmIw-nTZU2a zMt(r&_I(JBj&}$BU*P`AfqKF_kY}FwHg2-ZCKB64zD<qEx9L^M>7X7>TfIu&h4?nP zB2^46Ria5dY+^YVSwb0+7o_{z#d{F6oV;!xKRb&ZK9BLg(NFF&-m^%N{8jW((+Cow zhFC+5UAB0nZ-g2{e;#YLs25Wk$)~n=UO0$>G~$=-wc|YiCd`~iv|By)W=FST>>aWW zu&+-RNJnr_*0*p^Q<%_DUX6Rw`t#*+@-jXZxx#wn3W7V_1)K(2akWZq;)1?@HI&@e zI*3KY?Kycdy|3s+GS_Wcmh;gE6oTWUL9nkcQH;z=JA)^v<s%Y_Uvr6*#O_}oY&qqu zcRIgT2eP`mwI_~6Z7Q6U<*87!?zCC$%<)jI94t~h^FV4VWHL4uG5(Kh(DUN+zDU$# z{4%z3J<jeJA2_L4RB6=}#(do6NwAw+?2cqxm)CMpBsO)-7;Y0q7mKnp;pUEj=Te3! ztslE^TRMf3r<_TNC$7#SEa!5B%woWj&q8JkV6K(dlFVQ^kW{(310{r9d4{y2Hm_fe zQ?h3R=hWP&>RU;4SFeYehy^aVU;w4E<@b&f8q#EF!t&<tTpvazb&JG8q9poavW&_K zWrre7nzAg<w4E5sB4PmO-UB}Gk&vY+O-kcax61O+bRQDOzl!At{Ad6;>4$g}JW0+g z$~M{S*e5&WnnU4`%D*3L%SIF>lO9byq&SwQwqXq#iuk?~bND4BB6Z@ck3~X{$K&|7 zRRy!9MqY8cjPsekWu4-7n@w^0BjAupHplm}jU1%rtsY`+w(zRM%cMQZ#or)GHc_c@ zc_HJFk&s-F@d(X&6A@j}1iKL2RfvRJ;8|?BCAW$Tca2H`sGtXk;GSO1g$sXbf;HTG zz<*fGA~yH7=m8W7i6El`hxudV8qs274@~b(74u0Y?i`I}qoKZkZceB)CURy99rSG} zpG9$4#7Av)4u|b*a21IEC*$d@ic**h9sWhyc<in;U<q6Q!rbVh7+$xc!SpS_9xQeB z@o>a08MF0HR#(Sf$|f#}P@2pdZ(sO3Oa~|1ocQ29OF!a(b^+Winkt|>FJ>801WL`w z#~}><gJFOw+5|6lpkf#f6^47?|6jbVMQcV~xampV;l=1B=aUxn?=+#@y9B;YGXXq+ zb#&gMO07Ln!0i>UKhhdEw{qknF+kz@9GDrF*dQBXpk>$-MBxy;R`S1n#ojxWj&u{- z7cQm5P{MfcpO|~%np7ki2zFjLq<UEX8RNY@`FJcMS`Ld2P~0K|0Ki@q*UYY}yXOyV zAy<gq%dWJEe<jyza}K)qh%qUKFt@TzxSmgraa@|>5;Tum)JlXO@mV;Yb6R6gufgt& zNnJpv`6wJl6U>Mj<VO>!E~6w%Gh8JrURz*<s9M5%c>{a`fE`UZc{rxlc4z~FJjBly zk+VE3Y^Te%W6$neY%bfD^&oeg{+8^p{aq-uZ8Mr`R^voM?Qf8F7~eU#KPIbRz0LT$ z#c#?n`S=gn#upmU2KHa(*nU=0U{0apO>}RkjRYf1$%4+UGHNpEvIsDVvp_Jp5Hxo5 z!c<>qo-Jxd=nu|g-rKy9u<tf?!)Hv09_wkyXOlocDWVwa{|NGqK{|49E(?yhe1vnD zBd&*e^g?Ai0`M97!SW<10kd$)D?uQd->q9EygE#Q^<;yV_P$~P?ioe`DlmppK~%v2 zwSIa%F`O8*JngpK{Ckj(+xA)VyUXQc{17J~pU^%w!^LxKAImPD<l-?-y%2(U4$T=y zjb}~qkm56;z)o3a#Ycq(*gmF;6UhVfBQZ0aicNH+U@r<`76e_oo%mm#l5U~ef)yRW zh(1YMdGty1U{P@9H5_|u!FO_gWNu@zurC@?B0DDHBiGFh77xU|?4fP<tzNr!onv8N zc<l(ASxqv=+;}6+?t=YQ=f?AuYKFyChsX2MxaAy8i}#zuW-vkCJR+nIDI3G3(cE}} zToL>X&2)tSrd}hz&xM)s?U-uBre{a$8*BLB_^cI!<7Hcl52!TKG&uf`M>apasV2o5 z9bbHS-O&42Pe{+3<Bmk(tC-_;73Rcr06*U%P|FD7yVdZK%>fF{nh6f4o)O^WiF1?% zXc-sBAcYj#jH6u7bmqm56>Xt^5XIabg2>@W(JDf7G-P+Bp)rv%gL;<dlQQ5~yX7R& z84$ICz2J`*u>DZ!D^wkj^~(5{?p@X9YVg)!Vu(AGXmHegX`#8cYhlyM@t3bo{OI<H z`3=(R|4rOF)!AitHMNIZf_{g1p~(JF3XaXSiqF3ge0sy~zx?;`1+RD}^p{5tK4$8C z40G7-!Tqil&Qx*n75PevhwmuSR@&^;FwKWasI()a=8v@@lK??r2x18mDO@0sifJRN z)_+bfKqdtjK~|bQr?&#HY0Q;dhe;T%RN)-v-0@)=qrV2}QK27V87OJPZat6ShnVAS za4#uDw)hNXt2VSDm6MqPif&Qc9wcRKO+q%jy2~L+lLmnV^I{5+W}+SKxTZMG0T%o_ z5GlOS06Xbmp@!=}b=}@S%I;AU(R4mPw#9H|UlmWz?#qs6?i>j#$!~1g_*e7OYCPJx zZDy6YwLahqrw%?kyFZ^zW-wX%$k!dIu<`R_&GG2G<eC2LVm_XTg)*gqpZ!S^dfqQQ zZ#f{j1e(|Se+lF{U}xw$#xP=1bxyEC^Qf2mrVP#l@I~_#Z8msEEeN^*a`{rd+fVQg zswzv)h9;WGqQyK*k!U=DL-4%dg_oSJHvT~k2o(M(Z82J-XDQIoL<6`XEk661&uXzj za`q(EQ;W%BurHVkV&)&ALqU7e_UawGcOARy#$9)wKe*%ixxKTqyQe<8<G|tV3p4ws zr*=<nxkcQ)|G@|M?f;9vFdUP6<~DBo;?b@1v$syLPu(`RdEv;x?VF}|PQllF@N2|! ze=WTr%m^oh@~3Ezorc<sp*<N>*Jz3lZ%^h^e86J+6y~o{w?3d9z~L3e#ZMp_0ji2N zd!}7Wz@H}P^mWjsRk`vyj*_oi4NF{+TN++R$hokN^b4>xoHWh}fY9|>$Z#E!?{aJW zDxNL*$gQ<PBCHUr(gXM_H0iQvuNC29etG-?MiH}HVkU6_Q%QlM(bUPyku-MHc+HVL zj~}6iU%t}6U6v0Q@?TVx{!ri*>At)TdDP*k?Xb*hr_zP*Guf4Yk45lf>~>lH>BQ_y zigNoHlbrR9$!|D9t}uJ@fP%<|2*b4vea0Z$wmWO%Cls|-PVY5ts7prJHyp``^YFkW z7O^x4-8*e8FvYrny%61Qm{2X#*g<q*RcAn8JHI#$0?aqs3EgqG7yoC<!oq@e;X+t^ z<NbfS&?x>Be%Dsxe#;j0CJkdX-!z}2K?N~9uK<nEWcu}(Y^b$=I4ldRHFU0W6{i|3 ziB}CFDfXC5g%nT}?lG@&k>yvK&F{pU*Wf#O^E+@kXp|;iL)@jw^bb(0rQvQExy2V$ z9Ksr|LKBW#>o4h^s|b23t>k5mj9XtZ&bQfN#ai$4xAzqKuDFJ*A@A1kbB(9c5K>xJ z<BT3hREsf2TC<xMJ?i0Y9uecIPHK)I1tH(du`8li!GQn*3^vO!6xwCXFc`+uj6D(j zvt+ki=%e|6l&K{NEkv$?k=tkPK6+w&Am~{&Ft+J?@_gfTbht6@_*Oo3=ETvvub>YG z`3K?YW_(b`#Nuo!wOw6vLqA@(scSb|rmz3f&Ii7{esX5*x=mxpZ<JJ*8ZBgp$2*n| z^xnL7=bs)On`9q;*3(?b4UMOkF1$W*UGawg$w+cV&*WTj^Qy_FWRccVDq(!md2lJc z6a_#n+<9_Keu&$F2et4chM0hW_UdqD!n$bC8~7tcilvEx5I}2t;?mL9l^}{tu4EfZ zlcnNxQMv~RYAU&MWwQE8`;0{x6;$a}sUk!%k)a@b8&!MS^g$FvG{s4Scp?c6T*E9~ z8p4M132+pQWvGB!#^>wy@mT!gb1=+D-e6K4F<mf^4zo}$4IOsrdA#c|x8fqMKZGtM zqD*@M;x)B&Qa7dsx8STtPZuc%#v&1p89;R(o#t-4w>h5d$mC5;AhsQC#mKn9E<=z; z4~*an!FOv*s{meA3%GMQ^7}x_qxY7>@k5h~*X|nMbnnnyu?|yZ!_kSU@ojr%_pI~( z!@I^`u#N0Bw$V8C&dbK%v$mIBWJ%){_IP1(Y~wc$OmAlnw>Z98o|)Wp`|fXTwHX`P z?A!0KtBr?^nDLOb)_D2lmy8!#=B1YzFey<O5?{q^ju5O}75<RsGa?41;rArV&YTud zD+v>p3eb2MA%I<&2Qcn)ROL2tO+AF?!VoJH-I#b135gG)<;8sqfi6+%$kd%^v)4Kh z#)4@Eawf7Gt()30D|ljBvg@M2{Fq2r47Ro>Np=w35bHGscF}ZTT5zGXI|F07{m`b3 z#fiCi##R{1j%<8!->vyoa$`8ye0=n$P0re&2=aMsC_A&}NH(g*riK>_&1afJ=&^Q2 zB7xHN|MIQ9iX1{2Mifp81=NuG!A(U`iFYoxCktS~2N7G-xP<B(HnjgMa_0-*KJ$}Q z%~yWC{|-6z!kIHKq~tr;W4FH;mwtXCbmAYR_=~qMe*+z6n7MIQ+9|#RIb%{J_}`e_ zSS3osU})41Nz8Pq3Glp&F}KZPoS$(zrSRZcb~c~9@Cs%&;`jXjGRs}MWMehpr;iAe z!oLYxNv&K_RRmRUXh%sJ2qjET4**!hGn$A<O_`0TYQ1@i>FCWGl?vBz@l5-as;xR# z88sp9Q5V6nl@{|I=0UaaDoaZOJ3x#1s3ezZkr+ihfp4xhZEy-U*fWX@xU&ls2C}{u zj8&tiX~FYooj1~6T0^8UeFFRQQXtqkgg1iTd4=hsfo=LbfInb<IDhO2A|dapQBG46 zsqq|LhuSdtY*I3Ufct268Fy0{JWDDTZv`tNzHGmy42x<x1Q`|s-cL4Racd$oo6Y{h zxht^s`jPyWc)=mX+ov*Hl#%9Mn{B*U8n-cfzPUM%m+`;+Ek-T=YFn!5p8QXxP5q@w z_k;b4Bj}twCGW{bC0oKK@gGsN6II`zKlQsz9lEn$>K7lCPVl(0`W%3@=5qk>^4p#R zKxFY4e-6MA!x3VQUiQlfi%R`bUq{5(3EdWjQOhWM6|-__*5EaGq6tkf#)bf@OIapQ zjj|D^z2FHAc*23Tssfi59oaR1kcogo+ftp=<K;_eA?&Eu#W4Y*-f{+D--lZVJ2`rA zY<!^qO5@?rKe%t-gP)gNr44hdN;9*i+5P)xXFmUVer=1wgyk)1Gb+PL;d4AQZ#8GM z;$nWw<08t9ehLU1E#&HWoJ~>B$DSNPnp=Zudgj9;;1_9bEydnWYV7<^pC7@N&4B$a z&yrx3^Z!JnVPlbYDa9Wv(dz2`@UapXwqF_!i-19VQ%qS80h>Z4)CTy15C~d<7GA7t z5fJG5=rI|XO^T7?d|L!Wg>*TTqf!;^g(y!o!}4m(cQr08K(#BMxn9Wj1Exs=%+j0D z8NTGOg`H%PJ~tA6O#I?)yFOxv)=~2jDxozFqM}pMbtxzUK*fj7Q!|l0Mky7Vj!@n6 z`E4gWZM$xn|Ki+dwnfhb6HWDwn9CV5=C#S;o2EPfIm=N`Vf)nl$gOK9ZV48eV#@BR z179B6II%0ObjoQgH*|aw`&kxp$Wn)ahvZ;tB4IK~E)lJp2MZJ;C`iV0s^&bWcjCTL zf+H#Lf}NiqQ!t<lxS>Zpk$8KTN;lg{@t~l0gAlpY(irc=IbA^rNKvB2T^5`eF2Fqs zbQT2$B<HxnYB6(J1PCt!2SX7c(Ta+F?0`DBS`lk(;2~gw-|p$DP1^@<y8pn+uS9)m zkLRlX`ESWFr*fxcBx2#9)KuqmxusySIJ9!QWZO8s?TeYTFBDI(b#n(KPr}b-QIY1@ zdw4EJ9az?g*r@C0Vt)<nP`cD#qf528egfu1GCBzxG+ZLCFp6>Fd>1aVtq%84&h}O^ zpm-as_e*N<wWO=oM?(mhunBc|A8?r>O%dkfjtY4qsZ+|D&hck(lmQ}|)PJo#&&dva zbcX*NRq&RY{q>+t;t3?6O)jD97BPKw_*}%r$7&MsfyQarFvX99N?HWKVF($1D=@{f z7^SKeE}wA4ntF2g_!vC^$YskyH+=}|KW-)2iuRaGkp$A(f*TGWyFBJ9;?+(6=WQ_n zRjC$A@tgSA!H4G4VYV)tjLpP6!N_5cjom$YZ+mpVVoN2XP+U)XUcJk`cwF_4T}bv1 zPn5DRtquRT)$Gv!@`#5=j9qWcm&9*)#0z`-hl?(kPmxpU<M5k%A-6L4mNi1J@MTP` zB+dd_f%Q-V5RgK>*O)A$xpHM$A%`TRrpdGdH3L^4--r<|^g)Uf-%~WX6SQ9-lohBi zylUA`S=KO3DFg7L_d$SGKw!E8`ajZt@c#86s(e6|s1B$QY)1(lM1lPcO4(8j`h8&c zF{->{7*_;?=VC2(;3q>s=?iXCH)y{U-<h77!HiyIv*aoa<+8&Cm$b$BwejAKl5@q7 z8g766oxl0P+&@Yu-)o+nwEp8HV;H`QX9P)7BJ*BzCLu|p6CKypbIG*j>$d-X_UMm! z4+InLWDl@i;z7ht-Te7A2u}%-HD7p&ie^2#A~fNo`BIIffb&tR0h-6`+O@N5*#m3n zm2{RpZfvz6@B^l~S}03YGs0+cAnI_LMBGJjgS*HOJuIeDYkC8YZ<2qb$U$(IEw$8D zL>-jCMe;O{#ni1NSeD$h#T|vDV|uhV6d9U6uxV_{KbcZX&o~^<+-@5!jm*YlKFrKc zt|uG11OD3uzZIc-BwuS@iZh4siUCT*#EGh7=8sFM!QcqZM(qm>?gNN_A5<7XW&m%< zpYSfhcIJy->tly@1)a`#tl8xn*mZQk<H_gKwbHMm=#5|Na2Ua4CTJ6*5s_IUQPBai zy^@bXFwoB_ai@4dI*NE92<|JPPhcmJ-!u^^9P!5zAb!OpYjT~KB|^Od)o7M@R(^2E zbxRHDn}cju!0-#63SN{Ii2T*99o9H?@8N^TPaNIz$ILl?)%<K>e%`X{z#sk5?wx;n zH=EiwH+#qImi-6KYdeK_{i#dwI!Pc9^i!OUV!^UEmJpBEDMtUqc>V6siJxDJ*Q;;v z2PEOh2lLjTHG=F@L3MeY5Kf#;BOa)9noASUVd6XWuhGn;CXA`!BQ9#_nKrA{${O=a z{W-7|Xy6SO)|INZ{v6FHC1GJyEJj0V7_eUrieaJ%eDyx6Nal&QRqKAF#+gi+1HVxr z5#aC;+jfvhMfkqeR8NsgdKL_7KE}pvP8Os`-J!Id_ZKj_8&NY21)?1YGi~f<C~I%$ zO2R&-JbW`Ul}Fk;bAEQ#^10?UrB3N@{-flv)@OFFhd)H){GPh4nSSWgGf=0I&hsz* zRyBLx{od0q_Upv_)gyJcWmnb)$G>(>M-ZdNjpOSus#AjAiIJ!jmHuaQRUZS%On~bn z&O;ruTpcaK?H_Ei+W2~O3S-pG$yWoLI#d#tTJ~_-hlhChZ7Q+ADE2`+IC9oDLiUDK zYM>zNF6A_O6lQfvu28jg=PJGClJuG>EQS%K%o8cT{m_7CTCqBp)<_XigBcMuTx9L& z<5;%dby$`xkY>ttR}%Q#nnK!w)}$z&xk}wu2y*dhoyVO2O9T686}2|^Qe9)TmEs81 zd^#mT)qF%GE#&WxoT!rx8_@4u$C26dGc)2V2(EU{AO3B@)rHUfCu?|OS@0F;GlQ?A zL)TC2!mQquoVM)voeRiqY8-1A1=e5zin9`R#J@LJv23S=!E|`#1~RhiRH8f6V$I}e z{_m&qdI?5$Rjyp(#McsFTP>zmyPFWgReGz-xrvXjGtog!qQx?Cp0dQBtRs2w4>5=z z+$yavSGn3;$cdc(8LVJ8jWr6F^nMgumP&(j^R#SN1K~ldlb~<7?xNYnN;vAnNj--_ z%}Z^~N$NM%g<cHnXHu=x6Qapx@XS~~FQqR3Ss2KHz-kx>u=t|g!l}b-DIfi~;YtU| zI8Xm)G2=aQi?inI;&X$Cl;?g~@>-o&{4s{h;8D(&%S|{faYMCQ6D^HFDdKQqbV9JW zVz@NDl7h>_nNDA0JTYp<mKQ!h@mS+NGUnU93tMK)Z+vZZxiQn;vB^kTB8Uk)FkK0g zsnGBZLMp1|NnZ0w+USWANg4;#Lm494O0c8zi3ylD+6Js1gA%ZRREuJGm_#=M#7?uS z7yW;^d-wRZuIo(n9J~mEAP9gUK@cQB5ClPx1VI2KLGUe-A}NZZXj!IZMYa`3ksq<+ zxUS>4it9R#<GQY^xQZP&$)s-TCaHg}=bS_MQMXN<WLhWVo2D5zO5NPtG~=6To4A=f zZkncgAou(B0Z2f6NuoP{On?1Jcv<b)d#$ziTI*Zir)Q2xQ#dIviP2bUPzb`M9ta_5 zp*rHd6rXFQzFOSc)lLpMM!SGH6lW$beuWxs;eyLfAcdd$M0JVHRE$v%7z!;x!($Dl zk`n1<bmK!pQQ`xmLir^B-6%vR+94@%aOs&xvhSg!L^N~#VUI^BO3WQUHEw#cpo(Gy z*ADavERTqMc`vZ5LfDrW3erk62@HkSjq6`-t}5J(Arh!c9EC4fa1mHYsQXY&Rl9Lx zSF1r;Ml&niIwAwLm#0@S^KU_SbugR%i;3h;l?yD7JnIDnuHpXCvwyo|4Fi3hPKuiI zJh45kyr=c0g#?j5DX=}HD9X*+TD!1LvtpfOWp9E9DTpV5Ygg?kC1a85y=I`8QHnKQ z)*2a?4@L8DWSfJ*(H$?*$=I&DeR9C-IrrWjE)gQBzD9Q=$Md3U2lP8Jl*(XQ(7*+8 z7e^{V=KI`UM|%Qmv{$z!)bCB;x!9jjQDBqwkxnN@ImJrx8YQk$nSsk(Ez{w9Y~BYC z9zEf;96S;E(vFwx?%NKBY@-u18AtQ@uY1qF@J_AUaoo3~Ky2J8X#-rhIdRZviQW6T zZWSmWC~hJAPWAK}sRD^9&Z!8T5vq*iT-J952F9_hIIiHLvU0s34$(_o>QOH3I9VQB za4*1oHqI9hOWu%t$3i3793ae;ogb7R3Ec6fMEmFRO!R0tRu`&|J#6;}-C^r^Z#3$R zx)OJ}gArHEl+M4s(_Jv1=^8gUUbDFDep56)U~zpNh<{tq5Ir}))BP|N?AVGNuLc$P zC30_ea7Vx<OhJH<&L*71l4}+CLvgHjpKPhzK4*s!o3O923DDfp6-QZ+#wat&>Xfr$ z=^AiJkpr(lI-K-3;38dd*CvvDS?$ib#>V%sIt;={)RR>=aQr}vh#b<=+E)1c@>F5u zM@IIIjO>%2$sWjN59III@owD9f2Z)j=cS$PKrxOSL#&;bu9AMapqhhND^znttW{=( zSS^pDy`n87s{G4j=EyDTH!0{BU5UlksU_8`vDjBGqIzaDV2qgJC*VC?0i1nUEb(6S z*Pk~8Tcce|cn^oz!t4cPT`$R0Y^J-TXOe={V>=0+Zw$=dcJkK7L_q%JmO?yoGLh~- zJTS1GAkT#RNM`1y<8z_WUdP!dHkalx9U4o-$H(Kz5#EsyyDi^@R)z_8Vf4qBY44u2 zx&V7e3m7@gm50(g1D9F@)T$s%B~`N)h6f1If)QSxq>B0xo&F&=ut&Tcz;E^pu=cZO ziZvtLip{4LIi=3yX4MZ;?s<M^K8(xoypvE(Whhl>xG1}v99PbD>l=zm!TorrUzEiy zc^*}IWf`2HJ@~sLtL%3vSHfQ^7H}2bHNA8Nwo)Bn`(+&W;}P7ty5qpFWz`5EZ90Gg z0tL}pS7;+qC7oy63vtd1luTHi!wdsk>02nFK#^m*sw#i#Rg1;nkO+GLk`<~;3rFJe z+o8c@R_j|Bc)giQC(V%fTz=no!tF0ZBBh9xSSOJ(uS=vD?l8o<Oy`p5wBGL_DWGXh zj%o0q@#(}aL9SLRg;J)~u}(gPH4NZM^T&nO&<OQASc78Co_w2RX4V|sWQunny&2TA zArz)?Qx=6OWJR{qnA}6@L8LRegr1`&^b(BD(=7-#huNC0l1vV-u_e~oo$BKb)q)#G zb%HM*(qWhlB!vnKS_#gU2=rmjX#?Ab@JmtN>M>FDax=*yhoOe`;_~v>fZ_k^@38Or z)o(v>b82%jqp?$;JJjuKDk+f3sF10Kg<pH-OYhl4<LHXJrmP>jpo9j05*k33hjFjo zD}7JWI<bnvOS<ubqn6}U6wJ89hG)6FbQC9jXm4}36F#&T@0<71IhrwoQ@C*}n7f2V zv@(ym%K*Pi_2^k7lzvE-truIBIvts!_bus0RG>j;h3zfe+0_7NFe`Corv#8hb_~@b zsyI_ve}LFPdlAKUG{aA{x;8Y+v&vpl{HwD(igrz6V}=Is(?Nzx6#4{x5AOUI<r&2x zj`bQr^3Q%Al-+h*|NqOx;~UBaMMK-l1z8^EwZGPP|LD1U)?2>Kv#oFt{GqsiyzkFT ze_P-u2e4X5m!WMy^)h|iX7FN=``3>bf;DQssjCM2H;?yCBDgixGw`9{20=3nL@IkA zXeLTTLeLCt1q{uC6lb-&<)rvqZbFZG6MEUFP2W5g<(iXNd2loUuPjwyCnkYz<v`Yi zi-02F4oT<G#3a)b#(Ys$nP*Q-oniTN0u5+2OQNC-w{A5P+3C(1zER9cj$Zb@t>z-< zO*`E;<{*o#%*E+<VeAA2B?;N%ku}Qg;`WU21*$n>bS}5plBmVLE#iI4B7Li+5x%vW zohbdMDLKc=Pk)GC_b2gn$Vq)r#g$sE#5qZ4LK+)5n=<Vrh`I>kA%Gx6vKbqhYWz@8 zv|8*8fivLBQcbMDO%nt_o~TZ2VoMmC*g#r#h%CCOc07-Mt~$LNn>dQf51?aW6HAk% z9lD_5!LoFz>#}#VJWV=x)n#p|u@Fb^5O?t<IixFtiljTq9qmr5eb_IiKjN-3#uuP% z*XkHams*o`*wH1tZ(ZV+aV8QhA=I)4WQ%w)gBV@nKr>YZ_NTOrAR!Pl%o(Sy7_8sd zDJy^$36$Xr1Co#lS*dL_qZkBoMqckBwF>S+zJgZD9JYN8Z-PE`u2XkmUi-uB6xwPy znhV&6T7}U#{M*=C0~{Jda;gOsB<jo@ws=w~g?GamTG8hQC7|GjUm`h{y1<ijhSk4u z9eun$LD;clKdI*kQzMJR!;9a)Tyrr0!{Uhe7x?G-%w-vah|AzV0iPGo)o$Q~bQNgQ zoPfNfm3z<Omsm=imf*<opH{1&Y2z1S?0z1jEh)i(Q-xVywHDElV8FElj8o<7$pmzw z5;2)p_nLk}4WU3KH*7BLv~~!c`FI)i#+tw**|1H+)OpaK=~0|DU3?}FV-_!4V1K7L zsF13GqL&#pH;AZ%?j6K*TBGR0HsWYef+@0it2Pq2yLpc)M0QM#_GL)hv-Km%&lO|$ zceZ$-M_DNfdOc^GFY<X9gA>xe!gLa+w_Sy_p%t&YsbD4QKA1oVoK_upq=1001Q>P0 zIv<Y%1&f?yx73p&LVv81Sa%VRcAzYp9o^#j%H!3UTeo>iVnK1oGaR(&u_-A%qEGz` zeXd&2B5>!R^+-XIbrA<Z5AdrN;!3qXc;Unfv^d4N!zdL=DL|jeh#PuL5%`a4#Eq54 z<~A;e>cs6#Z9qv;R1u!9t*h^f3%1GiQDA$!mQM=z%<lq*BYQyJwfRbwx$@b?ZP$rh zo4Wp|7i=Ry!h~txz!ngXA*5;``^BWMw?7P;BE>1V{e>zOct(0oE4qXoS+RNqV!nrz z{*tZ<8qiBA<Q0*S1Ogg}_-S=%Za@40F|(H}!_YeDC;g%jHYm{sg{%f{u3fTFd)+`I zl>P%ueQ~&0eyBt%^i`kSP@CIOF;w&V9G??HFr~7gd7s}MHhgcrc&L1!$=N2KDhR^x z?kRMVEH|JYcr9}WOm}r@GhDPiSSUz<Zc_`q5|+m+Dvmuh98ckZL(aW$IyWTFA8lcX zHnwm8i#Q}W8I`soj-CZ7(7NZ6`Y)5$@|Q`Dud^pLgtS_jCbH=H3eUi<$%8OL2rjB& zuKBetwf}1Toph@mX|DVp043EVhFO&*Fgy=ru6`G7_TuY~m#@fVTn@)PA#dzA?EXM) zYvp)it@!ASnbY#Y%M@%f|7?SrV^zQq&4+iU<iuDaw45xD;#_M$p9)bkzzwg(&83}z zq>6kYF^n#qMlvj}7zhsa*eajy0Kkl?^~lY@>Q<CH=1tKTy09yAp>|K1w$tsa&6iVK zt2|jAEw)zjAHNvAr{eQ5pYo#0jpw7@B}p{(N)l#d1irBa_znH}3iyo%{rQMxBz|M1 z!qBWw4A}_bXlN;KsCc(}Jfp98&ONft)2(JAKe&j8D?S^~J3mfvpiV=4HnRT2xOW)L zEuM8azS~acYV{`sWdhrZP5Kk^1g8wgip>zBps55kE<9lq?kMZ-_**@t;U{HBUh_7O zT7K;1KX4HbjC)w|nHi(HFe8QvIj`fQX}C>v615J9?+HfhzSFs|z8JV%gC4d7kc%D9 zb?Zw5Dr?n<R0v@w(4NeS#R;@yak@eNb@3skVyk5{&V@V5I~MJ(s8e(-oH0}1`?g-j zO+&-aFK#_e1^?H34$a_3|NX)_iMy>Xed5H6b=?LT08-F`bFvg^$BJCnj;>H42SE)8 znQ2ZxAy=vcetZR3BhwS0rwTm?P(c7Dg{2gU2RKxX07U^lyW(sJlg1lKs%j&Mu^Iqq zWGRyBWU3lK$nQ<MFK9-Xw_4y=hnYYaZQ_2BKf29&mNpb-E^ZBF=_fK+0gS|yPml** zF{Fu|Be^_>g6*R4TmhAN!7r~=rKp~*%BUd=C>S{R0J;zWF*qCs2?3;wX1p1%;qG&D zXU^<7dv?#6GjoOmXYgk3%(=O9=jMbT{7`7{`K1VU?ZY~ko+9Zz&nlk8j<niDbH`3u zFodt~0Llj4*c^{;;M|1}2>$>iLf)5Ex?Rs#SNVP7nx}X$;p2wLCe}Vij!t7%X>z@2 zvba6|>aduXta3x91k>1<RLQjKdTSVn1jAiLy-UdV0cXLj0mWiFlpHDS*Y`btr}-9@ z8f!m#=Dp1^)A1LCl;8jJMfrMe%^}0%AN%w3w?6Xq+z<UR-$!1Q{X$jk-g_gyUwt|M zH$u_x!=@*=Gk^1+j6J{qZ|A=C1>i94KhkSw&~6f#f>|QKNl-*+6syV1&pCg~6g|+f z8=Xy&s?7;XF_4FAcEKiQp9UHuBj`*sqqwNpbt!XwYD$Ax4(=uKD1Orr=gS5X&FsM< z+CekM`bmQ;SBW{tNm_8+E#$3+d0xZ;zvbqxq7;lem*1tTIT`wY9;{`Z@-N2kxg=eu zHB7?!ukJwMiS<dYF>}Tlt}&x8xlg&qEN@ZEZU}4K9HnfuNF8Sr%7C5@SBZ8^71}Y{ zfXIvoMzF?Jq7GSTwrJggzptC}!Cpbs>%~+7iyCSb3%}zmKjtikmL$gz6|nSDP<RLv zR%$ou_NxhK1|WyaTVmrVPV2wR27+?n`<J%F#Y{DSU<WI_X2(3&=Q#l^^a7!|%r)M{ zHEv6*V_4%**BsZlkj|paI|QCjm~xG{XxGeQ35NvF=g<U}8TT57<zD@Y-Rr0Q_IBbo zeldjP4`Z2G<*gaQGEXr8=M1*C{@q;WUcI(E#<EW@NZ>BK7yvIVm`g5Eo6PWb*_9#x zJD0ZJ<yVOFzqf-`HzHT0U0X5vH34kp7l6v6We4<CujZ<wh>P2z=80m_Q{>Q_P3J}` zq<Vz;DTH}|DAyzjhOm%hsOyc1lDayBml*zHGX%g_)MvE;QSBQ*C076^8xVv+L#Yh8 zR97tm@fgKN)R{?qiuPjIrAFB^86d({0$DY-!6K~P4B*7-92VvQL4+ZY>V@4f=q(R+ zr8q1|mH9f51{a(81BG+c@Wkkc9y~TVd;@qxp1gMMxX0*7_F7Wu<G*#{cjxN6GTzE2 z{~_<fuYg#&|IK^=!>U7zKM{X=>Su2GlMk*5!Ht`(2d_K-^CRzmEPY(Q=J%esN&fi_ zi#BJUkYGd6DIbSu{hym3=frE_f01_@z%Bfwg?ID?DlnDSp_qhlxSgf+xR9ck6)*=E zYEd>J6m5efL`hkS48o!&h312#hP_zBbTKeQTM-yGB$U1+Ev6q7F7$HXo0qN1WLdrJ zpY@m5GA>7;sn|O*zAa`cvaAX@c{65>1C+WY^}Jlh=nSDcodJzjRzOB`Jy!7vdP{<3 zO<!GA>f#8->JlsMDXLOnv_ef|pc(MdwaOn@@K>CX9x7krI)mDWnnE(J^thC6F1@QB z!U?3ZH?Gb6m(LIu&9j*;)PO<PyOr+_{*v7F$=i=Fjo(?&5v8K>#4mxrg{q=d>pm5E z=+6#Ek~jaw{l_xKZ=ZkGCeMEHW9J>^D|(^Az12GR!8`A}|K7z~tQ+~1-!hT=W<NQJ zMZseVtA>6{kSoZJTTc+-QtH9Dn^prrlsMulQ0Z4qT(TuWuwYB66QH1*5VfL--S{cw zxDjNf1w`LMH%%|aYn`O<kUDXZP&%DrMi@=(6?aq8ifuux$RsH2qGjCt@7WSiufS?g z04oCeP%@>@U<S*r3}uxJ7)-svdR^?Lhoz6*d$I6SE_HeVf14ocn#4{Jkge5w1!wUP zIsk5c90(lG72$TQI7YC>1Sc`ZeJbjNN~xH6!dI9S!a%-0s2}O)q#E{7$V`>fH~eYC z%{H%lIC}7b-`I8P50<VyeBio|jBAhHb+p}=bw(qh{v-LX-1*+6rG;a0R|L#dpr`}! z*OBBIL$gn|4TXpAdFh7K#}Cdwd|=AdKY#Bh4u0-$e(8WOG3HL%nww20??3al|IlMM zFCSYt`CCWN-*NZwDp#cmI1(sVQRm2f1I|pZv`2bN&+~z0<2(xvPGt<r0PKBA9`Igh z8SWkQ9JLN!`=kz3_GD!qGrXrMwnDm=gi~<=I<2-~J6aNgHW#o0vJ5-g!MQkuVXVDq z%9JS`IlyQI$-y?)zbmb;po+Ru<8(mjsMWa=Zf%3o5&kSuI>MJi>BvzNZWD~=n*KuY znY#d_2FY{^Tc>(4+VG;qZmex+?$AA~1D)5E244$PQn*){9#riP<d5XXJ}$JReq<h> zaeU_zq>#+lL3gUbS`+*G`oy^tmn@64ZvX!0{|Wo2r1ZUe!BAD)Iv()k`jXTOwh)fS zLBl8~9Wr=v*$P(jPwh?&DgHR>wYk00FD_{hO^N!ykIDeOM9$xN`Fl85@WG${pRk7* zJ6!WYG=$1F5`FV^8>tf|6p+I^^2Ik6PI>%yAGz}ezxy+DZJ)W6ot!)IbhBmQx;rzL z=4bD;o_q9?H?tT|(IZ6F^cp#(BN~)LX);<do-U@eHs-)m8>oV`S;o^r#M8l9el|GC zghLGQ={RAKX4Iez1kyBsh(T5>D~A)x^l5b*!4~6T<4z^SVB3pe+nW$*hjFJGMX(JF zAlS}hJk~rP#XnHs5C*|X0bcyB3u7=@1zTQG`w)XuuYgryCD;-`3{E*$(y2bgVEkDU zgYl(^!GR#CaPdE~roRv|I5dqI?Dhb3lvSgk&6%i_YAWi(Aitg!_nbkDS*&!#dIk4F zjSg&9>v>3T25rXG2Fx})u?oZVh2=s3oY%^QdoRICOi$=-1!r|5pv?Ui=hv;ZKr*+V zpS@&r0eVxMzXfibW@*3NSD5Hn6sqC0m}wBaGwYEWnVF8#V#L*o(@ati<an6LmJ;fS zixqNO5~3wj;6zirG?I20Nqa)|;;$j6m{b>BJD^MhN<#H?!(ea^!;MR+;m?JbT0(H} z;1X4SV0&OJ^{=R%FqV#2bVXnfjHSc566@%MvEa`VW2vjhu?uNJPhG52Hx~VcFcvTN zT{Bq8A*`AKwOKJ%_ym)f&80Q=2=cXNum!jS0s>s)Qmau~!w5Zk=PQ9=J1INFOsK#Q zmeK?5#9E%wDTU<2B3a8GycFxWkcQ~|EthT<MY}LVm;<~R0clAdlC(D5sxi(+fj`>C zZggYYXaoK^*u@y_f*{4j7(YuuQ5|Zhq*(#dnp4R|;syZRL6$dwokli3;8Ys8*-;op z6htk!P1^^C*8&FSOu_1))}bq@?m{$i?h4>btH{M+b~(_l5TXc(Ou2#7!9cvPs9i9S z*ei<H33in_7)T5P(XlQV2>vWF5PT^Nq^`cROE(bxg)oo?B4uD4MwGGCMs!!uE`Gk+ zQR`0#MwftF4Wg-pF%%#~5VWE~A&i=zT9q9PAiU>^DPe)_wlH%SGyFR-5BWJFyZifp zaO=l!y=8vx?MpBZvu89BpO}azhqd1X=6mL*^KV|Nc^IjiXr|_4R+`1M!`BLPI|;#? z0O?1HySQKuq&+9O9gZA>i$-CvS9awFhuU4}+Cw=U=MKp6$OKIm3+lU7rj#_zfpGw> zWmIFpbf-4q-5m8qcyp&j6Zc!4z%3kvnCcMf@tUh=3QLVTw^rK;s2tFzz@Y}wbP%(v zA@uf9Yy`sbg;2P&E1u>aDT6N6k93OS2elAEz&>%31x0UtgT#=X7xz27MRHYORwt_h z$)`E-BIvOI?T)n9;s;{v4jGvkd=$wRCV{&SPad&Yx*`W&z5lhPoBrf8UpRQ?pFa12 zM{f4I<Imi%{F}+J!%=td<M&M-+cWqp7`c>h{B*LtefePWqxH5Y-}<9tGso^Ty5sS# zpv~Ukj~zL5&&iVyPais#H9q=-6NcdOP1$(VZmGLZZp;4q=v|>X&tK(#WbJKhnheie zdDCd}Cx-d!(nl1e$~s<K=yVQBe<W$3j71+&S7Ib~TTr%(1|&4-u~GG4t|!$9CQ}QX z0v!eDb~d3@oK}5sSCEgnu?^LD*^*l6s&>OMGdhktgXvcBiaV%g(1e(<nQZmOrbv57 z4_V06Y7C#MPSB7^YF`ib5n2?4aCeD2TRlMET15a}g-y``+{j)xmi_cPj2oc0`{dXA zqt$z#`SXqBKyBgq#eLm`ZC`os^>h0@;58TmYFn%MUw^hvB-EE(iq{+Nx)83d_cNhO z%)GS$sHI!<J2Oe9sC_CF*g*D}bHtk4P_jWsR}hbdPOCAj7l;usobAHe@Jz($Ja|%7 z7bmWbmCbD=NLAWhiVuqu!%(&65BX@w4(hIntP#=&q;?{t-MF3FrTpAUyIgB`cKwJt z!%uCeeOZ<e{{5CZn4+g!vIg&eht|MQwYmnH$;?OJcpb`8uWi@C5CaKJvXEbWeEVf6 z$*KN!%Vk)#1?Eq*g%9q09bzH)gbieUt`~^-Ev|<fsz95sM_C55?Rv=hk~C#D^5(WH zazSqL_Lhqx`qReA!o4zpnfYh+du1-Idq^j*3y`eNc@p7HcnF+j{WTKQhfx%E;3J+C zt`tZl#Er2~&*{eJq0~xEw-=w6+!o1vXh!J;gh22bn`;F`N8~hAycno;<D<eQj!mc_ zY%7t5kA<Q_Ru*hfEPQT6_b6}YX|Jta+E^mbzp>qtu49Zm|IMuy)gXn1{v%^fk92c6 zPO_cr+14(U`alvU7&{U=P!hFGhZy~EfULL#GbtoS2$)>480*;=24Wa@7%Tv29h9yD zsJu7nm-Pct4>%hZ;dRYnsye=Q((bl9yM?IvOPdHgmd2Q~Pj0X6SkS$E=bzgkZS~wm z5-Gj5v(0nMizVfoS2lg{wWxCL<J&zrx+`M88CQCE!U)#>0xotZ`<=7cw|?-lNV467 zhsxr}?VCM%dH6W{%yv(Y+A#T)4emk=>64x+VBA6o73hp~)8f~Z)?xAM0|@x0))-ye z5EndfhBl#=)dmQ_ip|r8*n!?F^_T_46tclwAk-1lsnJj<0O1Ka;@}iDOamMBihTzI zbV?AaB}F&87g57lxZIazfoqKup}(j%_Hai;eID*&)!S)I^vEF_+GO?CMT&C|ZNGlt zHbiax6#63$VBha5%#=ZBTMKeFyfv+PxP%q>fYu~~Sw^WjEfm=Sp8&XuS(z^X3LqUx zgN?2iJ@r0+ptX(5qeOL}L<QDNt%)MAVx8*AR*4dKhy%*N&ob-bggDV8^|EMIc-sY1 z2H2B&fMqI>fO{ZcZ;;O&IoYy!|NJcj4WG>a@u5dfP9K~b|Iqj2-w93b3Eh6n8}U~o za{n)+mvi6m&;9*pe))#ocig?}xgSB9<SRF(n?H5x@m;?in(pf!zWTQ(AB^N*%YWc! z`SY@ycV7dwD^0^@*2IoT|F@*=VIB)bZqBN9q$bf7)@9bNPUnVn^};JtC{g6D5^;2! zdf2sKc+u-&?@DR=$8md}*uNLIXVk%<+@nTdaFKp~8ez}ApG34`5NO31>Bo(^P!78e zKONtXQ0(^1Ud^@bcfE+BRZD$hnvHzbOjt5|$i&pH1lo=WaO@@2t{BXa*nkV|LQUQh zeX{kie;O+OvwFi{Y+1g_|M=2W6E`bi<!5)Grf8IemZG?mH-T*#lM&P+C#GI=)6$!H z3-1)kW)Ln9Di&5M>tW<wG<FWaRl1GhzS#`V2cA-7g=JkD8GOj727+RR31FGNqFj3R z<$q^GwZ+n-@7(0(%8T#BvzXoBLWIebK1<o1d5dQ$yDjJca+7CSJI&4yZT~Dh$G9)R z@?OfmP(p@xVNBm!?QR3Sq5P4`4y&ymDN=S&<rg-7sPc2_+`;Xii*cnJwUCDk5tj31 zt+cl|4yvt-=}<<)C<`yRxYP39%6!m>3P>?7uvwoVY{czBd^Ngq74@xg4na$59Gymi ztAye@R>+gRm=g9@>vQ|oF5a}^dTr*(=N+3aoJsf7!nxjxjOl(!YoVuhV&xjRazKr& zKg==r<2kKHP$HBfo<b@B)>q=7*@drZikIgBOsdTofh*<#T0>Zj!o@*8Bx<<fPJHQU zqLXiU;oscVe}2JL->~u8tSiMPH(eB*Q%328^gG74jK7T8MK@}$S*Vy}G1z?Z;&KB^ z!U!T?fnm`ncz}gyjP*2Sidcp*rG!+>+BA>2kg!(JBS%UYd*#0x*xTQ~cYuGkH+1^s z|FKHdYxo!Oaeg4eZyUP)rNGi_loo&SDmcbJhYU9iYX|2{C3lRhST%SdSna7rFLZ## zRRStXkSF!y|1U@caohmJ`r<a>2e7hG8wDZlN<)2+0DTOGKzvG%zP>1VH=OROfS@@6 z=S&0k2vSiMm>d^{H%Ke#Y92h+%n-S)_sTs+fmOwYquscabX;{phScp-x8LMzG2MC` z8}!)m{EyyryX6PmUCF_}&VOj%gNGk!Fn%s^^GE(0#c!|q)qwYb5C0){XvUC#GI?uC z{&qveXMZ-|_G_PbZuwFD9svzRalXt-&w(o%56-sQ1d!BYZ4R8~!{WxEO>jkM$Vwuu z*1?v$QRM_=)rEmz+y`7@8ra3@Lcn@4>4_GLI*2bi26;Q^z=Xa4r-p!zT(MO)HgS16 zT!0cd0>-w)4BrmAwPboi#P_x_xIbp_hh5sp^b9`2biU>q!pyHIeg<@XxUy4>l*Yv= zC8E#jggz-=ZLf^@<&y$wAQHT)i&RN0J@wA~#$`$-ihJDJ`lFukzhnK?M#Kj!{tE_2 zplWsPb0#xZUxeX{Tm7<g*Lc~hZ#cTfWbCcH=+$34r_X-m@>ZXBXs$ia@L5Uy-?R2B zR$I~bCn}-%=a2%cJFsH7v<)b`CG21W{AHJj$&1~B@>|83&s^Rv@Qe`qP&hmH5%)&a zC23t5g{C`kFp?`2*{}e_X_OcvJ0HsiI_9{%Q@coxU}|$~UZKV9h2$~%WozwO=yHQn z2mB%G7=F~l=H5h4A9ClKv7Ykt7F;`O6uIDv!3usVMsydNsU8u+!Oz@DSm8>wt6wM> z)Cfe}m`lrK8M43(i>5u{%e!3K<UAm705ZEZ?*u?`6su9&KuWstF@5=)d2a0#2QW7K zsf*mqb;<bIdoF7`4Z1F+I8T;PA=RAVnIuIUl_{g#x2d#9UV0|xk)y6DrnYg;E4yDL zC*^?!M}2`DlCk{Kbt(-HvPfPbn9{v-JJGB(U$+)?&MfX&zY;US(4l_`wG21n4|M1S zPY1}0wOu6VoQ8D|rL-p0HAF2V=d7R-J%YC=h&G5emN<fEPpcU;5TzC*{TX6jCNi95 z$T(Fy%ooiWaq?znvBFKHF}8!o*coU~jEZ1iiFS;RkafDWa2rOcvzQ=;;nwqhBEa+% zUn|r+EJn_x>a|gEz%q42EmhoLdJL@&xm1;2$JNU7R(p+qBwTmuo?DVLlb`;qCGN|3 zt6sRtZ?$`Gz3;KtPTl>*rCp|K$Tb^uh4AdNKz|=i8y52?S|V<{*Jpm?sqeopdAI!9 z<gTE><ZX9!)*W~-e^1EV@cwh3f9urUa-+dwvN|2#D3_;4Y!N!Ch1jypK1Mw)5`98i zWSkbcM+jMKeKSkM(yzmPTwPjkvtdC|T2l}4?#H;0Ol5?QGX$F3uEN^ZxV7ooJ-YX! z<)@D0LoDM4RW##H4O<SD7uvqF6Wd-F0M6{*seu<c>=rSLRtGdaY8L`+F_0El?B?_+ zt5k7->T`SSZE^4gLfwXBcAyXe#G-@rjr$Tr&?7OlV=WzBVl6|!%@-t4wqmGDWj5ob zGiv#Mzue@KzX9Qt{M9=q{`-fBrzh;`J$LO+IsbIr(j#KMW~G0VG(XNkWhPx19HU;I zo3k>Kdz^JQ7_S&MZD%b`YVU>5mn<Fb=HSVqs3oXC;spQ_X|n<XfJCWFaKo$iHo#DD zQD=-O$D7!JH*vh8wz~cuAU4I}gcOV&coV?bLM^Spo8Zq9ya~P(coTbds8z?C=r07` zq=r;mMv(mi^@3uwtKQ$<FK+xUr4r|4JF+LBSpozU!&yqwI^zz|r-oDc0e`a-yWFzx z63`^3NW>EldlKhgx>OK}$ai)A*V}QYB>(I3hWxKa#B@@kM-hT-Mh0wKH8#((5i{(@ zoUlTSD+)bcf-^4D3w~;S`h9nP+M2cBdvNIiTmIcU^#jZ8H~reZ@5@e44W0e=&OBia zLm!9!SG_bRw@O+QeF0-m5HBK9YZOX9O|<)343udSM{~?+ZGb)@XstQkCq*&S1pGl# zLU4iXOqd}Bvr*~OY7hJZa%w$J#Yu3EsGTNM(2uOxd(yaXN;({?M-U9C)erztvqXMo z>+vN<vxv|F%Y3=VRKsIxUQxsFn9wU2a$9lKRKsIJ&Xv|s7#@Q^OFRZ&3XiFFw1#z$ z(O(FUk>E{A67?@2NnSJ<E8X6vC^Y8bETC$1uXr1RJa$xx<1H3l)Hgr^Td!^zJ3(!K z-Ff2admS0~=Z@ZXs!o3O5?sd|KXCkpW#4^|KlZ@6uV1qFfPXW8-255nQ|<wOz&(=I zgB@2ZCcwCv#HGj?73W2UJ8ej)yU}NFn(xVAXK|S0*pQWj;L<srn-VA;ZOTNynqr_d zP4T4(lUeFjh<H;bENuf;vS&AnHj;^_wVIj7LiXd1j`(AN5;D|!J0Xl*fd*G<%+ek; zhCm3s5kuzcF+-S@fJznJ(fpa^-^|^5jlHJMb{7=U?a{iBv%b<|Fx{PBiukQ}7_32e zs>xU6#9TZak-rt6mOo!0Kt2BP8)KQ6|J);MewlykLHWt)CpsV_YBNHs#cFo}#`Ufm zPyRceV4dA&SYyHLJ7k)VoxQcleEO-yqmh_w+_mbLqK7mO&6VeXdD|uJm21EW&#VjF z=5Q~~g4fz0|KP1%<ph2MlPRfzsy0~vQ8peHNadgg;Y-jF4my?H^d&eEM5%?qE7}o+ zx7DK`rFP<1LI7Y`hf(d*y{9Jm<-d{Yx@;z865ccYit6$FyW)ODaf4UPO#tk)b7g9J z65fNiCEkP2g!j~XCZ}}osrB%M@Sfdhbzm4*HKG5Jt8R$S&hfO_>r$<i*y~QBAEDKv zR*%A!B-1X^->MBD8p~BWtudCOi&l_nkA{JQ(^HUf$bvdc;dH!*wD@FwBB=<fBW6yw zgVnRk_}ua%Ic3>zicK%?zU$GKpH%FhpFVD~Bx7^(x9^qvEzyXt&0wi(9trK+eVn_V zf1%)ZV#`195zlok{h?d#JpP$zm&pT-p|0jCyU!b+J9skx`TRGB$NF0uZ043|<>Y&Y zxY35cD|(;UXYi_+-V}4vonX_^Y}{uZI<L9mP9l*eu~wJpBZ(;^otDF^2RcmYtG*U2 zkHO}y5V<dC7d_~XN~#|#o%1*vqNMA<s*y}lr`&37Gi=3BYA}Y2Oe8P7p8w_}n-h}0 zZYd5kpQ-svIr{7-9HjZTiqC3Hi2So#datCp_^gn?CW6gNOXl!qm~-7y7-a!b)kOtf z;JBSa$)^$K>(B|(f=GjM2A8*r4+lm%K(}xO#Z!lYYttMazQHq><*QI<64tUYw9f#S zEy{Lz<*&^B?u}pi&PI5?(p2-@(%o18%WLake5f7j`+ASSQXDPRuK;SK|JHL~$peRc zhQ#u~&A3;E9Dq&F`Sq>s#n!T!mNPi&Z5@0}kX<z3!6Y>ZU7K8kyQ!<7ZeCVys@-;b z4Q2DxrtS9lo-#)He?PMIPK#&9KC^b$EdiysqkV1%mR_~mkiG?7&*S9vTJ#MkHf^H~ zFI$PNH(dUwg-@(b&dj#S@7a9I%j;=pZ`pX)S+m#oot$eO>{?i9;{Zr;-<{ld^emJL zAyV$)xd_T6yUv&krIiAiS~IRtKqF&@ut1lqw==+kvSM}mg!oMzrj=DUcyA@u*9uc` z+v_9bTwCkKY-<FD+sBtDCYIydY`*!s{MGoGrL8x;#J>~Bkvb(Xjj&Ne9YOlH5zn5h ztdQ`~g8EVgss_>{^!1RRzK-~|<^&QR-C~o1SO|+gt^uSzq?5@>qFR{S*VKgs$6N!X zfxtB@sUh@mG`pK661Y8re*yR2zKspszB>{f-W7|C%CBs?{ibh3MyI0TiOF;In{0e> z-|xfxLKowB6`wt#?|Uux9TY2=N)QBz0KJf>6}=dUq!nmcMO(2ED4(3Qs@5PHR1iU` z1G6fqa?w=G)z$bqh=Rex2_y@yleqM3=jn!L<{u25xbCjw!Q(eZx7+DcM{bPu&mHKB zE#7tJnyt6H=;JXwZwy+<dT#_0(#5@2DhhC)7VL8i5l|A@Y6TtwZ7Fh@f>Egn+sXMH zP~8v`5hgu{(0OWtuiG3Rv!IXh0-bFo$1DhE5%sky7vHExMg|E(l4VP>&31ow@&4Fj zbBiaV(17pSe8WxmJ>tJ*ap@-ir#^7*g-!O{NM20yw1s=#i0&y6bM4r5L8sM|(gNId z+|9rtFE$>%_oPm5k`$aK$v7@+11Zv))WQsO-W@S5UzNl|8vHb6A*C%@3zTUMgeHV8 zcwP<)K!gmXzi?|E6?t1WU<&MOkK(pmZ#cMS4*Bnvumx_&6DTMG%_UZG)|*Kt-v6y9 z?)~A{KX~l(>zM*q$LCSo!_0>1e$-E$QV25}|ESlO$dkdo!At-+ILf}UZ!m{N(<DN~ zhSNEF1(gX`GLjuX0P6si$+~zq0N*GoPhMt=x|Fh*;{aT@(D5a@J-L9ptUlAFq8eD3 z0@TxZm=kgg4sy5vI?c6p4Q3a4q)xch;W;d{&5eI|behdhNe#?kp?7mUex-GKLHrIz z=#O&5PwAopq-rM!0#rvCm(rhAvZ#QK2}=}Vry{B=wIWr~*%>K4CPaBN14WfkL@1UB zFj~J>w5=%Mr8w=;o}nC;$A{RZ3cO!IY#{6ALLkKcG=dibHkOLv64Pu%jkmmj@f z3(of&+;T;{VGw28oBdXK@H(6A*#)oDE1RuW^AqnHo_y36wjYXhE&998-XuSmi2CN` z;|KpV_ZK%bA3O2CzWUuqPq<yRvZb_qd+zAcsMQv!^o&|8bG>FW#U+6|Dez?0N(0iz zputINH^RcBDa}i3Cx=8TmFtcNy*f{3H$HHqlaNU$ou@%Gs!DKn@MKDz2nZ7r1je*Z zL4(i<4#y&8W~0tIHHfW}1Y)8Ui*9wPULd9+=0s!~2s7g@AOzPDW`^oQo04Vrr;9D< zGNMd*DF2rmn*R$gtmnw|AJ~{7^ML$V2|p%n-vrz5lXgr06;*AX1(UkvS5Ar|g>Lyg zB(wyZ2+cHca8BjgBg0T0Qe>fTMW97IAzBJug_gohLh%9qhh@|Of8YWj&x~qB2A580 zMgjdZ$}B;O;#EJWDAZXjWj_W}QQT|p7ULd9a5{tBSyCrK*vjRE*mp&Z%R`SCc57xI zdJ$RZ#v<CNam>yDdx|LQ4X?A$3$3|7)E`2uDRv<$>WpI1OYGnHrs4iF6E&3?a{epw zs3GiJEEx0em72Wi)c$jSd4XXXS1tHk`)~797W)t0bMGs+blR`3d2j^xlLd0GchP=% z4`Fjh=@7K8+As}y4j&5f18|>+(X!k^AdV?A0pScR(0Bt>Pb?|vf@}t*E7q#^rnEF^ zKqk_VnkDKFB#JQlq9Syq8=Z`Lp(K)<)GJw&P8DF`T)TelCoN5oGXh3&!By)?ss^~# z<TU;s6JYrc#wc7CwEBpcIa_e?2Xj0=C#P^U4rlOF5sr!ldCUO9F)l(Dq=?g4FMeQ@ z3Z?n~t2y}QW%F$ROt0Xc`{&D|*e=WfnOKLf2cO-q>UDUE8}jBMU~u5<%-N&<R&19Y z@uy^uz;G8t^BH?&Pbp|L*I=copXW=dWVj<DQ@?)HxHNw&vkRpS7)`4euTb2BA825h z=EruI(ngOcE-z?$x1HB8l(ja7ww=}})VlM{TMTIwZTdFkT8q;8P4omdsA)q}5Ygq5 zlm=Mseg`!MR?HO^5sS#+2ppEu;8v=$%8W~uCA^z<3!MSiCFl&472bDj?FUOs@KtYJ zuEASwD^}jmeekkO9(O6^l$d)<px*AqWX2!r^>(V;(B_Nf^)QbR@bTuVJDS3<eH_Lp zuh*8A*HLI_Ijwe}M=Dh#2z5Afc0B-rda}c*I^hF=J1UK-6?4S7R!e6c!ij<dq^T;K z!reF{)QANEhIQJGZa76_3+7oV6re;lFMHQ+H_!C5(%SaKcJ5(Lu%6v;-xH60^nBMA z9<mnqnUl{+|6S6ejQap*Ym^QmNgl2#8f(|=>uUpik;aQ^ENE`7F;<Pm<blvQ$DrQm zf#;MY@Mu{i^h@Y0T10{^FVTgc0EAec&ZD7+JSXSYSKRGX!3mxl)8B#qtp)E|^#cIC z3)PQ-GDzI0V-xrr{7o$kIFj0kbXRkSa3jhpb1q10tJ7#ysAD*;)eO!I+(k=D06Pr$ z@|TeKk1r=L*YpdegY&aHv-@}<)<0Mt>q{#N8^nFVXId8R%ZmyZ#C}m!a3ko`y4yH^ zBKkQSg#41y!uCUtcND7&cW!6Trq?2))9vAj>GPl6!kT5t4jrr*bkMGl-Y#HbCt#br zNumMA+*9+>fT@n!z!!jn3fWlp1@<8%bdFZ_!k`I#9u>Fn(G1h_0|G5dDnfLmVU8x4 z<6hAWQ1&|2!77Yrr*i$ewSqyT6<-&WxLb57r`@{kLeWkg!#xDOW3_)T{;m)_Wcy&j zVzdc_W3x*z-rXcFl+>zyuwLm3SnntdUF)ZEjF@stXVvyzeAfbkaMcI81ndXSTeN?o zX#Ye*AkOK7;P2+Q6;bBjL<CLr3B(VK=TQBZZ#{Hneoj{}n*f2FTxqaT_AC$zN0)E; zTV3|dnzHHSgw6v8U*02ozq*G~PliG#sL=h1$_4}aKc<m26^*TR&ZzdJ_kS3tpmAoL zG~S2GVOsi6h1wyoM2zwjZX(?Uw-Ey-xtPaF4W}48J$<5v=+XDiqwgI;NeilHt%Al^ zu#z>W#DGw%3F~atm0h}BE0%^9(KKMQ#UErz4ZM7)Icrh}=iw>86~v$wN^AaRzK^*C zm9M@L8MG6k0Dwtg?SlXsm<+h6tiY}{T2`aJK`iwj)i=EM{-d(#artoj`S0&K{ITU* zN5a;b!#5uk9ksk;%}>Od-hRK~f!kksL$>BW6E;12+o|NJA@HvFq4$l8p4oW8RaWoz zti|k={@-^(&y2-7^w5CQc&GHt4CLl)07wx8*i5wQgm+B2n1AiMbXl{KGT;say&<wz z7FxXQXkSt&I=^&zA%FF*6U9E(p?h$*KD$G_DZL;>FL1&9tJ)h$O(|3dFh+=>7IulM zQcAm1bsz+>zK9Xa9=vSlIEyGMI#^URH-^M8VVFfl3|zt9+filE8UlQyRJlOLTwYLI zwO};RHZpkm$+eo|FQhg5HB^PI5C6>t)_$AGuvS{kPiztDVbhb~J-Nw3e&12)X-UiQ z-a?JHqS|3D(vfjs86t=aiB*E%F+uRlRvA>DLRbN?aB68W1CR#PUkjyqj|qWPa-}vM z!ZpvMhjom36qFFh@gIdiIP|>1;k5}WL5^AiP+LnPZ%K_HQDW7r!^Nu!Dq}8BBNh+K zR@Qx~O_|c*`0hc|4JAne6;bTc3z2v!SY`o8^jbowSoXQwj81f=>L0ucLdfQWC)$FK z%I3J+Ka=?SYp>`3asHt<9{BQ2b4RV#FOLqNu-TIF(D(B%fh6|szm~@=<qW_17q0#C zuh>8Pa$d>*?55K%-T59Fg6mdWe-#4wfX&pE{vW}{FUh|m*SsMweo+GV&}80ce8W70 zb@svLj>)cqo^E5BHOdg;=tN3e!s=^#iT>mkTyMnXFj*l@!77v2V?ia>bG5^~GlJ$C zWy@T)q2qc~EA=9#th_g&Tzy)tMV)4m16NH#ksTy6$|&C+(Bt)iL~c}n1@Q$MGI@(Y zN<>EBS@DH>9G{MovBb6n{30vFnza#VWe|3ddCYDJW+$fh9%g_UaOF&fWZx0K4!cpc zRAyaj{Ce7a2expfDLl*Crkb6Ci;Nx+(Fu}m3cl(axJt~0CSA&r^&3lSPmq0LM`gXN zDd;|<?G>$-aJ!5%0ViyQKE)puKOhoS@Z6Y012#j{u=oQXkD&_6-^xEP!@6t7k3MaW zER8;!zkmGM`J=ge<HO^BY`ZZMHyn<-oj!M;{m^SC@7NPr>>oWa_{M?5&*jfP`tYBA z;ksn(mTx_J-*V(cJl3Bu&y+LV{qudl@psvShZ6ZG7VbLozSJ!<(f)8)w#ddzn=j-% zeB$IYUFqrJA(L%t{>Uv~xZ?dMPQ5pAd*-gz<G=RsbJ=93D-`Tm^~*bjU%pi0mz(F= z8+)v*yo0_fIlPuj^3@B{<xBEalhgY2Eq8vpa!3cLy?zP4dS@{o&VTEY+;r7XqsW6B z&}Dm%4Ad%fImMS|d6j+&@uB_n(|P(SL_D<X=%>B9pY~oyKkY@SS-XaQ3eYzFbkG~U z4t`pYjfM0?5sh{%p)8+P-SEJDbWwEWv-m&ZcPb-%3z=Wuu&zktM)X(sLgE*zIdu`J zhFp(A4xp!7G51%m=GCazf~!tkrDt^4elLC{krLxzRvB?A`?HE05;@>eM`}G$38YzI zaC~z98oI0nEL6(7gl-hLjG(F6BT6UT*NI9MNjChgGyC|<L11yz5nQoBh|h5Cbg!iQ z4=yt%^t$j*V9KEko%SE%+dp)E`1kLvDdN_?|BkqJbQREcKK>54^{QX9PLPpaE-1!u z#?Z<{ir0={YtoAIwxHoxQUw?kl)$`#<NjTq_k&DdjpGE4i`qcE1J>skq7eQNPT=Y| zWltRX3G9IEiJnGf@gJE*!{6l-dh31{FjI)|r~<IgQiTv6WL+3B9W>8SB*uYI<-pJ` z_OnRa?!C10*YKnc2)KA(E+qJeF3*pd*L#5!xIykJAaNVgoF$?kgN1pKe!Pc%%u-Eh zwtp>sI-&b?f)Zd7D=yk0`gD!%)0z7AYvI$GLf7Oy3FWHOstcaHKo4!yJ-N}T4D;=h z?!QZkoDP3eTu!yTSio70KK$2x;zE@}t_#;!kKxl{Y8ozH3GZ?Nn}g*S@Prp@>YFoT z<9iO#%TO!cl~wxOS|W9r$8^XAW(6d`YS*>m_KA|tN@Hj@$CT8}F8p)DOpq~zia)^t zlIpwiD*iJGmvWdVaY_9_{txk0d9`REH@y>%x>niv<U8S}^wWIOd=s#!E}$d7D`~^9 z6oDOE1<Y9z6xN}VP;~;5@0=&e#kz;xdd7^vD=U);Wo#9AQ-$l5P8q9Jv27+*0g$Q~ zooWPI45D={BX~42p#Rn)WCEXvz~WZCnFyS>L3iGUM6LkIo5Fuog%JF_#R{1}0bq=< zYY9#lq|*r0Ubj*Q2d=i_Lt|Erh4D8wyW7R~4#v>x8o*SCy*BI3<aWWP&O{pm)|Yfl z8>Q^bJx85$SBx)blNayT2ahf--{^3LoMXX+#W5L4o1eTi7hgTAAAyI5(Jg+ze6_B# zgK}$bpYG$32tQ08zluJNLWYnHBt{7Prrk^*$9OaE*TXk+ONLRW(vH!`!3WB7d#W{c zGkhF3biQ*-{tCQ<j1M+jR&Gxy?}nH)eD)As9IZ8;-R(|gmT!;f9)BcJxKu09xCQ}D z;po=vqIGJogAIToQrGq=;tJyUnSS)EAHY7(0+6B}#+4e)Qnd}$E7gvzR_8fy)Zlut z!rvO3+Pm)<9eBd64UZjX1VHZ^e_r8MGdD8=q<~mlX^&sQXrWHAA~t&j-y^OAg5Y<r zj7?nsZvF$qF6DSubp!rH;6bGeY?O!xf(aZ|mpay5c#`Y4>DvF8*zkIGCNX@wZE2su zX7$@MiOj?ANQm$r9$pxpci4i?rI_7rHl#z%1NpXhBwQd~5V!E0<xWiUUB&JRmHnWw zEadKu)%Bco>ck08`4EmLB!zrPuHW4ZqPSeYKKBIah6O-1mlH}DNgU4kzLXZOWh4kg zcSwQyEfO{~P!wGZRFi~Is$|A@lPOGE@|RA~k%Glqy&85?3!hj0-PmTQ>qT3Gucvo( z{s1>w0@@MPd}zr=)^X>A*nVG6no}zO>k3_%V%L=?<)*g<s$qEY>rlm*I6t+yj;i6$ zSPcEf$mG{-wmZl3k;5NCc7E{ahnby6&Xfyh$e+yLT>i-_7N{U2oPSB4D?lU)pBSdg zV5QinCh#U@fSd4wH|a}-iM08`MB3~X+@}G8CB_*9t`Q?TYTtl>N2Ce|J=l@DK^F?Q z$9QJ9kE>3JD3XUB#HvqYq<H!uKb_u#XmW6xFApvOB0)kVA9`78EA(=dgSc)>RkgvE z%;>m4|4gH1YT}fzq+Uay;m9c3>{&GhP{s46Dpzf<m3!Cc(xf`}hvAk1gV$j|?IY?i zRkTM3P^&J?YswYKjz#sFiabT^58y$@`a+gm09h1&#v(T}>I#eDMc>?1#9ZU9>_5^` z8)|Qh)it?(bC=Vf|Ic;Z*-F{=Y+&#YD=X9S_R7K-=gofEYB0E;c=fLSOyaQBs>CAC zTCE4W8X|L#oCuqjdhfV4VXd(nqRYwpw)U15M@8WI51Hh1|C(2>-cxU`gaWjQf7qha zC`^|kg!1Y7`p?hI{nz|A@5xrO=FPze>(D)N=C<VgJ?*P_q3578vmY}fU*FL067;(X zy#qHeI*TpN9dI`T0h2o*{0?{+zEc;MFlmCGhMEP$OH7mKZm4Eq7DYz{)TQ8#@uEAb zLl_(tsG|j-j&z?~<j;g>f>FA}GsjY51V#7Er0~q=;hE`v!QM;11pMn_{W9F2?lB6? z9ZFqrSrGPy$m6T~Dr;ZHigaC<9CC%W^vDN4YIH%I)@BnrHge@n(`LiV=Bb;Dj+q(g zbs6va;B)7v41c!0PX>1<`kSqW)m}!=HP`f*ith~8=KjR){9j?~z~5W+#eLF$ku*Ca z=5QL!KmVke2L*#!n4Z{6Pn;K?nDf@pqwP%fD@ucRq|{JYxMr~6nwb3eoMus0#z13~ zl0mW{#*@iY#<_`=2@k}=@h5<KitJQ(!zGb3YyG?VUH=e%wp+lPc0-v(-H&f&B0z0n zY}IZ^{#x65dP$G2^s|;Osq^UKY4ILv*}xz5Vd*mGE9_BNOiWi;nq{KLwsk^z`9fd2 z?>^bu(h{{-2+Wdfm5=2=v$@k5<~Q=KsippGv&A46#;2Q?@=tB=UvBUfE?AC&4{DtH zZyme`cVY2E+e$F^Gowx6rz7wtNR;)6tFYE1csqNrBP>n;k<Lp|fii9v7CeB7w-vv= zox`ADhk~>z*52%XJ``@t4v*kNP<0R&=@I$MX6goBUEN^YvrNwEI>`^){XuuwI^Mgc zlkAF_(zeZyF7r<}Q9?dz+4_{C?HeRn@L(FCXfa3p9`zyBek>OBz$akwXvz!AhNv4^ zAiarPONA7{D{~4cXRyN(#R~ljG0zw{%~CH$63Hh{Ok)R3x<I_cit%6vt$Pp*7ldFy zG%HuBmjH+Y7(|^n=WR)i3hMU`SFYCH-Yde98EsRBBXtBC0+e|g6Su_)A%KmHvzb|j z+JrnpMRlF+Soi>N!kGa`zOnMx1B5AQ5`i?ujZw%=th;ZpSiedgDc|F)uXEdMwKZnf z6=M}nhqKyccUiKX?naNJqSe`TlQR%>wHXs#n~_RBVfA%{WTVA&{jEkzrPXe$Hbtu% zTm2QTuUFJo1&r<Ij%~*>DeiKHNeeN2JXZv^Qp{8URT`#I18>gZ(Q?#Rlc1T@nuK6y z6Oel<#5Z-&mz_)i9!^qVflA<9R3O{>s%5Yd4=0A0<>2F67-IqQcGh$;w&$SjnI4g9 zq+aO*1?D;wK$O~))<N5A?+~_!5i!~x9uo<Z7{^ZvaE}w#<!qrzIg!3-E#k;)X~Gky zU{B5#JZys)Drn>21aeUjf1?`F!R%BlKo6+TQIQXc)U!l}t)$()W)fe?zsFi-s;afg zziW3i9o=l>Bh^l;Jy`FxdlKiqz1<eW|NK~kIm;cWuO}qTPL=Y|Zct(`jAuu<`>j|h zl3#(l4Lv~zGWURkixI%=ffCk<&jkgs%~&s*-FW8mI{5b6K3571vpux=##`<vmb%Y8 zyv^1dB&&!yX|eAwmhXEJ;kADMb9Q3~0bYnVVtyJnP;<Iic9sOFNZr&TAUr1GO<`i1 z_Q6n*Gc-0w#N34wt!jIN7(6qRPe)QWFq*=cGbEVT3dwBpMCGl+$xzMvKXJPwWob)} z><Yd6!CLu!n^=!IEr4F>!;=Z`sgIof{<iibOU?P?hDqEzGn~K@cNg4BaCae4U7%wi zeGU-;VkrRhk`+{M_1UA9WE*7#OnQ01l*lE;E6tRo5t(=~@nYB<E7gV>jTM`}f<)hv zYA0M*GC|#ePE?#xMwBf_sni$mf`S(n(WMg-5UWrLLl>|pw_amuvg#REAy4}n@F>(W z#9tL|XT|Z8I#8ly+_)5OQpl~&y`~$SEEmxwJlFcME^e$#AGAxqs?R79_6*RF5=m); zaG>0nKBDDAsNwST;>zvyiHHMXSwtLJ*(H!dt%x(N6xo<)Wh1$#)m64`J%x4a5$i?| zg}7ayb=ZM~+9m}Mx%yFU6|dAuEN3e&7CE1_cTgO3D0KL-s#O30VO4p7Q|mbT0wfB0 z7=f(fbK`(3uJo9kDPQQ-_*6)wI<XO}1K=g)Q-d1|-g9I)nEfLtmZlQRcI%f8`wa%8 zOj9&fn~tA69vh4`=Cv7pMr=$#Qo7!Kq){Y0gD2#Z%W-Si>uYtGYqR|`&H0;j>(m$B z{Ivq4i))V>1kNq?7Gly3lIG@}f>eD~P>*Ef;(Qo{GfR33V2*7#jnPK{xvX5q1|koe zQ%Zw#rMW@IN&-fLQ9&^|WrgVZ^F~XB!5N}K_O42wlTCZ;W9(H?^}m1fFjlDYpS^CI z7$)ny>Bfhp7U?!gOsR|Mkm{7+Ce)cW>l=hIL2OYNQz)T0PpcA)X{}&Vl9RDHrb<p6 zT2|E7CDp^)ngf>_mdA0$0c5xdcCfxuQhsx}bCb|=X4_bTeak;(dI>%g1Fq<|C9RM9 zhN%E0p3*gzp<_*r<>qKXV>t&uVbQXVdnVN(WC$1$<f%HblS2Y>H#C5q98W3G^u%m{ zD&q@T)GB1uD%5@Ws1xGKY6>+5#pim_S`}<b^zNSJrCZj8tj7rOm+1mv8NeM*AA-j{ z&Oi)S0T%~`Cq^6rv|u#e+73+aw)y_}@!yEwHre0}IoxJ{cWh>S_x;pR-5*D_=!TK= zcWrJZro_3YZ+hy^Z1m7*GMIHb!lp?7YtI;Q@U148rqOc;wy_}-VpPWXWvq7<&i-NP zi0mjRt?t2^qr01iuRsx$s0xgdqCBe|WT>i5E6oWaqThw3?-u%<-4v=maFFIkwXh}q z!c>FH0g6Nmr_~V{-G~_F7@^YV!U%nA0n$>+2tw7!0357W4^_R1+<^XS#3=>^APJ-; zsPQf(!0Jwi3^Z8qdw`Z0+8{oK;Z?q>ONqkL$N~<FY<l`2cW}U^EVIm)t8u!<DFClF z0PCw&)&c|1IUFQUlDf19A9))@zvzHVxdtUEw{nR4T@t@g;??ixC>}wrxx@-gv;ud3 z*^jqAjtxHrQgvh!bFfW}gI`@n;b>vs!QYDPO}K4!j#{hJ;Em06od04m6#mI3KDE~E z`C&0OK5%1xDgRdfe~!;2V*a$n;;3*NymHMsQ=$+b&n<7~h=yO>m4C``x4=SWByb1x zFlI8>#dKwTA{W-3E?mz|3FEq3oUO;vju}DFf@w8^Z?Fc5mKGIi=TW3z;w-O0MQ;Vb z87c)yY8SgPHC*a)Fw+)}R#MfG<x>L1RH&1--5AlG>egL8Z;PYW<d4U8Pv~0E@vqx# zH;i0>T{{1J7vV-YPk!;)e8$j^9#~jPO4mqQ1--x}GEurfi_pJnA>}JI5DU>ZC92nU zSZ|n_kQBy=qNWjU?0DvtindOD1&quRs&wL-?M$q;Y?1z9E8Drf4BlgScw6)MVi|Ua zXDIr94#WnRbPT%dI_uCiX}06lx@8oUVhj%BqEM*yj?AE%sBF*)6i}>I06S2{$H-i( z5KnFmV5y=hP@AY5HUkCh%8gNbwGnecfC;{6v3eWgi3MUT6eB9a#NfkxicP&B@=E1; z=2u$Z(PQFbSZwEs;ZV%TVK{W#sO$oT)R>IQ;aV`qaA<`L;a4v>)PgdiJ8Q$hTtzV$ zZupjH_{nnz#+NbJ_~z#E`uZvPK+qWuwIRNLx!v$(o9#mbU7h*Y=j2BD%br-m>ua#t zzU*ke!fLec%71D2_H|7`d2#Dto`2FhC*Kt6@__xX0G_T1`nvM3&HFvU^KSufWi626 z*t1v3kIxroO!G;T6FmTzbe;4aNxKH;tumQbk6@3-4`7c4=OVcQ31LSUW}_XxkcaVl zalhY>u0TslbMI%McfSk!y<coL$~wj7uTe~98zH2<GGXMYu+|M$kHSy(gA7e=y1Hlp zfx+y;LD8It0dxo<xHSVfat#asD%_neEwlg;d&K!TJ|T1C1%{wc2^Tv7<VMDzUS`Zt z3YE0fAg$n*%R)~<Y<L12-X6tRaoNdk2o+6WVtFY4r)zwUlY8b5+p4XR7+8&x{!iL$ zC+Bijxkh$G7RGJ16Ne924EFiQLm~fYbP7hH*b>m)KAhNK-^)7z=LaeWGI!YhZGK?i ziLp#s9Sbj8t8KOUKS~63`E6U*d%xE0tu$!&3|H8aR?#f546)7}L(iufcg6(?x_%_G zW@iorMye4bO8V$G^tZ_++u>9^Y~F&g6nDq9J7q4oFwuY{7=>nWr^F;C7zP5^Z49^t zM2-?*9}eYAj8>8r>>L01w_W5lawv0X2*09c-T$r(y&|n-hi2ylU(xC-!B`aFT%~>i zT^Pj{I<l&V2-plLz%Ul2#@H<A!pLA}ukiCNuA>eEgzGolT<8FDb9qNA*<AT}WCjxG zwnBz-H8PZU`DLSd*Y5di#9rEL+)JyKd+88ci9GSd;VUdgv6qd5XV9K0uDG={W2s4g zIIdQuFLvjc!}zdQHo;Ytls=+!6^V|iNU=dZ&h|iQeofvYoH=@Kt+k0j+lF?j=BAG4 zgj>(2l{P1Hm|R@1d-6PLqo3LQ`m##R1}v)*C)i0+>v34C-OyYV2Q&$dGJToa3&ufK zuf0sPhsnJ*fT4u;u>5d=*GLY0?(@gk)?7`pt#^*6<4-*CcsShR3H@1Sjn^o=Zlt4_ zV?UJ0JbItSsXcAN+C-4w1ud_FN54ZlENMRG3C!u-Db0pf3Gz6R_<*|rXa&Lh0;QlD z9K|)FH(Z0GSd+xPL;@d*V0@`THAc0epNNqR&f8VvnSr2Q3lv4V03HCMK8MrT44Du} zgMYPz5TFmJX+Xr_%}4vbS0r{iQ|`p)mdb*;WiB>(w@(gwJ!0PxlWpSM%}Os6a#zfv z2r6hLBN{$hh!-*-F4+p-@X!VSMdN7|B5GQ9BX_?Wbpx%jn=c#D_fdi=rF)aZmq;9# zMPdF(aAb57cjad5r^=;Stf=2&-iQNmnUXU&fQz--*H2q1hTJtMP4psd#sJ3XB8KLB zTN}-~P<ikoZ6}{JNI3HXsQEs*-gcBoA_p2B!kM}g40AxZTxd_I{B31VE1fkG3N-qF zq1M@oRhK$@U}voQb;8atUIm>x>Ye~gYR4LO!_P?5c9E9nT?bX#%mVkWE$7xY`1*w? zB+;7@F$inrlm?|YbZ#jj4_(Wn1N5~h?FllG0+Wc=Lc3%E9ko9LUstpg9T5vED-ak3 zp+XnTvP-1kT}Z(-Ul+~NM*vVDrF1zrW)q=VV*PS`df=k=XcsV<ste<p&;ise6;SWZ zs(mS}<N%h`f+CliMmY=r|KKC6wH+Fy0Wrbf0$E=eF;q!$ydbpc%UMDra*@_*OqcM5 zM7B293%Ehf+;TD2%6Pz7_bIUl3F-F>vsM(J38Tl3Sf}_`d0f1L$HgatUarE@$znD1 zMa0UZMvrs41tCymD-$i~-)k<>yKs>X1cQYelekc5rG;m*ADs*eE?O#@S+GVOmg@>^ z6sm-EtP+Y^(MGGeWfV08xrSB4Gfm)J`Q|2p^n%4Zdu+>is`nLj+l`|Jn2>%)pYP-B zjv)P0;%Sa->qIJ4rwrg$T44i`3i0TID+~*UVp%IpwN<orT0xShIqAzfw5Tq{dB1Y> zaSZkeZCp~Ka@kr%EzO5zlUZX`y8t~CI$sqOuIWT-vj!i=g|npTvj$v{FE~^UOQM39 z|LoS*W_UmliJkw`i?K4@#_ooVkzeW|9Sg*}1`42B9c&Dv?QM1X+SK9MYU_kk>ytNv zWUA`4+6dEX6gT=tUOKA8u~S7j5C)^LOsWNw@VI%Xja7K~I(jWu1%c4wR+};XD5);g z8&M0ujEzu)0kFDAD+XGhEzV&%Hx!Eu4|c@!Z){>dhL;z|Mq<%y{+?~^MwV>6cj5d^ zkadS_da4DMA|jqck9cj2Y!oW&(wkqt(^x3ghzu7@R7MfljKo>^1hP>Cw$>Em7d=cL zSP=GM6!8lY32vpzbT2#ayP#LY1!Sel-sp#)!J)k7@{3hy)mAE^xT2J1F6eK#3@0M_ zH%qTq`32orJ|B6l^af^pa_sld|6=u8RkRfOn(N1ZS^o9XQsVH5{CDA0n1@Cz^1x2o z+3O)os_MitfUY@2fC_>Y5`$(5!-70G#nmgCP-;)Jk3-HJ^gWCM#%Hu*d1}S-)RK)B zGC8WT7IsmE31q+;=r~?33`lYO6dlwFK)mbjVErz!1f|v?zul^oUcx>RezC+XjQ5Y; zm~Sq-enCoTdH8+uGi89QkBalpgPFj`EGMN(F!4U7b1;<w6vkl*iaV{2Acd;&5iB*I zPzFyI$Yr#_c7}_=F*pH6CBZA6YZlQ#Yu>{zn%NuQi$`xCgymzhWDg1*E_cWf?-8(r zX7tK~T#tSnYHBl<27@ZOrq=KjtJ8%md6ZXjnf|d>w4fDuFfA;%BfZ7sSTSYQRR~-< zG|DUoOek0zovXKW!@H}K{o^b_*EfF+fW`4dJaWFjKg3Uw=<ri~dFQz8@R-de*BQR# z04t8&<d7{VW4myH4R}5Ip}D<Xc1y7YVc+$@Irn(d_|UX`?EItrYBeFXPd;E=jvo-! zigS0ndpvd4F|p>lo+I^O7o^{nv@EY0ezC+_Lo2FGfRVBM1_Er%se(8hg7m2E=XS3p zq~lOsgSCMgXgtAn9v{XAK!c6=@;Zd6I)N{*Lzh~MkQ8wug5eORt7-xeLp3vtg^swM zH#mb;eS)1Gl1nBtC$gbrG6L6hcoz2RJAe=s1jV$jB5TCEoyaggP@>Oq`=x0vmeQo= z@7#gTqCxVY?)R9;Kdz9zrO$k6Lv(f230T6LrOh{{N#CJ*0qoc~0RhFq#OWM5lQ;~u z3DKFHn87xHM|{JQAH(AnaFB@pUqr4D`+|bMdL{OyX95l~$n*m+|IYq_Ijp{V1<s2V zUt^40f&EgO1P^bqBsj{ZP6V`m%Csxj*4edpzu2+$smTs}$MC(i>$UvV%XFS~d5r0{ zo%;}S$YIoZTwnwIzNC%Phk|L9<P}Zr1N0#eeQ0kw*Q5K;YzjL=*$EVNHr~)H63WzR zH4VSopVFqpR5LlnmGl-Hiu(}Z!J03|-SnaAtHs3e7{l#TJ%Df1Q`n{9z1XEH7ue+8 zgIR7S3Vh-A9{3v>5CCsO2Ra1bD{d{VVa46BwAqgC)DOMb)ptAR4xprW$ECU9y1g_e zcjk-Y{2|Z6yz~iOM^Q`D7YE>r-2lkZ7i;wrp0g3YNFIf~2_<_v7u6jpD;hCb?i4t< z0CUR)^e+PQ+>?M08?k_mh-TqoP&=M?y=ZXPwpx1<ySNhpT-C(oYyj!L8sM#M)Rk)v zw@uRpaRjaLBJ9MEKEFnZAm4SFepJexng7@hT?khh{T%AT+V7Eak~TmeYDlYD_)uLi zK_B9n9I02yU8GE>SvKH#G(oyE&d7=aDJ}#fsKmf;=2F^-sKAU2VmDBJVZR*%ubNwA z6>jv8O8O(>V$jXg#%So9=3cZyHMqi7-DFOt_!iTuRuBPzDh*q=Qsvqsf})WOMeBH4 zcjyuF&(}y7et2mfVX~AkFns6ooxvcbaW7vMdGLP5v=Qz{AgzvLKm3t??g!K1VkM%M zdorKS#q~Xzq4EnU{jn#Y9tB(W8nfwALYX|Rrm#bcDJ?Zgw@7jPX)?u^lf1o6Iu#qM z;f1j^AypzcPL-6<cvRV>ZV=$o2Ih$!>{m~8w86P>sHrKORO-+?oTtO3)wT{YvH?xe zg3XKL#<Ak^?ibVmg^U{5uCj^*dspIW;oWptfmz~(+Ruf4HJJcTWv1W%9`KqrT_=Nk zPwb#vI-%f;?!NclaO{B9`fM=!MU*ZR>@7ZZgR>Idp)yX>^E(i6CRXez)7$8y?v*}X zm?Q+8gVQ}#a3vxN+HhN#!)vNqY}jfR_>Ntt)g-bl6Hp;ZF7&Qh?9oU{N$R4{&|MI? zjwI$gH7SdAr@-*@a#g)=fM9CW+n}ncR+>>9nuGL(Vqhdo5tN=97?<6=#X-yfQv@48 zJ)HV<h{!}X!Fb4bp`UN8G@4%h@k{S}U}<Uire80|c`VO-`jOWH6&v&mL*}^Wz=Q1- zCX?*D_fz@1^1C1U&UI-W<>3rki1IKv<?OE>d%G=m{-*LYW_1XN^$%H!1KR;e2v|{Z za+D%zOmvOOGzqgAuEtVI#F?AWkA4hAH|w~LW<oioW>m^$1w5#P)WAc~<S9Yb08#@c zHMJ7P-%`n+x42;OBtX?6HIrEQ7KjBT*s1}O32-z>5@o!>dV7ikI2(Z}ibFWUuGcR! zL1*52`FO1`<%PwAk46@H37v3P>d=XX=!`IE<SWkxvVUl`9_>w_pCc^V@I9YT1fmx< z!b!-FbEGDk$SWK6ay(IIgzGDK0?h{nHrj`(2_`NDLobR+T9gKS0UIXpWv!=kF2T0T zk!SMF7r1v>6!D_MwqmF(a6;KRYSD<p7RyvbNrSV)3<P<d6M716ms66R6&6z%7*Z#u zoj^lvd}fU>H2<UdNA8}NgRka(nE#Rd-TRjF|N4de+xbU-bNT!ekfr;exd4*>`P?Hv z{hj;|^UtB{ZI|Qn^7*HKA>~dyB;O=Q<(gMM_|g0+{aJ<fI&jie;NFh1<04SP!2gO| zOvq-GV8A;BMOHIB6qBAzx-hD*@-EnHhqIw#7$KP+c&=bMv<p8w{dSDgi^oNS7Jz7S zd~Pn$Z1_P@(Mk@UfA_}v%tiI8`!<zjF76?*&wg-R+$d=wUanZx-T?Y+L6Cv{6Zb$A zr(l<FJDsc5_Y@O}fS2W(bUtMauBtv1G*%=_h!P4C^ycx9O{h`hA<IC}@|L^$mh}%x zs6X;Q-Fjo!K+yR|FKKVhG4Oo&(M<#IpXjDh9i#|_b|GM}y@oPiQz^qG<P@s)scwn$ zD{EXF?EZ4NLI*ojRiGW~Q3osN#NTETW>mVG!eU*T!dRE;I)vg92`4mX(MBvtI2kv$ z1KD1rx_I|(r#$y>Xb9%mDs{!VPi$%nCh~s5pP!Q)nDFTW8V**Tq&1M-FUvh=zs-aD z^$6)Fa$$Yj!%kXuu3O*sQ9K?8%%8VP^)*T6GN2CT&~i5$5gy*L#hi`@00W{3A*6#E zr&)j7--owo?HIQ*2$RL}n#E?Vwcg_l#=7!9bv*B?@|!>40M_f^Hny?scRXuxRM&ss zm#p>M|DeWaxBYjY9M9i-F_u!$bvI|l`gDU%aIw&f=)?L5N=P&%WMw1Kd(lJ8pgou~ z)zr0CU{yE>K}Nz=nR}iGSQ9Z`UCTu%tO?uAe7i*~FvcA}GCAuEPdUQDME>onzqV90 z%Z3N+u~0nUzoA8#-t;+sU~*Pr0<765C!C(@pIiJUgWT0Dcb<EEQxic>u>6a2mZ!~K zSc56nMy6;VEQ=s42>wTMl~_}`)}bk2Bsd?T4T-z`5I`<k>k#iMtudac)1pytM~KA0 zJLn(PVZ02blwl`%1JORqCG<2c!67^;!4^$2#Ud6MPDrW29>Z?@cP4~XwFd6Pf%Pdd zj@O>m2~AT%smC{gQy5D<icDJ>$tpwO%yB0tM0SVnBPdORF0~Qee;|87C`gXSF1gi! z6UN(y!$k_Ar5~y#uUM$H&V^r$V=vMvF#zwafWi^#j$Hu8W<sf2uQ6nJ!#^C04bMIE zW7+va{>}Uo@?qI~^6*mzM{~x0^E~iKucu?x4oBQ#a5zl~r`^_*e0^RXeP$rje;x?U zxXGMo%Kr#lV{hbNn=#0D$qhKM9x&wJv@AThD}N}K9CgZ--j7DT&siK3iLUcMo|a_{ z2fZZD<Qe$zPoU!!0&{P_^hrspW0nBMB8MP^7jH!qD9AgIF)cuiNpNje)CI9&7(0`d zfrNqyCNzH7-(g|85-%%tEGl~#8GO{i6U&F%EQkSA37`Qb0Z5<GA*zKIm+GqJl2-sm z|GcZ#+tkXj4z-KJV-n-PPQn!eOkN1elEg9E8eP3?)&r@=tstp@Lymyx3s+I&Oy;>` z%6~rpoMH6va%l3?aliY->u=q9&v#z_r+<C%(ceD*!-pR_@>^d#bi>^t)2!v$rC$o0 z46hwH{*9O2-A6xpAeKoOn!hN&@WR<Y|KZP%=D+gNnWw+{`hWY-Z`Hmn|I)hy`Wf#7 zkG>{q-EQ0&|5%?O7873eD5PQ?$q<2>z9z<qG2>x_vBOYzSC4aO7#DVNQPf9;!TGJ# zhH3A$C~_BPhS3-*4DY+Jh<#X0-W53mtW|J4pS&*5O9o#eL#QrTT)^cbslynaQc2Q* zyQI)1AfG8nUck(gp_GKy-odE_a<;^{x~s4ZuWqpY&W)#IM3as2ml+aJegQdM&l`W# z2xB+gSHx~S`^*iW*YY#CtXnx2yoO`x`jYpaFYdZcyIwXXmd73T=Im!b@Lk{0yVt7W z|NKC{>mf}Jp0jRpQ+Gs0kALb>Eto&Bxa6;QydWQo;dUu%4~Fjv?XF$YeUjG8ID_FC z(8cP7yRmdGBzQfvkWhOGp_aWe<RQPHJ|`(CR|Y`Th&#I@rL|K8a-yBYJp-aAF#vKE zmDDNi16cg3o-q+4LMZZR4wn;e)DhHA6*t!tOSD$w#)t#&n<78OTq6{a0f%9iVe|A; z{)ycQsjnYLy|&thCW|e0&#vK#;fSNz;Was$eY=yHJAk^H8^v76vA#myX?2xXv~5<$ z%(3xsbfU)LH#v^YkLTY!ky(XWS%!+kA>w*Yt>dIgGiOl$Y?Ss%Kb5oySCG)Q01DW& zWu3@G5K$|sv<ez~k~(gHrOEB9n}xO_>3n4cW7S<c|FkEi`2}vlj~cbsCFrKQ0J2ef zQ)Hvww8qU5gb^4%C#wA*JXXW7UA?aD4rS(P_C~idj`c@#P}u|XFIGLL*fefoQ>+6r zov3vy`}hN;&V`?&8l?5~j?fLDk&?u=6${-3{3~wN#+|ss2~p_=)D=b1(4~6By0WNw z=JGs4_y+Jh-xIZM5v58q+lIBwHiSSG5mN&Zl$}?`8b}^gNOCvW<<cD24TchNRChvB zq9om2sp=fUKeCRxNCdGeF++l*RyBg)w2W~`q`mGs<TqJ$0)>Y%RO%W;>T*2Gw^pfs z;9?zX-DVg;C2<Mf#oSjvD_EDVg_<?kLjv?Kw#2<CO4Gj(hIm%=;^M4yrO6$O$e){W zOpp(I2Jcv@o)xN@By#)ziF+IPw#xHt{G6ks56iNA^kK=eEX%Si%d#TNvMkH;r{dU- zV_e4=*BBE*2*HFvNYfP3G|SR7&9W>_Q<^r-C`&2JSl%qlbdE(}dDjj~DP_Dg<K;&f zZ^xffUdmQ@`SUW?m5?0%*Zmy%3kW1^zuzBzII<+0=zgB(e(sO!zAlU{IT=jRiGs9J z%eV?^Y1{(@*&und<Fur3Y9J#eJ3*L?nrJ1HGqBrEb9z%nQ`<5+Ip8Ab#fj-ef}!2c z0nmh@e4QJh7jjk`>mk{da0D<l#>m$|4sNc7UCRF_jtKwmd=cA!r6*)z@%o4ViL-(J z{7Z;W&0IJ2t4V5wrV26ZHdm~sWRA3-APz-mgh6(>D%755(zYsWZ_N6zx^)keSPA0( zz|dTf^?QOur<Y{6I;pJ?r@kDdD}##?B`Q)KnjtK<me5Ym(9l#VjsszbK3b6e4A;o? z$X938#8zMKW?x)>zQTPjaA{~k9AGiga;jK1kUen^D}9s?qD;W&ATEw-B$`|^3C~iP zPYc_jGD*{{L_`2GoL1NgQV%nwp;o-3KspFCfV>nr5dcAQD}4RKg3B0aNOn@GUoI)d z3uag}NzMc#<1X7nC8eQw6ARm+)?r>0=;(oEHW(ZAZ1S>7bXVypprLp&2sRwIJwU-^ zqZn?4av5&(DqiTFjc)k=EtMKdTZ+qrvCh$^>-H|`ys39>N#_mMELlH!$2~Va-|jb8 z2eW*5`<`Izn)EI9>%M&LYd@U4Wo$mkd2VduiekR5y{ECx@y%<SKA&mx*6AxcKL4eG zo0Rmj_^#8oV)dn6+uLr)q{_pl@aMlZb@T}*i#Y{u?FVlP&mhjpfGz4FPLtVX1Q13T znM(L9hpHjW2v>taBW7>u9Ce8YU|Yrjs`+EoFbQ5mGVe1LF*EF}K`m%fO-q-Q3Yt{Q zR$!|W6hoE>@XMnX4{i-qfO8>)J}uxQA~ep!#u7(Ow7IjNyo!$r0ae&!7g0swZ$xDs zULya4N_{H|k&KER;W|>Gf`Q#mLz$r_+^Ex*L`NZ}6`FA@T&i0fQmr*FF$ZI@b-j0t z@7O+l@~-&WjaH;ZW^+6nccrg$d9zEJ-tOCQYhvAm6~i1ne-q&ZSUhES^Io&*o+XK) z{=Ttfa_h!5d#rB3``w7y)HN2gEbk1o1cG+Q^$!i+o-TKZhTFw|6XUL5g?Y*B3?a_c z_V*stzpmo$FX&8@3KJ{ASHN);L{mY{-p>f$W>UU>*fy!{?PPpbr2vIKLS#m;%N-HA zC2d-YO5>INa1o^7?L+>fiHo+|vPA|<g%D~?XIa<EhEf?rKBa-$q&8%Q4T)1E*}@iJ zkX~bfJ%CSyBnHD%E8+LAXwUN(7U{6^*M7#0zh@DVmimRyw*6=K{vg}?;5B9v)Bk_$ zeOc(ko(}=l($dP(E;Cj~+yCuZQs~okP+Xl_j__Zx{TGd~UGetME*Nbati93$@Y3Qr zH=avF2jT#FQWM-ibBYKLpIfGZLz$*zBM!lEoCvZI&d{*wfRQ^wQ08mnIWO8!a5u#X zVwKg1WO86MF4hhQno$s0&#%zv4OVE1KV3K=;VzQ<(SR=UBo2s|WZ{e%`SS^zJ!|2> zl2_FzF6$f6v=CRYs6}F=m0{9|!&(6=&dFmKbj^(ridoUE7^OS6L}b4x*JyjF$m|V; zyU2Z2(Yuik)Bq`m#P;Ja1R%sd<}_I`utW(mZWFjJ<jZMXB(418sd~%M(BF4l;%bOq z`d~F}$u9Budd7CF9vm$jiNrJf6Bdi2i(cZAFnrS4s;bf5Wf;ixP6Pu^=s1U$qvy32 zoD0vO{xim1;FPPemJF?@4AY~JxlT~Zr9M5a>~f497gqKfLjHcr6_a6COux3a{qcN; zJ-UpWhlugCn}<lEXK$W|tlin6&k>#l<XSj=j+&BWOKT@XHks|YsCEXCI6y-XkN{#X zU2dDnWcyjd*h-v4a?RCnuM(Tpxspps_c2S{0tLY^6yGVgue+@C8R;JTT-_2+C0p8+ zPFmV!sc^~A@&WD2y1ZMe^}2LD2-X&^>L8B|BHD0uQSPVRSCn@Qa^Ej#3R3;KpW-oM zLZhq#B2nn)1CKn@T+xe@)})<B43;-*j{r!fA|6C6W~90_-k~u<6FNrjMF<O~9=U?% z2oM#du^c+iZK{d3bkSubTY*hwjL<E}-Wk8F?IEjF*OXws6P2lAx~k?<3gZ>WX8a@b zc?4g6>9h1n(b>XuV(%v%qRWcUBDkaah?{Hi9o0p--O}X89V|ZAe;J49)8n*(xznP> zX|jM4N8zqw&TYY6)x#oZrrG6Ms@1#~-c@oJVhp(*A>3yS+8pz1dRa$eLB#gkIrz-- zE4;3Be>xAJfqyeVkLnD1FoRT|HPBrJrp6-d`zFll(is46I`V9#f|kagr0Gp78U@Y} zpkdwdeBazH2L63^i{Y$55uF&*EsQ4J0$}#c@)+p>7wEZTtc9BoYVjN$k>zMPJj98q zR>mJAhdPk-H8m$Gat@M!5hwRil_RU&OdYfovW>c*HJBhPTto}FdeChkp{dq%OVSlc zR;QcG5FP}nsQ?}ssw=Ygzwx8@kgyFNW~}FFlWEsLv~}HvKdy{=Bj%#BycL9o|J@*z zyBj_Gx5ZKsMyv67eR7d|qMPRlVwNF7ALqJT*KRfEwizVJu=hUmmNn~dyYNn-X1i~p zJ6cY)a-K`#o>?y*N9_YW1_uACNy;W>`vrFjqJ`Nz1(gm1D6*X3K_eK2A0;LOWg`t3 zkWdVar^H}JxdVs_bae{X2`hIr01^HWSMF=+WSH=Nn-WV>mNI`6(EXpul9f#;=E@>g zqG08^uq{~A#kX_-o2LWQn1DHFv~GZNNj5_aCQJ&m#989T!1G7!+rm$#!j_@k{5ErF zHKew9$q_JH>M9J9>334%d0%+-kY9hx*J7~hH9Kw`WKO@Gu5;TX!fh6Br04pEa*N(* zF_=ZY(_$a{qm?BQFMsj;y6f77z!#ifUTju@GXURphUt5EaJ#k64O8?+`8rBlo4DrE z1=@i|hF+nq9c(F}9?_LWJXVaI>}69ZUgUWs_uT?>YJOj)?rbp>A7&s-<e<B@1v<PY z$wm*G$(KX&oQkZ8X0qS{AO3N?KA*U4h4j1+?=G;qOn}wpQ#R^CxBdRDAL<!=%pS#F zBwXB^Iyb|`_0YU1=@yYzP6)-F1Y=K60NuU_8^=mObxD#y28dKP1{TfefuJ&TPPUh0 zt2Sb*V$7^DLtB-IDwPaFa%Z|=TOhfSaL<-Q=OJ-2aaT5?)vZ(p(&QA_a2^%h(8@uC zD<|=o4UZ{07d!%)dJ)_WDKww_v<-{Kqqet&`{w7zF1~)rjk@joN4J?jjAeWA6Zh!e ze_;EV&O6(i@Vi;clFhI%Q{UEsC6IO|4LkzFR!2@Zn#|W|PPaM%eqFcSTnC<3&}2|_ zlEHuo7`-IM%u4FqZOyY(Or0=pfgTr-tHO*KNaZj`R;Vd4>Moj$AqBI9P1YM=+c*V> zob021WFs`DHF~DSfGL=&rL}{k(LZs|+5*iK=Ldvm7HwG7?XkY~iSCtmc|3OY!bb>k zeNvE*eblk4FIL!VVbd^bKjcHZX}Z0Uc5S`3YwOt_tCwikQn9iuDYwIT-sXIBx>15P zeM!cq8iX}brEEH<@WwXaKjjd(Ztax-f`*!C!-lAo+{9)*nqXd5t_-mXr9{|rP{9+Q zKm+=YhzKsV;mVQ(MT&y$z2GUejh7~irkHb`{FpuWjE66GoY7n)?L0SMUb=ppS-0uq zH{6R}d+;UZ7x(@HpRJT?Uxfdpvu!~?yT`B=h?FVrc~0pkD*?9m2RPeCn*FpVNpJz| zr!%FT5qCE3dz||mHs;l|8C#O_2%_Ue$0W?d=dO?EhL~zLWbDa6{%c{NA*P-^1X^xV z&z`5C&FM9wfGoTCl`Ly^Wr-I!nx!9(0@5I_!cq4RVM|^^xy)&!GmvQMCyov|s>ibt z@bKH%H0}ub)hv*hmrq%%kqEoo&D}bH5~$NSKNTq;Frac83Ys6Df76W+kF6^L9@{Dh zcx>b0uq`ab@3Q+tQbcS~SD2dP8}_aH)=123NwxN)satJ=XC6vc$2LEE@91~$TitOp zo(vD;$?d)R-(7;+R?S@e{lun&v1vZL>f2u)<A2|~EM%UWdotOF$JN+9FPxf+PCs`@ z7~a-3$I2phU|nPAwMX>e*a2?v9j$wkZ_zaCM_Q-~e-*33YcgoWL`fb5!>a^xEsf7F z^9*1{csGW8u?N6@z~?68xd1CmC;<kICFME7#exhsH{L7{0;<_INb5&aX*0UPHvG6= zi|ww%VkR)rHDh-+knV7<4Y?0j>Z&p}w)BvMk*wEY6-P<BY|thzT}Ef9Wd!RC!O^R1 z%JBFExqH-61Ek2n`jn0RSn^UD6ozEFi$>os#Z#=&;_83^TCk#sRVV~RjM5tZ=+gvv zMwy$mIzZQcBv@ajt@}KG@%*~~#ewa~VQ{fzl4X)<U3S&x0G}8Ep9lNCZME5aa@Z}o z%MXi(wFQ?-D#UOsWssgSIm~{c?)`~B+s5x((`{g^D<y}!)NgE?TI);l``2a>MS_Wi zp4&RwWip3~oLvUPP@}u-EAwY~3fRd(?p}_KZ{!`?8D`^yh#WCKsQ4p{LTLev60v#P zIk^N8O-UQASV=z)Ut7uS;^j&-#9?8^s*w%WG*XMk!yKrGhT14zFQMr|;3Yi3uNT!M zS*NU#CMwY>qnxM_xfSl<09_M6Ug8q?tVi#DKu{S)um9qyD?4WIYC`pD$7erPWNUV= z7*4gH`UympLf3H?4zja`k{YeMrF7#3#YN`|gWwLFtHmV7z&n@1QR1wWGVfSQ+ywBB zKrupKf@A_`5f#8?vG%VY4I4#Qj(H5iw&g9AH8D1`(F&wvJV890XdF=G27-SXo6_Ek znky+hDy3~WM%3StD_PvWXc<2HVR(HOA6u<Hez`lp0GNIM$<Maa=lA+1^b>}-6P%JF z`-7s9?yE}!YM0Vg1z~Vvb__&`*^$jK0ddZl<X9`d6bOH%oiVnwXR#Qi8W2t^?I=wQ ze$hk$s+V2;UZNwUv8GZ&u>cn5#dH~0n_}&RCVQ?9%h6qiz2DT%_Woi(E@Q91@{8wX zQt7+TR8Z-=c;<>ufML@teeuOd{|n9ldcYQBr@z;xw285g&Old?1o7InyM-WKa}ZxQ zorT7vOk4=W&{gDI2=qnv3N8d9k0`KRlXL)AO=awl=<k6@A`HXpmmwfa<0rx_2O;`D znu;pWA5$NnLm46GAy5~Mc!$vdEh_N^qm68hz|07bm8QZ1DmGZHAaOx4_@dIW43`R% zNtQC~7`bh-kMOmZ0uVbs^Eu)J()MFJtOY5(=PvUk>7#Rq<MaRPGae>gA@<vfTBnmk zDO;O(%2#Tqi5hf-s{rawsc&A1Gc)5Dq|O{Dk9KH{3P2uNi3gLo=uo?GjkuQ5-QgO+ zF<DC7joRFdOOm0uR><RM4A+G3Bo-17X*Gt_@@XvwOfxKNI)R_BVXIn$6UXLrU6{1V z4Kek#asn{P!z-~uCfiJj)j>)H&_R^ziOLT$OVzZlj0U5>5+^Xd6i+*BGm*pq6Mq@B z&oneGXJU#RN*_wGh2UtopNLi1#ZZxkmexW~0>lVtTwL){6`qW@v`(c`zutIo<LxU3 z)mN_UsHVE&v1~S$=ur0GcI)zi^>6+s4y>*a3r@0i9^u~CDj)f%c4GUZWmr`7D1gae zz%_*-S3$i)bXSm#aDe`F;gVzV`P@oIW=z@#g_$&hKw0X`@v92ma;P6HUS^DiVPIbF z!Oz5dXm7+*SldyYQw!LRsp|y&;UKMTE<`Jv$I6yP3rkA{DS}@j4(zcjZS5GK$B~Wp z;|KdPxn7$BRW3TRl~8_OO2R&bIS!$eD}VYD|0fRb+lyA&dAX<8-K{Ne=gc?OPY({P zfB)_O<QYB%{v<zCmY_1vu!oovr5o`YI>T7WQJmpq*A&h$Cf;TVa|1Z90Y;b`xDMxS zm?=sRkHL!uBpK5s`Cjcz_Oh~VFO?5sY;-Y3>qY`g0csC^p7hU7b0)W~bvct5?xi_; zP~L`d^m?%rvg;bWZP8Fd3n1Q94V;7EYhwwYF(?#$rT<_kHraLJSJ=eDzghD(i<j4L zvNm{|)zK?Ed4&-Mz1jGoA%=^m{tM2aHrsG5_(vF??Q%1+D@?lNxjf5NHlJsSrOv8G zqn2mo-~+j(%SKt$co_*6oZ}}BPki>IO1G9~E1bd)wAl*tv-7hRM=tX?ebTVS`IkP^ z$pR0JI<4;kN5u&D6OGwvwkkpu62)1leV`elcoQ9Tu)wZAL~t#5MvN@^ak%^}k@6D> zNC^w_OB`B|k7<njpw`$Oyo?eExkx{TuF;&$iGUk|rA3YoV%CL=7__<VLJ7o#I*H7I zo0P`i;1=;+399y3JZgWmq%i{UG35-Tzn^@=LG4N?b7EW|-Kx6)ihcgf(vc5WF)n$? zYHu8wx+XoZaj@l&5DI*>VzK?x57*b9Jdqe0nQB@SiXc|e=1-@^h;9qAUZ{o=SJ1~s zp^F1gS|{*|upaJ5B2MBcLud&a@(%@;Grj3BuC{q03HU~mjHTqa8U<6?L*$ks(lUxK z>COsMn^sQuqS(>YY@|#LIO{&@42XlQkK1JfAcJJwwUD#8hyae5SX3l49uVgl3pusL zL*9&LIcg9@1M|@`#V1xT<^(B30gKl^VCBmX@BFl*wYqnB`E(>5@iK{ozC)?@YIoUo z{t6CLu{-p=t7g|q;KUDJH{Q$mzHcHm0h%|+)g`sDHxGbi#5a+5RcBX14!e046t*@l z-;Ct|#p~n2H&YtLYf{QJKPvZG-^=xb;<Y*1*RN5$&ORcFS5AhE8U%UQu+6^+3&gh) zOtDbtLlqlyFYI*a2#LTR5j*q3LZF*Z@&Gr+KlHqa=XGZn0ltJ+3?`2=>~)#V$GX7c z$`_4ZRi<u~!g1T%cC*E%zG@H0C3U|yIb<>&^RC%zGDTzMW<DwM0<<!%Q}lEs^O(ux ztB4g@B|a>Em`?mZ?7lLq$p9v;0tK_!)$#s`INvP1kV#$)qR$~tjnvMS0tHJbcwtm{ zFdX()-56g!|GX%<W7QYFK<9{Rz_07y)DIx1z7ZNLtRx6x11732*K}cRCa^YEf21AT z7@6~$xO@{I<>ZM`1kcVUio38&7hBoNyB$@a#&v+vcwRV<Nf)@mYU;vxwWAw7!#Pva zQV3Xpns6}<E`}FzWb35i?>N0%S8ZAAwcoeAR{iVc?&@RKQmge_B|3ent)?l6x?5%H zh{-hSsgkV*y^*(^EH=5*G1T9>BGZ`BZtkmfsQ)f?+n+n@^!e@TyMgxQczVN7*AcUR zU=d>L;wxh33Uw+d*+cy8)kaIWc8NnbONDHR531Q(0VDE${)D%x-esR-N!CTuO&9Nu zM)(V=DHah9U!5=(ehM}g(0e?>_Wvk%^I{DoCRVz%AXeI4(4EMMkS7HrPC@p_%9Px! zxIh*tAj!mU!{m>^cJE>?CQNek>5r@jwHFJ_;`1DU{Njo?TQORd#fyTuFt~qV=c1jy zTBE$50OkFz0q9l!TQJkoWHxP|7&vA&f8Ja7mepSJYsuhlsKrrm#ka%KLn|ZyE=eEW z(C6=1+=JxBbX%PNX-CGcGdDqm(wr#LH}tAEc%U$F(qCU*Jh$-;$su*X*HRzU_OM3{ zfoB}78{i{KacFemHbAce)T&CDDi1*1)#9>l%nq^Y&((1<9u$xWHIP9n`QaU`!zLt! zW<;svT~Qs@EjE|HE=_E5=+!t{7c=A;1)V_{UjQOp9{TRsU6FJ$Yz|rn0tvsT^<Sz& z@wmw}Ha-fr=h}!<XSV2s(}og@#qhMd6YOV4g3)QSIW_6o{d!bCxM+jExBgHRwBCk} zb&~_z1#f>_xmF(rQ@5_9H0bpiMRlUo%9oVz&nH4!JsGcwBMJ4bi@%tL!E}ZFxDLE? z4&V~OQoC@zg3VMcPs*6LRnmPU6y5-)3)Kx!SGth<Ti^wNPOLpgbN35oDQ5J}S`+Fn z=sR+8S&u(3S%DTVmDzHTiX<o^Mn3KMJuNBX$&j7Y#HmC9-;8J76@5$<1yj3d5<v73 zi&99$YwALCk3Lq#>`FE>Kixi(NsqPh*Jltcjbutc`)(FM{{`B|Ot|pp`~Lr;{X7nT zaX;n)nz(j&WkrYklFf<NF-pTa6uT4`f>1UToG&$UH&9mI5zpz*<!jBH4Y5fl85`zR z>`oV4G$+?lZ$(e6w57JwO%}7=SzU`YAS6eKO7rPfE{%&WsK=d1Xc-qu_1OPLi2LxU zjHE-|A~-P&^V5&wdR|p8y28>i{<jAur#iZMSHN2nG(UZu|7q}clb27ZkEuNpIvBm{ zx;wn(-cW;j&Fol!DEd9FcZCzD-B!Cvxcv_E>0OfO@@3U^7w_^}{5q514~?hN=2QAN zW~u+^t{VQ}ysm@IEw*sb3#=o%pirUMamd+zwc-H}IjUUZ+-zH$rX%9O1>};jLM4Xf zA%2FWH6I8@n-Z`9M%z#(#Ad^%ou5NAVsbXnOp}N(fG*I@_we}i#t>PGbLgop1VoWw z0aU)(@MhR$cDR3Ku8&soTfLcx;0dYcf5Y2`uZj4=E_cJlU2!{~d_q0HB^eC(goUGr zuL~C3=e{n8s=OpDo9HaL_YVvn$;02rUu6*f)>N9CwBU1V`(Xomv2ENoP3w#CBsVfB zN+VKoO>Z416y&lBA&VJTa^79o8z4q5IY%r4*c(I*HP0){aP>=&U?>$NFbNwq5`k(7 zfS!m#Pq#Vk6_xeWu3Nw*dqq1&kg^j!JrhFt6>k^)^Ynu~=cg}sv#@wZ+asho)XZ%y zaLr=WfDI^-z9Jcj7#?XhTd}bHl8oo#=L(ed;Bk^Uq-dJx55arn7>&_jXn?cHR&-#% zD2vWBCs$xlELTrr0eOEQ(io?`1Q<10YN1w-?y`0gKQ>USN%lN;x!Y(?oJW26GBy#j zjkb>n-`YXCWmY;r378>4WIWjfc*7X=I}_i`SFpXqAg=&F0kRy|TnoXR;z)ZGpC8F6 z43WSbx-5Z%2V)hdJKSYHf0T>?Gt5I8`77HqT|-~5-oE;-TdSt0`Tu$48|TofTO+B# zv1nxEuJgBD_U@VWFJagxLH;F3_VK_zrL<of@SvFO7swzq;xw@e8I#;uLSrROFog%R z=D9HGgygHL>tie#ET!Cikumg_w<UJX!8!-k6PLX!gn2~g&db{uJeR;+-!Fmjig90F zj6d=N1;Rdf1Jy7%1}Cf_V?gbW&owfvQDyTbG~Wmz2Z&KU_6z3xnAp}Vz6c}62GART z@3aC2yOiu3@I>KMOR||oVr+_Dn{YCe@oQS|z)*K@6awfIakF#E%*V%{*f!ml3Wi1` zDLs-szayicmC67FQ?NRrVG>#U@^8B1aesV#^zol)xsUEad~OTZ%l)vx<^aw&JyyDf z7>8-y^Tmv?DqkBZZUHzb!$L`@m&wg74A<7gK(Bmm88I-S(MwFka$6QKAsy>y=%{vD z!!GU`pfL*_5E3ks`(POzpjC>yiZKkJp;25~){3*k$$GR8!Bb6&2)!kCEz2^vZHPxT z;)ViXl_QH7D0{-CO^`L%3do@aeAWcUM9-ZdM*O-UKhl5TJ9}<iH#xoY!2hGZcH+yO zHn(IfGn;*qX$PQK8=1t6`SfjG)6W6BIxW2JPpjOTS06s~w6N+G<u&z{W-D(uxr&NC z;SIWP0!w6G$ZOmIq|sZIEXE|B>$9k#HFI64(L7c-O{BI5Vo3=%6hVuqmjvN<rWeR@ zK&a+>l0jgwh<1-vX>e<y?h=E~7O9}diAI=El;kFut_ASH?q;Q_g`5|4XLBvum{=8% z_vKX>6O()Kmfkqg=do%{69!mhiI}ikY;qk)G;+Q6ysI+k)o9z&bb2A9n`F9_J>aCT z#9P4PUyk$Z080isM1*0R&kb6W)i#VmmWdi(h{_1GV?kknLXCq#OW1PhY{}{Z-^N9M zi?1{aA9-TGr7;#g5#%4)w31IJHa&OG`ulx-kBZi=zVL&~ZhhhC(@!4w9v?ooesb6< zzFC*4k~(dn+E~@vhec6+V)Q!^aY}t>aOaj~q9}$XL{v%9c+JHhUHpA2a&Xt-NBP&) z8{asv%kIZMcVaDuu@+Vh4aZo=sTB)`J0__>S0M!rIe<mHNcS5#TNM@o!$}MR=RQZW zhEPc)CHEp&zLmy|T8mH=D~uV@s4nk@^}w%QWJ6U;4xt0U#)vqOjRB)2N8uyo5cmb) z3YlTL1=^-Tz6Rtkn!#wmkpj;&T5bkBE>OZ$i7+~9HLyFsvtx40{Ug<tef!s{uYxUu z_@c94#2?Re@GqSe0|rk#v}XMW+|bMG9=L6&*0yS3^e5j+928DmypU;ZTeGG$xpsZ_ z;>0X4J2gu~E*?808SdHBqpk~d4NZ<lS3eL-1tDT3fUo^^-M#u<xD%q>A8T5^#C0ps zZ9v|Us=&l{n>R5(!Dzc6NCp|qoDGDjjHya?V_L-O*27?h@mwPC;s{k)2{)M2+hivq z3%LY0DHH542^6}!vHgH;hXK*(423DcD7GPd1qu^A$$}qnfIX)!7_cFH#pWb`fCwJJ zDGG&gz9!DM@|ncV2af#lnjPcg{C&P`W~chb`fu<5XZ4MfyT;8Kebm5vgG<f4_{63z z{*{OC{n><YN`1xf^2HZ6zwyv(>RZdDw7xk01+jE2E{NuMfNTQI2<)dm?57mBzAy)l zID+;Q<t2G97jt3%U@I}cn-umB%u(J9#0H(!6sEgpw^9&(kfvpDU|5YCyN7}+j*z@T zKvrZI6*U-9$f8|CwDd&NM+h!pyXLn@P>&PXgTy?wXtNwzIuZ!%-y1P3+877C>Uq(c z?rL7VL3r=fs4E!sttuPaqaIthBSbN774%{>#^Nc@2mh)&g8Rb^);idMSnG0?^LvBk zCA2SCeg@jQYHeSX%<T)%g)`<lFdL#3T|w(jo55aAcO%;bQW=4M<YGJQVP*7gq4_&N zhmUVuu<~rsK`Rksg0*L0bpO43Z{9KP^cwi-!~6fI`hQRCn6xI`c0=#yV2nNEzd3#H zPbY-O=u>Xk^g34iw=0}&#Y2<2UE1fTI{P57gfC;Q{oG_h<$`uRaku3m1@aZjJgZrD zQOe>Wg3gp-Jw>c*0DqPeVXT?_XqinhaUOgagh@6e{ASTwS*68^&7!>we1pN7T2u_# zKzX{=)*`^2(kHSDqtEe){V)Fb{2}%AvY<KaThC9<eCI1`|JU~p{^}0D^qju$N%h5( z>gk`Ini5`^cAC}W4<F_}ILh09blX-xUxuZ^=OF&I`}FTXcL5b>;!@0)8@V-9A;L#t zg3|gpT}eLrL{R4d(GW35poUR`bU+)rWU>k+bRNt7VMY*9Y?F&n{paLTw6RYYS-igB z5_$oY1lYDZa&~kqiD)TEPE82Zz>}3Z?X-wUsPTQRkdlb2vP$Rf=JjLY;H}<)q!=v5 z$5fBce5XA-_c_(4pHt709Xi%u)7fvgd|&@g!ZX;}MW30UJiYuEyN;@F(PviQ{23oU z$qzkw%Hni*^mp&mKELiwor=g78RYEekKpr%D4qjBjiMtj29i|bZO{>uFe*hf6OV8g z4h{5rxiU$%Voiy$3Q-@4dsz{tDCMMfLyliDRN!)jHSDlPzQ|HhSwl_><mm=H4_kmS z083g017NYi7r@5XVG7KMF#n!v0sW-QO=Bc<)9N0S1|3z6Qf*zfcEBV$mbOptzW<8{ zZVQ+Pe{k!MU*LrY(l>;>lh=gf;gD{ly~<nB7mhX!#gg%#JeYf7ujtu&?Y}*JmN!dM z%ZjR)FHi^HC4F!~--o-<jk|D1Au7?qcSXUtLpuSwExGpQd{ZjuMYty=8Nr1MH7{Vn z8_(rh5$)pRmtt&zX?-<<>4>~M!5ykJmOi=+IlAr2iA~33Q|w-vL*V2l)L;l*pQ8AJ zLFpmytKEW(7=$6gbtt=lcGD>^gV2Oz3%ecXVfL&+!S5a4e^>s#@vg7`pQ9(ABs_S2 zGVrg1AlAHZ$Jt+Q-Tqpc#wDd7sC$6Pa-2vf;&(oKc*ko`-MvorAW#!t64l$(SMuun zdpnqjAL3+w3?UtVsBkTkuNPsjS75Ia*lWKwqsNq>B}86iFQQ(lXjY0DFs>NDIA*EP zIU-1)xXnihdCo^<E12SUl5GG2aWdmMNi?o-MJC}e$Z}4i<7()F<&ba3&=<?;fujS4 z6dK~NpJKD)Rw!7n8o+H(Ae8Z%!zUOwJ6&3oMiWV_g8LzZhK11RK_L*H+<SKCN%feu zE8XI(3A@`yY+?U!aM{<cHs2TxzDlP2{*GV#^WEEj?Lt&`G!+)awP%#Cubh1UsDvvZ z64$*^S?@3iVq4PLSr(`r4^2#F5Sq=Kl*JF=^b;rHKZ_6vALo>M$|*rV!PF}Y_G2Ao zs?`t*hlR$CLOn&nEC54w0;Evzr!yExLIg4EXXAyrQVNGru>il6TnWooGRs!h0H#63 z3oB85NbpApeW_YeX#>Cp+;NyqAPIKCM$^B=v|FvEur<V%s2CQq1rgYik0OQ%W7t7a z8Qd6I7ti~7%y-dIhR05|)PmQmrB3I4SUvuu<I{aJe=)W(l=4mQ;{W~)Q+KkeF4}F% zr~01ndqlm!2l&v==KF=PdiLb;i1>xcod&+$D<(|_(U-blN<{Srb3zo)D6joJzdPlo z{WpdC$HL~1mT?bj_RE)O=%vb%de{TE5ZR4N*ve9h$S}>J^xy$%)=He58w7S3B_hmd zh8ereWX5joy299Em-fia%qZ7@>sxl=UMBj(E}Aq1TNZ~3HF)4K0mHDgHdM;c(FMMY zu!SDmOhrvCiB^%=P&why09HzZ3(<~W!1gNZ+G6V2VARm&jso(04q~2eK^6ydqU*r! z0wiq|!Lf-NSl=CJ6`rxUdv4vntb^Zs{m2&s2fe2BYVc_jteiK&KpP%7b70que?>~L z^Nz~%gKOO>->Mya|G3KUiHAKy{Pg(W<)2vV6Q9O)e__o$l6<fUbkO4V|NHna4{W#F z9Q^lu&b^ah$93{WWd8xxuRD&J_fp)PcW{aey%n03)6_h%k=bQge@=!HF_BFYp3lJy zsuXxtFeZ}&do8h0V&IEip%&VGbp+RYh=<I5h=5+$lL&(HyyU4QIz#M1ceqx^I0^wk zaQOunNKa-{FT&($U0G~Dx+?_!mq7_kj`!@2k4?7TGIn?9$Nc@<_CMZ?+CqBm$&>f& z-wDFVGr#5&71h(c=iu)vhacEF_S4=q{=+wHoZi(D4oz*~w<rJTiyQMh{`K8=9wEX< za5v!W_5thKPI{&!(L@u2ibTPKMcZ|xLRXMvR#GC~fV?2zTqy}Sz0G)UB_nySgj-aa zD=GGE21y>P)jQAS%e0M$TeZEI4P_FcPli=Ypk*{$kOXZfYLejPgl?*Ah6Sm%Uq)G# zgTUUPe3WAadwOkhHD-@c=f?f3ILmPVKuHm!<}=}Vrkz>U5eStzX<OkOLrl|M777M* zX>bz;n!~n04JI}5YPhu}jH%A%G7QHi1GW|;UND9a2q*bsTSun(_6~_x|KkT2f01nU zu3ycM4)N-56EPqu<01xJV^*uBA<4T|s6QWCZ%OMTLev^-zQC=r#zcP5ykpNYp*R}P zS`GWR>@!K-i3mT_%kLdFV6RPl@VfXae5D7`W>2Bcjp;Y2<G`}uL&{f3#UiNnQJRD> zHVZr*=fL^GLQTktQdN>SaqNC@fSInkiU_R(#3LAJO-V?Oth}y@uWn608Q=GJeCNM^ zbGiD2`tpGnb`F?y&k5><XQqyR-VuL}Pd?=r<B452Z+-HOo$Ax-SFiuUV_$Tq;JtSA z^<V6N;^(g&XEEgna>Hlv_if=$7B~RuZ%eT;P3z`u?hs(klp(U7L<3j@5IR&yxnd&w z$J8j13^g-G4?I$e*<LBed>(0<>2Z)y`oNZ$sTL`746tt)+mjw8`iBJ0?F+FW6cA^y zUW;13?Z*A7RM#|+tFqaM+{r}pLYO&arHn#I>SmOygn6yf(L!rmf_eP3TSuV>BXdgA zTyL$gYV1@voDF<w->&<mNOnWt{$TKDhj$;_z3psGchITcH2G)uPMzB3?;1()lD@B9 zRDWqc^qt+?1N|fIb&>G#t*0J1{HF%>MX@h1{N3B{eKcSW_uY|wKp`7`kWlZ}{hdC+ zEhQBU;_0VRMx|)MxeCCCgEp1u^_Ia*St6U{vM8$|OqZ#&-^9e=ni%tKFVoHhZ%WS7 zOoK^aKgT$9L9V1QW?raTkYb4gR<>(~CA!KGQ6|<PR+a4-U15|#O<6oZNw<P7PAC%L z4R)#=>B;C!!xedUT^B1{G!t#S8}$qvFgFGsiOjYG%qP$T29+&=NRX(bma)K(C7aL2 zI=dwCv1Nj>wZX`{Tpi)54gWO6^I|eC8cJg({1bx)mKaN^7dB6?{zhaf<>{^W`cFQn zdm|pvi-IMB=*@8VjCtGifI1e5wMeFo!vl91q?TAf?aIG2Hrkc&N7IZ~p8w!+HKe-> z_h|=j;S@{(GFn<CMG`t3CFY6+S0{DPL0h7@EcCR?j;9T@3J}WTq07WUD2owQFF8kC zO&M%s==4x*2N7JFgC^8Y&++$GX{enxDOdEUTwC^DuHsRBu1%WZ+KL>Q$&hBewH1!& zObvZK)AI-a@fzM=JQJjE>1`Q$+e|zA+Cg6xJX8HA)qfH+SZ#LPuV5w4D0n-*;Z7G- z1UuR??F0zIKYZ5A6;+@ix!^fG%`~8$Ni3qyf<-K~$yHQ`Mru-+WN${WttB5?t(iD# zAlOibX?CN4e;EjikO6KK@e#e-)BM_>nByzfmU@49)20nU|Deg4az_#t->VWIe^&f_ zRd#cIVq|TtFXaHac3E;DY<HV<eoSEA`NFTSO$DMY>q9%%y+*_9fs3oYr~as~oF5H( zmc&dU=a>KGMZRlX%U^quzy28cYnFTs6X`*DMQNu4g9)R2pf=eKU!%ze=Co9{l6)iv zf<e>@e0~l{d=S~ladhqDjH@h86D>ZbVdldPHB%b*!OzTM^W%j`lMM#@0F{_=;Qv8U zj$~8nD6oP3v~pE|YBcDO)zUO3#*Xq9$<;+uNptCM9TFLXG06@`um<T3S&t%|g=j0> z-%ht7fb6Hmj$0iV3navMg=?lwKG7nsEEVkDf%sEMg8yY`?~A)1TR#Xb<j1-?wwuj~ zSm0wJRM{YSn#f5Sb0;4RjYV?ijn<FT(;j-F9->hIQ{dt}Sad?FJtnNy@~-f9-3 z$=~=Ze$0P`FMW|8IbPjl`0_V!4{^Ygyp5cv2-d!nQ)&oLqGx3sB>{ueTr7lNDC)F& zXf&8v87WTz{jLo;(0~`Soh}5AOv=SDn8_yBkZ716?Xw=hGjUhPP$@uy6zXC2LjRq? zEmJLgD;4mVV@YHG1qrO?LW;;@AmC0IRA=sGNWbF;0ZS9cBR!cwjQ^6oZX_g$hgZ&s zw!R*_x^<A>wQhJM*4m}+aQ1ae;{H)pl!ED~iSIB*x=#5=$C7*WmUys8l)?xIsiqui zegC)5@7?h2sZ8SQcWf5>({-TL)?T!TqQUL4yDw}*oK%jzISrrEPb@B33Xi}LkcD|D zRAK{gMa@bL{ED)Kya?zaQD}iU2>X^uEU;7M>KfKh0dXWe4<k%1KPT(q$iXgIu7@AW zmctLx+;A`LV!Qyla-=wg_+4p1TtAa-K@*dW%;^yV$<`X!y4RL3w$w(bi7mIFmn(O` z@IVQo#P|^Uw)MR<vM4VBG(~P89+)Z{>qwz6O+gZN;evVh*;(O2=~1V#6+_L1sH<9N zR1~g3Cker79X7YHcFR|SWzqH`-h8sE?Y|WjC1Z6(tiw`#dcD^q2=>WukH36))1hq> zvG7y;&-SU$=rcOcwz?)YckK$7z9>;7iu45+X76HcQ8nputr<`sSAYIEf8o#GDKqr@ zW9oMGy!tL*&kx=;$l{X-X5jm<E*0p&u<B&quFb$R6?3DeVlJ|s%;&sJ)0-T4B{_09 zRoTqykRoc2iql1PNVGu^kwcUm!If5fW%&#$TPROFQW@yAh&_K(=Qa4kJwsmewjGDX zU8{S6;}V0Vlyz*v5W2~`_a7*>_Mzw4i!+wM<iaA(13$Sm8b7IH{01OS&oycM1`Rl9 zC`Qfm8-TC6Kx$yYsylrMkf|ue^gT^zPQ$N<W*J@<6jIo9{Q}>^kf2mXW(632?;_!x zqcmK9U{2VSFWS;`AC+77eJmlt$ZSIa*fkNlrV5b}u(iPF)sG@poZ^34=ro3tv>fmv za&05pg`E`g^spy#sb*!E%mD1AGC^Y++yrJb^3}`AhT-Iqez9r$=5wsURXrywMN@FC z&5)Cl$6<z*SV~OL(_NhS<_kNuySQJ<t$0*6xZcZ6K6>V?``q7ZPghAf-=lJ|`@Njy z(KCtHUwDgdVkm1sU}O?^FwedU?5mD_71>ukea-btGXwoA@V4db>mYrlpS_N~XNtZm zc*BC5dpSnDGYWp1u5l4#Kd3}Nuon?(CU;#XHw7vT2ges%d{zAef=<*MmJhC&yl#r! z?2Etfm-|L`^Y<bbk|XUn<m0x}Mee$AXZJ7;e{Q8+v6msLgIlnpqM3I%y{wE&(}VGh z+_MZnM&zAx=LBxLG~H`PD}{9=a$3Mvq7e&4)G*2;3mpdf8uOQ2VWScJ722p&r9zmE zq(sogw^&>y7K12M@Q*noF>h2fn;*0Kt~T?AW$KUHzp!XaF4~OKoiDnl;=@-(*ZxVe zJL1HK@1Dr&o<C%*wn`?r52yd)o1TDV(z@Z6^tJqVheUPPVx-!fwi974Y2O|(4e|!> z-N^yp%^iVwvNIXG`E%Mn4j?CnY8Q}}4{*v5{5x?(fxTRA!bm0?!zJ|^#Y!D-&We_< zQbjcw1S`mC8LD;koScRCwt>odlw5X}2G2(s>%-_Uc5s$W49Z4qU&6R%fi%W$VfyM_ z19XWR?GJe>t0E1W0LtfxAOg>NZ8P?+9-3-luvjLA;9!xORt_Y(IXHBnIYCpjFz!Ik zi@FS!5~F@jV_gke4iwU3IZ=05jGWEQV+`6xdU{t3O_@tfp$GuT3GesJ=7WQI6XHc{ zXt>90-aa*E5G+IA4FIi~+GsKnTET3di0jV2{4aAL1huTFHNDy5t@X|!iLJwDOu#Cr ze;xO04D{ycEk9r4E)tZ*kOufr;EDwg)}oOr*~N6%!t~UI3sMa4C@&7edDd$q=v=cL z2(lY?Ry#RPn(auFSaUAitnfBk4ITvwe1#ej)T!wS8>Z9owFDlPy6!xBH*ygQ+R%~q z!}4`>7--Wkh8WM4s|h%(-IhjpSStMU>lW%KpFn~7-USNE_W^bw+@s#Mc!Q!U1-h3b z3$X@jdd|1TP&Y`x4h`H3+AK)UnWVWz>STn;6q=d=pNq!jC7MV4$?5-dXwLOXkID_@ z@8xul&Ik=U_^v8x##SZ4p}E*sH~U)7zIxc#3i_(x&GSA@!7Jp=DxkfGIx6af$iLyo ziUplygG*>A_f#x!UHqk9&_)<YqWdAvWUdT`u%TgLw-IRL`9?`thdmqS(LxDpyqmb* z(DZF~xP_;Jzj&;tGd`U3n|<q}LsPMP)L&k+Jjp-d6%3;ISnz+ur<RQL`E_fiPmJir zx4OT4@eiNho;7&8ZrpW*f9;~j;PB|SPnsMh<}Y?{{ql7!X3qgsdw{9t!OQ--;RW2; zYq&QH+T}(|GD$t7Zlr)KnrHM&Y{(!GQ8d8vKDxm&<9LHsqRod=OGj6+DT15gIVV$m zQyLf-djl27oQ$W;>4ggbPZ{mC0o1bxSYu%T3ZqKX0PV-7G~O^Ek$@${a5U`ERrCU< zr;W-@;hk5pBKTFfGv&4TJq=VXL$$lfrUbyp17|S+RU7IpfLFxaHBQfPcG);VyV`?F z`NKu#QfIhv88KwC7{V_YjKOpm?MPw>*5+_#)Ba`(&sqF!!=O}}&FFmGYbd8{U3slq zF#ltQfIYC*@M0z|ijT}pe#v(F@jcUt?zAL6lTJP*+M+#)@9z7Kkm!!#nHYPfD>=4o z$I*C}J=67+D7AJa-x>~!=B9A`Z-g9w%W#gDzpuL`R*U*TA!4@*q9GK$aL1WLzO?Sv zSdH<+uY6(6>hI|u{?H3zILiF~2M?;d^!rc??dATwpxSK0`a-glmYaeYXEl)GDAO#} zN(W0>nPw@<Gz=RB$=Dv&5X$vv0Lxh>b4FZSsK_WGi58M16%>gvD@g#6F`QHn7{6qb zo)*d`i@1u3o2r>|QMm{S6(@@ToXl9P1?XWKD&)bMbSfR-Dbu7&7&V!>Y^hKH1zRm& zhq64-JK}6<8ICO3{JdeL*TWllv+%S#SzcsLwWoZ=M^?NlO6hc4H0g(Ix#Z+Ui$5`z z?pqyoIU7u-z|Oo*3Ee7%#0%SeLscV~8X1<t0{#h0L&^6~cVJwN36&=&r;kjniuBe* zeTjgN&Q(keh{MpFK+UtjRn@G;f$+*sN<Pu5hKWhW0IWXHR7~cpja(0&>1f8-8mS@R z!0|b7t0C7XQhg{!1(_&IkaGN1MylURu#2F?!h6zGEMf)eA|k@7BzeVbB|JT&R|^u+ zpMVS>?IK0*L`MS#Zlr{&h+rz!R8ba8HN57Uj7hCz!Wu&^cNyR)go0ui4A>o+|0Q0D z3-DO{w860E*bRH9!^x<f7Y|<loWW6*3U`d%^UZK;6`t99?JJ@^lB#!?-zYveY0~XJ zy<zyoy%)B{C=s>HWuvB0RCjpu+M}uzt3lW^{{nb+cy%y|JjxFlLrsSbad;`J%+=iW zoT8_D6XI)13Hd9)qVgWEz64=5ke_@m5YL&;<-M938%l~0U#WoOk(%>BeZc5UWw!$L z54i-J6W)U!QVAM9`@<57b%baDR8JqUlggrg{3FVVPnPU^_NxPX@BW-#_lFn0X67sZ zUU^-3;KQPc>N_$oes$T`UO70b{=51(-${KMO<yj8K5dGk0oJWR5n5|%a}<5WNzuNX z_6anoT<7w7?L#4<COH`^`);PjS_}^!V<l0ghNi!{8j$`lj7~9CDjZ&a66+3#jtPOX z)#YQ~$|3sPHH;qk6BGwg9@PMhKz9+c6Oox!XT%uQgO?VUd+3m_)_--xm58s-j2D;J zCIfeF7u8P_l^ZccVs?1#bZ~U@3V$jVDAQf1z5_dv*5qk%2Um;C^Nq19gkJCSTD z)9xaSZw^AGAW>q*LdMCRlfx*Wp<WCg0~zqb95k!qY$_`bNzz5)ge(J%)BL+h!~wF{ zlu`!}A~0BJ3G!$!hu$GgixLpPszHMttmj9j0cxpbfoTK=m7toVz}ILsGeUihLw0W4 zD!W^1jJGEet20CPV6wWech6mqJo5^F=v9B$mf@-WNBCoqno9`gh(Bx+Q^C>OSJZaL zstim2`?_;q+rq=O^{YpE_wFA5&Z%F%0lN&T2aRc*3p3bwj~2QaXq;-~nDG>Xd~71c zPd%<A9<<>>t_;vh>?|wddI<B?qX4FDsX_q^*e+6)gUdnpig9NaGbWv4xEv<hXF@|C zHFwwp%&zEEc<oA=o7(%pxycSx%sAPPKxjr+6r`RYK`~==u$X=v-KKR5q_^f!ge=Tw zWkmFTn%x8vqpO!%u0sB}%6p#OzIul)yeNx`-n^&dLG}0Q+v-04i@f9hJ)Zus?@?vS zZxB^9fSy-hQI%m)+I;>X5`4;$*feFwcJbkLk-*S}vV%<1G{)aIXv%u5;j*<i)0rGn z1A5ZPwQ#M(<VwB&yjjz~MQ=Mw%?o%ErIB`Ais8oO=0=LqNtc4prCH~nCaG{nF*!{P zgd{{kRooId@Df(VEureXkByuA&}$_sHraxcs?;|ElMh{l5RB|b|A!RK%3#tOG_z+o zJ?px{m@F;Q>ah!~IXfvx*Qn7(4Iuzkm~=4?a0f3uKfZt4?vB+rCGP4F05C*)(~?1Y zv$nY@2Ii8%*Cw8R`TUt}%X@w`ajLjR$Q}6ku}wc5569FK+uLD3lZH2G+pgOkjTk%^ z_a(04T@OCZOX0u>>}El�-5c(nbec0-N)Rz*gY3EhQ%nN@A^5gcfVwqpK^`y!A|P z9q8(E1`JO^79ENbL=cqjCm)t3O3}1L(Z5^?gW`UyB#2YClGsa^u$yoR{9LY5k_~td zpkJCoI%H<#x?&iw(k5Ggze9^(DfOVN2t(>fLkj?aUK^Dzn(eZ)pZR@T!N~<{Nb(Sc zY&5WmPLS0G7N<#sfF(UnC;CQcaMQa`L+loWouREq{`9_`iuzCg7<=Be=dsn{#QG=q zOlL=SgFSGn_va?lmL5wuSm$p``7MHow9^x?hce;eJ>n*{#(emP%Gc7?{>|%K2lwq- z=~Rzx>oCrl9gM`Ou0g-SWUKW!YmFsdi=*0YX-YR-cXbPbJ%KZ+0o{wpryQ_9n4p<z ztr<2?(53|;I$-Wf2l6U;Ii6hJK?Yvl3p>(i0M)d5>5{<`q~SM_#S7pio+WAmqnVNp zljYN3cqErfD0uL~P+VAwhtTz9mJh@7Bp=yl%bTGd(?RDSkZd~w5M-V$cri`fui%-( z1efA+`^aseNr&WXIgyZHX9F*<m2Wke6*f9D`q};tc~wB?(Q^23>Z;WYSf!mn$Ta{@ z26&U*?fHCgY-^YIz^$teNqv9XdG9}dZ`(ubMkI@-FPDM7+?tj{d$xD?@)>W+=dO*W zW5TIGM%`T(_LRBG`o6S%;P`Vte#33+``r2CZy(yi`vGZtX?r^$FNO2axm|ql_1`W@ zr2VPhj?g#RS_D6ciMtSYcY<#VN?2a<A;{n<9CT>Z#d#3E(6<P`ArFDH7vJPdmwBT& z6{R>JZCUoP4eQn>5!nw=k4zj35Jw5ilj_CtNGwF8a2r4(CCK9NU?skTLKbM6Xy}J< zTO2F0b3oxU_UJC8y!m2Xd1D(33A}LD#C%Xng#k+@w~RHWd+jm@JQo%Otxq6{s94NI z5D^xl_w3^lFcN4Nk4YbNHck&AtHNU+4dzt|MlFyeMAG?3Zg_rtqam0)zV<nD_ugc8 zN)mHp1t9Mxq)ixNH;Gbeb9D3C2M&xpnA@^$Wct)&I`3V_T&nA61pM7Exy@=m9+_=) zL}Oyn)NHIi)gn|6?p?iqyG<>-RjYYTq0fI1cViRxmq5T^1(Zq(8wd!iBYy<n#OD(6 zoF5$AKzcV?{fMDv6P$QnqfpFQ8N(lOlu^UfAmy9{1JJ-SM6~48`q0fVbt8?O0oYN8 zKIcP#y;atd$}@PZh|vu5XQb3Xoagjy#CF<Xb^0CJ1R4mGNCVVS=wc$;q`nXw5-J18 zTd|u99wuEI*1$CYi$Cf8;iiM{3?IC0c=H$U{qp&NuAnGo$_z*EKKa|uyKfr>?qTQJ z!3}*ij;?^-BDOqr@A$qoNndR1&TV`7PNVwnlsWkJ1Cx8FlD=tq&7Rb4<4etX%4-ml z3iks$)JU~LinA3XZC`|;*rKd1n8$b}YxkfSWjRM}de~;BB1~k89%c`65q89<fIn2j zYynA=Wh&UT5tv=gY*a7sXwyWp9pP8rSXx!vKvrwC$vVX5GDma8JS^vIfK@G=ry{@x zv}XG~j0FPb(U1=;6r*SzWfkl|sv*vr66Gx6{!+WoV;#ODvi7!R-~aab(m!*y^mbSi zf_?f=w*O%5&TXsi>RQ*?Rl@t#S5Sj3Pv1Vc`RlFz<z4%DTV%XLSiL2=b9%!q>Ez9Q z1J`u)?$q{K9crsXIO`!&<z$>D$gKe$*CzF7yAcb2p4WiO1XrcZkkPQ!u{_1hY%$|E zgIuBj7(oV;_iH91;ye=>!AA^PY|wy5u>)c*&N6>BOvP|Rs9@o!2GDH;fFa$Asn>w9 zj;f|;IkK{;JHW;>bi#Qx!T(se_@<EFct<eobJkoqeh=`F{!0D{QQe&AH5kqc7d-Yr z&|RJ!7N5KLjQQeEcJDGsFK@vhf!P!kN(CX%S7dH730r?=)eUrw`pFK!oT?+jo49{! zxFy8i%(xxvG$-c-Tt?eI#Z)=sic8_6s#$7-J;mo@6s_m$wXFjhGKxG=NzNIi0{9~r zjLQIM$X;NkW^~mdwnY$rR$*gQ16M=DEDZoGBTUcL6%G^K2(B`lL?JrWGBqR`>7Cg? ztb+p;YN@L+F!YSLWA$B|qrOI8nYXrwzjZXQE)b6!oXPmQyZANBcMcA9bqtGVr#3$L z-21BfYBrVX;0HrJ;i~ApJ9q2KHeVIkIGLWD_~JOa{-gu&0^_jm<^HO$AJ8kU#eQgI z;a)4T`xnu%%F-9sTXiLAq7zXii3#B@@RCt5o8|-Er5pV*k_Ujh_pw@=k2r}?Yoq*c zmWY{+6lVztQcNXzu2=_OPj_?!0doTs0u_UJudZI3Q)?xBe=(H`z!~0#&I<Mp3t@N~ zCr`mR1tRrN6$!vmqeI4z1|prS?dhSbz^Hy{WdKXL$7(pDz7PvUDvK<ENvpY5<5T~5 z3haU|KuLL-IZ(6xUf7m6*M9@f^`)d_*b4i}8Q2YXp9jZ^jj<5a1^o0g)>e-RMg+3R zc8;d|6de`HIb9KqhfJp%wy6iEqPw1=nr`%;bJ%x&m{gO9V>-Cql<ByrYz3aX7S{iF zlMVpn-^izPG7nF$*k~GvEQpw!MTA=1$RfgIU{1S2z_rL)(~#n$%TsG@Qfs=7|Lynx ziZ4MtXw|~OtM-iZCwSW#gU;=4w2j<33cU(_nY$&~h@SK_D<_^RDjLWptx@sCiA9m% z<XDSj_~Y-JKX+r8v?`WlqAsgmnC(aF%;ElCo%5Dy+N1FCPvNXKazKaIl6yx7KZHF> zr<URmrIfXTN~skTqfEz<WKy{vWR39pb>|R_%U*0tI}rFxJ(#o=lsc9w)}eewH5bZ$ zs3AmN-b+LL2>UGX>vT$uc3=l6Dy%1API^dfY8=_wkx>@JeN>cU!PY$eoju#TdfO`y zwuKP3%?nU4UjNW)-4pX+*@0huX@Szj9E(4lSA%+l>8CW&j0Kyl!<;P5JF?-rV3L4~ zq$*J!L9Awt6IhhR4d9|;v>Dp47%<8?fRQ8ELAVbA0F)^t&;y?UK$;wm;LjjFZ3WGU z8O%gqNwYP!Ixtb-lZPb4NO0S9V*zMZ?a)e}C|VO^!N+Kz2+w>0_~2m5*VR98Sbc^6 zL^A#j>b7upkk16Vl5ze$-Hwm)?~@&fSN<MD1bC3Y=9q)Z!+R4aVJ9_XP#`8|mNJ|( z%6c(+B7;>Qy8*n&R`9^$ELF3C^lG#M6l*mdD=VQQt6Bb5Kt(D`M0#OK3tj*M9`b<s zWtz1}^t&3ziJCA=Y>KCdpf?1MM8eQ+LkPQ+zF%g~^}~vGY$l~&v@+WfhUSvY?-~~5 zBV?cq_5nlzLl1~=C4wKv^ZpTrw5pxMkpVs(+sNN2zAO%{y(F9aCAra@0iRjw3q?<6 zuf?5-8ZvRdqpo!o_X?`WbPj>x1732b*To?i(0~O>A6*ndI)!GY(TY1WXKY4aERQ*6 zR_=lpqj*lX!_A3yGO$Q)&aPEG2b<+iyin0eFI3P-N2j(=0RqH6Rg#1v%qEv&*&-zA z%0(J+d3+A*VMm-HTVQ3RXn9U@1(>o}r+p?Es?+$<dJxWnGIo%?P-S~aXY%<oD=2;o zg5+ANrV(9mEdW<foF#SS>}IiJ1u#9`%?oMVQy;3PF)F)4r*e+ho&UQ_v7z}rzH}Rv z*9ObmBfL&nz~b)w6e=`95Q9=g5Rz1Nt88m3s?02Pd<=U#ty=>8TXuZ})yQWG(;?J8 zH_+igZv{B1AeIQS;v5t7;Y1j{0%rwPNsKc<k0vKl7EwT+L68O>GH+tbku$MTynMA5 zIpg%$uo6&;MGYxeK|kfy-shdwFlDvND3%J0opKZLfN+45%Vnta%2q<f+5pi<q=a5O zT##IW$`ZT0IGnZ_lZ{oBa^O}jqdN?c@nD)dzNl`XaAZ$mppQ+byMwH#gpw2HRRh%= z)jI%p6ULSnrL0e!{^_evJoVD;8}9zrL2JsebbK{GaodFYsQMfAm+DiK>nBAeI{MDe z5j8RWviceydTtl*d^VA8eP=ix^X)xwcJ(Xj5BZ_rtvY*fuMb0-JjdZ&ypG(ho%m7N zJx`1dtaoLZ?Ojn3E(VJ;_@qgqgqocNQ0k!cOEgFG^GXWGI+h$7<TM=E0CQsqnld_6 z=75|k=)p0RKJ^_HiwD;}LTfTs3?3(ps*%D~Qf&@pV1*A7R2w(eeCFF!NPWrUpm_y2 zBvpwp<UiY^veS4NzH1at-`OUKXGZxV(U#im?%H|NYrMe{TGiKK5{ko>p6J1$qu*V6 z+Yj!a-tg!9hxo^Z=jYShi_`1Bvv+`><OjOG{rtX!H4qNe3}%6%1SI~~$8WB#zxgrE zei!sX@WU<TzRoEb@+)0hB%WwyWFdJU`&QQ1<4(hb7u-3El}gX$joQXDN^GzT!yjPa zA_u_;YqCs5fD}JifMldOC6o-Z7?;Z+oJ88DrMW&Zb0e3txJa8A+lmc*(H#D)F<fp3 zQS^PJzh@*=0%{7ZAyud^gl+tb9odNvZ$|9)SMR%JQrY=EUUZjLbz8&-yoy*BjitQm z*;HrB&g+#?_4i~}UblVy?K_fDL(n`F^<=jE?Q<V+{IA1ysl<viE(j58IC=3=H4~4{ z))S`qQ_Ls3)rfdrybEkA*KmrF8hJJ?HnJJXI})Zs%YqRXAPWXS1~uP&qmlE&nWo7= z9978(YE2aw-1z`sU`|upF}$F&>SnC~R=s8<{Y0HdNanOFNnL4U4JwOX3_~LrVk4Jz zVWhOW$tVBZ9~qoThDYQ6h&$-?e?{Vhs_IE3o#Is2yP|ut<ds*a9$a+~9}lE`mXJHu z6%tOWgX))tuGyH1Cn8~AkmqB8t_fkXBT^bcBhC<#_?Xjs@omBM^b0rnn<}k*`<F&> z{ut*Bi?b0EF(xZ;KqzU73A1U4gnWdpC~q#+fqfv?kd#q#%r(K;A@l~dRZ8QZA)J0n zPjx5Em(l<+C#Na8)>CwChHwb3d)bZcG~)zy5!*C#99ND&4PDg%`z5%p47vb;0HIO@ z?B^Wm)g0B;(U(8q2oA_cUf%n$lvjse;X4T&s2+Xn;3DcPMtJompQh=`*1!4?>(yiG z&oPdvzVnZZsIOo<#$|;1NgKDJP=%%02-c}WlE>tW^7XL*hD>r+EvR<FS|EE$F<d@{ zjwG9$Ou;-25X<7&`~op-*V~lJDmEUJh~Y&-CsR*~#3dc@IkVJPfNX2zAt(YY;=pQ6 z7XUu3=~*(YdlpjET)XM3!14U~fpv>nu)eZY2HfYLzA`b(g{9kF;HWwHmq7FU<BkO! zSjYYvke}1vnxnyjJtovqakqYgi;y~JfLc+MqFLfJ8?<0nfzLHiUNJl70l(~l6)?5T zra=IQOuzZ7mkIon3p84D)sM=0&wDw;qbN}sssYjAXYgl$usRGg7(#@IiV_UuLGWmC zl5Z-4w6~7z8K9`8s4B#!|9EXA1a%XfJl<tzbPQUfla42DK<3B!05LQG&cja#FJ!9J z<4^oTJ^Jp{vnQuk$2R=U?%xe>+nz~dluZ@Z-C2~1T6XFqzCiNg;jsF<hh9+sKDIVH zJ@^36hYxL0*2E3MbHa%!k>4d~I1O$!Y;*uNs^{)TJVBQlbxZ15gTEfRJQT~R$*nl4 zdBi!($-P7vHGvJ0fo<X75nXQV%9rbK`D5ZGPcuhB94U&Xb@=#2xZq|)eKmUynJoi@ zA^o7~qHus(OVr!I?VoOiO{{QDxa~1Afj}Us+wb3DjiyYS)~N!2(3j3^-+I-*?KyRO z?2(htPQ9z%clJs3L;&MsQ9<2lSam|U7&FAzC|eGz|NO(j>7JWo{L&Zld~sOVkn#Gk zH{xnk7t`Ix)e)bemwc@TuEq2<eNa0zC6qb@zRVnT%p7$BefTir*#z#Hs0(O@1#B_E z(9H<tPQ=ZHz^Ys-$QWU?f-xGEBG<&|2_2!F(>7x+@&?E*0#%IUa=0f}qb&k-8uqE8 z*Kl_=r<i4e%rbB=tem8Q;;3eV5*v7alfu4etT*i$O{*{QOM5cga%WcGl?V=Y<o7)L z?qiDju9HvbHq{oXKR4esdhv|))UIQvZp>s;tKa^?v8S#4uVYqxdY+q7!@9I?FQ$&6 zDQAN{Sn^Fo)<k;HK%E*CL>tf}Kwp>7VahnNux<uJ5a|IXxY6EKSgkEv+=T6e5R>rX z#Bv1iy<AV}C>Zsetbto>mu-LsqFI&)*F%6928q(2qz?mx8`R-NIYdbNdYXcW*}$3w zyAUA&Z~$Zx0tjlhfG*eUd7<gUG~jRSU<uAWN0SrdtE5qD!s}ZyHSN4M7O_Ppx}>l2 z?dqYvMn@zNhzEpnpNIcD!?A;7clCsPmSjR|Hiweo(CGSJOQ7PmiTLP5!pHxpd5w5M zbb>gEY_{Qp3%X(P6)+a%3LNbK0nlBnBVW^mQ`5w18BHAkW7kHiba1>y8X;6E9B(%Z z7n9gN=*%P;1T9INY8>T4S<VmZVN8msUko^E0FxM`yju&w-Ep}G&nICT1>S^q6*vlY zQc(jtT5cpy)PzKA?S(CCF`Oosdtl2ZGQWYmwpOD=k-7k5hc##Gx1fhH66-zyQFQrH zT7;9ytX;78j%V-RfA*#g!*q9j;}4JU8`Iqt;W9C{x>x<v6E_^V=4b5I(_gdVj)CcQ zl4+SQkkI{j`bU3u@A|{jxVp9-Su?T8<T#iY#kHruHHJ&>+P_${XUj<391?t1f3gv_ z1x8cHiTg0wi8>KOr_ijV&q0Ti`qnvJoaVaGIt0ULt~gGrV>MK~1iJ_-x|m8rj7^jx zY_@FI?<MBIEpXuIq+sw^FTJP7L%CRbPY3!egXYJJEd@W$RIwFPkr}2%=N7e0OvPjC zv9^*U959w)43<^-=OZc<QiRP<Zd}t9yI(kBQ#S?HKlc2q53YVV;!6f%gLm!cH}B26 z_&s-S6%i=Da^$I3Cfj|rneiDG`;C9_y70Dc6BppX4C6%HEc@|!{j6xsXyJ`?1X0Vz zhs;}bP9x@8p{c^=T5H%xwrL+Zh<#zBeKGf!M|Vxu+prz1RK%B!RJEo(pnaG$`!Mr@ zw&Yl7Tn#Z`tFUIz>ceXWwh5c@ae@bf2mj{e!)u<7_~JqPaD2@+-n;g=?{}NGiaX6u z?A-VC^=V&iS0KKXeVzzD&m-bV#3E2$3Q-w>k%$2MNxd949cW?GflMb3F$gYTt$alt z9I!@_JE0B^&PFkFzXSszchZIf(-v3SSmn8mt|&YSpiRC)GXRA!VXFBdZfUd0MshGk zq!}W|@yTc0{v|EsXISqSg;YY>R1}I~4mQ<?mM<b=TS1hTK;6vYAb}!GxKWp;Ctw}; z8z5?RsUx?V4CCEj`n42)Dl@ic+lo!o(XFFCv)yM+8~W^Pl09Fy42=vd9jVm$O+6-F z_`>#ZM17o!f;aaJR3-lCOM@HJUbkrAgTk&pXLzzd-8K~P*L1IJLvO=_d+JqSLXg;V z|BX|C|3fK^3NocE+efztj2Jl!rRO*f4Axgmy&Wpa$T}3ts2D>Kg>u9*A?-LwfBTaR zI}W}^F<1GhRjzivm#cYnCQw}im@>Xr&kap-N+pYJc*V#5WFd~768ut{YBktou8dBI z1t%o06YY+2?dpOHrcI~^s}!O3%8W^DDtHf0U#1V&LBb;?Z0`8uExgKKTwhi`DL#3P z$zXW;=&5_u;~2er+ay|pW4h?js*4Uj$S>z3_YS5|PWD$DPM^;2eo;O2$csZsNoV$~ zB|m)lgEzU?P*c`%VNT&`uL&p|@;Nb{<IfcuezSc)%t>R+#~rjKUt77-sXsyeBm6vn zn*F>Hv)QcknB$U)0ly?C038B8W8@RsbTt(Zi4L{UM#M(NuhZk#>F^(adUW(z_G`~i z(67edjq2Uw_}va}no~sjyOrSAGG<6DAxl|o&3U6iEXL))CmGiVhs(jq>S`tnR9m%} zLblq`h|#esrjEz1mzoukr9~pu7R=4WqlCW;Ulz3C-&8V%|7`fTELbLtzqRsz<DZpp zy(JtTxOktiZs5>>5We{0z#;nlXYlzCq2}S@s(>3%h#G@R&s3u05fSn^e>_)qF6YKy z#Y}`pBz{`S9?8{|X7bcWwZNm;X^9c7qBjF_c7V+(nw5x*8KWx~CYD;gCW}QB#AZ=2 zS&Iyj>{L(Bl)i6hxMTCtrEC6XwJGAY>^?QnbAD|2{lrujFT;i+&3+ZV1uf=MbPY1d zZJy(lF8m#H;9gt<CrKCwxw;*YGa}}JLhu@fI#_cn?$S~G{gp9%0X0v4h1MN)P?Q^+ z!25~32U>e^=-k{+9*D~7N*%k2>%fQtlAl?w9+!qx!%Qsh-TZ1TwY&qzilcB}qP_xV zH3_*VukL}D=^7r<1XFZ0>)Mmat+C~9%;Z+svqh%90oU-z)au)CQsgo8!*WYtIt-pI zVD|}Mg4L%`#0)K9O2utmbNPJC3u%BpmH>#s+zj8F7mVhg`yXkIxA|YcwJ?eOOa6)A zhN4Bh&0U?|a$m4vA#>1!3khxt^5(NFdK*3y?JC<3=S{mYSD@{OHH0~$J&?Pa?EpCl zT6~2qGORee$VxW_^Vea!wBrGv!$`s9t@jOB@9SCA#byIOy52V+r;|x+G<Q9%F`@-p z@(E0)*-o2WpIdqDsvBsfOWJYK#EH0*fEjWE`Slg9I--4CJ*{c9*ykx>(X(C1Hx1>M zdA{jIq)(S#ZT<6$m}A3HUxAjI2QT$&yfi#M$4kvS77$Zk?mDY4UNOh4$c=vRrqK_- zDFNj^#BJt&$0;|{Ij0q*aB!B?Cw1M;NV_!rJ5w8nls%68N?Pp=crRhha#tmZ)JLvw zU;|BPE-?(r7YWu(6hG8H10|AN4RM~?N#Q%O?00C(zJ-oCK#+MyMYz6;Vu!em)kku7 z+Ve}2satNP=kBoOzJSF~-HgS*g%)3KxfR}^83%mEmh50Zx54(XD`8%}X8p!HX-lj$ zVCPg}=g4*Np!wQtZ~q|uDpXry%a&uKV1ux}1yKPN(!Y<g{=m)7O-yOvG}f&EMG&B9 zAGc*bfug)sc)Y{~+MvFUy;Gx`18$8b2m;MS6SQFJYOcA#>rEyhKl0W3dsd7b4DTc6 zVcI`;&(fM}+ov+@c1J;Lj@|Q9R?4aN@woaI5e==IrxY?QoZDK=J*3x*x+7P%ohXL7 z-{6G52y<6+XSHq@siZ3cIgOA;;7dT!++Y$)@F)%SlY}AYLa26493ovbRK#_)QvN^2 zqAM`7C>18k(<(3pp_mw@h>5x_U~2>&3lfhx0ExtgBqI2U2$MXFFv&yK`&tSiQ6^bf zK_L}KL3+S8<7m=|6M}Xn)IUfLW2p@a2e}BT8fvkYQWHc*j13Yn7-zA!a}6$FMp<W) z%Gnwur!I)<Hr>jz*-DjAz>hEJi?JW$V-Kz~o454L!JPpdO~wu%{%$Z>?F{@VEqo(1 zl!{INNo-@U$6al8h~6cUfu8<dfSIn0LE>X=2lQqVW^luOcu4fZ-gx@JPJ^Tz11si* zW9#qVk_k_OY9u3BgJLN4G?32=r=h!W^&_J5sJ;6A!1ji@B-E+j|Af_}r6Zty6e~%D z0$^^>^~dw$8i0D7;d;lfLMEO`3&1)S#CMro8p7o(ThMFa5@;y3v01s6dKBEXgIM~k zl&?2)Rrvqb$C<9$!c`^uB`7#)#nd#2190rK*WylyCNNosWlbZ5o~a4;WXUxoXnv`K zXzc)7s7F`r3RZeHwZ~dx{C?Is0FQ+A53^zk5R#%+V*Iq_=3g2Jg*%o;LfyXjgvIpu z=e?%kk?{ke;U>Ge%xW)^1b1Yh{&JT5`B2xga4_3@@%4#DQ_$t9vF@IRjKQ1g|C=4) z;}LJO!C*BwAl<@SFJRq6YFf7w{=XPJVk6v3oU($v|1g%gw^?Z?7f(>i0z-f9|D)|) z;F~<}d-3;ub+s(ZuWpuvEX%Si$wHQ8NtR{V_=*w67-NhvhHwid1jvPCj8e)dr8LV@ zN-1MWdJ>kl8B4RAEPdbCg5;!WM{g&~a<VQb3pxGw^fX=8b?v|F>8IPeHUZ^)f6pu1 zGIvO}=40z-_R;e^zvp-R{=VN1%6|ZM!o99^<8{SO*S;@7vxZ2Km1b7GY{uB2Y_&5d z02vGh09%=mTF*0~$zC=&*gFVwZe^O2tvD<MGo~R9D)Q+u#NnpRPEacwB|ZSoX|o|; zGKP=zwqnajY!Y(^g=*2dh@@j>H=vn{ra~PSuZ9Ss>;g-I++Kl~IX#5~l!&Xz5n1PD zJ2q>v*fDC0!XsRM<5OPBt^q-z{Thgdx-Z$Ugx%z*M&c&8n)1<$|5kRFmrvKuiUeHz z$z<FZbkQ!|sqRv59-EZLCK+tlr0NTs<`Ya@7@IeN3)=;cbn#0@x_5iIo!tLcSqxM{ zLl8(y0nqzmr)1+6#ApEKPwT2T!jYz=p+sSgdhM<uX?Si8u1j9p#YpsPnMYPy7y!S8 zup9v3Q$$|kkh8OmvlCAwIOIs$upifM|2n#;`?q6f*RdOT-DVtwyYOyp3%oU$<$A4W zjNn{qB#;^iX=D?UTDb6O^U(4w^isb~+6ohm0R$-9U3f2LJ!5lwa^0maBW(#IqXso; z>GDOjfDbd+m4I>qgI$$IPE^e#9dQAqvNw7!Yxh(oGrh!IUE^bD{zN>4vBCho3h^*{ zm9bl!)<X}7cM9LI0J&+_T6p2z>s{Dd&(QLiSwEu^X6`?7!Pc8phYyd@TeG78=7A%c z$@n-MX1TD>*5Ys)dxUS@@LldGW(`4h9+q>v`G%{Wh{fuOkag7v<mr%5=0-vstI4W2 zCJJ|}R(<zIIv;m`1e>{PHM8oKC9A&g7Ea4<oQEkiop(>szq)%1&d`)NT{o4t0PUz+ zdS9ZDS0AlO$@gFH{Pf}ctQ=cKx4_hHnESHTi`hj#cZyb8r*qwtQ}l^ttMLi9&4W+i zaLIjOE})Cy*3Z?I?-T>`0+-$?w|uzxcgn9D!?!2BMt(W-4+;2(+56^$oiNR&Q1@Us z*f)6nOG%9x;>E1)9h|csWDfVD9;C!5(5p%_5ge1F5y71@K*wrGJyrsam7IgHAn93y zXGqRK#tJt}A<K@FG?tP!Bno$_hwiQobUg0bN|(}ZQVx;>@D1H5B^jVwCP}0!5XE)j zbO3><7+mzK0faM(i@NxIE&|a4<A4N1UW>nkp259q>7Tq8my)zrEcBYFhXDRy`96Fk zki?Zkp&qzarv;$)=sGem=KNmZYDZTx28*@#B6b)U8efffLFb5D?F^^y>7HDR=Qu`i zF90`|uuQYQ0+=XP(4<T1N+cu<x?JO|xuSbQLUP5U#P5IP#l4wqVm(H@q9N#QOvK!g z{K134=s51DXX|o*fw`B~#9FiQOIyVXk=IB|)BvNqmIWA-UjB*^#t)y$5C3)Xd;4=0 z6kQS_zp-;apS>qJyg!mCN_c%{EXLSzfIrT>mm?yE_IU)ujUOMh9574v0BTL7bH1D= ztVz{{aED<e2DPb0batBA*{KXRBZ|ZkLW7LS-i#y&yT-}QgoLUB=mIN0mw#yjnxIBs zrvY0csJr1MX7?e>!wt-PE!CqBM>NEuDM)jbn1A7SovV6XBj_#mzY~mpTdLLrCXsvn zUoe?Q*N6Dm<AeND=3uv<gbc^`LMqN2aR$Vi3SjK_6~{teqpnQo=7KE5FaA<*(rfwR zCWqf?HVK-U71Xy)X@{fl-5rY8XlGSXGy+!Lj4Xn`tr|n-6q^rYHuf~;4`{9OyuPG% zup~{e6~H$!ZLU};G2*NQ+@N?(GNhF@vbGRbGt@@!g@7Gr<3+HfX4peh3W`&b1pyt+ zc2WD;f?p7!qF!rd5E(kg^k|xh8av`$-)f{vwpOY{XQ%2M+J?WA{O6=9aoSu}32F|% zI9=}#psJ0Zb5(5xZ-RV-o31pgoqJwUAF%Sa1ttg746*mh4zvfJ=S({A>W5g{Ad8NI zk;3(m6(;Ab@lnViYcOC24);s8+-}gjAQBJoh?A(0)mxWp-y&<LmGH*YWUB`y80#sh zGtPWa&mWm9%Itpf+x+s+-|f>z2Pc>8J@(=kB=b{)+qL>+bclcRAfM4kLY_uJ?~F~x z4`}ss+404<Vo51`+d4<!uKG-1*PiXCA~CH4EG;o#rP<?(4^12|K2`i)S5KziZPLeZ zOA0E_7CZmVxIGmoMcwMquF`-Fq^=arcyTD1Vs^sIFolYj^<oHz4W4Q?E>VdZ7~*4{ zoqD>cO55iQf%GNIaOhWHJbHKT%xA97WK@2d-@9{O^m$qCN#_-Fir~Kybs?PB7$IcJ zzZ+p(%Ls2+YS65HKV0H-bI4_1m33xkzbExwF@6lv0|-(ei6`=Gq<axu0+)REa?@8~ zsdUj&N89<<+@7~5w?eTfO=E&vQ*<+aW4%`pG`yCZ+usZC9>~qcW^>mbf_9xfm&v17 zRN^mS`{@S8W2~gz2nt9gOxsCx?6jAFbOg{fRjb?>;HyDQ?P1Rts)ML<LE%tGxD^nq zQ$N`Q!nE=WuKF5t{Pmo0YHm+QuRgpFO2?0u@9V?Yo7*p*#CN0_{Q@wtCU7)_NdeJC zrxx6nvv&?t<ihwggKt!xXEVLX92#+@*wGgw&8b>yi-J|G(K@Z+s*UW6E+G2WO5|>` z1^}_k4V*8rZxeL>lA(M4th$$<r=nOE6Zj`z9lK=WUaOtk%8OsZ^!<rf_FQq+p78O# zY!8=mCpjgBZ%1eqLdf^fr5Z@fUHGzfO-rb3McX;VhH1^neau-3I?R;SW{I7(kBv5? z0q({Qmv|eXD~vbd5q!o?6b^lEB)IZ2EQ*<?YWqM@1T2p+kYZxgadn=c3s@C7&Z^jD zUJmID(>hH7j-e*1ZNL_)S&@J^LxrK1qp?}ORDL{)cV7m7uQ$G8G!CchuY{E^KQ6^n z*OvxA@z!(eW82?2DU9szp4B+kcG3O7F!Agh4Uz&CF=P=HR3A|BT>;{`=uo@Y)O?yf zTC5q{oF*VEE=C?DCz?vA!>v)$;iXZ8I+68W`%`iTZ-g?m$RRLi$$2bt9l@bBg3E(5 z2l0!(3eI4v)OiEVi|K_7pwca^R6YhMrVHBv8ru%5lIy2rLc%PwB|AWmoL1ROQ~R1L zoKp>|K4pczs{r6jFH>wXhj)Kt?n*j%HRyZVDH>-h!tO`P&-Hu%d>!n&w(61Y%c1UC z=x)%Ne+~G7KCn?v5SKr?scc$^&OMQ@lpvpqBZ6a}A~u_)Y2vma)!D@<43rm;4mJ@c zWKBZSpQjEusx}Pb*n@{XDl=#v5A&i7*hHaStuB=4Z<5%aNo<cBReBPt>a<Q<?aLYe zUk`(A<p=NvUGvuI$ih|Qv{~&cX4^!QMS1{CY4!jJ0b2V~c_3Az)@DmqLL-4F(NX=U zWF;6Wf!<`PXWCx+Jp)7~w5jk&_s1UCHqf;dY5x~C4sCa6EXh`VD!u)42fjE|6U(|P zYQ0mgk&nZ1WZo&(<wiR@M}NtF?k#4#e65v#&Byy4HThj%`{<kz^y-Wgn=bx#!|f;1 z+xc~0eq;y#+pVJ}v5m<p6UAQ~bh&?}dw7&YaMl0#x{#wK=xBv+)s}4M1=Q=D!0fE! zikN7k{X$WW>}RaK9`>_BwV!daAA*!gs**Ki*{I*`OOrUQ6T9b5Njw`zWV^SXsp|kI zr34u=DhTFou!JzsgQQ8z1VjW-3b+qxG9?e=bDhIvV4Zy+`0@^uup8quPRL$-vLYql zhFwjYVRRNpGaKo(TBob+^{upnHKaa9d+qV{Q(_L15osbP4O^APOU7YmAc7skwy(@d zE3C>WBs^ep5kUJ!*3u(Pg0`Xp2){^8S%9bD*Y&ztHR3W|;#;;z;U~p*uTR7#))#fy zi0pWlU(dET_t*^@sBS#uh&MP|UqD_NjdXUZg2bJj@9n=OV`ck4-(JVLuidP@^3v*C zwbF@sR*DKQc7mBGs5a!4V|WJswx(tH9keQuh0AHDF11v7Q6Fq}1<v+|X0u?gMpTe( z%ig-h_6*j&)9iBdNFzSDxjxX?xwD4%+U;}pT4cOdL2e^oSDMD`RPA-56XM*3RjR#C zlD*oA!KXG&6Xax~DBl|Kib_VaBu(<TsLW3r-F9Pa+G%uKaP4kNND-A)CBmYR2-QX+ zq9jssM*>k`Siv}~v_2(|!fJa)$!dELs!aPwdtkNb$w&(lvL6xJW^8emf>4{gHAQCW zM4Qajs3N(5b*sWCF*e<Lh7){&C14F?GqYM9v(HhhvUC~RtT5ir2^jC{oV3cSjIZB7 ze-pDQk+Iw85eEO;a5iKkg7|49e17^@z@bn@_KX{v<et@W|Llis(w{O_r6anp`)yjI zL2EDm<`&KRUF|~WD?w;pYHbY<eCW3QYRMN*QEYT`zL=NR-KuS`zoKm`+{g;rvH`2I zY$e;{JP3@|&>q9a<p_!|SF!MA`1FTm;qd8%$bi&2AQ`uA<i`RllAg^&wm?IDgUvlU z_957Mp=CVQnCP{5+zz|3aggznlzg;4!Rj@2(C7P)lCoPywfU79;vpGPZGIKme3X22 zAFF+#!U@t5x&VMWr3=7?SQonrskantw}FX~5cQkXpbD1ABCIzh$~hP~s^!b@q1-Yu z@Eo`~rxz^C!N8eF97H)_;Iz*vq+n+>LX|znbV@|)w8iFULfK9*P?>ZYGRU0ujKS!x z?W5ABw164&GKk&|uEO2}ez6kvJ(`n7tjdZt%q53yO2cw?FrrpDp_bL>iV+{KA@kqw zxij$lxt=}s_jVRvzeOvK{N5+rw=Y?`+5MUA$3840o{6VtY<M+crf$}TCvMe-*UYnF z#cHQO3LaYK$S1=`kF3P+d?}c|hV>%s`mk&l<jrh$0E(x#Zn?#O{D#|Na>xF)^+T{( z{>Y<td!^|7z09nC)_Z_|S^)0BP25ElMP2ApZFNmo2(~(|+Ui=eRgx}~@@XO`r0mB7 z(@Yax4%w;?wz`=Z_Xs?#G<A>}HeuEp%oCE8oH5169juco-IgLIc%^qGeW({)+tc+c zdts{-J9`r}ZZB`aes-Yzd8W#2#}FVHVvV)n;M^f*p`hif&=aNsZPn?zhCIRFq<WjQ z4#eqJWoVepl9RjEz)}%ek6V@1w{4_PH{%Ll#e6dGRhbd0z7k_v1-V~Wou;&1R4mhc zs1Y@PdXGjc@YSZOhOcQ-xyr|H&2aV3e>RvaEA)n{&?}CGeDUyyWW3nd_kV@%!4cfs z0YVYiV7m#=rs|D@+zYat1<qSTw03Mb2lD{#^8g~r8tfS;M@S3|q#0D{qUit)6)7ue zDqF6up{?)&sZB&^eydVt$J_`lxDKPBNJ)DH6n7B&khO3S`D;t-&@_rP=!M<IWBxmq zcrznC7ymtIeCUl=cW)_Pc%=ALL<kJ5e_-++KCbue-C7;Lmw!X>D@S&p`+o7aP!!ib zd-q8`MfF%+sn_xvu<gXIItFEjN7R{isgnhiN+8=${T32YQ4EyvThaxMdZbOM!V0ko z6E&1nQU_){QCM3__ct=!_Z(j-O%qh9#mr6k{yHADX=i;y;{+Yd3h=<&6G?hlWjkZ- z>0LIm9#1$qiEvZ4p$pKOoJq$UOX)a&OTC=(oRB}dXvLyMEBI$}xC_UN_kYmVj^IIV z&$2?l?jZe>rT;E+AEeEreY}+6>q@<z7fbnll)H@|<dg(XqO?9;qU%7PR0)$j5EovJ zQsOckTqdsANp7F(L|lKZEPq2n%Ac3}Sua)U7o{NGODT#W5;8G^=I1B;{ftRMu7{qE zJc2(O5hd@;O;$6T=>xnwzfDw+Hc`Vlx!w9ZsqGv>m1zWBXcM)G(@59RCc@9MHWA)r zZKC=>rkw_Zctw4{+C)KA5EP4J5cVQl`;jtK+e0i^se#^6Cp;Sg_8=_3kmL%<*gW7I zL<R!D`1Hb9O63YKKNDAqMLtLi3AFgLl>)>3J8j}7I!d|;wOa7vCHL9IT{qiZnzQ*m zvo=4)J#bZ<Cks!`S$GBF4;Fp`(S`dA1bNdA%LLFFQ>=~8!qM^bg-LZXfdb0OajZ3r z+cw{z<%%ho_9_I07>Sa{*RTt(G`Soo4JzMSyVh2^AT$p?>+{*3DZcoe$7g*u`a!Mx zyS<ZreUtj$I!BY=7QLYRKsIfoe+z>Z+LBGHHM^zUYTjSM(dU`c40Ui9rDo<~5wt7I zaRpONfS@=TvTo{$hpCv&g=yBatY(K;5ppwe7$%y9aZ#!!+axCBx`bSfUy!Io3%4+; zqG?7|5pXk*85E(&2pH#^EzRv&&dCih+xl~~@l_Tx%(fo(d-=?gf;XAjnyaAY$jnxK zz|6Kfg}v8+PD37|QFbdvHY5}XY+=l_qpSehEf;=wSP^${au`jHX|yYeO=f;k596n| z-rGfGaAtN;I5}&iD{jIli{H;TMivpyZ>E@c;pk18qxS6*E&{%o;U4tg({I52E^<3T zgQ9GuyS+XwPhpQ6D8Q`Sye$o27CQH`_s)`6mV5AS0>^@JVh`a=6wLO9Ap{}-;}!<i zu!jTq%$flTC)cc;e@X!Avw|b73fn%3L(!j7CYh+nBuIo5>m>b&6{tOFQY3_>l!wn} zC3<Ov%HY(YK11Y+l5Sxmd&Y2mlZZ}%*aO2HeuF=)G?`(C-A}I=QwK5zC;(lalY546 zk9Xumy2q2vc&$a<YBu~z^;;g2rV+a}a3$nMI|E(2qJVIKHX4fUOw1A37|aJ^h@ELU zvQelQI%6mY+le3%BReJ3wr;#slK&q*|Am+N!OF_w$(QwduR9TR&0sr5HpKZ~2RgPH zjlaO;^Uh4dDEa~ok;%K>3Wi@inA4h4j1IO^D~?aRxQuswds!slNuS<Y9DeE!(Ej%6 zeuMiv=tx{RAVg!@3rRjwJnZl}Ek;a_m6&Ap9TS?<iGJ1&GwSp2i24KD(vjl!<ZaNU zX%8JYnX;{+qHT&dn;cITv?aJNtNUV3LzmDV#kpAm){+g}n`(cOMB8F4GL=?>J`e|k zx|Ae0vp%I*7#i0CE`h>eKIOyU8$@RY@!rI|x<A6W1?Oce?ht26>ZJ>W96cxp&@__< zQ*slY;V;0RM{w*$+#3|#jnFWfJ16IH1PU6#*SV71FGec-?bzX*wAMBa4rf+FY=mMs z;yX^x$E>!q@kDzz*US1lgD4@Py#~T{*$)OoX8Ci+0l*zhW0dkos`7(0yoB3O^_o?6 zdV+aU9}QEZ_)xYz6t>f7oJy~X=NL0p2dgzt%^iZ`ow4-;8}#~EX#C~Fua52b+NrNh zy!*2!?>oBFWs5(vb@H>xpvB@m`0$~DZOb}7&I2ud=i|w6cyc27=^E1)e({&v^4kt+ zL+XYX|0N!e`AufGH@abJ|MBAw4Nh&#X^#E)fZ(6pk&8#ndgmd&G51H^djms`zb*dU z*xFb-5X`UM(VhIIFuW<fK|+OqbR_p+HZV&S{Ul``#4E})Ye5uIaMmP=P9&O^iMlS9 zDtPPYg<1Wkj)c?(v1b@JTF!Rtn=2*9@vI39N;J#e7e^OVv9qqTohb*|eMmWVag>pX z`W7>Yl0H!)Q=nRsY{%U`%~jN~h#dp&^`!XD*d?4~^pM}GqxmZseT5aIf{^W`{Yyc2 zi<DL|xUaG#WitW%k9u!&4QikdXj*bkR8pfo#_Qsj$Vh@_Ay_UYxo1zAn51rS<9z+d z7sjIV1tg<m_moLUk`28NDjls!qriqnxb@uad;pO(=6vPNSm~u2+^M39hZdKTs}%`L zI6zk|QXv4|3vHRiQY?AX3IG_}4BSBg7#H4#xzRO#dY#Guz*B=348+*<0#VAd`KPT3 zsh3G6PIu~%;THg|u7_vuWZ>#f0#}1ap5~&QWP&1r+X`XzGvO7qcf__Re*_<>#Wh># z=w7#;72b7Lc>!{KNaUP#`x_DiYc`Vwf^$%`Q?M+p06Q28?o!yQm}4~N#K~(wIYT&X zKS)98Pu({BoA?dFZU`i45X3pVN$?W-Ez%(EC(YqQj2LuRJe$haGcr(MuGyRvG4PUo z8nB&=2CJLLFl;-tc3_C50rwt^a8n5gn&A<Mx}dTjfqCiz>frma?ORTpLu1{~@*J3( zJhnFT)T_tuUltn8bdPs@e|)`u+3ihV{r>Mg^wQwU#*b}EMt8k_?9gQBKs=gBOvJl- z|C{N~P(0M=5rq90k6ORf|A`+9U57sN|K%p868yItp|7_7!0~5d>A^0r+b<s8u)x!| zY<KqF1Ny$ZmL_&*_cm<*<U`NplG#|m-*O=l$pnJDzB0DFrEkUBDc}^yCU-DANGJDY zPH7{ngaV?<aZidQ4WTxS6O$?c*d!Ss1k_|;S9ihkT%ufuEo%ifq?Jihw2}qZvEhfh zM%bR%T8F~`HUY9dYl5g0h#6=R{;1A{qK~oCA*fh5=tV>6SR|fF$OP143!$(RvJ!Ek z1F4Y}ffVi4Y_+xe3T9XaOJGQla!ps^rQuK4AK0|_&bsY)idV42aw{yi!7`ZR%1g}g zT5_(;pA8f};Ihx4)=~rBpxb#&loNY_<VusEbO1{NsP77jhoG6MF8d6yaRZqu*tlq7 ztc9V3TS!f3Etx8c=47hv&d>%}pgO6-OR&HRMyf6?hkGLiYa)j8h*Bp#U8xcvtV|R- z)knZ(fqGXa#t`(oF?wAXWN$DV^ibs+_$5$QYF4lUtB6^X4j54-K}t-z+V<|gAyzd{ zgCWh>l3Ygy1;xT<u;zpC36(dX8nKJhp87$`PUUuLiFQ(L)@cPrKHTASGCj_!CM{N! z^JL8ieBg4}W0`TPV~fnwsa6)gmc8m#S*0u5DSv!!%J9({>ntAbeRg<D;oywP3G1(D zt<M$TJND2^Uzu;GTJg|l7wj4x+p*xW`-`!Xbrv5P*}LJc)UJFa6NJ(6n(XD)y6Y>e z?mckwj!O*{$4A9a>MYDQH*$BX7(;11O%O`if~i6VqLez)jNK!e>74;=p;fA4DT)4d zKAe(Tp9C|M?FdCU+3mzWPqAWMd>z$I%Pq`6@&wlgS3=Q{?;$bZxp@^tDF&z?kfx!= z&zJCtxK>G@Qh2Nc+^gHZz4-9e(U_YT^W%Oo7-$5_`<r3mn<mr8+G7igZw&EFqB7y~ zxK)WtqsG`@{6^RAa@^MS;G0Zeaz`NM(9a^as{Ni=@zr7Pj9z64y==~{A;HSoz0ivK z5!mut?f|C{>;!SCyas#Gu^fBhB!UNBTIG2P;<AjRNa|#RBtS63UQuOW^MODN4#DU= zpdz7>q`t}tI6jbZfPb^>7l7lvAc)XyLW3zm>qldY)q05~*%f6Aw~ZeCrod~1c_ zIcQdqo0AE?ahb`;Vr-UVN<fHZeL@!@G~&v!o!-apd+&|;!(A8KZ$fsmOgcp0$!ubz zxLr4a7_tG12%k_HehDw2SSfL&br%vekIT4?!Rn70O7NIaT(89y2hveQB}D3s)T-c6 z%ecgFti>=EXho^AW{u&FYOyH>H<Yxn7Zo&0$!Qx1Q6~ox+$&}`5wT$_Kx|AwfIb4* zaFN74=r_(k#pNWZlx?4fW7Q5!ezvfE!-|RGuZzb%J9+UDLZ}`ZzU%XMZ1*MoK5y~9 zGB9iL7Z31X==(&!jc4*c)#pC?>Eg-atI*@x+5ges&u?)p@bS9$E`_wJXLX{uT~k1f zKLCykoK++Fbm&B|8b5R7SR?p&#(|CF*a%PFs4fs_Z)kxqbWCbBROAPztGs|{SsRGa z_>8gIrLR{7Aqcm|$ws>5Ic;D8Ie@5@h>sgooa@X25lcNtRw<4m?wd>2M~$F~L{np# zJX-j#Hu1F|mk_Q!V;e_kWoo^}pUfyrv1Qr)NoYttHv@9*@~zz7&DKNPQ6?*eIs`i_ zdM{$(7I2XqE1|TB1Cr@OCU9E7gpHLcuQqj4!=uoqo|`sN7I9KQCPd2uGNDd9t5?qr z&?d47tbPgJFGVn9kj2^QgchK^u>$MC$xd)&O`F9QQqq(Y0Kh8l#(MSel;|L-C}6+I zgH)d6edZ?4l&+wb;D6U~t7i-NXgXUgM~{Qo76@X-FO|WXdJ&t@R1~dg)BH6>#OOLt zVWQNkE~py%OJOiDVHF9KLm^=+LPA+1;JOQ=UxL~RN@r&r-k`0S^#UPWUty#i3^E$9 zSFCHCaUW)g5f~;byFg)&Q#Z4+z&ENZd!r{%_z@+~tn<P9eK&9#XR_mWU&eyo*r_DH z&bTC*UvJ~~0UJzLR~#z_CN=_^2WZTT3m|u8;dX`w!%Eb?w0;CMCyl6fTCpTn%$LHc zYlLl93NF8u?9ibSUUPCBpz~>mwUs5fl~x4T$fOWi&+{X?8zqet(sRK3;;X`q5lEH= z(7SG!OH$y*i+)Xr%~y1Ak0EZR0GU|u2qH~fJBpL(P&lDL22$#Rzdd9_mV${AMnc+j zUJ_9pF#fDit>zx65vQ?U4Jhougym|w?`=eM2e7oAHC57>%`xaqmF5t`YBhkpy%xhc zIBe~$Sj47=7zquLcnR8_Rp6{bk#F8IYH;vKu_axT1^Znu56d?2Go@Mh38axMA7q6Z z1HWYs)BkOLs`%EvOvWdqjdA`<-hNk&H(vakQ{Vg5i~F7n1O{uw@$A&8?fY$a&YRfy zfiU__!*h3gdA&wsHjNayXHLCv;enh#mXD21|Cdt_K<7<(|A}H&h=cPWh&%8usHftr zNwKtIz%o)oR<8~s)-zd&dK$U`7oLtm0hoJ$PLmlO{2;;oxgg-@%psVASP?^GlmM^b zs6k^(u7qLA5d$7IqPK%V6-tGvzkaphzOqS{pUzF|t(f>+XgSS$TI@!Z51@K3G7%oS z-XOW=qjPaM`9clNAep@20B!~Ldo%ZUDi5%<Qk}sYA<xLILM57%#U{{lWF%N&U<;WH zhvKH?>StJD(!v=eORe36&!biibOSGKNl0tY%Sj}_Ym*estwpy|+K`fx46IaGs=AS- zDz#~&O7(O*4q$RER;r)Up`2A7Bz!_6h&!h>Hb)Ix)3y;DP#x7Zh~=RgR8MA;%~9W& z$SuYDvbzUwZ^Sa|Zi=$$rih^`-kBk5`uXv4vZkl`?_5gO)YHYtn*QUFiKBZ9rzW~T z{Vz{Edv%T=Mo9c`!ZT+t;UX?>ySR5A4Y59(Ox*kWBL{!-%15_di~LF5=XZ??=5)hc zJGX{-M+YcEcctZplr0S-Lhq!_qkUclvNO6xh!hdGF~Fsh326n>*;H0AeYh3Nu#4j< za`;3B*sbnjw~FkC8Whzqs7@SNfnCf(d5>6pIx#vS2fOh~ErjA|Y%2ugSV+MBTDmvR z+MT|JV1lMEaBs{q9bZDagg~i`bm?Y@y!1v!m$Q%2<y=m;WGiD^dhh5BZPYA7GFTga zSVoV`pW>I#a3f!g+~_807!Q*JJRD(EfVY#5x0~#bGzU8Ifa;BXfcI~(xKT4&r^1mo z;SFUKQFP31SwCG}T$((lymA%GumIa&RmeLNQa7_kvOkiXTDq0mK}fgI^`WVZ(%$W$ zy@Ts4boP&~+=dO2Cav<SP1v=dtx&sQwOV^D?cfH2xS(=~_HHdojq*CgeA6B4x3a(U zTc3CO><z(WH=DfIQX8-m2SQ$fE<QR!KqRKDMUQI(Ty&EIcU!MbIJChe|r9_-?~ z|1_TgyX*iI@3?ttrW-m92X2vwiRE@`E%G>5q3Xb<>2!E;I!t7_7&9yc8q+3tc?3yl z4=~}^_VemIhN?`Oz&)IxhEQLEs%1U`ko5V$)G}S2@ZnIY<wr?Ol)?h01nUFm+8M!Q zH#9Qz8LGl49#`7{)y9w%mH(3zUC|_m-KJEDKsnKV*0^+N$6)tdm$w71Rfp8<o{AK| zpY!?sCvUt8sXrd=-{;}$T#m<Ytp>Rj+RPp(nr-7NO9+Vt>B3^Q2HBTV#>mQ+Q;0=( z;%2HnCe_+wax*!8A|g=M!|rGrn`)0achhFn9&;tTyMuW_JM3(fij64#Q|+-<l)CBZ zxSBlyRMo9M0-cHl7TAu@x2tx!l)WKu$E)2)DCIHWD2Ul^0ib^<6+SOi`g47YmlK#u zYPS}&x?~$`4`3=5I-N_=d)rKrS`+ALr)z^tN6AO#7ULi4M)eXChf-Hg!dq?aYgo}T zX_Yq6CL$&yAU-29$|7Q!?E`#xrAA@UEcT}K|9`|~vb+EB!>BdtHfa+zZ%261<Op4S z{llp>@?7x!U+CX}|6IfUUrt#~*YO~`j+y>niO(EzJ&K<qqD;ErRqy9;lCc<!(%As9 zN-fStOjtdNvw@E77MhoXZv`_RIJ-%?RZ~M2sWHL5f?KnOMO!u1c;-tHxDX8@)GP_O zrSvoA%zkKeC>-P2;nMgRkXn#fDvU7~tSlX)!?_gHO3WK`kYZ3O8wh(LFU9pMcdVvE zF$mm+tV79060Gu4)YoM1D9(bXDbb_4Vru}L6{uCts)5eP$0ztOld~XAlxil>gzkM9 zuSPN$M4fw;yfo8dVo^Ru?l7IN^<LbT>?@ZCO)r|$2kmYXPNljtkeG+<k$_5fdh6)@ ze|~T>9(nbxqfZv_|LjOnNQAUn!J^8{TR@qBmb)(AsGIj%BvxF>%a^};;l53qijV)s zWnAo!7xxw~6o36FyJKy|q}GV~StV|R3mnrlM#q)Jet`T`8Zm}bfCLSjS;tI1R7Js7 zrD)C2?@Q6ZBjArTBS-Z!hN}7Kw<0QMf*YtB&pEjc?7y|=UW`8cy>;sATD-15r<Glf z-&rK*IZ)nC>$G~{ZCoP2Fiv~yMcaMfI*ZR%;=Y*n)z$$xho5D@IefhA8UAMO8P{jh zy4;>xpHA<rA#{%VfBf0`FEOl+n}lI8PG4a~9hJCkiB_11-orYo+-Zmf-df@l#x4hO zxg(jw9IJ40@Z|6BJNeAS+TzOH)9d&=KfeF3pF321$D*}F$9;i*v+f%o@#?id+Pe3` zp>3Zz_suP@oIAAZ<Hc>8UOxSu&uuM^kAPj+?AfYIZW@68w57>8jHuIrzS{}?L#QVY zbAO`FH!95GwZplgZr?|CW>3rAh;qaQSbRN3v6m;L+zhp~l4GGNO+ZkbIt;U@Q_Z4| zO)S=tAomCbI-q|eA2w`#zQSg2(6l6oXiBO>u^FW(q_=6<M;)Z=q;6{g5(|UG8;SKP zff6F2_76ZEBuxF*Z#VmM#qab9`-oo(RpOgvozg$_tjhs(_~Gu0U;QBLQtbT4p}p9p zBY!wTN*<&tBze>FBKSRb0}eUlh$+6r^e?1h!QVsp_l06%Ny#Qwus(MJx7qTP(#ztD zUNG`Y2{x2X-EGAgqC3vH05@02Ne@H7h{&VZi3Heu<t(wfg&;B}`YYO4UCT+TsK^u+ z;a0LZMoM5#J2bgbIZdF7oK9(|dT{HkJ3{QZnYWw8*w3}S`RXTcj#Wpu377)6u{!#8 zP2%E#Tc*xYabe)Yc?$8}IJc21(-i+pi_$E}v*s4l9fJ-f?jm>&jBt`-1P};p_^$zA zOtqAQP@WK4t!f%Z9j)6Y89XPHW|f;(YhfYZ*G~iQoV*%v^s|D$VPR{UP2i%V9H+y5 z4bV05Ok=shYKeI)D9oUfe>FxO(!vBO3stWL@3n!MueC3|XQ?N-u)+Kb3IKghJ{RX7 zy(YyDpG>4P>)YFf!E0kQ`)nJs`5l%8jWwsE10OXTT30*Yd*teTJKCw9M7*~*o?Jv2 zTE@Ghe;W6Y1B~5c+$MgpI=zV+2|0N}Zh}_I+eog~3ua1{{1~-!CDB8#C2~0akdQ_m zi@id{lKRt37Z_6_UFTVphkb1qB{zaWIf5c^Bti4@)8_Ur#N;KF69j?;C2+=(u)+UN zi~WQgm{F6PMGjzGWImeH3IF0cC&#S6lcMKlor)9{rx!+J@Fn;;=P#huj*cg|dYm;2 zW8^D%MSY-p2x3G)c%W%V9I>F0O^+DS@2;}gy@{Sxt67h`&Nl6@Uo*)LR|44qo<$hT zCWp~}rP7S5Af2zdgP||YmjoL=2>zNb0Ci>*kdrL`9G#3z=Erw`{_b7FL%+TWu;-u6 zjE7cO>^etxBHq^*Pj)Gv-LZWrzvJRNH;Mby|2H-!T3umln8kgbfPGugul`p~*+TXW zX7x#&mU=u^YBz2r+$Z{UvgZ!mwG8e_cQ2E`L#al|QzU7!*5{GunUM1Jt!5AV@YZT# zN9|j^f;gUbB&6!|Qe{e>z<KITDb*A7UiCJd#R-uvTY&ls%hd~GIqNc%7#S$zK(}>< zhI=Q@+ywEl7`EVA9)$z}wIA9$a1ww&S!YF&4RYtWU#6FjuD=btF86_4Pg<6f7v-_v z?O6*&n+d3TGv@WVwD3O>=&AknObS`d{^!6v^(SWX%?qFZ(1D(ek6V{uK6okejZNIE zD(*<?P8Z@T&J!g%Ke^#xS{4xxZGczy(`6&Vs}}M@ZGK?UAkvn{dz0!sTJ3q6@=tJn z_R}?k8E|6lBx<j`sysbOew;Ph!)OBF9+5|pX`-+r4}pS6@h_sGPmnH52P!+du+h>w zkWPVNl8uFjG47e9V2N<vXA{YeTvtCOnNr>=N2&e?d`tk+Ik5bap{hA0>AJ&3ltL-B z34jj(D+aGZX#-pgk$K%XzH@-3lfQ`6Af5c95039FPbP~`e_)KK7P@gi(M{;7h7$tY zZkBrVjP=fxmea6dro8KB)ps7sE0TD=5L2tGF_Ai2g*vslO6bfOnt+PQQbqgHZhc;G zx`H7?3u|T7p^;h(v^{ntja~$5HK>$7ZxL(jLasRD-at_0rR|ul(?A+Lk`1eU`jiWL zJ1#>Fb?X5Wim*04AS+=Pk3F$j+cd#h6MMnCP`68_cKy$vxx*LLZa;fyviR>``9Dwd zo2gwtB|Q9xe|7QhqpuWxGFPQ}S0D9!>UrMFQ``RF!I0+@FBbnEt@{HH{(<%p>e>JO zpK4nE>_0BN{uRJy%s;33iP8682JaNbrD5o?c5-`5%v01^_0iSY2zx+-<!3BD<t4OK z>(3)?mca!MZ^!zm)UT&LY7`9_#j^lH60C!&YYbP$S#1jiEQ<(Ya4cH{D9JO{x_VD@ z3BIffsLeaQx)j#P!1$;l2DW_gJ@|x6*FY?f(8FZw#loL4v$Z$L${aQq|9H}}29?0Q zFMMhD4o|)I?(OL4pV(gf`8zfnf5a9`cKluOV=Eq5f7D%m{Qf$z^HcwmIIGruqRw^X z5B>uE{k%~8Lh|ku|E4?SF+W>u{N$fJH+c+UJl0F)Ry>7izdCS{Z6>XeEP0z{X*MU; znjfQBp0cRI27k5>1&Ix%g2a}Dj1CMZ_aPya1JL`#DWQN56#&v-M8q8iw+egibVTUs zt<CT7$6`^x_;Dj|Gnx#4Q_WYU7euVPbqmbag5}V)gT?F9FX{(<t?nOLD$Lb}XDy*X z-1B2om4~nXbMpo3wS&d$b1}}$@?@~Ke&7~w<HhSt=MjP%nkZMA(&NE)dhJ=fONQ^{ z|I9JFJk9HvaiI{g0|x-H97UZ7GAg1^7h0I;WSY+-R+#`k^QY@DUJjcwx-oW#^I}iA z?1cAahb7Y19>Q}M(-_2Q!7cwmB0VW&{*vG@81+9xjg;>Unm?)0R-26a*Uhn)ZX(Xp zMP0>z6!d1R{>P%r7dO`ye^8}0nf1RlJH0Izp1#RAk8F<3^&o%SjXTN(9`zAU36sSE zfJopXKLa4Lh_0~Q5>CM4maB;zN3(W-h7rab88wd_k4A7fIRUaVS&TCaL=6U4GLvW4 z8|>Fe&r(<AH|1Y5p5C?Rlrd-i#KhPUlk<bxnC`(HpFDV1Zg6qug|B}=qnhD2jdEsS zfrt16w6JLHP-cM4ALUp;0^12J&)LB=rBu+ph2;X$UTL4X_#^G~egs@%ytFJKrO(UI zH73oWmJ$sPa5RutBC?EWT}vA2)1^ZX_$0A0%2gp`Svh1`=3EUvrs_{o>Sk7xY7~>^ zXIWLqvZ~L?L1bBha}cYXwp4>z1Kh;Z4S^uCEc`6XvhXI$vZ^c%K{d-#AFwP-hdc@8 zD7mHuY^Z{!!se=t5br5m65_CFm)lP`3zmdUb2_~lPjErEP`NIm9+D3-E+?E4|KQIK zR%dNbZn@{A^T16Ql;*<t_N|ldLyw&J^u<?h)_gRa8Fi<q{xtJfo4J4ClpgFqwWF0j zItYN0F2>_g?7Ms$9;~1T!+1b$zpy#k*NA+GW;j?wrY{McJp&(L%$eqzH>H$3V@=5q zAOiwZ3gCNEL#hx}_tTC2R+^bgNHZzlAzncHy0AvgkJgBC7SD3(X9=B<LT)2-aE*Zc zO=~I*b_#W&7!t|JYXDr3R^_BD^;9s}P5`(lkm303W`gj`w>9E#i9JUi9mYpTZos{! zZe+Y|hI9~UrAx-+_;@TDu#_AvWJt|z3!;|$n4_&WVW@3}jQ+Brwjp(>t#6MJ^ml@Z zrs3)W!N}yU{}l}DM|?>SIarJS5(oSAwI|u;_+rba-&o8t0&dX()ZVLs%xvZsa}TLg zP=$WA_&(!CJILUHper<VCp)n17`@=7K~h$hyRp{|DG5jf$nVLWc$P@XOYpE8;hfBo zXA40o%Hh`~J6M&sQza4XhMQ9o8C<o1XIeq~D04`0r;QFqOm&Gvg99UnR^y;7g}FGq zq6wuU)F8762&v)tXb_D{*IOOI5Yij;KfV#F`l8^_8x23REF`yeSkPK3ncI4n-{a@E z<-KhqJwheu9sd5+d$;h~hxzs4i$CgL|A&+J=(07%cbKpG2l%SyDqryjS`bZFs@^Jb zp{TFjyVVlEx7aMyG4==ESUd&xhkH&Yv93bNw#YAyQq>_&mW2|7N-2Oe5sDB@$%7fx zBC@O;ULjeJ!f1WMjwK$&^Lj=ZR*wu5H{oR-i&j5?^<DuL0Y=WA!97gm>_mtG)jhXr zOK6c`qL5}OXyRp1l*4$|q*@kyo*ckFP4iV8BP9ab*rdUZxNVYwrII>5o((dVQ~-Y# z!oKQ}-6-w|G?1w}0byTx7`LQ6NDzZ=D(rWWea&~CAK09Xr|;VHINaz#xY5d2Zq<V3 z+~>s&w`ekmIaHg8!Ed^NV0%*an<T|T?W%G&$yk>OBv92<8^o3ZBghbJ_2=a#<V89R z1_K5@=Mo7M4s5R+!lo`rfjeA;CoK8DR=h3(B6tS=sW#Y5{wRb|L5zZ;$BeOFWkJ20 zcBG0hK2R3Mo?h8i@z=8&;aTy(>4TOI+n>Q7If~xvnTn=JUdVIk;S)!Sv1^(I){X;5 z^?+XCtXw@efQeeBUBDo^jVYyr>;Ty%s%KIydP|E{{Q)2-%ESlSQ}Q64T3}_)RGFN% zXeZVJq5xDMBPu(qnkdylY5UU5iCrAm=jUcQl13106(&91-Dz(`EZyA)u0A26Ur)EQ z&S21(tJVA2{SPLyM?7tQSH2#5x90t$jJx{)?%{f_pMyp&X*$fTOO9^eY>wfIKxW1| zp#wA83`&5Z`~qRbXmiUXFD|W-A_DtJsY3?A4JkFkOZEWbXM^HaYf^KCtbtX6X6{&8 zt<uCynGV-1qj%)$nZa{cVffbVcCnjPEM)<>S@n9YsmkylNCOh@9l9RpH}uZ1R(btB zxLXt4WC^haOb)I*B}6A^b}sOe8?%*3n}cp&sk2aNwvO)AU`kG6sVtzhoi=G|NPmL( z+~-|i*PlA_P?-OCMJKzTO#HQI-5*dR_`%$Jd#`sa;76+1xrpLiw1Ux0SwN=(vm_)N zY{qvbTr36TO>{!SP3(j;mQF}AAyu58ZC9{?I$1<BP^_mfFOsfXy+{x1Yq1U(z5~=@ z282WdC`M>F$qSHOVu(&A5pGlzr_DNd6GdK}>I)h%i9(=wxP~~U(5d1q>8#V7v7#EE zq99p=YU6B_eC@-8WT~-jAT)emRZ9HhwYPuT@!6{!A<c!RC9PgWzMpvXkB?mVr|X>< zs?FfeMh#Gnb(>L{S*gzPv+5N?VKSID1}!q@3&6zEfDYzQ7Gg!>YW0ssD;6S-Ji{5? zVoJSp{1L2B9N><*a?RzJ4lCE#5u`XpjtY4q$FmaO+<*?hmu{T>M!T{6zVVvHGnP=< z_znL9R^=<7)VXRzSp4>-V>lO2vtnP_<PMAfRBm}S{%eKO`yHIpLT9W4(Z@{i7pB9N zB;N%|cMu|`;hr#{*+nY?HwJwwIo*JjV6k2owC;6|Ftt4qisMq28^Bh{bYw4y?Ml65 zDo$J;+g*ct{+A*17X{v_H|l>57S@~^)|re3R3nN9uE<1xy<l$3%-qf%4Dkp{$|rWH z#-N@ZisigeR@lfX0a~FNtWcD$8)AL9EINa@LgV>@T|G&t`g?Gapec-V#dH-slvi^4 z0F5nivITHIxmIOjn0N8Gu6b;f@}S7{;<b;;T<G)A4IK@9?-|_zGro5dRn0y0y(x+< z4|1$@I@stIL>8(*4Uf~IOMw!;EUp1RJY6noC__{U;Y!Ai$;1~px_T?DwW`EKEg~1W z%84VR0pG#ZVwkegtxJRngjZ&=>Dv*#T%5S#fhbR$KYxwQ*TUEQLaXSXV>xmJ>;>N` z-5V68#8S!xR-`b-AcqWzFZHs~R}c#o);CNHBIa9<4=z9~b6Y~{U|}Co3zsMg3BuUX zL|1SJ(Gh@x6agTjb9oueY6E^Z!G07?Yd-~cXeLw|MR&{K2*V35BMn|o9;5PMJdO7= zR=F8^d(wI)&Wns{1PE`N($&qVx(1oo2cKvqE-U+NE&0k3QUl|6$%y3lE_eQnPy)qh zyocKkKm<NrL(IVu4frQsyXsXo@yynzlwuwMl<~E$fxn&;;f#I5wg2h<dz8(*CNbG- zG5c~)-1h^|mVNq3mhKwog;OXe7ylA*=7S3FzhJzI1GR*@w?B4F@fXKO$GkO`v;4Mb z2<I6%8{vn#JoeI$$n%d?JifdH8w4dEy6o7btTdJ`q!|oVNr$mFT)G)x-H|XAuSQF* z74%O-jBFBmFiHoTneFJtW)3mx$stg|5yz9<F$55ZQRnMZwX)Mou$%Qkz?x8qpkaze z^<KPk4YzbHme+7Kjv5;2EDi8LJ1q=dM3C`ZahoVcbJd)<Bn3w0wv+-}^rq<k{;s~R zki}<lX)Qj_l4SONh_VcILzHD(8x6of!^&vY<t|Hw&9jN{#Yas>OMY8#Fw$3T@oFvG zhI@<e9LUb9Rp~p+)vDlfun($D(S2IOEu;AtcAqwqp`?_>WKq4UMWwMPQj5x~1MZb& z*2QU9)G~bVvio+Jad8s?gqTuX42I@HSfc>r3iFBpZ%tGYAcug8yxalvT8`f?`0oNT zU%%R(vz|4WG`@6Ko&uZ>o1)VrrG<_42IzVwG#VInyg~ei@ht&*mtlA<Hf3QmRhy)A z3CMHF6@Kf+#3#yc_{#6ztOd>8`48Wc$sqnzZH8*ZecXRndoEPFAcC86C&2Vb(JRO7 zRsi8>J#pV~^?>_L=j^1609sd;we%CmZi()O=vOk>g%B#nRS4Zx)er*7XBb?KRWm~6 zo<km>kE2|nXyV<p;!q+LD(pTsEN8Y!9&A=UE<`}$tI5c^s2X8~Ct=?9iiKNTaW`vH z!hXt+sa{JSrubJkYDZ;Gr})avSq;@{NiXC$^Fz1M-IXUhfz&q&^~yNe2{hbR!ZoKB ztwWcQ^|+RwCpufDemR*y{S+2<ISc)luYxtyvx01)pVfhs{$~12KVXAOGlMoa!y!@M zORiszH3W}sGpvJii<vy(j7Tsvi^X0UhGq1FRDLc1zPw+_QpsBs&fyQw)KN>hcv&5_ z_{L2M>FBe*r2B1)>Q2IeW;eAbMYt2=t&8fAPx*y5%Ic@23IgVM!W3Oy6=Wu)p05tV zOIOqYL5z*T(91<XRDlnw!g0+lWCoo;gc%*a`c_@LD(W$8yBRqT%-`O)3%X@tkDp%E z6S;0w*W>b<?3^Qd_vu>{_n=xW*%tYuVVcs*&<;9P%jze~YGkvON+32wXwb`C4-n9u z=ZW=EsqQ2zs)p#f(pgOpNumX<DJsa8z@bX3m)>ZlYtG9^iM;Sl3=~~WWPXc@@S&jR z8Z#uXEVUAC6essMVP~yijJmY2s<|~iD!x$eSWKV0DMOm`D#fR7-E^35A=@F}lBWre zDq8oTI&jw=tb*5oSO>g@(u+6dc^el}4vGuF1;vK0mJ(=3uAi%I)WGrR>Zy8qS-TPU z9!=qKK=zW-qB2jqW2V>&jF!xrf*vY<0JQ@RoaZ$aRlZ=Xh3$*kN?a&eo6?%<g#~hK z$WmIw=!v=(Q`kThQDTk}g1NxLK#@6kzOGV|y=9{ke#C0ErQTuj&u`F@%3RTV|Ld0R zCR<Euj^Xa>Mm6Pc)tisC-B~>a1es_M74imFOIe;M^cXk~b=t=mRaZ4Sy^-oHb||Fw zB|-_Jw4e?J>}473rR15Gkvrm+0bxm#HbemeruJ!_w;6w`;&kOzcrhD-@dTkW4B+Hh zbA2?`IY?og+)P6^E)2-Y8VoZtdXVMlBx3g%B#BrW%mBXYyrD3Wpn@p_pMjM@ze;at z(J`o8omQo1rz9sF1YfwzhH3gtTZ=Z*0u~XZLUp`4Rc{Dqhzp7_B5In%YJGKEr8^j3 zG}gNA!RAG`4{WVkv~hLwj;Xs(-2P^#-&7y$<0A(i2*y`u@35`=*xH|bdFqbw;@i3W z{w?Crs@}<*&OCn4vPwSO+1nDfe{OB-np}rBtXt5v=J)frE7`?~!)L9P#UCC%(6KR> zaYu}iHJ@Mq#CIKa?>@F=Z1PTv7!Z5>3B6??lx97tt;MA9n&t><@5{NLamp~6)gXSz z9>fhHO5>^b+eX0`*GBaDU?w!G`&Y#v@NEA^s0LKX8o507%0|O<PIce1jJ~V1DM}>` zQgM|uNQEh?T5M4-w!FEw7W<Y^{~!VVdiF?ZW>X8z5WXT7KDmL$qwK>l84z<hn4Q(< zhsdZZt!MQ{k3ZBlNSmdhBDUO|z<xHt2CIyCpvlQ?pw$QIc|$KggK>10E*Zk;)X(e; zh)a>YvH>X)E-P&z>QPSOGNp<oxc4Api)JI{rdXC0<`1%M;fKbtFTXT3n}YH9wt>4R z4j$P4?B2xIdn{&Ci^-JeOVni7)p+}wTYojU>(1o1Ned{P_=!vRgWqj3o@h=EFBya^ z^S*nwJYaDN-iM<mWA}K_ymV2ZEfBQXZ#y$|SJqvl)!#*VlfdaQC-owD1FYzr$ZFS# zh9f~As-)njsFn$-F{M?F2B{%sA+E$OIIS!0B-$&e=Gl-lX+Yd%Bfr#8hyAjNg-UgY zDica2y{_(~jn({Sn)1|ysYKb2;Hpe4(S`mKr~{--*;Yi*nUX0vn%jpt3%OD!CPtu& z2244%rE3D|dbO<%2oAmO6gJG6#uG~p)7;L=Nk}Ii&d1uf?fRqIm^W&w_#SVOc=2m` z!R>1C9Niz!L@!_c(Mt*kj%{~t-Di^a>qSxj;3?CdEjxB%z0<Il9ona{-hJG&oYF!5 zDP`b7a7iha=%*Us#ckw_B5Yg-1Y1{LX{0>}5$qoL9GcTXt1H1mg#<7YN+&z~ozM`b zksUDLU2Z}w@ktO1rQYakP{Amm<^mpk?$j@$N>E$-m6uwRZi7x=TaS?K0jH!>4<^tQ zs0brL)D9Tp7gt-x8f@t1+xSg$i~RjfyddzRJzz427wARfZ^Y!AzR0Fwzi$8IPwY2c zwy>IS_ujV9ZPppgdXrY?Fx$reaAj50%U^tV+qzC6@ZjGFJI&rG$MM`_#kVv^;5(X$ zAKgo*7kyI6oG4V4=65mar#Uajuv$<|cF{Fcqn=5Ca%tiZocYsr!3x|aqB`1(32F?R zVaS&W^lDsK9?;~*P+C=Z!q*YG<{5+D?hU9auXX@}i2FLgZV3)97`$03XKjRy&#=aY zJw=r+s&i?r_(8KZP6r?FPMkQt>r-z&W!mx0eWMeHEJ@M!=J=+s9nSNkKaid<72irk z2A|DDzP0<=$GWBUyEg9MrsegX0|&Q$rmm~_<)i<Qj^Ax*(+0?{c`jd!2`_6N0>&1s zGI#}%Azj9-ag5q4P-|EemOiJ5v}k4_YD8I6$u3ixY{t<sVp%{FJYCmlRF@@=*VSdg zBStn;qb>_^Lom#tXgfnI5?zQz;AAr<lVl_6(9}gX(Qx_P0@>9Cg4+u<7?=&C_=L_* zc6Nb?Ny{Rr48T~hv#y1wfA4hhT`_h4@g0L(Hdtb!_{{jEv^9DE;YHgf6BFawSTPn1 zb-&O$F#Qw0qRsZi==kL39ok>H_iq{7U+X=!YxK}qLi8<ZvqbbkFvijy2<{Z!bKogj zig~T4I0i#Tmrt&T<I+^s5WaRHU4Pwa*4asR4&tRk#^zgyRYdEQhPZ}MjfEB;Xk-9R zmfBKEkTLZJ{V+hBH5BX7gO|`|<`B|ztV71>@ZJCr|7-%77-~^6lu4J*PUUKEGdo?+ zo7^5qnGRF64>^@=L}-X!A{b6CsiM+w@AYU5X{>i1d9QPhXw;znb?+XN31w7VetY^Z zF1~tKTKGCaiC$Zz7F;iWd6sc>`@8e+K>pF<&iSusG6%ZmfJ%_ae1RXD{|dit!=2I0 z(0D95x)*y>3p`nt@e!`%rd6ysZ3J|Q6_IU(5-l+jrf|MVi3`w6ITB@EOU%{HFcKm{ z#E>sIesG^Eex~Z{M=6Q=2AH&FQG1rCg8Q<IaD6V#(!-^T5j#*IXeJ1?6FqN?;P+MN zOoGk6ur<YiM+nyqv?n;}#G<VvT5($fMuCeHaUg=eSteJ5TCOOgBb<@@Ezt*QYDyfu zSE#+JRaqf1dN_JwX!LmGp{2!h&gNVNZ+f<_<1Z>I((!ObiOz7Rmp1}u@W{)1Gugy? zqfv^6o;4aLV(!q;(F4J9%EF7mqO^8djjkeDq1Dn4Q)F;pK8s;7T~qT^e&}zC-`k(7 zFeXjVvrn3go%{LhJ;~wyVX_~1uwp>_2JYQFR{--8c{B{g<8Glswi#CsaWT_VoeET! za!e5sEoLd5Qrc(~JF6yTi}0A~+9x14jjciny_Pw$TI91zf{`pING}aK)Ff@sBiG^` zv{k$jA0V0b={`plc8!v)T8iPYXAW>CH}<hDieS`2=55#@Hy92X^bcEv^`04vH$*aR zGh^{|6<zHuYQVCoX@j#7p}JO^izZT{_NlS&3zlTEY4jt*(veUSubdit2d|{VG(XL2 zX@DB-z`q#W-W9QEHTS=_JQmgK=Vssa`bb=R`)^*l<Kq|pERKQrxp%bMNL+JjX0m$x z0rPJoV`B%$u>Kk@|30Vv75HW>zz(h8p>{;<#za|7H<LFdnN%$asU}^BGP(=uG7S^i z4ExbdjsYS_QgfnUZ;W>1Rs&cU8zwIuW@ZG21m-D5S(c^4#-Pd?=VR0MxY-h;_LK*s zMm7W);JYKvpp|sjEs(+sc)2;=@(TFbnp<8Wq-{szkMlGyZy|jP|2du1EJ`)!Ud+Dv z_Va8sJ}yqz#{Kr#w2%F)qo4Gl=~#0OM(E2wm!S6){G0>i691476W=+mkf`<5`Pq~_ zf3_wz_pGIbfpg8s+X^Nw)lDo@lE*3ylPol&^x|-X!-<=%VHy~bIke{$T@?rqX>Odb zF2JoVCc~ySDr3{mTEl#*y^W4b4QeN@9s|=wL6A>0X&89|@di@59t|c+W+v`Inu&;s z|LO3qjHh$M`sj}F0saqqk1b1Y%moid1g{vZjfA4k!(V?;^zBct_vG@`J04VC&ra>R zxBbAX+Q5KW+xyk9SFh>n-~IaoyZ?M`(*_~*u?xT4_Splw_1eQ1MqlKGR9$B%QS01& z_=NU#+mjKoEq4#kJEZNW?_A=`@)>rPw83iUIQhUW@BdV5);)+b)Xg1L5%01Vm#iJf zq(qCMWZ(s<49N}y+V%_26Qr5~g%r)A0cfq{7&<qXfsJyaTn#HEUWIBVmR@bgtJR{U zrw<2lVyMYqjdFUSQd<`$86XVlYmsVKR=a}>Q#720wj#9A)Ah)Ug$N*-KxSqzv5c5N zV<@ScQL2RyRL!`Ylx2bE;8%x^<oI0fz^2LJEn9`9f4Xzuvy-7kD>i2=`7MVQ|Lbq} zy!6Lo`?iISKl#HiO&#KM2RHEPt%pLt9oR9Ij3h^n3a{?@VtVn;t%<??qrG0OKeOe? z*vYp$zIbrtp{aJ$$bY&2=-1j$Og{9-lOJR8!^HbP)d1WHu8A)0K2E74>%(w2mPzHE zNTzvd5d{Z@3L{7HQwxQjZR*lhh_V@$1Mcv8yd%&%pl_v+vm47}2JX18LeE*MYwE&D zI+2`QjfiSm?+sGOjz|CkV>n|hisaj%t;qtWnZ=<}LbgRuKS@^u_Z-y8e#mRrZ@*{! z=m~ykEbvVC)PQYZ#g4YOzP|UV<^Hqx7)N})-jL<qq1e=s{ys53;p4RfpPC#Wedxf! zFKg@fj%EkX=07=>(8m|;*f&^sHNSgHeD6<^`W|;rtT!LJ=TrXl(2njew6vIdkE{M+ zy!a*U58)pgxzi=}Owj8P<F^q9ig9s}^a(G8E@|XYg%foEtc&7h3<F+b7(g$Zu9CKm zKWL+?L}~<PKX^_SEx*%A)^i$876HcRr$uY9kuCt(J1e`%Mn6m#b<j$)+f#?7gKiBY zVQ&N^K-TN<CJJztl4X<S;^3)_kpVSWkI(>EkVLApJpW*JG~;vh1kC>Ntj3p&U-+5u z<OBN`4@PtM?Y~fr+B9w2w*sU6gO#Th!DI6dy!Yh6Ws^Izw&CpFsesL+J<4MACp1S9 zqt|nvQ&of*M_2`2#}s>3u*OOS!bDyQQ07t))NDl%WfzvlrjiLYsOypW$p0>|^&q(} z|2etJi}kS3dQ{;e<e~oKub){DK?BbS);$-h!j%rEmmRBm3)X>HN$GBzC3qxqNA~^c z754QZ&dl@a1IZGN<Aq?b80AlFe`H@Q&ObI@+>+--{vD0~c}CgjGG{L44vjsy!7-e< zXRFKR5q?N#VrIsMG_9=Ast5k19#$iz*vz!zN#rQ%CL~N@iLpKkj|5^0@@57$T>~*C zHX6dC?WZbNVUmMs)J>ZzBP#wSio%U-iqO}Yk)pe``8Ht>Q<8f01CQDKkcmiLc<Mu% zv5DipadJ!1YF50oc2`rNBC7pD#Z?ukc;qh9^>n8QXrotkr}UTTPNA$ie}?8m)}4}C z;e?59&Wbr$Nq?#8)Ma{8R63>H8f8^bUfsIh)WnQf6tBIB=^5>-C4s1mxtpDx;W=}0 zJ9-yMZXIvn6fNy3M8b(Rz(9`|r->+lO%@XxRdHZnnXpVVE)3!mfno{Eb`r!;1!jsu zIA>@Ewt*DKa)dhR%Yj>p^#uW`qM?dVO6nBlDQxbD>V8H<xefcdFjYvZR}J_DL03Uc zXH-Q;*bSdus-m-wVJ8Y6if$DMsB+R2YTnxX(q$wbB-a9fg<1*ZOc5=C>3t=hVw{G9 z(!vORTF!w6MoIRSU`V7v(?E5-T3|zErb=Un<?3(|WrHQ*r>jrKpz3;c;iwpIm^~5^ z;EYgQ$g~OOOvF6?)-?uWNG9rP6E&yOp4#5k2R99kIY*lko!>Q^zgyCGy7E-aRmK7y z(&uvnlfi()$HyW|>0T6w%UgpS49MIr%BE-}niOei3P6X)N+o#*P<u;Z6uM_O91scG z@TBM?4HE{dM8IK$P6l>4nIa%A_EoOJGf!$dSk<N$%qikP!91us5{fqxNRU7_RjG0K z>gvNy)V`3bP>ulKIVU%xGmoM;+r%IMm^oQ=U=xil4{bll<_Id!whUE<OU7o106?fF z)&SKwXLq7(bl^?w$?(%3nKv|Z_oAS&Y17A-G_w`oJhaj=8clQ-e=;{ffYn#3_ChOl zZd#^WsLKquYWB`sUyReK<26`kT?v1w>Ztm-IHKOKsuTPqJE6qLw!p}6xq>RK7ivCD z<be<ZhtC(B41qb#JHuGil7K3de<*lShy16j>b!WXLd9)Xpd6`UHYxmd@;LK?0m}yk zAPtShRM^gl?qH0Czr@pT0_uc#m@C1+f!M~4&(|>qh8f+&8G%*ZHF2E*R%$~)!)6#> z?i%gt8r7IOK^q8{eQlIyF;)C{rP*WQAJm)lng#Q;QuEixeG13jRgM;8Wu?x^>TS9b z22|Bh^kdqki+h+;nsK^ikb<cM0N~xZmhdSKH|AYHM?7s#0t!S_Z6zvV8iJYjHjz>~ z2{fy+rfMh+wgBqnSTW=wTZktQqN9xoD%W6gL8&0AQLJzRtT=K#29Xd3QLNX_f<XP1 zghsD{Cl<a(+Ok$28g)h2oQ`M?4~xPT;iEH}gfF1X6peItMoPagCa-}dojJ$%Adad= zt+y9l)BnRMOK2@Sv6gJSlAu+W=zva(7OHZ)8PH#<5TypnJ5U?|&<bM!tf4xDhMan2 z(S$>Ld$y^{M9F3vbU5j_Hwt~~k!D>}02GNk96A9;oL$+DF6tR~BGtKM1>Gjy1Uum9 z#P{LEBi!WV6q*onA9``Jdj<X$w4{(B^Mw$*<z(<U;Or;xafmKBX;acEz=2|Xpd>n~ zqq|=v7eVV3Sve71w!2d3wV~8@Et}8=4tn`o`+<wQ4?za_Apf+l!B%S*S9Df5tTh&w z!INmNvzY>xrft>Uy6T{&Ccf|5hN92Z*%m@<w6}GMM%?*Py-BQc>K!%i<|d=#hrH2g zuGctgetUh3q1gfzCrvgI<vO4f2oAS)+EpS5Y9Swh(LQL47aCaaNUBavn;YnsLFFcl zJ*=Cj3Ye!{UG9QCbct+;hh}L>UbG}x#Y&)EAoo$~G$h*~Ra*=8hL#)=K1tXSlJ(5> zVIvlTf?xIm`34C&xxtLz)=TWfz_tU?SBr*fmL%qfJD(b#*RqyxlC_BZYd-(BV3RXY z6Me|+_1l8R-quJ&jEIT7Hh)NrYTJsn*R-9NV!eXpRlQiX0M)XLUVO!9H~EF|Mawlz z2)T%g#U~qbg3Sff3KUPX>pw_-0HdDec}JKgeoA+^R3*whj=!3PP+825XEmxZ5F~_j zv!{)sSH+<l=`h!TuVgMKxh5K4m|&@g{o2P)_ZCYm9=m~qHG{v8;an}?BHWhJq!Y!H z<#_^e0}@*uFEu43^Lg2a^WsxclXGbRjpmX-LlnRDs7WZImte@yMyJ6IGj3Q|Yk4?+ zb$~s8I3J31ku+=ZjVtc3_7yd*9=NEz)?UuP4_(;J4E4X*+#99T1pXxlKX)I3?jDLo z(~b%kM3G@UEVXP;NG<0JKK1s)_qJkF;pTn#EbcoZ*bb{(jCpd|ig6{?OFOO8?)n&G zpb4s89nw?qI&v6@I@v;_Pa=j8i162{uFl+D2C2+0zNRFFe9c3lIrubPTTc}~^;Hi< zJf5Arku#!?`E3!OEjn?-M`iX7dZct#(%dJ@PD=~b1>rCd2q_vW;g>SZxs%g&{gXpT zT_0`{c_GAz%FmGOfYX5L$E}?vdzmr{f09mCe>Bun{N@!;*6xy!=xaXSuiY2wUK|Pb z4P2bO+9~7ztt9V`SY1GG>I(paRbn1{E2q@ZtpV5zE@Pz$2aO8jQ2tdCi}Hl=D^q%2 znUKKa1%*$=%z7GTE4ml8tSMCsg;!z?k<G-~<$yJXfII>IVSYFQ)cz&?KXmqZLhqMJ zbGqzxd=^uO#JZ)v437x^s6pjgGddffJeNO=Zr<4BD<;$7yyz|dYcSW=YBC;}%%3)y z)_B9ewAiZtLDahzHkwSQYZ3<lkr-YX{dZB+>Yh|pQ{S7K-{s@A*^UJNovxfsV`@du z$ds(mEgUFr^aw8gS$~teQkcoC^y%SZ-q+R?q}<0-3~5E}QMR7FoMNHY5po|7ju;_{ zAW*>)-{u~MnVh$tv}DjzL^>j8S6mvq@l*_;NG??aTmr%cOqI-ADFMi=8d@!s<?X6G zF@2JWD;fliUL(#5P8UBpzBigpM@&IWK9KZ#+W)yOlt>tj;}c_#8jX!nhsI>q2xs+G zX0!fP*P<Si=?lTwc9SVH<vH@x*sPGw$&;~ww_D$}Z7P32@Gj|aqpwzpg-RJURn<YS z&!8<%R$KV0D*nx6NI&zQb~IW1<;B0<j<rg>9}$Y^`IrzN{2EnN>gLc7h@C)~h#*#^ z_zsa1mE~CSBnrL~3SwbGvYnr{+ng3mJfjRx#T^x!Bga@Vhc#}jHfr41tZL&X`*&|Y zr#5bEq7*s@MPaZVqsB@M$2JlFSO`s<Ap6-!KNY-E)|dezt}&<J<HVP8MlaffjUm>Z zY2oKRW6hZWB1FkgX{KyJ3<%*BD&<wu_?8F56(9#d)tfjP&~HI9pQTz4S}oadWTv|U zaT@Bw0F_cBd^Zt{jE@G>1BY`j-O(MpJ-qb<&-y1P?@VN(i?$}Z`rhzI#y{iYclGT! zWhvOC{9~HeCbBrR`quZJD|8NTm>5-a^Vs_Xx(MpE1KbMkbxyIu-U-u-I=@trQo86V z%9MjRt_+IZm?mnQUgYJ=GhKug08h|NAI)=K)}xOWN-25|Rc7>PS}GnrJ@k;&52|c5 zmq5ff;IctCv!LpBQ)^Au<3^Ahur$&DmI#SBi9+?D!zm3}r!!sMDkUr{qV(nD<x3EP zbPOzJ^C-}e!ep(+c9zrm{NV_rXIMyrz~rj2q}1^2;sN6^goD%z#p8CWyWkzvf8nAG zX*P*MlB3knDkbx5^<8Rdg*Ha`S$6d31CgXp)E*vtQ!7T25r-!hGo9=UiGIC)!(&s2 zH)R6R!NsB8MSr!G?aFcfpMsg*!-<C2xG-`5)+dU_L`+mU!NpiY;6@XsoF^!r-8J&` z<NsX9<$BjOtxWBBth!)(ZusGcX!io)4eCB6-X9SDP5TUNqmw(qF}mr3R^1o|tn#Oc zH35rXFvZi5Q=(QW+Zd3|Rh~!wBS*0AbtxrMNgG0gK1wAU|6$NWuuvPYA+2h!lZfad zz)elN7qnAHMlxW2#2Zf&<uNB~(>Sn@0X10_A=--Q2$7$9Y^j%^Q_)df=-Y&Qct|N3 z%0W~vEO_|}xA4Ni=r~)XU~zcADYQw;8=S&E^!o`zl^9KTE$XJ_Y~B+2NY^?(J$g)C zmTn;mgw$U(8;mBt-BE&16?<Pw9!qwuYE{?JbC}kSCmkjJ1kQL6lLt&a>Bc0K8X5NG z5(y;=5J^G_tRH3SNi-^W9nGX`3ME}%9ZBtp|5R6CJ#jB^J?A)O5qaly2lSrJtd+oC zAe{$l(}n5!WIvXWqlu?cllhrKj|(n#lghy`YlWlG#!d=q=FovQ;FuUxk%wAzEMx$X z3GD;0F`*M$akrv6j;0PyhdH>2fxN4&au~ziBwa=mmcU1${RH=pUJP`(svWV`PJ)=q zO9Z^Ulp)3aWSn7^%Bub58JH<pYT(Rqj)=ou?I=SYlN}ijS0dgFW^gyo^qz$m29D44 zo9|!tSoh{*r#`zy?ARWAY_6Yt&!b1*DE{i12d0EXb!h9c*^aaD?fv{rr+Fec`hTB) z>^r7-kXPoK$PW~^?)%x9Uljj!xwf}z=eF5KGxDq9Vn8S1OjTk=eku2TPHDhiNS#cX z173<fmGxL~D0xA=k}a_$0(C`MNTLlyx?qQA�|BK&3aVf8@N}hxP6gWk0;uk`#e^ zWK9x%BQtTR_L<Rco<aH`r~=Zb5Le6CfB^#bQ-Q4kSf6c?pORY_-K25S_LVCTlq!xk z3`byx8c0aiLYxJ9t6aYbZiriqF15BIlFIh6xjp~`nrV~Zk*o+Ds85I)u*)$e00I!W z+Xf&2nYOP~4yDZ^C@O<ZT9LkY9S#idIC=0;@#~R9#HKwq_6EB`M<3jAdPDrDFMMf> z$Kmttc;@&wix-}}XJzr^{v%(zZ^xoZO=!bN(LD8|C!YS^dnaO-!bf8K{y~27!G}j) z<IPj)Sa#3p-#q>6;_u>`hHssI<E6NeJ)!bV`gIS$Zom?Oy;=$0TE$P6<H}$gSlc1R zlrR>O#ZZ)6SZ4M&fu@#EQ`>=hbQV@WBH$D!FC#GT%ADMU1ubyeaFj@9uu_J!CKA-> z>oi?FDd@6THev{7;b#~nNS04!y?nZ2<~{ZGRF6I!(V+69H)!x*|B;t>uC01qBj_!5 zOIK_0Un*)mHjB}?zV+melMeqEOs3KGAs!@?cfY%{_$}UfRTlDnk&xGTzlPVC{yU#2 zUi>BMoLYXexKYq*1m2W13jEtYCg794ZAv>FE&e_KiK6$y`&XwPFTMqjf?EFje(gJ0 z%P^|>A5%443xYZW<e}vwl33F1Vj`~A^D<<82yz*KbMdRrg(x!;)SqU+$^gJ;LPe#; z>A~zgxf+aO>wFO5!XaIV`{Jz5P-(H&65R+u?y?&VGek;sAThNo1Tah;w1<U^$lQu6 zn>HYaaOnX~k5usF;YrVxpZ?f$$KSr&FTPQH_3s`pY=7hK`@jE<cgH@NTe@>ycVo~L z@l|>n;tMW*Ldb3Xqv=2C?0f0p&VBykFK#dXPyQpDc=2I=_~qxE){i|>qj_t)!xZf8 z$+<4<(|o<{?9#tI{2!`+_J9L+z2-@<gT=TBPO;;B(WOj14q^jfl*kNfDc}O&0ZXE6 ziC7_kIg%G5FH&lp>bH*ISe?~aMCXFq2DTtpDL}`I2wl+wgjwO+Q|kfzUbED$V9an; zAXmaMaym5Lq-FVnzGe2}e_i?Dv7G_a&=>Fg%e(e~&GiqK@xqgzUsLSmH|jmzt=o@i zo=j|rBx0@h;_aV#Q0v*Z_Ri1NxgNd$sqgWo_KR=xPuIu%-+F2zeCSV+7n-0kljZ&t z*jut5Xm}|C9?(FgjxIXdd73n;OBJe3g6scL_V$5omgl|jb3c0YVOf^-Ez7bj%d#xX ziY!aAtSGYl6~}QLV~lHzanfMi5FmVnG^APPrddv!Wt65_ni5jV7|mG9XvXx&Dr1x~ zD5I3}^7dt%hH)5UJseipx~=Ux#%S{7{I2_v96KN9-RU2WEGa3^bzk@O`MZ7>?oS)v zzDGfOGwu&4gAIJQBOaTE``Zm<(%P{rvuJ3imj>WP@l%3fb8^P#1hgilY7LPJw^Ph4 zcwn>=m@g*yZHnBCARDQdnuP1=Oazu1Yvo$%W$;jIfQqd_!j@7bh(as=N|41)Z>3TD zH&C;Cp$r=RAwZ*jY>X-s$}6eZi4zY5gH4XW_cE2n7<(I)Q6{VedtMNHj>!3+#xqCu z>nz2`$~&la7w(syL0-KY+;RJbqLCy>0UJ#eP+lEN6<bJvlxz1vSm2DaoGmz#1g<S) zS0HKuLPC+DA{b(j6HtRI7>L^7JJ@U$jS=e#WFyCcw-sa$`k7Bz9Sx1iOco!gK-*=I zO_paY$RMYcGX=?Bv5%$nxMxmcjYBG)wzoF7Hmef`udNs>PztxRzqn+!z?;kWG#<Ep z|FtcA#GvUO<%4H+CX-2zNrOu;wXdB2q4)v&#?o*BV<P(>XC)2I7#v%rv=)%WGK?BP z@8*?9_-xK*+CjI8q&Rg$DpgF<D?PN32CTVFZ|0-b%|zL=Azq%VF6K5s&xG6nwTwT_ z8LTl?Me{iV6^_+0sRBl+jqtwtBws?hjsuq#>tjY#qJ65f^~Forwv6s^v7T=%tH<(9 zc?%B0f|8-kHbFBr)`AE^Kae(0-3VibV7cF@udz2rlU=zDoPOMABYvrRB8Ee|0Q(Dy znT_N4s;u9H)4Y+${uV)jL+K2{%m2$!9BjY<)Q1-ah}{5)7X(e=4x~Q_Q_1nn|5qn- z8ob6@?AFOgye5;|;`KPpE&0KNBa+i#sj^s{CXI$q?lA9vc~FO0TH@8)hWrtijd?8o zKz-<pDdh~A>#B?xqF^+}!_YIH){JQmqSguSGY1_i1yi+nQHaJlw@NTdXBiCBS)#aE z&5>w?)T9Dt6|xu3vm?klMC*XiG!hsOxLJ(hbS6>+c)GZ&F}1YOkv9D4Das+)5#$k4 zqpeX<v2`K7d<~=Nl68I3?C`|!K%hAk9(;T<msmBCTWi(LdMuvh3l~L`P85B4TFI!_ z{?74(>Gb-Y@yO)yyBOsu*ei5SGY?GJYT?963&C5rv`x>;pV=FtXU@h8ZD(o!a?wxJ zD1skDHU+;*jJFbzzZ{Ma<3TL4NW%FtU4I+oWW>}5w;@G?7ez~IA{J6S0jKUQYTqy* zNU>GJv6ANmMuKpR&?pM^RESEt3-DpQrWpr~Q?xAy)^9&C<Z-6%*l={jXUV05YnbMb zUwp4F9Se814Aif*3VYf2hjz@TBV*(7UT>>ys3ww2M@~HY;_=k<#?IiX*6XH{kXZ&Y zvp`Q0!tY}nGps<VRz(oNu!c!(U{H%5N^}Jekanzq(U6w2G#g2)<bm&%1%NpPKGa3i z3q>0rU9h2cL1<5sRXb#r!k=D(%~sb2=#C1QyhD4H+Ib``aC)@^Xr|}p7L%3We=JMO z?SON^eS^OB!qBR7GtJl^P>~h1j9%+ezhwWC+YUC<XS2EQ8I9X}9zA;dXP;(bA{<Kp z&qEKNy>2MHy5na%Gm^pOF`%sGb{M7AcP|=_Opo2Y)!@-#&Mq{$@6f)fgJE}T(jjOr zl;E@H;IobJ+8c-!ZHYewAapZ%8(2)>)k~3%fEn(3Eyq%myI!3vu|axIW)#YC(h#C6 z%_3kC29b`0vyf<i7-Ly#Edk>pg{(y;@-J|UELz;XXjV%|Wh8JKLw+0aL6%s+unaI; zkU*nucQjD)D99byvo5<qSSzzZys}dlTW>8`AVFp&m@MRHW*9Xh#!{W$sgBRkglRdh z&Z|A`9Jw%F6t-h)15a3@tnBRB;gc-*=<PEbYiDAGAba|r$)jfycfaw~ZI1=ife+;) z`2CN$-#`Aun-9JG@)71Fzb_qCZu+Mq2W<XblJe5T*CJA+bpOcyU2FNwfYh@vf$pey ze>ng+7dUVq`|5!c&nxE;0;q8zpP9!Q3}8Rc<7&oU8dc+EIsw{}Q4I}sHnpRkUed81 z5_)<NDLt0*pzJ1l;>1#MUA!=G7CZdpAt-KLvN*)aCx)=pxi~OHM=?MO3(iv1jKK~= zW2Xde>PQwd6ZCNgBcsI$J^?ZTJ0VZvJccmx3o#cafIkZ&1*G{=+ItIc6Kv+an~Fdy zbz>*i?Gb8uSwUVT))wBprVEc2shAE0fFY;ibFx>HE>b_*x;^(9l;gu=7{gf@6n|n$ zcdx_mO$PivnrgyVe74(SJ70fj;(>>^Xbzr_UYxTRkNa)yY+!HxB5(;4Hh;+m2G@p+ zx=P92U>loK{?g}Y&m4R6$=T?(H;xzXoZfLXwvF{~{_1Z{!cG`450%jlB067%8W8M6 zp+Uu-37#M%U$o;eC@iOVT!qVn5ScQK^8Lm2l`=FQfz%GqcJW;nM)0I_G4vH)XteyX z!4S>|Lp4_REt}tGV-LjBpE4R=aAaOE8gFe4DbG|X&xYIkkbiLv(l&VVjqQ>n&N_ed z!EfaZ#<)6-Qr){rGkEURsq3Lc+6ymVcwfrEFAQQDWfXPVY2lv)PCd=BB}8v1Uk4rH zyXeu_<?F9Y*N^qPaHoMjy&ji*ZL&CaJt@K%NSavThIm0bTNq6eMV&l0g-#AQH1OnD zxKe6YRR;@KR-)t(ftPapGeXy)(gO>)cH=bZ4XUjHzLl;C0ZgI6TCC-{j7pwa$f+St zH`OCF>+yv#>v!6vHMvp3P86=^@T=>r#roD^`nL==Kq3HJE*c#k{N9;xHzV{cYobaZ zP3yh*4iF@ItzmsHA412S=R?V1z0(1Bi(L)tkq%G;gf~Hgm<;2=t2JfuwgZc}j69l2 zTFtnxP5Poy3Iw&%L7mN7E5XPnsx*ecy4(*pF!zfa!U0d}v3X_W!CS=5YlG7H-QBCS zzuCxslHL!}>HEDA56g+Km2H7Rr?LBOb0N<k4f!lOi|J7fV_MBZcj&0m7zwvq*vJ1q z!%RlUsgfj$;y>!UM$Y}MK0U-<|Fk&p=~P(RhF)bQr)&}0g)T$|yRMd3cHqz{uLLuI zo!0Ag^Qk^qb{-`l(agYVq>|j@XO`!h2IQLGskI{2q|&b30SjH!gm?_#x1zmKPCNCE zU|2)(WH<6o-O98Rhf#~ck8n4DDnv1@&uB7Xo`TNwvetk9AhtU<O@O!i{(I^6&msl= z?e*Y~ZXH@3cjTfqeulerZJg+(&B{M~>m{V4ci!+oP*mO@S)Tynt$53q8r@&J<Bx!Y ztFpLjU+Q8Frn{3-b8N@8lJbu~67QwB>B8H}qZ%J<Y*zRTI&5?uU{WcCY~#2#LglA< zh?J&beTJzyZe7XIc_5-Sw3(4?D{(}_wg$Lw3m|h?><_@gastNyW;$RjY5C+b<JiHl zOtjooBSQ#B4xdAj52U2F=O7<^1%96$ev?e$U#&X;AOG65F|;P<dtUSUY=05j&>h=$ zU+k{II(NWqlf1FWa36b<QcGodYMI}wi0~`Qx!WGyl?~726aK8l9F#)oSD)a%hU(m6 z4sxglW(v{Ja_|G^odR`47q9$QxC`LEDU8IHuU-hO4prWGxq8eB{WKU=FeQmD1gn{f z8A9$08JbjUvMe?>>3}UjWYI{;6E?`HubfGF0JfniL(!4OPWM?y%mn09mZ2t=70YSB zT6G_m<wBE4XtsPs)~~J$g?k7}(Lii#FHWO9@Q{bq*$<ChGxe3w#<<PsG}{dp(H$9X zJL6_`wcof1T7G{zW(c}H&E|u*KI63oPb)uH{<Uv79`UAhI<ww}_HFICfAU$-t)+hL zsIo`<8f<Mf>RZ@yXr{c9G7mx~7qI+DohhR=*2MsEK?wn#fPh05#V!pL2qX@eFDAj! zv>(vL)G>%41fsf^K?57eae5786p<gI$chvNV+sWm2ccSB1lfa@HH%<Cg>I#-t#)0I zNKl19677?u(~3qFT}Ie71dlw9>=4B>k1Ndc`ofldU)cR+<rm5^ww>7@Jro@9*M$TA z<{E=-d}Lv7VtOJaDaxzLPn4f3#j)d>eLH{Sjd8BiID2Ts)Z_il&mQQ1^xNG&Z{y^~ z+%0Xh--b*^l>O4PsC7Gl*}@!^o1$zK1F6jIql64J?Px6&`r^a~2+VFV)kl&^VY?tJ z0H;-dR@Oo?KoWse4oCt%+mE6kC2Bp83=OL~wGaUY-7s(+qM9{Akv>Sqh&gQ8Nl-Z~ z0Iml{wBWN9Lv5&Qkc<!lBJDu(mgK=T4Gr1V!8DYk8?|m1?>6aI3QUc>J0$V=+AVty zX1Cm&_*9qOXmB;o?|Su}|M;%GCD<JBH`dr4PM_J&JfVQE#aLw<3$%Q0k3ZLwY^xK0 zweg3K?fB+Y*lTq8wzIqHL*et|C{VagzsfA4R$o<9TUBjr2&c{$Er0pYpQIbS4HMz! z>}RZ@wVMucOaO6ZVz$bFTF`e@n@%jL(3xVMWUH>`=6m%}LgOljzMi)!AA@bz@Rgfz z=)6YcIg3h~Tm|WhAe!TC*D5nYJx>)mdk<=BKtfV$k5*0aZ39)*Xtk1v_14AOSeoPE zlB%?`K*3dDq?}u{;N?`<p!PvG2kck+R)FPrLxF-Lc6I*CWj0a`7NM!A%u=EQZ@r8M zQ;c0s*7p$I;QQ1f+|@`grC(izsgKR#F0)EVeno~$%C$qpamoAM$Auzlcu{#3PM{gR zYtraGzT1Tu`s~!d7VK1+%<1eX0p)osr_`N@=x`do6vbe$kDP=(ezuTF%6_P{fThbt zKcDmUx8OYDynRD77Mzsbk1<|3Ok?k|9oXEWK%i0JNKpkUNF>=Svh868vm9x!1j;O# zN#si5R+_v9GI7V{bbIZ(za^vjqvh<CUnN-|SmLl0-?@ZvuRjBu`R+=h%6B2*Ic`-Q zVM|E0n;bU;-!p;e^m8k#7gX28d^ES2(e8fAbG$AiEkzl<4tWlGW3VK091H|JFa<-^ zmCF}scrEr5rvQ&X%Qa6yc2RZWdI;@u{s4#rBsD<E<QN!;mZvSu&jPR+m!ODhE6#|} zMD_r{Ga0eC5=r#E-$A071-a!4I8hsWGJF}D$TvzE<hE}CRCJG0Iu69q$^yrN3MtCN z(Z?=<6|Eb+6j79FaI`CL5UPm1qMQTh2zHrci!{5?(YgV7K~bbmBS12QEeenpL*|X} zQ*Elu>+E6xGT%6XrWfFXa$+3`d?rqN6`_VyTOXg|W_2lcfe%)pMOPLghZvC{FNGKB z)M?=;(`j_}AS#g*f}gqsWc1*RZjbp`{7O)x3`>lz#2K}^zwYtazWwPl(SP{oPjeuW z^7Jye5!(Wlyz5GkqXSQ@#2QVcpwB;6wrmsMe2asTAV)KHyWEBvQWa49|GjJth>W40 zw(bb^bbr^7JZn>>PCfN%i=D(uBXsi7BjEO<eAPNIm$YUbgD+&0v8l44K~<S}g9uK( za-dAWK%@A)?6UqREmjvSi%bxKFv8D^2qSPqFm*LLE2=Lzl!#CH8_~us_^d^3Eg<z- zxfO8*!gWFJ1YZM+oi@(&xyH(8Kw(Y*59?~B8eKXCtye2fwM^%<@Ww8u;Zh@sCJi}J zxK$p$(pTB+C8J^w%BIsp+az5gGV+ss%BxHl4tZKcoijQeTVThc2{|)6<8Is@NC&p> z+V*fbDmj&pqEVm8<Z;JFrjD{%mdN#{10JI;8dYt_2>hW9w%CqdI(8=#<`7Gj${_|u zeTqgGiQ5RO&0+%`M1%(%tN<E%fR4(s1r27eg6X9iR4qjln4Di_Ejn6ALWm?la6=qf zqP6G=QT$M7uoYXvl<Qm?NdO#ll|X{W8sy4Q>|jBxmtPq|B=7qC6+#H9HrU_^i8==) zbkmC9fxStw!={Z_iXBcx6Km?4>PSk^3e<*wrg>L$Nbo>L=(}>X0mPtznsSjqUO=l- zN&|US@oi*PkQbK`)YA%Wv|3~xWP^$^mys205sURL5xy`LC;%ev1*8Sd?Xp;CV!>Y# z6)&5IDD31D!-Ud6YZa`DI$CcGd7$<aZUr<Ro}X$&D^%!sU^2ev;F;UptJb>gMkY#H zLzO9i*XQ4S{ln6uo6C`w)~t%j8NDHP(-%UmZ$19Z_cS%th_RH2(PF8sHP>3i#f|U0 z`hxP;x#}ukxJ%p;O=LY4O%jEZkkM)MrcFY(@L+jEc|dJW5}^uhOev=MolbW`l?XRg zs^&stg-pB<LI4jj)g;R?yzE3JGv)+GketN(Nz{g*UaKKOO$f%a9);jS(pu71+v;dM zmns*Xkf0XhOvHp(jnP3vt;HFl%sP#FvAU&G=eEMEszMAq9lUStbP!wu(nI11PA}*m zWXhu@yp8LqV_So4eBF_+9lB}josMn$kNlnT{u95|V|82fPNUIhF}!_rp*J7(=e>^A z@o$Ph^Jf%c?!7-h{_3+OpMCRBi|;ExO`5UG(p6RMiEh)xe)=D);@LpFHx1Q7EQI@~ ziNgLhVi#wziee!rB;}+KQgDKpuBahN!B3-`&9J2o=y}*WKA;4hs)Imu(5WW;X+=GO zqWC&1(m%B7Ar3E)k3~VJW;Fora%I8es@budf_GII`qmNjsh@jOQBECuye~9tkn985 zZC`wJ*OA)-VrkFJuYdGo<)b^Nqw&#LD*oc>!kdRRxEEs=!k8^2+no}=S?0#nt9^I6 z7rf;j=GP&Nx@Sr3tZ4u^4u}N~c><mr288<2D0fk!#@Fy!3Px%NUR<qbRLfoj?QZAc zGcsFR)}^n}I$aF}|DfPlY$h<AAnPe0L2`$T#Rx+qhdZGy9z@)*Gb*D79$7_rqVr53 zCn9=4#-e5>2;uZdYfwUmXxuJQ865eOnm}Us@u9!lH(j^Q9k)BKbL_hJqaW@rW<!aP zWcuQ5hYq_Z9~{kiLd-DoO4#z*o9{oPI~9@K!OsWt|MSR8%)I^PV4^Frhy7?ld3!o~ z*PRv*<slQu>z=}W=|&!ci2_~~6;ia)tRFtkOK=V|$q-Xl(0(FOrsS@b=rDv<)DeNk z-~cieFUs=~<5{^E((JX!5`<;R06^A^yVGpp8w|kd#&Zy<7doLg^F2(>R=Jy|pxbSn zrELw0(naHHP#0tiX0;2T)dL1gpf75o(8WM8a0wm;pLWIc4s-wzBkxKfh!MXpp|e=n zcIv774ix7n%Gj{5m+5?wCSS;Cuw4^q+4H&Bb=P#6#y-C=lsz;%am=oZdVTRDzkKoC z`xa0eT?Pv~x266&C%?Uv*afeBfc-c2-Mwtv#Pg+>cJn&I1h_}uglt=dr0|Dj{;j%H zDZypC=t}sIN<ePdb5YcgZNhSlqM1`cnvJ+|d}A<&(WLlTU<ZT>&@7GL2*i#DdtS71 z_@)&cu4LLlxIx%GJ`)@6fDpYtAS5{XI!2a3`mKR#w&OB>r{3EXZtEbWZLz90X}P&+ z4K{lS=q2g^T?#Lgx`*(3Wr2#%y|`!FP&ybIw^&vmvjsk*dnMng)h-FRB16sV@BTw? zJnoNAO(+ZJgJWUI*cnXyzE~FO`8!0hEH%={Vc2h)6E=c9;_mXcFfh6ihQ>vD2$V@l zur#2oO*t_qkg%UE_>-~$(lMxXWUxyf7$Xo!$Of9}!qA?@TC8z3@T+OT)ht$f(1W64 zOTms2RRT`nVwTw~Ga7E3wUjh0S*mcic%M1e6R8_N>_5Hx54WB=|G&b|zx3E^%7<sR z|K56|I5v4(#I$vnk16~1zV)UQJpTIoA3US{>!!`++$eM57n&Gslo9A4Fd6Zgib^-T z<mV4q61=Y3%5ycx3w3AVhbr|XjNOvG1q&O^>Bt&#FnBHu($>1Ti3RF_=F<ImP<u6= z4bP)(=FmGZqqNen8aWdo2wIht7P#Cn(98?U5&tdgyy>yt^Z$<O>>ID{m{ZQ3P)>)% zz~Jnm>AP4==i5DB8@rpmA^Q2|KILP$A?eJW$5~R1+Y*S|p3?jXbH<~ziHrE4X{|e- z=O%D6VB)L*)@SVVSPLVtmh;uhoj9ycZV$>J3Qp?00r10;RGsb>NHIHs0mpn1u*Tq^ z{d1kcT)(PwbuT`llcH`jnN6IJ8%-I^V7r`+A`%rqqVzhR(sgG5NLCwCFdDrYHyY7| zMC?v?i(`#9uJs>y*k{Q8?c8Vnuq%-7%XB-NYD_kbRO7KHS2ql84BJ`|+-an~`Bjn4 zrG<z0_s)Lh-p@|$+Oc|SWb+Xwek+-`p454#4vnw3hobGB9$SNVb)dP<Z_`w@hlbJ^ zu1H7Cji?W;`}AiwKlY8y_itHq?~U^Vt({|Y+q$Q=Jn{Ng^1HJaelETXy@-?b<$jmK zIVY()=t_%z;)fCZUbwB+_9`c;4Y3p%VG8nsNZqie%XYRDPys9f3&*f<HaJt9n%cDB z5V;7E74gkoz5wCMEY8~2b`FuRv&veG1j8sGc>v+E-b9jbwaT;;05Ck+h5)BCk*N@0 zhPlhmq*ceQ`YOVbpxsh|O5iu$xm{EH_xsqv!qX2;Z#{nUo4Za$Jn?XJ%fqJ<34Jba zxWPC*_wnRiGsXb2h3&(Kf>QqXCLWm0MuKZ+jz6Q-M&qVA$Yva}c>}WP6m|&3HqtC= za!|RkOypHE>e_s2Jq#2Q9H5CJtZ<gENwARrZsL;Rbyq4D$j#VDfm$nUBQ^%MbFXc* z@^^{A(c-Fa;tHmYVNk{k1@p5cZ$M+@vmCMD!S&Aa)1w^%of1=9rwlVy`JArHU3co{ z?6&S;%F|lded>W@Pk-&X!a<#P|NhO}-=EoGjD~a?@zm@sHxIE;IJ5cq<o&%;{^1>e z@Y`PV49EHN=hp0!-a#D&4C}lug|YFXh-tLqwe;{9#vpJvEfZ3u(X%>oEyMs29#}OB z{m={aiotzC`R5>sK~<1NU;+%_VoH^lO7ISFk{a$U5R0ruos|QTR--bt7;TO5&GJ3q zv?7?@I@@AkaAXYPkwuVR$Zl9u+(g_s>TV#T<X6uOQ-j-Q*6OK6e~FN|6yZ&-B31S$ zUkbM{$?1r-)q8Ex#%NP*xZ6`>X9k<iZnf$+UB6tGSC)6Z=D@4RSX*?!*XOf`LXJ(7 zu4+wF&=#pRxaxh4KC|z)bIS!jdHIQ`c5N8(@JZ;NMP*&(s1ktc!7NdPm%?r2uS;Hk zwSZU>adC(Y9T3>&v!$roJq7tH=!|epD|_VtM{#q!qZ>6@&b;bI{#6vJDbhidOn_K} zn=T%(5n4}xC-ALknBsIGYKjfHSRuj|0Br2kTkj(GjF<>*0Tm^I)oG&VCr`Y(<*q;b z!PCmy%>8VBcskbj%<k#iwv68Q?2*T|KQh*=Ir#AQL;b$bjm{tWx0gTs^|^nP%>GPd zPc)FpvT*pG>9MhUZ;vW}iN4w?WmepT`cj&_G1WyW-K6}eF<xrZ6E3>g)<`z04J8tS zv?9P!8%$U+$U#}{JW;5JDq51odJgieC(J^7J^osMJJzt`Cr~%M6;O|1Z8&3Uv6X=w zh}`&eADTpXe%h&S#Y(uP+yR%*gIWc_&r#uHhU_h1hV0FrL4R#N@im{xU2~u`b09pG za_7RHhC?stO?s)`UKDrl%`J_e88Qn;na|YVQ2tYS^5|%`!`GGyvL{$lGF0p9T6ios zs(5%TmlAq~KN5=VutudGZikC?sJY45$aBRYE<qwA`UkqLi;Z<X2B=9N&EXP|hVU{x zp6fz9=YWVD7lXKd*zGI~mE1_7F<1uz7#qd-G2YRbNnaX)MM9D(_mIp$U((HIpb|FV zwn%(sj}K9ig|@<8k>Hk07{>;7=TW+8E~g%DLMfHe`YVGw>Fe9SJ^yY?AQUoNZXMkH z-^LH#|5p?L`rW<PNmIYQXQ8-bYi>o5cZ11x?nUL99b@dC(Lf{^3QxZNdSZwje{tJu z%p5y?<Oio;e<p@o1et0!qvq2B?}t7c#YCl;DpMe4IlU%`f7Jd`GjC@v8l@Ns5N;0* zrE<=O(R9o_k-OtQt{vqsFbY9tm`ARm@dkm$?oj<9gseU*$I4Y8h_oJ2Ad1HSKXT!& zZJBU1o3AX7Q(GF{4*#i&OgvW;!DJ$wsz`xi0J1f|fUW5e9#K1Ma<uH{XwJ4I8QgGk z6ig8fk_ERa&L1C)AhZJweBVS`0sg(BvxBablbSyroLH_yhual!b|BGcR*h0~oSXqc zBs-yB9f1B74ZaY?!q|ca8smTsLvx6^V3>fxfFp$~s1Gy~EFXgEA{=C1ZFsu@j6|(R z?}P^7_MuPdO4;?iaHz&v8I7*4v(79nw4^^TIa5Xl+jZfcUo*>_iFd;8HQwL3e^2qu zlj$9G@5KWKNv!2FaUqMOH+c^~!kGEVH<-bq{8BnJ`&Xw+zv`z|(a;%(GLJgpE@b*W z!tLnI;#Ohj2$gV1RU8~bg=REeIbQbyyBr7!2?y0>kqN#60w@@9Uj6E#POOzC*R$55 z;04YRsB#-9n*_NU9hs8G+Zb2b8W}nMK89Waz$4+bE0uvMrm)9tciUlj@j4H@sQND2 zTQ0Xl7k2QfMp(zr4bKL*D6cs>Qf3?5G4fhXAYJ?Y(8m@mqTW7q%wdjqxs>;}v7IMI ze39M>Wqu@^7<^oOGZn9mNmA6pKoy|`r^Qsv?q`*^GUs=`a$s=tiLtQP{|B6pP!#+Z zekT4LvT4HX?3OY@KL|}Sk&v8RNQfH>E@%*DWg(f0PVhx~#0XB3!K|#@N)<rCic17Y z55Z+2ksvfdV{N1|7qKBx2-wFoaQ0xXL@R}A%1$$@Wt@nol^&k7h3y=f@n*s&Bf0p* zd_1!4uH5XV%y8HKbeCcF)QGpK-fi-XW>eR10lnUM-moUtv;NP&yLEVtv2XhCw(J9I z<&OXLa3s4s<i<S-!3LY~d;=60X~<e&^tT_!-V1A?ssg4B(3K`taJB$6my~n8ygaeG zT<pOnVC`8M&Zt<grMps3!&F-CiL~e{73w(|1m%-m_)Ztcfp*349ifZ9E?{vmCmIom z6Io+L?3^uh(YipG7w{#3%L+!C<YR^1^}vT!?n?&OrIzt>DO^?t<%=tR;!Zj#gK0*N z^4=4Vzk~Tw431CIm{cmXZqw$kt8ZX#%8WOBO+=UaWZ0`a$G-Xmkfp^~NLRT@VY5ck zg<?u(?==swPcHx9++?|!VRjHUE)1Ng3AKu~G*d_;zd(f0YB>rt>qBr)$t$`yzJwh% z8>hf)KWop@cp!QD&b1`4&J1L?mhbLa3j%6N`iNG;1>w4=3fu3Z(T@n#_WL6!&MoR) z5!70V&l(vF)iP+d!PH5&F%XLJLDUFvVRpa(AoExsWbU-C#V@j%QhnM-!ciub$|+o$ zhsM!(J%hQH27oV;AzzUTx%B4m?^vIUG-SP@+0&OD#O&sWRvd)p+4ue<8N=Z3nk~vv z<&!6r7nN&bE&Am&$2bc1%HHu4ith3c$OnX#oMs>7KP)_><~qcJMz<A1G7cO^v5P81 zeYnkiq=`Zwtsfi0#V*xz=@ZD7*N-Inl<G&EaJmy5Cy?bpvgqT7;G;%XCm&eo1RRve zPYB)&-?_M2a6$5Ud^;bPS3~l|HJyi2mrV8gP#3N}5lfkKLtOIps~~w-=O9VGZwQi4 zkbi(9)t7T5j(EZlXL-sRlQx&n4PFwX@Po2y_8l0zJ2(<?q+8u}Hm?Pw-|H?F_$8<F zt|HxrDdUKzB^C$_uirI!Ai%zV6=7>4m!1{(D-mD;V(4ABqY{e!6|`T8m~slbWU&Li zYl>+>p~6SFN-k#&A?#R+BQ{W#@%g!tQ71fk9|1KfH)=N!W&{_9bLMLpL~p0^UcVjt zr14YT_p;ciNdQL`Rnb$?cqzeOsZ9px^kTKYgG(PobOGqRm%^7Yb^ZX<MWE*l!riVw zqXG+%PA%a85HxFjOjE99!7R0kp%ck@EU)ZBm~5C&zVuQj09DUz@?zo!!xzs~2qMMI z#6PcG7yZu@lNBJ9#^`T~MLV6F#IKc^A;~{8;qZkI#RTeiP9X7?Cg6}8@%{k=Z&|^4 zRyIPrkQ&KGVno1_7FmOy!D5x!;^fZ7PISL8l89L0M#^Ju=`$h8)0c0T;zR3#hS1*p z6XNls*GQ6O|DaN{bH?Bu-1<HI|I;~2@DTywI&~I+Zbd;#7VB_VsHXwj&I+im0jkAm z)~$3P3~^Gymjv|z{bnT`wJXUBZH$t#Doo{3N&+QbA!yR=6^H_MTXG)xM=JZL0QMyl zpSdVxakp}4r7+pTLUvgSg8z0S{@aIO1G<Gj5sDV_Z|W`!g5GuF`j_;Ar4x}LdK-k6 zsA>X=!q-Y~Q3wQIh_<xc(aCjD<xu3h7(NR$IQcWMIZza&K@_A_V?b(TbrB@rhPDOf zm1I4zJ*bBkYice0no}P`uA?p&N~O(cGSd!oFzl`Z$C;#NrF?o0Nd^h0JW&QnB*2c= zR@4}gaZn;c?LZZl;zY6D#m)pHj(|JzC8IGlhg1y`I|KTaxm2K@h_S5k-ELQz_Ww4o zp-wy+?HA3jipoze;bZxtiE%zGJ^SL|fM1l{t>$*;<e~HbY@k??@I5a<o~tnX@R#K| zRUqogs=`5qNuth7RT$zEQd~6{P$@}zgA#fK5ppQZg^5LL_>E65)-8`G0gjgKF+>@( zdIjLCC}qO^QE>L7R$FYOL7o0a+;R#$<;E`Pc#XG(dVz(gRjv+`2h)K)X;E)5*Wz6n zaRH=_ltm6=HXFBH4x`Jc*5{qHe&!PGTgrl0iZ@7wY>L6fmmY_B%lbIiHF0F`mgw(X zb5XFT%%1pQcy@YJJT$C)_Y&Ez8^dy&d)c8Cf_*y>&H0C7X@1ue%6{=_$hVa^6#@_^ zNLor-wL_$(dnIVAAqud$HI~D#aGbjvs!O%1lj>%ef;LE}m8=>7<W*KsuffPs^WH%& zh2{t<I7hYQ)|wa9?oXUBCE%8DS^@Zq5Jr#pa6Y$Z$2Ziq1bu%-?VaPZ%lb5Le7s?| zX77qV4u6)A;@7-`XW1aUE2xXU*27lhfdr=P9=Ncg5f8|-c<dV>rqWW<J49^=&=8{e z!1V;m#q|WUGAKf>!Bd)(#Wg<+;{ZFYrZ1!Z3ug#!`r&GDkaT3N3L0hZ1FNob&_| zp;4JNG-_lFlZPdaKtMy|w(kVI>Fzv->or<sCz@x1JOodSH8x?IZIhGhA-68-3lLfl z07~eR4kiqUvVuV+GDYlp@h)PnKNCnIkOziTukK}AQYw;rxh)%J0ZyT^LjPXW$LdzQ ztDm~vYhV^1d(=DNc6npUs}KLb|N7lMPX+=*F3aTVnQv^{Yr8ebW|Y7ARJx0l5njh= zoa_4Jl7$>^0E?p0n2ckJ@WjJUpF5QGM~9;mi(h~EklQbg->f88s3$)&_l5U$5!m`I z!uw^9t=j}JSxDWLlD`2X4IxYEMm30~J&1DW1^}NhpImB;WrneO2Cm>Hz7G)%%3`yH z?rSpzq?TrSX`!|n+Ljct=J16ak63fv_(CK{Ux>u<1yB!yX$T>fJcaK}@zb8VmQH($ zz9Wp_{?FlGXtv-OJam?&>Kbo|)7Ip0!c`nj2?Akl*9JQ4DVw|z)1w0C3VTYBs*KeZ z#{FWb3*i-2aL^qlKYp3}fxjfbdBqR#K7k4@qTM(P6#Wf1Hj)E!&?^1ftPvmi`uIIS z&K!Ml-*j{arxjF}tmN~@KQBfm@P>!pn2sEtUkEx^t*NRy<4->hYJx-=K3HWr=CNxu zr`h!b4*P@OdV~#CyquxSpv&!59-N;#?8&lWylsqw8j9Y|E-2r4W^X9LCZ<PRx++@7 z0>&u%_B|ADxEyWP7WhY>;?e#Z=Y6yA_o{ElHX}N1^q45Zy_vGW2|8wX3TiOL&%I=C zjIY^%lP+!f^vxky_AR{O$_n#Ts-<-e#h{taDG028(Hx|gW`t*jpxUm+BEDiDFIx7k z#uqTQgD+6gve0Lt+=QBPGx*NTI{IK{65m-jLoe5D#JPV4=iW|<b9H@)^XGv!x{7M) zC5^88nw!ZJ%vcL^IP57H`jSR8SJBE~gc|8g5a=Qtb>*P1kS^kSs%L$b&(tI{oJKIU zJpJTtUFij#Zu(0z6WdIWKe}%^9FJ7;%tZKL{;myDC`FlPnd9hS>W=Ynvem{alpP^U zfVX><XFmF1W@_tL;fX(BkMWf+Y8eN~r_K<LDzT^fCcIg8IvO_U$`zi<M~P?FjkAv~ z|KQvRb%^6ZTEmDlA5?PMlc*7Hr8`c0oQY5X!*i6EwsiJVUb>l%eEMb_Io9n!7i90J zaPu=3(g5_<uap_x5-*Ia26_etdYsN{4F<xC<J>@xV<|0JW1^PE<U_HE=l)Fq`4_ur z0-~QP0+@i{uoo&jZo?PXjKf?HP0!L5?y?qF4Q{4aP2d2$34a0%qi?$z?+Eg+8Fm2U zO`{zBpRxUUozIiZ-Z*=cddDwKp`;#0_c`2*W+pCcC5dWYw$TTlsh_x=RHa<@No!oB zX#sr-F__CuKb+c=2gY^;*GFyXkh{+8wi+>xe0c8xF&g*Sq{m|6$0TcGJbd`JFRzL< zSQy?(JjSaJ&bk@t<xd%Ox>FBNZm&7@<i6SES(h`K*wDYFf8NXfzo*#md$P(@w54iA z{u_?6H=bc%RitR^Wr?vek`}8H{wCd-GjpR)Jn=C44o@mMjt5*Te7n457>Ia^+_2_~ z=fj*i4>bxPRp|a>9ngTK)Ns`k2bY2ipklaLEN^mgBkdwy0L=FSm`VP(7|GMwMhF&` z=axPXOV8^M0)i8{5!g7a%E3-`{28GtyC4r>BIr9Vlt**KO?LGJxN;1<VSTk@q@!a~ zJo$~<>2%cO*WrQSryzJ@7n=(uK%DbfhK{Kr5hgaW<;PG-e*dE)P|%7!@sKig$<8H1 zD9XMIE;8J&DCB$;_iKolcsXMUh@nW%)yO&Uu}b+COqF~2kX}Kulme=p`+1$U*xe7o z3EgX`=CCea@Tvq-;DFQf{mpu+M|<@oTrV0w#4H7e$;Z!!@uwc4ZqXV<G)zf`TnmsC z>d)v!mINKTiK<*F;Hzjkui;!#%PZp5VIwK|JJNT~qSq>_%x$3*nA?_AdZlDR=wYq5 zdt9NkBeFiYe{3lBF55wbZI)Q`N>xTFXGvpK0dEQ(nB2L3I1q@1>LSVh4D*rg$^-wC zfxqe^*#=3N3H+c^0tvW1$w3lSm<3>)vJI=3(Aa5SnpOAv7Q?(=5T**GhC_WcNl4I` zEQBp`4e~hjw^7?JHr0@Ho3Qw;Si>z;4FoNBL%KD{?kIM|_(l+N?MmR+CW-}%HGZn0 zp}CJB58~y8?NpfpL@j0m{eylG2Ck9aQSJ+>6t@%ndFbEXpK^N=eP(t}toFrwEqlj* zAenLr46=nHirE*>S=gS5m#mr48gYFlSQ|UPSv002RJ)FuStxsc5?g7-`_BI=+iHo@ zdt~d&9KPs-m<2ELcW^$KATEwWl*q&b4^awOOcG*lj9&319#J-J0toO;001Et8aNP$ zXYO?rZ%KIO0AAqc)3e%FQuT%cq$l0!9+3|&YNnPksfq<f(zo<)ADq2kYy0ZG?@H{k z<=m;?=-YM2z}KFXs`b;47nRfa?-zv}%@^amtC2-uM>ppqi6C<zP)s47s_ah6;bAC; z$B|>)r(|hO=Ma}*m)t&_WGfFsTbpo_7QWlU0$ve5eGdc;&G=xH2MIi9@~h=)K=V}1 z$uf=x#5+{O#_H!KQZP$lh#&yqFrhBUx_D_w&HX9I&>;IPf>{xYAlIW9BR8NSCa+qD zR|7~q3PV=81A|vsMuGwTV&ke9T~c7Fsv2ldZ9#7<)(@@YS`<>X2=F3@3vr7cgOJE7 z(2~VEFtkaaNeEJCg(}|3U|N(G4neQb$427_nMNRkVR3zjMjT89ddHM^WbwXIAA2O+ zRArP55pc#xQM@n>J<|W~ec^0G+M8ESM!#x$TlqIu^Om^wrY1K0V@dR_F}k9|8GlGz zh`N-0uNK8$#g%vc!Kg`hq$WBq8Do7+c{g!J_j3&PzW(E%pxz})((J>AP^u4!9&`}+ zGwn`}LtlkD%-7VaK|!mQ`Y;+q>?6c&RceS=5m>8%G!ejIkV%*OxAkh1iG-ct1J%@6 zq2C%UatmZm0Rz=t>Zy}k1@IPam;iDsW&>?SLpuRb<Q8lhk+pC*;O&R+`E$YK(APJ% z(10#tLq24$28LZ22$UL`79f6vf<UVvjYA56I3Fbd2{<p*0f5Bl*6tkIy>I<1W8-f< z^f&BYW-l?@iILxY_M=mMsmZBK{`4;<b{^ds|GDy!X!m(*On!D`@0r=~>#x81<5yVU z{3DNJ)zv!(Z{Ksr+{t&?UE3a=&Dj6`K#mPZrV`>lF1JDKN}NFcTqW3q5#c)mpJJm} z4a|0==tgWrq@J$Yh#@cTqkNkGXogG+BPrR2jGHCpdLK`wRX@b{A)%|lc_b_{u~G_l zR9tA9(&$I@7wK8@+CAKB_t5&wh`OXLgm2Lbr5*|DTi*rG&2wp>lWVDBEyy;&=895? z8VK;-R$>YRYa8JzTEmwz5V`<O5tm0=7Xz&hi7#to=k`1v>z&Sjck}*~<>=9es0yJr zXr5%IPo8g0wj!cZgX&c1uG5>dwr$71oK9%$j_-bN=Ey|Gsk7XA-(0ltokljetVE%_ zW%=ZVGs-!a@*^HEn^pQyvq&*E>o!bm+dm)O`Sh(BO#hJIv0Zpyd<k|aC43f`4KhBR zSDIy8W!IrE+@Z$OjilG0mq90;6#$7AcmbQjB7sX)Ajk@VpCd$X)P!;&_ND-x01L~B z^ULFF5Moy=rUEKO1i}Wn)y#9LQp({R%M(uqp61Bj!CZe&$m}z_C9}`7E|J-ZWOF1> zWx=hfi^-+2$~?TaFBtAKo4t~m-7?z8d<z*vg-MuF$rU+c6tXCxt{)eEN1fy?7*fQ2 z8Gx;IQ{pGM(E?o5y3LRU>Nf<&rD3&_%7|H51#tjv;5`csRtv^FrlEo{==6C^LtGkQ zDU%rmG6oZQRY7)B0ZT)Ew`i&cjFIjw1?eKC+F1QVpdG{~s&t6pm&_w&YOs!Kr%(j7 zEDyPvPPcIHu6QIKvbiH;9u`jIY)pId#{5aF(M*J;6X(_=+%qKn0sYfYZwcdT><)iK zvk+_5u86>aL<PT)^04MGbYnFkE@-2lijL<H9!?0$*UiS&ON(J9EE>TeJ?9VrkjIlG zdf%dsC&W4`gtT*RU!D-_RK`#{?%W!P!wgbn4kv3Q9;RBr<>fd9BgDPLmrP=Cgn<7b zU#p<r(_z#%xayE(=rc7isr1&xlfD47zAlJ)GRd(zn7D82{O^gLzR}s4;Z1$ut`2SU zd(#2SfvH!mnv5w>=RLpgz=0`jzZhf*M_<o7A4Rgs;r{5%Okmeq_KOUUNaBbxn2=Qh z;)}5RTu<7_QkDB}NEbNB;L2)ab}y+7*AoOJH1Z8=*T8cOzTj2hWat7^hU-ZK*_<Uk zp+1_bAy}0IjX{t?+Hf&%D@U;KsT7M{o&dg#%uLG@goWL^IwSEGt|j3_k4-u`3mvJ2 zjwnxS-m0VoWgXe>kDS|jNnW5Murd<@N=SSgvRFlX6z~@Ja>a%a-IfN5OlVmX`9wY< zLR`Xv6HM<M{)T-VSei<0vvxj-+^)e>12%!-2b?|idYnW)q0uAEr)O=}r@?CN<U`t( zU>&0mWM<U}o<BYcme#HVnaJtsXd*Cv!|-%EnhD{!@2bm;W=!tf;6wSzT{(}aF`4r2 zbYZk{B%O+;=ltw*J?D45`$TlKCqF*^!0BD$qZ{#j=yi*a;rUvGTh%GmN}mM)J&wGk zl?qK9dD{f!EnnbB6RBhjd3Z)gSs~P-3OAC-3d$B+9AzJ2v7LC-CYl1FO(GaeMcm^< z)Pvv|ufkT9tWrH&!py+)u<^RsYV(1qcg$;s*~0YT%%PZX&>;otBC+6!bi0{@L2hEr zEPhLVZEqwz<X+Ww`XM}zAJ6kJo+m`lL#(E>J%v&X5LxiO1w_rLrT=n4%NxU3^?^-k z2AmTR^OTG6e!Q9#<AI9;@5c)`Y89OZ5D@WRAR^%pMxIig5G>zTGGvG{z)g-9YKDM^ z>}I=z@vx^!>lpH`Iy5tM-AH;kV_}DHc+;ARc}}o_hfcS}TtneVekh-P>4frkpxNZv zKs#dT2ryBP!A>E&7oJde#1w7h%DTAP$<<W(p$Q3^<DmefAzA9vcGE;Ln&h0iqE3B8 z^!^NdPpbiKy<!~)VAfF{>PuE0KMYhoaaI^;3zbFfc9qO6*xrDEs}QyoJNWi%LGDA0 zyy)ua<nLp%RXb*T3L>z?UZ0w%(r^!AUzhf>Ud>Xc5W*rwClV|F(i}X@e}@=l;)%x( zvEdJtzYdS1b?8{$MvUF>dims+x5p<PA#b+FVF*r*-s5qZTV@8V;?c+ZlvmgfmQ83I zRxe!aFW$9hZow!@6X_k}QBRX?Zh9h|N^lvUgNzSC#=XLysu?P|K++s6Qs(gDG~IN2 ziZ?l&AjmCMX@XYBxQb+~!WbzsGbD==$DT<9OXwgBLVrg)WSpXM0avjm$haRemI%_= zioL@~_Di)5z95%n6G{MeD)%XI^?^H$x{*yJ-YV>jClp^2R0VbQ3AG;3MHDM2-#dAl zL3ux_*v&dZ*r@Da<VJ85oOGAc+ZAnE@x%wizx%lI*VfH5ZkOkt(W#)pkz3;pIVR)V zzkKp#_m@DFU1{K+S>q^xa5wz!kCj*YnAm&pn%;@&Ia|OJ9ltw0A&JJ%&+WOB6FKSD zkjyW}B8Cj^=Dc!}b5)KKUsIO+Ek|i^dKq&SgjG+07iEbNLLmtekg^kgtsn37(~T7R z(Jta66e1DSlzMqGQBudz3T{g6R;loR3|y^6?lHn)T!j(7-i6w}i-9gHyi|*tg|t)v zfW<%y9aEqYw|oo-;}&r+64GSwRH8^Nuc84hr;vl4F08f64$i}GE14t7Y(3>D0raOZ z=O@T!qQ%KmU|rO>yVOvpIc-g$)zD1=4mom)r`1))NcI4Zb8?<04mTy1WOYmfLn#;} zpf7jxE5gJ_jiOfbUuCJ>I$dp<3w1|gjrS`j|BY4q2QbuiY*k`(-Oywzwdbz6LuR+= zeIQ~q<R^os^*w>EK+tC2^qrAAGIcHu8+m6<lQ$G_4BiPJcwxUXD!zbxrcKx+6l;;H zs3d~5bRr(!9{@ZFXQZc&0d5eTQH1Z!6V+YFqT1g?G{DE+sz|ze0J*`%s>e$shtKO_ zH&KDWz%?8o_^g(EJ>W_g34{jIcAd^zXWhwSQ<GD-eJ<on+5Or3C3iYDx^ZM*^M`|$ zT9<)s*X(=lCqsSg#)Q8$H?Az4JGE`cSGI)m<2f%8(OrC|)xu%mOzD|8g^in@Ab|M3 zkSae@s`5;P6EyN33k{qQ^b}M<-F#s$?R!?A5kPMmx(3j_xT*?&vwsy;%BzSOr@sm| zmq&x@^QY+fLx==Wmm>GS^!z*=CicF|<ZuN-rHk_TV_&`fffYh|%;&cryG%0deZMEj z@+-un?D04{l;6EXM$h8h6=@qd;s%6*F3xe?Eu8<O1!s&EBr>rAnE72b!;5el=jyZ6 zCs1kQ%(xbu;SeCe%7H@jtrl1#BF9rlXH<8Ix(6}%^orthzTm!*@%2-Y`?G#SjnCx` z?Ye#DnyFZ%v)<iMtuvS!Ka6;D<N2XCUlrr*(73~CYxO5)`i5Tr{`umOC*uy(SX@qQ z?s-f}alLK9?#IoT5vNn;dP*HlL62w*Aw41YVdYt&gJxfsEx@BGvnZb|ME#d{qrK=$ z$u2%4qz71d(dzNi)C^T2U}6bbPGKpgKrpzBk^`PD1aqsNB1Q;tn*lKwUr<#xpYrxa ztYa}<cGFBEJI(6UqdV^!u20?gbFGQ=_KxgN&kgu3@nGGaC#J(|a!0?^sCo2-zb3u@ z&t2i*{x3BgpcvJoyey8xcdZic05>cJ2EcF-adNmE765V(LN3?Hja(y2&^$Z^E(=%~ z0DlUt$s*&pECxarRK7`D?P&P&V3h%PRje}8@FZ~dH16AkVhot2gUDG3Pz8O(gAwW% z;T+rxxf%kx5CIsdnZ(EzRF<4^M}xzzweU3-zVMb&zjp689@%Rg-`3>NRheq4%m6G* z9-1&kyrORFOjZmWe)arGthB)MMU<SlSvmtf#13gn&H?e^&)1$T*y6Oit!N;^Qh^Aw zxe~h7_6BaDn@4hPL~@8s3GNM8FKya(riwy<o>KsW7uQybOs`NZr-=$Y87jlTdgF<Z zk%oh80u^$LmsiMb(UE4VkRLMW=12DY)WVX_KXhpc+zTS4K|`N`-Lo$oQC_TxVEddX z#4s;NJ^LJZr2a>_<A4?p@S}@W1H`SP1|XcqRb?;%dm$K1z&u*=)Y*x>g!Q0_kE50a z@MUsxDHlK0UYsh>cCKD}L>H|_KwTJ4kr1S{QPhkGj<*JIJl?7$INoBvhMKzTc#(Y_ zn!22Zs+YPwL15ww>#QZ4HJluvC{Tl;(XMW;5Cf>164f@AHtTDx<R9xOK9&PiwF{#% zircY-%A<}wZUyh#Uj*i<YpYNj^|1g)>MeUi5AIA{99rk}(Os*z7>#Gf+!lA;iTy_w z4j)d984M@0DW}E2jwKKzs;^HeJGS<3-Wg!P?7cIQHyA@zj=WAc+TnICus_)Q{>0Q= zuZlIRIg#HyIp43YWujjuez<pb-iK~0aS#7o^9jx-O87HAp^6dPBCpO>aeNtA^g!Lk zwGE_GC_0CYLJjE_MImY{qTm5u0&+%wLW^L;u6_Xx3p?u8cwZC8t3oyX1(O<(nBtW! zO&GA2!zeV%4K(qB@LHhis0JZ9ge>s1s^6&9RGaLcrT|}u4*jKYAxz-&0LsUJ=j&0W zgJmK^6h?f34L(@&?g!6_!zJqWJ^7?gwAZm<AY}@!Ic{X;e|ht*U$I)_*RnxVSAYdQ zHc|KFlkr#fym8&>kn?-eP+-%6_m92x*0J{wYzhoX-*bjePyY3uSJXON3i6_zGErhI zAx1P=g1jgxDOj;oqJ@_+)jATF8Auq(@-TJJj)OmbQSYEzOg%EXkuul;iNyuPT(TP~ zh#ZnimnH|zE@MPPHiOBe&_eS*q-3NgHX>gx)3+difQ!K}6pwf*q|iIe8n$I*`=Q5e zac|Rb)H`?2zdkm9XwX9?HNt!{4J_4He^p#$Ro@(*TkWugdps>~Jbh02$NZi{v&>il zd}{64G~eaS1nGjJc@NK>fb~`Di}fkW2#SI`KxrA($Qf~EfH7H+Yw5Pc@Sn?hRLdbb zXjEZVFekB+NL9j<V59(BFkB4ilasTv2E%wCi0K`^patBST3_lKyxlN}on&=^x+=W^ zP2#V4>OFrPXdJ8%!Rq@b!~^l@`MrA6XuPWOuBg{q^?p-_k=@^??NxS+s~9%nf}(vF z&vGp=f@;UcK^UPlD*S@g8^s!wWU-%4bWMu3$I5*BM}Hb;q6<*l1dr9r=)YN~M;wc2 zt^$$b7)bR4d}O|qF%|+yy1l9tmdijITkmT6BQuyzj@D?TT3d6x1rO{?KM0iMgvVc~ zHdR$=#BZAo)w$KU{rSkPM&rnckC`la`x`%Y`203z3S>vX9X*;qX&jVtqwI9hVsB9% z2<T1W;3~((6?GjW>;=^N>kQb**4z?vEqx-b9}(}1MihyC9E+U4<GO@i6(VtM3*g%R zSUZa8o0PhNiU<*DzRl@hg|jah;*^kseu}1VXv#*A4c)lXt;i#wu}eWN4;-+^9{gGr ziSb+wIe-QvC{TbEZ)a@wZt<?E&sJ8{HDN_v`h)haA)^r+>aY>zQ_d!h_#JayHI~!m zA+hG@tp-E-#=3(aM5U*Wv+`ED8;5`U{QL6>eT92hpUvqd_JohsE73$~>1}B|$!ua` z8~I^8?<25@gH#Knxd={^*_ABTkWz$H{$K|v{mB)rC5C1>fjUuQfSgc*Dx?DmdO48C z1s}o%Pk`J4n;05{Eq2O$F*kxdtI!DH*gFiBER{;+5^CpA5Xze}E_Db})YAz23c=*r zKS*^thtX;C`W@w~!K&(-+<I-b)~!9TKjgg*CG?Xw&f;fgxT*f3#4?1k_>ewk4Cq~2 zkkVI5=%QX{Gc=l385M<z*K{vIl(o@BxAKFnD?!X-<Tu8Y-_yL0SZpozxvAeEYh$=W zG<w9dTN9}Vw+TRDSkIPfx#i>NsqXq`dT;#YS7-%aK?^=C>3OXj^Zlo}5Pu#kV-`E& zF)(7`=Q2htR1i8k*rKG3@lAW25380Sj&=w$NJVFiv6%p}vsl~O$#d6J#~ZrW!O)Qm z(Xi~*SpNfjSs6mvAT$$<Wm<&h_!Duv`1iTA=<uBX+yCPN0A?NgRKyo{x!bEOYc^M5 zXt@?4o~ZRUnUV8WPv5D0*)nzeh8wk4-w7bDo$jf5=<_{4sTY|>k07yDZ>`fym?n$# z+7UkuUeIZ>niJ?Tr)W-rcJ&oZDY*?15{fWENLhBnc4=bPAVMX~>hf+57W-D#R;Z0J z6#XiT-q6$`Ub6JvfBO<Y1~-nu^}HPR{-?Ik_Phq<dSQ~c%MHR1{15V$eH7?8hyVv` zzx;%Y5o}L^M>rUY5Ul|4{74l*ge1FT_@a+52IeF6$Z3%gXyKK+u@WiMl_6XyN;Y9T zM#uvkgPOWn3Kq{L*2LC#xy*+q=9R}YwSA7nTO-WI-c$ZA(`tzrGWn>khkY0fDv`sv z<k<GqmIn_$=Zl5bM}qkyNAEE2T<{uw;=5k+9VsJAR{5CQls=z192*(jFmpTfb`qH1 z5Nv-;ctI%EQ=J>*mc<ay?Lw70H?iL%8l^j2tw&5*n=Cn^^?C>wI8vTL`OX%!oZ<$P zn!M2q$S6UY-AD-m=q2H1^6&mB7cPTvh392^BZTW9;R<#VuE6Y$if|=@*z38oZ=z~U z8?{WZMol*JC@6}Atmtl|PQ1*hg)C{g7a~bQs;U1Ho|j1g<i5Elo%V?-Lu?7U`=f_{ z{EKHQ(A~_;!vx)ZTO5Bu|Ky#(QC|g}OR<nB%wU!`23b}KY33J-UC>K%1k_Fg!yEcU z^n=)#h+KFM@6tSmI+PKQ@<0iQ2z<L_4v2LSE~;dFdOUfyJgczi(t^C8<W@!Oj+b2O zD=^Bov*@C|y%j?=2r09OM-fuC@>eudPV&$G-g_T$`CB1>%ky%R9r6!6|IA;l`%t3y z&4qRglwB)F{jfc;=Y>T}QyaH0>?w=2DbN;GjSW|SCcBb2BDCz$u<;PD4(h*gKB5EW zj0R}C04`jdk}<)dy_gSd&B~z7M<R{se$)kMFoNgCh|s7mKxc<LflC0^a&+)Djj;3B zVuhTNLRo4Qiu)JNjBa1pICmmuVV?|Tm_vCrUKe&pLm^{7`_LD7J3e>oz?N{J)7NzP zskw&^Z#nae=hz(&-ZA^t{3r{##nn~j{op$Xsd>0jiHm`e2X1-n&dDvNYlrt8e&G<+ zhlDJAU0B-(+yl5s2zx`I6siWJgwBVcz*B=#d(ls&Ucrwfl0zRrEl2!Lp(J6YX+a=w zlu%gPFW|-$9XuU%pb|ulk_PgDxETb2XvUPWY(jB?cCug#fdHBLsaZ<=@lqYt5(^2= z2O#?i&BJHRuFwMG6$jcvf~bN5{YnM^x$)AXlYJ6OBrMWvF57FOIgtwW&ZncHh1v1r z<``0ek58U@cJk5O@k_d4*tO2O4;`CLbk1bA-!b1~N&CD_>G+Os9a`9S?8|q8z6CNH zhs<)28Q+OTfVm31t1VeHQXko)%FJUVnRzTEGc3W<t9?RHLo>*Mqma3Y2~shFaTGEm zkTFomn9nO<b!QD^Mq?xmR6UevT!*1R4iOcCV0-igIGK~nDu+p&HOQGzJ4dY%LnkmM zh*BvdC|6)F%BLbR1Yx{?gG5Q)kqTP^Gzroww$O9!#N@N5PX0$K5L{*c+BLHaEgh-( ze5fm77GJZuUz4nfMCdrXaqHbf!^gHQ9QxLdc)H2!OIvcE*?vcMCK;bSc1ZaViMF!! zRFcAEGnkON@S!Gv-h77Q2HJ*2!XgmVMPU_Jkqp&x750m8*`alX9y12CD5nfUBz%rb zl^M`+K-!V&1*!djREzYD=156)cA91);^;!qN+WdaqU1>7NSXv{;uL;usQPfpQQMxR zQ_d<?ol<EMkb0uXP1QZ>-=k!bajE_LAe<Web#7u}@_XMr`7tCHGanzHT?oZe^Sz;z zD!KnhvOsdu3FTW`cdQ*gcI(2S$FCqd)=h2v<)UNYO~`5b5N`#E>z>cGQJyYPlS>d# z&Mz-YrWNJ}f;jXWc-IfROl<(t07HhBOcTU`1vs4Wj;ELgm5L37QbrSLjzCR66U9xi zO;!t4@iQq-{NHM~ATC<MR(6Q&+{_H3@P*xX8o~?Te_`iN_F(Mx^&@2lKii<D@9u5N zX63P4=jWmi@88D`u~Wr#w*fqUoy6k@UC;>A7k;kY0?u_kCI>^podUoW@W+_IEC!(g zRPQQjnHMn)4HyAU#R_O*BCM%Vm)X~-V{Lk)&;WY{gegcNN-lMoJrQi^G3vjF>Q#`3 zCt8NzYV58?{D#G~WMPvohX$JqoW!?o1J61G4qM`s>r`840y*>a(9rbIiP8J_Y+#Kq zm;OQdkiGqdY30}7Qhucz`SSGn6U@gxC&q@R*RGuwkBr>^>#r;CD^FrV2_tH3`ut-T zgwn!c&az(n{O7)?9LF>F!FKFNZeawgdRDkkC_3qxsfdY9>x6Hn`aJb~glZ>^E<iXi zzkFGN4{O$O2QX#k<LlLZKnbD%Py0WucwBUS3KD88oUJm4H%t^%K;tDe(b_wyiH3=O zs%v`5f&g|)JL!r1ffDtlxgVXA#->x8{s?pa+#o>Lq3SABrF<9Ej+<838e|6!&)&Lu z)A-u>_<P%Dm5AgR4mPbHGcoP4&C&ZK&wcf?U6Z?%S0|pbvybNoXLA0i@}~0l*_Q83 z-mUaYF`HIw=@QouiITA!0KMDq9@%A(+;>l$Gl)9G82JnTqB#ow(=sOyEJ%fiN2ME6 z`FToyuTTwX!C9dI%&Wedvj~n$!E)$;Fqs$pg4FVWfd%0~$;?0FwcjZEk4v09{#M9{ z_;)Z-b&-<??-5T=nFfJVH$oIBJW0?YxNbt)^R;2BnSf`quxBfrJS?PqcqL9A+c)kw z)aTm+N}gZ;;KN^QTop>5)!A@(=k&&nyKj%O0d6<J7b5MD9zloVd7+pgodF85m?wpC z;RTgnN-b@fJcQ7~Ri@CHEEwaYKd1y3jl8XB1creXz=yC#MQosMhkdi9w8|vnPm|s7 zw?aaGOF3aH)^Y;#+71*>ZD~J(F}o^Jd%R>*<%j;G?1x$vU2RkxL3)6R88i-|Z7o#T zN@vkaz}>UiJ+z+?s#a-{AS6O^;L5CI5P^TH%(ie`)Fp%??p`K^{Lzp<VD>aNcX`dC zwPrc9$d&0wnx3!vGwBgj1R^>E3$5<*Ft&_lWbM`2NNSFLoy&g+TnRgHZ*LHuLvIlB zFN~#RJ0!0k!5W=pu?YuI$fWqGw59s+Rw*XdHX*9TzD%OqE}&LV#n4qKMpO5#Uqgb% zu5~w8OgGtZOOhyPk*8sSlRSiy!VnLmhEPYq(w?F)M1N06<G1UQMBgoI0GXF-crs>P ztZvC*ybS<eo17iz0Vz5fvU3<oLj&bx^#eR<Ku}8I)FnRuitbD;RA1)hR7es!1bV9I zlx>72rwj)&QIm9V^VWl+F_ZHLGZ~|FaDK}{(G<=4*)z@ES^U0r%NNC(Sk@oN#9$&0 zZr+A3XLG*O%%uD}7nHQ76s&-hHfp8j9x;>+Vl0g8g4AV_B!B*mH!gkW{M#%47T@Qg z!ETrnVgzx90lX+_;h?~`D$t$^n&>Bh44NZ!IVjXc>R>K6$FNYa<!ni>>Ns2q#sIOZ zldv*!LnlgIE=#H1DERT0+T)zOsvV(ONifxz0}#FJBFMCbmSIEtO@IWIB$ub5g{!@t zIx7S=UD|>|W6@3`QCsSx(QMc9sr@~v>4m%Br?Tyf*Lw|)X}fu6gRW=u=J??B?kB|z zwxi?*upLEteb0gT;9Pc0^K_+jd+=6eV$OhkIb+@P?M-d@{OEmCvDH%i;^00LV&GQs zL8r&4md!UKV<|ELh>VoDb&`<QrbvI~{!!?&L1%1+c~J)!Sm7EPrYrQQ2uU-Kzs!Ru zc#T6Z8Rng2BT(Xk?5FAs0HZOYm#NM5kcsHD7P`q05E@XOL`-8I0bLVRU6v1&6Pg^A z{t8`xk`^3b7z#DlW&INlE!Eu5C!CEmm^!#&bk=AxgxaVE*6F1x>d}#s!L0l;w6PzH z$Yv*XqG|L*psAs!bqIh{W1N`7X5-@SbI(@>pCgXRL*o5a-I<*xZ>U~VsU4cHdB$Ke z+Sydxufk8Oqg!8D<ubD3GHNkh7G5e2KOcZ>!_;^8k=JslIWWn&P@gJcz8tz&G?3w5 zM)TChmw^LTUUcx;4+okA#V}8-!m!>%7KWLJl1H_@G^OeYL%=A1)$w^@4^<8@M@oR2 z5c&S1rp^PIqttjA-W2MG%1p+sAz*&x9;aU+niD3S%nzNE?BVHu8O_^P+ZWa<&&@4J zmfn;{d<#Q=%Nb`3ljrwM?v|AIa7!^jC<VIVx3MRkbEg@BGb*SkT+$ps3W^zMS9vij zI((dOSZ;+SZ6%bg&<bd+ipaxm8ICtX09&Sdug2rjh7Fp%FGdN3*dL|nznV|oSEH|2 z00AEX?r9eXA&;s7$(5aiJc{z^=oUmj9FSnf$_=>#4vR*lF)Jh53O3p_UEm(7oKqtF z^(-e#E5mn}5F211z7(BF`IGqE)ED;bEA6-~H-5NGnzo!amycPVTs!%oWXUDMVnO9b zQ`R!<G%xd|Ei9wY-97%BCqW>&1nY%_%ehn~cV8#`)n&PRH|0=aH4hDUlVHObax2v} zn7YAz(8pIn6o%ARY6`S!@Jz8JDk7B310_fKJoXV6g{71pjt94USz^D~@2H2sk=~QQ zvk>@t{Mnv@t2fm(@KNWB8Ge|$sNfw&!o@8>7HHUhl~pE8K?3(oSI+V?SV7#CNpZ|X zC3n4;&6a_6s3#ItpgwqM*3W)nWHtY092&bU`Jd?TG5(Fw_|6qe0M&Mz(`_5OXZVVR zfZ^5G_uAZM&$3#;I^p|jTu`PdMF?LE(;0NDXV6WzS?I=4SIOh6CH{$iwYM**XMo+L z5Fv2`5`Sc)+PLmtiWQK|A!1mI11N_JC@L)W^L9TX2GF|Enm1ZlN@tJ-9A&ZA<0F@Z zA|lZpYmobdyYuu0jIWMpw-@!*1L`KuZb}sR(l<xhwO<^!BYjR=gJ`X02;6n`YJ%kE z2Ov6DRtbJ?G`(-^>A$R;uq_|d|IBD)KXVu_s}#i86iL5#I&;P9!j@|$ySVhJ203=& z=eo_nY+WbJ3oonO{~T@CMke<r4l>}7_hKJVF-1O8HVRu4C+7zA$PAoYzK_=;bNrxK z;WK=%$7ZC=n=LZ!%WO`=;vgC;3|YvzmBv)tyA>ZzZib6*ZHxDC$zpP<u@7oKZYwtQ z6R;5*59Rd}<OR*d-Q19sM{ly)`i;)!me#0!{Was8aUilZgU?b~IR^?>ei1;YhMFK! z&?JBnnxzH>wK38hT9D~NyNs%YDlvK^zKQsaI>NmD%Q55biwzGz&4@mLP&+EfKm(46 zsE_UOW+S2pxydK*Gt1bN+Qm43AU+%c_JCDYYfo%H^~{k2MgM48wCO6u56sXQXYNJi zlU=D`z$boSa`oN1FqAtyJF&A#gIG^jS#?})Yv?@aU^VM>Bb|CUh=pIi`1X+no7K*A z6<<s2E$8O?@@Do07S)M=VW=(4n9eV-LjV$wvTak(mR_32Y$fs5`?as&tS!QJ)>_7q zfGvk`By%-{Bl)!IQ|EE^MWju@IO3?=>EMZyvw-!8H{cE4pVFbYFqyO0BS4>o4z#s% z9cV4<z#V*Y*a#gkmXWO6paZkefsSNh)>7C&c{W#xxV!=H&RXPap$<T!$amsT!3edm z*N59~qYF1=Eo{veW^Kjx_4A|@8?0FVR=C#o)Ijd~=}&K&qwn-u<upJf*vB)5yW2ao z{#w$EnmK61P1(Yyt&7$z+xR~=X_Kcm;~&9t&Fiy;8<2GwZ^S<;Cn(Hn1X&MA>Ih5F zM$3uICBWrJe~H>$N(=(M`@bRvsi-4}#b2zss0ab~a@AqWNJFa4)Om2Zy&WJ8Q9BP! zLvG6?IuE!0mP#7(=Gx9B1Ps#}49ZGP*m@NeQhunpTDt;dBK=C>B1UEeob_NEr5e=4 z1)XjlI#msZiO&ibE_T<Z)$Upg*^wQp9r2J}<w%_-piZ3fJW4NaqvqSDmMC-y%{Mul z<GPe7>(VYB(cKAMy3-=hLpyv4=u$tlrr$y!+=3%n+NdTR8@aNTwx|Z?UVL}{PH2+9 zCAyj;ru(h32R(IpBb0>Dw*r>S8%(v<p>@}MYI+kiMv!+<R}@PqG}e;O-#oO1)M$7U zbSk`Y>`wY+;x+_6@lWAj$q*57`WmE)M1~1IL#PW?(=Tn<rQuwGXM)jbzSQbmqCOZq zbW>mBs@*+akzTHjy3m!{ea+o=vOqX{>1ujp^!wP;$~&15dV;HaqlPT|#}z%n^9#$> z@sbXqBvyNC$s$1wG*?llb90f-P?h+|l{<--=p^Y9;Rb#t?FW8u4KYvA0C35sI)B=u z0!%I6mWXA4R4XV&%IcBhyg{HuqFMk*{NUCF5pk8-LTM;!HwAsXXy%jjW@4)}^SKkk zf755H_-;iaI7JU^F_>_|3SI<#tJN<)K<UKHQOV|M!mJRKAczP7)L!UWVLwgD7i?KY z<w044^c8PoKHM5i6BsQGBp{Jcxm~aelyh}pN)m-GKf~Cr+IJx!w_c}h?$A7SWLL<; z_PqC=DaJ}eyYJhdSkvb#>^nJE7jX=&>v9d>cI+$9>;qfk_FK0{JVkNGL^gH^Fu|85 z&i`_IaBgD#)Ex<%KQ*4d?JJp&#qsS?>-c1BXL{(&{cBT9>xhk#4@fB)>_k3@Y3ol3 zKQ2eTlPc_Ul2=71LkN<3>828W8tpJCuIZ%MYYhQ22$Dd1D~aZBW3n(*20v3_1;qfW zj!kin7*_-C=(RW$KugWn(xI4b5dO86f=Dp7^(x+bJ%BIjt7|Q?5!W?}JelYe<?Hdw zdLsS6-2JuIMN=>&kbsRG5k5(Ve$qrKG7=DZ4Kd>NK<e5uGCU2}j+jdYnqsP#&zVy7 zouGp$0@^MW;&GSCuyK`xky)E*O2UQxUFjtR`E7duKn-mnqCbm0U^KPX>##bA<8XI} z8ZP3>$3{Ni%)FxX6JL1p+<>LlZD4myO#Mi_9q`n?P3?6i><Z8sC9T6`oBZ8tpL=7d zZzkan_l#p*E#c0O0C&#Dd%SJzcf|+Jf3o#PeqI;C($A&G(Ifthps0*Z1#gN*;j!KW z2}n`TX$<sTI2?|m!}2-|{ZsBSMz)n4_6=inJU2|z9-q0IG1MT?Wk&MM3R~j^RK-IC z^c=upbtj7hoad5|H$~q7otrO!a~n`sP|U&z!&TGaT#J`D%_}ebxdmSsKu|2lsZtJq zJw*N+IA}YtVOU*V-T*X4;TttD#mPAs(d8R}(48CR!JQv2T;2eN7?eA%wge~)ey6LS zH(`ruh#qJ9e_R+OJ$|KyL7_g+=T2PV_&>5FaLk)Gw1)mm3x&QAaHJZ)z^<kX1oWj7 zU)}Tj6IWR~1p62RuMhF9EwEsDF89t9nMkUG=x)20%BU@LXA?N{1Yx*^1hx&ST;M3J z((*##CVXjtF7cXFX-=hq*$|gM^}m^W8|b#mJ5TsN_sjB?EbCjAWm%SGSy2>Owq#jR z9LJ7v9LE@AT;s`vkfv#xrlBFFX-JvCo5^M~ESshoU>Y)%VL42O;mxufNihtEp%mVh z>F{RR4u?Y+4lmP}VHk#4Ubf3&S)lgS`};rl%9is%X=nDFohh+C?0cX4eEdIu|KA^f z@uf}l7heJk7c1UQ_8WOoqOmJPUfd>9?Tt^QUEyoQu0Scncd^nt@jGiX<(=Q9FLvI9 z-}&xNdimX3@H;yV&bGO<5%o)_=Mu)NosQCt_~Q*Y?04X&2mmqXYjd8{I=*#)HpX{d z#apRDrnKs2?4q5n;u^B#%Ihb-$v2DrOZbZ|g`&$@yn)cf=hof0?W^>6Ao(1i?Q%(+ z*5I!-<pK!>z$vRq$~aft2-qS>=2g)fokRqYzc1rvT_hIb#g>=j73vgU)=xxpBM$lA zuZ-R+I}r@}zLx&RhJ%MkcTNvXXT#}&BXB4@FqULT506-6Hud!NcWkL4u(A6cTAe8V zyWzX9;(?BPM{pSjF5{8H3*Sk{7IVUOk+_P*??Je<-hJWb-qne3^@Y2xp1CEo;lX|1 zX;!ZpyL$Z52la<*cwUz=4h?Nw`1B%PUCQ3;TfjL$eO}*}bnLYmW;ZybsL;K9E%XAh z4#4CuPjj(84fwx7^A{&^L#Rg6RY$nwVvfJ-xvp+R@JXPepdsNp?2yLA07%h8y1_)N zz>_DcNQk9#UF%z+mZ76LKSX-2i@z#&agbmavj3%7{AMG<#U<?;Y!!q}5RpQt%0)co zaNuFjAz;$J>1s07TAfb>Bk{gL5=Eu19&DnO5IsTk=O|7Uiqp>Gb!ZC!UGH!0AP{>= zrTCzHB`QGVxlzRMo0V)KBZJ*$)Y;V%0ycF>+GPv3QdTWJ*zL=2_6FA{yMH;d=j%N? zX8}RJ0x#hF)D4@go4@<T3%O(vdRV|9x7M)3VwKP2$FTN8!ilAbK&h<{q9aAtJC6bs zu+p$6ptDolT(8<=>(j;P1*7pFr-_>DH>~ktEKvqg)lv{JEaHSpBcxRXnO{=?K0j7~ zyZEhSN}qFkc|d^kCQt?zvBz#Y(O2;6mu9oZi^m}*yQ{{H7sV@j?a76^rRrqXI9pLa z^13fnJ@v~v=WxmS%pF=z%{V{x2Q5y8{RtbdI^>uuyO59`Q0%b1?xVi5-$Zkox$&7> zkDlo8XURtbl&f^jA0SJDJ|Tj(A~#5%8?;d+X*GdmPp3<NOz_}u0vJ`^di%YYI82Mt zDLPN&gh0E(_=#H?{VO4AQANknm%Mg%kkN;i`M9rBT7DrgQgq@<b`jus0%c$3Z?1SL z@n@>KEfg%MYB=F`oo#e@Y&YH3=x{eY+~#z~@SQucO;yu&-|x5_?oF<+tFZ+iTKI=H z&xFJ6-tKqzEW3uqUhlQubemESI5%(x54W{{(d&pwHLA)o{S%IK>UTw2S%dyD{=I$N z)}SnYm+bl5RV?x6sfh+F|6Tr%@!|i|lN6s?{KarvT))fJC^yq(Yz-FKqFHdQ=8OO8 zN|z$LZl{nI&&!8lG&cezU9|EPS`3QZ%3T;MR4s^(Q2If|J@mQ6Nh7EN2hcKBbWk}u zFqGAb5Q*a$>fsK9kk?{}O3*9<*gz!AQ6w#E{u9ry)f?A{UpM|0-(gApRd)ZciY%qS z!XDthSX58Sw*i|tyvVH%n-P&B&TYVf+|)OfipV{g@><9NYSBlANIlu3LOqhKb0`fJ zVQjXU#wL*7g$r|8!{E<ynjan}M+7{Eam(mH2h9*5%7mc+yoGX%vmwN+CJH=-&^YXT zc<sd@w2?&OmS9@}CvV`p0z@LHtSwQ^WLfAWKpdW%D1H#U15X#n84*p0$<Kw}TRyL! zAcRqBJ3vkX$z*zorjl?FT4nb1LyMgCugcEOo)zoX%XRYLwLSf*M;{%S?P`)k*^te% z#^P<cqmY~I3E5;etAF?dQAj792H%W>rBd@FcL6D`WL=h^xbxA>hQnXGZvMyNX*Cu~ z=fYP{p1zgz?Fi<39`ijW{EZoz%NX<BK-nI{UN4K>))$e-;>ZQF;XQc)30eVzF}PeH zPClW4JP;=W>byW@Mg;L}bCgi5NF;L96VZV=@8yY#DnkS1CmcLu?m$U8WhP{VYf&|g zKjxG`!8S~CqEPH|mYPV4J6t6<d0;ZB-Ug&2s0?}tvb+GWXF2Eu;ts12qg9|8HSl1f zll-$TCFi(!!l|wDWtU5O@2ep#khAzghYKgZ{^+Brv2?_45uL3;8$R{794@5RCjvI{ z;`MUrGa;JiR1R?!`{1MD<YfOh@>wST$ejMjH8ZCU?b1KFc*zVs=n>x&cPsCq?(WY~ zw*k#Z$p}K{_EE9#P#@*X8j$ej{WIuucLddSAQMX)AaZoL?pg$q6YPjEphF>4tCzdD zo7Dw4D76!yjHJOXURF5m0`=7>H!1Y%O-^ioqFW3RbI@v@r=*sY&4`hrbq0WWg;F1# zj0~pQ?ZrI9-=Qoom^0#F&LE=~b%|BJOMJWdF<};XLr~(oRA1dXeZMbk(<Xb4S$$#8 zuAT{hi(i}0-C=P=oo-LkA{|=ekd>*&c0Z$sJcv@)#{jpFp)IlK3%_@q-KhUl*e8nC z>Zh_R!rmTOA*wO0$&SJy)XIr(ikp>pfx&qccmmRXMhDa)VmNKQa|TUfFrKAJw5KVp zOCQ)}+RsLN24HW53sqCll_~e|?b<^fGX!o93e+(J9AdyXn@QesfHUs|!uTDNUj+>v zsO-PA64oE?;*C;|46`$a(B#IHQYMqpgz}ccM!Jk#ggQhJFvUbG%E?J!dQjpQuQ8ca zI(NzE+x87iPXrxdhodWRw`|@sFrM~W#I4DFqQ#q@3Cwn+U7ENpakEtKNlE?eg(JJN zQAZ#dpNi+U03xWrp4Or;^~9)#tij=AOn-Xu1JWac7-T8z(R$!C{zNE-c)MAewNBb$ zJR3|?378js=;X@93HGL<!r>bo6a_%RdfSD?uonky%rzM*1O)8~z9SRZhQ+3um>TbS zQ-?TMxgkXKO{FHnZU`uO<Hj6e!2ICaE-4gH_0m)V-lbB?rOzM<OH*MJK{&wc0fB%o zh>!va_1rH9qPb|qPJ51twCMtg)Ie$~iR8F58CBlv_qRuVfp-0nvAY6|`Mmz>v0GBv zcuTh5^PPKWuO-Fz*}F;r{h_`r;E;kQj<gdXZ6JyO89A}NjNpAUVurlRo638s^0$dk z6mJ)WCXvRWBu5x2!jeMGs0ke)G~!WP!)<s}C>g#fsKlrat>Qb8@56C}fkmk#3hpgM zn<&A=wnkH~RKLS9@S_2jYd9InS){#rP0_Ue8)Kave$A=b#d`VKLBH>%LPQR?ARE*w zhx>ujQNrNwY6xuddv8evg767Ym&SE<P<)fUqG;Fw-x10P%tI=1eoU-5Ovz#(OYi{F z)191apAZ6|k|#zl{03isFqbuFOh8)M#m|%rYPT#<xim+l^W`Syw^9IQm7uku1rFjR zCZMWe{dunks+N`%f}Bc%IB8ejT-bac6MbkJeBUWsEb0t}VPF)(vOGOL!TKW}TVtrl z4i0P~>d@k%7}s18{h7Qy5oo^cwxN8$Hs&&)RS?s7)*j(Fc$Vl{(cBYQXL?rPkR)TD zgA*N_AGJU*PtqJy*1}6hTTtOlRhHQAU|=viRhNY}{*-Ve%WeGGiME*3A64Vl8}19r zGD`xfKcE~?!eGM`#E^I|xlmw&DKPSa*w4JeA0|W;TYBliNz!s7WO0hv?r6%PxMImE zZ0$r~*(@wgl9W1m5G+W!P&}L7cg&mzpkbEgh-5$tdMcm<QB&JXy}SjQ6&q)trCGt2 zmmQ{SZOl_!nt)vNT}K8)ZB((qHv(8xS1bxwz-fC{(7f~@cV1q>h3g6mtV6gGRVHeq z5J2aAsjDDH%OHX!xryM52XoU6jZ_K|YTQAsusAm@m@{dv+=eaIO~x1`CL*$QLj2x# zw87#Q0*bq{Cc!EWQj&RnSzo~>j-_{tYQ(WUoxCw<&v&GIh7M2Y^J>^*b+dGCV6-K8 zM1E7n))D2X#TPkp_@Eg3%3;gZy|a^1S#)~XzNklkEH(BPs_mFy7vJQ**fr?HP;Mt( zB%tXy7>PM^2d1MWfx3qkC)1RfBpdpFnRE2pJ2vt!{uO-^<X_mk;=S-shlQ(Q?_f^K zQo{hz!W(7}2+E30X_B132UXI}q4Vv4UGf^vEQgiXVctf_LCpfKR`PJbB$Z<>n>&bm z)b?_TPYVx`50?F2c$kZ&oZMOUyYuV=zc*8L0z(~+Mu}PfIyyBF7)-c=Hoq<0Z}A+L z8)QwjqEVr3(XDLsAE=iP_aqk`GO0P#Zjpm}c4~7h9B6g;zwdX1y4YGZtidmq(JpCm zB2#JoGm{fCB5z>{<tKt0tY^p~(K!L7YSD#l0jUBo)tsma4m`5g*dY-1XvjbeAHWs3 z`|{mcAKq#&t5(@&6OUSA7QZs*N_ZMRQDvJIoz;^;k@y|<$p^$^%A1G}gizN3s*0kV zrEV&?86yGiyMWCpSa2htjZxr_(^l8$;d(YmDv4Jd(5*S3!Hr1;mlhuhss-Id6(}CF z|Fi7yiVxYN;k5oC4JpLLqqamOrQ8{?za=}Ys>oUbX{X2biP|hn4DbfjpfRCedV%?s zEHDG>QKw3P9HZu}jNC73$LgqdfT4#8wZ=2}mC70cas`d5^IfzT>X4~1Ncy<;r=GV& zSp-q}WPW7d4r~8t%eJ)i!j}Hw6I#@^Igy+UqVf&*9+qBYAtjG{4??QZy^Z^UW5yf> zzq58>PU1rPk~`Y~E;a7VA=7neew~Ar1{36Ay0g|l8rqgt^2_hcguL_`3n-(46CJt< z0}e-*gr&?arXW2pmZ<T<oNC-v<pf76WFu8jz&em|*G7_b9MJBEX@G1WZd%^H&Cdqc z)(_v1bdD&aqqDKZW*B$5iM7pz8}Yn5rE&J4l0!8|FML3HUchQ^#GMfZ0QL{Bcrz#~ zwu7OR0Chs#jEYXEhnvAJlgD|YR3jmaX(zA;zqcdWqWN1V#-@ioF?Tw&`w!&YEx%`* zv0ODZu~nVXPS9L$!@W=Pd!q-HME5rCSOier9EF<Zw}<_0t+~BQP*nll)A79IH`jJV zrP7zqZO6V6YqH?}<I^Lam@D0~>-XjHOYVPA`Z;?`IW7dKK1U*7%WYU}e%!U0Tm5L* zNdaIlZ})1Bp=jmB8|G4Q_tr{@K?!DP(doo86jh@b&JCz{S#=8BX5%B}iPsSNeu3bc z&lipM;O?zs+cu3)O%EpHyEQptk0@Gpa&|*ubzxm$Y<PWkHn%n8G5o{on44(KJ{mJ( zHMCPeH&hIOwBv$lJrDu49pa+FcmfpOkYjBwBDjxBD0?Q-!=jz{6GbJO3KkDEI^w|$ z3z^ikyd+tmF%i||qptE-u(#4tQQI9)4o**P8s9dybvGQx;I`asc6j|*VPH*u&FJ-0 zw8n3UKbOXpq@W51VJFfWBd!3E%^=)xei4D0V;X$`Nnf%cJb)_DrXxRni&udvT{0@o zOR59Kp2BlazdQ5O59xEIDB=t0xe^O~S`t(Xz7ysIhCYhC(wreitQZy0jFhP2WjU}m zPO;Ax?iTk+^Dl2@@+=dniFfVz{tuxn()dufR2?6?fR9wF3y_R4B3@J-<M)61k48l* zq;V0&t!mq+JbKag>C^At_|p&QyOJtkZ1@^UD`tExwS6kq4z(yyp*A08j0Tw#U)5D{ zA40?W%UkvNSzR}KjT2MwFl3^NDrn=Z81sm59G^-NjsqVf4G@LhSViN2N>P?&QPK4` zH_}r}H_FfIb>htn_u$#z#x)14*CaWMR(a$AS07&ifmqEoRgy9M(7SO(mK9yqAxZ_z z?LFenxaMv-EACYe2o2~4PyBeY1-cy9B>EW}hH3{<8NV#4mPQ&X1#*CH#6tOCzN3;B z{~=w@PE7ofr{&<*{a*>X9~kRU4jeiYQYW_U&#R%w_8UIP4mpNc$vE_sfB-`tjWfsr zU_FyZ5_gYH{$b}`8|B#H<H0@GPJJVI|F?|uGc8XFPboh`1T;;&VI*g)v>OrIq;c`X zV*V&mg_ekr2S$f|<OULt*VU2X(3kZ+C5pi|pH~#Ym!dosz4gdik1etPSPld;n1|CC z+vD<a!3s2ak@1Ks>ip=i;s=A<K+3D$(6DF_K>IWT<Gl}HE+{GS4m}qH#SHt;p|Fhh z=&7iO%I|R9E%FZbkn%XjpAm?}1HoKOEgwH*G`0ZV@qsJE=1VkzW{IFoGYnY?7zoK6 z5<q!ovpEMd>icn@=5|X=IV-3Azlb|EZ%CGKuZQLB>>b>z0qlA-SJiuA4eO!*jC=7l z6bQ+t(H-1I+#=1$z-HqoZ7!Q*)qC>bXrppg*8@$K#>6lD=Chp8L&ALJP98@n!~WrV zo*N(<%n`0}VD6L5BC8*^4_^gHBqz!I`LNH^8FmF?ddgvA(>5EQpVRtB;?c@v?YslO zPV<f*HS=DND_&;aXY{QB*6e0~8Vu<lwR%>BJVeQdxfs_6*|U}H#=eGr#+GiR3t}!B zP<*1%E~5QhqE3885wkAS&Ji&uem?8II_{>$#<P<S`9(i-X|7fJKYlbLzY_3h?!ZI1 z$5wruJzY5<*l4~5-lEM;_dvW4$J?TjaYiz|gfL`vQ^K$q1T#yWr%q9C3#C8y#;7RA ztB<!hR=r|Ljm2#HaXf(S7be&)X#lSjp+#U=PLjSthPDhI1QGmFoiV|YIw|iTXDugB zoH!{B{7+r|<P&;+aRh(gkG~h(kvjTX>i?hkd{|Gj(aIyJj0DdNj~fFVa2gksIQbbX zdbu7owKr;{A)8=7mNc%t;cHcG20lC7kU(HM7j!j+JE!*Tn+i0!Jl5=}{$9_RWyrSg zhkMl_c?0e{s*keq$}_lck6_vhs7S_%Fv6g?D_gV~_+2ezGoW@J3*x$*%`kfNWFj1J zg|efgS-<A+1t<0L12@|TEMq;{Q5i^N1}tley->Lgw&5sjL)x#{1=v>9gGJ^}ylf%Y z1ePg8r}4;TE03sJsU$?<)ej08ZDgcEq|h<Uf>;@@XcoPw>vp)bV;+Cdb=c0VdaA|i z2)qr$_03R|$GPnO>x42yAVnX@Y=@7)H#hDJ8CJCrsgQk$EJUQr>E)tv4A2tYE9!+D zi03#=oPo2~dd%;kJ8O5+egDC3-WK<r)(6<zm8S*2u$kllg|g7RnI7DkDXW*e4};T& z6O4&5Ee@2)Ax1#M5J1LQGh3wqp9${57-PYFkZW$128;!KuD~DKT>n1gv4xI1TzBa+ z{)S@#jPwqV{b<nP2^!~fv?8$AD_bBdaCTrd8+9w@4_2&CmsG=dqH_w^7iM39%NL;) zU@tmc`U$(|kzhwK<OsmhLM%`27iW~+IE#G({+fb3QS&U?s<v=7Tw}~QKbZ<CvwE&B zTEVPLeg}kYe23%qKfR+d?|S$vd++rqyQg>Um~A?I>Yk&<*k6F0y&H9JR<gml9SrN= zYl<tFzA8trkhp?HVO*bg1cB|dXaXBzxATQ+d6|5^S~VsL&%y$Ut2SQ0JjG87#37wg z$QbB9oEv^j@mz*aLj{enBIX>1$&KSvvY;-z++a5Xtt9QsR-3~x5jm3sH~-b%KC?`e z?N9giZOP^Se(%Y>H*FeI&aX?S1_x6;Ys<%X?;0E3Z9Ml;eU})3pXG!P3j4JL{vqz> zK=i|{BbiUW4U`H1F`$VveK-`A^ofDl@6EAzXz%7-p`LIksK0-Vojx(wzIos9c7HR1 zRro*jX_b$oPwP%x9UP&JU?SmJP1FXS6MT(8cfhOc#4M0+4UZT4*a*Fn#|x!New>GT zDT56iW58nek^1kFxIqkbEu37U5nJq$Y74fy+v^9@W@l7a7&uoYD}Rz46z%6#r!5Gm zOP}g|ky%{UR*}~O{qOCEPl@B=fb={%g}@IN;d@ic7?Z{=7&l=qt20JyA_@suD>enZ zZ`dBxKTmXm*uZ6YKV+MEJrtrftPizr>YW3j;T=Q1W9!3d$L8-ywwbT^w~tRo6FUk+ z*%rqLU&EcUUA#k?6MR@u7<XV_fqB>+&y$Im_vZ<`d>%5$()jj2*eu?2PpA!=_D^qV zUW@$JeXd4(!#(%d8Y4eAZFjr0d+*gE2<VDJ7&)P>%0aN=^$E9gO^ZGX<U`e`%kGvA ziaw!4f#Pwk?G%qg+Zb$)6h94RDMeCDLZrHwa+ESSCM%6&LPRMk924ppmvKIcgX(Tt zp%OT3rK=ohMc5iMQ;&L)b5hevvqKEoVS@~{d`p}ZXao~xgTNi;pbRNs*<L51rJM)M zQhqvRX{NGNek3|_zh}fE4uog>#eU1a+vHJkG`UmEyMHLpOdRPCN7B~y;+s9m=-OQ@ zlvH_UIOUWbFP(SzK~RroPN9lmJ2DoV-rLXO;rLa~J$$|c;=4Sz(@L`5i-K`uW?C|2 z079r@R>~+U#S|E16=AO44W_ViBUgfrB;if46}U|~g@pl>LM=XNJp`t4hIq1q=VF#K zrQH%h2O*ofXVf8wa;ln|-I7$*Tu3?w3ehlQwy>qymYdeo6K#sxR);8BNR^F#2nFOM z&LjV_Qn*JzvsUutDE|T;T_Y32ib;-ONMx$1CV{|TjISROO<d@;86a&}2C3_%!g&fm zBV|fE9E8ZIKYT*b8oj(q7$iklbZUmHATtGGOnmQi&It`JH}wdj4mQ+UzzqfO<6#mM zWRU?vcx*~s5Nipz2)LqvG1OT|sUWmZ6b{a8`u60$yt~^qIh~8;ve}W)<V-TM-ED7T z(%FebEFPcii4Gs$9KBzh>QBbeuh<hz<zt13Ku87yB=B^I(eF?k;7q(xAYNIJgLW7z zs5+V{-`SlmuNj`8{cgtQH(<9<rHhdZ#U4<|bMFL2>_nzmst5e%36|{6ZkphU%WisD z-#Q*8sPj`zum>lTBg`@seT~<d)PTlk>m;1lIAx;@22~Rql)orY%l_Y+Zfc3Ng~5rM z7~Z6+Nc!6I*6Dy1dtf{jz$boRyS?pe%KorFM86dt!)HmGXM6_C8hD?i-)W7s$|vr+ zFWzoN^d7{43jVcw{IiZ-!Iq^@nAGCvJgA&m6>PwV1lEbDVpI6Qfl%|(`-ZTgOLF3S z>}}*c`-C5m-$iDZ)i@exzJg}(2%XZ=Nb-8vB=E~zY@iFZ0URXo1H1!5q9C4B&T=SX z7A()@MxNJiB<v0FDWJ)rHdLhhNS<_22^y3T=-aFSHWahSe<hLNY^73RaFLXfBx##S zKatG@$%yPb)m<VTNR@vgB7l@RCgU0}QL#-1H#xFCWI|fO8iYFNsq9hB7jd+NLgCi% zP1`pgk9+c>Jd)1kuieRHj}%x;uQx@`#0J7mq40Qe@^`mf9SEo8M$x9ldNyW9XLf(} zgx`e@aSXNlg7}$|fTxjx-heG-1b$#F^VBaaNVqdNbt%k*$cxZL8FIq#KNx|Aa^h!4 z7M@WOpS>nOxu46W>NV@YXlY!t4grgjQpbUS)k|Yxz`m~{rgX{G0aGGVf?wUdK8ypp zLwx^9bB)b$FTckey7PcWj-&}6{ujN6lRtxT5Ao=J5!AgkbF=&&C0XRp@tfQu7s*9x z?{Q+kZ2GFzxmuimvR*NEz6{*;gN0?zkj&>_CZD7_|HgF}`i-IQm!G6@?-EXn$HWh@ zrm%-$H{yz587@m);uAuQUl+!z85TFcP!_L?suX9$yXzb5$<yK%9nm+s=5Fy`;7X7d z=eZ87;xMjR$NhvlT(d0m8X6iMqeBOm&In+_hARtv<h)ytv9|drzeVKbn{HzNm%eT{ zP|@eN(b$9fC*o!%BX)Kx_g|Fk9IK<;1}?*cXBf6@gksF--d~m8{x>`VsUpxQJZCS7 zO(3FU&plmu<sa^)FA$gPQi{rRB?R<JOtch5P`n5tr@WkFqNEt1NHezQ&IValiUz;= z{M^>h2AQlMX9pHO!hR58qn?%CP=m;9fb@k&iV)YM_&cx?)u08OT66beY3To@H|EpQ z%k-aocJq(7&~<i84~Qoe4}7$=fJkONS!<CjS!-n2go%tCH8d2$3E+Jbp1Yfx;Z>K* z#`7mOYM4+D1R#e;&jk%()xl+22g;y9)Xh61bhWK%TGq_{}{W<`0Wr<L>r0GcD zz*6hXQR;(q9Jl!TVpsV5<-%yXf1fsWji_0IwtV+$cC-1(q-eD?H{{cK#j&M-qJP}3 zg&k7~^cWO-+d_GH*=GvsZ$o|jjCu5K_Ggv9#CoR%xSH_OF_ZB75ai&?ZNxRn5#+H_ ziffhx&d$h}Ny@%?`^00trEBzOz&d<@9j86CbWOxxDOQEPN^}Ttm74hL;x!lJuYM`& z&bp&--hNG~_pyn;ftPw*KY;5c^e$1ZJZp@R$G&j(s~3Z{iJ&+_RAx}T#4zD}h-5w~ z-dHZ?eUeS;G4_W3@$D1;vGCY6G{&WCVvGtsH}+EXn#u)2X=0mG<O|OYK1xG}Kq$s? zidV_8gL)YkjOiaAy5{lxe*|SL_Eo>WgPp-RiTmnb-FNg~&UgIfGS~6wC7vVtwXzS` zs5FZ>OoaR9*wj2~=uGpB2O?UOAT>iQ89#+2$-1J>ZM}W>MjL(~gw2{oec8s6(rie7 zLsezXrha1c1Sv2H{wnr?ID_jq;`#-y?ZN$FBw~3tQw03Lmk#;CS?!3duB<epAtw%m z1Bo$YOO0c|?~KiF+&G4j5ZJ&R3e@rOnC&6aqk=OHK|Qu6YWsbW$#yG^EGdoq^?wlS zY!3A;hd-3EVVJTaU1^+Q%HcvlA~I7QXB#!cqaZDfGlKP&fL1M8;lWo&3G#OGYcV@K zZ%dI7cVGOX_efGF?f<M=R3vh3|2>&(cv-dB8>J&QX*?D3JKj<qHdSIRep@<7X<Qfj zAF~)LzxjTwsN#;wdE~Mo-=Xz^5W-SX+z0_wpu^VV6-i(Z_pc06R-V^JU;|^d*kS27 z5k)P@@FA(xj(kWnAw_vr<nh`0Xxh^gJCU6lkHu4t9{<%xr`JA|TG#MYE|tz|TW?sC zqOpj=QSo6$oCAb??l-f25RPPwQI|CG5b&DE(d&6!6ohplj1T!K<5kIkH6wPI@c~mN znm>;YZ5bTgGQ|It=jp|e9^>zgIT(;bY`5|V;z-Dc7^K@kGVqu#w2le%p_pzxMRq7| zjP}MZ69)*ESOj(nEUXUrKZvuYr#x)|InQPTeplG0zlla$G|x--$nrQnf^*e-7*TY3 zN*<sTC_qVv1a#IEvgcWRRI<fQwg^AlZEp&Dp9;i3@X7fgIDDhFu*)AXpGCm4FnSi; z2cZB+EJ9^0g*EwOL8+K=6nZR$67g2<_EA_5Ew|{2;sf>o-G+63;Ah8!ZvF4*Hcj*_ zi}!$?fgCi}i0)Cx^-~?ZZhj97w_oV7EZhpPkexw~MHXJ2!F+*cCmFWe_dz^>S!;sW zJ=;XH#@aD!xKG6<o&;*i4JIXSPZk?Fe$9z%lF5qnDfzzGtmOu>V2Qw)?^HaVL84Z4 z8SI7T!y0%Dg&va3?MQp)07Hbx?~RB#zt0wlH|gP!H|+L!*`Ihk_9q+>zwcatxxQ!4 zEygpA@iYQGNV*?!IXWoBG=kI=p4VW$E8BTrG{E(EmIu#^C}!>5I>d2HJfPob-s;5t zXu)9K$YGqxEm9IZ%1)u(cy2m~oUIVuN;s6jiR41b!eLV}Z^Y-1!jYife|D)(N-jJ$ z9U2X8-nr|R5nD$yIJE|5*JX!u{@9UVE*$0c`3wwI+bdgn-UxijhVx<MgY@T-4{C5E zFP{(6{ql~*zcU<U!*a;$FWMZ)Q~lStUvv2vvs9j#z!rWX@2kA66p@!%S@p{R*r049 zO<ira5#(NUvMAFiEz2~{Pu%MXBGI@b=%Pep5s5~HCmLILqLD3><>O$zq&VUrN6OSK z$}uCT2%{Bp3_N4Y<A{hM8c3Ov57^COkZyd%DB@$$geT%p{Cf%_MlBRXbjBU&13ZW* z_kLawku8kKW0fuP<AN1TqrWEChz%pv<QZS9Ni&X_X+}2xru_5DFW|pfVUZg68<@*< z5;zveculD^<6`7uM$o@Pomv<}_+0$UAPdPq54#f8+}Xmza)0GKkEJ7L`Nd-COL24n z^hoX-arA}P9f!``VPAc@Ksp1BWsm-`tYh5R3&wrG97k3fe(>C%5lnBTgrFwi2g!>i z54lrMjCRD=4uhRdegNU~cyw?Wx?^Ef&Q-R{pCC`^rfd`RtdrY*2+>nxa9p1U>!HTb zR@9`uLW-f(sJ&<0fOIN=QZX|=-rl(|3bw!ebe?S?*1ySud@4~`vm(A`4WdW<C&_iN zalKQSPzRB(+bxiigvf0+>sms@O=w~}=A1?V3lI+ikDBD>;EsjUPz8K60EvP?0yp|f zbwEv)1eoqz-&jFTj%*TQbF@LpaE~Ik8t<(`B$BYi#u=nA!rV|!Er>lq*54yr!uDXx zw*K)k<;CPsc659q9vxvmOMO5Cd8;>e?RA*9*KlqIm3O)AIEkG^h({o7zIf*0H_STd z$J5!|mcG8X_MZIa9*pdfOKeAR;ePo*rD)>b1nF}DY*W7_er;i^JW|<#{k|NRpeGGn z!pZnuWAe!1<IQ^}r*=1+xCH%#yhESodHp2U;gJ8;y#B?w1i-JBbT?rV(5aYX68sr` z9-xF+03{KzV3YDtblcv34{Jzl+sjc2w3qkGQ~DX@dG2q*B4Ip|1vdc(G@K3YWy3vM z+C!kdHun&1Wq8==E1A{LY}xYa)~)gsyG6g7?W5~HC_hq}Q|-`SZP>bo(T$x0(--oA zOke2zCChi={g~}Pkm;-bvpvbs)8~i;%U841!SYo_pZ>O}+MK54+rkZWD;enCU>EK+ z{~qLtOZgbKoJZIqlqnEIhtP*daH2_(nW;kx*n_OOaW>FDl?c4e!--5<l}VN;hRd22 zh-#Q#5^_H0EDd%KBAnnGHch<vG70o>ZZemfe3Hu{B=olLz1OSUa?R@1*Yu^9(t2!R zSemcwfef`#POsh^VKsxOR!yWLfJ4;c!Dz7r4-aROXAe>Oh{6uoxHxwHmuhb2k><_Y z-2HD!Mm!@uTR8&^kDqeVd?pO!pn;Wy1p{}VXJg@08h0skUw_E-_2I*l;fW5|^@K2o z1Te87J}*Hg>Cue<;tPe*Zt~J6MXkP>ymaZ=8)|&=2ATYC&;(*kA4or`yb25@9}~w= z8smT!hm|tM$H#Pq_=6Ewz-XxKjyf$%SBM;(T^>0=oB+7WOX384OrJ<EL8lF4Od}?) z6EOh_g<!|P_D7I}kI1_;OT4eJ4Ix^vJOdU-D%=<m6_)T3kz{PAS)z10Lf+Wvcl97z zVGCNU;^dUJuRlGi46km9WJS5bmY8ycHTyWmcLV0=P3dgq1tpB}4D;BRiT8AIludQb ze&hM!VAG|D%hSgEu%$&)!duoZP0-?^u?65;bOYbAisMfai@Oxxaw)cj>|NA-nb?+v zD}q~S4Hk~+<NDhi-_mkLd<#h#6VrkqlA<m=3{Z<Fb{R|yTiB`x^hYZv5C>Re#$=GA zg7bs$Sq<ia<1uQn2sP*fx*Jx}L_rwH1O3rj{DYocgk+F7$326%=Skz9h<5XwC>$ZW zXRRXyEQX`Hm*7N}ZM-414el52X?O_+!hxeO7Pi*j^JwiY(GT<^X(!t%zhL-X#v_tX z#E};QAw0NGwCL(_AOZ&jbHA(lfWT!({A?>6FVDwlPS27)>2T2f*^+;!?-U=cjL9mb zD(oj<DDYU6KSPNN&TkW)5XjU<!fe8pHljylF{nwFqyCtH#Z{>WpZfBbG8tDQo$tx? zCJW=z^wgG|B_2-5YAzZ}?!i1B)NT6a$|>Y^q3;mj1aT-#9CCZO>*|NKQw{cl8I|x( zRvbG940m7F1h8T2`N5Qde8$3yhh+_=_TK5656(D{$<f5YwffKZi&Nq)(=+>}83}eA zTNu+*`ZFBAGYlC(@iFXj^0Izo{0{9KikB>6cd%C%wk$2%CD@&sdt<NuR@fcdH!#PR z;db=1OZU78wu9%*;(Yw9axY{ItS=t3Tm<h>gc;zSC1lVSN9@#`8RGb?qIC#Sq$h)D z>Twbx@vgAL<3H|T?z{Aq#|FGkz;)PVKkE0m47?7;b{1n3VN=6rfxip+hcTHy7y?9S zy+P<xgfIDn<MI*W6xE-^MfLCF7CK#wF3e@iXL=AT8v>3UxP6?|Bmf*Bi{RiS0YV&^ zq6}~!0&nc$<DfmTq9e~n%_&e7VHyhbaG3$tokcN`C(dgJxcc+yFK$o7ADX=@)Z|a( z^qqhAhW_6_VmA+YBC%iu+`*OYdQkWUkmMd=qp`P28t%_2+=n#Ogq0+P?yvFt;}{X3 zZ_zX%djvrR%K9}J$a;*!34S3O2P{0w`y0@jhvJFtv0S1Ht&@Xjo&4IHY_S<KTX+p~ zItM%@)EU<)u!#^NA=8M5wu}S{qN6CEAlM&%;5Jn&XW_M>ZG(f`^!L6zIQZp3%&EAb z>1Qi<@ar{Fe6cZIRC(4DS1XDcGPL-05uX4Y6j3s2=>R8$OvXzt-(eb4`a^@;hK9E3 z$=bW&hShtX)OYBgnDOR`-yExSWsgC*T&!6i`TSrrTR5fn=!YwhAzp$=@FE0)syi{w zTu&ph0#RkAy{tS=Wd&1_U@Rq>FGhc1Gatgu1YIbM-nQK_cHQ7?VP^Z(?APs%h+|zi zt=flUJWsgrioUBdidAT$`bzWzqI?4t)q;_lWHK^v0Ar9RMMgVt|L_I@IHdgy0DZsb z-5orAC3h9oE@~s9cmSbS(R{=6DZ*;#WF^T)ii&5bN*-W$cw-;rnT=R0Dmn7mI@9)p zhD6q$>f?I#qq(?kdP6qQGWexCzC3oVXEKpq^McLx!d>cUVR$wg@t}w&4)_7&&!E0b zf0g6A0++{k(P<;^Wf`_he|7fw9OK}wuZMbCFAMHs3(r6gCo9JwKVpV#qe$VEu*3X5 zcnFRg){`7K><)D3$trM|EeQII{(fZwSf|w#<1vAQWfz6p1Q&E{(T<4*=xI^S68aB( z2Bs5UeH0@af?^3YPR~Uawiuy5b_nsfm|uTU-|JyU?5`X^=#R}mt7j|yl|AtBHwXmh zL{<qdGOc4UQ+9K9JU5UH3{}=t{$slOPQz_5;OZp-<<1DSa&Qg0QIVU4n4J+Z=_5!j zFLuPwk7Ksr@AkS<L%!&AB)cW#XiIOJNvjPGIXyl3fZu-Gz3%Dt+s2L`@i-qZiy~62 zuy+P3J1RRa@nr|@NgSJ*Bwtn^*!kzp+c%RhdkdBwf3p|QbfB^sf5UUdaI1hYffNFm zxu{_ltvsL!;ZCUu3U`+z;Bw)9ktaf6JK$4cPs9J|%j=KiPbW{WpS~+`=hcsHeJpg# z*v!{LrwPo@=HJy56-Q;(^sz{v<0e%f3%0YFD>jR2O_b(RWJy`(*M)1GEbm$Ju>yyG za5oT|Z2mpHr{bz?!+j9HLQEREs(P1S@5zA(f60^6-!S7>e+s|RqxaO@hM$+uDz{Z$ zffhpz*cZnuUV>GmM9IZiMNnY?tEj&fO(3S5xXTs{yAWpVUcxFm6LR+c<d6ucqNAZ1 zP{o|;c+uk4!Xl80Y$2?FRM9xDelf-f%V{x1y;#qH>>?e7YK#$%f&PYxsXv3jU=66A z&7Z_I@8{Qqd@yPxmR}QBLbTF2QV2wyT38ndy2PIZTWc=E_0&`)j6G_FrGv}_WU~>u zU%Do@ZeY1enK>`tcF1y@uW%2g<{ou5uk~N*S=^Al6b!i*Uwsnx#9KTopTZ9#zzI3W zrYpCJ$$#;>T@GXVh1QKddxeE#pk#iavWtCO8yj9e$7UUhv2Kg8VXWJWv}|~KS~m7T zO)bRyNnG=Oe$C%>-IiZxN9AXgGt$GzfeR9v3<D?qKu_cRd>|rA52xp6TKxk1?48Ox zm08|z3O2ipI12xV$El#`dVciU)25=cQ$bjr4MkXl#E|Mx0LD;`FVl-^jg^91AmQo% zRXUgiBY!9%2NMA$6f^Q$dQ`Vn<f>0%=vp`<giJ<?mKEWXAbxDX`jMtX*wpYzP~E9U zfo|hbUT7J-fS00|FXxlIvR}MbWaN~HGvYAL)so)XSj`V21qEe8c^O(M#OCk;S3MAE zOY$G-Z8Oh?u|NTo<pu+gt?+V!#iS5rp!B<;hT4%9Iu`PTV?LID$-^4w97yhz18%?@ zaE6BAD?C}*%j1H5h$N9TMWp`@j0Ls^{3O^KeAZ!>n}A6)Z6m|kkgIfzyxCkV;*2bu zcvJsd_8t9uufF{9u>Kg=3Cs0d5Os=KaxY2Gfm}DpYb1hIJqO!(NpvIaT5Uw@gnp>% zIpp#wI0pQ$5hF9`?Z}shLpcA-SVM42*3~uWk50E`Z)kCZ)8kNdHiwd)zD6&QrgK5l zZCuiHB;%T;qxxSeQJfXjR}kS`QRF#64WlHzMe6wi_4#73>+zWteN^BEa~M9_Clids zpc`QRx;X5oH~`9iJ$ic0sZ;%@PxqfXwML#kRXBzJ-M8kx`_^z>S(NV7Z&W6QW}r05 z!ZzG)9*)8+d3b@OvE``jlwyTuqZ@msM3MWFLWC#SBd$_?BPmgt0#)_oD-!|`MR63- zUn~bkLpZ&mF%i=qX6ZOfbk!&N6P{p`&0&>abq531c<7eIP=9N<qaj}J53g*F<k}jW z!ic$^lwPgeFTcxWETT78kg*pD>{?;EB)naVVq89Pkua^z9isjn_BO0+9tW?jIfKHZ z0QjZ#W?wkJh#YH8?HsPXR6F<U>*lVQjpVih;=WP&Rh}y>`-l}o%2I$a2CC=Jtva|w zi+_$bIR3d>T;6>JZGiFpZy1}Egk5q$Jfe>C{8~GDLX;(%H?WDo#i1M<0u?Rz<XnqI zvLw%WTP(;K6r-2Pw(+he8jfVSg_;?2+^mN{=cN`{ndcNiYvhrAlwx0!l5^81O9TeH z)=*+@4ZFce)MX>NhG%RIAtvu>knab?J7Fa2_)qp)GUT^^<_KDBEXI3TF}<J%Di2hS zA|KKO#%qd4D5OK+D>w3<Vk9q#jRcy)Om61%!rs*SV9M6(`9^l?tEWL?{?NLlW81{c z4Cx8!b^U$$W%wr)KcY;UVdMO(dnnR}hHmits=5Jo+7#?G!@m05vtN7OYhPdP>|bI> zAzui;L90B-_W^QKA^?gU!AQ9&C*&_upv$~306t9q2h1jSBnk83cauxWU+Iakti{3> zHTG8Zg+Agyr^jg<XcW&?Uy<zpNB!SaoR!~$jgPuuREtscsGh*BLOrl0JirRgXL|Wd zXcSt;tiM`G7w^^CCr!S9-}YB7_6Po_<AIZ2PZO8(ap|MVcIA3uh45c^d>z^yBB9ux zDLZ+L$_Y-}b77~3vZPL<EU5w9NxYF?C&E{(m`E4f7YVaq`V74K-+qrwA3M5eqD6H{ zh<sWMK8H8wEOj0?8ZFbm@KMjT2tmAELI0NV)l!Fc^q48*V<*^!S_;Bn2z3^>C)mQ9 zhm+03F8RcW6;P{1;y3`)0q;LU@J`Y+0RN*thpM9d<Zq%Ppsp1f+H^(Y`j{G&RjXPb zUb8CR$-xcTbs?8pXSLSXwY%b;=nC$cw5^i<r&Om}zY}3bwN&b6QL@xq4r7|Akcwl= zq=h*0UfY!ialXRvmhKnIjB-l|{v)ej1miR4V`zS(nR9J@X-sO|m{i2z9Q-|qsPLPt zxQ_2BkR+t=+X!9(0yEcaylXb#_%R-)D>aaM=lrp5utTGx1NzmM+E5H4;9ty0FQ*;q zCmvsF;)OooZlu|%YbgcA*GWbe%}pW2b&T^8+9d?`uHdLG=&`U1Bqm;KOWbv5%L-Rp zN7#+@i>KL~rbP$Nm`W2TB0cL4yR2SEL*0rNSD0<Uzilr350*kh-PS9r1=xR}l&vI{ zZSbHn!Zvh2BMID*Ep;I*uh2Q9F6tDgcj5;s9@^5SBwi7YILmVaD>z32VXA}34tOX# zVC)eIgvZpY2o#Q>)HOgbJ~}Y4(|Cx2%3onHqpS}S7+5s(^ElzBFlqQk{MhC#Y>E81 zHIfd<w$R|0BTu>FYnnxyw|{fwFWVzMepzdD3~voT?F#n<9P<5sZSPn-7SeXJk!bX% zzJ|SQEEcnCR&mb;Ul5gK@ac<sf9282!^l>L1#o?ks^KW8NE((L`mvl7&`+Ez3sA-p z_9=qe6_oIR;x~#~784gWUnC$@G#HlI19CdmFuH5sEj!1wh_gr3cN$13ecr@I$$kBi z!*PHAK7WrlCvw;n<j!q+LjOhOXOQbF%$x*90^^b!{BSoA7DkYjCuun-izHnRl)Hq; z!e~&9=+*6VjxO!PrMq!yJ@HGqJ7uem#gP$z$!+k3AIhw9tMZ5Nw-Sh%5qAho@XO+Z zFTn=u?-|%&`NcyM|H;4xGkrjIRYsM&5cAHPh)Rr|JO(#_xTNiBtFVDuFbPL2&N34W zUj@NuO5$M2sSe5@aBJ5GZb8|%L-S{chqED9Ae@+#k38^Abx<D7q6!>6hd2hn1q{GT zix>dXh=~3W1^_S_NLL=A!2nRCxI}>hd`yX<ivU~~Dsc4}lS@YYA_l<dK)i?nAlweD z1^==r&XyVsz-kwd<;bTeSFfJTEyDn`>s|V}%5$ba0UN;dCotug_!IQm7xO1t?#rAy zmA&u2?5R^3{o_-3lR0%Cz2|!7hk94#R*oxAnsL#ZHC~1*uf>#uBZV;KdYoOm3{=h* zmad0A^SPMvEAE+EJo)^eVSK;f<7)uJ6~AK1Vncioer|PM>^OrWNf2s?Zi3Te29AE= zKkKivvOXz(8-V)kf4YWWyYI1yCrw~I&4IpEjst)5AaX&-EgC!o6goGcN6YDqYN(8m z$Ry|8$FnWikp`8~gWob~0k9VhT7YlKZ%^KAUuR*Q7+^>R`~~@++vGW65@(?f)lUYO z5aQ9)x0Le=8Lz7~EJ%)oj1&xhn57+6Jn@}_vqM9(@<+3Sc!6ixq2B>J@+__j?4g=Q zWFw4vJxVS?(6S<WpCZj{uoRWN4yWr$o6yh{Nm7@21LIq6XbN#?7vd`vs6aRkQWg7| z!fM1d;w1-~Xz}V=MUM8O*s(~zw~p!Wi(!i#ZcNAg>)m2F=tzY;YrV;dtkvENlQ+=h zwOE`@;Yg@g*DY!)mbC=d+a1BUW5S_j<0;V|3^m)`k@b!Ck#Ho~!s9H`!ft&WQJn`7 zA8v&tLQM{0+fAGUk5U_Arih|aOO|rM^q$JI!4`_(N4t?QYj>28y{-un5bDjGf_OW$ zbO^w4Ku1arDBz+3zC^glXlfb*@hYm;z0HPL_xLEEL31K=DTLWD&~7|(?p{h4=`R(w ztXY#v=k*`nG&b9pzkY1b?(ysMflVvN#?q_TY$+7<k5Z$fsh;&$9XS5A;o+}t9Njq_ zj<S}k(#FTw9}7Q`-mC0ao<*FW+n2=oiRezbsh&z{iEo!j`0<cI9$P2{2{)|~Zdw9~ zdXCuYK<1c7wJ@z3x%A+F^{S`uQ`fAMgRe%I<%g<2T9BC+ujM!2)0p!<o1)q+*8is3 z{@9{um`xgoT@Q*MRu00>^_cnwI2csHlZA_$IoqVXPmW|d;FSoV{)irCBh2;;i=6mc zRG-&_EFZh&@Wkw7On(uVqw7g8i63&D6HYU67ocYnMT}~$T9V+cmqbqZ@%h)6Ub^`D zz@iRF4*6loP{bN>T}ZP8ogMze1fddO#KfRR=RXTYUCQ5wSi4&uh`PF?9ze#R@85*y z>!?isHrLTVXs7E?Apx7;DsPsKDtmxi0N+U+X&pHC?bxQ^EQ9``R}?A&#Z9S>A~8;M zXoB7=DF)w3UkCIZ6kP|LKAbTMR>^hqThr_-uO9m^ub#e*sbl>kpFTc)?3JINIii0! zEpOlb;QEn~4Yr*JgCoP!Bj%jXW6n<rcD{xt2N>6PjeHF=GOXJO^#qIeb-d5+JRH?O z(F-Tve|*o{=n3|H<<yU!im_<+D>MDECx47Jd_(US3$TS;*qxO7f)L}jkQ)y{5M~n% zu3TgSExaKiT$QqIAfsoa7IE~5^HU!!H*+vvQIEn-$SY&;>$*^HU-lCkE@srr`eWo` z`N5OSvD%9SqAM|nY2a)xz(X?Wn2Lc65orO`O5N917^<5hpQuCwc&%iQ;k2&+vkWD| zN-||MQ)Gvu)3Y<a$~f~lHe^rc{q4vu8O8S`0T3Un!MWB^#vPR=`p>qpn5DtuS93|r zfH#)@R&T(Uf~vIJ0u%9rtWmXS&2l-IO{$7Vok*^blF{L?BOeLcP`~bU+9K)06N5}a z`GmNzS9a+~lnLaK{S@Kw&>R4$jzC}8Z|)+zCL{!ApNIdzpn|lzQCVN~XQ*ch#e9uA z0Aw;sF6^(8fTtx2H>w=rMfRd(l=a{N;0B`u9LbBQ$Csxtb)Bx|qyE(RXitP)JACwo z)NoIrMU)5QO^_y$YYSgK)xY+E$uzHd78hQX{rUmr14J;|z-U}<rkw`9cW@Y&y)=F( zR2sk6O)|{OqqnFshrwk=Z&7s!c4J6p3iEnl6QXI>IZJ)81&aas{DkTcG>2MAQ$fuG zYfN?^$c*6=8Rg5JWaM^o`^)Fir_hEbl8h!~xdTP34jk{e-`f!cXNx==(Cl_uw%fHv zMR~|(mnBpsxi<!5!C(yg&yW2F?nCT9r!d2#iYStz*#@$yI@`$NFBVx)zc%2vO>f^I zJ4E|c@XQO;l@%#U;Gdvt%XJn^HrXg<G^w}GPo(GAWBOxLW0QY$^~OIO>t_?ImC>&i z`fr#$`1L35W&01)8AgqStFjk(#H8>CLU{$rO?$OEpDHl1_#vJ%<y&D@x)`nryU%qR zim($Eu?8`r;U%CV#f-*NwN)TH=QA)Rz|y)(c9eTj-=ZcAUZS#cG^putp7S;b!^YVF zYc$+Zu4brlfj<vUckb@w>8#+3&Nyf)(Ca&|A5PnQ?Ll-6?Fl!7Yd-*zx87-Hb*XBi zIjRgkIST5sAB5y431sckUx)2s)xBCQu&<x+)QQHYd=7$`gBZ?42%Ra)8to`!HGpRz z3!YYp_#eT|(o$>0ljUv#l>Ol2L-`$WHO>HSD%lD44S+Qv70xwy{XzOnbCzU-Gn!Ll z8$uWE1_dYJbfOdlzG!cs{#Y)Uh@f?SJd}z6z0_uLxaxaGM>b6>6OI0kcqBDaNad_f zi&ypz&mR67&h#4?e^_}6yFD)4E%2t;b1lZ1XrY?fp%&tQchQ4EHPGWA^Bt=e<+9=m zp3*n!hfE}J$<qn8!8T)kK|s-B9G(^sO2Z&NC)HV9^i06Tf?lP~RceLMGx%1@^-%;P zsEK5$v7K+q9F7wO7_)dljyi$rK|QV%MOq!`rZ6IA(Y$^p66zmr3dLi&aD1e{Cm`DD ze6onMm68YBn?-rc!?C@0L_@L3{;vl5Eud_bL#j*vaa_fs8nMoW-Eu&GO?eu1bq_A< zSwPWFJg&i$Q1-N=<y)rMQYFl0Mf57@UKAFssOYAie?tP8^2^=4uDm+|y#NuCGiD)& zkgw6Wz{r7=>o`q(T?780iC2m@p@m0D=*I3|jcI8CIl_~w=4FvpDRYlgtYN8FK@@BW z)GV3E(;<3@d4+lv)O?Iq1yuY5OKahsY*%tXb6ezGbX;w4X)U2(GGtQ=u?@23srNU@ zY$l?K%FgKm%k>>cLItGI0OI@5wZQ5+k_h(R9TG*=_(ZOEeNq5aANHA1D~UM_!af{B zPh@mNZlNuMQV(pIl01M(Yj#lYd3xI#$J;EJfH%TtO7qWHahmh>8i_cn^;T;(4xxT1 zZFH5fr>=683V4?^)TfW=#2ev@7gcAe0U`6E;F{B{k*kU0qj;6G6x)Q)3a+_a-@xcb z`V29gET%S{g)v?4%x}1cK5TYAsYYTelSAOLMljKZ4vG|}FekNEutSW%y&!_eY@MHi zwdR7rCR!B##%icQvQPnvsT9^p`lPcd6f=~-3Mc`IF{{HRvw;;pQBEh;xwGkbwq2HF zIT1}pWI5&v*rN`{I-oXIBGn6>+$hUi4kkmXdusK?Y3K`4bF1!NF~<t4SzqpGz#^LZ zB!FM{IyKwT>R*71fa8EkH@*hhwL&;z<fX~t<qNa|3xoyeG`trEhO|!F)+#kXr~#WJ zo{OYYz*rST)B+9?YNRP;tS$u`9VkjbtC^C>Yjzt6<l{v$gO*Cb2`|d<bWnPanFi&3 zod+B&*xA6wszf;G@1p=3NFEW6g}q4>#C33+{c!|soEpf<7pF-J*ln`rkEN4I#kB@O zM~A4`Z4cGV!BLAdit>zTd@aN_?8SN5GA}D<A-A^+T)Y**U~VubQEbqtyuw^?(QUmz z=9-x@_GQY9li)EB1C_13c9XzzC1x;2K{ZiXQeC0gVA5`JTa3%-0l=E*i2%>cdwIu^ zsD7Kz5pI-Y{ljU9taTHrQwcse$h;P3eB=eOO=j0L*mo-k=Q5lxm;OB0x2VUk;Cz)r z5u6|-wlUChV;~0E#EMX~D4LEMYBNd&(|(HZVu}H6h7`?_Uc|Pfjss!vkdaYED$yjC zNBXF)bK#>cwg#8V#JqP3U78$B85~eqM2`FNQhmU&T@5+p$QvQeRF6%v*K*JI?e$CV zqi<r@(mdaWF~16#Fic#(R6|mBHMJ94k6}0>08m74l2tr*LQj#%_VcGe(<3xj0QcU2 z!Gbcth3NpGr=<ZTik}ec!Q{3EM-_ZjRP5=^^Ie3v0G^0kk^$z+<S8UT&&c26E6$wX z2(C$*dt&;bKmv|(cb)1tKUp|z3kFA|E$Aoc|8~e}w?%Tc7R?h+1R7m#$KZ7c8Kffp z5xb{8qV}R1A{BHc<b)+WB_Hru+@k2Vga*f5^a@E4|Ca3$iy!1YPP;d68T2sz2J<59 zktf-MY79Osd_p5f4EPGq(MT7nWx!;T)u9`id!5K8B0ZsZ!tR4McS?;V!pXS)6V=1U z0Y#_%EcECfv7N$s<9degddc;8;6Cs6`6at}lwZzutKFH>KT5PF<7}Pk)t`*D)8z!l zR6QW>=DEL?%jW({Mk)El`9G!_`M<1Zckk4W<@rA$joyf3(ouL1;L1h?Gq;rxYb|S3 zy^d{+*Bs>}UMMwa?{Qj1POS^PK!eAN#fG8Q=5$2h1aOXS5D}0A3I;WHj9K)5*m={s z+#ToN{$T3(AI-deM{eNu6x+uhSv?s#`_W(D`IDWgvn>+^)X`629J?_ND=gQW;mgrT zV5=ceW3)(y*uzNX1l8XL&K%ftE*cj<S0x)FuQp&%p?@Gm@kJ?xcx*kmEzUXpZD>S- z4au>L9_pb8F=r#2wbNB<kMa$=1jklm9`xmdxoYqM!2RbC$2vNsD&i$#fA*>kJ@9k% z2eY$N>0P%E=I(g?rSBZMs`v2g=Wovq-n1jPKb#A*fzfo&*jP_`^u}8r+!*SgYB_uB z9lx9?XOEr@O|I^3FZ>bK6P>lpxu^PCJaa2OGmaq5zK>=f-UL0fDzy6W%+Le;nL9YE zf$^Xw8Fwjwm5EiK+2-(Z=LbF<4q;owcxHP5KdH$T$o3(3YM$ABh2_g-M@H#c!<St~ zHogJRxG|I7puc^Y6{Y7ptpAc7l}4~w0YbkbRBHN1;D19zpJM9hL;w?(!@WU_&Mv?q zCE{~|#~e8fs!ELD=DtuZ*!d(UM@S8fpZuz8e9!onb>dlrC*gPZZQb>UE;QoX8_Qel z6Uj9=i<{*gY*u+4u?-&YVT6f60VH@VfPx~tEeT_~1<6{(ee&);CP2*m$Yu-4H`$ce zk&18l%)yp&@k{suJ3LvAFK|(I3muISCLw%5TZU)U&sqE~!WUTlCcXffdJnd&$3ds5 z)QYw1z|R1$m(@@pPklX00$f0(j8I>DuF(^q1zj2-wPT?HUIuQ!KB_?!=>K6k+5j|W zHGl$r>LR=W;BtaonOBY>|4}at0Xc-4hbHVw>XJ<@Q_%qyil7`JGjc#^ieOg`)Ct?^ zBu=G?%}ppppa{;1;Z0W?1CB^hNo|!RJ0qAVs@cdB?}cfcXPNx+&ob-O&NzUAY8@|v z<B$oy4(JNC=edTLB~i7z|3<WgU741E#d7-Q@M%q(+!A9ir$*Q#w&*}OrD><958P+5 z*y7%Z!&eWV;vn4npe2fhB$NdfQXCOe3m>SODl>hXC+xLpqNMd$c;`#Cce6a6_|-RK zDW9xVKa^n6SZq!dncmRok;P=l<6J^yFyRDymmLych1>%6$B^+jrHh9Z&@LoLjKf+L zPN>SD%~o>)nQR_4_HI>h;QUnPD)aK2SPMH=;w7Qng_S6EU>Uk;Ns`%87O&Tl$#FGa z4A2YeaPA?|ac!jA(}2#_cpu@tZ6h#@I0s+fg;F~<-(;rTN6n&!`ZBN|5&CXH{L7W8 z8jb-+X(Pnx2E&ZpU_{0fhH;N|F17c;oa!o+Hf+St^@U>GSxjQPMmi9^2&}-qqE_cc zbez2!yZW4F*}yy0u3LlOrtaqaFfXkzha_c)$I}mZdzVOU%TfF92G|%oCW3!x%TVK? zI9@5#*Vqr+4YrffqG{{WK^5ojSi`%R%Cvya>6xd;neQj#@wPzr%np6*k)0OH{=uPd zTP(~XYPqz9IYN<We0I-k(YW$b?cy%@dT!9PLw&K6nwIa5>8{Pprg<KogZpJRNzEPV zmn|7hwy2|TJJdrvvvGYF$@R+5EA#3;zHYO^|0$GL(Q<jSr8E}9)j|egxq!tgPGotD z;h{~aY0J!wL{=4Wo<_{`^gHQd##pJVGUfIRtyDX8a$1)uXZX5hy6`Tap``<_Y;i+| zXaUM$P5>CjViUjOW?HEXl)y^D6@;DV0xJf%2i6D!6gVWI)Hi}}HL$r~x+cUQ$(9am z4sAE_QSBM7E(k2QJU&P56n@88Bl-w`jgJmi*Je+HvIi5->R;*~{%cL!Ke#Y5xL?!o z?nSK0(9nL1<)yX8H{vtqH-C*^-#@r~VSbf5)W3K-<+X>{efl?+e<D>^2791h?mwhI z#m3CFGCv`EN~}CtnHN9c>(nh=Ba{PlzFLp~LodEgI!#f$PQiNbp)lFS)7z0DlGajV z7!&A6^G=}@C8zfb&gI3kw5>)$2}9r3n3ZOU_Ua(b3(Xj_{<@jC__-m^D{p~e<OqMp z7Vy$ukN!~flzwc*mGp>>lul0}oN{5#qmj!(S_xe|=0f~k2`8-NL4u2`BlzQ`_LcOg zjii<|O9<Scm+l+|83M0E$YxwlhrrNwT#mL0P6tkKwoSOJ=0OjRA3r$G!atjPNB?K` z#^G80fBd=rp?>0fvkRvH4Zn5yvj2SS_(y-NzoS2n9*{PcVzUcpD#F~MJHZ?jXN}L@ z`fv4nX-%Jo{<?!d?{&h~<!ee0TpY@l;&}L;fsNdyL4?Cx+D49v;vBO}hp6-=bBQ6! zgi;&}T-Ge`YA$_fbWOYTY@kL1GAh4#K8UCbdYd8@HEgAIl3n)hSeh5<s~e7B&99w( zQPYlWa0K;_GMhGNtp3pK=$~rZ^}+c6XLmKc9k8wH(X=!E^dU_fpGy1>t-*p$3(G{< zB_~DBuTHa{Uf(Ti-5QEnd+L>y!}^T@(a#=Rk#<QW%;-}!JFbtq+L|NHUPGP<g0hRx z=_cWQp`0Ok)|#b`ts%0V&|9oLLYrv?ZKm;bap(dyV@*OFC5K3GCDRZ`LyoyVLz^Ud zkKEEmXd%%5ZgdbXic0^`YV>NOQOk3=&;FLf@7H$Oiv!QerF1*W%(|X?dhqY(BlLw* zbl{uD=Sq?2r^WW?geMv!?Oo{sf|mhc_GBa6k*>AoTMp7M^$kJwWGK#5qS!iG?iNQV zzyz>baigo)isfBVaJv2iRDmtMxgm;4T`CgmvTMY3GGZSn8g0+HsWchB4zk-mzuOZ< z9cdp#W;O%j<4!Tx>*~OM_j<${2`C*`f2Q@=qTrFXh+&^ypN%?C9C`1B^KTux+Zoj- z?9AVy_DyT*E0)Lq{@<~-=g|*k!&?`vYV@mez1<h~+ibTT3Ng#{yN|qy0+=^mdGxg% z`e)$-hi$g8@K^g;=$R8tIH#9CoYP+?c_&vsuDmDqsH3na$X;D7pc5>N+$4qjfN3jV z-g+%}!k$=e!WPoS)fY-BbhG!QR$Dk94a{k7(}KDhoVwhmU4x^u8mXjnE*~|6ER<4) zfeTql1}<tSdiwCOQ0h$Mmsa&;sd4EtgO}0@9E@$+<%Au$g4rwe0%1XF$mrK-6szUf z(W>&f$)-QFoEqm$Ob!eVtd!Fu8={ujZ3Ac3QNuWHwln~3OUCgs<{Z7_<TbMFI6SO3 z+_Kpc9NzgOIvb|FJi_hewO`m;UY{-vSFPoe2#P=0gTIxj{4ckS|78X-=0Flcuv53A z&j4{b7KHB$<s|7Cg811p==_V<b$Fdaw9cArf>p{$POM26doFO-9o>X`ctK(hQGxdH z=HG~9lp@-D=b9r<Y%m#Jo)Nulw!ByB#}~F9C?qe8Oyp(GHHBIY58v8_6CNoPnX^P7 z?P4AEH^sa<vu!+LY-a%^o(yaT__DHX)U4IZnG<n{7afyDqlVpinnfS_#^$RV&%Sp5 z&wln!su*QYeQWyUV?8&&a?h?a5yMpBhKhVr|MGu5al{o~_)w4B#qK$B>N)X<tiLez zM+igB9UnWq|0?caz*<?<(eOzfm3irP%v)Tz30Y8@vk>OVjf4fwSp(EmoK%*SC99vO zb=oeJ91yFHHWI52VpVlGh+ntLjF-UdINC4|O?ZAMrQ#IgU_s3=ppqqpw{>6<)NYdT zEWvojaaq(`&{U?8jaMwf&OCCh-x6%|dwMsm{?+(0i;UZGN<tKmJ!6x`3Nzu+`7799 zbVddsd|pDWl>}7nN@V2dNwEjYjd(hOu9qlo1^gU_SVw8+MoNfw_tawyc?Op}O0{!& zChpR|A{F#KUuPd#lE3<^q*c`91HXD%PXGL~cYxot^S+@Jce7zXCX^m{u>ZGE4%6Md zxLX7-iY<ID%W0b6wfL}$UI1}gR>*4_azzyE&mp{ban0pI?`l?DpDxBPl-$r)gPC%i zXQks&%sndn;mc9+pV2L_s5yvTyQ=5<jGU68-RQiR+yv+iqsR4XLm4$W2_ao_A%9)g z0zefNibKw#rBGUnq+F>FhDHh9*9mWceN33%-^@qy*7`_d`i=s3SGWo6fMc|Y`G2x0 z9x)xGuT_n3R?i(HM=SS>{x;evzWrNyLQI&!Iy{7=b~k1DC0Ymc?k#sxDf>_-&U)F0 z+2;a+z%DmZ7y+g>Xrelds0`WH9hix-%LC?9(iB8js}LTe(fMSzO>=qLI=G@%+GBi= z^a%uLiYj&ni~^M?M{v-=Ay+3bIG6btlsBeO-pzD<!54R&VgG#60R+G;M<T)YsCNEQ z_S5K1mLTiVpVtQ+S$2GQ=equ2Q!uttmcs$xN8;Hh{dSi{yy=_TlSdq~FEpg@TsRQ2 zg(Yb7*i0_3Jx==%@=$pk=XxV`utn1lq?@y)Haxo2;3Myk8=?cIA-ZOSyuX|)v=ZK5 z&h-Aq(#5E;<9ajYDBp2W!m_N$(59nq_kEd?07)Ql`WFGE1jXS@wc$NBZLvR7+C)1p ziWv*8!i+hbPiRenuI>R$7_xCC%OGYHvD>BdZ#voB&0FsIUF@wabMp+(PyVphnOnGO znWwi*c8EW!b@e`5?z;V!&YUDnz|Wb1jERCtZ@uuRLOD<SZ51Yckd8P6=$u=~4`Pmi zE+NixH&0%;(L|Z|k3+&^&4syy;R7XjWR74CWS6!P1tW$3K~Q6sX;RHx8Yewtcu7}b zrtN5Ddah$tevsQV?p2s~YXI{OJW72qk<L(Fq}bs)r`Cl`U<AdP*P;2yQq%*nFT_8v zgHeNsw43|$94Ya{Axs^GB5<G)>6n?udqh60c%QMzj+vW2-@6iDOl^X9eW)jP(qc&_ zqN#C<g}(tm`U*7i{37m_vDd|6?DdFnl_}Yww;&QZ8jFMbTj4>s6P%Q9SO=eW2W?Sr zhVvW)5Rcu6sWNxt=WjiB=aqNXMSE#c)pCi~e+agUaW?H3?=;7T(AnIWB~@FJa1QyP z(S)1EU2a|3%U#4TRmWb8EIVNs{tVWAY52&GKvrK<D0-ab_Q`2TEaP~6{sW<)2p;V+ zN0HjmFxBtzBR~ogEF)?Ps6`y<vCn({FJHx`!;WaA1F?eVJH_WU?Yn)+mHI1VteHLU zNu>ROCQW<Z9=Z<oZtL}D`}h5t3DZwo#@Oyi(xWaaj>d2xsh=MYdcq6u2O<uO89cQW zu4g|OmodK<$i6(!hpqySk>(dFmndt5K$K)3wSuL*4W(!V=v2&DoaU(}t^^`t%HV^b zbD2jShcBf1*cm)RtK-ismiRz4*5G7+?h1!o><8)Gmo&@so;)7<>zy(E>3aQ{_{za* z5Laml479G4J!#hctAF}#zeP*q*x>9~*WDlu&!3;Up6`uF<s<bZm)k!El8(l{Cd)%5 z4f*vnb_E~t;CFFFw$zGkz6x7=EKP!z*ii=m;#1y6)DBn$N~X|??$cuZb5B3lIv+3! zs+a>s<fU}Or^U>3=epAwluz~0zh%6xDWB>o2t%5{>b2%zy1OT1IJMldz5K^U`BW=H zGG);M<w(<s?wmxw%n*HFwC}ilhyP>*Q5m^QEXIR$h)XE7b1io&*YqlSug&SV=$qw9 zOV5`f97c18ES8zI_0!sly_&`n?2JuT8(0}}QD*&GSKWZ5G&aT)aZS5(bMQG$ThYG4 zs)-*+n}79FKl91)`SaUn>Z>bKH?>2WxTQ;_n8UAr84SsG9(m~do%x4*02g)uI$=iT zMak|+b<^4PK*m!)vq9mcQ1;SdtiV6KU<gMuhn)p@cPWFztTa;AeJEY*x<F7TNbz!4 z3z;2Vy%0Ptj?y56H+Y)p3-=&>;ieYP>v5i2$c^YM6bGG8Sp0R-4)RfpZUpv_J49cU z6%bZ#LQ7<vbFQX9fz&CCvtCs3)JA$>S0f3^I#s+t#hNJsObgqH|0t|g#uCW;8+vks z6!7zfc)%~6iY?@F5&D#ZrTCPG{<JOYFzH_LSvwl$*kn7C17lYwZF~Z0>TK(e&btPC zq&r90&V>{7Rb6!kZNo>U+0-;Y#PbJTT^_Gx9r<lrE1$^cVJAmnq5lri8w_GP=dR}E z^W_dq=Uj%rhUOgXxT5CVoGy-DKqPu@%utuPOmXas=+A3$YPvFW*BO7cDpUGh%>LLY z=65Z6t(UI57OH0kcE0WPoxMmJ7fM~eXjemx~{Gp@fNWx6WWmp2yS*D6$rG6&@H zQR)C*1=STI#oLSC=8<+}{ZPEXBRH@cDafPNd0_oU-l;WFpl9DXw3e$}o@Q{)d?0-d zc?8)DsN4Fz3qRhwJ{AvoWlLJRfBuI{Ux@1$zxXwpce_P({9V68-9NDI+o~h~_YT$a z*=viR4eU2RgI`s(c^ZJ28Aq3GIt*Zx0PO)@pwN`#WYBOObs-el1t%!=dAA!j3$<Uj zlAGbiTp~YCbUAk3V1RwD+40Edn-#MDMGOZf1fVDSah40h4~6ndj2t}ar4$Bl^Km^v z84_}b%XvE1Vz$%^c}qQs5jW_*P-=ti+s0viZ810!-P|GW?!)0uX3$`ZCsyn6JKzn* z;jTv;pd#vC1H39C0Kb{X5!wjQw}NA#+@07^^+vm`LNBI5>f=F~#l*%XW^ym~Pp$WC zTF+HeSx)x9R82L#xW4}-N_cG3zqt+8ixJMpd&4YN5~h7Jf4+93Ayly8^y&G(yy&xP zYJ>c0%%N-tAF76$f$s_B6s><Zrl~W-Ju9$Dm+J8roF8pc3J?PFL0GXjUCdr6Md8gN z@Dk-+qM|D>huJ7iRTgL-9Lr)9ou{!wf=Ev1@G}ZrUdfFXQGidBeQlKB1Ez{14ebPy z|Nq&07x1ReGhKB3OIORXEXlHES(as4mStI%Wl5HWe77;i0pl3sIK-iZ5JG^&A;h7S zp$tPkE~PXjlw1fyDWNon&17e5$%Zm%%8;aKhGCeT>=ZIgCzEDpr|eAAG|hCnX=1PJ z_xqO&26CUC=RD^;ZRfEi*>t@A%XfXh_f6;I;uJxjjM%=Y<^tz}FqVoc(+fh07!3k( zbjXaVgR*XH)VE>Jqx-g`pHvnlbk;t0a;ZPE>V>hDZM~yt6AdZ#7WG=W%<%AI>90MP z>FAi|tXoyzy?yKTHu=!T#!EI2r55hfE3ihS6YdkXKu4DB8!#<MHPe{(1Jyka7zagy zk_2sResgGqhLe&21`X)8lG=~E4TzE_Jc_aMgeTnm6%7nYdKl!Ci~4IIazz~j7`vlc z23x+H<~*fh!bI~zl!xh<Vp+60kFEkHCV~0}_r`Y2eNsh6d|!^R6XMj{Vq*9%FNlE3 zCJ*mI8{1vXo>$B3A{|=oi{hli>hanOV?FAZr_LCro*mt)HNQ5ZMogpkin*fb>C7_J z>ctH|HL1Fq`!YIXTCBrUJQx4Um2rA0T{P535X_`Y=&pHV1Q7+NmQ)xZFN&1>+Sna< zS=5>){K_g&qE2xzN*Aesh1Z!<erw8$)yT)pt5gB7+egR3)W7AaI&82fRtoEkh45zj zy5gBRTgfq^6Qb8|?F!bfSo4kIpfg~|dXbp`QU4KS)$BEn9UDXOz}2CwZN(kSHyBbI zwPv$++djj{vQ?}3c@VJ<dti4uphKusqB8jk9fK4m;2by--r5Ep`UCKmdGyP9_KYN3 z>b(5ig5)59=0N5-XN0p#-<lynz)`25izm!VD~JdFnU8rShF*}EFQ&Vw0o2DEKwW46 z{ku0=S8LFt0fgxH24}@PXaK=4*sIFX0V+TTh-J^aKJXuAv-+xKU`ofQF=<s@+wyT& z7G{fA^n6@(m0rXJ5v<YgutsXwl_mkPFvVI@AvYyT+Yl`@Qu1>O`KVf?;xUq`!)Om- z1Iroztd36|&}U?{n@RgmQo-gx0t)yZNhptkX9mVagCr}R%~9>yWO~A2?cqpHV(p=N z2QqEyB<S(GE%@%F5>P&xkg04;!c$?R!75p5xC;ZW67@7H0P6|#RIZ3JFpJt^M*Nv? z6F?j=Gt6a)8g;x*@Lk=Hv$UvSyTnHMNl|#4x#id27CY9$1fe^v*RO3Cx1V0s1}qp5 z1nq0NHAkavxKkW?u+f@k9`RQWd0c*6l&x<+eo($>AFjLHt**bz(0j|~f#5Etb*zeC z%Bs|{!&7!+&&Fk4$B!V+B@B)q>wmK_Ur3i??Wul*W;E(Y4$4EKbX0*JAV5KqqX$@O zNjMcaC*6cq`Vv;@rT~^=Hm+Zo>Oa=m{Q6z>@P1W=t_*qRRrvv{uk&AZSXKC!d316! zxx=MHjDL;dg}S%3w?jFddVRmCcG8?<$d})9dwcZyL$T(sqCdGh!os#<`CqEq>Xmbh zMyAVW(`H-D|HGVRwoJM!uH(N<&CHcC`GT$VrNwd0beg8FX=bbKXmWi$?z*&?+WOh- zVP_z~zSr1bxpcfmjnPzGR47Y!2U{6)Us_^qqC<YzS!NGmeJS=B#QM&o`Y^9XXKX+< z*}?$Mb}TyHQXSCqHfINi@=jQMhH9MkM2u4v%)_CcPwbdd8F~}IU{FoY$7^yZ&CM*! zMXo3sRMF@XKhT}}x6+*raJfQf8{&~3?3$T|Obv9V5#s7`rs-^Jq0cBt9X?c-%IpoD z6aWV-X=hOz{H;{of`3sJowlx5S{+CV;~3MZ7KX4;MlZ0>0B3L`ly9LHW<5=O0^kkb z478T1jX)XvsE@REuG!ch?+-ayptWzqq5Cpe1RVE~wy}pt`Wr*8P4T|340l~^cFei4 z{)g_^zB@;2bR2r%IFxT*zY>5UyTudFZ9L>KWqJSmy2DEo7Rv7+IHyUXhSwnclY&zq z?VJ$rGDBmGCwOF@S*nqGN3`6<hPN_4h>|)Il%53}lM_wNGGD6Hg{3AXKR;PIDe|sz z2)*N)lTs;q$8PkLU!OE6J!OW^k)T`gc^-Pof;G=wn#sOFTEggU=u$RLS-0dzCkk#Z zRcuLB@#rT`b}6M!re+aO1)M?%!GJNxR}rD^CVb$hwUCtr<mwi(>ag~whUnFO<A3`J z^^MJC@_ZAujTahpSG0}a{bGG$74=WV0qr{AHgi!;-En1aids@sPNg965_2!19YtJ; z=rm<7<r`o<DWy|Qp*k;9mlngM`=Zpd%5eGO*Q6MJCkN>K@gi5bpCTq|U?KLv@g@<4 zqj~uS!4RK2zf3AT&H78Ii%%}Ev7Jr5jg7r>djkq`{jB}ttB0Bky-oag>FWo*9}{<H zVs*q4DxggeOJvrmUwH7!bs}4mnjIr^l8&{_gWbM%xlmeE7>=ix3$633mg|bO!o&lw zInr{aYa?+Axy>|s$5(11ub{1+CS?REK+Gs)6{cC3XD-;4Rst5{DsvBGPcW5_iA|ns zoVlLKFC3|S;&OknxTWjEx78ap>w#}bK#gG{<2#FDQU~^c{3vl-c@XATqS+@!6NvWE z17xW=yxY<W-`S;Dif1N%d!tgaYv7f<kogZO%gJ$b$O7u422dY05cQFZe@)WYG=6ls zf|tf;#D0XF0PVdADBy;Y+-|TLua&A@6ULyA=H}bIv#4)CT|NbZJdN<R!`D(Vlo)No zjJ^`j4<PC&Dit7s0FV-il`=&F1CHO$6FX)gfLWwCFwY-xc&y%m&ULkUy^pRwacp4w zuAW(oJ8H}tOq1Md>h#Rbksmeqmk+MlyK;Ejy_)qiQNokf4Ts&Kf@S^x*thpDU;F8{ z)5Edyxwm)PETV=%DM0QGKK_H3|4aH`SqJWfC=7l0w)mQ69rmC_IH2GmBrBG`oR%Dr zfOJhc`MjE{EOfh)g(xefnqt)V0&78v(v4OM6(?%-LKi-$HK%eRr7bz(WSwCt7vE~l zTr=AbEI=D9M>TSIltuMca0_O|lTBDffKk&n6``~QDT(z3e!759Y{o5;n~(uDR>=de zdd+I9I8_W~VjP>Bh-0nmW3(e$nsnq-k2DWJ6DxS9WpB?|>E}iUwrp{+(bVz1gDZCL zd2Hlhz!CNbm+m_liRc=e^*0&@muc3`gqVD=t$%p2!Ly0&TGKV=RX2a5@1db&z&m^K z?x!@GVA!~f))Rbhsy&(!tY<T2uwk5H4pdAduT#bZP^6}zFh7<H*og}B#4J8G!6~Q! zAq`;7a-%7<w92qt4lEmLpHcv*&{RV*Rc1LV=nVxtRZTfi_Q3ieY9vr&aVeIn!ZML- zEi_kNbvP9qPC<g`r+}*9XVzcuP{tjR-18(~K~mt50<0?@nnvQR$&r68s;V0Lmn2@( zWgjYV)eYYMU1%&g3x<rfgGT*<<p>BhBUeCi*W|EJ-7JUv^wiPB-}?FkbIN<zonmam zrtivj!yH%8oVbVmelVo>+HHQbkDtSBgmuzvz`B)V<v8k*k`#_}!ck-?UpQ5Mj;}`r ztqTE{1wCdX@c35TUf@dHN^kk9EBP@<qplf8Vfbojkp|O#mto^m#n&K-kZEBSWY;BZ zRxd?B4U}~eSn*?kS#$9msk#_2*j(dzHsAIuky_V>Q?2KuaykNZjcDqiTq%xW>L68O zma34%CwLN{u1e6wtcoBQ?Zb+c1G<&A*yoauK|{-0&`qx#7_>L|c`x(QjKXZ78q(ne z`pL=E9E%j`!FNFF=6`W87dK>9jC3Mrp1y*>=s<RyN3e9Hp!%c|XMjm(!n+zU^>DF! z&DddU*jd^YbS}H+H;0ywbvVK}UMFsLGN!`7?5Ci;yziyQS=M7+%VycE{w7EH%f~0> ze{0?_HpC44pgGD+n!IEjg!H1o{f=rBGn;YpIwa|s%2wd{^qMAI$m=-B1bQpsbx`Nz z6g#$aQZ2kr;%cuGG^dOwr4o1@@#Hu|BN$~(^LS3s2(M$}U#sYA8b6xhb?_N^9TSEe z;dNM!siZ37buPbFyiN#7^ie^pw*_jIi~R-MjH>JNdlt6sbNP}Luzjb&M^XfT4kSgK z))I71G!sT(o^%PBKp)s6s7W`GYl!>MqyUnHQ<lu)B?odG7#($T2|Fo@LvN#tP>JNr zR!Q|v8-~1Ta#?a_s(3cbQFPIQ)p4whAjL~yCtXR&fO*DoK@+7;1HdQadP|U~KT>up z=FbVyNG(jI=IK}52do|?SL7#9CeCI4`aDaWRq-Z(ouM3o6YXb->%TudHoPic<Ed`! zNG7|72l_{?haD02lNWY8vuSm(VO2-$-=6sEMSDjsZrXCb#L2v+F(^YFo;SR0?>~I- z$csx_%eS%`)f4YB)86fC4|~-nM|<D4fwVl?#_ga-cB_7`BL1j4VKuP3#2BiEO(~?q zSXz#uPfXbnx1q&6QIO$h!8|+)pNa?oQKO&`JP}rASL?A-TAJ<5M$(xITcl!+2S<qo z2TGq^Tt%w_?n=ZWG#so0uTX`m22ZN>5&=ilZnf9vDcS*;?NDiXMFyP~I8;#^3i}TS z5*`0w^5qv9YORjkGZBaJSgY+j;cVBUSgj%6_phEJ(SUL4mVwsJ#?kH_L(7_1E#G>c z!LN&=yEJYuk_XSm*!Il>29vsDx4fja#Sk#OcI&!7+lqUNeu?UFoTqYd5dv|?kzrL> z3mqLH1Sy<zu<V?K*kWAmFGIaoA#zLGNI71xK_xcj=fDr}zP=^ZVJ*VIv`z?xZeJCh z2Bd)w@*h=L4RyAuxJ<#vQ+USBSVr~`oqP}|6ri<FAE<duSD#SBZxV?N<?QU|?$I@y zmo7WFzp13ZW!{p}+y3l3BXhdti$|Y&<2U~zpTB=J)He`W+M`-mIJ~`g$#74f*Y6!# z!&ZmZEF11$zxm5={r)$pgOiS3@`pR$bM<_G!@wzk@OX|*@fmDYn7A#oaEqX#@K!pX zPJ<gS++3ZE=0_$fs<QGCorQRGLD88P(OGRcWj#k_aEi`6=x5VyGUAdML2Zg}9<v0U z4yh2!6Tx>EF4ZxWIX}0eikK2og(wd5S&1rGAlxUNkv^>er4~4*Q*M;1P`Im&BtB;# zHu)K1uZy1+Y%!r%4yw9wKH|b784fE&@mP`D5e!;%Nr<dQDlH<9OI35JoWJDO+;WaE z@37Q_3Y>^3t%8aunRh9ODf@JxlNi%5g5H)TsB-i2OUnWgdj6OdjHf7-0}PS21Y`6& zr6_I^z_lEiFtw%tRCiIpni(47_8LGpOm3147w0`t!ha<`#jI$i?~K@0v@O0jRMZ== z&u?6GcuHnZENL4W%MbMAiAy3qYqoXttlcU<ZfhA`G4{>x%fY1I?Tb6t4IN>-0l0Y8 zyYY^Q$LJcmGqh{Aw=pPp$tOA%FOF#S`$t#)2sIlXpTfS>2!E#JTfCr#A~{U6jslb* zABUvq)E`aR%@cu4bP5(GK->W7CHo9YR1l<Mc;+ps7T=3dwZuGGD$kPE5nBPsR+Mr| zDPe_jh`)gNmsDAd*jcbFpCb}wD?T;)6kq+k@urH;FYm+I(L=$ljOJ`;nwMjEZFFLA z$<RqEp_x2@VK~b)jtibQrH>M9Gz%es5ar=0&4Om(C`dLQi|gs7#lh2+3K&xK2Q;1n zIFfp+kA`eYm=&&sGQ%1sioX*dyu10fq?=93YH{bfp8IA^EoJJ%o!dtH2fwjnHu_|} z;u{y)`VCLNwEDVt<gp(y+cRu2+uVCVK7aBB`Ru;A*X?5g>?@C-Ra?Myh`Dg;EYoB% zMb9d1D(5=P=YPLsr+ou}m9rPA5P^(mnas*uc-}{X@}IamIu>_d6AzP)_S>qbaIWSP z0}grqR5#}(0ZAL5a7;lg*~MKwfz4btGSDsu47AHkByNZiCQR*dtT)m9p$)-OJ>k^+ zb5aMEJSQqGz+c?~SOF*@knY5>#GUhSs@g5%eq|6LW#T9h7Y1?ZxB*6u(8{LFBjoX_ zD!f^~3Dg+TR^Yclj4_=g0#uNTOx?{&wAAe3*<_g2XuX&6q8AW|>L?IZX?&61wVP+J zVIfz*YjxT}ksR$@E%QHUaH7RyHD!Zg#IWRUUuXPb(WEtEWwVhg71@W3;`Zq<G&l0q z(7xtn+wWNB47#o17PEe_-q-k!DU@HFV^YVouJx>S@LhFbmQ|})%jYxs68VDAhZocl zoY@k>>=Qm09D9@`QLvhiQ%kWFUNi6}QYDWtPS5_$ufL&GMsmz4?McaA_=i;CNtF7= zbvar@r$+jk#@EYnDJ}6)t<IQZFI0-B?230vM4C$}mte_r@%!xKg$}<8qu74t$D!Ae zg^M}?38N)dM_j9d^M*pZ<ex8^8)MHqmu>FtbgP!FJktB~)~0Yz%stk8%ho;Y%_)aA z&!O71n03_|9QQN%e@_L@Zfb)p2ow4dTfPZD*e>h=pD8_iEALp%hM|f_QLdl^RXz(A zpFnIprcaYBW4efpT2+h@MFz_M78@E_9vmu=tkO4JXZS#B3rAqg2pH-iV3s-uR8hP# zAyxt1l1xRcQZ*rPCDDXa2ukB}UN$bWj@yLNc4ZY{*@*=Mp#qJi`|B-r)Kf<e<VT>u zM_nl7Ol013s1equp2wK9f_`!L=wpr5j9GFH7P$0!yCa;8sT+EFclvVZrB(|vW1pxs zhZcu+m^bY#)M(U(=i2((TajO{cHNL~=(VvY_jIdmqaE_dy07#l*iZI%@OKT3ZHZEw zK^qM2RN@N%hwrM!a9^FmUEo2XHL}GxhXg54z#ZX2GH^5@dXVf$c@QMfae7XCdmkOA zlmVsp?30q<AhK~d-)tw&3UpQs{Dm^YWk)TC7uB_hHYF=SSeVeNVUPmmk!4;%0x2Wq zi&U`z{Ou)^&(~wr{v42opBOC9`{RRQ`A(-td@AS`)lOHWplE8StcdxSci$Uv$J@j2 zPrYOWqdpV*<d7<cd(k8OT1d~v_S1wEH0je3Itp<Lp<56uwZ!R+v^3Be!ENDRCMt_( z>tN(D3YV%hPq>v4Ig^fO=*V5TC0am;DLy|nKh%XN6J$LUPGNQm0JMpyl8(32%&*X% zKzh=PQ#2pID45t(2rf|%&c~$G?x4z-XtLUcvdZx6Zi*zL7Km{R(rmPaFzhB(VzNpK zb>X-nGVP`Si{Kif1-Uuwn?%~&7!yfrtW@`a8=Vd%K;FtDFdQYs4@3$|2W`3%R)PD7 z!CspuQyI&Iw)hPImnlF=P`Jwu#p{=@$qO}D#r{ZN>%peP$i1(z?pX6)hqEjZFt?co zmNYM9uI)#j-M2sB@%2XL#25XK-~auudgrQln0l7?JG4&y&`r*5jozg9*t)*n7Pr$M z&UUzWtoYq`wr@J#?<_5I>=@k5UVQYzu;aR(^>hwJ%pAO}n!vp_;Uq%Y$A}%13b0uu zyP{0H;t)bk-GWFBYG*{iK&@p8S{;bd(7M5NE1=wnX8HL7(-HYt>W}%$L{>&&6@Su- zfpvfO!tT?3W9yHfd7bTk!`(d6Gqht5J9K(%=dM#D=e{<=m_eK)@9Ee!I`H6=KmV0- zPa>#M531H;7WbQ(X)*M;;K-afUPQG(#2DmetMT^|Hy2Vfi0e^d6jQo^bC)RkcsVb_ zPMa>HsJ*SM2<|W6%-I~l*#+WxKE7K9UVVwUdch{4U{NG5+5<<m2BXy<P~s;$v{tp4 zxhH>>DKh7I%`m*!TNL@?$lr8eXMmqmx7y>|?$CCu7;x!B-)djhrI)wt?hE+?p}51> zyToaBi>7ams@6J2=E-fF2Sj({s-Z8x!;<n=`5SXw5xv2)_Wi$`KZyV5>!9Ex3Tt79 z-@tz5gA4L@rFKHlS}sipI&WO7Y}z^A_5%r|8q^uEvS28}oHF(MOxQdsY}#NGh(@u% zlC}`;9#3mQ_lQ0!;X|G%Zfd~ai&GAs8dNGlL7|*z0fsVP%S)Ge7Gh~;ktYfw!Ju#5 ze>rwhzWDO+O^m&FaO^Z&d*9ak<;6VTX7`V#Tgpe?PrWH$OdL2bo`3U<ykCA{4NH;z z=#agtR@lO@Fsh)kNL{fAo9e>zOh7S!RCfI?y5BTM#Lp&$DX3ed0fWWICIihKA=v^` z`(<e+D%NGDi72_M!g)PvP}^Lk)E%P~fr6P3%@`FWB@;4JK1P*?N1eX9RK!;9HpUBh zSzD>Ep_jn%T1ZqoR=&A+$7n1NiM8Im^@-!JtnKlXYM&gPP9ef~96omNINQ4Z%_n;! z!RA=&g{?={&-V;=CO1q_>QL`w_uMuxCrkeu<vt?kR<FW&&&PxSFp50mr2A;0^9@o% zDm@4|-Ix$Co|D|Tm2Tx$s<E+Nx|LLm&SJEo#dI&#M!t!jOM`o`H+Vn~&MiiyD0${j zsCq^Ix0SmCi=VvrrOC16*)jG`fxB6Dt{59!_SBIB;v4Z7w?sB<TKMnJy)c&Sb~MhK z>N#@%=;H&=9@z8>{%|HFU`N*D?{-lvpGy}L>U`icPUFmi6<Ybj)%e4~B3X29qEN~9 z3YFUj8bP{!1hLeS(;}OkkDlE`c8;|$Llf<xJBK~E6eGs#^5~kC`9UnzD0=P9cfRE5 z7+m7kZyj0O9~$Xy_4%OUv5~Hpyl4N|QiJnxHh9+wwxM~sx$~Yo@Hg`sGR|F4Sg$ZB z@k$SMUI+?UR?eMJX~q|WI;r@a#BiCw^+_XynOcXAhtfKNLj$5OfpL?vQ@9Dq&N(4V zanaI78xLp7z>ceAsen=9W^u3a>2+JS^iztn;&(oC^_6T$A6QS9!iHpFy6`WAv>Cfa zkt|LZq7LY4rE|!Rl+ijA#3&VGsUVcXc=aFrc%1sg93=rPqRBHvA$CI4hLR5vTs6KI z!_$uEm<sH4%`jF;a_!_vfq=y7r1KmEx<@MoBmr22Ov$f>F*5~i5Je>;YKM6x00vNh zN_?hLJvBJg+WGKz<rfRTUD(Cmc=ac@_#fZ&m~U3&@<XY4tG2IZCk~&v>pN2yt*^>I zlh52Tr+f{&&h-BI``--($K-#Xc*(@x+A!9T{N0gt$5zfO!v5P(>w6#T5QOGu+Jk3K z`C}6fP9CjTQF9*ZeW*r*P5SZ-h66Pksge*?D$5Twy4-xH%NJDKLJfjQgx=I|M70VN z0|^~NZ&*39TrE4yRJt%wjBG_@ad#x-QoH>fCCk3n+QbY2(}NGouWfGZXp4tETDLcx zSay5=^+(pPjceJ%qG8wCklqyx1=3GRqFrnJIT-1`^F)6_ba#}7cK_m4dHZWN(IbM> zfeDS+-xoE_-~oLngGHN#@G6LX#ZIT8(;Vajoj~vLO_oG$=2hB<J(o($_`wImFps{P z!l~dn3?F62G&4-dQnM0~p7eb1A3P%UAtJ3fDLKpkkSYR5#QPuH=wnSv#OIACfk6L5 z%5XB(XdV|DQJ+ua>*bV3I402aMn$8N@YFCAe8BHm<n&$oNYM_I>S??)LM7#Nj%5~i zTO}9{l|l;q_;RHb6OE8u<7MO;6++&i67?43=Yd$MK5kYOU{^qhZbQ{Mq58eU?23fK z+K9c`ar^--%iCse&R+k$!GJzn-8a{&JKJW?v1wh&+pkj>Ix4*_K8JC|9bR$mrrjVs z?%O0E8$DYTUz9>|`5?;?2fu%G%R_zcl8Dn_mY*Iuw6j;g>LoTL-l?|9&%7Wvr5^M= z1$+$egUGNy)x^fFr(1~8Wzu)?M$<MDylt}q0g}$4i%1z1BAV=Tk`*S%%56S@YLmEu z@Bq}4Lp_lPFb?YB=IC>6bp6~2SM4Yg5`Wvc%3DcwSBy6kwSbVV!RNSvJQG*K7_*N8 zntU@tw#21Yk`8NyC8GgRt6F|(;&5Bwp21zG*_Z9=_4oZ_Vo}8H_xTGVOE#@rdQt5> z@H!hk{G0#2?$*~&vLy$M-g(ELkayg2Pw$%eB5z>*#<SZVXUQK2v_10M#JBP;^&QnR z$IkeK4MN&Wz5}L+g1S6<avi9w`7I>S3~{;xXYAvZuyIY157X{kNF*$2F#ev!KOo*c z$|wmABU!=PNxnWGg+y?$<d%5(3gp2OjXO$8m5TGU>cceb<D;9FW-uZodzpt5Bb>du zETbMF#LTQ&{Hwh9ni`Et@i*%C`Li|>n}i?F%4g^5=s`W0#i`?0uQp?euYPh|#cG7& z<b$Y=j3=g;9JMf_8YKJ>ic1saN<L)gxB+_!{1UFJM8REJ)YZ{R&{&{`!ND+DsG1S6 zrc_G@qPQGbI!cNpRML{EIxE<FYa`Qya?@+`S-sX<oJ%dUTJjs8;=<HxXRhAOpTnQ2 zlpKcRSX}UEU0{Hwb@d@?q60+XF?V2~k;51SO*>R#K`5&xXH^OGRN9b7BqsuBoN;1u zd8T~1jjc5AKR&S$gVViwux1Q?WR`OkL?rx4IghBE6aG96!dHZJ1wBn3o+d`8q%<Z) zV*HdK$>vo8p!<U0Il!p^_z5vS@ut$CRsjtt2n8;j03iA#Gnvj9Mf~FiRYe`Y1z|`c zq@7;ODWJjv9tc=es;(!<DqKR=@aI>3##luhy1W=Ke#)L-xhhkx&%WuRVmmb(k+(L| zEC=ylrmP(G3sg+nYbWdJ4Z?aB)1oC|J#l%8&heTP3Qjlyuq6{U+<>NQcx$r;`fjNj zGoRl8a)w`<N;vy*doec}sT%Y#Z57l4PQ}s1vPKh0zKX>tzo&DHGDU$a*X?r|SJey# zMKSp4hE}uj3gnUDvzlFC@j5@epgNA4jR|>VHSq^?J`La$)47@YJNGv(tQhW0rbK7Q z8MB*HRx&=ArU!qYWW_Zi(klTAbu=s2R#Z+YG9tZlBBr+5U73uTqCAsI2WQ*DIA0-D zzj;d+{X46`Y*e5(Q>8>Jz7cNi-f&%;+w4E+Tle+WRoi6g$lw3wne)He#oQkXsgd>f z9~{{>s#Tww*W7d0k*^Qg=j1j1EprbqK6S|O-V>+g-?N)PI0Z7*?8)svm*2X`dQwXe z+#&9bpnltrXV#-Xj<}ci$3Z;_&QPT*PE7})n7ZPrNQ?%;5Hqn98sk|ce<9`T-y+@+ z^0=rP5dR-K5U5(>w|GZfgN%iAMv770Ppf=nV3C7SH8U$OOg9W{@le8ma6uW5dr>aH z%yI$3T|73ND{#dYI9_x#(V)t7Kuc8lAYcHioK4iCZ>0)6v-A|ZceVUj$dxo~@AVno zuXL@7IbM9Tt1FS-u>8>4Uhm4F!550(_|pfUJf!M-kC{$yUbcAZ!losLMq8Cheo}sC z|8M{2%Er!)u_dd%)iv_y!G4Efp6f07`!64)*wFxt=5k<)Bh>MoiS?!-IVjOlDu>A* zpjp%TG=+eP21>mFWc|d+4+5>y#7kOV0`?xPOFqd3*d0KjQ4(-1tAd$QZ>5ZFwa2Dq z`%rqo$C&sSn~9_B?_0BSmQ&xmeectGT778X{=uH1KuOZ#2roH&HPZSn)^}h}$Y@*6 z<O?q)A9!-4Gwu#YI_3L0=acy1^T4{tG^cSd+Nd%+ljDxlP#_^FS5}}IRqrZVf%0UA z^yCwn(TLB4$nVnj5RmFvfpSD36Z(ouj@6yXouzP<BKx@#c>2jJGCTuMr7HNWtD`wR z(|~*sz7oXCK5?;B%30B_=kZKCA$ewU@&xI;V5$WIGjv|S!{`grYWYzAy%yPv=D;;v zJu^_y<2mM16n?=AO2}Ujo98I6u0Z?p?m;Oop_Glnm1M>VOBdGA%~K-5JWOi}k`J}K zaetsD%n#nD<9$DQ*~J4%id*DPj`C%p&%E$&FKB9-{=X0AeZF<?Mt-*hcFYCK-5{)k z<|d_dIk4Lt>aRN|#bA|UoLol&yG1l3GmsH6Oa-FUbf>;)0o)wL2D31(IL<_44i<%2 z#DIwc^@iZeC5afM3kV)3tR;bv0?Ma~U|vNz40obJ^xKg}amg~6cdM!e6GDKYp=d@d zeKgArKL3r|?193ceJK3qH)?~nW0@~Y`{2*+b9Jry!OC3g_4l<`W|wwtxiPEv`VAd{ zoU%E8zQ{yn!&R(1`|W4{j_Gp#oN2eH^W~GhU)QaCen(<t(`{W@@^4s=Hr~DJ=EQ=r zFE22>{@Z0@ow{M+-Cv5$*>Yd6fwjD#)Q{(VctO*pdIt7vTZW^+%(-P~a0(3>L<yxd z%&8~UQcOBhyOdsYi-vZU3k&Ju{Cl*9WcsmkRO<kSgCp_jI2}{#ZACgKjW2>PZ``W# za_|O7W6U^x&_F_nIc~JtGnK5ma*V=O_!FKe*zS1WiA?*YqQ|wVm->+W)=By0<|e0| zot0T%Z)(G-{-sau`^lL{oUh(B5?dFd1gtQ&^xK<PvZr5q(G{`FXHTtt<-jYi{O})k z{qH$}_3|&CIIwP&l7em6@Y;m3HW7K7>RsfYa|w$-haxC#Y@$jj+Y!Pg<&F^pHEC%@ z!%{0^gI3;TY2^kFoI5!)3wc<T>G?(wJV=Ff{ZvKB%gOPjqe+f;^jjqnr7%INngeHv z@kg_z%_yj{MQdJRQL&HP!c@NEA${PmqUE8LsaNsg*bHO34n_7%jTWT$)NE;GAcUtD zUM6Yfyn^+uyh9_damlCb%L9EKeO}h+4~M*cgWbLP$M<_zgaYotb#4}O4-BdO&vf10 zv!dVcEbs@MCSS-uw4yXr)!TVw<4f|hi#PQ7IuZw%*t{g<+j{@lu?LQvTD9312;F<z z;p5@KA+2A2VeQheGhp?&eK!4yeG&UQ9&_}_uW_E898A=HO<{~D#%pvfP+Upr&BT2T zVfC~Z=2DqKM|xW_a|WHp{}5$S&Oc)uo$_0!2DYxyfd#gL9#C(m%r=#cl#Wv`SP^+h zHvANY0lYFHr$bbK^{l$4fMh=@{HVCZIVxT?v;MIBA8NDO+qFi_RsT*tG0ivnF{A5e zEB4*1#eeSnUJ<*MSuS&p3Ip;a?==;H!43TFyu}2XY?5p%;}f);vklEA7akY4t`N_d zgA0hnmdM(v?}Q;*^k(S6@QgqPjjQvVW%SrSMW;iM%8=5Gn{)CD=~E}-)T{eas0?8~ zk3Te`(Tb1kQB~d5AsTdQ`<$gBh$sL0#~;<N3M7oPH`LRUhQE9rT2W%`>fV({-lBRQ zx)DiWeNz5#hBp#gk)+-QO-cx9Q>)j3S6VaS3Ja;uh=F}{F*1TAsJ3EQIFr-!A^a$c ztH-h|+7f>ir`jmDNZBaygPIu4%XC#a<#Lh*FibRuX~)sdAkAY`yX+WO<Gg*;jKl@0 zM}%=1P-my#K|cUWwiO;*)8dDg`d(=o2oxFf!voF!z2##!Mv|)%KYaXnJZ`beub&=U zITU+7vTW}ykNvl2J02C^?b`8I>wB~7^5e-4m!&Uqc<}5<b6D$*Me2H+kDqC7>SPPP z`lW&Sx!UH=gNQS`fgQ8M_SBN@7T%+)t2Fhf4m%N4dOkr9MGQgkIP)Ty(kIW3QG19g z&!k0;X=@OM+m&{L-AuX2L<xV#>4s}?36ANrErP>C=yW2A&p~hi{szj1OOm)C_^!e_ zX%rJ&KfP8kCI~gP`DRTNYjs8Mhp$rw+q~(izQyal8eAIebQC)m_S-t$yAQB?*Y<=R z|7~Xf$9U}J(YKp>^+nd6j+_I#<kM_^c=e#t{sD4q`Ct#8A2VdCr@<}KEIga(!h&iT zv%u*hG7QAH=b#s81x?g2qXR8-8V+GK<iQwV7vnyoLyyw~9c@4cL|mXyG%Z@`$D$7F zXC;6}n-y&eLZ|}3JA-u(mn%~<;Ij@QwO|a88^Ta!1h))A+USjYQ5pfJ4)gi|x~J+f z6=ET+ZScZIxf)SfQYu2f3By&eGKAf^Dtsjj&@>l0q5`ds^C{XbIGFXoA8;abl^8oQ zlNODL&p@x2U0<aGD5F4)sg%;LwF|yJkDcgR-0C7$5i#4_D?Ti*xH5veL}jJd>UJD; z8(3%W(omP&uT>kpy&xzLg<G784_piELYHcvqKH^Zaa4djAH-Zx_eT$MA}6mjfP}Ds zE9-Enav<?Q|Eyflw1b2b+Z;5hhr3Fr59uYUFC3TWpmI`{0c%)enJ}sf%t~$y0SQ`Q zNd*_lRiCh8Xl|96)H<3i#YLSyd^TX!04IQ318Rl4JgQoEB_DXI`Ibh%U9XL8eR*D> zz`DA}d3SO5yd^~rm&fX2iG5!VjRyw`RXZmFi~DyD_U?%p23H$b^@QaMv%89I%<9@8 zzCLpDK_kR8nXp1Wt9nc8MaOXi=j2Cw9UWKFngGJN6|bPr2kV5-c;68+5a@m30*)8i z&|xEaKDv#nFg088ink$c;4wfuRghc)UX*g&gsO;gKgoz=k~Wx7m*wL1x!f(mZxe?m z43LrxN0I#CeFj)z*NTffm(F6>Px;v%=eHvf(~4Cc%Omn}@w|Midng>KGp(1;E^do2 z6??>!vHBs4uaylwFfes$Dwqg_yk>thJ?DqZRc}FCJqxO#1hat?#?!Hs0YXn)V;~FE z*q#KlgIhKw_m8kAr~D1<{&n(w;(4)W3UX|}TrQ@>seb%#p^~rpJ!0?x@gleh`~^o& zsLgz^qH$_$JJ5ApG1tMPBw8Z0JIRHEB7^WNfH8`A=i@5GQV^a^Bt)xVA$&Z@-829& zLGlm(&G(}Ei2vdWd1}2DuLU?S!3TNK0>U!tD%4A#i5+l`TAjO+p>^9fG~Rhk{;+uC z>l^OC(EV+vn05QEZ|oi2^t(6NgOi8C_K_7^PQSNj{q@USTZfNLZ5MBEd~gm^vyI>0 zEgyO9v^@0Cb$GsR!K2!zUW}Eh6jI`4tvEZ6`t-^fCkIWse~LgZtMH*&2kLU0&P&i1 zo!?jC*Z4*E3-Y^0vyUxdKkW6od*lw58)}f>?NDt#Y;}wBCi(ee&#b>Eb?8*j+0NEi zdNQ#o>QfY(MyLWxQ(PY#oAOZo<04V2J5VGlslXO0`EeGf<4<_YYvWI@PxnRQPcDE{ z@mV6x!6p8!BcCm*T<UuqK_qIO=u?iVMxhED6z)UZgw^0wKO8hn_&QG4Xw5OZ0N=#O zCbv@PR|9bhXg)#kfkveE(70KkG+NOz<?U)e)v9z^h>-|GJ+8JEQ3{CS1ZByBK3|7- zuueQpsqpJc5K16)(v^xZN7Uuy5P}+*D_qnPQ8l*oxlJRFwr3g27e2854S&dWR+gWd zvthL7g<s3dMs~!uEMEF>-NJr$Y~`6p77nU6ZNDe@oT-bkZgynv>l>4kQ-}Zd<^KO` z?Af2M+1o>+hQO8qS3o*PZo<`OQ=FinRTbF|5U-$m$5N%F;7BD&$wM@GMF@;1^jW#Q z15JNE#~^*2ND(u9qBhaqkC!CEMDu<*!H6Fv8|W2en54W+$Olp!=iMt;c%oX7n6Bmk zmTCZ4Q2xkhHF3;J_#EN?(_s=(6vCzWd7*i{q)=;uFM%{SECuPLD6c@GNKlqSiuJO1 z(*guT950be44QT*Y*WZbQH&C@T*BVMidf7dZ$cxX0!9ng1Cj&r({p_OrjCHGxtPYi zo*Q&Beb0h{Z~Nwlt%iJ4p583l16`Hs)4t};3U6EI)SJ^d+R;1YKKX6=H?3XafD_nA zlg<ia{M<>pznA$%b1{un24)UU2FOPyawv|$D8NK<ktPRm3=||dz8fPn#Rh^6ri!3W z$!Skg<|r)zO}$gK!ikERl!K-vc9>owA*aK6m12Sl%9_Y>bQ+HpaPNAnX5bp7grDjc zjUHJxcl(!nRH{Fp95u0`r$>Hxmw51Mfdu)D?f?3~g3hmB*xe_8ApgVHnTDUq9@r)q zY?F%=6FAlda}T7(YsX!r-X*3#Ai0%m$J0&IiXkY|;VyzgO2>mT9VoU4@B-0L0<0Tz zZ~;*t257q2DiO6Yw$TGQH=7^G9SYu2BS5K~E*-ysZ?hGqu>51={;`cKy=(sd`F|Q1 z-O+aFF3++pvdrGi2{_sQlPi7uSoVnxCx#RE20KwBqWd)cx0eYwDaa2h=+fU#7*vXR z=x;A8HtX@Xmt}Uwc}~j40nPBtP)8BFfzmc?htggJlp~u8+tZBMWDt;2+M{q!e6%$I zycA$@9w_(W?1}45d$*3P@vc1m+^YlIzTW=C?e1m6zQ!d@VR@gC*$dgbV^6L0?q~Av zo*sSv_Rt9ac&&fVj#~{b6QVXz*ogX<2;W<XoEmj}VxA=E9}?;rFgQIyOig4Qkz(?u zF|crGU`W{|pcX);Ku|`Y8hNMKuEwJXI_!Z1J)SHRwax{}hO(C=i1^gYxxt+nH*p}> z!k{KV$dV?DA3r>(gPBo)&|319dzV4TnKiJApew~St7pk01rhrX?2+N`AAd)F`{gya zGf^kMvTU=Xd|B_7eW-Vpjj<-xDHdAV<=ZNseX#${)CKv?(2GA*Y1n?Y>$JRB+tQ<G zy^~E;??8>-Ez~PG^vs##+Jw|8<OWd-yw0>iq;B_wkgGLk%Iyd+Nn(V?Ayjr9ZI=lR zKI*|F)H|Br{;qsMKE&>l`yN<wom*_Q{>zT|>iY&8{^#HC{$>2)+V8UaAvY3WWB$5E zpEeR&@9ZCa^Wlr~(QUoT8QTJ-aWC#iK3Ed@y@#4lI-6o^+7Gnp_&kJhnk$R~Au1jr zs6>H{4@f{q7!?Z|B)|xP6<c8<i!SQ3%(OGL5s$@Br5!)hS#*9&!k)zvc8gB+utt7f z-(=FuFRJUUYx)q#iHAZPR3UNlRhv!xh3bPYqfygYP%t$|{82%{<fAnDj62bR_1uj0 z^w4@z&x5QpfbD5DnJR9d0Jkf^4aRx`V~3*;DB%PulY)u=IeCQi<O|9RdvsxGE-hn- z4ma8^KFZ>(UfwJJYX2LprLq6Ddqet5p@zZrtKxmz`x<~Ti6p)y-!DHazbb$Au@P_Y z{rCRxsihBJ7#no?BC$M<*~;uW>HVQI!>vxG;#f!noGdGZ?{YDLMkR%5u|HTHA5uZz zM~nUC=DTT79mz1UbQ)1PfHRS-KxfftnNtyK=4r9wPN*99amPk{Wm=MJ^pnqz=!az2 z+_}*DlVAG8qa<G1muw+K&s;z|gWQC6#!Sx)&^QDR3(R<C^dR^%*UZolBe6e%x`K$F z$fp&7Lvw*E^UPF5B122hoJTsY(5xpj439GNnLlxB*$e!subU&MK6R5t*R%kj82gm% zhK&=@hjVLp!28V?UR4+$IVCz5$MP#&Rt>ZWb5RE1<d5}uVJ0I!-&Ktz;UZ_Q`IMcy zh#HI(yfM-fJx5|A5W7l8OXz1@77K&z2<3_R5F+tNUa7x{QH+Up<z2g(PW^;W3HoUi zs3eNT%OVUMCKR=B!xS^CfU%#AsJfZkOFzz)3yE$zzZ?fDnVL_iVrec7q_tykS8Axo zuM)ll6k|M8@%!rYV1Me-@640~K8vo|6`cL-!bP_=$DNd)Z`}*_e&Y4!gjQhh|Ixe| z?EP~eUqSqmRB8YzSvFWlp=ws<AG{N)X4)N3*lk8@;nwEK4PS`IUzORrb-XXu#|oAF zh2v3ZAOf`oBIi_i)_d@ROVTys1=ZXq`pP1>#8MyLm-W#Ol~HZs8s?yd;Z#ClrMy0x zPH-y9L=0u1EN=o4ex^MJQc4;e$%|%S(BpY7hNv>bZ;BzpZft9pvdLW>6fseyY;r%{ zmjrewOq(egXxua>(^zwnTNL3eB@K}ks<e33)s5fnz#X!rs<H89D14-g0yrSaR39hY z>7ihu3+xC|Ya>4+*K+OrXLnybs8H>6@Rr<}FM0s3AljKqe4#VIKKz}03OZs*p^t9w z%s#rCE-mp#0*~p!>r@%XoDzSe9g@`?<TJ5=<KF;Tnds69#jtl>O_UsTbx=tGTaS{$ z<&r0;)HoUg9j(i>i$t3Skjsqu2ij{;SHQ)lyE%15cPk<}+Gsn~KtaA^#zrH&A+Rnw zGnJOKHnxCvu!8zAg0z70=vwf^Ncs2&k`HsXS9I<(Y|VART7n!H9fgnLs04!=BG>{8 z$`+Zf7rnJdd7gqaQYSu@>7f#h^VT`R_=@xOp~8Sce~H7=VGlcYd?ggGT7724J)Ij@ z_BD4W7or8`2^1}7mF@Aqxj*c?s_m}=z2*5U*)8RlZ(cC>&*i<R<mY>vlYVc^Y-Q|) z@4eYro16#x%e7{-J5ei&2y2COIW8`7i=<r9w1IYz6yYMYeK|3z#PT9U6;O#F?<yix zKwA;%PvD9{t7SUjreZ`<71kO<MHb8@iC~NYl+{Z76A-w@nvw)OEMkFOgXtMm#B`rY zdq#Y2v0px)bh+J!K2?N9yED+f!NE%Gwqu`3jYi2C-d6A9n2@J4wv1R-{9!the!^H$ z)+=Sh8@ugc6}&OVxGIrG*vj}jRO=xi!B+^q9q@If$}e&Jk!?d0pV~Urz;Eh_L~tTY zu}En1XX;F}Z~=IZ301b;O^th^(A5-i^P(r#N?05~D+qUQx8M>Wa75&2MV5%~;Js8L z^*poSp_-XpCwiz22IMa<_O}w9A00*{Ek%@T)#E*B@b$1cO!F|acn#j>mGWa9U)$6x z_CMk2_*ql9veJ5BsQanpn%m6poIe{6e@lL8zx+Jvk$YJqTYIy(?U<eowvGOJpBRet zTFiz1?v9!d1^M)zZ3DwUU|W^AlV&FF$GSGtJZUMNFd{$WUfoDX3blOLctFv{{kde- z6$cC^o@oxhT!q^Sq{HJHXElBUrC6#x6Jt>VCKyzEQZ>r;M~(K<09VwIR0yPbT{b{P zXi)JPEk{)v0IfMwM}sl~Ls9}L&})>kj@k|s5{<Rd5|L(`6|B67Zq-m8RZ0m=RcX%E z=GJ8wdP7ZpwKr|6Zo0K&c}~-p7FDlWeAlj9&o;OXrQSB?-!$e8ElS*Oz3D4AzWLbV z+xs=f{^9GgSw%x@O-0_fZ>+r`Im=n0DQdjozOGx-iP_<;$1K_Mt6MkC`cg7p=-2yi zcwlJ%^R^Pi?9hz)|B#NxpEv?P8%H2%7W`HvqMjC@IU)^3{4gJ1h0o%=%cS?^N7t{4 zTp$6z(~J|AvkF)t2P3=dfZgTAbVgo{{1&LBtSB9#?MJrNf^rdN|3USMWF1poJXO1z z=u=#B=%y8UvDpcKePKsHEXX%{!bx}6Rxo|`g+Q*nyh$CN+AsfnLz~v>tZ})k5^BHI z+Pr7IxN|0NC+83m(c8PA{W+e=glGB(Md66BCIkTRfu@rlvQ?QnY2_obt&Mn^1+b|T zH3|@XQFOvA#1|keG?QHz45wzFn~3nco}L|{r=EQs3~59$q=bjhjiy<Ee!%+izZKvs z6QI%~hL;F0PUz<w$*M-MNS-7<gZFXuQE5$$#3<z?K?f*}kn&L$@X%5X1@4hKu%`<w z>8^zUJ@LgP>;+GCMXw62JTeqYmsx{K4-NVg0~L{0=VRG*nY@bpJx6o^<=9MepXB98 z4_6%u_OAV2uh%vE8%xuhcAZ_{Wu14|!5xGCtuFapwK?8g7ixL5qwKbMk9DuFuk5m0 zLmmEJR?+J+i9KWNO(m)qk4#-m*0!v;BOY50S;sN?Ot|~*TU(bu&=)eQ)oQ;90@Z<_ z>itNxYyH&FzfZ8Xd6DGoINK(R_I(3tQET*y_zxr^)Qx1PMY1(I2lnrJVX5$ULfVH7 zBe#$AGo4>DaEW1fPKIH)qX!>NG#B}z=o|LMGLqkXYVQ^f)rP%JHAT}w&f-kn?NlwN zT%sJTB`Ph%k5MOFNcG^hh4Zn0g$wCrp$)$|JDOT(=F$&Tj97*(bfS|jc>(N4E_T#f z>ZzYKhdks$03r}>r*1;W`#Rd!N^%G&)A6}ST!fjs1*D~-82!b>tccR?(m7C%+`H>% z8)oOv8-DCtgZ?eWid^y3vmehSDz?m2xGL#iyZ?Gmap$%rDpRcY-rJ{lPEnC&C*+Qg z>Gfu)M&;L-XXYh@p7D0MV{}&2^4sIFm8)C1@BQ$1z<FLkeBLXZ&2Z;}Gnz&QR5?!C zQIb(0-Z3n=qLc#Gz=%Tg9&W?W=Toi*)6_<h4QS&c=ZNx&a_6uzUr>l$oC`Te4lOSP z2%oxtZ3qTilwWHB2QuhO(UqAIwR0x9Fx^QI<5ogFxSY0?_fOlPN0wtVmXJ_<s)dsg zqF$`W#&W&k?p|Ed5dNjq-12d8Ifd6V8uOn$UN6Yk^hB2H^%x&`0#W($$kKU@Kkoa? zar%;9^pEVC+&oi-dNYe}*gUUCzVoWR{HQp6N$-(I`91WZCUS<y<X;kw2<b1;Jp{{1 z4mXlg=MC7F5LwN7PQ$^G6V$<`p+@N4)Q5111d?x`sObAr2w`f4j-aV5wKSXpxeQ%9 znB#>Oe(e@x!%CuQ3%_|5j2EX0qtY$7ZVN5&CeGrUVE_fG4=GVPx2=cbO|UXGHRF4| zRSGuZdja?Zy;p|wfO^M(TqBzpW@TjhP}m1MCYJ(_;wMm1LE{_#$q%k^0F^{tev}w{ z-7IR)*TGyO<RWYZ-n$g7an%~c{{Kv||I~3)qO~^C5;v>QZ2CKHEf2=~x~%?F^XA5V zA-AX0<uhsHhgXgUdmoxrYG&fW;jz9@>(-G-8~)AmgNIjKP7dDp_|34DLA5Sau2Gx) zFq~@T6M3*CxNqHiti3<rMzbyCm(MmE){mSzuNLuZqWJXb^~)Qk{t)s({FVNhVm~!w zyq2{@IGE|?TDb5~+Ej>~H=1t3wx#-GQW6(%_VqX6BVMoiQaBYq$AyoPF~sN6#>Bz5 zm}-bZ2AO#AQ$0~)#h*!tf8!N|_<d1nDK>3x92X;mivdK~53}-BRl&B7e!56N%jFHw zP9~|MRSbg(%XqmUb*)OZR>y#I1%Ty~X_QH&tRB1$S7;dIiz_a=^NhScB|b9~=#pod z`oGnVtXz$*s)?%@WhCD-zoW>y=jMyCGF$eXCE(D{2)|!0R=xl2D{qo`UYkGFttm0N ztbu@0H7#v4Q&BZzF9E425T8aa<+&ti4Gnnmd%}k=t0%PU1c$It*e;|^gmQrJ2hB@R zG)C$Vq&VZ}>JO>T@I<e2@p?PyruELFVg&@l&F84BNZ47F6;-f+*-D{*5jG|tVa0g4 z&)-X%fq7aX#}uq?@8G&c*`P+MjmGd3#BjaXo*?RmC>!yqbsoMUE1{4=sZN8#<f`?y zvlub(!_L7w@d!hW-6L#`68z!Q`h?iKyhAMW^z8oU-94T%QU3e7**8opqpc_ov10XW ze%~5cJ*cQ1A{8}%SzeC(YWK0x&ifLf<}z<Z?1qr$<)Bk*6&=3Lp5ES`PM<@xYMpZA z-ge!zjD(@%#;Kj+iot&Uw1R}j;i?amPq~jh(_R})_(OwhDc5!ZFS;4AMyGH>VKk*x zrs1z9vUV6tX@gG!w5y5htu_(ij+XYKgb!+`bDTb09!}BR$w@oXo!p*vHlsG!)=9dT zm>8U;l!C}xCHT;w6u>14@S&+dE&XPo9RI_19zoh^fDA;>lBE-SpeNHdODPmdc2o$Z z5<q-Gu@8(7l&9J*Z4)o(C;}c_sRN{rq%Nv3lFktJAWfiYb4ChNUQ~T`)4}8UD=t6} zyBI2-acx?QB5g)7)gn*E;>K?K(#d~UTN@(|b!b!bB^?8G^QOtoL%z6b^r{bs6Hc|p z6j7_U?PVUdI%F0#>Y`cP&yb$P81nKgp$&Q-oMr^`x70^WTl^8SR~~XR%|f9EUZgqW zMgIS!Q7s}5n^gDkoTY_u%f!pdN0Zfn&Ew%YB9Y7Al<|d`s1YnB$WM4F43yM>&Qlr^ zicTjS^{7$!>$5IZF*q4Dvm??0ms6G{)s&Qc#Kb`fn}CE32tz4YgVUsx85vH=r3e${ z#Y?e^a=SeV<P;f^vV6)zFnXYcSH)FlTN)RS9Ox}A?%c6legmRNEPJH2?OuF4)5u<Z zQSH$>!oFpzJ{0D?w&LLGdF7TRU473!5ZQ?|Vlr7XYuU2;*z#3vQ-fT+Y=$1ST7Bp# zvv${5tGvR~Ja2JdaOuHN+^Z(+!=LoM|D^BzN&4Phu<=ym2nlUWotf{Y$j3w%i$rRS zWYNuhN`Qh_8w@%$wZUdospJloW<H#X_CA45OJU6wc`YEKNjICA!diB{_jQGM4l0_e zlTtm%kz*%u8^=Ya0zY!Sc^m~0Fwn*6D;V8Wm+2VN_{;(%N?lSoriwAKE?z%P%y89v zF6vPkz;y-tWV0o0FjgZJmP)BVkPn43q8~7G?h`~<;8jW&leRJu;Sx8O9{S~&$pU|A z3(Ery3Swe^oXP<d6cZ;Pd;?*x^MDn`$c)MlCfs`5X&odki3uj-PTusl#k;zj%}l+s zMYO-BHaCS`{a&XjVeqvqOHMs`yd_~019QbgAI#I*>aBG}3HD5&)V0k2IuqEcbFDh9 zG05Xvsti;oC!IsNT2r3<k{B7fB_Y2iGSG1t)DQQee*4n0cSo&WS6uaGB%>4dC++lv zhO0y>_LrA>EdMWPr(+*bV~Y!$gr5rOP2{a^yh9;x!3|0;x@|4FZGzCJn)y(Z2luKg znhxL-8d0QcM#@7}O%%*jXz73o;C-6_v`;OKO_XvUpQ^`2{AxHgat<2T(k-}sU_QKM zgwm2*R^p#vCU)c&dN*<x-ua`_W<*W5j8I6lnsSW$h{RWS{moyYU>9ghy`g6Ty);`V zjLzalp4s~>&<><K5mRzSpnl^DBQ&qSD@99WD8PeL4$1G!XGkKI91=Mr&mlQJe>&&_ zQfoRbMfhc6xy+bb68q-d!SO?3ZH~9d?RHgqBeT0sc+DL>kv)`PGR>J}2U5)X=GdOv zo({A3L|13DEa-HXIK4(~_)vPVdwa}n5+7Ol7d?$5k3?GQkzYQ3XoV&h8S0*q#-1%J z_xjDE`k(*q0yDI-fIk#0bqD&}f4g>H2)sE*J|gD~?OQuD=X|Br@V_ploKKy3{LUE} zXC_e<b$32~<|z?r48_$$nEmcSOjjp-S4h{=`Ov}8@~JBTd?sw>S}(RbhtHN^2ry-j zj+fMWX`%w`-tKdmnG%u!Fkn+d2$v1Ko?sBL)dq9QK{N3cD9I7ycL_D4fO&UVCUUIa zfWo<A))AyuW#F@cHj6~vajBRFbsXrV62z>OOd;Q4%1l7bg@O%A0d)zC=9FT;y`^LV z%awEooamTJL7NIYI^=UV#|MA;@PB>e3Gc?DM%Lu|a~Csj-6Ee}))I-c%E8nB^vL>M zN99+}%CD`Dgx>a5uQ_<+{;mHtCl+Bj%sH}CKCWin+NSYs8^7l<+E4!F(=V(JcJOsW zEQ44$gbWbtMqN<SR-^WxD#t}viuXY0@*Ze<R%b6yK9KXF71~7JDVnaL&UITA3iKGc zOf@OWU+q!qSkEYYA(5XU6quS5l_)!GLbFsV!*xJw`l!KO8c)on`+}OZo|iOhsAp<v zqU&<)1o?j|Q&*pwk!Q4h(e(VPF7zKGC+GXYbMiZd3qrbub_A3l?bs2QKS4V(GfkgQ z2JhCG)QunaFI~lV=dN%Hso@}YCl*b24$|(BqFU;@Xljt}&PQhJq$>zMz0}Rot9#rl zvByQ1_V|{lG>n}aq+GEH=mFAOsv&sL(L}Q0`h~pcqC$_Brm(2|GLcp8UwSL;Y87Wk zD8h!{M#i$c9sdi#!B5HfuNgD?Pwl$6e|ZNYjh+Y3^ICv7{Kq&1A6XlCfAWuAu7Qo? z^-=i1MN}PKL?@$#k1@fyNe%OHG|1c$939S&Bj|{unmnI9ny%(#G1Zubo+$2G1Y1a$ zQeGV$2&XVnJ{RZ38BNdC(N53p#d)y-%FIi0Ix>N#sB{DNeJ+e#+G1UZ%@w3B_`9^G z9#tD6RF$H1k*lF4L_=jpln@*<{M;Y2ChGf77efDN#ZPWS(f?Dwp$Vdz02@j-c0#AL z`-1dZq46P1gsugPE9gK&lO{~(L_*merxN~WpPOh<T7KCigV>gB!Lm1)`4l>MjnbuD znz)oWy&-5E;|kBQYSeCsc_P-pV|1X+F*zWW(GDam<2l7HE=>dQp6Rmhxw<)bxkP=r zQbX;%OeuRtch7ijQ|>?B=apz)VovqT@2O3R=IGjoGEKa7AJuuI#jg%dszOzn5;Aw_ z!D_X*c)DY!_;k;Qztir7Z<<S*tAqlFv=ayB<Rse35>oq^muga#F=By&WG_o~hb4@s zQ;O93!#}*Iq)0_vwak$Hhg8<dr%skkIxb~SkT(qH{2>)RIbIu%B3p{k*EC*V&Xzz& zCqPk)5no-nHWJOGO2o(f(H|=e-X*Agq@kRG%K`7D@&e)xJhNAIwZ3RZLsSDc!Dxc{ zcr6Wra~k^$b`W)iW<i}st}|NfdU>(BS0A}Wuiw-a-=x<MG-dT0Du)4d46s8+wKj*P znGwa0+iP^qDot*#Kj1eQwk>v^G#Dz&DzgpZJF3MW{KO9ZSO4VcWdm8$g=1aca#hc| z8ZA{XKX}*aQiFIklBKP`aq?iKHexZW(J&>nWyFBw%p9uawM@hgPxAeN8jXwg2eAV* zwGlfM(Ee~jm1ZTJh%|KKqlviH1%+JFH{-Sk90L$N0Dw1D2kVtb$KFf0D(KwiDUn3| zHIYQe6_G?^l(t*S14WZWxhIb#3LQ1I&C`(tv5J3G{P4wEq3ZW$`i!aBpV0OE@yxF0 zRiOfVL|%QRhqyG-I^E_x&bqH`bK+9`N#p)c8ux$FxL1FQ#yxQ?zkyt^lxBAkH-RNu z!5Hb`bxE$6r4;Jy0vMhGvy_VhTMl=<e4c>l&eM7UqXWc1AsIMTxCKz?34_`M6+9YH z(Ug^oaWiq(65vql2jY;Lyf!Okc93-v#h<?{zqsT4iT8IiN50qKcde2Sk3ZPZw)~$T z+4+mR+~${UKV|+W*t{2>yjgr@rOm(=ADjD+TldTF{CM>SH&JKe@BWiUy`oXi`;_oA z32104-7rU{8&*&XrvkLtM4b}!)^WnFI<&Zm*Ntcpz*JWU*FZDZn7I2Q4a9R_q+zHI zf9~EO%~Vcm?0C*<C@!@Khn2HS{LB>a@rfAH<+evBwDkfeKuC%Tvp^K!d!rME1pf@$ z^B7hn=vgs>A#`f==cEP<CL)l8Y?)-JhdhTDWsKqw9JH3t6=+2s-={E504@?@P~^=U zEka6gy5SOFK)_1FLs5uv6jUl42MLUK;8$r?H43(<++v=g98Yl!`sga~YaoZiI}|(e zxQ@JEWuco=BC^8<!@BmV_e7Jb0|Cq#^u)mi(U7RkWnVt};Pq3#?YIZ;&os6_GH>Lb zmLEU)<l$!<hwoGRH0vE0Qu|KH9d(x)fnt`YDWR=h`q)}|P4RjKi^9TMPtUG9PJPcf z<q=;q?mM~W;a(a$#@+gpru_f2n({XJRq*Vg10p=6OtVSV2n?s&jA`ClgmygjTzz#d zxW&8$g0|v}PBnVeq|-_YSR4w<eLq8|o0^V9w4eaVc>)xmRwGJswU%^QJ!Tk^DP~RK zE7Zr)iO5nw<ThY?B&~)HEgFshK@&etrmc8&%TWOoP>TW$i%eg82GW43H+V`tNsZC# zaXfVQ)Czufsqx5<EZUHmKZ=FeftcfpkJRDG(|rS0jlOjE51fvTN3Q8Qa#-S~$!+08 zeEn4+AwRe1+iDD0*A~K*N8#I~IBWuDO*l%xm9dLO<rHi;+M-$n+l?6oNEm|_1v6lw zM6}If#7Zh?fmA_Mszi=iIg5T!8Nv@hhzT^CWFhGkOpDAyfv(sUt!t#P8k2PHTB-o3 z(8-mO9FRtMfSK!q(PqkMP}aQSV_MJY)F#v9MGdqf{!*r8i*?kGw2s{S<hQR_w@lEj z0A-$e>G4O@;4-{)_km|Qzj5Ufw&cpsCQn}(VzV7%FFcOBPk7IRSi=F~Z@{fbYe)(k z+C(1~O|i3q4xln%Y$xm$wXN5c+i^N?AevAK(9U!g$8}~A*22Vboiv14haZ&H(dw0< z#*(Ttb26s|_;f)deYyZRUBU}CE}(af?a12)@z0z@XUMELsAUjoE(p-U1urwYZH<<6 zD9lSM3#_S5taS+>yyI%!^;BZ3jRcr>8q<jth6Sd@=E>s~C-&2Rz+t(U_C&k${ezcB z=Q0L)q&3pbT&;mCwjkWW^h+Z2(W&;vf5euxQL=MN6crTbRVR1T<mvUv#D<UhG`{iL zPcys5&$~;0mG9L}!s*L<C23I!{n(uvx_YN_vGh<p{Y=}`jpEGZJxVPw6O4M|1|>SY z`Sa{p>PAayVKQ}tHPwH4_jrNykG^{|ESWa;zWhbE(BttWL7-IXc<7I_k24J&<*sqq zt}VECMsPa+mBN+BC*=T!_EBU`V6#ek0XnRSu)$ghzzN#roZ8uYj#M@Y(d1Ex;{Xe! zhFmsBQD=jsmMYj3;sA-lyXPL11X4ae!Ag2gs70sDd>A+fv7n%6^Z|t8;srARIVB%a zh6Ql9e8goyuy(Yk`P9%QGzw~AKy7M4kBOdW=I%`6tW_I^w#@creJ7PZyYi4H;}1eT zgUxq@4FHU}q2$250&wpe=H$l*#x}JEo0fEiH#{x7zd4(iRycZZZF2bDgx>04bEYc+ z1aQBtnFHK$Cscw1z0fXA81w8QOdLEXr28_pmql1WRZbb1F+0{V73XpT>5^y@xr4S~ zA2C({V8%xbYR=I}R;m@XVa_sEO;ZUq)nuM(>ad&(qp4~|xvd+&*3HW?ysKjt=y%NI zJwR7NQwG%p<VgUb45@i3IUnD<t>Zv5lRyZk1(x&<TJuGHSYfDz1WA2MvXk?O0dJ;M zXOtN6<BS?plwfI3{Nz(IhVAkm01>3dbUx4nJRH^R_;~*@IlT6-dff+8d!M{y)8KO^ zh7Ygl_8R4P<yZWpE@iJ!#h>Y9$W&>j>tNWdNmJ8`+v4j_U;}@2wHcGoxEs5c^~X-9 zGFywS!yUlhsYAE}*+Pue5NgQ#MBu@32lD9XV*mvFC=(@FHF?-8PHkVurGO0B1R72> z(&8IX1x_0c^wMa>KA`gkk_8R6f?AYud^bROQp%@-m<CEU@GF9pMY!_{<$PnXi=Xe$ zw3h7RXI-z>&$%8<Sc+WzN#yELFq83IUCP2C&Y~kGkc;JrZvyU5SR4X@x<TDQvxmhd z{titHK6@A=UI5~67_Zc+07COCwgSR6K+mL%QPLGpmny|0bpN?lv7cBGH0aWjhD`xF z_Mp(PfTI+L7K9WC73rutENOQsVja%g!bvI7!S^E@;xz!mrrP0ot$8G~#DS-zV$8MS z><JL%QcD-0mU4wXLJaPgU+U@(#m}DqwPpF8$sf%fI$*7=3^$%@_dL-r_HG(&mk-|2 zjCn+6eF&6YUGj0;4~F|j_kdkO2(gZCKbT0&tTz%8_x*abO+KPOwoSZQ-i67;m*p+| ze7Qb+Tl)^yGaJapA1j;+Bo;;kL$m0tX|cLdI%puSQQG4u*ruvLmESZA^DGL$ubFEh z8FAu(L@nkguDB4#%o9x+d0>MBC#mpbl`?G<$DwP%p)(R|j}0YnL@(nOMNN(P&}#_- zC`F+@AV^wNS3n{tHPQH15i#ebvsA>TE(tWT=;57mPM}^Uat0GCXpIP`{mJ&h5(67h zhN44IdDUBX|B`7_2VZ(?vi;)?!E-h??;Wi3fP7qj?|u0R@fq-+zAV22dXx!P&x{cA z>6p}BdFQXW=igyAwpQMOe|K8m!j>~iFli=iQcb-JJ*GeZT1Y3zLW7yO0TRjiwXn~G zFeQir$eJg_tOS6j7`caG#!P4lXS$*RZ-A*&Sy3|6>1q{ScIw{|jzz`Iv<f`Ed=^w7 zlmT%=5yaY<<0x`^61?$)Zd^ImR*)Ki<fSv71)jnj=w4D3FJgd}FkCqOCCQY>ig7UH zMP^ToLlIa7z#<};W2q5D8s%|LSr3uJ4Az>bM$@4@x4CgP!~zEre<{DmPVe71JKjAq z=Nn?Kb!1m^{idJ3!bIN*Z{wM1Vun-gLwG&ft5(aev!k(ssc~`j6JtBqiRcDFzW1xU zyR_CL4{Uu}%y_{of_ZJzMy&BX%-0@LjRMEpD7>j;ndlHvO#}mlsaoVjxnxGP;7#DH zk5UoNd=Y`3g(BpZJg%rvBqLlTQ<_o{$b)n7hWYVEs6oSYLLrJK)hLT(G?C1AK3h&5 zBy~!F_>p*0DzN+^WjhJtO>k@r0N>fn<B-!pBMDzGH<Hr$Az*jHxWMuYsEdSF4S&H= zb}GQ<FyJg@c~EH77;=l0I!lojQNJPEno*$QbjTR4i|5l&iGp-*1*@W&kMd>)8wg`k zikU6qsj25^68tI{77sqh=09+kOC9JOoU{J^r}m_bk9V$AYa@Yfb`}FWP(O8)i`x9) zU}zJ2H5f@HZobLpxwACxSu?V7A7({t7&8pIvW*UVsC!_i{J8vLV@tfWkW^2>J|#l4 z;{qsp9u{&G9=ia}JTXq~K{uM@a}_Zvn0M0JN|`R5athiiFa%vtp@rH{h!&jamq<te z+$~3~d5)qoT8vZyJQ`*d;mHVSWh!x3P<)YJhF`WSj9Qd(D!>{LgrPj^!^lkx5gP?< zIzA6hpc)7>h!-E(y7U|0zHi>0k%IKnQKsJgO=HM-c<+v#(WKGne{$RQznS;$Q%5G* z>u29cO*}o-yJ=Wt?)2{i_U6t5V-{22;PO%Rz`ggsu;GM!f#s8LR|&1)c-^8J0jG-% zY<8Ha$R~@NvA$e-%YzHA1Ls+GK13*p)CPq*&Ik-04pNOr6yW0KmQ%WTxw!?wfK-Q{ zsp)Wra0(mofsuwdF}|s`jOSL>l4`mjWfLmXNrQrdMXIG1Fz$4S?)3DcX!DgxK7t;x z%N^cJI5Xw2Xcg#@MvGtCx^3u=B{%oZ4)?yXW=IaGZC&2dxjjav*|j9NBXHt7_ty1~ z$fx^`<*|#)I~F&(gD{@INS6P&|8BWW9kOb~@;Y&DC+=`P5SXj)?jF&r?RWQ0>P0Q` za9XcstXELDH^aa{GO-A9j2OP7%pjeG7c`2g&Kw%>3pg-P=im{j15|oxfzJmEpui=l zERivom0XR^k%fE=JPI%>M8S{`ho3fEh*khVELgeBgEumQm!klrpc^lD!KaaP7fKu2 zjZ8PuF8@-kE@*Tv861pz0`WStdhg(`)z(NXct8v^`ZABH-tc4nny>lVrhXX;sx{^? zVNb(mjXKD4Ao-%&glDc6T7}~o-XDSl6SI#588?MOD=$W}=qUBFR2!#@u2$k11mq-5 z*u>wZbsT!7qd*+ED<V-S#yJ7C9_ZULgug&>kch5S4a<`f6_!?0h{BmBBJ$xP6;dI1 zDyYayEJ;xR1}H17E>+PlVGtw)A{fZcOIZ$l$3hbSuM@LR^$Dwm5WUKPZ|w|7FJ+_o zd0qw6n`Z+LKA#HVxl_JY)W&nf#t~EB;wH2Hh>sn(Ae#J*fl{|4VNB@VjkmYT`=4ul zHNf;qmn*s7UZOWu7t8N6b4YLOWCJWKsZ~ysI)H&<tv@z($_pJjeNcYK-?JnkUu5@y z`7J5x8><`z+B5Qtf5H7RdP1rW=(ih?HLc3<Yk|gs^Z*KI=@A!A2+rIB_+dmwoCg<z zh&0J7@EQ+Xu|y-E&`JkalB9w@cRII#_--+li?Lfla+3H*PF^;Bgvm`>r#xTOPbfc= zh@cSRA+gsZO0n3yJ_y6$R?2L|%~o_d>v~#1bgbY0v(tAkW9o<4P{q`1?L%K592PsY z#p{=wLU+s6Vo7>*?aA{@FYoiJpBUa9!E-Uei=29^YO63zr)ik%P0!7jak1SuE~jLW zo)F0(O;QrwB(yq61d~v&)F>d~zyflLIUI2UGNC($u?{q^2Zsqi%2IcSQ!CE_A0#cp z0Rs=sqLmblEE>YetBj^sE~0lUZ^1iM4cAiB<qopY+Bw&+A-i2<8PC#p%%+!+Xv;2) zk)AgIvXUDokt1Q@!q7{LBdtUJro_<{ST|yz5|^}spRwr^=He6{_^~oa>Bm376r~+Q zMg@d_rE;S(LRo;3N}|4>9#jYV6-xBp!rZ=9$-~2SUbo$3^SeSTMmn3H^!XFOIBe>D zS)FfA_<c__ciy)GBJ_Tn$?o>X?>d}Z)oXU>?_YI;Z`GDS+-zg&&4Z`ZW=lM<WsPs( zj-5V}S~O1WR=XY=&b39{OI|$jtDsjes&kadN}~qyKA~TocyWn40<P8Jhl|y_{||fb z1Kw7B-ix1ebR^5NEbEVLS(as4mStI%Wm%GCS(g9*i{m(s<2c3`$D|>oNg%kvq-h8= z3u%^{GL|ydkWxmoG?(TwO3#r+N-1Nkv}4>X<1$7vE@PB2E~TTi>)MW$`1JRAzeoO; zghJOo_j!KLbDyVABOS?Z{e8dh|Ihn*l{`T-VTw|CYv(5A2ccS>AR4pZ76m>2ijt|D z-V#K}n07diBhHB}U|X{-p0t;wh!hA?T#0ZcdS@~oNF7&TTaCE7gl#o4Y^#xgUB+eD zR!SVNj%_vIZgU!DII_@I@K|EM9&^Dfu&vO3iaRJEq16tIfrb1HV8|E)SdMMQ=l<%k zfXh*BGx;V*UVF_sVGRYX(?ywy@$)0`aPWN3$&&}S+<E_74?jIK_=uwlI+N<;4;unO zLwaEH7y6%x@!9E-opytH-{gU#>oD9!I~OmBCtzPkX%e-QqHcdo?uA{g3@1qy)|0Xe z*!wB3#Y0^|DIW6DhOjhnL2kn3Heir=fUa|s2=6k`L@x){;hegnn5iHmOWC9wPRPyh zSZ>OTFjgF6oX|07&|D`&WsTu1V`!3PunjZCz$k7Ds!O0*BIslu+%z0BGn?voF!rSS zRbzS?9*f=LWgu<@Fvu`zTzJ{T#eri!`M%E7-Qnb)b;JzfmiG6B{H~O%ocxy{4RlMQ zp*Q7L+FWrXuaUaHAsG5oPDfj}Nq4ViMid*tpy2LMO884bZ`R7HkLF)VdIha9M!qa= zfLm|o-&8O%T)OO8MY9m-E*II)F`LU|A-)eD)fvLFvU%>_nEVMGe8<o@N#hnWObs!4 zCmwn02;M3=(Xs@A<fp>YjTe}?@x$EQ$jprg61kDtnj3G%sZU4ev>R?j$O!UMX@i+& z>yb^#t&mMCV^(X8<6D3WjW9nsQctoO0g=WW*E)h911Yam5|#?F3Jg)zZpbTd1LURA zS4~Wkfhw_NEN<_>5WOC;&I!&2Li5Y5<VyP?$eo{nAZq}Hl%2QWm0)2aRYz8{krYJ5 zVlOiM^az2*!Z`TvNZy*gfBZLJ2zfm|F&PS`1R?111?Io=U%dljW_0tG5trZVM4ibS zj{)s38k|0#^5F0GZZJL7ybZs#ihF=duOdA~HxR9EIY#UY6=c{ze+h4;Mn>REf<a5y z5bBYu@d3r^Y+F?UXU)i80V4y1wH@*-ji%cfgtff~vL^^DYjrI^SOFD7(`pq2h@}Q? zmN{)hgyqRjOpn*YZ-bJdoN8u9qCC%friFR_B`<_@!o>sT0g2{7+;BPMTCz?kngh)w zC*()<m^>0(P`^2l@C9N4vv}XodC``LZQJb|J2=Bfx}5eBYoj&3E8RZ3(Prm$U*13F z-L`wTS-f-X^{`RY!fObLAdN2>8%Xzpg}lxadt>u7JW1GU;yXjf-1>W!e8K9&Z2sGu zuh%I*94HWt6m+uqK#!bEfj#V@`dSyxnXsvF4{0KrlpAoYFkU%bbgT@R%7P&pRx2dc zEQ!e`e1)I{y%dMaTAZ;~ZN6rM&^)GGc~pf&5ZVYW0)QaxVXZJvDKxgxL(1ETo}%1` zPqm~zU92%%n3Nk^@s1b;lLRcz^8)LiluT5RST-sFtO1N;y`YAgODe<y^~;?6A{!0M zj`cxbN50OJ@`Zepg*LtZ$T+|6CpvR`n-kKoBJ11uh6vgCE1EYa-ErUMq294+Z)`Yt z;~~D}^bk%|pBJEGkD5f$nVdhrETAXh1VTbtqHgS<@|V9iKHSgFJNc{kwPQH%KJHO< z1EE~xgR`L%wUpJ9%|oKh76EtTVD&iI{;<?`A<N5{Ze!eXZB1mV+WK&&wPq=R#BY&R zB%BGT!GC~V>;Tav)T#vVSf!KBCk5%ur^0DPKD{Kou)GTAUV$mT$9YXjWhe?O#>w4D zL}-955nC!ERc$WsTeMAAW<YTp0IcxW0I**9VFDOdmI+Mp!@Q3V@IzvLD&_LDb((~y zy3dM6XQE7_QNE$Kn{t(JixyABXB4-0pA!o@;8kowAk=Qu%!>1?g>Q;tlR*?+iTM}f zelhD?Vn(grlMr@ZQi$Yt{%p*rTjq8AaUl}JfVZKEoqsR<{|wIm8k(;s7F}sbov+Us zU`#$_S;*92C@5VM>wrKqA)$aQ3CLi0X@ZS5YjDf!P-SBS^BNj$$wfznRp4mhu4Il< z<&%TIsgKINIQCwakkSx7F|TsOkI>1H_gNohTs|706XblPSI7+J@~RRe<WqXlI!Dih zLpXv@i~4C?D1=v~UF1l776PbVWr%Ch)uRrVm6(h7us~Uj9~K`|Ul+e%WdNZ`DgB;A z&?p|7+ImPZBvNi~B4H2@ZJ9nKfFjt<KUvw7#QUw&pA+&!Nw+5vLb`uwY8$?sOt~KB zjmpnb-UZnp77#021z*yu`u(h%XnX0U%b(1@y7H$O4o6UT`y#eAaZau~yBP;daMa~d z1tgL(W_ZBlO;`mY^`<>hpgolxRH7GgegxRNat{>n)>qyAwC9QQZ(MEP(=5*$gucsn zJfTJ*9%A+rTWC{spUPrvRhr!!BW<P%1%tK_zq*viN);%!Gke){L2kh1Xy~Eq(2#&m z_OK2_j}AAZLG7&-;tPd-`a)qop2C0m+15HgiW|&c`Wx`axmoF4q!j>am;tnfntR(l z3zr}(ZPnu}6?dvJSqqx-rK_}1)QG=T*4kwde<X~*FBrSqobF_|QMm8=9|?w3(j5*c zzYK)kiB^MfY}3DK^ShF6;kgc`W@HbuK_NuCEGQIx2_YDBxg+y0uXrx_6HrZ<y`<vN zya!nufaE)o?M=Ytr>j66PHbCL`YFIM(&CnGxG371O)iS<XvGR-lzwC`4~OocO_QCh z{u5;NpCDL&;K-;1L`^VSS_|x->>%o_e4v4-308nD>6~0lO`kIK!%!+mj*c1Sv@sv} z3qlTo3>PuErC1%ttJL{(Vrm?Rj&CL0c{asd$SGM>1KBB?(M<5OBinfMo!e4JJUZR% zpPl&LiG62wZ4AQAh$lAulg<{H^2gzFboZ_u74B4w6a2o+b5jq0;q))h{N%n_KBPPn z3~J%!)oKyKaQz0+6b$IHj-S~!s$qn2e&yWtSzdUjTESqKkm_Ycxi}cE2!ID(a)%{Q z#8b@z#8gy7q<#)I+M*#QTT0+<b2X^`)1(et`&2{~4NeEJD;b|`Z5ageS@LGIvT%(U zq^DRD9{9>9pGHCd)V*gqtNQh#t*d3*p_8}Xv%~Y`-fRB-+zY(n=J8;|U?>*;&co8h zBb39U_ph1I><6R&=d)axUI-lXg9Z{ZS|1437>mCSyYbXw7vc<<SlVHkE@t+%*a;Ql zs8rF(RYKr^?oz=oyY7T-gh#cU&`PMk01H-38^FNCDwQp=u!3F5I;YhYSeV!IS@1{E zZah>P(9cJShH8Ki^oq6yzsJ3AtgP8CoIdiUQ-3HFI!j&Sdn;p`4F>RKeO<5L-1ti& zFnL&i?4;3n{7d>nQ}Y*Bw2YYS9Ewjg!;W+V+nL79D!b&^43~D0-9V$Bj9vw6)bmnL zSnj1rwOEhK8f5N(Z8}tw2-tnZo_-D1a4zeD3k(EeD+ebZ`QzmufoDpi<4=w52_#~E zgV)sWiMSn2KPdNw!+QPb*vS2QeN~`PV=!ujv$}kvQFqSX+HNqM@CLUT4Dkuap|^v| z6ZZu@&Nf|Y>qP&4!P(bb0_0jAY<^U)$uICaT{)sMUSQ($^ZAz}KHb7+;&4QHZ~kw$ zk^W<kJN7!fuiXv(x1suZl1qEYo?<`*6*u_QG?qW0Y!OTQ=}6$CwRoI|TY(a2mAYO7 z{oHa~2_b-n3@oRd--wK{&@5Z=5GYqx>MTbjnQ@l+)m=ZRRyeSvQ?gngFSbXb3ChGd z*+ZAT*a7-jRxEd8MnuwD(^c^ndg7-<onRmd3+fgG6a*s)%tF*+<_oc%!%ugnjRty< z)?ulF9m|k#dz!Gs!^Nu>1KQ%}pC3!M^<Q*t5e%cL_{dAjnI}DGdXjSGQ$zdjJM+>P zCp(|2#7yniJWmeYe&5%ge`q?@cAc<xNIAas)aQhsjn7NN{4V~;rQsKD35*@SdpNNB zZ=N3I|Lw7N_=5g(FYXEs-*a>{IP=0AQ$NYPq<lXG9rHlP&f$)2B~}|O1VO_##`xpQ zYf|K^#^_w(qSoW8>8;n{5ii{smbcMWTaTnj7SJn%1+*8V9)|>GIh?P6*I_D2vrI9f zO@L(73t_M_p{$0nCbdBe%R$XR0l&hpS$Pdn2L<YP&z*&~!wyYZ5K~Gg3d7s<`i{}2 zM-7H=d17rv7*;w_)ONyP(7E(BL9f%>jC=J)eeOZ0Bh0?<sfxI;-!0f5c+_B=8`-Yc zpD0P-`v#A(NW_3rzVO-;F2|EzFMc+AL3M%dS50Q+QmCm|Ttsi==yV2Ck;{DAAnJ_Y zj+EvQtb@UxguF=nHKUCeXAKL#K@fF<A=hlf8F!$5zmvrS-{jKeIAazJxakE9TDpOb zSRZ2%fmlEV4HgObF)j@=*2)|`l_k=CmNt=>pX>CaGC|2x4{kBV13fGPU{M0@HfYeY z^zKx7q7<st4=V@D)`JCpn7Ndl1f(&eGlohl08!v_#})UY94(j7LlAue28C8Ibv_{8 zh0vimwmft|2gH^mcs+~Z!BT%@#rGH(Rue*4Xb_H!zayIB2~WtA5=@EiXXbzUylC*H zT<&B7<)wQjUl)y`gxj6qg@i|V9`RqcrbIEmr)~b#U|eemDnAi;2o6!cuCzr6T@g3F zlL`cr%!bnZD-s3N48DS$Y}q3RZ(q78d+Je0KvU3$Q_s})xtkEw*HZL=@<8Z<+>Mj% zrn+ZmHzA!u-Sjd<NT+TViji+z8I2((Baf2}$Pl!#Q8Rxx?2F!AQJZ9@q!u1UwgdAZ zNBi)qhLyBG<ct@h`Gqrv4M7XH?JD2SqGyGd+)0lSb?F}?QjdE|TYU!c&XHFD(o4DS zeCWZu+)0~3*t7bgVD!X2&C{ng3+K>=5?HqlnUo-S+vZ<Od39PtoWeGiNu^$$IQH<a zqz6};nsSKc@U_T+gIg!F^;6wxobe{^`D`=}+%^GM^U0p%#%O$o)>f9|$mQf1Qd{&0 zGF9=85!B3Z*}=7lwPXRDb)bk2H~>eTE@DJuMYNo<EV=;cW_c5fhgsg#YnIpGZu9s_ zo^7?@^`sj-h#J#t7{BcrKxi37V*tAb2%7_}cLjnw9Q_)qsiqjwUmdanSz}|$AeT(B z8^0|2z5IToreI&L1_%VQaUF4QeAq|Je@d&k6TL=Vru&0~Z$L^(cX>;~AiM)11t@!Z zj}!p5PPq6(!5E8I`J+EhxLv8ZQF!Am!5E5rgmd#R#9hl1s)|M-oLU}h>)fgNA9Vy! z-?Ob$B5!K`$8Et#0?(I!U-U42=~v#@9fvHZxL4KPY6Pq#Rum)Y4v=PP7fG>s1Eh#9 zb;jG+zV2EkMG!L<UXW`biyCH&YN+&D$f!<Pn&ON~;3%S{D?Rv54>84e_OwF0J!W|@ zE5$)3w88aw%9~|9tXK|<>lk5U4NMl1)rJADYC>|8lm~ktx0Vffr3cVi7AGOW98~Bw zj^qCl?3tMWZCVjefgS{Tkpo(H-GX#K#CR<i6Qw}^1>s%_`cguEQ~y&RoFnNbycHzf zw_m+ny8ruaNH+-<_J2)yym3hPwRa%hlnc^**&o;HLrZ2(B>Ao|b0pt)+Wdi(P_$CM zFp~e;<_{%>q7~m4iJVB_60nE;kZ(1%>|(?BqD3TFED07}bA(lyqFQ$zOd;|EwoVs& z*i=DCh^8pO4x-wDN@<YA*g+_1nq&DD2cgqCbPB8~^`haR97LojjZ%Q^P67u9nHUj8 zVc$2IK?j*Z5!#9Gk!XB`BvK>n#Wa=LcPyt^NTb?$Un2{;Q7PxoIDL0cmbW@E9Q0#t zUQg0>`i%6w^x^UEi<Usr$6tE>-qg^R-6xS!`fj}2@X#N#dXeBxeD9W{hfctlv6`v> z-shE967?e#bD|!ky$5IH0m@(lcb-cRksZs4u>~xm34tV-N6CGl)CnABy2#D2vR%HR zI1n0L+5LmyNT|C&AQh=InyzNxtLhdUR9!XQ{5k-Pq_t70TE&?6BF};~qm+wTGe*$J zQ_he4*21*HLb*O>E{Ia08d<$GNCN<JGwn}dG>mx!=He}KE^sZJ>>EM2TF^_R+;XG~ z$4iJrc=F3Ll`F^>`NfkZQCy{yTCunpA=@Sue-hcUkl1XAMs5Hpa3e^84O=4fZ>xu| zh`uCMvWMRmt%10IKY#4SOz=?fn$M$>n)C^Kl*9XzESjR(7Jc)6e*gaDVl1XlDAwnd z7vnV}79kSyBfTvMkrN7`7%K)s-~0K&2NtEs?l|P9IRg37o(}L7){^A{2_aB4Z2zK+ z5)t8L-I`k9Woa=5+ZP!PONI-w2zi0yS7bO@5skXEK_qz@sLvgYN}_uID^V>V@Pvpa zr&Pldf?`}`GztjTT#2KW3yAQCC=`D40~IcQoMYpKk9{h%<L_PM(ubh+hqDM`014AM zgyL3<_%*oDbQaVXHWk+cM=#k}B|qcHqj$868wD0-4stKRC(A)Jg>#wK=i+ck)qy7h z3QNi*qT{s13B)V2xmIX9Sl{GIAm%X#gvfHSP;alGbcrIGc%!;)9H~)a(E=<qfART{ zw=C(6?d9!Zum51A_fCs0Z1#O2JhEB0@5t`YxZ`ep_?g7{7ZmBfb7vftYqs<JE_8<N zF5Z|-iVwZ8Z<qey9>MqiU%ma)ovg0(t9P|`;ym)ys=I^k2%QK4&=|*erCL{FnAaAv zjM$}xhHj?M<tTt<>?6yFGJFKPvv6g45#2=So|Jdso`AO_Qcmw8&3IQEC5osrwF{g@ zz!G6pA?^@W*%{wcHE5z`g9ewUz(M!OMO9q`Wubi*j7;HQtlVy-+e9T5>Xrk*hrAt= z1fabG=(4if??tUeMKmngNtHt-TW?|%ddN35=V!Ku4^EHno!@VLDKm3sG8mqEc-OOk zC%(jLPscZWAGN1gKrKGqxAq+<?X;%j;+H0+Kl-Dy{CAaypZe#KqbJ5kzkJX5{MUa$ z4UVNE6fjF6s42BT)b>*Y6{beth3|1gFR9MnjLy<S96M@C*%2(t8Vn$?_7Slb;c{i7 zt${)uGotfM9-CKXIS~MWNt||)4fsiIC`=)g$U?NyH=@2&1M6lqV7P<jJxN?hSx<PR z22{}t4RID2)Db-)X8fX-v_6M`#5C!F?b}_{S|XiYU|K_ABu;m;9ZPN(av!RUvJyE? znSieCSXcn40<omoC|*z<n7h#CpR90BMYcVr4-G%CeY(B)32jb19ts!odxxF}oth&) zJLcXJJM#X-#4&Hjp-Eyt`~B6ap<Labt&YgrW;((aohYQ{pV%U3te~Zhn>0dY=fA!> zydBF3zBsy_H$@z5zarbMs_E8l0gkc+TW5D>H%f%mS|1^Ih^aMW3b?9ps_4(LEwdgr zI!P007?~vBgGm5F>_qU=f+Naka$Qx7o2=R*IdCfjxRq(A2RD>;+-NX>w-qx~O&CPw zv8d*w&`bPd$BewqXb%2R?C38z`B7LV?X;peO7sPR?-z$JegD_unnY&fy2VSd`!>`& z<J3*_LY*@)^%QF96e#m340}N+!l-NYpm}F}NJT)u7?F!Kg(W23VVpx@6q|Sb==p2$ z1_b_g910D;c%pL!b#^?b=?bg~FMt{+@R^o|f&|1clEE@p>a8V`YRQgRhbHw3uyFys zeq_{$fq<8b;T%1F1_i@5%Q~DR7WA?hr?T3&_=^9i(>#Y}P+~N}Pf}B8IH|nyQRgW> zfHHQ#-_qjG{yYD|uRBzp3oBbi1L|+B9C{gS<5jlk9>!@(93y#?_-IC3Tos0sz#?WW zi{&Lu1Jqpr42%VV0xC&q2aA9lJSAwX3SY;n@O4zjKncDOCx~UZY$U)z(M=QW`Fvb! z%=a{_Ni$X9(}fxpiOCmvike~-8d$y@f4i2j%(TW}p>wqa-~u3FsnV8HBs{IWaNpPX zUcT?Y-M`gg+7TZ0SA6H5KM2HvpbDkVt~C&=edwV-_ofrQXFN~xngavx+@r*fzw_l! z2Hb{T-?rV)?C1C0qddB@t4lWT{(+-Q6}F?~V;EMI>12|-BRhFZUIgGdwxOGfKeQfL za>isUsbmX-IAcF|+HWHFK}DXli4}lMMK~wFnbremc?TJ$*H8WyIT1KjIW4?JQYyD% zQ42~rfDxt3sRu+l3e9xW3>ulTgl4Hq^N}c4q0ngQi0a<;7aMFP_kAR!RXZ()i@)#Z z&4xnXrKXREZ{-;#Kg42_&t_wjDx8c4Cqrap6p*-z>ajyiHkTml#Kd35T0!q=ulYhq z0-Ql5JOf5nDp^ia33s4|B(5sxB~9M|JxE*DKy1aKwre4hX(N0?E<QnDN2lr_%sdjq z9N+=rXYEv&g{RE|2N9{{m@&jh3mMcHc<?*V#-0f2T`||KHyyZ5H$50S;x&bIW`0k1 zc)!n6$LnXqo`kdF=|4R4WBz;3{fAGOIilaS?}dkulN!tH!H`wCZ^sZ{=@{MlQ-rRl z6>j;}yTW;#MUwkE$EI)8fo%#+Xm>d6UrtvN=@CaeAla1xmp-<D^}~R1)b&^QebIhb z#!3m0d}+vl@g|qrA;I$TQWXtQO3TMPOR)%)T_rO>7BY_uL#F}Qug{t`HG;>T?AhZv z`L@a+fdJ%ELXj8ahC$)LT?vdWp2B=J(aJtLEOeDo1!iprv=<6>MY>8qZM~(i1-0Pp z_GdcF>Me#kgEth~+I{obo%^>vx+lD4m&s_THyFa5;o`)##m>&UhM)BA{A6V7xXB5h z%R3GFd+Q>DeZ8Zx*xp^651Z_Q^NZL6*f#1lu4?r(dc0QKb&n6+lqe|{bvKa?PGct+ z@nci~J&oD9g(?P>qpC1Ylc`nrva_NF>KVKgVr2{@VIXTw^Q@9VRV8qV(P>d}4|rZw zCo&#&F}Wu!qX8rrVg!Rq3_R*fRb-<$*<r>bk<^b_rpuwFEVokofEot$Qm+(NC`bHO zUVG2TOCp%%{rC?|)7=9@Phu&1stZfmH8b;wuXI|$q&EIc0z;Ls)JQN9<cFWgZf2)D zrQr-X1E2UyoIwrZ|9e)R!opuJ)r6&REK`-kkt+<s6-L5RB@$S8IV7oMpIK)5Aty3d z%NN2Y*25W6&R>lIi#cpeb<vgKfexqM8jWoQX)Z2%HkyxfHdW*eftXtOmp>c`FP%@^ z>x*U2M>*QEwxxMJU)e&wkR1{N)8&Wc53Bj3dPtX$8jm(Nv;U@Z5`OiHg4IyCD_%rU zpWxDdx*v$33ej<MlkNdGCW&E+p3-1Cg9@6k1BL}s*aHnbAR_&Rt|Rs%U>GY?Wnkdx zsxS+wIA9Lal_<^N1W@Q$vHc1}_L|1(Y_!fmO9}gmYjNPf@Xfw~pe^nL{L5+93qoZR zSn=*291@@FS8qo9>x#I&gO3IrRUwb3f7R^hL9Y;}j9LAQMSg18y%0FyytGss&&59B zf8x>|l=}4GZ$zGKrML+wVM2YC(Q<l@hWXNzB+{;0>ccjGtc^|uLkF3-h6b;RqK&1} z*eQ&HZYfR#%P!G7&N3w}z>5eRlW%6Z52f3UNViv!a*5>bR<U$@RTsXK&9$2m7Ocgc zDy0%GR19Xk23v!%W=HD41fU4BRfmaS<lZz&*xW~nR2;AJ)e!~2_Mn1`MGeZI%!_kS zu<Wo|2^B1sT|7pmU1AzX)oG#Bu@p}rH5K>#<<aE8)UNqk60V46dR6yuz?;g~q3Ry= zjXBM_@lpQRzvvL{Irs8ApLis2XW(V5y@~nCniev4;S0cjJ$>icNPmPO#}4wPgdEcq zsK7BJILAGyzxcUYoyw|rS+bVwS$DQJHNs7D-&5;%RGFesBEMvUDpRbT)UURa65VU5 zvC(RCHz1$i5R+RWI3Buv9Sd{Ox{y-HyHqR*VUL!uBovobutq5o14^|v;>(yd+Q>>% z8)*fZWg|(=l80|@G|THDB8XF-!oL=jYBRZIScFYM@J8B6h2>+?Y6yu~_fP?l>nPO< zl4y!z6@Wx~{Y!_eqL{8~Fte($AR!gQv^>2v@k_GSJh}(%&?g>w@g*hzWx^jVN73l; zh$;{=t>WDy=S53-q=r9J9mN79eE1Sdd=Vc%@E0WIn}io%_{pLBsKBHOf-h74<!aWK za2HuMYXE7yE)aa-h0C6a0fl>7`F^}+i0lo|YOx)Ntp|v=iTI}_tGWi-<X|<%6lohm z>=q`pjLy@Lf|6lWku4Zo9hRCeU}v1{!=VeSAbB5SV)3yM+gAqxHTy_F&8Rw3I6(L( z17b@9^ZQiLN~0}`FSv>k0#FIw*9@6-Cnc99qbmya(#|;ay&GY*ay1kQkgz;OtSCdx zrmNK&sC28kyG2luy7F&DAnxLYXO*Xi&#yVu4{J_Bcv-Ym$Eq9F#~<t^|0k0BBc=KB zm(%V?V;v{mX@ug+@Zhn>e?+18a<d)?@#2Bu?j9v`6}c=#<wwSXyU!ykCmjODT=O=I z%Qq5|cq5%W7WJV^C3k_XI$e)*r-fmJBxXG;8figkp@~(?nozmT1lDY<M?i|*tc(`{ z)@}&ufw`6;*1;O04hAYnMMHGH*(Mgz9&oZ{BhI*!cAwAL+B*hFKLY4OwIa@HvdDD+ z6G){r?2NNU+_@CBBcfW^ow&Rv`y1nmg;7NG5#7J~?FOq|Ird*d87~j>KREt7z#KD~ zh$om#uBX}OwIq{96?PDA?I*i7P1*oj>q(e8TbUn^cxiK3+IWHbNd)x9jwT#-58dmH z3_iDU3Q`!3QiG`zHp~&Ff!uUmCw;9BcyI~7wUec#oj_>Lb#~Hh^e3PPJ&=)Xpv0il z@=cAgKv&T`Fi4U@`rZVUm|Ty~OH_=_%JRojTkv;;)faqfy`=6Ps+auZe-)&^ypE#e zFM#oWCqo<Hry!h?_!ee1Hgk6ZsYOU7?9n5nQi6mAzeCII+eaoLw1r`fPlP4y1vZ8O z4i?&SiK#=Hsn~+^&%;9Ej3(9E)lG+9f?9I@^)Npmd7l$T)^A{DsS&f!@@hPmotRVq zc#{-rof&GyeEx5ahtl}29Gp<@{B1B&;wd&fQC;aY<g3FIIT)U(s`}6W_^q*1JlC!K zQXE8lH%#2$)ud5SpFv21IwBHyzo;ofbo`9Vr55Au=~j{#uqZTDDw?{Pl3|l&3}oL> zomFgnX;mk_7-JDzEC?-HwPJYGYnF#0LL0tr8zxP&Vb+x9Y~(7yg{1bRJWOpr9SUWf zJXnvHcBBL}HEM>7p-7E-PM7u3%LSsyD8jN>OoVD^9av?7I*bfdn;J9|j+wRIvrA%B z1~!e|9Sa(H@x|A)`SF-rbXZ-q6h?R6@}~o~KI_1P`k7lqbE-Kv&y|Q7g;y?$#$du( z8s;rwr|?{gM(y+r@j&ns03-gVtIRA2TEU2FECx(<g22XfbMs?D%d&R50Re!kZT^jz z-xnvJ!*jcS^>$7NI5wuZA7(MZux{7l{EMUMUL+b)FeZ=UE!usf1O?|tX?t&VG*gh5 z>V?-YP)Qp#FnVaTcU_P-A}>SKHH7GTm6~Nh4itN_lb>QIkB(t_ii!oDQ`AZXJ#r1c zUSZ&Bk>XXD<x+f|GB#R=I0TzF*P9xp&bhP-JqHL#woO4k3DgAS3aY;4nX#3EA%=>B z7)FBt3$D$6fmW*Lpw%LcTIWjZyEfvtFj+!U6tdZAP$0)!sbW-`tZ{P|8_JSiKjpui zMO_o@_Y#m4%b3k7l9cs^)B%0LqLQV?H3ws@39|?iBR#0Zckeqo`=|ZsL+-GtBR<W4 z{<e#JxNLCu6MK&xPa2cQkL`V8_dppRy13`)_3<`i*nKeF|CQOJd-;QrfTdb1@>*@R znV;M_{zre#7oOq`{D`;E*z62;92niJd`kJ@)^9!e`SDGYlbgmr|Kt-l@^=2%-q8ac zL1!Yr(5pzwFHVDi^^e9OV{jR0U7DldS9*<0H<6A-Vwnx9;2>tETeY#w8ubYzy@mi? z+?sC4rYf2d^-Zmibao3g;lZ)gIe9@IhI<<}vu=iNHKO?rb#Y>cS>A|cu@u7t@#W8B z3!f?|SEX>Ca}(Ot&7fe60f0D{`i3Fnrq(rNK>_h%GL{NKs03l6gh&-Jr}go?{9<FR zN)I4nxJ~_3Z<%w3lC+X>xmX#>3t~l2sgSK=Lmd`PC<v7*jS#98<sL}W?}h0Fe-MC= z*~%H}kdP^Ul2|=sQNOTNjm8IGMc5bdKh(P^XuIe$``ZHJV-KGjd1Uuf-Ma3lPXE>a ze0Xd;Q10^Ub-vx(Y{BW?hj>053R$w-Ncmv6<Aw2~2fO)uoF+%1%OCW_M}Nir^{$s5 zIC+u}e&c!WSEF%H(C>2S^;YLAneKx}$6uhBkT?=Y86W>z?guJ!JoWg{ib^+99iTxK zS}K~!>ndzS=$D7IH&{t83E@N_84eN~4olUn90%61xf)+J>Tc@n4F%xHgfSvQpfcHj zDUx(IYgKh4>PS~GWLX9Eu;3yAE)$@r*-9S9MkqL^X$Bn?_*AH~ybjN3Ku0ct^ke}D zvvgqs$!}8D4#LpVkV6;HWs(Kx(LBgqtMf7Liv|K<Xlj_PjbgPHD|ULYAP-<n2#)x$ zh{<2Yy+A;Ob+#AUk!eAfS-pyM8LuJqr6tcui^HL~(bMMc8dJV>>e=CwUwU8p!IlRa zl}n-})mH!Qu`YMJCqEwIT{8(-V5n}cXaVnk>;9QP8Sy+TiQA(3hfW&y_bP9>Qkg$^ z;BS#~`OU#5&4J$ihLaEJBYbqy>_^{Az|&BpGCco!ysQmWTK7Ck@<VS?)FGD&qjLHU zF#baOXjcNws+)ikCb#Bi%w`zB$#5~96PgAt4OXr|*}ddK#;(>!?Pfv-Qx~iea^k8X zCu<{_P%CjJfTOVi=T5^uL6SwS3Qr<>4Lk7hL{4PI$}{32VWf;8k_cMnbl$22G)Jy& zz*lM@m_pQA=j<i5S%wxh0mi{o3cTXraq0^&;hZ85lT~n8f-4l>#qZ{ydNgo2Na<3< z-??{#D^V!$1L^X|qREHdpsKt{u(vul?E3_N?X$<bho)u(;im^FpY@wVVc`qFpg(xz zOK4n-+<kZ)(BLl}{A2yWk@;7jRW8PAfrcU9Mf^~A!*@*}e@TzxWLXelpch0gtcj)v zN6F+24&x?QLt7}YU~F=%$5tNAbhnIF<G{Na50`WeyWyy0XA*tvIRwQRxvr(furjs) zp~<#<^-ochn2m@-A%@RZVRG6_5{Xf-J;r>Xybd1#p0UoHS)jcL%aiy44n<hTB!_|X zIgQR)PNpwreOxSaX~(ajH$#?8b$yGv`nS%Su4!9KvRkJPUJnw!1O1hJ3&4YNACyd5 zm6Tgh-BitD!U9GNGNCmz0$Oj8im2y@`Z(%)AB%s=!^psMFjR;2b&PEWu_<bJtfxvv z@=~<;6HQTC>r@H!zL!7p;oMXif$!&q<GdAYVC~zKQ{9L5uN6igt>}I^6|SkV@+T(; zo^08<#r)nYFUP}oD=$8D|9cNi<bLzAJp)&;Rn3Ca>Za?2BWLyec;}%XJt%}C!xnRi zf1tY_`|r-3I5M{DzfYg}$IsDSNdi~lK>uQreBg5Yrx;C7(2W_!)mWZXhoG92Y*U+O zRmTMA+Gu0PDqN1%LH3S|Wm<En&qLWZMzdM*3JJFW9Yn${u!LLi;wqqSStQWKC_-z5 zy4r#|a90~;*?BU`;v-Ig@NzNO1If#i=nHZ3D&RlUBkMPk;0G<z5S*hG{He6T7Ur}R zl`6KV25W8cDvWRyJ`j8n_L#bHyJ1Of*(j-6)K`rpiioUUg67B-O(nJ9e??PCXnJ-U z6F>_n3^sMD%c3xz)qm&7*tb~6X?Ev<J=u;^z0SO+iyBU$Y{SX(7x$d}5r0y-G{rA) z$<&91WGG~26TuDU$)9}dtzXlO`X}f;)rT+X9)aHH00p;I#cRmsF0|l$Iq%FF?QJMt zMAHfAJ1+t3=DZ;55%lXB)2$xkL}{m<gyAIaJOZuhF+efLYvWDOI^fd?g3-WM&hT0{ z4L9OY^LVt3+IcoFnw^8rV$2Be?AIV#xV-2@JGG<OO>)_D1RH)0IK#!zFW%pBm(_mf zfEr(9OMlxoD~8PgTs#r^mS`zSbvlJDo(i`|7*F*&thxtB;Iw?cO1J6K<{iRm{yR!b z<o6@-yOgW+U5<=O9ZGXyK^P1NOAQ%ABqolv@$Da+oylDU$Dn{H>I!XUCx5<UZj5ZJ zz}XRZL=ZFPa4p=Rx(8j>7vqd$xa_#zMAyMfsRd4UxfKW50zhS2@BmGPbBqz}z={zp zGH^N}j~oq7l1583)j8lKrDLFN@k{3k#0pLg4j<oIddh|`;M4i@M}v`l4;R`3M;<W> z7H5&gC>-@45sft)%J}1Mv=;b3PYs%Mvu)=^lXKGZo&EfK!~14e^BnfgyB;{ZQ+cz{ zU6`-c3I;~~5Hkqcjt1eYvDp0R=G<iuO));S^RWlia{#|`?tXUH_Ho};r_R&n0M<1! zCXTa4yRV_;hFj>u4U>aegVWg<O>d&mtm)dWO$f-Sv+jm<xIHYbx*#Xu4<@jiDnVCp z6%F<eMWuuq4uAMH-H+2v&{Rvu2oBU~IcpPI+eXJ2wF5Sj@>)9aCIH7D&(k`pHqVm( zxE8H<;nQR~)>tJzDGghs>p#|A&bjg~+o<xL9g0yoNteSjF)Ty-Pd!UFe)%B#!j0D{ zA9~~YPh_vW*?}wn-aP-|u0#{$2NpD<1S5z{C8~Vh-%2HTNjb4%%x(V_zUGkdQEJ05 z%sEGy&TIkyCu1c@>x(c~5TnV@bkzurDY0~m>KIZF5-oBvo>g^ga8JOe#!EMZrA8Go z0BC!bgPU4T==nBQ(v@kXqIoDPRWca5yaln~R<dF6{BvADt=n#~$if=B5e+nCYax`Q z$1-OYhBwi@SY=I%d8}-O+6Je?hoH||d%14vu29Vm9+v>FWl^?S1o}esO(jOWMmT8u zP#qS1OFC@LD=%B7!pG*HyS5rlkbBp^!urb-)IPm;;GacX^#m_$`6st8RPm7tURrI< zS*hPYy4=@HF1Z?h_pjG|^rv&iwE^(&Y~*%lng7G9z({3dNmnMPwy7JLCm@!j`Wq1% zlyU{wERI!ngy)jI8>kkHWO@Asxf3p^b3IA0vmY*KJuCLEN4JWq?{X&!x6%ekjHB`S z5<m0gNO7=X3#YvN#1po5KE?CZ7+osPA^5r7MqgQMYLHMcPRT|;FZ<EBCBb`yUyRIz z9&t2u5wIvT_%HA>Q=v0$Up=x)Y4Ht+ZH{82vUAhnKW%>`FztY|U#Dm-q3)W(GvQq$ zx7^~z4?Q&UGtpA9`zXKv(yPMp-f87eA^~$Y&7I_LICP8P4457BZ{2c8`4ePDi0Bd0 z`(ykK)4lT_EIybNwB!8GRU3<Q)@{+uuyY0$!bIU1J7?f%un{DqTR3Od?N-m(<X${y zYz+BLPuYr7w&RrLjaY?)gNI|Pq`|+-P7;j`feyO~r=3GcOL(ixzS2Z9g9Q1%?782P zuz7^L2ex+frw(|{;_+4gjH4dx=xrOmbD~rEuDRHyadmr?)XWzrM|%4lX7S#U*F>|o z!|Pc)IMqMC+hi6GjiBgIu$w>lgL#ENIWWCr>GXC?jVJj}*sTHOE4S`jdL9n$*xaxD z+tT;=f?+;I{+#^SJHQYK-0j?VxbzIpIGvl?0Na$AWv6^rXU1)w{3P9Hw(P7lMx)HK z-GkDO@AjEen5e;NHOP{=Fx~9^VF|o)gd7I2*v2R&Y{XAr!YCy+;a;Qplpfkdw|)o% zPw7pp?X>9{-1<B4sfI3l9+{_2md;#3pIctx4~$<&Mrw$_P@FtF1M#5xbGt>h&~#ue zVq^e58zQ;9=VS2bdSV}2QI;!^-$GS{tnw}YAOY%Fk8YN3{-vjXjgPuGpd^_x+4Gf~ zrVgKKTT~{|9O@1foeAGOGBbl)@zBV>i)PPmUYr`a1zH*I4w|8qZ;ECIlrq*jUm?89 z|NN5UBYe}@kE?N4DC^B28ex7*zX$br0bP=BtMT+)fp7BGDtnA@CQ!k%hT@8ipV+mJ zF2^9%ERyQwXavDgrq>ln;&Nzqs4)>bTfFqyMFcv$To!>&D^S`N5$MEnpI``dx(YYp zqTF-`U6h-4<D%Sg6TQ6S)3_)H@aY}k1b$rbwe}C}q04cTg<TGUqGKO{EFkW|1L&~L zV?$x+YI+R0F%@N|ba`I1+<UP)Fx{EU^nJPgU9RuTpT{lQJKvsH#!3dZcJ`${%QPN* zJdVMRp7o9M&rJXtB;K=Lg-$WocK5(Pi8lX)rl>>t9vOo&W(<y#F(@I=x1L#pD$j7% z8t|$)Fzx2w<5SAsmDb?)I_2M|7HlobCxerVg~!SFKc1aHjy!{YnW&q@rZF&_#fZoP zBd6(FYOZj#guCboXNFitobDk_=6c}Dq@u7)j8?iR==X@x9Hm}fLd66i04|!TVTGv7 zgj&SHpq&i`M3v=0Dq#IjTHz$EL(xhuL&=|NBNNo_N5l#SD_U?P*{q#rO-`;@;;yI; z#Zs+3tA}WD2-QL`Ia$C;iDjMmnFg4Q9)P-JFTn+oM!N}WgQP^vEulcT2+SEYO$m6V z7HtF~|3u26NyOX?B#sg=O2XAS=;6@t0$8x9@+jtrIfb-mG=X9V%Ek2vXqysjs{$^Y z+67SIg&@b&<je%Rzv+Kd_s7PEhvI|&aMWec7dneSx#iF9IQ*Go8Rgv<mA~Zw<?e~@ zNZSh!b>Ft9@7xo;JGV6-+1I<f((T*YedgGgKiPjvce12NbBf=l4+*@+B>u&|v5+U` zaq`uV>N=0zID0f~@ErSN>CHgLsU!R-(z%X_!p7}``;Ugr;h}xQ`ws2!x?_<?5A8k} zO78COq58+DvQzklc#>O7eI6}^oocyX<!opSvnlU7DkySwVMw(-EcZkB?fvwmokGy! zFmY}Y0dg_1F>%FK0-2$(+{6T}Rvgo{#4^a$M)3np=FD1E>>zbQB~lttF(Dv!Wkuq( zs9F%9vlbm13PV%YPNS$dINTLA+GKkVTQOf|N#_@tSwh)P!K@FxIH^A?*fk*eC}UMI zEklNaBLm}n`IfXU7lRcLFU&RM>%W?OsiX|~YUm$i?<&nb<X=I;Rl7QnE$bC9#z)O) zq0oCM@ICGA`S<k(`+~us+o&&e6i@E{+O~dIvOko5@#2>QhXU^4N#*q2|Lx7eiAaB8 z_hj_t*@5Td!sy+H^e*B3Lh-2s!$DWfYv+TH8gAOzeX~t3mV~ZJ?z%sgnBC5|dYvQx zymhZPd2DQaqBjtJbaX`dCB;Z!%tt;xhCaY3_rJKb6W*I<qF|XNTQuE4*(Nq4(cAuL z#@poVKyWk|%XnEK8Q|kgB*dO+W0^<}>#|W(08EiJ7cz~kZIiBHvacc8qnM2SG7{F6 zno-MIjgluzJ<LGlQqodshKM@}<_mK!nHgDO#-RQd1CbSS1`Q1N7&N((FgnhP4oE{D zp^9e-wg7Q*2$G*O)HG(RD={b*Hmnf>HwRssg|IbS?f1vgbHn-p;vK|C7PA&O@ddM; zE%_H&+C|7pe5W<+c9ezfs}Fo(Jk}nE%}uWd$$=%ln;*O9j($hTUE&OMK85|XZR|tz z>9qK8?QO}*oKbPWVbbd@ku8Jwyu82Da8o$AU{NP;Hl%hxb8u>Z(x%fH9m?M*JG(U3 z(dTO2fAZ2P^S0Z*Fy=jcf@IaNY!hGB9_EI)H?sY<%2<{ln~MIq;18#P1c)o$OU|@7 zmg!Wbf?4EHm_$VV0Ut!PHY}wsWCqpF+MrolMaBJg^=*4Nvufc3+LHluD|;cZd{k%R zBqDDBt~Aa{(s451IZ@h+2iRPuLX}npVdglthS_443h8$Pjh#eElUn6*Dmdf<6-u0_ zFtmkMsTRWPq#i+GMT&qY>R9Z8FcNkoC(su3(3nV}SH;+wSF|EF#z){AA(2AM{M8AP zDPVT^s?hC^qTxx<Q9(HNN3Pi^zjMT`j86EhSs+%VX8QhpJ)Th~ADG(BmmUv@&!cn* zrzMKNJY)Hr9|q1;ALSiJo%uTV*@V*N`|*++429m)r)HkmpE&EDw7|%G-v710+y9Gn zi_IuOw|Y4XCTk<yelBgN41{_GZc6OH=#uf}*^zK$d~E$#MqfkkBuEwPPG0hN7}6{S z*^k=6M!g?=KM_d86wO%KGd7eN0enQZqXomsH9ov7MUhw5qX(AOnlOw*5e!w^SzQ*V zl&BYYU~xq+t%86d!%7I0%;0d~c5;f*gtl6QhX?Q8JJ|e{_FZo#Vs|u{kG>#84vZ^5 z|C;hs<(@wnpFa+;=%>Gzy1RSJYXiwc4R*fhZ00Rd8MymrU#hL&?5lkAKkt}7{VOiB z{|LXG_wxnMeEM_BeabhQ!qblOUnsA`I~v*fyseo;U84X<?F3DXQ2ku4#`(}Jglxn4 z#N7E9e2_|G*`X$>j4l((=fvI{!9{Ln%x4~4&r(`XAO~<d<|tKW8M{>gF!i~7jjarq z3Y9F3A<c1~09);V{x49PWmarFqG=?wY8NI81?*^q&lNFoL&zCzIa0Kyc&oNLP7@{G z%Axa5ZGUJy5Sn;s+mq*noduu0sj98cTa?$fIlR{&Q9h}yOl@r+`7}Q>6u#S|<4?ZA zn@8_{V0`pX{&0fN``M$J@_7G-YeLojrjGr)`;MO#bM!g*j|MJ9waFa%>XW!P5!9j6 z#gtK1kR=8@nJh6%HXL0a6VAy*vmcxZU<Y<pj2Ebf3O!33DQn`4xv-ZQ+vVgkK(Xb* zdUh9<TGFf|4c>o2qssvo4<;&AcD!uK1@4Qt+pJAux?@Mu?+f~quHfX*!NXRc*P`t5 zY|l4~Cr1=?HMi(_F7@op9opXO+XACOuesS22)^{Haz+^r^9|x1)BJ)(Ytnn>{TU&` zcl@pTw_33mK8QQefc%klG1C?bA`lV4RI#lRtR&3FQaLg?90%*e!1`e}ZJrL(d^r~e z{*#dSwA8$<`9~A#_Hwsbs(3~gvD94i3^%6{-4$%7#NsQqhMXno$jaylDCbt2zH|mT zUUK0!GC#{2`8YMrY-r5=SfR^fRo*GA4;B~lIlKDK>uugtkyNrbvTsM*L?Gac*)@p+ z+vdLq4hZ4dL{LRLqx>XzjA5mFjKA-Z!xL?R)_}+AEhvRPFuCi_Kp)Dv5t2PN+5|82 z!m8;24`Vt+Y|2~+xISK5AC`g_GL@?AgXC4YAl9JBqa^!O9I}rgp1~hV7zGp?2!C7& z*bzqW*n0|4@mBjAUk(0nFyJ=duIH_KL(Vx9Z)_~}TV}MSMoXqZ@O;c`qZ@HW9Iu;l zjV9frCNy;%Zy1b@g8y#(;5WKW8jjs1Z9mS>gLvj{E?q<CN6M0p(WMHjm#VS_)zUt) zFl@M}k4~!+6*CsNprkf{oN+zzTXXBt;H7wD<)!K}OWn`N)wonXTq<96SGRhp7G9~l z?Kxl(()wa>6Oeg{qMXK}e4H{oln?JI2OzUc=&!~mwJ`7-TtUTWJSi0E^tvCKf_&1O z|9b*278vxp7r|f7fBxfLPi?YTc~j6CExLa;+g)|}F6C9QIEi^i-TTHOXL$bKKK|{c ze4k;ybPogb^?UrazZvRF)HBrAA{I!R31>F5*d@JrhD5P>2Lg>-!<kL$!&|BN!`-@{ zq;~K!e3-N@x&#rE_K?H)teP);Hk{eRUZwYZmVSQEXYfmZLl_wrjsQt^1A?8K)n9eP z&Gf5o*oI$q^Ag1DE(AI&AZDMnq&5206tLR%zzZdsH=z!)5@&XU<#88)W}9c2vl17e zX7VWdU;h_CW;Swml%n8*E}A>U1S-HuO-O$yK(xIbFS+d%eSYi_8)=LGWh@-@_#Ms= z3X;D|fSNYG-&yF4`OMzdP-<$w-(ihg9OVIDYB$|ghFwG6<K_18H*)DoSPE%lEVEbj zu`S^LA+y1WFdIecl`CStwg|}Sv>uPsn_)qu;aFxy{gJL{dS)}-gc-UC1E^5cO>RbL z2ZnA6x?_mgH02x~p!f_pBJ102x**?*$mLeEyaU&zb{C3XH!vGMS8~G+K&ckCIfgSO z>Rl`G(EmaS4dvZnmiGY;hhZV98MIJ<!}aK?Mw0F!W+=6U(!;hrX3HWAo1PF8P=%xe z|BLGpB~c4-2SrJfn`ido;sJiFf;(=o&J}U~ZS)8Yoogl#ulHEf8Y}9hv~%ce7`)(J z!vuUp*tr)jhJB^ZGS{(Qz1l@!cfgn_@!%AY@A(+l3zV(!zl$?3zZ<08@P}Nt|2hIs zd~xAwr9!DqxLW48O}$!|rhjW}-fw}Oz1)gkiJe_x8-FACtZMgudj#$Ob+h(6F<HM` z(?q%a|HIDy{}emRUpl8b^DE-r@Pc<lvW97Y8!S3ZBN<)|!6u8sRF|QLOJ0MQs=|`z z!c|N;xQDUd?MgE(>j1HoIi&CypZz%IbN*<@-<R7x{eBlbYx|;St^1UCYSCcTU+D|M ziK^WSUuXpD1vt`Jm4bND+|hKH63?LO;hLfu6K4nO1*M*uY9GC%#bZd=r52>(;K1-+ zKzd4x=;eHDl1cNdM#Q%=l0vXZf}R9N)aj*Z;ZPJWXebH9ceFr|*nwz2W-gDv*&d~0 z293H(h2%Pb5@B#_OrvTRkh^S??Gkbq<c?P$cER~xB#d+%-1G35_xzLR9(iH(k*_G1 zl#AO>#g!k4madKlY@<B5CxznBQyckW(VW~XB!$cp(w4tbd2=N?_h4TOx}a)#Zlbjh zNvuFMRB(nI#~#ZD?tT?ZO&cW0?E_BTKx_R7)+8BXlyQJ|C1Vl`y$A(Ntd1(zHJB@& z!O{YD%q+F7xPaCtEs~IP^Kp*a3OY6|ZWby5^s)kAcl<ZOr%@5&aBLQQ7wNC8tzi8M za=L{$LJ_0|Kt!XSMxGToizsjL6aoFbe9{+{G7ObqV3C-7{FQOq#+uM<Kk%s+p8vVh z%|E7$KE<DY^NGZtCEgV+v3TIsSKXEUMzLd}tx*}9Q7%sL+Uw?D=~rI)>47h@t~@$L z$NPEzXcBuV#R8)pXGA-FP1*!F?Ar;i%=WCNa2OGAgW!%yMQoS6WW}NlO4T&M<{aw% z5q6~`EVW;txeXK|)AeknLp`mlY-eLU?SR`!$*5GX@;|t7jWGNs*SFI(qN+Zz(^Ll< zDUNZ27z03XP7bE<DZK>uL%KSES|o@Rfp<dl26g^9gT>~i?_)4gy?(fTDEHebI0k$X z$UK!)8euZ48HqXe96xG9#DBhw{TD?`{*Z6)P~ud_*S9`+Ae~*M9D3P)_?c<Hp|~p2 zSNql0Z?50Gt#kO0!Ek<3U8pS1D3u25x5Ixm_|5cP&t|tPoBDQdZ>s3(>u!q|emn8^ z#eAW1dS7ahLk^3Ush+aLA720#?f=20D-qYyDW=2JcFSVyXl)5qScpHq1FvNWf(W(I zagnQVk2o{#ToqgERE2@3bW;`GxhC4&&=8fX)PuI;ph3VbR}n_kit;-hbfK%Fjt&kq zKm-W~X$Do?%KdJ3{2Gf~L6m4%aAvXEi7Ej*aHSTe;44me8LK=>_jI<^ABS50+Sb48 z_?zzK^zzzVgZ4n`$i{yd|8SD|ntzYHyXmPVqIs=eFWwjba_Hmn%`-k}od;|izt5$K zRW0LEt)&I#4;PzynCSpnQPg;;G9O?w$^llg1NTP71-StAh61MH1t7giSI|OMRk!_7 ziqt_13i^%(W>8e9WPJg&nm9>l#U<mBlz9|>0kwy%q+NoEn1#@{o=Q-6LAx2&AZscK zV5i2|QE+w#PV%J#(}w1_b@gWD-NBiJNql-hQE130(A91dC-()ke&NJsrA<E^@A=?^ zDYYu@Gf#=ygb!~7@U*Kq=t~-hI1cW5aN01=XEJq|0wUgh%6UMOq`pRnf|`A~%#h`h zO|udsf2S1?55{f}Y8y7wMfp~z4K^;#v6(Kmz!EcE1vn3~p)4h0%$o%a@ByZN_{jwR zKGZS&arGFy%<L`peio3+Z!l^f{%0dox|WcATnWTtXzDyP_KwyPiv=G9x2#_<G#{N( z4s_2S4%aV&irL?4#ys^az`TUH`?7y4Fm^PN%j&Wi5nygp(lAbfOtrt99Hm)xlvX&( z2I6W5*p+-9G58cR{VGJ?6ND?+MU$>7q(8l?8VZL|ryRrv*%=tr{^i0dNE&SYSV@fO z3$0bs;3DMvaS^oWcH#nfz}0WpfrAEOF0#amXw%<<RRUc(?P^kOJ!ory+~Q9^yw0w7 zR@e(0S2uq!c=6)m!goG*w&V0^hDXr`1mW;gM$JHS%-yf#jVO<;#E$TzBm8};sZ#Gl z5NFf|>_?0{mPNRsoR3{TjJRHfE#hKGH8I&u_%=rYc-kaE-KI!AL~P}=vu%pRm;t*l zvOP1RfYY%v#tb_SYOd36#u<S(qYh^jR4)WA=&M8Ifc5UMH6X&J0h9*dvE(3T7H~f5 ziGVt55rRXUcI;4K-><swls|gr%!*y6Pd#-tZYciJi1M>#Oj)$hG{hT6&R?<jRJ}*a zthg2TXn^}yF6|>rO<ZGO<BXBv=)^suE#y*DOvd(8NfRbDA``*_%}xq5(^jhUcUm#{ znCU6?VL5rGCp+W=x&*8Pq2-*|?&f5)l}<3JOHQ+tM<p>MJ1xl=&g7}DFhAfl%SAXn zj3v)Cb44gXw5T7ogfq?R;nHB15d&3{7gHfLWu>u}d&rO%S%|@`*(xP)+E&W^b1+GO zAxgQ3L(L0g>YyrNwNM}J7N!OZ)`58~F?Go&HDa%FPoc`VquP}>I$Ev~xBl|!!JUa* zmsZDzBb|eX4xZlh$afE{xVh}!^7_lJ_x5kS^u;Ytzok4^>a?1=`NT-7_qCrs_{yik zO6$rSy@=FZe(`rXR`2@<Cc6&oX*q@a-9yNxR+1g$CB>s@Vi*P|7aoA;F<r-M@pXQP z>S_ynGYyW`T*NtSfeYInhVHB!fe$XJfe(%_6G)_*X^ydNG7}gv69bcEVJj9h*;T&8 zc^^Rsl@=b?gL;%A8nczH^lc@2x%#Bs(20M$u?!j;e`PcHT2y0swF~d!%SE;W8nr;o z$zfp8V)mg9c)8MCb?NWW=kC90)j8?-V-FhjMME7gC`VSP&ehfc|BF!EvGwb>KCirY zcxZAklxXI6@Q>Mka+xZTpD`-0>)zGI!E^g_E`1lyTDpbt8IoRX*!gJ!Lf^%7;;u=Y znuGMGEtWCoa2`bCz&<Z?#T0v*iIk5sQw-T1%~b54bifSV7^AhXoACoputbN#((M;O z4I^)ZE<wQCZYP!7whKQ366xF7me<>7@eba4o0)({nQK*L`5a`j?RGLepMo?dE$Ptc zEhLv~RZOntE_^gTjR*TJb2;m-KR|j^VV%<(hG>Z*771sJ&axE4tN4IDLg8(Rs1v*9 ztFhgLv<Q`1D%b)>Z!aeAmlY^NR<y9x#hFpUs}PNwx(_y|lA3<BjE~URu!FRVm{iT5 zp#N4^!>Z=7l02qrAJH8q&QBw#JU_o`eC~8um+vigyInQjNKgOMUUT<Q<V2($EUnt} z$?m@v%}s5w6AeS%X7AJez0t~`(_QYwuEg-^^!UKhnA;@gH(N&>2aAn8*QI9fk90Jd z#Vrq?-l{D~bTpipO${B`_5d-2{_nqjn>Tdu0e>i1;SP*;{oBlgAuyvo5~|W-0+3qt z;8PL(L1nx=^x(`-`L5A`+rtM#e&u3^;RnyZA_```K2LdR-;`^-*f8|7hi{ty&k$Ps z%YRnLx#{8aPa>279!B}4c2u*E`waIEm;My#b<QSWS#MJ<uW@`k-LOyHMpxidbi)Wb zEOo{*_8iWI3j<IKv8m7?G?>;X1H_HV@(ob71{h5hH%U$YtuYy_e=Kp{7nVMAffWBn z__|~?edA~7*R%bd_#gYsjr8s_fG=X0R=RP;4Z9sTZ1P5c7$cqi!{fW?)?90mCwf@+ zd>cq5v<2q%*VE%Fi);l?7$WnzB2PQp*9m--3*UrC1iVbQFpEjOPFw`!IdrjTIV~;; zTQa<|GfLyI1SY`E%Z@nXF5+4NIZpj@JC<9+#uF1B!Vkot%R!Azabco$UCdz}{N&v_ zyLDiz^VasrO>SSAuCve`+&X<tYWuju7c(I*pB?-wt<4<qmyPd8jZa6Kf}X-o(Gm1+ zi?r==dXr(7HKj4eOuJiSqgF?Jq|F~OgG_Gkx<3~!rih<^j4y3VG`|ggDKl>wSY_C; zcETKXkDfg+@}sbS5z80|{b=Ms1{*@d=82JQ`oX@bS%hrLd!y-h63v|@%3I;k!gUED zks>=Zsd&X9^ypU;7Kbs3(Q=|(GX4W({3+)m?@g1^8Lztot{Gd=2swpy+Tb~u3w1>G zh<69WQvHQYNcFQJwGh;&7V>p{BzPUdVyREv>edPiR0HQN$B}%|CWsL6Nmz+BS!eks zmuo0=d%76KP9IkJlvM|#9VC0;9m(}_atHh}C=7K(bOg$3LbbWAa|KpMD?Pw=bNZSb zBr4__umpoHn63M=C5jWQF{GLY)*f1*K4jO#-<Uvgzk<pK6&k;-sY<;QS)3zo0fRky zE8q9@_QeNwZ5DSnMnXP!TYUWg9RK?Lk9zl&rubIZXI;Gc-~r{_=JrUOuNgo05BKl6 z_i^Ro%eA$U0pHB{A>~oiAm0(#e#)gS3kSOze04idJ#x>%ujj`id_M1-J*GS%@&mfo zxg-0&;xXEv`P@^_X*|<AVxyCg*O=mE`=R@}zgOjzDPj7liVSreRb&y$R#eT+RF-!{ zAT?(!Q>o%j2spKh_M|M%+^X{G&dRHs$;$+Jb*qiQ6y)V6c{M^_1CUo89jyg049Kg~ z2!R!qI4gX$kq#!2Zk2mNl~bDIkyIz;a?}9RCJPAP82vqwQ%JSeIjnACId#MW*Nz7B zauKNwJh+2$Xeg?zP~=~`WEz1O8-@2)Ui9?$#0MsBGTzWX_%W7FKe2J!z|gLV157ew z!8>oU<XNMM=3l>s`edJ5tUSDB^Jb<;F6DLcr>F~!QdOvw>@E0CF|Go`$CQak37MgU zuWiNaOfN$Mr+XQdL@xlOnP@>NcqE8An|1J!QJZn@aPAchP*_2LLhQW*pb&fSvLpix zMQBE?OfCU2Fjn7vlPf9?hFZIrWz>LQr=1k00QvQtp?uwX(hq0@CS|daGBH3_(-$qW zHFya>NTV98^rLQf`p`i(!uqz01Ysl73b_EB2z%+qxT1JARM28dHpPyPKG=8L`VXH0 zIGwns(~uuYZtc7~>2dCKSv!jz-mZzbf35Sxo*R+@<%OYLG5@wh@9){9QAR%`q(xaS zS4{0!o_CMD?yee*1+dUHwyLvZWbcT%GxUq6&><xo2R<FmTbjMV8)8DS53<RuW^Pcf z!}_3f-)g$<eGPDeEXPM)PyWuVvJa4@gF{SXOELm1FEWozQL+In1nQ0r;17}JV;vc7 z)o5X1ZVQ2dfidJ-==XB{cn4r%3nL)Ngw=?O;DZe<@OMUMg};umC50_=J*Gi9c{KnR zST-jmEpjdKdlXsaYK-ELS)-jBYOs4{cQNb37U$1(<iV_19!?PvqM@t$uP4YkiL<Mp z6UjIEr}y_xZ{Nv>z4lc(?%Ju@-n*Q;hr4-0*md*v>+R*K>3-&x#JwX=iDtN^`-I^u zyw3<ZpU2FDeabI*xSsRn?l+T7m-!yZg!Vg~NBld@y~|-y211RA7<H%-|H`@8-kNSE zCBaS(#JlNAdR@*w#MbY07xCqBU8v(_>P1U4lEr!!=B7nfNfmjBvYsW^^%1JYTpyMk zSZ54n3ZVwg=&Pv)4ey1(nuO#)$%04|QG@1@ku-p5LX}f=4VKSshOo5AHPQv%DzaW! z)u9JdEU{YU=KBIrDb%NnpkVfqscBL$<SlCeQ<A!@=~z=g{YW{$`v5aS*}Rr)3Gj-r zlm)!hGF}nyT4I(fLtc{zmn5h`FX9mjc&q)wos%tFBAs!BpQp}j6-@{~DT)=(??((9 zZ)^Nye6llX_CG(kD((xpJrypWNvAh^0^Nbmt-WQ#Uu*`8qHt>0;gQz*_xV4x@K=VH zV$PMps)4^nc{2Qk8<z#Fb$!?McGxVM)F%d5*y=}d?WuDQG%#JHJ@Es;$z8*p%hpr! zh8HIW<s9(js0+aX5+p4INRX(==pv97cF|a{D)}H*u0abCmZU{i6<}fyGO-6kBzASb zFKw{~zpIwjU9=#6rC}^P6FV!qlrM_CZb9s{>GxU)o!7#y=AqIH`JytHUbW#G5@ygM z_kyp5#2bzw@XU%8@!^N?3M}COSg;w70m6y_iG_fZVpO=UD72OR0zaw>*nsU9Mkia= zZ8wX1)-9*K9DGJJ$0A23d)8LkOCzn`L^9-dR2e<?@X)P~rgjgT?c(w9xX)SWXbauP z+w{uo;|nn#R#=ZLYwK6Eyz|Bd+x2TA(bb+<z#X<3wxw;yptktFFg@6X4T@$aGjCQ` zujBrfOJ7I2+P{#i%H5dZOb1Cn8&zE`q}Rw(p~xU(NjH*gxJFplOx5spLFg)&ijk6L z47v0U7-1r!h4S&z6m53L5=i0%k=<VW-gP;oo$JOSv=X%0)}dRJ&RIv#ayUG*Mr9hJ zTZi6Ix~TYCNP?4x5gMko?SmvJ<O(f_H>6yvgaD)wlpu0|!Va@VHo?>`CZWqpW-Ig$ zi9wOK=)-bOn)gQ1nbkkF?6Xi;sLb(^{+&I$LIa<B?o@oF-?~x-<7c)V3NF*WP{+a9 zMJ-lSLe1F74VP=)4I}?i9(-{3GPP4)ZZqteU8Zq+wzMqMVc1;AQS$`2cGhqYsyo7@ zo>-QkOxCpHyz5BS07@oRYa)V0t_g!G8Q+?`I;a}hBhV0*ylj7|<blM%7i_zLf-xfi z^i<(DELFJS`oQgxwyD)`m|MmLO!K=a`f~Cbl>X&b1R#>nA}5JGvq_Z>Y$VIG7K@sT zUA^i>MjdqiG)URnQO{9xwiPOr1B=L12#D=nfrxoYgC2Y^=g-`aMZmA~KXUjE<r)5# z-9cB?NGQ=7@;EEZZl*(R+s7<+Ui=~{QT3*s_k<RwcY_g4Xa9G1sFXW9zT2;Oc3j`F zYSq<rXUmjcjXl6`q|+fT+)SHOXf6fBu-VZRu(aXCLeLk674IR%!T1<qpN4wq@P`I) zfb=;rB0--rEoNsAG^izq+eFAKP%(n&8y!<hwk>Q1#X>VEG0dPawqKe-A!0^sRHF@% zI;e=$Vv)vZesnHa+sRgwV?{_8bU}+VLk#Gz*C7PFtk>eVG@=zjbxxZnhKy-(ssy?r z5b|CP1!*))jRl!=V#%`hz2m8g;c!sQEh_R?G$z+YcE_z?k;2r8xoJ1={KxTN#Ow9( zC3`_~pt8c9OvJCNa)<WsMP%qker4@8*(-ae60zY9k8AR)&4uW22;se3cNhv|4MX1g zLNh=A+R{j`m=h@+9o@>0^Sl7gsY&HFxBJ%Np<TBjICMJrp_<gj*>yJmct`uWH_U#D z4<oQ|)9^_fsneK8GFrQ&7Xy$H@!n8nggg?sJ!l0bO{9>Tbh3G_PCx@Pp^bU8hLY`; z$56ZLLZ(bjhs)ShA7dKFruiimK{UTyKWNb1Rz@zToGjT!(vW`8oJ%2#+|!GHJFRoB zvM5E1AEG3X5+ZT0Erkl&GK#n8hr63)x*`<%ql>x{@we;a;m)Ad>w6&7(i8lyXf{V` zhGxPpIGvO?(LzqgT;Z%4ntil&_gW~7?!NFfRnD60b}UmEayaWSxa^-8;%ypi{Q_tl zHHxmmwh!|;TdvFMjaPYHyBAoFjZ~LrNDfXwoPZ_RR3X(B3{w4yaaPr~;yE%~dS0LC z!ShTs&kouuHIsrQ$}h-9C<!ttBirs|v_VO*%%j{$uB03U>4*t1ok{o&D33~+$$EK2 zXP(Ywl$p7h*&>WB7%?;J)aZ}|mTN?HN+_&tq0)mqf{`F;4V%2|r6!)IA={{nTQFa{ zI2*sz!d!^s8x~c{qxLnI6SmpaAEn}Ey(l`qLGxPkTQI+67Nt>eg_;sG;AUH7RTLg- zT$tVhRU1aTO2Zzb+mr+~baIkpFePZ`KZq<&Z=nUc2nCr`*75SP!L1KINcxLy%FCXP zYuZBDT#9j%j&LulyS;%1A(u|4uPJblR4P~-u!5E^P#J=P1x!gpchZTjAren(ksbvJ z;xuQ7|C$T5_nQdKi`mx6VuICxT)u{F|E6`6QaDNrz%d$2gyv}LEGW?+KGQ8KQ7_&Z z8Rgir4U`xiV2j|fzeloRl~j3*ahr2&7E*7KMT*pcnHU&?(k<u}PFh$fBmz=GvQwEo z{Hi>u&9=~fwyP5s6xso~lkG__J}lBa;tW>=^8#&ucLcF%!li5~Z>ll!!fP)HhIriV zj>iqcORov}HE}op)$*pGL3p0lOU7L%k~IP4U6?vCX!N%8rOFT5ssv+@D#cjG>uVL- zQog*9sMFzd-rM&2Yi;zFead?~NLJu9&^?d7+%+^4=_0$Pi^&xTm^JyuBq*XWklWT^ zEZvV<MTEu)=n+w0Vk{{Wx4N)yiIxK(USr|37aeV6QIL(7b+Wx~oOn%?2ujV64F)2? z0hi`jxq{O|{$}R=&6I3$D^!QVYhsj_oy5OpwwyJTIO-cxw6Pm`m8{1^q%?v-)h_U~ zK%K18Ff8H*8UR@bh94%Tg;p?apoTb@j6R}pLAs?vURwALc&(<Gy80w$%b|f8qMiC_ zw)ghDUO(G8-}B)G4B^i^XZ8B?t(A!qO|!PQ2-7h6fx~WplXLs@;Y!D4+-&^)%8rjH zVm!eQWHB;uT)ajQm9zT&hfQnNvlyIqTV~LwyOH?W2z5j}z|f$y!`DEvTuqt=&UtWO zrp@aSSVS}9!RFOavvE?hZDFbN0?}v_`%;!kdmW^LpUg68ufjXPkLuLXvumPq5!5WZ z%okzAvK7D5bW|!b%lR;bYV%e8B&1n~uAJNiae;kp&eOPdGl{auns#s5N+QJ6vEFqf zy)qIpAD~Ob&Spn-O70s+$ZpJ6GbtXVV9^8>{F;`e0`*fPZOW7u`#2==rGR_!@nRmm z)HqT#6a6HXiM3#){g+#EM2*JSVhS8855ys|uUd^*b8Y0cho-h25)8?dOX>F{f=2Ps z)K<Jtq}<*_!XO^nGJOb2m~`_`RyHN^erst{nP#`Z^SoA6o`AsgzQFve*zBq`G8V14 zK`U0a@<!!nmp>DHm+g6Nd+DWxCp?q6Lz&H)hW<|SI<+?|VaIz11YBMdCz-Dy{Y6VX zy&5k7;irjoX?^q~JM0@vTX8Wm8?zmD?3C!fAV;COQC9hlCLr$aDEY?jcHD|pQ7LMc zgOD-0@aa4gsxvD(@#~NPGV-_}+QaC*FSX>CHR2gYJK7ER*upYsA0=YA3h2KVfnP4I z@t~`NYq8jJ@4)q7E%!D3cr~z2)%wMjJJ2A@TJBc@)+~)GYG@HLk~I?x1_0}ziZN6T zS}wHVbEe;adhq6Ey-TMHvo8EsCr>|fCv;!`d{^qc!7$q~t=v3K@6V@HY@2E&G+gRD z>_o@*Z=Sz*^X1L>2QGdpB3#jp?>lkg(zidt61?DCv+n}zH+u8hm!b2UxJ%h>2Y^t3 zdpaFyTT3#aGIQERC4Vq1rUSGdf@3?lsgB9$3G`$Y@?vZ|Fs9r?g&Xw?5h|8<V{Bs( zN<PR4XlQwJdeb0DeG@=k(x6%Hg2?M(GB844-$ftP1D`?Gzl+t`x?1qA{{JKHZQ$Fg z&NT6RuC8QRmSstnCCjob%d#xXiXuz0EX&_^WXCa%F~%5ULP$eMQXma!nn^QhCe5Uo zG^J@uQ%Z(r7?xpEhNYQ%WtCwm!|*XohNYC@$2JVxVHt*HDJ^9>4DC`H-~6BVT*-Dm z_(<E`KR<r4EXj#=-uImIemw8<Smj3Q4AkRZsE;X)_yPnYISXWDl<V=e9;~*xOfLdi zluT@pevupP(DPCv==sc*dBYm5;8jE`7|AG6Obd}bj6_x!d2=7b6x!Bx`HP+4rG0&B z?4g!&@y0$~({Kcmeq1;!7EX{iYP$za($B@|_eJwyk7v~vzq)<^{r>N+o+vk1`~$HU z)DynUpjkY6{m(^HSH>3$s~?18zI3-qJaXN?>&pi+KJiqazo>NI2Ej~U-+Pj0aKrt# z6d(YBP}&0wz+J(gb_K*GM?x?yMv`7%eEvu0JP)m7U>}FoL;7jV|D~aEdJmM<$eM(r zAvs97JVrC5azjj^MqLBNNJ0Z5Ca!7<x}hK}5Y#NN0uM1d(*q+UVMZ{nQh=*V<c(VV zDj87SBqh@oK*G#98!D)lDfig&<@L*J$SfdSRRo}q0AVK89?Zb>gNR5cRa6GUkz@zn zfxHnNH5ZMS0!|uX^}41;ku!jkVi?*3)P`>WSqQo~Xy+7$lQEvHD-whNjUjE^F%dBu zV-e2*^@rbnOnrYpJMjJk>n4K!!FXuR$_L&VJh3;M|C<d*CxU1GVawsfO`8X2{$TBm zjjzm%O50OCmrI@~^Vst&e2UrGH^zH<{qfn6u)62PJOBNW`uqROYs@vi|BmS&s#VW0 zOKEcE$n>*+|H6~QFW`(7g%R};X@~BZ&?fW=-x3H$1i(9jSx`4p|BN2h$MO#Th{^qs zojlD+1V9uJ!pikxU3;xc2kt`f*J5IsE>HqGijY>+R+)BDFJNup1h52YQ9EqkGuMS% zTuJ(78CJNAR=5kGm%kQu7F!^=jKofrdJwfh5|CCH3k-@CZGl5g;8($LyGt8>Ev&EA zMZ=f%SaZ~dogoB7gz$w~bhkawVPF;AQRj}c!XLf#zgX}ek6xW+_1<(OFwt}1yC1Z% z6D)9VZDM$Oef#v*&)qq6{n(ZH!(wJ8a-HPJo^mqrzdyyS9nr9>z9;Tg-#e(@KD>SB z-A~?sAh~9wJM59Z`u);x>^^Ylo;_csm~#yM<k!(pP6=PrGyw@O`US-PybIP=>`zwG z)m2N8NqbDLIIZ~Ms+fAG@qsw+Ldrk~VMwPiWx$0x4VdogFGaJlvNpJEIdu({YJ$U6 z0sky0%L_eWB0xt?iF(4d_Iy=^ySW9=5T10Q`3rIdEA{&&kh>Z}pzx)8Kx3#!^nvdK zg*GC4qHcI7HGIoj@w53`yIy|mgQLt{6)=Uoo7Ma0zS-Ho{(szg<X>*_SzlD2tu&2J ze`b2$VsBShxBuibH?h%EC$AM>+UPK`nTJ;XaNphFod5JrpZa#<_~pExJElf-G2MP( zFn3_RXn8QPz-Py)ftd4h1`02#Ivj2Z9bJCVYG6tL1yL-L-E{sE1QR0*1gtDUJMbaF zZ1V=f%W07!=v~NI6bP0Eu~1%M3eh~b6gaH4SPQr{)STfp@T}s-EinBfjKVUNXwR~k zFPx5~hU<FUlv8LI+<t$5<FHYx9O&G*|DMkt{<Pn;=_^OJU>zQ^g#*)((RG<})4NYT z&rG*WM_Nat$=H)8<d+Ug?*ClB^6|Ad3heCb=$jtU&*1aRn!a7JcC1Ut)qwsbiT5$* zucqtAPl+KluK_%*J)QSx)fIsKppaA$*CfSelDv!IUUS6gha7ZgK?W8_U*X)7jaDUy z@0zrqhe%#Cz6)u)l_OI@r$w~Ur7?Jyq{F>sv;hVvDAk~7ne)_!nz=`5bxIHnlCA~9 zPt>~#Ss}RBD_t1%0CYE%)^ZE%Me(R77BG87OFI4DU6hzdJ06)C_QkU)vrD(dVQya$ z^R1k@H@)nxbfVg(zQnwgtLy%EERfw<#(K<Nhr?@DpDNp#4aRwVm_F;{yb~!>L#jY9 zPykjEvv&mz@$^?<NHQ0s`=c6n25#=^AU$18<YM)x64HgzoaCDj6Fp!(0sTGb0%<bX ziB+Ub+sRWnC!sfx!kxQ{xaN`m70J}o_=!@SVxX{BA{h0BfZIYsjmtMcKwerUj!|u} z=uxP0Ij}~lv@Ni_B*H|yHyW_IMMKo!&TJmZo2OFVyInOdlSQ%_!eU!k+!#)UC9|Q- zWcB!}zi~XX%vtqwUuH5HGmHLS@t|04^4Ij&uxf+DVNic!AGC%Im7>JgN<@wC7-H!` zuy^H_QlQsT?2Hr`B9gg4D{^4?7e)v)KJ$y`vV2~E2+Bdc0ZBx<8p9XONfB%^Dyp`n zX)J)pn=~(<1~!RWcG}aN9kxUVViPepybmGf#Pl;J)4pM=NBxgLrmfXv+&w*fpUHHU zr|BJwz5HLT23Jd?$#kDHw%f$YMlTQl5>>nN(@iPe5o3SXl>x4{6$&Wmlj~as)$81% zi#_OTt}YXe3GHb^a#S7mwlxQEc01Itv={N_Aga|lup&nCecBC(f^r!-QAHz4ZQ4Xo zuF@ANt`$1qt?-mq?kTO5JqWGjW&*0Be63o^E8JMC5=8Z!pOD$W7C^NUUgv5plu!^w zlrkDlDZWs6zJwusE7qh76*UxL)rY-x)`;7|bdPyn70um%zLzn{;_!8$zH+Z8FWQ%n zH>uBqKw8r!BY8OWF%}ko)l+F&V6mui1?TsPd;K2k;&Zd}0rjy(exnFn2<ubyp;g3z zv($%(21iD0jtec<lmw^%L(-KdLdQYR4+_B;qvi-H>uJCQI8Xi%>cCo=1T}uF)RJNe zhtE^Vgd?LC+^m*5CnPvc{Pb`oN`-k&(}S{;|I$2-=E?l$dYhVOW0&M{tXF;VJg>`7 zraPjS=5^M|C4R@(RY$drl$Nr<?*_>Yhc!3EG<yZPVP!Mi5bE+&Qgj7CwW+Nx+%OF{ zOmjC(LltqZKaKn~CvYxl0rguf^oqi|rpYC}aLF=Uhj@Sh!4|?oUYc9d2-?LwvrPRK z8-XCkCveYSu$Ahu3tZGaAs+sCC*3}Ao|iJgfVz1Sb@N+=L)uISNUm#$#k5c%H&k)t zXC3br)!l;SA&|a2nIu*bD&qF7rs5Hkw-Nc7as$E&Oz7W&3-V^|g8VE_`5OoiKfDqb z<1Nrx8=bgn&2^=lDek)2o}0PmCZ5R0z__fG1N{JxU8sV$ft;tLp@bnY?c%~vO@ZQT zU^WNel}22HG7TTHPb8g5jWihyKEOq=_7I7g6kkqijLHW5eGobUX&I1AxMifs^9FN# zFr@y}T5fcy?|QqN%*=4g_n_#iUN^zI9kuFz3TL8{yUh0;qj>Wkk5M#D#ux?{thP#K zgi?2pMRNaTsVF-P_CRbrroQH3_3Y+}K-!cxy5r-bK?+;k)^@hf#k}g*!mf~L<_e%; z%T{(0WDNoKgY<F>MomO9Y{5K=dd%f!8zUZf@wv#Lx;s$gj;O&cVfBT`#HV?^)&ZQ- z3pjsoVTgrj_UHoJ-2l_%3bxw}+fC91-j>WyYqpDZQd@+{TVitfv~nGClk2R?X4r0% zX1mW&k+$ZM#FH;e8m~~3RUUd=v4H|+_VGuaedxz-?dfHht$1AN3;tSO_c#)oIrFlz z%1H6Y%4C~$&T5;j1c1gm7it>9OV9rE|M*|{(K32-gx)OoS?Buu*3nNL{ilE6FX7Eu zdNcj_we;hUJo{gM9Of_K&DYbLbNJipzf%8|fWcyeU~D#%!@on1uFd4|w=H?;`b-Z0 z4!vZ~2>sSFvlY0S$g;lv?9lAB*RLZYo2CD;kADlX-6+?g3XyA1qaB@*H`~6aTMg{N z26F68_T1#umE5tX$+0aoC^%`$CDT;4LXK8+<5U}XZ~|b+<Swp(e2TBNgQ`E>rm;_x z6BnBVTpYm#^$@uC`BxvYwGwv|VV5}=StmYvP_j8=>0L&HRDK|ksIiDQWxp-eR7Sc& zj~ZCX5B&XcgIoXLYp7ww{Cmuj!SoeBvwG?*%&@-qgy=TL`@(-KNk58)kRgbsu-TQo z26|{AYo%%CK8y8lQWlqn7zsF{ZND}^*p5Y#(F({BONOA8Roh!1wKLnJF}K;spQNr{ zDDwdrcg*UGgFZAZ(jzlbey+c2koCL3sR@<hCAd4eV*zzX^+kY!Db*pO5~XDsLNUsK zs<g7Und5ng{|kUT&ZcQRU6>%0OSEtl%!(AXnyi}GFm8a=H57fUCgdG-j4@_c7%m~` zq#fVcw4bn96#(p&6n-K_d?fuu0+{o2YCwN8?g>(AS`E~QJVe@Qa7V=t!4XdKClR2s z1VJv(1fX$&V@Zqpc|f}m^W%2lSpYR2I4K#oId3&+7YA?Zu|T80&~8F-j6I#cduF(< zbujJM#r>&l%r`Q#KOOPK#Zc1gPWPImSKcw0d(q8~1uR};%Dd+_u+shW?pO4csiapr z8Vjx1X=42q@py&$xM}B##>5#{k3XE!y_IefO;I|QsDspTrYyZ2&AsOj9><9ma3F@( z;+$d34jj{+5UvsaPeIv&Wb9*<U}(1pWfC!dgT$~-IX;IA$PeAeLQOZFQ4Sv5IKVr_ z`H+<!`~^L{xY$E&hauQ<HYN|B&I2P()*DHX8=#CrQpC9qZuJg;_h=>#9anjEqR@1O zQjk)HxB2ikUm01=M*`PnL?YH-MmRobr!C3nuEl#h2vG^i5{Y2oO4;T3&XG}?2@(P5 zeMS!16ymqSlrIuKIZm;FRxvD^PP=o8+m#m+u^TD#YXh#*UyMi#+28q@O+efntE+2c zf1dKzwxgr)%o`H=d%oJ`4M=uMBjusY`ri2)|5h^j(_UX{z~yzCDkLj1o>W-efNUEX zkEsm%*R4zWd*)y13P&=ce@U}1*fW1B)8edB-?n$I?vGNx(Ay;rV^+^>@_YI`%xiSH zjOy$5L0i~hFXTgA$h%)doG^&mUd~LWT~I2_Whf#5c_WZO6cN;gfDKCIT`@CgtO*9e z86A-$)$oo|ZW-ThB+QztA<$5N4Yh1h2cv#%9PutpwFm>6>#xKd*txkRPZBC=ybI+D zN+A`A8h}PyuA~bcB3Py!mdVw49Q?3dl6d<AMd2+A>WobV3xraxv?g^%bzh`j#b)io z(^405af=7Jct|~S)HIRuZhdsieRue}0v7f^Uny_s@qTW{6Z^a=)c4*tccU4bYIRhp zKVrV9Dcm7iwhcXboK2<SbfOqB1(KLy-?W{b!V~DgM$A4UKx9G|y?F!f^C97p^J^P9 zGv6R|uxVL%SCRm(w1b-42WYulU8N1E)$@tEVGdx8VjZI#y&A<tZO$E~wRY1iB<3wp z^guI@ueE~*zz)dM0fv0uLQ}QZ0q{O#fEv4Sb`@5eZZjwamh|)v;7LKLYlRKmRfSF~ z`8svPi&YME#8KI`UD)i_R3aBvd*OmJROvQSZtpRMW3TL$%>I;b?WqsEDXU35`GQnl zm-MHnkFE3J6qX+SiD+(1`HpK24ai8u9PEX<H%lK_+U`!?xpV6_CY-Yi_?nX*po!0- z0-wO=Sq71R;li<yQ@12Z|1Vb+iqjI0`1>r#(dv@<krE+<rS}tCUcaBZ+I}kuIyT@* ziQf%`E@d)?QA<uMZnVMN9Gd8+fjhT`8irh&h8iX$F-j$vWJD>2MqF|!J^_=Wen~9B zMFHKwEP}9829{U?FN!h*r=JVA^wILmDd<V}kk^k;ZX)8yLgiB_;juIeq_Fl)TviDq zr3-ZhEImM1Knu6i@}rcu_!8}bV#bK)sr+In5%0RLs^-L<{hvh*ftp&>#&XApV%L5B z4?eu-z4suKRPoe_Cku`00{;Hx%w(HKV&73`%U82C=e~C4*>js+?6#RY+8c!bHe+v0 z2rp@|fwBsA>cu1@g|y3)oNvGtraot0@(o$B6(}r__&Lv^zCM0Q_;A*66bbAC;_O^4 zw@WP%5R}xCHA`rY1qdtxcGoYMsDYbkHJ-z`8sPi6&einhPVjNy&jI}OY7XCDy$o>x zK}Z+@vm}tCE309u-0~s9bpsc?3QzY?*Bdmf6V%bnl?1eqFt})`V&{3WeMLzlHx@S0 z`LB|_ihb>3ud0W^WSLn<<?%oEWhzbLj#V#-*hc=2wU*gy27DR6S=v4RZ|GB}{59P{ zleA~zWyu`q@=lCC@W<@Pi7gqwK{Rne+%~f)2I717Z2<GeuPS>~ToZ*?+D&+cE|`Ep zZ%Pr8Bx%+g{aruK*tNJIa8}g^aAyn&-`37EN<B2FZ%&GJ!Bjb_rqM(a^-ThpQiU;T zk{b*iyg%hQ4Q`UR2zbvpH2^chI4DFq7_B}p)q?&OOghBs#znvT=!IoOlhViUmOfk_ zyl2%1ex12eRHONUwF1>RB^03pYiX@$kyfXWH!i_0HYV#y29v_-DpXUJUgGGSu)WD% zpl|Qh@KrsDr2s2teTGFN$%um)j!grb?Ma9B?Myu{9#O^EB3Lvkl`Mr--J^bAy<$m8 zDaP*Jy=Ri2S%YU)i8uy^gg-0b4^u__VNTb|<0%rtEFusa68UhpjD#@3w9{W$S#<tz z&(g{+!54z~Xz}8OVl*8~L@L8?l73p*D6tv!^P7aLsMGzx39OBEVe0P*O16SlmNO1$ z_wM<tr3FVz@rLc_A<$KTr9BVD+t!5S|E5r)4iL@*T+%Hbg4zS<tDlo>b@ApC_ix|7 zp!a5Liic(&K6Z5JqAgW!h(;edasTmmm%hwOcU*lGYbpxPvu_%#=tFe~0|FRSB}6!+ zG?rUNSq9-|oJ5I;E0^H51GKte6(??wW;%cRl6iB9u!h!lkZQq1NlRfWRHaY>=W3Qo ztK3J8moO?M;h53~%=!wvR7%;Q(@keX9E>tKRtnkxOuwOG!U>v#E(%mgBO#6EC|aV0 zIRFc~E!2go({(YLJCvL*zoCkxl-nz*g=#2CR_X7F9eDAcyU*M)Uf%y}r|1mDGqu$L zQ>CZEwY<J&*}!n5wwonhdz2Y=n$Jigb3MQBsd(&vAHU;tAoM|e_okhvezs?7{2MlF z(^yZd$L}$_tIBI)weFCs!tmU`-TUsJx>D+E>b@=FOHcgZ<v(-rbxfUoTk`1k3Sq1y z(6O|R`4Aue)YKp`t;Ub+yi=q8_LNDa@B~=D2j_VU;`~fZE<c@j^99Vg%i(o`n*f-& zHqj{M>J}}Wc-;g4^#EUPRX{Y%`&=T<W;~1mD3-6oY&7TOsn^s^%Ft5HiRigvb)7(v zfFeM#Ch)bADo*HFY0HaMUP1#(sWjlq^_7l5P$O&r#6Oro+D$y!x?cp0hyMGspIK*= z-rL&eXb5`znI~KSc^jK$q4`$_X8&Y*i`uxBRr=V5;fng!Wy=~L6YFyOw?3X&e$VtF zZ#Ax~;I;b0_o>GN(t}%$#@Taa`q~}q^_GVCmE0%QDejY_;6(oWLIs&zaRnAt8?q&l z&02CyAS%f*uO!C=_?sM)&tqSaq#h3(Qx;>&2=z~);OIPEkP(rgAA)pO$-pNwR{09@ z&ln!ZV#*c#DLy%<jpcG_#|kRs)q>$j(O-czHYLU<ucQcM$R>BggH0niV+H*XQQ0eL z0Jd@!eSDa+5!ekbmdYp?!2+l8^^%-Ajg}($I|VR{4!&z<|Mb?;zW&QD$;qF*F}CfB zHP>a+eWO(ccUNCIG&|X|Cu9DkKEE!$IkxiPOr&pAI-q&KD5TT_-1mDhIrxT<Gm`Iv zLk@$_JZ9$z8=m{taNj4BWH2S|t1u#Rmzy{x3_=>NQeZnx(g8R_mb7$buT=rdu@Zu9 zdJ}TUN?EP!i_s$qb&9ms-%VTj$wJVY)&y4`kl_1DUq7ObG+=cNe8g;peNK!RGGx6* z+nlgGNq}TQF=ohwQkqXjgG2^{g%v>CEpMfLQ-c(kbfz5u+aUEq84G5_$%H@YX!^Ll zgT62Z<<Ypp;1kqe&e=-uO@B3iz+U-Lo9UMBfH63oOmqb{h0|fr{J)>GwSINr*!s_~ z58qrpVqNQqD>ASuv@RIdMVIb0iZ6kE6$g<A_`z&)POrvA7k~q2QymP`u+ZqU@}!9) zRjU$QvUov(SZiWp+GnxnLV4yn{ky7<lJ`BYO|rnqQV5^<Ico-1_s53qf?K$QnS|UO zbYE@Thq}E+!<E~z=McqL7ShIGTB67c_g8a*20spjb%`7wAu06iE(4f%QJH8>BH#J7 zJCbnA&nIbqWF3BB0t@ghz5oM%f+)L6!><6@c^F}FXF}NphXb~6*I^12c6|{?HmWqc ztcnh<39cQI!MGFNTZPbTuD9>dZRC1~ZL%dJ?;?P~(0+PCXT!lPmmyReTm1#@kzb?I zng7!}zCkZ{+U6P>H$qg6hQaeqsrIXGq{q;>CpqWVxWMQ2`3Vj<k;dadNQ5c}TFFG` zk^cP>i11lT+_y_Y@Fj>-+`_~hqU*U~Kn(cDzWMUe%lrXx-uioD5)tp*_{Fgg;}FsR zDB?l&#vRO#tzJEr4Z20U!SjLed!iPh8O=wY`$$d^QHUY8J}kW{c!k@TPRQLxK6=wG z_^4ki6m8t`Me@-$ilDAc(otClH(P@y9$jO-+-r$hB+^DfKqLl_nn~n44&2rTzq~cc zML);E-UQKHU=qF>ll7;SGMtEIRs|F^Kqo8P@ZIV}?n~RqCBL)>e?fm@u6o;-j35|G zQUI(B??e>@4y=@6c=I7$Q>bU~3;W2WzGRdC7^Kj2-pSjz9FhR(omD$<4hqTvB-nF} zH*EPby|crHuEx4g(ZdLm@^+1Qq!%uUJWugJvZnw{WSU1(xXWD<cNz7t=sv_#a8RoE zw<l9LDrkC&-;@O0;w&UYm!CqCPB~!(py$LwLZBoKHP;lPaVrqpAQN=bkj^5X%~IuG z%H`pdXcFiiJo?<%!Tb@FGSNsks#3MV@R|K?%f#l)Uv5bSd@hS4<c;3AeMQg7U?^<` zBgyXZ7bS-+9ST0!lijf)>J2$8E?+}x%l(<n<2E-l9^QOaaP!`9%I1*vOg|&ptf}zc zn}Smtzb@T3xo-W0x2eHnDmQ_FEnr5Az|+|Ho1UOW63z2RCGQ<uDjad&n&%(;StMW- zrE-nm#;g|&f#}a3dwz{C4#|Km->Q|4s%4^N0ga9L#!c6$KZ&-0w%=TMw}io=<*a(m z&2JQHmI3gLy@*-5^-R)u#tKxUvNw@SQR0aZo)XV)z~SlJNiB$)qyc|n61C`$MMn3# zgM^I6Y!@G}yMahD2*@h;0d32(BwpLZxLuSNJ*g0fi6OKZKAemUEeJv+k#laqR|mRQ z{U)8ap`&Mb<yF^mHxq2RRTEcmPb}h}-RMLq4n((rT;{I5{!@5So<!4i8ehScv66yZ zhk&39xd9F4aWr&2+)o=sEv8-{l>lFKcn|17SF0-l?D|giS5GYECS!*DI{6gR;>EF} zbJJqX?ac;#(Oo+aXZ(REpWCZ+wY82#Y*rlr^8fPK>*~{sY-O+h;;tWkCOZGhhi1&@ zkL@MWoB9T=Ivx8I-R~H_)V&5yhiim?*7Py*%QWMonqZ)Kx|FbV;Ofg&(j=Eqi84?w zL+23YSx{!h!J(-OW7vh}_^!N+-gV^!=J<wco1r(HAL5Z7?^Cp%R=jY6Y$E67z!q;o zs<D+I*Zr+wI02*@TdlI0B5v?tQe;N}Dg%s89SGvzpo`II`|DDnKG;35iig<-yK))E zS-?9_;0`CsfT%J-s!_hurmTXcQOr1>zl<zgq;j|*xxO$Jr@*XusL&yVc+J4(hyU=8 zuF3o~|E=$Sn=_kj+n<@)y`vA@4vV9*S{6Ra=l(*C>%K6se$!^+g)n#{;al5n?77m0 zl{b9*Q%|WsDKMQqb>gP`*7Z76H3~IbDjwm>x%1{4I}zuH7~gEvA?$%21L_E$^P3d@ zMN_t={Dw9$#=5)|s`&}L;MOjf;Bvxe$pf+cg!aV11Q}oeV_>-nj!>U~{z*P8TD+(i z0u;}QRH~rC3X@=PQ&&ZA*AWRoLn2qj`PHfj*#=!Pos|l+`Bhq?57F^eoRAnPEP;G; zy*xhbMfE8saOz$G-6_t#BA^8LJwutlDM`g!L75nU4Tm7R2@zenH>1!_mV0Qw@)@P` zb|xapqSZcb2Jj-Y50dv%<^VtV+roZsP4tZ%*n8jYV#)k(7Rx8IA2|nj4%Oh#x$Oqk zEm9Y+;Xb2jUXi?I!FDt<3s_wqsS!0b8kd7dn=GfS#+F=7wi2j83tN?=UZB|u>OWPd zvCA%EF8_kLcsZ93AdMhCh1I|seV&mx0&USqRKC@LpMxNs0y0_{t3FsTmPid<7^&u5 zBmK6+em3HSq|8a1N$6!d>pAm&)ZqKDdA~=VJ9ct>!fdFhAB>IP6gkEx{u&l1{sJE} z@i+g}8TEzzt7oA99!Yt=`J%SJyk|xIN#sFe!e10vlnrLI*fGlw`b`goYebDM4|0yz zTzek~IFKs18o(P?$P;+_3Mf!DmKzseGl$|f(+L;_M3*ydVs=xT{NwwJz-$;A(kSEm zt8DnLU&~vosoYQ>#}!s(`(A~&E}7!gHyHRU!+dig&jPj$q(UWuG(w=l&<=sDJ|_Rd z-daq_IMT8(%U9wv-a4~w|KKMc=+g##3OO75%{-HgaT9o{FKL8tF}1@Pmpv;=Td_|z z5dRS2bd|Ab*nlpll95l54eDvPq%{CxbFz<Y5EU!xu`&2G`He&(%XbCl$90^^L`RWF zX(H$1hnNd$E&xarw*&Y~vN*G{{LISs;I&cIg|l?{3Bbi+fKS;-o5BTilsCW}0tRCn z)7MeRo3-aNQ&-K9%Cw+}xR;bF)EAUmV*IKZ;2lAJjukr;OVXpoRz_m6)8Svz;aGUO z(BeQTxkYShVPx$F`nvF!>AF7Q_}$tsiRICZPn+G-mEb?4ZtLtei@nBWE)4RWxYX%8 zN#dhCJqvSu+zTWc<qOcAMZNPkdG8!JEnXGUY3Cj|kUy+<ZoI_aIWP{S8CpQLEa{{# zmiLzcY|BXd)kXYyATO_I@8D&iO1qqb4KLC<_cHL?1uEOxaT%eROBLx$X`Oph6Q-kQ zg>c&9H3uTTkM5j*E8Sv>s;{w#qbivY%^%-6=Y4bC2iPwoq{L2hBl${geSvE)<%7qi z6wMK?UGC-CCtWd-OqTXdvc?j`zDd&Mk9KWs0h`QwffT9bYIz7(3vv=fY#be=F>uQk ztK!04lG3{90^eptREu>U#Lv0-ZgLT(nb?CX&?3*(gt&^3#g?;n5>NrOl&C)o2wGr| z+G@UAihU4Nm_<O|i;IqQ@De-d=#QJFy`vk`9%IzxvHD})k87g$G+|ymIyIJN4u>n; z`H9=;(D0Lx>mT6tsDLml>=$wg*g?)Fl@v~`$N+e;&MiiDQNnd~O`;3;#?^(1aSWEC zb+TAjT2eS#gB7I05l!XmY!3<LmRfIy00}4+eXrCrvg)#FegJjRfKM$}_M=KcI=?QU zGt}vH7FMS#cDni1k!Apmp7UIy7aaULOazvQC)W>(f%>tdFMo3kRh&t7)8pkY=tP61 z@*hQ`&z`7t8;!?4<2z<Djn4+z3(+h)VGi^F0Ae~ex#Lcw(HyGsTb-pySa+%uL67mG z1)4KiF*^T_!DNt_y4K;VGMhx5DQ=YLPBFC2NTcE3+!BmBC4KQp%zaBnvkmF-wG|H7 zVf5_Vy8EEf;uS&yhUSG_l<WYmG;|2QfG~L+3I~*!3XL?sxe{A~$|$)G;;axl;J3wc z3gA}R(@Gx#;jRSGnX;c!l~$@L$N2OWo+3VN66=ulF<y7+(`MZ2+EG%$pr~Ap*Nb%) zWHkgOwch+>BVk4*xo)|IG%DL{zm^k^V}NANS6>IPXTmy{Otb;oi9hQXpq)9q2=Jyb zSLaK%5m1xmv<Do8=||}X1kb1flCo);UC|LrtJ;>gHAXvDl9Pl`?DjNk)N37$sNFP& zTUX%)#j_IrqK%Dybae(yv6#?7#U)im)WTX91NE8|@l2viq>VeJQOTh$u{MuRWgfiM zAl~=QA1u^bx=?HJ2E$|bKE*Ek`Ym25oSh!parlvAviU@IqhyGON7#?{u#_PbbT^8I zs>pP7xAZVCpoG9FXx0iS1{6?&!FfX@E~l@(#^JxYKIOk@`^Mv;h(xXNh_}q_c11^~ zj;JTp=ev4S_0=XrB*Gx;R@|p0P>XZ}O|ehQhULyAr$+Oq2~3oW)9zetZ3gW|ezjKp zEs4n%&f--CFEXJNjOd$bEP0^8qb_GTTACC)>(Ob?RW-l{pwhC<*(+mw#;)9+xApdM zqDk~Rv6nf4EB0wydjkB-4UL7b?6=yT?1jvDCOydlneoofarLi%=iT|{WM}7OI?ew) z^UdFBR|;77;=U{uDzP^&XKeyRfv__nq=%G;b|$I?H38a%HK10=_0Y~F$>ild^xSg7 zz!@=3x?ln`OfV6XJ5LiJ6|gxuWtCmDA)ByAn_>!dN<bnh3w)2_af*+<DUM~>oc;KT zeydzTubJ?gDW+7wcvqaiM;+Ls&c~G&?9t}O!KgA<<#b?=He2Vy%`MoY_;bk~#fyB8 zI;z4g+8)&&@IC4x6pU!Z6hC^0;N_UpmAC`6HBGi$&2pfZGO`yaGkYBupsBDa<u)pG zEXR3{P2WU(zNA(RSK)Pi2)Gs=RsIee^8xNNX*SE9CNp&@d-5*ur-nnFouR_l`I|0v zGa}&<X+IjC*T>-@*AesfH2DNd(_^qaS3tI3IfH@*O`9nd*J5V|sm4J~^Z-2|a+Yp* z3WkuK#085<?|l0z{*b^6tGJfUD$KAj8T0%4(=x-!OwN$V={dSqKMTLY=zbQa?@Z(e zv<Py5Gwm;7`vtN&e+7SykLZumfjga1mQh=`nbQ|t31gLv!B_*?31W>^+EM19!TbcZ zftN)kGJLQe&oxk)3fWhv(&oR|4cyK<Wv(!;;zSh}^dN0wfM|JbD#2&R4VV)LywFDD zfR*iN4P$5@e-N_^p3at)sYg#444!IKpo*9B*qSJN*Wa<hXnY&Iv!71Ij8?C|AvAsS zOM#|G_CT>I!Ef0z$vQdp$SUS~YE{VZPTsd(9X)X)al44y#BfA96K5fHpTk>aF-nUU zbH|kKzSw|7Gjc}5@atB??hVP1x-ovO$+WvYc*JB%w+2=F3^SV?_vX<#;Rzz;+d5(5 z?Av;p=gkN!aT@=FkZU3jh;hz0V$5CvrZ$ROp%W#d35n}l<RiW!2EOtgolQ2hLTa^E z2n~(lP{zzJ-N~g1IZ6AQ_Bsw|;39x(B*+_e<$!Oy3@3Fp%8hhl_gi2WR69Ch*PLzn z3fhgewp?XPlpbA%)7;lYk06oh$FVLbooEC<<PWq&qRD>R#1P8#r11vPu2k|UmS!3j zuSMSAa8Z+kF2oYv#HU^`m7_yV=tn?8ilYMBgH4sxBGwxHC`A(bkWZ$IzR2w08iOGc zoIJJf*@?~HKmMJm58k@>*4sYqvPU0SKmFBsz+$P|bI0Dn4XZl-khvJ!^zC?4)AUsQ zOHR|Vx4*k#c*9<uJsOSpOy+7&c+JeNBS#KpXEtPXhhN(*`ldH$q9L=PYA<Wde7R?b zf5h=a^{2+x#@fNa@Rgf;;_rx~Ym;kavY9BL_bP>XtUJppg<J*}<FqPWWIBxX^<wKS zZM;%*foOGdBlLh6cZfYdKfA08BY(;)zF#(rL%x3*a&#sm3LZGEM6pY+0dod#A`o@$ zoHaTCWs*X43<NLe1klLHH)`9mtRCgEMr%O_9gPdptAcIw6WW%Zu#!w#zD;{Kh0|85 zJ7QB-Aybzz7z0Hxgw<r`kX<e%Q~<GVOibcc&_9C&x*djJM!4$6BzT5N*n!?KO|W=s zMrcID)M)tdCTt~@mp^t37rW(pkglA21c<`Y7+kjv`?ffkdcnRO(e^E|F0oQ@XJNzF z`2(TpoBk=#bn)2x^^e}LyB8YqOE)?8@EBrl(na(N&0vK<-Hc-He2<3wpzbCSdFSgg zQS>d9I_%;?#}e!i@zWf9Q<s?2XW~Y5A+bL*R;3!YrO8hS()fjHJl_s?h$J|}N*NFu zsO|9WND*!uC&BU~jD--C(vEu;)5U_4>4BvpaokQsCk*3VLMO6_a1t?dhrE;arU07z z1rU_@)Hk0yg63hP)ESKw0$R0;Z6~0Ur){iPEGqz`K#R#{n+jN{4-I2)HURZxiGD_1 zrop3LX2v2Uito4Svu|>a2)E$F`f|oaJ^>`rb^@zOzQ$T6AyP#k1>h3{0S9PC6(b@V z%<Bk>6R~EFapV_52yk80gO=Ya_=&&Lj{qKBF$>exvHJG(so0@MS@$o$wo-jsef985 zdxwp>7e)2VbF+6}RT+DcC7<z0vG~3l01^UUzxu^%|NM#1x>Asm+LIlcAA0&9-ngIt zzLB%<AjRA#n1CkPDDZn+F<_U|Fe-6*;#|)Q9%qGiVRKYT^=V-56&<XM3bhB;7ZOCB z#(^a|FyVs&OZK2u<fPNrLpH7^2yw2w5@aBBq3RnrHmX-dn-~Wz<s)Dpx3wkpXfTHw z4B}Sxu<ynpPim}p{+F=Mi_dOdr=IzidLktH2WJmVZ($LG7gkvQ8TO*+Baqt5>icX1 zfN8fJWeGa>B!Sz(umglA6~On!gu@y`Db*ZvMB}SK5i{RX?{UBu;6i~d<cb(L<Y@Mm zvIY?s6sQ<VPv^@`BoZo@bC`8ruYHGGDnOSKK6V^m%S++sl|=Uj(7OlskE}<B6E_ku zoBL6_5|k#y6FFP8l`p~hd4%pGEiHbK0mIOwn(Z3z>Kgyq$J$;0_Ui8L;{WR2k8uJ( z^2i6I_XUdg>%eaET_M*(_e2fWibOH<NzD%&u@)#|a)8kV;XSBMmYoK1F2(Y5f@>Mh z0U{-Lalj*;4*-JARfJ-Aen}Khv<d>v#`q9vQ^i)Kvx36x09eQ~a?GxnF+w0O#Mm)} z?U)poleU~c1Z>}j1==6yV&_HEH-C)7N#6=*I-9i!ntwg>=tp{>j_zGxo;L6`JtF)@ z$Q_{-y?Hm<Mwl`}BPq9O=YC3Gi33*#^bs`=LORVRay@t^pV6v<n58e2KroCZPbczi zUp^8=81>mCaZ(e~X*98zYdP^f<yv2l$p=qkkWJYR6Rg5{z5O889sGr|5L3Foh!`EV znKSWcDVZAFg+%Li*y12|E3%!Qy%ZIj5Me2OdvH)!Y$rN@`@qo7-FQJzz6?{%d2ZPH zRf?zDY;(1B8l<Grp0AI!UVc424T}FXrl_&q7t)`_MCkuVfRnhsb9E>FXS1D@gm;{o z_<w<LGKf6&%4D<@ER*>l`2P;wM6o;gvGx0;4}>k?;@KnoTF7l7e;m6O{-_oBb0b%; zCx3Ke!-J!Mv_}cygsByx<|45zLKc8oww%*)l~d~rSy;`cEl&93q^1hknvi$I<S!)i z+q6^k^KiyLipg6~b5GoErPd4@(#n+x9b$>xm0XzXO3Lfd!QOZ!1(qAH$9D~hxt&*T zgwO7VckYE_Zr=(IEcF>1*V4$cvinLnVCdS<@8RyaiRh6GLzmw~eph3gD>V+Tpoh>f zEvx3uP`Tcw_;4I^TJ(=WE^!{k^N~wjdV(&eMdT6?3QLho3!?*loCHQgE}ehQ2dXOo zTDpHrTM@LhVSAQ?mTXpzS*o&Xn5EBcU<J@pAax6YmaOgt%#zi?IB3ag(?Cm`j<Br@ z^`uN@{Qn+MN!m`=P)7&<6FxPISZB<z75rk|LY6dpTFIwqo|Y&4U<}DStgW~*pg>JD zW0WJh5_5H}0bts?W5m$jq4`N4{A4I5`%W*RI%<d6J8@(|U4%5J2k{V+J($;~d?RP# zg<%s><(sS;O%#er$}nE8Bjj&AGDK3nwbJ#`Ynv-~)$$S!^n1d%TEH4!0;dslrYL!e z1G5fzR;fkhR&mq)9kAiAfF21~ZF6$)8>NW|Fh5#HsUEN{d!ag5tP%n_^a*8@)K_=J zqmS@|iK>IJ=G&h@LFv<xL4DJb5+LpODa85HxYLtD53+Ues*-K;CJnK`FLh0Y;#>=X z8W$EK+YBWMn@^EYKWsA+liRfD--kTCXe0#x1o0+wn7KKimK2#=5jA7u_QG7fF@<RQ z%uDREitGcFAr==mMBeJKw{!G%uYIn(HsB&#xNR5~93ZhJ7zq>VIYGx$`+|+Yd^YDs zm5tuI)JUIPN{MF)7c0+AeF8b9H%>gZb+ii>LK&`jg*@=lVoG}tUEU!D&I=%@*E+3E zNc;8Eh)-`3G+kQK<27sShfX5!7Mz&+W^zkRZaSS0YT0p6(-|v;w}Ycf2{yrH?Ifau zg%ivU11(G`bh35i<qb}Fd8r}DmkWC1I=Ybk0AiLfr=7xXR5iGM90=s>9AJkbCO^s= zA+R#AY4N2@oqGS%KUD92Z}x=;XV*q=`m6oF8r!uyllHRPRdK^-N>dT@UVYf>PtJcU zr2guAFR6bUUEjZP<X*-?N4MwJ#SG$$;sdi1+b0sy5ECX<pDqYK`lxU_w4?~fMsN)i zt{AEr;Q`$NCUhntx|?@IYDi%mh;}9~C)RmrEGrGhxAKy<6c|~;IK1LQX^6&Zh`0b; zW^w~WZ0fNf%>>o1anOReAdA6^w2>ed#OtarY<mKvp%5hL>23%z(B(l`J?SKU=03)k ziI;A*77DR(5_vDL@{}-%&8RQ!Or^YH(imm;cm`ZfPgMPH$DjYjBRfy}{aL4VGCgyA z<1YKAAB*Epl$^Z9!wlf%HI1pl1IHgeb0Fi33`ZvB{`B|(8b?B{2DpCoB|X9wsIOuk zsS!JtSgk=|4OVM!vDYE5O!8uTM|K=f$ydhmGbKVZ<&r3ywwzW-fX)qoJc)1hw-A&9 z4CCDzi}e8VbnMne7HfG7Q>lyN0YeF1GwlXFBV@s`gjU;Mn|$@D^vWp$O>p|^RpeL8 zX-dP^%~x%40=EEZ0cj1W`ngWRM~xL=oEI44MY1pr*#aBTp-Vn$ke*q}SY2tG|3}GP zTA?p#>Fz_F<Xpa5jQNrlaz&MxY;cqfts(wu4ep}8s-`@cab-X7`pz3y5MMS_&5!8o zEM9v!Y+fjm0ui%NKPjMH=0nUWK8icSuUn|%bnRvbf{x_@2g(1yex(0%Ou7v^X={at zg<L<`FO*cWuwO%rFk-nYxSv-d04J!Jp`wN?I2FrZQ?MY^oJLM7?TC&{kYeBy+wBrM zEF-))K0@>o0|~j^sw5FdnYp+unu}y;Vv4}LZ!XD)h?9uU=K7N~;7bZm?LdYpR}s86 zY0HZsQ{iP8pFJ18b`5eUL0O%}8_G~A!|A{Y%QQ(+aDRj}Ug;v{Wf9zh+psvmo3c?2 z6mnc7PO|vq0%RswG--*$z%uU}EQ26!c*>nLN$Y#x7t05_ymg%epf!_624%^V?eeKT z-jtc?<iWoZO~YLtch8{3aJx8JFynJx=F@F009pFfYIY_PkctDqL{mHea%L$Zt=QPX zURMFytXU+f73(|1@L-)rZ~IWt5e+y<G*VaCgFAWM;8TR#Y$Gc=lS&!}tMo?6iUX9g zU9HtlN2b=24KJu2tmo~nG3-T%dUcMGi8|Y1qP~PYrg6J`SY4qWroAFT`gl2OJ^h>& z1K%`hutX~Fs)TX_>^jy7vn5ah=LLwrZfuIiT8CDWxFm0E&eeFQuBL5^vBCiMGKBQ5 zA#3-h@nilmviA7_DkfXc0U6LZ{qL}ub>Xh@?(T7QkQXq-pZ-3!v;OTXIy;O1{|3DB z3-|(T+uwCVb05(5@dwz)v%-4e%}>6QfkXHdC#<5l={m$s0NH>1ex93ZA3?-417G=s zJA0iki62;lAAqjZvZS&Ou>j%k+ox!s&tjj?*g!ABJC$qD$~0hjoc8`)Wo#{h@)h4Y z+>}&#{lB~M&kuxu`^~?T2gA&dsH<c0_qzEJk}OGTV1J2f<oc;X?a+%zWtZ0fCZ16) z2Zr3?tqXTN0u%&kRa$)zlV*S!d}!?uRnvSOe<u%I7-%-<tknUctwqp_8mdx<B;>;F z=>pCQ<=|w(ZnFx^S^{dLdckHz=P6k)=z_dHSlDdOj^4I&l=)8O-%x+be!O>D{pEMm zch$pRou2;|=5hYuoEl+n<hJ+zRDDA|iPY81qHKEpzOzDp_d&K9GE)_g|G}4344z9o zr^ussKy|_f)Kr1TsK@E3v|wGT>1GRHT`s5$qKXP?1W<R+2|)+kmdb*NVWaTZTB-@w zM)Bm5njqCv^@!3IYJt^SEiei^ClBEQ6rDH#oVs|Eowd<a0I?iXuTEqQHn4|iQQP~A zJ#@tH8CBf&ho_D{J8@vg!_U3Oj=t{e**-RV+cD{ac6t2S14nLuX8Y+s-p-gw98!-B z9^60m&67X=8QcYT2KN`t<J(X}j}w0;*YqJk1d(U*Mt88xh5i}wMT1}{Z)~RTh=#~m zU>r&i<501<9O1HkSO?hxvQ5oRaN#R)D#*D|^~6vGEl>@%wPM7~O-^q?noE~A`3IbV zGYqDeix;>EM*|^`Epc)u5>Jqqbo_bmme<qCFSJ?@Juk*DaFwUKZW~<x%1CB^tBX}V zlz&4~$z58OuM0Lj`0FoR=ql=8wa3=F>wcxaO72w1Mf2zzThJ3p3OhA^4IaZv<TVwj z69j6^=!(d8uD*<5mcv76xv0cR0(xy+jMUEWQ`v5%s=Hj9;Ej_MEe#DMNLUz%P}=8` zLYTvNKx_t(6>y8S+*i)`iz?1ffYG$&B09OG_KL)Z`1adV-E8Vy->|l(I;vy)gOOzU z=;!`%*@oF;{?htziJ`IY=Irck?7s6oL@KqF8^e`b{&cmvy(XB^Lz8Xi))n78Sy4+q zqc5B>yLg;IUNb;EN+H@!)R86JI-N54lqW=Jb<FjU-&A+dcXA?0G$eT9!b1p*4lN6Z z5WsLBa}Et)6~XP(p!wJeEF|w_RFi-Ov&~sUv0jQF(9G~aYg$k`NTsF)x`RqJ30qXv z0bu=M$J8abp$<gHiw%zFe+!?~|DxFMK;z?~OY%>B;aq#4`)3#)y0XY=rGQ^4x8S4c z1s_FaWWh&Ks3#w7OUTV=k#hf~+kcr>H34rc`D{;OuHG)-oWn6l-b?&Rn)lAdj6-<7 z1<$v{h|oZx5=jv5oNEC0JM7S6%U8Ihv?ipBP2Wz2$RLE9TPgH#P3lkX<`=8QUz+Qy z-{&`RhHywIpQpcnDO-VEdtMb}zVE`lp%d1Em*ZsN{v9L@T`TPiBC&Q8vzXu}1|;f| z5MVd&4_kY%MKGreo8*_(@$WD&*Lbo00cJO}Td7CY;i?2g0W<^zLr0scwCuyL>sHNN z{Rqih`za)E#fS(ec1w3)LsWrN6BP!zDWlL$p0~RFDV|~JZPcr3#qQxPz&cFg;wXZM zP%hRAh6ApU&=|y5uwtdLt1i6oD;pZM@(iy0O6raBN(AO+UC<B$OJAC1BBAFFr~){D z5DEz5-Aki@m+n?y7#@kHe)RIsZR>B&e1GNaJ@)42Sl8(R|AUj_`0o7!>b;wK0;T}7 z-+_Y0u=?;hfx!pe|9Q*A{$nh~geV&v3Aq`jYsceJ@%Yd8_p1*WA37*rtKN<V^o#1= zg}YEUgSum@(5cmgmy_+NCR{?oDnf}4wj&-IYA&OrkheIOH{(pJOez)_(_Kd2(M%j@ zm55$tkT<$$NA+Mwp+*c80s#yVDI>t5O%#DuO1VmIDbIQ%rPxoT>fRMoU@=}2ZjWO> z8T9B#XXGwh-fi(MC!TxM$>z#Dy|lGJmQ27RU`d>|oXx`nW;Y6J2s6;TRtO=AdJ;qi z7HYz-dK+#G%^)s*;i5j2<^UinBJtyo4?H(rtkyUdsLceDJDDpM2<?k!_t*?EYw+u_ ziFJlu2e*CRm+~27kEfq|UX_nNyLK^Z0bDe5zq{eG&5UiaM<XsTgMx{4_w&29829ZE zgYW<SoA2Hv^`E!NU>H#l&b}z+QCmubGh#%*S|EjWClwtGQ|<&FCxPRKT8ijF=Gt1w zQlx7O3(#ykuP$X{bZiNV4&__LhO-Ib*LlSo7fV(G61Rk)O+xlq03cJNA6IJ{1s=j! z@dD53wiQ>Ij88byPnb*_YNM=HdNmZh3AAlb)LHK{8bdvyhRSmFm63<`XFr#Y_A~^V zl2=7B@{;m+c{xXleRiOv$O3N~yn6nKxM6zISgg?K-QM<a<Gk;oCk9$0=}>fftMq3J z1shRtt3e6;OyG$eC#(h*387hu;KmZT;i1IT4a7#SK0!29ayCZN>U=3Kk0~JT1X{z0 zrHJssp(0rf550f|YX=psSftUQ%9e-%wUfAR!RtVagjyMvBh8CmC8Ex-wC=_3o!k*l zCqt%yW!NA0x!ZqI7mUS>#>uG(fb$!}4xP!Y6CW~^o6Ux2UERHa?FS+oO{Ub0d;gmd zB3P_<sSh8G_&q&_t_?H8yG74XM>Wv-Whi?jjJooQfX7=RsnZn}R$k726b~ATuSw%^ z^_}@2Zq$QHs8AX~9H;q^RvTGK9NbRwqOdmY<!vIZHn}sF&uU(jC6YNI+Yc|gpgMwn z!iOL$`1MCY$Xt!bo25O<Y<bj=G;F*cfUahO2?<Iim8|>-R4zO;b-#^Uif?rC7-U5^ zfh_*vw{gsKDii9SOLR@`{M3B%efZ><FipBCALEsm#pJ0Zk;u~#c{z^AtI=0Gt<e1F zh{hT`i8F8pF~!IVTuokUps|itbu}sDV(LEEnxZkE->)z0F4pq@oqYN(p7;a(tTpd{ zSGWG=g-7Iqf9DYgdZEplMFpPx8>5RP(-LVTrdV{sh<bre$w?~DYSar3;FZDZF&$H` zrenbm9HJ5%Cf8)(zu;E{n$9>H^z>~B&OdyMJ7bAb20sTr(fO%+y5heX%8d4sKeYJ6 z5%oVTd7If|*lG!Tl4{*=<F<x19?OFUOS$t^cf8VLzORD0&8B~Kv*`T&m+Hbg?M&4V z!hf%T(#EQD{dX<aGj}=tapCz&lFyo+^TMgBtcALdvJ%4dSV--nr9~tzk>BEcrKRWd z%K%{)ED=Df0f#aE&foFGebe$&$yz1eV+r_T>bs^F3}s#>?l*`1(cjfmf9kgUt7J17 zMCNs~<u*t8`vwo_?;>8V+o2OUC#*p`lgEYALT-@aE#P*!<}CSdv~WJRw84M#mo*Lo zuaUp(L-NPVWAgZEnwiS^$Kl@ne#m2jFZR6Xcm$DGQ$p#$+d6oYqJ!pe695`@ppuyf z$rB;wSgcUcpvzM6k0ZPS5U|{))Le#OirF7(Y*`i^%<>dw#HQ#BNGm(2#1TVmi~}CC zPs%DZS}$_Iqc5=}$Ur@-$b@{()6}deR<n!buoOSzU!ra~;ptGE{`omPvD-g;@@pSX z68pppC+<A&ls}ge_T^n0*L*xp?157!sBXl0Vi7B^#9p~pcvZ;x$aYjjt48;SA`oMe z&rj$oxb8~JkdH*TQEsDYnro6|MLPDUaO|TH3_6!VOh!is$#6t|IX5y$wYAl_3xJ#; z3^tIO4!p$?qPGwwDgN}}&#N6Fyb(8rJQ-6A_2^Gsja%b7m~D_+N24I}z>SgXxeQX3 zbZg{WGM!!Aa3NdX=nG8E(w%ZOiqF@O=~_^PQCvuDl*AAoHFM+*nn|CK4Hh;c4&d|O zU9j@Sa#-=j52uOUbN2<jviGi2QAwusm8Y?IC_eiMxMF86PBuF$K6CzcLfjv(N^Z<i zHw$3sWMo%8E)kAA1$ku7j-8pelnBSDZyTm)Nbyk~!@GB-U{05<31WMP(Uxd=VT3pD z)KKv)uuX`7E)5wOD)*2Yb-M6f6;_{S+LS;e4CO%=!095BS!iaWHSP*P%S!_&(fVSU z5rq1Q(SK1$5<zM}eCA%s+Bw)385*=mzp53#7a8>2vO)ciP{^GOh9yf6I%hGA{2IbB z-5W{2!625$MUjj8dK%9d@QhxHsxL4Ic%tW@f%eA?W}cgcBTLqzl9E8cptHD3?RZT2 zU#^CaK-V}4^`I>am@{*wT!NJ^3lz~Bu?hh#Zx3)P8X7b}7O&8nG!k8A9fyVhQ(6W1 z1NiU@YvLnN0E#7m2w*+Ru^vh4RxYjwDK@$M&{@G04`)e3%ucE+ul<?cmP|(Ky3wUH zyenGVUGCAzH@xwn*(Vy)7I*hXEQeXbazsh_)?lD+r26F6&&_`|?K9}bI-B}-&**d* zbw4+7p~HpRMjY6j8wrcI4*o$s#v8B#mDK1@<U(|G4<!kWpIbi2rNt+@)`hVBC$Rl5 z2&kcf+^e<z!D@$u02{j-bTPSJjsob-!l6FJQ2^(5B?x~@eJ4h!o&vOa#;TwN$@`B> zI*y={-h^knd*NFv*J2$@ZA#)=d@aWzjaDK;Od#}hh8k~U2xoy}1>9MwMyM_*H&C+) zWsfeKJd;sEbgZwy4RGPHPR+-UVIoZQia=kGbwU?#VqoZs7=<K1RdTffE8QX9DO6W* zIJ9onho9_cwru_OxbI&1u9JlxC1(xKZ@K@bk$_qKh5AxxzgL43oKqnA&Dgo;Kfwl; z_9M?&x;NaA+VPm``^xvvhZLyK`MQSJO(vg}`N*`?O?6??!aj^U!7oI>q<WpOi5L^d z$UjPwWcr+B36p=2l7?KJRN{E5%0~%mejqrO!qq;&rH|=CxG_oN>U2Y_!ws>S5ZX!; zumzxpYT|^`O<--$Eo&lgAsTAX&q4Nv6aQQTr?|$dG{EFp6wB6T8*o36U{cnq*zhEn zu5*lryZV^{KM0aC8>hLpks_fpp?n(eX(C|vXhxp0fg(u026S^B-8A8>WY@w4)@_9I z++h1&nWd`wQ#aDn416_8DHp2<PM}^>kwrq5QR0~WQR-WH5V+(*FJMj1BSv9h9KxGW zX9&VSiYMF>G-th5lF`5mwDT`^ok+ikuQ!>%C;DBZaeX3wgV89rWgaw{*2m}HeCGD9 zvGv<jDVPacrK96-O7?Inw3{7yAs^Wvx%w-THJpLkk~XBt)0K`ZI>a67f!&#s;vBrO zV>jEqJF}2&8`G-&8TEx!^Mp-|M@fsOLKC|Ole83$KK;vmcY$*X2aMhjl7i>oHKOpA znq|ArKFE6sNkJ}&5u_KyUpOXcoB@xRY^4ck*~pb!Ny^n+d<15iVw+Yh;1=JCbViE* zt6MYSa3;bG{0rr%dSR`a)tL+bo(nyl5HuR)5TBk1X#cR7kX<pwQU`P~=q>OIC?-H% z7?sFD7Q4ur;Kor7>R9|jb3v~gp`%|T>h#A5?MA;R$i|4>e(5i+_{mW!QOtvk0}2JG z3h^Vy%X?zHJr*E%lE6V-MTA4~!xcf9?S&6(D0~(7L^6@$|ML&=&;Nb?{XP88a~C4Q zmqYvWq_#BZ1FBi^5v+?&C_{&ISdhzNq$o(15Msn^3V7w57j$i0kc69sGRM<oj8g66 zS`&i82!wAgx7x5NaAl|bRnFq?Sok`|s!tVvRs7CRPfh*cTr25>IDVHW3O4d#&TENR zz!fT!L}!J*1w^Q56I*MrZLQWhueAcdT+wGRn^ecGMby4kb@aw6uv_C#Xtw6+I3ZG< zm7b`hGy%P)3rtnT0kwQj>j1b_TvSiRG)Gk(tY^cp(_E>!+{i!Z`GyMOg9oZG8>D4z zDKTF3NjT^HkJ!Zi0G<e*YnPho>9rt(1gk*Uu?DdRYK1Sd<7d}lZ?r5~gTe-3)U9Ud zyYj^^e8Gk&{wmDO|4CIXI=?8~rM@Uy`QKN8TI%9OOh$$-a&hUf##G?aj9MHW;@>U# z>x-_)_GupWj`l_S*EHY4e$xrZ)nWXb+WvM4pI!1fu*-<)K-*)S=>WR2=j<?k-}8TX zPG647c%Es$6D4Q-k#_9z-Be1uNi>w`=rk^;Ow3e)3w&`siU-BQFRrc6sP=pRLi@5! zb47`NDSQehoWz8ddUhP2rV(nmd^f=TXj9+|lO-sB<1EmbgmNi@f3XtAKglN4_ot`n z%%Ptj#?SA;G}ko!ysS$Wel{oYi$?(Q9Y5tP+??dGdJB219t{Wl2}CJmQs@Z_ft33# zm`PUjDETX;CjJ>|pDlGa{Og*7y-Y6E>}`qEh8@@O-(^yIN_b5_acMqPT)RoWO42(; zm%?ZJ{r-JA=zeF#dl&rW)+PQjSD`C`4`^Nr=P7s&BRS#^ahrv~H_dy|t;d=%jL0i4 z8oMxJ|B|~@0*0VixvEZa_Y}RP_*JxPyQ$zOg)iFuawlJoyNZi~eajwW$HlQrTjS!t zT-dklt<G?`ljDwASY5|<oZSuo3kXD&N|&YPc(U$v9$fqIn8FvhpFU{|u<+Q}0rd^o zXRmHX+$Wjvv!SBzx-{R#7xG<Mz+Yh!jJW6YC;-BJ)BcjZ5%yA8H<Qb8pLtMvo}JML zx&P;Yr75~Te|&NOe~?W`&lmmwt+Vg4wbD`id;_rJhzE=Ru8tF_(=2uU!aq-1IdAQ- zNdM_K*JJO9hor}JZvAh$mf|69HED4z&)y|HCWJ5b?<?NF8#Vu5`0LP{%Afr}^hzG= z?MB4Cxf=fIf^_-Ny>B%MX1YNWN)umOG?b99V)O=$lsa9d#aMzyORt!&W1@Fz*b|74 zT8!hv^#BM}rH#u>eq(i+-CF8yZ%uVw<*<5(bsGbhjjU~})$4ry)5(soX+9=dODv`; zyD#AE8F4#Ymq&sltFgwTXq?STTe0GoYnqjLPl@0llLxrTwNxDQOhqAS-hXtu>3nf| zao`9vD%DTwtdz837#6o7A1kSB;}inOX3~i|9j-Wx6*+*%FY&wlc*TA&xM{}Q81qOF zka9Wp-m;ECz4N+(`43DzVzlc$bSfl++2<B*HND}UNrPdmOZ|DWZ_{(ZH4*0F{?>z; z&U>X@==L2H5SydqN}`*jafi|zLo*8sr~uaFU0NcI%C^gzk7?h;Vsk-PGtP+`lVHcv zqdVS@!?{y?wKJCM=lVhYI7q1_sKh{hNOcG((*tG-xJ{hWY|yvhkmv_Qm8S8j6wW(= zAww1MK2sPD1!>NnfX5Mx0uv(!k~_$Tur!Tv1BdTouJ$x7RUZSK+^M_AX2%eH?N;Xg zlksHK&^rHbEum-ogFlp5g}ddl+Odw_@!=Yb?tVenFm=>#1N&QiY`^Z1_lnL5@%C-0 z<+snSO|KhEi6h?a$(~>L#m9}3MK6V%p66qk&wsHgJA3`~*mQ?FD;~coxU+A()Mx^= z5$DAV<03vP9YD0(v#2QtU%;g+J7d($#sR{|xp9sf`AQWgx}h;cC?seoxe?B>M4i6M zT9PdBr!mhBWqK4|v=8V^vNuP!tV-0Di!<Meoq0MgZV!hiX4StNy(;dmag0e}SN%xW zuBl1&%--jodu-D^N5hkSgOjDD6z5=u_dRhJe6>aRypSs)dvnorD3@@DjgZP(LBlI2 zaG^D>%k1XZe>Z6h3f8$Yp&8*HfQe=JV0E#4Gk>HkgN@yA-JH_`R3b4dD&}&^?49^* zbD~La;Q<_T*lCozgl5$Px)QQvkrab7W{?dz2RM6me}&QJUse^cTaJBT5bf;`92sA8 zPt@f!w1KkQCa#@TPYs8C@#^u7dwlBO;as&k%GZ_qzVqy%sdD#cdL~Lue!p}y_%>V} zpFMk4)lK2E*F$B96C}%)7$Ky%Mf@?jjDK<*j|IkwoQ{ht>QMQl3|CMr2G|HiqiUP% zAsf+FjBv%$lr>G@^<J?L61>#1w&=tY&fA>e5I?C(cQL!Wy3}N!WT%b<<H?{nHLU(* zL=FyQUyL&Ep*@j4ze6Vu?~V0`qf_*|iuhgM(Cv|S3wF#|LlLPOmcon&x0sBqCtCQe zF-~pfzjYLr05zaM(?Ad};&xK%up2A6kLBuk!K{ubOY5vu97NKTbMjc-nZW1c6V_&Y zycVKX$XVQcma5X8tFBp2wshK*df@)iv?A201rOz@O>V?0R61yr0?NSmB)a!ih9Ezq zLXdJr6$T3f3)9NfF)R(KB3amb#<5Y0$8d+EA};nt4_n>LvSa=)#;&q^YE7labdbG2 zo(u-ru>CI`Jy%G;GTQH{dd=vm`-`mlPuW~mRir1vDkAS5n(T`V)>nyQFxAl)_hoM8 zc21vtkNp^D8nBDSI}q_N;&;2Y7R*kI#AtZPZY>JdVd5DK6?CWL&_#QIA10_X>8dL6 z>DX4s$jsQfq{n00k&30O%&A(l_~>L$g?UYHa-h`M<ucl$Go$n2SjNDAhjD0cn57-K zslHrb#nQ(}UJ~adlW}Ju5<o&rR`XitOkO+oQ;A71;d`Xzz;e<f!7A&h;8vp@6S&rZ z!+?n(CV{P0K~R$#yQ~Mxl#Mzjb_7;Pc2UGcSv}_<IZw)=awT@q*T7J~oWL4rDsnGH z1&xre7*qtO4gA<5R<QP8_$OYl!c~pt)Wh3rEw0qCuiqbw*{r_JUBB*n;(;UZRIGNU z+|YA<EMy3*zj0E1Ze{9<kV6z(_~#xudq#H<e!E;aw&1lyUV-cbN?m-X#b`pSAV8}r z!Ss6ru!Yx}cW6rjRay)T7A;dsYk;@~i<Q$6k$Jz4237#JQfxHI70M`PjQh`(RMZn2 zo+1DhKyD1#4GUJIo~R};z7@790o*i&Z40>oI4pNh?Ml_2pd%JEteWvP?1?TC8xMs% z)?JumW1-3BW}m5RYVG=@r^Y_P-afQO+|Z>y-9J8KFoXuUQyv|=zfXO=e`>wK5MA$n zap2J3OspC0%1YAY!I7`gIh4U2aX0)kL2MI*d!j>#?g$*EO?;8$GODmq?u``#D*}^~ zglH-h_2(>c9sC7@Ar`#q<m4Lph~^}1SQRHwOc1q%^~-I!vTDA6J(z}UXrgI0L9t;) zz9%?6{ssb)D@_z_(7+#HIz{JjmT=!_3cxuk=>kHj6nosxp|KFV(fP4<?N1L6+nivd zGTpF~nehg<A$lu&X6@|M$gLysZK=_z;EIvRSLUCdV!9D?-*B|Qu9C4aQQRAen8af} zC)6`f{cOu;r}wy-)HM^BdGx>;bzhbpU9%0Ygmp@@ciD?rrz-4Y@G+9i9(N!o(z?^d zgH6W|HXTIth`AJ~c;utFM`30<k%}9cZWKDESlYbi-l^G$vr1}@Mz*ZXvUe@nKi^y; zjjiap%wPxxrRm$&J%fLHP(7jBD{aIXvQa?xU--9kMgkW08&S(-lv5i{Q=W_)V-EW3 z2!u58%bdUk7zNZoawYuYEGfgUhvNkpZ~S&p-(WW0g0T`Xs0@mV_;o9>1p}2hEGdei z$wJzmP}cRu{1*HC2g8>|oG$Qf{o=MRzxeu@E$LC8NxJ*8{iAyQFxxrhDAU%Ad}~<R zhWG<_oQHf1^Sk6*I94=w;=u=wAFUaZ6EwbRh_GsLND0yJS7l_I4InrCT3p??cxYfp z*ul#Q;TM{G6yG<vBBM$5!NYd9r?k{M$tsQ?@lGD!tL|xMCwFXQX9j21rUQG!xZWh) zh^ngZAA9&4J<r{N^>v+nOE;l^UuehVOs<k{H#k4ak=(#(B0(=!MXX@aE~izms>dOL z8c7X~h7!%&JFukXs74Ui69)QClFQ3Iysi?@p?!lRfTpu?MbXVe*Hkf(q&;3B;v$?G zI55E}Q@Fr6@nwnO{_Fr-vv(x9b$q2IIvjatEHb)7ec^$)YjBvE?Ue?Dl$i)fc4=qK zclf{VzjgPTEir4WDG-W;x4kpIcHQ*E!L9qFwL!Nl?rE}@iDKB|>XE);Ny1PfFb{(6 zRoKZWc!Vz$7=H2b$x`LG9g?Ia$Je4ZCYy?hrz~2j07z#-X)Ff<D6C2;3Tt?t63W6w zk@ske<Hh2L7>q_^OV-<RX05}A8^et=zzF~<7vlrY1jIgwFIi<f#I6>PKJ@q4uKPrj zm;<0}$Nb;f#7Fu@#u0qkFK2zPe8uS0jX!n_a=_--zi}ofdbT||HPAP#mqrf%>-@0h zi(_Yhr8|XKI4*3`IHF}&GGAJVG%?BGCL$O*^YMZ55$!Z{Xh**`xJ%=Uii5x%K0?1n zAoS%yL69Uy^ChK1MJqRUDQzN$n{OhpAOXLj#mxN3GU$*%G^DKOUg~%kF}byiQ#&7= z=<QO?qwN0FeS0V5fkUj?9QgOhP|sk%X0u-J4~Go(|5eI5{_4F$_dfi<a}Vr1^7M`y zo>jLS9qZUhW(@f)mMhblVS`~1Bgx@_Jz!2&-3b5AqK@#Ev=g!BB(JOB>gVP}n2A87 zntx_Qb!5S663}JVIqU)p0%!~=Q#+-ay0d!F0AXGmvKWF^P%%2Rvtq$v>8zk21!n~U z6^j&LQCHfr?^)J_EBz0X$&vbIzrUi!>vx*fb-KS9Ul*=PmY5C<UT!QIF-S&VHtqj| zb$>a(Co`K2R#51I-`x)fjU!GDP(DjSB2=B`=|N*muHeab@t~`q4<JB)Hdnz9x(b|( zbNv-4n&D4;E+t53atX!Fe571UUekWy8ga)iJ)il9FtGP}jPvhf-<c8auiNrRQIK^d zW1(<Yw=XD0H>z($W<$^QZL;`NZ^y?BhRL46Rg&HZI$F`>?TTc6iQiuoW^lILj@ZkJ zEazq|t1Y-4uqHsWQg+LYfo#WhnnDYFS7ec=3N`quEZ9y>+CHMRhG^o6i@;)=v$-3H zAp-SxB6D&f)4?D)pB!OOtAu!EQfofy8kXxQ-deJw=;#zD)W@e|y&JQs(G54mZ+<1V zcIR~OmcHItwaa=Xv!8r;V9y@i`0anS?qD`(aP@t8^Mm)D`L?ZVt@62F+`l2#H)7C_ z{Z#$e{5xAi+*aAMuj?L>_Miv&z9842I!aO@(Cx%0QY>iqadE`S)A?HdSwNBq{sam( zQKmu#6*VaeHt7-Q6IU&KW@_(Yawgx)FSlHrH;Ut+%_qe~;ygA5AAg=Wk1g`BRbKhH zGSu{IdBx*%*`XDnP{p5%6siKA9A3$}Qx_g__S8uUOKRezjOKC#D-^)y{Fk9eoRL=% z4WHaZvxiXn$wf1yxG4A9m2M9CF%yHTWKYr139cv-UBrkE_5s?usPdp`N4uLdJ-86b z7)p92WSE*N6{R8kfX5k7&eB262|>M?YG>uqp>T!GS*okNzS<M_xGF4eM_Gk~m6a~{ zbg!<eVJ34;ZEbZ=Q<}9M4OTi#so|TZr`4aTV_|k{<EOg2QFSmP)-gu1kwDsG(+x>X zwAN<Av7o3!t<oG!G>?|o4WtH#YJx_+Jw15m<c_9PdBe?v6D1|#kjZj==B2a3&F|F> zKigf^G%^|;8?=UdJg6+tzU)({bX%lDn5esj*R~BwkmBa63eMb0G?}2tBppl&m7A%o zEi^BOKSL4|t$OkS>Y({+>k+DGVx*jvteFt)xIyb}b5%|=Q9yu(49z7)XXd93^~rIF zk|U>4exV`@18Y7m2s@4(20r%?ah1|+HzgI=yrW*<fW_k<96rdpeRan6VD9bPOm;Kt zNO|4y;4$^xzPgU&0J}XL?oO}y`da_N<Y-x0D6{7UHL1R-zTK6z=<b!*j!vawp>lJ! z>00gc7h{_g*oks+I+qd)u}!G33P^}CAcsqmmY|dbs)SNfw<+Z!Da9z6vfWZRQO*}k zkY_=B${V6F8k|)LJZ<xGV1LL4!m(luwD9>znqzT23K8?gBctc}=Zu+0Ykh+jPyN8~ z&X0_gil29u6&A*{<6^N>R$Z^VOWG@RVd9h#2@3)QjWQDinRDjGdhG8?tr}Vx%hzj< zz<yIqq!6zXlV%t$(Z*j?daxcQ#7v6Kjnx2T15OLJOeFeVeAU?$fk0YqVZ0**C^fQV zp5j};Ilp*}Ybm~qu3)jsl?3l;F`MA2W)o_c2DaH-?G0BM9X;V~S6Lh`NL366l7pFz zl65Z=@2;R{OgE2DdHz4<-UhzS^1K&*o<|>+Wm(pj^<i0-Wm%SGS(YVPmSy?-E4Je} zj!8_Kh7dvs2@ohv(=2s!Qkt@i!<%K4mQu>g=~xb9jIrl=tUwPfG>q}Gj`4Q<I5lG& z*7fwTtsC1q#=EyIi8%l3ek5B;oWyqM`+NIq9ep{C?)&<_{@4FX#&#Zf|A|!YXPLxg zpjxe(SBGB@CbW*sNN6zNQmX?|wXS+F;m`g1a5ndsuA`ChdWRrvRcpI9?eV$!@jG*h z%tR1pok9-L&3#v?cM)=*qAYWirbl2V#kM&1@Mba7V5ZzWLu!=uDn7x;a8^dD6nJ4d zy|p-5n_@3*(QK{qC=b7iD@WObqckENm34HMY$(KlOk4_5Tn(cZH)plw6^*o=0-H%G z9Z(TBT9&W!Hn5cyIOE1n{32*tY_-9-V&asL6mbm-UrotW2egr;2~6Q2nO??_p`@)! zIM5WNP**tj)ySOx!brd1^X7iktDT+RiQLDqeWu6UH9v8C{}w#c9rOhwHfOjmG0T7D z7GJ9O%evPs?l+>-4<4O94kwE^A-2`^plerj?v>NV*~HM4Ml+68>W)}om*2+6UKnKh z3BQ}`RXvKBwhT;9uuz^tHp`1i7WHTqAEZ?_PN4V+L6oa>6hc%{Y_C&LS`o$V60egM zK@f^e)Cu;$(Xv5`1+#K0uu+u#v-ln$pH#wZ12CxC7DPB7=!0BhN7!=B)W_=QfT9`s zleTFK?-s@~e7&iCv`u624Muaj$9JH(W7^6e8`~RmtKCz*BL_c~>n&G(Z+7IPfA#SH zOB*wf@S7faRUaI^Ip&Y_2R%FbGZgP8hCVZy-u0!Ok#PSGMdp$p|2^dOFmgxaV6fX^ zN?rltV=+f@B`RNjB7NkdGRMkN9A&apiIR+kOqMEAC11r3#45segY~9+bB~PA#cFD7 zGraJVM^d#e{uggNw!J?+t}B~5{q}_u!H8P<H;rI0`waBpX4+fNVmc5Bk{%GH)?Oy_ z4eZ<(j7EwN+)6kO<{UB;Y2_J+oIt8gSuWjNezndt{=K*0t{UJ)<O-@l5uk8Er*$<# zcEylI3iRWpv{?@J!h6$poduO<Xgj?LrlA|428|RSvYZ|6L@4v2ZNp+GC3I5!bs0=9 zdG$o6HE&ny)NuK@@UkYqpz;CgjVo6J%A;5vyz8hdQ=i#$^pN4glRAx7AGB0;9BZ;# z^lkln_1<Bl?^x)bJCCsfhPo0_zcB5pbq3S3C;1H{Pbbxea*MWHzt`+^>vRxyy<e!P zsBl)=43%2dW0OBTmwWSJ|6ynNzL7oQqmKmk4qc}&4|ol>-wM^K=kI>-KR)13MfQgJ z>a2obahugTtEaIy)#<_%+y`&`>r~QEZLH7<DwTh0WUL;aF*!_S2CRV8SDGwt1mNWF zLy)7FAV(qYUPX@NN^*LTB)T}XV%2Q%3ll3)KUoV61=-4U%F1xNR#dM^vjDkRW`@dN zA*7CRL+Qx$R{;@&3puN)FgSb-%se?MFhEbjr_{R@IV7`UsYYKS!Qw8yG-*^MbJnu^ z-jBKPNOxuKeg83`T-%mtZ6!f<*|w0NcDb3LZhmHTSMDEuM_%{bIjh_H?LA``OD^qO z{ZN&I+}gpf0Nz=em`8TZXr7#l*^pwUoguOzPFesJGEu9FGzy9qQoi=(%D#%;F=hph zT$|d+T-%BT;1yp~Jy1<m;bcuszABt@UwEFqw5md}_OcaZfkO?OeeY_nm6!sEHy>6+ ztV!f6QKBDu8(Fy!CR*7RpZr2}+0jg=<3OVkfgqI->3iFw640xXoo>Dc7+_S5fq|7c z)S^J<w4if)NP7|ctXM@W&8G7u@{3Drf^f;{tXOTk@aC1yf&W1DA*%s~Q8OWyX-%R~ zBsJj(T3N#?F9$9r2QEbp*pB9W_oJvtNPH41+%^m|l52eLCLn<&9m)!q`LddoPC&JN zsg9UL9dUf*!pWr<u25Kvu2gJfy1q}4`J;&GDp4_@dLx`@IfWxe7Tp2w0u%x*YN7@y zRkGm65CWir#rr&&yT@8xj&{?e`ncDhdiu7$>c~4=(e0yL7=N$u2K#&6n6<^V10PBj z1b;bN<-CaU67&dve;NLMsu&S7vs6w58W;aRu?pJK|F3X#`TsZf_VVMBe?##<N8pA& zeqYF{=26WZpst9Gj2TAcG2P2{$+Ow0L6(w((tQm7zz&lbq~brKF@*Ej3ZD%sk2;iS z9?QWf%;)qJ=2QEeS|;7}8)gNwG=wk21;OiIX{%{yOwq74xbA2s-t36eGkiztiU1Kz zC807UoXV#<loK*qN6IGabdF9oJpwO-`D(Z+OMfiX>oM%y(9yB;JNMrA^aGRi^@Clr zJ!o1x%5?ThY_2oJ)O&I-?cd(JZTnrip+Vt0shf@Na9`3ka)+<J{_|6Jee1rxbN6~| zwvOGMW1n?+>!Y^jTCc^SuU6ZH-rQHm=eH(j_7B8%DR<c~yo-JAE{H~y`w~Z@p-W^n zY&ZbhAE~_zA`GGowg>1y>|Ux;BP}#9f+#g#=8#;q=;5QlQHDg`#*#&t1C|D=0GFg) zRAu16qjFm_)S`k}266*!)C0_N73N^8a&F`YFk<Csph%M)SwD}s6lX<+R^<#6adi7= zdKHoXefR9znVQ<}k7$$Qv7w!B9=S8oqpNW{>Q0ZnU2Cjx2rX*0uRpeR_}&;w-ZKO9 z$-1ZNTp+M9dOY^@4gdJf!+M<y;k>{N<j+CJb<tL?4xnU+|E?T>cT-#pM`>Dsj-iI8 zy}26nJb4-sZ(l69298h+inL>-nS_e+0FX~(qlDyWCj~&}i?*7qH7SW8MWLM3ft2xC zmCj(}+LJg0BuZimB4o`kbSMiVC+<V1f%2WIC22uCLQE^PCIU2y0*$NycAo0|#g}dj z$8aKz5lv$v*$^HNzj>s;J>H|!yDSdx_?~dfhM}O*qqhsqst<57jqM}98S~crs{BLY z`J{K>h}UH?naizJwT|G1W?TDBqu=c9Dx8r4XB5@^jPdtl0yjuVBJ@v4*Nk*tFa+== zD1_0jQdEIZ;RE19#ujV%iZXNq*jxqrbV??KhLr4K$&g1$hS+oktSbo*%~N<0i_7Is z)^lt`{>U1U#3~5}5F8~kX)B!+0@+|g6v;}wTI*q>j!ag9tbc4}h-@|*zu#gD>v&<l z+oYb^u=B15Zr+j7`8?`dM#Crg$ylEz+1o5=1A3=1Id>%Se;zy}$Miv+cZ!gJ<9%=4 z_6279VmO2SI0HQ@&|l(Y6`cX-WtoQ185pBTz-TEh;v*t=#{s~>Wb5+S7gVT$(LmBd z{t`(GH5@rR<V7LaWX0)rM6R;Y#){Gbw8+t5L{<mlA&k)=Z_dV`FI1(5ok>#FfK;ZF zZ3&^+7R_?%tRZ6o5dcm^8AL;#%+?d|PmlDcW{>%Ht6h6TyAB?^du!U_2<wFTv`Ott z>eL52^%}h@`{vA%J6_q<AML*P_7mCUuH}Qtjb$3uUZc@%a824sZ&JdK1rO|p15$AS z6PGj^P==Br&H&jkdXUHPMQd29f-~aLks(L~>hp8}D9j@(A;U_~)Zi4vaX69aVh+|) zMr{{r8439M0!l~V*rBI3`r=r4!M-t|SoD?cjsT9qVYhZ5q%Y%#4vu;Zp%Y`bXEyP_ zeeCE?A$MBvMV~nwp4O>?Grr_d$H+J5pZT3*YMpUMVvG+w!td(Eck$!9=<?s?7DUIm zPhyWCM2DjNh>xk^g|EcE2yCD_PIY(s*JEQV){$o`*fZseknIzQ;CJcka1w_X7X)<$ zF@r*O8*nY8$t)L#eG6)+cG#q%D^2B;uQ7Ui&!$uKE%>|XTYL`RLj4APi{!||^er^H zDtwFFJCEGhOEz7_dHJ6RGa9TH0Rn3?_b=!N)5Mor(aW*8tY&N&YCxz&^^%YxvsV13 zI24u4_}2C~VaPY;1$QPY1{V<^78W_zlSfP{MlNW|Rw<9L+nJI%1%pHxLK@5_d8ok_ z+-((c`1r^XOki^rk$hq;s@>C)s-U~pnI$cR9RLJb#ZiZulZMEzdh2VLX*6S>0sT&e zMXbWQ9W#Lz)Zo?#fEJXvYCw1b(SV$T3DT~xW|LGvR!A%I9;73MmK^<$UL(;#1DB}~ z)pE;lAHarL6`zwr(H%7oZF#MwD$O6C8w*<|gn+isa5SC#^ZT|Q-8aOKj`8=*jkU#w z0wGWCuJAX0!gr^}{U*I%>;B>4MC`E5<2R{acUpK)|IpkXr@N(}S8D<G*nRWBH-=18 zGm)W*t+#qp-cW7M7aG{NH{kQ~0+zae_Mz_aM|qtK<w=3-<lo>AYYgzwt<>49Bs)`` zAmj%bT0&DR6SXMZA^|6p+tf%Vw+X!cETG4B1mMG!79O=ZA+Ib_cVkKh<iARhvWZFq zEjU37)q!v>D)76Ye~D#&7>qKi;C1V?33y%1kas5kHRkun{D%y6^<I;y_k}(!U*`9F z^lH9RaH<>C6{=&fDEu4vj6WtE_XaFlq27&-N@Jy<wKjVH)>c<trPJ_05$au4sTAav z=d{Algu}3{lVnL8WLbfEB+J@B-*^n(Hjy1?{zhsYK*WIx2nuiXeRT|7tz$TBuo9pX z-Kz*BcuJ=rcfDjj^pL0VF+AvCoV-=t6rNi#s#9z%QHbsnAV8SLvnObFrwkwt$OlDW zcH&tN(%2p<0%!<OaoUbo$n*B`hphInPDS2jGLg`i>8kB|Ug&K*p*Cz!Oslo`1cyz$ zdMY-p_Uyz~OM@qcbDP}NXR!>0RVR44$j1IfT}OvrZ#AI#C*VIJlnPQ?)s=ef_fuLy z7tsj3N>guaIT2Cc)$37vJ$UZ%s=e$C-NO5-AH#;taZhq`AD!E_1WoNqE+0;;nzYuz z&WyO8<M@|S;EprSYTLWUH#a&EzF*ILl+wY%bQJ}5v><6zInhy*l)<xnL2Ww@mts6g z?}7@>nju}jscBN?Q42VUFn4#h1$;CU8|kG89X%&iBlJ7#0gZMmk>B77w1tzT(K(zX zMKcgQxGx;JO@W877>K5D>BN^m$7s_C{>5Hj{`H3Q7T%G>2@mb+#R<2NZ@W9GSL=EG z^lo?9?<liIH~P-_`z@B<t^Iu_Q^Ik`8unN7o)}&^&R&S@Q6G2MU+;ITy|seisZ)D1 z?_~TMuhHbHbWNCycN<*Iwf1oH3vbaau#X6y{+I7}HK4ovRt7h+^x9NE>2VbPr5|@w zfy%=aCu*bl$bn|1pxn$qqQZdT4OAD%r+}@CFiUxSHC!b&lHjbh=xm{P3qL89gK3-G z;c|G4vqMb4#?biGiF+S7dgSzdeASMj^yKc5*rdkw)cnCG&)j}g&Sj7OmplIMo|~a} zDa15d{JZKf^Cw$K?@-0YbTCRe1lMQ(E-#e@z|WpHA7Yu(&P4l%j0p{GmI0EFrZU=@ zs5(L#@K3Y=n4eW??2cOHq8UH}MAix=A{so%c>wZLU^M4<9!Mn11Vbp0y3rS%3iSHi z!MUTyPMi|nolRx<sK@U;atNyAsrCAfK6>OREg2IyE&unzH2&>DcssCqM@iz%35vHU zEUk-@9jqe~+o*{K4#-&4(r1tkD}?kwIEsx?3f0*61SYDo09KV;&CnT+@kP>f7=2U1 zLUDw10GuUaUQlc6sHTCjbBvY(H6}*QXI(XJUo)YiAlk)psjTZ##JbuBIaj_ndHtjv z{ys@wr|d_O48-<(%q}cNWP0zu@Qz#VyUDl1=J{M?YJc!<y>L(3N4hVN>Sy>LGs!f% zA(=))u=jHZj{K?9ynp9i$DhUMFYeeYyvv`~{18#vc3_C<1b_!YEQ2pitoyns(J`^8 z28|!!k0^OT1i&-=9Lr>)GH;{P<T0;B@eSCR0Fn^3Ku(we7$j55ayKoY5oY`^)q(GX zA#bdwt_CLX?eL_pe(a8Yvs$$_6`nq$R|{_S8@V?Qx%}hD_#^6=c5KS-epweVGCSsh zOdin~k=K7siLx=HoYg9p%M}-6z>p^x7iwc1!(&vv!PPRV#>8S)Sv)N#*bq)agWJar zY73In);QfGzK%r1jm88tsH_}@%VL(Dt{6oTJ}jTERUD$J312TtA@@RSq%DC96}3QC zv8oOgtB6I37Ox@cpf652JP|__ULuRcqV8Y^3LK#OxZ4}mYIhA8bn1Pn$+{>!=B~j? zow~jgVRwp|CfCly!@T;ov9KC-X$<D>55IEX+`kq*eRnDPPMVCq%nLXw+^t^tnd*lc z2e*}Ws<o2-YK<LZn;L<PX;a*#ZXAM)Wv4D5=oPvV++7E@BvY|F;KgkO%g=TxSKnol zQ0K_+vJ|(NBmt*~EqY4XQ<}r+4RyhcH^md(jPzOD!T|UJKr}iu-083^ggVD6JYcEF z-20np1tN?Ugo4?3@W>&FH!=o|!dWFd7<`nS7ab_VkL6DR1ffE+Sb4n;N1gxWxbQEQ zys=+Wo4Y%Z$iUX$>8ii6{S5CPfLY&`H}nRl!&P0$^R;oj_@!PLe|uz)>O|hw7u@MZ z#{NLluwv+y_nD<HuW}F!o!R>3mtpLAj&fy^-xy*x*@HVlMJY%)ih4zcRdBRO29XTw zl{y(<NmLyxqUI9HR6xZ6@c}w7%$yW)c6re(2a8V=|2PE7<4$X}-jK!ml%~8wtI=sS z8r9?V4FSAm4eiouqelDn7L#?i!>ax<6D#`Rtq<nz!}Oy{tt;2^U!1H_B#zz+$Ns97 z5B-lAn_VjM4cHS=L9(TIDz>Xfc2m0N0^TFfCIK%|%hZp&7jzB<p}hBkkx{$^b#xT} zKu9_QA=%;6M952p89f}M;wqR&1CE%07c8^G-Sa70vPU^CKeTgyA_?LL_=;VMuc%?Z z;<q)f*{%#+Mct9l!d29ImR$wzB!&1=KzteJZsVw=6IHGP7DT}Qmju--;54wnfXZHq z<CE<K*UzuQfN$U!Qc1BP#FXM=2qhot;0pu3AdTJl$VPH%*hshv+^7|{5CWgUtoKq9 zLY~h9^AE#O<~Jj@#G`y<Nn_y;&S^dU$G7*lhdh>m#gW)?zrOZp67bf$+P|!>aC+^k zIX<QJxbFXYgU4_21XFL@v;h;J`$1Bzu4qsp98@*nX0>*F=Lo*DR^tKg>Uv-|m+ueT zEW~Q|Ey+(-Fu-O7V1GQEA$H-Arw5jtEG5uZl1E|t08t?UBXL(?_rMBb7)*RYn0f50 zpimiKi9OVoG5jNp9%zv9iv9ElrglKzcz5pIhf`69(P8%(atA$nKK!j|kIQD^4JM5N zX^&Pg1aq^#gkQI7bCur2fB#mY{H}~(COe0n+_@3Wz2G|5b2h{(+lU2@Hb{#738@`a z1Wi3-#7@vj>=nQ$z*$i30Wl0BTtqY3+3E-=Y=}Z@H@S*AHs+8Ia}nd}Agd<jD_@VV ztV1|+w$f;II6KIgnUuYr9Mxtj3{>TpI&l_~gDo!5uo5NP&cSB7SpiD5t_9r!mf9wF zB$Ge6<9F`3ZE}3mwJzv>yf+#d7>L9&^8E*H-7vOa&0YJB?%a^-jOs-G9%lGVeqYGu z1SM>=qZ_N^+?0ZH?s4cuA2F;%aYIqUd~wk<9P$bm#J2Kglob?kSDhx&iFAdQ_hG=F z)*TY@W<DOKhy-_rVHE6T#R9oOYydS4)&YZ5s67B)tz0H@7eberZvT@;LYJw<YHaIv zoRh-lUy7k~DB3O{ufw+PkZGYM6b94`KbN6qGTvm{x~uIiVP#vl@_^yZoEfMCL_viq z7uNDZ4TFTnt#B9D0c2WFfpD?dM9~{UNh7Fu!xbgxT6T5wuSGjLnf#nSc>BcokJhlO z`^-h6qWV@iCZ<09DVy*1`lRo+L;D^LhSfIUOT(TDgUb;a-+DUtMDB-~zGS^quMLNJ z=qUEcsCQ%MFQ!tsS};rhF()UWqY~C^Bc7D&p`1jz4Mx>$xix`)1}m{Ev=Iyr3Kr?~ zqGZEu>*8{nMVx7X1hEW`%7xT%23HJCykiX}C>*2^C_)-S;BSx&pcg?>iyb`@u}Rx# z|EmW<9U~=@tLoXzg29ZfMRX<BPFx+JfB+`Z3t|-^X_m*$RpvxJ)lre5xOmi%s72#f zp^$|2uE<xZ8c)s{A1Grjv%)T3JO5h$+`zybh->}dKCpdb)l+Xh^6IPaMMOXUozKjk zemZvNn_t~`#;4G^FuE7byY%_~+`r{+{HIg*SiBcL$ocN$zk1I@=P>x2du8TpA$2Hw z|M>Ak!+ENf<^3?I^_kGme?s#Ra!M!H$34NxKC*}{2`b^qU0TE)Dk}ad2x+mC@_(R` z$mVD8I&<Ju5McXY`=espqEv-r?7>zMr5&c#vsHX3RinBo)q$A{<Cs!a8_b};1+Xn@ z?@KO}QV<)X0p|9>SYjH)hr<QN#$lHGTvW)GIsj~yIV~aM5aqC%t<up^X&q)I`=PT` zazU(yqIK6m)-<3|YF6q{g=#kc;#eS<84LybcKG?}(NkZ$_r|^3_Z>g=H@Wwo`i&kl zFb~ywy~m`}MqDX}{@DX=o%*#v??{t>U?ex}P35@VKYi}O*Myxvk>AVxByQv_I(vDA zD|nmgABG0Kg4)q!Y^k2TH+Qm@`5aGfPQ6<b!TH+JMI9x1PLVuQ5M6`)ge1>aJjQCA zayD!LA<kAhQjL)8O3J8Ew(4DE*2{qPnof!iddUC|#i?Y>L7s8MUdS_!n0lqyXtvN4 z@j#wUxCm2}SaYNec*EHXdG>({9r-`b(%Q&Sh9+fu949$w*fqj9?r5EyXr=ngMMu4E znSOaxIr83~Rf~1*B^-Lt-#?`Jf7c<{ID5!;ZmvkSA74SXDgMlh@AC0Won5B-bBs|K za?Xp_D%P^u;Uht}5n^RU*(Tnv_{E34X1U&1kZWO;a=moK>dotr>kmuh8VIh-$u$bh zh^1g5)*#ncm1J2(qD*mBw!AUaLXgDE%Wx5EGrwLL7S88(y4+U&_uQV`(bY@whjrlJ zu!S_=T9*)$41ZW8!-!ZfvCqXaymZu8li`bv@q!B0nBr74o62E@yDw*kmqfMrOxG%^ z!i%{ZTy6`0#$A{D+G>S0f0C+yU!$xF;5OQ){;Bp9cE5~qf5^#3*k7vGA<$&8h3q6F z44z!Tma{`j4yTW1JCz!yCd!K&*cuq@=%g8hQJCDcQeOr35QVHlq6218%p9041C<(m zrP1jtYe{r;_OK-sX){fGdRaTq0}@dTGNU)u1GVpCE2AsZghG^jXwqhDY;}<+jvuzy z?$Bsa3F}Cq&lgO!sWH(F2d73=TN@Ox?~lRW$oWw>lV)WL6+!S3*wM90cB)3U2Xk+y zJznoq$!5dEfBMlw2jXwP_^StbSG8Xs^z6?)x$w2_fjxiyjnjX7hu75i;N#m{UA6tQ z$99d+R8M1YnAe^P4IXiM9kyTYG8iOb>Pt_(cr!ow>NB4f-r8r=^V^@@@PFU)Sngk5 zIB>+f<?yiXH?(8vp*g?L1_cgoKz$WwuZn8+07Ep!{Wm3+qqE04RoQ}jP<HPSQD7sv zHm;FyQp?wX&Z%+@><+3Im0(^W`E%ns+TsgoH%$(pxj-d@(P;i^Bz74<*$kkB>};%9 zz6xy96;S_{+z9U$)S5a%jx65=iqB@{E*fzca&;>OeYgpnte2;qJSZRJZv%6w*h8Zf z&I@%0+yx3wM%4;e(IHG4O#N}~KLs6ni=#0atlaq2_j%#*2c8^g9Mh?-gWda1eEYMf z4*2x`$>~k4bPd-XyK#K84cE}E*`<!!|2D{*^tQmS-Y^*Cs(*XohrIrdxp4brBoY0S zC&af-s$J=sh+Q~bdIRJKX*c5ti>V!$ZTwI9jZhU3c&tQrlkivLF;b?YFgnFzIURUC z+Zb@Otpyj8AXLIdF|-K^kY5MY1qIL{vXh*IizX`1g^FsoGMH42djL&AQGiS>24X~{ z5>0iGB4f^M6Bumg(xCW578F0d_5Q=zN4NHV<)0sWdb{su<Jj=@=&>KZb7cO#>PrLS zj|{0;^v*XPI`-ZVKEL+~b?!j!C)wObCo`8u!E{D5*xR9Iad0iSk^8?CK3nQz3}8BC z(B*}UVVMA`-SRMrZRf?2aR)`l82zBgI8Bjp6D6${ir}WAV#A_jhqz)aIv=pndTIo0 z$VR2s4ah$;2+}s;358~acQT+2;Au*9Opr~^SfoxY%Ht&0Al@BdJApm@8`yi8PqH_# zWockiZAnXUB~ETv#I`6-<{9TC8q_F<k(hQ;p_HT<q74#+c&a!|F3Xb@VmbmPRA7V9 z<EH3hC=}@BU#PVA-#R~%Ik{uzZlB|!+a7=Z)IG9yGAUU2vWT<GQ9&^?-?dAxe|pMc za_}16as+*T2Y+ug=$hpBZT(*Ml|7!l^Z)kpJE!I?poq~HhtKDC&-Ci_fpS|fu>S2D z4G`my<J_EPgvHWs%uYa#+0G%#2C<Kut*gSG6|u7@l4he5@kPl1IgT?Lia5<4rQ%rZ zTeymR><o`s*d;{c5(bRiNVT+AnXY7^%H<`VK3oA&qEMn&l7p;CmftS~!?&zToR^|B z!Ln+BhCCzhSeEBkRwd5`ku3^^J33h?tO)knMft4DiwZw0fEh5()WU!kYY=BKZV}sQ z6a__+(j&iCah@M*4E8W_9$&jSFJH;Nb&E5T=MRhI`R>ch^HNxMMR{I9f)`XmZ3~$Y z8*P@qM1+fXo32HKReHq2e2o<W9Pi29zB)0^$8Y?ztCnJBmvbK$$#HgVa=dhI(v{@+ z5+iJHfR1AV$^ew_vJIBY$q28?3ST6)MWHgmMHozIeO6d_Vkrr!Scll=uRzeRS8B26 zIk%6=YzAHuv2+uhAx1u=ei+`WxLilU!~`Cr^&49PgrS5U&Nq1A3oVT6S++2CXA7<V z#e!#mT(fOT4A-V$z<kPHH*$(SXz)DC=~eDfjN!Pt%~BXGR<X?@P5=RBmg|}@&4Zgs z#c**6Dk;gTHcAhriGIA|gYD2z1l{0oL9lUA!42W_a_AAvSCHTvmt*J>)!>gJberX| z#l}=t6m(>Y!jaFMj^~4D7QLnU&LRzY#b9^^du;S2dzzPm6X9e@_{QH=q%5du2(atm z>qeY1hJMWdL`R0=kO~Slu#~wdaK@46^A~`+Kx~V^MRK{BEl0JEl{K~lcE2cXL0E$J zY-v(M<sx${!jgJ~C7;5R(Jf}nb1Ih3sJD_mNt?434yTLKZgB(Xj@wxC8?m(;7qc8r z(?uxZbQ8*@XEkbM>Z}p&M^C&h#{Q~J*cMl|1($^e9^vvF3>yK00g4vcwGjo(D#H{) zTOpvyN9Fk}sN)i?<K<+~F!AsoUOjO6!dx4R#92br9PNI+SnY6^lK$t<uPT61-C{Xf zdbcl^Y9X$d`2<D3DuJ}<;r4TK2pY!llreM*NxeY9FLt5(+Zz=-79}Ghn+}FkP<CF$ zscmY9HZ>~x(}&wLb|8Y$=~=2nfO)M!@P$}3;|BDK5oQ5nqeTai{zv;sN+kX&vAHlZ z^`OmtWapthZp)(^0^cr@^~=ZT+vm@Cv{Sndp$+-;A>FTvr2LAJI>j}L{8yrbw~6~B zr5-|UBZ||9-9E(W14^7eN^v@?VAIMr$p#GDMTNcTsMx5KR#8a>xq{R-0h#ZG%x}gM zCy~ADK(Go#(sOiV&B;y%L+iCjU85xB?qQtXP+A-?FL(zPh?))LWQ}Bu#AdcXuY(GE zVy9WmkT91V^0h`V?-U*Tcq6YV7R=Q}<)xRvPnXK$8bWhU_3;AwHX-d3sB=#NA%wlt z=;pAZiqc{Q*oIoH)H9IA@t~U?&{}dcRbZ#;B4*UPr|`UvtsAc+x1e}$wzTY1Q8B@` zOo$to*2G8g5sW#bhXc=(XQ84DivYw!3JL|AL%>~G;LO;8OZ5V>v!DuZCd(s@U=)D1 z<_(#t4AW>4n{lbM=n%sy$Y9gr6rshXal$7K!S_oUpoytn$3}p+Qbbzh+If^=I^zyP z8)yl5KxFO-U)<VBnguF?Qn#b=5zdDSIkd0=?O!t(Aj;S{21XG;OoTOYCjNB~CZvz0 z!5gW^-m_S_+BCcM`3c_f@<h<*N}Sn~8-L;!o$k=Uz#(uvfO{ty)tTyi{@~ngf8}q= zJ^OtpudUEJTmQXXlT~)dU5Lg$t=G?Y22ShssrEn)<V^;>?eVNy5H9jbI_=m`P1^ar zi6AOCFMq>oEvqc6{%n%^=-3&o8q)ev({X`IE<<w+WTmocrLNOX>Q;}1h&9PjJFc~c zVgrU%WOoT}0Bn0n!td0?#iY`-h~rrhs)rj&MsV6@6HecQQlvW6&IZuq7TIA3?i&}B zMBzpbM(7DG#9(!EHX~JrAr3;v;Zg{l?TdqDgAc@mtP3Zgm_|+=881gYI4p-_EB*L| z&uZ1qO8<A&@h$_~&d1l9aqhe~*UcaD@_WbJ9aDWmg<p5ye}C=H-Msqi{EnszKOWri zn{x-X-hZgHRoM8z4{R-bPEgj(<viNj!+VU8J9Di<oqXc1b8qvy+@pT=(+3}n!M+=y zCr@dgWO_2k{rnTw6F_gTT~7#;OL`)!1R6Aw?0}<^nv9sG>$c+1hgj9J(Yz*gk+xtV z0#g{}ZqU3#H}u*R1~YA_$N`Nnyt%wJ@Y(gMi}2=1-A`K`PTp~i%JNEM_W?Wawpy2U z#j&I-U;cIJ%2~CpVo6yR;A#TQ!>^<*2#~0lsg!@#q%DF+UzuoDf6r?vH(Gg9{?u2g zEx=sZe&9FiEe*MUUWKj*oB_T*ggKN-?6`bS!RpY4DLD#bLoMn!9<x=tlX%cW9wm(j zY&XwzebfrmGmYo<)6_Gq2kMj;u_7$VR)k?)h@50lt7Hs*1Y5$e)?PHTX&eGA2{x2N z5fzNVeT*p11lvQvc9|bwR@Q<gp(HvzV<iQki8E-)COIvV8k1(y2#pG$SjXyepnN0( zOeZ6B5vK{bCQZ_u^9UjdsTU?~g$UtV6-3xt6e4U0=+>(lrE!8Pu^tyfu`TrihEz09 z?8LMvcBy;FYk)SEx&e*(AT4VQd8qk~jY@1#)VD|w4Q!LA$$47eLTwm#oK1&@;H#Ao z&1^o4<M%WYY#JO*SbXimU2swe-b8l$=n29&*3mAF!c<pv5Pv`QwdL@(@h`6C6iN?E z`05qB0p)Xs+!-e09&Q&O;pAS5ni7!Oev<L-1l!`hi9~w`UYw!_NUvo-4vT!My^IT> zlB|jYVpxUH7YM{AJh#NNyW0ExIK^G~aQiM&fOZt{dGW@mIJzjsaD+F+<<S_uF**nZ zh?$6VfvE4m)I{K@W|OF)l6C>###SyUZxDJnyH)wvR+H3;XA0_9+KuapjpD>DHdc}j zn-{n$dl}t{$|5xote_c;ry+>^);Wm$3=56BaZikc5}bf`f`Uu!<b9e`3?8zJ?WW!V z?UA-_!!K@{z%Op3Xqwyy`E&qWn+W+1s31z&Do|}-qhk?7f5_%@gzhuAz1D{E{%A00 z3Yw#bE#81Rq)tq(SC4)k9uSPLX-$<i>R=?PHT^(m)q91Y>dkd26zS8GOrJblGj;$! zqgcpnucA*Kq)%w9U0F=((vc2`NgXNM09#O<hHTWuMYJ}3xJ_DD9gZ)kIw`TzusfFq z@p>AS7RdtQKk#roOLrXUpwV@ylZL=Tmt1HX*b_M%!^IPmGdR_+bbVKi*@S7jA3NPo zoz<_S`!NSi?0!rkbw}>+uco8#U*;m(uB5Daf26$|_Ul%@6O%5aqSqxP@;=q{cuPWZ z;;{-f85#}=!m8<2q+)v7GsVSaKj|;!K@If4iwE7L$Tvi@H!E#t^gFLtz@{A=s=IM9 z9r(<K4*EMbKuy^qEVr^D7rnS~le7=Yw?nC$?O>35lm@c>if;Csi2Xd<qC9J18+Egr zl{-9uI~+$ZKdZ57Z@h^Vbf5WI5I=3*cKr?XYzHQ?RM56o)X!vIp~Ay}Bqp*pS;Q^4 z-<xm1Mc>G#b<8_%!o5lxCh^h!X|`n%1^Zs8F(5@!3t~vgPxxOd!zmlJg9++5VVyP= z<@{I8jOe46&8))wr6s!#)iK?f+Yfz9mo^;RI&-2T_h%^g%)WJnT<Q5M==7tBk_wRa z?)_gnc6)kcxce7>j#}Oo6Q|24_I)LVAXfKbzEKn94B+S==RV@(7D!aqr|2^4lw)}n zK#K{bV?F}KPN3PXh)pFUUnh)mpQptMvFSthr79oc09OWqtDjIz6U1LDd7#gI@D`qG z=(9nQ3<IXelt)-tMao@+4>iRX+G_^!p&muL(35Xxj}|oSWra6On#9L@FlIkcsS?V~ zPLr=C97!Z;_q9|5s54_INW+?oNk;r^YEO}<cqqrNZe%%jpINK{KQ_?Pl9o~XVa+)1 zTbV}>G6_M&{GzOln(Qm#9Xgh6$9twP{@$^wl;!c=2Oq5FgAjnpCvPsv-|pz~>za3c zVYIZ>o_lZEdT64v`}WK^k3aOlSANxqu6yO-l60<Ds7p?^t4cfWs<ONhp?V%{2Drz+ z$9nH0+*=C2h~&WKRmK$~K-Pej^<aYK_6Fvu!@xYJVHmQBa$6d=N%25=g5D}0fFUXB z(vK3eprprr0l2OwX2AhHm<8n;VjmfxRg|Ls;)XqL7!6RO(Ij`?#~gp2w1~o8Y&*ks zS(N_0REb1f3Wf1X07_r(zyp*@5@xX{Ej1K2f)rNQSTT!$sU;%6x~Ry`svQJd@l#!- z0XtqUQfdA%*5IFeF~7m^2Srtliq4iY-fnj{y3)t@|K=qB;<2=biWMk`=+x%fJpT3k zFLHZJ_ZKehEYxW~zo4I)OONGdOB9msCxZJ4Y2QLV-bXtPh|@@-E;5=K&9LLl0Cx~s z5!Zo&lM;Agt2S;65ai`XDMJa?E{ctD1|#l_%Z6II0z-x-I}F%;AlI_~b1jj|EjOR@ zH~{dPO0B3n5;oRAq<knlQQtXRWoeF6`;<m|ikkVVOU^kzB%%;F6hJ8$pfF>64r3xg zKmRz|&aC6v(|HG$0uF}%NUz`C_S^dCfp<>N_YS4qNsp~7`uN|NqTkmZ&I4Y5^>oqd zKW+EC5HPqyvp@ci)>z65{(x}q!OvBO!063*bTs?5J8_N~njB8jIeH+s2J&=(mt_4- zNdsgV`I{KQTp+^)7V1G=OI27Ra6oh_#~80i44nG1mqn@G01na|BOHb@Lnd0lrJy)7 zDjRu;Kfc^ht!iQ|BFv68gYAI^{W-}G14V5iWMW-uu^R1){PqFlzI0$jYpyMe6Lniz z8_o<k6dExFA_A7DXlx4S=+OEAQ|YiYs(!Fkp?m8TuZdXPW6>YH_Ga!MCQrV7;-&rL zyLGykdNcESeJtYpaqc_0H*!b*lJC>}hm!cpH@x%(!(-3q#N2=EU;NIYoB3GUr_(1Z zAk&?CbvW^N-kKNrFYr}w^V3g~9Z`e(-KBb4eLv3E0S4hcye;2*b0%nMrjIl^E>Gjs z<q0l-^f#zrLNKE=j5s8E1Bq~yEX#JXESUXA6DXUqH{Adqky6}9(WTrR6}K%)W*~^C zC<_1u11&VHF^L&^I-~f6nP@hnJW`e;NFDfm2MZkYfGlY*t|kM6BCDpaqu>t|47B(G zWZ7puslLGp#<<sk=5#d9)-d}R26+?>z8ffgBR0i_8s8|*14=2NU!=?oej8Z=epXo) z;vLs*qi`i^7I#o|QFNawD*J&W2^UH8_jVcpW0QZ3p@CQ)C}+rP3i|-Hm>A20TB(e{ z=_m_)@N!|;R#?9F>3zG;7y>iBPv`FGe|mCv_E;p-|KIet1|q_apvB~}bQre3cKYs# zz;v>Aw(ITL9na)GI(6Uoet%aiblV%Jj?V?=BcWte9mrQg?8pD}U#GWjkLDhlI=u7t z_-$jsq~FhLc~z>><umP=KmAlVF_P(4>xU<I-uC-j?wo({mgrrn!wvhsbl)@SSSsxE zwlRN0Sg&Ugqn3fW<0VeUI!_GJC8TDkVnuZw%{Rp<j=*+a8mLJt+j)Cy0&$1|G6zgH z75f*kc7zutLfZAj<uKc19LAKI*d7<L%$P@XxjamP@p8x@((#|kItEkM(LUA5xa?#A zGR7!A9G8g3!wrWb`7!x~odXN?zD`1(N=~E>k_F`u;_)zT!1XVK#n^m)iC^)NtGV*R zT?`Riq`=j}!_3VH?qWA{VfgZ^b%htGwElWeetqscY}DeTU8RnO|3>aW=_^aFhZXHJ z^SSTv<E6`X$@kQv7Mx<3vl;Hc=Qoq|C8)tCn=_+yS1lLA%$iC4Fsw{4vj{FyPLMWl zV#x-Zf2v=UDxrmK7?oiQ3aIWy3kphcXM~RI!W!JJadJ{!y)X;oUG#8#1cQf|c#<mX zaT|8Z*g5lp)fSGCnrSWa(C8-8-*Gb;)GjJ;andFv5;AtVP-xer;LpI`NBUUovS3vC z;u}*H4Q2q8rn2YtZ-p3wbQNq}e_sH&KTzG8F6Z@6*LD3td3hqzRL*8K2zSQoKyCTZ ztA~@R=ngo!P~d5uZfn>X7{7nszcj{i;fb7h{e)dpj)sbwe&~ZEs7n>&&vFU7{fV*h zKhOQ}Xu6yQ8~CB?KFX&K#wL$8kxeEn#GK|C;L$dK=^VO7)hy!tLhD0TRf3cW_eZ+c zn~;KdDb_uLm&7fM6Njcek`_-3v@V&SC)^qp{fm+ny4HbAk!=w1gDs7ypunAo8_P~B zgC5f+37|`{GcIw^HI8lnpoLf%36V`~p40@cdodFyVsr^XnMHCFEzY=k!Rcxv<|oNC zjTiN((a`jb4*g2#THunHVtqPReCg^B)~9F}o+x1CepRYi>ru5|eKkKx!gMaTbxG&` z;_^CIq*lv8p48mJ5EP33{o2$`3Ij|*x@NYbz(V&XCx)2Tt)y?1T+zy_;$v4`yRJP` zfO~FVm#*br4D`|T&ai6Xlh8E<!%Ti{1N_?c+~2HH+e&qem}#aH+3S?O;*-Eo00VOc zor9~&ZeuVCahtM<V3YD})2cO)z%xr4C~ez<Ak_{Xj5$dQsg${(uk2=(GY$)dKWSAg zBxZ#zrt4M5+}6x=CNrHIN~4K4`w6Hd|DEZc9{MZkCv2%|$roa^DR?M8<m6hirb7u< zC9CcSFe5IvkvB|{!`s3fUbdoYOIsz`T$;AGBXkksQfnL2QOp=HloMDffbpO}7Joq_ zTNuq|3=pxFlbeINg?F67!yfp<9)>>XL8(oiV)(Qv%pU`v#%kdbj}P-vNrNFql)xo3 z0YqpM9nFA70XGvXQPps;_?9=zV-pP2G-Z*6GIEWy??$B^!6Q!Ef_j+Tk<8$U)I!Mk zW@zxmw{`7WEWDhYPNk-Ea&kJEoW5^enyfy!sjF*KXD9oavtGM4^YEMdncuvTa3U;I z)Vjbc(zWb2hsbY&cnYvWd17`a`Avmm2X`t)866^3yEa*pL-3#oe-O9w;_j%3@o(~= zQ*k-NJZL6``x}TaM64NPj^JIW$6o+<N)B}T5(m1Y#DQKHmu|u}M>0Sky6YNSY7?xk zWkT`lIlbNK8XKQr$p^X6>*-b*hYw1mQS$-_d>A&ajz~(c>|0CAmne*Y!-RnqO93ND zU>J`YU%f(HyP{&h*u-FnU(64Yq=yN(82O%Ff2cq2mH!N&kwg8ZG4AEHy8unB_Jo7u zSRC^5B``)psH_<5@?0%?xfJW}<o;RVddJ)Y?i~E6awtK8k?bVx_K|jHn0AY^(d>35 zF$J1&ij~DByOU`*7&Do6-#{es(g3u(HZB_nD1tSDtA63r_#;fUZo2LP)}x|1Sh=Z_ zKGH;KaAy-e?4&)DyKu{&Ho`P>>WRszS&DQA*!IHB7Rd#Y3gEZS`r0!iV;kr#Y=CLQ zWpAH_+HO^_rkm$Tkuwzf*3nj|v}A*jporEUb1Yxwngd=?Zx%x4QXyfZT|dV_*KF;> zYY2c9{r>|U)w7F6yci;NU14zhvNgD*q|M@EMb9_DqOaCt=i_6PtV?}2;}^Puuigq@ zJ+72A(G~$}I6JL8TYa=lHhtMwZyKlJF?}Tz9satzcMEy!WhX6pHo|$2Z<;0~*_9Ri zqFU$`6&b`l%f;^%wfgl;G3t0|B^AC<Q>cC4bnrpj<rR9>`uowXv$yWK-gW$;ubwE5 zp$*&??#G<$B2Nttm0IYsMd45hlP*&SGCQs4vNGF_TSYOMX>c&F!Lv~j9eyKfpB#8N zWFLdF`e;9Z(X`y4AW-WBt|tv0PEU~Hru(476KQ%lF$#^>q$i;9`L4uu_^5$25bBDq ze;9P3;7BgFRmVGq=^E42nizx<yU=!6;51qTy*iD;s~n3{*I5bzGK-Qo+LXl?0*11J zLMoTKW|ii5ndZ;1wgg>GbX^|#I5j0)$G`r-3E_V(E8w~lS%Guq&Iq%}V?F31e~puy zai3Bcby^=VGH}F%vtf)5SHK&v?lH(8K=2^9u?@b3fx$WxBfT%`9oJz*aKUY8O5xDi z9-lG_jx1+={>F5s2QOX(2e?wNm<aN&Tm|Bv_lK&LX0g|TW^p?;^G(suS?U|9nXkTl z?dLM>dj20vQ__oTJyaV|eS98gnq+4h0w3w$f)Sif6^K_<zgfd43k^UOKpujYT(L(l z4Gb}jQN|B>5&Wdxi&6_Tp@pq7Z=t?KcMEAkcPI3w#k5e@(uj2_n98K~0(Mve3r>&F z0#?B==C6Wc-WGId&l=o8U!15kMUZ5dAwUIKgT|FLW+~E6aZM>WqmMwa<IF{FU|_&8 zQWNUutPI8|ssS-N!;&%{73c`F2{sf}HUF5ux%8@<iu?M1Tqys1H?~KwNB!xKbh<t9 z=$t|K;to*5s`$%XnC83rZpQh!qHcz+Yo0$n7t#40t_GvVnof>+sLqalov^jci&=3q z(5)Hn5rvzGDmdV;lVhY0ZFt;;1x-=Wl%Q2-)i_%NvPmFYvG2s8<cJ`1Ck}G-Qsoxb zm}_am<wWBsqDvLHg(?M~PrVBVfM5%a!8WR~Ni~=|UMSNx(3pf&h1}pd4it^`$rKey zL4FD_rle7`>e>?i(a!jGwGD+lkIJQ2(TdlHy$aazH(4H*8|<dkjvu_dI+S_yxJJvP zMakPo(^u05^~b-!nOZdeip<Uh9<dQHW3vszQVi!9jLSXH0I9o&33->oD2Y8Xz69}Y z($Uu9Xwm32ElSl0?W;|UL$eX`-@#alY3^a6rm+KzCtkeY9A{n3YNXTIwpdr1m<*&E zm~+WMMKFjs*MmR3mb2<wO+yQf=W|jQ;4FmX?>56Nh&@!+l&t6|$z=u|+ySa9oeqYX z#DaKq5js<4bmHh5H8cg59$ihkZ!52iHEGV)Sv2~xvWkz=(`o#d-+>>g*_7`xGOc-E z&4+A%+Q64pXpL3X^MKHVg5BN0{GaDfUQMJa4nt4C$oLO-($-VjaY0xWw4tTSX>z)h zZ7EGnxw)f{vMpjgKyaOJE@Mo*sH5zSvxu4c#FE8N2Et*(@m<fBw6|{GsT|l6cLDV( zuPlsaAVK<gTCfC8Mx;6DAOp>X?Zpn8gRfJ8I$5z9=-CAMrl4x9z-jQ(<)!t;VSd(Y z^7|U~e8o#m!b^JnZ*+!Ra&M0FHT+AqP}Jda>h&)fJzI1t-C*v;%w0=F2OoK^EYZ1# zb>sYgU)Y9KrFmF>mDd%{y*BB#c`v+=ZVh%QQEOUlNN?l6K8ev~<qp&)<Uf_%4%(cw zjqU)o{_VJf4BY|CenO-PP0)(=cs|n+g#?lOvSjDeEZNzZPj)tv!8cxy^kpYZ$VQ|u zk`pa9BqrN%86(u~ls)5<EEx&Gq-ZKE42Wlk0x0GK?XG1`15BeZwr;Y;7)F3KPZ5vx za{H5Ltjp*=FsFd)AmaHZ&{*S2h{n43f`wO?hz_jI+*k6o`TcI)(8S~peWflC(&-+K zxxb;;pB~TZ_$uBQnCiz+!j4(c_f3A?=X3W4hjDG9KKis?zav^qesKPGO6&9&I?Fp# zcN^S|Zp<%G`!udIPCc)y)LV0Z6!n&~YtV;p`Lk9>xga0Sl(833%GY-1>-R=BPri#g zmEUIsJQv#kQq>IZgy33YS}Ewabb>Lwq~Q-JWi6X5Cy$<Q?$g+cuXwl|ZkM4Vsi`m7 zkQz$V)_)E|<iSu2%O??p(iqC8Xw{|Mwquu~Jw;CQ6W1RhS#YG!e1iJ3JQKa=*P=g| zXD9usz;w|WPQH%xrzauJ;jYqCxGOtl96{0_!aQPq;u4#dWV18Z0*+CpMwlfsEJ_nl zBk*q-STbSQhI_;Afqat;i~tf13r!F!pB+^^bV)=fmE;G?8w0_3Vr*&}ROisOx$6|a z(P)<4W8<W27I2OzJFJVZ@RL(6{;@)5<X^5?v5Kzj6VocCFZQ9PpjUUUPOp|GwHZYD zGI~|2Nh>Q(VfF{U`RY;7S{=`^{Mb7<pf3F4lhmEUV7B&?R~=^OK3vir%%5Cccb0Br zmAbQ}E#MolQ=#O_no+nTn$P;=G(-4F5h=li7p`49mM-O!Qx3|t=hS-Ua}wM)!K6-E zVF-m2BBc^z3hg4J<N=Zx8?u(o7~>|lEz<ad+{9=tn`p(kjg>KNVBRI(VbD{RHu@vT zG@=ShiWtkA5U5E8Pu>kl*zTH^q8O17W4JOW!UU-;dP(E{B7AN>LId{{I`vq&%o0C9 z{`fZvlp1HA*jrL$)*b3kL>_wR>wbT|&G$lT$?Wj^F9FZy@e=1xMpLJbYfbVQ(9Lpz z+?>F{_SXUVbW*7|QLzbQ4zeC*Uu=%j(kLX;0G}{Z54t&K>ak0mV}z82P4L+ME_2F$ zcq%bfAyuvlAXWA%2{M+s1=u43PD@op{GxX>G((`pfEnA9utte0N0mqjR;0^W!ab^r zjQ`6;q&OXbg<*#U!kXP3?!s~Xs#Lzt&y*N}+|W|@n)+l+ezb0e1M4TgPA5rF5)xEH z5(KOPq3(i|rsf?XSek+M!s!%)?tt|NQUqg{a=e}b$#?`OQy-^?^|Wn15EtVLSuR$w zQ|$n&B+`&2;JttvWhmh|Aw--s%VB|^1ieVO4F@G<XqKrLN7abwD5)DQt|D9(G`{#~ zRI3y%!I<k*#$kEy=EAXEMal|8e7~&BpWo`FjcPX^Lf-V+NhWWdv=xFXoU4&IY<|qf z$~XpV1cBB%WshjIJ|WrTi%jiR<&cjR&GW5ND9bh~bHyyjae=F)VB5e{cvaiB<YrcD z+g7k<vOxuNM!Pg5Z#n*qUTVvVK!HeMO4)XmZ1WOZCVW6VD@0J}$%U?B%?i9WhSeA| zc22h3hfDTs?+VgY&z2^FFmsLeY{_a#dIx2Kt29;+k|h-E<%EPcE)maMMKTU6gqIWt zy3q3@d8p))v=fnB9MzR&6Im)W|G;}Id&pAJgm@#WJe^U|z9==rDmB~56gB%{mF$GR z9<W<rmE3X6rO|HOY)r93DX2;_(x(NTQST))jn=Ey;1lR!j4;2PR6|(9b|7j{qyzwc zndBm{T3OhG0l?Zkzr}cC?jNo!52f>eGpiE{Hd`c6ep82i54$dviHjuxTXwHxUD$Rm zvLRW$!y5?WI94qUgwkI=q|3>}okf$S<BeAqiML9)J|Gvqk3ZC&X7#8o+@ElA9f`w8 zf*3eu$KXcl1y|uEIZWwyoMBf*kh)PxYBnmNw?lO7{m=dK1EOO$ij5{Qa89yS{8F@@ zd;VPAd6)8Xg-O((lhT#H6noAsbf<f);e`zSl<|0(o*h%S+<;zRVYM{|(%qRJVrW<X z!>{}@(YA9^Gnz}X!Gc(cyqi+$!W=rU0%RAvzj(h&TRI4lvUaj!iiW4~i4UM&V&omx zI!*Dogt|%fY`(O`*9l+m(BwJ{W`{1fT|KRf-Kf*ek0s{;XY47T)i)p1>-i9W#-P?# z@-jFa_=;QF%6e6rs;XcpsMnv|?mnm2H#am_=!NCVmRYs^qd#@<c6IRlYrD59)e#g! z%4YVcrtWUjQaS6RpSwM3BZXnw&g;)V9&3-8P1pcV<t%(sP#H~;cG7LkO=$ZWujifI zzj5*nq}#y(DD`e&{TthQH_Txdm~5kj+;cnDmM9QaDS0c3D4R^#1FX)=i+4rEn-`^7 zB;&@oJbN=~`s{Wn{mrxV@a7w#^s^>5Wz>y=Z}%j<+TD*2OtRj`<QP8CZDPZ~K77E( zW(9nxw97Fz-5o<*BYg%|V)ka3fluEK4>4(8sIZNEicCbeB`Y|b9P53=%+IP_p0FlK zy!u$<!=@NF?MAW2XaUnveHCk09iV9?DhHR;&ah*Qz^k}&6wQcz2$6B|#qz}N4hZXt zZ(#i!3U1U`ymWGgS|gZe?KPOe9QuCHu17)b`$m=U*f*y?Q>8H{zHn#i*j#u!f=a)F zbnWeZ@QXrd1~0hig}KnlJ@fwR*6#B1r@hH1b-Eq#cJxcWy~k8;DlV(#4!LY9;fZa7 zHf!gY`^m?0FQBZ9`(`sX+Mc8Q{yp1%%azW3mtIaNvVd_Ae(&@A(LgLG&W$^?3{6X= zHGEuYtk{C40LhA?@5Jf9nR{nNWEbs_mqVoQ?Fm)@?I<B51q_{E%tOryKwfG98cd@9 z9Cn5MQZyGEfO3<lJ13=J0J_gDbfvms0MhhR#^WLb07p5X*8#fG;K#smcR~Hz+F$@y zdYSEQAZ}BTrDH&WfLwYLQ2;ChxUSg@2)|HZ>h@2{1mtI=KAQQ23_wBuBgjMU;0>Jo zY102dAM}40^uM7sGYVWpym)tjY$03VJ^)*=kzoR)#4WG|8%^0;^S0n%RJ>_Xnt?4a z#O0ZrNY!WNpvgDQ(8HT<fU3_hAT&YOMQywYy)Jf{vREfVi3%5$IR<6t0A<JP6fQ_! z*ak1#<>qa~-xy`nMKlc~-HCT+Zh~$2)B#rKTd)j`UJnZ(bpii@8Omxqvj>PP9PSX~ zat4cY5I4VZ7fge}1PY(JN;YzIE7b^L8j4as{?}|6gzpt9xw$7kA*-+ilFy}oeI|iy zLT*lbn$<zBBb!hU`4PtwjNZdFyoq7}GuZ^1aS%J$21;tVE){io5pa2|=~Q+f5YWBU zCNyD7`#H%Dn^0YB6Dm!j;hdC#P3Swf(39zdP3WbcG9DM%1f;J6dP8Nky(yUK>Fq1A z2`jzKYyzY1r8-Ir5ck->jF9tUak$p-KzOB4{1txdlQIoU)fi#!6EY4g$BE#6`&IWb z`%rsX`*0PTP<-EOvJOS2p=huL)?lHoON)BXlC2;CBCBzYb^>NX#q5OEOqipn<*d^k zN@Ns!VgCP*y?DJa1DgB#CuA>{#78~)Yqb})kKflkz~-no5PpDWZRB=xIv9=?L)eT$ zhQh>_6H1j1IF_O*>G7!8xtMKaDWlxUKm(mUa1M<ZkFbIFGQZ_mc5fi3<HiIv1D-W@ z!r`|m^J?K%x>sof=G7LeOx{6?7u{ybf;t2zCEDRj+yol!#@J-NZ*UZE7lj7YWnFG_ zv%&*d$@nG~0t$sbDn5Z}MYFQDPs7)#ku(T*v7mr|r81W6N4)>K-ScOdlJ1X20vD1= zAH51VFTIL45B2MJ^dUv47G5+WS23uKd|^+}{P@POfyMe1<r@Z1oVWDHR3}IHy%$c= zTT4As!_W!UTx6DYxz8W5wAre4eTr^lZY{%X_ayhH%H9vr#g1OK*SC}HF4`F+8{zjc zZ<uXXkbqcp(`?EP=Y2lFOxSJ{fj^(X9Y*NP*>;=A#f5A?{InHb-0G*#SkY)Lnvr7o z{YWoPK1uU~U>!BIC#Wi<wPZc@%`pbW_M1f`TWdp0SegNRS}<59i9@~AXR8p%V+wSI z2ws^&WA7BVyD7FWY#Xk&FAu5R`OxzhkD&#}jve!dW^q4H*QNhJr`z2gy*?j}^F@yL ziLuE)TY2y-7|n?oNGqgc%NQmKwU4;qS8bR{dw|`|M$!}77E%H|8zeokC5RzHGP_GV z+92ocUd$5!$=$5%(oKW|sJWDO6*fXoyox$`84kfqJ6!>~W#>wamlzssjJ8u^C2d5> z9210oQUuX!mh4!D1FBv^<amJ|5&O*9y2ilhr8mJ}gRU&^!l_)Rx-1zs%{|wqFo>1W z^T{w@I7xGYv?u2xeYt{;6#KwmleVC<0ZmzGZZB~2%0>^?&i#L&EZ-@_ul;LNmZi&4 zkFH-?3boiJefidE^rfh1eFc5Fx`u2tKUWT|P$^mhJla(hqo_!0UB6<ezP(h~erdgm zkuP{_0@tn=A1>*|Q)||XqKmkKUaZb<pl_XI_6fYyMvtNfR*PJ&b3+X-@fJl*6)FT> zgSQZVT-3A9{ni?lAb%06pIx&K>|;7m2}WgfyZq!cIw`gSog%H)7$%=#Pe=ps8CFjm zg?cb2_bld_rQ{xlJ?<HTr^u5`Df?|@-zM_aKJ*hAp>j(PVyF%<NSUA0SK9+ZEKc)O zD78sWh9XQO#*%!%NMZUWTi+DiKy1qwSM9IGDg3;sXuTo8e|wz@P+H+vAGkI(pjs;W zB;+3|!JYg#C#T6jxD!$j>fz1?gj1MGqZn#Kf;K0CRZJX?<D|2r`50<bRP0}rqKKhD zj@-|x9Q~l3!Nz*Fd$$@<Q+0rDq8eUW4zN|d0bmEDX*ldC2)@7#o?@d-0rPWet-<8w zI#@-c+9C-osHK}*FxsY8x{zy(mxS?>OOe;60;N}>id>5#FkWNLR_*6FAJ{4TQLB78 zKb_cui_igRK&z`U!m?oDbeJ{4ZXh38bSmHi<!VBojnfatN7?{I*{CBZ0xpdgQNHY2 z%mx))c+ey{5I$A2@QKjO%aD9pNlGowb_E(vOc^RHk3Fk%HMAtzkZ^=JGKhgeY&T;r zfNWC(@ZwTChE^mF6G^gB;b(A}B@Nz2Xv^>!v{k81jl=`%NSAmL5yUnag=!PFSXs;_ zjyIKQ)qb$sn2>v5BZ3O>a-){5A5$eg4VGG~X;XK(&1^S1%3RUbI*Z<CY~EYtuB-B^ z?2#jr_aB+$y{~59%l#w&&hfe2|MUB~4|1pekGTsEVL|)nkAPdZtg@oKT&GnzotpFQ zqbk#Z&ujH&L$%iCaJ4q;Z10v=SNK%b_T2dWAAU3UUhWxSf(?9xpSy78V=g;?lHZSY zPgUnW|CQW>!Z#b!2UJ>(N)Y&3tA%j;Tfs@4)nb1v`e$iOiT|#`pvL%{>{weZsuBc) z09)q8RGIP+ZC7ug>SP4=ZWwJQB#d>?D4@xqL)2p=&QTRnZ{%kuXcj37`YVM3Q3dKe zdqbq7o4F7T!5@IWmHM&Mjgul2=+@CftdcQ`!p-1-FX6A`^V<%v5gXkkfp0)}0Br?Y zA!{Wf8b;pRg#8r4T^A2bu-O(Os$@inwReBv)dydjx%cpQUjGUI;7`51ho*Mi|M0ZB zQUf6Rvvp>x%A(%CG8f{7p4aX@{lE_nEq?hB&+CQJ+`~gBPi+0#Gw=M9@CvVL0U}*x zv-l7~ftC(@;7o65c00y2=VMML`hD7R%8<=j6Vn{A3TId{t}UWX7or!f3BvF|q2vIm z5u-fC+!0fr@x^S4sl-BgY8)_s8rn{oq##r!pr}#6Bw0(83S$-70p)iz1HJg2qS<ZI zs}u_UlD-OxRB4tfjJO?Z0Gj3u#k)s0QP$Fn3cn<<$t1I+0<kYA#gQ{g13-C7Cg69o z8ctVPM-HsQoaNQ!2HM#thG==ka=Kp_$Ox=k3)PMkjc}j8hy5W$uT>E>@6unF|L@$j zsUh7x(>u*udA%FtCOEKa5O?8fG;c}i2BYG5g7}AM-eUv-#$-VQ7cx<7<(B#tT^z>M zZNNh;pWk3w=BCAyf|^Q*yrm2=(}7JEv{06`*pD)ey@!=)hRs=nyP<=z1gkLlS2jA1 zTl=Jy^x{GO^{Hv@aQCL}?oE7sHw}oaLr={#(bLt%enL+R{_Sqgh|SuqT#EZFCpXYd zVGx8d)v(3d2&>ET5T>rE7+4fL<5`>Hy0AaH272XD+!NqnK(ynau{5~GeAb|DXlbXR zB~GexKv6sdcITOMLMcWpQ&?YtH6dCpP_EGiZ0!{t5E`QRh##!}ovQzw`vO*m8~Kwu zgQn)gEBl&{KJ$I2QSWU1AgcKfXQeGtSyfiu_KQ_`l}8@7S}IJm<e=bBY@7{So1@so z0!&7P&I%KJa)5jVh#N(Fls1nrxEHpuV2+jc7yvhiZAL^|Ez$CoILdDX9`6SGLjVMq zWFw?H7FM;Va85oT=-><hyHFN_V}}9tkn4CHm2y%9lfAr0;Gey-oL6hhHDy-Yf~vw6 zb*jsCeA_Z$?!rwgdI7EJrw$v)D%B?2E4Ff@g%@%*g^TRM@ynV1B^mlmUWU3>mZ1*x zYnR0+O=7gN4ADW=0F!!FZD?re=p+k@^{5fxDoUgXE)<->Btua;t2M|F3u3VI2@8Z& zHx$dzt5=qxdl0XfYRqMbRn{g$6q)GsY8I6&$0b3|1f2_WT_i_rqYrU<64`;&3ONFm zA+|CxIl^Ii7Ns`GQJX?r5P%%fF|`qiA#7TwS=RdCumDK~`-qE9sudhSOy6)eh4=u2 zY68CWDuqahstsBtCQ$3NY6~B_y8QT+Fo0K==`@!5+}l?b9@rq@S5Z4wX3=b5x<$6R zv^BaNlx37?BJ%+-OmxPxEtRQyP(~vBX5gPbrB;li(Z}UTFJ&|l&^Hvy#zb_679lFf zAoq~y*!uA&s|tIFoCTYrg|hk<#Y9!YFo~_GaG7O?3ziDgHx>oiu{^hCwkt6gfE!CZ za!B{<1PrS}q=J`}h=rH7H|731jb*z7fg=xp^U>q-GYZj0<VTKs&+Q7v_P=`XzJciz zdj0D=lp+2lXin~{xd#11zbk+F<vdNCarEx}o#Xf2v)PtAQ#7}Ptzi^%sW@HEr+S;= zZyLB6?i(Ciioocn`-%T?2xry>3RB7(Hzl(5W!w-vYrRRFpePm%cq{~C1Xf^!Ph$Aq z<VO~h9Hlc&cH>M(lk|{L!{=ADgmIum43`{*u=P?3;xV7qRG_+^VMgB>#fhTwJ7Lax z{DCIIoIxCMUduCPD(s0?2I!T|uT5ZJ#HuL<ol#=lV3QUko0no|3iWm29w1?2*r5v~ zjN?l_UT1N*w)tm|^tw;o8lImps82lf$X|Zr=-Iu~CPV6O3<;0Zknrvf8WINRMQ1Jn zz4U})x%-=fu4;Sr$mizAo_ur4rPpX5`PGv@fAA3R#ZKOL=exCqT08A)yN%5YE3lR& zSteYI1(U*mkk1xDV?*?Nv<i=^yk<PLgI2#HSqE7L4`Xp>3KPR%tijQWI`AuI=*+_y zXO%NydYGYcG9cf=7x$@XSCt?4LBqpg(9=k6xLCGu_iTl^u9*!Ib5cb!Zlr}cH^EAT z3Jc~l$rCL{r>ZqeIRE2;x&HpSM^`D{g_hGBYZ2~(Y!kMm9-BYDllQ!KSj=9MZIq#K zDkVK8+FUAbOl0d6Ij=K`8%WM4@^X%~AH$1M0&;HjLCzDybh?Qy9P}`wRUQVSOiWzd z<-kS|=EZ05XJQzqoY{yo1lvy|{1dg7>~&3@w4I2P>@-7xMWq|eSvP15Xt^4MzX%C( z?V``WSg4h6Ux(Z;cb9}~5qh%U@2h3jhc$7-#Aprn-!P*?wxNNC$9Yb21-XH>Upm}Y zAf6qr=#4=BhfC#ughfqe#iW}VV$Y10tL$S$T|N)4r713gtE-f1v<f$ZP>Z$}N?=0* zOAm-D75UfM9FYMQ>sOhHbQ}w0`*A^OP*h6xTG9ctRE_9M9HuvD<sd$!v&fB2ang_o zHS3o(#L-YqRMkn-%0`KC1H;6@U{NrO+1P}g_UZ&xpb_R@WBpqHQ+u9}b7Pnm)0I-U zgkN+VeRA-id?T-l&h;BHmGjtV-f`_dl2@9iPvpWUW!`&US27MVzeG1-4D{}M<dp2q z%}&p_?Z#*My&<6OII1UN-u_+I1MKJiYw4_gabuRi7nq^UdX>D}Yk^j9Uf7(4E(`H2 zcjvVSvm#hDh=F^;Wi_&81dSnFQqlN5N8|SeFb)XDC5}Qi_*ZO+>S4_#Rk?`(+MMLY zsJ(2cq6kLn?!u=`7pV%y=&1me-{D9@$wdCSQbTsB?aB`oa4`|BK6G?2Gmr@wJw}Jx z=y8q46i3htN3eI<5iBc2TCX#X?d|sm2da#2wQ=ucf9|LAsiht(!5oX>jQJTHS(Ac& zvZ2xh9q^J4py-F7MIIy_KqiX;B*qpYwk9M4-U=0=m$DnG24m2M4OHMju_>TX8vp{k zVxd-SiBq9)c{-I=hojy>+c_9Ip&DtJ<Y~i+5(w$6!Rd~Tuv(Or5hZt8#9^F#FR+SI z4H|o5gIRK+3yaO2cteDu;IY9XH@9?>4#bEtr__8|uSE|&C|pivt*Wrp=F{$2pEj`G ztMD%Cz4p_*T^rq0o_At(@?YG4UD5=sDu5qZ)&)RG2vbC>#0h`MASN6fIUAH}gf1+o z%3Bp`n-%5XTI4F$A^(pSPWa8$ivQBdsz=u-{>TAo22IY-pw)x>_8;<F?1&eoF)nW+ z4S?$<4PdaGF18m%OhSqavuN4t+?#r#0oUPIltv!QegNu=Df_uFo4q2Ko8lpC>oTAY zfyp#Fa7?BVYy1#&f%tBHurFxmg&6M#O{Sf7HU@iVXet%x1GMyj5Mi`6o1qdDG#t`I zydZ66DF76&c%4PuOmH4?ls0+E9i6D6r^Q~gm?D)Z?)(*B*)^z#N?kN{mHQusn_RQ5 zTr%&rUY$`9UO{+Q{Q&Hn16!VdkCVGeYkF~23B<MJHBhX>MrW$lhKcNn3j<ZVFj5zW z6BuJ&WBI0wuNMvTfkmkuahSyk{b^@}eeFqj`F4<fDc5O;0=z@<nE?<jhAC+uz>QW_ zcPXX1Fkn8i(_KUUV{8QeqgJ_?qFM=gWL7z9T8AzPyO&enTub3PR;^g-F!JCATl=pb z%8wncS;1(=^F#Hrf5C*zD*ek+M|^$$WkrvI3^}WE$I`4-+i8)?hpFPQR##FeW#Lx% zYtfbICG<k>^))+_0&XF<eqZw8k}ts=@;ZFU(hV`bw5vo_OU^@T1!K|WBBvsg1g4O+ zMoxmR2!Qa1tM(pC$b&UH4}nXePj`yt52LibnQ_g*AIrq}N;{g1v*R(=*mmfCwh(pn z!?Z+{W=|xVA88lCu~nKFbie^1F<z+jQ#@M1J}=QW^L|Ru!b*;iwKd1O+00LeS*in6 zQ>v)MyE+_kj8bOM1=`GKD=Phwwtn&&*mtUfjVV<1!3c;6xi$@o7g#T$cGRgT_D9s; z<j3-Chs8&YI%>9}{zl~@p5M!oX8tRtDs5Fqh0`DDnQ8yj$=05mhW1qU+;Cm%{_S@> zaMO>vz506p03V#c*B`kqb&KUwzj6J0zq|dG!hV~t&K{ho;G4Sp+nTJueSQ13beFqH zQ<K^Dxv`t%)Nu5kXU!G4ckh|+x*?r(26e%0zq8}9S8R1GAK3Hp`@-ui9~k9+z{w-9 z=5imIbkx~O(r7PiJF%O|Z8RXkwKdbqG2BA!A`M)Q!Py!pt*%n;0XqXrLsB~q8VrpK zO=&w$Kcsvr6wU8nmYYIk)|)WX$*!apzFDdyx;X@(KxW1)at*b#^FgfD<!wtbpluU) z3B(56P8;#087gZj=Pr#;Rag&(BuuH-q0OB?d529~O}m&?%3WbKV7_!mDMb@MR<Q2> zOW&&ZM<RQN?$~;4e&5rFqk9e-4f-~{J~|M!r#@wO545)beB`!UV|(X}AbDEGYMwu$ z*FDf0n;adPO(c#S+<mXnA-KOD((8I>{e}%az7C(?V*T{l@!L~QyIOlYa<f!!PIXE% z0^d48^506=K-2qUB>%+uP&DFzCK$>VZk%YsJ&Y4ohF-;3G@DS)7CSgfZ-7R}gOGo7 zunV78?um88h7lE^c+T>Jtg|)QH%5}(WtJLH7?P^Yc$er6lCA7xqSJh~${dWvi7itc zqxVYBn^>UMrUM5HXU%_ieQK~2v1&iJK1~p~J&0>wSDgSpV3d26le<U{;&cIJxByxk zOF5j}K*Z5p1F;|sM=6s7bpq1@(;~H-NeejAg8XJow4PC(r)D$F5rAu0k2?UMd7&nv z#~pZ-oAN|yhYqL{<QflYL5(A?1tuhH5)b$Wm@q|wiu$s#gVHc*fVdeetqtKaf-g%o zRR6c*jvBD0ZvkixYGSs>tm~2d;&fxr8fE{Fagl6|6z^V>?C0i6cq|k@a0+@*jSK>} z!2oB(_g#+uzi3X-;yU8ToFT=)y|9U&O^Z?hN820NgrilsD8vjdNjc<ACgo=GGaa~@ zA>0MlNiDczL-++3`xwg-Hlb>YtXB%q@vA@_APoVgjG-U1wV^fA**`u*XYOKEBa|vQ zsh<V_a7zv5vo3cm(?cKuafWULzmC7dXr(q@@xca`Itgp`7^e%RKEYj}y5qmNKDE(& zpd9<ipM(aFxV#SA8|%{^>NzNWWDmpF%yECU8Vv%6H<-H}q(NA|u(Af3ObcVCaU;4n z328G*fw;HT>kus~H(^vy<d-#Q0UAUs-jaSZX7tKPhvGd{4`E6q+xxKa<RV3qX5m|; zLF9((RH@QV2e3Jq+*Oj0HeszQ#($9!=qdBQf7exQZA@kYCZE%^MwL_lyE<zB+aPb! z+XCBSwR*nS?@gQBe!5Ghe`P_|yO`mA#K{RVgakQc&>QR~6?M_#U~EfNoLrO|pncs9 zlTapQ<*(Qk&2}r=*Klz#M(sB=G8dd}6uj9L<)aqTB$fR-u|N<OPSn#-jaWepz|ZL_ zE#5>A1K^1LW(?PaTIkm7VB!?5xUn8II;j!tHJ`OR+q%+|q=Z;VYE4UJI=od$GO(bp zs50S=<%kV*1}VD(J0-PcYgR@69u~ZnPu}aB-*tGOci*kNbA6hqeSj^G`u@&aL&@>k zaA^AQ$@Qw@k}m!hba4ds^WQGk#R(YE>}IBUtOg?wj!q>KiIuz@Gps9XCNz;s>{C$6 zskJC2)nSSqn<NnBY5*A0j!uSSoI(?Epqx3k?)V5r*?H9*qyVtg$4Q%^nsRqPB1J01 zNIeYKxIVwPEC{M=($IgLz9;m+`1I-EeH%`Zes**>nd)6tYt&EGH;>=%zHNGDzxxqT z6vRACLA%ZF2GUQ86Iq^Q1<U#?dH!hBVB~3Xh}bbDnxnM#<bQMbF3@eA*O}-!coPIc z00i+Mh!;T+1VQi+06`FZf)vG<D2kFON}~0$EXndKj_W#(qx!n8>n4ul+OAv2&A4rD zM@@4xuIHQsSV_~FRB4ixn|h_WnOrNklgaHPZQV3!lF6hQKP2RS`#%5)P$b2d*P3<L zO2|V1lo$Vh@BQz8?{9w}5j$1Px`9u9-!&-&HysG=gWF{l=gcyxJ`(&4F*ilOu{2L# zFe`Nzt%a#i0`!=73pxCzQ*l`EZAklw(<zgc%mBq*QR@OdLlbkfXE`l`CKLE|f|eNQ z1XKp7@Sh<v%h#N4A=^)oVR4S$Tk*$A3aG7Tl8>;q%D$cElchGx+o4igvfZ?je3A0X z(*5cnO*3!lh9w>ls9MUhOzhsecP(hMLPNSRt0bDbB)R0~Z(OvE>)*I$CvZ@zYu@`i zS7n04d&+zGr`ydzm%BJyyogmMqbg1ZOv>k(uk{h58mTRlWT{vS00lW47}ZFVtqsp& zjNw$x0DcD&)l;TG?zpajrw54F6~RM;CGRz8F>%ZrIWN2q1)!2*qJgoNl*#!B;bXut zS5WEnYC6571bj=btz$?Z;22}@EpH3lO(Ezj7bA#sdi5|O1a<+ei}YrCQjTCM0773T zyvx(;jVdEM&z{)(IE|1VScK5-;}fyob9aBFIp&EPYresoMc(pfI+e@W<$n5+csg2g zly^_8>O<Pk-Ffz5qxgu<V$ps0W#fZqF5ZRvZ3?x9i@GfG`hM=$Id;7&bTZtv+{x(W zI&ra*O;kh<rlel1Y^0(w#v_|Yvr*>i8Zh=CMbUHgCFLmVIYyxXEBCM(LJwN^5E?BF zZ%cz%yr|M-X{NgpC-veUD|LaqzB(mWr+T{7>7HQ~gV;3!Yf=)&1(b)Fz=*&;ZDmrr zuTTm;77Y(AoZ^8pw=@KernV-X#qe`W;tg-))Qn&I$iMsjM~vEUhlln+=xp@H7e{XE zaGA7vlg_Bo+D*1Yzj1)Hac{hH?s&f{_~E})-39GjR*%u4fAV=Pt<fO&VzuUQC)pc_ zkO;iP!~=yC$vu+2wWh15HGw!R@uVUan5y_Mn9?vu=n*IlgZHsiFJO2?Vj`BIg+Zrj z+A3LX_@1KL04D|8uxd#_NvbdVTemT;%+dGt&70Zs{a&@2KX@%eW9yhg+`XvzG_y76 zR?t;`>$WD5scmR!c`jP&ZsUf}jE!}-ajr$Ja#J(I4M43A9X^)3RM=h>0%Y^H$D7Od zzbhLX-}0JizP&l)L##t@p=^8LA+OZCzLo7&6t9<UuafwBZCfMEVj2R!$j>PiSi{t+ zE7)5LXX<!G{rZ=-U~Q!ewrVxw?%KwNxCYL?vaLP5SzD`EF&5WuZEMT1Om;HEstRSx zU`}eIY~Gwy&Xul-a0`PfEsN&T=I!Ymv!^=fr~hh+jT8d5q7`y|6w65OyFV$DZGqmF zNB}67wPlh3`HwJ=zZoZ{kIDY{2*PP*Q2s(%$v$%k?Pj3)^PEZNAp6N!^BNV{+VGfG zs2)S8*-Je@c&;85+<M%4v1%4=Ve{s+R7G*e&u-$Vq_eVkf!S0Y&H?ablR`|POF1jM z*pb)e!;!~k!$V9S4<Hg=gzxo&)V2{(D~9ti0N@}~H6vkWxt9S4dja8O!9IqnRo>+G zF$sC855hZAH{#XmmD++a@lRB&5!euay0Jm68qoXm&6yQ)YE$uZ+7aZ`HCzMmt(Ykp zpw$7|{#Q^FoFr@NNJ-<k4n-1>Pe@X+La7;+#aK|UK(85!A#x*YGdJ$X^5od?V~$wk z6hIA$Invl26`bnvSg)v;_cH5~dl@3Fmk^mjw#Gr=o+K(^G@>UhPBUV7SfbV@?VH0n zYP5n+5W3`2goKlWR?!c<HsPQ+X`D)ly_g8Y+Ak}HtZ0Yu3IUumIjk<pBCqJqtAKqX zz%d$1s|Bt3mlgF8rXqsJ;@yU%aQ9Q^o|}ssP3hjfz?=mPNbzP;f%pU8{KWk4ynJfl z4!jtd#fz85-o)%drJ9I#=*C-%_dgrIz~@f<)&~#qZ;kB<8<(j<47S;YV*L4Uzq%B= z@U0hAvyTid3t(s<;z_uH&uOCS2iczO=YFgxjnK?$4pVpI9d=wQp;Zs$u{?pzRFDK| zrgtr_wSyQkc3zVr7+-ZpsAldYIG|Ih@v?oxC}kHeMKeih7$52yrVn-X0EtjP+=ZDT zBlgoM(6+Mi;|NN<d6gm1La~FiIQ6ZR0~w*qK)cmt&DUAeBs7l=Bz>b{tDMPBup_;L za0Wn1jaucgnOXWRk4<*w00(7w!d6Cd>|A48Bbaun*fS=r!BWlM$jvVk{~sEO5e&%a zb^nN|pX^*ezr?10`kr{^Qfd~m8QD~$#c(cn;(;Kam@^ucpZiW3jjzl)El$^qt9SIB z-+P3A>ddf?iDT5gI8^-DBTHw!DgNoRIq*3`FoRxucz(!W4A<C)bh?=?XXA$mLue{q zX8z}eUdgwVX}&^O@xFuPec<GpgZCu{j*<7(Zzv50;S!V%A<iTF8G)l8swMIa<6xVa z$DaCE55dN$p`y58js>(FVbWvFk7FsgAW5G1XRs-Usp5A<Zx1GV_L3N<6vMR%)Qu$I z*bBF-nz@wb!f?VIG*RG$>k3wF(-qa3b#%u3VbU2Zp4pOTt|w*X-@bzq7Va|hPcIad zp~6YLtA!rLJLh^ZWmlRDkmt!e+2{aN{lYlF2XD2*&{4+pQvzOC0&U?e644)~ex!N< zC)E_*Dv8ZQxDRw&<w@G(_(TE_-n`A;(MitNXU)~<>m9zXp1~1th*EGB4>1w!(LM9b z8>{!UXAT>_KP&FD%Cifk!$~c);LH(vYPU*VtfasGZg1a1RgZ<D_MkKRQIp5l0AV>H z5~)-x``9KucAcI6Z`*QQ^4oLFZ|k5s_nb13KoVtiYBK7dR!03J2V+z4+p?G(LZn>J z2vzGR;b{GgA)itI;5-BjzR;~WW4DJwY<CAf<6#&r4>j{iNUokr`G}AnYr&{$Rk$>| zZDJc-a2O-(Qg{CvS;zLyJk{4LGLXq|5v9y69C_85n&$DH`O^jFPjx^6OA5brp;hsx z;Z_o$g^(?sb~*AV7nMAO6e}!sV&wt*wF2r0zbN6}(GezOV>?c?@gz}P%I%EDvb_sy zOatdZbm>9W(9}#`(GE8OSA&`JwF*ivABS5aMI>~f-XF$pU<iG{Di4m+5hgZ5O}?KY zA6>w_tdT75zz;0o$M%%?u_s$Io3#7t%y4j9=1zGcX3s}-pXOS*+qrwWw>f#1q8chr zWbw!7q#8&`M{rX0jh&{G$}SX|8FEonC-Yo0<s=3+o6d`17(7_g*EmRa-gXLhevs*F z9K?x5;u=lx1sC%L(Y5-}O5(jq#ZM+#rE8L8#FLCiE=dVFQH{18#Lg_yz$_%cB{eF3 zR}3wE1G6ViW8<VG4zGOo#k=mN9iFr@8HFQM%9?}-kz_cA-*po>Np>||#I8!s)V<c8 zBiE|ndC&rS6e<xtIFmj#Zq(a6)4*&En(9yssIAo_<;in{#aeHzYtUKzL9@fxPy-f- zJHP^AX#opFF}o#aRNp@C`v;f5*2O!uKUZeaH8n<v9_s3BwJt^q!5bP)0b+&7ZrfA& zB7_S~HJ*kB%26@D64ec19<z(P$i1t~%@LIf<*29k3b2P0ir3ywUYjMaomUE#`;#J= z`61g05(&|s1jr_UPrLM78?{y7(sv1B@|rXT&rObz0VSzSImVQ0#q&w&INq2(PH#*f zK$2=dPKvdqw&>lkjj<$lKNKFI4jIPiawY%^T!Dx%V%|G{C%tCJW`l`9goNAeci{VR zn{;S_+GTrB!!k}~#S>Qf<oP?v-+OIx^0o`~lm!t5jzuXzP1G=yZ3*k+Ad4xFin5zU zPsX`!Mx@?$7OQ%U&6R3i^I6$k>2n+MRs7K=Lx(QQtE2X>w9wSEE%5c&h|2uB&QjN; ziNw=7%hwDIM!%}<-No&N6yV#Hm~sJteh1ga-9g>eIf^Q{n^RO-qNoC)m7|C%{o}V& zRB<s$dO+AEofSX&ECpWAxnLHxFpG17c(|l$aF}+#{S4y5VMeca_&6-2-6GlnC?XML z(JV-Id;{Yipp1oSCDha;<!M$AnBIfm>|v5+JwQj3WSJVDRtwU3*iSn&0OeXM=`SKi z(Bz*Q!u75FLvyEYC-VW?k;mm9TF5e|Qy>sWsUk3a80O)Fd0<GvOqCUM>Xa#cLNI7o z9(CcB$!iS3LcEEwmT&das=04CErf%q>%SWox?4kugvwpKgz5e|{v$f0R^4>tueTIL zw4aMbBZ3eKN0TkVR3a2i_0ZJ6zSg8|>?-c$MoaVCEM9281Dr%X=i!d<YK6bNkW?bZ zY_b_ntjrlQBazuCBCJCxd{L9u!A85$q!rB!)#QL(7-1KSg6L<wO=W6oG&u6SfYB>< zy9M=3FLHrH&@)Bg=$#<bA==V>-9#@wQiI!VjX+!|(ot+aPs2S8?bu$yDy#Ik(NgA| zSL>QtR$XJ2ll?=b$Y88T)F9@Qe686{ja)$dq>)jIB*|GshkX=&<Qa-BSutdj{Zp_I zMrqRt^!G|1x6E$htb%uM_r}Yc*_F4%cQIK;;RlLe+GJSJ&L0BdMJB_({!AH1fA_1~ ziUKN5^@}&PDcuF|noM)|b8><X13e+rY$>9FLdeh8b9xHk%6Xtp%Ka1<(9|M|2w(ds z%dY|9RD-#m`7r}xYN0qT`vRE0!fIy{Sna)=Vg_Odxuc<iI#vULRJ8d>H+4m&6QUA) zitlYxJ2#B9SHKkeMW5xe(HI#DwbogS$83JDt$44HUN9QIXwQ7nV7Rj_%ug8kJ&~yP z9VNnAO&&zpUY+%8n7c-ZjTC=e7C_Z{+(W*OcYi4G-5S>iA+y=dqyuVdhi<$61$BC1 zf%R)lzj#yq6D{;;foQl#IeF#&QYVltT7(wu7R3a#$X8aBy4tj&;ykS;O`z`wniT*^ zbR8rIvLjKuoExFHcN5bRThC;M)Hu|o<!l3dxRRk)<5NMWgC_wG)h9Mw)C@YRFCAaP z*N#lP-B$du-5atL-%)G4sr~d;^B5#gT|rlk-T>aE-*z{-|0w9wytwx9Rs7wuo~gzz zv~}Eg3^ewtkLYV>x@*dBsPtN!-e>AE79SeX4)MP=mqy)21^(9*!y;tB=Eo*SSQA^U zXIFXI<Pd0$D^SK@c~OYesk=p_E;K`!_aclUGH_BDfdxW3)h$67U0JEuyrCVcbmcxO z|J+t=Zn?9iJGey~18(Lgs6p0(hib0G6U|B)<-fmi_1}-7zLme9W>@L&FL;_e<HYEM z1t8$<z*-BiX~W^4Vyn;|9_FpvF_Pug(0shS`CG9Hs^@8b@uvP0x~GxToZ{|QChEmd zO72C>k{k!1+mM28LxXo{mULT09Z1kl3#7M6wEABP{V=VF>U3K~;i@!hep?*7%H@Ub z(J|u2=a-)pb(|E2Eu{97>{?Rd@QH+b`CgieD1|EWI9~BMHG%Qh3S}x5%F;c;st~8a z>#CppPrt3spDlhRr~dCcgWDeQIE=>2L!OVr%I5bq7Von}1nW<10Q45WXNw4y;?thg zjKOf(bLPV!?~A*Pd`iQspc-XfqL-stggkFkyvA(dBO2Y)uQ<`ccRfB-blbd*W`hng zP9?2p&CtNRpAq<^>f71WjS%i#nx(nc<wCg16b@aVSA8lH@%T;)lPj-lEY5h~`uodq zQ2Rs1-EkT$Osw1{&ZOky2!&G15cUG9wgt04+K-_t<Oyd4>2^K_NzHj29rkWeZ;&xI zDCknq1!2fU&BTQG-b@4Zm?_GwbPu31gvkiNQ5%NAow<Bg&<3cm>@AU69+8n9Bg;;$ zdiFV!t<Dnr7wyP|`EMqjfy?H4v-!C?$O~Cpx<jZ#H>Y1R80OtR$*j}rdDAPk24^OY zdh}5A0t_6)o(;u+vkcq5^$ok%Z!7*R*guJv@17ZY$*4V5EV^BW`sm*}gd1Pi3{Dr9 zLKa(?zbBwKMOr%>)MX3T2~)*ft4d|z{rqP<zIKOgS>jn8&2-=RKrG5%FB;-e)r$uc z5UXM8&uo1aTYm<9h?8XNG-DyQq~tW&dgC0{7sPS}Tl3&PRE!l}sZ)Y<dbO<!7P0H9 zl!L90!q&Tl+^Ayf<tN3~lXN|gAm_%GECarG)zme1V89uxjQV}MZ|TpJVBy72@3x_% z7Kt2c$94|uUSW{och}C<$B0fw&w$p>;`esj&{xpc#jotXp{q+Vo%m<!aZ3e*G9h5Z z_LMw8mhM@ArGt#7;A$B=cpF*z@(qM~q;vzp&6K}ZZh_X|=%d>Nu9DS3KW!0vu1XVd zgh@C;PkLffafI^IvLcdvP`NVHSo1MqA1mJ1+2mu#&#+{-59JLhes--3#IE{_>P6NQ z<wN+Eol~9Ib)V6`d8ogi?iuv8_{MJg4b?Q^H=4ItAN4%<C?_w{zRsuQ({w_#@4>#F zz`okULJkP{HG1C{m^xr}Yg@XLa%D>G!fowrZ;W7S3U(w8(*>~JmPfv$NH?EY#5vQ{ zOLsn&^|3jImbQv!_3nBh|7NMzd7IMAE}q(5N7TF+&GxrP`uiiL@5QUT?~N-t7Ww0C z+(#4^Ofis>Q{;^efZr2mV8PPVKaC}?5-2%Qsc&|2wg1H}V&_$982&c`|Lcqok0}0E zep>dwd?>PquJfGKO}vZYrAlAhRUcE^7>ZP#+I{ExE}N1lzPa0OMLlhH9ucq9N$e0S z*MP~OY6A8sd01cod$oYQGI=9?7!*dNJXLu~K&qfz5*pntQ3AMg(gM1WvaO2{0o6C9 z-OkLOE<*!<u***JI&ko9q1<@!>G!}naQ`SS>OaBmrKdS)1Jmrh*qV|v<QvX;talGZ zG~e)kh_6*$PtR1YJo4b`vngQ_yRS-P@Qpq2jqb$Q9>q7xPs}&ig)S1CJamHXRSy+C zqo-Ey)b2XSKzX0kGrR61%hy5ezwWY^Q2%&Q`$KkrKFJ+l@sYLtqOy$e|9cNvUUkjK zciltietumWVKWGKAeUpj2q+gbx#ZeLkk*yTiwjD5arEGARK=;7`(de^*j=9Lk#1Kq zIouo(nAAE7OsJ3%-AxmoS8BL4e^aSD8d10D={jD`!I*(_;ED(?u9b_W8o$}#0BpV_ zZA*{%m)Nzw(e6}b6xM@pkIao34VM-tE*lJo%-z>5b;icxPyHdFnJ%Y>e#K~<xhug7 z_U7V$>&W$XvrqneqtP3->uRl*zlTn#^}4@pntk{ZK71ineA6B%&m^RdrL{%@wMNn^ z?Hc0ma`N8dZ=}82GHau*cMt!fCmQAdV4&ZsE|(=$T9diC$yJn`ZvH~bzkEyAWyXpx zdRm-uTqAK_1b=Z|{W^<T??<mtxu?fd#91VF^i0y8RMM%Pp-QGRC7s0Eb{}+Zacn$L z4qhNCl76XlY~)OJ=0-;-k<1)Fcb*!1oOJRqHpYw!{Hnvl=@{XAtLu?`J+DJS5?ls$ zJylMyBgx+6u&PT%brK!kEl-4?bBi~JRrn{ctCd>^832_}#s7YJ{wDgUz2Ql>x;aM0 zDZ6mNU>G^n`-;){r@_=vBh-XnYQ)9esPh^cR0f@)!St}fWT<)E;}+QGgRK}{Gsa^R z(N~P7{QO0f2V5C^-WW7BYM>Qcw>+g&%zn=6{(LBee?+;i8+6gJ@0cyc>+xQPW_i=} zI-Swn=tX%N2!}sQG%-aoCZw&ZO{BkTYTz{w8q5EJN~2R5Yb?~;Reb0nb|1QjCM!rF zj)Y54+mai+QQ9%$84*vUB$_mzK7I#zQN2*2q^E0<3cGf%Al6(f)G74w*}58ZTkb-) zrEYn^wFA3(A9gc~$?wnBnxh?^6O*K0CEZ0tB8L^4k=L3${^Lu^ku!q9RjCCT^@`(_ zn<wv?1FDVv)Cno>8QecOxL@`3d*X*bq<_2gzjuT0nK!BxU&Nj09FWo<B)(gkv7>Ga zIP}nM8KAvgAP=lfNeA(g!NwvaFdZLA6)q~o@(*Bd9}&d!*QC3#oyU{%`Mas9cK$G0 zsdrnX`>~_HjIjA3e82lVkQrQdc>l2>);qo5`kIOlJKNi^%i=HFt{7`W9bMg8=DY_d zvAHUSsX;7#4;>bBlQ-YR9(r$m_d#}x&N6^;=1@lU-S^Ie`Twx$o<=cPh1tB9*ts~v zeV}?aua#UG=pNFIkM)g_D_0zd(1cRP@kC8hHn?b`4RvUrR}JT_jO8&D8%^M$d1_&Y zWe~VEF;U^MG*7S+bV;8Pw9u;#MbdOfuV}cUWUZ9C)lgdD|BfAr&rduK2=@Hq*Nn!; zCoG=g{|IG!l!FibxFepnADL}+e{az_J6ciEb_kacWM>XU|IK32l#jhtqj<a~(<ks> z9?063PdIJISn;GA74%p9?XKG8;<YX{Q=IVjw1WZwYwIqCH5PWh&2dM7Ujw;r!LH}r zz&KPMY(#f2j;0tnO6RbG8w8{7O4Ec0aw*4_no*m^@U|U~89c5!%ur@#-2X|+6d;%m zmei_*LPe<vt=}-~wpATidX-wY!hqX+=%HvP6)}d)6TyVv-TTkJupl74SeXB~!O$AD ztBoeL>WZ$;WYWFv9E1+$OQF~WqcMHV{luHG<<saL`wOw4XGk}2?%2d7m1l3COMyfi zlLmENeaPe0<I+-Z=IiSCHxl9HBTF-vDE{ciUtLgggaz2=Cv^`)yJ?s^&oR+rrhL)E z7;Jl>!z7cPvf<H!SI?cq8G~5`4P=2Z{?c%*kTh~u949pBlB7X*;M1A;^#G?MHKfpY z_gb$EH@N~WT{PQ>HN}`B7!BzESVX=BpbQYb_i`D~PKC5czi9N+0fmL$R8o)a<Uv!F z2@h+aB(GC_ap{9b$CC?Ze#NO>xD@993tueWa6~MZ`G0xVVlU1=@OaSE5;DI2Y5vQh zdkh{vQT$?Y!~(eZ*ty{WkINJ8&^>(LQqVY*o5+~GBOh_T{XUJ~?f9AMGgq8un?ZHY zy~Zn#TQm-Du6XXoLtc|#ZBY5c3+asURoL6?FWylPXs57NJy2JY+YrH}W;icX0fzPt z8+GBuG?74-2V((&kOXRzr(jvl7*Hf!OWaxrhGN|b3dT0V#ND(i(8~!X3E+eUPzQ(k z5gvcNK1?gcHE0QC-<#>-zgvMM#VpG=jqR~!{$Woxs&a>m@B9;Qojn}&MjXzL8;=V% zKJ}I2I}fHpL9a?XRW<KZOsP!h5P64+^TqF~_ACz8ed1SiZi}0Lnct^V{i&f|JZ2() zTfp4sCrfjmm?5jDzaOm#Imz(0$t-N8g=~ckhDKe;Rw^2`82qtbBXKC$3kVa-)KAp5 zC83FK7z~gFEzEk7P^gitke~)4_sXEl;R|$jE7ns%Te(Hs@t2u@RMT5Fo8`+Be{RzT z!(!Ad`ndz(Y^vpw+(Vq)N^91HH6ukdb~|FD#1*TQ5=7zJGF9C2Dp3MX>~L`c2_zVe zw;3Fgpz+4E04FuI;^p?N7_%1q!DtuL^0C;U<-96}ZO*R7%ZYV<vlg|yKAInF)|P&@ zY)f}-(v~X55i4zJE7ntm^w~<}QIFs}^4R#x9z^-An@?#jkvDAKdJu1lKU=n)lk3_} z#i~@=&a$=0Y8CwG7B*5b!^&^UMnE%q!p%1|U_o$Y^>5lv$^)9uZpKUybDA;lo6)`q z|C#5mDD#=(Fq^rOtpj^0YwjSutrXm;uwPLZTwV~Vy~6?xI~qU-moJlx2waoufH;J{ zL7<LOssN2_;$k!f*CNPzX~ts+p=^qw*QdsTe5Q%Hx&Q$TRSshG5vQ!LHP~Z`6k}oR zBcV-9#$k!~V@gi$PGs<d)P_5g*h&sqxws_(iSpHVd0dY8mcSCOW@-`u9o!_&VeL_f z#8rP+x$t*1c4ljM<14<t|D&cR6>qE2#oZZa?UlPcyg{Y1z4+DD63*eNr!R$yrCm@j z7R95}4sA`MMx&uG#>n`X`mwQ5ze?k5Gj}u|da{^uI6gHo^_OewIkzFksX1tyt1fE) z6YKYkasNT#+oj1I0FC7yHkZ)@f?e58iCB~ka6p!GW#o|ng(lo>_K+&4)Cg5)8pQ!~ zzML4OO-jV^%YzAe9E6^8UYi)im6pRDh!a@%50g5j0Z3^aTQx|zoEq9^3JHb36~h~V zRT;4#J=mmSP%TKkpsZu$3OC<KHb0qcU)A{e{>H<k+9fuxn9ZV*gSo|Qwm7)?c~VyU zKL-4Zox4D_funSPJeZQt&_NO%M+F*%n?hq-_b4j*NwGmF==9v9SXoXenAsCA-YJNt zYh~g;wM8-jqiazxy6^~tg<Ud0nkikxDjI-G`)r-L!3XxS1u}XIl#M`1fojiDSh*)d z@JfjT?VMa|qALe1ohU^eOz2Nny6c5nkkV07yldEwp9?*`OX>9A-TUl)_e@U}f4aMU z1g&#ei05+g#9;no7cWdrTzoGe9`G6E?y|`C^C<T$Cp+o(f&`mxFJ0s)GfMR|J$aCB zhO77UB?@R5TU%#8C7Mf9o3J;@6mIsAh_Dv0mtod8wb<mzDNGS!D`NnMNjbvT-+Cj} zJG}s@B0?(7w(hxm@1B_&+Q}_QYadefX+T!?>76g^@HV9W#W8UEUe^9T7#;oGLEgm4 zVe+y06iwU!I4@2l<z^D1<(j7vkVeSy&=HlpN9eU~jK~m@o%BM?LZLy=u`L4Jv-Vnn zQ=E*mFn)v#f?FynW-OwcaP#d5WbK0F#y6-!5*NIq3n3757!vTI)GxB!^B1iIVAF_> zF0<&mDs@`l7CWwDh9~cIIT}c}I3MfiY`~0ZM`z%pfsclCz|v=ByoqmI-r<aOvc8h6 zJg}~ko8*l9vyvYwZUs*u+)XrQXuX18B_9a3lLiSO{rC(=fs%ymh=q<hH_EmV+|C@( zVs*mbadT@7u|XTUAESVjwlKol5_T?wls(|)fwc#}Eu%I^Wmk9V{c}-2&}pgB(Uj9@ zad_t5&%g7BU*h*Ycdu6y9ber0&@-QZNi=<G{DMZ8h)(fuJj$nak+8c}rE82W#xH5U zH}^jCuz{aSq$V^u-Tl{JjU~j)ZO84wdjjd;-49-PITF*@d1EZ*1vcLqpIUgf_^IMI z28PoCmr)ms@yh)-0j-hm=!(cMvfS@i^NGZeSP;XHLQ25(l2#X?;~BqAMaJGwL^fi3 zQZgZdFf~&UHCgf5W=3Pvj1fbku_+PV;<l6T(Q!DxAb2bdV6%zPhTy(EIJI6=*VNl$ zy|nZK0;`zBhV;^<v7Mr8#nlH?**Dd4z>gG3_oR}HJh+ToM2TJ3xPsq|_TCFssj=8A z2E(FYv;*ubEPUmW3*+ffc+O(U%;j`1m-xiq`S7FJ<%%}Mqf};lov|T$_HX`aSP=Ze z!u*&1g5!Afq_KX;&(a(iZ0hKCV}S$cz&1szN6gp>nn2^&8IF7oOa!!qxHj`*FBK3Z zC#1XB7)THL5yV}aQyZ{)4QhI90E2{xlM3izsRIi}Vzz3O8HjKZg93a_8w+F27Q6)Z z*&s$l5bmS|)=~`7nmP41XHxn|22!-^E|-_*f7(*0T&QP5@>*-9ethVzqQ82X_$L+y z?i#phss4c0irE_0s*n54(v8Q4wSsb1l-3FjWQ87z>mkAh#$}D0Vj<kx6s?PiyfVtn zAx{ERSK~D?n3S|II4z@u(DKNOa#|yPu~AD4*NB8mv;ZAwkucatguYCLPfNqBbyzep zX_afENP@B)R;e+hTVtIn*C-Q8r>GuLVOJ>Cs6v0P<||jKv`jTGe!A_CjxC#2I31|b zyl|j*&kdn^waP{{(0AamUoY8J=iy<j6?B@k1K^n+Bkl|x*%b;blDZBFKk2*EPqG7b z25N*9yuBeu9XmZ)o{v1za*gssnPVvpr^T{ZI9e8q3Cn^oj49QU{5xsBcVmSDsGL*s z$dM@-Xh2oeWyKB~ByGsXAXt6H;SG7RgA^vk07T!Nbb!X;rg@Di%nIwh*4M0o_Gkx{ zDnu{i+a0k=E+UU~D3|f@&3-$~w+5VzROMQheIU6wOg%C>a3cRkw6(!vLoF&R|3sw{ zQ|4p5k$>_^@mm_x%Wr=FOY?tJZ_@>v{1@&x@uWF1_nRl2X49c>TpIh*+wW8}y6|T| z{mvT?3?_<RU5)9D=Rf&oMs?vcCfyT{e(Znuj4$o&Q6(PY&jjpk#jh3rN{N%uNLIfA z+cm?}!SzD(B$5_;;3Ct-tdJa{5;Yg{6XiM_yB-?=630fa9y2Cw?D&)0$W*yDVy4Cx z5MHkni%u?GgBVxPnrt0Z662&0&5@ew8)@{1&K}%IQ4GThL|}70woDh_qu3${(s3-Y z7)J|hQ|QsVR6lhMesJ;r$HUKj<dqM6>OXz+nTJk1GIQnJIesSS-!t^k;$y$}#ZSL_ z=-E#!UAnRq<)0sW|Gl~Up1<?lTT2Un_7^Ylj@gSJzI<0`Y&;TeKe0GApSg7LMDbrA zf8uke+YdhetJJ3h#^6cDf!+wk`xDB|T<lFr5v+qZHAJ`4IBavJ_*IxTwL(6O2CdFe z;s(k4mGU#m*i-pgHrK~kG>ao|lgxNt74w7YkXq+c1CmqKK1I2iPiCrLOAhPWBI+QB z^i*@@vR8_~`)b`c)heC2;r~(@{I+DX8?Daw`O)f}TMF~vjF0o5GKGfxgo8fx5Gv88 zNMq3AsL`o34;Sad9>a2xNBxQ_e&a_vqfWyYPucyACZkHd+_COCrkRPpdv7>i(GgXv z(Bd|M28O?_-p=lfGpLJQW_7VfZj!H6blAj+RKcd?T7ya1M$`J74cc~cn$C<itZp0W zI&y6YkSmMJX3;UYiWo;KRe+n3Ot<7iKILQdDWs{na{Q9`R3nz$)QWQlcYc~puTHW| zNV3dVr4}SvfvX^N%Ga4qNU{Q!d@#_0Bnw|Fk}SN*k}OkQutiC-ln0h%)dMsNVLgeX zkpim+fk0=KZEY+iZ-k;7iy>_`m^-@O8D{R4$NWwbEdEfL#g5l)A+K85<h@%@sFWV| zle#A{YZ!)S{6~cq3j4~Y%Hn(WlW&lyQsH=W4+NA7$C*cnwL-zI=g^myYu&WNwRVal zCk4@QtzcH7huJ}In2}i#$CefWT%%;w8Wu}RIkt2PTdPI(EBpMQwaJPatE{he<4NMs z?S2ha?wNf=eh#WvBwmsaAA*XW#A%qj5ra(#-3YWIT6FgEVaF0Lep~LResa&aCY=h+ zczimO-<9D{?pvHKzPM9Y)Y*hsJTs9BHBZ?Zb+$U2I}@3iyI!|*zhrlyF{u5vamV<D z()7X6R3W7J<$j#MowN}lFyF{cw38t5C6<WVrt-$g6XdQm5tpH<Pc8!@0bV@L)F8lg z1U@x!*dhkVNkP{u3IZLzcu8_%Q|pvXwI-=7x85$(`O$`r_JUJVnuZTAt@dFreAs_g zO2LN{SHaMe_xZi>;e;ihOr+q$_*&t^c$4|C*OyEwKCC=29}Yk}Qyk4obxv57y@gm3 zB@LZW`lbw{uBfph*AzLHfwozt2&PzpRSMw~k#>PppQPFe{3>UFjZsWzMn(<99(@at zM#>;eskPX!gDaJ4ESJ{xtCAjmAfYk%&7S5{12a>OEjYjC^;j-7F*6$pjq?sejl0I; zX=x14AG-dpTaN`AhG%(F_s{5;DbZkTv(dmIU`&<VpS-!kQL%QYzGLu*C>?q;=xkMV z64l%CPtAk=PV+bF9TvTVH`M*-rALfTv==Wqb!absVaHzj&z4xzX?=~xT~lvtu4+8K zdu4~NirMZN=Bu9r7le;%2hY|2Ea6g*r%F^;;(??b?jjcqAA+||ky~crmf0zKEsJS< zUIe)nG?a62ORyy6EMg0}B{U}>(;~ItTaG2ms1m@__QBIYT~&Qr`r&E8t5O=CmbwZC z7!(iu@U)aA-;+wi)9|&z)9@zqG=HEct$3R9z&x#GFWkiK!H{-VYR{qkX!SZ<qg`3@ zArK^C+#6iw6uPrvI9IeIK?M3R3RD$*Dw?uk9r{&vu@jF{TL2p~Ih$&=f8?<kOqRc_ zYPi0<1<%sHhQ!NivgQ4C28(a8s{eZZnXP(P*|*+A{m{j=0O<<fqMTgZ!;r{wM{b5r zS{mstEV>i@IB6G^lePi5IecqLkZvR2>Id<;<YC2dpcnL%PYZmnb{9O0UwqZU*BKn` z-|ggoyvu&|=0|N7W4-;(#L?Y#Dh*dt@+mu3wG*?Z@F_NxbTU<lDktD!+}DX8hl?)4 zr)J5gh}M2+mR=hI{}hd$iqqGmarjg&DUMsjHu9-7oGC3xZTL2BkvegDg@F2!4#1~M zbE+$4z7TvWd{xTAr!rR|D3Wgthu~8gOTI6Yg-_vYg-_v4=2M~8zO3R?$^-MMwrONN zT&-8*pQMwh6I6hoZ47$a<H;dL7I*+!sw@i~io&PjU1^%Yz@^LSQ;{)rX=y^!nXI;m zo%qya=hPaN*?=1Xa11rtf3NOL)bi(d;8=fUs5jP`c(q!Ke=cup<KMqS?;_p|!eTzD zdkZI5luKa>v%tw?<X-?V@5OEpFaWc>kNoS-j<J34uR9h0Y85auvy#v9;yskaim~Jt z^VoY~AZC<r3%<ay+b+=7DLF^2`wsenl37WG!aXPLqhXoOX~fK0pa5RuG<HXnNFvLb zMnyWy?#rw`=&0V%Z62OqA0Jourb5#mzs|(_-I)`!NNzK`?CejnR5uy&PN522PIXhe z>iDIzRA)fm+W~aX1^z*$iz%K@6%vZy--dQ1^)k5x@oEuPlUc&v1$u2S7?Ehxn^xz- ztmy_?u2}?wDbhR;yC@h8Aqor$BriiU2n6hi?Lja&ijqm)QDW;>)GI`%u#ynxHS8UQ zj6>84rTb65hP|w`7^rhseLTkyO5&J#M<^M&ikmP7uwn=$BbIz_WE7zUUn@ch-ejR9 z)>#@CmX!w<O1ci>cn(Jn(DCe?MI7;lqTPx9y{tRfW0mdcUV2MQAmSk1yz=_mCO@-Q zbPOfcZDg%`0fx$_dxxkKCDktA2Ts!wtT0!W7>903jH~T+Cjm%3gETp-N>r!G#b>u4 zf}W3!P}Up`j5`emw-FLWb|lU3+i@5w$02<?a%d;`AnxZ*E4VY#nu;P0`O_nG&I=TW z_L<uR#G!pk9J*Z~*f3Fh9YV?Y{etuW9rprm#*&4Ef^WB8snZ-;Jbj*Gi+Ir{9Y8-9 zA!9dAeuark-_iv#SUqY`K&v4`1LR+=+1=Q{zuwmAYIXI0w5Gw@V71pb>L*eiowkmK zo?mTfjo3Ri{kpO0K=Ff}q}`gIM+emAE43Oeieoh<ZJ6&!>8*bbHkoF1E7eKMapVj* zVq&P2-pTJ%n4iT9DWY7WF5-!#oE)H-kvxq!atltLBNS#Pu+bAo=vO9i>E^|Yf_Uhf zv>y>=Iw?XlPN0Y~j8HNxunYBmCYpew9uegjB8v7{8WE+W_os$}2fJcRcdZT;od^}6 z3>ZVG$Xyk?EXW4XQ^XA*KbjjusKD2XP=PmDsOXHBSTtnifrSd;5Kb;zo4tLW+&sM> za!Dqx-lO+*v?h87CWz{m1Or<bJ!$}9VFHD+d~aqL<8u(hIsz+=0SPE_R<y>bc)lJF zj>0RXud>>m1P9glh8#IP+Yb^yTZ4tD)~AZRvcoWeT!re<x`z-W+*}CvK;U>;&N`lA z(wwo;eH0^TsGiOfE7R&J9BoPtAX*$%qD2YPi_4(-TEWgFdKgKu9r5BdPG@TNl4vO> z4K!hAnw2s})tXmoB5g03SQQ(d-nFu|?xCs>@K(Ci;oM;r8@^68Z0$s8-H-51oE%kh z#6m#v^%>yINb!XW5PSO`a%R@%+e5GQ5R>rohA=)gFuo7Ktusk+z#@9dt<fP7I|a#$ zZvz$xSOMX#2q@VtQV3~76SkHJ)`&!{`rvPZH#c7etoCg<wb*RQdzzcz&2Vd>s~g^o zuNB^mH<>p#d4z7|4xv0SZ}#FYE*p%)bh?|NJSq7BU?P!Ws2T)%n8<3#Cbfp~;)GSw zsOX&U3BaLS!=0pCc+;^`(aGLXiTnFi)U6-dR>504QRRSdKUg-8ZPdR@_Xd3sSU4va z<c@KRr75m4Wr{00GEKf%hprioDnw8>MXwl*NjuT!T9R^>PVytJ?ks%s2=h&<oKk%g zRS%4JNKF72vh~ENN^o$%t5Efbq*%K!v&|y)!np5yoer0azr%3(o(KP(A!~YcVg3nY z@%Eio!1O_ImpXWGmpU*0>ZA8$$M^KV`^HX=;N>$``(v<%C{^mjEvnQOihPZ!waU_k z2KT51OAWYAh#GUZC{pK}he^n|gqcIdy0QqJZ=7z!n<Yln_p%~wA;<_S(rw^L&KWsd zgS)OJ(y0`q8{vT3n+nhtU?vKBRAY#I8Rml<J=M#QqCw?56Y&xKu2Zq39XaHWYD|eX z?Q1?8qf7l0%caYYnlrXPJagX*jr>=)QZD<UhL766+`=mysjfp8&Mvwi`<>^Xxc;?m z*2?sr<yskha*bM<nS2sd<K&Z#OwXl}6sRBsb8{7Kz4{qQMV0#5E;yy?;V-z%Mpx{A zbn{=?ZLjpa)L^T%H2zZhx!v;0Wxv#^5cyicPJD*%E!F!^rwTFUJUjxi7h;0tV#jbE zy2vlRDS3|0MH*-wnWOhcfUYZeg3~VeC8kK=mwDr~%K-1Zory!@v=hgY;)F$PrxP-R zqcI~8iZ0(eL6VUrg6j#Ut|5idj+YK13-%s_s7Ar3<i2NDpO#^qmXWK{08Y#7RWV}8 zw?)D@Ewh$<e|7+;CB9aimUxq$mf^Pk0p+w*9@uH=qFD@RtAM?~9`bim7n(o0TBE(m z6V!GHJtGRm$p|jmd7(SQ5@Z*!$+Ff*lT&a`l-?8|LmO&u!)jcec>S(<=)L!AG|a2? z4Gp*xm_4BejluBA?R#qRdo}eYz1pDFSnOZ6>y4HgUiDvg!C%A6wYR@qv-ajEe+9Ik z{I!jVakN3*Mr=v0Rh3)2zsyQ=+o;P?e_4v7;)WW>Yv?Zf+X$(`YI1qizT-<QRjw$@ z0o}5R26HL??(8Ig0u{N{?dAt|Ieh5k<>qrmMegD*g_80en#1W%wb%1JkcC;wtIwqh z-AWu;O3K|?iWS`_5l6ffM?jFL$9MTK1rrih8Yi-9ZhQ{GB+xeJMKD29j$pD*nGT#& z^Vg`uB~B*AeHJlJv1JgkWl)gfh%M25^9Uba7C!P$_QOi6df&#a>J$?74h;B8i-<!l ziz)m=Aterh)T%mb??4;^TT%{jX!xqw30!Yy2jbAMB|kEpLma}_ia3NfSsdz!jpUR# zq&z5DI|};Z9QxM#5NYgwfo^cDh+IHft_|FbS+T|{bz`a@y=zGo!1t7uDUm_$sT;r3 zlOAN6G+w0g1(jA4iqh}YS{WjExr<Fu3H_=rHe{-IC!tUcj8IEE;TcHqT54%~PY=vY z7Jqa5;p!<?tJ@b1R@Ler-$B4)bsyF*SgzB(pZoA;>vX8c7?mJqWQbQIhP;8FMGvWP zh`lb}nUqf-f-UDx-+|GLqsNn|=RF`09Y;kyPdZIU$gz{>afFD6tnz`y`<26F7cvp1 z&49)`OBbo42Na%C($k`(_p@D&F8tNDxYOja>H9xwjWrIcCwjbXE)$jX%$_EP+tQ@V z>ZX=s&GnsJCa9jrt8|(xwe`kURhYL&T2Ri@8fwf&tA+}C<ydqI^*p67fLX<Uibd=7 z1#VH#^HHb*Z|xfGgF-NJz=tRmEL}w~tg}(yAm1^2&=3>tsZvUFQy)Rg+5)Sqeqso5 zt<b7OpZBs^;G?dT4ZQ-X2dx5D{R;&3k?yq8FId$G@IgXJY6a{2#;n&$wQu*LOG=SI zddlSS+aQ6|9f?>XR^cIAN%QLK+Yd88U8majl^q3|a;@+F<yzkd*R1uaDK}axtZY~X zWMGh+>wg8Kdad$bbzrHe=xnq6_oM1K@`>H6|J`4X7H=IY*LHRYU@h1FVw{QLFBbSi zCC;nEse)h0QKpl!pF}b^|15G8jvQZ`lE)}LAwHiJNvv=wOqz-gp<q#pvg(>44k18Y z3IRt1apaoRkDR18DfU~$X39xo$Vp;?)Qp_O)IWk$qBLC8Z;|Ry9&Av`gAEp`1zCq1 zBHz+pWF5M_E%@k4?+~rd28@(1uFgN)$Ui(+B?0+I{Ho}&<eNNh<R9o8cEttcANX34 zf8b4)f4G~v1SS7a9$5at0T&AiAgK=8T9S{X9U8zwqptBQW3w$bDTE6TC#8T0%7<EG zl#xgrrVb^YiS9uSl}SXE@cIrga(~TSy3h89x8L`IUEE&5Ft+T&Z~w=#L2bQQ$hhh> zQEh^&<z~6R=VXi|vDqRyP6tdoquW%daHn7|RD@udN-HdA1^ir?^~J4q5HZ6?d;~$s z`B1fB2=f%vEEM-8<rKRfr3Ba-v<gd5XfO&86xt07s85!vR-(f~LlA{AW*wyc(~6(k z6k8#jx}S!cV_9hoFcNV%E4Gpj6em&mmL1BV#4NRJq$#jDNMk^9Ndp;tl-+A9VnhSA z2tjE9_#-I1yHqh~v4HWSp_dSSp&-71`HN=^HF(knlZpyFgbkPqyp{h^nWA9y(d;t+ znZ|Nx;q%kH^Gnl_pgVQ>Y;o#S=M9DjbGZi%pjB5H(*m$p!BAxJ?*Av$whVOE>`UBc zG+ycrKWj8*y2C}=G2Udf|6xJHD7!HFq)uNNVE+w+ZsL0u-KBG>Nbv&Z0EpWG<I>na z2N&m_P%!LN>WJb>+Cq*_^F7OxVg*yg(}Jxs(d<(f6Hkf2mrY5O8&+*fJ(ju^@41s~ z@`<Jc@-|OPj69?j)EFRk;G_sD2(sG`@J?1VS(Q=qiX~QulM$L3F{`MjOJfr09xfHI z+CyPv>zFO!?<=q2{rrP|{@jG8XLeXs8!|lgf4}~|GrZ<^_@%ZRe>1xD8;kcXua){U zy5@(@n&a;)cB*{x6Zc>JYu-?NIiz{zz84b6mx!bEI`pXs|9yg=R(hXo3T%jM8cpjQ z?37S5(lI>D&;t=*(M(w3MAuLd%L8g)x$6X3au-ICSCBNTk>PuwJ%c3zEMt#ApWcH( z0TH4z4p{6aAmdTky+2v9ZHyE|5guz4v&W8=Vk#ul7Vys<QeHo#s2VBax&%#?1cC++ zD7UII%sdKY4$Y!*Rw+P1ugxX}G3yPfE4)5OGDcz_cCQOK2#hljVgkBX*gfeO<v@pU z9h3Hfk&7xwo5H25cj!uL%^C>xH@53l>j2YV*_Mluj{%Q2rX@Uj1Pn!AQ7Zj~LB+=^ z;pfBvL4!yIZ(nB$X5=L=14U8yHK`9~;!Q%db87(e05+IFdmTx}Zba~JifjbKD(IU+ zohf!+F&q{qyreX4$t)Q54Ps|Iy0Q1feI%v<b3xQBa5Wmc6TgB2Ul{-np<H%#X0WUh z0Gz*-+p5lvr1b<}!IfBJ;Fs^jFW<_E*8_LoIJ7CBp2r>N6!kdCm01ciLO$J}lE&b% z`^lA;@Gwgc(|8DxPoL@@31Kr%;pgsC<fSCOc?*RcpeKPyCJ6G$q`a5WKJ3K=0qL<q zSQm4Tg?7asp)!X2rLds<q^65aDZyD_+Eax-_71@-r39eGk{t?CpvcLaY#s-8LTtB5 zdJbEs1!K5G>Pu2DIcY!Kvm09`w`XWTQaoja5U8{SkfpS61RvTzjSuZZuA&I_K+kSn zctQ#GvQrl|Zg@|*yj^G2elM0=LwhcAgWI0FF8eEKB(+1qMpTqq3PXy&_K=UZ!bg4C zcn{3NhvRW$>1Bx9B(6zW*n%&KV!RZ@R`%kjSTe8f4K8Q266xM#5m6<Z9UB(3Ug7E; z`gjjX5lX#qSVoeR9m1Z*66jr%>^1b2w<5K4YrmYQO>#tWnOsXxxA>2@^v`vqly1DT zp??zJ{&md(-9O+i(ga@cA1f6KMolnC9-2|uLfK(Unt#*ep@b<4kcZwD9}HkCZo|*< z+sH%Xq<!2-a`e0iV)wCY45f&Q>ljTE<i-fK(52|S`jUmP;(TEXaTOE}DnA{xNGN$0 z(u(_m5MJsU!`?W-!-B3zzOJ#sO&fC<O_YMZ#$c@{XFO;n8RV2r%w)wh)fh2ciz=iX z?qSn%x6zP|bQD2WI!K;6gRTl>&Jj)7NC;E;ylEt4r+)Wy8V+$Z8)-J=&#RzovF-Yt z?jL9-q!zO^^~OLIFy;D@ZF!xBosrLKl{t_om*XB$=0FO=ir2B>NyuOq{QX^NY_*?G z#!-Q^KKtQHe!5z+%OZ8sIzc>v{pzQ=MNd|2v0kydo!+kWFl|u=^K&$VK^#aJ9YFV4 z?o2YNKED;)Mt$XtCNXZAk*eD<Zc&GE3k@5$uzsNeZ8aJG-mM&H!x@X8ZsbIY4@Gs# zJjHSD`<&cIr(74jf|dKyDF*VNBG(}`-bs3(E`yJJ=y<Fjl{@J;evUDL)EJ}?LD^E8 zv=9O7Aso-Kq@1JKxm=FKFFP4}Kx|HydI4g{A{~U+98i4dfJI8+r<CGDnAMOvb69V# zksV2G*4Mx<-QZM_A04EVslE=*b-+sEhY1^!4sg44;q##+C>SBNY6T)!I^x3z56r`R zrs)1QfgNjDcj#k>zVpV)^v8eSCC^!B>SNn})5mhl&!Lv)!L;KL`3>E{N%op<a7j9? zTHrU$nOG9r%Q!2BiJp)3oSTWkh0(S2U~i*IX#{&j8h#Sz!ixK?pEcVZP3l;3fLy1? zCfh8y24G(@Kv2n1fqoF1t)Qf0yFkzZ>i6C>YjI1jdG%b48GYQ#Y>?u%t-8!#+3h-$ z#=FMc#D*S2ykE*)uZZ=V<L>6qC}%w*0}$xk>0m-tY}!|vfolws&(Qc;1NqEcKxn|C z&f(_(wfz+;HikpQa1vpcX&QLxBHrSb(D-136lo}qW3e%Wnhw(Mq!Vx?h>k<!9YBq8 z=ig-!Pf{{?Uh$;!LgA$Hq=M3|a-%z}yl~he9b0j^3HqgR_D;U-z%gVON}?&zEJiSm zT{9Q~oz9Bqtul~rc#=+n@GY;gHPK9l=(5TUZW0TU4kL|}7nd&5^D!J$(ir&R#Br;5 zknV$^0Oh0roK|+xtY}u8vqI5WngK>f8i&ZaG?ayR;##=``kj28?f^;LAh=cmE_Yl( zDUUiwZXR`34Zvd^PaeS>BxqJ!L2<IZ5{v37nh)s-@fZ-%v*~9V5LrdRx;^(*e|oJR zG)fTY+0dQwJ`^}<mLrYY+W%W(q?=LvdX)Sc%ru4AME@vEForC2?a>X;Gz?#pCSY+f z!kbiIAM;&xPSE)5!MlAM4l7GkIGZ3z`$$%FS}`>djwbp?X?9WUVpqyFCMNc<dnKv^ zfJV>2Qu>Bjr66V%_uxJ`3cJ|O*u*VkiTb{^Mu>~gZ!xNDY9L~B@kJkI@#TGdeADqo zgFqu2=*z~D1}#<$Z2>=m5efP9ofJ@5i*@lBwOH>L#Lb2qX#zn+2^=)r2T))+a=gD3 zIr7H%9p`DoL1)Vubf%N1?qd<dCLSQntgKNXr$vOYNjcmw#Ws?^-dv10jTlj=(;Yp@ z{`ott(gFMftAoW(YecLAOyP$~(3Ts(N;gjR<nA2r{I8(zYYWu&f7I*^Hbx99eP3mi zBdbf_cYK+;`NPiZAKy;8to}~N0I$<ssnuI*T^QhKY@?Phucr}?a4B|->5k$48smDQ zK>IDFhDlR3loPP_4aqNdP7aJAc68qI20O=kIFpDHok?+0VX_*>PuXN%J1*diS>2(P z4!~w6M{pLj(_L1u=4v^c-r;L&?HCv%wI;}-v%v|g7$A#5;>}5uz<9{n5nzL}Vy9Ic z#|#DCxY!Iuy;U;dUs|WLS~Z_jJx6iNXra1q!(oZOEk=zWmV2g6+6W&{>nvY0G#LG= zw#!uj?d!LW8{nyq>K;MdaC1w%MnMlr0y2#O%1c1hGmK|;4Te&w%#wN<>Lkjo6+nJr zsdK=OEvBMY=Mn{n&NgHlE0YqVpq#-*LWf~APH&G=FAr2RC6GS#R{+wFAT{wOOC5Ky z)gpmRwlJ@xAM=cytI)%KV)GVpM0R8u*v70xvlp_?gaQ?c{t+A>8bWqZX?flz&Ju$c zN*J8fiIh`rP4&>$FVRSiw1|e9G=C7kHjS#6IC*n8sAv=HG>=f7U3P!%!iU1s<(|T} z`!c#ms&2(^B}y%}ZMJF1r>RfFc-!4TrVbE$D$QvWvWic4Q$7%YBQMWl6q?e}Zfrg1 z*LdK+#P~H<2yRM>>DAlog|jAM1(BpQgvT^^enewdG+T*7%nl6>Oh`n#8T2X%ju8ZU zOi5tZ(n#^-NSuBUo2+7T!#%pCBl171a<#k5n{dm<&MGGOw;w3?GjG-B@q+hnjP-ej zXf~R0;bR~X)dWQ^5X6|o^RhPX5|^mXYt+G(I~iSKCpCH!$-KF_69G{H&{J3ZrZ!Gq z!#=Un##ShR@-^(`(g-Zw!;~U@t-CdifGKf=msercsSv*jBqp&d0&|PzS#=*0T{EV& z&NuSkF5@gK65DxDv+*@w9~k>Tjm8H@MjkX8=@~CQII?zI<ee{7)3Pl$8Z}0cF%Miq zKP`=Qwy^%k5cdWrM_{*72i7)_T#tN*m3vZY#(a&@Wl;0*7z`vE_JS_Z%R-g61(1Ze zm!ei2pxJz#H;#p0Yvx$X-YG9L`=Mf);}ty-S>Hs~EO}Yxikck_@6p+m01^nd8Gs8> zSr^lc7OxgJsG3!DIMfYUO^34<&n2<rp&e{Ds~h>xlty_6L3rb<2jR`&%w#5U795wU z?4KA6XOqIB$?%0Gk4mNH*Xw^=H(tszi6&CvGW1>IpIwX_LQZ#!S=*3JPk32h(_36r zjWMl}IMf$q2knC!`%n))B)Zp9{OG6>{zHcsw4fYlL1E`5E>IdM2qrO%RnRFjcEoE* z1fY3!jXexg1sP){(t#vfHa5g>z|{xGqT!*>t#5qG-CNb!TiBbLOXKfUG*z77sr+M3 zZlS*v)kD#fl<V=A(ujl|nGi+ImEN}0)y)j9vt)3fsBFI`xnOWE!ocKQUKm_EQ?Y3$ zW=b&E`dm6Qz$*$|g#(jus<bGgEy=hlUG11XtWh`C2N<8Cj-n)vI~0v^?ulL0Srfa7 z(tFjKRUU3P(=9?sY}tHdCTowy)0Nc@|7;!GSzbb23K5cYCa*8v!fF^kK)b+r0^L9| zV=B{6YiUP55h0TS4q#;gd9gzfTdozXOtnk2u0{k9HC7XQkD6eNFEDSi`y*sCNbeBV znlLYmyub3VNAJJU!ljZntA*n&dYvW|;H(w^D^dYY^+5TMNY6a|@C@(&YT+M?|H6Og zvBl!Q{c-W9#b-XYc;iLh%YR_yProqr^v`~`_>aX`p>Sj3<NV@{%fH|Xmp;s2<Rg6j z)ero7@dfybnsh?c1vP12G@)yIN@4A1@;glR!c+Zsw-bpq8*ub4|IuK{C2UF<f2kc5 zk?7U4zgPwx6+1mWE%<{esjdElc~D)@v_vCn8<7?j-vYLMP)CBzt&M-Cyv4<@KJcNh zzVQ0|lMj9F>wm|;@IC*~gR@IdzqI^ks!gvy`Rwywd+^$?J;?J$)!yPuV;_EE;kRG? z&JXBsLv2C3pu56m6SCCbqEZ+2v=k{&)Cj@Mu+FguqTuAULQvUZRIN6NRH(t$m+%On zJwrZO=uixS*n{byqY;E=f!G59=mwHNp(2Ca3<go8jwE7oEk0F&+OA*(VG~OVBZzt< zb5WnGMhT#hv+oNV8;)*V=&a_b>x)e9FaFLZtOz!Q8cu<odojeJSCov6b)74l{uP{a z1R<g87^BDaDorc2skTHZ*NX)W!`}Mwy|Uepqm5YKXm!DBL11|l)e(v)oABb$lWGb7 zv?+eK4NO$^$XdJ<H@>r`xvqPd71p=jQf&<O`(JT4B4(Q`v6tR#!6p8+)%Lo|^dk79 zQOu}uSrpsru;PYNVRi6Lp9*YkYq#b$GB(XZW3Z{_uN=*_)Oaf_UoeQ2&3g-@`>&uJ zv6lL49aT4p;`=vYd9+4~?J51VIM>e|En)N0@Nz;mP+N!>2N;5@8@WHUjQU_P0CO>V z_ikLjsCU+9eXX{^?G2g9F(b%34P-Q3s3S{$6#ub28+OY!uoQK+nwIR3tFr&%?3xy{ zVRvj{eXC*hRM^Z9*!*N0sv@wNPKtUYT-QS<86`rcZdZoM7x|PJ=Tq>55hYyF8u>`R zsD^NZS2$f=qq?()9bj&lk`xixqLla$Z)PFe$YS{MKXKI6I3nO0`@xpX<p;zhW^TAM zaohH6g?dN@^j^uII=Bo}YMAyd)5W6xI#ri>v5#^=RxLn%AdPhZxKGOBdmOo+(l1I| z3l>LHK+V=8hW0DMH?0R6bnCW;HZYN@rqHi#&OX*}1#Ms+O3h5Wpyyab$#9e0>m`g3 z>SXd>vJbkj+{BADC7>@_5EM(<JZ=xGSLkEEl;`@0*JL$lkNA8g2kh)hb4y!Ca)<~B z=su2Onbs!yXlAbr)l1{5!tR9~wBCK_(d>jmYa{#Ad~%%8z*aM5&27n4pINKpF|Zx8 zEyuW0k0i@YaG&So4EgjZ_DmqV?Z=+2-4&S>#Mm`uG;WRal{nN>2#Q*2rYZME6b14% zsd4O8>FPLz&kO>+sx(BdP(wuMC&3A^j`c&vk**MT6wyMlsY<p<{iuyeF`T?oEp~T} z{s&E>t-e53(?)IJFzP2uu)4~tWbu=m@|;bzJ2vA()a&q|^1Pq>bxsb_emJlnQSu=( z_G4Hm2ui~OQY6C+^*TI;nJ{SWpkaYZbL%z2|D$m#0{;(WccEDcy;M?mqrLc=Q5B4) z2x!kSayb@bq~R#F5H#sQHWLV0h0fG0(j6R7RUO9-^TWy+G0n=1SgA$uqpcWv(Lw`Y z&ISyCnVK4#8+@!4;5K#G%B)UV-B494p!k_vpX{nPQTS)GUOGd4t?j1L|FTb<Wwrl) z#IfHip+E@Vc#F9%Rzn=1P>-;N+7{z?vq*cf?t~7gz$UWrGz97-rM$1U=8Srq$A?hW zJ;3C|_Sz7tz_dg4`%YAKB|T73J77cFz~)s?D!4+dv%jIGW<;o-e~NC(_<vrm+IuvS zc(2a-HQrz&UZYp5tM(h2Kb;}qFj^?p>?1fw|F=@N5BLi?P4<sk>`H;QT!qKE+I_9y zS4<wG4b`Q0TptVViUZJDYo@X|)z-mWLv`7vmN-=h#jveJS<^(f%63Rx-DLTjzEuCD z%Fq`!dBt9@<gMGxjT`wY-M6(ftl!a&jyH5BSv{7-i3p!WOmQn~$K)BBt|@aw%S%Q# zZgeXE+auOrCdv>4BT!{6o7jGQQC5q=oUp?36D8X$RUWE?Ma7bBWP#2q8@ulomvrq+ zl~KKYf4N+}i97qVjFV@Cdy1326j?EwA&1Gbj948(AiX*l)k*<54O}bKD>hZnv;qn) z_EWyo)rdnZ!`d}+X9k5Wt`n+241L;IkMOF}d&5xzopO@VhM?*SMA(m#fU1?$iS``e z|4?WyRX!a~twQ4CTt9b!;)+ti!0ZdD`yd-MtMv|>I({rt{OhdO?|*px8_{bocw7Ak zqmG1dD!WkYYE1>-cS-kbbo602?{nJ!?p9HAP8W^a{3=~0H2=V~KOGM?s*e|cs?>Op z_g>MSh0P^_^L<`nIijRNQNyJPH#baAY!+yswY;!qrICUz3N=dz6W0nk0EUSh@}n%+ zX(==*t+6J7t?CeNOL?u$QA>{6WR<-^LhoW`3^QeN)XZ+>HbT#c%_O!+0zkET=D@gH z5yMS{G@CPu1=32Aj7Y^wsz9u<Mzu(lF<D2Fx*tXy#s+8cbGT1Ea>C}cbP4^5#Hs9z zEtCq3kF6>oX-{bc$KOVHi_spw(U1F7LS>BlBL+=6G=I-@>u}ts8~w9$YgLt2>Pk<s zJ64YS1Sk8*MjNn<X`1B(?*Z0O*5R2+G2^eqOkX$h-@HrLZHCzb&>6f|uqbA0QHo1$ z#cbnl2h3Ih!^?4!wdCR{m@NT49Tqa%8edxnAw1C`L6%T(2iw_?Qa3SI#DO@J(mDi# z6h+1oU@#KyZP>cisutez^EQ(|Tl{kF)h*iXQ?EGfrUv)pL&e|TuHh<mr4_ryI(O4L z@4DTl(Yuw5HSqT#Ye8qSIp|&|HrX~~z-l1Nu+9yNE&xFSQ^~CwFn{@3lg(u5{Jv&n zqP}=_lLlNYx?P6)=-)bo>tkCo;BsA*=DwobLGCAORh34C!UQwpb;_fV%G=HIXnwdO z0<)i1qV@1Jnf)-SxgLhuGX|GtipLr&=rG)^q?la3YKRf5oE;#VTcw&+Kn-=^0DWK- zvy6GIW_X%*bHbYE)HQlMVz<g@x;2DDHprXT4epPx(O&${7X4*SiICz~cj`Mdi#5v4 zK|fEX0U$Q(vr$c!vgdXj1%+7mSQCbuF(1i}g9CzyGA%)20GZlSnRXT%IDp?vCFNOS zQq9eF;p`Z&6r#%65f!9)d}LJxcMSVkQNbnlo>{9DLitvV0>|D>xN1%so5sP?=Vj6{ zL8}<fiZP0NkpC(j+hQ#WHgN3no@0)6t;Xe69ai<&8lB7=Z*0nOH>?G3`V=brFmYYP zxWn8xN-AF@yR_=|<I?Xz>u!o|WWJu$!_6E_?N?MQ42Cq>!;1@o*mJEAQjR2`+ttX& z%dLA0HFXM|$^n(@r0L6jil^rGk*DS+;HgKji)!FxU(?%tZ3zaVh@vzg4-5g^g3a>b zcq(-E;74-KJ%Ch5xqUdS*kINK$uv_P_SWpNTR+FtL!oG+GVHU$9AOcD*%k;}qMFp1 zt+}dpJ+_FSZPQuH=UhEFC?~L&|HR4T<g0|XW)hY(bQ`DqtXn9gm7r`D3NAet#63Wr z%yTX)cEq6PS|}yPMz&m&rr~n-q&RI^L1^HIRlAA_1}TRR_9f+mIr7be$TUD9T<B7K zvrABR6>j#L!Rl(8BsMhZAa)gM-EhS&tDHz<Ob@#|IftztYC%=51c;<=Zy!n7ij9OL z$k*8&YzEVX@#u;PJZd;}sgdMC!l}elqXwngmOXOJ-`e19v>qHbIX$+3&XaVuwAQ!O z$1YmD0ZXf<-q%tLZO%Vm5+>BvyFOsBnLS#&GcjOj_^zSR6i}NRj5p}!vK1d?_v9#g zt8PT$KiIIR3<;M)Zz<Z2ZL+(Z?`i_7xsX+Sv5S#xb@jj(_h4h$9D1qi3?#E0jg7FK zqW*KCsRcjMW+s4Dle+K*LS)4)a68%JW_r|X-Eo$ixwpPSZ*<4Oz50sgQLJ@0rb9lh zOjSvJa2?ji283L8k`FCU_=`;fv461R4UikOwW$)CTHV&BzHJBI09D;U2l>bV&=Js~ zT%KPL*enUt1hgdByw8ADnQU}Hc$=C>aF0YE+#_OQ<5w0&liftP(CS6ss?o0~#TH-R zlx;7sG;5o0){q(Zg?5DDX{I=tlT(OJM8XP!ba@1qa?rU+gIHgGbOL<2D=W;4(}LJ{ zO{xQ?rjBkhxw^eTmh>?W-98#kglbrc;GzT~pfacueB>_DmSeO6vYQPvx`VLn8Ei)# z788uBG_+q2(h5<>4t=*(asqHbDE(Hf_#o~^QWy54dX3dU8Q@mrOb)9-Ska}5fUyEM z&5XkirK(1V9zw1PG^<v~S5~~ktB#e?CvO>&!rf1udu}dnG^Km@8Vp|#7`|pO%%lSG z2fq1<`QLf@)W98hF*1u6FO4m)xc1{W-deo>+4u!McjC7`c!+;%Y){yzGbD`~f{z(& zvkS%e^WT1TDR$voFQ{f88Csd)7{C=C*s?g`WYrJCzV~wASTxZdwCK4U2sZ~?OcVta zHH%@qAv?&@<0-n5t^ztuFttxVus-#SXQaLn+pxkjQrZT9;=$S}_2Oe~y|fK&oml%x zti2zV6Uo(tbp~Lpnbux1(W*mAEk2I#y*61$5cf1EwG3jL042*><^D0e2TB>6T;E56 zQJBJ(n$aAn-kSzo|0@<g*{H*Y(B||;Zhn~*mFdJl#Xzs)pD)*<V$WPk%|a<Nn`*Qe z&gD)#5abhc(8y#z_nk5tUzv4UoURv_H-=yMy+`<`&J61eiCT@O?!}?v#~xWa>rL@b zpUnYdR(kE>`5}WbTw@>7>1Mi|jUQ6>!~}fF2*Xt#<8_?8hxSB~-jffIUnEnE>P5&d zl3%P1uLOzl6hqX=UK0Asd5^$FrWo393Ur7fa5me5o>0a%tQ-u(#A==!?!!ME9j3=9 zCb5X6TJla|i!kPdp?E*~ZZ_%gA~r`$b1S08R_?r!d<`L)qmau3Kt&uT92Y0e(j?>X z6t#E`;AtaKMg_A1+L-P5-dbpu8`rnwhHE1_Z@jfrpZvuS!3jintVQ0TR4NN8#V7Zt zyzog3F_GHh0rJUpu3Zsw)!`JOlEeWthK^E2Q|gB21-fak0)%4+Xkkm2t_=M5a8jDY zzml1xe<cIBJ5A3?-87ROg<LIC0x?cKJjw3G77{cG$QhbHWjAYCH9%f0W~}0otT<_t zb7RUKI|X?N0<_Oq<$Z@Sy#jcEO%6;i&{L!o1Zsh=*%V-O+YYaKVVP=@i6=6q$!~Ac z_t(3Ey>oqAp@4XFeyDw3M>XpM+!r{xf#QHdekxB;K@jsOZkU0V3W9zL3?nHTP?g#S zsTDXch^}jtW5^v0bJ>A;WZA`X4VMp_56x>}oyz#?UNh}yFP)xHRwe$Y{mjd099ilB z={%vHdaVUjLrBlmug&bN9Jk8GNnE$G;(i)qLal_x^$(a`cuCqvQ;#a86uVsJ=FR;2 z=ONx%%aw7v24a))NWERB9xE5URi8<O);b+;@?qa&v(X7Y6Tgo?y;Ax91*e%gyYA_D zX+0md2faqicsTY-?kGDynXgg}{5ba}C!1-HS`}ZNZ#BbL_ba}7h<w#WPAR0MMmQxV zfwBUP=VJ?rN?f#Jt`&}DWkao2k^%rcupJ(^n3QJlM*9ryP&<`5W-Qd4DU2#{c$AtD zxlyQ3$(d0+av6-4ld_)_Bm%A`NOgd5;fmJmt8EpIAv+gmZ23g0=MWw_sdXNEivhI# zR(Wpm80}=EO}0)Rr6(5@l=tF$mhJWM+I98K9k1zCMR#7wiy+utao(y@6(#r9_1w<y z_GbF120!KMkMe3&#qnENe}DJaZsB$+l_q3~s#Gbe%{TXT*EDBG8;Um_$z?x$8g==3 z!bsR@j}<{Md6N8aw!#ks<cGNwQ4dKG(2nPIle0uyeozqItluWbSOOYD{FU7?+Br9M zJz~n1)+<|@si!wG#Ab!=6FSI+ImH_>I3&fuw@1TY8F-_;wI)Y9T5l7_vf^H=TsuW= zdvVSt&SP`EL@+g9@5W2WiB_uiJ9a|V+)52!b$uId_%s#rZ7vk@AKbYcQXX63hd=mV z(GT;+nkEunfGd7hXEOWLJN3o2S_Hf3i~N=6ObsSW>|Zn^lV-=Zd~v0f@Xkkf*B7bB za66ym<S_Z-m~v-4M!tAb@kJs(l@sKP2U7&ql=|k@O_k3IVh6)A$%E{49L&HqI|j)E zJ4j@HaGR+z5Ke`F;?Gk7`q!qQ+yw=BMp$)P@#j0>&x14@gYvS&DyRFHz7BOJsO-Fg zQY+`mQ&wpLbx$#16Ax#_MXTI={3QANX`6UE{GBLv<l6&O50H)#6_OTRX7MBjZS*Kt zQw&(MM^Vb7cYHEmtiXr+m=feZ{x`M}9M<kye0t}Bq8ukEx4y^&(@Q9Ta&CQ^<<=*Z z^ZIr=uQ!!j7pC^NQcnY=aCVg6wOg6?`J{B5GHu2tLz%XC+`^{JmB~?go`ltN^QT~~ z6Z3fFCidanzZ({{jcoi*@Wa@^?6l{0mKCH<!vaqcE1`Vy{GDWBl#VZ6peIVlk(Y~8 zbjDELpkNs`*%VQ4DmmX=y0E2eeS@@N+nM_&QwW?-N-lp6eU&@7ca#~v!nhK*sv9fv z5{0qT%$RQKtE4ER9cHYsQvW<<`R5qa48ni%FoRGGXJHPV>^e@c6o>|_D$b+QRcKM7 z(o72(&<te3#TJDG<#r`1oret#1D;0HRch-My*-l2%@YBMG=uvb7`MO)MfbO3HraRE zX|jpit>PVsKi!}dlJ96CNX9ZV-s#OXV0N66s4s5Vj^Y3F)&r6Dt#xp1#W!~^77<V0 zU3`y%_8<fj)!xqZ_;7c?VMh1-vlNS{eM%QR8pDH1`?OG)V>WUN^1QnV5+WrKM}mw+ zF-XjAU{M4RU%Ze6yUh3HJ~r4+h10xtsSp3z)ocXfl7)(NC1yAIAiMe>JPkveJV<e8 z66Ds>J+PSsApH4ydj~0@z;HgRH#vM=J%jAN*k=<jg3Si@QyASde)b#-Gd*WuPN&%k zb?O2!<p78$2amFV1L>SD;=!1vQWCmWuq>0ElG}Ai3@ST4j@^thXV$Q*6@vKtwvifX z)?xX3cX4;pB#Jla5k0S?9??PWo1EN8=bR!wDo;^>!L^iX2SExbqbcguO6|kdRd!%8 z01*U0OYUr<K+y@)k(*eQX$ruKKwef&BMqA>$)d@97W$3896kmd5>cZllx7Tg(Mh1f z5G%BY(IcWlJDPcYR;hOsKUr*4CIlafX?j0d!~xoW^mzf#mk(M{B3@PA-ErVhO_d>< zs!we_Ml4r+b(6ai6Q~}~`b|f<UsTuQ=U6=+MIVX@^Ar=-tH(nNi|BD#I&y5&mHCCF zG=*@`#_GjXnHQ(nph{sxIW9)hw5KDaVwoNxzfbqW?~lXpH>%!K<#)=K7ZOQQxkC9K z9bF=M+-H?%7x1Q1zTeAAepJ4v&P&E7j;z|{-Mra6U9RdUw&A<W`+ds_|0a_I6{iAs z=Sw=m{p|;LC-YUwg)nbUvLWG88CUKa*h}?(oxouJo@-JqT(j0gn^J4Rrg&=U(Synn zqmQWtEt%MW6>p{L9Zg7A768^7isq~PR=gAJ`<aGezKbt6tuTiU-4``lTKqRx3IP1j z#N+5q%`bk<XncIa;wk=*P`0PrXt=aEaoK1*;%WPl*;e=W7M-)B)o8rz5H1<{+L;5< zf3sLLx@AD$JI__mrSqChpTK{4AZt?_yOnm9wqvY#(yemxulU<twW@NLOP87{PI!CT zLsXA*7sDD0!=%mw1ua`R#5&+oDAuu(_#Rdgm%DN^SW7C)NGPdbjU2cvWl@U8Dm6G5 z7qbGoQ-d|V#L7$^BlL{a)vKgp1xu{dFxINKy1l1YsdfA}pEdEgNGmWr|DlJXnN-9W zGEW2(es}NxN8G!IH+GhJg6HUF`&hOm*_KMa$+9fVvMkA#Y|HY!d@I}Ka=Dx;*EMy? z4G1I<(jl9!>CI%)nZSf}HXE7_X_|pdI+NjH*mI6lAx}Dh8JebfvRpzIvn<_{C!MCd zVbieCG~J|Js{Os+IkL{t#g<*!S^ki9F%|23f8Tq5fA2rE`J>Uw%C+OGe^go78Eh~L zR-@rXOO4fPdDS&GfeMo^bcY}eo^n6@_aXE7Ttxlb^`}DK=5fpDZKvih8=B{a4=7b7 zAy#Rusq;0rR$wD=T`gZz!~dY)uT-J%nQ5gz^Yd%pxr5dI4A3=Cv-;}_hh6ttK7($u z3yQZPN5}H(uV~xK(HdA8eTUT!1Zo%#a`Xn&hDG!SGUyF(-8som`K!VF`+dWJ{7sex z;gx-(ar<uPlO*g1j4BpcZFL1+zg$aL#KKo#QLLiLD6IG(76m_fcCf)T=FmH_z;KBZ z<~Fkg5p;sA5nyZ*1`;ROfXbQ7S?qWrbr^x%1S}<xtLd}|+$e4hk7i5LKe-XiZXap; zyMR}wCn@)Hp4-&(%?uM>k1D&aCzp4qi-K4jqd7!7)51KN7F7Q=3^eZ&{S2{<_d=z} zW8>5v!^&V7W^;E^47JTfGZ@50m_wZtjR$4aIg9e3>_ulAYc=^9&w&KZ@1NWEWhq|# z%%`*2&&|o{amv$N;69HIC%QX+MUR)r$_&vpT2odmj~~T`G^qEBQPH)T&h6+!V~82P zs5+8pvPo5Vr4A%sg3PF0G~Ey$8KwCRDp(^0Vgf-tyiQiWOc{=@DrJ+nA2M4wkN2yG z=rR_oOXhFH<8Sy*zMP}|*KET3ha0(|rE(v?{mUD?uvx!`avm4C|D?`D6VawOmfuX8 z-#<D+*BP1H1XSX|6;LrkmRS56*_7IFbK6+oPMZ%mH_}EvPlOf<V9-3*Tq<F?tJD^O zWSXsAxPgd9(_F>MQECze`!3C$9?GRCb|CLph8Ky02Zbs<8Pr~`p5cuh+53fk-@=k9 zimQfq(VwCm#YOISIJuqfAr`KZKb4>hhh!9Ov!7T!10n5_i+}`;n`-wj6Ob9U3+qs@ zD4u3a7LiHX3mg9pyJm)0ibVFTfw$9+2Hv-B=vtKs@!#pV!D|H{*LQu(;(*^)1MZId z)p&t8m06H)V(}p;#pMt#QQFF0N`CAB<3*5ppb3OWkk`)By=iyj-tac55|@RVvW6<@ z-68CVlF*;XNv1}~r<Gc*XlCB(NZTF0&KRx`d=0cvjbh*zomBy!g(iQ&4C?oGR0IYG zCgBTH$5oW?8t2gl=k>c98|sue-5KW}j8A-QGI;mH_ZW}gd9QywWX)`Ct&e`oRGS!& z4UQyirf-cJEPSQOHaI>I51XFaGDds-!^W#;c>e0Gk3A&3`k2XN3e*}*rckf(S~zGn z8EOM~OL?z1wB?H?;NS9G68=J~`IFot_mVP6EeeYFUtw<oTG>>UI2TX%1ii$GZ)l9i ziKal}MzAnVVNVlh;#BIr1efg~4sa<d1~x$iEV;0cEg6$t0Xi^O3p7!{MNb2?e%NJ` z8jxNajY%f_8W)pI4fL7GfM07c(bERRYtmtyx{LZ`Jw%k&U{BY!CZ=b|@WA9)QFjLs z7$0x+;iU5jz`XR9s*fUA4~^6I4aNsq-ytKmvb6Hv+{+3}qw%3fAB|rG4IK(iUl4?g z@ys)LS;=1U)feLrKWquR6RzqP-`iZt-=RK#Uo)g|es1ecy(Tew3g4M)o<8U?m|R`8 zJ&kLREbCq|S)4<C)(RdhH*rB(C>w=N&zqq|tb|g|ZLy++!Z9*N9xb$D)uv=ad#p0V zCK)k~+Kl6(NwpMym1An0rrG5xn~^*$EA@ZZ5HM*v65W;t`vB*6S-3`O^+N+mm(TF@ zbKlGS(>LDXpWy}XuQG3Z?Y2jszirL9`gc#gk@+?6;H|&m|N1?{4`%Lq>nA^#U%fxg z{_b_mIKBW39RP}CGyUC;3hpBQ?rLPwc+njdD>tP&h*_Bul?;_Xm74IE(kN(}t2SE? zus^jLlc2Jpl6XAvCpzu)7j_NW*k8z+5$P`s7{2k=um1emcldjFL;D+<U!~98^32kl zasKhlzkEG&{g=PuCk@<3d!Nm8W`6PNcRu&kd%i>KwxAEV>wj;0-W<XdW|BLu?tMro zr_0tMS9^T%c$~rw${f`-1(-dBGU(q__j91%o?zK;&fa_@4LNa*Y`;-oe-#hQ$F6c8 zU1h9a1I0oWkG~mDmG={PpE9tx<@C+E{3dumu_1#$^F-o(4Tgt~^Ma{%C28end7*Y~ z)M|Ki{W+6scEFwa%KV=`5I@fIzmxe5KM}u+KV|ejP>Z$<gCSt^I=$CM4d3uNeA#zG zQ(Iq`rTOH4bS#`o@b72e-<9#1IK)|1+ziE8R<cp*m!;8XaWI~4Vs}{X>O*T(EM4KS z(w5Z<b|AbsODQtd4V3FRI*B@5Y3jS+qzbZ8bxElS5|lz1AOal=gVZG7VKFNnU1uCe z5kQ_qJ*60(1|u?T=7$p7hornqi*pSXpX_Y*FFDqi{&(KC{CIqc=k67Ji0=F24W@IW zFD#o}r)u7OtCnA{YA`*p@VLe0A8%?*{AeNbOsjV)^ymM%@e7BBFMicF=vg<M8V%M3 zq5Uj=8y{|HzV`nZDw`Vy>jD<ih&DbR4c?bAe2(Tz=Wz!g0G4gQ$vl@)c>TndI9p;t z9Dfv@V)z50C7#1_wBUIUPEZ<1LN$UgH*KkypyiYn#>~%)paJo3N(XUK585OFzO+(0 zkl7kzPN>Bpq6$vit-qH3<w>qFB-EmuF1BL1r0ut2{YJ_Wt;c4`F)#s+wbqAH9`-m& zk1{^eY?;J+*fL4&q-{wq?P&8vM`_t)gx5T08z%)Ygj@khb<c!CaWR3pge!KJ+t0{F zK)45S8nb%BQW2oOBwq(A;K+_aQ6yq)1$$J*@xTPQ<G^6$gj%hSIb%a{UKuK3YNek4 zbz=5|w^WbLx15jizt<963iP|Z4&U+Fd;02EKJ~t@ym<U$Pc9FinvU4a7T$2FcGi2S zCi5vFaLf9g&)jj>qaQYZ^6cpJ<ATq6Di~U64qg4_{V#aJ&Wmf`S$*dB-}sA1U%e~d zdFb}p2D`z`TLdAq5_<kmU;i)if9I!x`>Z^l=Ir37I>h~wlYNk(*cz9J$e5lJPb0}Q zL^2#4fea_%ehblb6_cN6!jiiRu9+IpKv~hXDNRF?Sonz5gK1?ssBEE5B+dfUMWhva zNsimZwhgHdl5E&W)%P_(lH<0NXS@xPY-f*S^eE#ajU@3NNfJ~%9!Rp@)803xN;3DH zNzwupiZTHwO~MVi;yBRM#mF%bDYF|%s-AR?3%0N#V%)C;e|Yh{i=P~=oIV?}&+^aj zLxxY!uls|i1fOAWd9*os^VMIJFGQ1iC!fWg^l%H@3BF#LxD!Kh=_W|ApQMQ1@BCYt zB#F)}K%z(3tvr&um0+4ictuLA!bvML)@cGyT7_g+OWo0v`I0?sqXqO6kZi|B$~(~k z$sV?)!iRey*$8`FqDL7YX(WsHNV1V6I)J&fHyoK*QYD*v&Lq1!09|T9V<;ybLAUpn z_O9-MMJf+S6*OXwc|Gj1G!n}m64<Iq*Ire#l96@Mcyyp60D7#w-`o)0Tw!#+6*t#m z33&&rj~d#9ijVID0QyV3X02@w`ZM3M7~Q;Z?Nr%d$YS9$xX<nIHr9!?JVrOVH!d9q z7U((kmTqGKu0HOrJVIEQT?7^|+LsopRtO7N{7e);Z85n@AS_g&EWo<JvCrTb)&|!d zBi^jSOO9r4{Uu+vVF|D_2`KnBQXP{%Kw%k+fS1F7LX<rorbihcX`q1j2nta`1DEcI zL?;icP{=(eC~(pW2r9351Hr*1Ho#dW001or&GnZY>Q=qv=B@yxId4ez)420fxbo*h zwpre0;Ys?d29IEQcpuV#a!z~w&#hVOG~z#3w)nw+quA$Xu{CcH_W3yJ0z1jST1jfr zbMk9Q{zGJ;(f39cnm9HjUpM3*hWy9I!$Xii%72hQFJgUk1N#1P`_b$_1--6L^-Zy< zzA2i8U(XkL$R=7hq&A4$u#p0PBSapurD{S}h`g3P_Ryn@k2E63dnEE&H2*{7d`)dz zk1F!qb9Vcu&>}94(Hh32U5X=#jU{hzvaWEJ$_7R)BaKmo06od$Y#rP*V~ZZLi%X;n z_H)-MB+%*)KfO^w`Gax3e#kh#cd2JSqfsA^1pJmy8){o;1^&@;<!$omEWZ_OSJup1 zLX&}?2onFKf`w<G3z%Kt{&g3-py~p>T@b36T~Ks^#uB9JRtJjS{{O*-pg{c#7~d9~ zl-n=WHL_rd8Ts7(*UN5Ss()2`{r?#0i)TzEyl?Ft2IjoWUj;GNWd2qW<JEsG7Ytc> zTL#s;zs@g}zWciSj@1OTS(9}AG@JB=Mi7^yNS>w-RdCaIaWtAf&YFtEMPx8A2egYm zpcR{N_0mg9FIa&=`2Sct^ub>j*OKCteX7bGw#UcDj~+WtNf8VoU_NNkj>Ihn7}zq; zJhd^6-JaMoE*)A{{)yxJR~bf~%7Dd5Af}^W59<H|Fh9=yD2*Q`X79cgE=S9GKAIZ! zhy4$Qp9qY0hC?45{e$u~hw%qz#xya*;}>hi^XG;1MT7*~IEr&L^}ocO<d>DXTAJFV zorP@17IhS|A){nB=9XYLjv^|+*s@!}QA0FsW1a=ESPO9q5eP?>(R7W1BLi@RwvwSu zX$^J-8?ilri?XpncGCDK=X4+qhCT8Ie4vDc)F+~MERNX3$c8ipE7Q4=@=bNZ${ew! zqDLaIGJWiEnI2_)q_Hx1kE~1|`4(K-7wrS_2mO!NJZDyB4P_azns)Fhd~Z&NToY$q zCI<Tz&g^btY%a7%=P(S=QIvD8baeL&9j5Y*WFruIpogg45&uXHxufKW>OoKub0Gl9 z>8AY{C&QhiU0NhOe(oZF-ZeSKyptLJ6Z=p~uV*3gNH#JVIy}~tym>2GwnAd@rRn;X z`4RK0h)tT%MfUF%{%vtIP88d6$M7`eP-95nsX08w8!=KoZ^a%YClnouNR_24(1(Us zNVhPT1bt<~W|2$*aq<|+{qsT~6bMN5Xc(YSpbwHj`xktRBEbOfHTsN}HlwkYZ!>1I zq8PG9KL+Hbe=jy+=?WOVl&4Clmy!JcI{wa$n%9g5OKtu443!>7ti_F4-@83e3&P47 zKmS@7fuYqm?g^tGW9>e4YFGn}UYoPZVldsGS@kzpnqSr3$*l(r;cGv)2o@8cIp5%E zv<e2J&|hhy<r0>GQ>LZh)erc?PLnzNrV+{PN-M_1`LlHmnzR2l?0+-2MrYqcp5Pes z@nm0ehR&Y4fn{2Brbh8H+s$y)(L<+wR5|-9fxS`!eqOwZrMe2;er&jxSiIehvnL;1 zYJl~{0D{<$vybEK4eadm0`C1CeUydVV^vDX?YrT_|8^eC@4unbH(mb*aQ|cTYb+)Y zqVDx+g#%Pvic6sPl=}}IqticF6!|x|57X(Rb5}t=y+TSx1-~vZM>(wMJ)qJh-2-_z zNfboNU67kQd{oE8ZJ2~ahz7T|o+Tr7mI_%AZE);EaJ(-W`8Dn>i;jg=ijKd0!?OB@ z&h6cfl?`s;+TY!f*f`Vmry;jz&2(ql!NYxtdsNx_B_39IBw`C|B)4hEE!8~DfxSp| zn%T8E?@eNvTlzc`oK!V!Yu|{Lh@vqtgnV(~CBiU|a(UzyUndC**JldXrk<IJT%IaF zgw;t%yXs{p1?ITOUqIq2YbWfvD7NBx{{3IGk~*uY|1PPs{?-HNxPv}}=2G<84C%88 zPZ(Mb`b-d@&)V(I$_k_DBblTArb_eIvR6vUX>H4i<urbFLxM70e;smq()?ZAuQvF* z7r2iqywl=hoRRSL&113wYm$rJuXCOR-7ie(DEEuK%32n*dlbD<Xzk!63sLgBlVXKk zc6q@qK?O~50(WVO?h-5w32MhcvFKG|^|ze_71vy)T%`CKZj$x;%;Z2uZS^E0fr*H( zS~=6Hu{n^@vHM_&tlbS>o2yc_IZajOuj{UoE`r>;?}kK#yYwR_qgPq{NjmVUk~%Qe zd@%nmK?!21s;UH~s)L%l#QHdj-X2nWWP;@35k>6@;1obG2);InbRVjs+p5pNLc{u< zLqpxQ(pF1%lGgEp*wA7{Z1}>vATIrts?<jOH|j8$u0I8NebM|a+$lx({kPJ(uV>m0 z-G`=F^6IhC2~k6~+LMMBr0b<<K{#&gm2$mM>rA&P0YE(h04Tv?cTF_fq<VP$4NCM# z^NnQ7Bq#AK*4t%!h_)wVU6E3UA+jCPmC`gd;I<I^ejA3?#fF00m>+{BV$S_}eoqyR z!h8`Hg{>AEICd?c&T7uAC#cb!I#1B_<1<vz>9C%ricZhtm+Dw_#$Vd{s~fwLxRd0U z$mVad*e=A~LYxq6G5xZ#m_ELE6Y0$vvTr>zbSKS<jD?tNBN?N*NBQC=#C>Y4m+rVQ z;)cAz_ycvC7vkMY%T0ET>u!UDoupc^>(BsBZA4bRkzrQ!rYK7Gbb?(G#C(XPu(KG? zYN%3Q5KTdx6?xZrf8ynrw<Bv`ICS^uADq+P=yaic%kcEKwJtpUJGLeIh(7Uq?baI; zuz@>s{cXb=tUf!%{VgXWJFaXB)0Sfyb03j<_jOUA-G?!abYirr3tr|#JdLXVS;%n! zC2mVEUF57~1T(O2Gaaks=J9dNCfd<|OuGQZ)~IC1k64M)5MAd8BpgIr!!&*h2*akT zEENsy-mbm@q6<c6ojjNzjV%QcnoCZ0y+<MXl-+rtm3|{(m%Qz`{(TO~N1Lch*I?1& zN~BpXg|c}lF{rlQvkg@NBwrdE;zWf@VF+GX_-Av*_42*PAAGQte^`9!nf2SAdirzs zJ{xpL1EKSuezw2AVsyOnq_BQ5zutPw+Fhpv?`8hUJLezqnZ`f5`iEzdLEp@&r(QOj zLs9ERgktEgP~<{&)`Q$%adI=s1r#1^pn#T$%l*^_#Px?Ei{Y{60Aw*4PY)}y7>pG! zxPpuin;PPZMB+BlK@w?$DB7Zu13$nRSF{XEJJD!kU=JiRgP(N>&Lr|@q$on>sj4cW zG0;CatPI`tfV^yk)a@8bs40ib-A3}jq*aF(^5}O+owQ0Luhx=tRQD_nSsU3sSdKvQ zi>DWNm55Q5$ER6*c9gGF_k#H#2<=T_J1|l2mf74AH#rLF%r6De=_MYC6ALljlo`4y zqordskm!Jq_@E-CgEq09q|^)X^hTw2{3zA-K=+2|&4(c+V$f;Fy<rtkiTV?yC>9Jz z<0yJeRn<FvgOfA!OGgyR$o^(fUqNC^D2%4s{Q)*hId7Nxuo+35bVxDMq5CS6@_W5p zT<&sdOQe~*$`R}n%nquZ4LxdA>@O__6h*hbyPJ@+GN$@RJ6Ko#Ny&e(9eYg1YUD;e z-_jIhJVkVJ+!Tm!(uWTBpa3SWumm{Ccs7=<62?fzZE>~pL01$7HA{RGOFYxlikPQu zqL;)RhJ3?OBIUtl+z$8R^dF*r2x^p=fM}BtG10P2fO}!8x=Ltpb@ufS9Xd>;rbJTL z)dHGf2=p)tohffucbMKkWS9DZ0ddM9#nDs5QZCuDr&Py+k^JrkZCsC{_nxIGXzwPf zo+x@ML4D?)a^(5A-rDH}`r2*+WjQTHkJ7zu19vp4+ufu|Y3xONZwE-Gw48fl07pGF zfg!{6++z0t<hLA8&nfqIn(pn0*1}<82JLK9iU_A{qK8Bng6u+33A}`$pDo^?L^pK| z;!8k`G-%@?+6?MVkOT@0Be-_b7~NFC)!foCFfl#1xJ(jE*i-h-hC_JDNsCw-lJazR zh3JDhyA-9V)(MB4=qKV@$ua<s$4!x^h|dvqQfmXTb?zE*?p4Sc&5g~)nwcM!BVhf7 zt<hDo&h0_IWG6jfEJHsMZP=E!^|E?r^?G@#q+YHPNH41wmeHB*sEI8xP^+XB{v`CF z%k8Xjshut=lr(fPdLc+1i;Xd+k;kEjr-Uk@88w8#oceY6cT~UfFzH`l13M`sHyaB< zqr!HnlF9)&^O-GGR&*o=>xcF#OFl2d^XY)E;!~!&R(_^B^GaC))!*2vh?CVO`DUsy zU)|Nv8Ytc{ZL$3SG2W(bNgyV0)|7&R6752yEDiM_F~(L#*Qzx1=86GqN;!SZ7jvS= z`WSxGN2CE$KJeY5kI+dn&bAYls0`2*#cVk(%oA5*@^B~p|A#?HEOy$YHb}N1mTqS8 zrr501E)OZM4k--dAte^2`ob`7vK(VWS=c{Fg@r?4p{9ZJAn_J*k`I#za&r%u43pw9 zJ8rFHp*gNY3#7!QDR7TU0yuh#++4Q(D(14FHP+G*h!{d>k&k<0^+RZk@A7%^KsAXh z6RFu7!~SaWbm3m3leE#UNx2#;e=ljGX-ziMX<Iw~@*?m0`eMNAjz4!xX5snUv0Hp< z3cJPmDuXZ>t+cgzI|A$P|DmrduMlzVK>t}mxIE;4LJ$&t{*2=kZxtGzO`8k`lp;)H zm$$c#S6b%ZvRN+Q77t`F9BQC^g6W%(y`5Vl+4s{;4nTNRKolc!X$a3L1J3Bg83@Ml zPJS_haU-YKI^!<Leht6Sxkj?<1pgo!m0TqIY;!`1$^^N)w73Z?S;b=@miN>DKRF25 z`&r!)*E!vyNV~;GG`{Ht<rQYvd+@4Pkv-KnlJJk=3B)HDW163pHWrp3{n@05y`^YW zk~-0jFSmr)N|`k~bs`9u2$L5Umq_{tslH5u>G=uW8x?)6CYNiUHT-5<`rz?}^#5#0 zBmG0q-<I7h&sQ3%wbBomU(My1sNlC>@pH!OU%`yYEX&b)xQo0`-Sa><{vaU$g{WCP zCpuc`9G<Mulf!sYaQkVxHGMI5&<_M$#4p4yk_2NEw_;!<cl*({$Z~_Fl1ju9OibDr zV)7yK!KMz)00fAV!cYLdSh`0MJ6J8zv5;O@Uad35kseV%KpDnEPyp%r=7Y=3J;OSB zR8!VZL9S>y7`BUJ06`48UG7ODZc2(5?P51n0II|C`box)2qWH{l#U)IMS!VWi9^>D z>ryO89_Kr3Xf7VQ0sY`1+pe}6XG=0@Su=c^(9&N8U9nfR=o84g9DnA~eJKtj)nF`M zrZ?Kb(AL7qQ=~aaTTByf#+ByeRf5aSc#{EW8GDROt2d(~N$EyrbxgLSAblx5jnsy8 z3BMY@L{g7akS$bo<x~gkI?F~DODdE0Km$74zjrYvFEEdC0Znh>0LyELow2lH9f*Jt zEeh$A3UE#`t&<*AfJ0f#1!#{K;oY}PAE7lc*x`lRj&$-gqz@)GOh}CbI&qo>4YB++ zQ6dGQKggb(JawADgw@gLY&21K%1RL}l2;bGQKgdSK4=R>hED)93Vi~MOcDE1S$Xf| z<h_-Z`5LCn{1v@ZaZKCnnt5zrprP2{`YXVVpZOYX+y&ml$wWBLMhgi*)O0Wd3@|lw z7Eg{4fR^y2K;LwdzPZpp)d|>Kz%TS)AlUSWn7--E=^Hj2TuRx1utthO*_@Bb^D+AW z&rbt0NRstJ*>uFVm(18Cbj+nsD402+P<Sd0mJ*E<W8;G(-Ox03m<9JSJC&PGoP_wn zjSU9mQ6Q!tv4I=}pFHpZrmO&z4aJG>6YD1lWh`WnR$!$h0o+GrgbmEqyg^mds!=rv zBP#ybz}!y5Pq$$MeN!AZa7zH20=MKVx@-Ui7u(&A`!!O=>yH65wM^HLje@RWyNGtR zQ7O~Wb9uUE2O}j^F(Z|AQ&6-orD)zQ3#BNU#+JR=C;>Jq)dV92*f8@%Bs-|CXiH^< zp^4G~d!wqxqDw8Xj!ITFGrakGBd|?M<BQ1-KX!?Cetjv>?v6coEBsikCnF5@kss>| ztl#-VU%>qJoF_v=6}oEKei1{t++@aWeA@gJauAc;hT_+<gs$wx)R-<0!N{7&;^}T> z+wO2o?6##x*vJ?>7nXa-#w}B9#DbCy_81wGqLWY*#dKDOq-G=<4njGEC@V0Q6a{+< z`SfnO?M8=uKq*Z()2?c02QMyu8<M+VjZNDdnjAeO-5v7h&cuW<9&;nsq>0<WkekQg zV@Cx>3~2*JK#4wRMAYRp+?%`Qe`_%IjI5nYyirm8oU?Uc{q{dG7=&i$8EfCVweqvK z*s5&ir*t>?+ENW49vF6KUb%K9@m2p?aISNif4?Dq|K&f)xP-aZkS*~c{x|F4N}sDC zVDl?^bmi<RKQKi+$`nS*E%eYqXHRXd@DI{K<-C(IF=$JVvhyyUK}Ssk!wi&+b;O8_ ztYo^}kJE0!sZ#-?7tS3_wOB_Bgd{)i+LbE7-Qf$;`M20v?^%p?z<Q|U9JHq$jm`Cx z*U8V4=equkdpYzqO*QIEyPtYqdi>hl-j3a<;{7u$SH8miNb$#5Gnw3%k9bCqbeL6< z0?3;liirtZdNGUm#hmSAzB&?aGn>)~j=w!7j_d&XIF6mF70qFcRa1&>WCk)gOvsll z71Y%v#hF(g=<83A7*MrvBJIXWY*HZH4?!gC>2_bx%hrGGFZsO(A@SeUq|vgjHoBiO zmfZ#Rr0$yNhCwp&-h{_;rnx)Qj-1--91FkX5ZxL~|Ed~G&8#tsNaRRSc_R5O#Kd8n zx+?-MGj;=KqiyCgqqH%*DUCvY-WV3=Np;ml=7Gq89)QT8+oYBV<VVGf(OJk3z1S8z zty>*nZgF>ijEcTu2o)zJwmIn_<xwNF*JIe8_ICDkjL;?0j`x(A=iZCp-q0B$ZI0Wt zyXYQcu5ERPGRK!4<WWi%nV0r0J0;(Jg!^a3KVeBxIl3)7b<CFPx@a(@U0o!*`ItCl zBT8XRY{S|t<nA6Lf1BQ<vV-Ws4uKsmUJSBA;TGvJImULi#G2v7LaVl75FP&!)P9fR zSCS}PV75lNj;t{XP0>S1F=kJ>x}vn}SqftKmL*6gqmXJZRW7g#!qd^+PC}O~2y#j! zxY;=V_UsLsII>AmpOO^ai*(;k%-pR+z<eM7NTAcA`lXqhb{Fmr+3jlw_buC}Sgw18 z`&m)FDmvoX7Q>xXt5C&MYu3Ze*Q+J(rK(qv0h8m5xIB)bb*%MBw<+R9Az7<!aa(;! zF;4bQRkTtMCUL2czJ#I`h@4`NN$$PK^DSN4i~n?>!;(#Kmm$%->slGkrMJg}`|wXx zTZtJQ>Y+Nyn+tU94*C_WjB2liXk|myG=}4hip}w~U1`G6cTeRj-;y5V06a!0-&8}0 zCbsI8NV7hg?XS|RW5NESrNZ6P*2xsH(UEdODU(jFw~O|q6hcb|65gdWv+7hzcBH#$ z16P|;pwi!8v=`?O!ti^LxZys<<vfI@dzrQAgr4$`Ds4KM$9KQP3%V}*?a7yGm3zIw zSy?M{GON76vGkPOOLjFzx;aSt7<~d*e@6Ovku)tW+<`vc>!=Ls%X8Q-07cvp1Jyz4 zX_-MNW4N;^V=v9-f-R&T|1m=}J3&K5=CL_Q2Z5c(E>~7lV-;;St-&Tt?00cD1!HmY zkVJz+<dRY^oGP^MW1lTGA4JjVfizXR!yodElF}}F#)X<~_UY~ZW`|y^|7&-7ml|Zb z_QW0qZn*wU<O=R!`K422i6aoS=!&!MorX3dM3G04QbN}N+wC}g_yoO`pIl1&<|AZ@ zF_5oxAG2549xJR|E0s?o7mBJ^b1}J-<sdtWCuk}r53pL+0Iiuoz=Lv@7YU|AG06^^ zGj$UXhWN2`OZM6+59OsTk;~xi#y>a!`AQLVUZIjD_0l}8*O6-OnPi!5yIpdxk{>n- zazRljTCmEI2&tthEw7NSTE)7uu*#5cw0ku>^Uzd_w*onD+5E*xRHeQ>I{lZFLA*9Y z$)InKP5))KVu4Svzwq8E!<VN%j8E`;^hsXP>S|QaH^l!@=6(8iwyRbw@jZGu^Af*s zc~rC8@G^X7!}aHx%xNCrG|kUON#+OQQVdzaMo%{hc#LEY+DKy0nqjn&1tqL>=b<Rc zyi<{RmB3zQ=V`FFsZ=_MZZlP_=3}y-vFrGWrS2eTr&tYZ5X<t!kr<V+Iw13fnB;)W z9gLFSLEE2>$K(zN$-D#AGVw@^ZFCw$z?#C}Fh<whj~*4&BE=3zswFhVveFK_)O3u* zK1P$Y7AAK6{Y7d(HzYL(i59$-x2Cd~)V-QO*1WMVvFFY0Tn+1`yid6suYdjeR>jNg zu6wxm@YS3=PLl76vrZvjc!(swM3TQbF3}R?V`pwFdfQ_s#!2#^O;>I^y~@q<@Zx(* zUwG=wqiPj|7@r{_#<3<*Jc{mIR=b*^%}$8VP`>Je@Uh{m9l}>P^I=b$cuP#~V>@H0 z!BadNlir7aa0VAXRo^_mM8c2TrBTWV_c>BM$(5Vwt#-TA4GukJV2G+zONZFqr!-(0 z!$L|^(Ypc;Urr?`onuO;k$HuR3fXIaZI+=1>LFWhAoV~B!SKq|hjVJgtWO^6N)B7k zzDt<-G%)i#%O$sCGfab0e-Hz4=^y~o8A%X84ije1$E5|lwRZM)yp=K?UeKqK)A4o2 zkc54W1ez*=Uga9kcoFMiN@)yi-D8!i6R-=j1eyUX^PrB0S!Oe4iA^7l2ufG|0L{sm z<OOKdwGOn-@z$8!&$h==C#rZZCf$jDa8^liHMC3|CX5W&r7@H((fgSSjT}8sZ+Y!f z@FKw`8=RZ4zB6rETD|#Jd@OM&b+YNd{Beu?dR2iI*`H!@X%vf&oWp!vMGwQswOK7j zwC2R{imHvs)J*PJv3i%_gZh;S{Ctt=oHp(b-oVLYq;vd?aY^pz9VGlL5`NCbrFr0I z_4KVpbx!|r>Qp4oK}F}VS6L6^J*5>6<u<8e1y5m`Rjj75&x2BUC|aRiSUeGvJV1*l zO1ScL;<HOJIm&kRMo}9R&%~tn;-8o%71Pi>wn)H;*`*Qeb`+zIE5XF@k)!A6P0UjV zKzoaMs^h?gJ6ob!oX@G6a%QPYm?7=-Teek(oDW7kU(5{IG-JHEjyLm{H4vDBAM#<; zq_TQP{pc_87F5lbwLVIos?pV{-X*sLSV4`v;u$ul2>ZnSx?KB|-&~W{+ovjlS*xsP z0^?Ll&+N-M6){JOW@#+LAPFac4&wEocALf`weP&F5LAgC342u~3SRO`;H-nrQbQed z=0Ab<;HwD|%u|IaVup!S3+t>Rua!O_Wulraz8c!(8#~mc-W><Vc>P7-;{nY1P#)sF z{A8K72-`YDCz@V!IS1Wj*Ls}UrS+_0VXul#cELiGieqw<U19^<5?j<>?a@N9nSW<C zO#!w$M`~hfX;*8NO;|W#t>k0pZlhrMuszkHnynIJWce8k8YA1$aH;&N@QCqj$2jQ4 zoGOwTr$guyoeI0h<L2k6TQ7?#a(pzH(jjphZw!g1A5l4>+DFr5N18*qX|k)Ic$fSh z#7Gs7V{VZ6Lf*&EC^>4eJ1$MbNcp3~WTckKZe55=i+Jm%b9dk^anUB432WQ*AiAlt zlLM19Rd!OrnysOUOq$9mqNfqTdaB)L%&2k-E>39>_H7;U(;V3_T0TP^fpaXjnmdSB zOEGcKCL!xBo{C8wuy!5H5_ZtatJ`91Lsf24Ek%=R<loFu7(Gd@^q^fDU%}6q<4SL2 zboKlNvUeSJspk^eyKKJ1jD}HpY3+8x+p;}XefZdI^v~=v)|M;1Yv870t8C?_Lm(f{ z%b?Mq+cuM-<($sU?GlRTjHCIzrRLkVT<;3xP=kN;`cDo0=7*r)x;V_}cfwOjw^nd| z%qD6F0UD^22c>ZWPqxw?U^JB|p`+7RGy;KXK}nN=)UuF!v>TwI1}xE;l#H|vrxS8) zKrJKH<neZq)V53M1>Jkbfq-(|v6Gt3q!F?0pr*k-;5WG^$9qiC**RaO|AFz(n;+6$ z)@zO@j+jig2d6VN_nfM1p1$qN)PJCS_lJ>_2yh=(_m^U-cKf@lrMyP^?m|@~efMl( zpqt32sgNz!(HAe|9|+>>T9C8HlmRC-BDbGvZtn=t*VIgS>08u`e@kB9kQyrZrVr1r zcLq;Y`wT-wd3o-dt?;YD&)H&fR%*|K+<#P<h|;~9Z>tu+e2S!fIZ5Y}SPrpTKdQ(n zV-F=qcal0*V?$NoQq$D9)Pw6SJL)M{itg@S97-_KH%Nz~t0Na_=+2LCiUQ)CCep3X zaWGgi8wH29Q)&3Ug$HQ7{x&;5Wd>@9yTZu>bW_5b^Hb5ndftR8*4?I{C7<{)QQ8~C z5_q-3Foc73v)U1Bw;zIk$A+`T20I3(XpMs8#^was6&_%x*$IxvR8O>jh)$D{#?Y>D z1~$2_!HwT3Y?aP6#iB6;Q%pq<^$ty8$Tp+QDuJGp_n-#<je>LCX)-DEfr5iIUjJ8i zu1Z~YfY@Cavt~$ht|~4&6kK-XaLJ2<tk(x+)s!(vC7GR&NJl$BNM!SwL5i*1%`Kfc zF6=?WNv68P(E+@oO1c}76y4k$^=TRNLB+4{DGerbG0|%U2e(5_bn9P=oZHXYxheBK zgWTgq?z4(VcAteRcAvA_3pcn_JVKdLY7k);mq5qp9(wGkR5rJ?`q^=5Z!Rj;1!c+n zmi%l$q0HVCKd1NVVWtTWepYx^`o35FFwNH0g3}^?W{%Fw5@&w=5chTc{iQp**Qz0I z@z5sL!YOlYB@7@npym8@c9R8aOXk=j6v+tONlsi~lm#U_$QZ;1r3S3K%~|Jquhqcf z2hD3HNm<?R--bTH>V6*_ef-6Oqn<Bb<=fgQJY*xX{K^@V|C->ydND{kGNRmNwpnYB z#CF3`F`;yo(FCE`885<sfMzoT`hHbrI~q?(fB`|OtifqISre}ThK8}sMaa}(MxIq( z{TPr>h5BOD<+SZj?#eNz=iDlB!XF#;N5CiezXYh9iepurL6u|6UvS1HD{FHFhith1 z^7SplJIGhquwmd+>ijzvhEiV%?TKS6X`@A|WfhiY(wy1sRlIauiK<tA$t$cYd3D8D z8<LlN_2zCIGu5}S=A4td(f3+zeJj7L(M*a`8L2QF)#ib;*(39}b$HAV^7Cs^KqPX> zh99u}PMoX@wKz&1oEI83f^=S(p%eSC6$H<<6Awrhx6pZIGdKgKH6|)X`Q@UBu3Qv| zNu*UfpiNOOYGK`tR2#xv8z%+HB`&Wvlz(E{<g8(|RMNz}OJ%3v%#4IF(;CB=2P_N! zRh{<+p)qEA<OH1>)(nXgMNvi!ciOrPs1-^jFi>sDFSU&6N-cvi#;VXqwz|H#hxH>; zU4=P3s>diLsC8Fh-T1^$UJyIvoE*kau8H_cR8=)V+^&~kwuk?@=7cej!D=E(R{Lu@ z@+aNw&emu3fm*?7)4avUaNcgrchP+PH1|BmS_-8I&bSj3mUQ0eqZcPp;TR@=TQisg zn1<-Uw}<I@ai%m<QK?{M3Qb{RC?-**qaWCWg>PsHv3?b*LNKa|?J5Ly>L%b$(sn$9 znp4*6!XoFaPaEdot0V+X+7MYg1JD1cbc#5?7hQkYpV{rOjZ}v=g&Ch<8#<pwbuJnt znrH}-hv=;1cs^bf+dAe<bUc>=Hc^7)CHO~m3I2f?>wk_Cup63#<g>F}K1NTF%ZItj z<?@G!?@@tw89wOl=(ZH~M-2-aaO-leW<5!uj@j&LaQE~6Bs*`yHt7;zo47NH96eu` z6w$yswoE9JQa)?hibO$qW4<a85IeDqcmhAm%ZRHKW+haJE#H~lZl*KDHJXFX=VV(l z%-_sqvuR#bH)U}(wA)FO9(m&WmhoTBuQMNans&-iYjQC?;v_wy=B5uK(_^97N^uR+ zoYi87;e(Q)wmE&L9CS;|re?gmWT#xt=cU|qR_|!aqqty&T1Y@z2=f<=7)Pqd<)Bu% ze&nE~5N-ymjWVs2b%>o{uabtS>tfh0w<OSYQ<AYPn`*Aeu}1HHSDwfvk@G2{4Tnp) zD5jt2y3YRq{J;E#3)Sh1S061$hR@=@1W~&rA9{x8IMxOyMR8yJn7AhWg!xT$E2Ydy zqzFpne0R$7V!=}N=60mKLkC>WtEQYU^)Kh8ap3^OSc^<7f~0OD#`3}*gxEws2G<*{ za0V={Qd7Gzau?Y!H)V0rDYc!&#coe2fz_;F6_=}rA<j|>+;|aQvVwX+&dewsQG8J8 zdwVk5+^FRsGi+(DBb&0!wmq2dWVzSs^rbv%v8~2k#AxJ>V^3L{<y1XfFZVG{Zl!xj z6Z_hfDy0>rcD6C5om|aqS~VOM?VEJgV#f~s<UNpw8=KgO5KO>~45l|g>qXIx)EJ<{ zZKsA|yBmkOomVaBALR##z^jeN0tT>r8Tbx+LmPH1VS!c`|7H&Sna|wwyWf23)zwGt zd+FP6@=v|x8NYY=%wtdI<yg0TuRij`li#{`^WWXe^MYY6^Yrwi4<G-;7k~VJQ~a%* z3FTVbxiI%IC%4g=bZ5^*O;$MROuTG!ikHqLo3MyPMaQP(!<qPYm@Nw7EShj8WPd}@ z_3)0IUK-}&ByTrP!;$3(!45GZug+`6+Tk#MrRjsne5&|q<mF+vyrqu=ERV{0JO*9u z;Ue59Wn`)0l}%?mu~0<h;@PcVl${_A0E9%r+o1x5&HaqAC6fI(m;N2FLP2pi&V@*@ z`a$HWQ0gRcAu)a58723FlcLB=NbT^g^|@H;Y29`252o36g3@RXN_@Q~)UGagx%RQ* z2l*gkypM0!*=TI1n^yQ3+|}0}E9*=NV^MC5`z$Ai=}c*JEw&G`GgW~`*$4^LeaP$g zfe#}`F3gMLQPI1}`ZvJMXQAx=HjzzXo}XGeMsTb_94`@2jR1CRGTq-t;R+}9!KaYC zK^$w=mv%9d8s<aiLs08O?{cO~8bp%wyz8N6y$gkYzRXk2;`9;Vv<>%mgnL%?DrmCT zuE!~58p+!LKqKTrWEI*d)fKaAZ%VohL@E=&DH8uvA7yYD?-Oyvq#K%jEOo<45yT;q z2l`F2_+V$;dLAcxu{L?#m<z3#*$JuRDpF}H1pn{Vsg~L5Yriax)CY9g2rHX<cuN4M z&v<alvjbGO@BbsrFQ)tdoaX*JaIUh6SVGlMuNKecZ8acNv9{N&Zq(|4-SG-Np;ZUO z!vefEwiK!X0kiMSD`t3QB)|E1%_er$$>kjss1B~Vq2z0SUHo8wPmfk*o}`SErGC@5 z&7V<wzoOh{l!~V023o`9JlXN2pp(49!m2gd+*lvg_R>y)HYB8s&cM!v1zN4qf&`?8 z{XX+`TQIgEbt0XMU=G`pQqC3+tNdB*I8gynG{qaVKcO&L(NSw?v~ls6!*1hvTZP5s zYv=5?1U4VGbDBm@!@n*)cK;IZ`HS@5Wq!v0_`&teKmX^<uQHE6mi)sn-Nv`__vf{7 ze|zDvUw<m|lgt;f4cN+u`SokhUFXu5ALZZ62l%>;l@E?*o}#&6<MmJC{9aP~q58Nd zl&YxIgky`)`8n{UNTkNb^0T?J{?d^eioj_XJ1APr+lp81%CV7q*$pg0OX703y70^o z;x<kans9)w)^;-EMJiZ3otHieGxp=i7WuYs;>do=|M-7i&-|Jl*~gQgc!`eeBl(H4 z@0NLHrt4q0{<h^O409gn$O-QAN_ByGH`LWmUS2WO4dms@CQGLjP>xCam4ic*IvkyJ zL^(LD5t9<+TGuz@;6g~iN&{mIdGL5r{!oNv%@m>>D{Mc*A2DZ@*M#iPJ77$%D}dwy zNNaJsZusM+$3CKhlTJGM$KUu6f%D_**FMP(dWRn3=ZhZtv3n;}q<-Z08Bk4#b>Fu9 z24_yTJ_7y#WEQB}AO&&M<8<cmB2%VuZ$Z9nFfU&QzJOAB1(>9CmvZP`jBX?yRSumf z1BQWCq;nfu08FX33nw~=02F`?;{C4PC`+U%EHfOqs>;in?ReZJNh1wkR<yx4(?4O_ zAo<K+e2{4a<-qUXvF*036E}f_>TS!{)ja{d+&K3|rQ*n<8k*Ljq<S%o$H|S)YSxL; z07rQ?-Km^4CT|ekq{GTtPvWeH0C3svg8NSMH6Uvgd%)GnVo2(-NV_|G!YqlVP_$Ty z0-N;gDB)bqR+ii*KqM{ZonWpdd(twnd`}DWzj`na=J)Tk6X3P-V9uFHA6v+tQT~i{ z%jb&dmXv9%y#VK=T(-*>&@81uzO!Zl$Pq<3nwo)f%(x&sCWf*aMu3J%H>oN{;-FN@ zNS^I@(t3@p9YQt8dV^O0@trtIid(;YrDEXXh})lJar+pTRvA$n0c&m4PK=pi2bMq% zvPBB$4OgbZNv&*qya*w}n~XWyu|rsP03i3Xjw$&7jUjOdJlFt9QG|mWgh84D$3NNF ztpsTOjkpA@ez+BE;KYMem^kPVq4i4#_(seM!kJLk&m#wh3<_Jw9x>$DM=Ba>jr?zN zY2mvT?tJ70N3^+pKGb~iL%(?LmPe-DQBaS3<!m$08x4lMu>YF<S8|H4`rQ1*;RZ)w z%-#9=OIw+rjo<&s8D3zVBkqn`bA58~5cwC`Gp0PiIQM62_T8C1V>LJ<CSlPRmj;H| z8Dr*$&RD1#qBG9Az0f6*O{pFoSoJ%k1l**;l{0oTl6H4H&X{ycJ?g(SSr76N6gJ$7 zHg{5S4y<!<;N-S0I3R3Dqn+mZh8*IE(odtw9^^U+%ROm>j-ThA-Of5Msc@~X{KGb? zcvf-@hq)gpn_U=hb6$pFmJYlheWtVQz^AgHp3O@)V3*<8CL>|??ttq6&U`>Q^8gLY zZ~^28r6V|VFLVSFRkL)if?b>^mK|{}DF*DQ?{z8x^Ay4kxtDyHq&RJtBH;QIXC30g ze)AIL-hlaB{aIV~%Y1A%5tU9?WF9D2S~*#fKeNnzN7?W~)%^U-#T>~h1`4)0CaZ(l z%Q2Ui#yAoc$B{-vZ<M8D9M)l{BnA?V<e3jDLhGh3UamU|p{+n@RJSG}byE*eL(A|y zyCMhSY06=8d6MD`Se8M`Db6{>#U1h-xyzED=O|Z3r;3&gQJl>@va6Kz5v4Z&QNEav z$5d>Dk`GzpzR$@+<aH9SvR$7InISp#VP<WH$!Risr)IRNk;74Obd!<Sx9@PFx*!`Y zC%^={+{Gx*yD(EGEkic#@NjUUX4DHco0PiXDxs$6Q*6d0vLtedJf);~5EH84Qx<0& zw6(La%MDVd3S&W%I4{UAS5|8|ivoU->{8y$-|r@<9datyYW8}cl>EwJ?t2>DS3J8? zTm=eMvs#<8t^ZOM@7OMP3L0_;MM!rFRJQ@TQ;jW&Id-Q?sy<Sd#WwNgZqz=7be>(P z?DjuK`hl68+3L*qcN5c28Ja!I$&z3-%Vq928ogIk(e1TMZ(9rQSWMBED|b+Z2;}7E zmO16R5yMBDpXo!gW7(-0kb0EVjoHfOUR(K4p()FcGgRhooig{5>q5)?dN-La>C&0o zlpb4uR_?XS>PJhM)h~~5|DtTZqAn|Wo)jV$+|JH3g@}de@ap!;-?0)ECr}82*M~w- z8LMuobct5{^pK)HgKRZL5EKuLZOJzQc?#9@Bv0B+dk8-j4i9xNZ5L5^fV86vzZaoO zGkg)O|DZw4c?Wo!_SQHEA-XCeM)=`ndyDt(K-zKhZu0zg!S30?WkyB4RIWf7Z-e=H zmJdA6{i(ixjuNWkOkC=La0jNAXgaA#YEaikccOG0TG&RXp<IyGRYwHsvZXi1Vtohc zMC4m2*oS8x?!nKp8|nB6ubUxr)wl79`XRiFmJVFC1S=j_7;8<9T5<*B59c1jwAku8 zxH2A^vo&Xa9b23O^ZVr!^Un#wiRQpR);en5a$9@*1>reY^s*o<tp@*DsHrsa`RT>{ z-bLP&7>@FPexlDL^a&;+(=hi;XKr>IT>KZhqYk4&5!_#8i7#d59qmotrbf!wqQBAd zQzoBw?iTK=`koI)sj;R~Y8*Wc2_aRGEJzjPw~%4kSgCA5VJnBGk3*vUW7JCaCDuwd za*)g#+BVpJaDn2j#ckqcP-3JlqX&<j#IGp116pdk=dpHF`$?2qp_LS+u~FMGsT*fE z-aY>i8v0h((a`tMJmmUMB-hHz>+{c5R-S0?`gyIx`j)#o+*euooHKg4vU2HY@Sm-9 z6^0#`bLKby?McK=f1tfd$;m&nEn^Qhss3;J9x6tab4;bmSvghW7V285ur9Ead<$Et zG~5>wac}o2@<kI>a%z_3+fKCXxQEDDNKJ4s6_y(d>&VPCQ{{AD&Q(!2apbXF>`kN% zx((f$y&T3r->*pbG*-NqTdRoFc>VK`=l$xO=dH9)iB<m99v@0xi#I9TS7%qx(17R2 zjsu=MHT&FF?nq~0dyjNGgk5Arllpkly^*nLiUTo`DfU44O`sywnnUvl4Cc(S(_}Em zF_$^Lc8c!#4a{Y#sL)Pb?r_ohezKdI$p&2uzUjbxR*Uo1OpX6jYyoPV-S^UJoZa{G zM>6-049btqUw_+_F+U4`c%1tZC%X|{iVbnr);|F9A~%iO`Vs!gZRAdk#)~<`yolsj z%eI?@WV%=pOZTy9vXs!*0|QVnpDFe+#&lXh$kJmyh#XUG?sl5^VR=YrcbEw5Sg|S9 zFpwnPGEQnESJ&P`iw1W{L-Gw4v@_7qX4jz2?{tBv%7!+h#P&wP3YgrXL0e!b;dB^Y z&Xtu6AABwS!c_1>>rcMQf5c_F?{~hFxT)V0@CTgz=PuuK{$EYapMBH(tj=w{`sNG2 z`2IbY-rV5Neb(w*dg%+9$1Z$m<<7yIe8Kx3_`##k^T{s<Ez6m=x6hh#krTv$OgkRM z#yA@Z7>vnXO?1{>$XzK3U^O}Ep>qcnwNT~JhK9(nw{A+Ea7<CYKz)vOnDw(0SDQw} zcucm0D6_&AmZ1($9S_(pC{YP7gsfL~5Eiky><UuyQ5ptuD@BNdBSbRANgi@U9jz|< zOfyHC*T}S|(|%RcNMg9NjN{HrNp5jvp0`oOwvvxrCjMO7ImQybXla}Q(M~s=_y88i z$$1INFdaG=SuGXeyp$vYyb!u|aPH{DL1#mHQX$+FsBxzd<&!pv_6cBJAj@sh3e78) z3U%~PG7?c&0;~P(vVV+rOTo*(QtxY?Ss<DxP6}hzKyDAw{AxK}QfNKN&vflcG!5Er z6f{n;C2=&0aHb3~l}<}uyR%G*QQw}Tj~-)n)+5}vm5e<mvDiq$EC>J?neE^>WR}-r z6(N~{!CI(_<fSGN#Px1UamWmuTXm3G!T2;$9iy#^R>%=KF80L{p+C0&v9OIcQ*TcW zPr=ksVzO&^8ZyE%zjRCI>>@E%DD6VBx0g-J7Hz-M4`APru=Hcv+jFv@s^-g4Vh@(i zPwqorN`CS%_kVD5fbJL-tZ>Js!GK2VQK#Zqz?q-0>?L_E(jDv7rYMo~>e-a~AuL5@ zr6+d0U=s>G1h)LKjK4q%43xvYkvJqG8EJl`a|jn}0+eZ3$cLQqm8On{=|e<v#7VuF zc91<iSVCQPhEmtxrFfKkF*dFeJjVQgE8~vskgL3!Dfg0P@0gOSTqc>7){DiHms#~F zRE4!RNv9i2-ZPET%+ro_(2nr!$sUEHm{LtjlN7d93I4=Ex@+_HD=ogpBFeE;zOq;l zz*_S5q!xV|i^b=9$@njpA+ep(mgS2qs~0UHmbYYl;<R6}xL)Mo(Pw*kvq}*yk|Zgm zMRF%{Vx6v}PSc9oSWMN6DPd}4+m|CHsn26-RptmT){EdHR{0hnpeq)xYj~V=rewr& zO(lJqwinf%`%YW#Nv=C^_V9mKhHS?R1~1N^7ch76Qn@#A+ulL0G=H%|`DmI~BJMaE zy8vbE5M%cq0->86C55>h-)jmJoZ@H>7=S_rdEh9F95x!0k)si3b-tOLAt&*{aZU$D zE3h+DKa8;e9H6o%t>UK#KFO8JYL)sy9A$GCPKO*wvPlX9R%?EVW-$D7OSCIhY4nym z!f2x1clvTq{=9K^JBQgx<IuV#@P-eUA=p<-<T1C#%ayDFEgYQryR1j=FyE>4(1`7E z+G)~if*U@8<mEiIF;OzJNP{TrrpdN3!6cR%{hNm8kD1D_@YD1#OF4^SRtJ(6(WBoM z16R}bhQuftNO7p?<&D92{!gwkB#0A{T5)ni>hJii*thZW<lk(0>20&<!29+M$>;m6 z*tx-7vHP&pcrwEN$A8K0pqCTs>qtbLYT~6~S|c!!6vMRGN(?xt)Q*KJ=-yd$uw^zD z6b`@C+7<{8C8o*rN>)mrQqR;peirRb<tySNh6z(*$kkBc?HH`b{>%aFu`!TRuW#^q zy9S3q6c>+;A{}QK9B_1^LlDg-cjTfulfm*$p}(f1x}mbl;BnSw)_u0eANuJxUVZcO z|7!DPmTO%tmcccl;`^0feDhJ>yTSYT`M=5QWWDOY)l^mM^k8Y#1D9L*%C*0L{;&Dg z%#UAx;fJ?m-tpY`pw+tIxpkCpedTeUOJ(F=r87UGxElYnm{^{4neSD41x0T>J*LRN zeiHIOMDh<o{zbCbti31O;0n81Dt`t#KbnM!K>o_^F)_kw>(UbBj};M1Y~cfOiDTgd zs_Q!WPeRU98&VAN@7Kz|9`8Fgq)y1+5BWPf{r$12to$AI^m0;tt;&A}Njla_OUv-k zNEBh)jk@hr`E6yoeo0xcBcLPWOKqO+NMd+~VjRf?N(!XWC64H7i=|}u34_ZTqiu>> z{z8aZeL_Q3<u(rf-Ov8~)i-}vhlAHEzxX%6!PkI;xi<^)=e~Ua!Fc_P9EYs!tFYT1 z?ld3gWDMDb#AuxO<Vh2Fmw|-igaqQDE}{u$V7Ijc?qHoRr3|2+VKSsKfI(e?3&wfk zp*aaKgk$1K+xEi4C^1SiFwDF&$dth{gnnd0nriwjbi&J%KfL;H6lLJW`|SV;zz_!* z+GFv_sUul1w0r60vie#D!zwa%tUVaP5w`0}L)58voYuq8%9xW1<A~)jfY2Kr8k?J6 zrJOu%y8~;nl_FwN3&-(ql-Dv`4nRPAou?*hb8136BV>HN;+wlb1%t9{uU9dRp(166 z{^G5tcSX%jwUz??ff;@1k2lnQ<1>7elV=Gxeev|Nf}4&bz|9H5%?LFfpCHFfo!+R! zyKtJRlR!JDf3~z1IR-4u$E0xpN8O-1u4Ly<0XUJEc*>^k(iMB7(z!e>as&Wb*^mxG zi%e;?NC)2cZb)$eX8^$Q#s{Vju4IAZ?Vy)O)YmF-jzNoLTb#uOI~o?G1S;U-iKN(T z7yU``l(NJ^SO#{&k;M4?p<|TiRhMgWAQ~;N?RkHYDM=t;{FbU;8Y}bk%jG9+9ht+h zGYm93^|I44&3&SV>T@pxH;=1*)2H|$Ww!;EHq$OB(l_mgft#BMH^asC%{0XcS<hpK zwFXB4gV~rg3e3zao(J|w5%0)4Fw+|o*KONtd84$lKxcUl0~^a5(j+i5p~Xx)-uJ*G z0cK*rjHf>~F}a+@jHjJm9#&tgm^lh%lWo<Ww@cILzLo~zY|vs>EOy4$n{{P-k!1-x z->clA-oeq?xuff<${|lgS&|@eH<8LIr+a&U@X_*oXxh4suM~jC{`f;=oO0laxQ>x+ z`Y7M7Y8?IcUOh>IcV8NZu=5#U=aXvh_i284XIqsr9WSixlCj!u0^eajLe5gjW2=-P z8Y6!5Ra;TJB@dmXUm1$wSBjdjtUs%mFq1UZ@mq3iw>?^u7z3{28)nJt!zb<eX6#D4 zH$Is)V$7dig(k}Np&znKGnhv!ZpZ`}vZ<;jjVY5%WW>m)AtR>U!C5RH!BR-Oxw?TN zSD^`WYg`;9FJWS<T8&>Jw-@dXQ%9~kxV=CWb9<k(liSNfA1RQ6+q(^^Y(j95XdUzD zAa-|xi&-(k)MtUI%cwmPpYa*~+|G7PayGN_qq;Xq$6;JyVXbkBX3FD<fN3k`6Xi^s zSWf|OgQBaBVZoa6?Ee=_r>8w`I0n3wGIo|cW0x&^Nhgqqrk&DzAUoa-W{9W?VfI<2 zQ8~A_Df|Q@8oPV+4~&1mYk)<DPZwXAe$3J)9Ykxq(%hq=HU8-S6qjQ2Waur*p`7J^ zhm+^XSL%<ak0>_Jw*s5DPBw3};-`{X0LDgb<Sk93!?g)nm9)EphA-1QmA23AkH=yS zmJd69CSWEP6~G~5O_jLQFq6@kc-oc@EANM+(#?5has_JZ=!P^6GkGwl&dj0@?{{oS z12B_=P-Yzig9oRNX3b=Wk6x~*uT?Wyk|twu4JevJ<qxfC;yR1qyOQE*WqDN75ja|r z=<vj$g|*WZtz~(T*U7?GmnyIq<IUI?^!RC(%Gu`9)}?Yxe^e@}%Nb5`)pcH1NtTW3 zeSuHOr9WW4i)%pre2J!bx(Ivi%Ayc@o<2~)d4V&^#7e{CXh;ORHZP(RJGd#$VE`R{ z)#Mv}E&a=r6mcOWw=uuj_FK`lk#fde^;D2bbxe8dLn$A7oTNt?AGQC7_W$6s)Y=_M z8Smi-IXoRnF752_#g!BduX#?k4c&zYkilzgZVgUQzZ5#K*-BtneV>bF)_fgN9(8iw zMJ00V3Z^Q0lvir$1DFerwHaJaY!SvRacn3~U2VP&3k+L4R?lCyy6fs36%X>ftwykX zSKxo$5?l)OyS)zI@!5O&>Q_GXzOTG^{9{ip51*Qj*vuB*aH)3Id#EP!DIsvn`kl|* zao3|CHs2!f71r9!4-5uFrQNbsThp~Qeg6xdu=C>DcUGVI{Wt#N(O2(^cOJSOWbp<w zZxMvdO6d7Nef_`4|DB&^{Y$Kd_<;F7t`@A2vm}#tl1Y=oYfjJ84n-!k)K7{fAd@kY z$y`+QZAz1n32L@_nV==6c|!_ACiab#Eo_HOMr^5;5$p#}wX(+~J<9k<BNMzwGHFFJ z3^K8`w1$)HmhzhCicH2FkV##m%Qrl!$b<%`>Uc;aF@Qe7{Yb+w?H#On#fp^&raNo+ z{(VT|zEG>H_8nVurIq*lwj6s9f`L18{cXb=>|V@rf1<Ll96-d;Lq`reKY4%>+Z=5_ zOOH;r^&nQAX6ezh$W;x;m@qIEk3QOYyfI`G8)3bbQJ|>Qh-CJV!VwfkTAC{CG0=%j zRn;Hp^!HScPO{Q$7@UV}M0OH@Pc=5Rc**G>wM%}~`>?uJ3ZO7p0{%t<bi(NZO${}T zBnG!zN2u7>MN9GG#AQRB5O`(b=dB0oYz+E#D?#6T@9_s8Y~>#oUwUT!wx^!{+`Z2R z-O)hk{HLGo@2?mguRJNNUn~OktEsiSP6^)2{F8UiKjJfue{}T^&m@DsnNv@_Y&M6Y z){B%|RL=Gu;Cq5NM-BkdY*3ytHh>UAj?iW~E`pt7>A~cIPMj?oNCqo7I@=h|7R@8x zO={4gv+c7nPI1~zE;dJ{I{ZLyPA|?hfU~7$k={O>t-w_}+ORxkFv<3h1?+NzF@c;u zCnbU1R9%CU&1`~4-_wTUjevy&t76spjvSKre6Vk8Q2h0B4)q=dwM|$I>GxfIw2bpq z_AKAeIlvhAZ<IM>MxiA8Xd5Is2;xLM-J@)S#M(BjSw$Wf-Q>Iombjr!2_O{fW0ZQw z=#12gW1}@IVL(!xNQzzdbd|lyN7T=h4260|QUsAfJHfEWp00PcH1w*PIF9A)u6n0y zTMzGL!8reR9tbb&zGm?*!D6k;{J>~bbZy|;<9l4M2s|q1cQ5cb&ONB?+GCAmvWL!( zIF`hbc)CkDzoD4eWurCK?EKI;7TaWtitA|SJFUZ$+&Dj)m<&=mc!YZXu5gut)6-4| z=ypg(;0t9X;%sRF2Nj)fo#TK_VQgUIL=0|&OmXgIL+NwgLF<>T)>i1t%s=jaboYi@ z8|+#2cnvx>JTUCeymIYI;;a6(;9Tc2|9(UK{>y)oaS3y+AzR`@{BPKRYD2*0hoLfX z=NYH=J@6fRz{tX?@M*8YUVH$D#)~1=3qidNbR9h?nMk!5O6Q~lu=S}npTCE}roNZ~ zuP@53{wOT8&0!%6Z51lME36c-1K&|#ey=SR@&c>74DN@_J0W$sgRE+^^0zzK-|ml! zotsRSbSK<2Wyw_{yMn*o0jrog(CVQZQK%q_{@wh%Mh)Cm@JHXT6GcnG-<#zFyA5aU z<NCOHPWF=4r0R<se=ZHF@*;>t8aJgbIOd>VAfmHYxB}Fv8tKAc-{7uhe|;yBXg9Vh zrbD@}TF<G=xC1TCo$UC@K~k&)|BuD_XwKkPG8f2LX;WOF_mrgf$$mHTcUplYe79!$ zZZykTPu~r6%?wI)nEb<c!{i?(jwBbp8=<NjBptF)Z)D$%+AF0xwEJnrqFkiX-^zeS zG3DO8-<nx3{)P0eM6~3WVfR9rdAtZ&O>vjiZe?QBN#fhkD#t{UP4vK^F&zSC)c#G$ z2_39b;{;hiCAUzk7}w&%@D0kOk`t{#Qa!#VDwAnLZFMg#rRStUH@?3&*iRoM?2->v zY!CyA<IFQC6z@Pi^L5J@albFm8FuV8KC`Z<V)A=Q=+?ls(`C=isB@x(ff4S1Qlb#4 z9%mM(GebDiA7^J4RnH6@S!iRl;mm9{$%&=_pK@kSADx*Kn`Rj?mW`@&abj0Y3C1p3 zR{+IZj{|G=flV5l1np8QQDw%djj;eTWltTQ$pPrQyPc8_N!MD-J)vB$^Eu$^ATa_G zOCs%r1%(SdPC3|$uRayTYF3$cKZuiYX5QO3$qy0(Oe4^oIrUCdvrUX>rXW|88T$x$ zL9Q5DJ38wb-}Zn#T~XWA%La!=q3njo$*;%e2e7rsjqSuNg5!b%N#<UwJ_h#G>%eDn z`kX~v#$Xx!4Qu&8+gd5Ju_4<p{s3?`%6*)ZU37+yI1MDjvy^=<Izu0p3d${XhDn?u z_`;gDhxnm+L?M=ruw#@XtN}M-L}C|ZYbqYV`((6%-DKcgEp0xWni#RCDjT}`=;@#% z%{gKTWlXLGUX)zdL`J0mdwV)YBbNi9Aymc*KA?+?gfdT$0}tDLX593E`IE?1wR4Ny zI`2_7kcho;={T+{+eXr<M5i_kc&A$!FRna-)0Acx5vi`=q^(iO4Bk7ZYsH3qyOSla zc=06LlXDCz{upa>m0WGHm}JC6ev`2d<w0eEymTr*wq3S~Vbo#}+xj+A9g{x9w#&9u zWI2r3Hp(6k)1!=!G_ft-qu4eI@gTPCh(sq3tFdkFIm@Ltp?mg4-sX&rtQ?036{1}$ z?n2BA8FPGP6}`Ix2jPl05FA`OMlV~y$i;Xd%?TGmEo35CBBE=sfWmA)p)vo=2g1|M zy*2TM_r+IlIiq1>_`SV{$@ezbzW=imA^Z8|<Hs$Mj+*`IFZU2LXR+}#%VREaC;2ud z77@d7=>(~A+_kh{!Nw@zVQvX{IJ%98VvS+Ec#1_RYe2=Cf(igabi`6lh6#_7d-O!r zPB7_oK01!r#0c5~fR4@$q@Oy0jw7~I^hgBg=wpwloB$p8NP`Z%N9gDyfN*JFv~Nl& z+wq#`3>{9OgDi~EeucL+j2=A!=_5YULgOf$k}DnEJwu1r2pVb*vai1o8hVEK8%M@C zDlFSu)%fRc5F06dh2`om$_Iy8)f>+;y%FV3akukJoP3b9?`T}Q73iSnX>1;*T%%TV zED};~It8TMyp0rSj+=9u1Fa#l%Snpcg-yb|ct=#6-IOi@F&DQHld5NMX;R>VwJM4- zk(=ttUueRXa_K$!$T@2h$2O!DASbqw>R*WgIcIID@v~z<&IEfrLys~((jW)#5pobW zj{`aV;}a`qRK1dW&X99)7XF+QzN%z|M{<St9b9Bfi!l$cMbFLXM7`2CG(3CyBB93x zsvR-Jy3^{DqRoMXX(5t~lzl@NIklW&@tc0E0-LMq_mdpe>6{Pd=H?al7p%BK>;A`Y z5Mp}uqgz&R?Ns@2LkF>9r#{1UQ6G02dyNOwUV^c>bQ|zCO?Vs7;H{y8YX<g~2ye$v z18?VZcx!T>C%iQ=ONEXS+0BNX+}O^@i`Z;Dw<%o`ewy-JnuDEks;+3RA@$NW4rx)b zJ~MmHCXR1NM}e~VMk;nR4wRj<r6$ge17(x!@hm;c_(+2?yhkXTB%pC=v=tmZtD-FT zoT1DJ{q!X_vNzL9=Wk=WX<Cb}^VAL7pGeM~xkTu4gFRC0-9bkg^GKlc_RQTlu!yo| zYP53tT*x-78##XX2BGx@{Squ9v<VVTveC^8*Us$&Tcpc|xjAl~yN#bv;&H49kY<1_ zip|sY3by9S3tCgWpdeu@J`8M)=diVWWPBdjLSGb0c)WN!%R5X#d!a!cLkQ#miK|rl zRqAi@>LRJE)#FfEcjSX<)F!rUND+X^v5~Sz901d(E!8^O0x-3)#}Rsz@sS2hc#pu; zMwxytZEtOhjHqDBJ!fEAMdQ1)d;}_M5wMcH;L(BMOY_0;OSdp=d9~OYC$(mC9_UO? z5w?~QI*1pN;t{)eb5dMm!|n5S@k|l@wJ*z;uk&(q(`;lyr@r_v>>W*+&uDaBj-f~E zJ{g`Z7e^+a&e76!z#xZ^)nCsr{iWE;?p^Gqs=tcb%O*Fomx}(<Sjd#;;%rfkm9w1x z!|kG~v3kR0*~2Dmw8cm@^Qt5dCJl>OL;bv`9vC;SvI@DHO|`AF!ngJgF)fxXc?zp4 z?6THm{#Ix*@DstS|5z^26fA{J4_F4dZf+3SwRPCS-{oXG`Cip*&W$}!8x?(ZD{|h_ zIVdrfSM+s5S<&{Hv1oc+v3}Fh^pLXb1bzL-;^{*b)Q2D*ji!%Ra8r1;hzbpCV#79v zIn<PN24URPnL7YBrO#WQN|G9znnmn<#wINyg|o!$W}$@)*+lDx)CQ$y*hukh1}L?V zEmae;LaEiV#~yl=@sUQU;XP7nwMp#N;L?0eZCj73)N;?6QtN8~g4UAa9d_yHF(|Yt z*voKy%szgc&EO}`AV^<6cPpif7VXlZ3;4O@kb)N}6Kp?+5b)%w3(PeOqm}OpXQ^xm zQm8LYQ|l*G<sme{(7%7&j){N7`TOv_^!^+s_!6;%3xzyX?kWOE!Z?k2QKoTz@8J8P zW_mV%eajCh-V9rnc$0s$T<{rxpc^&cW=Q(&3(a+Res(-$qnH!wF01dbZ?X=3=Kb6m z?mdK2)@wc<mu?3f={eN^DLyz%cJ#y<K=Z=39Tj_`>7atp&S<)wIdftoayN0>-c1NS z5T#Acgir?%`rfEGzbV}UJ9|sc&PK`3s{EH-Q37aJ2LK9EBnJSh#0+9pOnP6w;l0_0 zMo?)L7#-M1#a9P_(VK0l$(tvD(JA(Ljvi%vq`@fOBaBXwBgduDiN1PH#c1w1!{{yZ zz)&-SQM+^ixi_)SK}Gp4^!442S4M}-9|kUyqQicr(rRlAQe8!?cZg?!(HX5xzJP@Q zS7O5>^JmFmqlAZlHwx_uu2lN>TkR#17J9f<#ApyW*A2KIOzU$P-(ySXsWXesJYfYS zf8I4Y#{9q;{u6sgvrd)%ykH<o+3K*BEEmbZs0r-CZ&y^ZKCO1tDK2sUps;meXq);k z<k-R*vB}f`s945;2m>8katk%H1Atf#CD+sQ^WGW?CC)4R<y}tp%B4DgV47*#ER0Q6 z%2wtI0*h6^qO!V|G0fDH)80?atW#ATUGc%G#nm%6697Q6hMhv-)TV5e1x=l)_~;@_ zPn}1MDhpM_MkzUws+2hcPxx`6fIwB<76(8n*=Q=NjdWJRWv7M$mc>FtK>_C>?=||2 zmNuiYmTxoa_-PV+ETMldHeu-s7`+{q3Um5FSBuy0tL%5If8x~DhczP&b^M(hHLn>B zmfHI787e)FSc|)|^69%hPYc4z89)D8c$R<O>Kpfj1>xzn`~Il1(i&*=+MHDugX#Xv zs=v9?aM!3WbME|7|FCe&()zkNuG2(Y4;aGNer^#gCO&h%!P96J3`U{9(qyXa7c2v( zOiRJ5AMl5rCUf>pqXC~;?G3g3*}4W53zlDFK8kwwLST{t7DmXTE;HX=jxXM%!@>+< zp+Ut0g_$0DexZX&|AK-AcN2T1^z!i{SfxBzt56V3qDYht3-nkZHYr9xGpt|%QC}c3 zFm#aSl=85E*2~nu*b>V}T_BP^p^8mEHqmAz6$@^=)CqrzR_q=G7G{72M^P-~Wv%uH zh0i|dZ1ws(s`?!#pE`B*36{9Z@+|t^Z7>XGGg+z{xc|n`0Q+_Q8`rlg?!cZz%8N(2 zB=;fS%E=3ah%<5NFtowJlb48_sD|2rNIM)OL=<R;yZRPLJKUAk4)hB7)3l->bZM@8 z_NpsB41AETkQWv~$VV7?(HH6;oIlEDP;|<oW&r@1Sv-A#pmW%snqE3ZE{<q%NR#k{ z#JgyzI#CWwPW0Af5#zGUp;(;$Y3Prmzzq6=r8Xa{ZtSK(^0*}G_k|*h<3X&x$SRa~ z>NNqIL!|>jlIqJh45v>kx+Ruiy2X^y#d6zOD}!6O_IEdiEc5lJVRI6yp6Nqv^d8bQ zi-fDQacP;X%&A*+nr02SN;R+9YC~x?B4n!O?*5uJ0PAiBEXlpbs;R~ESs&i3;<*>V zszY89d8ST~4K7iuE%E})_&5~fgD6cteEJe0=5F}I&^O2;B&8<Qw6o}O+vVQ)0HFsO zXSJxN$)_KICyWYYJ(8W2)YL&nyGwOP8}ybAGEw~|O1p(eb3pZ~!6Q^W&%gg`Rx(L7 z_1`6v)ZcnQ;HzPiV0{#uG($FN!V`w2f=v<x*ray5v$Dcy`bg%ezp20`1+!{M8Clo{ z4ogY>?8cBoI_Y&_CyrSv(n%5Isy@iQ#mR>VJE!B)60kG1eo=>=)uKA-15sv|K9JQ( zI*;W;d(}es5Ll2Ida*OqH!!!tW+(Mn!6YrTCHb?`2Qc&mtdP2roKff`*d|t=hN=nl z#|SJ?H7igxtE6h+x|HFp<i-Cw*r<C-L7O(g?+BYWw64oL$2FQJml}EvipKhV=EMLo zZS{m0fdPiES~=6H@nV3OWB0+gv356jZLUhyi)pIL_hLdiO{2>iZQXZ+fH7bH5%3aF z=Mb0;<9<=hhUGk5vSF#_WBIz~uD+UM+eS=sAJga>UWD1&n_EL>i`*J{VM#Sx7=vr_ z_ZGKY7FaImL`3IF=`M^m0X@s~ya{%S(8D4lxHe>`NYP}?l!K*?3LV`YQ%C!Q&g;z8 zA+Hy%&N~lj6W%#?mKWZA?C7<PZeQ+=`oreyPXRmAjH9s~{b{gMe6S2V)wAY>okC|N z#pj`zY@;P^O3OeR7K>aV#bTv(fsz=ErXxxy??WgLgH<YO@*z~}REmD!nJex$lldaQ zNa`X!W1n62j<Ue04#Dx15F0u;Gq-yB;+=ODwQ8vum^FN&Gi%hUrJA*DCxB?POJv=o zHT2?1y{mW@LEwdn^4&C6e!3?SbM7<T`nlp`SH>9d{<LVm#}^^W5q`DMM`Eh$>8!2G z#$Os+w^RIOnkg|gwe?pwsJ58z{dUEg`I~C)Xq5Xf@8{$tvUBI+(h=CXnbWuG>>SMx zr+RK8z165-uPG+C6ZX)$LeUv@`K#HabnJsQLEsQuRW24Jk?xI((aj<WBlKTt5=P3x zl_Jsm?*MsSFl*ugO5MCjiJP&B1ws>9$SEP7n4X<KcIMLi?=Hk1EMsbR`3y_rd=PU^ zP;@J#=z7p~oW&lccjV~6AiaMRQan;K5}Z=24gW>9qqM*&*Q9v>vx>Z8-#%fN(dxUa zE8FZgzHkULc-r;Ce7X>JWrY3P2D2dSH+_d~B^J_r{#xQ|x85l3jB4NbQ_R0Q%KccW z&#)!pvK3i$+7N>_CtWuKP3#~?K>gzZplKl{4%=w79gU!9YFHwLmu-hyr8WkU!-zV1 z$_!pM^d~iKry(`13zQxjs75z1ei8Gc(kMC$#O0*eWPj0Ybym4LhY3FQ4!K)#nLuRG zhm4RJ3390qP83#Mq^XZ7LcccbNP9bbI;goctGoL8U3vNCf&|cB+WfzzOD2DwT&L>M z`Iht1zumQG{b{9FiRFzg-`DY1U0d2?1AA6}O25)7_g6}P03#Px<cF;>BtNR!ibvw< zq$0D$n3%K?{Rw1Nyk+SaH3X34#z=A<kQ{hSN|NtW+h8w#aScDm<{-HvM6M4i6?h~> zuP;?r;pd?Jin}QYuZ3jPtBf}?0rrz4Gpz_PN*^X2=?;I$M|`unUZX+*>K&W?%kN_v z|K&@26W|Z^9?7-2JxY-BaLVik^?6X<<vjOWPF{uPXURB4ic^KRFhLSM%Su;t3s1(x zL$>tk>@7T<yM<^i;>8P5@g^qDB@$;J#JOZEeGzGLW(2<&8b|ZuMZ8$L2{JuPw=r#X znq9=}h`Xprb;h3dC8iD@qW4DNfMBb;dM_tIrz);0VjZWC4>{6(@nk<EpDNk}R4C$| zn}GN|jF#xHtVI*;X!0r<%ihHQJ>C7)HDqKEW9(%bC5r>)F86U4xL<MdF~Y$pJuq#2 zP=f<CGcrs#I2TVZC^!35OkDVXd3zV|Hm>VjbY}1dK@b2zf-evRK@fa{00@HMTNFuA z6h%=KMak6DvMkB6Wm{1c+fj8~RdHR#bsfj9<2bJB`uIy-Uw`fl2tVr2$&ruio8-8z zPn#2_uABBcZt9a>H|=%P)QKhN{%g+w7=RCvlyj2LA_mXdd#%0p+H3vmU$*3&tOph4 zbEF5E5D70H42m;M54MmVgrEmou&fJ~0{kFGI}Fj2<!%90nsg&Bwt!7R+)G$=tZf$O z7)HgP5oejYt_?%9>r-uC03@G^3JeZG1-8p7&`&RqIg+7{?l8r3bC5cchaG$uD?n~! zRvltPyk!0IFU^V0qM0@$+C^k7rCpl_{11l7DX0}#0A#PC0^ddJa^9SJWo;^uu?0^m z@s54GgG+29@y9bdFrd-_1)0p6=gmcuqp}XnhQ(1^a(CJm?9SQ(UfdrPr<nq5CItvW z0XDB-2&iTw4(v`{I59-I?!h6z%4zFBy*y#?Q7?Z0Z(#-Ob7UTnx`TZn(r~g5CUjYW zU7rth4lwvk9B^d<d&%2&Kh#9rChG>PXB>4TgW;}r<~){C;a($UbHo4(kPuRk$g8fk zkuT<kUAK+!EyEL=+Oakj`O6$5QK$w%T-mTj4ax2o<g<Bczu=84?HAwyP6ts}+A=E2 z*_M$;*DSeTENdr}{X+DI6C1lpO*XbbO}cGLje*z}%-Kbl^<XjfjgFVecqrS4)D6Hg zrIBU*SaJKPh{5q|Y{F51u*eE(X0i3ijkczvW`<hE__A)rs8B^%?ZfJxrE67<JnJDh zJ^x2D&-%%)*&R&V58z~o>t!2vfcL6B`n`%rPe}F^y!fU13#`Ofi%kVy{*>(Hd#7Pj zKtpAQm%o^c9oas`%b#~PcVr48#NwdB3d6RUwDWJa51P?%tE#%LYv&Dn$*5=?^QAZe zF#BQ`?dX0@$QL>Z>;#M@a}sEq+P<4b@YN@Qbu-$oK~M5QEW&B|Xpm)YX@11tBcB>m z4_?!Zd_#TA(2eEfe=|swE2y$wgQIcTsXBl#OaSvgGq{z5e1J<#({|IF*>i^EJ%<(l z?TaKg%lpg?w0R`=$ft-kLRW4KiaXgpG(r0iA{(f8k%G{f#mE%|B1W*u;2<(z+)LnO zha3!Zq!IAaA-p`X6N&&(W{~!vGF@eqbfg}85UNDQ5&KJsOT>pZlItlt9HK2Q^&&ih ztgvEiHV_VUTHcJPObn!{$#AqMLZtG`K};5#8&&K9h^GA5!@3VBLhA|vhrhk17O791 z2FEfuk}D6bNsEYA>N|*+eyCtcj`3T$L_ZWp450rd>`Pr+EGJ&I1@S6zzs$)pz2|0n zZBgbpGRwlHsGvZYDmPrpi?=KvAtgVx11PAS=qE8u`%gz4`_EK3F-*N^2ZxE+3`bsY zFNXkn+7*_(W&oT+^qLpt7UE4|RYxmvZ&*Yl$ED+}$MH+UP#mcraeT4EF7+Y@D25#K z&E4DfliGOg3|0qplN30FmvYlWfJ)8#Tf3Mu6xje6lfapMCdQO8YZv*rv<0Dmv^cv~ zJQd{-yc0B0P6NT`&<P7w5_T`}EqPC&gO0jQRtesd|CIibZ1V7t8-Dp{_FSZUVc--R zWc3Vv*JwO7aAo7*DYQZB9YPzlQv=E?Xq|OxQ25s1=kN;NN3Z0kdvZZ5p9atLWB@7J z-ZT8yQn%zjqlupB(WV=pX?xD3Ug5`bpMlNNT>U0=?`KL*`4)bPOAL_iHOTxj{$N** z?oE*H9f(Mqp_e=M9--G3;aEC$G2QEsbuSOca(wxO51FPYbZ-`YF-AzEqTSG_?cu}- zwbUIP*?_C@uoz`jDdO(1R14j!4byA2D7_F5hZEsix(cJ~jJPjMO)Hnyy<721BS=xt zmsboBtvBK+QXk@_V%Ra?(lf<y4z+fv2^nfqzXk$1<f`8QK8z06MC31Z02oE;S7+Kt z-?fXW-vl9%OP5r?+*r$JrhQMS<N3e5UloYaUHUB4FApZOr~}{n^iO;BjjJD-LW=DR za(D4Jafz*@ft?X)3R)KF8_m(cInuxr5os4*yYcwvm$ZIEH!}@HlgzC3%i?Bu@$Tgf z;0~z9MkpbaY2yx3rj27zrUPIZ-AGqEw&4nqfsHoek`a%DrFJM``yjp69>r^)3nvEK z>1q&-c*I-7#6GjM65fMf+5rPN?`z&VM~Z_+&|CIG(*_;$-JAEb1XH_R>bgT#!p@V6 zln`-#;I=kxBPAS~l9e!QB1oTjGgCq(G9=CIVz!X7PU`^RP;_9CJzSrOH2g@NDF1|Y z4LA3Uy7vb&UBeML&`l1IfmF=km0_m6FqdRJl~=yU4CHHrpUaxnd~;c!b5`7Lv1TiL zE<c@XDc5N`wdf!Dk;$Xv$lJNQ`P;a}B<XG}B5j9$cJyz`(cL|yySGQA-FWTLt*7zY zymQZP_+fyAW?+CDk1*xks5YNOd7DN8-m|;~oq<XXLwRq;emz4<IJ_B3co3A_Y%iag z!WCT2VVkr8%6l{{g`m8l4fM`XCtf=hPHbR%`37_(5^oDr)0(A~_lx+Y8CXz+q3c>E z_mIk=wd=$_=;#K=e9!m+mZJipLi|oyePf?nq`s3--)P_16shk<fWyd!Moff41p1z~ z@O1r|b2s=)OXm)rI7M%@?SV2ck3G!dkn^h1J~JJDOE$DJIPsGSm7&ckQPa(+b@j@y zT32(n`PWn~z@kmRKDSWq*P+HhGK<jT9>oT?abMtfafu1i<0z}hPK0|$a`gBH(&O7A z(k%4&;4P=<wM8E2hQmyaH>ho}0$k+9yOuY=l=v8e4#Sh29wN=0o`fzUC_iK)9wq3u z6ol3W2k5n62VOfFP7Dw_e{cYOnZ)B^F}R%8-i_~`CX-v&Ji!jNLw0F=4-|92@nU?_ z*uDd|k@5jnAG=-F+9+vn?#@qmwMY+3u5Q>SD{aU`IxHS0rKJwY>0q05!)(^>t_O{@ zjLNFfY*CdPis#WAa0$=U@L|?7eRTIHjVfq$7N4a1mP!u?;iP`f$eYqeIO9O@COD?w zQai9K+g*y$SyCS;g-V}|&#EOHq{*xJtmHFHtl4K+<hADd40&EF9ERm!O6zhM*5V?_ zR@T27Ct*I?d6H~op2`v)0B!>1{g-kR#0t4)ZVt-oU(#b;I`#)P$0*1g;Tq&B<XT+= zS#1q(=a|(IRbbGz5rILU;p=*^KW;40{Q)c~UywOX!98Hk0=Wc(gLKYPTm=D!RdFNu zg2l(F*4%t!4g%`4ac;F&UxJ=m85V_bP<Qi>t<^t3vjVXmAaSudLS$-0@vBBf9ofi? z?2#pmY#u5KUc>58QOo%dpOx3Z>Qg|<>!TJb(hM*u^OI9^>-HcL@E}OBmvkT$-W91H z4n*R{+dfYz^gZ_Z7R8fTCFF`2Gu0J#1i4z#A7*$_W|-9tcb7EGujUx$%UQ#mU?or) zuHXlC&!|<Kd}5Vq=PZbf+Oq|bXPvbh$SW@J5nuJz%4QW`^`oDOOAKEE@dWt9-$PD? zYWMEppOEuzVpl|(hQAOU7|Zb`_L47gM?|^-uibRp7w{Tbq!~YIwpcF*fpv74A+V0h zI$u-QL`gZN!Y{X|?~AL&W9@`r0P<oS=Mp13$&(lvhbM6(T9U9h#Ln%wnhT2~Hlkh= zkA<Z$JdiN+PQo#~_W3aNPhHwmK7;ShqPpglay)4#g}%1Yac{&fZQYN*Lyi}F`?nmt z{SGqVVY}2#2R=j}x=!J5t`7pgBhk4T=&iUkJT3bm9i|=hxueW+P8af(+<-UE9Xd^V zzt=urx#z~u<4tRYe7T_PhBi%Zq)gHARonWUm0-wmp?`Tu=Ecs0^W}EapncfWY#-L| z&G9|15j@L!EcuH6KbOOPU3PsInh)-5aA8sJX~Ldt<^Fc1Uav|YLxuSY2_eNX7OR3v zbs#V1v@V2diQSC-N*oNM05yjLOO5?PR!H5ruqXoan(kT%02p8DO1CX6Mtw4}gV2mk zi1V8=oeLY)1)^<HD3H<ub`2y}Yie9Lsm{#YmbHMZ>0X^v*ZHlSq|T~4StACj$nHGL z15a{qt<<-Wu@NbVk#rA3g~6<%JeUQE@Th@TwJHSQ3mCAaI}`#RlHZeXKHZTJT!n#n zj{^7@H1ifhyKhK$BWzL&E7%EP<+dhVOn*?EkK$avtj?-VzN_hdog=J^qqB95y9<0m zCpGBUyXrpJGOt)l_W)aGL2fViFI=LF&Z?;Y9!CB54#YmX$c7J3fVZz6d@G_Yk|v(( zBZ$vjBuhEhSD(RHGIZCzpxCn@4FRZu<0h@T&GM?-78Wb08hjB^6jg)I!o;^D`L-W_ zD<~LT>H<UJFqzuoA~@A(8>g|)^U(<wnwxdJY_j;8BRh6e)Q^+8)H3V?SToVy!!WwJ zxUTxukUIggp~m%q+<Q&TfkMt&>{da-sjJvp8ffRp`Uxz|-_7_OgzIS_t{tJgVD-(J z)pGa3m)^(yRH?V_h)5%l`waGO2CNy`3b_Xqxho5`UxxUl<z7+VAIL>>0nF38AZ>)) z5!qc9oJk5XE-S<onPF)cB#!3J47x*ADf99EL25%F5p4;@#yZDXJZl$<u}v2AaqgJ8 zfm~hCjH{*IKLV+TdO_fx4SEF_7=uL(@~Po8tl&PoEq`_*A5@*X0LNfpE}6?~HEp<a zg`PfM$yK%MlR9B^<bCx%w%*&gJyeAhqxB9vF4&dCX$0zHwB9#vf!w{h>%EuceuHYg z_j)tyofpvrrF%iz0HLEtEXj6EmhI$9i=BMMAViMtCQ=0@9!p)Z-hRsbu*EKQ5K>EH z2L}VHI^yC^TJ3dB;ptrzaN?wxTJF6lqfP|78T_x*&f-Q62GSzp1<@<lgDjT58p*jB zq_uv79$N;bwVH0;IkJB=XD7C_mV&G112aS7tf5VGj_e7abq!lz3)v@}RJWmo)2s0~ zk!SKN%KOxNbdJ4WaouM(f;@}wAzlO&*#nCOqfbki7Jst5BDzVo_jYK2zvY_t9+<!D zqAt1)OW&;8$=BEuJV)-2=dO7fgSB?SD{G$hk*2uGKEhrRY*Sp)*4uD`OmFachQ8F* zpM`9S^`MDPhf(svBRTdz-^{P%ZReJhsrQ2-EbbQ@Ta+CE(tR1;cU$M!R@xC5Z)`$m zsV5uZg9GkWb_TJ|A(~d&AJQ=mFDl1C&IdYS2Ib7R5{sB+b3zz^)Le(U9LqZC`C*FH zv>MM*ncX>Ay5CDfYF6ELT&*h<Z`jZ4748k>E=I9G0Le1Skd&jyBP0fBL)e?%5Z1~P z1T1IYf;5Z`11Yqi?9t08D{?zk6hVtBzCcqfg|TZ;2|y4&-iu|G-2)p&p$XLR!5!PY zWs)8{9D^2K=In`X-#N#Cdc}685Z&sHV*m_5iEtlvELjdxwOAomhE!dLKKvqssaju) z@*IWur?d*(mo_HRiQbC%zSXxY2d3o-U|Q}~@>zTMxon?4rB87!^-#-D^RQ-WXgs?( ziIr1l${dE}f~ZJaE!}jMI-{ZKEIOmL$Y8scATE{(7A?~<ZB^Z9GcD8PZnSwV)5Z`d zLRUVq&n|93YXH_AO>RSr?r81uo~7akDy2S<g`8kA^Kxrixno)tsA2%Bl5CXNS{)bA zDq5v2(f3^6tOg4F@|`S}N-JEm!>GP<yV;P_Vl*!pl*j+GzOLwmo!SdMdqBxU?Ik@+ z_v%yn8vC}X^=ub5Dr(xd7&4WYZh}{`Lb_?`4t#UaGj!maQ+41YDn0CFOO%@8q?*&g zb{ksoO|7yu-xMPyYvW&*Tk~yGu9en&JCLqI;vT!&W;;6ZiT$khu*)v)LN`8#h`|4x z9rRSpM^foq1=6aDzR4Z>cF^HivJr4)`CNbY2(+3ewo9p5EJw|rf*+J&$j<ax&-Ovi zZ)KZNv`nA%d_~I#O|7TxI%4WCT>Y8xxeP|?7XB74v4i}ffe3+ysS|Q&uUgX%lco_y zZ&6xawrFZF6cBmK88z%y*gvHSw0qno8!teMN1;(ip=^C&@u)4?BR`K097mT^HlXd8 zvZZ^o?}f7M13eykvu{Hm$V_kc5T0*`5~2^}M)ZMfAKJKWYF}E}+Cy}?SGiV{?H~ow zsdoo~?smkbQD~8Pm@22l4#dK#0wQ10N`W3x6s@;!bYkc1!K1YCqBy1*6|Mx@BlC6` z-@8tv)GF-V;;y8IS+yR1@=Dghy54r~!W}v6M7p{he=8SGvcB%_IHO;^aP=eO#S9MW zIDa#jm?8Zfj3oEU`WfD**3X+sKfyUO@4R_D5B?;ZLN(y&*5$1sD(L_hm9|1bcgu>n zRStk3gM#*l#bfAgCO?n99mkea(A)B?;XWwh{!7w!G%B4^H!7vR%b`osMkweo6g0GP zcxwCpw1S4hbh%HtRuuF`SVMqJ>!=}`xZ5tyfB+QPc=6`A*lQP~aq*agsP*W;s5XWu zz)*CwuYc3l*<CjtqtBPy85#G@&`rlOXp)sn{MM(M<2iZxSJfs`4iK5Ai0pkmZR7K; z<%(dZ^=YT}>Z{Pra~a;FWBjdhO$If=LvbhB&h|ZO-Moo(b8LCrnE^4a0HZ$ujhYEd zTcDS-vi_ksl<>_*pqIU2@rW(iEkEBK#KkhZG$8UhyYwFD-(Il8LoX-QdfATWgO{WY z(90p{Wsv&i%G9CQrGxEsxktHH^zr}<XTlpm(HA<(+ofr!lQaUCPrM10$9C#qafHeX z^A)oc5RpME^9mb#dPla*>^yKpv7P}~PZ@D9*={21xr~OYV^_#1rt8QQEXn#bv^B?g z{!FeDA_@T>=qAHA1i6p+Iy%$m>*orDOP{W@QN8x+MV0Lga<}m3R=Ky&I}dMO(spK$ zqAOtSvpaiY;vhv&CkPvQ(6;10pBOz#Zy6oJTh_Mar+eAUTfSDBqF(m2>Hqk$m)Zks z#UbeKM)a}|Q7`-U{B8fm_Rz*{=}n(;-yVcTMQp9MshJdSmtEQcVuaOf{1vqE%evd7 z+W94?4AhtR{I&8vt-sy9A=j+FvnT?mm9_TU)RWxeGczaDRg{|b!3q88n{T~;=3%=p zHDTplRmL{gcAR$Jzn`ypo%iu0?=Mo@>`5BAwtcxsC;z%yMPtuD1HC<#;dnaEe|436 zzEs}~jK{cbDLhloc{?ve2RC3jKQ-$Hh(5~FDI8BXqICQEy@Iscn)V9dwvtznaa-5! z734yo>TZPHU7cf4x};-}CT5Xt1`80m2-ksYTAu-SW>HKid^(yvtS2&&A-Paf^S=$> zF^8efPLomalv<(Awg-Rfpa1-=AD{WEjno<1OkK`Rq|M$XwRu_Ys84@qJ({e&`WW>1 zA?CYM{P0%(IWDo2eAf+;<UZLyi0oJU2S><10EeqoQUN{QVxw#W;qufhQ7fU6ZVQT) z3(0ivp%qFsb2+0>nMw6_lW-|+kd=B;h7H^Tmtr6+-eODk$<Ox%aj}d`aU##9*bno2 z;F2^AmtuR?EbGJwo)2G=hT&3-z$k}@N48HNNV~e>2wm=1t`(Oe2+lNOuBx{a8M(ue zK|0-V2rul6OJl(Mi$_=lsxvO$!gz%99N(sW8OOJOU~F>N?nAf0%4Z^+N615|TrGZS ziL%kvL<yESob_bY`_bLM{P7ZKrF4%P?boguoKmdykAG(wYpwlj22?6Y=5thCE?AF# z1KV=-BlPGCau)P^7~+fM?&Z`eau7alH8h<Rn0WNTyWcG8B8hs$$ffou@DzxNCN-vR z4mnfwgYZ#9nvK+KTk50pzfI)G*ji+*vjmfzqJK6?hw@`RyCUY3+!jY&HnIGY)C_X` zqD$!5Py}-PE&!Uk%0Z4_$^MPezX`mgBFD#bM2-(u`U;TaTPrG?V@w@+)nn!$=r}kC zrt11`Dh(C&cCi!Llx~omy;xXW;_W2%9El_BlOEV7REJjcnRa}Eburf)ZD*c2TNHCV zedCEooWkj~lhf;h(?<@?mfic<!(Y05=Nc&J3AaSD)jq%#a2@CZ`dy_9s4F53K(^Eo zG+89e)=x4Gc0i_a<V5&jQg7fIl$PJ<3EGSHxuBlN6;#Y7mR*vXASmml`NAeE1QoN* zm&eK=s0#LPC;gkiODaL(ITBO_&Z`hqVR=PUry{8AV<xCxwCa^K0ukKA2`yo9IBTNB zUIL+bYhxhu^@a^j{dzjG2tQ0kEBPur|F)a=>|47;w0^&{)#nUeIkakV$a$!HbSHpQ z@NpC9G4@YfB1-#6P_9*^$H{;!4a!NQ69}Y%Dlr9=!i#f3v1>t^!P1(^m~UxMzErl^ zSda`-slPPuAF78`w%O)G+u9+OF#C6s{!QQ|l~nK?NhOR5Ye>Z(3J*;xQprAMQYj){ zqFM%6iuP?MK3+5xR;Qk}Q)}3|rk1X))c<lZz)YjG?{?8i&sQ7d>!Tg(T30zOnml>! zhxwncU8*^0rn8G0gI^w6wXmsfTJ9;QA<Q84{rZ2F@A<V<pD$h<U81uWVK;Nv;IFxb z7wy0E-o`$HuW-@|{kj&*`TAwLp0-<R8Lqy+7USlSdjOt>ysEiP+zjD}+G%|@$+&cS zoUE1CCozFZLz}QZ!Kt2EpF4wMWI@`7^|_6$PqBi*g%X+~Ivs0@(Si*~z+Cx_+r;Ke zQXf{U_tJb#pBJlj+&14b-i+1S%KqI<|0eK~YPI4yTCJ_%_QPteX=&}-tgP1TW42lu z(xj}tYl1KZoYbyfs1vnVs2+bHGDZc$iv`FK0!2L5XsjcKI-YfOBFgRB@~o<<Js-Nx z_4$>Yp2Ut-FU_>S6Oh*>U$)wHsSAqcNRl&f8^Fe83?>U@MqvSzf9j<wn#t!WkUgD( zpwfATy7LJcnx6mT%UCc&)W$+(Aycsc?W%bl_c$nrSbzptc2PrnES0nvvQ4AbBY)wT zT&42WdZkehuUWg;wckO;pFeP*AbIferB%y0z2B0YJ=_?#lRC(TXgxPaq#f{n=yB2~ zuV-T0kv5KDJ%j&OS<l&SP$cl&!jaja*s&l@VWm&mWZZA1Ebz);`rtN|*`gQsg8~XD z*9vU0i7l6;eprFpOY`3TT3CTCw)xgAEwBP@?B8+vH-VQ_RsheD6=)+vz$Lw{ZT;hl z705o8_iP-!q%vaX0n9iQ-^waM9F7ElBoyEUfMhe&A*41ATYb=&n)-&0&8+--H65R> zQ&)0r#b15~ilg2ObU$8?60v<;d-byJKK)6qkaKa9=!ci6CoMv?QX{m;6CI#xB@Lj3 zNi*YWN|-fd?#+b=hU&wKAP!B*E!O%V&PV{4fNz3;|I!rS%t=8j<YH@O#|$fv25o@| z<)Axb>!VKAd~AH?)>Gdw#w}-Orp}vFH?4~vS=ALufAZLGp1n0bys`V!?_WRvu{@7v zoJC6D|BiBrI+{nYcm(Jy5&;2fIMGV8wb@eFirE776;Wub&LUfb(j?7Ws|E9BMOs&N z5`Z(g&RKg-ISc%S+mkwQy;G)pMm`x_e(E|WOoIq2^bT<+C;X9}+yIworTGFUWjp4p zzL&9{bkclHm$Y_bzR)$6Y1eF!VCS6F2@3CdgVoi+cK2fHeNyTwZqM<%mugACIF458 zX7L<+ZJR@XFVEWw&H8uD{1ewbYp<L2mZH=T1VJ#`^_Q)MfzNRMtXvuR`7nP0n!k8N zYQp^0bPkffwbT4f6b0Hbe}Lq$`I`z7EjFAb@_o%lJMmo8c8K!|^M<124du3vQL{s> zG;{ovH){Eo#p2f9YWd{4C+)2q7w9^Bl##ws|CeFxm*w0PcORE%p!w>KNX?ipZ)^k2 zR|XOnqKVsTD-2=cwki_`BI}}sCBQ_)*pL^<(>GvlYnx<xuB~up!AIXhV8svuF-bJu za0KO9&*z+9m%{3D+OJ&hz22G6%b)PiU*}rqxgpK>c(*PFE=g_?F!$+fxLg;@L1$ol z;&|S8^&?@g@GSFZGXmm8+cMIyFOmk7JAayu>WpdLaPga}`^CDVxM0#EH+j_ulLn<b zp&btM!eU}dYIVR#3PWUjXf4C#xfitG6yDGs##s9~G2@X_JCIzD9D^6qzb#?JoV1j8 zt7?fVL=}f4K$MYp^gO83rg5fpmOrkI3m5cWwl;j+mpS-@qBlZKyG!ufRpIQtEIu97 z1x4F}SP+&<5So`8F^VOO30U&LgVcmYk1Z+4{-Y3-{E$xx{PlTbp`#3oKoa1+&2u`p z7Z=wY2j9`jI$PJ)`!kRDkFybJ)~_S=bR7I7lW~3y|HP(ay4MdDiOj)Hz1tX%HwL=9 z1Ik~FUFJ-`3uCY4wsT*T`>e9fNM_20*G#<kl->|0Bgt-=u@XdVV5Nmk1@ApCPE-3- z;<)bv##!(oFVbg|Ix*5#ToXe^AGIu<jEkN2mvxqMuc;|Q+ZG`L!x%K9t`DOO1{_$d z_dq+n*zHLA>RamMkl-SQNM*k(b!DPV>n4TdeU>upcn0ivb_?xz2i8OcsefMH-833y zbG!23+6W*8Sq7R5Oz(eRj!PseV4h_O)W{NWkpy}o6htJR`dC;D*r*KwlfVE4CrAQK zBmo;F&}5TPoFG-;8dPOc6H0-_o;Vd7=qx5z4XGP3e|XaK@9|>(+vWLp(~AK|(pFq< zrw&_-siKS5K;@dl7ypel&8@a-ZBv=i=Tn%=YSdwWL*_F~G-hTK@mSJl%5sR~5$aw| zvpE_TJ8eY$&Sn$E*F$VJLD?+{9?WK(u?|X&xZZ}@j8oCiIMwM7nsgRhS%s-S$dtCx zAr-e}NZZ@mF{cCaoH7hprz7d9ZK$CZ9E;~P&j(pQCHs46H{^I>>R;A0-}*$e+iCxm zt<q@W{k|*5*Ester|7cI$8`B!wJz6yGdZz>60c-rnR=6Mi#F2iWuh=ym7f<Xd;rJD zw8>2W3!wi}dk=gNa}#8TgSF58GN&sn#!?7C98v|6LToj(Q)0qRt07>|tbl86Fl!N( z@V&gWc<RzMg`|BWW5X-fDwMR1Cz*I~#{Ew&F-XgyDYGWp<uwtYHL)d<jLB<aQ&^1I z=$uBqO)|~17l%tw=)RGy7V2~%)L^xAFBdlHwn-5(MqnUzG({jGxuBZP%Tnb=_=)@= zwRMv!x*-gtAtLs{0<5nRliL?)55-7MO~9_9OSQ6V1VfP1puECb@iIyqk~O|YFLg%A zZ)Ck_51|m4%1Ky@40CLnz^z@NYARv9ux{ZBSKq=p(yM!&b8tJjzu*#K2vur;f$P8n zN<PK#n3_*Z+wfomJ?O`SGKf04!{1R>2o_3Q`x)7#-($jVKEvu_>*2gihZEiP6z=FI zdO%b))gZK08cw=pcgAg#wjk=cS-wL1gJxWzp+PY~l~gBNPf`rWrOlMMABl@-HAI5* z<8+X4w;(VQ7kAhxqrVL~!-Qq?IEj7;)V=`4wujQ0B-IvAx!TUKOJCx2>1zXRLNjl7 zB28G_iBM*}&x?O#0$-V$SLY4K!X>3HeV~nR=4<sOc1vCwNwfECTx$-qR?h`m+w{z< zn|aq8n*-kR$TNphW8b_v&Ctsmg@S0%XmfjO1GBgO&{wZ}Nl753cm4V+<Dorf^O;Wn zLuPZV-Jf#o<}GID(@8B3y&*FgdNX{yM*ZjqHvO5ykwEGQ`S=2Qb6y5_$XlEhC)u~T zL>tMl4pLzW3IsGGc-&8-qu#-N^nl8hoFv0--gYNsNW%4QBT;!PNrqEFv2%eKa>dDT zqLb0$c7`ECaN<@$hK_KuT$Wln1577J<SS6z1R%pkS%x?#OSPR4M;qXwXv7JUZ`ed2 z<&i4_aZ{qa0d6F9o)EFvwxD1g=jjnDYLog&^1TpZHp{@y)z>S)GwGe}R*UcwWS5~g zUP5-ns7yq4yh$*V?5e#!MRqwS>nmmJm)q-CFYC{<7|veq|B~09SQ(L`5Jp)oU~XV0 zgs{YExh4GQwj1zNay$?oz_A*AzF0g0#Xd;;Hb7FF!-=j25@8q78lt|U!A8dgaSJ<J zBuf1BT1f?7+Z9guOX$k)#nn`pQBO2<(GEwQg8cRuJ*{mUM;P}&2{yAC@<@6C50ea5 z3ae?Bh!)i^Lu!>v?v^FJUY2dp#o|X<Z<^N|*2gTUTNF0>boR60>Jh+qd0Br4FyU6j zd;f%EO}@mEh!lj}994djTb$&E_U)Z`YLK;WPY!!q;*i_0BDY;ZF)XjJjbSwnb$<>G zHN12zIbweq2}?zwjq<Q;p$ChUs3fQ^v^EU6G{G}Lqp$h$hCaqf4dN#k4si(cp;}Zj z8%b*UYw0s$-rJpBQ@RT|td9!yZqE>7C(j{$Op}&=7t8C^fbJc2tm@O|&q7M%OS%!W z8rQYLZ`s9tqC`kb<p^o9yFMpk)lCB1DsOii#%E}ujno!)LtrC{z;?6ztq%)rIGl+1 zX!%5f5JOj3^vOHhXjrmAM7A1w4NeDEE;>Ql=&Gg)S6jlOEhq0|f~}N~{L&4Oq>p^e zW^_UqYwRypG&J@PZY7t*hU^U37tvx}Dg|j5T5qU|J@n}0lW<lAY3r*6NP}2#9i|X{ zZiL9Z7DsRVOOwy7d7bKhp|0#mU0&ZJ&5Ir#`V2%baATS`g(JEltWOWTnnV%FRu@TD zvLIBVM1xC9B-0zQCMcFKD7`Ea<qn98D|aIs@1?*{DZ=EU8OSj^`TRu=QJ^UWC`HvQ zd{F~K{vlNxtCQIPo%REe*EIT5a#Je)2n*NM5}Dp)WfSyuO!JB`sp~;5iO!x0E8RC2 zp|)XF0N6R{zKs#;GTKB@6HibqTS(g3UYjVh(|wmg&B)2(qzbxkdt9`p)3u0C8#Kt+ z_VLA?Iu-;n1kkLY!&i&32tvTGe^Rizhs~+?viGa&VcZ+(HaM2-{et`t_dBhLVV>ol zmA}O$3W+^Mh$M|H_bmthbPU}pn!`)cl!@jbFM5{;W~*}IwOAhv_^gYIf?ecb+R{`8 z9KvD9OhYVr4!I$JnEbosX#ZyOqpb4_!gz;NV~wOEdJigmL%T^V&E#Kwc=e}3h+!aW zxrez#4g4#rlJKigk2Z3qOd1=39Zd?@(H9gg3u#y*XOn0QE}TJKIS9u(vf)4-CWIoE zZ$&q9jmbfdkq)J>HTVk@VkO3F1B7wDw9;Km9+C=!Sggh%7Iy?<rIBz9dh4V=F3KE6 zJp?ke0%Umb#3Lnd`nBZ)eNEcn@QBaozo+jx6@u`}Z#=YBtF_%blqx*A+gLet_**po zU(ooq4{`<QbG4C6cxe1(^7xDB%%@7i)iQu*DNb?}Ju0QH#CF_!5j^zyl8Pz{8$nPi zfF8f(4fF4*Ab6?&mVd|EfSgB5-M094DKkd*tKZL=xMD8A^>M$)vDSS_^pqf(HPbyC zgQ7h`xYSadPA!0G^fPt_uq@2$!hObOiG!u*f|3q@z|M#+O0@qHhqTlOX}zd1l-9E< z5-E(#c%s@%=MlUGk2^ZC8)MMT9T;?|vx|X75#2*s6T$*;pE9vmRTNfnsnq$mrDdtj zMrXUbremqgdHJ^Uize#@7OjoP<*02u&cB__r`eECvji^hU2-0o-1Vcdqh*Nhc5#2q zB|1s=L3th>AjpBgN)>Gs=dKASj6|73p|N?R%VBJuFEP5TuvBs8qav6)?`-YD)M6sV zqM+o&pH`bxj&8HnnA&bkZ5L(L9Hstx((zUt2@*zg1FeN>`#h%!b>eA`@nO!lgd!9M zlZ-BWqR3fZjSKWWhOH(;kduvDD(bCSTJ6qfote~Aa`>Bt!SV2!{CU-n%3eza+CKSx zsj0?zcY_=oXMa<5OU$Wo^&@6W2zTP=Q1jxW`9$|X(#vAf%S_s0ajF0WGzAOk9D$iR z_Y|KMD{|07IDkohs3BztifK|n3{(Q6iAK&<KBkw&Rd}_$y2elQw^Zgsh1FPQo=#<R z2CVV&)A^I5g5_NKi^UUxY$98u>6}TNVh_a2m28*ig2j^7kVZivWX{~gF$--_xRSOw z6b^0~GG3HonXh0^P->+uXWDx30ZOa@2<0fNr;r|T<A94@a3IcCg_C<biywI?&PQ2? z0KBdFR-_%4n?-eKXo(fS%9+j2#$3#w=@rs5shY(zPkyoQ>SbYyVOE0N?OdV^^Db2a z+GJl6s-lPM%HhNaZ$zrayINo@+k;~Df@FoaYpo{#)(ZHd*c47wTj{Er4w*jmhrpa# zt1(SxO)1SNhozZ=WYH}0MY9J)WX(ZGFHEUXD+Rx;rZHAXj2TgJ7L4+%3%NdICT^lN zhfL^W#Lfkmg(1BS`&ldbS*0}oa*V7fobb^oYw)-!tlrkzf@0NzWWji=EHuKZGK|oR zUIG}Q1<OftqK=S;r`p+Qa4UP&f?l=DJ)CoP%3-b5J%{bC*4h5))HQ~Nz52DQmo+YM ziWH+>17o8K4atk~xWb8M8XIDJVi*$aWB_8A;6$C1MpRdW5jBRT7)InobVjPehQ>(^ z*uf=3q59PC^f0m?K(hlKi|O050+dhqmHEec;bIQobPf}>?hR#U;?v2nPiL+*WQ_T3 zj9I{#tzi4QflHLrn5!^O2UwM8%sxD(=F1_X0$|7JA`gwZ$cZsGht-GbI<PZi28zJO zD7monaFP$=6^+<cvgc<R@p?_&J*`Hxrqt&n^;uZ=wmN5$PaVHVdujdlWX?A=$hl_D zH^?_r8<=wCd;<@SofF_-S1%+><a0s^Emy9@j&lOsQBEp@Q;9uODneLX5~@kgi%vJW z{QWtHn|lxP2f%sKYhq-}`RAM5j=U^Z$;ENTImFI59&U5?jN_2+y-Yddz@=dKj<cD2 zLFq`DaJuQ<owoGZ%<8~BYXDqSc6`<OCYQ-TC=b-?zmAgF;eym(GX?_XhVu?M#zi+Q za7Q%163*x+IXiq`WCLAv#;J-ValhE06I$X61nQUOy@=c)bCBGB+&gu1_ppPI=!ELQ zFUCNs<S=O!ivlPmP6<7QbU3{LUDVqoCt?D!te6K~MB0aH7@Pq-anb3RH<=wSGNqCO zjt|w!d1!m58OUu6wurFpgsMjPmCW`nQ*P;w<@SmB?Q}vxmRaQLN1BJ(`KOCZ%E&y) zj>uNJ<SRsuOEm&uAzRI*V6v5^d~g~HsZT?20voYqU{AJK8<DzE%Dt@W!%%txyt1-7 zy-eT;b}{GB3cni#D5B1u6ioKIFq?)}JDkLFFN-9j^-3jJP|*eu<QVEBQcjsxF?#$& z!j5Q1n-Gu>^6(Y~Uu~NpcSDJx1jXRAYIvXO)a3l!*zx<Gbp$JGMw=@S-ubW39J+6) zJUDAHetS<P&uauheTX{qablh~{n6;bE~g{VTVD75S1zaix$my~=6LgLtI_8wueIum z<IzpDmf760vsH{cFZ+X%4RhO0bLx&rZS8DMTjVsF>ZH{b6dM+#Lim4$HmMj|QzK7W zF@>MFVr;<@IYEWUmPi^(Bp2HgCLi)$(47|Gs1jv0%(HH@OBEmt7TX<Qy<s~T1vk7< zCq2K)nG!z6KIXQeUdof2yT&|aj&8TEc9OL69HB3&0bX-0GLX>0ZnE(N4c$cZM8uAp z*gS2N=gANht(nuE8S4*8#(@Q?4ztArfzM8Lbv6kBIH?e>AZD-$GZ>R+u!#s^xuy_i zkP?e^JUke|;N1j$gYM01^bXcw-2^APD$vcea3ji^62aI29aX@{ADY15O%8CSNg4x` zwh9yLARo}FudP%5^2&)%m7iLzW~SH?koU$XaRN-enRft;TuTCT)A9IpEXMv7^cnPt zauodSwaO)ZKAhzKMkSYJ4u?yMEBCaR6;%<5icX0TRXHSqQ=(YCqNs3uOxRtUNDNMU z!cjU#kIHctGBZKp7zwjLD`0tnt@l8=+O={4N6gnOg#_|cA2GYTC*z-7Lvq(Tar#y( zI^bOHNR?|o&^--IVLj^BpX3s?w4x}5QGy2v2ca#hu(k*fyHp=)UJM7tk_E|+CDjrZ z{VOz@rV}X+Tr73(SK^XpJ8DR)YMa>Vsl`T4!O9?!k%~<Ysnmm5N(KbDD88yO1Uk$4 zj($FdH3Z|$fvLPi&v(@OYK!fqu5hOi$OiV_dcHK&-0VO3Cog@H`sDj3Zs7TkpT6%c z{*KR|dr<du4%6@_FAdb4<J<TL&nNopPW?*!N$Qi>Q|}7z{^ae{#ng9>^CDsW<as*} ze;~kxIaK@837&zj=@^kHrg`HC@2{1;0yoVeFtSOrshPGzlmRf$CK41&7o>X3UTc_A z<6zob+97Db?A2iQI`Fq1^d*QxOM2+J%v{z0qi%)^M2;#U8xW*!x1$e}*bM&_!9$7D z;@KkRAmsXS*Pg_f;@laB_M(%<J1flN`CK~wPkwN%iB$GIogWsxhx@-}^tI%K_H~dw z_eP{}2eao02M~N@g{TOqxmp$^JNyc>y#;>7qT(vM`kYVYHZDxH%<++M)sd-pRm0Tx zVSeotN(T3vU{4wgodIfH4Wu`Pp{nW#fJ}{5;3K2qFmX0tD|WK#?eji4EdA9qx-^x# zh9S_tnW<N~^7d*MeR>}iFy}60Mz(N2mt!srE1<(cNhx6?I*MX)W-C~1j{>CCXp2CC z38Am=FcZw6ESOwi?F8e&_d==dFx_H7ghDRV#F?zO8xH|zb*td4xL|;R4te7ZP?u;f zfd0GER(!w<*KQu}F111sOCXZs!cwBghe4O>Kwb+m0r^@<kv~ov#F;hG(t+$5dg0im zw$1oE>=4J6mXazq!{zVGt2N|jGfuS9Mqb9v(zXoPZ0h^hkfHABbiyGesy5`*U8|L+ zfRpcK?c*$tGsS)3nsFSZ<fw@Ry+N+43H4JqC0+zJe^EF`70ZzbqfdbxR4<h>TN9|n zLWz-8xpq8<CqKI*2A?~|VmdA)EMmp|Ftq8XS){<Au?i)cd(`0`^Mj-3(5Gr@79pN< zqqfS_zx(1H?PlYd+0kdr=AD)GKeIXte{9pcn(EBvXG()-(AaH!TjPJ&Y})1HLA*BB z73BZCC+^Ug+s#^v3hA1LQv1pU7yrDcp{zhiqeAqN@zkihqrpdhI-6@<3NfQe?oe*b zsDWk_l{bEvuOO?uQHPG;&M2WHOBK|6(1NqKDvHEhObQ{1PH3^8j^k0;4bN~GEEy$R z)2v2Rai&MsJ*T?I>zr-iHTX0?OS_&K#hhM!1=zETx;GK4nx!~~I#xx6apFisYQda# z4oxsE1!SUbf&9UQlg##@vmAzkAP^L<f{%@J4Wd;ELm?S~EC4J&UZs@dWZSH%G*P9X zomu^4&q6=7B}62k6gOGShQBoPT4D4YdInC-5|ZKE2<d2eY*bcJqft@O56UeqA*1mb zSMZF{I6l?*ALc@%me;+SCJ{(SyXrsPmmBR;baiLDcCo591Ra_jO?A{%cq>Y1USGw$ zo@Mc{?dW=t6Axpne^6Qdebir=!S5G^#Z)mdhQOzQ7+_imdqY(dW+QQjt3zTkJZq_t z7|pJ_eV4ramz>>0S=U%|(+Dm7ZEWe&wEjH2dBn_{u=M{cE&ag1VCl1Y#mdLLn$60* zYWS4{YU`R(a5V#~rF{u}Wp=I#d<{S*Q=fZj2+na=IKeD{FT#T8NP}siyqTyFJr0)) z!1t5Fa05+ddZ+}6SaEpK!qSc(5k68wdqgp_4eit2>}wG%LN70LvoFKjTHzmp{6QMg zW5+a^eFL;hqAR_>54&KQBhe6+Gm<613Xy+fwkL|qSi-W>1W%G-_qBE)W=W*EZX$1o z;qHu*Z*9oLcHwqc3nfk~q9y@*zR-JAZli4@kOg-={`DvCO}rp?FmC;U>%L3-1EFKz zx$nr}<aw-}xuNe8PEz;1EYBP<wx_<Csx?3MM~UBh<4g|&>RCrl9_t*taBiD3^~^mz z26KqvE(sb*e)$p9;GENsvwFw|?ihAM@?cPf<Ce3N_2GmS&&*rhWt5n-x+yW)$c``- zm}KP8fU=V6Fp-D^F!+SzD8}r0><NRlfNgqKH)gk_!c#}HTL$-rrc2=~Q58-GI+jZC zzLp;cZ^eNR2Y7xDa~ERz0W+Ul<yfD(n<Z{l<u<Bu$~jr=L`xsEbC1+OBB1t;PSVcI zNg8ntWIf~*s|n-Fo{iNfU?sGwB6?UNmBs&D73og>Vcx*T)!;5lljfEXp<e8S_Bek^ z_XF$$Eu2^wq@$CC5H8Ljm4hYCX3hp5)BqeKg{-xOP-24Nq>EOw20q&HSY9oT?k!bD z!9clg%%R*9XfOO=A8Hx@Ktb>YB2pKY^CrM3TWEXf8pKA;LYtH6jt|UJ3NA>U@NheA zbY!5eY#WVS9QP-Lsr%D7X%XNL<TCXE{Q9D!R!LQ(I_X8;-B{7XeAscvi)iVzX@VLz zVB=C(*N1?_Ow<K2Q5a-?BZf5gQVkTij^+zf*$s_V>uMVRD*5bS<DIh)zr}ySr9J)m zzl`k&c>?}GX=v}6llwo>RzCH<x^iBBViA^L^4Wj=>ytOWe~I7wl*Kpx%JZoQ4%|6; zJi5czc>10n-2Y8J{;fv+MCv2>R7?)6ZmE|;-4f*?#o`DZN5n`tQCCJ%sB3`~=n#!s z(+rmiIO-|^o`e*Dj>O5C5?2~rfx#Z<JE|PBK?MEyTLc=D`7*b>X3<xvBtQ{Rv6E)M zIMO9Y@5<D(@1}!<t+Hgv*}smN7R0s9tZ;Fa^O)A>d!WzNIMw}`Qp;S+fZyT}(24<? z$q4$JCt7GGH)m%OC$^>qYC<N4Y*H6yvLc-9XZoDzXMOPcDYFaGHHyWFD95L1?w8q! zJ(y!Ur*(xe$LLNLfhtS=2u&pUvbr8dpaw1x8zQ1K+qFZRHZxrhsAszcn8$>t33R_p z=z2Eernna<w+OHUN7UvgXj6%e_z*Hdd;{C6Y=O>LsY5uWZVdTg7S|y(uT2j6N7eT0 zvU!zKp)&*VtEIFl$fcM2pR3XD6~Y`jDqlcWk)4@E;6NI26_jldde>(`bNOq*T!6VQ zbJJXljrN4G5K%S+-5lvfA=ndKaTh(OaY&JFiYciU!!@HFYv~-ODbF09<qVWyzP41f z7c)_h%RgDQ7}EBh$-%=-aQ~lDyAFS?1C|Ex>kbB+@94$)Xr=Wro?ag-l`U;WJE#Tf z%0uw5N{ArK$w?L|E?Xo}Yr-PYE+&^MiN$gYgc__JuvJJEI7lZNT3HXbB0Ho*NM6i; z!XE<^pAbz+6fpT%!l4*wsEt(QB96*UeHzp&sTF9mL`^er2#YG0^7oKp0!petKG$K@ z-dcwo)%%C;&(<f?tKYu*k?tQ@?L;lu3xRZ?5>3gC<6;;dw^U;IDKADTG?=U~%lTLa z(oXu=726|fP&zF&hDpy$$h9+AeN<(+WPL-RLsyIVgpEVy&~D>+n*q*m4QIE-`h`dh zSDx?PX-<tlaMxzO`a8*gO8u07@4v*q@#dYWk5do+Zv4?#4)bpQ3yZ50zB%^5&%d7f zN$Lf@pSSR>{Opxyu5!ukr~3J0e1I>yG<io~>JcPlHEirN?0i=Rk7=h8yLVyibi9*_ zXzZ;)(T~izJa%|}6;MI9Ja&{b0k49wQz``TFsT+y?!agglFn>&6D(}e(Jp`VdaW{g z9-}{^oZEEAEy4+mdz|-tC;5}qKlAV1JDd8CKW5{ez490vb`IL%oh65S=6+?wU-%LY zSh)He$LZc>vfjr1ozkzhE|MIO`E^T&!r{a?4ZMv8j?yI*lcU{n2nF>@`v&Z7s4b&L z4`OXt+-g&DrD6>u^B}gROKGlgnNeJdg3`7_O=(*SO52i4Qo#31u>q8}C4q*H=tf4U z1eW}gQo0;ct`$n#UMgBgXTOjgjYJTSr-I)ayI39<w^D0HPHG#lJ6<+ey|vvvqlENd zQt?24xC&O~C7RQ;=6QD7iwe1HK6&!{UrduH>!Z_h=c5dJ@?tIhbmE#kvSXf`dU*G1 z5Axi6D)DhL^)AWq9mwzq)7j13M@p?zb%YAgsfU8C7cv|s83Lcgi{n9Ya6#+{OU(%O zG!K&Z(oD4$gSKR+yzMp1@W)M%DEJjN*)rBQNHv8y677ZL`YvguAhE}{sby-zD@B(i zFJxK^nHG6#+k$<HOkXOp;j&k`Y-3DWor92Qm<kaktQ-_dv7T7KSEAh}vc1)iX+s<` zH5)6dI--M=FkZBft|`qxCs1RNa1WxRLbStChxG+`!d+<x=6Ipz0X3;RW5HfT>aOpc zHtTbGSmi}U*40Ufe407b)HoOq%s6l35)ty*J&f0Z<Y)&^Q-)<EF+u_b4;C+O35pvQ z#AsM*!J?GY>|%>es>h-ng9zPWam<#?u%zK)83O8wJlXa^g8i4IcC6N*TC#RLZ@VPb zLb5(c)>i8aw)dx3tIbZAeaf}6T4Ng^TXe2KSF>)=P_u$l6n!}3;uzI+=Tp=)qC#=7 zfc3Jdaz{Hi&<2gOavoEPe|<|=_w6iY^dmQ1`R8l0LT5TjEk_HzdmW3G=cY6wKdB1< zSLfprMnJPDs8v8GbD+0@h)-(FoSjxLj^$ZJH=v>f97vyJHC7iO7y6erUgibB9>xR1 z_6j&{0*I%9(<V3M^imF}OczMUh2&Pi`0O<QJ>ifrj97etQ$p>vTBJ`69RBb)C5oB_ z1ll+w?yZpB8}Fqu7nu@RXiklKybk_#H}B`a7jVZKo&AJ!`t^P9urQ(N5njOky3_YN zAm1<k(EVobG8^>QU)k&7Yl?+u-Mmkjs4j2zID@#y<NSx3PlO5iySQKy1vZWNF8Pn@ zyNZ<o6OHcuZeOvc$L(|QANm~aK4%-gXN><)_xP&5N4_Qcp2Kdwwpg3Ow`IPF%^SYU z;9k#n>CfkVm%yD#?dPA={|H{Yfve+qs?*RdB(<{D*V-f=Y9~X-2)#p)Tc16m!GHXd zQ&WF6HI>?Ly!SuP8Sf=qWlp`RdGG2c?7M2YgaNjSlw=PDYl2k!fk*TU$~Wmsyiq^? z)4X>%sQIhQ!T0Cp-kqDnTs@|>@=xgAhpzUb$_Y_KS_eprCNvadL<$sN%98pHrg?G( zI#EDOe_TNkk`#bv(Vf=PNtM;)paLDZ1wHTX8-M7KgCD<je6CMxy<_i@uQ(hY#|eMT zXq{;8CY^jh^9KKr?lJuK25>16umIGe^xJw`k0`ZD|1jY1GG9gh4+A3x(<YO4(vKSI zS>YGMzVSxo299r@>(f29=mzlnsej`i;@-gT_i=;+k$;h*bn+bVA_l#R-)60PmjCTH z7X38F{wT)&D8}wU?TI{gs5zyh3sE;#OvFh^gPb`wP?aTT&|t0spai)C#RzR=kqIAS zK*fJG7)lLoD0u7SD}()W<4rT+f|8pa8P`45-Mpp97;_B{SiBC$SB~tx13E2mhqN}~ zS=|R1w~xCMHBThzLWcXK8vq`jWS=OZrz?U~mM6n)LF&kh;B+ckP*@HUZmP8A+*E0e zY_|sLyI4Pn)k%XRh*XBwhbm1eL5KsI0+J%HqC3hn5Tnox(cR=2hvfMF)I)vaUpa(} z@wtAjEq^Zauh~AXc|&-L>6MTBoIJatBcdpm=m|6Z0?&V1;1%+?mXb7`aNQV}0x`}G zn6Xv4Q7|Eoiz%X{L1azYEVAl=KIC4xTCYAzRf$YFf0)%x9Enb5N4Qu~7mtvQG4(Uy zDeiS9!`n%FbH)QLBF{xuV`N#F+&-)E#gx$~OKolAdp&<R`Pb0Tn<2kv*xIP)PH~B1 z!X=|t6k167sYBs0L8+m!)eNY?ivh}>DsFP3g5h2&N}-g<d|>}xVj&Um$VNwG4{$`B zR9%B51p;6sK9*TnOoExURv$opK&Q|W?89Q)8KT9uC3PjU^nPbFz%RA<)T^=lc>gB0 z9AOKdP3_i>qt>*J`=X+)<tVRY!*gQ528hoe6w!>J*oE3{D>Dm{&H;DS%IdMLz+WY5 zSa}L7#<tq#i>(?fVDNM(3`!bge#jlQ(lDvKw3^B-byimmJUfw7m(e2o1t2Fvkv5lL z_t`X^c3jl6ir{)|mcl*$==gnaw+;;k-@fnQLErxU|K)!L`vzLScVV25>K@q~`tb3Y z{@M6cDs@llgDdw67kTaoUosV+?Vow#C!x)%u?})C$<>xrs#quIu?kSGz_Q;Sl%OQK zXhR6Klywz0i<(F>WZ-Br1~3^LC`3*kHz9C(!t?bdl|}^8n<DsM(j;3=T`{9oH8GAB zlNHJUyH;<a<5oQ-7~8_q5T?XbPb(OKK~Abe8hGA>;(3~s7Q18sgcxw^d0kV4svJdh zYVE=5=9KBtm(X01>rfkz|6m7<Wo8SJ+T5vQZ#LjF+HraEzWm+Rp&^>NNWdT4eEyy0 z4Z{uZoX?rj0GsWgMwF-YY-*~jwYg{d;`_mgvB(EcPWQ}qPx0oH>M3R-z&zi}{Y;n^ zUWWx}<3t1V2u$#$;27W!b3OQlq|QuN${)p3NMm{Lz3}>l3$OEcUHbBuFI|v*$j7+n zg#`R!fve$AKbqtXoPoYf6BN0Hq#!$`TILVL?EGUw)$_vs)L+S;J$>~fev*3=_Y+mT z72FEgaAeO#eyqAJ9JuJV=-ef{LY)ef>GZ(d;NV;{{WCyz5Z^0&%)YmX8$!JZo!Rt} zWC8nf<gM5<$|a+QG%X|x*%uQsnmjJ$XNW7YJ(NZ;r~eA&?DtY`)z7kr1Ms}?+qw(N zZ@a*o&&DEIOQD7GfFv_5Qs$ebM~J#VdEzlNQO97~s{6rBBunIB16GFnNe?@HKk2{n zH1Pj0NE5;T2$M0OOp1JW%02SjX+FtM3J<FF0JE;#-%1mn)&u32eontINV3#p+#l+H zrG0`c0M@P+`n8#x;dNX>Ls7~WbW$No%o{PhiZG&3wDZ=>*QT&I5G2x6da)}=h+Oa^ z5(@>M;6-#17+Oe<8@U?1JZ>ZOP&&xf%NYBTW%6rf@}4YH@raE!Nl()zyV2m6B)F;K zMx*P1G){zcXec^mPjn27P;8@@uyY_elY&_Ik{k8g$@-RXgeq{QB7k@$4U)!S>ZEc= ze6R*kJiU*|{-iFnD4p*e8lNJAfRc<(WD=y2X?!FCI5u1|G6zb|Mx#Ir;GmWIQLrup z4fF;wri#J9TVgH*Kyz#ez3eL0Q6UFD;;Yr`fA!VRPanN_a_rdU>D%JI;k!duo}QaL z67!7RdGhI#<43;o`Q1m|(=Rt~xcq0M!ap3kXKq~o@af%$pT50u%65C|XjkpzExX4~ z@>6$xeD=mOe=(an@%H|qzW;W(Z~XrA-#IiIdt<}Be?D{Mo{zt*c{SSY{q%OYgMnsO z)m{I7VfOxSSC>RCkM+EH`0N{)QOMooP~`68f?PMatZ(FQ<FTSEA^U`87>!$)HHSPU z3jjDIWw4n=lKo7^lF5tbb_J6+%Q^tU;Z4wi<3Z{t#&qCBP@G#x9+KsJ$R-_wEN5iN zqV)?3S(EMZYwb2EimMIsYa47*HLhfWEh@YQE}OIksxCrXmer5jp&y6lAm*)TIBQQ# z?Ak~AG2;*qQmbTX(-g!TMKNr$e|UTgrL&~~NJ*qh)VixRrH&wj_=+z3i=4i!x@QBu zbLwX3%FNs$)}eNjU77?EQrdSMAK!9tC0)_{I&{e7c=+`6k&CyFANzFbcIePO;h+DS zl!|Y?PCdh!RP#3f2;y*7t_l7S0&V1|E3p~IteXkVz_r258`-&n7L^Li(WeCNdF^qF zZdhH;4jU%WZwODb-*6+xBG(zh$3$@+en>RJ%cy4DDcJ5P)ItCn4ZNlVFuPC@I>%&w zkA;MMnM+ndDNJNTL_c2kPIB%_L-(Y4R=KAZSk^=}1sS;_tRGlV+!q)D2-MaZl#u7R z<~`)yl4d%6OSf8nqyn}4{i<#+=T_lgoYee~ze_g+TuVJCgTiojUmn?lQ;~uuQX^WP z$Varq7QqHdR{J$S6n5jc9QZBBssyt_TrAjS@PpuHU<Uv@%Z9<7k;u8P0KPh5s-cV_ z<|+N#s(K{<61Q`IHmv{zB<S7Uc%q20-^zUNy7c{JsO{W~Z&*Wfqj^M`n{JY0Dalc2 zgd9QBfqRw;K?TW+Jwd5gmgAzE33MRS&uHsa_ndx8J%z$cIX4a1=D+7n>JRg-GlLK^ z*v#}ORvH#${x}yEhBZCtm&fto4n(jeN+BOTKXA|Qnj7BwQ}KTU3VxS=<-+%U+8;ep z{>0yDec#7g<GE+>xhdsy^@Lz>mC%Pf`DgEWz#Mo>6yFM%A5cE`_3|fvB!5oFd9L28 z4WbN{YU2pi-^@Lq_GQViyI3I{Dud|)EIghd(IDZ17hAx9g1TVY&x1<<KTj+RE3O`6 ztyX(pQY$p3gb?{P-bO0hL6<(Dn<_lD{swBMD-{AsIPa@t)Cn9bdrQ<eQj2Ju$QZ3c z`i+%1LnJ^TjF;L&TE%6}N8zWh{V@~oGVulzJOKXNvM0Bjd2*L8F#nC;%dS#SMEINe zhr+3&{O^ZTr&E6z=5O;b2bjOiys{zoFaN*UpQq|me!*UU<^3!FR4){yngjd?=tIw; z*SE%^8{<lWqlqHkbwKHM*vmMWXa<4xtf7^~MHpdB$AZ`yPS(m(*v+`w08vA&Jw?XJ zd94f+V+`n+0G^kK7KR)pO^JnO496t0hgKe`WCP=)1Wq>EVN1`*N-qkhXE#_8O}FR- zZAdE^i;Vh4c^)n(FK7jA7%x~&$_u)&@rj<37dISy;efHR(sKTV(Scu0Pkb6;bEKP5 zUi{Pji(e#vR>SSb+z|hN7#__Sh{jn3Cea^bS`tpo$-JT4X@pvB4vIKFz!L%{F2T`Z z5>rb@OPkb=xrv06J#22~{XOj{*GW65!K7byMS5g#Q6&Hpgr#dFnhkqfl#p6rCZvQ+ zkK?84vd&n%pIiu(TPECH8_9-A2EyC91I_&#s3RFF0F#w%?F?=Q^Cm}JiS8=auZ{I1 z#fFezHR5XqJ^qi}J{-(wy#QL!sFhDU_QfO`16V^*EzJM4&>Hg2+)_|hRbAt-^wpL& z`ix)vy|-pcttO3Dqb+j_XtP{pZ;RhDuxDUk&(pl4!pOtbIZ@H3o%GtQZjWzjZo*hp zSYXnILd8vP$CnO%^S+p`ywW$=q=iuU%DK_sKh<XoefsB2EZP^MpZom{mPlp!+;4g} z#F?EK*9A0{x^s|79ZnN;{1;dpTm6_9wID8%ttQ2@E3X7~=Mc#z?7)9q{d@F$JP#bq zquhJk0+;w2Azm&;k{20kmH28H)gbe{q2r5nF4p26eLEAjrc(?l{&zv~&_eRK>|GwW zNq+(lQk1Vm#tcezRXCYQKZ4|rAF?}M+Pih%?Z5Hc7f2j`0y5r{cYpD}{O0+uedFT) z`Wt#mbiDM`@Bh)uZ~Wz7(Nz`RpmUi=Hoy9<OK*PXEqbcX@shxm2RaAk7ISmkA>i`* zz`hf=)2m;zzf|R^J9zVP_TGz*#2>x*GP5fFHf=3CWAtkvX1$BWePztD-oQk!qiT5r zftVq7Ak+!UOz-OF{cU)N;|YDV)YYb;ypfD{<_+xY@R{6?YFO<!KFZh2U&2l&ZF>2W z&@ii({iuAP(X(GA6dV04eU|;K=9u1Mu@qVin>J}pg-%OBi37Ea&dNe<ox98e(gTm+ zJ>oIjEhfEIxF^<Fx~;|ItZrzq7B<yaTgxo1+ppYHP+n%;G^$1En#FspMzC57s$G^6 zo2keyR75-hr@gFDM<1PPsVS~*3=|bsJL_Bp#f8S!?ffm(Ep6R?@Z21$;SC0(QLn9b z*(z*0gVBK2IY(+dRm0KBHdpCXbCJuE`LMODuyy;N@EV@ytNcc*)ojq2EY>1xk=}Q# zw5ri)Eh{R&yWF}}YcQG3ChZ1;E86ODRmG2%x=w_xrG?&LxJ;u-e|Br3E946l$)9r$ z43!l4%AFRw&8Ys|mLPt}ZpJ5tpBl{7<s~&+3$0aUWhKrMQ&nSORY^frRZUfOp>@*d zsB>$y^ovDh7Wzqilzy`<!tp8Z)rSSI_8qJvJN##a@zskaf=1|X&0e0F(Z2Jm|4#N# zOZHFi(TyVG7zCT-0ltb$9H2N#TO_$h-d({MgU(-mM%D^OvBYdI*}mB*Y@aTS2V3x< z5+2P#V#A^%8X3rT0$FnH5xj=e8qkzkI^_A{&HJp)$s6RI<OZDo<VPS%-ht&cfVkd( zO)}ymX=7_-#uiBsGPV?v&ye)a7nir<yBg(hXaq-t{0Q{D#M4I1W;f#R)*G-j&(0m7 zU3(uo+s}{m?%7W(ZHGNEHN2Cab~oZJ^-YX$vXw%l*v#7KEk1jqhFEZz-|u7rQVRCb zTA)e+va)X8Nd$z9WVxfhR-d-AV3ASIbBo^|BYTY%qI$dLqq3048@Xlg(@(zUpN@9S zI^DJ2hx$5BJbz~&v%HtjmFK^~=jGWuPWH2RXdm&{3<Vm4hj&gr_q*{!Gp+uChS>f& zCttGvJLls^HywF&@Y7pl8~n$seP>2bw9hL(#3*XTfcvDnCUA=qk0bdIo5SKrkV=9n z+B(uqG8?Hg(ZAjJcW?xCdW^3iu#oh~O5!2>F6W_;tj9*~aI#C@E7AG5xN8D}9h1d5 z7EDgakGgCublV0c0ngPUrvzamsdo&L9NCO2gt#;T$oYIn^Ee|M9kM3|BEuv}*>BlI zYzAO8kTW+cIxTCIq}Dk*L~cwWyp^;YBTMjy*OK6h{*3m&u0d$z_kE~+6}}DmeI&EL zUP5M!JqdoFGy<623Q{Xs$Z`KDH%4)m&q9){RB)YAbFAb|t|YZ3mh0tZTulB!CbVO; zF6*W{BUqO{?^pUFw>^(lM~g`}yn>s#vdoT*WtQY178=jpllq0GNAn<<39-U>_%Q4r z%H|;hk}#1|<6;s0GCJ!ZS7cux+VUc{a0m@a7_ouF{~6{=n$23#0l-l4CCO^|rD(bI zYTuQAE2igB0%QbYmm&WOy}uL-@D7V{-cneA9*oE!6<RID&9C<V-p6O~?gG4X-dS9N zPK@|B+li5TxxiXt0U4iD1|YSy@h`DwGEWe=jFWf@rRs1jC=MX3NtTd&Q=v!*bX4#$ zeL#P~_QHQZes{FlW*t6ecWg2=?+ZQg<OLz|cH9}Mb_aiQ<)@E-O|Y1GFaN;{I4zMq z5;#xFtKAC@WD}wSPlCG?`~;H0@KdP*{KttxI?Le1jq@Rr6bUmsg2ONcu}uLy+l1@~ zn=jx>VSxsle6T<vv@?<qhNbdrr7U4o3d&s3WJ}u2^i>RBYXUr3G7|}b4ZH<XAu8wL zqa(j2l_6~;3B*@`>Hu^UZN;eh4nr*?EsY}Ln~wfcb(x0`yLm&sx5f~Nc0_11uy@o4 zd2A%HQvUtvnVIPs>p$*(?R)?FkAaHdK$lf}IQ7faFAi#LXucR4jM}sZwU1BoTYptO zIjKLTef$2m{z8BG_T~U+PVt{s8?b4${#M<OTkCY}8O+%(U><g9h7n7t0P+C%9Aa&d zX$aEkVz4`vD(C1*;^9SQJ2B$}c|gH7N|3S)utD#l@;DsiDc~(AN4vkLoG5=}yr2|W z(D8P*Hr|jEpZ;4NUPt+@{iA#D;6J1_Gl=^c5dJ}k!sn-4!Xzrb$m@rq6U!9t4b@T% z1A?pxN(s>_ic$lDEFuK_$sMQidy{~Pw>U@!#0i1WEDNyK+HJ0Izkm%^lMc$mC+2(3 z-`m_>26V`Y3tv5c#{&=Prc&Qe{fzH8ba3?WQ&%2JJo)e&PrmT{^RkY`gpc?;^a0#b zfIGs0X_4r$Xub+qz`@IIFyUpKYhKbJ9m|X1WWkFK3@c%j_cN4G8Euq`rXd)DRM{XI zk!(PK)5rEB9SVHGZqXisJp-3ceMhl8y|spVC<bg*eMg(tS5Fj9CHj(=LOx%}cdE75 zX6yS_zr!)qtSu7C1i#jzJsU@jcmS{ZLRZeky;@I|Ah@fwo}Pbj);O&ubX~Ajn#+q~ zF<L`Agpa_>d=6e<Jt560*Pbv?dc6$wFjy|7IQa)DK}%zH%b_+MwKPeiQcJ>ezzqum zSW!+Yrlro2(&O|WDT(61f!oJj<NsLp_K3lUgb3-4JTU%b|2gybe(ozYYYoQDe^49H zmerrK9AyK0lMPH)G4V>pU7<FM(Q5g^%ME3MGxgADNHs9BnTMc9PTgII65Y?qi(QYq zFAh?TB;7hdQ3z1tQ9>xB^)BGt&Qp2{nM<3jNnXWp?)t^wkbxBSu$#I|I>}7n+JZ~a z{{q;zd7a*jzY4U{#zBGxBNZ8Pl0Zm%TWNVEd`4RII7$VIy@nDmrG4cS5-k;KM;Qxu z@=ra{(`{+xC(Wt<x6HfY<lPsJ^?k{0b9=nep*~IGr7At|9~Hjox_55>mz%M0JYIgM ztdn~&)(~4W9`07eLut)m#nPH7%5bz{z;Mdg>Y*eYE5s$$D+?178Hat_k{Pzdp|}=I zDx$?Mm4F2SLqWtVVR4j^164xIEY9?ZD4j)>N$D&_tVD|-LbzJPzjn4KF3d*znor+( z;^5=&{xJ1%a9}Jp+w9hBp1Ig=t~z~ma%%L#xvxC@Z?-`2lL6ir><?f-JhvM<7{GXJ z)T>fY8c0EZ$wcT{^wQI&(O>g=jTT`Q85c#~2LO;X3V5EZw;r~aT0Qu4ViB}H5j4I| zN9==Uv=lBE{LTE~H}{T&9&Xh&2F45*V&;^Xdh;T`ZHnSQ2Qi<0_<b|yMK<#+a0!YP zBE^9UH892D#USz8q+>aLn<$`eEW>*jOn0&aT@cD_6{Ql5^d1=%MKL1D2vhZ9VR1Qn zT|rA=ufSpUqOQPJT1DK~(3)BU!n^EL;Bp$2N=VY0SenpHL%=DBfWkHbe)JGO-B=mw z`Np}a!@qH6_wOCuSa#~n%=qWN@7*>Px>p|v=q9GdTb;Ja#z5=L?(yc>{(-wE=OVZG zEU|;bGn1*yj-$c8&J*51475nF2h*5~W<|Gv4J)FVP+0%qb0|QtDhyVc6X=#q1Wvkz zqZ!iwlD+~k6^BZ@sB5hYx+H4tQc)G@R~a$<mByLnp64u%3O4v+C#e0}SwVItPm1GJ zv@WIxh25KH`{JZtr|!7z;Kk74KfRMTM}{BRAFyg)dy;Q$JauGpYV7`d9~wG;dfG?| z7TVlN>xBm|qZRUXV2{~J@nhr;B%^#pG0?Gc5C{9_gwa3`jeycHlr3zWDV)Y;;xMCS zUV;`(h6$aP-l;PHkf$JUBPG5PU(^?kL?is+ci)|Q@4YE)Y>FPR-yhR_C<L`JXj&}? zLqv9!{$r<dDrw^x3_}&w5`w96e8CjtF{OD+2xz~F&l(W_2F{B-9OX+Gz7t;nV4m_F zqBfj{jv_clpid!?YUF>IYSeynbo8;25q3Z0!jt?#-D`;BmjZ<fs2FUZyr?8Ai>wtx z-KYc+cG^%se*Sdzp|Rm3)t_h^?|5%>>5*qno&MSh=O4;*8x@}7Cv>l8ze~>oq_i{3 z--UBA1B%rLV%0o9@rmci(54$br_XC!fA)vY6JI-h>X{?1ae1D6umj)Ll_Ls!4wW#7 zTQE-E@~~J!i%90pOw<r85iscHNYW}BbbVupMeur9&|e0M66_uZ0>o0udf7nWXentB z<Co0kP{a;9MqDd5rUM`zZN#M#^rW75S9|0dA)Ze3Y@31=PjA2hw)+xj5nCEe`VwzJ zzMj`QN~748hxYejUv8@DG-^*q%~~^W_TSk(w#hf(aP-X#^xJF^S8K>Sr@hxz`u0F& zcAI{$$EWjHZRLgKn{C!;@9vv_*ymPW72E@FObmp`Z@Ov|qI`n+O%u$IgReyP3b-72 zDA(sTMONW8@#n~EiYQLg{5Kwc;~Rg<oF?FEuet>rp9CVjl|!LB?M-xNR(KCcs#XBM zF53o5s+tIerzh8q**;1Sb-L`%YJofidFlmnig~AJr;|T-hs$d9d8;wIcifNJjYOIY z_(qSdA-K_ENx1^8F1;YTJA$Pn1pDBn^}qLOrO=4Es^=cya9kq2ISWU?T2AZON<_Ie z4Cf34LZU{GCxLGwGXPkjW<e|ulgTIZ2Fp$;1NrW8AP`GQFP*Y~N8^+=jq;Dxs6m^z z=qs(TYmx~}1Y#LftOhEUFqKwfSA`R;@<|avqB%##PjB$bK_QG??<~<j=BT^lwz)6u zK5)$QoXcNTWI?dKJT)gy4j!FyXpJ9u9MM~*=7w$`p192wb=R0rZaaJbz}|^Fn$3+M zW#hTYt5zX|v4^?SoaknI84a_6@y#|MHiKP_uyrIL4O1P4!vKX_GecA|vrfa5-R%U& zAio>{b&R#G8U7FBP6=WklPb`eT+)OvbhEXDs9eblLUhZJjRiAD$UwOX7MxbxAqDD! zh1SWR4fby^3no3h4)iBGIX$)imcFA;o%QaV@;I9vj;%d&UpIQK=7;&9-&<L#bz6$; zHmxfb8|a>n>-SC{`;x7x+F4%fAK86Y>!^2Pt?Yt6PGSzU$SZ7-QCcE;<&b;p1lXAw zehsLLNDXS`si0scFqg8Ys{l!g!wf}974_8k0b@&hxUzS<qJJAN)>)VIZuu?DyX`J0 zgKz6v+PB48ck<7|x4n<MhD!^QUI~uxuh+BHbp+P$*}7li-Aj750(8|T_?S)eaJO)Z zFH4G<@nxygO1ei(#;c{k<8^qoI`Y$(_iA4_+tV$~cEYPYd0Vbm+k7!-cAq}7bqS{y zJ}q=EfU(%<lns7<^RL61<>vyM^lE2T!_8n0X}!z7>=~vnvj0ju16Z3epTA~Lwmj|0 zmMrVZQr%6sG}n`jA0kJ#_enUiXV2_@@W^oaDPHzur({o-Kf9bKd*kp9#gmQR22Zva zYknB>;8t|$1V{0*eBV_x1(QYD6Vmm1u{bTEz=7;KHK$qDiGAs8UpJiCme1XO0#5AB zFP!Af9n6XS*^}hLZr!f9u-i?{g&mKQ|9Z7g=;MceZT_oY?Y|y^^9s-P$hN=z+iia* z`*>rVO9&$Ru`2(Ss*K2gJ+f^ZohPQQ{z?dNKVbfAD^7QW5nY`JOJAh+V1F<+#vE9l z8&7?oKgj<)>vQJ&uTZb0{MUnbyf^O3xUW1H#rI68zbD^!^)KVQ9wgWG@y};mSLTmd zg&}?%UV4!GO-{73x*3%8JPzZ8>>2EqpRK(_DfKGkN?AKmw3<NZwib{I*>fnb0z49K zs{uZ}Gn{OY-6U!?L7pqIq#Nv#6^SUZ!66yJ+=Q3}TnaRy6id<QwZI;bi~RlMyF&|I zwe<?8NXGz$Bgo3JjaLpsz<tG=N|6M;aBlzJ*u-vMt3Emw8ru8)yKay48Y?PY)enun zTV*Y92~AP&Kxk&;eIZZKJ2iMF>Uqvn4zd8N+g%#p_s?HHZ8ny-LRp9x#D?CZ<A~<D zxtr*;C2`nh?QHqM=_2ZQ*==XOAWBl;K9P@sGJddqCDHPWu5gSY=dcWpQ<Aef46N(6 z-41FbPO`)0m>0AK$nRpc>Cng&M-_nZ<uR4I{1~OHw3vo#;HOTExGjz6M{n)f`rz4n z_6n(o1aI(#)6H8=&rTuE5*>d0%nM&Rt1}kOMMio569@We3_*-x0Anb?`3Sx>D#M(z zSAggW1k5Zbz(;_B$cJ3fVUsMlc}NjA&sfVW#NV&yNJ2#zgk(ZdH5>+26o^s!>RBv> zkDb5o;d}0S=mK9fHxwT~I2;=1$DccM;@M|zz32bu?OnjzD$jJ$^)D?g$+9f#X34TF z%d#xXiY&{LEX#`Tj^j9v<2c4~jmf<sB!Lh@2qt9GG?S(&JCk7`Q=pj+%`j{xhlkD9 zl1<7^DNToI8HVZRa2UdqVY53e!%*7p-lj9VfTNuE`<LWPNVs&)K6$7kt+nJv|98K= zUp6;$;LPrKj%<Y<@WGGW2t6<)Y!E1Qg~=>GS72Ffsf)U88>89Q?m_Cog-1u}pk;7l zR@3MR(%v(A-pZI6h;wufJ(AVWu=>W>ia{JCCieaKXO;n8uqhVMp?Soa#}Wj*WxRz3 zUUV|TA-n37lZ%>aN(CD=9Ou%Rhrje9&xc0=h*`M8{>QDi215<9JQ(U-Da#(aY)Tc6 zVl5J`L|nKM^2Kl6=Q4Zj<+xHcFeuAmqkVA9Swv5S>4_zE&)@fOt6L-cWJ$sglifLP z-tMjx3l}uyZHNBS?=06a*{1{cAbyJF-u?SM6?i0nLE{EpdqG!B=5XHJN-|0rGV(9V zqcavB6VO&HXeZ8_kW4AQpyeFHca+E>3Ul5hi4=J!PJu&_&B4X2DA~CoKq{vOp)aNT zFy)3kXi$q@n_!&=S&o(z^X%^T6o-y32Wv9Uo9qs;I{IJthllk{B<5`H`x-H)q&I;% zb8ntz8^N6v&)C5SJ483oh5uZjAO(|9OH@8E*wA1dhmdjSHW8&JCh$SFB|)K1ODjb+ z02OX5Oz=^e*3(G$Jjh*V8!0O2ZmfqZjCnr75|+7Q%F0f<Asg|GlLDk}C){NYE9_~+ zF#rj|NmE29sxH9+Xc++tBO6x+D^D@9mmyTH^XrmOLp+rYkE>Aqn7d3-o<CTVL<EQg z_v`CNPdmIu!*J|avBPWM9vgC2It`=ALwd8{VzY<!nuANt5<BwLj^}dVDp4$|6~$Tu z6Geyjrzh6u-t{^}u{eKCvZ8L(Q7_5?<h_&p3}u|*ew^VTW%c``nWjX3#l6zdjT3~w zfL=o44dnCHxzMbhr-0N95r+O@bUM_~&lp!Qyw7*1DFzgm1!aKnF#tx0g?FVjm=VBX zI2V1rGD{1NLL|9L0zNk7q<R>-m&s*<m^sTJXH*lyCK{X-C1U2nVYbia$bWr^x<RKe zbKq-tG#-_V6NzC-J`!AJX3|h>Sn{k7paAVjAzB{X$W!S<EIW4$<GH`DYHl_dY(}hM z4vF{^(u;=Fm;|u2@1$f=A3>!+sx~b-8j<$QWgsN4%7d;eE*8Q0RS6RUpoH={2WqXH z8>G$ZIjg7#hK-ImFttRPEft7|Xhls{;7eZx+Jk}Rj!G$%yJA3#HNq=8v0&X?#p7x) z#=R~SH;wYHezW9B%5r3U5-KC<83gd8mocMP?=DG>=AuK@q%=fPuGC4j>c8E6#iRKl z^uUPl1Z7T?#y)6F)hf-%4b8p4^1`C_Xl6}9sY9yHMMDb<QcxTz5ACIkGEHEBnJ$$b z(7A$5LxoMz1eQ0ScN(I+ldqp+9(7qWer#p8JX04>EM3Nt^Fr246WE7RQO_V1<*#q& zMlK~^#R60fT9Ax_m^n$&d^8^g=9$Y<&F8UdX06P*AVuX%&ZVe^H{b56-q`f`!YtLk z=nCeEA!~J5O<HZud?aI4E6ruB_U^1(_U*+fs}o}u>l<y}MM*5^F+XCY=Rt?1!XFYH zK9bU-=@Ht>wKxMvebaE879aobo9I)V=nlzLlhOPCE#ddEr+2-4pPX|chcSpMWL&^* zJw0mOg~}~}$7`cx0Z3h;MuCA!48~HHrz%QvRz>T2qh_jv^RysxJ0*qPor<d-%#FM~ zTA*jJnue?!&QcL*$Q-NnB9>`Mt|C-WL0Ljj#6g;Sdd5}Z_SHvlHDgmZfGnz2k|r;3 zXIVZ~s=A%QBnlVmRFm%YNE;N>b6hVG1MY4yXblGM_VDD*_igb_+C6tghIRz^8gkhM zrM@=y!#3hIlN;GlAYOLdwR7rk$}KzA?>T&KJj7!q6MGrwv;?O_#{w6ejs^ZVVm-WK znJ|5Qpeh#Tg9OTMF;#RiD@09a)v6b2#rztMO}%A{Dh)W8a&(3(#X-2>ddrNVw5*cO zMyaNHKP>~*;*1npMhS=f8781Jg3s7uxvqt5KmvLk#vfilBQ`fC;$DWieeRY!t2ERv z4KD#ATbGW=OTy_;dTMOtox^)lwkGTFXflvYB>FtVV`1NBo2i@yhC%`CDI1RYdk;_e zPm3e%;b0)(w7VmzKzhjKk+6e@p@-hY?iU4k<KqIsK~T~QqJ~iycsz76PBklQxL7y? zRD79vKC78!e1{sNr91F@(EhH+ca^DDsCIPlWvov4H%SZrmud!q(Y{cITCYy_UkVTU zs{3!-J}OIcDm?tSK@#0e%)PqL<sUf2rlgqMKjd(~pbwbvZ%tfzOLIc{4XS|W1uSu! z!D65g{)N7HwuMkjg_bxN*~$}KnG*c7yD2}*hk()IPu|K>z0i*j00(QSItHlDN~4#h z6m>(jQiyLqQ^HBA^njjDTH1kCQ%fz=vS9TROB*o;T)U*1UScn`87^_A#F}ZKg~Li5 zsKbKNfM)Dzji-!e0s&*Em`d@CY|Rvb##$`AH3LcwjE-ZlfxiA4fUCin>5w!-cDoH_ z@lKiNL2+m>aG<}1$E%{@jVl84yw-7CXHI%jhSTFG9)GLdtT&9W@*91UB>GK$m#@YW zO$RI6YzB{g%O5s3`=9h981|wRxTo0-1e2$w!zXjkzn8PEVE@-J6I<*7(~Ow3MeQDA zG~i7|T&JVCbG`>l_dU%0pzxKP<_YN%vQ*zf_XTzHVYv_z_D~-vfMpv|ZJ9~xgnOY) z!t^@Gxn`!)V-2fw*;w$2n`i@_@m2F{EHQ(w2)ZW3r(@PIHO^x}R>ei7swN&+bm+%K zq8nqe6{MV~_C^nO>(CpRt>C%X3fd@M9na{@v}TdoNvXn1J5*z|amd8?fJ8xuBvQ>V zm<X&;fI||)Vt0}egzwa(h4a+LAZivW%i({}Nny3Irj7ZCEO_U%h^O9v|NXn+NxnRs z`;};I?P~ot{xmM@HjBUWJjw14r`>zDJ~DzynX&C-xyLp#Yls?aLe_9I+lWgnkzPW* zHn}JCM^5(Ve!$N(aOF+SY3WU&Pk05`eROIGwU0d!A~xo$)scKc!KMz!TExqVMTAAF zB9G9~d9c%ot8seM5zk=JT{=O@X?mXm?=R=IXml<RJg-?rc&_y0d^<2)&;l1W+rfKM zI=b+CG1}qnpwsqJClhu%_Twz6afuMyycVSd6^Tp-^cr_wC93_SA`Q2d+kqKT)|I=! z>M)XsWhk!zghiEc=XzU_0?!{b${5gWYMy~V<YAOwvV%ZA8b@FqCvNKKeX66h@6y=8 zZSB!;%IB=U&1Ct!@p~`6_@btES<CLfOzZ`(h7*Z_TXXnxxonRlKid%+H=bkt=^AG+ z<h3tL729^c4GIh&y0v+4G#ZEmqvwtejCEC%jQ!7X0GJ@&^Iv&Smy~{o7UIL~%R=@5 zarc(Q%pq0r?Tu%*<4R#G>jrw=s@}Sdch^z60)s?5`St_AhM@kF=_g|`tUiTGGC&%T zR)iYOcLlM`7b!2G>;ZHpLaRNzeQeKuaCXriI{zZ7GTa-_?BSc?6b~LlptS=`j?q=3 zWNjSNt1~sK{i%s&BQ^A~2x>Vr)w&FMA(YnF^4l}&W|7QVbuRWsy%2$v906tbfPlP@ z11r6<<+t4pLS{U`y4VGRMruIy?P&luLvXO`Mfv<T++Ax?Hf_P*ZPsU6yZVRMZ6L*V zqa{1Ak}x?1rS$-Q<NkZ``KCko{I+!FHp@(NTi4i)^dKNTR}S7z|1S-eEL*mg7$-1{ z=dhtfhe&8f(0cr2Cn2cVVN`7~`|C6oO3?Vx=@wUa9nri3MhT!ZE!>d}LYj%!u=AUE z4mPt6=>}deBCCl(2Fhm5nN8v}T@b$z0uMwPG}l4;=fWtwW;;5nnel_rkH+=wlF>LY zRefx9-{V8o#=hk4OUB5vmBIFu95HsJw)Ebyet%!bO4k$nANias>fYE}*|)W4a(HNK z==Q_Cn@5c<k6m-qmJ|E7qWl@(jfooZ$)Q=%nA|BndihlYgJKD?5C1{)+PGL+D``c2 z->#cuspjHcw*&*Z*Gq#=mNRc0t9?Xsq&+pXylYd!<4q4Fw;oTvr`fpI)xRp;8uSO3 zk1w6vUu&17met=Ahu(1HM#oJ7(mXJfT*pK(v$PpHrbU>-=0@UOxOe39C82MW1gyLm zpQ-XWlfdS$0ycj*L@#M=4RvSJd~1r9>|wtZt$xLYY(3xpS|6rJED1#d#AL;YsvgjC zT$~B2$jNb~Gd>GcFi2^4`e+NRGhui7A=Xu50;xn;G!6Pv{B=p3d8a1jt4sPC;da#} z6Pi>g_hgZwH7-6KdMoe_6aC>;VN<aued-V1v~P%*LMcn<=uNFlbh(ouc1Em=jHP1R zLnoy-V?Mn;xxP2%QnCR<}}hv^(!?hA;%nCxm;llH}WCL`q<Ukq91jw>CK1nd|M zEZR!p4Izv6)9_5cPUx1x(<S{C@B`F_M-B{;O;ERy4;{5i7mfu5P~`V|AkIrdT?rhQ zHo<p<Asrb@@YO(A4~|3TD2|?s$}*hXGBfQW%-VusWL^{<BsvA#mL{>)p<4twq->g7 zwe%94EC3vp^fHJ<S%nX16A&I`bW0N?WBufE8)<>Tbg`|hobGj@TI`|uFFNx;T8TyR zNDI#1=_~UwCrJpU<e*jIZu3Gb&L4SDQovP+2UAI3(g#;ub;dDtSq8DgJ9r`_Jb0j` zL-WM!q~;0nNb5#};i(~~*~zqez4Xwo$IhQTlJyQIMJvnwHR}-l=3!^?xt(r3`->7s z*DVM8+Kx|-?DcKiq#a)~`~8s)y&+I!?~vtzCaqS|9Q^eQZ=O78wb))=v)f)IIqOVI z%0`blXK!W4qCpoLM8}?+`QBEK%FDu)_cRYkZ=;L%)I8QLp}e}VcN1hmG%iFl_Jkth z4V-cy^zj-e+IX8TPHP(C@YUjq6OM%yoktm`nfAZJL(Ff&LaQe4*@d~s3T)X{UdR2m zS`}!RhCVWN6zM26wNPB8)Js}y#F=9+DLHhuE|zGq!lEH9BL_+E=zKb#hv5gEG0Yv! z!jkJQ2K~uYg3jM7hEks2J=^tdhsWft=|6wqej9r$nCoW`1#@4uK6)@U(pps(DfeO@ zT-SI1m+NqN1nHMkWyxPLN0oTwTPBHlKlt8Y-N4{0Zw4oZmPy8$-KdP}epZ$=`9cnw z*|Fxj^k(gL)cI;L&$*rf#gOlq>0gOl1`<}VNlJJ%+yIV<f~F$}&JGE|?mPo;ptDn! zfnCAC=mu0=@EbPb*w8xOdDm~W)B$PAjF>ZPFDP|o?`HxR&UY^VbqihAW=d5E%3AmP z8O?=hsa9hPPgnDwKK`?Y|Mb((EPl@Si(?H(#)m;sbb_UYZlgvrR|o1w*7|DvwS@+9 z`kQr^Xr$6>-&5gAmtk~LF=BT!;ES@w4q9A>rbR?1)3U3H2I`aq+?(tT>&8iXx&iM_ z8mqawm|^@FIL8dGxEl$$ysJVZau`@+8dWT)XBbrsIfw^PT%h(%0r<!OD)Z~efsdzb zR^()Yom$<l+`gd+<I=vBXP&=&<D;g*o>BLzXz=LMCmvt*l}m?r4Cwt1-(_*17~1^E z8+&4}y=qum=^tCyVn_^lD$8GIf!tr_E^Ym~{TDjPnA&K$qj};B@0ZmLu3RxNAbQ@& zUC131m-eyd%l4(dDduLkO+A+K?fTlGv2Q<ld(ijT<n2=3<rnofk9qgAFYSMU6&-!1 zX}Gz(A~+Ohxz{fNaCyHZ`ORbK6e>DherL#0SM4q<h0sYt1b+Xk@Qr$e73jMj6aHDn z@W@~=QY{ig(6S*k0xRMGI4Z0YHiM%C;1SP^mbl^j&y4bn`Al6cy;BTp@DxxA=qcSw zNFf~5Xkw;~SHSXq6^D4Kxu^sipeeD%Alo~z21+w)=1z_oh+Wwl0=w(3se|GS$H}iH zmuE1p^ng1xTKrldeGGwVhoTB$yO`n4C;)LGe`dPG*)~R$?-_zb7lR*0Nu!nmpe2mI zSEwe--vl><TdabUpvDW7Y$d{4V7U-4(2uJf6pb^)v#}6;ir}?nROQe-5uP%PhyQ$Y z?w6^?-7%~5<j^~!$+L@X2!93tjT`Djld*ey;MnlMJtOTtL(a@5BF_i=**1-s`?IwF z`4kB66=SiK_tHWuKQ8X|rnY%gqXjR*6Y1`(eX{%~+dS2F+xe3Q_6VnA|CKki>$QX_ zNgGB`trD^kSUkd2$%=RZuq=u`pboM<R0I%u(l}dGuh2i2uX!L8$u~(8lmf~{2-S=@ zV<VqxR{HR=&&)SY&~6YxEgu=tw?*ZTCge_Jr-wAE(%A=*OGNmJ(iY5urw3QzvY_;= z!1boVRiKkngbK+yS~OEvPoHUuVfwv@k$1-(Iye#W7F|*`V^F%zRDxs@2Td6U6=kvz zT%gKZ_!vYQ5wxrw|M2bGq>9;p3Hv=m$HbCKmuH)Pd~D<Rza1EN$F^=W7+jIifsF&= zzV(0jPe-<|PrC-t=V$6a&|d4>9%wnBedPlv)ql?&*)<sI{K;*BK+<*OXxZ3-6%R2( ze}8T)VzjeV&B56}dvM^)yJPLX>ZrkMcRQZGZ`*KNIC6;Ois&&uM~@NVZ2^XiB4{ed zt>F5XUVRJ!u&{2P09eDK3}XLde6RzaOH55g#iBED^%YQ($W0ptU53q>WfYwBsdxIK zw7Z4kW=cR8s1=*1RwhoNb7s&o9Z4>211U1CR)vql^<f;YoAhatux>=!ue`$YOf25f zyNpBN{6|UX52kq8lV`!y1~6SPPk-L<{He(}1^8F}ioxD+`aoamPj+n`mWM!xMDnFm zpO%sjTfQ$1ZHy;+R|LDlqno-Pkb7@T1-o2ZCns)!e2Kyk_FTM;`8mB15U_cvh`dQC z$UlHQS*S_9f~tjelu}e=c!T-YdXGv3#CBBF&i4gkR!%9$h(NZ;!nad6EQ+m$&bb1p z5$A*wo0HY3QBuG{4n>*WiVI@WOh!sly$>G>XPnvNPLX9@uBF`*?9gkuS591d{)1D@ zRpvJYJv(w|cii&%FWmo6w|UL4eOvQfxp&2XKXQNWceyvd&w@{}!55#tNql3w-M}XL zKm5O!?VR#rVGaIh2>)94OFu&QV;DZg)0iEmZ|l+t8}My)h%0C!7)pb}j1YY08K9YP zZ@7goBTyS{Y2m96U%J4nYD>5SXX8+Fa{1&NN+gO8=zG(R(FEn%lu|O-g3d+Syu_jy zc%OJPjo~_^N?L?Ti#<w8O>`d)w4vDFkawVgIFGzrb+C>n-l%j<j+&2_8rY#jR9ITx zh{&DZmH^EvAt7B;Cr##ZPdvqje|PBYHiOsTtewog)qCvF@UPAvee{-~apOPEez*Vd zij|#hgSn><oSm9Fd%*rzPctpRmG7<&YYv<@$*iya;CrXv%zb;}u<SZ<OsY-aG|`?O z8&8iPI50kT&po6Q_ChC|<2oTAB!q61iL+h`IV8Ab9dns`NOek0ET9~nk5S4Eq_WG+ zRJR4{q3)sr?1+q;PrVlQqf@$`cJ1Qaqj<R#ma-RryD3!$1zl#DW(J8THcPPlr-p`& zlqe29t4Sf5NtsWM3Av<0eLiMUnhDjcoS;l}{PDb!UeK@VuUVjc$qVNBO9mubwRM=B zWZvSHZ}Tij^eyX;ymfTj#x_?!L*wSafk3bfiZ|7<Puv^~+M|O3>&A)oXJ0vc@V58< zc;a?nUWzZ@d}`p#J7Y_IHBqC-?sh)?pZeE1+ry1*J(s^WH0fBmXKNty{i)->ICkUL z-`%iP<z+ha@8!>YjJX$_crO;Ys!{MzbY?B)x}c8XSX}$ifk}7GL#G9HA?W47&~klP zxdCSmcZB=SCw{X_^_^FkGXoctp!5Aq9en5EUvE%-=Qw=l1Yq6aQ#W1^&LOGMlvpv4 z_kP(o>3w|ug6i#l94#@uoLK`sZp{pngk*X#(QZ}3NWut8xR(@oCx|>`IhRT=?Or~- zhD7c6p)Ey>L}ygzQ)H5P8W8Cg74woR`>@88gMqRtvwt$dA75&8hmKW*%6#{3HyB}; z4^0l5cJ1v?n%ZHeqoITA2S1Vgx$*xI4JPl~ml?C=Mnk3cl>bnlA$%ax;$HnWy$u}f zt>$`dR(3(JwZqq4%OoLdBo7IzL<mc^g7PKU%#d}_3q(oTRd|t)ld|<ZNnKCP5z9&A zjc|=O6K2NcaC(tS$ugR9?`FKUVl<i=xH!|RD*s--L~UlZ`ffGfl%t5ygG=IB(Mlf@ zF+xJqjL08Qe@=e=1D3A=w}6UqWW5Ovp#x-`9$dACPzM$3YP>O;ctRFif-{D6CQT4% zSoHt__JHaPYpM|MV3h;4AdFa*SRC?PzZ{?I=p9hEaR6Cd%^kq)LM|NbJiKU!S6Ux+ zR6&FK8xaCwY?X-K$(-bc$u)qGxway`|M{+cSZdjZg@>$oUl$f4vSOwei(8iH(9qxx zxYCDr92<Q8&{8a=%`Z2!jLyDvm39@=x$iDs6DnT9ibO+PFX`pGrwt?H>HgekY!VAe z4<-Zm>-DK<fW7e%l?%HS%e^D5mtKUneI!5e4lHw&00WSAl$K&V_s*3nCV^2mLgAB& zMnJ2k!A5!q@L(GbU&!mzrRZ#V#^9-{sV^p56tGPD%Uh{DpQ*5BOB?xaBb()$l5BA# znp-+?1Fc_FDzl`O7C<CDV-p-p+UYuNO=;!Ph)IL#S%?hw?8KcdmS#kuI){#6lY+Wn z$_ar2Vox*AFvPh79F%o-XA1^tr1ishAGp<eaB8rlEYi8FW$V7z|J#YaFCJnWpK~7T za=vhEWc22>D<}V`X7k={@rwOl89Q=}?Rjxy<}Eh4J9FpyP2rPlz#MCDyZs~+PciS< zsauEs;7=SoV$QE^PYrhcr|l1q3>_QU@87}3^v`bIc=*Jvd)B6A$sh7unU%L==6VIX zC;mamdMLz@6G}BwjG7Y1037}}-_(=o;EPzNV@vynaBU`O^(;Zq2uBzZBJZH^UsOgA z1Nuiu6#VFi%+$w`aG(@fq2}5g&)QbeZ5sfAX#*ZMOqof~#uWkEQv@C?3RFSDCO9*1 zDlwx+37Ej#>s>Vh_2)(}l43^ca@yc(<UqcHMbUSWdbFYi4w>sUWD$c&kZL)o5}20@ zG_sLg_!fvt{t_k87OZ_P!!uW3VvZK4#%~wR+9$jAVHGGOL$BZ0b$MAg$;rvH*!#S% zi{CkW@+I8q7QfwnXI_GE=fi*KFm{MX*$0cofjyktx%i4EncpEw+n%MigSltfz~U>j zsJJre)b7i#=1O238G6Lu)yyJC+Q?WQgBgAuTH&Y~MJ`{q1X)qk0kalTd1x#^6;>%K z;R#SV@A9F~AP7Ken80#d5GIf>EQ~7UkQRc%qAb$nuOB3nml0s`*aamLd|#U}!4%&3 z>-$v;S!+gGNpaM^kL^P!Z%mg)Oh_a((9bM>E=U`driE{^(lSSFpdnI76tQlDz)V+N z%#Y;;DKFp!LK{JPTX>g31qe=#Hl?7JU`K4F2V`%GC6Ewm_G7OH%o+wqHSiaaAXs4F zD4l0h1I=Ql8)H@@3-uZFp`9J6$hjZAbfR+HSr*zJetdr*=^1x7)*SxlJ0FtH#Jp{K zL&9cs&c4fxXF7J|&P@gb2mCL@!_m9h;P$3SH21??S)6>8d3v<4M}@4NB+8#4T}AY$ zC3++&ODz*I5PA?XVo6Sn#yl~Q%EumVJ%~}FuLLoia0i{(ya8e$Jmkc<<=21DiBSt; zgs?p(_<lyXaK7Zu@7}J`B4{QW)j8hJhzm*NGG{6-D8(R=0Ur|dnV=b4>=_|LeD2rB z@X=zdshrl-2sW?)Zr9;Wop{x4iEdB(=x3(ZJRPVF;py5Ce>%XQF4_6r+xRcTudk<H zpT*M)=r5(SxIBMB=*FHhQ6m|OT@}8XKwYptRA7t3->loR;E~#EKBIb9mB<S#c7$XE z*MP<bJtd9QzFmwq(Ux4Oan3AgYtViYhXif9Bz&QTOvV1~H@OcsUx!!pKY!)zg*+wQ z)jZmo+w^f<<I;TCFd7@4<1ykPGjLuEh*LX+=P|_(rlHp0N-&THQ7Y9qD^S%aud->N zgi*4DftzVf^LMCu^0uH(-l3OJ-IWC2DmFjrMw3s}KtUzOE?C8I0=uAW8mF(isWnYD zt;nMIsqLlCs?;~)Z*+&qIK*-|CDRUc3(+%G7Nxol_tF?GugAlvCek`s%&FmZ2KzK+ z2qb1tDcU7>)2f0iAvGJIl0^`cf#{f_@Rv76lFzATji?HxYt(H2%?l$;f8@#V&rIn@ z4s^Hz!Q4+$LC=Wi(T;#8_CarT>Ot?<LZQPy|LTFqha>$P#i6ZJotigKzcM=b-B>c3 zJKPscZ96#NbDs9Gn(8fnR&m@jHZdH1uYY7X{P4cRw{9N@SJ*-ZPv23<ttdEwJw64$ zvIyNqTZF8IP91?80*b7MoJn4!K)aEhh6s6#Cd4UB)6BVdn2F?FMH-673>H#JRO&I} z<Ya(toD+<uGKY%}57}$zCpAlAL%TKad9UlBJa)=LN5)6$N*Rq#vI)}+YpG4NjE;@E zjp+ybwvHUD=)39hss2@#BS&14$uB-<d+|h~@0O#pXLlGX#wYGM^OHT1<S6sJbuQf< z+xvG@j@(-V{i}F=Mnu2ytb7VKe;Lpb(IwJ=@2S}96KZiL0DDgt2H*>)DI=mU#+?@Q zO=?ZZ?JmWoj3Gh!NRFQvzo-nr>c<DjtjFOs(?YC190+=@XppO#r+lfg9`CZ&W%QU8 ztsxvRv2OtSdr4YZM!oBLmo8Y(p%bkHohT^L2Hdr#l@)ku#%1ZK<S@)j(W|2vkt3%3 ze<Lsu0j-iq@wNAWWC3rnMhO)(K5|+`ts`^(Fv^%P@>*6UA=Y7{&2Q1}FNr0w%)0mo zj(5nIM}wiu34M8~*&rGi-GX2GgRh=ZU0}mxr@<gP#pg``_Ap}2&g?N!JSMpYR)vjn zVY|Jt=jd!rS5kA#053x?$_FkVrTcRmz>U2}HRF*{z9f8h%Gziz(|6|H76tM6hX=re zlAyoxuC5<)-;Lbwgz$L=v{?<vLER!|b<EYwclDTy5W)mV7sPo=K3hqz)#in4c_ofn zC`aixyPm8JAlfNm+)b~0S7=s4XY}$8mYEy2@OL-x2UuoXG$X?c0Qrz0Bl4k>kdi<1 zp$x*}nN6xg4fk%7S-BAxM)(EgW+)Py49?e7AlZWmhMIppge2Nx$>8U>Rf*8B%O*=U zybP@=Xi2b6ufAdZ7E&f77G-cHd2wb9K1KMW$_<of7JQUkQ`T?6)p6i=XI5J>?eGJv z*_tFGVPb=P(+Y$J?E|DuEW8^Ctp`5lh$9ffoqYdFqlVJb7Wzbxlua4)QR+k8wxzkw zILAXHM~8P~`Ib*K>urWMzhN|La7(OItQhs-)@{K`+_LDc#clquW59p;CvMrGy(2g2 z#ZD~t6jpb=dGyN<UNsLg@y1tI#Dc*t$!9TdeN>Z<ZZHSM*#jrW#9ePZG%$Z>HslGm zXpV=6X8+w|$7FO^6Tu|OU;X96NfOHS`LE=p16soJjUq0_hH4jSFSQP!SehWYb;KZQ z`y?9XTy=<dOLXr|>s)nMw&6{;(U=Kr3>bwWi<PP3xPK(HRZS$cRcII_+&`eSD<=3O zI&3RR(hU=}qngyBBT^{5uppq}S#iKN07MDKRuNzwRC$FsA_B!&Fr0AIXLzwj?d`Il z2Pp|HR}E4T1|OxKcogfeDP@}1Qj=Q}+lF`iXs|CAN^}A$DKnZoD4F+-n*kzq-+yKL zOPv@v9J~BudnhP7YYp0JbZi{z@|Z9G#m<enpXXLb!*d`?-xN(jz4`A?4;`6^S%a7V z73dw3HW5aB1%EWcP;N}S4>A}6ddN~b%OH+UNg=-gxFBqz02mOp$fGLDmvW(fx#l9F z6jC7oT|I;rf*r>sKG<<goy@4};vS*lDN85^@4{rE(v0H&IZ1CWlY=~)hcKoX1yDgL zHs>Eft}k0+vJv~IaEj-Epj@*gOi5ClrJ33UsFF$AjEX7LLW3)qP=R5o<Lh2&;VHAo z8Y^|&&@NcY<7vkZKY020F^6@E?H$|S;oqN~e0tyg-M-)w|A@|7{KaQ~o%@?V%CA73 z9lD>ff4KKy+K~EM?DG-vF|2w0K||dZ&DmAuXFCQuD<bibvAp>7W)GScO|0|+?LoaP zY5wAD?(cFx`|UQ2hw%NdKKw&4hxN90FV1yR_%moVV#j5wQ%u-iGw4Kg<H(XxDl)>m zVVRq_2Q#1^CCm!xo?-RfuvzKGX<|jAvL1hr!$h1*wk>O2HOf1fhAq?aA<bISE8W&i z2XAC5G7!SEptK<}oGy1cc<dOobQ_8Uv)yB8=pLZw^9~CXiIdXw=$>#`dFurn5=Pbr z-9Q0qPAX-jQjpfysjW(>7Wgt?Sd2XLL^Tb`5P&_B0TlK}-^h#)dn5K>@h^Q|yP1V@ z(?e^DYeVK%_e&eQEB)q%px(~5o{BYl>h}5k{psgJ{Yify7C74BF+4aNnL5+AEox5& zGZWU{kXUBidGAim56-?4G}l-5nxFPqa=D*v&VBg&aYy*E=*z5kb2Je6_xob5m2bTE z)b!-wH=Ij@{>nWm{q}wDzl7#eQ(~~g8J)CEL{_-mo4t}1-KQNbN4XyPtybWAB!zy* z|Dx`n&SK*QsRuOhk{z*Cw^Ermxj>;*>g%a9sge{*eG&?#v1b{50>+6e6jYBR7xPe% z*$Cg(&YR$X{YWJ-YMg^Y@$jZb51|;K)Iwbh6!K7Wqrjn<T!AD=Bq){S8Mvrrn?Tug z&@T|2#rg$!Mmn1#L4r{i)YMUW?0V|vx{BtBZ*6{abhqA{9=|X2n)N%^);hEMuBC7e zp8R%IZvCTS<K|&@pZi+M=fl5SpmjtcBKB#2u1%w!aEQ5t?0V8Tu`M{k3GU7-(styu zS4O8@{devsv}2fbsFZ1c&+Y>>-GV*ADCBUz{|ht^mZ^#p1PsE0=?Yz=KAfpc6cAR? zg-gyqTB*^XIL%B@{{^Tc%;l?%^U7f!W@hRtnBuK!d<iS2ywR5^@Vs!-`@L#}S!K?+ zF975-Q++|oRGFtMtK7D*n5}kKB4lGqwL2fX>rnEct+REOXC#Bgzl8Lc!bDu8)m1BL z><;s0t1ahj&gFx9@4B0WcEb8iCYl<P?V}vhb-hIq(D(rg1HXpXm3f~aXiL3Nen%-h zL52hyuUsr>EOa1eNe_uZ5d${``WIRQTC6F|-RS)Ax>E@o-X&;i3re)i<|y<X=bueB zv)AoyzPlrxC!?JMH79pWG^2eRe>!wVu~XDfBvbv#WTbsO^7V<)am~t_mkcha@!%SJ zk<sgyN5Y>s1Uyzr_Q+PB-|)?M?2m@d%Kj=Vi|Q?&V98|oxUt4#(mEe6I<q{!Eo$~A z2BVY5tML0}lbGmR?vl-2J6Cr1ca@nvsiDaD?KPL5WT$c_yPXAI$o=A8W7K3ZN|MoH z>KtQtPRd$86N}Bp%Ri2><J*m9BdadC{7)UM@KCgv?3^gL#bLQayAdA4J;G(+EfUX7 zYyi)x0593_4O?mYaCtPVE54nic_nAD8L_luCzvn}>=l+dz{}fKvT`Z{tiW^Of(+-= z%2Rk&O5wqg1dhr~kgM-$1o+^(Jc@cCTEO(Vcb@B96~uKFu13yvonQaWw^XjH;9Q3^ zhWCPYy4vNng*AfW_JZTGZf`X>4*-=)0FGxyN8Ck8a=FMztAH{t%eneks&#mhZ*Xz& z*%vIi9wok-V~(Z*cUYQ{Er&TxVb-&h`JnleDY5}{wd_B_SGvM1AowxK&O*J5M?z=C z;>s@O+2Qk=Bf*#ijOw=-lM=zGwW?;qT%vid=4IIpjv6%^O<uolq+ts<%A~&zM=_(y zQbzH7%-|+IbsJ)$HxyWAeAX<MP~T~ZXg+V~t2MhX|8$;{X20@TIf?VpF!p7C9d=zK z)B|rC<D%drn)5=(w3B$sgx4(2k@s09#KG$<oDaxVQY;v)=J8$)0kIU(SjIWT%ARVz zo+0D3X564q05I2D+KiiGEzN!(QHJv731184_h`^E<wIcu%PZc}bUrcrw^HcC(x6%6 zG23&g9h1RMpV2#tuxxBHc3HG0*ju~r^mjH4OZR;^HH?-IY0O!@|48xUS<$Jtd;34U z<T1M0i6e$r@ckJ46+xTE_ggCbqmcDeG*-$dbUpCtSR`)=6&FRs@C9QRGXVf>VtKU6 z?3qfl;=uQ8MX9I+8<fm>__$0lUl+oaYGI>$1lhdC3raP1M0*$Rh-SDhTu`jo5pBbc zXu)dZyiU0>_DzA&+qpcdngk2dC<Vq5D=i4HGhlWszACn$O%H7$JRRnu`iHS3P#0^3 zaNvM{zx~gev%km2=U@FEE1roCSAGLJBw;qNmA#p`3}>#c`l(3}f*mIan$Z|dd!{bp zxD_-iWF0ArQyf;Pq14>zqk91ZsxtwO)s|_ZX#gTxrV_a5<^t)0QQA>OB<lLntdlfQ z0e)AjFh5`<k#oJ!^1;h=N1JH?l_J>Q4)U2mzdW7tgkjGJ%;qhmy9q{DDNcb#KxuSv zx>O@BXeI|$gu&#LX-aX*u#|s^2wZ+S*bE*o<0vagYg@_CX`0crjR*o}FpNG4WY|Q> zUoBu6JNr4UUN5OBw(E;TufxQi^qWr}dGC8K|LEX@W<Q%TIV<%M=ji{VfBHwqb3eS0 zdn<ST$5Q`RNnh&lI*i7H_f+Qo_vpJ%yp?;K1ztP*##W~Dvi)B$8gD3(x3=Zp{`Sc$ zz%pL_n;BNmbwCID*Wb__gAH&Kh9ylN6#zQRDeaICJcS}_IkFE0)CC-N*+s>GV@D>< z!1uZ+Zus;7gjX1pK(>TKR%YC2sy5Tv6rlx>zoa#nWnf#IFW)m}Q5&Gi;ltO*z?M+< zuGl}bd+*FMcRahN|KyLK&;6Q}KA%O`<y)@>_8$o5K6o{9_@pDtj2E6~rMcf-;<Dax z<rkXAqyxx}0e6%#fV>TtCUv-T)yJFctK&0HH4cG!beiX)z36nnNrtV$AXpK~pm7;N z3RNR17F-(5b}uDH>;@Q_LJbFmu|?$ztO|joLea%ge6%Qbs&I8L#l034pq`Hr2IVp} zq>%k}L4;cPx`;o}h!3tn*wKu-1|mN~%}8M?k)f)Hml#SX<TcG13$VrFm3x@$%fpF? z+;sVuOM)+V2mU*f7_KF&Du-G-hx;qUIQs)l&DbfQ#bRQyp`)7nJZqPZh!5;dHa;-9 zF|~OpDGqq{B|6^kiVf=}lU52k+^<B__k6LgZ*tT4(0FTZQvCX^z*N_8k=_stxcsC$ zLkOr(NymV7)uGnusSrYoKe6BkL!798;A}X&6-IznXkE%U35tWN-g-*c!_$X_j#p{R z%({fmm!fV>OyQ|(hP#kIX73K}UKy_}5hwmCI{UqtxGxkMnauso;JTQr!agL0oYezu z2gXKovxk51gXeEKb1F32)jL{Lw4VNb;VZut4@i5E@x4by5>ppU!UMuTk`I@<3bVCY zfdkA@eKPCfo{fuCieR2D7U}_PUuEFC#;O3y#$WMFunVQ)tiyy<2(>oz7H9RlO^XIC zp~;c+f%qIo3=w?Luh9`D5y5ks5;-{ysfs1DA09;o+SgF#x0)XLKo+enPu@Sg{!GN_ zkeh>nr!3;e$=r+mA#bdFc>5u5?r*T@%xo{&T;lz!m+u=ZaoyfAQe^P?uup*ZhcRF( z<uqgX?$yF_70HJnG>}5%6v2cV&|DzTR9e7jJZ*r`6chhyXt*C8`GR6XvM*D`i*jV6 z@kPq)Vb+4?KQy^fVyzRnD-I;rkkQeIPdH9ETulFTPCCu3uJR&-b(Fn$e;}3!h-3Y^ zpA2LIJ$<i5nCHGj;Vz$DBlaJTc84Nk5KH)~16O8&D{Ta#Ye+bd@AxqDjt@crM6r{a z*eW=I__!hVMgXA4LNr`L8G8cMr1D9A`c)4a+8KQtJ0+vTq;CkErm{A``&b3Vg)9u2 z%L)s%w_}jsOWzP-B;tCWwIWB6;4i4rA}ZDb<iaEtP_u!^6)Q7KPmKIhGPx2U(3r%m zLngR%|7*o9D^}bhKA1Cx0tQJc#Xc-aikbn08w|039Y~5r9f5M$((+~YC_RXoUrBIX z`MG|x<|wjAcHlE?#4em{KWV1UUcR4kgIXAB=)ibDJi89sR;k5kF=eWdMp1+^GKia^ znY9=5yM`%SM-Kahs+cC!9Pj*kMSz17RuL(!dfqc!z)i(Ydm5)s^Lxr>9GVr0zUhHA z>uv%kE4~RF*@m=IS&oa|+9iE-u?b2}Y3PMY(*w$Tx-r?-jmyY#ZLnCojk3AISKpNG z?i*M;p$bC{sWxhrsDqRDkyjcL2(>t(Lr#ly95q|;-ADv(N)uy2If&+e(GLfaZX>+s znHoGUKN@PaPHjzm=x$p&a$icUZcA<3*K%b0_{P3*`+@C$^3bjWrM@+-eQV1$ef>@A zZ!^a;&wl0j<If!%d2w^Vdg;`tcdTpU$lB)ZU%pR#r>D34=)qHm16%KkPn~@9SZw_6 z_=%%u9*B<T#Pb(-Jig<W=zHJUIF)(bw`x<z_L~zg&+OMfEN7nk#u*G7jhY@7_n*1c zd&k7-9S7g(+y>8i&y`WFN$bI@7Ig2U69aU|XeS?<25Zx(2YM}#OQGez`FY+|bIVtk z$1##ic{84DL6aM3-K}QnyLlVoOcj5ZjaGr@g($~~G-o7gFvj8)h<H{a*%2gYG~S8d zy%uEaKtdiEzcWItxqk)7lZlcujzB8IK*0tAI2Y5Igk?rkTU8$=YU?PuX0md~Y^5!Y z2U4W>@H~WvR-=9#d}N+ll&l2(5r`9fBz|Hm-8C^prC!>5qxtJ-N3TKD>(FoN&=mkC zzj0{k$x>%Y*k&Cn08Gw4>K(Tqxt$}9dO6}KunK@UYLH&BeDH#q{Wwrv;Y1sw$!w9F zEv?;Y;7@KC-ElM$F2Q!6KtJJ62Ff(wr~=$|AK*zH!H5I!q#mHLdqra+4^IjTuD>#@ z6}4WxS|V&<F_o6tT53Z>Igu1qE$;)Z1!j5HNOZ<B6qI0y&XduZ^(vhkqca`q(~XoC z6B=nd4-h}CYQ4wGv3;y)q+ywj)YS-xT*4>;Eop@=^fQgc)4LMX{E5&70f)L&g}X#e zsV-L;V0=0Kzwp&R<K|45KB2NuHr!2j!WcAAHbLe?)QPG|XKE}nqS@fJ65UH6@|nuC z62=%oR+zYn%-uL|UPS(tnB@j!Xo*=S!7QtmuNkKgHLeGbb(3lk5Ccv#Ea<~!l~FVc zx>J$Wz8X!HCf@92@@Hlzm1V?_2VvFRcWubhrGG>m#UV>i>#I#g52}nKeP;n^t0U?0 z8TO6<v~|PK9`q81^x@0iImR+;7cth7630Aak)6{BhpvpEKb?5Tk6E4{<@EsIi!g&v z(hoEqB>M0wD_cO7ZW)#gW}EOR0k1N%ROKSLrFHW-=jf2_M1ex(1{5fO>79pZ%Sr?W zAxH!UL6bQY=oH}c4Txrn0GDr~Xr>f!`9-COHm8dWCj6bx?L|64!6hwROJs!v!WRH& zfs)0>5>Qf5d;|*)s>_%ju~!i4n^KR`I2ybcDun>=GtDFGI~Nq;kbfs#@F{e_8w;@a z{auT(_h%m_x!>a~DfL&+%YAdWSS8<s=R7)AyBx^;4AA`dsnGo6(z#C%b5LmGl~L)P zJgo*9Hkp#n&i0X<qsfZoyl)8|Zy(fcwt?jQ)6o$<<oTj*oF~Motj;oks0v;Lb^vYN zBC*D9E%b%YlQpxHM7C8GZDIxW)ph9&grm<)@0CFi52D`-(QjJP6(@p5(DQ+ctRhjw zA`pPu$I@Sjlc+E4>YWqyPbR-=^tlzkt}4hIgG&n-;p!y}v%!y*ds*d&KfN|T?EQEl z=XFEy!w<O)PYUZOi_iH%NBp2#?5xf~{GbEca<-iKVI|l5B-bDqieV(zt-ME~bt&Y! z8yO<VHM{`K>ydScM`tjjfu|VlZv*U`YSyn~$dQ$&LD`vpRl&FP)BkzP5M=#xdD^f1 z_j@RHk!gplw?WpsN!EKH>uo(G>uv4b_&YD_jd0&!{dw~%(YYUV?&$2|a(=@~+=HCA zLeBe#cu#s9LweJ-RY6*~rU=bMbg8gO1CD_06C8n-yzI~U0iU2FRPw)lJ;m*{N6ZDN z>7ob4Uw&<FRfo&py+~cOt6ZQhx~|f;SIjvBv&L&O3+UbhUU--520grhLE)!)P2L&J z<I>R>8Ctc7;`w@56ckTJfr*0Ij{>kaz&(&BMD>P)tX5zWorhpL;3;yAP{ctE!3E^3 z16K<aM=UebrcylIMidYCfZ|t6dkhqB0>zU=@fJ|Lsf8%s6iedoJjKI^U8e;%A7aA# z^l297lRbdLgW4x@15HZP*2CR=0ZxI!B(+Eb@)n@9=a<Ex3;m<ZVm=jBzu$SJ=x_hk zUs5FL^((*5H_EU(joUfi)b}_hzozVC#6C~ZAjjp7l&D*0Fm6?r`xABT4ENw9@8ynb z6BBk>5{|<z|CZCUlw0Ntd3s`<2=>q*H$;{hFkSQtn5Y=CqF0~RA|pRMV^E0*#62X! z$V3AS26~?CvNCUHF=R$&RzWEZj+TuEr88!rbWlqH3~mWwI^+#5c94+lxoEPp{Q8C$ zK^7U=#S35%QfbKSB8K`n*ZbX%fN3{%t>i0WW-G6U@2`^MXdY~PEjGy9|G=*-gpOR( z)De`nf0Esmgw2qx-)QaJM~Vu|uu)Cbv?i!j?g$}Ril_vDAa$y*mPTjN>Z^}8!YqRv zA6C@N<Jgi4U7`}FJ)T)&R%&sfNhJ=_c%Rvb+M!6Z@Q^?-r50IuI8M~agwfk9OJh_E zWQ?F4lgL;{p;c20jU^~)vahXaMMl__g)>DTCw$Tcu@yq5!h)h}xlx2zbzN$!3Y5to zBS_*8Ik|y$y?^iS%@5N?paQv7qH4Z#O$oYO@+jlb>U>|<Z!GtA)=ytd+UlG+`<; zj!A^q^U?)UsXYhS5}a>NqM=*VA;}u{#Hm|E9kSyi2&*s7v~#*-+Rd~#Z*F6=IU|#E zRIO5xwnM0|5GbLdSD>|#p2=_Bqbb{VjOHlm<sc2Rvc16VP>e`;of9lhkH3zPnU&H; ze*TDs^Ny}$Ij1Qq1$FA^c8Z1&5{rCYOyGPb6!lLIfA9Vzob}n4Ki2;hLnDI?W%-3d zE5oTxT~k;nw8LLQ3x(F?F6S2t)wNnB^C!usRv1MdL(;k-n@RLAeoolNbb0Cg1V3ma zw}mvzvrc_=O|R$XPv`Xzdgo0liElDzT8XZs>f2FR-NFsZVC9{oOea0l!|y2DLHmNY z(+ThPoUm-=nspP@ilFpt&QpA2-sNcl#oKz|^icjv`TyGM`E1;x%J6&Fa)Kxyf_)nc z(jnrBYcb9ZgZp0d<pKKhhEw-xJfX{fwPr9<W?oZIYpLeAXlKnwxk#1yN!Zg5@;bZ` z`R7sQxLS*+upw4u1>ioa&haM%P%Lx9Rk|A>bMsq@Gj&8{BsIuc>MnF0P#`so#{j=Y zVO0Y?qqgrUWB)<z-hf#5oDfTOEbGV66vlLi^UO0ul|ShBS};#2+Pt*8pKkyDn*C$g zgy%~auV%=rdSQX8zdB|XcM)?mwJqG<faMKW^Tu@)elE+ILETX{#|+m}DO~?g^4uro zJnX0wwV?L)PG0|)fEawNx}GP)C{RM<Wg!HDq1G2RT}=Zx8I?D6GaDfdT0l~GP(*68 z>J%61aORAYXj~X*j}~fq7#GASL}59Qhn`Ul1~1kXt649_OePnk3j-z+xZsKhn*W?o zQ{UK};vI!uYI3xLYI}IB3>2=Y4uoS=%T**E2kG=?yixO9a~tlQ`3-mQx~^*=u1m|+ zSFgnrZ(guR?$Eu9_Q<V?A4bn&UbtI=1M+>X2Bz-M8yT7X-=9T}MPcg75$O@`7yA(9 zs9EdsLP3VWo!}O$@I?%ZC#kDg%}VlJ-X#by6oZ329zB(kP;Oa}E4F>2?c0%0=$6)? zZ-!P#Q9^evv7;3~)?6{{+?o|V(yPGW9Tl?0RIx^Fk{k<Qd{t7^j#?Pr^2MLFi!?b3 zj~fwHw?Kow)fi^vRX`i{H&w)G11JV$sBGy6aWQv@O3@r*-CU&z1=|8bcqx@}*`Z{2 zg_((ujQvb&sQ_|yU%8cOAGzV@VrgYx>`XLREJ-F6q|ziwh9wcr{ZYSBmP+#1G_oNS zCY>q@ZC7SB@8bKE2r1#Nd<VFOe+9)_h0;)jz7wiK1cVqwO{kR`m-*MK;a{W%g*^c2 zQm>W_6NoPFtcgQ@^A_)>I1M%{0!Hyrkp*rn+7uK*zCd4zQiqs?e`jiHLP3lsjSE2W zXxZ~EDwgGsz(*+Q>KPnJcCVO9b+6dZUP<)0&C)|R<YGUU=s1$SI2`GrG4|?vlG)wY zbSBZbMRR{FvT+uWlIUc*`e*`%xuY>+vnP&5Lv+?Y$IrzLOLG+c+(c{jWU`iOG>pWv zZS!ZNlmjjgsN)<_i-_LDPp64M8&eBUhw5l{97!dfspP1-m9#OoGp?+lgKEM-1?ixW zG*i40^`M%O?!LxB{eH&=)>*kf)wvSKhQp%+oMNxWmhl6e-Aw${n~e3V{G@sGXE?y# z_<{qRKgyW@8b`T+%koEwdU@BCcjV(Z%T>Zz?DV5^Y*%F_7SGmDQIe+Z6|oHUN8~`k zY16W$5?JUs(4lCi`=S!TVMVB*&>d+;8mv1)FS}RZ+|qGnwQ6u*i2uAuZTG_%W8DQb zh|oT3VcJj(I0gJ+bFb8O)a{gl1P<8}%e2{3P5Fq-Doml+U5JsOW}5<eMsqTZhtfXg zMIxw!81iutJ|6$^Vk^GJ;0BK-OlBra%x(22dW=l_a@CWvBajX{bN3mn7N0qHziFgv zf2?nzTo>pd{XmifRwj#~$n1|bRSBOYmxgE)M#!MWvW&J5f9G*FODl-MuSpf3No8=* z!h7W%DGOReff>h<eHGE6o;$QwR<?S;*euh8loB)(T4ffWLnF&CrsmrUNDk>BFg}Ir z)Dr9XEptfr(O9X486@VhR{4G1Ch3yzNy!mNg`Bw))_!yDDR(dxHcF4-TS>wBZ{-gf zMY%Lgwe7Hp$@);&hqDRRtoh#~L0J|{aD#d;@b8z$IWO(Tzn^#sX>ZcM!FaZdcxjOH z5`8BCKH|o-Y^}m{qRLAx7ZpEv$=^Z@<R^&X5P+rq^sWV|G~y*ei6cN$I(r~{{lJy$ zfbfAKu6q#k8tK|~e%=xKsSb&4`#Dg<`eH3zZZoNN(p`zENGj<1wC5Hwam@}jcMS`$ zF}(aE2a5gaSKiV4xSpF7fREcoCz{v)&4eI4OlKOvnGz=SfGV8^n|B>QQ4Cx3sm7p> ztGm7qoM_$>G?5bImVhfkjO)|Tg#dW4OdBb<HslEtJ*x@NA~Rrlro5_QXe~jv6%<`C zZxJGFp&-sP;MM2Ot>!|TfrI}Ddup@E?tv+m>KfSpVehqBRTKRw8If2bZ$xsjJRw2> z#43v^F{c+Fy&flrR4sUI!?Up>82t!IV0%ClLr{7JHhH_SPk30J2hTeeSz*&HWPA8b zI3aUqmUFEMPJc#M)!IFA8>h@&(Tw||QVbfndGgmy16Rd->{Ph}gA;d{m7O5RdOWax zC$azf&*Oo0ab-WQYKlQ5KvwL$1Am)9EZV`O_-SkxQ}Z=Jf(!zguzomCBXGVJu!#6i zMbN|bdA83(ZW9=KZN6Jxb^`7VPe5bP<(33ZWAgt%Yjo3N*p4PII7x$XZ~~sCYYBOB zV5+qX|6{Z3*ZuOu*v}=aYybYu34dAcD5;hQ%)?=$C_QhH7`o}~qlpqIB8DC5J<u($ zS>*@q53GCdfw6DBaKkSoTRDuM!_K7BV}HG$-8HyB+1<89^KdM>d3I-_v5WK&FurkR zb6xv~KY=EaRwPP;q;>sP7UPO|BY)4I1Ou2E|0JP<(=Xc0-;s|6UqXHnGPXsylR5-R zq%znwGpg#RhE)GH&@8)!E3OJCt~=DBO^5F$u8j9ZGvgPPtsvF-RwC8-29Rn^JhRoT z>;kod=zY8ke{UTJwN|GyyDZO$l3`$w5HK@At6~N)y`bDNiU!dld$6=)k_#DY!ilS{ z6b7el4!~KGmE%fWT}1=oAqWH~9MX9Z<pg@nFdBoTAX-Y{tg6F!3&oCp2{)BC0I_4m z5eAkqH5mj^#h2f~Mwu>jo*AwGWHQ@+=@o-*M(M2Myv{N*5&2iQmxadfGBIiU&A*Zi z!IU;IYc7_{XRRgnK)@)S_vW5h?CuWlw;z~fvX!}VzZ%{c-cM#SgwKom@-kQMTTT~S z76^%wRIC~mqee`3<j#f6*2diKBSoeN5iSA;*%f0uovbqVm+s06jig==eZ+D8@HI@U z-OF~_oz{Bxh26%D@SkZP*Q{0mZ8u}DbapLKwmCuBD6*zaZG$ALK2BH}ghh85t}7Lr z$i{AoW+pExVNh=?s284uh}ki>Cn$k+l<T#@0Fm7x$Nw|YYOc}j+O#4O<#Yk`9_Uh+ z(m(BdN7fW^y4ZebvRHrR)zq~UUA`roL6_fhy6iWBEWgwi2UBt&XEw+#ki{85JKA~g z8PR%RJJVS~h0#f`^qA_Id<r_ag4d%%Re6C95tIDo!kO<!Ck)9G&Rl*PXKsZ&{gIFz zpi|$>d)LU`_pDxj3&~TQlw(<f`i9|BDh+$-&>>5=&7t&}aPEN0&vuwdtU6jCR?FgC zmgL(&jQ}Y;ljWIEW7EJO*(o%jp(7>RJ<MBG1j~#!uq1s0i5YMd?I0HULj{>pwPP^A z1Gnpo+SO)^Q;*Bjrz4;A!stYplW6QV+f?(i4?C_;PxjO=M6>E(SjE$0xqY9Co+rg* zTS00L79)^CC|ULiG@dtm^Nz@3NxMFUHAG+i0Z!j)>`^=|WXFlZw-+cp<5+XkE}A9E zb`yz#lcJJ1>sfl3E6gL&%)N_<42T+n4RV3rt3dBnyybIMALzX{uKXdc%J+iaR3jsL zr`kKmdEyrIhEB~6uipt#M3!MDl4xDQ2hY)<RC8^5BY=hFDmX8nmHP8e%x9(k=xvs% zmL6uE^KU+gUY`*(Q$Lk$s*eNz`=PPfv!9t0rb>wu0)|;zAtK!Osd-{P2LH4?!GP&L ztN9^#qFy+p;uX;@cr3!mRugYjCT40?ovQ~+@PVQFi;5kzw%3zvQ^Rhyo`>7@7&4<x zlT_+)tGRocD6&)T4a+srx?7yixT!HsP@J{62wZTz1@HuJRUe~qcY&4UsR;5+d~ur- zBcpSYn|wF1%w?4dIT?S5Nr%_R!uR*)Mt-d=UDoE2q90Ak%<hi3nYSqSAPiI3EJ<#4 zPhCIoN4H9Z$vJ^+8NTmAJWj<zGeqdTmM+rZ%2f!PXz9B)sHO#(m>Gj);;5RRjGQT+ z<{+7?2_&-^AejLI$^4QElDQoqnU_8R6O-%h5NxFjG&EsG{uUuyj&H&fwaAKD$wnIK zn*da*_M*Z-x?0|AU0Vfu0SB6?38pgGOMpu$j7Wny7}=uGxQC!f)J=$Z-e^(EY9Ovw zYo?w=l!#MpY33jfLAaXLw3^3PQ_FqT1sOJqUfK2y|C#;!vRl9U;K=cz)#-OynTMUs zJ(+tqn0u~{d`(KVIqlLTQet*@%Fq7p)V~~`@Vd81xl_4+%6*V~9(4Pt<B<0g-jmPc zyY~rS7gXmxL5vIggjysy3zkZrQ$m!B5Q@XM<7%c0Hfwp5#+iA^_<J8bq~<u<%$fEJ zO1<a(Oc2vV!|#7zz3ijOTE%F6KT~o+_@=R>+83;EYp3)F{b3)uqNYOhs4OTVBX^?+ z*$!krFujO0qpb>K-*b)YEu@l?Sg=WX4~;zM{Bef@w^C@&QL_VVhJ9CNFENuOnmVKY zN!cImv`eSA{t&6R&V+L`mohqyrrh%7Qbn@cEFD<&qGaxkd8O6W(RQmO{b1GyQI|rL zE(ykFZ%(<j_(iG@XT}1gxMsy{z~#|=clNbJ0PyjJJCe6m^R={B6fH<vlAi#6gZKBW zP%K1fQiC>R5LR!-jPQv%-hFrt&I0x|B3?ea(HPBOr2_`Yvn9M$wuFW?kRtGukQVW{ z@D7WmO3c$$B{HTr&|O~zB4Mev%wKWv)e;&zFVd7zHJ36d^GG6fHZY@e;r>(#pcbLX zJ>4G&w)KVr9p~d)^?KBS|BJz}H6HuC!2le`r}X-*acMlzu`KBC>d6h{jvd5Wi9D2| zzV#qGcrY#NW5r1M#sQ0nGmwN7^yV>s2F)BF=5K{;m^4U!JwPp?d^l*>Dj)QlY5njZ znm-dWed^FnD_9VK;LWh8IG1J~(l<wN5|pUNNysa35<Tik0BC%L`bst{#SqhKe59I3 zpw&Tq1l>l}=4o$r8$OaW2wof>VK?wsd?T%lVn;UIM;m99YJ{OmE7}~8=|^Lj;*UYB zHJIu`8v_J89~5HhI*FFvLe!D>rwFfQZt7w_JmCx1s3#B;{y+dOK)lq6>nDcC$A>3! zw+3T~<D`>=?4e7NH8>uO2CaCP+jU8@;1)Z$9Q5e{eR8`BloBsZPG0`bdbUCA3Bv@6 zrC0@pXwZc5jM2;Ai8Ns7+myc~B0g!<Jf9~ZehVig=p+-JwC)9Tnh>@L|CXl}TDqy_ zOHahgQwv>Sq%=e#R~&tP%HUSS7bBcvO5b{Nby4;P&Dw}&)Y8>mpjjJe)^-C3uv(?m zYV;GTui8|A43L=!>y<a*p&L~y-58zOq`tZ?VT<v5i+k`9l>>N1yp59+h@Bv(+zlG8 z-&!Cn(O6JA<_O#R*$6xDDRCXXm>cE^yO78WUu%A&5`6ac0>L%mf0S%0!9o93v#;kV z{#ErscBnw_+1CoM*h2+w_^do2_T+iyBg`<*6e4)yUF}2Qi2+P^+^_ONwwr_(l1XLF zbUdqv=w`eL+9O7>kJ()ZsDyY19SbOfbCrg{j)99x5^Rv<cw5PKxIzOQg=c_VAvDAc zfN}#XLAf=Y5BMxc5|NHFG)gq;Bf>?1#)Gkrtb>r<AbwogtAYMp2;;OlDT{zfnnJY- zUHD7}Mh_CilmSp>l9O((xT9GhhA$R~!FK;ZZ}mq?Y#G1*oXb06W<~o~ozObB1#|CM zy?%4<J$rj1z@?LjrIN~3po@6vDnXQE{#@^-3HIGKzf{U=U;!XvZN5ARux#;#L_mfB ze~h3~oNokP=C3sSfVI((pYg1uqt_((Jh4!OwQG>M<|ttuVGxW8zERRr1T_}N(76sc zuj)~t9KrA(!g_UMwuEU*SXw$c@icJ79=UN6KbNHH?bh5c?H+vD4>Xb|v6^mG_C9Pd zl>5qybb5_d`cqeh>p%OdG|__2m+aBHC6^D=xB7~%c(AFca6{B-t-2~>qanAyOWVO7 z9EM{_x!-rS`||wtfQoB>HGhTaOWG27>o8qf5%@&@qL$RQc!9;xNBXJwt60aC`;2UK zS5u3)s^BSsTi~Dh)D$@<bG$}=TjDCMe|-*<1}A*6PRAS<b;j)S?&Vi8n|}W))_Q4y zHW>UzuKgD<*rXgz7T70*>I7#h()k}{wNGIQV5h-fI+5%Db;uL$Ce(nX5cO>R0%PqX z+yE;mqLk6LV$60y=4V8stD7KqX<YBt#LSJV)xH_rcsl_LaH}2TRy&66Q>w{c4`Cly zmHs$}htyYj%?@0htcROy4?2c3{y1=TrmH+V@R6<RgIl9BJ5)`9X%$6|!DQFpxQTSU zhfMa(w{nw>fNHwBQH4iNT4pLz9YZQu(jr|=IbhJQNv-Q%$81vqc!4HaTn*22!u3qM z_EKKcpEv9G4~dtqW!5Kx+v{@w0>xs<{nQSYVDj$%zb~@+BSx!ILkfuX_tys7$uDDH zB`pIFpmSd5e?!$N*C#a8y<z<)u?zEG%P#<5=pW&H;pBXAKsd(a{46KI8RQ+gtUZz@ z=b(zX!#4u%=;7S)jA$wk41JE+V=AiLdK5SIs!qbqhe%<4IhyHES)_xr2v34>ukJ3; z`*xMyw?}90TErq&cnDT7h4orYfjhK_m7wvnA-*#{L=|ozH#_kW=e_v7UsSp0IB`!0 zxTksV6mgHWz&%O`4Km81qnwZ6LQFR-?c?jRdn_|`%T`Wom{SmBQ&bO7EfMh7*XF6K zJcj=WPc8H$3jD<$Qu*w|TH-V5_DkA=H=#W_Snwu3k<r9A3+$D<wZL+k$h={jb2SSu z{txlqyc>dg6#1sxz=JkSQ6Ixz-s~nYUZy)i>jvS6Ot-I?;1!I0WGw3W8g+6`Yj+<Y z=k#clwrz4>1l+dcqGADK?BIiFJGOu^_QW$gz!*E`86#scD?0E7oEC+6#IgfCvWeIM z!=hhA-oox!H@TkjuFxxMM)uzgI4k%mwoQ0!;Bz<QF-qgv-TtOEsD`C8`>>ArZgN^6 zw0uQ@x}K2p4onH}xrxJ#6c?p-0Mk}V7%?!;fJ2x=;30RDm%D-HW)iOvV<4nk6pZK= zY9@Bc=U)+gN|Lh?05*<~vs-;ugX55)ST-Irm&$gR&nR#1enNDm`%_QJ^0VRIV9>*) z-OFFcP`Jx#D6>x)ON>X1C9=itF-bf6a?h#G&z$Ep_Rq5A@}2_6v`NLu#9MmwxiP6P zE%87_{GEiedDhBG&)c1D)H%cwDr@nOB{J})0TUCs*^c&-g>j2y^HsSYH(9@Bc2-)% zg?A9qEDU_)09L4Z3y}@4k7!<o-?R+~li5Dv8y^RP%~sX7k-{1yp20@SwLoRjTh)Ts z%~1*-1ZC|CXaO<Fqdlhslw0gL5qUry&4P$J-hy3x0L}x2kiw8@H6?+6p;v=I+llxn zUM&i^fx8E=-2I2_wqq_$Wtf~Sdvv>A-!<Cul)>=de2I>-puzA+S;r#=gY41UM7^xH z8K?9{ebEuOE6Shu)y6yqLnPcEddgs&9@(MSKT@8;^9G->Ov0jr67j8PJ+AY9KR%m} z&gD?=n<jH^Hqzpdw1p5}C(8y?na7L`*|PClF)TtxY%hxXm_aO#CEqgIn6%GO_zj39 zMMII<2Kl7C+A!pE8}@)66`mBb%jujUh_C?^IMz1ZMMu4vj@q7>*`~&h=1M9gXO2W? z4(7%4OVP{?7nR$g+;2B4QxH5Jy-!i_SdD4VOdX5}Bz5W@sPHeSD*PCpy8Q--WV&_p z7AS#q<{ryAs~roNcF;7ZQauF{9y~&dyk|Ll;-PeAn<cAjy%9nR7w?+FcP(emSBMuD zi{6~0rxF(>Ee&{N-c!_R;W=<Q#?Ayd*<>CT1{~~uh@bNiNLxbUS60r&k4-Le{&G)3 z_{v?jVtd)$dEqmcRieJNqwll?RO^8d$#b+n=8?8_zAc-@sfl1)AH<jGo^9((cqh2u zNG^jWaec|VMzY}ALgz<fxFl5&>i)L8_>$mloRq@-c>x{^>o&|qmq;;SW&(VI93=6{ zJ;IlSKS3t)XIJyeG<f9*l~xur%QTqfAUP3#9L=0g%zTA2Oy*DFK>Ybap1B9ya2V`! z7+iAr1i0jw>NXsU&YVzR9X2apDlpcU7BUvnLRHld<0B8K4?Yl`d01t!M?b=1#}={J z;d{VhU-@IoD|`vB{`B;JgxTiHZy(k3pPku|FJ0v9E(moOx)9gqLh<_xUDeB-pOFcN zCv4HK1ZTozUr>+S)mJB0kB_gO$c+?S44%K>UhV&R#*`zydBz-Sp)A<-Ga$sPOb8%% zD_}oG6_C_8E-~SR(mqN9_8=&F!=gnLUIKl^rWF8t@jj$dJ8~RCDe;#;z%$igp;7_N zVQ_MnlKO^Apnrgx4*VP-MgWL~;J+6qXJ61o%3JgMZksnF8;cCA!LYgc1I^*y9<Nwf zXgfQyXc-6o?bsKmi2{42O4y_#anHE<%7$zadFN%YbY-*@TqrB0iA?<CfuCKD(oH$< zK<B%dv0;(^(H0anvPBvyJrSBdZT2a1<B}lLC>I<W4)esJtJJI}cJpwoOP*ZNL<hcc z%kytA@v#TacGvdnC0ozZ?MKhtd2*-k{M6_E_2pNX;kNN`(_kbK{p%+(ZycA96Bt3Q ziFPNOg)ZT-{1yQWD|W-pOP1KsEsu6PvMDK;nCueTNl`&x3NZId{Sr<TEC7Y~$1p6B zai}<5m`xN&=wa}cD15peyb&;qIAf~_*7FAVs6|<lBGQ!5^b<A~#Vym8L>s>+0~Oza z)Gx2i0XV3RLjlxVG7&leo<N_QQA@SZ_GyfM6Amwj;Jp@w=XWKGFD%?+>|C_*cmeen z?M2QlUSGUzZp|^17qP|WtM(zQd>w%f?$X?cei7i&t-&EGt~jvrN3lD{Q5{Vq%8BzW zA4(~KYIgBW0!r{k80s$b%uW34dF#l!i+m|3;1c`R5uf)Bg5}p?^VM|sy0x&?*w#tV z(G8G^hIYEz5W@rQ4fL{|Aom;iiYeOOlC|*$NZPNH_48hJ+Qh@VQu@$}qO9J4vv0R# z<rdoBnP~t$icl&~XZ)6#;!;nL(lr?y&3Ow-7ut-zDM&?zVk_etmT#g>2C(#Y2pMPs z34^Joy!+MCMx_wUB3|=GkY*5p&JVOYJNa|E1uJ++c%2Tm6<|@1er@=UQmrL%@9n8W z<KYQNtUrL7$ym6f<J2EuWDOTwbYVPneCt7fS;NwzqNlvcvwHnxylLF1KeN?bWES6Y z0Y>VHbv<@lOTYW<qq%RR#sUuvHI<lNZa=_wY+d&iS337Cx|~qASjkHdvTf(tfj}&m z86R-UMOeZk;WPC3{o@h6-)Rq;14Mrf_5jZ6t*8T52~EK2Ko2Y7#nHSeWG3TPSCQp2 zJ6ICNGQi=mBa091fb2Yaa+OBd*G4{13N82^pY)BPmr|*)fFg~SY^8^9N2~?FaWP3$ zlHfM%dhTv<JTzYXtRcid3viHXL5p~c4Q?|4(ln$yKp1LYfY+=692EhoYuzW`{rdi{ zpb@3e|LM!U<8$SH?9aWKn-1Uh1T&rWo$fikeect!hF9!8eR615XYX5`J4PRT_||2t z;$OB;Sfp2D!+Y=e%Jvji5x*=+z~n3+f9jj}O-Ry*yTUvF`j1cj^YMwQ<nq}=>6)7~ zM+f2+BZpo;nft}%GHjmUy&%$iI4?i+(Fy_GbF^(r3BneLKnJC&G0nag<D>?-9<7Cb z;f<?uEwQ?c!av~z)~YKd)ERd=$_5-IZ3|zCziVL+6iYSEZ<$4jq2QEhvt~U(OzPnL zf+*(^d<E8~VuyJ*K^nnuH1lGp8tU;3mReZ`WEs>Cjnrb4gYahv#v6y~K@(jSHc<kB zG)W+Bz+Utj+min;ZSMjYRdub8&pzkOGn2_YGkHyt$z(ElkPOK%8A8Z|kOT}N#1JCJ zh%vlhq9P(9Qi@b5ML`r1EmDfsQff0Xq9XN$meO8pDV5fGDfN2&S*o>Ouk|Wq4*zfM zGm}hs1Y7;vLMD?iXRo!_UVA;hwN{C@I%Jj(l~*0EVs^G}+M4})XD)iSY8N_xH8gEp zy8pq}$~C99x3!JSY-{rGUyxKc?^b(e0aB#i67h$DEck33+9q$3KfC)6>)PFTV{2(< z(Y(W(R{r+cB^je98Eh3T_pRJ{$Y-(nAD*>!sX2@6z#$o|fDB%vnoB*qxg1_KTn<~5 zZ%Gd4aXD;<9JZS&9o^rpl)iPF!`Fa?O-MPR;r!`v@i?fekTUiT18S){u53tJ=i^tq z8;RRjfy>ueo=%(C!n-wxtTDl@vh)?I;PVlvDx82)5mmV5Dqt7wx^WY_=5p>Uef=&Y z$HekvkvNW8bu_r=Oc&9Nmu9Jk&?+oWz=N2zs&Qg!it}F-^~(IC?5TOqy1I!=Z*_Ve zaV*&XOYit$((iVz#Hi?s%&&L~96IMl`OMDOd}%W(a{5Q@?b*J>?v31+=6m%~rrDUR zcf_BM0^v1}eZfwQIH<6b&E!9KTG}tYr*f&NuMFuGtyWc!9Z2M(-&t>_zt?agoC>ke zBL6xO<!SwbWVWYrwqp80d4MiWbp0oxrEa-2TR~;PYz3i6e*ayQclo>bo_zDuqtCrk zI-x-0W0~JWtNxCb2W2Poy4T#mfO3JmWq<hz!>MILm;9{!kF_7ZvO=-vJ`(M)iQG8H z9J52+JuwPFF&`3O6D81?MOA{0Y$8;B=}u6GcPjY;3`<QmkD%B**gaY#b$i1W3JB9+ zjsdopZWpvN_2hVs&eE;To3`J7qrW!Gpb<h|lPl}Mj{f&fzkc#{t584f;Qqc-Y2&jC zvb=@@ck6~T&%Gi4mDadQ^`vf_xLuWmOgYt{AtjDEJ`58adiAkJlhl-0N7h@d4m};i zs2B|E^w1F0ms})0k_h#S`!1l-x*hIhr-!<X2g`*(%j8Ei|NJsxLr>f4lRYns+xP8Q zy!7|d`o02xJw3ZsE)c6U^>}s&-H=uA)OG$|76|JJ6N3e)PedE)@kdHv2!9efEa~{u z@kmFW&*Fh32cfdj&r_`0_fYH+6QzwUlOO%kp!s~mY&?JVE8NB&MlXuxu_G~hbtgb5 zjebHcWQ<;P9^(&>K2a4`OM~=!J>?;H(4FTC&@JuIXx@&0UeC)<(puh<?gF_J`LpN9 zMPi$#5%e~y`b<Qx-XGRMl@IY3&*P+vC!(eI{OKAR^+HSr=>V(36c(W@MwdE+$><eg zDGxaa_V3A?vv0>DJdravotXdmmxd_wiQjXu*YmK~9{jJ@${Pi6+H?Ih|4|k8P<JuC zAvI0WI0zc~&>|JJc{D+Th{e{@o7876gw57ywFS+dN>A&W6)UE%>U^=NZNb7btCr4b zD|)GYZS$Iyt6SaG-YbOmp7O4i%~#FL@&{V>2j8Bz^!LI2EdhUe*UWpHudhINQNA`g z3%qe-W%;VUH1LM?=K6a*C!!OPXn|<!JhVQbOpqSI#`CrCRYDZN3BFdoN_vw$5q)^( zX#Vi|OWzuLcz4CM(tFK0m7ZxJpdAE^8YJLF5O4<(P~~k>)_2aOX;+)!BSiaG(Midn zlVXWGDby}fIU5PsD-~^<nPZGFN7RSB9{TQ8V>(7CSrVQJ?faATQS({5rFERB$t=v7 z#=O!S<&EN5`xNP;pPU>gKEu*_+<DYCnD*{mx@M-@pgdm>8(^?n6wimWsQA6ZpFjB6 z=s`)Kvxy#os9DnZpht4i{aesDLU@#;8IStUabiCnm6Mk*9K@pd^DS4>q6*y8<ad<k z&plTR`rW`E2c!O-HA=sYR~IYC21W_5^yA0HK<WV^#T##^edSITVtIHrwgp4WBq{D$ zI#FHi_AcJ~<1JW~^wE^FcC1SG$k_*TtS&n=15?dC7XbY^pg*X>u|fDWo&l;8)OwY= zz|aPW<n|KZ0iq#2WdH+gp=2!@E0&}8C8WkzZv9dGlXjkY2<MosAG_}z{&bM96Qo+~ zNiLqf>YPfqP1>t+svf6jhYg8vQvfEy;Q^M9Qc5ZQUbN*^%BcG+d_1QG)hC3$B@Em6 zpM3b30#-Acd(|(86EJR2^Rl|vXntBUsxci9I1+7w2Ezgo6vwxCKat)SRZpum2}T9S zCaE=A1KOc*SR4ML2L4oJ4L>tRY1n}pVQe-jVp~9jG8R3y^Axgh?nwI)anC^I+4i%M z+&I-Nv_^UtN1VdxAr(E0_C2az(GPk-yGfzlkQk-ie4<}zR3D=~`C_zd%qUI_t3kUF zGYKTM5tJhBrqv`6;q-T}8idpPmnGbSuCr0XO?1i>!nsB!l7pQ5;hH&e9Vk9~PRa%S z!kzq1;5wQ}UQEOf)79@p??t=n3a%^BH67cHOf+4CR>WE`!Y($>7=5ZSQJJCSo*L-d zlwx{@p6aAj1)PAe0d5IoFkqqE0w$M()HM{U#q-X5P4V5c+fq9|>}mh|s@CS6Uml+? zZkJzNc08-=l>)!3Z&TlW@=MAYtdd*Ar=TNC;77q+ph-aauyoS#Xdp<@<t*Bhg|MR3 zaph@^Ha#D@T+x*<!b321viO||MLAk)QfsN@um&xn$&_%@txYO(_nu8pXN)fBt8aJ* z`qGr>A2+e4(!m~dd94HQXSr>x-n;)>{r5`mZJ0D|)yM5Uf2pc&o45GPr_lVy#8Km> zv|L-{F$dhvlJ-B{*xIr-s=rjac{oEw=u3=-*1eVhEgvj}r^0$i`NPG;EFXH4;s$bM zUg`#l%JcABh&KDen<#P@N%$2PTu1L5Y5(N~cakFaLstXtB@zjR@2kXOT}XNd$VU(C z6b@G9#rpj6#B>59EnNqt!!?~Pc0IZss}@|}b9ZacQq3l@xGB&wb^EcK7voH=Q9Yrn z6xs<%NoR_n{KGzq6u6?a6myEGU88;R%a!vMw%y&+xlCwZy!6CF&CNkNpBJo+`KiVK zun!bi^aGh<uh7k?3w=?i*+`!_yYmLIIQEI8Ylp25NKura4Es0(DOKL7cF-~i>M9p> z_uK<ozg!`-57H{Yz67nDcMiHa%*4IHk3`Eh%BS)VPxey=T@{Xgc-V%-xDrMgddQ&V z;daD5Pa5R%(39UDTms^I=dFS7AUb%X!XHoWjpJ8r4GA!^&@~_U;0yW2hW2C_3%_WL z+o3^z$|naIi*?{0@!$-kt8y^45UrXB^o)-Am~0D_XrGirx<#U%32<64y9f<T=|_&T zJ`1%YuC<^aBJ1_|sj(I&4biM$MX|8xv7iG?r+iTT!yfmB+!}4NQ@ZAr_E+RGPeEJH zr}O5$xuGQ|Us&+H_V0<K2Yxm~d}2T|BXV<~d2%JIiM%4zRL)%Sn%D2$yS=rZ?UuXe z49uN7P*)iwQwZ@1(4k)sIy9)(EJY0%Y)Pt=1s;!<t1APtooh*|zbq*ORJd0gh)(bY zw5BhM$E&lb5;+fNzsw&F5Irhb)|fX|SfreNTC_WaKKAHN4eYgYkDW&#DN|V9iZW%N z(E5e<X0*=`PF7BC4h%#<a^$XflEq|@Eac|rv}|~D-n>uIyTv2yCc*{=nu3*e@?#ev zzP@$)Uhv|z6*DU#OHJog+Ruje5XKKEw=0yp(CLELTi6mno0HlRQUxn91u@GMjtfcQ z7Jh6Gop$V@4lpoW+Wcx5q=w(zK{WB_SDC^E=~Z?Ur)K`V-)@+i<BR<B<>@aC+#){J zepZ%p?``jG7fLw~<k<c_*g7$CM5vfMFo!)6`CQpT^7;1bJmL;jGWmQ4KY7epW1m4w zyJn#1k)o97O_fM*s&L^k8m*i{J0-%0hUuZK+QG$3ITVB2BZSl+efb8hMdLq{9Jd3a zG>6zdH@3j-t<X0M<R-CQ{TbF%ucD_6d7gckkbuYQX-zsa&SOz@@Kzt=7cs-SXiZ@O z9wA>6-F5H?(coOGIEKot=XB%=ugs-K&#An6#p~G5A8wx}1<yE8qKH*~c}#tJh_i02 zsY&&O_A_y-!XrgR5Icq*3tzQguP|<CkwZMvVQ|+VkAhr?72P{1u$oIfomE5^8a;#$ zPQ!Y7WUttweiQ4t0&y$uAAwPd-An%J&_4Bw97z)GlNh(FoKaftlVI<_DDoYXotUG` zT}V1b+yeNVZxQEHF4qCBOKc|LDIrVx^^he!I?};X!i)h$4fLgQ9gs?=+(9~kZV@hJ z{PY_L9+imGpXU$z22a1mnw1@Q`sw6*&pQ@yK?kKwIq@1h9riOt>U-ml^_g4aq)s{G zbi(Jvo#r?h6d;3M$lzg4$0E#SA@4jLWjO&_VsFZcezjuOsm=Wk1I<eyYcD+ji#4o= zR4V<H`Vx4QBmsIFijD{i>k+khI!YG?aKRWnpw?>>5|dM6CJqFZHo->D>);+ZDW)+b zx?Lo>wEnsE)3*N*();>~nd9RGxtWCc>`Y>WwBYQict%LBLBVdxQS6yH@~<`Dl0JYe z!Hz_qD5x)_xA@6RJVDd0Ay8acjF=+Bi{b4Kys$w_=?R?bgDXWCRbM%GWyzhqQt~&R z{r>dpObOC;lg|Fml2ehG?a!IbG)F?MQsvp-PLgID60V+g_6L)sl8K!a53}+fw?ESu z$|if`Ci#eFGh%pY=sMG@=Dr=SKB_XYm}#0XUmP|A6GoKY>|sn^2&bT$3v<db2$upw zdPGNhEf&l{rbVL4H)IkZDD(>MHg|d9`l;WgrSuq#*1TMAu2osiHRFbsBm5u#h3Va* zBrs7pyBq7Z<cv$qE^*J6zx+k0mDaoPT!&_Z`eyipAyqFLn<9pUNI-NN9CrqdynPd~ zk?JoKjZrqLVcsHYh7=xL;c_Qja;g<~YOHXG1+Y_kSb6$&CS&}Q>vsKSedtP2@=rRu z%aAa07H9E9EL;9{Pfk{jURUnM#*&ZRsadDKRpp?4P4xF0qpX9uZR3qPX+Z>2v{Uy} zD_9rZU}70+-nDSisq|f&OAa#F#w^n4NoT*$pV^IVo!B`}e)y=eajcPkwkvMS$7NpU zp3SsJLRUgBSW$!X2kAbTw%lrFh|JPe1LXiOozYA$ZZ#Xo!qR~w7{wQ`!H{P*x6rsp zzG3?;_RjSW%I+62yjI#KA7{k_dI3{Dh|X^0n%&ZF#L>E=F<+Q+ZYaS7u3HlLZNWpC z^rnvAU^?Lbd1OsEL7EX$TBv<&BK&hrS#-uDe7G>hHWQJiMGS$fNOiW^oo(|V>S_jb zJs_|*<ZWyzGc;`}-FE8Kw$d$=?NYz|#L0n%Kgs(@A2wlad!$cPxv-3SRDBtgxhY17 z4->Lz)SJd1wju7Tp_HRagN9gS|MX^^2p1y7g_fxpkseM)2pASoSVYg${v=yQju)x| zz7b8t&BC(68nXOE#nq=EKUSy39FJHI7{g_FN(RHNDEWw=D$Pu*1>s9kn$_JLuvZt` zTsb%Iow<F#vtjXztM2=?{A%58#T#y4vH6C%4NDsfg?kH{GWWfgWh~8dyPA^S^51{m zeBkuKb$$2sGe`Ni7QA!+iif)Q{sen58AN|9ZBr$vrmFe`ioYW+Mm0`|CBv?uSJ-y! z4_>ieBb98zLZn-q2IWY++))ysFAZg|eX&WjC;Sq}b!k!l2p>l}g%nxGzKDD#Z9BVk z1AB4<>)HTW5K$xBC48**1E=YPU43hGJ`W4T9uB+Og4ufg2&`BVp|k^Z57DYp%KDQq zmj~|w9)^Bq^d7mQ7`2slqBnVPiD`U^X;e01!V=RASRx<0g?7PVD>W}5i^Giph@N1; z=7G6}MTeXi#T>9CFt-JZKT+8n$e;P}F~;tcm$KK!HjN&==7+~0W=n{ey{Vg`^gNpU zoelC{)+J}jGue}%EqHFF^fC0J4u0n-6{<%m57-Z^!L_tjKdl<C7&>J4Lg63)hVxCg zKN_q>M^{kS)^Gu?{AacLq-3;X&E=w$hx}ex?}o-ySx|l8Ors{Py-Cr6{u~vuM^2W< z@{r2IfH1@>Fwv1_reQ8JbSg+io)5#r@_fS9H{?Ivxrm*;QEuGueKz|`U-P7TcXQ*F zEAEjW8z`F5n%6V&_ucc~WZs3HQj7e?vGsqQB;Hzk_KPc-WzQMbW3Z-MHspM-U6vm_ zBkwXgEkE&|7{vm|Zu`SzoFyc9q@B{ocx6T%u+n)BZBqGlVDVF^PZeh5CK9eikMY)l z(G1{69?l4<gaWxsoGFJ^3L1>3PdR(Yt!HIY)7cND=g*d!69H<p2%kWP1XU`Yvn%?X zA{B%Zx(Xv+At=KhCBQ+UhVBMSSj*?=dhtllXokc*AKj!?2^-nofhXC~m60<#U5Zri z*2!<|mF#CHOK;+VdW|_^CjC_j!R^DDv8k@bY&hCe#8PQkb`sawxp)N^S)*0q73;P0 zR-Fj#kj^re^(Aq`Dv5$lS_otkjE%5^Sdm1rR<R<ir|U)nO9~jpO9QaYk=VlDNFlk! zKKYGL<)>KtCoCWz-v2kd@K&VA`M<ka73*ZDSm!OD>|fsdA?aBWee$}*=Ww>t!Tmhs zOqI1_h7{J?XQES;na51Z2|`P6S~t#((Zk=f2I&5w_slp|P=6R}ib18&_sAJR@-i(b zHz#0=URf!L6&JWl25Ad8JR<!;nS>WGT*hpaoTJ4zUs2aQ@v+xIz%vz7Mi;g}aN^v3 zrDd#2c6>*^(|ixJJ@6o?vOkylTs_MVf5NWDvARW`vs}J8v4GvY`gQg+d-Rk2ppW)p zweY040{eh`j0i{%@!X3Q<7~kI$s>MqBOkA5AGrU47`ztOsKJWlFW4SdlcBFSay{V1 zZGlz8iX(1|mFQY{O73ENj~;zutz8fzvLU@{b=T3=!kL?dWpi#Q9oOW!0qelrOz~Ju z9<(G62riR6;0{Nqm3Rm%16OHG9*ie?lgcPJMmp5U<-tf{0~O@Kh}%z*heXPfNflum zMKeTj+2$^TJb0j=(8e5pmtZ(tCkuSa+T{J8$ZxRf{ihv~BH=CPXOGFR;?jLWcFRw! z_y+`n2BBfz`?)#dQFTa_fsCe4wLr<A_ZkB+@r7?MkyJR6EfS<6Gtlo)R-aSWugakJ zqD0Ug0bhU`Mk?JPfq>y7O)-Vjg?d+OuLeCpDWIlSmOZF|iddYYrS+>tkO5r_irQEp zDjXo^nX%1WI8^8EODRp053>)y!@h04NB%n(4{4k2*_7w&<+sD1$WJ`#+^~W@vYg$g z(H6@0u6|vfD$o7Lel8!YARo&W`LL0E;GB|tD0xGArQB_J#pMIt=(%9@3yP4Ce2~_N z$p?dc;Ff@0Puv%a0F{*3Ub&0QM?@9`Ls~WDLtGxw+z0_#Ryytq&oVlT)K4%=y+`VU zhrEt#&fdyE*hvZG@;U-EtKmhFkzuAV3!!R8<IG`}c-o1u9)msglxD8g1A5n&#yylY z3SeVof?h$Xjlu7X;kMJ568r!)mZ~w_?XQ6>0r(JBA>!9(_EGzFTz6`M%Fxs?3ylNj z5K;AD6D(?d<qM+4RkAxfRED8^H0tgmzbXZ?N@ZuMYefaxcphw*|E9KtR+f&RfC~7C zc>}{>%$=Hh8yam@H?0g=nQ&-^{5Op?w6b2B>@V_$&K~SEM@PbSN^^rQQz?y^X3TGv zTFyQ)_%nrfspn*MCG=Q@YNKMKgw<fy7!02U2a_PNy?Fs@#L$=QLA4n;hpDJY#ixNQ zX*~XrhDcVQjx$k*_zjJxO*0doAY6*yDa|5Yl~OC6B1E@J>ANg1IERs_;WRpg78=hO zwoz}LQfkov9(soXy2j9r8UoM@9SAIT+$I+k$rD6LL4vWO$~O##<ux_S4Thr=D&NrQ zme<T;56r5)*<g61vhJ7keeKOU-5Zr?i3L<v<!UiF46V6(;DsGK#D??U`}OKdjJ`BT zkhN0us;!syLGR;?8%Q4^NK5+2W+q>e1~v$Y>G=ub*^O_~%`{jLZ*%Jj1~#laGDXn7 z!VFVHNewAR6DdoT6;c!yU}Zn8Gh3XBA8dmM<Z;8TbOS*Ya*FO?SxC?FJVEi}jAEUy zGbk6cZb6<URF`)6Q^ewN1HJ+`+a*6fd7`vWeovFt))hG}M2<Jiu1QqOJBw+SxqFtM zpPkk0P4Ocks7e4zgu=bh^2nP+zmQQHvpJ)A5Xy;&{R96m%WW0it~~Z8QW;2S;IjnP zYL!m>NOFNUK6GO0^N8S1hLVTT;=J)okCHNWj36<RBxps*j**Afj25!@^9k4h5u%hW zamj|$rsk)ON)G{tQ}dIk*5f-2NQN;NR>8xC5RuqW8Gq0cU8scsAX-Z_i38RRCUb;& z0kcQmfs74n@jKm&%}3rweoAf@)A3^9FD`G5)$3y29`hJB$?DFO_qv^aVF&)gI|`j9 z(5b|A?okn7H<|aFA==E4@?0+E02=q@a`Dd{a=IKO2nd?EPQa*-O;bxk0F<|vNCDWT z&y_;|+~o#|DdwkbDb!ekDYf>rKT2-|3ZJCFQiTUcGax{{iUVGV1=?YhTZVCky;)h{ z8$Y(FSQD^#7+dJ_+T=bzoS2>1P}U|H+QyG>t9aC!=aP4WTKRcvP8RC|>7-{pum#=H zcHDN86nh6TdW1uJ1q6a5e^{*~bjv#acsnI@ds#`ZSVDrXO8AmYZz17J@o5zkM9R`4 zLyJWR;Mv|iM|@~LKJ~oM$M;lDO{WPxNa8;KjeX~TK0cT}ACEcz^I`Tf@!-J1IpR)f z``JbKOi6WAs1ZIw-S$KkeKx|-aK$K|LM=hZ_<IrMvFcRCFk^z?u@zCCXZ0$B*^!w~ z!FN?yi(}enusb-&1c#8P2jNOaAxB!fMqAAYd4!GN5I^o0L|xEoShSwN6K;?1Hr$Jz z^m^tMd@Gx>H`kP=UOUF?^!WW|Q(3VmH77WW9ndwWdOMQ}R;9cBlU<u#=4%QZ&ARlO zfHysR!m9N2wUybaz90-{lF6;iEfffc5K*wJrt!!N>?FGF5Uz%ugjYNVi*c-j!8+uK zW9`lPaB(_GPbZqJHcqEVQR4&*Ooj_bbC#3|FDO&sv(<9)*n16YCpT49Hpx$|dUEB; zCs(nLDmrIYR?N7nqT{yPI@(vQ;`(Jf+s&2>Yha+2aQ%XK5Dqqlh46(Ds0jcKEuX6z z{SrpZkmNZhkDg^`{*H+gJCJI@U-cdQGnbzGgLJp1SY?Fu0eh6jwJ^f8p?@3xA%)=( zqY2qqFyIhBkhiqR@2*c<v*-+K`#^p23_34;Tt2RDn%t^hWkARZylWZ!ylO&}W&y=c zp*7-OA@3AVuPO(-$5U)M`p_C05l(}?3D-!C=KFK_>%K(FSE&-I*q|0-6rx~ZZbjIK z=ma{tV>TA5So@28Wqdq$*omT7nrDY0Jc;RORs^08qeA^v23^6Rhz+_rkLrQXFE-6# z_nL9v!!!pzitu5b+ho{!0e4P$-RVv#Bhf_DEJLB1ahPV=)wbZS;v##h$IBX&X_nxj zh*?#!J+a_ywFA>fwE%Wp0lnTpn5OKaaNO_jFXfISpv;3j#F?{logW=_pHs$9o{v1X zY_KXDaE7QC8@I7)I1&Lz5P=<PHAFxssCxwMwH|WNleKxUA@Gy)Iq`?fj)A&>35WU2 zrxK$w^C?BJ%l9cGpE5ACGgAVBmSm_|wOe{$T!VXb6h+?hU=jBg@jeN#0{FM&^0{NL zaU{uw%0}QeK&cXVFBBuKqcee!1+ZP1E<;KPK?;N-VPZa`QCSeKRxAQo_puX6DoCTv zgq06`mnTOw;YZO~cTQ_16HW#Xx6hila^B>Zg#c>P4*Cm%^JeF#8?$oW`FYa^3s&!c zZRhfn4e!2SU?0yrzIM({FWt{N%?~zJ)UT`(IyKC`;Hf7cd-Tkn8yMVk@^e(X)iS3u z6+NSutNQYZ(&9l%>01iE5Opr}h7V9*Hjou1l3GdkdSlc9;zFTL3drC9aX33o$9YC& zb>-w2b2XF%(x26+jMlU)Z#4TEN4%HfdkXc+@TuYCBg7X^|C_7l#ps_kcg^94hBIK} z&Or`TEIOiaV9mitzl>+W^WXtf0kt8%gPILB<_z_z$p!?dU?R2_a)ABQaCD}?Eto!m zn=b-kCAC~V5zxiqUb{f}Y+$drao~6q<5#76MS4@(iO)<Xokm_D(6IDj(SH|0_}T5K z5*wyQs>N4d5#OFL@G6{RlD#U_=+M7SR;6KoIMSC=1-1a0p7@kr3$8Zg2!s=1$yruL zql(D5vXr6kU2igJt*E($=O*)?Q4#^FR9zk%9dx637EvuF{Eej})PSV3@Vo5D6!v1l zwSft9+S#P3@-zF>*<JEFmf4u`J^5Gmd5hG7(KFWGRy>)ln;@?~6zN>Lfcby>sQdt5 zb1U@48Og*Wlf_C$W4>^1CL^VUI~Z%^hOrezPfsVB9fV4zMd7@UDG$N`rd4^J>SV22 z>(yv&<vz90lM}MD$hU;qftIc*0rql<JnpJSHa|z+RXDM1Orw02lz&YbV*=B#s_Ul& zruF>w>JLBpY}@p$b7!!?uaEpE-`~CGK2XVsp^LyG@VE|{AH6I#!#&C^vUji?r^gbI zJ3CV)MK-+WK49<R-zFdqz@uexf!$~n(ohz­OEh0~5CBkL<Y*tx27U#Hj_I28Gg zkbQgUexY1Om589idITe1kCwX4ki^0A;+N+j47e9_wHc0*vXjv}RE@PU{1J?uo${xA zk!L5-LK$j5KM?ldGi|D|sy>}!Yw>s*kiN#Ueyt)RrZO6X0@odOZ7hgt<*^-5&Xna3 zl0ak#H<fPh?C)$UU5&rRrDGn7Gzi@X<PJ8isq_v}8<}x{{ZpQg=Vu`Yau8<-I#Iz) z_G$us5<fpWKVN=WBB&2#gis;^I}v|Ak>tmgu!k43hn9%Wf%nDD1Ivk)EaVjqimQ~* zBJS{tDQ=d=K1)Mh0?#f5_$LL}X*KI!BF|qeFI+N9OdnV-ZsvQx5ofy_xO6M>|9tOz z^?^QhRH^ZZG6D>PqCm~f4dA+h@Z|7LBtR)N1O5a&+y_DY1}_w)_36UB$B#a=z;jdg zu3K2&JD-1cQf^^=D|RkALc+fhbaac$lxK6jqC~NIogm+F1y33sP=of}2=n$UiiSU% z8fGZZW&v*Z=LNARb_E_Ue6fHYzAy(5Um-4&r@r&g|2@f`mZz?Gc+oHVh>t3{QuqUE z{XCSN(U28{=TlxfiF$#kl4wQ=lGtwLO*bF*?M7)Bha-z8!VctVEM<KTN-C)w@X`82 z2fY-L=vGkFNpA9SIXVKWJ7_U`xi!gaWGr^8i<t%7VRU^_A8;$nlGwE3+Mw4{<!sk= zY+6)WU0QublCRX?_I4XH3MbhZue)TTyw@*(QU95IOuql#vb?O~8gxXVa|3#$r#L;6 zRF9~5bt0s!Kfu?IC}CfKkJb&023^Lo-YR|>Hz+@Km-RK!;9OP1MEo=p3KaE3z7ydJ z&=@T1tMJj!DvCj)uYz9sYVm6oN|I_sK3oB<5e_s!rKv2rL4KWo6;^ze-w{5SSZi_3 zh3t|G{#`z9ygZ{t;<M(XD4!Rr-atol$WB;O2sA7Th#)}WqyYL2T-8F1x~d(*U_|FS zG`tR{R^fLRD!=1y!0#-)o_=THeEd!(qIax!iNANsiSV`H`qgE9*G{3IUW<ChzA4wz z%PDif^=r)>P4KmMW-0B=RSU`fz|Ks*`WxMwud;vY0PSCGImA7g#C>sTby-!D$yXZr z=o0%Ut1j__%N~%+ZtP`$0Y?lsWSd+meuBLpg?xX#>RHA52v@=0FGeJUbTJiZQ;>`l ziH*;!x0$hl8t7CQzu~d?)j%0Rb_<o?DdetGVLs^<glXX(qDqhKm1sLcl?K^`a+Ik$ zQXy4kz2#<V0NbCT7z`AVt)=?M94{h@<>)OzQ*$sF6HdF0hf2a@;cbO8P$d$!!PvxD zQ`i)gU9m<hQ7K>ovl2l?yOT0|N=Z>nDV0;#Q>6D$`eBE)X-;;MA{)=j*@-9q{9jL) zX52F?b_5xF{M286lKtrIq_&z&o|1>w@Bh&{iS7FKkDgpFv7~gye!P!KX$StXJgUbM zGczn1a^%UM%ChsrUvXXb=JC{H-`)7zj4cyB+bb+e`<?vJvDC@Ou4G1gzE>r%gQ{i1 zHgOZ~KB^R=EE)32#4Us%VMF3ov|b4bkp4u2%7q_Eu~%WF^l?!kbOc9RlOsgGvV*mq zH8q{J{BPk?{yH+izsH_4)pE91+$J1QrBJ<rndh>B@jw!Yo=C3@IJl{PYN?EP3Au09 zim*s<J~eVd$hB0XUdssuPo=sP8S!C&0NT1Ox9{kzyQ*t!aAj7i-nD3or>=8mRp`zf z8{5BN*M$1|2IHKS>GgH2eFDiI)^To|xJ}xKd2baw;(??jJw1JJ9R#Av_J`fH4g+u< zVFy|RhefKGSL-kaVQ?+z)f~SZ(P!iLwX&q~ZQ+*Z%Y1{YDs6ae>VIU9%ZRP4LNMDS zZ@HA^6-J&(6?z=W${G<9Wj^s^@iQR6O35p>lO-|8C8YjnHYvpoGkM7oy+T)wL7ox< zLP3C<JzygetXhKcT!xwC62-+kD_IVUi{5S~F3%(`XQmoli<&)kSI?{tt??k7>$MI1 zH{yVTO-$u4M2{;^IKAMpy88OY#5pU|Cf2prDd=^%F)r@AOtnYcA)JCdl%kGmX!XM+ zqPNITcwX8dgaMEerB_~TP>jM01ck=qrJ~hAZ;lnU{ozbMHZa>qn}#EJQ30~$poHim zKUp?hu$prqSVXcFm@|bNA=+JoYjITO*6ce^;8>Av*gMln{+2=hc0m4WRZCPLqrX5Q zYW$%5jVc2onVc}96_`i_mFSfVR8R1S>qsmM%0N{)L@_LmC$Y>cA#LTcgcG2wdef|7 z4fI&pj9^@!-HGbgLFz-wnTZmNrgtJY^I|7TJM6nW-;hWXGYZN_chJd-lCRhOq@~a7 zG7U1p8yH<MAu}rR#4e4ns^wc&kCm7P-elZYi&ID}(Q4R$J@OB&RuTWyEw0SyIaM?H zp;WdS_4X5RPpoEM@ndN<P|>M43)w0<rP+$;LuVm#0?{8tJi_Epc51W)ROQp7<Hc8u z6J^2Qyv?4zvTJ!yy7ke?xd0fD_>$CG2VY66SMPOcnrAH!X<Yl4>mr{kf-CQj2{2Qw z#y)<GeIz6a<qv^c1~i$@qOOemX4s6;rK+%V(TYH<F?z;gXALZlt}PV(YI6z?v#}Io zf4P<Ygub@i-Lh8xh+^%YR(TbDeQ-;Qa0-X&WBieCNM)<Au8;AD?E}O^L_G2jVy|!- zue7QJOo5MPRmrPYP(<_!fQ(qYU?^eZa0k`VKVt8|AH)LXk8rxP^JHh|JDn6fV3<k1 zP3#5j>FCGRYXZEO1Agd3wPl1|SM#oJ^u1U{wJAUkA)rIw6E8CS<e`%<=x|eBpZ*XV z6mT{NXgfu*SN_c?HcMV##GaJLEp3@A|1T>VCI6VM7$slFUjAOo2CQj)=ekb$5D<a* zy1tX`w#&C*eS&I}$|=4pc7Xn*Xk-K^ms;gki(X%$pxeRJNDu<p$na&8m=)P6Y_!`4 z-gUXeEW6#=-Oc{~cW0x4d5VL~ZD>rU`-6YBWVrlii(YhR1p*LpkPtpQ7+Gm&4l&c^ z8u%}~pb!7>V`qawK3N=;PZ=7W{IfTxHfvJFO6(cFXKFxiXm$9KNK1^aR9uZ^GDmcs z%yHHkzijsI-<=H!gKPe8d}m-e#pNv9(0~sW5Pz#wABJow&xl)PF)cN^V&RS9>y=I` z-)CX{@8aQXJAJPF=i+ktFZkT|&L;)Wc?vS}E@Z^cY0(m^z1U@MUdRizL#f$B<`iC; z&aSiD<!|5B@*VjnE*E=yN6Uuy@y0g!zyB=U^2N9Cl<&&RQ`wd~ipzh(4r7Dc@VvVg zT*>+MhjZH`1D+x9XB0EhM)auh`4r#5(@?1&Vi9(m{PxrH(UlVm4?fiLF#EpC75R^w zr(Syxy>rbyOXSlmFb|Kqk-ch{%O5HZ{)hb2ROT#;?aQWfm7t{*IE^g)L8PrWCD5;k z6aYGd<^t{*$Z(WGFDO(iWOE_+g=|{qCT41pkAThtD_VM(4ut-83R{gY^BdR;a_$u^ zb%(!r)!F^Gn~Q^wv1{$}+ue}68{Yjc#PGr2bpL1SVPR4y$816l&rJsHRbORIfe`pm z!-g>@wx(jX=}^uZ@Zs@=@~6a(;+tD;VUL4BuapRhk>iam)xT!z<zvq7`6XaSC99kb zM%}<>*yX)Di-W)YZugrh&$8>j2UG=m7O7^6C&eP1In+TwzaTh`KsE|`DVu~>91)Gl zxk^P}pN(SyYy`YYDpM*UEClE!_^KBes}eCd2JOzE9<pIq2Smg@TGy_d8LF7MVKpKk za*MqFX#Z0k=CWzu?dn+9Tx9By2cB4ib+lp~@5a`Fn*vuTtpkxtY+7s`K=A6g`G2u> zsK@}Nb=ZLI$rL#N6bcG6Ao4t|tZ!%;6pCSMj`j<0#jk@&SO?<!=azvVXJF&FRn<Im z2+fFFf);}=_P8aG1CB1v4mm={0>g=Zm*SN0rUmIVkb=r>!4Jgz6pjo6x;!=yIFl)W zGxMgkOul;4>WQ2;&rfQp`QgilIu1Q5uOBn#)~7m7r~FRxwmz_sbLmaza~+Yl1)n_7 zc1tz-qfo!5W#HEtM1pgH+@gXJfX$-o%qSyyPv`QEbw}%Sy@auiA=~qy&BH2Msj;G$ zSz>$}jS*nTF%a(tQASQtnWF(7YA3bSPXG2zReZ%er?pIa^)H8Jyl@9wlO39V4cPgn z^94=8m0RZUHJ?s7)&T%Q<Q*o}Twhy=TaJ2=EJqC6EUs7SRTjt`p{RgIVdOvZ;9F`o zQv)`I#*nG-7aa<N;TonYu3>grZz}IH3)?ChI))QMoo4|vqIjkv9(-W1ld$m&13@eZ zmGwHvsz!ruVDC_}igsWXz1c)|uNuQY4N8XzDVv~zWq^JH%vvIegi#VLs2xtW5u8ns z<@NHX%zWrYW|L>L3F_R+=2?Mn^(V?Nb;-|J@11(hUC$=6$#!(YSjEipCqI8t7%TsD z%=N9+1(9US3*VN*Y+CR2V}#SDm%jsEh;h8I!CJx`G2#P<G!ZW-nbvPN5I@3p%6z5~ z`XANhu``g$&tfXD72$N+PCJ_C(imw2r^~G1-G*pF1Px7UKu9LbE)T^MGw_n|zP6_4 zt}Cwp;cUI>81uYkiX-CvNo%hOUi0YB?mK+=z6<v-1@)V`s#(#sW3mR{M`gi~IBcCV zYzF`t%G<^^2=W&;!9S}7v_@@AusSodJUkC%N93^9Ac#7H)tHj%BCPJ{(jg*5t+U*R zAgkg@I=$4VH6&tEUC0+t(Ub4~t&v$h*U!GQ`bPg><UdZWKfQPEUEi6H`($u)*pL76 z*V(s^YHwONh2`bG`|$j`R`>i>exr@G|83abSRtjD`ONo*dbshu;i+iky+9vkRFfZ` zXeM<Sqc0Qmr63vH5Aw`PqTK*|Uq1*-$~ccEMkzigu5uW|l-mV$@wCYPUB90y*h+7g zkI&uk=sQ+VY8(+l<*8kF%@b_-qat0?*`L2Zs(nf}Hi>NA)0iu<8`fm?5SkYV2ufdR zdEJ;6eN1?6K^nwOYs&gWj-(VBgi6WJGY2ZP5nWcNpZ(>61;X;K$nT_QBgg3(0rZ_Y z9(#rcvKv*`^bE!UJPg9gpP@N{zD-d`LZ61`$~3q{qR)WF#uM-qDYS5*;Or8L&t4&v za5@U0FHT7m_W&KV2Lwccp^hR7JcfVvV3Z!Qbp#7*RlupoJ%1=ig4@917M@!Wd0RRV zdD|{c?4qO`awNdJoZ@r<1Iy#_Ku6)zsnyibTKTiIC!%rqK7qrTDB%~!<~m?l0&T zS{DeuvrU3e+I{vAJ&g9{Bk>#E7co6T`?9hta!l%v{DJO<PG(d;Av^~io28l_yBm;v zCa@+-odA!dR{)Fvw8a6w+#Ui@l9JKcxG;Di#es}$Zz`eo0OrvoL1+n-nug=3;@Mc_ z-^;w^&S+rY=cZ^H$0j-1&g$Z#%Bl`wckt1f>(+EVQ6vlAeVwb<%xqm|W(PW|s|$)N zckO0tI%lrgwQJ>}RUKDgA1+uost|yyg~LOu1;`Z6BfSDF9G(fS+p4Tvb0U^lRsukj zPD-ojwOgKMDrMbzA_VKE-dxl_C`V0;<*kF9ZX$t)0)5gvW!;hP*t(fKxNgDJQB_q? zJbu@1TK4yMv0Hz>szd3EvX$*&%Y?O%(SH~FQbd@SdbTHqeG!2=Y-V-pZhU4EEM?S$ zVdH?u=9-4%vFVkXTSq+hhwd9rG;}v1?Q7%FKY$B|Uc=vl^JeAIzq~Q$e|?fde#8E+ z?&e`Z+0EaVJv>5)JraRG0s@!QO6!`cf~AP5@QCgM=8T1eB_p1yyyjJ4Fxi|O^@A?( zG#RKrN%<~`r%+;0AZ2|qD3BC|1Py*&M(mj=yOb}f@ysI35PlYA9NnoKCA)YGe9J)m zVPSOIST^*wiSAStU8KO}C7kHZ7|e76QwGbe-=QdS<UI*>$A{sjQ;6nSjm~7YrKGv& z2*8d22r;b6rXogNg4vcvl}T`n6(ybrHO>o0I9gjv)gl1(dwGxsoydgxUGJ@6f14vu zs&8S|mma$PI~5zg!{m=&+`RIE3e~xi#TC2WUo2Lfdh`b?gyx1wSzWmM5KFrH2c<u_ z`{0Z62h+oW7qW9c`u_GC_<7lm0YV+>B4FM0gx=Sa^}s->25?@66GJt`Loie-9FCet zG@*&t^=m+ihD;(!jRI_tVY9d`de}Uls7;Z(BNpKw!m%^8VqHWM2H5TLGwh(;H~=ev zsoodfmzJt`q5fT^SF*Oklk)D_Y-b1CF<V^}IVe0InS|dfWoD^a48rm&r@BT$P#~y) zOO!OV8rKZRQ79@Lswbwx<8Txn!cpvACZ!LwBW>vASO~^YoGN^#^<tONQ3224VQGZK zDf?ijDHbX_N1Eb}MvK@Cl#|hwoIy!jiITQ%i`hnr0N$LEAfuEDpuR~$Dn8t09%oKT zan!CZsx9^wl?Ckul2w%8HoU@~W73=qce2?Z@Op~tNBed?EuWGPMA}sB+*@GG0jw#V zdY#2E!^6`dPuJRj6^m9_(X>0twxfc910Y0{Qz@%s7PCm{h-g>SasXT?ie4VFVs!^H z%;TKdnYDKm)nIX@L3^QO6(p?}&Q%SDgAL5h&rMcV=lAXEM}1C{@D$`lts;3jjJz|| zBPJpHoJQ+RC8<e58&^Sxyc}=pYQlFaGiBwJg41YA6~c;yO-aaT8mx|FF7<i=BV~aS zaqC5o1%JbWacD%hMVcueWx-8*-~Mb<1XV`zfc!8E%13sINs<3ydcpet-Z}Qk*$2b| zd9%Fb;B&`zM_ep~e{myQ9&rip;nkYYceZRjr@~^Db*ls7O}HUw<tAVmu2%IW(z>;@ z(j=^#A8u+!tK;iNy%6x7QE&s04M~vwXiYEGBP5}C{~2IJb$Tnvz!9Kge!)Noz&l6^ zP6Ddv^^CpC3D^k&j_v@o;eVJ;XgNDun))Eo*ce^-ib%TfzI=qO9k@@tY2Yoq-Xw2P zG1iXxK^>@<NkG)Gk8C846QP4?Q0cJmsPqdnx@qB^P<TxvCfginWFIOQJ8pXa^{oR( zYHE(E4UZg`-+xbj|54Vzr@G?TkoU%Or-jLCBR)4zk$?WVsuL*S?AH%{G~6um1Mtz# zV7V<%3|g}wRE_+Te^FceOXs7<S^9e{{Smo&Pt~|zLrlPLKut9Z!0%yll1%(Y%|KX} z0De;?2<e(|VoXj@eWxf;jwuQ=0-5S9DN2h$=Al*=0lnyknWq=MZi`z$BNO4tjXU0d zqjwW(K!T8w0b%i;-SScS7fi>r1Mf>q&OY$TvA4vcr(S>RHFl%Cg;-6DWh=nw3Me9` z+S&r4#<4qBBzFsMojVSGv{P3Rf}v3e9vCg94l)+W<kgq426yOJ3oS$ymsABp01pMa z1CNk?z^!1V6jSYh8nU2n1(8<at-i>?zAxCp;!gPkmfCsd4DQ9wb00_nX#jP{nXo7C zRrPtnFG|iKpfUjqN%QxkoieQgp%rizuNY9L6g*Hen!OnufD<;Q;sRBc5iTN?MU}WV z^d!X;u4hzQOBRZ@IX>A7wVjoNJV`}wvgK(}?WQ(I$Rr^45VjBw)kdX_%6#H7U{iuo zo2L>mU+Pqf=C9?zRl<z3S(#HMdIW}o!QxXr6^mQD*7YoHs%&jbUECO&*}SK`b3OA+ zkl%TzW3PO&i4{l#1M(|*{_CdCoLx~Lm>HDQr_S#z@<)n+UAwbbzDfQSt3z{hc@5ti zs_hb1KrcEFRq1o^Gk^%Q1A|)8k&+tT;NCIY%B|(Vd6~4Nfjymd-aGGKm1UGBTWejx ztlT93KITW!an})6czBiI(R8`y)ULi|PZz)-9rEeH=P>9&>_&!C<HGS=K*KRShyEx? zF6xcJq<vcN7w_Pgah8Uy&m3DHt_^W$eeh4rwoEOu6(wk1Cv@)fo4i>;SFLrjWXxK1 z|DE1C_2<klpT=sAb?v!jb?qEim&PNkdX*I&VNvW~D)<nH@qy}u8AaNo>r?nZYcq=_ z*c~*fvpTIpAo`#h+a!9>*0tUVc+4>{8Vrel&|&$lU&wpuEHGf3_;bDjAMp7@d>FPq zMOX2%c{J^g?GrPL=$nUZO|_QpqJR*X;9YzFZP=(uwi;}d*W}-){v5I3y+`D?UX9-< zY~MbJ8S(oh*0%$EFsM{;ln^U72@cCi26&o&>q@s4-8pfAxAc4|E5FO;@jX=mQRz9k zaU-o)+3%sx0kV*+A1Y@7hPzM0&ow!iRAo<R4B#ocWvSjC!p5;uPCVFsa-)0YRy<d3 z(R#m>ds>PTbj3PaV;%k)P}b<?ur8V&;${Q<_jL@g2K=(YP-p<Z37kg13xo`ptpUD9 zyfN%^fWPCl;ef=`8lx5vN~!r86;q9B8TH(W)sqFJEEmBjKE)(xVFPlDPRDul#XDH2 zVN^>G@D=o&Q!vgw;&SLe@YGC^Y+~vuVZ2T^E4-1V3w6l`m%+LEAKPUzQ>fmT9l{dH zL*%%?z9dRnBeKbBm;hKtfu*gCjsPs>ULDY*h<t{%0O+KYk%JgEQsID#H|kBoOX`(u zw(vK=;Y^ggq5H81XbS>Spp8dxj8Me_&^Gvi_LGiVEA|F%zWeT*1A8lO?GQJ6d)Kx9 z--?x+<}p{(ycY~vhJ*8)SjPNKD_8tq`?_$hvfptuMz;`T=8C8GZtT5Ir;EJK#O3l^ z+yAlJsdMX`!h{%K*n03qMBP)L2pNfjcLDSSl1+eJs3-*zu2XyWipvK$K9KB7c7Xr8 zC~onBe>e1R0mAbzYwRBjsuijakjr}q&$J*i7R}mGni1ClqUr<=H}U9!3A+!J2BQQV zgmeh+2fRkC?W+x}PQV?|C+Y>#_2}=R#a$4?|KUDDNDJ00wSgiVk>>!j498wN@=IOF z{Olw5y`IQg=FexFSMA@ws(#LVVhirelejNExG%Mo55&4)m*c*SMKvhCPwPV2#$d#w zS5RCw#0bdIQ=&t_=DUM9#GXcWurYG5_{<rZ_$Q(d$GhsEs0(1NrK^zsQ0L?dW@~P? z1e}dY5UIQ#k=9%Yhe{+Ie-g_n5ek{75X3;ka`261jF0;KQf9c!0cjSs&&o&p`+>k& z#>PIg_?fdCzm)f|YtQUqjy>`Zep=X+sF#wmoM}#1Qfe@}$|Q-RG<8h6!IAE9TEyp@ zb$|nB9bjYe&qMwF^3i9%d-lL$DI)*ii!<_LKVZpw&#-F;j!yGQi78sf|C6jXQ7mX- z;i<S+67@!BhhL*FZ_YoDK4i8=ppT8I#>VgiL35?J9ZyMb)QDnM(HK;RQb+>M2Ng8f z9m+h1ypf5O3o34EteD+U(|C8!-*fXVDM_NMs_3*WWV4R%n!R*Vaa|n($3-@W`*H-! z+X`n?`>v~RxPIcgp4y5@>V&S9B8acOb=LLER(CXy$<(Bvrqpb3WTOIN_`WV2k#0bJ zu-6g~v{b7JtwCu8t&^u0lujMw*Eln^O2c#vbMRFoeRLVp&lPlFGZL%`=+R<HD=2Hs zpA{(7*|JBwnY*geQ#yS~W!;?~hc(^E^iqP(l$30AlvQ}2DsXE3Z2?!!<rwnYl91M9 zNi=F*7DHN2c2<fhP*|Jn^}5|jf+o%1I;M6}W8N6AIm>1iOzK3nUKFi`oi+QJP^c{| zuS^?19`Y{@lm9GL$B3K~HCogV)5aG;ZQ?YbXS^NffP{iZ9+ZWubydur<Fkg!E18q( zV@DQ`|D64Z{QCrbs;Mh`v|bXhpQ6weEV!abqY*Vmqdmc_Pf&~U|KuBgv(soO6fb`+ z<fC_gk&uv>aeKZ4pC)K^dc78(D`>Jc5EemHYYYiy!|h|)gqM;;?Su;Xj`L&%6<8yj z5s{muwq%g%<c+<3S{!=iD3nM_ppMIlI5OcYW;C>SluT)8`JUcU(2!YjLt1Fcl5g+2 z!&X$_b6L^E#w{2$23=yJS)Ve>o1SzzVTk0-pPkLtYFUz|Q<99Q9Bj+nJ0I62T9eZf z?U_ZQNt>wEsYOSAu;IKtFG0+1B#LRmTqXu$P`n?9!Nh!Y7)Uvyju0S4`1;9k21itR z@=Xb8{$S3S22=LL5_5)MTV1<$xohm~>QPs=mEN(e-~8TX>HB$kS=)^j$x9cw#!tu> zl02^4uPg4pYHa$1d%F9!tSdL95+WmBmya2lD#|Zud?1Q<ly_r01(95h%Mfxb%~N2q zInxA5os-V2V*-w?4_{;GIeNME%l|Z%mkF9D{18fOl3Cs*ha~+w>~p%~)Pw6bpuVF- zRfT?bDjufG;u_a!*W#q{$PZ_?PYDPoi?k<x+(YCUjc{%0LXC5CZ>qn4>y#e0cKN*3 zlZ-9)<*R0OEj(T^`=+YN)pN&<bX|$>Px4Q$O=-QNw0UiNcW}bu_Kgc>b=DfIjm_On zb7n@CUeVsu=<zQrXr3Ye@-mm!`TC~#ayM?YVG{0CE`!i0ik7B(3hyBbRw4$=N8YnK zjk=>LwJ67Ca!)L&@wgh(*Dm+3XdBhqR=RrGGZF@(|M%r~FY=>|F)81_B`q*L=&GtM zuqGz0x~}Gq>!ZRK6twyTO?%uv79vK{sOmyBnu^7Y29~4P3}hiQbPVu8WtK=WlVODY z)55k;$fqEnZ;Y>mi8=Tys%-ExlSWv<jS5?7W??cnDKr`w6t*$Nl3ER=shAY^kF?Dx zX`_nM^r@AjjA>qXf|E@(W+hJ>TVC_<g6k)^3LK^scYfih0`?=t>cB<I7*uN1=18&% z8m(3@MXip3rI(R|-)2w!;fm1gTPNv~gL%_m`f@^oap0e0%MUGn@&9%BX0M!3F}I~| zv?m#E*U<gsRb7cX#-)QC%^Sf41Jq;~+cnDd6vub8{UxNYV+6iS>71uqti;eHTu$QD zAZ)aZ9K}Wx1bJ}*S#q{gS5dMfFV9tMBCe;|f}<c*lJNW#9o2QM8;R>Nev%<`F&i#% z(RexVjj(*!U7ydf@#_b9nNvd2DVW$$p9>PtL4$S#E{TfKsD>2MxJ#Hwi2hz+HI~4- zBC`?cW96>Q6V?t%ti_Y=PRUJ-$*fsY<n^ABIQNgSH#ek|jLOeRN%Z;s-|1n)tV^qP z{57+e8m!Lr)Wo_8urNp29N95uUm(0D$5$0xc`J-it=Z`=7?U}=qrCVcmS|QWC)K1K zZjHvvvBG-<1BZ#|zr}lK2CZ@t<3<3OsSJ?<`d%?zHC$EVvPMg=Ol?^;tP_f+kxDHQ zzApPs8k5axvzQY!lHO@fPD%2XUlW=YaM}{HL;1`-zB2b`YwFf`>;{V|K?D2Npi4?} zq-EuKGOa%=aB8|%ShIcKz<&G@2Aw`3$(CX=+MOns1(jF@>0_K_*%`&73T&b#6=v|- zhsX$q*{4&?U`f=O-Ng-s`-Q^9TgPUN&cJYwINh!6&t%oNzlj|hG?-0@qJ9O78LfA| z?#`$clNwLf>(!Sst1Y@=X0`YYY=vTYOM1OtBYh>~+qhzw?M-~Y?Hk~^Z<tw521AL& zbrkpC6c1L*4S2*yzm^hu+&W~rPg>WenC?l=#5mi%sv~OWw|;B-AHOb>Kly1(%(S1r zVG6hH)10P3+y2g3Q5PVwwetdbz6z6fzX6Q{sk+b^;7W)144;^L118<qW32pxXA@Yw z!%(8DuI;#S)aVsg=1iD0CWkx*SFpGsC)4Rj`np_2EG6>7`OZbnwN<e1l8PEuw!pgx z8d4H6GCjHZzCs%CGUR6zBPO*Vj;~5!=;GD}1%@ttkNOnvgTvn7mY|-TqGZ(itwb%% z$5m+%ld_Z3of&4bG)5)GSsG$AVzm3x*X4cpJk*(R{jWYw%PC9*J^;3c+LOj?DPD7W zRo&#~)))`Ts{6O?x<CI&?vraI{oCw|IA2GS(o1}qc0_xTfsgrTil1|pY96<c$ufxZ zbd((TB|RO?ty7|K|Gvrcd8l!YcX|Dm`&+u#-aPM)$<B%Jhq^oA5B1zwJ-KSG{My&E zO`<MS+a$$hs#~PEOxCf+mRWHw(@nnS8TVeoxcllRN&MYKG0b|^OmtQq(WC!r97J1# zsLLPkeUab6!-V8_z=ehnb2$f0zVou|6kHm0sz2ySEppqF5F_AjTE|UA{#&nvpH`M$ zk-zL2QH`!~mvz-*anVZ~c^|-+YrAo*Gv>`%63x}uR@ba1cka62vK#&Df`S^}jC#f} zo7n{aV+nQIxJY!!SuMr~M;(LSI$2-1_i5DoFv}P@=HzHp**WCiA(u|>o%Ve94!YqY zXL~(M%pPCTv$T<7sA@(Q?mhE)s8z9sttloRYb_qT`~T8YmDsZ_=H*$u0TY}&QLE8u zb%>G0xq5*s+b`}9|Ne1Tpt{M=qpg7~L|g4T#9AE_Z>Okicm2Aanz2aqBPuJr<Pf4^ z>OoJ>SXoAn9yXDO`G&cB%l37R;(GO#bKROU>2qX2iXq1;e$<E0Q$_4G&zDQnkR1&5 zvFOh)Al<1!awX48mM`*sV$duiDXA9Us9BE2dd2NW4ThDi1)<fc8kV)(s2TXt^V<^5 z(uW`FQsw_H)Tmz!T(fdgiavX6R|s!Sy{*33qCOMfZ?ATx@aRewF#9%4Hu&F@m;8?1 zww5I&%3sJ`@^$k6BqN`?ylm8>-#zn4m5`P@egBy)Xfc(Pte*JDgWtLnK~`YoKvyB? z@&m^LsX{DW389h4<RB%_JtmJEjnan8k)jJPq?8jXMG0kRjy<;Nx7Cv;y}5%e4OOT4 z98&SDI{D#!*KU63$t8{R=2~?18SYZ~hB&&5)aqBCe;`GdQqVl6@Izh7)Kx^b_~N{u ze}8Aw`u2t+JGwiT>1_A)OH$8)Ef4N{X~XgszricGOj&^vdCSFF@H_eLJB76L;9cMQ z<BH)d01s{h55|HA6L5~^fd>e{A%H{RMjE&4%#=VY92dN)MB>4sSj64zj`=O@608{9 z^JZ^LYpXlg*Y<;jTU(NnO$PD0RpRoW%AdY8@8wWdsZTIzl8nizZmYX_W`+D&O|IeI z{?ev_NE~Y*5-&f$?L2{yb~>6X7qOC*tk$lo)}+iVx7*&au0pat`rzTy)5fKi1chY1 zRhOiZTs7b8W;&_P{>)W1GrW;)BeCj2VUf^tqg&GqK8ZL3)s*pPh_Wzh7Dyfhe7R&; z27Jaa>W?5fz98c5DqFcT;Dq<!r9&YMI}|(#9+MUE#YaD6ceBPUpJTJ3C>0?K<doNU zE&H~CCESqiV)Au07FPAKHEDt*4fvmXKyTEG%1MF598OF{z1sLXPl`UJuz5@Y9T$=t zNiw}9-{ZEH9CG`t67t2R%GhCDvUBvI4)!VYI1%0GkL**9jxIre-q|T%GPmcN8B-fF zFDgBX>`Vqvih%e*d|ttY*q}iXgom5!!&uO03_gmL&D0<caT!^Wiq|h*x4v=Zyj6cN zWv`vtxUZmn#S_2$-<M{+?#~L9k#aMqrdejKADA*+IDX58qv9GxTQrQaN^Je^hf0MO zL$Yp@(JJqApM3Q2z;%^b#dQ*b?}jAlFsu02q=4cDw?@y43Fy!uj|fu$WiSp;fkT8+ zlps6?#(I-sV5%=riNiFau-EdP8(SAUOXp4Q30{Bw;l9@o)URIK{L(&_oVxDtg2)es z6Z+}}r$AWI*79lJle>$%dTP^0u@|=Ol1~@RJ+_Tm|9s?1vv>5qpMCK!5-O@@fzD;v zTc^qg?a1{40`pc3Fy;oZ5|_8?FWgK;+s6c3_}mw&wP$&Jf=*Xnz#eaFcAxxslX(Xl zbN0-04<sh3V^XZKDEfX-g1N$%CfnMkN`OL6(g<nEDUtUg)!HQ1CJK@FFHTWZ-Cs=E zsZqn)ts!|eVDC!N*KsD%6=&sAh$b2b8)oOSxQIR{IChdtL@&BN#9wZU+L!FCcxwJ% zirOlQZ{2aEch|vX?IF)o&pohdul(H1oxNSlx*m+I9y-7`Z7)w!zjA@-O3Ri_UOFdl z?F~&U4;QyL7iG51`O(_rd+y)Z-Q-JNv2FRz?+zr~xNgUyu}%9cSFRU~oEgIoaL<Q- zy17Kr{1>V-$not_eP%-S%@}@q6(1bzptKS~g>(mC_yGd?X01k@7c94OEjIjWAEw2m zd6o?~wJu67oj-{Sa`?4p>u)DPmj9Oe^OFNt4VUBJT}o{U9Wim9U89IIOD>pmY@7V? zpO3Ve^CmIfD;MAVk@b~Ywc%X1z7_j#0nK&5`BBuG%sjJ``sJ2e%)UGkF;^d&e3(JG zGudKC0FVa+FK+HYwa4TE05r;a43`JJrqkC@UvS)7Fcz(3SNf{*Cr@;>U-#ke?Q=G@ ztXUr^qyEEW3h8u%rqInsrWttkxow!&Bz(yJrc1r)!SW>Cub+cekUjRA(BE_^{^{cb zIq69TiAg)lZXM{0e8P;L*70jGZD8$O&)GJs&o!N8KMqv&oDI*(NB7BJZl1_Ymo=Tf zg7_3%bM?%bEiL0JQ9=kA+6sQo06$&O|IL^NK9Xe!ff|vK+F_dHLPj2}z0hhTR<THN z?ZuhM2KEgK933;=Yh+m|_9S>%RmOQJj9^nnDlW^kczg5WNA40l^P~1*7LWdXWvf%k zHGUl~s_n;iZ+f$8^5ox;Nm`ZevvZSl=f19cA9`|W!@Rj>9jkZ2CO!W3c={C$*2uO0 z|M9mG9Y$6FNBeJBv>Ga*`#NmaEK|(gytX}xxvQ71TO}-iip<?t#%99YP12YWlT)*8 z?kQI*=5FD%noz#WRCYz=im%7|f88S1q<f3YV5q??9_aptH$*+CBgk%^g52v0><G7< z(ExeSas~_!SQUb*z;fnDl|jo{O;M4-AM;dU%-JNB5^q^us5G98TGh;FhgqFcY=Tn_ zt9re`0Ha!-6}74BU{n9_YjXW7TG~gxVg3&3;37<p>*R5}C=}*0R`>``{AKy7ZH$`e ztKTxYVWR)Ip#7F7j(+wcH_=N4ht6(ENjJ~Bi@17$tIw78z>{A^HDA_%*L=h5M0`GW zr;%M$2W;ySlGl+fe62?KI@&`0U2Zjz`G0+R)cl9l^divCsqE(`<@B#7nwN17Cdpq! z9{Hx(2;1Ne*ao%Se+UA}be*DyFX}-I9dLNg!w&exqe)Vn!L3oh;u~AzKp6x4i(mnJ zLh&ywJ6LdFZ`8Q@C0yy1)8Yb6KOU4l3N-0na~6$nn&C%eL2(`@E}2-CViGRtdJOda zOYGuNsLA)PU%j+De^?x9+}$hh>YndM{5>XfieEAfc#2iTFU2znjmdNJ<Z&`ICeIy1 zN&519+-!*VOuo*E*0i@T-HApH3r~^bvT8f~#Z$X}4$mdKm%XqRuFDHEA0pRfd*pw< zo-juJnsrMC{hE1?uX!u#*DQbVJiq4ce~Hmw!TiGch7YK3f?O)+yIr*ioJM#T*Uw0v z58Xo1@J<XU6uX6DLK^22qQ)UkM%CI*3)^s^gFofPs9*F8Od4VOH%9S@=(POM9jR)T zbu%KO`-c08y2HO8TRSMGQ*UcPh-XA!$&?RY>E_Ea@j}C2J|utMP(DmrV^LGeYiu6$ zqr8Zl&Q;yXPxpm>6mW>~e$>Be=*HjeJdhmkYU?#mo#$FzQRfO>SrqfF&K6(F)ZM&e zn5heAJT45OJN&OXTA?TZW9cwQOZ?q85UY*z*oV0f(5tarT?hDn3+5Q97#vtlau!ui zYT>*>j*~FK8oWmL5nely6AFfSOb&5flVEq+C|Jn{4so!&2ftjys!?8p7V=O!aENX6 z8R!NKP;t4R!6&{X_FkB3$Z@KjUbGnz^{z?D_J-6H(V+I^C%%|on8efzuUPVwpwICp zip#WV?rgm%-7q?_z?YklAnJ`}c8y!e$k}Ex*)dsgo5OCHnvjxmPinf_m*l)iW@Qkn zFI1=9FrC@W8c8jA03t?mRLqj}Qo5VHKMhj@4Q>(tp_)|G8ZE}i$DUlHC~7<+!MQ|_ zRuHsl1<xA+=NHv%L#j*G>cr%f-!%@TT?QW)pVQN$7Offc+$hgXn0!L<Xe<bx00RmC z0qX*H1y2c*3L#*8(AZf3I1r#*fYtFU2s0r~M}PBB;iWUsD&bdaTbsUrWl>Tv_=?u- z${*)_$>GgQFG-R_N%Xkg2CWty@71EH-dvD9wbr)h*6i}^A7@XzwIyqEuzcxAd=&5a zRs9v;N|*omxjWnXxHCV`m28yn0+Xa9Gd^CE<Ss2@J;CxU9hz$;tHJ4>wycRElkbWA zNmaS{J^9_sax6YC??VmuL_`TtY(fxaM28V-0!kT-L53Zf28yFn4Ihw{mlQeC)VZn5 z7jXXP-ceVL9dEZeGLmlTc2`uE<l66De&vF3iMGmb&mD>C)1ymrd*+pg|4V1k&AdHP zH*>Zzt!(n5iR*t8te0HVCd*%3o~WU7Xa+@X+`frLfjOqmH@M3J0Fgjw$4P^pq=6~Y zFw(}~U)HhZyZYqfx&;%*1Sce3(`|kIc>VHqSLD|F3p|MquQ#JONy2|tdupOZYe0Ru zy>KiNdFsulMxvkH>dU!vufOH0x{xc`HIE5z@1MW@N-wq(|4AlGlEeU`)*CGLjBE$% zX&h&f`0k(m=jHh_gu%EPCxd)F8nXdCG6+5d@+a<0C|q#5$FXV=*+Xb10NE8i<YEE5 zSB^-eBZDD|jyC>C!lLx&(vnedZI<Jnb)%h5XWEjwfyBa|tidxvoWT(}z^3a=fSehT z4lyQ*24_NAMdO`L*R`XHk_7<>5tRJcY!YMovW}5>C@lJCs=rTTy!#p)3eL0GX&Gb} zflQfNGr2QeEwQ!<srJ)M?Xr6{nl_IkYn+~nizKZ(%@J97dA7!3Te?-tRCj={Q4xZ6 zK`Of}xvWwd`UM`Mz(b7k&|!)35cnYG)Z`Eg(U}W-Lgq|P9-0=!38plXG`4_+ByFbL zoE%p&T<_^dyTJ||bF!p1Bp5YmW5LMGSqdZFTB`D3U?_NCFUg;k-=bwY%lOQ))FHX` zWV;iz&yU1r_NpR(Tpw}y6q#DTwLrjNY0gtwjxjaIkv&eY(Ma0V<W&9r*B7g``C7Ud z6p7}$BR5@^(?jw*4*oQ%ZjjT$4d~-HXgQjjQ8v`XTnw;A=qx3z1tkW-)%b>NNL0~{ zAf~@|d)+!gSYK+dwmU?%+MVWXIneggKYw;0YkFOs87G-JGDaQ0bxT(5&6A6-&U|eo z773~}2c%q+OOyMAT*#W*k+C|-Szi06#}54&D$dhfm1okw@ce`2H!rEr&Xm_*jtN6N zn1gu2w@B7Vcrr!G>>x@*UYZ&~8byd;=|<n<mmHm5Zf#^T;*{uOZi=SSyCkOwO7PDy z3GC7gl4e5G1mNRwK?|D;N;X1tGBeypqbP64Z&7YK{-%cn$X*n}&*|4lI*YtnTDICz zNbhlRE7=(Pq_G7K`DY`sQY?{o(eXvyfZm`(d6GynBpEF^qjDd6tfxfH)gZkYf7t0; zkqfA#%?<`PZ>Epn!iNqVfL)@_No^T7aKq);9(TrzRO6@u1AHImbwn8q2P8Q#j+Nu8 z%mjXgsv$3}5~uKk$C;@0Wak*PV69fWxj;FEbXkBc^&@g6SKbBMIhk6YZW!tK7AH)d z?39Sa+b&CD9E~2##6e!sTb9bk!0fDo)Hp}e`O<oT<VOjK;_9NlFWGr%`h=a`lZy>F zN_!=Pqmj+3tgo3-n(h{)EhCY17^6L24(M~}u&8r%7~(tgz0S(~f+|*&n)zRsqhjcM zbfP~HFkNg+iJGdr$;eiA#6=e#eDtWnhj+Cmi9jU;6tE3$DwPT4;ey2JiVJZH!4b{| zi-_u^#H`dbt@gVm=FM~%M!C(?u3c-_=kb%_OS>AP@6686&M|Gy6R^Xd=1jsnttEAy z)B{<KX7?!Uzq#bQTJ6|EJ2Q;LCowm(Alsg(#fiX9!v^I0+;IQGEiRK{5z=8`lHQbJ z*^}>(G&;3felK6E&C>~@)0^>M1v?rl`C-6O^6Av(agFlDF-$`RwDI=hDCCtULq^<C zGSTuJ%FdlH|1{PUCI*f7a*n_{DN)|K<Vvk}hI{Y~kQuGUJq%8`)0;}KC@?vlsOue} zO8AVxK#5d1U%8lwFGDAwL7tT2E^7*E5@lE#Bi0FXTvg1M%L9_&ddQK83!2%;j&vzd z5)MZL)>QClmD#Z6!Hb=CXK)O8Ak?Uo4w5q+e`v0t#$h9O#Ko#%ZL7qOK9O9Sb1F@G zz$mno7S-?l(q_Hu^o+#98vFMJgD2f=eE%Y(-<MeG^OhQ=xBhYJ4aroIY7%^9rs5ip z`~z`mWSh|_bjyZfyTJqHdnk!5>4w&pB#*(KCFlq4xF{F?@?*Eb_5ZQ<HGoZ(=l<tC z=j0^KS5Lm1rfHg{Nt%X`CTSYd(ho{0rIb=@t+m!#5wL)OhzwcBy55e<kRc+9%A8}& z%N*D1lF%{7oI1VEaUC+QR~_Rz=k>bj?&db<oML<U|K5{+;k4}U=0<vQ&U;SI`@GNl ze1C*kzg@4wUE`0k%{=ean_T0+2-(zy@xd>*AozDC9!yEtlnsI8QI10qf{Pn~55+z@ zXOv8qZHHN2Gqn|YAO|8*h_85-C|`yRTLQ9txS~3Fy;4b;h~7U3skv{Ftwx)M&+?NM z$?6r9Vp6svc+1?<khAb-9|N%|J(WB~(VuP`8UHzDQDvO`&~8Wemmle~tlgYrHj$Xj zFeWz`rLWPl@wexcu*9Dnd-G!<ICHzqun8DZJ0=9)L497x{Gon)>eL4(o)lHAv+NH} z>8L0VmgMV8n^#qD-1z{R^WQ)G=BvMc_+S1{S~1%!xu5C^O6N6P9$xx|t7G53`_NO; zcfR%JOGl2q{_<GnQg1)4M`@<=$P82qM6EMKqRD~>NaKW{3URVn@XTeg)3Yiy)f^!K z;|8iz(^xfGohJU-1lj)y)if`<+aDBVGGLhRir8}t@@)=_OGhmP9962393v=m^#yh- z)gbl+jJ1vV^RBKA0zYfFw3)rGT<@$P*?565_V)Y>5na-}rYfT>NKj#v)97>>*dnr; zwZrQ&iz<O-v-&n+jV@m}XbW6^Sxli)jvu)c^k(KS85uWnR6#8-Ui;Z(1BxFB8*sxf zU=zvuxNMgAC?ecE+f-|EO@RWJ*_5ojrnxkf#h2&}e}xlchD%IltlFM$(dPMkmFSNG z+&Q}Z%bRA-PX{ggItF#?<R~*e2h}E8$;pE0RM=e>DJDhGX)_CSA{U`jI-wRY>;>4m zpaso>B;3C(SY-F*>4kaySN)N)Oh}mGw9J{M2YX5#4N39Pt?}-JZ-F=uwwXdyAXj_c zx$~BI=BUlS>Qdd(*<i=TS{nHrU2cx_Uc2=BnXGaWLncWBVXs=1xFW1zxdnkiuivOL z=+z1(yQuYgo6Bn+zw;8nogsI{5y6Xk6DaylH&52+oh_+A^%?LCSgtEUWvLYl0!%h{ zzHnB9%x@6ck!S4xe50q{?eI0U&8@z%dFkhQjYp*wij9_4OZIeJ>nNS?ciwev)m4@M zH7i7Z5gOL0S}(7QJ7Q%9t>~M7S*U3sS)gK*kEn%#YxmSInWMFomDLsZtu-3QxW9eg z3lQT3rb5k{RM?U!3e{z(c}^i4WFUn?!=cINm5@2J;lB{TDVXF-mf7}F{WLL|dmE)0 zh0Y?aj@1-e7Z;Q`&AL3Z&Ms=)tlo$YGA4z};IN}swwRx*O3nrk^3)Xd;AAtyggQbV zFqahQd==%aa@l^4vk<*X5bD=fsufqy7xSI^v46T0WM=Z$CRp)T(493QU7;FsR()fk z=z>1LST|vAfRH|^!9le%({7@*!ff&?Z;}DCqN;iYg$e|%J)_C%dnzV((A?W5)icU; zm3e`dSZUpyBC`fvLEfDWGModD;br-Y=W`9Y95j%o6F+>SaN!ra37Nh&sm$>zS7rPc z6|I+5xb?2WO8-KiXZ*k=fix}a*D-Cd_A&(Ft(6H=Eemrl6-1{%#Va{I2T-An54KH~ z2GNUhArAd)w1)lSHx-<~e>+D-RF*=UQn{_P1QHQ4bV}`cI?XRt8oj8|=j&82vwU7@ zElANeTg7JqHvGRyJxh8KB(yqdiJDWA?^>PqT<|nyDa2J7Ts*56OY$A#Z<;JdB`Te> z`HGzRK4o?Qt5$2&Ih+t08@MEdX5vvfG`f|5kpnauTV~Q~;zQXUrLBrsqgMJMTT$qa zA{GZ^HrT>1-dM0pLBXcs!{}TyaU^d+e_XIlR{i0+_6=(7#cIO#SBhok%cT#c|CGi% z6oNvp%GKm*b;2X8Rx}#;U43SU!)#de5nLj}Iqmub=19V{R?rFRK4a9cKO*=YdXcOm z69ODpBSl%2GaphaOSbO5Z7b#AJnvTPE#mn5WUwt^8T%`13U``Kj2~)fAVY3aAZ>4` zW3RUR)CKn(pOD!O&7j4D%x%m)fLAe0N?=eV3O@`>?M)D9VH(du3U($vE9;JM@<4r@ z_VXyI>xO0~_~|_$-87v=&rbM)e6r4f3Z>c91?t}M(+69#RxHmf+>cH3f|D;?BPi9H z%{q+$OR!GKX+1@yxqEn}QdJU|Tf85<=xe@&tHj3gn&QhHu3l#$+@6-XxkAwmuz4s| z8wsn;Q5Z~l8p2*(HKMRPopr@`LEXC7KYxy$?}<FEP$_1waq%6(86scJQE3kuO;!!8 zeo%6$6vg=z0?Xz4%O2CP*gCJpBRQNxT>@JyOE{k(!0}V4owG-$Qh$|jM$v*%MC+qY zp<*4Zu5;%+t`vAd@oH7M{79&_C4zFS^wP*aO%K{(^QW;Pl|D)bCT&~L^RWsuG<qtq zQdLD+L%6xBtnNBtCBTx$F2?+e%UT_F=Py($_3`|mrKB0V-c)E8O*s~R?}YlAk77i* zN~JZ=sWT{)<d8BqN2}K7n(ZP;2q*+K2P{_kKg-Se2<~50_KI6t4daeq1;A0Gl`>aT zp!KW9zc9gf08)C9RVqcZ%j?)i1dUxZYxF9WV%jJmR8;)Oc>krKHx0XN9S(*lDiLIh zCN~0A-+{;oJfaPeAl?I~<(x~Ja*1L_v*Btz8LG3EkRoZVPF3)9LtT+&PMLcAc{@^? z$RoD=-{*(R4P{mq6`$<Cm=zGeWp!fiobhiP@~W1EHFqo*6>3EGtScst#q$)OU8Mjz ztNs=r$fQnFhT3RhrXF#gPcfM!N-C!ppe03lIK3Eu&_PKRquuTAMOimAIk$#Ky>iF% zjMYnv4_oDSscqp@#Y>}&*}Bq=BJE?8DxRwDX)i!mL)4x|zvz@XI$A2%UQd;Z>7}Ut zR+Fu=wj9;pI0Fjpx#pqvo6<0<_m(eMObdOM8Z#;l3c@8*OAYA|x`{Dqg&t-2=F)PY z^(Z)N@Xx83U(>i0gHc$PrUg-T0lvEl%Xb~Ape6>nAm5~Z37G(cAX`$zrYS^4g{(*| zlr!tKtsL&_O|*7<<|yOMk%p^Z_{@gbd{w@~R=Bt6tW&SHv1Pxd3$=FiJTZOvopm2A z)W+_1*F`!P?uZtK-R<>*ao16o-DnW?PNy|_&2R48qE^{M(sp3uqqluP?>;a6?yu{# z$??mR_7^9Utu7$z7XIQmP2#Yj`XJCVn-1U?P=hFLq8(|dSnMgD4Ix(;EkPAKm=p=$ zzPNYezUzAuDz}qgwJ12UBwC}G6$giF{H(G>ZP&-Uw#FW~adRf3_N!d2q+R+;bIr?Z zKRb5mQa&^I%cM1c&O(XWG*Er8UN%`>hT@l1`evC5C-$z1#g(i!;7#<o!%MxjMGk-0 zw(TQ(Zkcr!Lmi1~5_WoATQ_&d+MES0_crpCt=m##D=tMNCihMNIT{gp?=U1zSsO*Z z${9Wvn7n)*e{mQdb((ZYE#<&EOwt$k%|=w48tW35Z+BnE*;khKt>1oAXO+nsL?pmq zmB_o}Dt>)JjdYXR*`~(|FSKvH@tMB5aMg}=Bct)Yno?Hqs`IdB(xPe)@6ziXT6>FG z3SVlrGxb|LDrwH8m6KsQMhP6Mj6sadD&^TWB~}yNZ_3%$%ej7$QyUm*ig!9oeYbDl z*2_wJS$Fu6Ex|U`=VzM}Ik3Da-jJVPvT5<)BYU^=s@$H~B2sd&uXa3esaZ_Z^(e9k z+b{zIHA7aQPB5kK6FUft2#;=5Ht-veSB@jt@u*po2A|9_DRx_zP*6YAT^9}%n7l?? ztY^2HIJd=vko`7BhIzH!WxajYxosn8>1W@4x5rs*bovu#QS(zI(#g1jS9@6e=TWh| zV3T%SYNDVS(>1}O@cS=E1phiJzxgpc8q6V3TZ&94qtjCd`}ZaAPhLwK%?~rfx`cfQ zQX=075usRKTY6_Lh~+1`b9mk*<P;e7$)T>Mps|JZ3N_koRq^L`b#7bTK$@C{yPCq0 z1>S&D+8Dg|wPd2%V^Dhp$LU*Q(Jf}DN4#z}IB~n}#)7(ruIufNU>z^S!}I!A)ZM#4 zY;6xUv~;d^C5Q_Tb_eQKlU3d#7euX_4<wuK&&f-Uoxc?5rg5YiIdfM6W)=Vl+=2?L zP6)3`n6MH23A6zuv73P)GfvB<?Fqtc5oSY{qrDvmtuC#)EwV#nb(vR3T5Sb3bw_-= z3XM@s=AeqBJN=qj;c)-M1CL2gvyx?1UY7N$S=Q!yYj>~orpv~%NVcV?c$1sFK1HlQ z{Qac>a1p;zBBxG{gkdHcYQW|X^5uh7I|~CJ4IuI%&XYzz%z_W>I}wj3;@KRk7R8z$ z)EJEoWxSrXvtC}yZ>J2XdbWx3O}N(+86Qr%d6$!Aoi4s`{;Ot}MXy3izvuuPO3WMg zUutU8G&={C%jeO$h1lI_6HGCl7G|8{s36m{Ljt`(2?K3bG|S|;WDV;7dW!VZ{L~%@ z)r8uEevvl>i$5;;e#-RDC#!A6#Ws7fG&o6~6A9_X9h*Wi;&;ZX&1B1+(vnL}Z3ZuG zLC*l1A5EPsGx!dqR;X-n^&vvQCc7`G7DfwaU`3}$oUERqCX~F&)Omqro(+o|r<qbq zET$H6ELKpvnz<vq77UdahW!h%xUG>k-r4JQsI<-m*1x5%76=f0dbTA;_lZmv9`;<Y ztDh8BRH!vZEz9PV!r#tuun(vcoxDyhJf2Xpsu1vick2TqA%0_xlXn#WNeX$FwDMAb zo8Iq}G6>)XVkhJDB)^`K)iE0|UbL5~6<T3!lgK;cf)MKJK^fAxGdk-$pCPHHLPE4I zl|vivkixBQUU0+GE;t3=D_zp8G5!9NdLG2Vi<S?R3KNpIC<s680%D`KjE^jGmYx9- zKDxnYl_2#-Yb6%6A-kXuOSnyOwL-0K>)&86y0-j5;;A<pYp$rP6~&lsb9gZ!1reMW zs=;?_XvMMFAj1hZ>!~`&)LK@bt1f}aBHDHKC8B=QfUn^PFHbEs>#Tcvcc;Pla<yyX z270EoZfxhJ5a1&J=433*C15Dqf<6Y!ka}jrk4HZNTHZ~`=EXB16^gifX2lxz4->GN zB!=f?aZa>rDQx1^-796#$XD5d`lnj!BV*5B5)#w&QG<?!Uc^0sK4yglN+D%`h9CzI z&PEK`ksbTCZQZ+z=zA6<TlyO#OKzKWnRg!@T>s4n*KW&5quUPM_^aU+<CkR87x5@R z-ZLxB%|u<wh$pe$W(F0T)Q~^`WqN)o5tV&r2CpiroD{g#uGPVwmAm@gJ!a=Eq1HA2 zo74@nO^-Ari>FX{K6!Oo?AB}jwOa<ZK4TZx^pcH3$GZY!FI;koX82H5Q#{N7^CP6R zN)z@rRWN2sDwT;igtiHhZnkA>Zmb#FTXl;~Q0u+9>ek3kwKYZ>st0Zx+}IzU)j7nv zh_{{m$Lp|ny(${Vm$*erzWiC~BX%yabMwYj<yhOL=Qd5trfS$Kr*`7B$T7LBSz?yO zPio|zZFb$BknykA6O^j@8n4f5RU6cLHs|`;7LxPU+pIsCxtQ5KC#z64%chXsF-NVo zu&S}gE;ZBXc~07wV&Jz~Uo!Rc#G+_jlXZC^0*Ce}&3e3)zS*ctG9;7*lY!*c&Luau zZA_TTjcpxqZ#))jbhdQ{-Tfwg9+^Er&(Rw2`u%MY&w`<D&wXrrb<poce+5TqUP*FU zUIE`M#V$3q8FGjdy~mSO4lyUP&f7S&tvJduM3Fd|KqR^ap;NQ1nL60wb~mkG-2n?| zBG~qznrHL(&9+4B+tN=rlf2^Q?PMz-5t^3T98asf`msM<@~vn324cvjTaNdp^aFkq zD7hAjgm95iKjX|(R7lEd@)dH;z{G6J<9CGxp|3%!;%`W_OdU^Y@*4Zq759=~&N{sp zn}WRHL1|dkz?^^IMY{el<NGU9ge5NO-PMnczjEnGPvha_extLaW-;o0hMNdx82C0I zGfss{PT2opsh@Sbbz-VcM^X@60i)?CLbR18?KjX3*bUE4pE1yfKViMC%P5-qZKhfE zeJ&u+n0~n2e(5%?PE|-g*P_*-u^}KqKZ22;%^`vRjs!2g^JcE|GUWK8u0Nr3#^TPL zvxX)_OiX>oW!d#n2j|2DL?-T<I9=(RC1iy+PeG*AWc*h^L@;Y_ycM-a1(WVDAXAL% zCIChV>#bczgK3G;Wy~$0;8MwFwdPi|al6iDmY$joM>uHqN(2ki&q5hqdUV0nN`KMP zzk@A09Y8iYAQ*Xh4jcGps8y%=tFsiGXD*FGGxmEMyw)^Fjn-hIj%y!PAK-AHeh0r& zz-Gyw11`}T7KusUBk+wz9RiAptczk21NqsA!{K%7{DF`lv;-QK2!az8(h}+py9NSx zOUS+iaB~*he5-`bWrbz<(o)z=&jq!XC7m`E9aLs-5d@PBHsM(#ZP|BkyJwz*gG3Ho za|?QlwaajWqkx?}!HTxfR^h?<wy7bk-F3??PRG;;BB-o+WB+w2P)_f~8f5ES1vr{% z$0|g~RVF6GPGfS|DZnboTXTY`Q*dF7pPLIVA$u|tw%G{Luk|s0=N^Gn3Tn19+|6p^ z{_1bp^AbUOUBGA;jk!h`tZerg3w?URCriK17R`!qTh%tTLHZ#{>J4_an|g!Y>b4>2 z=0wQjb=eC=+eU}pR!&xhoY|{m|3A9yMC>huH>j`F3ZtAjoN&M-Fl9LqA~9P5dv@y| z#d3pD%$HudCsA(AwOO57=|(4dL*Cz3U^g4Jh9Ay4#mxxt7Fcbm)ZvbKRDErFwShdo znpJJAX2riDTzZDgW@9EfKCR}GjY^PfvQrYvO%-L&HV&B${jIbPA8#m#dnU@Q2aBib ztw%enN-ZeDS25e9cs(tlUbbv?b2M8+{qu(kri-YTFGCS^T2)W_#ibBe)}L0Q2hKz2 zK#dOgn2_Toj1K~z9}fGGt&boeL|FopZ6ENbuwN?WCSmTTl~HbQnuTgG3IZs978 zYHwbPesEb@m$VhQHx64Gq|X_3^xU>(DQ&{$Dq25mTBHa{o5xQ}w~#NBJmTo^#H5q% zpe6`o0q{BKX046kg!|*qeq;`FXmr}t9C$f25oMNSRR|2<4cdh%$#K50u~3u0_ivmF zd(!JyZ}2wPE<HiW5w$ZCYdo^E_v$b9tluC#Yt(q{m}Xl_TV*SBum6Iv>EbCp;IN|I zQ#F}i*Va3K|K|Sdw7a_c8*Xas?pPN$hCX3t95cN(Nm@H_YF-LjE)TM%TwfBDBRdmN z6B>z0-I+K+eQAZ*a<cw3+hSg{V9#_kDcnkTecjqfi_=~py0_JAe`Iy-0J(N~$L4@w zGe&r}+21*SzftYUZx-c+HM`scU$T5;#vL_nnh3A2kN38EoesS#G_t0nS4egw>#J+6 zyg{t?ct2*H8zKoe_|o*e%Ry0R%o~~?R?*@C7xDuHNDrAn(AWfmqI*}uLp9$|LfwmI zdp6(f){_(Z>(>Tq4&ARJexuU~{pk*k+hJVCZyY~zDO4Ra2hE1Bm~a0|z46@Z_SG(9 zPLO2-YGQV+Y$nX#u&$-F*5%Z?%7IW9uO=p(aK{sF&Q@uQ{;jym{i1e3*za<R&XL-I zUE4O)`OrXBEj=4l$LkM{@4XbZO8w0}e)v1xp<R7FU$7iu8>LrNPIB-Q=EKieFC)C1 z7taWm*cv!FD{U2XIZUR~8%JEFY>D`rm+61j#XD=}4c#mh%;`$BR-^67rhmOYa<FgX zhVkH~R!HjHPj0xyA0~I{{k36F{rlm7s$<JY!`|VY<lax15LRR+4(6eD+4LGIVgeU4 znh--kNntJdo41zd)X4g#-e;t*1WQ<ZNc2c|7`0hKnARmI)fhH?WDj}YlD>uf=*zUp z+l1!8Cy@#Gi>CLB30x{pV=t$0%1h1UO1La5umo8`;mY=nk;w&|n{qw6tfAp=MzE~| zq@p0uQ~l27=FVGhvNx{W(>1t#W&6;X>fW89t_~@2sdeFdgRLf8q$wEcO}2M7u83>B zYhpcZ+v@t}1zR?`!u?xoYr<<@`=k}R80WmOB2CX&LFv7dK)>+I;pquMVL{lwJo)Jf zDlH%7Y8R0P=Ro3$nyT%abN&4bTk2!dpU%GW)$2g&&txY?wzZ*!sm#7UO6%{ML^yNh zk|@PD@tr$?ZTu%q1v19;`q|KdcP0m;*d@oC8xZGUlIgq5wXAYc4lL?ytj&7Sj1D!{ zsO-t$kYK6yuReI!>Vdm%AI}(P@HCY(9BX<IJZ+?B{l?;p85nzLWmEt4F=9%#Zm1gH z`^eFU4?VMY$8+Sxiy2$@8p_xjZ)%{7&5QzW3g#-v7q2LKF|SH^Z7zd1Q?5lX$A0?! z*-Ss=SRgDE)PXWDS{9Hif=#Fbr2$J~QQnGnd@M8yDs~z9q7tLXmO$V_sxFWmE&6Z; zybhiE7@B}`rLTd;VBMM@*`gqV*W=dknjCj?F8j@eO2Uy^>9wCcr+$nyK6wWb9>3{v z>BN&e-%SmqNI6+e%F`+7r=#zF>ktw@$ysuR^l!i4`_3K`JNI)kPx>+WpT!=I(`Yr+ zi=ik~u!1G%B|EJJrb4x?Eg<Co^VqdNeum`hG*6I*?{ECRG`{x<=?6bhv<;-**^!q1 zGxcqR$E_lvZ-1CfJ@l*JkUHsyzj<3q@e9BF$J@_J&yk9s|BS?d$I~#YLR?SpM2@qZ z$&Pu_I6XiDn^#VEMlfiQOw|sa;QHv$M_VKb*wm&^4N3O$!D}kIYeiqNYn#p)ezC}# zYt?c2f+b@3&y|&zU$vtCvUy0$MJf(kYA`u}^yvZ14b9(Baz(8>QCZ@;MOU=c9$mnz zJJ%vJvubT;_sWfJ3++WBk8%!#RVY-vv9Mrl%Rd0n({R2F5v7|^-}}qJO&QJ54F`Lx zmZ!x7U@4+y6l7~2AWDl0(0(g0@2RLKJzjwTQn@GKT=wrkTeH5nG+$x$TXM`L<%`Q~ zrA7H_AF(9MNnvHY=&Lt5YGTELT8pOOb~RD*N<ppG8`P#ytoW;5ix6BA&NmczazFhV z@CudD=g7^oA%@pk;&vL8I)mDmbXUY|tRk<hIehb%v}Tt<&}vi)gPw(_NTEbIZ_(E= z?<<7$3BE{0UO|~%_u+s416GFi3z(!E3Rog(4U8L5iG<c6p<M{+E|BZG!CFBRE>ILx zrXHY%Fi^mwOGBiMI7A~tv`zmzmZieiQ|Ei*38SFa36~r5P^5zX=cy}Nunbx03Rhvl zy9RW?JF%PNRlE;`;WM9p3Ak^^cdV!}DRdgu2D=^gzGPZM3y2gVE>Mwg-(WWAi03Hn z>!n#E9UkBP_k((R4@5zoJ14qq$;22@KvbikWtLI%1=)d6IAmrt>5`BOE&m;`y~kHm z>Rmjiw6QjL&HNnm(x^FT3#;-Xi>}+^U3l$=%>~9HC#z9t)jC5?V%g_EJz&|2x=xQ) zBv&CYKyewk04x01G<@~E>lXPF9%H_b6_q;hoJ|t>`vD62J&n71NW(N1rHi>9xC>=7 ztD?~uIrL(p#6>33l%zmSgDnklC!d~PA5bq_7_QYhYXV7EeqCPml@Z&`>q?tjLN`@> zwp*z>mHay(_;@6=XU&Sf{K_h?sMWjXHJ0{%Bd=<(>+8ew+wyr)vuw=s4}sk@K04%X zh}rrGU~9D5I4uiJb8y*gVQBpQ^w`J#$yVgm`D{eN7v;StI&HaX(?8wTq4up<o=AOq zKyfc>%0t}UL7s@nQ04T}Q_@O4r|VZw?;6woBLJkFqKj(e;)P(bSw2x8h}tq-(>)`z z+@A)%Q9+4r-HTdAerhb^9Fh8N&+3|yk%iqi-_X0EuEV&RtncdRdm?uAnt0v3?#jOd zc)zU=HYClhHNmD$t=-||^{t=n>uOInXq&nku4)^9xUsdn#vNSgX=;-O{{g_9-lsPB zp{J=I<>}Obf=o}Ey;w`@#ekzsWT`=U1wllm>Q9e*-1DdjwX7l6VK?hlI_La=+T<wY z^6T?T7lb_)_2zZ1<bt`bjVr&QKeg`DLz5jfxSjevYtR%qFt3R2Tvnm=X!W@c&zy2c z?>BV*+8Y;refXC09BZpI_a6YyY25T7N2d!Onp>GKGW$Lv@58S^bE>oQ{#=d<{D{@@ zW|N6&suy)IAZBrfNmW7PI}zahsgaib7w<yKx1kh#MxmfO0Y`NLv}#26vqqGWY18KA zTg`g3=C+p?X>9Jo95ZRMl`M`%=I!d!=jU0?#$08`+$F6ae){#8MSbAVs#uUUh+RsH zJF$3Gt#8rQ3+5`#<;5-k_RDJX_`+!9TmAp~QyytrxrUq@kFUJ1qg2}e4_LFAb`CS5 zgJ)>bQ<|p;y;ICu4ldFt*~3|la(HO<!TmR^z4P`@P#H;Rd{2sWLSfu_5<kEY1|1N2 zf!hQt3uAH{+y}Uu#_r%YjqPO^cE&X&t{U71FGjq<o*Db4JSsTuVBct>L-Cv*riB~B zb68rxfpBAEedKft=f^b`*IFm8DR4E1C@|oVv|HNEHDTD9A$G_3V2tx--{->H++*Bc z`W%~q9N>IL?&}glXy|k9ow|nSBv9_nK%UFqThc;qAzQ}2kFjAomcCCjesd#z4njA3 zxWiI}+c<Vse$GYjW6B|T;p1cr?mvNHVXkI;klj%+b}svz3-55l+<y8TVj$e`*jaA> z*tO)eJTF`ulCR-?a#$0tx6*rKW9PVp^kvL%5r&?TpHqEdi6W2p$#BkLtJw|g#_@gR zQFg=lo#XrP^y%vq);Yq1td5M2zdL^4#K<Sc-zDR$Zv4ZEJ0|R6yqkM~(J&@-oPw`C zD;82+bHe5X|A8P{iz-F3{<B@(gP*;s>$Bu5(q`!`Vk3KqT{<Iu?Js{J*PSO%cMlGB zcMT50@NdTzoAj2nnceUg>1!W+K(3>6-6{Q=Kgw^!`{Vs`l!PG%$vx7V@!zo~RzyYm zqvNl!F4jH%D*McM^>~d8XV_BY_fs;t$mULw@TgACc}eAjX)4SjG-FJO;S~4Ms3x28 zq7l=%$YoJxb6Jr6lFro_jK|O?jX_ollg`oM8?w$=IY))h`bVbb$f-VNX0DH&k8J0e zcsG&Bhed>$ej2<kFks-Z)KFIhddFsg7_YeK7tR5L#|Z!ZM~NW4F8$zte?{uvd{;Vq z<_z(_F!NV;Nav-!(s{y@E#pu9>hPhTo&DNjmOUe#ec{Z3{W5M$!)fX*X8zKQhhaF@ zVtTx8`n?r0Wte%N<W8MGpPF&^>sZUb;QkdCUgurhA*>|spTrbzF)$@Y1}9fv`Zf7w z>ih?3?9^XR!u9nJfG^p1SaIPGc*euoXBerU3ja7Wa4o)K8szdX5*4|fSf$@cH%o7l zSLr*+b9fQ(-#76N6Ze}cd_{tVcTfM~|00d@FYLC-7a)1rNqrO!;xrtP<4!R??*0fU z-lupjb<F_B1b(MK0#n2Evv}+rpT|(*H_v?7_)U1y`IPj^ju{V=XDW5EBt_qi&^2Nk zus1;&6vY&y*oJ56wtIHs9y;d0h1c25SO<(TQH1jxcouMa>qnCym)DwknO)1iial<F z?oD@2u9De~Br;2Qlv6U-(LIzGPBAZyYO{MtE2fRuLqc{B*|CS<nKh<!IGXWg$;F|- zB$44|nn5iR6WS2vEGzF{J#hcN&1>(i-LkxE)7JLZTgj^f_ujXD<pcL^?7ZbOt({wF znLnoe!n^!_9;5(Yylw|^D@<Mm*&{to_TDi}z9Aj`XXzW#U1ZOjWVf{CZE53}MB%~_ zhS|yQ;zi8If<6*k<rx{3OetpM43FBCj0HzJd2TpIQibx|a<g;G71K7%&7o#o808St z=INQbFjE_yp3Z<AH-|x<ams8=YmIc~b|*-OkZY!d1nF%R*y6BdG9HwJ0?NsFTftXU z$^rTyZv4|HKak!#{;VXi-O}0TPLkkBsb+phb#({ZdV;8>-=-du{z7tD^QnD%Pd|Uh zp5ye=d|-1#5>-bPk1}S|A6SWu(ZkFm<RG{ntp|#19#Ckg12oJqQ7I=w+o<8#Uu-km z0Q~G_ECS8!g;fiBDXr_{r@CrF)>-LjO5eToG_FH_gPW0FLfM&U0?&Z49jQb)*@qjU z$RaMCdcq+o#}ZzP(q{4c(HKvUo^w=qhQ}P`UzJuwn9#y)CTdYgWZwgCKs93!K>A1z z75s=YM$Pv$wHW^sSS5Z?nM3e?(3$BA(|OVV#VfoXEA0BX9q&H%5I=rN&P|i2O<e=D z@{1h(RYPAqjLMJluf>s0G?#;iQC8IHZ~Cz|S<v|?1pKwMYYi;%GF;@Fs9GvVd(fZ6 zOJESC;1q^RIba)ZI~)Byj20er`Tau7YjhOEyIm`)^y3Q$zSt58x>_wq2)nYP!xgqC z7CJlZ`Ry)Gpr*aDF78C1&cLvfU(w&<2t``AlNXZERg2<WiO+T80%!M!8{>(F>YA!W z?8?5r!Sl`15E*R#)3(*?2VWO9B|27xFJB(sEY4>;iDUB?>Gh7|$gk<DsU3;0Ry^G8 z=>G3`Q>>#u*5s&J<S5|d-UT~U!%fB}t=sQ+izW{n^J@b`JIi_(e!I4*rsJr$@6OXt zVz(`N<To`9NxSU}w$%}PTixwLHPzDFq4k8<uzFr^66x*wR<Uu2-%%5Yvk7yl*{IfS zUR2%F7V=lJc@|^5+a2;&7CF&wQJbSdYuh|euYN0W``b6W11&*Egg3_m+Frj_FtEg8 zWRs$KX}%*8im4o7dx%X&YTM>GQZ9$t>>W#Vc)WViqpovp4TW1AE1V>+z0u+itiFvu zY*RNx^r7mI)zX{eMlNS{P;fREi2Al8+qZdr#sw$qR1USuU$(TgV}-db5ZrQilFSQ@ zpABqiBsY@HO}oNGJ=*YXchsd1Em~7IvK|pO9nHbel3wl5(T3GE35P8bU;U+l&wC$a zZ*qH^NW!ElaFR1<V4us^j5zaMy+(ID=nS~Ew&ePdDt2#oq<dL@V|nt6jgr(?{iO8o z(y0dxMLE4^JgbU2yBni(lF6aR*R33}@vJ9g55~R5Ek3uno7D0uF-cq=t)oCy!IJjI zPHi&CYK&Hg-<zy1uQBOFw@Z)y^r~FmVRzSA?bV4##f|O~mo^w)SZNJAZFPkXC#%j4 zg}hovL(p@hp!fI@T&rV)`3DH^cDSv%o&qZ4MHr*<B-aPZBKzMf5%`IGG7r*C%-El1 zgrm`|4OFLl5?gm7%7C}=R8vF6WD#fJnfJ!&ma9xC;$G|s8#aEiaKB%)T1Eb*0dh^8 zwDQ5m{ik<4YWz80*JB)M={w~#lijOGcTejx(%ujM88OY~xoYW!^UdK9%5cu-RRJVu zkea4lwWd51c{Q1+DG!9I?Sj=&V{cfmva>yhPOK}5GzSEO!&(w|lK+%Wu^yvS?0jM1 znXt}eWdppPj8U4&h{AhZH~gh@VfVZa6;HP@yTMoFq`FuUEr%bZgh8eq*`}U;A&yB? zQ>en=RTvTmh(B3V0r41M%!?%`Fg(RFsfdFhB>W(S3eX4C<&%RkDwXsOIF6|=%StNH z+>nM37b&vIrzM~h6SRoz?s~U5S@S--mbC2K+Bf9QZL-A3w^w)g^#O0+p2nVtNjgG0 zq~E;7K4M=uwj}YnU5gv9t9@u#Xj7{;kwi)J(%#rrm#+{#PY`#7AW2SEQS52|;dfeg z`yF%R>;2X2k?}X%MpndkcR&ew)?K%>jj(Z{wc30w`}jfE*8-Jh!Ryt#U3DI}E8g5w zg)E(5B)F=8P3aepziwo0@7wd8QhSHA9jEhzR<&cBw#MRY-(c~K+#S1tSD9`0WzQ@K zgw?1WtkK&G*{%V;zezO|aWvl$cX)EeU?@WQGs`qxkofhA5Tj-8#0pH*=Au?oQcR6n z20##iub~AGr;Mh41LDdk_@~NEAXG`IplnV-Q~l-Yb`urtXbnV*`6DN5(82bvPU4Y? zxX<tP;bACgZll>&?Cy|%$DqG$VZ2=Wfg^N7hpGx>?%Z(e`umz)-#Hx#@rK6`RS>%B z7QUGGth&IUX_m8JU;{#DJR?|HTid2um7Z@~{1C4fS{{tJSR+c~ZC3et{SD`7X<zB} zeii2>!p=uk{EJexs{hNKYeTu?J4{-C2m8Ygh0d#1C3?EccRgv8GOTWo*C-OLlg8^g zwaq^iv^;;VMT>+HmUXE`p`o+0T~90J@g9Bqy6t|QQH?xGmUZY`9^+Uw$Gy=!c{7Ag z=zy>CZ}EGO3tb3nb-65u#${#1V3zS4;tHdTp;Z|$vyk=45*%fhC1z{1vcgx)9T@*m zItn5GJn5C57`$70;(y<kex_%=udh8izqi{RcDW-Fw`=@Rs39C~aFgEmQM9*PdPtJa z=RbJs8>C7)^FDFjxwGvvRrzBJLZJoW0Xi@QJNkCW`zro0Q;c02MZfJH>OF?eL}`l8 z58fa1Cv5P<K@Py{xgft#p<`10oRq98r(Ri!wxU$Hz({{F0$U+r#QhLTz}IccgrAk9 zoznZK?k_*y{<Roe8V{q;M))2($`)$0>FDCYE2GjY()ZqclPrF7JmU%!76$(8w3%Hl zvu&qYBX)_bTc<LY*jyZ6=kjuThcjQz^OcJNtVJUd^<&>S@80n>^(~p7gg#(N?Jm}? z)32&Aciy~r*<)<;o6@7d{xxa#hMdk&K|#og$KT8Hkaltr$&hUF<aYs4kW9?_J)M0# znhH`s!z6rM$n^}vPqrMCY!W^o&6Nol%n>IB7mf{n90*dBaDOs*xF`nE5~4C(0ASn= zuz0d?dG=oC=NANRto^RASsjiC*w~?tt3Knn>+w$l%C?OQ1A;EUpv=n409oEJ$Gs@Y zT5?nr8i(HZxUcla2-#Un-VZ%bj(tYmx_7z%&-Z^kFfK3y4z<OZm#1oI@;Do9H{~U( zD>@a+0SCnt95~4L73iGW1lbpP7zxOvN-Q>2H0Z|m3&u&@$$|#79<Yf4HozPT8d)Ww z#$SE{x!Jf%W8tftissZ@&U;;TJlhiqEbOLu^6Y)_hbTDi4)bQdJPVH#<@=u)-w1># z0s|pzQNZmEyv8n3e5pE_w=u8L>1k?oT<I8)Ud{rbdPR(|U#el>51r0zRUh8vfA{{! z$UvyDFywN{pP-E$7v4}f`A4AhP+~cyHXv&ilk;O6c;c7<_5lc24Be+OA?yna4Ng)* z$mLqHn#7N#=Q{SS+%kW2mw5QFw)@7!O>6J$*t>31_2#bJ2Oku#+LGM5_RHMS3C&_$ z`=EHmrsU@JyE^x5+%SJrkM__ZZSSV4%>(<}_uQ~Gxupk6Jj(<wtmIzcCFoVK1lbTZ zTETeaXbl8>(HK$&?P)7%RK`0*Bm>fICy%`Rf70jHt|ASH5bcsq4Y!^qEz(mxWG`XI z$j(0L;WMvEqx@Ni)F-_ly&<)GB4p(JJ7gDrs~W%c#^i4iJRK1q=%H9)1iEH_3xOkY zIEYjFgLDts`r{)n|CWrbT_rs(Z6!xZ#qiS8(sxM>|CqE%lE$QMeWdfuYozr<zk?hm zA=Hyfc_PyG^Y2Jo0cV!kDjne&#Xi72%>0bW%q2`P<%^AmluQHEz|v?YG#52g>O;Yt zV5UI8MN@3lD+CL4L0V*ms+Q{LC=^F}!X}>}sLPVly)?>Y^-5FGONLQ%_80|L5a^l; zMn}Eb6M<;PtESKQmg4zpTCSyD9m)x&ykc4f{j)eb%t4<~ARnhA&_&f|gE2!CBCWXM zHKb61#bL}af(zf7G!dwxT8v9JLt4bS^2iK)qOe32SQ3xXO#tO52Dduh6o@wgI-w2} z5RE-Osh6%ierCt!l{c);46V4UcJ;dc%_C=1!r61me*WB9o;kI5-|6S?+%ukW5Y+?2 zXOoFD9qS)kKa%WvzB2jh;1Nl(zupIHZ5GZ)d4^w&ti>YcK_+Dmr*dPXmMoNY(Tv$b zr*1}IWr_)>w6RgAlBox9^TMOT1XSI!H7`F4k$l+_k+Mcd4f2VM!AKuq$ORaU26|{T z<5>nV4c&b-Ckwh_YOu%$=$O0$3Oa)!B_vZW<1wwy=9G<qdC=r#<-R=PrI5;M@-snW zoZgz!@iT;f>g4%ph@6@L+v^k_Cx}XVJ4J!;THpEeeVjA_EubCnZu5l~xG}|E86NjD z83RQvAv~HlWv<X>0l-9QeIHAVnr0RNAf_lu0TMYcjk>asaEWOha7v7hTC+uajM0o$ zM<HOf02n$eJ=Ey|46B&V2QadDnWn(dxiDRu6QHolpvX6-DP)+mGj9qGgzj_*2XqhC zfgHsWF-mt>n89R=nwi~l{ERdPn}!Srp<i+C>?z`Z?xghMNv_Kwz5mq{ux1$3M+mXM zcI^irTsy{|x?|tz)BE;h_uqvV_~VNGh)ee{JD5y9g@aaJbh`gw)-@=Z0FInt)HO=m zly8w#NjRmS+FsaFbfZkHuosh3aR%dYjn;tL*lsLpD#wsE7fo%99KF2oX_GsdOk1(s zslsGB-v*>GJ1}a-W)V>!j`9)UfDQdv;mJ?H+yT=BS4gHd$ZF{wm}<`x_TSqp{&?iU znu9;u8FcmZH0Hm$b}Rdb(|7JYeVVxLI8H~Lc$_Fj@@SuTQK+-iJa%5S@0iq-cwV;W ziTqi9H|iaHmq`V}sp8nEHM=vMsOBW5%IWqmo4|+gsHkK<%Lg-zh?Qj+bq*|9NlJ0j zoa(KVQZD0E36PjnOcKK?O%wqtDN!+rX&+F<9!-~EWI;4j;-l9}igJSKDnO2d2|S}N zi_pDM3EE)=h<ql1C;*fw58$9EnP#j&4O1nCh=xqQgKi|)yr^O$D}_|1pn@2X266b& zxFKOP&<nZ5p!6Fm{0dsmr4lkaA|c{Xa7S9|1i>Hhh90`@z_->fB73!~Lfb^`_CcF= z+@>nZW!sl`U%QS^6b6lUmD3&!dHaT?KV)|PA*SULHh%0V#5R|Tmv+~C$%u-WEJ3Z= z?R{os577g^U`v!9<eo*3u5$G6OnJhi`YagBu!Uly;p`VKnhBRtNQ4OmW+a>nyp&?1 z)N|0p2h*%K8qH_|bbv;{mI`R-VPFm*3a3H=Ky8showA7OB76fy0^rk8K-5wI%b-5L z&;vwF8;Ss4Q=T0VV_+kN&LJD@g2XUk3R9vcp%v_$s+ZY5VoC}$<YZEaonH}t+T}GJ z4IXXXa%4>-Nj4cqwp`mvRwPnuwpn+(T*htuYZmtV`R1r}G}*O#aO+dUN7@GFTg8Dp zR-C+lU*p<)#y_a@a<6*r#aSBEexZp!#BTv#U(WoQN!7}hQELBf4yU5A(blX(B@|69 z3a2V#lr~Y4To}#dEus*}a{vPHv4CBgWovP&2*jy9oLc_UXj2v<wrFN~6CJypQuZa$ zRFjxq01#ojoA4+`OfQrXa&%sHJn(&<m=-XiIy<6TCg7=fbhJz!ktuW0$CZ`iah_=U zGQhZLIS_MkGCgl0z{};-0=(%m5VlkWu(lb*U^Hi5^@4?z6to&rkz`t=#4R6G4@6#; z1ZCx3$QGK+s+oqE0b~xJvSz4+UW!ZuAZ@k^hrwf@98-{SJ%MkYNd=U()kK1|RgrLw z^mwFJ5c$2$$us^2|BtF(<_$tkWP^ZO8+%%+e$=phq_cfw*O}zYyeMpB6OE0DKw~49 z2?tGnUSOqeveM;}?k5ki0`G^-Jv_dvN2FKCeyP@YWli^rl^r|1<Vk5g&wIrlx}Joo zmW~K@(61|)?=u+%m1MAre3p_h(To#y<3?3B-l+rZVxy|;az?Ob@o-A>5*1Q_w<)uj z&I4GBRc{^Rit|*#v|x+H=!z8qsXY<w1%{(*z*Jx_a11@<cq~V;m<IVBohv^j4GNSt z!@-s^BvW~YQ~?MFmLla$roF}ZTbNAEHKa-@s{!pjt}*47md%-qH7buNvNDK2UIIbW zkDyBeDk7rRZ3T{V2uj;nxDq=E2o7PA6&ZyxX5qF9*Fz*>rFSkcE_bq#REiJ3tdBP| z5)EOR8$w1h0@x)M$6QWHlHO&{Qn}<_)<Lcz{%X=7kTtAyH=JoNi=4i-Lm*!1oY2ua z_KutNla;J=P}xu|y(;a`KiCJ}z4F3#?tcDh^!_eH+(Rax1*8?>QKDq(06zg<1$nbj zwgpcL7&(qe$`wwjUrMV$d9BfuN=z$&`FYV%A`_XEBbp+zOq0%4y+z4LnkWD#3m_yP zaLJTl$f$DEK<`wZ5dcl)8&ZyB%3{nIb3v(-DXk%;p|sfGwZU@br=)|%!J7rRzB~$p z4R#I$$g^RxGRGuY9_p(g4?p<a6M?;5(mM|g1Ud#-|M;UiM~j+$L>PZ1=;Sko2kv~} zKgjBVRif0aeAc7-Up+kfGEIG5c!56;i`sPzCY}n!D3|tQx`jZdFlEp3JTXZI6A59; zA({20-I!ZBPN`^hKy5O`7&2-Nd@9LQ$e6a;@YiQZ7nb0!+nDlDGO4Ig!$3s2OE$Ry zYJEPj#cCBe5Ye43YnfAnyKqhQ8;)Mz+|w1>(D~TlrsLaMtQM^&(XsZfuI-O>hSzm= zFBv|)X88U0$h+h|!fs!=V~@db{}*<B?;))0)8AY9{a^p!aPRk}_r78>9r&lUpOY4k z@05N}Pi3{07aWSW__tv@i6hbl;yqpJ5_zdd9ob!E4AU)T%6@SJ3YFnh>?JCeVArH_ z#59Mk5z6kN1OSi|0}ayWVxpX36xF55$|-#U5PYRnR1pX+hhn%4XOpH3e~re`T)ov* zDsK_^4zo1NVm8f1w>_T{av4ezz{m*O8P$Oyp%o*5e8Lm}(+*e&yMr{IJ9qejR}Z)L zSB1LPTB`OB)i*3?dSy{_b;HoUq<QU?p+sN%;6dpeq_?vNH=R59z`3)B4ziu@-ZdTf zlIwnX<EB6`7+QbhtK_<SI#yoiZv4f8XQlrpCC?uCMHWsOMGyZRA_koBO=Uz-AS%d! zX&f_JMHPH+GNMo>qK~E=;WSZ$vOvCvs0Y`(10zt#5(KLjBa{H5?4V1VRc~b&vrKY0 z&>2H?l0<26L6b)>XjHhMi&9}fZDxR5Vlu4(nNQ^cm{ud0u!b?`0d{H50m$03#NU8Q zBYY9!s+9(m)hQ`b#4BY3wgHsgPquG)kg)7Kgnekk&b`u$<NeZ$cap-@s|K#W{MId9 zt=t>h;Povn*N60DE@NqbTU&o=fm+SIAPCqahyh@#g*N_i@YW(!S*T}z!lWv~qg7eF z(T39u3U|80RL_JgN4>21Vai2;RV1|l%8`salt)hBPGOcG7K-UwU{Xo;TZx!1!?$wz zTe^Yb6}*(0$^;5@Qp`$gA)cjc3o(Clenl0Xe_08|togWHZb;{WZZK(W6|Va$=gp@m z$QX{5l$F=gp&>WU=j0Z0^h-XVnXw8J;@*r56;$!08fB%sNRdlQsHGbeN0tYnUs5#n zQyQcY+(ll0f_KA@h~eac-t8mZz1xQRhn^l9euJzb5z>QVN=Kz9&fW1$iv3;h_MvOK zw}0V^p+_$;Z%JQ)ncxU0aW|<ucgM3oVEx10J=+I+`nGqBj`ZKYXy_~crUUc)4-KCn z%y3Wl(6+wbq0Wbg9=&b&{$S(1)hmt-J&ZMBAqzgD7~tQ+KAz9~C)GizH0TA41Cv@N zlus%YQp#Nz9<7vdm!xXKsW~$hju-$9N|thGbq`;5llx=`avDfEAr_{SpmtcTbc}L7 z1!$UKRA2((2!*Flg>xmqQn0^tAr+?!49E0ldo^8zBxRdSIshtp3_R3EZ(yL-kJ|Hn zbL0_WL%IZVdaBTP%;AnDF<?Rnhp`qgmu+Fx-jx+mYRP0^*F`cq_1YlwhOvlBf$T(t zPHJ!~Io}kkTU8g@JF?}T*V`Aoym!~rp@#mGJ4Q|%AC7qjNa6Q?OROipOUAbM^bHR7 z_6&_b)pA9!+Mn99?%Y^chIn54>fo09+qc|2^vrj6jXbK_IrPx$()&lG*Z%8yc@0)d z8x<$`Bd}`LF@p>wk8~x~>FUEN!%LL)rQnCp0nC&*REn}%!r;&wrNSbM3qb%CB4j$3 zR-(1$0R}lTA5<DM1r(0SRFxrJ6vto2m~qGCoxmE7m4?bG+*3hWn}Ot43E={r=VYY> z1VH{<2MP-E2{jL&g{WAMKURx~a@y3XMGLpW5GUWqx1X-<jZ1?gw#fFHVS95=d7VAp z9ctXBci*%LG%{GdYfw5YRMvS8EL!`3Y42yWBr=?IohGW?!<&vu74=?@eX(=T$?+x} zH68dT7vH&eN1s=JaMR{tTXn;x$SZ@FdmcX{JtO_PZ(DOg*TXgTg~8UArR)<AckTQw za2XvD6+`?GwB|5#F9V%;G@xXBz+0G4ta3GAir8Y5Mm}gJBoNr8R4<JdshAM-LRAqZ zoGKp(CmT%_Wo-lc>@|Hpy{3nTlgZc9!+aZ5L~3~|qG$`FNxmVYDfUok>nX8Cy<bq> zbRM-jISm<oepZz>QOzahM#)NOgrEXY2&veN44_H&WmA&u82Q3Zm)k@Zt|xCjaPakw zq}$~+O2wL?dsh$KyIm)_agl!t=g12ldu-En`+k1r?tzUS^Sss753K##1MBuRua*vb z?ByO~0y0XEbWmaD4>Ad4{D9Sr7AToAN?YfJQ}LJ5F`!gTOoy`glnTnC>m>R?`W>}r zNoo=UVsXrjXEC#oY$OT7LZrku7~V9=RIV|jb;y9oGh_;i003FJo4_O)k!Y4O%+nx; zP%donXVAjGVA_a|1x{r#Z{lxHceu_GyJCNvHL&XLK4%~%>dY?jk8YJdU~JhD8?G3( z#s;fKtbCt~?;9EEv-kYl)~)vKkBmo#clj&Ch^gq$dJn238(M~5W%rFX@Bg3i>w#dZ zGo3qM7;nZ=A_dqqU3h~(2V0yIT5-xI!=`Z}9tQX^VIQKbl*_I*lP&<2O4%umfe9U= z7}Zd*C;SBeL%$>%WRXCa2Y*~=CZVo4KQ6u7@vd~BoebbF%ZD6WZxT1(zk=`7uWujT zZP`3x_e#ur9q&u`b&wnW)bVE`@g&gYMFbC})7viCg&zJ8RE{Miz+?hcj!-I@ejs!- zJQ~XiSzLJ3lRW|*hapj3-4@w2glHuybs<`bU=ltM3{<gBRpy^~^36XtGM1oPpK2CU zbx)>UWgn!9p2U}9_04sr;IRerXM_B?Q2uO`KNrcLP4eer`kBFJ(QQSyc?6K?WU48d zT9iyRCQ}W`)WT$HK{8dJWa_m=jjp=-1q}-qH7#zQJL5m{$(K^~`6R8S(s4dos2jec zDDVWLl)N)(PH8ZHN2}BOg7kaZ>;YQl<O2~cRJ{UqqYe!Ykvgb^VNv=kSn9T?z8$Lx zhC8cqM!=0<kL8BMhzOUtIcUtH_~F${P_7_TGrqn1zD`#=iZIj}n+C0kxsCVSQ@u=| zIJau(?&I4|Z1ncDtVjgc#oBw?V{;J)VB)oc8kKT*3ncoJsqNBpYYF$tfHeG`-F}qp zncpzi5n^M3wbJj2KjYdt{+YzuyN-?w9o@9zON%^0f8O=mMg+gwEqdLt>Xxwg{+bn4 zSpS_&o8aP|$b0B!w=$VB=vQfL5ky4<rR75j&B=ulS|%%@WnR!P2P_g8lPRVjrBoa1 z4yUdNrxz~8J(27^i5lE<MT8QwE820-;%F)qCaJ4v2v}-vG~FX(NXiHBch8-Bl1b0) zo_mtwjDPK^{aKuuE2hFvrn<${l~2m!LJS?pgeJx{o!Xim$A~HNBs8AXoF`BGr03`H z?Ago3RL7I)nB#*~1x_n}{!MyXkt&_5Pc3;et(fycs_MxTrFTC+LeHmT6{=K)s87Y8 zOt)!2NG+Sbq(~Lx7SEGuCC&?YRN2obez#CeEqF4m!7per{HxF3CO=70rAk%dN!3rL zn{daX=}U@K9-d@>GOa86Ag$4=a9dhXs!|H^n4qXK1yB6M`g1yUO{!2l<}6%bf<Ht& z<}Y1>1B^-1G4xWy<fRsR3FYalrng20!|4ubNoFvN9yzfyTqu8})M#~fXQ9VGCmgGY zC#$L#G%RXb(z2|r<MJzK?3Ir@B=yA;56Ht4iaq|)|Nm#;xpc2F=_|Ui+3F$fM)22^ zOqCf@M(idNlqhIaiK$PU2?{7xsvuLzHCU&Q(5Z=(f6VCOT$T=<agS25lk{m7f^gsi zy+Tmk^=EAkuYIEDE;Xyt{2qA<%3Xp=$oV5Z+6|X@jwHoNuR%qJbG-J>KkIV@)ekXT zrTe{_&DkwzbN+~e)5<-lzowKVbZr(@MDvrMsEO`Beyk<A(&@x=TJ<x$@PuAC$qP@b zg)@?xh&dk}qJ3IWJ)u`UE2!uY&|t#syznA_gnt;>N1Y4|@@dYAy@U{JsA_4mBMRlL zm>P|s$fbEOQ!^~2Jl$LxEjAY~>$@pSG+l^m*q(**W-W9=itvyiRqNqh@lc<W-)Nu? zC)jeWrWojyjpd-bipYIsR^R@+p)EUH(tGde|IpNZ*8Mxzxz^5i^zXe#m`>*w5<9d@ z`q2)x8d`2uA29+Wgx^XplEB+qElfQB*9Y0@Pgxr<YKthgD@4Q>Ff`J{W^u<6&WFX+ z!?fTN6A%(6Vgr%eJ5aY$dT4-nEskh#O-uLP17zj12s+MNe^+12>R^~`8Gh6d3>v>z zS6{odc7zBU$(BD$4@&Q>y?x7WoT0GWgx`CIKS!$-z`!{1bNDr+7lCog#$LIxJ4>Y_ zDi_INLUJtQWU3%XhjSal5$UOZ;&FEG>D|01FtoDuOuo0NYtaK=J@EC#oy|7U_VUt> z%@NCOpQ&Ee7>`)AdbQL#s0{|qKiJxP?SX9#pIvu@$DW99T-<+P?+wj2#S<32ZB64q zmw($`kY)Y333;vCFT4U-@)^hj*=nK)mEhp#po$veaA>j)$9HTSO*HT5Y<>TYm)gJd z=%%ib#!#Ms?g&jg_-F2Vpyg}ZkMy=(VBY`LmLETGXz%<^8nW6sUzRmq;a`XS9g+P6 zE_}p<Fzo#>S>#F2!gV4&L>l<_q}A_8rx51ULKc!@3~RseTXvIT7b3{3m<;T9@L3_8 z95O3(Ylv|0hbdkfRml=sMxmlur(l3t2=51%q&7zdHn1#nQWLIhr)@veu{M6~sZHIx zxAm+w^Zg$-_3c<fUkSg(Yw=r!%qk{RKz|GN2l`t|`L|%S6JDB-22-kNMzGVL6Yz5> zyO=gXQq*OQI@DrkvO`B!i?TgSN2Nn1&BFme<z*$@DbQ8uYti+`d(nti2*Yy1BjwZ+ z@40n2xH@@l-O1L)gSS}r5@+~(#ius)?Alg$tq-p~a%1qeSlceA+rK>#ZtC%lx%+l3 z&90%eU9nnm03P~{Oh!fL0o|6)L-s79B?wkPONG3Ga7uh>RFQ=q`jTKeq6X7Z$m$jK z8o}z2VP&roYX?}J6s*+ilVX!;J(UPW>N?GEJc6GwBL!xI5|SW`S%jJSjql%>7+*&x zaHef-lIwhaQ}52_!#3_t(moc$t2fZaSTFT{xS?-H%fg=Zm<srR`@~#0WGaFGYw2td z$4zHrq5eUbO0Y=sY>L7u^OUIsHg0ncosb!BU}`vlBOyiLp3lxnu*qvjiIAJliH9o~ zYa_?7r_5B20-h0-RaT~RNIVDj{!g44yNSJfd>#4y$IOq9OW(eSw0|hr*e6NnIQgV` z5|B9dJ^p&gxp}Ziq?EEfLXJ$g!{z%DH4V!aAlWp`1HlXsW#Ph26ARuPZ_FD~da7<v z4-(a=1P)EwC&H?s(AE;tx_4K^ySkT6^tE+u{LLF4zMhO;U=m+i8E7P|WBhP$+q&fk zN9X5ozvo#Q_U|+IvV;5zM2BjiS!R?R-EmskKQIx{ppZR`(9x-jEV~P-2b#V{TxsEA z1}{7!3O4rc;>Bcsv-AwR@W)r3M4H9vFOI!VLjNqwcOmvg{uFT4g*ti}Sj+GmSjpU% z(k5FpItncm^G@3fA+qJl`e!b6A6eiE7t^Q-muAqdW>lRWg~~ip7?J7Hs;L!_i^{8u zw}B7GqO{;2I1K(7l<`Cy{|qrp!hknror^c&zs0{?yb=E`{^c+EQ_^dpq0sYlheFaj zf9yfUyy7{7p`XqfnnN^i_ejT{@A)akPZiTC?cw_Pqo5gH=5q9IS_9AZE}8-9g(8Ga z;0D2@hOYz-Q=?QLJVyh0f-4rF@X0)Yh5M7%mH-J*`fUjp5}uW|Fgm|u7s=*sM3tIQ zC}omEaB>sgIGoUa7<Dm<@(~CpblnjqPsJ4Q5e1h(V*=U?0x^|`s(lStl7J_$z~gLK z$i|zOdT$(RICTeaX-^8~Hg{uPZ+qMJo9Bg&hn!mToScY-HJXenHSZL)T2IK;pIrHF zmwnN`hRgH;U&O0rgI-psG&wJ`IDIxpNNlP}Y+Rt8!#bStPU#Kn%13k5>KvCyuC05T zRht9Okc;i*4JECKhHht1JBvp54;Wdgoqa$gYZlgB8@KbUF~ABgy;`Nxq92#WXciPo zqphi>%|u#q+v)-h4T~0hxU`3F2zIhIwK}2^5hw1@SKG}k{<wCbnlvW2s^Qp??&Z;F z1>RcVR^-B2PUIeijjs~qy*OJ88;ilAz3(J2p=<+qv|SW9oteUNE|euMUKz8J!{O%U zaIm$ZwPnSoj@DaNHpd=Xr&izS*%Az}W?Lve;ATT^mn!#n+?LkWt&8VXH;x}A`TY8} z_M2|(Y+s!t{c6`EthxOL_p;E)`aoddi;*@TJEyqbZKV5)N4(Kd!4KV=_GNEoI!F!m z_-A07jzPYHk00_?#+Y9m1yToOpLD<vq=v_c38xKijEQEI-kNA?PB^ttHlNcdBn>38 z%jF1}aF%L>#k3!L6tPCIPg0WZ^V2=zD}hnRFZ=!d0X!L2%X#2pOS3mE71QnbwmAE3 zv24+V{6<4d=p(E$1+^lwCS#N1N^D>dG_Ir`H%;u?PV7&A81#vu&YX-eCrYhE^NeXX zH4ZH`r105pJcbsEvLSP^fvV0n*`Z~HuVW!D*BjH&6u>>1GCT*KuBaj9N~Y!*NA+Cw zk~V6_6AUR8c+X^zj;z}v<NyLCq{j&dBA6<YU8cSx6=fpfBLSJuXM(*VYlfm0YmQ!{ z2^>C>4ePo8Gs)ZAX2$mLwKs;Yy>&A7DY^f-lXv@;)Hw%MUG?@=n+}|O?ye12tx7NT zmn`*9M*s8#!!rYbUdsu2nhi5zfpQ(@A=31@n(=?CBZ#6pUHkb%50b%aZ>`xhxBqk1 z8z`+&0AEtVHpIvHfY0^FbnGVzCX+|GAPgW^%lJYQw<hA%E)S13X3f`rIhl(Zvd!VN zg_>BeLtNEM)QwJgN|{LgaHCb(Z&hNt4&ScGe!D_U_u(4|9i_r6u%6{uhjfkwOH*Yy zmaFt5`ZbxVGafTkT-HqwuP_|vj0+lD+OMYH`i#f(93@RlT9?spj3Mn=OcyhhgNJlq zgNIyh_%2hXEl<q9qK}SfG-fJFnkbg|4QV$lhcK$84VU4s*>Kze6>;$r`lNIoxL(F+ zZ<nPGFO5l}D=hoAE{@TG;d?SrC+mo)hOv#(OXMKY8dAU(*srRn95xZyg9o9ND)4~D z!a)#+xm@wI^v4s|{_A#^KKH~&0(=S<?s+&H<@3w6w~pOPQGw;#_g@=$_>b#v<2LMC zhBlDxv3C#-lt0%}RHS=#|5EPJ+x?GhJn-Y^_OCxKebDm7F4R-HI3VcFf3H|;89Se| zFN3I{Q)G<B4iyw3@Xm3~Ez+st?`=Ui(}`PbyX6p~CY4UPWbi7fsTC3b?)|?l>%SFo zMrUODVi#P38NS&C(A)-ykICp5Y8#<V<J`)(GGS^TSt2{ADcvZDj^>KGJcI<m^)?Y> zhKNS4Q9c)!Z7x*jQyQtxw?>vrqFFqNis|{llkV&{*hRE7P6_fbMKB!<`mu?;!Hiap zN7EuyDWlU;Duh18Fw%(WRiIF4LU&pAp3B7a;%UTN3&hjS2LXW5pxcni3&W<LOs_)1 zVSSE5m7l0@Sa!t<ia(bbj&TJ`<XEr8@TDNLh9aFSNRh5_!u1k1e8<8pp12{;n8~NZ zs7EGU8lXZfq3CoGx$sZK|Iq@RlV~!9R!C@-X-NWHgCdp>j)QFWJaxGu=>y7}T$z)W z4x~A4iUH}J6FrwzFPtPGyc|5(-+4v9j48DTpUno2IS@ED_5wvIZ?SiN(B^F_hqa`* zn4*rxYE?%Um-w2$(RMYj))}q#f+FMctIL+LA|lM(`8luA_`25+eE8K9B*z0ST8=im z;_~Tev!Bt}J7r<(mDk4_zi?wf7;g~>#T+CK1UGH3Z}|KMwkQ1cK1;ywD{;FD^3>tO zeYKl=DJ^1|?UJ2;Op!qpdW888$Ze3zQ%*Tn77p7C@|=SiEo8a0252S#%|=V*XvR^m z&l3WI6(Vvr%e-iuP*i4C!R*Sz0AG?7DBW4xtuvZ&mrzV`BRDNn;--fs)Yb=|TNpT! z4FgS6Ph3e5I5LN><uQ$*r4I5`82t!sA-cf%hO{+@B0^~h)F&T_s$d;y6^%5|AdJl{ zD=`0$wzrRusyg?@_u8}PC6mcy-Y1jEWHOmdhGCctGs9#EA>@S+LPQ8LjWL7>7!gsx zlu}A5MNBDD4n>M0MQUkFIhJd6_GF}hXtACiwN|a=ia+b6UXN|Pw!I!}&pGvlZ0`42 zdjdi2IrskYYd?~iz4z?B_OqV#tmpOnOw9B|mLGhkYMS5G8wxfMUi~|xR@5E+Jrkeq zYWT~MAI(1ev%6!KL}sv~?82HYzdt6u_PTC`?)BGZ?R$|bJD+vCnRFbfe|C0iATl4R zf35>B>@yrXuFTE6u;aM$f%Ek8e15s+jbA7i&Yj!0_XW^-)8(AVC~y*Tl_c#yD<=|p zpyhGdOTrzxz~9S>q#y_7M79(mDbii;4NZv<jgS^uhD3`(zHlKeGG`BCIZ*ZmPecIu z^4J!uLnMAH5joG5`SBa9mx+wx$w`uwojyTARsJGE0D1bE$D<gtI4oG>|hCh}zB zvy&!=`}+Esa8=^u_|m??l}MiCd>K(ZI*-;58>)kK3OO_Jr6!ec<B0@0R1rsW0q|uu zk;+%A@%HJ-T$C8Ji8E-<jnCY1oQM0W4Xi4qC}>8B3Ocar{CL)TKI=;sxDHu6S>zvE z!G<j=`?jcg-6hEaya0@loSx@(>nUN-s3-O{l9i2+X8UCkM<GopU)fp7d`7)dIw^Nz zy2>d@p`sEKVYQAq-JTknFla@N5DpowKLIRhZFowgl@9A|a<mDD+pT$oI2+q&s%BC! z4sAFavKgy913^(iZfdjIeqyx?E^nsMNxzF)^W_l-8b{!BwQZ~{R7?2*BoS-yOOl9J zDO0pYGGSCAgk7b~rq+z;V7XAu2?Qk5#ZeWEfU`YZn_(_q@XwP}r2qH>UlPl)B&4H% z_-faImIKc@%KJT`@SzvKAAThFtHUkd?p}ZQU8`At?$PkEmkx$PWqm%Ua^@eDhJ7z! z2|fEPd*t#uIYX=I@zLvccJ6#D_niQ<ydR7sb}GeQ-`%;Zw__H&HHL4#c;J2Iy};4` zrhIGrzRL@X^GesHKWpC5Y=<|b6xcPnQt~apyM(PUS4<p%I1{(Pk=KYE>w*Q2%q(PU z6}C2-czi`@-Uyf$&`gl^SYBg>B2x*zO;6Z3Dn{*P1c~f97Cd(j<ulK~Rk@j&;rH4p zcG7ak@$ok%4k_=GeSq<4pTYP_;dza~MtDfbRS`c|@ST`;?07O~$4Im86uEfl$PDmk zBL&bWjhB+5uVjft@U1~*u?7oercMY@Y~aYB#=_Q^?cf(Vg7lCRMrWnAdTKphMou>1 z%_`jS8T>spRS>mE-7{MsvsgXl)nOdX+9^wozcp-DGO{vM(nqiasEFoZ7qqj6Q9MrB zt@RcSc@#O)4AmXDvgh_)x3Fp6U1c*e75j)pXTEmF9bY>G=k;F-pRyyYF+Jk7)rO}! z_d!NpcTE$V7ev6?=A%nbecfa&QmHt9K9N>=zhgcVU%2gH*VmJ`vA8m|$ZXUSscV~m zxb!5|y>?wXr0>=|1%GD<F{THEoRcVAoXQ7OdZOyX$#)#56{{CNKp31TuHzdRQG6OG zUKh`nP0qpxTaZV@qLP^<$L!Xi8MY8S;1-J<1<ead>e#>JAZ6Z{L!*|XVSFmq(HaE= zj_oIgqEef?%%{STz=uJ#=!D-u$}XH3<WAtUb-ASipH2ML99$LIbYKaQ?)@j09M}|D zHRr`Iol)KgNjJUNJYH5{L-9d;`1P-q3!0~t-?iP8QE8UBsZIGc3$NUu^5_><-c(>v zFfM0QPPac7W%$~mO0+C?CZZeoP+dUK<oFuO+ERsOl&wwXz2?ed)W#rBj<2gJpRumI zD#<7(jHe-FOOtu%0i;G^yQid5Bv_Lz0ygF=PcA0e=(T1`kWLDs$!=~gda(>2W0Juc zAb~0C5k`Fso?%B7?O#`|OTf*#1ND^qLX@ouqj?08mC$Tzt0u44GdX4`WzQfUkvl^x z_sqk}$1Wx9R>psPMw+<r6N~x%Z1#PXsK0;cq2G&x69k0(VZs*po13Kr@Ewt_9Rsjq zjlk{@G*}|&z0O_k%soeB4$Dldn=RNK-kaF39Dc;n)-`M{30ZVjmqBux*)z|e8+%EY z+w$J{lgi6Mmb>8wZ>!0v(?#s`oK=@DNc$xqCm~BU7r;ER>@{(Dq3WlDQy*Ap$VmXG zCR`w-um)jBh0S7hofiL+D;M8kH1P#Z3NFMy1`5pxBm23<W-n`rvb`}cW4<a^TmP}4 zhFd~i{z@@!@HkPNWbrrxrE3mEjV7nX<f!$Fj$JupY!IU_brnmFQfZ*9rDuOP+mY~; ziNT03(ea&IH$RlNEbQ^>oF<bm(2(j6u6{H<+hk}mi!-IL(Q(&4<64$#^tKvZct;z( z<8sLIKWld&-}&1@PQp9pd{C9Rbj-O!Q^=zN30aNg!r~ZZU18B8)7f}F>rUoj!%&Vu zD49$0jXF7%HtHnbsIls?T~l_u(K1$L)MFJ>4pJb=vzm>F!^=e;EL|aqPeEj93bI+k zM**5<)^4SgXP-@`tPsjDC+K?e<JmF@1kxur$e4mnKsXdU`ooZx3t>qv^g0%F*j(Z! zdwD0^a3$ilkrA&WX1w9MH<p-UI8<WV`nn;;>}AHKM}`nU_1Y{x+mr03m$W*2po2-e zP3N{*Vm7aN>noc@!#uxJFTJIF_Yu;Q&_NvdArBSsLvBL1+3yKIX5R)OQSS@uh0576 zOw^_WyaQ4I`~ZVkPu|2Zd?o5>NN!3mC1?-28l$d-yHd}?X>>LDAqkk<jq^OJ#eFI; z0FW1$N}^s6PR}Cs=As@F4L5=X1@|S2?bDfXTI&Xk;T#PP)Mv;&3N%9k9)Yz*pXmu1 zcu?_+%dNxdmMPV|50c4+99aAx)P_X{9IskEp-R$t>9y$paFeEL(Q^%sSloCbYJs0n zW+GmQ-Wo@N83IoM?vA@T@y8ET=h>JudY;n}WtAwSRsBb}Ovh)K2nD!tv|Ppt3L~ng zrwjN`(0kQEF+U0+(<tb+dYW*&3b}L1kAVR?+_+jyUxgpiP%1ug89ziEPEEjvlzTn} zAHr@b;D;!&0)9xtbb=3&_7&hm#t(D&5Vl>}@tN=;%I`O9WIh!>WKPp0_>dVrD^l~y z|5Nyon};W%hp;B)@lT_NFx&qPdI%f17VsgAar}^~SWN*xq)cn_5PC?Uwg4V7U3&#Q z#KEycy1axPQsiYScu3Qt7{Eiy9n4uHzHk{nB(NzOU*KI?p}gHa*8%vDUDEdeAHoPf zB<gYx8Up?Teu&Rf%JD-`f6?|_GXsK%e%AHro)rWj@`VEpnV1-kte>mwSH21S5amA} zYlDO!{E*cxqqp0p6)%||jdgMSkelaZyskw3s-3I1hK`QkQx)m}e2A|?i3i<Av(&`C z#_>a{>!ZFYRVEkkLyCYOa?Mx6?TLm_XgWiU>?y<tW*lXffB^s^!-1;-M3`6|D~r~L z>>GjRvb#w#SQ{qrL%Q3qBlr+Wuh&bL5kyRuYJbU8!Vf76G`oQxa>~LA03z*aSG>aA z>_`73qj?g5NKLpphSsi8?*T30hnRy16;odPm7X$!pz6EUt2@d}H0&sJl$nfFFWyw> z8nX!7xVs|Qf56#yHMHPBFs=L|I0YSIL^OzrREzG0S!n_<yz1F8e82K+b;V7<<vRn2 zazg-5?gVo+xS`H6#c+G5YtEw6?3me;h<8%Am%9em-VwZdELXL5Ik*n3U3T|Rw00o` z{Hw<_Yx0=jR3tf4C>i1sfIE+9sEER=i3lX%pg+?_a21i>>qe&*ZIC*(bOw+M7}(TI zomy70Ws93`PI+}6wN*>c)ua9Fwk(a^4K2Y$40{#zW%=Zor%xU;0!Y-@@xSW8GN}oW zc(L3(#eCYl^0vsF>@FP(TYP@B&fa6BCM&v;tH(E~p~`zNexjkuC&x8q(zyPs&MA{g zrxBal&>f}R=H1@3`Wu_qwFT7fC^1u_{r;<m^LIL-kgafq$L+Aku5cO}U2*-WKJ{5G zPbSetW4|rkCpOHcz9!MWnbg;$bmZZc$wNbH@A$+hcn=ew@iUs3?81{LueY*EBl%>f zlF2j>dtVs6eIB}z97gcY6SaMUsV8?H=?deXlY3OCJw~p)@&8kYk*fxA^grn$67TE1 znhO7S9YZkI$#Y|aJ#uBFpV=9NBC^<fuNqk7)^s1be|+8?9_r|7PAUIB^2p7J2bZs1 zclD4y)#_uSwaiuYs)5hFVe%-TCm<~X+TW);YpHcv!D{Q@16^g?KHXVGOtd$pySHx} zKD7VlPr<N$Mwcb(rZsS6c>j->{nOa2;AQCziEtV_)LOwj>Tu>lBNmb5!m~ms$%SPN z@vQ!QUgY*yPUMa(5!HS<y-2FEUW;UkBGsr!Sp!I{3W(h(B>T8a3`<8oJ~If<u_!#J ztkU#rM=@i?!Xjk!KuOPwdSMx!hGYUGFFQp7GB5c-s)3Q`Ad_~E;%*d>ky8!da(Z5C zs4s$EE+r>Y4qy^9C!k~TAIonPJ<2(}>xlA{Zq0`WbU*9=@W4_%->b~ZD)GpbF-SB9 zxcF!cNPkD_1<E053<8V<i~+tM#J}jNMf|Bndj3@M*deEw5D;;0Vhky3kkD(>_>bi_ zZd3-vZ;h{GF%+#}82ul-wN!_$ivklomo7?1)MsW;=goZ%vJswgHyP2klw6L_(<n+4 zL1|D5#o6R?5K?jwUy;O8<#*S>GpS9<RXCu*Fp<jcJ1+-Noh<8(@5o|3V#a#k#I)c| zFh5C5m5<Mw&yP0<;+T(Owo~!W7TJtfQT-J%WxDGL{lE$D9~?Z{=_v4~*oz?hjW}Ei zun9N?ogg$w?I7qS=;+7>uMGz958q%ZlP^vGVPB)BoY>E(XD1^6VwUDbIhTqZIB;Ju zSmg*jkx>>N8#Wp@G(U|0HE%E)UubN5)?nDstXbbZ+O4dKXZCE<S#n3wHr|jhO0&1H z9vnZrrBN~@fwyB!6iJO+&T_i?FI^M`<UZA*#ye*PIb{b(WJCm=%)!b69epYt%RSag zm5v}dWs8I(ma4OH3QcgpOO!9V&&ysvQ5ESnVvMND<KCRvg<^bnbBa>%TekUdA(b@h zsgS1>-g7|?hj>JS$Q5)V{tR-tue94lKxl$cfX5c$FNMiKHE8UEd+)t>#&QEPe;Xb) z8kf&FivQ_jWi#EyCvNo`kPb+Md1OnY)|fD$#q1TTFk@S*UzFUn=Bdt}9m>xIb}3)r zwKm}~A(tYWOaXtTiPxs5<eB(vB3>&34eZH$wI^+YlLp>#Sj&0M9R3Yn9ehKCkY(q_ zvyJDmSmltB%2#utMq;xNvT|~b9JHj7oZIzD!Y!Ewra}=g`D@_nmP3u8i<wFpry&Q9 z{5EwqFqa@#Pl0O@+0)#M?XglZlx7aDb>o`p2@N-iKgmf>6zYNR3KrJiW5ZfP!>nHD zI5<#Zp3GSd(ex(2$LL$LFgRt#XR)Hgp4l3-i<#9AivM;sqbWZQvi6RW;Z5EOW$C9` z^XFRD+Xh|~`5HHCGlCf%As!WS38MTA5P51cR}O*$$%DdUSz9t+qfL~TAbh2NCqJu! zK4w`;o`rKHf@a6Fne*_2$zfXACE&SmhDaU;LP0JQrjMBha#Lgo<|LWShIuU-*6KTQ z6Q~}}5GCWFw!u19UK5RR|E(6@-E3`xJPTf3*#KymY_XMcp35nljI*m1i&nL2eJp%= z8I$%(K}|AiKa-iom2ZSL9sKs;ZMjEOg2!HQ?>IXYPON@r$Eub^qgdSoZO^K!UGr|Y z@>k=5@8#|q-QHTDvU%Rf>iW)I+ZQ{OqnjJ`llWUa+Rf>jR+^;E+D2f<wFpO$0Rg(^ zQS||w=Ufvpat4IMW)h1N69dB_R~-Y9glgdG<b8&+CQupzfY(+g&-1ioAZ4JMlDvy5 zv;gHqP`PpmRy~zchEO({(|UOrktc<tpbS+Rs8~lHf?STY7SNW0(;3?sk`q(p3vaS! z8;Ouu*HECX5P<a=@&njiDN5}^t7ZrTQuN_c15v3M3Xz;zR07qay%T!F)@!TWE=^bc z;81F$b=6y4ZO7&e^@U&SS#Jm@jxRrFek#^HD^gKnv-sP-zPSIPNYE29H(3xuSjlv2 z*J{r7ZEf;d#=m#l^34l(^rcIcWe-NoS+>igEPnGZ?V%>Cqp73)$i1U4X<muPbmowk ztzHH4a^LQD%~8!u;M#e@HNp=I_yMz0`DVTF6)ffHR6o&7ih<Ca0uNGBCf}kcxg_VL zSgqj17Q?a&<JrJ@xdIL|#K|fw*)C)S$Y#h5b6oDip9&88AWs9CuLYUA;J-yA#cv{) z?OdeG{zNFU*=uHZ%qN0EX4G<->TIS6CNtZG+DJP;2j5x{n@eyi1434imz6e)aQBeg z1|AwvR!srUr1YeS5!FF*oWXB&A^HmcjaAR=1iK;jOOf`zc?PS+FsEn!-KP&O@AYkO zUnS`hkxur?*83Os_ihX6!XZzMh<0Fou`SAfFR3W)yxliN`TeP~r`gJ?bbzt7TjPgU z9S*zi{Q-+ceZ?k^JJ#8|cku&nEdG-6QsbO-Rhdy2jj|!F_67%HS4QOs>(D2%X+ga7 z+@<Z}Ug;rJn9oH0OKvLhcMY#j&DHQcuo?@f4gteONJ9UPT$#a)2)v?2KzSYqfr+1& znI0Pw)!Exf9|*)rR0JyU+c*zt#4&lisxp8at+8Uu)M;EYBp?~`wWy<@s1MR7c(Ioq zUP{VuK(Zo&5Y%HG;6Hko-suPjk=Y1Q0Vf-0>mXts1nj~x!Tup8ip*jU7)`b1I*Z}Q z^}QKAONi%u;h`?ScJl+@G(XV$zc(9MXxlc;`sSrm%S>85s)!`5!({9E`r_p)HHL_n zDSP%0-!MxI>^z3rq3^ScCT~Rf)&C@3g&w86p<Rn!wu^;lg`5kNBEW@mRTyZ^3n+Ka z#VZ(H*W$fHT&HS;GZ^6=k{7Z1He8ZV7_SXMDu9$6v;?XYmF`YQ41?=X#guXisqN#b zr9My?+7ngwBC-a7D|evyyn`p7ceLOs1`cy#5OEnWPQcjth3X!-1Uzm)Qu+y|vq$P0 zDCteUwgYl_0pXFLMhP0j^h6*S=I-DyIw5f7isM>p)q@ZZup&4z$@CBsLGi_n>qus( z?W79_sm>6NMSHLgL6C<eYkGu@-;`>qw}`oJ<par*9vMm7q+B;MNS68$CVnT?l(sQ` zK#Pmxp$4rf<JlvIEdyfd_+R2tAdcXF_$q0EK7t$(I;9M-V|1$U^`b5^JEFX_`5|>~ z!pcX;r#~Px2`7bIkmd$BJ7JaZni3R_By&N&eFdRW=Q5<AHG-*0mbXC6q%vF?n_f`H z=ESqn^Ku#HWhQF8qQuK(5Hh(aulkQ7moZz$Bk|;*&sn*uZbh0A>FIjR0rYgzLy+7i zmr<HXh9c^6RU@4Ddc?@H990A50zfaPFkAxy>EyBp6`g=ZMvWY~=piM@bkt9pm;&C5 zdItXJFbVpU74pTZ*&6jXm{N_W_F2M-@VtS+L-=Enc6XnX%=lCuXg5ls)lJi5#~-?I zI94B3{>*d{3kpk~+uVCV5v~wD8kaP^tp;PtZg!3@IIcXwc|Qo=w}JPsqqrs2Mr0TA zye%-cteFSu7MBv0iWh)Nq;SZU6b1nb195(J-7HK1%l5{zw(~ruf%u2Ok<$b+aQw-f zjnmLZRXh;iAmaj%u~Q}E=Rn2<AlnH;Wi*j$nMb0c)0&$#KZDRQa29|)AlF`3AR!3% zU+}QgTA$(2t`lrbD6H_|>$&HhGO9Q$N1=#>A>-M$)SR?M%JzW6%$fD;GnN7?r$_4N zvu&AhSrp(phkJf*ba#HfLpjx5!T#7CX^_x<`p)$UF{12l?%02YyjSydop~->o$(so z6MSt*wkpqed2NC5KMykS_xhZ9ppM-)*9iFo9@o4I9v`3>C~bt<Yg2^EMROpvV-8FN z&DCP+#2f^P!xO~eNYR9W0Y;YnLC)kTNu~o7mRd?>3^D_s*Cca{V_h&Lv*#R)ZsaJz zY#;{#sUd~bTzL>v!4PNzND8JPUo>N8b30APe4sycP@)|2EiKSrLR0%FtnbIZnst$8 z^4Uy`hy;Z~Cis}N6Er*|kq#@4riCDoQ1FJW6|ic2UNl%8h*v_<nTq;fwkVvCEYj)% zmp2A%Qi*6#8NGF5U!<|fqFdSdQ-j+a*%*GnVEotSS&a}jCQqVoR!R!rXLE+#8l7HZ zN0ooQGDoxo09h@Hd=Xq+R5g5KBphJYf>>e=k%tYXCbRM&P0=6IUHx%$PyMgG?lQ(` zpXgQIkY3jIp++8ARCdf0x3}#)-e)JIK;+j(<qG{fWByXXfE^Uba#iP1XHdY_%NvY> z6*B~gJ!&0NNN|S6DEpBJk+ZvjAxBD}ERb1KirBNJ*jDDDd4K_3f;lh}dK$j^k#mL+ zf40IVYYDOd8?1(AgX-ZqyJrF_vj@>|7^T(qGw=bmLH0_<Y?}5&eB5A3X><+D4Ea2A z@q@5M<FvWP|1!MZkdCBd+Nj6f`tfT%lb5xSDE50?Qm?z}#E``H=*ZE_gf{T3L30@0 z-@l>C4Kh4(#XQHZ7+xHn;UHUatenFqmb*x$2vNQ#fK?^{EP)N>0wvUfnygz*i;-06 z0BAYcO=Uu;iq~4lj216XTas)!5%FB&NGp^K*+FT<fG!G<q(w9kyOjnno55SFuZ*Z) zDga4JN}QgFC;dSYjN(r9HY(rkM4jJT%5O7^%hSyX(OSm7wXob#Bn6ddcf_syG1EB^ z^8h0Gjc(Kh%q(&j>vU47B|T1MZcv@NFTJCA8a;Iq$c-Hmat>OR04}0MSe_h9I;;j{ zJ@aMEvTgBf{CqyGLSm)W{OCeu{9L?f8roB?b_zERn!x~Cyp#sTUse1~F|#~Y>F{eZ zcC4f@tt2cYT1T)35dREoVEL<n5{US@xzajE2XmM0Yg+KQ&Dt9Xowj8HrnPgw<u(O^ zA&c^crNgpO3<Si$VA)_KwsRnua_Z2k0mzwg&Hje)@V(O1sn4!m#eSZ!dst~qk|Guc zCtYNGPK-rZM7-fTt*5D`v1aY!MeEY@XLY7^0eiLAUs39*A)V@yqFIA|rBV2A>b_mr zw5fVCiDi<n<AljJ0I|v)OhR#D;1xHQPi{(+VcbVTQj1%!Q`K!7C0S)$eWh(XTaU7B zzH?8doqKwcHgBrYf(HWC8<g(>i<it-9nfY}5RYQ3TmsjVAiJ?a1I`CU@mN{pLr}4< z0u+!YXnJ%P12BPFoBN;ZPz7xXU6}%@h|yfSz+kcHt^wAzrB8X|@2N|7NC%a^XdJdy z5mGTmb>^0V@jqRWRFiiTzTO%^Pm5*)P>PYkRS(%i*gx1HP-_m~DYgux8^OUsyoYQT zd$e8;lG;JiD6hOD)h1`>3Xi><NZK6F&N)v;6I=(mX>({>m<H!w)|AZ6nMU8{Wbh3Z z@H7tJMwvH0m88{4D$kJ1rh%lz<cIM=143?sAV&fGlda|{3sqQ+cG5D#X-NF(1(6Sh zUyb*JygYH9)0bKiPeQ{}b=3Ow^ZjA$zq>nsE7`;SuO+6LOj5}LooKd5*O;|NV=Vbv zKO(8po+albD-<E|!oX!YKJZpF(P(2v$K7GeW4hAjcvN|V;O@Bc8AaDEQCNt2<cE-Z zWIz-Kd3#=3yYN_pqAW{bj^||?(MFX|s_T}t7pj^=4Kk%)aU3Y_$0YGvgX%dO@Y4j@ z3t$~E7NijznrRE1@>;dF%B{1)l60dbH^@MCqXzjI8Zp0b(a6i&#t#nnw;80!o^_q~ zg*Gfd{#?(N2X{6u+IMLEe2e4ey*~o-@k9O%*}2tPF}hay+jBd6tW4RwYr{v!H=P+- z;PuX5k5n|QyCKaedt3VqYF(#7m!P)-R~YE6Q09uxJB2DY>_0L#BdWoKwP3eqV8B9} zXsy_|A>{{cZlm&=+vU|9VLJ?mbm3x~eN=hWhHVhf=)yD3O*})SXTbA4@eGN3x=Bl> zX9&n+0WcdC^Fbc~fE=YSWVCK}&f;~ul-G<dI~$ZVM|^g+WYlIiPSG7g*%jtz;yuu0 z2X#$^?+c$gUZsp)#l?YVY$`uujtG`R3r`hrlQC_0qOpPJb@7r}J=7Z6fP^k&85f%7 zpb5jI_hs=^M=I5!T%7dYQk)K@20!ukqR>%zcN*8V0)h>n7#BMakQ^Ee6~LOXw-qWH zX=s$>DN|mgY(UikTx7`fBwx6}M1!+U4DKpIb)@Fam@$tHOeWXF;I1YbaJ4lLirciO zvE_u(lNumk<l4`{NdR9;70Ohv43)|$EOFUDwkk5{s?QfP;&Ro5n=Di#S82@Vtj7Ko zQi&Xj=T&m&Y~-I6Y~-H>KIsJIkx8c_ipWkX%PO(THp(rA$DhRN1l@VY4Lnp-!Kk`a zc#4dRGFgZ**f~BEn>q)g=C<g<nS)&&D^rH<Q2WfG!aK7k<FV+huJEB~$90W?SYvD8 zk#N%hc#_^dfp_<#TAp653~ye>5pBx=TTG~e1`d+O3q#A#7jkcN<y1)`l-u#X#*%Wp zZ$Zcb<w#vAIU`x`f)Pk@YL%~+fXj>}Kz!r=I<9unYW~k&Ywu^gkoNRvz6q_bEAO(J zzYC@*XW;K*AhLUrD0<~x?B@$sX--FyUeH0gA9LyKco$W_$hinCL2zBJQ}$wo03Ody zg~^1a>B9`!=@qc!Wyq(cx)CuSRQE}0{EYhN7Y@yA4|N+-D?7RdXD*DknWF>R(*sT6 zM*?lEF&1d-ofkb6?wS>i@t%gFkYPdb=i=`nhdja_A!`La5pSc#TTaFzB5wIIqo9YF z@M6OTE{c>s8zg`cGtl_5&a-l9*@s#4*~^EsI!ji6R%Wpev%*=#yI>=jOC8P$I1l!z zBOLjr3{ZVs2pFwcXBw1gKo9{d1Zff>N-u|TSolWZX5u1cRePm3wvZh`v5ViGiYcqQ z%DnL|cBIK22#ZK2Oqq5Rm5KH$L;Kt}y}E>Pr_L)T4BH{H;PdQ5{$NzwjymUz&@S9B z<RCBMlOSyeAVUy|xf)*n4EWo!^KvWpmsYh}GGP>a*tn-9$Ep)EF>z{zWj3LPE0N|) zz{>)aq9e)*M(ltvnb`^to3k9PdTAP~t$A;%epVBGuC>WFg1k%s0B%Y$gF}K`1476% zRd)oE#xSEa&2V?3r3|@@d7d>T$)SraPU&4XswJKyiVmcrK`<eFf|JNJbHni|9w{*= z0==utyP}5LDbDKIC13vL^Sv%}kw(%;9z^Gy21BeeUVdxyKugQO_r+;2Ny;kMGEd!` z0gKt^4|WeMboh&q0#p~ZeQo)%9hn+Wd9Y=w6!DsrKP;X1w?#%@K$<~6Zuz*U{co+N zl-Dz`EznlbyM31~8k#_79r|OVo_`8iV%BR@DLIPRhq)}!)r;KBT$+rrB;5G3leu=9 z17FLOc0@%G0;SO1jWCmk<Jmem?;)(4lexM|(6^-yd7FrqQYD#OSqGxk!WkjR$rj=t z5mZfEfyzXtmL{mtxin@Z?o9;a>KO8EGH6YT_*^ANlbr};UXFZZa2%Qs0=vMQz4A*$ zH<NR8wP6jJEYdl)=B(-0y0XVbL!~`Y;V~EvuJhlIisVHDHD4Y7V$0_ZtcaqU6#uv! zk77j|n#G4%BG%51nSxFJn9+D}$;N}Ig`VOFSlkAkC~Z=@L*+$cyBHgP*HWwuIs8tO zQPfOCM|3m#S>%#%bv#g${zTx)E5020`1VlDC0!01YS6P9I~p9}OVP9rYacZSA8AIh z_HSd;R7*L#CPe}Op8xD}#GXihjjH-9d_+brxQsVJMa_^lfyF|4?3`ro27HyfApW{; zAm;DtMf9bu!hQp$QA){ew0+%9hL!B2#RTUncTf9*n8QYZaC66N93dM3K4*Ovxet@N z1Xo_0%r1dAp+JHQlMf(x1iW#G0`RiX32*EmbiB2yV6xoyvl<diQdX!#>;tezR)Ree z#soI6gX9m(sUNt+AqaT6W)?XF`|!!zi%%s>;N#L3neB7CZs4o20=xd0xzsg?Z*mHi zG;-%sE}GysK?@FSjucC9@^A!Y1;Jwurh*4d#fzTMkhvR|E7+Iy>j0(GlF~`mi@L>9 z)S}8$Xe4zxU?Dy7{c?}Ad*R;OeLPaJagV)bMwODfN~fNzfs3o1?yTw9l^6y~PmW)( z3sJq1HQt+9icrLDh+eGOP}NwUa4LTtoKU=Daj|QXj{bpO7nR5%btST)9Ge{0;Cf&$ zHn9(MhoRF{LaQlgFdo`R0Hxw1Sat4ZM*@cyQ&z=OM<z5W8`rW)#PO6^r0kNI>y3BB z;~ipK!syf)j9O8$*tDWWhYq0`IuY00n@ofk+!|@#(4_O)jHepI$I=Omg$Wr=Ov<9i zLJ^QA(H;Xv7t{*iACneX#D_v?%!mUP(3o{9mXRh&07j%K0c!$V9S8do8G(H<Q3d;Y zzxk<c#A8{Jb~&U{jA`^D0OZg<)*cme@MKNkOJS?l>O;<88mJf<X|r$qqx+TD4{7%* z4}AXNm)ex&e4I)Y%u3_jDC7W;L#ZxUM|evlCh*bfRc|c}8X6V>E@6bQoF|V^tD_O> zwCV^;zVOsKK0>Vp8!$|vF~IL&glKj`BSaU<Nh1uQrA)}ptf`k-=zdY>`*`Oe<-O*Y zAK7^Lk{}M#yk2@s326QUK2MvlhO$?hNroeNAa@;&v=8!mVUqVp$;~8*0B0zQhI7kE z94xcYYslpl-02ArdB_Tuy)B--;XDj~c>x^vLKWQ7$=m`G8Z8SJL9X1efIi;vIf#M< zTrZa!VK3Z{H(qc9mR|)`hnuV?iY&g!%=X26krp9BD7VkU2{a~1*Tct;lJS{XxPhAF z0It9uKv6(98n{u2o9AI_E+=46wj*{-oP{<;Y4}Li$G$*OS}aRFiZjA>B2vkHfL5K@ zKMGkd5CX_!N9s5aF*c*sk)g!wZ3;T}fBCjVN81Ar>LzkHm|-+EGXM>y#NzRnn2H}V zxo^L7-}Cow`p51zAn6rL;><gY#%yQ9!v@3h)ar&Al{V%yl>5!PV%=_NJ6ozkHj~?< zeeK8_w<R{Q=O<@<R5W{P<`$V<h&7d#YP*%+1+-=BE*|^y{!OeLAo2!_`GqaBbQ60A z@@b@V32&eyRO+=_HJZ5sBI`cowDtq-QLJ4XuvtQ=J{uIi09=!w3U9I?JD`Jxv=<7| zNwz<Jl1)kF@8zhA`G4YnPBLJSUQ{8_bMgEVy$}QkEXmP4F1fxa`Z#MBmS7n|U3!K8 zV61UT4qqPSU$Q6TXBhAQ3_DoDBLM3774^)fcz%!iNsMP}IN$}a9sxB@%eah#X3+N) z$?Q7Z<0gz=KvfTT>?FG%<D7Lq|AuN4y<s`Unkk>`H|qD_Sk3@`gUnYue<Waje}hXF z(IxNj1h4En7WrQw1KvuWVO5i^tiokOxa=fbmdx+scRP&(yAV5hTxelKYy|;C?Xt*I zaH}h3FV*A$1DyO^z(YqB9#!9u@B@U(isNI#ND?$rJ*rs=s7>nSBNq8a{CHUXHjKxk zk8CvmxcdEZeAm<W0*XghG7BIN_|eX=kx8S*O^(vByDlF)#SY?Uq|G-P-kz^+ns3#a zPqH#g_AT7hCpbQo=-Vl_3?EO2-u>kl=qht|t0lYb>?vj!;vb%5*0ZOdi~aUve&!@A zvSfdBR$k!4i8}!{IPb+a^(O}`*$45H-NJO7*>zUJ!hRg(NekGW0<Z>a%YLb14Q{oJ zeRb<LAPXL_jNO0WL5Qq_{NF?T-v{`=hxxzXp?^;|{`8$vJfIi@eq8vS@D@fo>2@cX z-SRk`#&%|<!-dx@KFNGIT#nUThQstzHaPLq=f3jW&HRQKAKNxDzNtrkHJ6T_WV><r z8*DeaRpEx}6R+SMo@9d!r_~z+!X$?`xpMBbIu0B^t={*_sT|()%D3#nhU}q+>;nzi z!wuQ*;A;N3E9ZsQ3gN55HsJmqVnyg2Z)0Wrk-LNiAm{;DG7kuch3^Ot3O^Fw5`M=9 z*)q0`-AT70{paufkF|RK?rpEWDc$nFx*=&-w3vQlwLK}k3B2vI!YizvwX-hB;_N%t zGs08CuK@d<VoT`2zt|oZe#WM=UiLYBJ;|!9*%$F=$ktjU{FCqk@YDYwyeIrc_&uA# zILgQ_YcBJ*56P;ZV#{s@K)$uL1gJkIAf-%27Da>b&%!gv5BRI_Z_LGNaZmIifN>e; zXfjA{tO8WTCj45mJ}!Jk__}bB5vqxN<A~LETG+s9*iCfrVe3y^%gR@+Ubk^C9~QGc z0l25zgzdt&;KjcW$Fh%GPqJbRRgY8ik&|#j=wuqUhE82$eL`r1+i8h#J$j$*r$bG+ zsu9*^j;&+s**E!3N35rWN7)8E?kK46WA;V1m3^I`xeNFHYJ+e}xR>p>+D@{)4X0R* z@K09TPtij2XTm?TT97P-7U+B=7p+f%sPxuvfYP)0J+e7lYcbIH&wzq22>(O)o1m~J z*1|TiyJ$GCU><%foEHAV82ciBWV<c<>xS&3*3-g^!g~zJEe+X5TlNIOY@EaxPcjYY zvzTdF7fwM=TU*Ttr_AT?*(>}C)ASA^xMo((7UC60a8G&DSFE;EXbtDWj=Xn|)piDY z&UQ%4&5Yx}h}K-g>RahSIuNHqsKOL36IQ7tG+F`f?~KKhnHz^kHV{?hbwH!Zrt$2X zFrpQ-ufv&>o{@BgscL~rEzlk>=}y6Ih30tJmude-2^e6g)y<rOn>Y!LyB=m8*?PQC zi?9m#rH+aGENBCS9E0AII1bgJo)>G7s*WafaF5c4Px|`|D5?b&9z|!B%<h7DNa_SC z+2$aFhHJ9W5%5s@89s!oeWJ@rjkR5rheTt5ei4R!j|+GoAnx9Ap_6(BM^POj!WE7G z^P({){}s955GLb~*QD_bt~FD28Mz7Y8+R?kz2LMa{iIIf4kG*p<3bK4h6betpYc%( zJD5Cr=;w>)R7&JaaOgos*u@6Q`=fgrXyoF1MK*)|7C$`m8jZogUNh-R&{9T{PLzu# z(V&|dU=f>F1c+E5<2Q?M7+7s&YS7meHpDZPA-_)4m*~6|*HxC7rSU%+67I^d!D%+L zy2?P<Qhe>m+@_4VH`Lgh@`$Em0sqF5cw^f-QTj%Be)$}CK%=!eswzrc4PJ*gWwJQ+ zjOmI^OQTIrv%yvpli0=kS1ef_Y@DXE+RYus*UShc+>#@lE_O5X)VB9q{H2}+qRw1g z(={jJV=;$l3$K6bb}?hI8`6eKi`hNJn(k?8oUIWxzN)f{P)rK5qZ^vT3yW+{o4eFq z<YZF@49PI6txEmoNJUwR&LHY^IyTkjao8lY$yw~tOi$Y{2D%1Yfkk5QWfBW}Ep|(h zF{!cWN_E9ro#^n)aYq-3<@&N>SE<upyvny-^S-W3XGz3F6SJAo5>hnvJ0zV&WX#|Q zH$*B7TCK6dlU^PO7hQbTV2PPbth=purO{;GkV)TYN*VMPt=C!PDb@5?N;+7~1>;w* zi$xMM!oG-qxH}bWZ3>&c2CZrC+@Qe}XWgRl<K=D4Z>tcsrIvEL-c)WXv)apCMW%QS z(yxZX4&+iSvvmY)ovTX%Q&YCOhM;G8>78+du}d@>Y+-YoJ=Er~1jN#&Lp4Rd<Czkx z%VnxfMB)L5B_f3q55`=^5mT!DJN+fm^1AZaU?AMo=+PP)n+G%V_f0YRe2UJOL9bI} zRZ9kDaf$|`$!+$|cNzvtl5Qi@=*><;vEJmJGOu!a_<C!=RZ&u*slO>=d9;rS_QdVo z@tv~<dmD#NZEH`|Os_9Tx`9Dk1lSFo+h?vYdzwYJ)}bra8Z--05lu0QkxM5GFK9i& za$y}qv}Fa^Fw}=b%RyAeuJvl6djUVo1_kYwxIBPUy>ZIIE5!j9fg1rJiL|#mjzaE* zep{C!{kE=&LUikS>4J(JpYu?bf#9JeG601RLpfWfDr?I)#G2e{d?(l4N<Z$Nk89vq zB!qx^o^T-F0Xt_HV4^1kO-*nAz-=V;x|fl8Qwu5Cif#sV&D1bPu18W+E=7Jq_`J03 zxMJBbuDI4Jx88=o=n6Ui8ZKFHv5xr`baSw-0b9128vf;s9&RI*_TpzY^hwCpP)8<& zR&&wmypcl63NZ_SPxu=VW~vh*IwnG4s<?ybB3}n2E(s)6s_OYA97v1HNY$x9VrP=! z1$jbZnsPD26Tjmg5Y3L<>p+!lxMg`m-2c@;G~w`;b@jgK_sm`29o=_f_j7%T#+%d8 zKs2y!PXD&ywOjHVPJe4O9{6T+_b#((My)TL*mB^vXKv0}pSf?}w?+?MJMx3ux3B*1 z;_TtBKW>tkGd|R^@w)V;p>!~4=xTaUH}dr8eD8qIoJpqAmPG@jhxhmrhjw&2JJ&5A zS+^}1HO_Qp0)1;oXAb7N4-YKgGqZEc!sV+L864Z^aJyL8d+Cf}7jTkq64nzk+hW*9 z`FY4KgD!u|yv69AMGS_%a|4`n#>xhQ;4zrTPB71;cn3O=%+w3SQIVO%6cNOBS@sTU zR*;XWStu|WR)9s;sBE-GWhymfK!q}J%TI(ryqx#E;5~T+oD@R?gItm`Np^+xvD)dg zT05^J?ipT#6GCtqZM|p~qeT{rAP?|(i5IEk4cUk_R~w^jBH;JswUx_;*5PN_#Xx&& z?YeHoFzGn5a0}RHPA9IJXU#UzZmjN>ND!rsh({wZ)hc_S3v;k~8p^&2-WYB(;t;R~ zF%b4I+HbHUshc#iLijbb@zH*T9TFF!N!!FmTd(~ff58@5a7$}nb9864ErxQ#P*dl@ zbn5WY?Wx_sTY1M7PefBC3p2eddpF#(ChYFNCHDOGkw`4Pw}0b8$z_lEETOQk^&5jn z*DZWF{p!x#5qA3cu8k`n82XpE6#9?vw@)|K%pIQ9cyG*~=!>ri3?{nP8rK}|@NYQ< zgiE03q{N;bz5P3MD_=Wio|BIB4sF`fy?T+MEm&>J#KO03-yZE^^9M~Wy+_yFqdeT7 z-!gA~SH#sdNfzBG+|F8s-0}%ov@|YvAR)=ou@roQCa5HfR>R*3E-Hx=UJ=QnYe*LL zsIn*(r_>DMoD}a=n?mLQ%dX`uYgBQPt>=9SWh+FEwTigQij)HCjf&{_)%=t#N|Kza z!ZX^FS%@l1IlG>7T(0wal4zZa!By+PRrQolfP_}HuOR{<ybBG~Y1v2VmBd>}i#PHJ z$1-bn2zhF@>{2k;Qu4S;YrrrLBqETw>ab;3HDp&?$E@zMBH}PTPfNH4+}C+MxKHju zV@f%)7?&c!9{;8wq7qz0)mZ`kMIwqkg~Uc&x1u6{a2!-H7e`ef=>rKyEj;Tu!iI~( z7VwL>^a4;TuIE}|;YQ#^o+(}p?TfNIRfiFNp*_)jaP;1N`=2rA)|`6dz?MYV7fzYX z?C!;FdxEJu4?H%K|MHq!wk>U1TNSQ}*5BNi=zC%3o8OG~+}`e~e6x4Hr`(;02DbSV zw~lrLLi@Mh+q}F{eChUgk9N)6vwYxicXlYlM%VTQGA;;$;I?<QeW+V4XV-C474h zkM?(4(y7Euvv0sVfAr}Q-GfbChF~x~^u_v~FZSLR)NcA?=Wx?q4#T2<UbqE&o(8IY zPUqKj;*%cLcL9ok^8yr|%2Ohc-N`Etgm?JTDeR9LzKQ>I{)@j5&wZq5Z@%~oO=nM! za-qAM{-gV%j!&DLeBWXoo`%C$mcXl;LcsI#eeD5fz>WX(_-~OWeewB~TJa-gC4IuJ zU+z9fWghU^CfM7CUL-k`3T}Zyz@^A^$A64s#6Z}g5dyo#SuG)FtR|RF)}*wlgt&eE zs_}IZ#oso5YiGB(t5a;K5=(-?lIp@e-?D5H+i}n7bWhsFTTtet{uwis8qjzQcwtD2 zAV-pfK-1rSm@&Ovt9j#vk1q6FoU`MjU+p=pb=@PjhsDxJq!cBU8ey~Yc49~y0e$X* zFZVhjr=d{RwFvR!)=r9GRMa9NcMfmXBTq{wlVc^$IT!$X9%J7KA;8++fqI^QjAy@_ z%8#o3sdgvFel+@z5D0LVp~p||M~~7{{L%eb#fRhhy&PI+><4=f<BFZ?FYb(Iznsc1 zQ!BV`O6Hc`NxxXO4!^k5lKle4k?l$4Z{@ekUZ2cvjb}Hd^2^oltCHEpBw?^;QAeS! zxIDcLV=hu3ZnOsAtj*4%E8xw=6)Zazm-DJ4$@uiycgWw9HP&T25Pva3hD1NqQc}Dr z`r+yKr$2QUKiyuJwc>O;QY#ZU4Pn%l`0(^!{^hBA>9iT$d9sPKa%J+vY}MJ*MZY=w zG&x@gkEPgXg5TvYFFtjUA1SqrSxT*HBMN_I6+WaaW64wkAKEQrbKBc+vGJp`f8`gW z(aZeK4)mzNO|H08(98w-_zgMSXVO9Zln&ya^appuAM|u7F11wps}jl7+)ws&u^Ed^ zr53AtwrwtoRIY0JqIzA_o)e=#f(83SxOc4b!!jAgKfw204f3sbLeBMm4o8Y8HESk9 zQP<)xa+RFcoK`>~3}hWlLUJ{mSgL9o>`w_sXjM(1N(%6=CV01};Xkf+Rhh8=PD8-h zBnELdh>NiM2CC6tB2ZNmf_aDo2p8i9sVa&7P0|VaoUAFRdbpw{i0i8OztuHNeGF3{ zClxGGZUFNk;9)oyDu_`N+Z=G1MxJ3NcJM8BK&)mvUwxB(^_TChW(jQ&eyNz|BuldZ zyU1Sq=yi4hE#B_^_EY$~Lo~nZuJpT<*T>h7@4t9VxmC$3k=LGAwxdgD@4qQ;kH4!J z{-&J#=sD%Zt>_-mrM#6_9)DMHC{HMR#-CT73l#t5jf-gP=&lO5SVj~61CunjD{P~r zJD_Mzvk22^P9I=`^xpSgW@1a!;qfacl(WY_mj1NsVzcsZ$_p3g|GW94zbfyXzW8m; z|6~ir?}$bFUl0T0m)J~nkw43}J%56kHh5T-+idN9<=M9{UO*qo{P>RX_nz6O7{*sA z4~@S$e$ibKaEa(|_kr}Drz+Land<6{&5XUA9U7Y7-`U<fH{ftrtzX~S-#KgktQ-1X zn%@~q$JXt%Ui@fVRbAEA<@@&@dcq&}2i*O0&m6kv!L{RihHt)kxMN;yk=gIAGBzf^ zb@TA>&2vOQZd2ih{0u7H!03zu3lY_aRU~_mW=?sB;UwV%OR9K4IuXL^aR};qBp(*i z=-~fG<ttJQbJKV(=QQAL=G;8F(2e3F^0*_p5HT^HsE!CNv*c5%vXF}B7`KGjET{+6 z^KKf-VF6B01%|-}88~LPY!Np966BYm$qYC=PN~sOYI`S_L&bz#N4NzB1*wD?YamRu zFU`Odq~Z%SQQ?{Bl(|V=Eve9j3NvsJHFgNe1T{}phC91zHtssWl<yxwYM5nCTWxCg zl^TI26ns80(6{j|bDT-a=HdQ^(2nQXJBWk>?@Eh?n&RJL5v?Se!@RD=j1$wM<KcLw zZ+SSPd}NAvv6Ub~zDn^JcWIx54{t8=5ck8=dI~<)KM4vvu~XoTTZF!_x!KU>NI%1N zk{iNjjzL+VeeNSN5q`%{=I{JU(d?3FZpHV0OjR0pst4z<x^?szG9n<U^Pl5$BR|$d z!L#h%RQ`mzN!^pow_V%wJN$O(w@%V;+tlA~<k#m*>gZfHosvs1$LJwnVkt~>{(<}H zRQ>_Z2xs8-s{*R@?}QRWE`B5YEBv0QP)gxAkE2j5=5gT_#DD4;kR_A(aejX|JL5TT z_>Xw-8CK4(oX4+x3SQTLXFzLCo@9W?d{O_-Dd9=hl}y5$hzm*geDN1xr56!;x#sc{ zvOmS;dJFaQ0;is2(-Fs+#?OXu*OSbQ@7Bv#pN7ZVT!^$-5rKfZdy*~Yyo=s|xhUt| zDCb?CKRdy@z?cQ^7W(a;5@xW5Bp%y=ZR`x2aXBP%lEspzSqF}&v78QsjEXRZ85r?Y zepMA8)6_Vze*r}-3HImEVGK37IDU#vp>WJ4t~j4$Xk2i~BFi}PUDbFyn9O}wCSrW| zhgf3rcj=>i96$Z8C4Zl~qV9`lWy?t>@!|O{4-Z$!@8d=vsW<vvGW(H5-i-raQV(38 z%zbG$-R?`<aOd4$qK~`3hC6@BBEOBx{-$2`Hw#&Y`5XDS+>Il2*^M{j7b7>)$B{Mo z#f=vEG_E<P{^IFm?({i2clxKuSL-@=8oEXgKPAup|DX<>Q&Z#@^FyatH{M^4sxsgt zn+h*;5NjFEe|8NN{8K_P!<KsriNw>P&L^#>*fa)<6J&;y`&vr!J}nf1=g2{UP^0aX zATcYhyktF%P);=?5Zdg0wpJse`g%k;9z~V=3GlKWagZu{0D5c4kTee*z;)lX=I%dq z62CYlywAw^_}(!b{m42tdiUOY$lTaXn{W0@*8C0M`s2U;I~{r3Cf_-Zi~eREyLH{h zKhvQRo4o#i<Is)Pu_rJ5_Fr&FK6e^6&Wcq}{hZGCSabh;{x>wPW?LRe)6c#@M;BYs z7EX}wq_kRnL-s4^0J?hZT|4R2=WO})U*7e|qw0^x=5@8-NFQh0vWpwCk6Fk1Z`efP z0r}b!`0e2a*$yKJtyfNA!QPLTNX;P9Fp#bmeGnS5<DeXJxj#b4;eQM73x5_aF^T!; z1~2~3im<}Z5k`5Nk$Hu+2J${3Jj+mug8P-TYbdsmXvhWu{)(z_q(Wbj#c9am$>nYn zIpT#BmO@n=iDI;~CsB;f%-l_mR)G`7J+dtWB;1)k1>ds^SNRLoVRUDb@~9n!VNW;1 znaf!-1^FKr@vE9bx-F2@te^ThHVJ+aEBtlhL2&zXUprjjxbNf#rZV+VoVc}j)gbkJ zkRc4T;L1V;d@zivC_LVc`~c`(syB<=VK_LECi|%&rO2ON7>edqZ@6P#aO2j!O8(#n zlC_~JoM=cPQ~vlu)Ul=3utapMZ^`;s%`xebS1oLI?dMrl*}^Yx*tloulMdyVw>4gW z@voB9zp4|2Gm486cSuHy&rum$xX9tN7@61=-z*xPK9@HkZiwlW_NrJg7+WMI+(vO> zVvA^W`y8H>xFw<!kC)95H$~yA?;K5e9mZ1kR5L1VjN)JnP=W4*;(yiB7#?AM&%!Mm zzOZKiPk&*}?7nd!7Vf_8n)=SVeyc5{6IW%-Ols=s-g;ztXUbtU7fVd}W#I2U^kHT9 zP*1vg`_I_R*il24KuUZ)7AXy36jI2{`o^K}^bCf=9~mQ_oTV+J9I&t6wPxF)=N+52 zvX^dM&6fB7TO5EhF4~n{1|~@k&&9UPQsoaG*W|;Z%fo^lOy_l2x5fg2SYYL%M&-g) zzvP#+qs_F#DyDI<s4I6^rPZs`35Su*wz)3}xZL2T2e}QB$l}(vdOn=nwtf2-4z_7) zIKBPW-p?ogBNLP)pTT8{FMPXs`|#+|sL!JFw2G0do_pm+?lKl^mvrTBG$SiU*INNu zmpg&VS%o^JHpB}D5l;NFaJO&>l_CNPFsPw7(hI?C<bntS%m^R#6v6`6)oRJX!95+^ zLlLZ1&%2xzRGmk8Dxm`8AbuRyX}MPf3y#(zPv%z_Df}g2t=gBBqDpjWhzElT(Xwhi zeWMtbdVq3H;YFb_@ljQ?eSQ<-KP&)JL@X%IJ{w3_gUsPBH`+~Z_P-36UAI|R5-K;> zn8qK|5QL{F89hO9i|90aEWz;j4!b!F26Bp9MJN1l2D45xe%!}wc8|$uHid1XWGXdU z+`*`&G<4Rea~S<aW<w~PaGP8|cSj7RaNY&Onnp>q>h(Uf!IJ{Pgu@ez*h-9LM&^_Z z9;40TlpJ1jX^1`8U=CVJMU%%IFt}Nh6cNqNP^5#gv%MxyWRbyXl?=T`cXXjoZ;+k{ zyG<US+h!>ZG8QtG*gU?Fqs78njb>zcA?6eGnJsS7<O=ExlFlE7-QOX^r3+FXn8INp zg&iqpq&z58HIoi##b;_L4M*KUsN!fx3q71tCa3W50^lkbx`P8YA%6uApDGR~dMT*I z%w?)(fSRcU&Z#-aavJ+=DWU+ffRY}{;}>b4)Z0!-cBi*;N*!$}$QiG}CS=4W)M6CG zlG3t3s5VONkfC)0XLCZ{kyNYYwf?G@tF=?#crdC(&{PXf2N0d<J3DN1c6@%-+Krt( z*ZI6=smzq@Y2Nkt<2x4J`Rn(Ue?)G;8*ETH`rfO`Z&}SN=U7;Iot^gRR*ims*WF!< z4E>#=)pUIO*5?)}$64RI@3AGyBT7JdSZY^Zef1T>33}yKw0#A()&xz9&@E@KuuwRU z$~2-Xybk23L9gysm7wu>zDp%2^43um#0h#WZ!j=NRbS><s8;}0&muKDJD)p5^H+MR z6A)6N@M94OngR|DfIPDEVRbu^8%o&%Ek>(L0809_c<VwUYNd5d=QZ+n17=&Uc<NkS zCdhGQd8<uqy2vw`4cc0Y^#-H4Qd5<hHHVkzmLZ81ZPc^4&<Xtpo9z`UyD1I}L}CPR z-B@`DIlB<RrQoujCJJy%<WZ-UNby^^Ci$df$Ry&%lellv1i07qtx#TnC3@Y$nlH>- z)O<t6BuQqk|C;WmY0O#WblBau`P%|p`qp<zMTR2n=Z9{?bnX74@<o&B^L<kz@yV0J zuK!-=OPeq+W?;qmO;)jMxuJJajk45RsVl2lWU+M)-n+F&LMtcfk@>G~#WZ$^{S^yh zBi{y6@8r3xfoH5=7#5xsvWw&S8`b%XVyaU4Yt$(WB&i!X61!$_a{v`?=+a7t5TDhX z<Jq3``9XD72i2v}uFfiaNq{1wS*1A!whLO<6r9)cD$Ht8JHSOk(HV`4dv73BOI`_t zm%PR#*eT^ikgp+xfa?LYZ*3_N>_yICDACw-ZO`CJn(i5BNkfcqgEiYn%ZBCwGL&Kx zJZMg#44NnN1Tcoy3`ptEyaIDo2?tBY_1Viyik0<e{LQoMtzWw_IJ3M;SHg@o-}FRj zh1YJHIs3XAv+}tjcU8?a_rygRFTco+i^`)_I+Mdw?Z9P`NxX2?V!SYArp52<Sh-?{ zzNkc;H&xd#rMNuYGNrGtL2EB9FSiwc@^t0rS}WqUK8LYEt29sMi%%?Cjj;YwUb9W> z!%i2%{&tgEAJ4I;T+|u_(J)4>KkVlxYOdm#nE-M-sg;KoK7ky3Xhi6^HE%GvJe0YE zb`%_VM*w2fV?gz;2$1YT2s7)UsX!Y~<T?N$pjJ~IMu&o0yTt}g&5hm!w1BJ*5iWcP zIf_)G<u>~$+%Y%eVx10WxarG%jjao|vp*e4^oJ5HjauYcF4{W!Kk9F_MKUui!H9Gy z`NfO5A8gdPt8KxLd-}fVfl*YAY{+51_lxemNPak}Ja=6AaL?Rx|6mGs&)&vrU*1Wv z=Jlv8*(SY-eZHEqeW<d8f;hSI%Tx$*leJKJ1_4BM4b%liHftdoZ0PKa23UkOtpZ~4 z*R;6Rx3KVc1t>SYY2g*s_d2Hzd=tG|?~Q*ta$uNVNPUg*_gG<!Y}?KBHrKJLAJV(* zn0N_Vt4!ztp4w6A0`@YD+(;uQF9Ld6crbW|jSL+mIpBuCaSjA63`Yw$?k&KLko|ZE z0pJ6&!Gm{zeEU?!><m3x+l6aLtJbr3S~eIAFT{GEGa5H_M>1PHI_CepiRsu!lPeoq zHi?Hfv6Lmeqth6FOC9H$OBXc@LF2e^PXQ6CL@)TU-)e}+)u47YFQuxEQAlh$)l<qL z<W83Wz>E(!gyNB$zyZtzLd64m84X^C;sSdT6-;@nDrc;5+H&O(6=0T8HGke%T%qB{ zmClM|PE1ub%(F3_&0Yc4EI4Vgi<%jsCZI@p0xM5HgEIcfu%MKGqyVYKTU%c~viqeu z^E+QU$kt}&l!u*CY^Y87_VHzV4nD+g?^rc#G0gP^5}GGhJ@K7GU8BA8o;kR@Z@t03 zH!n#mPuz3g@t^M4uqbW}i9WMGRfjno0RJrlZJLBpwXdlh13h9DFyVz*HD23XV8nc; zzQPQLP=<?QmIdSra^wAKDm=2xk)|qpKoIixQ;jVLQ<2ciiz<g_S$|@7x+=o48EV;5 zeGJp?0W_8@WQZI^HY!@<YZOwCBc>9R5UG#=#icAG2uW3jx+?*SjcO)KpjwYW9VJb% zlc8(7y91$c&-YjFTikNvW^uzK%3q#b^;E`>`r%TY+2k$@*aO$}&r<%{95jA2pX`zj zITtprVRdEx?!l(+))jr*8zkG|`<{Mh@hneWT67_MsZ=NVn*U)rKyca|k6qvVxe%C1 z6gt35he0n5Izp@va)65E$PiR*Jq47?SBsu0yo!bJ#aghx6d_E677)OTD3mk+E2yfN zJS0`9#-t`iaxp>&vKwA|K^Ch)F!=GOSkV^`^w4P{jOzBHv>716ow`zg1Xc;B1Jz-4 z25<v4%xM)fhyTQOvJQXPxyKmw#LPOB7L5+BA2k@(Rrr{4W3!bty=ybK!KO05qTC&@ z)tw21ZIV`;$o4k4+|$qWvG-XJ?Ne;|@#E_HSp__VM(HhRSC1Fo7|@fs2(`EsA_TKz zz?)Oo3!vy%Vp-8w%3YwuZeHKmVld*WVmfTsc#Pm8_>$4)A*Z1Vp%x7U18+@)b%(kg zIi`k)n*ZU0l@#R3YU{EV#DGwEYc(MTgf1C6(}x-qAJ(xkt<GYlkO=}NvJ2}DEk03o z3rcDtprorowwUqPM6~3d13Lsf1U}51um(GFVOgm$;YuAeTye+fn(bff*tBZPE9R=L z10BaBy&I1```4ci{d_u*v_yx&Y4Ma>hep}xk^2+kBBRT&+hkLY2QD0b`r?g^{#ct- zUIL>T?p@9=D?z(epq&|r8Uq3c@szcYk|hXJ;!kxvU&YlIj-gR;UKX*0L=~gY0aDb0 zJq#k)qZBocJP?jDT)4+c^gFnAXyrBOWYpo!;Ey0gIdVkc&F~4yIF?L($mP_bv8Hv$ zNcS37;+FX<(<@dyE&tPr_AT2M{`5F=d7ii+ZtPj~e(s?ovB8xs-f8Ud{SPYdL~ee0 zKePSznI)FceB}d`7rHO~9<uZuL|vok7)LV-#~R5}B-LUnOEA%l%2+*!!OA5h2(5*6 z4V%-3%u0lz0+<0&qeqoLYY)$CcKWNsq5%LU>{!pjz=d~rTMx3d)M@Xz7Tr39`8u)O zRW|<C_#AyH>(Pi<c`G5ahOh>ULDT6-vsE!tLnOyBAXRrrP%S3mOOlS~rx+wsU^?)* zj%V9*6o4W+Y~q)2@l`~9K}ww!)uAY)5q4GJb3Udw*~<L%oum^{+0U_vkT0O;;)%;8 z*uVu9&qX*;GvcbI&ED!C&h)n&z2Sd7IJED`E2ob=vuXa&jQo>__CKT?A3B&@vEllC z(xICcG~C<}-u&f7x1Q?1FW$fWsnH8RJhbQ5Imy7t(aqo6`|+E%?mT+?tcAJutrV(< z3|fBaZOt&=-wfSpSnc~m(gr^9oSO&${>ocRs@y<|0FSY3Ek_M9fU{7*B!Z(Vz*%A9 zER@jas;J>+Ll)V!0x=nJ7wuJm(9l{rkKx5HZ$y-vh0T~`F*FJk!C3S<_y<r>dfdu( z+|hlTJF%*hvsdot#9r4td0ctN!?r)YTI`!(u$!7y24k+s@XPy^cYpg#k9o@U<B#Iq zm|#TTlh^TXDdCuqi;#4sT6ZnI7Q*}aSY1R5@mkKi$q@`ODkD%>O~+iRJgOU(;+O*> z7iRLb^MLG?eRyq0GMk`niS{Ry3kD%yfK^i{fiF&S<2VS9?+HVx*5&l_c10m84@6aw z7zsCIV`xu_%6EVtVC9jB0d;TzcJNH99tJOIkZxTVOAcAE$X?Jn-1ryFjxC)i28W9f z4rHhUMJAmZi?N!g2fn$@<+FLFtorkeJMOsNCI+S}I`lpNWn8%<_{`|RebYnbzVwru zK2R=HDSu172z;Ys1Ftik%`feE+F@XA(+2)=?SQArKltch5AU>{_?n+F;~Rrk+f)P< z1PQpV&C)q^9R{Y+WZf(RQf6CIGA)<H>}wFB&jS=90s{7@sC)L8LBJujF!S?c-5fA) zIga&#sLQ!VSB5GqSpX>!L?~!>sNc*k1!nMM6%SZ~AUx%w6|}qz@m2^8jR3E9mA1$0 zYnxaA;5RcG8TK^(H;hIa1!O763upyA6C2+C@MXj)Iu3spw}XL^gp0yKC{g$uW#YzW z(ia;57EC3y$^~QMBK`A$U3o8p7G`4T!QC;*>`OSHdWjy^;4?{yjf+~5Um8kzD}g3x zu&~rcR!RR}rDD$BQdU|X{f5aOt!BsD8m(~&@If&ftMUhwe)dqr6JVw4F`cBZ3a6BK zNzBK-VT$@NFI7rNDgoY0D{2pNB8uV&P}+l)>mlC{0&y<POI30<=7qq8WPk_(F%1<7 za<qj<KgmJIw=3GnNCzN<ph+moBIiVVC7k8J6ru`na@=HqD-4&EM#ql#Y>Z^$QDd;Q zqblL|)W2O>8;=(i_4Ib%UsO~RaY#ia8g@cgQc|pY-aWgih#3!72UZmoEuTI5c2qgJ zKN0Yyb&bRQ9a}{2yqRTE5#WWTqGXY#q%`RD=_RGF)XYjs*ei*UZsMF_i}LRHi$D++ zsc)HPH)@EDh>ZTVP(uue8#rT7L(IXq^7XtpEEgjeP9?YXbCn#VwGtJ-6QDt)pTh$X zvSc;1vR(L!a$gPpKsh%O&!P9XoWl2%T8po?HKS@Ls$;sk_*XnDms>x~dd_0&9b+O2 zc5?XlN|FcF1bfO!%|J~s|G;a4HB?|Qr9n1gpBU4b?F62iMKWx0gWQ0E!t5;Ydu2WL z5R?QbXEvyN2v6FlsY|=zE4^F_tk(AekQgSas+4e4Qb^cBh{epr-tRxdmUIA2bTr+m z<JG+*d-8`+yxX?A06_Ycrke(nmOw^I8TwJVn>w1g8D+a~E{Ij717BJ{d~cX4b1f`h zcNuilH}9U+%B!opRvFjylKiE<Q@3D#=A!OG#SN@VkrxsO&$(t|(dntYn|~dvYjz{J zH?5$)v=iJ2$I$U0U`lQk3rGzCFc-GKJWNO>R9ksEJXxr_lq;I?cdm-zAQj_81wwUM zxAla+sMJ&6IF~3n%_dLf5&sZKQjnw2ZE7?2*)ZZsIB4t+7vUU$|LfBl81_g2AWt-v z>BFRQqN}QEMQFZ{ZeNzxiW>W^-|Q_dhU*>%U(K?^D8MNyclVp{9Z`bNb(N*RNIJ^3 z<lASOxodA|*jZbotZ<-IAo%3m>3i7xU}{Rqv`Qp!JZx_9N30oZd4q{5efC<7E>Pxm zIgS1Z#%C9A-xxj`{sQ>jf9Y-Zx@dx~;-XwQ2X?dxJSU`CAhnEFxw?7JWkhc&YaG?P zDA_I8lyVtJr-L3Pm!b)&Ad`+Fo8bAFp2!q5J~9zuE|LSDgu~8R)!nzvuIk*d`nzkh zdpGtkKi=8={K}5Z)2*9-aeDLncdI%C*?k{io{OQ$P52#j5IqDfq-s!Qgy|-0Am@=a z42PLq3atTZU110S-hqY$OeDF~$psTvj<^b18&2A|_K8ezn;6-3?~$#$zje=^H{QE= z?q$HsC0_W!vERNt_Wc)oE(t$-27TvTKRw9^FR>r~guR+ws0fFqJlV1$AGvHW^1@(9 z3QjVViNjndMI^tJYo4XlZCv1>mI4`8RWxFvn-NNl09ZXNNQw3x6{M596@UeX)pb>w zF3VUv;V(M0%+cf1@f{@A43EQx=8xEp?u5xx+vs(y{)%$zWS_%qjC~dC!sVc6HE^f- zx*?4WS$eq&5@Yc1Los1U@FqY_0^f!M2fhGsJ|(|6t&iz!lH23wA|gasTI9n8WIk+M ziV<?tr55Hcw^)F7H`td6dBc{Nt7D_`@}7wMcgKbnb)IZY*?nhr?Z4-^^2d?zlG&nZ z|J{-lfAzuL{oQ>Bjn2NUn}@d@Q{JO>KBVl?^kBTRg{|m>TA1`~Mf?-yf3`kO7C(2X zBJbxvxA-Za=Mxrx?f)MZe=R%n83zCRKV9|I%*qLa-%~Z&;9sd}bdbFt^Y|ApyJGJj zc>0}gZtuGd1$%$@3fTLtjuU;$3cZXQl*5|6z!M{Tf3=XS#`-0sT2VFbC0PBSUDgvP zct|(_K5gq4SCL&*Ox*eZX!fIo>$2H@r7Zxqd{ULj>~BO=8fJgAVD_`H19pE|%#7NW zg4yr&DMPB&Uka;VeB>2nXTTPN)o;hQPg(udZpro-H~N{7Qg#6cauinH!)lC54OE%* z|A)M{k8i5V^T+QwxlM1=G)>a<A%rwd(=>!MO_R`;^nsSPlmexcQY#j$2o(WA6a++u zK^7Tw6(0s=d|()tbs5KXoO{#g6hy~|ah=h1T-O!Xbsg7n9M>7wopD{)aUk=1f6h%q zi#j{s*X#HF=lAn9LvQXq_a^t8&-tA5d7saDzoqFgYEo^cm!66PkktGa0(<b2(G&-` z0HTi!D)4_R*hlZ4669l@>MMf$fqhr(-G?B5=j)b=Jr{L<PXzg&JUQ}Q#M2lB&rK%C z_nnLJSrLN#pWpZFnUNL>@?B;}zC{H29P#M?w_^O2=>vx4hMO@mG$F<(|CNdH6>4^C zoC18+csjHFT#N}axmN{#_J0rg`vZ;b4v&vQ{ss#97g5M>y_?PAAJ_-_DCF;%2>DyV zB|BhXlrR1N_n3bW{c$<^BgOpSBrQWr2pq{9$)v-ncBJNKqJ50plcIe@z<RU~J%XgS z66?Pr+Ap+IS`S)`xU}?Wzt~M9xnIv}!Aw*{`?P{}Ubx?2=ik=Bq}bY?OB-r71)rsG z|GGWBdbt0@Tc07=*B<;%_~O-V(;a0e?>s7k{X5eiy@g;uFrR@<fk|PE_itgmuct96 zLa{#4Izd%e=oTw+g2ue?HWh>tN)<rf#|)t8I!EZ(K>MBqRBr-QUr&x!K=t(ks;`e? z8qkO_$&ZaqngR}1NvnVqCuo-~cyrB&sgyvcim<H~VG;s$A_py@e1-{RkYX#^X#0hE zLj6n8L8BpOeAoq!Bx7*r+~)RXzr}4SMM&wK7j3+n_&Hj?uPCFl_)UWmMVYBMD=n)! zkO*Mv9&Pgl+6yf%*)qDYE&cAU#=Hq%Y-|!3k4*7<&VJ+7=y-)-TI)yNhv36fn*<wH z(%U$fm{6i((|~nG6p$+R4p5w4h^{I2C}8Ip@DHXgrC~c&U=nNL1%eqSl_NfB1E`z= zK=fn9NZzI?H?xh|iy~sxEXSU*&1n;B*pe2dWg0s&@6K`lP2^b>3Bs&$gJCjbmS8+- z{1aSQhQ_9WL}DP_*|NB$WpN0^SOu>bu;;euAGqSN$1>blcbiOo!+!S4WbUhhZQla- zRiM~wEy>GpUt!1U`+Y4fea+3{-!pd;5th+pM8?sVaAC;{|1{5)3=Yu`a|VrDMcunG z?(fu@ZUy&X2sN%*VWwZGXMp_G;9Jz|85D}N#{6ZqPB=Dg!YkL?fEEB4tqEDnd^+rC z#m+jd96LC_SlOkthG_-dWVcx*2JB7tRY*1~8El1Q@HbNxq&pH+wJ$EbdB;NLc|P@C z`q%8uJwxgL{w~CZ58g6#_Auxyw&%P*&?K6G{#AQ!{^XwYd+A328kkua=~fT?QA+JP zz}B$<D|~AE4e3MVgUzUywoJYBf~g_dNYM<UK?>bFQ88+DTx&-V;R_>`H}=IULaIf} zX?=fL3w2f-vCp>%9_f}2z(v^I@uzCdjaow{En=i(Oi+NuYH+_>WYoI-=<Jk}VQWFJ zo|jPbgknvBU7HUA0%8Mh1Y`#pUfLP!wt>-AjZyF}Y3-$`?pVv0RkuM90fERIoAY%I zrgLgl6qr-&34|RH#bgAA8GNDz(p{oGb}6yQhMPuj?+csE^$qiYNmVF6Rg{JCityF1 z+}nTGk&)&#I2q{0$z2^&+9kdB>}NxpZws$w?JK`|?Ew3%qtkCPPUu=G_TIsC_>R{f z84j&|^$_pf2}#ea<Q4f>{>S%~1a!nM#VkJ=qnT?>T&UnA4W&zwlT9i@k4Yv+c~>8H z{K8eNj%aHrE@^_794n0UT&zPD=O85Hn;5_&0a$4g;*^&Q(#Fa_p7>Wp=990oQWh^V zP=bSAcY-L-1E~;JEg!`K)!liK%T1=9WwnQyF^}yJ`bJEqr%l!e6vg3lO|uo}73Da1 z+EqHuV@-dU{ztpVZBKXRD8c&0Cgo{|?`cID@dVip(|g}<nB8UqaZ=zAo;pVkUzWGg zXP+j2Fi|J@!xyFBJ^Ks8f`K7r;vDv<JM!K6)%oF^KYSQrb$sCLBlshi<rl4LFq-Og z0Wnj@g=ZdOudpk(U?D4WF0b>#m-Ak1>k1&o5C8G~oR4zuMSlDx()&_!F^y#Xa1q_m zRiRoB#<k_>6?S44v&6M}91Cja=D5aD4E%P2rg$V1Bm!lXPOFMQPSmDYvQxyjooyKY z+B<tF%I&1iLyVDXr<GRAv?1VJ0J(>(+6sK41Ob9wJ8Ai45n<O!OK1TN`$P6IyU!XV zSHHrhxj;fLY0DQQyTv!7$KOY%`)%5M+Ca(eV`b$-f<a}3@+Y)<7aj&a6uS+{#Q-04 zAs4{4*x8LW0SumEloap_vW)=B)WRUhFi8CZ^38;d6H&B4X}=J5V}vD)M?|_$|D`E% z+2UO{ca8M+gq>L0TD#}q&(rTcbZv{-W-}F5dMpZeRTejfJgq&Uw*Y7Ts9_`2&^kg; zLt9?Uf8*=xK6q&V$mr+X4>_3STgR_#nq{phhT_=)chSmslJBM8iJ5Z1Z0L1{)*Ajj zfr)55=$dzA^K}JKlsWt6u0&Q@%Gku$c^cEws&{{Q=taUW{otq^%()-=8uv3~t=xe2 zO$O#(LA~k5+z%f0)wW)~Wj8ug5>oBR(~lG9Su&S~>3j4Qn%sM+@p4I@s}!e`gY?fK zn$B5P;32o7c*CVTXmA%s0WcJ+$V~mKh<eLu_0E{K6HSf8gq^q0V|HGT$K0}$4sW>~ z=CISMZ3pTGcJwtUHXKW?*-qEiTtc?92HE<_?Q7`7_RT1~#!5QxS^^~@)t2a`3yF4= z=}pjK?>v-A=nMNEz$_l&XFb&@5c5!hSO_ln7@n|wC)|a6<=%V9fY#WOOgciw)@x68 zUGW_{N+2Tu=oLI}(lv7~&PdvgH^7{F66&qCWVw?lVzgWs*q9{u9l=wydqA=X%`7ZE zflvUxL#T5jwc1<XfCM+1r;&&X=`~?J4vFHxN+wdo$sJFmh?ks~B3>_2tq3|K9ygKV z8X(!dFFpgjj9+q=)+4Y)0kRc=GSWPee{2lbRJsFRN4yK!MF(<ud2hYb_|U0M!__gY zt{)_+i(oX+F@N*7XVg^_^?Cnu+)@hCepk>^XzliIuCTmgjXve87xFK!$)pF9-sz8L zKgRtBO_6mg)^#o3e48_Rr0p*GCb?x9*!xRH7WSk{mLFtho4>ecf7<)dy?>n^4*QKc z+!CSG=?Lvv3WV8RyS5%q7lndY>pyqF<OTVEiN{s$2|YZpbBnP3-RIqG^oRMrvtN1^ zKmW=1=L8l;dIzarq?>b|gb%bydzeQ`_LAR&Il~PmYnz}q12%i=1eazy!KGwC)Cs_e zD7Kkum!sr0m*cc_4U99nLiC91@CbAuE!c(!e4V0Cfc&%|SW!YHE3n4I)axVa;AwSb zOfc`FYs?*_htDNXIVhMj1|jc6%o)I^sB2*NkGeCK3<+}l5C{~KbHoN9!HkdHKP23t zlf<hVX@z);NuU>7BB_nyNZX47bCAF}CO0g&lx#jkHjf?Z33ZEotil<;oK9i^5t%O= z?doQjd+yf}cStgo`BIz4R8B0a!mJ4EMt%!fv2KLt?{oOGO;aS!5F{*k6ySW3U)0RL zXt4isSMQa+o?vmkA9*s>g>rEqeQPi>7p8i=|4YoV-Z#k{BM**@4zykn_e5vg9eX@; z=7re4;8vyLd=ou$bJ}rj8;b><RblVSr}u5+(>j7@cZGsH$C@=&qc6<&pWPMqO)+3` zmaOg9&9JuCi)RO}s`o?}4Q+Ol%bup=+2oh4f_GA5wNmmD^2^h^kC9(q1HWA6>sY+} z+vJybl3(5}HZ|%dd5thhbr&r3nhC2+UI|-E7B<{W_pgTmwhX`k>167VUi=%FE+r%C z<)_unF>MeQWr-yR$$Z-fSHPk!7dCkL6)?O(tF{z|XN_rdP^=z03`?ol9PEI^zi_h2 zrNR=IE<|DEAJf9T10}zXF+?a}IRIxT0}i?=Pyv@ah5}1L#xf=^Sa}Uu+8i>!9b|q1 z`<SPD=Po+wu&KLYcxx_!BR!r_*V|J5JvZF_2rg;4S7OAM<Y0IQK20`9PF|5*tnU2I zcb<AMOx#odFZfQfbkf42y#rsWA><$lA?#=<&iM8T4WX?4TKtiSMR?pF1Y*@B839*o z4aEJ<fW_kL_cV5!g=3xKNz*$g`PnI!-23)uS93Hxcl%JVMX?%JEqpQ$DnvW{M@*)V zJC+-rOkUf+#~3*b38E51ZY~Tw(GMStOv!|B!e>{eKMV(^cvyH(L%M`}$~@gg1Dyj~ z|BrmEFT(ec)xQHD8@@MaE;%z)F**|RPtrAN4m@mfTGGcym^Ew!4iO4<y+Eo)VxID$ z@oDr2hqQ}TOUdozk1vU9JK;N5uH8)j_yU|tEzuV#mRvyAwFFa4p@mCK&DemDgzO2t zY`hMq6?kuvDydXbQ4+8dOB+dRV;6OjQapI!g&Xi-rhZMR0l8z^MwoG7EV+@y2iiAY zja!O0(qS<rQEs$q>(MW4G3_E0gXGIa>*?A>BPh21B05}u8H!z$nc<=+B(Zu?v@@3M z9ir>KaPiu1^oET<w$y9L$j6M8YqpU2Ut}NSj%wkZk-8DQ1QU#2B5rQQ!Xrdtmf&#d zGMrSV<46+0HYF0!TtlRA7;`~z0j;5lX|6uxV{M(r{E5ULEWDw=fFgDdgKL}GEojZX z)L4OSap+7_mYBZe6#rPxcUD>p0?V!5XvBiVs6DQ;KfcKn8NCFf{-Ry&g|&atv8l7( z=kKeS)7Qoh$xJC`J6ufp)Ea$FC?*<rq<@^=RKJNP!ljr9_XN{>!@c<A6w~6cyiayb zca506eTl^5WvfSf&+HV_<5Ta>3CxJP7WK}FL_CoPpZoF3AJt$d0&8$UWKqlsKGy{n zU*PMfPK{gefU|E-l3_JAkuPR6M>Zy4dsl*qF}Vz8rdejdL@q@Bj;XjRf^8A8<a%<$ zn?wkjTulRedtAF%kC`~RSy0-Dc`wMaR-9JX#Izo?JlNcOR#40LbdmEUj+x|&9y+lC zGhDStY~Ii+P!f7V6=I`#MF3^uVhtt^v_P$}3cT8g39Zpdcap9`(rKhaV&M{;TJau? z)}j9O<WfN}c|}5XLY>Ms1D_@B;tSzDbEHxa!DLz!lJnFCGCdTdOsPL-R(k$?6<f80 z)lCkMzBnb#Og^&#SKP4s!6!EjPjYYdzPUY{zOCyB4(Ps+`yGMhp0=LwJBX8>1--OY zzDW$~`lPaV*G*lc-BHZ)-k`bT&=;qF?4I3MA5Is3@q#Q_6?4s~tMlGW-1JqmL<C?P zn(FMOx^1MJ4M$-c1H=wpMYaLGY1JJ|Ho+jY`S2q1o6r{0eBeRJI2p#`xK@iI*yk=s zGz$t?WGY}0nTE?mn%AtD=0u}|TG}aS2%H!ulQp7wiJX9#)r#OIp#6x&b+wHqb@NP= zwW!XEq}s%hHi!cAz@;{3$@L;VNT?lIX_GEw0VHichzYc%z|{bc|H--9c}+XD%_x!i z3-d9<(ZiFbUjYl42-63>ja^pxLD5i=AWgTrckG!rx~z2|?ujI9&g&@h#YAhH;&~(O z+Bs!jHN9=-+1(MpJZWAf2SZ{ksk-f-u8hqNUe(~KT{L*Ls}s|w@=SjvUG%-uPo?B6 zYLmRJ=&!TEFO_2rUrF7yhq^0T!iD<pM5}1tVU(DIb_oEWpbooQpBG&K|Nj>R^MM#j z3!=g+Ag?aKz93oxgk?0Cw9vTH-hx<D4O%q^YMvww`(j@%X{hZ(0SEH%lIAG*!vvVO z5N$FCIW{T#l4a|zr;h0AL`Ouk_1M&f)DgcleuJz2Kk8{TNx=jKik+EB%4Aml38BMs zHlbmO)q2YPq!e7k?n&LS(L1#pdK<IdFdCrNO&=cZwlvzpoBul4tk@YO5T7<WY{6^% zM-|h@t_uQ2mt4QMqt#eP0*EE~?2+_G=XXe7B-0^jR(y>f%tg+dNK-I}DaDev{!nLh z8_x{XYv)fV%y|~fCxQ)`Ik`Tc(AR1{l#jzk2EiID@*Aitf&(eJl%^5bl0~@=i^R-! zQ8Sv@L*02MtYs!)1=37{4aA~Vyd8!E0W(bkeGyYHh^V&Hsxu}m0J9P%&|pA;*bNB{ z0bs-=^TYI{T_3E(XVsQrE`TmSF(NFZVl*NM9SBCrt9g_`pUm}ub}XUwEQ9GJnwJXi zU9<=tc`*<!-9*kCm?yfcK-4kJ3sx|)7}`+h0M3vd?1jp47R%BHdm<!ZYXufHH}rkN zqHZ-s)-U$$h;E(R%mVI4XJEdo-i+Xe-72$O8o-tRwzAkiO()4?<IW>TnZYx4jBVN_ z2I&E2S{H7Jh3mb+rjd}(Iw!Hscg<Ii)0qEZ4a4{n+RKgnXdv~YXeB7XQk1PNxES@! zso9`~j2znJ478O{=nx9@kmsOvUR|%7go~wO3+lFdhDg7u%B0_Hh^RS`?n8DE_L(Bn zOUuV02t*)!T#iq}XsMc*b_EW_a)yF!O2qLp9UCI;WB%&Vwd7)lZQ6|0G^SoYibh+O z&}K~|GRvIO>ETXN)Ieg4Ru9rjwb8CM0HN2akO+yW8crDbwECi?o-8NuMWYp=b7M)P z4%KJY^U!tYaP;~ta(}MdPBcD&I_R~`hs8g{%(-aqT#=J%u<Us9*%#h_@ZkMzzP3DW zb|pfiSKq&3+gkqv#;~)oH2<-6E~S6mTx9l#Am-wCeamFJZ5hdydipy|rd{FYWlH2O zcibQ71SYMTLI?1t*?;n7_Mb1m%8bxEjV|;C<7H2!)%Om`KEs~LMEz%5N?$O$s$5)& z=F74vrWgkwhF}0xOd}qBz0l+hb^T+=Gd^hwrnF?*J}ugQH8#jX#~E#39j66CN)h*k zFQn=JWCLg>0}%g+75;f~(fD*sJ1;R0OBM?SYheh9W?=}(3KDr(Yrf17U<I@vs0~(7 z4mo&%9Jay#Cu42YS!vxyZPUSO(sfuMCUpzMbWR95tQdmDfedwu@g%2%d=$n26j7Rq zWMT}?upde&Tw(vAueznRW8NU`*Oauy{TRoqYWjxAhM*x@3px&Lu$A(2Y^BpH=XNX* z+~UPHwKJhLl1UML>7Ozwj3$&;>ky_vi<^YcP&iCR2He|FLQ8PuFupk1$CyKK<5$?y zj!{306xTH6<vr{v{9ayOjlX!33<Tr$p>GXst8ZkHVNB(Neh5AchMfNSTX*@s)Qawq z*|xNL;ewK!yl5Wsa-(^MlD@&IhLw(&l&tFR?s)+cu!%f{@^`WPtTgRpjhmticUFse zT0QbWPstBS9nyECWII_<G@eQbH;;~;Ote!wCg2$GFR8{;MvzwIg>Q%6rxut+zP&S| zcASQ8iq-%-u*Z@e4RlinB-e;)G7c|Zd~S{)X6rOZ(7Mp>4IMPxTafZh8MRs=If0-Z z;vnQLSOU#kTi4u*=0yVj1SmTx46BI$89|wF+LYZ;KfcrChdN>=q+<I+<emAH@G56V z`nhz!JJ3JZsRR!!JZkV@AblvDo14!Xu(Htn)z0{=h@af8eJ{{B+KaWKyU{lngJE6o zgBUwW1BK?ol<rL9s1^N@VsPrNlm@L9#l<w)nQ_hYFs{xMJ#yY0LO}aOAMAk#MhFCc z##DozNs3x-M`c11g<Ny9Rct46iN;V})>NHdstz1Yb<{F0h}S3UTy$7B6AiHh_qHQ{ zd5q`HBRU?n*#;8f#ndLyBC{f)LUYw&I4wf%uB6StF6kdiITe-7b9=z-1Rni)xJF$T zY1nAZ0zno`{Y=j%X~sN>64@(?fbxKOLICj-18Kb;cZ1$StO!w7bY}9bdwQ|ksnKBB zb&~wE$rrS~rn0Tp{CBO<XNcj;X$HmDev>jXbo8Fx>am4O!rnneci>x(ZP>l`<g1^M z2d-MWq;A_o>HD6^2D&Ea0R*~Ao5HsD9z42ld2+RLRZA!$-1IZYAN=4D{44TSWQuQR zfw2E0Nnhi`S|07Fm^QnEoU4ImKgbeDTE)gvE9Ct|Fe4(FTC2K%LJ%}OHSl6IFjEI2 zYEWzuL?(N(M9hy#*-tG&GnoXLEB2+9SVbmvJ&M%Nqh_iHh?txw5?|)cMd^C0s?eh? zdUQjncY1V#DU%jNTd97rADTgV+WwfKybZ39T2@o2t@5dtK?iLyT5|z%+LL7gnlx!v ztPpBUqB&EMaC1}Rn9L%~d1&FRYQX>mDv&QK;m~@y7@i%uPg?%vwv_{|%c7oWlhwJ! z)7%yG$BFjlZV~(>JSNAmiA4tZFQcpB0pkhGt5(M%LuTd4VV7cf<+rb8;~Dq-Bb(2h zI3I!ojJDBHU@8ipy?N5Qo-%tEyKN|}_)3wLWB8ShTTr{cgLWO2K9!Qw$mh+Er%Loz z@^<!R8&(9NqG2frO(tkzV~t)cA(qz?^*FtfI&PhSJ=7q)p<pDSAu-uDKyBC7Ee65V zMSAmI6j2w&X;8$#srqA@4Fzpt;IsiPOUA^!I985(kR<^DQY{yG^+Z=Na1mOkZzY-z z!{G>kE)0+<cTMB`HPmHo*wz(C7CmH;AP#9Mo`#c>c3}zbade<<6D!fqQXIFTb&>v& z1-HbkS8rPMP7HjL*VoXrm@^?d0$(nr-xFl^Rcmn*W`GfrR?Q3!>kY$eyXt*aeKu=- ze^($XB3%;ujr&c=#V>IVITX3u;TrbTC%l2E+bZwP1j4#4^yirdhuiudl+Q%h3(#t? z|GB5myb_+elz>=Aw7bRi<9>Nyoqy$1H=PZKz;rWd1&rci-8fd?k(?bMJ)pzsfbxcM z194&r(Tv%wgRKEC<34^lKgy5uB!8O!1OMmzOC`v(z@JN@lUQ#Me>oP2@%a>g3YkL^ z)J4#_^T{?qh)V1(v;da;Z~CW*wzGNQ9$v*k>Swv65}EYjI2omX8t9+j;GdJ+jkFsm zDe-VTxt90<m+OzZ6vrpoVR7*!_htUx2;W47ZShn5hU{-RA)Vs4!6&Vu5qbr_^l2O< zpT|EZd4ag&B(D>HPw+?hi!`}A#gAuiKgF-)TlkB3*@Dmsex;QA!8mJ(K?jG<L#62y z8_JeF2~L+GWGg(5k!~d$mOzNduLL1pB3t4FTg~j^E7+xzPF78v3`{!NH*s=a_T))6 z4^lBY<K8^hgD*G%o6S5ho-GFMvx$>BKOnUQ5twC4=4MNh1`pV$#V>};4?U2-k}1Yt z%N9Gqel~IM&g{ukd^^7j^|AsrGtTcrrEkw(J0ac00fvwB?bcKLYuVGMpto~8b4%)o z-j+v>lQ%f>ojg?KaY*509?1R$oQZlfRN>d3WZz>CgAyYKiK(=BUI)&Ovoy&6K8MR6 zXP=<_TjC0pKf&YD3cW2AagvA^Q;Qd@DZ;MM#Xw?Pntq1E$Jq_+FZo4$704V@E#e1h zHG}je48z;cAT@`Lfz?xZ9x;^|=f_|p4`yF$oc|D?pUr+|<NS@tIDh%{DSj$TNjT0= zSx;~cgny^_1C!1^VAY<%l;y|zWaP(I?IC>HseeK~!z=jon)o!f=e1Xm{JB@Z^co)a zs(9GzlOFcE5aK!^EoEic*a&qlB#V*U6?OriV)@yEC#7HDlW2Av8PV*J@XG3cML_$? zck*vwUE>t1%-)0|Cm8TDJgGK&a-7xThg^@d<E%M$iXX~eImr*<ys%Xq9i+~{)}!jK zM~A4-mU^5;ML94#ci`LFvlqtsA$*;#9@dHW7(91{^@DnH1ZQ-+9bwDy@W-=)V&e=; zjxv4_zXm_(igWkzd-3&Gz##9QbnfoRNqz;+=s)xdzK(t<N}b>#QM-|B*^@lPBbjm` z-hy&)Al$)SFv@Wr!LNNsRL+~?4x+z&98>~t;w*WI|H(dQoc(8Poc#}~;Wv}c{U-7_ z$bBbDe>UmLXQ1^wGwJi2_@;lN3g?%gUf_e^Y)P5z1p5ZN1r(y}pKwyV4vzjo)cY2^ z^xAB}alRIUZ#c&eoIA%3pxF;(?>xY%QmNDlh;UJx-!ZA|Pw)bF;L0g+<+1FQ$Jrz7 zF`T6r{%-c{IQuSM;gRg;lPLAw%rAU|K>!%N`@6WV|K{(q2gEl&h9<b5p3na_d*?X+ zTbxaavwxR8dxHN9&YX~*;ST^2ec%VjQb`*6e{4%W{MdJB=zqwbI&%Dn<3FOaJ8j7a zPn@E&uYjv++lE~SZo2E<lh6NzuI;fsx@Gse8@?$Ns9v)t*Dn7V72In}ZhYbIaWu|f zwWl_1zWw!oM(J_>x-Ge4&8<}87bG6l4RCM>icG7dH(3GbH;ef({x?*R0r>gXg!(oF zq2A)x@*C+%tkQOZe~sUOn?Jz$_oT%vAkLRVoN595cMa&?_VFL_XT%d~?I-wm`6DR3 z$fmxJ+w1Kh%mf)F7%JOXg3S^4wu1n)fny~cod*|YOD!4osS#*N{f_mpJ~kkIDjqm! zQ~wL1UiP_#pavx=u!t1Hc88Cp--8fy30uo==C|`7@)yO6-fhz?h4j=b@EjX<^7;4{ zNl;|+5b0V?!!|mP<l;772I6NwpTn=>5AyHxr}(qt{)qhv9^f;$i6{79ad7s6P4hSG z9bz{96-R`G-e7x@ZNMP25ow0M1l@34{K(Jj<LtL6`zAYMLk+Vl**3P5{Ub!V-r_Q^ z=MB7*6Ia+c`zPCzoMG2uH=ob@`4A@|OPIO+alVAF;1_dR2pwk!Kw*2BFW?J#FQ?fd zOq*>XPS0C$@eWXg-o?<&<Ls3EB)gj(WiN_a{VpyXV<(6L^fC4b6(57t7%Cl8$8k1_ z@43?UI0V^7*jlz;e9xsge>tAB8g~xxtLTvdoSKEdRkm@y8Gq;D-7BeeP9UGOk7^>y zz<`3`kAYO}1amN`j1b*x1$5S6PT0*Ue#|sHGM@-ZPoO4#MnB*c{DAFXGrd*19o6%X z{5$*s@$z4@tKY(<yYPiS1BE)MP4DJ^%T+#xtLkx^`XGMATV%JpA^Um&+g*=I4@ft_ zWxUSs;rEL7IAMdT++OJ+wgnaS4Eu@n1AdDCow)QlSq`>FLB?$yDNj#;==&YC+<&m& z!smU)&TuY@rA6J1Q4Jg?8rmJw7Iu%6LU(@%P9Vu1WsgxguHXeWN}Hu^><Bw5kzmR= zk3$geq_hdLnm<8nKg2b0_D0)D*inkVOx3~N^w4Lxg%|P?VSE7|A;UmXZ83;A=bF(W ze+<L9gWbhv!~DoV7V`V~U!#QjGuw}(7o>lN<!xlQk}YwPNL3fuwSm{s)tYR$eJ1+# z5FFzzyakrETv*c@K1$_0kYRIysqz)+O@0X%i0-R4ROmEm2D;O1U`Jixg#K6Q=hA=j zIk<t;C}+V#SqeY+-5==z`L;2`FFsDwVV+%k?|pisV>a#ge-Ob4bU88<m8IZRpuH%x z90ME<iKQaF6pK0F#KAFbh==e-B^5gN(Fj^E;0b!!hH3)gpjC!5pi>gdaN-Q2lF1(( z*fKon1mVHN4_i%wYzTQoR3yHTcGZ_w*W<Ccf**j@{t_g#Kw2*I89z;|8tGSQV@WBv z_)vm=E$*c=_#MS41ez!P9NrU`%27?3I1C&kanRKWq9b~8=`68BJl2p4B|8+w0~&G4 zfx5}96g$STYp@Yjfggc|V!S^XLGd+Ur-kqxk=X_yhEZBsj~q$~%61Sm0g(v~kaI=? zbi6U<K}xS6^AuSQ#l~DLd(1IlS=&Pzcd^o0@lyI?`hV`6%z`HPPV(@@SS0aPlK2xq z8wkw-Pp6Cu)D0>Mx6w=LPXRp*XjJfkhEihAsIDjeD^kv#g|A1mPW%zP8ndeDqo_A( zf3gEo`60p;{3_HsY8tIrYXGPKLR9@{mJn|*y)qSJ;Dac3K(!aO60rFK6v{SIw(*40 zb}1@ZXP6Lj+&W2dv1}EbmK9P3gvS)7|BC|%BS_eWA<p0=h~p&e!$28{p_-{_b6h5? z(HwC3E4(v2ip2pkP)DBAj@*5-x!7neHWgXB;&Q3C+*|JU7rRD#E#9Ed9d=bK{()X& zx6@H-X(_c>3uTi<Dd=dh2JAr}4|@C+CUZgDBX9LsLOy3DcNCggp3)dFh9YHVPmWbl zoW}bc*8bvYxeklDi2HrsaDHeBo@DgMKD#&UaP|94MsG0^+)HOhxj$+Q_88sXX}!h} zch#4BWh<)-d5j)E=c_Jvv1x(nUYFNxHd?&yxI1mJ7Ubo3$;BMHeU^~Yl@H#F&9>=A zliTZbTb-^eYz`wRhD%wQ&B>fsC{UME<U++op_ALJ<l3#qvVN1v#NJTC?QP+W9h{qk z4dL=|Th}*@IXSY4yRbdV*JH8gC`MCJUY=r>6{x-$jo2%w<as^TB91*m5IF{$sxenK za?{<zT<%`U9ql||_t|nR78#YGa1P~)9LBg6iMH7JZ;qBi4A1DbdkPhEzSYW<0yD(% zu;ri1TMBtTw-g$SEZpKLay9z{F1LAW3Dm>gQM1u4LohDi<U(mhu_zY&YXgUD`Zu2K zLy?l63x?V2tM8Xhfq<(UG8%IG$T8XE?QYq!spn=BGG@llymWS)_ZZ@52f3fU%q-Z1 z6FvK8I-353nT?;O-%oqe@4c4Zlm7K_=#TyajKlBz;g@_ad!M-;O#dpq{dI5+xYEBu znA&JInHiHmLA91bRVu{GL*N(T*ts0E$t<V97?dsC822mgfXnQG@;OpuOQS)%;^cPb zj&qNKx_1>><oX`P&27j{ODKICxFZ~}cxG0WbhLS*iYFWh1bji8+dtQGfn8CIfhyB1 zb2#F&dg8`~OG4$rO0OqTWVCF$0E`^&FyI+~;i`U9fiqmvYAdyR7T8=N2N-W{#&WMG zXvChpBBNZu!-a~2du;A%hoz?8gzEc4b8)T-TWIMg8w<Ur0=wB<uSCrXtDEh1_@~)t zy37%GU4z%*FL%s{nwbrfkV=tRk&8UcBAcgKnGtst$VX)!?A!Rr<@?_)mAPl`(Ef#^ zV^<CAFLD$sVQXHclF#k3y>4i)+v3cVK^y7FMJt-jcBA4%-?6tBn@mbszTIfXUMiWJ zo2<J}$=vnUh|!T_Y&Ax_<+2gQa%IJ=7+F397L5ukFd9n>?H!=vQPvKC-<P+^g(iqE zn$0;}$+Hy59D0x7kOPk}zMapoI`c!kwb*=N)Z_{kTRnCsRR4-CaxrY#Wx-=j0WQ08 zz*}9BpKp}U?6Q8ml~p<hce606s>H-fyzKSH{4&h%I2el?(`{{=%Us8lt%f-7(Jd_Z zWk`|ecJ(plN-vYGoxv25{eqPoPyaf-=QV8a@}%Eq&fMuRt7++fdp*59{i_F|Q1U*k z?N^_FW-^;iz>zZem((cDmzJ}9DY+Do5-AQ5g5?Ehr2671m-vXCnkkcUsTY8P34lVa zg$`97C;DQP)*M&_Qd=Xb9`TWZomIWybZV|1z9StQ%oeE{@hMqTMi@wqND>ekJY_b7 z<hqauu+9mg>o{_Ntf{E}glHrwN9i^h;S%ZGCs43gT+@bui<S|^uC)U5m>OHMu(b>G zC19Y*svd!9%TeH-rWVM+)Aj_hb=g8>7P$&eKrPHhs)&!pXD8a{_7d$Yz|sG85UKmp zQKZVsrK$>y7*@L^S5!$fj%0@n8%7ah<j-*>M~<gINbgP$30FbBVlSqU`&j9jJD`el z&Re{bb_#FtJ~RE9Ry2U;?1hyGGr{wBr$0D;9LxgO!cNN4zc^<n_#O|wi4v|0{%3i5 z_>YaoM(js+W0F3P^|N6>dmGt7c8GnS-^%aG+XjTRNg$)iAfccEVFpP*j!yz<%Yj`b z(*QYBzZdxD*p=zuLk4jRl8!DBM+>ioT#1f%BJkkX!<f=mHBQSLt%9a1*+Dqz68dM5 zz*1;@$av^U-YM|VaXt`F-b|pWE)bvmh5${U$$+H`#NXtNFlp`j+z%e-v!v^>%NCJ4 z(9kS?`x{(N{to|~<P90D^aAl$EsbP^z#w0a=>8kodrks|&~YCLM>mLAehu)3TLJH^ z%9c9CzL`CSRydw;%cRds@wso(=gs(hZT9m?C_-PGK?^nmu^^n{Vw6~yEpdV^U`;yA zdJdgE0Vq3jGLkKM5+`*~N5auOJnIC%EmI6&)<i{|V4qCfdn|kM6vQH4MjPJ<kTCU} z-n`GfOkMK1pAee)4*ON+>NuD1_V0*0y8zXU^J?HUxSn~7ci1cR&FnSy28pe_V;$#S z0yVh{s47X)Nuc^8P;<F-qYi4;ijxFto{|RG4Dgo*=vU}sI><dkvLLbRzz+e8JRDJ7 zr&GDQ-^jIUAHq|7q<{Lzs=bO&NA*uft=bvFdZaA2GFEivmzI$FWL&#b0rG8#sdrja zpX#MQjXW+rfU*FmLaF28+&Gf}(A*&an!B<OK@<wGC|`gy+gLlntG4XL)KwXHGL}7c z5+QC(RE>nATG7BZz*S1N#5hx6Qp-@Oy_3$_@Gaf+-9Q8<P_c2g9A_^RXD^v__7a?3 zfwSsm_%$0QUD$vNH{!xL+k`7$pLFFqwAqa~qr;9jvcJF&BiQjq*e==N*YWDPbSJj~ zcx2f-Pw|rM@#EYHpdwfFvvYDak8@x4tS4J`oO=K(`zC!R$(+x{H|)Z1yYJi;Hjdx+ zInL-9+UM+3@r&+5rKKj7N}<xU_&B={&v_8fImMosboC>c-V?a;_u|UCldk+63m$K< zcX5S~z0;GfoJPNUW76k$t%Pu=igXJB)7ZZR{(h8LOfBQA6<B@CQzwA_#($4Rl&4NU zGxi@x=$edQRf_^Yc<KbdnJ)5Mp28>V<4k2gU>f+yPU=UKD9L^VIkX1W0!0*+X+W!P zA$l_Uk3D(r%B1sxvFt<eM5kVS>7U<xi(>ANY{}Q&{t1nMuiD3+d->l!pp!>!W3RmP zi(k><FYRMT?*9QnSn3(uqxU|1+o|7uOdlSwr@nRU;P+1881`;U-u$)S(?dSBkKOUf zXB5qAcOJ)5^EVOA-(iC!Jj~yoy#B@%3K16gB!n!kg_Ol#Nc-6=l99sM7=WA{{8g}! zVFXZMW4%Cu0G42|hWHYM@2_M-0DYvZ7*LIIR*18g*wpp-vjXtpYOK+%6OZ3uQwMSK zMnDro%qd;M%GovSYvMb;j%T~@V{+|t?EvgZIZ_5nu^8+M7h!CxX1B9LypA{V3-}0M zBk+uIUIJk5F#A5--~Rw(S<Ww}M*(&j=k(VHc;|kO^oMcIfW)wG0W>?tJ_DQBOg;># z<O;q~d^OZ|0RLUdH)F|Y52t+9ah9?_!Jgq)V{z>oegi)s&fSNfi27m=;uTNg?-O|b z)9hoagTJ>w&VCGZ@Yn2jq9`r-{0cSoE_;J2<ZXiFzR6Y*T6QD7iKvZRQ0Qy8c_|RJ z5q24&W}69ZJIQ(h`douDC)ma8T6(|=95%6abl8o<A+~@H39{?JkBBj(;t&k@37!3) zg!-KAaVZ5YrK9XK{2cXVyGl5mfdI2(()~cjejvRly(E1Po${>sy}z;pD1{U)5Tm2u z_IwpC{WW$Udf`8_k9aOOQ6cUnH1$pCZRs62)?ag^+@aEocqyEQhgWe@QNZW`5cXT_ zIrjI^k9b$yR!czD9nzi9MmZwA#eTwGU@uFzu~*r@vY(3r%V7@hO23jmkUr)!rElWg z3BX;zqJAWO2w03jvBi9lnhck;Enfq0W`ccwga&<zJq`SHL^R$OTvdNh@AwIr<L(rO zL8~Vlx!8wwrCr^gP?yv79ujL|DU3baA!M?L@1r|gfkDoeR!e92jr1{1zq%3d+G4=w zmw{bsJ(E}tQ&_36JrV~)*7@U7jTB~1R!z4Afogj(B)HIW*z9!zQ+?pl5Ys&Y0|v26 zUDQDM1aU&pALy>oWHWP2lo3-gHGnV^m|C@fjId{WDyW9j0M(jVuEij5EK0`-VPChv z(SWAZf(t;vXh35AG_r;ap2^4>V5JFYO(14~hJZ~W%K#4rVxyyLlhHPTuGN1LUPF2T zl^1B5AW_t@v{GYf7G5JXN{8GEX@5p(tc3720^bN;1BvKld=1Z|Y@?}o+62NT9wRWf zNeCN&BP0rc0b!ffP>dih1F+>fCIGfT0}v|uK^cq<DaPk8wo?59VQjNv#WCdd10n-d zC*aD6*iZlGplkwOqgo-DjlPI*wk*mfzFP;z1llHGG;C@XFdNks2x})$wpvb<id0)! zoJ|04I!s7KflrEC2)C=(@vq|7)1Rgr0d%ZA3xAK!NbvnnndyuN=zazQNq?07ZMru7 zF)(G>WHXwp+<vb&=uw;n27p)sDzkFbjJ4QowROd0htK2nxT_qzY;>{W54r&gL1@M7 z3MlphrGF8h?{buy+wEQpHx(Mq9gRjMWD8dE84$&sz|73S3}$9jnyPa{5vRq&<swAh z#-pW{0mbApmU3%eiQNv=%oOS`DYP1Z{!|uw!=<jpUW?J|;BLhkigRB~33nRpCa=3s z2`_B6StCJjl^@G5tJb<)0l;Ztdy&-}spaOhrN~q=k3*t}=gEa(rP~C^Y$FgQkJ0S% zma^#p%3N1ifrU7xxtvy)^NKvs)+n+?aTQsa*Il5<4v0oAF%`2{dAO}TylF0oDb2xp zh?oa1T`*42SdN)b1BM`bdn`qO#T+icHaTF=;M|~?6ezfs6Kv)#0_HJ+ml=SURoknq zGT~+T!aM<($&4u`0A@TGb%YB5m$h=)!L3#$AOCT=#K3JzA*dGsmDw!@v)Arwwz$pa zd?%Y`-d1Wg3b@QAn;k9zm2ueyw2lr9hCduLnH9be<T5WZ)7kO#<uH!M^nb%Rek%U~ z$VCLk@KL%BT#&#nuylAoUz~noGSFo-vU5O}&vYKpWenk^^DT-1yNm^%s8<07`Xx}8 z*FjKMTZeIB7V0XrSe9q7E@PQdZZemY>Tp-jQi8kOJ{|5d`G9wYe1vzcU8oc~BaY@G zdol2?Zl6gh{wlN!y&hEnu&c=Q6<Akk<z%d@!cqNISeKar>{0@~8`SmZu&&mN58lze zs>s0$WdU~?3ns%|4RR4-mL{{6^K)=lWV$_X0_S>r<)4FdRa`w^pj_r+87R>N#%28? z!X>+lOG@jFlR+*&G`tAm;@o4DT?Sqz%T_B;u0K41M*$t6oIBckv_N_)qXCmQ3voZx z11L8#r>>c@?tDhzPJy^ad9URZusez9Y$4H_V}B8C9hulUlnW?`SyL`Or_M#<n?i^} zsMU(x^O`ua>DmOy)HWMQu8RN_If&Z?o4vpiluui}l6gLnUtVL=0wDdCG^-f}aLH*) z%C*RnOlb3v?V9uoG7HEdBCYZ(w;W~jAnQeTonWuf1q{#Ut*D1<rA#_DG-;n+y-&}2 z8#I1+aJ0KC4B3-FEMnz{mpsiaMRC8+iT!<za|4EV+5TvDb*qmx>#OymvQMDFA0 zYai8t0cF7re|Q`Dk)jOPAy9@J17SI@%v3w-ntz1&P0$|3ZnVcb=>S_KB^MzJQgdxa zi(scPS|n9c*|l*y?FhrNxzKyjiT|Yn1=ZR_IM1lwbWNiK1T;0^^u=g0q>mE`jtcdQ zz4URpRozWc&K^;ksW(Q{Wv8`vv}C*3&(l61E!l2O4eKpA9Kr4td=^>b)iL5{NZLip zxP2PFA#SHHi+ga7-Kt%OCOv>G=L1$XhrT0UCveD*5bqg?D7BkVVA(Lxz9ko9_SA0E zE?I@YaeJyfJiA9MZQ5<gSq*DxmX%LhXayy%t=qQm{`wwF!nB<0aHn!Lt{t$cdtn{B zv730Ts&nZTboJsrFqIpyF?26Nt@x}<R?)gBjDuD#^>JKp<o^q)l~@I%T-(VzScpa< zvr7RPnsNtHZ3n^C&{*WyL8-R?OY)^Zo=v`-FvE%D%eBo1+0T%Y%kq$u`!Ke(td5GL z+!Zb?yX52?o|=`bA~_RV^NEsVieXMp&a8sm%p}W{jmX^9tyd&x?ilF#!{}FK@D@hD z7DhlJ?Du_n6YRH68et_;@<Ph02Zs(h6E|h_u860+;v;e1h%{|8oQhdw^P0(qJ7Ap| zmAGVtd<{0SG4+y&>KEzR!teyWpq)0c+pVb@-9l?X?JInZ8jYnE3kkjqS=pFraT}0L zQVUwFqtwfE&52boLw^m-#Vcr)kdzJAV=!Z@R03MJ>ct7wWLH-t)WtTf^&*@z+EYqH zOaFxw2_$UEj)jC|tII&M(>)(Yk~Y(8v$y9tqOn9<_x#1n$Q&aJ@%*lYX2yn!$vNDZ z{hpV&`_=i|jhOt=YAH-9BiD|}Aa+P$;;!$QBK;!$X}Q%;T3)z$@5Aj2qpqtw_02v= z9%PfbqtJE~$=q!JY->|}U*o#L?Yk~HvSa^hYi^@^XSD5trh$WpH}7s7?01{&9%gvu z<5$12xcgv_$m~|Jr*(>bWWC#2mGA3lUOBgWWLEv!O{;H;5A5NahkJV3y2|TsylQxN zU8xM#I)l{z$B%M)4F{zJIGPto&qzsytOG%55Y}O%<N3P&L4Aa1?7)Vsw#HL4#YQK! zHkO<|6JAD|jy)=3@Cks)8D6F`raItdv@RTbtl9!VYGpBOEBZifv}#Rg;#o-4rctIV zfTr*#v;~krNZMvnlc|liWPz!Mj+#Kpq14y5Q{&EpJfndcv>pszkAbVaz|pmUF0|X! zxsXD)Yh}0xTPYB`P+*2W1^FmFKx@k+Dd^b+lm$Q`5B`rx#7pgki9Q;x5Hl@&K=dP9 zAG+M<iVSROdw;!bb{Eg>-?!to2algj#Jf5+m_z=7=l8zYzx(*W*ZQJGXCL8z#R_-y zjP5i)`1av}^*vjLHW&M&o&Ik@5!D(=>{++tu3+S$rg(kW1*zHK%H_LnyDdJrbxXV5 zeC$Q`=Aj!NSzlkje@*KuBsPk%CLqR||JHY@X^vWqG=rjlrmU`+^?2!6@^Tved0cBl zM_o-ll6V4qv1A_gR9}WCV0}bgep(%kY4b2L%v(;w!Mt8{(0SI>C3*)%7oI1&pH_{w z!OqocL333N(XV8pnr=vd3rMZjwY|)Ej9JiJnTt@sW2OS0GE5jYpedIVtI8m<=V{wS z6gy(nYBU*kNFXOBnO92NPC*nv94+~|jXl&4+lXc}r+M@;YR1bm>Q0%t02-(0OBq)t z5m}v=f<G};_}`N5Y0go?-|L8A+VIfvw^;QQqLl98mG=hgzGN~nX@fG7@<pem`Nz7P zwU$aEnhHIXej>eiV!Cn8tFxc(96X$c67ad63NcBuH<X=YY(Aq#D)m{)x!caoQh=r# zq~3Hi@57uf=mNG$JJ}q4|5<8C->lk>eoDlJV^<Be19sn}`*vcB7jiMlEp$|gi3pL& zCNG1Hs0-sMaO+_RQ0K?Q{82SUh=kuS=u%U4f;g3_*F{wKX{`}%8XXrp5t<)|P?Mnj zXag{ZF0pN+3!IhefR&aQ)$K8D11=12AmbQbg$u)0Z5N8SHlp7~&5p`o*hvlUw+Y{? zcEP`!i0uqA;96fV{tnw8Gq&drjr3nj<u=%oYcAU&tjA!2X_<>qq|i1d<rX;TdYxT0 z3NJ7-p=}~MlkL}{z=*9q&)?YE>F&As5~B1`FSCy==wC%FRCUoDN`j))7bvK~eG9M# z`cRZG!$=$o210NVWPFIN51(WMZRW_5i~?Pm#BWJyOE6Wq4N9KD)I#4@D+t;2sRv2j z0Aav#`k^}>zq<eJ)xOo~x9<P+-KT$cd8EF=Gca<Sv)*;zr*}>ek?y%O{rS$Xn@rYn z8!LNf5^1Td9NZP}HJR>C#2v`k7;PKg>W%hcn{6VF+?&P`Og$c6RBA0PJN$VfG<UXt z_==IfmYM#V;NEqB0UXDfdyMJarRf*JX8Mmayu-N)H?W-B&u2UR{G)I0WH+svV-!1W z3l6uYZ`nD#+8t*%uWpyK=XUhBLK-H|(P}g<tSv3Rj^-OoGJ<!iH)k(W_uk9s>@*Ky z6C{|JQ-;|c%K#V0k^wldWI0R{Yv;)dGR<Omuq$H8I`Uvpqzw?m-&ML?q6wSEu!=F( zP8x}%1p#$ZZ3#S9D3)BZnl3E401j#O5;|PH5@nWr*{(3{YTQQZidfoKS3{rHI80`@ z#I6OF<L^A%qruv`wz)%OF36>Ag}+(|7TRREb{5KM#k8-lj5ck9{{+SsK)9m@X~;P@ z!K8h#jI4klGvav!PE_KBX#nSgLF8gnEIb>s$R_QH72B$D56yuEU*yznwSs0<HVO=m zzcg<z=9Mok83Hlt-tJJt=!P4ohg?BZ9v~!DeYaudYVHEU$?b>#4m)HcE3iFmrvGnY z7PLdQ=jL5LeOqn$Y;3EIhpY2S<Ke;R8hf}tc%Gb(!7+OI>K!I^r_q|nr`JLHVSjpl zNy+hvEwX}~dCC^qx~Y3?k^a-wVHoj7TuAre>(;|Ag3$Rf$`4C(q!VCn#<H{)hdZ*< z@Kxs44Mc5?rE+s-xUp=RD^@B~l?nPpG)qivi>UcRJrm3)Nl6F>N{DWgv~2-;0)@gj zQpLK~b8%!WB;{k0SqDRb%`~9#p^zhXf7THTD0V<M*v4j5wzN`%`|X<BLqk+90Om)P zJb$QGWQ-<qP!xK`l{U47VipHcTac_(<+=Ts$c?jk0V50el4WG((BY}6g{#3d5p5d) zlHG4Ewwmp%!+v<b0!oV=oV(^C&n%q&%}qajPu=l|ue`g&d_ki3(7ILj%Y7`K_Auw@ z6&`!BlV$c`UbQ<AOh0QhnyuaG*0=9Xra#}4@PxX9i=MpW-jgL>U@O!Id(uwBi<n2u z#u`8}O5+14R<Y+UiBttt81+7?cv9qm3WuiT(-LwjM+^x}O?-*oC-$dL=zS8RfI)!t zK8sPKG@&DXPc1)!&fx)H_B0pi{(vJJ+y&q_&`?(()b?0O>5_O-Q5~p1WN=ca$k0}j z)NC%cK-%eQ<zzHKP>8f)Q`6hiy&Yz`?U8`Xcxd-cCkAf+o6AaBQr*+PdapkkwaPD* z*k9t7Xxx9q@ZLxVc5utSLZUbw-v9b%KTSXW=E0EFyPNH2>rVY>|Jjev+Z8DXsNMp9 zd|zITdK*I8dlIZ+p;VQa>n1y?4l&(Q7sirRl$s+|VdG&^MomrxQx}jW_rs(qMq=uP zg0r?SrZr(aY!W8gL<~L>k@>Vs6wD9A)F!J|i^WmMm6Ix;i(>N%-GISMj{V@YL?6Z# zEYX7nkBl3iwB@LBR}-py4z#niidr0%H=yz%VqQ~+zlhuO%BqA;xteR!NETSq`r9x9 z7h40ht#fFDt9pS=EkGZ@UdLP$92|DCLKhL}HNbm}cI4D36IP!>!;*EyArcoAArCn^ z(Nbdc!>UPkIu;i`P~-;3W6%HR(AFb=eIL>;2X@xaUhI4Lw%vDBy4si}G0?HGb7A-9 zjr)2NSM}Z78w$q4R{79{@5`3@#_GeS=w`z!%zJH!-7<9bbsm44KYGa`Z>WE?ee1)! zI!eoXUG_xpaK}}NaL=yZ-D_J1oBB80F}V56oz$!usun8|eF}OP5eKivK65X)^=pvD zRjJ3pYABxcqh-~l@r?0fJ6UoL+E2B`Q{8%)<HSx?Bx*IEhPqp-M{k@kH58;2Nt_59 z%poP8>Jv>k*3nmwIasHDsWX!5(2x488ib0mQ7koGJRyb51-f~<5VlNJ>NgixD!92s zzf=;T=0+iryVZk6P9~OJKtoo&?XlSGIo*qTm!ciD4x;&KPiV}7zahjzwxl_LjROf) zu|JxVZ*x}osaP>KxT;04gB-ARX+b<l?Y6Z8aD)tPWw^Gvqf@XQ&%x$(ENsHn{TVSX ztV0Z2v;1FOSXh#d^N6z5{F6oz>aHK$v1{a>9s4e~%JmfL4ki%lKEHBc&}p%H&aRqm zP1IkM7+th=*U0@>?ON?HG`hA%4mA(leCu<oh6mjy2VfP?c~S1lZIhzhz2Bf9SLHW1 zH4k_8jKmNKuev$DZas?JjyO1e;}*RHVI+-z{4jSA^Xmrb8Y!Kz4Od)Sg$`d8OI}SA zm05AE6CX*uc3VWf^0d~id@wfBeI*)fLQ_4G8qu4s+e)nA+JNbUv6g`~DAA;!$0Rw5 z^N6IzD$P-xM_(r^sECjS(2-eZyM*TJm|PM-2`<hPhp~1gc9*Eb32jX`hN#ZI3s<eb zni{sL1*aPmTC@Ux%P}zlGW%$r-BlHyMVGPYV25(9q^+WMFUOom?L@m{^`gQXZE6`< zc`@j$66)39`~$*^b|;-B;-*GqTrTSnomhhORR<bM?I28r50EEdj}lC#seDehyhxuX zw{c#1E(XMMX8h0TN@3hSqie(bi*5|Jtm<5C-h21Di#FbV>1$0MQ-!ac1$OQCg_q49 zwg2VO_x@_z_s(2kEi@L+$}jVUTl(uRI#APcX~*h<mW!9wtXsbEj`Qwnyv%;lwHLnk zSIaLO$T8n~T|R3XSQ@<b!nz9*vt89Wjo<yorOC$5$lk|n`RTXz?wWmZqP{Gkw0w8p z(dTB#1Dh7)v+Aa{+Ulb09n+dGxTgEd?g?CQ`|#1{9Oe3)Hx7N6-^ve3Jy_ruyz-=$ z2RV`!%xi#_sSPpJnxW|>;cuw|6i!TCKw=%3^Ju{0;Tn?l0)wwlU}&hZYIZo4X}Y_a zCKNig0*tl`;YE-`Gwani-xf;1Ga&;Ud`io6{a!U;y&%sNq=c6-{GCauFCxdK2IFrs z)<OB6+B9gJB#Y;DlYj7GXehy7NCM>K(hxBdEcaRgR?P|b&n08X&T9;W)goXLt%!{c zB!z+r4nbgIZsWz2RtV2!=0#XN&9aXav*qlkuC^d|1wU1frhotKk3y~v=CB7`aby3{ zyLZGx78Xp8K!3?qva&#D9?2~cWR8Z-M%k?NrN3={?!fEmm)&5Lt}n1Vkxcx~jXS=l z6tU#lYs{VQkhSqDehc_V&XJA?pWhFkzb9t|fYx%BCnbkyL=WnMO+v*7AumLi$2zgZ z8cHqIiG4bWvP<e*gjOifsqwTZM)fGIl}J(0OQ^*W%}Wzfj0(!>lu_@wMnUtGD$_rK z0;&b=7Z;zBaiV*Z;*H>jvBz8RRaJT^2r(j*rlA9UKiSnsrMh~6s34Tq%8`7VVg-Sq zh)Bvv)mUvI#dIQ^_a?NiD)jT(NVIL~AUd$5Ed?Lkm{AEx)SV3=z>R>-6K;Wqhq>_o z82*K(U|lp1k_6Pfb9JJ6vD-#RE{k)q(*(^*lK@si3-pa>-mCzlKBuLPBoxr<<!B7x zAh)kZs)pIuy870QBZt4!S6X&UfQ8FL5gwM=n!$K${OoT>Zrgft$6!DnUDN9Gh26lw z)_tQRTxj=C^Mtm>yQ>`OYtl#G#fs?b_3XK;H|N~^`i_L@cMCnP>gEX6?PRy5f4DK= zUwLNL+QEH~JT%nr^Oksg-uU{JyZZfBi_`D&2ZR2#SC^z8POl^a3obdambTUKJhXG0 zpx>CRp|RefGacJp)wGaipmf-ZrWymHj0$Sh<_NVMba5i6FAU2G%|J6PhWQodu0*w7 z%F6R5mJ~$97h&E?ppC$UL5EbCOXm!JLv^mf5r2iXIzz3+%qLsrLZj7O41HU~V|P#Q zfM(0`0)I>Tn9r2(c?J>14};Js*cdG0<~vRDy}{S{$$B>HY=;swd6Z(=A{!0oG;tQ* zRkEkAK1<3k7e#%avz@XrecVlWQ3tg5P8pKetZv7<AI#`Mm%$T(l(`bGOk4|UE|h{z z#g%w-Fcb-aVAp9aU-v~W5nmDi4b~Hh!;BcsY61Vy`Vr9>3W?x+EE6V+d`R~q_1~nW zI!Z@YyIKQ-G?k&=oIw7_o74~xDB@*SEZhbR&WW{|cq}l>XV5E6gw6s$p&T5MjY9vB zZSB2(bmfmnXE$E@G5g@RSY0u#jz(8QP-yJ~&KD02&3QF-s_j{W>&)QF`@QLd>6cn- zUVaJTCYYRv*5u^GBFV0`%{$%Emp6S&##|qKaE7}L&w>vQ>QVR{5<;`NR5?N;x;&!F zr?n!ya*>$4kuEk3(XN;XeJt2IfvGe6Cz9xGn#oBH)P_k`$}AT+UDOG4ZK?@IlY>c8 z{VYh-l*B1=1maX-b3iM)0h1<ZnAX3w=ODAcwDL7<6LMABzqvF0eDb!@8+xqHg%3Ec z+4t{nF_(AKJyz4^>(Z+pNT<67jlR{xux+%l+=@0Xz>F4efGu+1phT-0K;%v@K`f&A zn5~2(YLy6^lT{)Lsj5Jm24fTyLsU~EO?8AIngn;ES->Wx8f}_Qyv+<Oo~RDI0U!?1 zW<=vy9H7RLu)u*2<omOrhiEyrw{=tM;c0yj?AkYA|Hbsrm~rrq`&W*fESl74N1`=H zUs`$a*1q;Va{A-f_dYQSzVq|iN*`Crj`tZJ!>-9!rDP#_Z^Wn|S%b7k(nBv<gO_y0 zQgy7*1JD4tC3G2+OmH``c0^lsf-$iULn;Hasags-dDwR(mMrCPw(X@rvPoQiE=JYH zSZa>m!k`>i;Fe9|(^$2MCgn5qOEV&=YH_4>qt19KDyR@#K{H?r=1S@Ym|qv-q@9c& zqi`|p@+6S5MQpmL4m6RypeML#bS0q$Fl9D2U<w6pd;Bv2AW>bGdSeKXb(0bJk4-q_ z2C-4H)Cj8zRI^XF1zZo^-|C+3t~fA|{_$-?6J6t7_Mj_aw3NE0d%Kr^+wI>Lj(VW# zsst6(HU6<oV=tv&^0fS$VtsHcFq-q~Uol3XHS@It<*xKw`Np6<peSoYo*jzPZJVyQ zO*h(RIohU8dO}Jzk@dBrG2$TPqJ|+nPmZM`&L#y76VchTBZ7c75=6r|BUJy+N@|#P zfj1CPQw9W37-=2x1xcbEr53bNomHEOHmZoF0(wIPteOpNRH!!w+D{Xx&g_Ib)222+ zm7GW~^U)lUR@ij_yWx?lKwMA7_GF<IQyhdxHbBQZ9`WZ^H*K2K-TZ{%ze6NU|7%@4 zS8t21fxDWO0A1bc=;v9NG^v_@WIC^!AKQ6Obw!ztN(Zh34muZnR-JT@lq{t>uf^2F zO%+}Zd=*Phgq_Kvk1#bFQKy~Og79GAl&wP*6`_hNAtop4wForNaBUNFk*Y|ts)*X8 z3QVJdtFVqH6nG-#S%T3F^;v=XlnHGhfm^D-X~cV$tn`U+6B<$#WYx1sbQqu*?b9VN z0lYyD+Gi?Fa)Q-f5no{EYNH_#j0Da6f!2HFf?&n;*Is`k{UM9K+xA;$slip*nEve0 zqk;b0hWGTb9`7`)qU47;{tKH7Mpxy%cK0D>ef~q{5H9kA_aE=NZU0c8s4Li~d<6b+ zK2mRmcT{uZ<RFu>rIbc~a+2yCK_i!pHIpDms-dVngwe2pM#CQAHc55>0tJpLsu>5Z zZ`T0;63PP;e$*i>5$;1|-DD_Qscu_hwDhSt$gxfbH5{0y#`32{ghya@Ee~O~1cE@+ zYh^;4(SX0*RDFaEXA%~KJ7{Iz^NnuuLIdU;RKs3ci$-}k>P8_bB}?aW5inY&15s#d zkpWIP2x1O{F~-;%VkAY*0wyM_qdb-Fo)|B6pYZ(iQ$J^oH{-Kab;@<J(}mD6^rP;* z;DejOjaxS18$1}9Dv9Y<&sEBD+!FEiq&h5#X+G2;QP^sE0N<rd1<_1vq6>bZHrRp5 zoozBp+81B$yqA&KACIN)<=x`9FOiZ)`n?ADvVs_<{V9)F7)yGHzF+b{g+jc97Yc3w z`MlyKygBGT^qXzevLpa%!0NV8RYcF2(nQ6y$Rt=>F`|~1G{$U9NnD0jnv-dZ)bhQ7 z!FXwNz&U+CPfaE@oxLk<OFwe^LY4@-gR^{$JtS-%^)83;+ZTYqL130@ZCvXH{bZ;G z7EdpuPK(p~BnvRQz~XJQJB_e-G2ub>*n-nS{KyJAOTjvt0GT2t5y=H&VR*qje3D~g z={|vEcuj$Fk)+KA9-+?1Jegc>HB7kZ+<RuM6GI%0R=RoO92VOVf0P1i$TTD(3PUw^ z$7fFn)31Mz8t>tRLsOK#!0K&x7v{}^7ma#1V-c)NjTEs+XgLHWn+C$3cLU1<SPMgZ zBwQ>{P12CWB=E6)56>teR!@DV(f4B1WiCx0iRPQa*nR7cR5Dlm$*H&Kc5EP9@X-tl zTn&iIU<<|=SPXHDB`qvPFtv*GBsH)|CS7^tCtM{Myz&HVWFD*mb8exa<HCZs#_UvO z<qTCN3&57tN9&?NWHRfoJZTM4R35Dr>2m~wlCD26swf+RlTAA4TY0S6?G^*L+_3jN zd){^S4bP0X6y&)}QN+R)yvuxPtCWO<ES3Q9k2+hYw#~1Hpb!@3$V6twwJO+5MJ-X5 z_C!Pi2QY`H>i$qJ_U)+it!npETGao0HTIM=7L9e!C#->Gj|g>;v?^#&ASsT<28uap zL5%qfnJ_8Sc{xef*BX#D@5Afo<T1Z^>C&R9t6*|<z+iYaebQ*P=ceD1o2})8CKGgW zFISXswa3(H%ma&n-DOr3xI!ZfGR)Zq{4ydZn8eNg@coKX;VI9{#d?dua^}}oW-};f zKf8MFqyVaKkoQ9&(<o!B3Yp1SAAgG+oYC2G_TR6Mn<h-hkW*ARt0vCv<mWoDUT_=g zG=U8sI_?9pUfnP-4KJn2^y}Q-l{HkEO|+s(+pj3zmk0`KmtZ01t#5)ZI;As_ChN+n zOkD_N%EhQ%UIq8k4EHjzl35Ogk1{_&eKgGsz_<!%oxBc<mTHl0%oVBEmtAO?GSk^I zh6WG-qjV1T1L=#FllmtzVQda+FA$3qv(^|wtz8RyuA{k&ce!0=KK5Gabc;*_$F81* z;(Rz>llHi#7kxgd6G6L`M_Y5vQKNizpJ$rcS;qEE87p?3y`m9Yy8psk;*L_hj+jht z#QY{7JCVBd8NX(Pw<@OYCeg4Aon`378!)Fa2x&~sjQGWg`p7fjt!Z8&X%dhV&5VwQ z!9pZt={OZ54L*GmBJt?%$DTZV^MlW`o#~H0eg%S{-q&AdRfaupK6mPqHy?lgztXE; zP5+#kUw)gp)BgqE$|aibZa`b-Nwv~`De0hwhQ<p<hNPE%9(e{t)HMbN?f*t#!qkYE zfSE9d0J>+IMonqLGG%g_i4Ld1O{kujYO-oMcmd4Q#3apxRWr>_ARHoUYc-`pNm>m> z8fAp*(86VTCX%*cv=<vdXwERTakDNMLyI0n2iU$uuX?g4{>8!3`K{n{_dPPcu+d;o zKVACEmuH@RbZRr5^D1{;w(W&7Q6IhOq`XeFRSc-P9&@6JphuRn2_#LOCe$gI8jq+m zrnDJw1Z@^x*?_o-B*F4*kC`ncUcbI#RZF9#9!qQ9O4P$Nn^xoi5F=rQBW3i@sExwf zr?xGLn#-8_&^(h;I<T0C|Ird-HGpTt-UwKs<&<sqm(4k=n5%Q7Zu-hQhL0VbW3)K( z997oNz#W56eWu&O-;G^!&K%B^v2U~#AL#Eh<mMF3EUB1%T`l+)o)y;dnwI{rAQrS? zBMX5kyen*EuJpH3ayEHcG8DBz=mBX3AOJ{K!&)fWhSG1!W2w2u+0`(ZxnwXMS%b;> z(*}bbmBL^;@%Nm;NVXIVCQKU$i5t2OXpE%Ih3Sly)y2eF#P&6W$QZER4abZmE!9I+ zNhoeH;>(6Fr=}GHcgB)(i_5X=9ShqfVp)L19YTz!`j_wT-f>m>)yadfK2C*hE&1tP zH*bgxIJ!NFMu*~C*1a+`+wNQ0X5-JT9BNNLI+}iHlGm65*<872?e!R-2jcg3MV(dl zp;fyAakvWjUSPZYHQ3(_={_l0LN#rrE;$|Uf@)aHr3wv3RNv`Tp1!<VMyqy!)CtAO z(^>fesgLo3as>FYejb~oGfOUrH??U#^u=Pr$aV4Da{HskeCu?t2#K;QE7SF*HQ+su zazjCx(?|7RO{*@b*^**{e8>%z$~u?@<O?us17(nkNxP1u@?(m#ymaS%Lx0EI$8H)q z3O<>9t1(m(eO6t&<C7EHSN)CQaN}QFOsg*M>s+;Bm%9Im&&0dmG%KH}fJLu9vSXjf z_oZ?EjA7jHB4Wg^W%6%mo`T`1f@+20@*0dpA~u|aRujCZtHEeN3_p65EyBPueHCA| z+L9(SD9943BvNRCltLo{=>R2hN?I{#Han*UXiwslw3vw$7d<V;h_o1bd7v8&SXHLl z(X(6nF`Te`N}d*hNFcbpE8%s6*Tmy6h4~%fe0PZzo1U;z83=to)jx4#ux-IsuB<u% znAoi-4_<9HUcG2;P#>c|l=q-dHiBwQ_u6$Z5vmNxf@^LZ8wvuL`!IEOM0K7fF~p=Z zL=M_X+4GP;gkpNirjOaSM!H}_A`+2nXd$QWz#45Sqob-ihZ-JdXx8thd?CWoMDM~& ziqK5VZV(TXo8prL7ZI~z(4}o=bL4u%K2aIF<}chuF`D{L&#tRFGSfaix7-?8-`BA! z;lqxF)YQuO;N$<O8oYlqn>+P%`qu+nuTHq<&e_o3O!eZ$+@%BcG7I>j;C4>>sZPvs zA!?Ragb54k1c|>$osqOkvH}CW3^eFsjYSRF>Ad*mxk#SHLckZaj^%RBcDzCxxZbRA zF9XOO4G*Jrm&~b5f8cPtLf-U$8FI=+5}Sa;%e-ZIph&Xw_s;$+_V@6rK8OxyChfU% z`_Hb9_y+j>M)RUbc0$IL99yo(T$_0(BVnS8(b_r-c}YAulk7gP150rMU?fIsw2hKz zi6fr)+$Al5KrQL2Cl+_S-W*OlKC_VuMiB!VD`2R_fmq<e(Uj|@E=Iu(k!?-7tF$O5 zk#A(BjWu}b*5F!2S+L0YTX&+v{jkMS_^@L2I{g)Ccgq6I=NGo#4P33b{Dtq^j0V7m zm32PcG`${Ymz28xfxlbstSKtWqW|)O`DaH42H3v^<v@BGB+UBxBmFUDqMGF9MQ4A? z-k9acH3xj*bGI357o7bL+PxJqVgz-xf!&;05t%PQ5G~$`!AZ<n*XXiF)=PBtx+@~; z@M+Bp%zvZ4`WXAk8)iKcxX`MapVH>If3KE5b@IhiZ~i~*y?K0F<&`&nH!YTB`D(W; z%d#xXvaHC8B+IhA+wm4BIF1P>H6etMCLuIrC{1aarkRwXG=!xT8kP@qXi6z%7~Xqj zCD62!CZ&|=Fd549M<}JVr5)ZbL;KD!{Y_aC@%x_V%3DZ67Ut9Uk6#yCx{?*$^PK0L z^PKZN-y>_6SIJ82VO0vht<)W!(n)n_*H0dPna7Kz%BAvD$)#p{bjrkjmeNm5d7Sch zRpV{XJ@bmh-X_UYLHa{wl{}SQ)k>f6{L3!(65iZNZ>Eo0Nq>IuxnKSC74{O|d>Oqt zhwHP{05tu^gvOO@4u5AwZq}d8;akmlLol1e-_c8OnLRd1S7NI62kTn1on4o%ylhpe znGQJf|HGf%+gJrQFbNe~zhM)L<r=8L;bO4qSo4PF_CBiKLGLU)I#aQ=fOHZlfi^o= ziJ0KXeb~Y3Hn%~lCC%PPr6)K2Ekocm;)9|*vAJ~#-Fyp<)ugVOG^`KH#_@+j7QOFA zONrj{McJZvV8$MA+}N+^yxqfHPwDl~#s|VNzjVXWUrVw#CZu+mN+o$ugV7N1TZHR} zZShF3wAA1ag+`-K>5YdHBdva3x~}t2daJlel0wn%JNWW{#Mi`_UBMg&gA`5Nl@=^{ zgNVxtzwPTb=>!>5cC|^o28Uxnk$Ly`+sn<xIgx0*jLMQ!n$7&tXa&*JaA)y@^f3#= zwngCI4ncoAL9UvsA(tP7Ps1RHMA4*EN>ntrGV|3HSL%<cY=Qs-w7ZbAVeN0OdQ?xV z)0tL=|KrpqczO-uSF~&xn}f`n-T*wqOl-{@Stl<7|4`iP%=r2Fl#q`Ea3n<y&t$5o z_d6;~Uft9@d;O25u%#}>Ha2R1e91>l$$KT?^^w^euTbJBtCbg;3Qs>Y-*Vr>56gYF zfLS-a#f?3E$5{+Cf@0rB`1(5T5-w*aS%7zn)iiUc0s<sf^mM!uiYwK}@;>Ip*w70f z;a4f4YeFH5#)NsT`wiuCg|C{N7+ky#T4~oI=%ZS&9w`X)lA}%$0I4F%puqrec`d-V zR`b8(e>qwe3m2qd&!5J}!j+k{`9v=`$po)2P<U}`ro|^1O>yxFL3Y?$Pk2*{Cp#?p zs~l#xsE;1j-@8*Uc*%B;^ZSIgpm#3?Z;iso^LhhUi}pJ07q6J(iik+g#2JBz!bA_s z1O^x!Apk~4y$~%7`U5uph>s8P`!HE5;^=~iAO%(f)~#bT;Ni2a0xyme(&EFa0(-Hs z$>qlx-~R>Fgde)Y%-hX!^!HNtVEOqLh{f*|3Z4q1rQw&Z_>6p3{w3vOhnD#<L>C1` z7_JMD&$rBScT)fFVYkU%ao*wqQIh)?VXMgznxjACe^L1mCaXTojt!ZX3z5&#yByvd zR$af=YkOcQblwtrV*A5B{nFLfXY{_uuQ$%<ki=xPlKY#pL0X_$nGiSBanhBXYF|Er zGgzO2tH<+;5S#&9({(K0quDEh!}N5)O!;81P65nNN2r%?Xc=+@%iHm`9+<9khHAjT zQ6UzAIU+d&&S*HOP>-0~bpZnB1Hp(R;Evp4uJqe$OS<FXupG9=x7z$6IigFQ?+OTi zigpW@=k!){r4Bfov|j$9(Qc|1>%rf{?49%$TZ?{fl>5S2tp+AjEluUS44e}SFoXqI zMvzyv2Z^>Gt(#&5Nt-cR2uOIUKGGT{KQUm%$fh(M3P)`D3ZLIYiO5r~(Rr+h*xbT$ zSGDz_nMW6}7N5wDW;5d%eiFa&zw?&;n~`i=L%6LCAY1l(TIRYHOy&dJGH&l#$$T`P z?`4PufR6HuS&5ui#^TD*u{lTtWO!&?19BELG+kYUZLl9AA70Ghx=XD23hx=DuD16g z@*Esm!VWj*yeI4|eacXN`r;A<YcF64_QsoACe!KHK6c0EO(VnSy8`01o$**tPb|@% zyLIFG;lYit(*kFD@3_u{xbSjx>Z5zofM`ewuo22RKSf3*wdkY-Pba0_kh5*3_^1c* z5k1In(5^Qn^H<sZ%^>akNQwp+0Zl@~we4(%k^N$%Zm(iB@~beJo>$OeiAkO4u)GWz z-agbV+0Z^KCLk%7$?7o9N|&(4XUaD%Pg1KhdL{B%Z80}tF)yQvnE|}y!!TdV#$mp) zZP;8ou|?0ek(Vlm^;zX|&~;XZX_wh;IZ0$q&R1ELQ8d2U@(Z$U7ZI5~HNR3<n<21p zb}$jjEaIHg0i$__nX}sn!pCa#fedS17LkK7Py!H#1`t7`S;^rxP{cyn8LK13i#UR` z^H$IeNv4TOZ#p6roX-iSU_A7Cto$yywzS@979I_67s^)F_T99%-z?=H;47Y&1YgSJ zjt<7}Xy<o?14ff*3hIfuBV+{cY#_X&qwt+=QsmiKU~tV!{UhU&6f=US!RO~YD+@3C z%XOIaXpGW%A!-tgT^o6UkG2Jkf*z0|Nt%4X7)tfj#0y9KwNW##zpX5~Rx-tcRfQud z;TMJ983hv`o}f4oc(1~GhWBa#9ak<$8c~VW*Wp2~lyre*#^U*;1~B3P^eOJh48c5A zH!PuBgZf7_R4zajgwd7RdLogpwqi-s$hP*Ba#E>NtPIxGFKEKbs4_ses>Mgy*d!hP zkirK^oESl`27sB!tl&eeK0n?{6G1IlN*$tWqV%2vnV5$ggLh)gO~uqc5)U*Ot_76U z4dx{Wu00K)Z0mi&P}Eb^J-%c0@R)Nn63_C_n9Wm`b*Ds`J?PQ#TlIt4zKLMK;p3y> zA-2yLCQG*s^_g*yl3g$=3-hW!L{t{dtR_@eu6z{HKvwCs=7yKhe)$Vxv_xdgW~2WL zL2P|)Zuo!R@GbFC7XOz=e&LS8Zfff}&eroc|M#xvMPu4}sso@2P|&8b7+u4E$yJ?q zX1S169WYyQXn7;-!lC8-7j@U{sbzXi$Jm<w?*HO7o!+AZ+8%8{aH9?_#`+r3#PpZI zQ;PFWS{JgaJ@bxQ=fAEuo_y9~`&D`eYrB+N0oL3Lw7|+z@C>DhlAI#{Wr+MSfZTmd z8AzT2DhmgQW5D|mvs3ooQ48yLLL&Dpu(K00R(hBMQ?+*#7CW_U1w9@o9T-FF!e4N2 za}?`3^tE-x-xs0>g)g&X2e>Hd)N$T=AnZeUINsXIelkCdJiZXpO#ovT=Po@1$_`Uf zHIyA@XwDfX|9+qywWbl!ZmOiRmTGusg|&loV^vKpwRL*r-M%lHZDmMY>AD9kABe&e zf^$5SR#&Hc9cTB7tZykrZx&>I!C5j7Q8;t2XMLk}XMLmd?fq#XhM`$&?*8B#_0L*u zx=$ZlQ|&&H(fyF_GfxXjf#}(-=zJ4++8Z61XW%~gR)g~m8FW}g4f43q4qF!BBH%W; z{j_&Xb^L{?d`L4=RKHGAVP6fy7!eY@6q<{&Eq{6#=@d>eU;~IFiGj9i6{D5GL}PJQ zP|O{*GHj_D2P$vMP_=;IC1)k7_t}Zl3afCxw*3Rl7<{0x_@24DU0^jks%u(Nt-X=U z)sp^)p#MN3rs(W~iZ)P&`Nn8%3TIbNYdQo)G0L}q-k{*Gia4UmXm@I$`nAa8B-=m? zUp);NYdsC6s(^5aCxKu&s;vfYxWuYD*|DV&!;h(o7PSI}gv`5>S{G(}LagGcnnF7H zU^+F8OXh#K@r<1z`}Q#yGXKt4;mw);kiv6({^S-BWo){i2P(wkS77jT68d;9<^EG+ zP0BTpfT9pe09}<3kg_O6O?+z42xcZ203k~e$rW;qQ{|iiiARhyrbMI&iLHz)O~>-p z>`<4hZXy{~2O+K|hQw<kaaC*0eZ{uDQEKk%83hSWu7f)GdOFbNKcyR`K0i@A8;amo z&`$jyfq1Kdpr=#4Xgy$X)V1_1B3d$P?HkXkqr^8cfOBG$)?gG`pd_qQ*^4fwi5qG; z$jVhSL7`kkVL=s|(`PN;kK{R9z@NxXCoR3z_Mq|Pq-Eg~=Yj{6?mbPa^7z_0n4fb* z3X<($SV7$?)Qj9)2&h&T2c3i#tUnc65JU<LEjU%~^ag0FpABFTu!lK*ZU#p-A3k{E zw|AIrW_iKys5~<Z9-D_H6!sN;8&zLs?_>Oe?!!lrqqDn@d2Z}9_fhDB6HBl<ip_nx z`?z<m>zY}Qkqf$y%G$HOk9l6|boUWkdbgwMhRaazus<+-I&`OS_eq{<TDK&2FKO=c zr_e1W9H*&1WHr3Wsfs4*)kk*#l`k4$MOx-6o)=jnt@5h)W<je!6{%i;R`+xIa;Ja6 z{2N!pNd2jG0Dr1YXrt8H#ZH4|J-rDyomh4TBhG*;vfJ$S+pxbE&K4!C5h-<=1?PWt znvIP7H`|M-InZ7Va8ZnCeE)6)%YhPM8CW`K)Imr@5IKVj#q1;12X_t$)#72%vzOo@ zA$()_UF(B;=F0WX`J#diJe^=>ss^+5gVTuQc%kj0w$|mX{A3%2>20Tz!7oqF{R><> z^3ECfW-~X!y~^cM<a*HapKHMb3R>W3kaEuTke^1(n(H8c)R!WLBV5m0!z~@?*#cyi zS{G4vga-)+;7qJxdT0f~9<MBkE7@bJ3E}}*ScEJZm^)V*WYvW<b1<3rPG=z%uT0WX z29whSB{<0~#h(y;TCSx7op4#z3&f)8gNIj}VtBL=bMJwJYQ!7-F~CEqRVinJ<HsWl zuu+UIfaRA7vZl3)6K`5EUkq#jczlZa;(CNuNE8E#Th61Wk;X38u`DJKqtm%&p)VEa zTDoiBlfr}l(#=0iX<^BTw6GajVGIQ9f4JxM-QiS3+SXaPH~J;>i-kY%sCF;9qK2QG zNhoJ9lJ8*DedMJVUJ?`W!gKy$)I6R4?dA*l#N+xmkRg8S$1m|9{+>w{Q_Mj7dmnJ- zZ(?bxfj0P9G42G*RL(`q#E|9<am7BPogoO&2t5QgQWXK1VAF3wV)vxJv`ltY`f8w) z;9F6>XrKjHt`u7q<T{*nzs@Mzwal)|hIYB4vbr%_gV5-hy9m;`qIle(-FJnk?VeZ_ zNKwc61DEl)0-*l+EmyR!o>&85a?|7`^~>9r?=KVzBOm$no<|!=Q=j;xuxXNcnf9B) zf7A2*PXd=L`&=InfBaVeiF;PO^6iy->Aop11S}+cW`-LebST5HpAHQ<cur`@Aqa-h zkZLV@f|LvdLwLycu09eBq;cVpNCF<trcuuYAX(U_<xMred>(MgnT)CCv!~@a&g8i5 ze?*Q1i_cgOFe!E}`p|?(IQdzm#kmOanb~t~#rcTvE3@IkwETwuNA(akZAK5<dip;+ z5zdP|&P9Z;opfTJkMcST*sgf*Ns$WKGPbATdox@ww}hKIqc&2j_2N{%T{}nj!v`b# z_dd+?RBP0OvvMcgO(;nK=ss-jAv_y89Txf`yrEUYLaUuoxRJUxbhl=eKC9|LkxLl@ z%cow4E2=f~;TVHcxb|~diW9HG-{XJG-)F{K&qJey|NZ`0fMFBCH&cd{nBa0v6bs<o zsf3fcZgTtJ9_fISu~rT;bRvY`d`zuF=Ay2doP8ZS<5iMn9R~3{X)Il^Xkc)J$l5ry zBLTUMkpGysgQs!W=ZIEZKW8)l_Pmgbx=AVS`Xxh26?R?#h*9gSx(V@96T0XgE>1is zhs@9c;~{74!ejLPa|_KL<TjElprX2bESV3s(luf{#s$`0hj<cQ3~DeAC^N?>@QET4 zZ8kX@;}x)3@XXD~FF@L+Ho~Q|W;!a+nlM8M+-g%sUr6IG<cP@uLl72Pl|)u)wW*Cj zx*!Z9OwgJ7kJ-d(Q0O7_5v{B+OF<JxQ<gTw&`3DV0#Jm9{=Lcck+g1IV#_^?eFuMd zd2PiNH*8$Be(|d<?KLH{4Xbl$Owg5RXdIWCS1ez+@p1l#h0huJRiE5E0b9WI54NBL zQ>B(;;8d=eG%y1R#*(=K(m=HT_CW*PHA|s^iV?Zt6!jueNN$EX7>_G$$J9n>WFzA# zY>ZPbqp=O<fP(?x;;Ih6K!V8WNMX5th_g{vO$aza@m&saUJS9eAv=LjZEj&06`fUG zOvJv8AfXy#wb*imqEk+9Y#~$aJ~9boq}r$0)aR$*ynq-Ukvz6H!us*d%Jb53Y41rp zyzt2j&~%o^F5E9|hpq>JLw%6T#b|XXcUPaxwbAOJAq9(thLm1fol`fQ_s5mgF*S%K z3bJtz!3L~BHxhwCT9FhMOleIjL0Ju8C8#gm8ej(r>RB1UrC<uW9__$_1kkmjtB5c{ zf}&N<plfBOSAA~J$|-u)&(L(pu+zNQo?S;BJ)K$T_}b*$ZuY{`_je&`@A{xVccJoR zeuCNNM8;P@awgudH&}1jatw>ib<<G{ogi8IEwia9N)^b1N#yk@wgUR(RAkvgBPCDs zQ4D3neh{?%u=E&BEg*W%Ug{q!QadJPl|tSY^p8bbhH&_gX$P=aJRj8_wJ_4Q^XomM zO|)n!)~Tx->x9V%ppO}{?T<HS2S%1*)2LA#+rXxDmJMnH#QkNbC68KrETUA3CbhtV zSaVBn-w4G;3sLh4CYRzNHX*{BvuaoioNZ^D?kOYh@w-3uz{IoDt!)cTvH_#GpH?<} z@=s6Qu;$_8f4bo2x3lVv3*h{dg<f=VFM->EcAv)Wz<Pz)soepcP}|Wb<V^+~@GaV4 zXGT6OYb;8qZQI5faSl5TWh|;;4}0)?x>p;d=X#I?VNRVIgey^E)+Zg_KxK1FD{IcI zw&g0S+=!CE#D^B<oUW9$*NIkDGLstE4p^+A>Try@4*}O#E5S*ow1btp6~7%;fcUkx zySdJ(6Gk<FS!GU94_*-FlW6)07)EBY_n_Sc9E=_7`8DBDI4E!^5KG9<d13Hhx_)M{ z-}vAYd$v#bO1+hi!Op%lu9i=Z>z!ulh3gZ0_Rj71M(5@qUjMV-SyT9DH?Mza%k^sK z;3k`8a>a&Eco$8yfp~>?SINQ0pJiw8Lc$bcoc339H}K_LZjjbvEv(_k;ra167oi`F zxPB9i;yCSAuoW!B<8=2&$UvTwwEj~>TA$xyqDi;P7FpejWeI7|z*<aWCoYFkynG26 z#miU1T3#-zSK~d}#?CFenqF80;)~qXi|FCipzTw);m@{iIW>j-65Et$T0TyTayiY~ zRIaw6YoT?J@%AmUDs5QLGjP`rGCJ{PYmo^I`4%nNvXyPab0?{v!!AOdE5y2k0HoMu zINy{{cNy|?dF{`KonH44vC4n~U^^N3=`O|TE>5rour9?b+R&x=_Rcv>_UFb=FP=n5 zb2cZX(UU3dS>9))aM1WQ?Ee(EhWi+gX*jHBhDgLVBy;VgZbS5=s_2U`R13a12vy7V zl3lnGZ(J2uFUGItxOx$OT?!wunV{<GI>bfmWHt+<9}oLw)d`&mL)F5>7Sj_3)fLq$ ziPU+znUwi&Y9`I4Sl%y1+%ebU&(;yx9b8tMtOqi^LrX5Yg4u_Do4R}zwz|`r^HveL zBMb=`UzHU6xV`4mm85}1KYsBlJjl@8lw9lbE9k8k*_4%8)dTq9eyg*qf88c}_m!C5 zWk!oVO;CZk3E&ZhVKcp*MMaVWH6FTpIxW(_uhZTlP!P<9CZY!)G<Tp5Z4xv%v6kO5 zJd(Joq=Zzbq-1Rpv+tfxtS#Y9)c%VLa}?(tvv<hd9lrD8Bk%AvKLJU@KOG2mek3TF zM`NFU>Y6JiUU?)^*>I@vr%zq+4?FG{D$+Wz^IS=yM3V3WX6bi@0Dm*<h@EqBZb$5i zhvJFq)=$UEdBgX$uKuaQf7X|}8Y_5n%|-r}exIZ9vPJRMxD=nJh(Me{y_EldT~?!s znqQZl?z<XBUL0)am_(~ZUqEEj3Anl;MZ1b<bn>yGtPBE#7`9Uhn?h1YhdJ<XnWw2z zl;&j4%+7D-cI+ZB7lJUONW988wm35aoD{i39VUO2Sjt274GVe)NX7LwOl#oOSO?zW zEJFrD9>OEbBBtrpR6BuoqVXs!0{DDR+@Ab(^7e;!STeSKYc}8I+<5*>pSt1k4OdNg zc7FNJ8{d8Oyjedf?)4PXjPG<9JqUlrvH4!AA&buE8pz!2ZXuaBka5F7ZV`eU>oQ-% zH2o7A571zTj+>fdt3eNnQ)sv|%QR>S4eg5%l8CGnpjyE-t%ks9rZz0Yqk5pYU^iIz zR!gSHWOOQ4)OigEV3VAlxP&DD^)eBrikTIgwl*y20^i`N=}~Rqqd66&sy$vOf9&qw z^hkdo__@Dh4623ns_{=;6RZ??;_+!a1!rpPl+xyT-tdi!Q-dSc{sD2z46Q2U%Q!=I z8<F3M!h_^2a27<Br18^$Qgynf6@Ui_$Z8d|w8%FRNwOMN6?)C{I*YTEa7RRJQUO>$ zr~1JWm#?&W!x6gdEP^cNXL`=eBHBNv^X=9%FpBEdA#d}GuxxVFzq|dk?4qPg+P!vj zCNOM3mlzg_=<`HkLpP*yHu8vI+h8Oc<;N)4C1eD+c=B%T1ca%Cjc{h3YTTT$)Fx1| zU2y|af*6=nnJC=l6?H6}1*b8XjX_JoAx#Kfst<!5m9;a7XS{L#zejgP?i?PyBfNV! ze1`j$ygT<)ZW<liSb6_nz5BxHZ=CX}u<1Iw?+WBp!5KkeC?gsLckcAfx6sXFehFsg zG2VzcBhJmm8EbAFbWJc>#F`6^O4W~30FgzM#*y_91rKYSvRd<;&RE6ZXKIR?0{mc5 z0}0?M1TYtA^8xIuZ_P^M)ic6v7}#d6P>%1KLEW5*l@**<g*4-2UCB$uE*oT&(YZ&o ztY(csfcAu9N0yDnO8JJa#cmv>8nhB;7lR;mVbW->p^GA;h6I&3(eMW)>PB5NoK(75 zl`}u4HCl|ZA!T4~&9}@Pt&z#d53y%c)k6a~f6r3sYuFveq1{axJT%y3$6&j-sGcI7 z7ArO1*VxpHF2S<_Fv3(VUP7@thQC})Chn36dRrfn_X1}=Rn{^%M2g#BMdD7Wv*oJ& zNqU2uG`48<S~BND-b#%%GyJcOQ29CLP#9lv&Bre*yl@VLt{@qm;er**>Z}`X?Msgg z_yW7g4vsvOIwy|T8ERObw|s4xW(%ilh~VVaWw1Y%ac^tPAvBzpmft_SK^W-<B_8XO z`F3+<Bi0{P2P$!!+iB6w;8)MJvl(*jNFuVnx7uSwaj&$2qPmRUQ;QKR;9#ffHQKB4 zi?zjHEUR`bzBU_UD&e$a@im@x5KZM1>`kmyuB?_;QvhpZR#|GzaVA!PND*<N!H8>i zYt9|R%o;4bOzb8;B66N`IBUq?;am+)Lj}_a6UhPkghrdvK$xvl?9UH@#`oFTeVdtG z#dBr!m25k2EgHG&PJDZ23D1??SI~H7bridGatFCw9a%smMKmfEn7zE!Sw~9>)B>&h zoUjdq3z!1g0$+gqml^A%k=+$zHZ<N9!?v8BC6M))RjYtHAK2r3Tw7l#QG$^JCqT~3 zxEgdwC+{!8znG$9kQ%2_NDQlHMJK9SMGRr^m|+@wlZ!7z5Xd-yW3)Ht#&<ZtoD(nY z{P2$d<5{xQ&I#s@d(NJ)mbk-?i-Wo~h%FM_l|}pvolG5=&_~_|Tn;G5G2|;6roh0S zi@0f*M8F7`b4Tc*o9L{ok__P*K^N4NvAmR!3E+<-LkM;gWQ~Xw45FDAT<|BGFBiDf z*x_%Mev*G|4k(O&`wU!^x>Yl^p?B~7AYw`w6`l*yy(2Fk<u-7+GSW7%{wcO1XB9Y_ z;1pPF<va$a4+dt%*kq9%ti^iVzmtJkomSj`wH9fKV4qaWpd6g42Qd}ZaeUS(Q>n`h zf#Ck#G-IkrtT}!6ouG`%OaoS@CE())edysL&1Tp1ZU1fXd$aYMwMpTz&MOPM7}aLs z_32xly1euHlen2KnYN*XSzU{@9-^985AK+an_`P>O=?U`2-^{FsSDePhY80a@ON0) z7Rgx{7S)2)&Q)7z;j3vIS!8tqwvk3+p6g=6M7qFyo?F1&>;la4WQQdS;xWZVeB}G3 zQb&D?K{U`%r9(=RQ@zC6-$ljqfORU;xrh!?v+BawYBLuAE!C3}8CbKEyYR2)0wmR~ zW?ID~bPXN<;)P&G34cKHCm+_&<+)rZ$=^w8R~(Tr9p=%f1Yo#`+fj{WXdOTz?>&|; z)z+bualDqwknx;(2?+a$LakkUPdnS0?KJf}!aQFD6&l7KPs0;1kx7YI_ZtOkaA7wj zO>QTNRj45CZKvcv_&w{}dT5PQ5f1JB?&tg$Lr800=ZlD)3tv5#wV3N@j{o*T)`H^6 z`x%GoFz~CnGts9a=SBntYY;<pQMsJ!!Y~##Md!odR2@NtKAoqpMnbJ3l<B1fnM<nL zi>v`7RbALW$hO5fld7IX_}qdX-*ezt&8_@g_~MkSy}?t}ves%SIS;~B&_+Lg`wVUL zr@o>$PP4Ca(>PAucFqDceS}McM_Q{<(ixXE8ek(ZaO*<wfoW%4LQ_*2o#3p*q_&?c zcNt<ShzB&sM19@}00<0-12k1EK|o|LE*(8JfW8Mza(F792Ei%ru>4+<ox($wpg&%C zv-AbM$tV1cITDC<osz#*7xGwMlVp=#;C-IB)lv3mJ@1v!GVY5OX7YCP;Mio3MUA(U z`Cd1oT1Bfxur>>&KSiAus<#iR6Gjz)m<PB`^i<@RIR-%$Km`dFzJlYl0^zG2JxT0{ zQ{ls&Z=hj;Jq_^DD8VZj3j}&DnyAYn)Cz$7r;MdF)NrC$tOqz<b#q%g3qLbj?T=HJ zq4Q#@zN?2_byzX(3pxm~Mlr1m>0Jauv({PT&q1Jbfn=VRIq(p^=e(Fb=fnVo%g>YK zb9!CKPCL-;__W4p&gdMJUdB_XR)Ds8=2T7$V0nwSK9OX8sdZ5T>$8;B=OP+cPxC4j zIhk{^arI6(&0L7Z5Fr3=f!kJ`vRaDuX~z1D;<ppB&xt;Uh%L1~3Cx2xk!hu^uQa1m z$k^V&)O;!I@>H<Blf7WIs#qDq&8iK=!JMG-&M6Zcph>^5b>bhLJvBUXUaK_Mju)oR zbD5AUVqC0j|Fm#l*Uk&tK$=Dv4#u<rb_HGmlS1McWrwJmUQgBZ;#d{*l_GQ@3iT=3 zy8s6|n|2nk#f$U`j9Z}rfxk+_hUn}aKmcRwDG`j0QcrbboDpE#P$|VBj|LD~V1625 ze&~Hp>wX#Ip2BRKHPdS+ZyDX$*0TAR=fp4kx(}*a&)Eh4`oRTiaM~8uGfvw+?iX6d z4TuniBFlN{2tX}gt;`A1sS6YpAgj2SlI39UsW?W(XA*1~30517LIo!rmWk7JDjlVh za<7(B>5b>p?9o&WyIdSUSQ{_qQP_y@atZ}ar4VS?sv0r5M(MTnm;q_>f?7SkFxg7# z4}YlCWz`z&0o0;6wF@kYD0wo1i!^LiRt*6Soi|q01fukIuN7}A5?s(cp;K|;rsgll zDY-71`Mc6#^Kfb+RrWrnT+{5je6j1IsK+;l@dAVwPl(pCleu%}uFiqQe&M>4h%csf zJPlivq6x+UiW$8WGlErgo&{2Er0dPBkIbPhNsyK#r0F!~g%FAyu41aJB7!9;E>(G) z#=sKK4<XY+XcEiS1OzfAu=*m1cR}ggQ^J|7I+r)hdK57S9my>k-&cx!hDKyc^+20- zf0?BsSVt|NN`l7VoHB2LIn-x|^x!9F?99+>v!XpUOR*%+<{&;jSE{EbECydOcv*<A zc$mZdLc(MyIQW!6(-$ULA(n<yC=E%aE1vJy(hX4j)}rELgre(M9#SU24kU|=gBpWm zhhk<(79lu_8A?p1`3w1=CS3~j9K_fRI(4)4fS@;ul`Os{gbgR9TUrSarG|*+%D*s4 zkqjm!!j5oiKb2*BG~I5;5eaZs%ygJ?x`LF)Eh=@bfzd=#rv&jD6OGs@`Hl%b1^USP zAXQ3F;&I{MOoz+H6L)+t5i+9*-dB?<vbrpDOtZQ?Q=HCnI%?z-$BXde-6v@}oyVs+ zq+7Xf7SH1{76IL+G#`SJ{+RQWcR9<#KxFbM2GQhjILgMAz_GkvTMa)mJpfy><9Y^O zno%dQ-czvBA}X?D3MTcQVl+{)SA`9e_9Kc^f-FB*Rvw}aATk|ZXw4Z*8tFM0;%UGf z@orDm1e)SZ_~=O`s!k@-;u=ghpq^<RaJ)bMIFsTG^Yl|sosoN5h>yKKAG-be*bArT zpMHKCpRL<ejA8z%SWg&$(cn-?p>Tz=qO405BWIJiUQvr_Ey=uHpR^#yW-qEbBA6x+ zvLWx&)I-v1YKHW{%|=PK={{49e4)rlB(uzEab8)jekuJ?JpvsT+VoHpY6c_=DDpj7 zX0b=I?H#OM3YzbVN+4mgYCQmGjBzhH06$QVD2O#m=1R>nUWN-j6*<_+FoAg}l)0p- zvs~B0Xgrl>tYBaK7+Ash9c(quiuzg?>0B7=?v5ooa(fsPSmBZRD{~f5kr@nq@!WF> zw^N(tp^joLx+nusmtxt^K$NCL@ybM8$sAJ`V<j?+NvN4ltOU~1mtZBdBxWvt3B3?U zx3GFC?9F0=xWaH;V(l?W=5p8zjAJ#BWr;(w92xKSQ3yG%jo>}wS&63W%mrecg@ncD zPjQU8(cBn_A&CXF&e_TO^k=IR>&`>!3=a$8Z)Ek<Br({MPo)%0&I+mLEM#jUc-Djm zY>LE!V?}T!n;sFRs#z06S3w2?qur25ar&z3qtr`@_#3&EljA(Y+3?v4b8}RmJ1cVA zIaSrC&q#BNSebG5>SE}XT*&R>G;TUz+*t^bFEMZqSc7u9e>2@b1s4n0{e!W;@mL<* zW8~_<J%$kwdA*6NM!=ASibc@9M^iRIEM|5%u|YZ2Mf7nnq+#oRySJ(?(nxYK6TVk1 zp?q)?$!B(qYyDYChu=9^7w+hbggW=0m1v~rLY@8f!JhvAKBHi!<wAAWICph%o+VBV zGocl!gT~ZWkE=AjjaR_;5jj>oM6l7?4H4W|6!-3=Bqw^P>aF{2mEK?jE3B7N8dw!A z2(36B&<k|`vxVj|%87pv*)eG2S-ts-^}B}(|9W;3m{WS*d?q0jYXH)EP6p4zevS=j zQseNp?m07N8j`9ZAlDpb49SFXG{>pOw2ICFpnJf?Ka7*kEnw4T7KCxctz?<t>1i{V zEDf}N(PQL5$A?6k30)D4`C$R-VA|{%lQpL|6rHbXMZpu36wxF>MS9E-N#SIRq$r+8 zC{Ik4Qs@;M51`F$#pD$lOoP!D_UYvM+-V0Vl@^2pnf`09GPMu-R>yZtXV~Ya*NfSA zY5in--)*L#d1WLz>TP`Y{^?BnNf7XvsaIrUSS+KXxV@cwVE!cB=`=4X6G3D!#E(}x zC{s`j770d=?rk;{7inG0tRXBo#ps>58bo@4>P{eX7oBdQsA3SO&Q3e68T{@>*~+5u zq)k}}5!T>DHB}d4L8U69QEES@3pd264`QAw+D{5(J}{W%UziEehduM7nKQ;RbEEa+ zzn^c#PZv3oObZDXvs<}7($gxk+cZ(wnam|gmatl2^;PuoywSjLxMfU7J;k#s?0BPQ z$Iv@M-QG5Y?4`C^gzQ=*gMChVS&CsvicO{#ooU)k%*$3AHN&seB28<oswU|v*xHp@ zwH?vSl+#_2rb}5X0vwFPGGdj)EbFMBih?bZ30tS>X+2X^HqH9nWnRb*Zd)Vs!{gl( z*`4Pg^c8EyCN@~)fZXMe>n(ku6hC^Fk|%v0D<q0@Z`!z@oNn?BMo;Avn&?~NH1eE@ zzWof-a7ecoutmu=lfmGcfzY5i^q59QW%<*M$-%Jf_jMsO+nzMAF*$LTa#=uwR@Iqt zIW1JbU65trJJmlfXlxE&T*ovS9JY=hJV>YT=j`C8u9SzDW8FS*sLpZM$*V;^obfb~ z507!5E>2<vn<(WL$)-%Z@jBWIsNpG5*p(*2rIRFcWj+cH%WRNuB$@Lutb>nci`FIq zcbKX!(_2tCXRe)+JVhrTr*$9-gc8w7kSQ<}+2=TKJDn$CSCLaRAkIF;NVn$)LYm`# z>U6y~GrqbYe^8SR&rP6)c1YTb`M;>3vDwTxloKCDJg#!3;MxV=XBD36lEnNFj1%1| zoAqm(X{bRHjT>r*iOO_Lub0&tJexf}!p0VdZmTrlw;mO3@;^1mZdEqMTWOE`Z8^cm z!;>qWR?J%>FH}aZijq($cLTFSf57G&1g)yS9&fCiX4A+(dTe$8##f;iA6%;_$7sWt zQXn&+F=Ajf02waE3$O`Aj%pFZc`#3`UR)R8Yu$W+e(rJ?{`V_~{%zM}=MNr@f8OD3 zb$@E*x+|_b65xZbSBAHiMS8a`U9)dZDk|LNb2NC};lf>oBTqlF@!7$#$t_PU7{A9f zv2A<TJ|3^!H69yZI~?Eg)qNeIM%wEZV9_|##RtH(iq2jdaHXVEd51<RQIeo@BAe%) zA7rT!wW<LJ$Nr@<hPqR>gtbLp99Kb$s$hcSk;~uwImV=zEXs0O={c;%>~AYghw)^p zcX<ycvW(Hs9DdH3jMAIs_Q#rfddHS)<4^c`&$LO8{SLM=>L5;xU;%)TYpsfxTpF8Q zLi56A2bW}!4xxUNa4j2g0-^|~1N{Ls--Mqx-=vENUL(ruff@%g6s7tW%n0R2&5Z^O z@~p1$Sfq75k4yIPs`R&lDI0INan;Q)W2$$|uG?cyby%d`UEh(+mQJrRA}xJtLh)9Z zuN-;IAJivp{EaocE0<XvrI=5mkqd_{f^qmJ!5a111apL_g(DbPVs(_CxVp1Z!l%-u zXo)Bo1I@qj@cd8xp~BN{FOm%_X`R~n-w2aB53nu~?hZ`XlwbqG+9E~p7*h+2jYPmS zD1ms9`-4{+X~-~fLy`XLn61UAz>~`S4#aT9GCz!j8pT$q)<Mt#Ku1wELI}pG0o0P& z^ma5f;gO$p8CkIxum)BtGdNO7JMhf}=U|J0v{SknT+f}nUQ>Ob_4&v8J9pmE_6fJ% zWDQzOOA|Mn+^N3yjkj#yvL-IB<wIl7{hI%U%-c_l1Z)5m8Nz}PGKzwfdL?u7^;`Z| zny?&&zZ2#6ZWG_(oR}*`5ddOS8~J?~19Xcfid0q642?m<FUu$(C|{UBPc8NtbqRn< zuuGr(<wr~SGNWiTnM=MVO1jcgS#~(YKkv6%^(N6!YJ5NzMWfCjiS%!NQ?#*vvm9N- zXlkALH|4le2Kj3LrVcO=RbrV^j$#6d{#Ath8Xp20WtuZTEh$|Pl)mnjC4)|~ihmXZ z@&jH`hsjTs?A~q^D>32kUDQW0qogFRNU3}oYZl~{^0-oZEH9WivxK=JWylL6H;aFO z*a9$ur#QPoNpC2(MV8cz4cN*yCo9EwCqC|!DmwyWVad5|nV#<DDzWLkH^m>IM-<i9 zN_Y@Om!gz%4tfUSDl(!NoQ!n8OADb~8XDf9#icsr#I#AtDu?f&x-00EBf@MIhZ|sY z4I7U*Gu_}*W~;T#=$5OlG|CoRsY4g5v3X>J*Blr(yFKPAF(iK4Ap6`x>Fqkb$!HP1 z<y9V?`Bq(-(aCGG;0rF{z4vYvhj4BQpn4it&G31qq*_W})*4s!Fys~!R|?7Cy9BVt z>E<p0#uxo)_Y}U1RTOk3W*xuCfe+FdOC-H)F?lQjk~tTHuRW^U3uVD+nMT$!rfv)+ zuGix+y6bZZ8tua+V%R5s$cXsw$QkS5bTM$X;uzY;F+i9N;WV;xtWiQ&bB_^2x^opq z^2HTZ@V-VF41kC((LYW-;YP&jgc^!lbC!BQV;Ci1Dv5LpG2EQ&YlE1j+T6~kWq4j1 z=FF%{NYr|?VmVw+e*Mqo_;*`Ah6%;JQMALZY`WbPzxtx!4T<Y}+j-2Z@q~xGX|rA@ zh+E1cUjD=@b3#I5M+=#~k#XL%G1iidr2~<+)zP5bmE9VIj07$W4x%t*WW&k%$6Ril z{6sdDpU_NDbI%g;6K;|hW+ml<gd5|6Kv(9B6s>6F_c`MR5`|H~<ts_m$|Fd`uR`rR zj<grAY@#V(%G#tl1W^JCJv2#g9~#BmCx__a<Z1{J6cGr~5eTUXysZIPzJ7X*76(5G z^xL1H3;oEMsn<Zv{b<LSk~)@cAa|a?I0?>9vQ~{(vQs)efenLUpaIeb8tkd3Y3Fbv z`0tFMnX@+N-Okws^7B|LghR7r_yK18PQF@PIXHiYMDQAQ?ICY<B<M~LIX(PW4BcjV z>DFxn-F(EK^}3T$Z=%f2|4~jo7fuh#(i1CyH;81yg|*?<h%DW8$!n6mAydECl-RQR z+TrmncS$!&<DGU7?^-=F`nYZUw(mHcYk6%66R~Kp@6}JR<^Du0B1m$W+)UUkN%~Ms zxb?&*Vs(a-FG39tz19`(!MaPFgmu47Z-j|0=UiZ%_>sm9sq`?=LeAScL<`@HMa~)U zL$S;>`<s&ai{)Ao`FjumT+C24$~ac}(zwDii%FRUc%%|+-7?lasjk$5h^KvTEY^Yu zo2)E9thO}2tz-`Ca88@D+Lj}V7^j~iH)U(du$Y10Z>5LhTKs^B0o^-TbBDKzjhZS! zcB*kf{{RD7m)VqMSo~%gdoPNGe;>nUKn#iwm#CjVADTyF3AMFmx)7nc`Dzi0F%t?W zyY%Sp0TY#h-<Ks`P5wyxXnXr;A<J*yUw+JT*b?nZVBp@h{(DWP(UoET5u3&B3;Eo2 zcAMVl=C_-Ijy_Wq?^w0<9;4A5cKc;liC&PdE3|uTM%@nj%;Nt}a6<gV8|L!D>QHQU zUYEXQLVP@8^BRp#OF%B`diODZ)M1$!=LHWc8BJD~g}<!Kg>@B+>v{*`iD5*3!y2ax zxG!knaIP$#rBw~n)*av+sKMk1*aGs37V2^8077Nei&}{q26i?lRB1!Fc}Nc%=>77; z7;YP&NX;ZcGGCD3sRihA>-1rybuxmU-*%{E@o<mjz+CaAJNhb*98K7qQoCJ_lk^_8 z1Ur4Pxx%wKN-%|eM_~xU<M*u$7hcZ#{C=TdFh2PAUprZJlC`g$Kg2(?HtjSDW`jNe z;f3N3X}<7~QfI=Gz8~qm!NXU(9T4M;9B*K9oWLv`Z5Zzd$x&GzS1wA?ULc+SAC)GB z1;x<d!;<LO<TjK0rWI?pyY(x!hdvy!KE8P}VCw7{%vgNgUvSTr?RqBL^;9!qgO2CR zb_^~@xhzG)ZHR1j<>E1<c&^ZniAztd!W%G%=vZFkR#i8iR{ZMXF_?(Uv&t1T2L&GD zf>j%xfGyMGpU7<^KytJ3PaoF8Jbp&j#Sdqp;?o?BUMU=F(OZAW8*L`P5I)IQRB$V~ zdo-L5s0dk+!PmW*tV-t8zTy(O4!$CPNzu-rS?6?Kf@XD*eb2jm)itaVQ(~R6*<)Gq z?`5{Ub#Q1ojlAU4OTsWxQNT*n49tDb2C_@?^`|oV7iP3}Wyjmv#`(uPnAKnRp%|`{ zt^S4EE{fxrKL;m*o*N|AQHpj6eoSSMQo<i6b9H3(7jd>aR(a1a(mZz`JoiwXjU2); zf7U|_veJ84t%rvW!$S-56fg9`L-)~7cxa(M+)G}W-_O&-z8MeAsdexh+7bSOJgy*r zC_WaO<HFr4<m-9wO=+919epi}xW}|uiN^d8p0g66D#V>{)?(%|k~88eqUem4B+7sz z1fMjofPJUun5si;i{Pjc5fP3#fb!%j8Y!x+pqRRmoiQ3g8qN52bUlz{hI&%+Wwmj} z^nn0}pxo7<_fVr4M4F3dK;glLjJKnLXTyGR^sxP6Rm#b@5l@2yhl-T7K3qfjk}&0_ z7|*8Li5ORhfv=PxNn6=@1Hm%J0bv3x$R#N!nrRU>rB<&>_l~@Fb>AevWs{}H9ICiB zBlFUcp~4Btn(l1A?e1*$fgP!kacsvDZ)DwI_TJDPo3`jUU+0dA?W;EN!WD9RC1%IQ z;_tpYSqxO=Xq;EK-hJm&mdL)xvi;*OeMi*|kL=xufn5B1ufF$&A*hQ2xA}c8x0p1U zPvuG>)~WK^dLYMAR2u+}EpMxiv;yA_6}~7=*aXT{b(#NTR0}davf@3ga;~?PrH2o8 zrvJb$D+yUi9ae|ZZ!3d`6^}es>G5J7a6+C+CQ?XY;puD&D~D^C?926-I8UWFkxUH^ zE!8xg>HeuNYl$u6Vk}b(hd&Uj=rP{9iAI%&)2`IQkkf&O&V^t!$=Go;p;H7dXV_&T zL#6A)<B50CE5#QK>?FerIm<+OVMLB5q7HuhC~vlmX2967IJ$4KcXPyc=lGkJj-X5C zL)osEjTM10^Uevy8HrTe3lE$95xMZ3Cl-@FnQoM0f)KTZT26SMdrTO8Y85X>1tDfB zb6FOga4OFUFFHbwa!W)I!g4V4?v7|oTm<LMzxNh$w{M~bouo=@1=c1P#KNeidZcZ^ zVhp9v>8y>gH&k|~vKOsM$186%X={>{74ER=ioLBk4<j$>bS0T2XC7(7^V0zu#AyIe zg-D6JH1V;iW0drxoGRpDn3n6@7c>EizcKTkK&@StHje#+WUWrL*b2Mlpx;r*%57O> zX&xFop|hq}oA^L)k92RK&Pt~xy<Q5-k;J<@<NQMDGw}ts5M2t{VwRc{;e&a~DFokp zRA+*2F6OoubJsxGkQFeX*$R@#)wZ;yAd5;(7WIiP$Rgh;mRG|GHX683Q$lNmuxCgp z{f@Hul8g4gREi^TFquPw63<*{454-`aL#}M#tv0lj8wH+NtOCOl89JeY(ioO)#4^b zt&iypCN?h&X<~9(lQ`g{XRjB<XDC638)d46qdyK!q`K0wq>LBdkt~@lTQahg8{>_V zoW5>igl}IO>yxE@V=qe9x`{6oHuEFHKOMC_@+yzOsgEykiRko}UM1<Uy#@RyEKe4e z0WCKi_fKEgQMjQnu<rR_O4<iKLe~5}zpfkhv=od=K#H}}E^kidJ2YX&letPVqZr{! zvXzrWnq9}T()k4?t(CB;XmDfPO6Fq(+f$6mT+B?Oi=|*l%`tjtMhg=`_<-V9eXz7X zWMij#d?A=u;#LyCgf&$eh-iZ+>kv~w+tdZ1d!X8M2?Gl2!F~lcWAxjj?+J2;C@D1q z{^Ax)w8Nuv+SQZ*r~weqpf}iUDX?JGf|EcaGiK|vu(&{?t#8H1P3xDg|8>Dy5i;pI zWxoB}D{trZFMp|T;P;Px&zS9e`N(UQBia%a-l8QCj<31l{v{*wiLde}-jj5Rn4q_W zG3NT8?|*n(+EwT*9Ec#RypDg=TexV|w6rPC=DC%?N5&Bo3~*0@$qe!&iZayppo6(Y zH)O#jNJ2=n0hgDnA|e^C%8e^#6tYwKQcd{v*veq^?KqZiE*AM{YysB{awDZfRvZAR zs+CxQW>$njyPawwABSuYWLyb4O4W_NLDXKRtX1`_Q^{h>`=d>L#7)bo<|@3S3|VJ2 z0krRwF3heg-PT!9fWavPoQGtqRR-)=E3{M#=(1}On=y(o&WBLL%Y>2X!Ggeo(n?If zjdwS<2rtCLSC>kWjIilwZv5t&KrB^O<DKjnG)TuKUJ`5p!Q{^TDebQtZ;MB_ZjY?n z=}+kTBi_{bNa3wvze7ry3QrWiQuzDPR&Qe@5xM#XUg!9k{~zrUi|2to*BXtt`<>=N zG>HyJMz+RA$`jPAheTMxBOVub1C`Rk?dEbNoE9YbAbvMx8)K=w%|Ho!hHp2s;ub-4 zP~5^W4I0ZKX^C1%ror3AhS*#^@tJV-pbB8v_Ld|R76UdUt11xIx8BBvbK4;K17^YD ztd6iUH`o)nD9eaBXr{D9lVnB^NuBAy0sy4o?$SfSiB}rHQW#j|{?p>|!j~<)$=emp zBpw=n+Mn)g&raMsIa-|xcgVI&YI`b!WeYuxt5U;-72UiuDtZb9VfdQRn(fO!-gU=I z{Cd78b))@FestoF>;CH||L5A)w~w?{Bs|7!aH9A4uFW?k-hr&T3w|jAS(S2aR2Rhn zW!A}=vyz}Psl403d9_%Jt`K}ZKypJ}8Ze&uAV#2dAybvg%Y;$P>CJRLAZOx-zPZ7V zwx2WmgAsPb07D*n^Pz$c=BstZM*47b6OY6Or8G2vElCMy2_$cwj&LnWs3p@RP+pN3 zjFN5?e=1A;)(CHMS6z2``=zmHCgOEBnh-Qx+hw(+9}Z<4{+60cRxAj12co4bwzrwT zCNCNPNOpYEKmKukEARX2`u<qR+hiPGa^lgB)<2HF!H=yBjePO0tA6#&>Ji@5DR)#B zKF9X8435I<#p@xdFKcKgr6QFJLoB({@+di<APbLjdOOHWv3uL&Bn)!I-}}!uUeFwI zKvrrFtIaKMD`^BrJO4PWUFKy)fPb!K{`pB>aJp&&&FSfwhkxG{&0hxzGUX+>n8Gy7 zTXZ?BOt+9+4#Xs4e}+q06hk`4e75!)6`9;#c==RPj6QvzakA3J>vt@@U+``Uy<ql* zEU)}t_J_<bhQ~e9#APo^_Ub;<1Bc`eXGAof$P0F#rL&w>-oi4!<HX8nM7Z(9E@VX? z*~KpFLoy$+1y8&x1kw8qyGZie2Ki;VJ2j>o#eGKkwZ$o|Px5<OeB(Y%P&KmRhy21! zejP8oVVftvhFSUfYfhHm+$%--p*XK*K{o&o1KDNBU7sYo8PQR``u(zNNwtX4{1Zn0 zvKiT}3;*8IYk%%Xj;f&L|2P)LXJz-qRIni?+*_!dFS@@vu>_)voOlJId!{`+Uv~V9 zt9X(f_ug*|0mMJu<WoZAwY{m_LJ9%PtkAp#TG(TB(vg#D_EdjonwS5>{hEjxWE?5g z4&&QOCp0hqhc>#bc61t*PT8V#9ae*g*lQ0T{9g6{d0hL8YFY6eRt3mfJk0f!3Zk#N zHW+T`=<J$SO#J3g_kt$k1wdS>9Of24#7VcEIdbS@nqsBtn21G`%pQuTjx90SmbbTv zhl2K3_^h!rE%W-VD~?DWVOOxQ3xQ+#zgR#}`(Ku(gr)F=r6XMFkv5I~QIcb6zKnPL zIwfsfa{PPZ6F?d9mN3eo2$VLwduyCeineU(_;2yer1kNB_L77t-^QlnW5U0EmF@Xe z1;6+k@h+~O`rM7MjyV@Lyy|c<JRf3EWZij0Ns0+)O!QZ3CE;n8QO#UNHHhTY2AD-1 zSafrunXuZNYDYzL3QyP|0teVGY-oKaF*yWLvwf7b73n+(KLLB_cG+8OGhRac&ENjm z)R)S1MmMHfYulF_C)-x9@0zUKxIgZRUY;5fgL?~m3$Jdve&pcyZ~H9L?Y3Q`R*yRo z-Sx<C3ctIP-*J`8+)_8u3m*``J-!YfU;vN7dM<|{H&InWWa6Yd^#!0_<wdiP8D2(| zOf~N)$gE7LL68`OfnLC3mx7=Jo$9KG@g$W?aEF|+)anT{yAR$68*WmEZ~;E9*?@{R zT%l&ok;b;UsYv8PjDaF7w1J8Si~_Tse=8ObKN`ubdN|M<hysC8<+U8>Sw9#zg}Wl1 zrr|{|1}X)UknUX3-QRxW@Se#vog3HgLXdLVv=N_pv2bX|3X4se*jZTB-DQfHesSe> zpWcOKh78=0!4b$Hf|vpGy6Q*<O^|_^ydJ_~c)eT&9<z`ke5?q(BD56|VFWaZfI!ND zXI0D*<{VWNFE|j`v3MZ?3>=a)5Kh3VTK%K~`X&g+Xm$DnVGZeP&6T*Sm==WU!^IFm zt4=~NSbHS=5imF%f}toCa)&HjK!DsN3gGY@e?8G|Ycs9<gxh2J^u7ffJ{#`j$J>{D z<AJ8}@oazdwafT6hadBrO{F?JpW-K$n%d<K?@NV8zxg0v@qjDP)yEGtf3mRk^;dg0 ze|AM;*_R_XerB{IA_o^NY=a$&K~8tVhF`=zs`dWls|;KZgwsJ18-c`RNJ$B&)G`!w zX=co5Tq!xGdT<UX@jxIwB?J-lXhAENR2SlfnuYX24QL_MW!NSjl)<NLj&4SCP-8uy zlU!9{t=9p(xBf_!z=Dw82;SD5?q*HjRSZmoY^78pGIXXQTXEi{4HbgikQ+v1ijzaQ z*wH}qPf%Q>U^e`|%NAWZa4=fPn?!Vmt>?xYg*Qti$KIW{ZBH#lbtIc|%Eom)tF8<1 z2~e<V&y9MgF`#!iGdFx<^4sZ<-|UkFsW0(`OlL4sVUxG>4{qakuj$eo6PS`+wzsoz z>$b_Yz7)S_Z4bJCiq|?;+;+vvaMoDi?(5FBd$VR^0H4w!O<bYr>U7SLYE5TfX<8gt zaWwp&tBrg}N%s#Wt4?wfxx2-urTcXc0oM}YI=CE{!^}Cftj1x`VC!6nRb9xK<m|Yb z3)2rGH&lZeysot*^GYYNs~0JXatR_>XQ&0_tyHwp7iUD}d^GrnV}L*mLn5vtlXhe? zQ3Dlc5~0{#Q0y?Z!?vJf1czHZszo-DQ!PlPD{10E%mot2ty4Nt^)St4P)UPD1p6zn zHZ1rJ(e^jMdXi-{5G<pG0b^$`QHU%B;$4$+h09#IpufJ<dDUGbg{N?3q;W;E)Y*OQ z;*r94W}f0j-TiIRCFMBaR@62vY%B@bY+qczYKuO*U`e?#<L>D;2W;m4>jzeCG12>` zFR?d~ocBP^cLK{`<$_=!_*+;{!Yyz%$PuSrl5?3Ka%Kcm`U+e{v;w!O8Y?MQ!Qibp z>x9J-EEi{zj3bvojj$MDUSy5R`Karg^HJBg*7E&++%Y3G@KK4P0wJARN`xGG1|A~Q z3+@>>Bo>B{?L-xC27d~+<qy0fnL0=ASZm`)t{Iu^EIe_lCz}>ecIx(SS+)L)Hk;qJ zIg~M4mPXq?=sA4PAQzWjmGYPY;N->#jYI5Q=n^KjWVjv%2?Q1kT#s72DKTu3k{Owj z5g=2w11Xi$>ET_VQ6?(R7>uIkNNPDuE2;s|N37+X4lb3X=fa(D7QVr+;(x)fDm?rj zFBN{y*S+{-K3sSu&QJXDKltT1b_WWNOZ|nHUV5?cV?OrcOMC=-O5oPN_iNqbh(k+& zTv-YaKoKZFLj)m0!y&KOPx_v=+3yM0sM_X!&(gWyQ&+=TW%fOV8v362#-ge4o5EK9 zRw2RfE^PG`%!PgYYW`3BY9D_)@8UiDp2F4>&lf%+6$%f%^H$*@dXa+`?nSP26yH&Y zN~ak&?f@@f6+W;Iyod|X!_&;Q!XET+1N^5neuBK9h0nc|`;ypWRbIp*Jb=Zhpitpl z)T<#cD%b^ZM&!Mf?3#~FNTYO_Kbda`v1<W_rKM{a^c8Jn*N}ApO+8%$EnYIkuHk@< z6E$5+C-a>>>{@#&->F^8Ci4c4t`*UTs;QTKCoMt5vO@zigsY%fB`mHp6ias^HJ|sJ zxN>}PSAA0px?_`5lD`I(Tp&*<(*cm5mg1O_+2vwGGa5jN0Mi<XBvAar9kS2ZZ?e@k zCIP}v(xfsZ^@_yXlxU(OM&s%q8R@{F^fpP#z+qLNd0VF#4vX;612`Y322`wY7*MNd zK%Th-8VVe=HGuzy4XI4P84y#>ApH)V!~gB{_b-X*g0=6Zm#*X26&|o(bIlceRpH+X zd*~-l_UHZhbz$)Mu=vB{6Vi)?7mq))Cw;s<^Kd4Ud8D<qb^5o8$4}sz5Pj!6;&Jjf zM$$XISNw1I78B5hSK!2O9a!ssrbA5qQ=HgnY6q4&2LNO=|FX8kKS`=T`YBt~ZOkU- zpZn}jU&ewz$6n6o|BgL_8sHD}eM9USm>WrByvcmHo?XyztBR9a)-|B9as@rUoPFff zEkAwP7@cao#bg2RTWWfWU1mA`tumI%XSC152#SoXwBZj4h&Iap7SqM^$xGN9w=h*D zj1d53*St*MS=p2#BpN#by&6}3b}Y|p>cwkat<%cJ#@pp5HJ3QaQpA&2z>rVM>TcMC zAMnxK=WhEx<?D}V7VW`@Kox&PCNM$%O6?=pB$X>=^`p43S-Y@VBb@xH_7^`ds^UFA zYG<nW6R6_P4#UGh6-y2uJbL9%UeeCAwJd*N@xQIehYxyw^~?=)S*K+2{@sUl>N5Y^ z7R7T|sjuYKTXr}7kW+Wx()2^p@@I0dKcIcYZdtkMu%z4~E4Lj!*m3v2e?+@{o2=Y^ zSpACsZAp3X@WHVD=%Dt|o0=@jm+;XW=%epDe9*h(2R`jD_Q}fE53AqBclp-ggVCSA zuwJ|T4O#i-Vf86oK7el?c^T#~S_-bPsd9^Kj~=YlKXWs?f+AXlXBn>i^MAd(fnBMU zr)p~bC>IClr%u@|PqpsOqRfW3j?r5s<vw|8&3zm2N6*~%`oFS2!ka$7`zHL+&FtqF z=qGM&YR@gV;o`0I^I-kYkA~Tsza&rH^`%XC13qUjea_T9?9;rBKPY2w*e6eYZQs{% z{bBlfu<@oJHL~m9kf$E~#y9bYPsvkHK6L;uRc-mfW9+5x%2Nlw`z`#*x7p9{u%G|H zejcKqIox;6Ej70U^%fKbbGSGBLasH3yPtC%-*@(za%(n+JE1pzKC9fE&EZ?mzM8|A zrkD0+bNI5e*OiB}IsC))(xI&KoviZhta31$!xx`@PY&N6Z{T_s808Aj7jC<K?>+ZD z_?3NM|Hf0_{?2!QW-0gA1ZvmZ^!c0j-1?=vzV`5=-#qZ-w+{Zpp*abY4}8e?#7pR- z>652F_$S&2w`P}(ZP;`{pN@Z?GBc_Xh>!i8tNzn39HY9m-kNhbKZA%ti6M|U`Y(8- z{$LUZtn|g7eU9F61S~nr@$yES(nU1%>fhz@?4vgInk{&%xEZ!xQ7QV^_9Oh&)8GFW z_OZ`d^Pf5V?9rF#6FzIpeeR*J(m#>6;;qWx*%XBh9DCw_;S(H}BMi1@RRh(7_^fh^ zRlWPmxIT)U3Fu$dHfqR5n2f~n=B%>8Hg)B?jUU6us?s%hL65-BZXl;(Ag^LbA!0^x z2=f{BL|cW7Q<D@G(mMR+I|WoN2=NPA`*xJ$eA-zLaX%{XNVf!QDPxK<F!IjCPmpqQ z4p*|JxxN<jk&-Tgyw}srM^_STgbag<k7|+tf=5h_>_u}I-;EldJ_tloIdcB($gmPj zPe`@23vNTWmSJM4&w~DsXg&-~KMtlW(;Y>+2mjCT<lkNexcK;lt?$sGha`8<Z1L%h ztx`NB84RLeF-fH+lf@uZ@Hd+ES&wmDmp+=6gZgN|Jif(pQ=eoMBvWdGQ5fwpx$B*Q zXslX@FBPZ&eN9Ec>kTYQ03}=-NFD!p1ZZx_VU=U1?3aQWcLg3t%$1(FrOje8`kaYs zO!<_ARLE9Z?K21l$ze?U<IyX#(TreNQ92$^Bvu6C=7im5wo49ON{ZMbk!=x6zooNG zcI(lv?v}WBwhMRZPkc%ITU+5bzj|z|;I=p_^ioRjNunr}OR~Alm|AV@kJgr&0xR`? zvrQW7)^}eaB!;7*tl+SDr9rvGU6oGoD+3i3fy!twkr+rMPuz}to@|{c)dxpYYrHMU z_V28z@dSm6s9hh@OTL8LwQ_V+FzOuM7zU176Xj*0s;Joq%sK?5w>6d6aC4He(_JG; z6-hDi?tcpV$)B4M-*UP=;sKm+45*KkV?vjk^KyQ!7TG|G9<ShX1LW6He@C=FWtJTn zPM^&6k$*=)m5>tE-2;7PIJihg^qQvf)y&mSS!~WqdS^Fdh3rPlZ$7qw-k1Zbo(c&u z_P$&q&2XbJ)(#va7Qz(ryC|(H0rKyl*2BN6*7vahBE<e?Iw~yoWTB-~3+hlG%vWfh zvmrTETTz8zuv2>uiZf7vtBp-?z#0#@s$Yonnun2<dKSP_gYAi;cZqzo##NEyg3WlA zbI3TC^*B)5Rlr_B<k@@+-elI?vZ)nc*DR}LxNZb1X1-s$P9;hT2f@*zb`t2I4~Am_ zakcbj)ehimVBV1lO{If|bkG^V|E+ken|@yWZ8xMs&S08PIdK7V>*@FO|FM0XsSxcV z@#dpP(?^9DjvhT)D9}Ug=YK!@tRNpRJe@weZe1Ez(?^d_6kf!O{DAn*J4g6+e`NnD z{QTc%{v#S+da<>&pxph(OkZCY{`1Ge!K1(8|LQmh0x15Q))xV~6T&&84K?<a+~xf9 zT<&txlS|op<$DJfU$mT-f07hsh1LevbUBNt0PRj`fnaOo@<{}O=+L8>OGoRCOdB5O zon`cjvMiN%Y8Prv?5(xhTg#Jqeu7=#wSTXxlRbmBAZ+B7)p6zWWBE(9wZBwW{fMdN z7xFyXYsk@2uz!j6>=LaOs<Aaw{sX1b>A_i1nP9iXR7Ho2UN(L*KcGDukcm2T4iz+# zQ2C3r*DjI?J(ORrJzFlT<9N1;JyWlMiuf;uCh!{N383m#wd6)TjJ@g-q?(mUVhf<L zy$MFk*<#&a9%+(085>aVN_eD>cJfFmq*<xAu+^&6XH_RUvG$8bkH5PJ*i!9o_@r9a zIx=BZ7xm*AZ&N-(9g6CbrFaIa)uCbhUO`Dl=p`Hsve!tt!c}CDs^-IXDqPIwWXZd* zR!`O-RiBx8Hz;q?_LTMg|0Q!DC?S32Z+Q0En=9NAzo+K#ig2XxTI5fupr=mgN(7@l ziHJEi*4YAhNFo^PcKd3p+(Ewd_$!%?OnYm4rYk)`S}g21F{-JtIPy{<HV}+1>Wz2{ z*SYyg2XEJxWmaZ_i80w-cr`V)B>Rz!H$GSwT_iX%nZmaU5A%O0*oB);Y!{ZDQ27VZ z-IOjIELa7psOSQB13Ib>0N>!iDW|{KpKn6u*H6iS=}F90GxZP|D;WeoA8}~O)CT8d z$#3vtfeRWbFbEV)B7n;gwt!rvwHXCo5yjqcy^xk|BuuMAz+1f0k&&1jy=utUvBDJ| z$%OkCj4t-C_~<*2?OT7>_>Fs{Kc#jQ{(NZVV?W&PTs^*rU-}#`#e?s>WslaZ<Od$- z)BI;{f4Z>$K;iYTT!dWqW^fqmaUVXqi@7|e9^448RL2z>AZ9`z7c+7kX8q)3s_68{ zG@1_^hw+c9Mh`pFzMs|=`hh%C9h&&`Rw1!;r^)ns^WbYH)6U^o=dE5HAH2_^=jGjt zbf&mTlITbOZsD1mvbN~WSCqyJ$MGH4;yYf4?-(Z}1I^f?)Fc>0Sl6=F8^jv#gJ{=b z>>AuLwH`PSwEGnC3ShL!anx_s!x87O*0ZbXQLA#G-(Q|3?qhTuH88$YCeh=g&`MAI z5iabYAR_N6z1=}qXPcO{5XKom*aRFlK#c(eNa#=mh7KEp;E!7AX4?<Haqk;_OGjT0 ze3svp?XQa3rR2K7!WW+0@a28E-J_c(jiyu}oR%!tJ@e?k(c4yz{5Wu^Z}nEAeAAC5 zY4g9{`%wP5J=@1qw%Uj(5=oG)<+)YR!JUvxhWqOxn?VCKuz{6a8c=V`$wg(%ClggP ztOhhqZ0>{U81gg>0|s;ljePS!5^0!M6y&t24-!S3AR<XypOl@hs=9_m61UB%32=Y1 z#H`8z`Y~lN*<DN=q`lgWqDKzhEVyd}a5GK^CMap38EvW(bXmtx3y=^0e0ar*Xe1PJ zcdpz!dGCF%2sh*kul>{Jqa8KbdWU5&6&zpPRrp;dWV-p#$E3I1W8LAAxYyG=))%{b z#lbx~+oSuQe|LG8H_;k!1pUKb{wShYY0&ZV<m$^wK5WjB_!8tJ!Vh6udL=YRC01^5 zokl6k6%s0=a3W}jMBLB-t6hRrEC_3Y$I#Bo*fTJHv@}R+Qwq|rT#(Jl2%_tbao3av zA&Uy&J2+K9g_k%rqDA8<h!IServ_}6q;7E~Q?s2tT~a3>s10*GB5yY;Fv7&4-|~;} zYXTA1{lQ2(hCfM0;~iIRy~AXRx%wUMLg%2&uK?=DG6+QAn!-DU-L;N|FT`Ue9WR;^ z6u2c!f@pC>d@rm5bh*!?^Bg*{UDJ`x(2>oML6rL4Y40#@q6k%?xQHJ&MZI+3Yoskj z;2_pohtw<8`b>C<{m7dc0w!-4Jtp{LRD8p<`!r@WSOi9Lj!ox+g@Za#s-XMEx4byn zniO4=BIERrny|GMi3kKp`qsBaPO&!eQ#w6}Mzq72>Y|n(Q>hteBft(LY}3Ii(l#>K zSuqr+iQ2G8Xd$FVXVpp%FC-|pU4P?ki|-wM<ZyZLy6Z>svC-Yi|5JGP2kTx)Rkt?_ zZllXu;g{EbiB}$dI4ykK>^6MaWGn0q@V|NF`NFT(b_WtelFx*@)Bi%#ok_^;e#p%R z-2tzCHAzl~-PK5n<0`Ah904MbWBO8e43qMHN1Sj*Z7(tPVE7-q3$Q5i*P#EP8!IE* z@1Z_tmz7u~FrQ%o8Gq~^Sx4TK$R#Zz^73pC5mKs2Og9r6DC2TkM83jk<h>KuyHlH& zTD=P!o5h7UwH!ULaPq4k&0h1Vl^s{FS^dgB&!)$(Un1NY9jhx0;)dfJJ~k*A%c}aS z0?&VC<i6V`UBlNN*}0pKT({+xg>pl9$36FIx-$OW>*5aGQpERROf;q&MbJ-5!MgCu z(OXm%SHR*AK3XbEAcaZW-VnBBnp4$08e>tg7W6_nA8x5cv3(+%8FURs1*54o${$+S zADp;<d{@O@x&O{LYo&+mE64S`X}MAGxjoMpR{y_+k1sKl?d8`At0$<QvJvwC3glmg zymbqAt%moY?}HNsYYm(!Pl2AbfpMD?i)ez37{<ra6dMVPQHzHziW?En#c)n&#Wj}| z<Z(u>+8>HFY4}SEbZ@`W;i+vP+8b0$;euhH;<PLd7n}d)XC@ukHPJ9_(y8u@WkClY zBedY(YQJvn+O=12+w|zh)tevgJFw-y!_kRw?CwYZ?FY)w`g;O@?d&+hKlil<o`^<v zts8$N+qL58qMPoPUfZ^O*ENwFZ(4EV%i;0NaHW6srenALd;a16`!e29Ik4#;c0I<{ zN8l#k`yJECGOm$yG6<c_xv@+HRLEP+L5!;*+DTrC#udjg)c~C|6cMP{m_$`;0ObJ) zfS`T?`h%^A`Jj_l*fAm^Gh4wit4M&O0d<PCZOO<Xa$yvbXw$}^h$%D|QSBY;UZ3ug zgkP5Oh1XYJ=T3cW>0kH9_hgP9IQqTpHQS~&apgUmR-}ada5w)Q{wt5h*KZyyt*(9k z!H0RLXYj9}goPh(*-T0pSa;pcpCy}(x$xo^@kUk$LPSE`5p3U?P*K<7DpY~&^EO-= zZjS0dwBb<qp<Ci;o0XvpmsC^_2q(6d9A+gh&2aE9ZTa|XyVkva<fUI7$o=}?#rE$X zC|u7UJ;Hx__zy3>^3K~od-h#zPrQWK!vai1H5j%v`$;^Kwq%ZdA)|>MJ5iv5TJnlN zuD}YGqncmNLZot#AW%g@Er)`sCWspy!oUEsCjJn^@J`1@REVE$USv$1`~le${-i_4 zJH~zb?(2=lqp=nL2-dqBmc7H@G?p-X>f61Jjh`u8y}f;=t5E8}m*%;3_|B^#!&;os z*#xXSjzuBtn<@yA4osKPmkq`h%pIpOPjyrhuLS~@)*rOPdYk0-_=`IM!3>cBAh3bo zhQ71FhE)zVFV72;6Pb`VEXQ1fTMIAT7IFXEBdbS;pK4Fpec#!A*B2iv{QkP|@(#g# z*B2xy{?b?WtQ?!T*W{eov2ESXM+*Nz-*<K4HgO!^w}ab(>14%)hOMFDdbt=2XJRym z5}ZKD)3Z__YSJZhgBaqHqacP-_r%M@O3e~56cog?S)a5Zh^deLf3&>|V3XDPKmMNc zp7*?Inx;v5O-q`lX&OpN(=@c9Ev2QDa<O8qSPKFImU5B%MMTCJGR7F9b1L31W6U|` zToUSK&KvVS=hUwg=Qie?+nk$oZgZO=g#YI`Nue;e-{t@RyV1VQ^*zry=Q+>i^L(Ch zF=fOw0(_lt*CVi|A)t0hX1PWhF$%tmKou-*%7~tA#h8jr#X-_g!eEaE&h6>(1%kc% zmtJ>X`z4$C`ukIVdt&hL__9cl=Sn;12+H$(=kzzH{@&)dUNf{vZ^`d&S<XUk?|JiD zdnPUH+ukHxx9@xEy`E-wZ3E6Kmn9cLo3=d*0fe8F|6t#|xwIC!)|t9pU#o8?0#0qI zmMq8Kh-XJBSqJb5@tjsccJU&`Srd1_mKjU!H^`O&e-Bl)=crf<QTQaw6n{?{QFD+1 zm;)4Lf|A2pLX@Ya2Cq^%dT5Gy+L$@DMgbv##j6Y%SK5#<o!qx**={de&m!Jn!RFvB zi)F{u)jO=#Ri&QPx;DpkY-fJXBttNM|IX`w-{W(I9_=iZ!v>6{u&D0QKBQwGE)uQ> zP8`I#See?bcVJ!AX!pT4!PwLfGDbkgk(4x8cla!OCa#CcK7lni!aj-H6{Qzk2GAu? zV%bj{0Ma?jX1V&~MQ7VAwv5eU``@=&0%zDP@QI18BjX0(%)r7bQ#Q+<9m}`xn6hbb z^BcLot#i5#R?pga&kG+tG5_gsS*)JtXXQKGrH=VK{CD3J;WMp;mOVC>mpbVC)oo85 zpFhDH>JX*2{A{~;O0A7WI2#7VJK*@J)aGQkonWuTbHa*=m?*L{v5!DF4!@LRvjDtC z21`Cv03hB8wDWK!aw!e~>gI9)@iOL0KG0ZkXQrkL=K+kdjG+RKU>X=1DKP9~jR9~o zqlbZW=h~j-g^?vw7d0$g_*CNQ2RpZH>we-OD|A10lyB^v`M2b~cZcRKYA+qjesKMr zsrRZEzj{4${pz{%9f7H-k7=zlZF6duzER(UXya&b`C_W+?Dt|;Ei~;8*doPY3fbWv zp<C0<-^mz}Bdrmx!#H7VWpI33fwv;aEtZCAc294&@8}16oHw%u{+_FMrhwNwP4be$ z;uG(jXhpQTS7&q&x;7tY(H!wM;+~bT0+GQ~%~n8Xo~$4cjq>v0E`ZdeHcgoACGDeQ zuhlvf))9&*o<-INLV>VOlv$XCQ|f`-f>9L#A{NI{1=ryL)8;B3MSrKyD`LAL4-#}7 zBdiT<xFEo9w3w)9Zd}GZVQ=*h#QNJ0%>Kij^RK)6^+)b`ZqwBH<A&H1w_N}2)WP{T zA6l@!|7+sbOQ$z2ZVGO`a^|w{_um-qUvPNG(Qn?mciF^x-`WG4zkcnfZ!g<*U`2Cx zvU4ju2jXknt<S>Vvm?$s7u8mfI-SEr1LRi=DeejryiMUg3yvf#SWloLQROaJPlWWe z5ctH0;075Rxh_z;(262=I+u_>i2`M21ZbejRX*F!;05SAV!+Da{Fd~Rsvn%TuyO3l z>Zg)VKh$~Uw(iFdrrvXJe|qW1SW}66?+VRZ)HW)=_>mj#WQElOuilXQ;OKL`_F&z2 z?x*?W+8ZaUvmC}&70Vv4#m(>#qJ#6G6o58?5>=<7B^)<L=5Q(cm(Jl5sRWt42vP+o zL=`c2e(bX}{p&fye{8<7f6gcOjr@)*M9T3Uphq{--@%k#Qs0qJO#_KPi6Vt?j3mAz z-6M&*FVlB`_MrgZ1E?PM7hN<)&3BIZ`nH-oxAbp~|LeK<R>Qxw@6(8Qxo_*7_|{`@ z8o{mck>8(-7(Kq<87AB+jS>7pfNiMn2eDcHkxY1+BGWp)-;<6w`qcN6CqmYVx()1i z!>53QL_CJu!J>Spf6JDBx}U9m##_gHTjRK&xA1e@)_(mMJ$hXBZ|#q7)sN9r$0gkx z(|Xwpd=c-&egY2><N>p(0<qlmFGiI=fcIosH4C;1qUC8pkj7MeXS^oW<@LGPVGj$i z7r;GJoj;N0)zo@V-ec~Cui(W#4FTEG47;aOVKd2xK?6UE9)rvZUOrt_hliTeS54ue zN$N^<NCF8s_Aw!_e6{5B`!v`b*vX*RIn|y@wRoEd^rq^CRO^G!S{xvn3J~s4|A2`u zR3T`IYHaFc2|>Qe`k-zOO+;K9pX7{(saw*6mfX5=6%#wvj*r2;b7{VM)a?sc9NgrZ zL9Q-5zZi2up_W(13@OaQJm$n<Oqn!9Nl^#Js+g}aulMS!p5e1t_Zv@hrvwR$vupf~ zCyu6)E9d_0abAE&LhR;iKV;j175d4o$8RZ59q>Hz0sHHwzihMVJAHRdJaH}Hth4Sp z%>VFkit!mMOC0yGZ^oIGg(pgA%pA*@J*g9GQ?I60#2hUa-RcOiQr^2OH8r*QxjiiZ z!@?N5p3UK&z>a}}6E_q+y)hNnU#vgzPtKc9#NT3y(*oImlXf2FrHC_**1oI}tyW2t z%u+&epuP@gGA!u0P$XlRq@<~85zQ<LE>sf-;V{D<D3m;2njVjd=|N%#*6UDSy6gs( z%?U41g~=>JzoG4=GvO$t`KTa%sbXNxsZ0cJw`#RPLA)w|up&$?lncPmbLyA|xka$C zfw&2QjV0~0)RkXjVSG!oL-b6YyZX9I=fo^Nk61jVW(S)QnaBoPN2A@E)t%eWw<&tZ z%Js>JHJIf+hxMR@tM!fLn`xagEe07#HUqxI#TqOZD!?LyTef2UQh5nD5TjJTG!zk0 zXEYOILtcXZ0wX<n5+<UMfXTaQcsdmZYKd&9)FIs{EE<(QK<G$@)uGx{PykW|l!+sP z0&%Qk?$T(3<T=%W*nD5OJJ9MY3HEK?y5kx)qc_&V!XAIw<?H*SGd-iczRkC9+Kin` zdL@E;UyAv&0v~^+27@YL!D)a72Qa8mT1xv0mRAOOQyQi$A0A_Ipyi!MNQ2TCq+-oa zXobjv1=0#9iZRu}B}&@!Riq>U(?dl{EMss`*l3Uspg6BEm}Z+b8q#M4Opsxh6th-D zM+gXB$R0^!nt=<YYy?Kk?(1xe^=|bo5blBM`75`soz>(js<H4bO%CB}u!xn7Ry4?Y zt!LAUXXdwsn|3bWG1M^s^mL}WS}KLtZZFMoPtR9#Sc=+-Qq&+X*WhOlh0|lNtdH|> z+^dj9w2<BPSmK!~#$+MI4Yfj;-HH!fJA#VQVYqxTIFgmG=tJKn=p3Rx)_k<SHPC%g z5i`_ak(26fP%E#j$FwO)xpEJpL}2zZQSLdCtFR3!;`%=6Y1`J{R#)wD1zbhZIeV<5 zHaFB*`Rcl>M0T;aa5IaUquhJ;`8|G@Cm4G(pGuWdFEj|jvnnt|USZLN1hJ2<c~frQ z!<J&qli>w_fVw5%+{qQAk#?jF%))wF$LFM_fs#!GRnyC2s8((AWEbgFz&uF^J=AWK zV1b$DsO4lMp|cB}he5}1YY}kMcVhjg5uRjRG-VGrrdEBZavR9LpkT&yXCfzx-<__9 zoQ0k-vDS`kY=yYRnG_O>iP$s2&_xZAv_uEFeBSaBMYfB!M?QKZcz9uxoTKt+T|msz zhmLWkLvvU~2tj73(O~{UCDhZJ%UJZ|Vu=6HKY9%RfQ?}3SEZ|U2}qGgNiupTOc|e7 zW{q;?HnDAeT{W)hJRm39Ha9i=X!oqmgB@&27u(*~RU7FDRC`jZYi@d*HOHp;9ag_A zeR*RvvN7M|2mOKD<!S5aTjVaSX%n&`JbuBVwwot8x_d$$GiD8XV`X@+ueyD3AmH<I z&erW~o_ag-GfD^LpDb1TxJA1|qmrGR;dx40^L$8V$4kN>V??Kn@KB?g%w%JfR%0Vb z&=kvb?U9VF0KYQnQ3s_5q!GzlMGsp$8cG@XZz4!dn@vfnQs@%dB5=9UM-`#prDCG? zP)>5Z>WidfW~jnEX)u~FsOAF55^MVm-p7dyy8x2`#jJSJGIWse!^v7Xr2t-^+k0!F z*Jhm=*_mDF%^!?(7mX^i_BL#_IQ-5$XmXttIop2U(nnJs9JoBIlJiO{<Gj%O$31gX z?|KV4&(1t1Tt4>BDSe=HVeF)bt>*o(Jc95F;tr+B=^)Dn(jYwKOUuotPAMN`6E~El zshrNy;dsH3;Wf>b`UD3eDMu5oY{qH^&pXU!h{S+_p=bzPw5}?u#u43fp;?8L8bUD; zM%B1l;U@|)(t8g^{Qii4eQ1Ki(fZg#m#d>j<nm%35H_(D+K%_*S%2ij_9mb3j^f-? zCcG`b&iCf$0@%VGr8vG~G1B!~s+)zN<FZgQfK}oN6Wa(K%2hPSS{21&;88o%4?0yZ zPbb=X(l{X$HBNAZmLD0aOz-K+bZ^NF_JY?Zs}<QuH8nmhc7ShuSOCNlYNG!Y$c90B z+mt9q9st5XAP>T@BQ2+xArl57xS+_zDxoBcI@eH*eflU@HY)ZV)IX8!k-z$3tG~hT zv(xOS&vAEks5R76gK~Cfjh~lJD5~%my8Wpwt-<t3jy&+|JJ*Jy!BJ5*A%DZyp!+5E z&PK>(2F*HjTwQc%no@3wvF+G7MT#z!)&b;@W)Md@g;u=9N}J~#iZ=`cPHgqmAUe^i z0)xo5Yl&%H7!>M(i{LC(Pz?m>{gKPMz=>GgL{Q#TsnS7J+Dgc-0#g2{&M82Uj%WkH zKZ`v=s4c7%T+vclpI4Y!_A{bUt7$1H#By(~-63o^>V@1MoD8+n9hoUSa|5Lon<s`P zG-JM6L)|QSYWer2eq1(goYk6V1BX2bIykSkaBFOq$g#>NVlWp&SV^MXK4wQqtZwz3 zT4rQJF!0#8)w(JJA|zg*b`9ALqef7!CoM`vqy?tJshQJYL_{^I<&sdfQmBzLgR5zY zL0=E=h!H_m#V)<pU4u?0aycjvJq=QZ`sUO~+2EN0#E_RUo7YmB)6km=cb5z4gI<uM zOo1vmy7$tP9{nxYZavyP8a$OmZ6_c~$eCffhve9C9n%RK4&Ki5q6sv5NVuU((_rKx zwM!VWv%wI`fJ{8~lr6#n7-~!JkTwTZZe;po5CkNlV~r=QZX872PV5k0*EBlP+Ei<d znDc?E`e?K2avQHslk^wBFnqvM=Jk)J{$_O%l*FC61V~#1m0++|@&OF2FQP0H>DUnq zC9Pn=dwjN{sW|RpMGaUM9rIhETE>v=w6?*D=!tdaf>rMF`E>{UZ7vr|4<|Yt(W3R? zsmv3>-`mump+)hc!k5}gPpycSUnslE;f7u1u&-TKHTkhOPAwSjX?tqVET`*@7H{Sa z^hkFb=D~l`!ym^yG;3GUJj82;XCW!8BXoX^4<{;ce!&1!Sny%-qgelJ@x!*tv>8(M z`lki1S|C)UE9k&zKux<zx`!a$0A!me8jqby;H&_ghO?zNQpPHa_ZcsNHNb?caD+6= z8&63yu%>z1)RCeLMSOGqtMnQs^e;YBc-?Ig=y3_VZ6;(Dgyr_Nbmi`lHSKcg_3_Fu zS!krL!jO%^<-wjymMIsvjGA199TZAcW_(mp5(D1_6D{EeT{pExN8p&uSR7uevyFzj z79uVWIS&b0M!m%{%?yzk4?7<Ki;-`1h!%04D?W;p_sAudu1lnYj0`JitWzOVA}*qE z7ldZM9%{h~d{pVBQ^vd1J6JPk@$QAbIr*Lqq3)%@wboQ}<Qd1_I&H(-ZNb(J%Qih- z>|8o`^|lB5sv)Nq$Y~DbL_FA6Bi2RMZK)dg8N}D*;H1k8{(qyuPI@a<FacgFkzgDD zHze4(fwM`lSDwLv?SNu4t6=b)&4I1ez^-{0a;Qb!1_H+vIilrZ!DNFpNV)5X9}(x6 zRzZe<M#i4zptzD+Bv?ma$<M^K(9j$m(h>&6`%179H1zICmU$^8f{jx_e7Ga$f24N8 z{6jmX?Q7z>U~~N;mzw#<NcGfU=Jk~^xNX*!c>y#PusW@Fp7minG|b3pPiq+ExfK<x z4Y^B0+17jxCKS@=O(*T_MT~U}b_Q()MCoX(aN5$NR!P*HBOpl{hKy?Dq_DKQ23!=8 zbLu)tmYb_SUXHDjekw1lGOfjcmM5}_>)hfTvLfC~`yo*dIwi!aVc5aQNdz(1zyUnK zv>aESDuK^UN3x5eP|5Hz6dwvX7T}+$r9>tqM(I~omrEymQ%H`fX{v78G;8|0nX6-Y zb*`Dc4S|Mew9_+lR*i2#p1qg_x~l_#R?m$1JGS-vZ{<A`YJ$Mr016xnG<CZ@Vk*hQ z3wc}%)6WAwUc?HO4*>NF(jJ(3BpG5l0~*YFDmV&{k2y{?F$^bJKAp-8_$oq$0mDIT z<x-`U4chHa@|-9P(Ex9@0g!=uq>o~tJ}v;T<IS2GzVfaWgS}G7Sk24>R>4adPrbU{ z?Vq-lZ4wdL)m>QnlqF!ty?PLf*u>h<(Gqiq0Q~4E)!m`c8t9UYsfxWyD;A+evM}7Z zcXYOlo@zxORW@rB&=rcZpgxZpBSV#ucB4L)c2KV+%q&cg3ytO65iHuc4-vglqj9Ar z9I!+wkf%EjjSbbRUW*`W%0>VK=WeFH8eSsBlxWb1P5^QrD;$1^&RZ3o2#!M@-Cq$~ zi;gOCeuvE>24geI!jhTuC*@cK9ST_Ibi}&nMsH@fca0H1)C2FcwC0)JeIJha^XMrJ zH_l=AwLGTQxb0-Fl2Zt)9Py^4X7v-WIt`(X>ktaMd$^AbS*@Chfg{dIA=gZCIPgjc zHtK{x3F-tx92bKgj?u%QLn;Hjzp;w#$8CE`o22wsyYdbiS=ll#XMcG`01xL?&zCZ6 zFTBX*$Xfu?3x~RMSz(GYUoiC*n;pCcTV|A`2Sg9oM1D4wRsS+lqwd{)GFQJHb9@eB zy6D7##v)^=v$^7z!IUO+AgNirJDfQ8Gw&Kw4Q8|da`P~v@xRX)6MydXnDSf)E%vN@ z+hv>o<Itv~%r$ou-eOVXwdHp^@;+<4uG}(aOa1FV4f|W6?wsuZB#UaCPuD{h9gs%) zbVA^ih_oumjnXPW-w1y_spquP^rqc*#UqafyIAue!lvm(ECn#&3M}mWNlFJdo*@8^ zB`Njcihqi40M$zxNUyb=A*iCf82g~8pnh0Po=Qiq&a$8P_q4j(Ko@N*vUyu9`Fli; zu*z(!!)ZIU8gh#^X2l~lr^cz5jPhGVX=+?gf2G%5<|tx&iyWRXYmr`SfwZGIfuCA6 zp>XOL^b_f-5d1L7W}?#iLgh!G7L+WPY}FJyo9OEhfP^DTYRkYXsYGP%2o%_Gk^^m3 z6COg|S70wwiF6!BNvSjGDg!8&R25WH8<<)UMui!oE~V)s{%9;KV%P;F;@ZDC>YPpo zD@`4}ITkLmqX&pBwbEl{HMh<5xbt1i=0K4RXC}7=Q@y^Z-!gxu-RfkI4|2=u7Vf0D zS`&YtzinKvur5B;vlKPs<T=uf=Y=!1k{AqJaL_DKOqC*Tg<$(r!IrXeax2kR;j~gl z)nO^L8LDtg(#H{ATvAGkxuh5Ym{QeWtrVfgcu6|%l&)k*_#u)KbmKT=$;s1|<>sN@ z90r*G;$?`UTk+e1z!-|F>mn|d0)}f^q0`}%34{eu!3t!zDni(JV8f;Z2R3bd>xDgg zpL_P2Juk40@p#*zLv3Q{y?gF???CF^Juf_e_1<Tny}u2QC(n|FTe}Q7-_Mi<d~8@7 zkOD)X>dbLd7A-RxhyY17j3y_qut<%jh_FG-z$l9UH%H@Z@cr8;cpx>O?LT`AtKpZN zxK~_va*nYJ@$oXSEYQ8<>~Ye4W)oCpNVjTM)ktEP%q19TOm+Ut>&o#qrJiXwcHLK$ zZ_9IUZ^dWr$7i(~d$2BsKMU50`Yfx;PNH2$=Cg>{och(pQjdd?{O80-ZtNj4ac`PS zoSir(A}8NB{J_z<G=``R$qet$N;kDo$R1=ns;JxVyJt{v8~(G{v)PHg_^j=QM^*Mx z&J?x0=#c{UNi|50uvYqm6FQ~j>Chk1r!X~l#!las`5OnlgG)qY?vm<;X}vYoGdE(q zcZ)eEUotvykCo|adYBz#Y{ORp&Q#sCM2)vf4B~`JE2$ts4uQIkS`J*c{x(aKZR@O_ z9af`b=@+(M*3>z<S>q>a#B(Pj#xYbal_3|Z`rE_{)cI$Jx-gN$SCFlu;m=NJMOY;k zg&-S+1*2?mo!XkJP!di&EOsOPw`xO!KVP=Li1s3Rmd$owbxrCc`vcDGQe%xB{C&p$ zQu}>?|76+Ua!2w?ZQsl-wq?KV4v`K!^WN$-ik-vN!fj<hWJ87NG$^5B6Q;VOJOH#3 zx|=qj$^#(iqbe&9?g^xAL~KOLW>uZ2sj2oluCy>$mQ}uJXSOKdS!;|E8(L5q`#ll& zTlOrQUEXI8_(JZttvPO%`%N3>;ZDrMi)tRmLRyqNP0v8O-(tI!5^mHm8LJ3^EHwB+ zqLwTM<*XI?YoQ&bdE!pXf#vIOw=}0GM9fJ~NGFA#@f}-?eJ8ib^_m-XK}kF96Lxw( zJB|?51>rcsZ$Vk<>@;pdhmKBaId(F`vYj9t#J6f>9;cz@nHXSN6BXpAW-RL96~68p z-n_osSHX?fYf7byy91rw=bqc$8E|u#ELHc?C*;X4aU3>pOd~C80jdNhu5o0zb7C=6 zX?}e|1D-4!h(fSJLVdt~2Rlie(@p4^=;QEL!$P7XGksfykKFLmwIRPRU^ItnT0nX0 z2~>-RdlxL~@dW*)R+|;7(r=T(S{eZ5pvI4XWZZYMMXtrkl}mSo-aNo%<8Uj0KzM!T ziXq-i3MJj%f@m-p1chcnu!=}5>GI5q%FOT7Ce`I3)v20C81K%Up!me;C&?fs4g1L+ z*#^un&Qh#GcgB|k(ibht)2M``ooHe;JB4duE*tp1G;=~6<pV%PWSG7Zf>4(xR;X&? zGb|z46@GxlVCz;$VuPmIQ3Zo4jCXeGjpO`mdZ}01nAaVf+uhe~ycDhpjR*MOGmW_l zq|4@V$AZ((rE~wV7-IVj;^=C?`lPjErW?<WxMpx(GA!>vE|XmFD+AqdWn^-=t*^bk zPkhwZh6`}aZ%<i_Y7;#tU{TBhdP>V{XraE`bf85GYH*Inq6I}U)YbVL0B$zJc+za} z)j3-j*cv?C7cIc=0)RpaqWDLpdUWlhsy$rbUxSkY#Zlng9+;{A3xJX_8VJ--j4>Rz zovn0uSv=LlBHL0ux;TEE?F=J#91o|qu%T3UICVW+N&oQ27Mb*V!`=!n{y)=RJH~bY z#!8q;mf#iL5$267sCzilQlHvXeEeX&+nrh%c1GC!^=x-o$YULPYidh4wF&dCYdt52 z^aar6*}(VIYm0#QCZz2cV4Dh0VVe;20n#?Z=mv_fL<NvNh?*q~Vg=51VmC&dLl#?M z$!H4epn9y79DSs59z>ct&r=bkhXHgiA)S?G1E(Tiy&?!GX_b@8ny6g3NKu@r@}@j0 zDyFVa&T;G)!6}{9MQmz*L9q48zUGb@+y8X8uf4%t*WO}e%RMtc_r>+8MB_47_4ug6 zFK(^B{CM&k8}*_Jm;ck=zH8ixM_2{(vw>W<>++&&Wsc?H)C&hwpX{BipFIB4t8M<C z3ENoJtF*4>oV=XJj7QLmCJ$$1k%9&0VyM{u*im`oTwr4>vV#d0Rn(eZHD}|KTTz?q z8DHB}Q<cx6+2yG_C+F~6?qbou&*dIx`J`zJ*X_7-R;niTO0i@Xwnx=QLcS-jKn?Dr zz_ii#6jNt0Dy>p5HV6o)YWP_b1*rWRV`T`Y|Mf7iHr~d8&c@VB+2t4Qoju9x()lg# zr)FQn9QLx-8}9h~4)!{$E0&XP`qGUnHZ))?M^EnJw-^cRXP|#z1{Y?`;BbcDMPc}% zy|J^QXOni7d$t1x3V$#I>MwHo&>_wa1X6yP0r+daG>@`m7PGd^>I{woKUa2Xu=7G> z9t~r&9Al}p2&=8AI8+n(^v8jkP!Uc!A#!tb1?d-U(#hTIxN!*g)1)Cqo{6(04Mm4+ zsBAD)4ftmfS)m)UF)~(d$bA#yAyZ}G^ilNQK#Z_f0M~%x@)$dwnwga~WA4<70z(v3 zOq)AHr__UP9Qx#lC15eK3JdG%YQ(s);kv>itAUPe*;b2=Tf#kgP7d-C<KZ-lHb{CT zgA77W0zecfQj(k}qX_{MMKi97`a~Ace8_obsoYf-@?8o!n}yQdM2U~W0;S;cK?5#q z3y%}Z5Wj=21W#A@<rN|x53!;Lkbva^nO0QA@~Hg}S~KcZjnP7ITtN@W%hs)3Assxc z&|6hgXyCtlZZ>V^%Wn6yc{nWYy|K8ss<zoJzLomrswJb$(N``>9jlx)uDUc=WJB<u z`v$8yR9H|T+s><FoqNAfP>@v+o`W$Lo*dv_<3Vu$06A=<F{eYj6#E*T9y9WM!`eTj z_J9uqhX0U(BfzgpfJr{#gGq8I%UVnUD=p!nf-nxSDAl+a0>5pL0YEp1gkV^)1_VCx z8ms*2S*tIE0I;N}y;uQ60-FIf6#7%n^st@gz|{0KrUz{?4ePQ?Xw+D165}q}$SPMY z@tC7GFJW@|Rk~?fEGFObjnq#M6`DDAOV;Y{)T4X9U^pB`Rzclt#C9J!`8#&K(TcrY zhWbH8lTQ5T2=pjROFu#!G6m|CAP+HpMQJ7Md_f6BtToa{0PS)G1$U@CY-2s^?d!-b z$+Fo%OBXS?#cq++ZFAb&=akOqZdrThqy_h1VyP~*?YO_I?O1QOc$uDRqxHvuEB&4B zG1?H94-<=Sm@KUtbx+AsltWW`iJFC15vRCTrRuBd2r@_&q}1CPdTFRkxp#;1ibLqS zTAjZCvUGcO048WBY1AVQsvgrux;x~D4ON~GWCIy7hYD;pRO^zc2JkysH3oc@w16qD zMXehEVa|AVRCGmBfCvcgjeII%S~Pn^2f;J4;MWyLyuRqNx%+Q_G|<}+?JMw9_^xdo zxAML<tr&INoSOsX9l`4Gz=b^ruWnj2D->uO9h;9j+=BVf?Pyvub?}CEanrDXxOTI) zNFUOdqJrFw?V9#cz;liZm=ENvr2k-To3FNvdG*@|U#ZTz`s2-eEY+_bJosw0We@w^ z?$`X{?S1ZjzZMlg-97RTC4Y|3_n!KE@_mpzRZb9MLZ|=Wn?`)*S)cz?zy9`(?tO0y z|7*K<kN89B7u-MZRG;sIp3i|9r+hB>Y(NMk?3fHLA!uksX#KNCWF7tD_u@YaW?!8i z#nD^cxBe2NAid?+7O|H{es1;=vI|Y+4yw;JsfZ)}xg$ryf}8hPgD=M8F9xk79o+EI zTimxE#og2A_JVJ45q})E!2erZChx5B#73uane619sTjKswKIO|VFgqpd@cfSgYz9- z%ai4b^Szun-{I?}<L@d~15kc8N@z5Nn$oC*1_2wylgpuLs6LR6WLOcDD+1#mc-b>+ z6y8~xplrXZ;oWeb&GJ-VDYJB+JL{UjIboNzz@BGu@S^Ij(R_Db$l@<@SJ-zfN%f@O zPyMB>D-4HCO1tR-eQ54+?9o18zaKZQgVR|{U81tdIs)KZwId`1G=Qt|63PgaD;D+S zD5z9MnKZk?LLbV9LKsdj5_SpROkym~ddi9D*k%uEGTT*Dt|(7I6gvvBV>DWaY8{ex zR~6C3s{YjRGTM#G)d@<pkSnSnw!=1-sX2=xNT-l}@Vr!V0Q(XUqk-K!?s%8&A9&%< zn|9AR%<kV7y*oUsyhz3lG(A)ui)|i!?eSFf2lwCl^vngRzuf=Nsm%4v_HWLNu8yz1 zrGl3f@2IISZ}>uV<=aobok~F#0rcyOf!pw0*okjsSgU1JO`@IyEMI-H0i#c}Lp<#b zq;%R_5NqluwFY<`f?3Ial~nBS!NX$+N#1n{nlu7PpKx|l823_Db6ZnS<;U^)V53wu zw|Nw?@~a|KKVWh3JVyd$^zjACA&XDJ!IUmZpFzO6vkTJ)Am==&Jfa47I1gFfVQSw* zU1}`#Gket0)lYb%`k5C(^+Cpzj5^hPj}S%Fck!%;GAKM$I>6ym+4D^73y=f2zBu7C zjU0(@oMPr6T~$R1s`sOQeY%lm(S26ywCdE-5ExYB?i&2nAKGmV54_qrYxVWTksCTz zf2q1G5G?SXSJUCLvtKzPkL|SxcfesQ^#m*P8>i)a*v<0T$WEIRHuJ!VUu<qMhpTCZ zdvCIV&A}O=?wJ=&@>W+@cXbZ*ZeBs18`*#;9aWm?@Hppgba;dwK+jS!AM<>U_RTb} zZG2Re5<$rvx1VM?O0(RCzuO2BrnNO=enPaPJEI9yp2Jm6PQ{-TmIs|>ysADi8S`0C zpPa0we=-0Y$?#;nwLOf2{P<)wpX6P~W2YvQQZh&_BdVX69LDsO(eyoFwYf)4Q2iS; zSJKivl}@WO=1i5$r7`ROaoVUth0@C^YJ!9e#d0w*$jYD!TYoCD2-6V_-)CW2bHcuG z>bl4ht2I7e%}?!Dtl`DWH{Q4Rvh}RNx4ghI*uJ1M#BSI&DW6%+n2F{s?E0RL&+LO* z#>CcLtYccgFS@g#aZ}Gd8%k_DdNOivQ2Pt~tklj4et%*d<gDjVH=E>S$UQzinsDQx zi6n1tG|>ck2e5U(EF3!$Z>vuPA;z-$WH3hZ4FYwBpP;Qt#)wU{J%$}gY@)<N0f7&1 z9Y}j5n*$LK2Z4{I+?E4r%R+aE8f9SD3|aE>rvT#uCFLfHn*65tWM_QJXKoz$`jqcN z9W$DoBawip)PHF`OLj&P%nJgr>32rwViLmM`z)3of)@qyU6({WtsT{GJ6QOM$UHn* zdul?K{pp6iUHPGarIa5VvGm1+E&8tFV%;`2+~A+Nf7OC?W7^G4S!ZvK*7s9&BCk*{ z=-sL80_Zup1M}e0u3(K?GDbSTd^I#xGwN$S!MYWf6Jrf^2TWXyskpd@cEQCLL55Ty zJv1Y&>&zJ>aFgTVPSSO`3-JlqHHqRH%;kXEAe+PS)+111s(lP-DnhOFo=`pB)2g@u zTPI<@Lg`*zj(XMOWC0@is$EwJmF<sFk^C{Bi>63Em{1eTFs;j!wp_LfFUVU)ADOoi zFIeV?&!VGhHiFYQwq}QiW~r;hm3UjI71Ie$F*N!z%*yBtn(q?VA+A-NOP8~qhst}_ zW8h7RS>*L!v=;W`GG}}ME1|6!Z(RbqvH?7yovR{~WG<36MDCC_R|B=G!CLqXW%?Np zokBUER<N-6C}5QS$l={e0MIj3vyLQbAYk{;byQDiuotmAC7&q@N>f+W^kd2Ri?E)c zaR<-RxJxfy+IHF6y~W{g&);<Z_1M_+OI<g3TRKOxz1t>Rm|)}PLIYqmTCF|RVyCDM zF17o6o2K1g+<VpWhuEVLU(uNx{XeSV$I<(CTDW!I_<_rO;r(kpQ*U3rp+s(OO@+=_ zxaqT2F2@?o#Tv}glCx+H79y{gxCm=7&(qwAS*cI1B#p2ks%jq<{p=E~$4yv-=^n^7 z&^xOh$PLrzc*SZ<_duQj9fA6Z8L9{JjPtNc=BUN0dLYkmC^FNW`a}#bt3^}AskDM( z;33A(33%~TrCX*>#*&IDrqzT4YtB(lFLMw<M!7nW%ZVU^meve#=f!)Q;!~Z6^hrIc zRdNh*+_vT9gP<o*Nmc7gT4N5^P_e&q%?4b9FLfbK<;ozvCb5KQAl5+1K$$3wqRv=a zRA<e9gY%Tq!Aml*_|MWh|7FpoF-PfB{ggto3N*{@Z;tjCqK@!Z$=`^MSQ2RkwKy8S z%<eeqh&=jV7u!H%^<4{B-d5tduebG9Z(41MZ7reB{{E)JuHJRWpF3kIoYrA;XU<)% z&g}oqwRUu%XZ^H5bY5w}Y28LKkN3q~tgC6-3up5DG-8Df)urphD@e6X!BT1?#pMa4 zRbQeAii^dk6BQ8g#(ehq0o+HL6Nrv)qFJEMSe5n3@k*bK$NGUjBi39(%uEg^L&^{c zRb#Dy4J$(;1auG>st(DD_9>)sLX$E7jaU@Tb%;6Sjj1QPhQyRMJgVS#|F#Oofj}eO z><;h}n$U9Rut^4$LGT5#mSgF13gS9cu(AR+C%O<jeOwdt#DXZd+2)}7X{NCMYB!7i z#t}K}2~L~tl^b{OLsw<X{A%|LT|17AEW4kuh1<Dqvr21T^`Z~nNWJneb><u!nBB~M zyMDx%S=uaY)-C_M|FH587oi7C59VSS>(`R!L;lHgXq(T0;0tLsfVc_3XjV~eM5f4q zRvqGD6;OwGTYYkn=BEX(ZyBV`*V2H^N0ylCcYf&+Dp%`*m@go$-<M7VPfP~~MyLN8 z?MTdn)VnP29g23%gQwD%ejFwCQFWD!Hqs|Y>o9L!D&&x9t%VH0P(}L172%|vfMD%* z;G2^b3R6=-o1Z$QUwQ!!h4W`&Li?QY3#l?LdG2`&X;+V?oYtHLcxImSP)p}IN-;DN z&N3cV^>L`c(5V6gxfX9z!Vbs+a73sN03x<MJ1;-lII-)3S$*>^rN!WF!H0$G@UekG zD2@t^5IRs8Fe}RQicVV}n0tf{6!u6(F(`SYoF*04!5LG4PKmX8`XRKD2FQw#FL;eP zE{Nz4&KtE{$g+V4)`G5ptmE}K+vu>q1=~a|k-(;1`*vc1^xbjm)}(86XQZgtX|vW^ zt>N}?VQh9`Y5Ul^4?CAHi464FOY^<C*0EN9C^XQqV%F9zgZFLBY7U7_8xN*#`=MGr z{}||Rs@TO|<SS|QT;9Fs?j8NrQ#)Hn38!9qW3|=N+UvJXo!}ei^Skmceqh>VvEo7@ zmnW>2>&8ZAOzG^6MmMcou(M%3)|`%>75mI2@?JVHjuB)yviNZB9xxw-bYk@+mQzwm z4>nkZRDV6Oeung?^dBCd_`v$!SsS;swp&ZQ+uwWm?t7k&`TPy`+>Vys2@`sGzy8$m z*5eb6b)l|1=WkmZ+u754Nf9f&`1bSnJ-PS3Z!M05s{Niq*YRw6waN#op5by3wVSP| z*qny!OVW-VK&>-CWOB%?tnH%XPW9uPqN0>T%@gft!ABLG3n+#++|i*qT@sL(0_!yC zDQu@B295I+n~Qz8F*ryYD3Hw2mM~VPs7i;QAut_cTc>)i$Ebg-M8##4aWBvE=0!%i z(eEOapl<68Xcx|&%PVvoy1;arnO@=OesTWBo6%@}bfMQ(U{Sro`vN6dE7`!a;wQ3b zb;`1weYig#-P%CbG;E3rXWK+k9rAF-ZB^O-xk-3#{$^fr;uA+UfBn-+@k>l8@>U-B z<sS6ItVE2W3Hj9LHKjiQ<|S8w2p{rDoq&dOGN}`o2XTKi(TJzY>%p>#r%)S<+TfwO zKq2}hC+eu2yG_vpOaSBCDC)J>J_=rV+*co;;2=(o3}71c?L?wR<r+5Ydxs2LjRm}% zQ26M(i7vCM#o+{MuBYW=XGRh4IHYGeE$V;|Ou*wdw1bP+ITIGb*tGy?f-0;8Z^TZo zpxs+v3C)cdIU?rijaZqUpme2=f+I0!KC!xB7wgVwx!TQk@YaGKiiJaLrJp^#w)fia zMpnOe-C*i2U&r%S>-x5~^;YXktmohxw><mY-kX^J?xp?R_VL?T>lC9}q@L-yp;}ZA z?U}Z1{ba?dMV>^%mKI^Dw~7<nQoh~nYuood$=*#}@~6GqU0%<pyBL@PbTtpxVIC}q zpEV<TneHj>j#Bd+R5v7xXchqL!Gcw^O7W^_B8I2Xax)9FK=eJTo17P%Jus-eCdu6k zZ<!#CVDQl(_c3XeJ0{(0Cs(C=6>MEIA<itYHY6g5XC!P8I5<3jy-engA>3jjiyXQD zd1{~(^yga<cqj2{D&OX>_zcPB6KNT=9V~aH5IGWzI~YSDlzS?$e+!lly|8gh>Nf?q zo)YQ;_JnQLocDfPFlpuE+h(-(U8nC@x$?xDsS;DXCQ@rtZw{qC-34Yd5S5J->ZjX< zCEzW7iv^b7_k(TchfsY)@o^jC5N{e0;N33L-c9qIP&4WHIrYg3(hi79Aa<0DknCWs z#9QlC44v9?BC!NncepAHtmh8=?dl+5x=?+_;{C)Fm6#3@kDgACj;@4=QDsQ{n;G=( zq<RW7q4*r>uuKD!xYdzp#w*6BXJC9f{+@<cC1zs><iIaNoFXxv+#wY3;Mf|4IdB2A zfVfi$XigQwv!xeH-5?(4_!MVyd=rjF%o)0JpfH@4JmTvVh{!LZgNWB)eWUI@uTgRD zRQQF`TnI~L;H98sp;!j`eJX<oaZtU%U*y5N3ZmnJ6U+ww*4AbvZ$Y}o**oiz8LaTB z=>fkxdSKCqH>IxKVzI1mZzp0Qq$odV2wRxL<MRjmmcQW-HhqC@=lq?>#a8Q<#sG-u zVs(L(tDo5{uKh#Evz;}M7TNWVLv9&}22+FV_WrAxJpQO<fBwy}9oxV3xwQ*A3r^4L zYRsz*n*Oacn-cIi)BqF^>hvzjO7D`<G_@R$M?q>s*M3Z8D@|?8?rz1@V*4cR${n$H z($vB@P46BA#7Ax*>>eBDF4Nngu|Aor8nqHJQ#{u}ov!08^+O%$sqIkZ$wOiFm$FJt zLJ5+|3Gg~f8(?VQcMT@jPSipjP4SpB-VVhDA3AM>jHZwyZ8p~6uk&!|D9i@pkpFI} z!b>LfsR}<8Qa4=m-2=n#S<8UEe-+!oKY3gcK>fMnh2c5d{~Vv!IL1Wwpq*xH#^adm z(`QPr*mimn*bS+HGmX_#O3flKKV;nk8p_@?W}5aOc({zEIG0WEDa6w~#}mYFT1~S7 z9?2m`UPUz~f`Hd0@D3W#w)GS%7b&dMWxJSwL>bPLG9`MV97@_05s?iQ(T8fwY%o95 zGDrM8L%~4Ix(OS!*dcrkQBM-T0%}WYi7yWTY%BB|TnRem^QcjCydEX+Y;2eu1ZX;^ zMD7~>f?)LxZx}NEDRBi`r%3vs{>Hf~qxAwvd&nNA3)^>VepSoh-e(RkowmSYIo#T^ z#cGX&d~c;52KT|HSJ=d%z03%?O1r`@ym<SDf9gxYY}K%4<?=OKFJ+M?pT*jMytlOx zS)S<cy(N#cD_QQFZ05c54qs`z^ARvoaylpYF`N@#U@m5(Co6PW!iK<*9zbBIkOZ5D zzebglw}sMNB>XC&t8yzzSLITVW{{PlloDJ$!@cz~oTd~sEvYBcmD5pClvt3hb6R6W z18V#Q=`$e8+i35n9)9G7QzzpPn?dWV6}o=cS^mwbC4*I#C0e%Jzw*Q@DflU*uTO4F z-K_^PF0w4Y0)35<#4DqUYBLCxjX$XeQig;C^kc9_fVWQ;s03w!o%Szi0^COuV=!E| zigCFS<BH?3!?wuOTqK#QHD=TxjYtD(>9ngriztwiDtH*9`tC#p=pPbZcn{cuPT26K zB<>OlGN#wVNP%Qlgzhja^Q!Xv=%hF^5*|qXFtlQlyQtirCrU%rQ`)<>Lq#UHV(TrM z@bouWb_wTS6yf6smfpQSR^8tk@isZ^ei3MR@u0ptV=O2kwH|Lut?nCOs|WQv)B2C> zwdWwGK46T{Uys;!<L9U*aY=pJY>M;}n_oN}P2|J)EgRc-5sY6JzgThb&W3CpvuP<1 z@h+n3@!5{VoKs5yCR$QcaUIHpQ2*|^5YlXgNN|&oa<FCM=M%S6vf{!%i0X{^`B_UP zBN<ej#!ztzCJCjh)T3xc3l0)Gr(io%A%ICf5)fTMKzm>`XBO5=AB`}%T8=5jb<q^o z?{-<rvnFw?^>7tGYz08Pp=AO1ox9!a$^0?lLU%DJ?hfa~F0fXolFiFockut!N=)5s zPG?I?D+}x}FJP#V?76f>Yf>-v=DABwyzlloEE$8_Hn*PLI79G(Pgl}@bi>Yg3iA<! z{rpue8KnJ4{IFns<uo^esG=UsrJYIpv=DV~D#vFzk{FBWuqaM?3t?4q>ZuJPt=6<* zkStNFwS>y(5T6CD&X7NStWW}QEEHQp1AZ#uE5WRQkVJKxM?-2_y%h>E;zL6E%>8?6 zQu5JrfzAnK8Zc$jMOo;cfm0hfBkysccbBgW_<i{mkHu2UXjlK+$(`BG$9IKeyLU)O z@_-OWGg})yF;9ME?uUG)A|v!s#;R?OCW@f_b7GUGBUD_EdVPd1e6(y)g;RzDV(heY zu#$&MGZYo<p-)~>v})7A16%4Ki>ss(v}$OiG_CTJN7^3x<ZtTf8F)I&5l0-->c`Vr zj>I|sPab))?6{krwxil8u05Fu=X?^cfAUaWxE|e7BJ?wf>(jkS;t}*Ev1%;KT2dOW zi`1WPHNx6aJd}QT+VmOcjC?9XLd̒dz(ZXuWiHi+(LXqp&g<O>4*p_aOu6~eU8 z0d5dpEymKSwI~mGsS!y;QN{FVo?O%*Qc-JOk(ufjGcA$%R%zLC^bwd}vnFKqT1T(6 zTG5Q-8#V(N$Rx|LSO&tfRX1{TM~@zDvFw^%`lQ7&+CSQw%|D%e`JPQ5=>4BQ4X;`R zkH5HJR+eJThB;>TEYiEz)=Js}pMF^C5q6qd*?7_M{gJwe6V?C2<^h<39icig=F>iL zko1zSb>pm`g|j{z2uP21F|d`i6QG0A?g?T}3Kt6$L)z~k%ZaQiSK4fHrRg4!DhF~( zF@hJN=r56_liDc(<LDu&<=E80%p95zN3Z$`Y9nP971g`&0Oe1OS*hDoM^hiBu0>CI zC(CEsj`!}n;HoYAhJKg&@Xg4jA-+HLVrm;(%7W-=dTVOe@e}&v-%CBW=aU_a8BoYJ zkekoM=hK*IZ*Mf3fUp@4KLS;q&dit+)(KFw%D{xboAlG>1{@)dMq?uC91W93xnn4u z!GSaZE-f72CJeSdJy@_T);EzBt|u*wZs{o7PK1!yM|n+Rf*ZRzw<!^;r$v;LE*6Ul zJUEF0-ic65olACBfzZ+jn-JN{#zG_~;c&t5r&6(KAx<pBoB(akn1vwD_UAi6do!+> z=?lX1mv^HknymK+JD$&CS(dQgC3tpeVL24&OV@>-a%?&{Y2~e{mkutvCz;B+3dvf& z)DE)NgTuC9R(%%Q&U)_9CsmAT^U=EDu3+ont^0=fk{zj<T}kH6`seZUgxSU3=Yu5E z7%&mUvt^>6+(hkBEg2=5PJk3gkraVhiZkqdL=Yf4gaHmU8*bQk!;TYZx6(2M<r!Is z!Sr9qN}>R71RWV_XcBdZD){THcz32MuN@D%sqVV5DFLLe%0QwvZ&Y*wo%&WA9vK65 zO6GSNmfeJ?dTG4M*`B3YY;K>wZamt#HN{7{5K|=Ezz(wgLIzlJa1<{z!A1?ID-BB~ zuL?F{P^Zu!sx67a+z&WSkqa#EY)?~1!7OH8&8}vxo=b9OUvbf2Z`}CpQKPbQ$N7@W zS^2~^+k|-+j7=STXyNP4UTe9(odq4m`egR7YwG#qTh|<aGxosOFN}P7*Va02<^*N% z!mrM`?fG$H@0M(~^md*a?I?8{B`p_j=(_8J@b(9t*{P!(XO$hc!ve@YD;?)?k>pLB zNjXJW%^?!M7k_D}ic3385h?|%1d<#v2UJ@sm5hn0Jdz3C9;&LSgmg3E;0mZ@2pDsQ zSR1YGl0vL*%oCxzhx>SD(Gym1pOHsQYy}y{o(M&m(@|p;_$dJ-5DDT$7D47+@NvoV za9~nQQVY2Su-Pk}z|vzD{Q1RP7Bm*cKEpqPKSV{6bj}e4F=P+e7i{;mtAu26*B4l8 zSDP}=?tJQbi)DM)#Mu5>le{&heyi8VWu8Mcc<rVB5?8dep?4eDieJ0nq=xPee|j~G zvH0*l$oR5@(XPo`towG09F(Afv)aNo9t2C&<Y^R?pqRva#>Y4dC!)j2(^_&g<WJ~x zgmmIFfPcXRBnwNLAart$l6f8|8D!A#Fcof1QZ_G(pBs+X9!YdU%C(&&%-V4fW~U?3 z24SWvi<7L4o?w)&M=ciu5t;c)Rw%+&3a=+~RM0&KwgrVQ5}ib&6KleYp>+zIW?)0a zE1dCSNPH54LuiCn8^+RCTIo<h={R&TR69VmPc5pSs1vRhW!9&QAvvJGcAW_5PQ@Fd z<!Sdx&-3R~n+m<CmT|Ww5?=n5foo@mvU3_5x}c1F7Fc>SqCWT3%`ffUGIaMqbIYZ8 zILHJZ-!iEKaioUs!C7NIzgUD%eB8JC+Rz|tpZm2ddfCSvlLI&zGZvi1)jcZ}+7!Qc zu$`B7tQy+FyEnI<vhPI8iRJwJ^HQPP7v46nW0Z^l9)~&N=sl=mj_Oc70jy^}<|u8N z=*l!f6WY-cWl@780xBJcJ<(y0ZBR=2^ze~E%w>S65=2)*4?rQ06b>swP`I(mS6Bk* zIuk&R9QHa6SgmX7Z3Sj(c6n~@Y+tCfFwdU4JlrtLYB_xU&chbVlBxjf%3__tF~=`M zOGR%aHAlQRJX-slf{5d|ryB@m&I^1M`iEBl%2Ss%X0ux_KK{7gFl!c1X(#TSu}cJL ztm5RmmfJDbi?oG^{U!q#D^^;vjx<5`G%Q1KCqH0u#ipFsuZFxRoVb*R9137Xcv|}A zrWZk@LR8a%Q;&!^++&#b*mOGC^h_4o;TeyVKxPp@0(l6%hR>3WAjv4PRPit~r%8)D zKUi7_K+S>H{+n<v_3d<}RycxT$A_+Ln<MoO0*eh37W6x0YENktzUP5b=HF$twvBhN ziQ;3s)oVw7+Ipb637&$dG-R<mSWtZ0${%>m-_rEJgb8A&%Toa0EzH`?x)GBbK0S!0 z_WX^e=n222$eNYoC=hdIR-Rrv<@LuG@*9J}(y~j!Q%^n2g`+4`eteCd;wS!g;_jZC zuHywiq;wr*ErAP}30V(n%gJw}oP4se`Fzr_{ps{v{ls%=zmKO`=#3_pLIqHKJt@m7 z=v9l;ugWf|qE`u;BTG5v2w8g;zbu?sK{K-y&3clN_HNp~OPmkog=$@0a~Dt^lse4i z5UgA{(Fq9f#|F$7rXpI9@5huC7UiL|9QF?g>(kUg$aUc<P}MV3+~9d&aLg(sKR^yd z+;H&Is?cp7^Gn1*1tz+&;Thntkrzq(aa9IpvQ!>G70Q#GTRVLA+`?zzOR`vzqtMf8 zXW~H6e?aEdO!szLQ_qz8*z$pm^l0kAw!_j^UE{WSe&@)s6s2D9TbiqEO#VKy1>nn` zs!j{*$RCx8_bn`E{d&{><Ifse5GCz0Xcye=@uz-X<Ytvc!BQ@`ty<u}YUHQfQ?gS( z3q)#Cj_b1Q=|^oGRF~c$Yb`>P(^tK%=Bd|%n<(g?VPW7$FZHu`w>qWg(~V+JdM-K6 z(OH<w`Pu+kg{NaM3o$d~$0RE|v3=6MOmGtUGRj7bcSRGI&~z<U<gE*>ii|jSA!%zB zn?YyJMbOqr0uYJ7AoTJ=V!Wsfq@mT9z(P6XYP^7wmQ+z&1Gr4w?>bZ-m`t#$v@?z! zcsgt|fuYzz?Xa}>O+H42tgs7|%&;=v7|WZ8SB&t??b2j+qfizWWHD!vlj&=xY`3zk z)PXk;;@J-C&?@po>hFycCS;pd%a%BsX|dS-dBxTo?4K;qVwtFsFh8<dgr(F~lWVc` zE?e1Zv1HfTeYwR2w$w2L71=<=Fm~c`cKf2#P8B1fu081^lodOQimU||uVhbOJVs~3 z3Jfp_VSx=`Pc`M_=(Zg8qRZ`owTVqjl}nK<rgMv}z+0FKX1TKf;?K2~TKNMf7NzhS z9|s*V{wrY<fs|AKu_9iVxn^1}@X+bUxn}BL&RhdUum2%)EnowU%Ul~Mv0Rk7mYNNV zBK=+qUh{C~It$ms@6Fac#_)S>xV|RynjEb`7H6*Q+GKeI)-e;H7U8wH#q>4P%JgZO zYuIo46`5;Y>(jrQxfTFq<!7!9u$oh^8u}cm6^pf*Yjmm*H)pO<5iFk1TxV;|;^WM< zO=~uKGuJuVa^t?twO#v?oC%9%89Hce)UMFV)%7B6v9?azpdMX}#};Xu@v{e&A<NJw zXPLGJ|0_p~zE(ZD8jr0&-G~+3k8AMGRd}>ZTaDivaJ3P?2h``4BYR7qyHZ<^$5%sJ zmg8Ug-K(`t>f4s6_pZb1=zA|vuiAj;7UF&M2;J$WxLSqJSd4$o_|~bovu->$8dvlT zeaAxF5#8}d+&kUzrT_NcYCJ;UxdEd}-v`}q2=Bf^eZv|xei}tNK8?muj`4M6zUi|@ zdgi@VLpoOQVbXms#Ahv5pK$schb2V6*J(@eJdI!}Ubz8MY=(UjR??xnA}OU`Q;RVV z;*)8V8xT*mLPm@6dXjWK?3y}oo{T+pJ#NHx!-(s~5!bO1*W*WAH=VkU;8TY`GeX#a z?q#bsUQyn?Y|-L%8<wqFT)t>?dC$^iE0(QUQ$DS>yl3@_6=SXEty#RPyleHU4dokG z4J=+)K5Oy1mFvq_50($&>D8N-ty)svwQlj^@(WfEZrHSN-C}&)rHfasU%bBAI<=>} zeDu`Cs}`?YxT3sg<DwPIE**Zhx_r~J4NLJEt2TrdUvcT;H5-<#URAzu)j;`qU3k;! zFU@>fb***sy2T4OEFMU|k#4AK^|~dC%SSKWuwhMeI7}ZkNRQO6AFN%qctf@I0(@$D zeO+C2Ed3m(em1C|jp}Di{T#1;Hqp;Wo%%_FQET`=?W42hrfI#HyCqnoE3iV=;m^fO zHm+ERi_cmc^_cZK%xp9M{rB%lzos5D71A2<kJg2nx6`AqudOR@hEUI($L8|-NT@M{ z5&d6`r4M6Sht;_ZYnAp)1MWA1e^Fd(eT&zvr<rP~jnpEuoEZQO<4jdUB09CMwEu<1 z(au_Q|L3nibGL1RKG=jiUU6y`>FUfItj+91Ewg+!ZQ4}3g2cFDcsJByF==P*<Nq%| zIj!wNkm~=5wky{rp~j{Y?@H^i&}Zsunvm1lRl`OMm4Eg|ET1&9llENb-?rR;8v9^I zbEP#|IHQ$MYw|F5!WukUt7NwlSEuj!%z0Xmdt8S*C9Ss}uctLf?;HN)^aw}J&>;3U zrX>tbqI67n$(j|lOE;`sp@xx}%JrA7TefDy`r7r&R;U+abY$DB|4!y==cljE{Arzr zzZqTyoD#aONJPFYh`?C2Y~-+W&>JBakuWFl;(6!;S%4bDBCQxTYbB_59fe4088Ut4 znh$+={J`i0ko61V45$VhQw=n32xAJXy<3l6G!8qe5q|4<cyrBK3wq(SBG%BRwPR*G zaL#mUQ=m7d;#8gv{?T)=#m>cypNBnkKJ?55ID0QdUg;t*YtBY?>0<P3m<Jp85^aID zka6uB+84E*+GE;2?GM_Qv}?3I;C21Bc7W-~*Zmwm&Gj(r3~et1^OW|K_A6%5?$i!y z|3K9LpW1%yyTFV-r`?aeaw+m81KNww1<z|g(q7hnti7cDM0*cs)@#};+N;|4Fv}lm z*J($zpK7mTrvIe<89f`9V+JpSU9bY0<8I93TFfo!l8u<*%dsc_h%@<0*dSZb|L1es z!`jzTarJrhG~BMe54Pz?7)p;>Hln*Z%&wi#QfL_MKzP`ton#uzLzuV#CTSsz;$l$T zm#|Vc3Q68F09DGFPy1N=gi#rIC7K3Q!IvM+s@WLi+Q+gGZ0j(qV-f9`_6FjXQ8o_o z<3^Mgk7rG+nYFM9VA!QT*6rGFw0GDf*1;yTPBw*gv8ikto6csib67V!m(677AxEBI zJ?wne%Pv5#z6)6&yNLC(+1jVtaqYL-@7Nr6F?!p~WAoW1Yyn%y7O_ha<X+4M*%HLF zm$BvSGPVL0{;Sw(?FVcPTg%q5_1f>Xci9HEkzLL<u`AeS<dnCt&#|rS^K2UfM1uhw zz`n?KvM*`(uwCrSY&ZJ~+k?pDUUm(;mVK37$G!$~{u|hhY#+Oc-OO%bx3b&V?d%SA zC%cQ?&A!g|vv05i?3?I6_$_uX`!+kszQgWg-(~l+@3A;bAR}>z4Y3E<gX|&pF#A4x zg#7@mydGnZvnSY->@fQwdx|~Ho?*|j=h*Y?1@<Gb>iw9##D2nFX0NbU+5fQD*iTWU zeuTZje#YKpKWD#Szhp<*uh?7c*X(Wf8}<(SE&Cn&J(vgoz}{njWPf7svp=&B*k9O( z?62%^?C<O&_7C<?_AmA^`-B~1pR(iZ1WU1#2>3Ag7CJmdgG;ncu<$Hy<=NcEbAZRq z<qq!TE}q8`u;PWh2vo>!UcyWHDDL59fc2HbyQtuPUdaQziU;{<Ud_kw8a|eXkn;@l zIv(NmJj%!M2HwbHd^~UB&Af$A;Ao%D+ju*l#5?$8R8LOfU3@B^#;5Zc{2boR&*d}u zdAx_8&wKd=d=|fu_wkE(KcCI#@Qe9eK9A4mm+%FAAz#EV<pX>%ALL8G%e{;*=a=yn zd?jDSSMxP|Enmmi^9^7fzMOC3SMbgJO1_1Ej&J3k=iB%f_;!93-@(7gck(asUHr>@ zH~$LX!>{Ih`8E7n{#AY*{~EuZ-@tF=`}j@#W_}C5mEXp1=XdZs`Ca^O{&l{ee}f<3 z-{kl3Z}EHixA{T-9eyAGF2A3DkH>j}C;1^h#2?@f@`w1t{QKI^w72;q{0IC|{uqCp zKf#~mhxrfrQ~YWE41bnC$DijfXur^YsU6jRt-Yze#ec+K<Ui&w@t^RQ`78WY{y+RR z{!{)sKf>SOKjUxmpYvbvU-F~;SNtvhYyLL>4S$FKmj90bp1;fgz~AG4<bMK7)}Q$Y z{4e}N{#X7t{&)Tn{|Emk{}=z5f5MOPPx*0vf~WXNUDM%7>bedV0A1>)Zqc)JtDddf z^c?hX%hesaQ+MfkdcIzu7wScNvF_GO^iqA4?$MFU*2{IDUZMN-N<E-g=|O$8UagPO zYxJ>tNUznydYvB8>-DHUPH)f~^_V_hZ_=Cf7JY)=s!!D0^mcs``c+NVJM}4gmp)aW zrcc*r=;!F&`nmc{{XD%#KOcS0E<j(M3-vzzBE4Uqt<TXf*5~T;^!fTF`T~8SzDU1R zAJ7-;gZdJEslH5Ku3x6F&{yiK^wrwE`Wp2ATBonqH=sAt<@zT53VpMFrM^Y~oW526 zyuMBUg1%k9O1n$np?^`|seeh|rGHu9t$#(|qhGD>)vwX7)xWA=r+-bqUcW)RNxxCw zr{ARArr)gJqTj0Drr)mLq2H<BrQfZ8UEi;NLqDK@Q@=<5mVU4PZT+DB9sNH2yZZh5 z_w=})(3AQh)M!7TKd3*XKdgUWe?<R*{;2+#{<!{x{-l0b{~-XwPwUU<&+5<V&+9Mf zKhj^+f2_Zx|3rUTe?@;){~!G|{ipit`VsvN{b%}{`p@-W=)csD>c7(8(toYLt^Y=U zNB^z<JN@_iyZRsW_w+yNf70LA|EzzY|3&{$|EvBt{qOom`akr4>i^O|)<4mY>7VMy z^%HtZKPfc9P?WDDc4P?jO%ce@3#-Tm4SEjJqPfB$oWg}zNWLf#g`!9l3%4i{rDBxu z2ymi{a^Vvd!Y?XCKvapK7%i#=Qk!C|2#Hz|7Ih*bfE*IzM1yD)F)>~=iDuCvCWuxs zQM8G6F-dfY$)ZzC5nW=cm?ox+8R8t#EzT7)#d)GfoG*IC1!9)CQ1pq5M8B9V=7@{M zTrp3~7ng_yVxd?hE)@e}u^1Ff#8R<LEEkuF6=J1WC02_yVy##w){6~dqqtmb5?6@L z;!3ead`@f?pBLN27sPgPmDnM^h?=D@iCyB$Vz>B;*dwkMd&M>4TJcqJo%ou#UfdvV z6#K+Y;%0G+xK-RHZWnimJH=h%Zt-=oUwlIx5Z@H{h;NB|#ka*l@f~rW_^!BLd{4wh zLL|i@F(e)k4~mDx!{Ynm5%B}@sCY~~E}jriio@cE;wka8ct$)co)gcD7sQXmi{i)P zCGivSvUo+jD*gw+(4UIe#S!s__?dW9{9OD({8AhhzY=eWUyHZJZ^S#|x8ir=_u^gg z2l1Zxqxh3}U;J5oApRmg6n_<e6Mq*UiGPTHihqfZ#V6vJ_*5JhCqzn|G&F-5+|UgH z_6%v5hQ-J-tVXtBGja^Ok!v^%r{Oa4jC`ZOC^U+UV#95e7^TK2psC9YuTgIJj0(eV zR2l)J$_N^xjcQ|zQDclXLPo6-HtLLsQEx<zaYlpDh|1{kMw8KOv=|eNR%4>kX0#iV zj1FV6(P>OEx{RsDG-J9k!#Kz2HqJF>8t0+*<9wsnxBzj;3ynVGBBS4!ZOkz)Hs%`h zjQPeT$nz~U78#cs1IA)w&{zT--ZEpkahb8gSZS;>RvT-KwZ=MQy|KaAXk2b=GOjQ- z8&?`zjL#Wcjn5m~j4v45jjN0u#uts9#+Qs;#+QxV##f9z#?{7N;~L{y<EzGX#@CGN zjT?*`jeW*V#?8hp#;wL}#_h%(#+}Ap#@)u(js3<qj046ujeCr58TT6BHVzu!G43<I zYus;q&xjidBWWBmhKvV{2aSh}hmG$Wj~G8N9yJ~_9ygvao-_^{KQx{)o;IE_o(1CR zdE*7+N5+fBkByg%pBOJ2uNbcy|6{yn{M2~eIAXkE{LFaM__^^5<Cn%!<5$L8#;=XH zjo%pW7{4`sXZ+rH*Z70+p7BTHPsaPkpN$WUzZf4He>MJQ{N4D-_=oXN<6p?jePSFl zJ~fUTCybPFQflbH!KIEoo*^ajI2M^DtukBMWRA4UT<MTb>5_ReA4MO9vPc$7w=9vR za+LJQGU=7&(kClGJXt9NvPuT!Xjv`C$Qn6ThGeY_%Q_j6^)f2Q$p+adV{*J~lFhP3 zPLQo~qHL4xa+2(jlVzuzBD>^NIZaNNGvqn4Tb?Ut%JXE8JYV+83*;<$q3n|v$$mLo z&XE_(xpJPIFE5b`<U+YfUMdIVVmT<6$fa_bTrMw@E96SKO0Je`<XX8-u9q9+MtQm1 zB(IR0<&|=a{G8k>KQFh*FUalkD!D^`QSOvqlDp)W<!<>Exkp|t_sVPJweqX-I{7tu zy}Uu*DEG;m<jwLHd8@ol-Y)Nucgnlu-SX>lzx;+gAipW^k>8T{%5Teq@;mZA`CWOx z{GN=<giOjqa!5WPACwQthvoO>Bk~9GQTdpBTs|S6l!xUH<x}!$`HXy4J|~}-FUTLs z7v+!TOY$f3W%-JHRsN5BP5xBAE|17J<j>@r^5^mw@|W_c{FQu5{#w2*e<R<Kzm>m} zznAaIKgjpwALXCq`|{871Nj&EAs7}9$-m0K$-iq4Y7c2o%a73A`T^}3`46zd|4`c{ z|0(|^Ki0k@Kat0@XXU5zxI7_K@}#MmOnY2=!sMoI3ezy9_NZx^7BkDVn%SV)$uaF_ zuIWHkhs(?}^UVUY&@3{GO}ANMmYSnXk6C7V&2rOcR+xUX(hQhYX3!jMR-0qY8gr}} zGHcDSS!YJfdNXQ{GaJlCGp7B^9B($6&1Q=^!E7}rnr&vgImzrWC!3w-6tl~mYECn! zn={OF%x?2sbEbKo*<+q>_G(v~7nrlm3(Y?BBD3F|ZO$<-Hs_l2%=zXe<^pq}xyZcK z955G~gXR+R{~_-^!0RZkzVWShcdlev)|ER3Q*4@zb$3@+5?z+GxM72Fp_n2|vL&pB z3dSY`2t5P{C4>-DLV(ah2qCnX-h1!8mpG0?h%v>|_nULhEvCH5`@Y}zJpbqU-}~&D zGqbbPc6QEh&dly<_1e6{y=7jz*Wq<~U0%0$gxBNsdVOBMcci!6JIY((t@Muee&ikF z{n$I!JI*`a`-yjgccOQacd~bicdB=q_fzk5?`Pf_-kIK6-p{?Wy>q;Cz4N?Zc;|b+ z^e*sz<z49g+FRva<X!At;$7<f#=Fe>t#`S1g?FX*JMSv*YVR8FTJJjV_ulp1AG{m9 zKYBMB7kGd2Zu0)@-R#}s-Rj-u-R}LxTkYLpT<G0t{L1*dcb9j!caL|kcb|8^_keMQ z_n`NX_pot}_lWnX_n7y%_k{PP_muav_l)<f_nh~<vC4bFd(nHzd)a%%d)0f*_?`E< z_l9wn_onxjah~xD?``iL?_KXb?|ttB??dk+?_=*1?^Exu-Wu<3-dgYP-a79e-e=xF zz0bXWdF#C|yba!$-dEn&-Z$PxZ@?S$4d3)F-^L1v>nD7CTnOtTNx#T1h8eUHf2d#T z4}*2qlwa--_ec06{ZamCe~drY-^?H9Z|-m5kN3CqxAM2fH?b!86a8)d?fmWi9sC{r zN&aMiCx41x;is|a5&Dr|>1X^ZzuK?yclLMjclCGkcgJg9d-}EhRKL!z_Z$4!-^-uo zPxoi|GyPfq-u`TVj=zt;uRqt{&!6Yd_ZRpJ{r&v|`~&@i{6+r3{vrOM{$YNjzu0f` zoBbAliNDlu_1pZz{bhc;-{E)qU4FNJgx}-$`h9-Cf26<MKgwU>uk?@hf8-zI|JXm) zKh8hi|A~Ksf1-bqf3kmyf2x0)|5N{T|7ZRg{+a$+{?Glh{d4?t{qy`^_~#q9`M>lp z@PFlB=>OVZ<zM7q>|f$v>i@>S%>S)_xqpR!rT;tsD*tN#8vk1VI{)|n_5L6H8~i`| zH~N3_Z}R`_-|XMw-|FAy-|qj#U+v%F-|64w-|gSy-|OG!-|s)*Kj=T?Ka33pkNS`K zkNZ#fPx?>!Py5gK&-%~#&-*X<FZwU}FZ-|fulld~ulsNKZ~AZfZ~O1~@A~ig@B1J4 zANn8pAN!y9pZb6G*Z6<)*ZP0=*ZKeOKlA_TfA0UwU+;h6Z}7kLzw*EKzwtNv1OA{g zu$RhGwsMrK63SD)QYxv6RIwVOO4LwQs)ngDeEh3i4Ob)7NHq#eR%6sywV4{HHdkAy z@oG!8mD*Zuqb8_{YFo9P+Fs#f7iyB4taegURE0`o<5j34RjD$nN>!^GwX@nq?W%TD zyQ@9ao~l+&RduRfHK<tarKYLrYKEGrW~sf^Y&A#iqxMyE)qZN8ny(h9g=&9wfI3he zq!y`z)gkIob(m^Yi&c|qRxN6YTB=%An>t)AQ|+olb*e7at&UJVs#o=?es!c;u8vYG z)Jk=<`jI+D{a78Vj#J00pQsbmiRvVEvN}bbs!mfsRi~?;sWa4>>MZqhb+$T3ovY4M zzfk9^U#bh#uhfO=*J_ozNL{QhQJ1RUsLRxE)#d66b*1{9x=LNGu2I*j>(uYn_397m z2K7gEqxzG&N&Q*ftZq@as@v4<>Mv@wx<lQm?oxNFd(^$^K6Sr(Ks~4)QV**~)T8P# z^|*RMJ*l2jPpfCtv+6nZym~>ss9sVpt5?*k>NWMcdPBXb-coO?chtM;J@vl&Kz*n_ zQXi{N)TipNYK{7vTC4u9)~SD}&(uHF=jvZ-z4}6JP+zLA)Ys}8wNVYIL3~QdOj=3( zJpmW5^J1Mesgg;oixww`BukP*lX&wWS(Z#C%ag;CBa$PNqmrYOW3XPiS#n%*^Cb3r zCAUm&mE1bHO>#nVVshK$cFFCN*wKzxJtyN$&?#6GO(%n7n2eH@cvqk*S)Hs&?ws5u zxodK_<nGBml6xj=lT(v*$@*kNGEVN5oR*xPoROTFoR!==IXgMW>+fu<sHm+Fx4yV< zdDl+8{oO4+ZCyPjeXUq?$$2@>)Jm_rt-YzIYk6^3XTizZjz$D_dX4NRY8Usk9NCg+ zB*&}mT8h}qyhe75>YCbmn)*AIw6`2p)RgCwb<JIUjZILy_9dHgY@)uY5lJVS*{Mf9 zjeTB&WY{7ZHgJY5<Rlw%S0`I?tk)pLXkj<ez$vznQ?ysUAVo{_d~&Z`0?DO0ws^0m zu8xjIE^P7A0#DJj{Ix}`dES|}xUt7+1xZZrYin<ANwksUO_!RqNlm76P1?BWrc0UI z*tMq5u-XnUnvqYp=<qyWJhPx6#mhGFEXB?2Y{%+XTT^0IV^e=$OQM~e;#r09+6z31 zS=>PF<T$fXd#4>FF`KXN<m+b_TvFUw;7QEp>pQuDI~%*Zdi#31x?5Z9hR&sSOXpH= zj#R!&DnExS-$l-lIj#MjOB;LoJK7uj`-XHC`Vw=w3_V<ixdqu4_Y`;%bGZyX?Ci%^ z^^#MxU%u&zdh@(Luc@WEt-Za`?-M35k1y{d$C;;ZWuLy4^Z8cx^R1jOwdt4I%;(zl zljF|s!9u3nuj!Eag>@OyU+DAZ%PsF`w`gI$WJSvh_yY>~6?r~!AlGpvImrWaeKxr= z$GYuZolAS^kxp0h<Dafhi$4%I6gLvLQrwKVRpM5QTO)3*xKqWg6SrR6261C{Yb5^~ z$-hSOt&x0dB;Ojzw?^`<k$h_;-x|rcM)IwZd}}1%8p*dNcIUKq^<e6&DVL#E>Rc;% z*Gk^Cl6S4-T`PIlO5U}Scdg`ID|y#S-nEi<ZG$_X^6by&IaTUARq~!Hc~6zRG1The zIXzYKo+^1ymAt1)-cu#-sgn0p$$P5gJyq&GRoc5w+PhBjuao@iB>y_ezfSV6ll<!> z|2oOPPV%pl{OctDI?2CI@~@Np>m>hr$-iFmub2GmCI5QKzh3gMm;CD`|9Z*4Uh=P( z{OcwE`Z}+rufM0WrJ38WUh=P({2L_y2Fbrc@^6s*8zldR%EVs$)GsB6Q)`gi8zlDz z$-O~xZ;;wINbMUW_Xf#5mfT~>J(k>K$vu|bW2t>CwU4FtvE(1uBy06Z!Vh_4j`eCA zD9CGUA-8BwZ+l~JEAw4>o@#?F3*vHVf(kB8P{E}MD!4R31(znM;L-#YT$-SQOA}OZ zYXlYC8bJlOMo_`65maz%1QpyGK?S!)ke2+@l7CwAPfPx3$v-XmrzQV%Rq=iWgM4vs zp=Vy9r?0@nbxli|(o&|hlqoG`3ZzVdlqrxh1yZI!$`nYM0x44<Z4yYE1kxsfv`HXs z5=j1m<R3`>f#e@b{-NX_O8%kbA4>kA<R4~yjO*<!y}gHX1w$!7C<O?m0HG8hlmdiO zfKUn$N&zA%KqLi-qyUi=Ad&(^Qh-PrAd&`%B>zbAk0k#{@{c6{Nb-*)|0wnv3m)8{ zQu42q{3}DRX_NDkf2HJKDfw4Q{*{t{rQ}~J`BzH*m6Cs@<X<WE&q)0<l7A*hE^P$Z zYj4B`CT`-4G(o1)ZEfx9S|*`VfQ%F%BL&Dv0Wwm6j1(Xv1;{jb9fcLEk^)po0q|u~ zEeV1uxdBy@dzCaml{7$=<X=_o^=y(}ZBdUd<tAZLhAJsTl{7+CtmgKjuE2cVCCs&T z9*MnMGMon0QkZHfOtloIS_)Gwg{hXpR7+!2OJh_^V^m9HR7*jsr6AQ(kZQRb0uKU# z2SJVGFEAmfiBe13yZZVX+k5-Ey7jE1r#D+xfee8_hCm=gAdn#t$Pfr*2m~?&0vQ5< z3_*>QMP^Dtt=t%a4S~RhKwv{4uptoG5D07t1U3W$8v=n1fxw19U_&6VA*hwMsEw1Y zxrtZ_^|+ijd3c`f$+4x<0dfK51w9}aIC9e)oC{Ab&$BCXEO$bI9f81(VCq!WxTLME zvZ^MiAubcugmUi%dISPJ0)ZZZK#xG6M<CE65a<yI^auoM1Ohbzff|88jX<DAAW$O^ zs1XR%2n1>b0yP4G8i7EKK%hn-P$Lki5eU=>1UdxuaVkFo<k}<Io@cQWNT$unqjKzK z-_tR3C%H8n<f&<pr=~$#PGCkLFe4C{5eUo(1ZD&RGXjAbfxwJFU`8M?BM_Jo2+Rls zW&{E=0)ZKUz>Gj(Mi9$W8_QE0%TpW6Qya@u8%s}(W3{xW@yHfP-NlMPhvrK1vlBW& zvn9<Xd7`9QYLBp@Qqg3oAcR{atf(HM$FRo5U7Wha=W>MtI6?s&p#Y9h07oc*BNV_9 z3g8F@aD)OlLIE710FF=qN0^pUq<J(8)7&m$TJleGyMzKRLID?{fQwMTMJV7R6mSs+ zK{15);uh?3%~x3vMkojaU-96|3c?5lVT6J(LO~dzAdFBDMkoj)6oe59!UzRngn}?a zK^S;Xf%6fB5emWxL&;wr>renjC;%fAfDsD72nArkZlt8gog);05emQv1z>~%FhT(s zp#Y3f07fVPBNTuU3cv^jV1xoNLID_|0E|!oMkoLytYo<w3a|(TScCu;TE2#rEXu-4 zekel$7NG!(P=G}!z#<f25el#f1z3avEJ6Vmp#Y0efJG?4BFwNT3p0|xpo%af`Ddiv zGt&74SA>EqLctZG;EGUiMJTu;1g@ZZWmtTLf*V4?4PljZ-YV(5Rs2wgf(gQ^2Dh)P zv#WPVb6ZPKOK%$lEPTn@_U_h3VpU^jS6@qeOIu@cLpQcUVtcfX;WzZj`RUN-QBd)m zjyA0zWY7G9$jLb!ElWAlur@^BBrlipa;CQQHM)Cg4CYBb&VdNBk&ipC6`Z3pa%VPn zcQ-=0)v>s_(b~J;n%!^h-v$*E<!tTKX3uTya`$Un+R<pwYwY)=6gGAS+jThh>1`|4 z>10bVR3c>k$;Mp8i(3k6;G+QC9J>6(mG|@c?4=aw()G?_M0ZJ?xXmr?eT|-^<*dXB zTc7NsD(S@BWmKhhu2N^ebyS<z#r3fv&=bA7GHI6_yAO4eeC=-3rwIp&;dbc;F4oCq zi!=0F#VeLH`R>R`@+9eEV!1EXS0t!-X@5Rfk)cp1A~md!9ORu&&~@2!(tl;<9?Hx; zl$m=dGxrcPcj~Y8GH!_Y4Q1vY%FI2KnR_TgE)*dbijWIM$b};0LJ@MI2)R&%TPVUU z6rmQ1FbhSPg)u)jQAGs-n{-7*5N5f`bdC#hT>fm73#-g=nH*P@<EnEUgD2#X$>k>F zL{w3qjg`sfm5y@hrgN#KbE&0ssRg;zf?SD%Y$_E&uAD(Gy&zZ4AeVBGOIbi3%88G8 zYT&22T*_fC<uKP4VJ?p_mq(b(Bh2Ly=JE)0rH^uXM7cboTpm#_k0_T%zFneR9#Jlj zD3?c+%OjI3RVH^uCU-?9cSR<5MOCg;Rk^a@!^YAYRk^ZM<x;N7rCgOuxhj`(RW9Y~ zT*}qC`c~)isLth4oy(&-mq&FjkLp|=)ww*Xb9vO{@~Fw>QIpG~CYMJ|E{~dA9yPf< zWJ(oP=2EN5r<TnNeM4FWUk%A|*}T#f*}T#f*}T#KfZUGhifmr#ifnz-71{cvMdK7z zW}l68I+vGB4I`1aQ6`snu1};zyBvwkjYQ@~B6A~=xsk}+NMvp#GB*;L8;Q(~MCL{! zb0d+tk<8_zT<fHBy)m6@*L1F5re(^CkEhaD7|EPI64@MyY>s5U9#zR0CDZXprqhv3 zrz4q8N7Zq*f9CI)Oot<xQe)o6d1l*2H1AQiOxb&$yN*F{B-7|frqPj1qa&F{M>36$ zWEvgGG&+)LbR^U0NT$(|Ors;2Mn^J@j$|4g$uv5WX>=sh=t!o~kxZi_nMOx4jgD%i zt)?;#j6?$;i3UCr4SXaT_((MHk!au}(ZENdfsbS=9LZET5)FJL8u&;w@R3Y|Bbf$A zG7XMo8XU<qIFe~_B-7wXH13g1gCm&+M=}kLWEvdFG&qv!ZY0y)NT$1y$mU3<yOB(H zBbn|-GTn{pvrkvGAnr&cbR-fw5(yoNgpNc)M<Sskk<gJy=ty+yk?7VV(XB@^?TuvG z8_Bdcl4)-w)80s?y^&0NBboL_GVP6I+8fEVHxfh;2_lFD5k!IrB0&U^Ac9B`K_t`O zNRUAk^9WX1g-`6zb5mJWEhCubc%;!Bk7YU@k7b(Uu}pJ3mT8V#R&zX->1+83ubf&d zIaJB$s?X(Oi7c$F%7|pcS&@aB%jRD#BdESMTat{-S1RjxBt;sTY}qm*^>kSNY&O@7 zNIiWvTmOtmJ%q_vrnzkXu$(Kk$cX&YXS4OkS66apb9spT)A4e7i2OsC$Un{H@(}r_ z!*cZy`KQBjd5HYeVcB-ii2OsC$UoeRboPoW!B9GkC-{}pV=B4Fr0dg_$;O=Srd0IQ zxq#&2JlmXS^BW{eWl1W3ZfTA!ZDYl3E+Dxq&vxY5{ML+8(SzsC>2W(<Ul}F4b3w_z zJlmgROGVL~3rHTBV<iy+-$;xmk8tB;Q!Yw!NuFJqW9h;MUQ0?hR1oTiOS4<JBvs(j z>>4i3ZsAh@hfA{qxYY7+Nvgo5*(qF_ox-Kr0bDL%1FLr6r%^rKz^Wa%l0U0<;FtVa zwFAH8&#E2xC4av2@Js%z+JRs4XVnh;l0U0<;FtVawUcgO)ec;#Kd&3XFZE}w4*XJo z*6P47^=GXP{8E3`>cB7cXRQwWe1jTTtCMbEtqxqNKWla1m-@3-2Y#tPYjxn4`m<IC zeyKleb>Nr!vsMRwsXs3p!7ue^txmduwK{O6{;bu3U+T|V9r&gGtkr>E>d#so_@(}= z)q!8?&srV$rT$np(!IZdwL5U7{;b`BU+T}=9r&gGtlfcM>d&Jk{8E3`?!YhgXYCIB zQh#|s8d$rNZeX+tSMq1=4*VgPkC7()l0PF&_$7Zvn(#~hj5Ohw{26J&FZnalgkS2< zNHg8Qt4468{dv_0erbQ!{=hHo&j=HKY5yw8pS42>m;4!l!Y}zV0);<LEF_(8VmUht zN$;CjPL5ibRXV8US&rL-7mCsiyif#J^5=yj_~ridLJ|CO{~6iBFZZ7piqfzi#p&`w z5&TkrMz!!u{TbE5FZE|s3%}&gi$m~B{)}qjm;7t$lY8O&1lUQnxV=}hlv}{SHr>F$ z7OoV5fi3(}1O~S7OA#2@!Y@T&U<<z#fq^ajQUnIJ@JkaguuV5Gu!SpKkby1y(ghjV z!Y^Hrfi3*f1sT}FFI|9vE&S327}&xuU4Vft{8E1gw&?~2ws58X3~b?-`t$k@{8E1g zy6{W=8R)_<^=F_9zto@Cd*GM)^LkIZfdMdFsXqf?_@({~fZ><=GXREP>dy))_@({~ zf8m$<GyH{L>d){OeyKmh-*f}RU$|0#hQIJj{TcqkFZE~m3%}H#;V=ABe}=#COZ^%C z!Y}n__?vEE_zO3kIDntU736R)6)4ld;1{tZcLu-kOYW=|gI{uI^%(q;JFCgym)u!R zmW~Ag!~y_f0RXW8fLH)PEC3)D01yiRhy?(|0svwG0I>joSO7pQ03a3s5DNf^1pveX z0Ac|Eu>gQr06;7NAQpiii@=XX;Kw5HV-fhV2>e(Cd@KS!76BiNfR9DM$0Fck5%94H z_*evdECN0j0UwKik43=8BH&{Y@UaN^SOk150zMW2AB%vGMZm`*;A0W+u?YBB1bi$4 zJ{AEVi-3<sz{eusV-fJN2>4h8d@KS!76BiNfR9DM$0Fck5$Lf9^jHLXECM|ifgX!M zk41pTBEVx2;IRnsSOj=10z4K09*Y2vMR3O=uwxO_u?Xl`1avF{Iu-#Pi-3+rK*u7W zV-e7?2<TV@bSwfo76BcLfR05#$0DF(5zw&+=vV}FECM<f0Ue8gjzvJnBA{at(6I>U zSOjz|0y-7}9gBdDML@?Qpkv<HnT~m5CtMzlVpbKw&!bVyJ2&9x@gNp~9E(7XMIgr_ zkYf?Zu?XZ?1ad3_ITnE&i$IP=Ajcw*V-d))2;^7<ax4Nl7J(d#K#oNq$0Cqp5y-I! z<e1e=>6q0_aCtP2MH<H<jbo9<u}I@sq;V|LI2LIfi!_c!8pk4yW0A(ONaI+faV*j} z7HJ%dG>%0Y$0Chmk;bt|<5;9|EK)cYDIALwjztQ`B86j-!m&u<Sfp?)QaBbV9E%i= zMGD6vg=3Mzu}I-qq;D+JHx}s|i}Z~}`o<!CW0AhGNZ(kbZ!FR`7U>&{^o>RO#v*-V zk+!i&+gPM+EYda>X&Z~QjYZ1FB4uNdvav|nSfp$$QZ^PT8;g{UMasq^Wn+=Du}Ili zq--owHWn!xi<FH;%ElsPW0A5kE0xkQs|euAb0qRK7I_+rJdH)3#v)H+k*Bf9(^%wb z95*CdI*;gY#EY68+VSxDneN7>7O$m)+@bVB242n(f60=r{+_&dNN?Lwd0)|zwx#`8 zwABfecK3HeV}R{zU7VuoYi$#*wXv;-B9*Lc!CM_|&3&!pRjn(qPrnl{gbZP>Q}4-d zT$bc?o3W=9`!RajR$@)Ju^n6CHO9clpfW?~6JFSm+0z1mJmiRetPp2+WfhSR!3Zen z>uGE3T-x5UxWB!<rLP1ph+roKUQuXi#3#T?ka2-mN`ny2CWuQ=RmARV?dtCZ6i;;E zyGnSap%*((I-85Sd)hi$kU@8Q%MjwTCE&|hGGdo2;Xb%i6+RG#RaO+OsI3zP?&(^L z!bxnxIoJyu3_GH~rB}D&p;=~VmN}FY7|KrWJg2~t4=E6igc}dht({Y#B-?WA5Pee% zd~HYOl}N?7!?{x>z1Zs3nfJD|Ep5%Ke1>%5TWh(KL%Q4h^E;GEbXyg8ThQ6^Tbatb z@qsnmg(dmKQWVs)Ne0P$)?Ami&L#BP5WN?a^XM0fvL1P%DC^~#=6$-RSs&Fj>yet~ zyj;_)mus5yQBAWxscFv3HO+bpYMKoytZ6n3dcPc-FJUrYFYNfrM&R4cf(@IZX_^6- z1S?$WZY<c~r)dUU0@`q;d$3@`X6nFcZ|iAHbhq?kTL7nBTkSOC-3_Oqzo$z=Vh0h} zg9SGV#B!UsK~ld?)rM(6a|`KUGwFbV88%;Ml1m!#l2>PIOB+8}X~x(%MKc_@-1pNJ zu|IuDPov&KnoWgaL6A;YCOZ2&(49F)M$-tc_2<eJazc`17#*M(93EtR+5&=Yrb<6u z9r_D=*elV}-O3T<X-&)1nr0juq#4Hs=_vLwcwk8DX;$cO#PsT*C2dRkT6GXHy+t&B zGN06s8M8Vj9<07r3~fg;*Q*dl-K@2kE@HkHgX)nI-djx1pPbO0o*Dt$KYIK*_!=Qz zXzKLmE$<>GIZwYzN34pGv2aL?<L%dW6vx|7Khw-M=*NvC&c?IM5$DO-qK5Wd%%XYA zTiPWhZZHO-L7IUmT)BxEYA8+#t-|?G{Wu@pCc@>4hL+)YR9$AN(j2ifS2PU96jX%$ zK-o$Z^?kc&j8TI$V^p|OG+xFG<Ygqd(tjDV!cQ}JxKR9~jLY@-<Pq23og|lrCqCKS zwY-zL#pn&p^>;VZ-t9nESmE;1n3ksz*6q0U<Y`RH)0nQ!6iY_Uor`;0I1*#MAkA0| zu5@n3UO}3%7hHa%(~O<q&&HE_*TtMT1dpx~R|TH_?je+^mX*3A<YMW%<vhB2TpnmN zX5-74Xc=#I3%mPeT@F`}&w6ikV>?{|QC-w>6t;Bf(IFQW`7IsYeJe2BNTiIfM~*f` z@^D`};3j#?mu4kSt>4?<)Y^wVVG=-kf<RvS2;?OXxO@-sz;ZmEha(&fjB@m}^)AD+ z&lSR*mPmY8(8A)cEAGZCgkAWUXjf;p-09dQD8W&9evVHq(|iYx<SPSC!1Hs2XOit* zOWT^TvA?rf3ES0EOaK5tLg(P3)$m%Y#x?=Z&*Abt40wKy<TGAxTSr^FV4wiYa{wVr zd-`)t#4})ITvG%Q)OI8Ww@%<XN|k4!C=3gMfM=8Na}No4_80^_dxV>&2Ed){M{n#a z(XF99nzZyO+#No_saIDf_1;?yra%B3RFk3YxVW4B?R`K<D<oJNoM%aa>}iJ^4MEcQ zKr3Dx)gK(?*gRJX0-n3Tt?~fQ^0$vMGn$CzVnHZ!DP)}x{CrQaAD8eJI$1k*@0aMN z*E~9QUsnfZCA(h(d{SQg+#_VyE5f;F$gag8<Y{jZ%JUnFG!1#0h;VM$kf(`3$kRl) zJmiGZ03lBk5zgH;<Y^-OEX_lnCc@7R7xFX_{@P*LYP1}M$4h_9luOPNp&*hDDf?3b z>`$Tdk#q%^7ZX3{EPGQBk8_s2De!a7vNr{O?n1IR1%7@IWp4`nTyxo*63E^ZxRO6B zB;c3)Ss?*G;~3eK0zco0h!qm>^Bsv;Apt+%5!tf>KR+<CX9a%l6p`GiNCx{z2Kz_` z`-l}3fGd(eD=6TX{H62BzKlTjWx$p4@WdH@DUS@+vM&SS(jGi<4rE^jTxk!UIKwaP z!4qfrr93==hF{9V6KMD)f1W_YFZqiQh(riPJb}h}X%C)22eN+yE)QU`e*=E$53Ih1 zU-|=2sNt9Xz!Pftr9be58h+^yJh6sf>dzDFKz47ymHP9<8h)uiPpsjW`t!sZeyKlC ztl^jX^TZl{sXtGw;g|j(gM{qf2xRvLT*;p&+VD&MJkf?<^5+RR{E|OUxZ#)ldBP39 z<j)gs_$7Z<V8b7)C3xK%^VP1S64M~K@Brx9>C*mgdJrpFV-r+LhpWVBN9P4nREqSk z6zN|n9S9pQ_<HFem31|0aT|SGpuI)TG$f{_t1DTB8&paUsALri{M-X7W#p)o9#APg zpi+83Wqnl=QlOKk(v)(Bwd_HLwdp}daCt@qX@-}tgNzKT8P+lf89~(<)(*qZjhA8V zF#Ozj8P*QNA19XJBMefZOq6yCJbdFatQm&sZ@zIEUiyWfZ(K$uBzSX(%P#ZfjLera ztSLr($(xsc;pYb@!<u85^x=+^Va+l8l0R#HgNzKU*!;rz$bgDXE=3*vy&4H<a8pvi z1Aj?Wes(f~sWUQ`W<)$^WUiMHG?@`8k`XDA5h;=pDUuN>k`XDAfikfJxUH?F*>7$; zva1Il9^i~wn+UPb4?i}jIKX3k5?f~|QWdL;gDM%Bt7K%xb|X2@BSDpn%vA!zt7K%Z z5*S`3BXgCEd{y#<SIHAzCk<IA601%mR-H6#9jnlTI%(iKY2Z33XPro_I#!`09+#_5 zKu?`WtU8fAbs~A{q<(c$KiOLn$lemT+&Hqg1b!X~WiJW*oR94F2pSk&*1`sqZXdYR zj<r?1)Kgo<b5yw0p=x2t-k#Tgl#YvSE))iro=v#a$>35M;8N#;OKHQUwBb^j;L<&T zOJ#yP)o<$RUZLTCaW{Ryz8RCpw$7nV`eXFK?07M#1B$y+RvGEHxtFx{VUAox0n8^c z0o_ShOzni_$*XvZNV6oD>WOSNzSb&nFwbeT9-8K5FV+*iEU$GB!Y2=a4vH`4)=Ag7 zjXMF%iLYU(*3*fSPM4RJMP2R9y?v-MDT{I*UGS_<o!DD~s|zUHq(%lto#OD2sAl zu0+<$%A%}~%NW!Ssm<<8&?O%-XA|H2LZ7=($FP?iy4YQE=#oQQi5^aL8Bu+%702N? zmf`5Lo8UGftdICU4=>3q?LD-uh+U#w4qjeESC;;T<u!>jq#*90Z4%m>Xl$W|=98`V zzK8bP^C+X0ZmT|1vNX3%L3@W5czb&*!jk+h21<uIH#C!t9eOH(DUl9!u<5PI!S1#u z2fM(U9BgiDa-djja<CJu$-ySJCI_#kHaXB>HaU0|waLNewI&Dp#3l!O*P0yY37Z_K zI-4BqO>1(F&;$z3CI=eGCI?E!CI{QonjGvyYjW`3X_EsTXOn~1PMaKTF>B(+$xIa$ zFhW!#rU+Us3Xb-efGcUtzhX!y=oLe7#krSW;PL4@31D2bU2hY<J;dL60k8DDgA?Ts zCm^h(Ws^jU5t-kE(($!QjpMiF3emP{{I-Jl2`H%aWteT_+drD?$|ca1OQ36$1XNcx za&cEdQf?PtPakv8DfQ=4>d&Rrze!5P{RP*n71>xmp3EM7gNj!aoOj#weJm~PE|83U zLpBLWpu6e3_5Eb7nLf;<!}zOD;F9veCmq<7%~Unf=QMN}f6_@ulut70K>i$)=9bc{ zpqk?^6%m)KHGh3bhw=A?z+wGF_sG&sz7(Wm@K=I}%Vo~r074i({gdwoz1j0c*>C?K zhNr*v1J|dI_kbzg<cmEBQ2ezWa6|H6*`e@kdWyfIq{H|tLg0oLew{-53%_Fl|ByCJ zs`(pJ@D=BuOcY5!n7AuR+<@!{yQmxJ*GP3xFCGc3h~&Jfe22(|mE>N;r7MQ!-^L}s zhi4oeq)%b2V~!t2o)=&Prbs@X7r-x)k5L2sBKde;0Dq0UKo2VJOrmp%&d^lSrTGZr z5v;?;&I)&V{w&W15GuLyYykdBcLk+XTM$=dY&_Ln*3;6~OX-VJ92-ShQsZ3{31>-- z_e^{esrdAPr{$@3wM>GG6*K)YG;ltNQDU?Z(s8r^)p2-5MsqB6bR?3SU|4$h;q_$^ zeM5B)g?^n=fqxk1SP)W5Id2l6OHdFnRF|Z{Ka5LM7{Vnh2q~o!76fpc@Me{UG;dac z%V;GXRybIpTY?SG*)<Pr56-d8d3H&jU7BZG^K4t5Jv`4Y%d_oywj<AW=GpE%dqke? z$+NwAwlB~2=h-9k?D9OjBG0bOvD_@YCj$7fhTn6i`_3a9ZgRnU?sS5mecd5=B-8z% z@Kawp(lA<XJrvRxZ`8#)q?{sWzq-+c#~FdelEMmn$SjxKik{-!xUG}g<OO+s)sX!7 z0bhdFM4OU%AX!|n*w-|qaG9^EWHAQKWi9xyj9v+B8Y*5{1#Ciju(B*Chma?GYKW|* zLQ}0T%&w4X%{Ba`OZ)X)yF-7nhEg7u3&<uwQF`%lBhrcE8j3;z*(ga~l-70BHwJrf z8cO8-yG&-MHY`a4d@aGzhNGRJSsT%IflqyQ@1Zp8#8`h%^kbiUSj|oiJ5$-IV@IMi zkT0rWC(Ta4PRLHgPKKQ-c52z-G$T$k;xr>kGo~n&6=XBUhGiHFwqVS`L2Y{tVI_uP zwmLh2vtZN3*5_a;rVPFmY@L{}c(M|q+UAK#)=n(rcxzX~gl&o43>)?(ZUp})>(AhC zw(bFcuk9K(>>4Bt6SfUD13%8*3jEe~1^BdcA}q(9<eX{P_`>^H;D7F%1O8m+T=3^P z=Yju)^9%6jJHG^fLE;U=#MhSJG%S2^`LE#DB-Vic8@{e=;@isq1pm1=$uPai-ekkl z-!%qb;Z=Z7d(*&A_vV9Nph6f`kJMDd!dFWhz{hGD_~~jo_!(+%@Uzu9;PK5+@V`*E zfWK9(Hf((H^A7NLsyh*WS8|A9;tQK23>#n790`6@auoQ{$<g4)B*%aso7@rnq-0=N z_=4tMXp=)={i+o<WKJ~BF)oCqnrmP^=5E-gc^Y;q-iGa%^{^IGY>tQ3mkL;NiD8*# z5iG47Wu6S1D63%i<T}_kSq=LokHbdE>*o8gJn|JRiX>q*WDG2W>;UT@d%}{(0^G9% ztp1oLYy{fPIo~z#?UVIc<?x}cVJXOTFJyP{8+Kv!CY!<<+yTVYvCQ|_wRaJ>L&Be9 z7b(hB)>x$diG&uj>ztiE>vZ9c?SsDL!zxiJ(!UP=V-3@vE9v2m>inD&a#!eke3hJi zDI2QGW9y!YxK?X6RM*St$oaN}Z*k5yZj*C5Z*5&sUq4ISL&TLjT5{grgTw8qa=!Ol z{%m|p&QHkY=_8jJ5{lC3++5rV?b;7*8g9vXt4hLW=Tg!2Sd#NifNvj(vxT^?;cO{A znFeg&{K>Fk&E^@yfu)b#4VRvr1U)kzJuSYs&^y2=!WX&^F-r8;xs5XYRc@o4o{bUo zR7}v{@in%i9=`+i;~kNQNw%C!xCTBx`@X%vNYux(<{1;4R`j$Ri}4lEWyW=__)Mzt zA--qTX$<H!8}n(@7e9<Pqv4oH$3}>Ye(J+QRnaE4sfK^YG8Ovresiaehh?e%X$tx} z6a6h)ictln_|M9az0yO;Ju2}ivYoY^k*GbeVZ3oz-Q0Pg^$qi3IdShfv&I`2%$>70 z=)Acz5&pj_`FGnaxk;Oqe~&dUUE?tYwamBmkbl$K*_s;0=8b^P#?4JZZ<DdQsj;`k zxVNdPquY46xxH<v@#K=W&PL;T?7?U<UM2c&s}BDdq|sPQbbUK^Eg0XlcQv(}7E!+w zX`0xL)!$*3cWXKZOU8ZXc!=L6=EUBPrfzc*(FjU}w7Dy2U{2Kv1#=o`WX=YyH0Ob4 z%mejA#5}A|r?W)E4zr!=41)VzM06a{5k!X)O@PAwD*B#|<q?JaLs&8Fr0FoG!-&^b z*SNpKMk?{8M9YW{Cz=H1KCNBdv-LG2iE>}#d&qZWEQN1I6joRfGexu<w%20VNm~G0 zXiH#y>_|M%r^1HW1sIX9g3Yc%Y4I^Emu*0?d{`G7WwOkuHur?3FFm#*O;{Y$&j zi~Cb?*($8*LCWd#iY>#MYBM~IC_TOjeh57$u!GGplAp6%^dY;&oqWDzyts4O9eTUC zFLS)11MHSoi`yyg!|V>j7^q7ytb}qlhehm9X^DgI)K=nFh|4y0%@pK0o!6tbj;F^$ z?XDMB@-4^MtHaC3V7&a)SPeU5KY=X$DXfg056fa#!fM#funhJftbe@-3tu0aYhk}@ zgE;`1nt)U-fm|H{$+|gY>qJP`3Ru?K8AxX;tY^)DoSh3RSqH*0)?#Z3>|k}m_SKQF zclBe~xH=hjt<HoktMg&M>SEZex)OG(u7_=^n_-LUPS~G%5O$`Xgl(x8VNdE!*pT`V zcB9t9R@8dfhZ=xQs08dlmBZfCX7+e{8`yQ41UpU{*lyY#_L^eYXqpYXObcL(=@8go zS^}F(ov^ZWBrGfa*!~HuDE$<clg@#yqzhpk=`vVEx&~H|ZiMBd)v$DQKdc))4vR+5 z!-~=Cuw3*$tQD<+g`y3xLgc{WP&up%Z3c@%+rWy@Bv=lLU@d4@SO}_zRiIh01T+uU ze-^>gPdhC790lt=C&E(C&w<fa!2-`!u(op(EbH72D>{$Da?Z1`mh&ns<e=B6{eXlZ zeUqCTHtbpq^)^EoaDewm!^M~9hbK3~7wWeG#+qmxt$?Z2wnm>C33ofAAN|u-+b6zM zqv7t5_(}m=sT~tvtFdq=CB8wwwbkUrMl}xZPKg1vInsrM!ME5C0ewWZATLN-^&s+x zv`mf%KM%)z91CzP#IZk)18^LOD-Ke#!7T!Ju(}D{Avg}jaTtz99E))@;b_L8(_VsO zDUMbgZ8#3cu?$B$4q&9@mN?LplUw6}<W5e+u?^1Qxl~8s=)nOzq55$2<2Vw>avVqD z0KQQxaU6}~M>vkb@namv;y4b+@i=~h;{+Th;y4M%$v95MaT*Rh^9oj@)z5I8f#XcW z!T0dz;H&s}W`3Xq7QVEv(eVGfGMEa8U!9NRmpCrK@hcoap$dp!{TjzA92emZETTK` zbNKXeF^(H>T!Q0L9KXSF8IIrLI0eV$IIh5PCD6~S=9UU5Nd3-y4X&xKLfGs0mjBfV zyV`t1-H5P1;<yIKwK%TB@p~NC<M;!x8Cusy>)IG4HBN>FGSkLgvULmZi#xb2j_ts2 zVc4+KQtIj6HynLgqu_G%W6KzhUThg7(1$H!q}m>Il-dDww9+_gtin9SNU7nV<>>P| z{z%Xf=<hoIXwXsU>y|MFJ=`)j)BT$Ih`ztLpO6S9ENhy;Pqw-l7CY;~FT$~!5LGdx zeCZF^`Wz2io|R;|b9Y$mtp6X`>O2hAH9IgHJRUY|P9S@oCmW|gTAc<Pou|WA{TZ;^ zc@|{X*_i8`tF4{Dn&&Se$u2a0ZLESl&x?&qAk}_@S<r8d%Z)3HD~;cwMpt7_bggk6 zY<OM|>zp?jf7G^f$XbrJoO7#jyYUy)?+$Gh2Q#O8j0a%p^AT9~d<GUhUxO9DcZ`3+ z3h2L#Z%hpNun(Gq70whaZjOYV&2i><b4zn;a~pGlxh?E!reQ;KSIo2aw3}eLZkbrB zTkamA?f1Y=&h;3R3T<YL!`;o>$e0BC7;GJ5I%bU8!o_pAOJy+T^eWqogw+W2G5B?^ z$P?BjuGap<L7=AnwB}W*=7Fd!%mN#cO4oNRW&C?P8Rz^rtz>-LMn>K)#xKM&#v<6w zI0W`FZYZ#jampq(GXA)Um5e32u9z`n1cmKrg?{(#KDGD1wf_45q9^<dwo<>pC+L3g ztzH0&sjp#Nn5k#^xHA~d>CS+sxr1r<0aux@l&ke31}Nq~xNjP}X(VR>^OX^&ZNU~L z)<H8-;<;XlF_d%<kY03EnWSN$o%GdwjRh>qTp#Sr7D3N)G%%=6sf1D*1`APN7-g^! zHDIJ9J=2VVUCZ&j-2^+i1JD}m4*WU>7;*>b8Ft2OX$EZV9tb<Qha27A97x`E#%S## zr_tWN<R7bD_{VxP$v;ZF@Q?DQlYfME;U57Tx7r<!GHG|D)}|!ZX$qf%mMGIXd$S}J zxCj~L&5%^!!ao9SRE!dDhS6#gEJo=PjnyvBjrI1HYvJl^QFra?F)@LboGoYJ(%F2T zz`Im4jeB*?_LFqs!ar87pC_SkDOAtE^}W#eKR<sO14Lq>j=OrWxuq>|q4mGOy}(`3 z*158_xjYm01Mh%kKy5$p8Rs6WC$YimbMCW#>)h`-up{`ex&>BHFDFZakHZe?71m1N zp}S#o@Im0gUgt5fka}g_q73rdh`dbLQZ7O({a2&+x1X+A|K7T?c?qG}39uTQhE3Sr zU<Y<#-V!Rty2(i`WfmDbdfJwlwkigDW8bkhHq0w$J2c+#?^>eqMts{AjW<$TqtR(0 zZJ+9;HH{_kmEv#Hv*Q@!@h`!caWI~h>%HrZ?O~s0GxTE1c++TC`)OM>-!n4)2R4KX zEZk`OHrxEKM#}GwlQ&_U+;o&Iv?G)qBhM0BLVAomN88weHKUVBg13vLF8v?c+ujrN z#Q!^T_<rL-NT4U7iF_6owqJ)f@-xWMuVJ&=HeFb>E;h?xvw93{RgX8ff;5{*I?90T zPVWYr)32JZv)p8vh+Yh>Cq{+MLGkn^r{G>sH)bU@=F!p)VQMcRvRY%7s--Zk6Pgac zX)ZuN*$fC}XT&@N`qI~7S9$|-okZ6aYhACgmDc{^3C2^LcoS(4!(XRbocK(KC;p-1 zaELk=II>mEgbyem_*dsY6Qe7QLa=1MD|+4xSSDY9-oF_6b*gGA?Lt*W?rfDIcW+fm z?#yHbxpUO6NNJ(k1$4IB8FX(Ifi|cb(3zwsz9YF4_&G^kc3^7cz6ZF4I8autoi?`D z`fAu(*CV4I^)@%QQ87a1XdSg#t7d?osdd!Gf@B(WA*G^q)2*9aQ?1XT#JVgQP+eMF ziLxRsTvLSB8iqEBV6}ZZQt3h3%_x`FY3MsuPU-Fmo^sGyYvL?uBrr=t33V(zMxf+s zcepdvZlIVep;mi?I~bgft!ozm$9|tRs|lN6D6uo<X`HOr86Zc2xwb$)7ViJ|+AKq? z{AfK|Y}eyzq`MQY{z$E(t0N%j@3VL|9yZT5#c{d}I}vL4M3OIsRrn<E-?l~(te{Os zZ9h_LjUlkbwa&my6(f^I>&OAz_ml?LE&aZ_>pG%#<55ao)*<kto_gj>?Zz{KQHBBv zoK2RI&%s<{CXl_x^f2^h%oLou9C(w_<|sTN`VYMgET7+Rz6gY&k%*5wJRO{t3s{M< zUc#!(>(=YWXxNj6oYt14jj>q07-NisUUw3Ww7a1HY5nL{_&XSnGU#1Z_&Y{5{w~cH z9Ag*!T|6<nB78UeUCbZ$1h4b6IX~l2j7bijqgK#%{2eVlL65@UA!OrVzHlP0)w2Z` z^Mdn1FTvlzjNnr6m*MZwoWQ|+;2Q9HKHy+B@JG;_@ONl7;9xfJ7w~uD?_f4?59odP zJD3+d0Qv&{j`0%y4(0}Lfc_PK$M_rmj`4T=9n2N}WjN+gb13L!b28`@bBf_mt69`) z7PXp1t!BYq_JM{AYtvVN{?5Jz^m_Xi(A(@Mp=XD!P>ep$Cy=kU_-Q!Q$`-Y<Mvq#z zsxgN~eue>U1VT*ISo`!b9yA+ApD~Or5w{pKJeq4D#5PXGEb3wjeN<bVqtMATr{lQV z>fYHD`WyEMtY%S9lQ4@yfNK3O5yQkNtI_TZ;x!`OAJSLbhTH!8EYxXvYL6D1aifwG z>@ixZij}!B$&^?qA7^i!91R=gT7R{rmUu{gdy@X8-}l5AdKOwIvg~loDv#9jO0o}s zA?&|jA~@w?vd;blG-uD*Tj@EPF8QFA114-(X3$ntPvf7au9=oIdQ>W*d!Q}hX|$*B z>wU2P{xHYYR{SXLbUX!o2a?9?=EP0bG<Kymjm5O4aTKj-{8X=L828hi(t38SH`O`v znnqGqD=?aU=bR4FW1ag0*Xnz(<2bq0vg-@m(fY!61?vmj(fY!6w7#(2_vDpLi}On~ zb@z4~6PLC0bQ+V|mn?2J0_+D~W>j}9>sV&&-qVMWdT9e)8=kGW6vDmJBHlr5OE3av z=NNoWj|T?K)kfB5_PfK1%QM5*jw$7K*+%O>km5G)B2~Ck?-%!ZxJhaWv>4Wyw=pI_ z%4}<FhrY6dv7<4`(0zJ}QDLMZPeP2il^B((FdEgM*Y1KbXt)2B?%`hCL5+GIH69n_ z^StpQR$?{&c@qfeUE_TqkB^N{jWt-CT}Oy$y|KaA2vlSeF7gN&l|TbuW|jjVjUt4! zxtYgEJD8Kqov@l4n30(=s|hddVcwa2nz5n_*=pX1CwzNj9_DI?LGRyg^kAg?F-FLr zLVt2Tv?Z58Gjcsfz1yMlco3z17NvX>rTY}6T8|m7g;jtOlwb^Uo`@U+<g~k44;4t6 z`x@w5L_a6Gt)}*Xrm!%s=?g^D6n>AUa#dmcS16o9+%C;~eANm0aC?z+IZ|Fjbea}P z_B!IbG<BxC=b)wo?i&cF5S#L{9@A&7wM2KJ_}6Rd-a|aowB{WuIaAn0M;Rn9&hb4R zXYeyTJ9@0yv?u9al<z;>dv=4?YcEKx8Nh<Gfspn?`_Bi`S!f)9`*0A@40?jO#2R9y zta9rxq-sKf>K>c`ouG5Wy?{m%m{}vYiRiBZ`sVK6Mw5_!D~;nYYC~rS`|q#hsA1oZ zIxMnx%+gKH7Q`%!Q^<3C4sj?oruGJ%GRhR-nc2&j1}#8V|EBFuLz9NyGL&nEc((PD z^#%}ue&Y2iSUHX{Xv;EPvVrv$N<6s2xd~dw0dD|W&&8<k09HvHtdW#rg=90Vk8Ee{ zM5`lvVr^s=Rz~*6x=1rtMY=K1{Sj6~PQ`l2c~}j(6l)>ZVI|~Ntb^Q-@%(A5fxM0t zkdLwc@wxStjY+y)40$^W(qkKY2fG669=l@IV;a^x=3&L-Fsyg9LsNGY);dnaO2^N! z&an!s99KaK-(=s86^;k7zVR$pH{QhB#-~`>SZ{yhSdQ<MU~OXzRyHPLT_eD%#_m|t zn1L0I1z68m?6f+aPM@>VInFuR`I&Q$bAfZQbGdVkbAxj;=Ir-64`V&!d8}r<?R@C0 zaXxc4I0LTZCf!nZgu9u$mAjq0lN-4^yL-B^JIkHx?(ZJrHoJ$r-LQE7BUraT6_)JJ zgVp*=VWIvySfjrcmgnz>mHDS(QT}yUkN+5!;y-u4f-U$2?7o-5#{1^5=e`4Mw`X9d zeJX6S&xZZ=0~3eA!g@QbsUHQ)=_kTU`p;nzeHE;qUj<9&H^Hj;-LPQ(IINYwj1~X) zVTF7hY>R&bU4{=!;pMOjKHi(?O@g)WYS{IzhYjz2yalk)y%_emJH0+w+&<1b88)@g z@h*VX?8{*x`v&i3Sirv5dl;6jpN9?Wx4jR&HFy$UNIU_$HPKs%_7a^!w36r=qVH*H zhD7@*<Obr?n%aET0~CHe(JG>k5ZzW&^EKl6T8>F6*cVdxRYVsNJxf#bI(K`}bwuwZ z`VXS7XlgYRKRN+a4XsTBVj3P0xV<%<9p|FYn;#LsW1crpA^u7VPZBNGA=Z6FnIEmg zO%C}0vjEfjglGZpbR@9iV{M?)9-_~hdY)mLQz+)`6n+o!|22K;KZv;p#YrZvLyknv zoAW$}2))YHr7ej0pYSHP;rCs2%@3z$e)XL))KD9839oZ+#kJIP<UT&5@07U-b+K~9 zA=VV{KJYKm-K4%}(p@&k(N#O@7BEW^7ivm(XARZ#fdtkbExOmvA(Yw~l)?kp5MWwm z34Ld|r(I0<YipvtnwmGeA<kY&A=H=9hoLLBcBWdqMJd0cd7JvYwY|;}SLwdTC47wd zkvfjKh30LCu6;>gYd%NiB#IL19>sl_>17mhF;VX4XK3D{bgWY1n;ooCSwxL*C}u)a zdn!?mLv^>t(%FaHOA)_}_&cd2TTt2fDX1BkfwM%71OA-X@&1krj;IR$G;;&7Ci z>o|c@yPZ<Lf#`cgpK|maWZGQ7zeoIMMAs30q2MgVw@P)0wFS{~I=hA_$G?jB!t_7S zhgdb5w?<P)VJ;kJ3e{yw0<C9eslMZd{6`Y!+u?jIwFBbovuUE-8<|pXlzv5Hi1gKL zFGauN5UY}U5%(M#G0Y=$yYbj%-H=!eKBTriE75_g_^wee;-|(rOrLd*rK@;E+Q&Ve z>Z0!yk4OdMl7!qyX%e-M(B-$1I)r<vF<|^p5PCf{LK=f;oHgrvR)O{xytN33>qO%u z;3Qr}{Mo-<JN$PmhlT5go30w(2jupU@tE-hRt{f)cHm{>6`=mtpc`0c`~xe9UqRnF z1bBLDVCl&~?seFUIm?`FE-)9GhnR<%&1Q?a#5~fx#QfC!tNDfbrTMj0EJ$&bH5z+H z##-aBZ)6*5TOgzDtsSjN(7J~}NR`+>vbQzQIt;r;x~(IyMz8{VMlQFm!2XSEoHAgP z!?3%e&$-a~HTG0o1QhZc?5w!lxx%^9`5ksxT<cut{2u!({s3h1N9RW8PuOwsXJC|D zupY4*D-w4(cRTku_hSX(L130goyW08@hlL_T4&I005<uVdy#uFP{<SRv!3Iv^#A6s z_5bd#^Z((0rnXYM=pC?X8SLEOC3fxEiv4p*t&1<Ey}kx!&@ZBA3}S!p2-~-PV;m4$ ziLtpo%uX3w0=bPcw!zGQGh+e}T!k^wInz1Q2x(7mM7wz_X+Lch?VjD0_RTiXj#=GL z3==!7%x!RBC$fpX$|mOZ=B_xV;+TeGHja5X4#aU7jwLv-=gaKDaTJap<2Vt=PjUPl z$N4x`;kXRPRXDE4aTAW)aommLK^%|ccoxUYINrqZK8{auti!P$$2U+dU=C|RJXk=; z*kfu~V{nYeF%buJ2UdWi8prN9>T%4#u@8;~I2PepjH4AtCyqWGD{&l$<76B^!*LFd z3vgVF<8mC=;J5+D%{W%$xEIgiDVPD918=y?L07ogD>Cq&i@ga#=jo??U=h)qUF=mD zILloCI+sFjO+ec;aEc4rJg|p*Gw320a%<o|_ju5xdk^S439LtN9B`M=S)?|&h4T{V zE<~F(-S|$P4s7APhnR)s+0j{-jlXfg*?_at3Q9#%YJjBX07)|BrOQnG07(epizz%L zO0sj{A>wZ(x`OB~M0<&HnjOS*E|W>_E^@T&zn<tmc{&)`&mrHn_OmE$jr~!U4m@vP zgzyX9Q$XkH5p!Ujy8`qqcO2-=Zarw){R(t9cRlE>6tY5(l>=iFxFhC{?lE{eUboPO z)*8%|W|Fq<a%eGcz|3p4aWAd2Jdg3_ZD<$PK&!9;vndBV6-uE&*bM6&+d*RxK~u0N zjXHD9{W0b=V;^`oMw}mE2ivLUnb;40p?N8Guw4he@2%LwcE9<k`80O1y>7m1evEx= zpJQh@7HBalm06=`B+~oWGS;rxy*3Sd*XHSxnn$CI4_F#4FkNn8_lxz4{SMAfv2@EF zLuY4bYPZ^F>9dxm4_N4T)(ji{$hraj76`$HCe3``0v^MtHjTp12EERH40M%F4dWd` z%=b)SyzikSDf}6_b{z3r6WxU<owaH-Z}C}9X9JyOex$Au+ET|MidNVfAs<kjtMb>P z%(`quv7-*PFgcfkvRRWU{uHd9o91E*_ushLN`tO9KS9V^6TQYbpJ=S9`G%&}SPHqv zT7{6=6kbDgvA))tV15o-NpVgm`jV!WMRb<+48_?6l&&(4z}N*eWsL@<`1TeUMNDfK z#Mdh(6QD0Mpe>t>+4~ge#XQo`myvEh1r5jEn9a|DUSpi0^%~niuW=BrI2ix!AmN4> z+e5-Z|BwC_89PFzEioopMOKk98PaTsv6I$)7*imPt~b(<I5$C-L+6TpieFe8tS_yv zoU=*Chq2JWSO^`60WF9Dy@LTwh&ixCx1i8+;!8CBW`owJm~%0vn&`{BgI<6qWg^hQ z|1}*L`s+04&Q0q*y?SLG3!RV&`GkINFGibSmZ+)q6?8qOH5a=C&`Y#F-r65J5Yu@9 zEx2h4oX!tZq>L1D={Wak-o8nfL1<c!-dqx<*dv9jFw*Z&d135DnL(Xxp}Ub1+VgY@ z81Z%ZSe>T3So78_=uY*jTozwxbTyasSNlU-cwYg!c?J8Qy0EYHt4$EfIE_r48Qxxy z`!#A;NPdin&;`9`y>ET`9bM2FgV6j8-iP~Y4xA2Mn&~)p22`&ZwfyHANpps|4^Y%1 zbFtZKcA9<WO7l4E$o(01<X&K2j5l1aF>k=0+|}m2=EK;P`#jc_-^RY&HL@?)`G2i( zLJO?b?FwO!np&?6eg?V=*3*QV+v|3))<N41oraEM>eX9}mK3h{{+Ym$`m8ay0WqJT zIN=~92FEcU)n~2c)cY=_luy?hMU4?~?K70`{yH7pVdQHrpljFZcXQ0O#E-^%2d44H zpvLohH;rkIrv8#3%5lncFE-DhvdyKdfO&A$J-Y9h*XWd;=6t;h>&U$1TbMhS;X72~ zQ@HS3)0;%=b$N^d%AL!0f)2N+)*6FldFyE6K@jpe)mm#I&BPSl_sw!@s~7ZH=edE) z5N8UdId^b<HU*Q@pG)a;n^TBYLNPz0R=S4hV?@VNE)$5)NQfR|pP~3uBxEE)cES!M z6WURIm#O8kI*5`$5sVVbJ-VDyTcFEhPu8`zPtiQ2J9rDI74dJPHfi~uIaJ{q$9Gpa z^z%J!lbI81srSc7elS*0v%4+tGrl{c`K!!iHl4>DR4|9>GLOLgWdm?Rh<VAr*wMbw zI?Qf>2K7tKMz(ZjIzMrr@b=2i5@?RFD`p5!(P&{AE3Hq^GI}SRL8DTyuI0e#x)r@E z6PsgY&cO(A@Bf#c@&CVSmC+NI4{8ZFf#^p>cga(874hqczN4u*nfTR2%Qdz6+D9nl z2~Ev4#FrBP7V%4nUP|<IP3;$m?<cy5=qydmvx(AG=6l3HK=f^*B}8A?*o!ik5kFE> z<1GrAJ1`w7aH;BbxU~hv`HW~6#d$<ijw5-^rn8?AU8||}4DpW<T}Sb+(!A3=I00vO z`2l+IpdOjd9@OY^N1|N*AD~DFX>uKzH_oD~s7B^E;!8BOp3^C#F2I{6-!E!OlgoU( zPRF{G;=e^PClLP;UHiE%t+|@YT&7d7#%kWakk0N#AukMKoU`ceo3knAXv%9grS=HX zCv+V1#z8&)y+D+B+yx!N;fELS9P(e&75Vs9G4bC}n{Yb!3_x0<FX{A|a$YqQGLGnu zMDHAge72`)-a1fIyL1rPj$6zcJ*W}bS9JDvihmN(ONlxg_1vYD&RP`m3ANaDgP5I| z7Zd*y(bK7APInd{KGRUgw~wK=dW+(0pjO&I{2ChLH73VgL63nZ=I)@+5!Iuk?k5@r zJVs|ZKE0U+jfTF8<e~YJj%hg*pW+y&Q+#L<5T4+0>H`yqzm!A1LvPd})|UBji{3di zIbWjoXlnnFRQ_YAS8$(MO?`lRjx~$UUPiU1p2L0K*+YjoGw5uB_{o}Drwm>L>JMu4 zQ9>aX=Ba%F#s5TyU>^KkZQ{R^gbk2|T9?SHw*SGd*8=&dG4XdLq}KUm^?i9M*?^h% zUZn9`XlAA55@2XZN}y=1=_{jE;o(^Q9EtfgYy37No!=JLc<B8miso+z()~@w>S!8o zXR`({W7S{>?yk@T&LWvRkEO2G2V#d8YXJ*2fR{Lzk{0l{1$w}%u{QaGx<D<lzoXl$ zfp$-$e68QRLy2AwJ#Q$zwQpE2Tdx8A<83s<vp%psHj1skTK|I1?`vzo7>hUfJ!1=7 z*+s^9d#GJ%Y>oHRMnc0k#vW^Ik9YdFHg?3@{AU=G@HW`bjY_Pbo<n;z4D3@d4CvFa zSHXaG)xdrQ!+<_bLu?cJG!yzXAUwmw9MT+*19M1o5)RBEP3Y4wJ2bG9!7yjwfIbcT zB@7c-*@QmLgg(uLKF#dI0ezZz91iHy%%9<aJ`J-<!-PJ~ggy<J*U-<T34NLgeVTbM zj)!qPiQ{=3ui|(c$A>uZe3+l%*nndIx;H$J7M@|N6vqf0o8j0BJ$xtq1Q{ENt{nu{ z8`xuD9q8PF&p;~&Fq_|a>j36e1M$EMpw_@^pozh4iGK-aztV9AV*T_DOr;cvZd^Zz zIpN071~D_(_?14paX^pZ1BXzYBXlkZ!BaM1P{^#oBfuByl59dLUru4bFGz>vvc&A5 zt4P)i)DS;K(~TtA2T1x1bWz9(O$QB%b0ty217_MlLzz$Md5qai{INtYB>E812WXVL zlc@eefob;UV;aX{-;0SihKjLEY!r5+ZUdc01$xk~c#B~g=>^PzE^i;y?gOTH4+8C+ zxB&F##O*o$BNv!+aA&5@2MB-8xd`;DgdTH>Xz$kpc>fZ6$<%G?cJ&vvI*A#P;o=NJ z5w;pTk_-cPS?@;0UP!~(4#!SNW4U`8=z(qx=*{japntR927Sf8%CJV$8}80$CH0t{ z>34nq`@cu<CetI>>!g3@7hHY}(7%Cqq~=PPmK6Lu0_!Cw3BOx55bhHEvs4F9(}aF$ zq@NyKKe%e}cI+5iHFzxks|Hu;bAzW~Mx*zPrp1q^1wX#xrok%*pF*4y;GU4Xg07<! z`*qa8!Ygh^EPch;!CBz0<a3-dU7_WXG58cDum0&$JvDeX@gRdQ4Zg#?P6NL*wSkcf zT|am`&gfmsDD&V0gAahm6N=q{1;4?3OQCFoAK{8}=y>!Ms|HUQT!*O6;2Q-OAe?-7 z-t~{J(2Q;u{085_tnR<~p&V!@#9NR5OZrdqBKUvkB9!B7U2E)H`V95I9_25gdi>xO zh|6XCp(KAWJb&HbM>-PITy)(k+|~yA#NoJaw^JWEoa5@)bWoe3ugFiw!I=|q#o^rg z7_<JHzd_35`!CvfS~eVQB|TM#<nQgK_dFXr?+&ivd#r0W_=eQ_e7asgsiQY(2lRgY z<%d2ZEs_m^t9vT7pM-64MM1ngTaaQdFq=oZAP@9k{K`?LW@DH9-O61f47I~|%k}MO z1^&VNzQ2vWr*;KbeIF}Nt3G8xyX4|Of}Xq)7`Ka{?HGdp2=w6+ylXI&{`x&1{dR`- zZvlOdeg{GCFy9(~599nch`TNR9yGMu<NY$d@3S1^?M{#;Q}8dQJ5mmva)3A?{=<z3 ze~)&j+t4#tLl3U^^tq(_PRQ=`-LVs|4ANsN@@>FBiBWJG%BXj$n^=vWfgSa;@mCmE z_e1!6{C!%B_DM^wu<w2mLJq;d%xJ`a7<MLifOg?O4Et4AfF6VYFziA-1+_R6@4A-b zUDtC#&oeGS2`|Lo!~3omqlEeW?LPK%UWW4Mo$X1y?fQG<eiPnpEi!Jye;D3uy$ku? zjenB%wEMuy_agoK@h`#3^@E^K<L_Y~=(C`2;ji%K>pP(D;jfGj@K;z}`Vb}g82@r; zqSk<}#lIXVY#r!l_?P1ypU*+p<6n+-s10cIFY(T#!ka%|qvbc^UyQx3_;L#NyV{_x z=^EvDy&mt(VehK~E!H|n>?bV2?%z_g6tv7N1089OG<<V3bY?!@y@kFFZw=uMf9y}4 z2)cv0gOQ@OhZI)3b^>2vR)7Y;eJN9~Mx?M`7VjXKRb~~gtp*l!%o?)>{Lbdi;A_oV zV2Ej06ZXuR<}A>CfDg;fea(Hr>y?b*&@gu*ud~dvj1uz~<}Z-uFU?<q{>uCn=&#LR zgI;7_1bV4?DdJp)RpJu!YV&GCnb(@vf?kJp0uSGbxE^#haAOHH)3_H{Z^ONShWbf_ zKW*Y(VD0Tir1LV?kW*NNdmZ_{ZN822d~AM<m}`J3J**k61zm@AWe*zd&kPUWblBM_ zvG>6Dg*<ytdry3?r_QcJNQ2#A*jSm0!B4lR8|C=w!wkfkY0m^d%f>q&_TKj1I6KFl zgOGh~)EZxfm<xVCdp~2iJ>Q;>6c*SEkmf>rA<ph^?~kh%*^3a;Y(x8NFSVB%3U59y z1MRlEjUw!<>;YYFF9$uwJ_htB_KApp65d5AvQM>7MLMV3rz8IP_W9sf*{eV=wl4;~ z%)SgcUSa<ZWw_S97I|G~UkCp8_V1Cx_4f7P|6u<CWxmnA5i$Q{{|O;~w*QQnx7fFW zzs<f4{O$Jb;8)wLLGQ5dK+1R8cY?pmz6<rb$G!*G-fQ2BH1D_XNBjru2XOWw`(e;W z>_?1Z`%(K*@Q-2TvDkjXegY{xiMKP#@zsmxQHvMs7f}9J>{meFu-`z)TlQO^@7nL8 zP2RKLGm`e-?Z1Prv)3W~Yx`@&+-Pq!d}o+5%<!C)lQLXqxHBAdyfYr(C)>)|3Us1_ zHwT;@oE<<XIg>zlaxjiLQ=BP=!q+q^z^9!w_`nH_a(r21S6nsKnTqf_rw-x!I{Sh@ z*g4oJaax=f<h9gUijY>P6=yr}Wvg<h%jp6=!Z`wGdmP*^r`PEP-{<s!?|1saFL#!M zu5ec1vmPs*l_>eq&e7n1<opQyG0rjIk9Uqo?k6}WAhqY5=aAzY&KtPuE$1!Jcbs=X z-@^*O=X?Mp?K%H&{$UKq-dD_RoPRn0!c|{5Ux0q;d<pus^EK#3XCvsKGia1!s4zio z*ESN^E9)4_#Sc9G;PHAXc;ChMpIrQnVz<b}d#~;gcL?ZEcPQvccO+66?T!W=>y8B- zhjsXJcMEq5!*jQEw*=kVg&c7wxD!CP#hU+ccYEvt81C-o;=6S29_}8XwQenFom&Um z;5Ha3cQ1D@(CO}U(3$Q`(7oNgLFc%04A<Sq-N*3VecgS*&voa5-_P9-{5*Fa`1$U9 z@C)1p;1{|J!SC<x5B>o60PqL82ZBGyJqY|F7ZSuh*gY6EJ=8rEw9#!e9Cxw17~#v@ zWuU#-sZrwgV>d^MyBzC$CGHCBI4E(Cc8>-<2D=GL++*EiL63Kj2R*?(0rVvIB+yeZ z3n_6=b58^Pwfk$d%th`+ptrcUpf0z$w}Jk}{R`+F?j4|axp#ry<K6>$pL-wZ11{c^ zavyRZ0)51N1oScYG0@lCcai27?toFA@De4UTPG&sou(fpP?E$Mi8B%MtHdhMs}k2D ze0Ab}V`$>x#1qD_#8Zi<!0R`mMqnrMv*4e@`%ohiFD70D|5D;5@UJJ{1pjv8ZBYFt z&oFd{VMZzTD33BqvD&Z}?<C><Kf_a-;p-QU+8kfM@bK0|1>PG?t2Fo!U-EPDwX(=a zsw%wqsG#521#~yOm8kH3+&sLow-DcUDN+aEy96WgJ%d9Keki`=Qmhu^8!kl(pA5zq zHk$Drmm;+kUvTm9mS`(tw&4pdKC~nl6;(UF;^O1oV@Q0wJKT>ne}eC;6sc43g_R<8 z8s6d^sm{QcReW_8zG>nsyd#V_*Wzm;BkA2DOFf8h&$xI?>tQ2-_o~JjWq5~b3(zg` zU6``ugyi<dFnT)-?+-ynIQW+muJR!LlaTIug;k@fA>cKxVpOHSu}}(L<0?hCYAE3< z{U-4?Ko1k}PhoW!_YBx-d+<Bp?-065VYk3!q@=ZNDMD8%LRSu<s}!Lt2W!=pIIG_P zGznvs5yr9!U0KGS_z%I_b3IUtMpwh2nTt{0z3?w0q?II$rT4$g#M!;^F9F(`1AZT4 zU({(X{z)LO{lL$|KMDM`0Q~;=k0c!C0*5UEBGJ1Qit%R5VaTx&|DlA#h7t}Nig#st zfC%*)hQkSy4K?()6NVEq8wzB0Dq7<-{2i>6or@Z2Wabbua|oH05i+yz6_Veg<X7Qe zWL%BEMrPN7|1<uK%!cCqo7=%_e3l@5HjMCD0_f~s<f8GJPx#D-rtu-fd>DU+X7xpQ z%Sf;MK90YGw~d|z)fg^C7%pWzkAKQ|5&x9&GX5#-NPGn;yo!GcE6T5-Jg?)Qf<Em{ zT>Ccu4&G3D7xaDn9YT94<0Jfs5#~!7pWr_Xdbdw;mBxQ5!hb2ke<{L$DZ+m#puaD0 zcQpP>0sno8t2F*g8Q<VPj8I?-yFUlOL-_-IgDznVBV3rmyGQtrsHxFm3OjG`jxk}x z6k)^^VZ;<+#1vsfe$%pukfK9K(ZN2;a^Sh)=5TyZaD+Jm>5MW*fsTPrFh!`*B-CgU zYAhqvSY~R}SVpL^j8J14p~f;ojW(f1myn`mPQ~8AVP-wFh$(ZrIURY;fHpA+)QEB4 zoMX-bKi`~>@J4enXp`B5kdv|7E`^;5XMk!1=@Wu1BLrDS2(pY2WEmmIGVBGt1Xo=O z9O)B|^s$=(>%!($*mdY&U&A#>N25uH(4>Q14mX3p75iaQge-mR;<^*@?>6s7xJH>i zp-dm&CwmZa9yT8aecXH;RO3#EaHnHFYd(vR=S}n<!k}e@LCXk(nuI}9gh746pblYB zhcKvP{=@u-QA9}8!I#Vq1ftUD)3@t^K2wA~eS0q(l9DiJiZG}H3_1%T8h!eNK2wA~ z9iY$s5Uz2jNx0J@-09c{*#{w9<4(&y*ghDDYO%c-=t^Ty*KV~j=Guqb%RoEq4j{Hp zyA$CWk0u3=_S${mkF<|O9F0wrc>m`pAitINO7M(RkF}2lpU0^VzJYcQQajf^7df71 zpNF#=vpV)K?O%ew(7q7-CHAE#+i&dOpv)S-rtoI(ZxMcleI@v-?W;krv9YEBM0*{0 zjbc-TVtxAtpx6}NI=Tt;79iIYVb&C3RtK0BvKH?n;cgOUHSPOsyqQdBHHEj79t4Wj zm^DS1)dXgJ47^6GDf=n=DU|sc>`p8u%$g+3ny_EAUqtvz_DcxY__f%6&3+wJqu3<g zczO%`+t5iS3Cku3%NF51s1FeSkqw<bzV)^SDXg{Eg8tL~Ct`kK<Jq^rw7<kzjd2t9 zfIVOo6Vi1E>Dq*JZG01M8zV*NHbwX~g|EY5p5|nn3~0538Je@RgSi>uT9XiM!ik+2 zC{rU^AA6e*143<fnvEnO+5{om6d~FaAzFv9tV38fNm!O~YKm}big2p$yny{#DSTb- zMPr!r6858}oR^)K!N20X0{&I!Rq(Gl*oQ-iHbsawMTj=#yz9J+GH6_zB3zqtK5{++ z|B3Sn=wF?`g06AapoD*O{sw-ngB?N6-<`jM*9bV}{L}d-Qq~AKg*WilBQK4EQ+OwA z19**wQ_fe;SKu`!PC4H=7)1#gr<?%?qbT9y6yf6(;o}tH<6(r4Q-qI)5k5{4J|0H+ zIEA;(lEyGX$tk?KSqxrd<rLn9gtmnca|-XHLEA#OIYqcRMYuUdxH(0**&*EQ5N>t| zH#>xzQ-qsSu72w&<xX}dgHCaQ5Zttz2Gw}lBs^^ro;C?jn}ny!2v3(0o-QLiT}F7i zjPSHgc-kdAZ4;h$2~XREr(MF+HsNWP@U%^M+9f<~6P|VnPuqm2UBb;S;b!0Mb1~;3 z^z0LQ_6a@vgr0pu&px4NpU|^U=-DUq>=SzS2|fGn|EcTzL!&O^IR4ze`#j%!zVq6+ zo4bRR8?Gru5D}yfHAM^xY8APLOs%jC%1Va}v7xXe!emHDEQzubT|*^hXocC-SW~h; zBI=KpTM-oW3kz1xb-g~H+u(%SanIep-|yYIiSN(rc|XtR?JoBE=GYzovP)#Q?B;o| z^fK;|J&dmi<RE(m2SHvH90ci;K1OBkoH2J!-VpK&*jfK3@33F?^Zac=N#uYWU_2-X z8T+N5aX=tuIV{YI<)|EGd{4+VkmGWkaZm;smEf~<37&L$<M^y`eAYNVYaE|7j?Ws$ zXZ=FIkj$wT-@>@eFJpY(ce}LT;dd}9mzRh<HX;v3<gu}MY%Cs(!mCLL{>9Z;LU4_r zh%4Rf`1<$;lIfJiD?>e*i<L&<X*hc+<lP!t;c2Dq+OUd5#X@K!Q!z2}E*N<i!fH|# z6H*mBIJ+h=vxUXi@%&NJ6br`UNg0bL;c-$G3r6H(L|$h^E=J_uh+K@wtBuHOLl;Sf ziLrQ{vA7tEi?MjEv3MOF4JY_1l*J>m4d%E^(nt<uCYehzV<u@Lo3X}NygIp=UI$kC z*#j#uR~5{u%&i_~gKNaUoa>q(Y-MV7DwUyUy3@-$)0g5qu7rbirTRW|7*`q7D$%Zm zVK+jrTVd4l?7W%0y#G(j)5P#nI^J|2FRtPH(CEi-XsMfB^LsAg1b&YLsJ-jgcCM?p zH|eGzH%;{^?_SU2)YYdn*XP=f^#wR@i{Pl`IBmFWcx(^gtvv$~?S+K;AfVS_p11JK z2B4cGwsZ3we4=~NzlK4^V2__+j^9(4@w)EB<642Ybr(Ewk8%VQ(F#344sb&oywDCO ztdBNCPf`CjQ|r6&pL*!@-HG?K%X)n^)0FB=_68LhZ|Mlt_!wT&pn6C+NFPvzPigg~ z>Yl+N`b4WNHJ18Hb-hSE{gT=_is$nk_3<}6p2=v6YL}zRHR9vUqr%;SZ*wbt&27}V z#rQKT@MYHE$84Z#ZKPUl!guMUMs2}o*^0li4PT`jKc$EIG)lD@r@s6^wYW@`n4ls| zqW(qnK0@aOeUH)gfRblW?L0bNk3Kh`%T4I<d~|pTI=dWwU5T#Vjh^0%j^2-cK8S8U zgkH9xlkMo^YILy!e_<WI!ei)SCra3b0&YY3wxf7GDBX)F+&(nw1Um8|`f(cFILo|$ znECz%=K0^qsC+AzWK72CL!OYo<ZrnmlX6w2<R9k#X&-#ztLRG3`0H>1ZuHH5o~jES z$lKKkzz2BE@1qO(kni{J`T@LvQ~o1=nqK4~f5xBn=g^-Kf8Jl9Gx;<BxjF-=&=;PR z6g+{i(4SGdmB%b~@xL<1G>eSfv3Pb|7w6+b`d3vWe1SQZqHsGNzyjv}OHiSQQ5>`d zP2s=(K|!8W<v=%{MlYTVo5JSMN%qzj=I}k?g|L$puJSCzUR4a$3!S;ER)kNPu{SVN z*X%G-Op=={$B9G{-5*^XDh@LdiqM8d{M#nQ6UB&%hq#S!(>2MNvn{${qLU`)MqYZ> z)_#U^7w^#T&o}>6v@;1ax*wSvFP??$q}W?r!Fn&G=cvTiwNx?ABP+Rlw5!k^v#b%O zIIq`6x^~pgoz>t99`q41aS$g{)zW!kr8UL9;vJ4X!Fk;$%XM}w&+Qn;^*TM4()XM` zug@bs$G+FKNnMN9d-?N|yryINn(5bF;q&nbX1>-tUAxpDpoZU8fmwa2U*HiQ^(%_m zZIQ~y^NA9fikrEyQzE{aB1c<~*_WjGv!v=HR-DsSQ71RLZqmVZxh}n?>=^3~yrr#7 i+k5}X>yE|ETr<nwo1Jm-A7pPhJ!XGbRc>uO?A*Tt-O6eJ literal 0 HcmV?d00001 From f45c926b30a86f5116ae7164554db9c922fefe3b Mon Sep 17 00:00:00 2001 From: Sergey Slotin <me@sereja.me> Date: Sun, 27 Feb 2022 01:00:53 +0300 Subject: [PATCH 257/531] fix segment tree figures --- .../hpc/data-structures/img/fenwick-sum.png | Bin 43016 -> 25588 bytes .../hpc/data-structures/img/fenwick-sum.svg | 3 - .../data-structures/img/fenwick-update.png | Bin 39961 -> 24876 bytes .../data-structures/img/fenwick-update.svg | 3 - .../data-structures/img/segtree-layout.png | Bin 52243 -> 25533 bytes .../data-structures/img/segtree-layout.svg | 3 - .../hpc/data-structures/img/segtree-path.png | Bin 44722 -> 29558 bytes .../hpc/data-structures/img/segtree-path.svg | 3 - .../data-structures/img/segtree-permuted.png | Bin 0 -> 24136 bytes .../data-structures/img/segtree-ranges.png | Bin 6145 -> 0 bytes .../data-structures/img/segtree-succinct.png | Bin 0 -> 39350 bytes .../hpc/data-structures/img/segtree-wide.png | Bin 13306 -> 8767 bytes .../hpc/data-structures/img/segtree-wide.svg | 3 - .../data-structures/img/src/fenwick-sum.svg | 1449 +++++++++++++ .../img/src/fenwick-update.svg | 1406 +++++++++++++ .../img/src/segtree-layout.svg | 1064 ++++++++++ .../data-structures/img/src/segtree-path.svg | 1786 +++++++++++++++++ .../img/{ => src}/segtree-permuted.svg | 14 +- .../img/{ => src}/segtree-succinct.svg | 30 +- .../data-structures/img/src/segtree-wide.svg | 1696 ++++++++++++++++ .../hpc/data-structures/segment-trees.md | 14 +- 21 files changed, 7421 insertions(+), 53 deletions(-) delete mode 100644 content/english/hpc/data-structures/img/fenwick-sum.svg delete mode 100644 content/english/hpc/data-structures/img/fenwick-update.svg delete mode 100644 content/english/hpc/data-structures/img/segtree-layout.svg delete mode 100644 content/english/hpc/data-structures/img/segtree-path.svg create mode 100644 content/english/hpc/data-structures/img/segtree-permuted.png delete mode 100644 content/english/hpc/data-structures/img/segtree-ranges.png create mode 100644 content/english/hpc/data-structures/img/segtree-succinct.png delete mode 100644 content/english/hpc/data-structures/img/segtree-wide.svg create mode 100644 content/english/hpc/data-structures/img/src/fenwick-sum.svg create mode 100644 content/english/hpc/data-structures/img/src/fenwick-update.svg create mode 100644 content/english/hpc/data-structures/img/src/segtree-layout.svg create mode 100644 content/english/hpc/data-structures/img/src/segtree-path.svg rename content/english/hpc/data-structures/img/{ => src}/segtree-permuted.svg (99%) rename content/english/hpc/data-structures/img/{ => src}/segtree-succinct.svg (98%) create mode 100644 content/english/hpc/data-structures/img/src/segtree-wide.svg diff --git a/content/english/hpc/data-structures/img/fenwick-sum.png b/content/english/hpc/data-structures/img/fenwick-sum.png index 9c298aaf9557ecf05df0124cf88fd7ec5449121a..6a5ccd56d6ee693b4a51ded91b7155ccdc748886 100644 GIT binary patch literal 25588 zcmY(L1yGgI*02wU?gnY;Zs~3*0YN&YySoukNd@T!fg_D{BMK-D(v3)`gf#s7c<=pY zerC>~@Wzgn&sv+en$MK5F~~6>5D2#NQ~Bo*2*Nh_aG;@ppP;cmt^(hXJY|)2(7;y^ zn$;Wde{{E}2A&WI?lt^_pw3Pp41P%FrJ(Pn?P}xYYvExH@%8oPvUhRxw6t)u=5qD0 z%{mk(hd>@fl;ve~{Id6#eEoE$TD#X)7{VDojL|~hvvuep56a--qM;VYym^ECEdo_R zMjBUvsF_+uhG?VE96=smMxGH$7l2TU$ufb}@f7}*(GprQ^p1~}KX7zjyYs^TMB3)F zZKXt=ZKuJ35hM>0XIRRhI}}eAjaI(E3)OaORF{gGITBA6sSs@|l!QZH+SoXI6!Z4> zbkv~PtzdKzA6yB|V3*dbHq9|?aJXvkjv;2hZpWgH#G@OFGB7~^|BxkOKrlq3jXahk z$d3Y#mip}-wp?e6>o`-jaklf*W?BjS63>q;zsY%_Zy<%;eyWW4XNmfHb=N&#n<;!K zPfvxV7Y=;xw?gk<uYG;3-Q;2t@OS@5d3EP*Y%4mrnRaj;(h+#!GLkEy`ZXFG0rsMt zM9lZ_Nykh0?JVH-JaPY|^z=tFM=9)jOd=u~lao5y>`Y*Ak$j1WUI783@87>ib{YNt zou^n9+!J|BC*9xI=ke#4!pspXE9(bV?Pv^Qwsb>Y_{=`|Q|22rIw|YwlDvKU7Q(>5 zFxKK}+tXvE+yLfhL6TVHprb=nq!|Bl?z1%m6Vqq+b;Z6ovi1KCmc8&&igo+%f~#$R z0#(qTUQ)GO@r>Q6@+=c>_|rf76Xzp{`W~unjpPhh#KJej<9l~4_`dd<K^hZW67qej z_wKOm=y~l+g+v@luWdX5hsnomY-Qi-nlBC(whXYrI={j5Mt9RcCahln4W{`<+6plm zaVEgsiDOJ?i?`h!T^KzZI<k2!l5?7T)2J=|Ic)>J4G8@UtGj0sV>nNtDu_zq*V4jn z_$(=>+UZ_8tLmtq!&&kj@cS8{%pUjZNC{gv5*N;;;T+YYYT(|QcjJ%_gej9H>P!4) zGJW(Q^jpkJW(T@?Q*)Yk7?X(e5bIdK#nbM*Zi`i{cM%xBi^JH%GNp^N^VmJ^y_s_& zWZG-;Lx|o*{sJ?;jUYn^;5eo@rYjo^z;S~#QwgIiM(ipcC17kmpjJQy5d^|ErR%ao zjSx3tSh(?L5%)*VkE&s-CU4ti#6s~#?*g$GN6j@#FH>7iFDzl)1n^zm8A2QD2z!9y zB65a5G8fULNJne`aN5)mj?n!$AJZIF5YZLUv01#&IC&6Z!EROZ#W&s*7ST6bot5N= z$*sbO0Twdi4@b<JX3ybEN=Jr0W*Mdze$q|T?JF&%^DYGY5k>{dp&T+xsc3g4%zTNM zc>9+Y=iU^I>)>_>W3j8i@H#i91U{t3c^L22Ef>NTi@&B+6nPe@EE@jHIF#+hLu8U7 z!^ymjMIX`{JORQRoK+*CegQ<3qSkK|i1skvee3RiA}OwYF1IGxV0)IhjPN#|aiSGI z-+NO+7AO2x=Jjo*Fg3za88BUiA*?YA*q@CPSZXD|<`WrpN=K9-q>`f6rX~9XhZig8 zPo(4jRLM0|NPVqawbWf*j>m}Z+{D^uq*6-kz1!296>e<<J0gH+1|F8j%P);+j2MI{ zjx#ju9Z1AOAyB|UEnr{KJnayugBgU!CY*yfT@-O%>bG*9CFM4wG|X0Q_>ET=<*-K{ z`U1?eI>GQh)$GvyEJ-aP+1-maa;$qFg6%Ld)1|pIb@j?ER7c2<iS07PG@>6<PMwV+ zau=mKAyej#mQ6qOl?Xx+fxpqhu%-mURE#hazVOp8tHEQD*Ow7s#_r8#iMgRKv*U_@ zdf-079>Wg5yK;$DP=6Fi)4Vdw>m7EXS$S~P?WeNrO_l4p=w^<liSSf<7;1*>bXsxE z`0L+lYO$#0BOmq<2__tJ!DeJTs1JW@@M!C3O1=8`T>HU_lRykVl4f#&I<YP^bBtrJ zkvq9?IqFm@^fUQ}3Rk(|&E;1Zlk8zD5<%h`{)Ek=MbmG#a2H>rCAh3Fu5UADP}=qh zU*j`iSK!g1U9m~0DGtX-#u_TV&_CV@-z7*?Y-rT;H%Q#h?kA5lbTo+`{Bli&(RdR4 z!(`RvtxCf4baULrsJBqI^#%8{p1!Flyq;?Clvjqh9Vi0yH%oNXMWPPMQ^PQG3G#-? z0V86M0qPH@>BX7|t_CCPM(1lB;R}>kghNEclbC~hrp6IuWJ;v6`Xgxw^Zuc#lh4_G ze<*p(SaKXo!LZDmiagvYUXX3S!k8XuUmejYVrLQkna*_x^~Kb=aJ5L%?0^z9$8P-) z4sg~<qVlz(G(js!sYdkDUbWR(%`uMX--}ZrG$i&zoBAc7`eKsfq^&vI!Ap!yzbke( z;ewf<xl(HqE?%aS!U4*ul~hWVD$T#k2?=D8n2HbrBpmqT3Wt!X<&o2p>SNd;uK9)? zu3MBwI(rKg?lAKMHGu&1**6%|Gzg?>T%&4WRGvD=ENXnaEot4(oww`Tz3Tr&O#Uqp z1_74NEMb&}eFrgzMsMY%sDZ7Vu-%jUA6Xcw(k7EhmH04{(8Kmsn!?8KSV%S~hhK?u z57Zx(T2e{?U4K8L^Cs+9tUTl4B5r#EIVzX32SMp3>`4K3IC}TsWn=G_nt&*g9g?y9 zIYE1TneXF6QWNwPPR`KdDzk6ABi%WnoEN6Q<(4rH35JMBCLIe>AGaolQg*<&E`Lg= zYK$nNY0<Spy&jbzn<TXRx2hIWI(&^^gAMZ^$`s$73E0+9Ct%pT?Pi9ho})k?z7A$} z2crv=i~BknL7N`_cu{uLn38DpjVBVMo0&%)8?{28z}^}xgt;3p)6Wz7B2?!~;vIGi z8GK#*k>lXS<M4cJ1$KDU#LGPe>w>+RhcSe%Rw=E`#3CJb>n;ih+^;z1FOyyor<(UI zIrnQ0qDA3FjgjTc45uE(?@CIp_x_3394s;{^y$kCUq1iT^6FeuC-;yZ_SV{F6Zt}E z8R-yfvcdy>@eA{VaDbM{@s#Eq;wfC$W6YmOYe*kFfAPFQlT~;{j*;PyA_EVqAJS}u zO61GlxM{b1iZbgJTmiB}ZXKc9=c}lgyH?2jh!;msW#5@^{1Y=lS>G56rO~^#xT4sw z%{}k2Lp8+%^d+)c>-_zYZh}eF>m?W`OC<sFVx)t?1?qr@$kcXVx8z=#O$G++M=(Ki zln{n0&-!6wCX74gu$e-My8_d+Z#>GZmQGr!@N-lSLRGJRnj2-=!mZ)th`xsT9mP-v zX^`-)s=PfH>RywUWkZ1)wCOl%k#OBBzKBMF5tX2axdLr0e_F0uQT?x>OAXFQPHjnk zYs%|iQZhAk4zBx){$IM3kasV%Tzv|@LG)<DjkdMHIv=j6@+Fui<%dM9=9JQ@*}Mot zxx`8_so>pYK-`I*Zz-cZhV&tJ&1L(4d5tKvxwtum*-0s2%7nqj;WpD7jEOeM*D3YB zn{#Nx1xNtbWl!HYcR}jgR4<Qgr%8gI3aqga@ufPg8`-P^@?#Q*lFJ2{bny7cenD+c z^T%b@Ibz(Ge3KmBb1Th$|3uW6P=%z~liA`5`$rV#P^l-SiJ$|ztRmu>ym@N)x|!_o zrc1bgprDv<U*@abUFrz()4zIpg{`+a{$D&x$xY&~{J+04VQ`2O3my3R3wnW%Q6iqw z&FRI)_9d=uA!Z&{{841dnsO;|#}qZkj1h=TbvOc(ln<(oV?D`kY^2Z+EicDtm1sUB zV$F##Zs0N-p4Ra640U%<oYS&IrqG46eF*Z%7NThQF5DVbfisqGI99#ppL>|}=7;}& z0c^<Z0$+G8Hv5y}>jLPYzd2Qz^&QznymjgXtO`gY9nl9X%B5{ZPQC;ef>5+Ei*B5J ztTB`oq}(>mtM<{ti1UZ5d@M7EgiK>W%JV*v*Yn@#^-m&hih7{FmWxfY(j7SHPghC4 z6xFzU`I;OzXb!S0{m3&{>>F3@rU439J5wF5q_vk3T)<j3*ha>tzU>9&b~ZoMDU#E4 z6iF-yK#h<9`o_cWb|iysiH6VNl~_ch6pVthah#WMm#^v|;!x>EucJCcTBupq*I&r= z5?g8G0$YBqTL5Mrj9MnfhM*7ZaH}98ehiWet+?KzBBxDU5$QDY(FKSZ&bv8nbdVZc zj(sOjV7yEcHWt^2v086|t?oQ__wOVD-U)fn2!tuLI(#3BVZ*GE0t{}L7FD<CHePW> zMxV7}U+m=v7C}8injpd9TUrzv>BU%VoZB=lF6GggPkb16WBUSz4yb+dnK~HLFVhv! zQC*aRDV(D+dzYapciU*kR5j?Q{XU(DAy2M&<z$E)W{X+b%!L#>OwU=x-NZBzt~oZh zd`BTn1s;TG_-bW^l1Js6w>qO2fdV?NHK!L(g%aOKslZa8#^0}$e_SamcDN3iR{6|s zBRU=bN`E>jgs+5l+y1NfbiL9s$d5YeUS3{4sB~jrR>`)a&Geys?Kbk*cSDQawn|3e z(NTXyx^>;$=f#cG^9HwWLEl4T#kfaWt0NT^9CUO;mD9&-Uk$oKBtUfS8Q8@=us>UC zIUSCOis6WUC5zO(Ih<v`)ao5B7VMsYD}D^22|jz@c1XY0;S~0~Smn<*LJd1sH;`l8 zUalu<UVi4Im3><mwXc@@0>!VRCf$2yf{~7XNuagkV&1Wu*M8!o8$0v~#CqUE0l3?6 zfbE4!FP_N;LS$5wNIh|?>eb0cLr&=&X>l`$xOn#c-Hjvb-o;9&FNz(lJgrw>#rC$H zbSLk>nE!`eSje{z@I(ra4z~zlz5<&I1fgi}Ri(Y$ON)t(uQQU8dh$Z=m2FC{$kzDy z_-Ym~x%|c4e<6k5S#R!gJAN!Jt#e(GocPIYo2RC(zA^bp8{{S+EBfBzxh3qi{VuA3 z=+Mf>Ch6U~?%r5rUoG`#&zNXw!}s?i>=QNeB!@p~+iVaK$O;#K{8$4X1YUu$=({_~ zibEwD`BlqJWuidt^-~qL0P9pU)NPuO%ka+N8@|x{jr}~F*RMh!ZdN#qo8Ki$-KUdq z8oK@ZhORv@I5hM-UTAr>th|a&<mx1KXlG|9j!G;Qn?mpytlxO0k<fE(ZFN@dCi|Zy zytX5hzt-37=RPwAUX)QX88x{S`y9;IJ1?Yy5sa0+V5bh;=Z$q__gHB0WX2@rvN~Gs z5WhYCT4B;Q?Y#oB#es=p)vKM-yyl1d8<Q`Wzj|YbawW2I#C%Ui#Ey7;_Vt@+7-Z2# zISuRIAY+o$I4=l_dTn<Ltpp*$^!Z#Cg{Lcx`afF_ZqGNcZ;h4*XFkgnS6=B121y<! zNcMiUY~`|_{o%~&HOXFT^ChIA83kh1{HKq?q{(%;JtHy_2?!TzH!4k#bWo7^*Vnj0 z#+%@A!ok;1(9zL9aGA_|-xMmuK7ICVGdJ)MA>ZO_bo<3};9}^*UG(7O4v-65Ab3tI zom64*6hfa}mP9iJod-Zde6?Maxc##f7jn8eTm<gY)z$mWAAIdKvqwNi_M_hZ=k`Jq zx3aQwg<(UCdX^vpeD^wUPgD<^*Ktz0%@T%fiopIZdJJ+d2AvK&&3%r_l?XHp5|l-1 zOlH<N0g|5l;e))NpYUrOs$Utlxod;mT~_eamVto>s24=?dYgVZ=v2Kgo}yn{L!y_n zX|d<~dwMcaFO~*}sY0`^kSAaPq{)Ckd9?cCRpOd17^wYnyU0S5D>KKwhulLng-gC= ze*$EGr7Lf}Khbl^dnV)gX2-?y;9Q+;2A{P`F5k`Bj$D(=l5v-<jZJ1}iT-Qww$olx zw(^~wpQTq<TY3K1JEaJZz&nZAb&+(qd*>S*=bGJ%ti6emCC(?cfw-fF?e9A|38tX& zu(4sh{wf1@LC-8$K<eSP&hf48#o^MaxT24bPp+d8+FB0Vi?W)Ry%;@TU$n*!4XMpM zxVG*v&DYznUoQYLYCDWgOx#KkJA_OIHYq1FIW4tLIfn^eKf6o&vL3r}i|2uXi(49W zbs~PVU+1+m5#_$#*CJUTR4nrfrQTr*Bg~}D$D(PWp;AjAdDw=Yi76UL+*oIDU^Eul zFHvq=H<YQ*otv4C#-tyPaly6Tl{+i-_Jne;ae8g)@QI0gZ?8`M*P=+9fq-NPI+LI8 z&7kJHEVWKn8j*$_bv@()*iqH-*RcGV%$t0zG96aEL`y|Ql)Ru5bg<6Is{`!SAA@iH zLiU&1(u$N4`MkC@A%MN~cLt~NpDQXT5Q|!o*O(P6B~lN!dhhwI#|u4@3TaQ}8_km% zCDaiALTVNy6NM4w9Ju=cmm;`kfTip;@m+ecdLM;J+B%$0e+#6Q;!RA<ug;r8&n2Jh zGkX*q%BSmn@nk|Sl&HAWL!N3cU(x~0BZ^7F0q}*2Z%P!J10nM>tD$t>s<49(z@X75 ziuYWf?|ss(B!}?Z44MDkpW`rXPY1J3TnV{}N5Lj<4qgny9-AtAQRKNbB7VJH6m<Lr zRg#;VmGy1gQD>GyEXjDiy>?#OmMTC$B|!YefiMJJuD$lRsbFScfEyxm0S7A!Yy}HJ z=lO3y^3(aOW6FK)M{^}A^lKw-4%-reg#B!A)H4nGqcFp}HJWD&XpUNrXbM1H1~0s6 zBPM`w+0WOr;!2!p<}PrXwts(vPoFO4%jLY#=)ENAwkjQRwLv=H<f`H1#NpGHljYL- z<G}JJILt>#LZV4Ejmy24SnskP2mA_Hhj+Gl!KsSz<nL2c`}%1@_kZ$vKh@A+78Dd5 zP5%uH#3OOPny3bXsH^ovso7R<_C$%lj|8*gxw~$YfR6(vF=Y!e9}qcU-JuRc{?io( zaTG!cnF5X;Ydy%FcYlrk3B?HjNCZ?=2bnzWL029#!MIYlJQ@<`M4U#A4E+538gids z_Quq^EM=1M+rYIPNXORSKX;;V5|=<Jl4N_g3ASz^Swr&e*R{u-obkTP{;z)|GiT|R zc;=jff0#6Jf}NsQN~rNYGS#g#?EUtZaH&e}+!q)-@v|{GD=VvaDJi{KuALcMErZ{` zoBv2+B*?#e_jS#D;-m6pwOJl~M%<m35n$s8@bP8pY=$rY^wazp7=TWmd^2*<TBzqX z>&*V}f$08n9T&KU45`r01@8%9!m~xaV_wsQq_bDIC%q`w{WERk+L9_M+E#^2CHk%G zMY$ia6((kxG7VKgS3(XuE=&*GPe)`2c|?ilY{~v6Cnsk}28$ldH>`u-Z=TPX*-pZ= zz$U-uvmW5F9w364hFn=!bzVDY55y%V+JXz?$OU46tvWsairZ#0E#N%=;quZ?I@sOq z83c`Cs$7qmiz~r4*FXBr8#G|J(n=kzZ~h*f9WG@xHw$ArvMwsCs}}>|q?8Ew0Q3<G z2Z|8jeWag1CxBaZ*(i+33WMb3jzB~_uoT0Y0&~F0<r_}&1<^*j_b1ZW9<P1@mb=$G z>ohnmBSU$i(b<Ie>zZ4$jZ5jrkH<jYkI#N;v@yAn^x1{wS{+}?_I~?z+=I#8+!4Gx z`AL$s4(R&X_IOeE0q|uf%Kohi3JSdLzrM*Hb=_a@wSbSn^wSSIEgl=js=pR@z(~Rw zUqG5xLfW{?8Rfq_EHrY)l5lzit^m=;$gwl3G8SP{OV=#qyKRj%0#OUvi~r(ZBEOxQ zNHheWwj}|VeaBLbyy@o&cs=}$zaBn5wMNkJjemo}1uW2<Y6{y6w?SB;$ez*Je0k8s z&!1m`Tsm(Irj4G|h--FWx$OD>`PF+20;_-5=1YG1^l8nfEf}ATstm^@+}V0Cbr=T| z^K|>F;LVl*_@c~L%ix>4s%tGSEqyIUnUMGS^XGYgZHqPCK=U*O>rrH@*F>}@6V!El z{QO_R`<#Kp63wqB?Uz6^-syGc4CgSfe%+sfx(n&@*V=q+{sRBv1%hd*Ii4?sJuI(- z<tN2>_j=e3vbRDTIx!K!(xNZ>*rR47CnTS(wg-0eSESY$7XuK-@BmXvBl7d>JWo$E zg?x|Bg^BQ3Y4Ue*0cI|L3XN$aat_mxYgS?{PsZ-Ph!Z?2Y#AyX`QxDTxU>3A7dp~u zcu!+h8PphpQ#f+2o!?jzfIeCewIOwXP=euNDvgZVPm*$PJ*=P}7MedMmG$e_97<At z+2@2RiCU6x$o*y8b=6u&GX!lL=@tX2f4**vkD5462$~@+pc9sh%7H2hRn__uK+1#J zOmdZt*F$j9$M3>ccGJFEMfhc1LC4oFHam~An@-2kFi8wG0kK;QCcQ{?i7s$^>Y9)4 zhByY9%JHS!`db~I{B+L{?SMg=qKWD<V%1fFTiSa=3EG74N(=b1chg>v{YeC#-QiB| z31NRn2`CIu3;>vCGgU`%SVKSev}vN1gf48G`c)_PW$7m0QfbVOJ#w9ywGPRu*y>wU zn8DDBj*L1(0OO2vEzDgVwT9YW-KY}b*b4prP<=8|dW&f_CtSjqI_sYDg7D>TQD<DG z+&SzwO<R>!{-G=qJ^RblSdQwJea}^%FAFg!emDX)2Fxo5ufpnJxWV@7H(e+&-tEN} zxgyQVms<><873VKW>pgTVAFq9VSCH(JuB*~qVPhP$Bf1q*Q$sQ7%F$Vle&lxbS9>~ zLcq{vFB>#n6vVJ);|)|D=etCV#<dq0x9lB7yOQdw;y2Af<enHAz&_7PqW`MC<Zj8R z*;PSo!&N;L#jlR^)4ylS7>=hYgL{dmVtn}KGe<NGXJqXP#mZ->V3WzB#HpKgJmIr3 zAqb-rvne0pBd`z7gz$@tZ*s<fH)l-!yMr#>0Kd*yR$|K50$NFIqEj~7^P&FV#VhPw zQ>9{z_TbW`%#mu?^tsNYA{`id__vt!4LXE*30wAS>g2s&?8``Y8iwP~CNlqWx<a3P zXeU1ST@Ge>4sza4%EjKC(Dz7oG^c+lyB@n4j)RNCw*FN7=M~3-(t#5(J}2H_-iwuc zUGJtF(69*dgNGJ8u|De7HzCZ=#h9-4)=p#!<~1RZi-R>ViO&VkJ~W&u1s5N4!!Q0W zf}+AczW~8rZYs?ADBw&@55F&Wzqj_v=;ziSXGc*fdlSr1Yx}UjR=49=dI;VEY5vLK zmWai~OT_22TiEoKcBT5OmXn_6=wK4MSPnt6Myh|nu6YSKSOzRr?%c*Wngj-h$odV$ zq%(@+u~gd8$2I<YS7o%{N{r9vR)L&6_@+c*e9*1Ox2Z48se0_{*xGE;7}=6q*vZiQ zD2OuDUVSUA%{v$!3<m#1wwO2vXHszI4;U~uKUlu|X|T#1aH;;S%swwRk=Yn+0LFD& zxr2SF^q(nMNYm%%Zt@M!JKbQeLNo_H_VfSrhD}Fm+5DTh2_G+v`;}tG99Yo{ut+UN zGw49a{QALu{j$$>D1tg?yl@%OJ$sr=J*uXdYOt{xt6QI3UlKb*kZw6)nXwe(<1v#d zc!Qb4O*GwV8hs6At#n*VsCf04mw(iqzJ}%v&mZ1lbKNkMDU}li_IYK)pHck$(qcyA z$!Y=Zix`gTs4<_V&CKz^EEJL<2bMOumZZ0J#LhNhK8=N2S!aHSPaWqa_z|DLrq>Me z+utT;7y$!ynmpaW5(i|2|Az9e7JfC#%D)Naq|G7Br-4O~c2(bEu5^5#Rgk`+AwxI& zM6uX(=!kvtcXC?zFjHDaWM*86D)uB&-7Ssq4Mgl1rRLjjzC8pbt#HIi`4Uu3nl~4a zYD9p7PkD}WekAT_ICT#<yRwF2s>w@i*)yg3^OtSWGj3f}N~7jOcN?-XZfV{*zswkM zJW(a2;qTI!T%O+3RDP~lG*Zs#z@PMx->GvS*;*39(-I`M%fDIiBat2Y?vUh6Ge2*W zPtVU&Q(pdCweex=&#RvkvAZK*+N`-_z4P*hZGPTo&pS0@apzL7KY^A6x*OY8wCGk| z&i#h73)EmgYK6P@iuK<BOUD=}8jU>&__dd&I&$Hz`ncj?U}DCSgY;!o^2RCr24vUY zM@E=LMd#f%vOxeX=F)ZNbIe6HkZej~)C$t!IS|V$``wG}`-lO*_kr7W`PdqS(^R5f z=9@zqBc8+bjEqr$AM||1)aiG0?Yv?F53Z2niubgoIgTUkPeHi^iJkp;p+dM=a#9jZ zvmoM2naQS`u-&X?JPeCMFb+g~S$}7vt`3@3X50Pww+{q3u-#U>!=b*ylsHM?7X9(X zwH*rAcHTV@31VzUJ`L~F9O4V8?V}Kvgn{rvvnwQ+)0#@mM@?1aN<mRk)GKFwuC8pR z%3-Q(K6eIWSaz)aG?Qg7R0mIY-Gq%Fy3VJKKIx5K49bJ>^#E&Zey3FzK_!jrfaShg zJ|C^F6$-E7u{?#=|9{m`yRbuR6;2fw$e595MyxkUYEI7Z)#;Wu2|5opthh~+xozp$ z%a?^~{qDk)*<!vU2f<`yWFJ3%{N2I{qJq(*G%gc5&f%dH`FR0I?u3E4iV6n!(S)uk z2)=yUI>v75K7X#C>UBCC*#oH`a+NlSs2D0oJC0)t(yhIcX7$gw8e(J~=AE6x_ZntQ z*M~CrXPpEkuYMDP?7GStimge;<EqjE&sdc}LJZF@W)f>batX2-IW;vbcD*Vq5XpdG zo4jpl1caHaW}V_7N~V+yN(Zs;^O8+Wc8CA<0;rK)UmT5+n*BBryzeD7bNi_-L?*#l z{sDn)hh-aNg=$%X@d4+%wbp|qpr+g(`moogKm}4<{=qcvQV?M}Ei}gC(<_eES}M)l zWn^S<m~~Qux90dRE6dBTe)PKp#eM(H3|qgOgC_e)7(T>yB-;W6<n?yr5g?@9ovs|b zy<!tt`J`R?h~EZl#>AJZ&F7#HWIAy)Qp*F{cv1XBb<iy~9FM@i`xYbFB9!8Ocwt}w z$smh5-OO?|H0ED1><AFpo36B+WI0M!$`SF5My2)(tFs*iVg45oMf;!pV7`Kn<EZ0e z7PJb`1RWzl0G5HL2ymhUSV<<>f|oB}x(QQy9WCboAsnyJ#&pmweO3%ICpbe2JnBdT zC3_x_d{wpd#SPbEWD2=b!*eB&;rs^iFq~jiL2^Kn`TIvQnV{3798vER<*UwL?UKHS zi^@9_X?jW$7mJ?8!RJ$p;AvQt!tVf6I0fU<2Ei?0IZCC^avp(PmXzD{BM@nEkcP}U zCFOI>J`MXzsAues3})nWM-AfjfPmI0fs+mpNyk<iHF?~f&rk$ppKJ`Co(yug8LESr zmYH5|gJfs6mfqbx&9epo&gMg-dOLhxUWd98Kzk;@c4^OOqSh_Aw8evb6@o`ar7kZY zsnKRMn8L0ENCudd6$twMp?^SAh5byGNjI`>nAl-6;_I(pJVzyz^k+Bl`_T!RG(RQn zV$A`42739?xWyyPu(KuPE}$=gY7nIB+shqoNL!B?lfT`{7ptT^dSZtMvA4H(0{jXj z!lS<ouM$9h1`>wbEvX0ITYW7ptwrw{lQ8nH55St3aGn4e_xy@21aZ4Owg72j&pTQ+ z0hi&-w2N8&W;fb@&?`;T9b7T?QJD||*O98$UA9)AgWbB5+>mSgM%R@QKxlz<CG0g^ z!!uk!*T7rg?C|pPQhcuhF}!V+oPuHyXt@8cx6J8J><QqB$I=BDfOegYx~R|ocYq8G zIszI-b}`~VA|Je_4!w0}dY1hZ@S}>V7*>!fnwVSwOk@R;COA$2;I+*V-6liClPUQ6 z%wl`2z~80IYli?1fE}kR3Un*me_{|25D16dHZO1(7TPgu<XQpctuX7#eO?^LY&<wH z@aw0Bl(+g{2Y>;9mi4+c{RCSEY}YWrjGqkZnBa;c=Chw~P-k;?w2}w-<nfpsuKQ{? zf?^`Ib2FEXikg}gz!+64ZN5j(UZG&eh?jtbZqa`$$Jl=(#n^Yr+iAZ3?Xw)w_c=N0 zAV;$sFGQ|xKfxH4dJrEKUss2UtR2On*e(5x?^-`$LT|Op>6NgbV1O^j_xFarAB|tn zQ;#wy&`2F-7Mp<15s=_y0F408B?X$gHe(hFN5VRwlPB=7rGL-EG++m?Z2Aly3)$QS zem=fOA|4E*;-^?9tzN8di*ODh=#)OQas9ri+qM-$%zKvzB(65SF~t7S_;F8FRR{Tt zz#2|iSzG@Bd=h}*pfbJckr#R<y+GdpnWG0p@>^fua{#B<^s0vPWnSsGcods~9*C_M zNl&^q22zT=_om?l0?v$qiv0Yf%}gC~X#wy3c}FYvh{qfP#<JE&;rbiE;`0qwv?I7% zMLZk(_3%xn%O1E@ITK_!gqWSx*K;#%a-jmqYf7tL!?)YRM{29lVTx@gsRURqEznUU zC-IMtfqc{k_)qEOEtlKNKk%%c%cS)ka3}!E^Z~F5g{^jY?K}^;{Iahe#qs`4$?#*~ zs{H_S0`x)B-QAs1%qI?*q46pcF}Xye&LDqKeR#js_zD#lK2xv+IMUUo9qIhGI6r>; z&?(oA2aQXu_kwI}0|M3Ec>sHKb(1P-0RK^@TZz#P3{D2%NDz?l7Xjc+DdO?%LjcB4 zvKRwClmliQO5tw)QE@pyM&1A%d3Uk01*EW{)ll+hg#kaDHv!1-HZG1qP;h#W(HfLI z)&W^A0G(Ig`uppA4h&Pd^+a}ODh$4k<cI;TLMJJiOKR#D4p8+>E!^nB`Bm^`Y?A-q zIa^kvMh@^edT`AH(0(i1tQP*?)^M~TA~KgkvkKV36T7i6U<m&thTgtf34O@vdbp<u zgI5AV?z~+6Y=?kW1B2~vI14<U7bk+a+bSgV{;G5@)&($*xdunl{AibF*}^ga#h&XK zUuMV2+`PTJxqxeuY3QA|bb}@eF13?;Wsoa?S|AAVIa(f0GYi?7LLG*q_Uqkp|Gmm4 zh(N-QQ#_4`M<n2LEx9xG_4VR}KP3T(&CiY7Vh1f-pcK-UI|Ed3tl8Z%@52jV&TWR$ zzW`b!3ykac@88`jTN2{qtq&JlIP`16fFJ!?V<C5awzHlbaM_bUC6*5M8mQ<?ASuA> z<pQO|CnfC%ntlbSny~L7WtO0`2R42C-QNZAhuhPj^C`WL#l^LLe_mcikbPv3zIhAB zUf?sg>tHA3{!9LA;y9S(bd|~8-%cfJ|48upIA)o~X=KxUtNB|kvmn-!WBTXKtqIt} zoy8W<@P#(7onJnkNT4H+5=`XGdn%F!t)A;pbX!At=iY?CcpgDZS5NklE(zMuQ$~3= zKB78MD*P94x*&OJya+P%^w)iaKH1nx&{K0Qt*NF&O7|45K5fm8BoEA>*=+gCE7S(7 zAu<1xkE4TlJx)O2=y6_u{R3t=FXS}4IzFlU`l|(^lVDcmF=yanGti1hcsO?IMfJ9$ zxyPd3yMFB>qib%F{s9Sq^`OYBWebl>dTy$@=IS{K8ij8$Gcq#ngLtYD#DPS9L%)Cj z=7GAheAmVaGjeHRWas0XDB9}>_aor=jym$HAVjB9BSTQb=8h79ok@*JNN8+pGfkwC z;>z2RkvCPN`7tyUB*@QS=XlSFqngj>MqKkX_GNN-NAG=#|Gf}BJw4~eXGYw!p0=E^ zd4NU_HPqD3&fYp>Ss|b<F3zA`DmggplCM@|-7HZNs)ax|b49(&DoFc%kV6zq0&|<u zVK`QPS~eoxOM7PI0ysyIB$i9-sRWgQqQQWCEm_$Jp}abU4yi*>TH{R4%*=#1(H?nj zLsA2@a?xZ(n=+9EbVcPoGg?P$iFw*)yZB~;@(VWk?sn$iB0a;qq;cX57s(tvD|IY= zvg_@Y-`Im~1)dh2Kk|p}m{ik^y!n_k`lW8Z_mcI-?e_0sp<C(X`1`G?JYy&cLK9*w z$xQDXj!8$dGQO}GkNXF-@QW+m=QL3CZar9TY;~9tXXSgAIAY#G>#)s}9?uXjHj~@D z^nn<X`a@5o2M3u~yDsX8B@UnBkG4ZN?>-@8r<p0C<}+$kB@$iSr-xmyy2co)uS1q) zydN+hELd#F?xH_&YIr9&oi*Xf79dk$Z0TqJ?7r!Kh4eS`STmGU0}=G}(F|1%%!GSQ zl8y$3<4cE{lRjSfe1$8=%dX!R1~m{ZL&n=Dh<zZ`F4np|6~%=`->aKL=M*OK!k_h; zh;bJhepo_Z{KnavV@Y=`=q6P*)s7`#!3~{7_v_-!#*@ALRIOQ;<gBYF|L8Lr@y7-i z(sBcOoxS}6<ECCnx#m_U7E6ekD&UTIcUm+q%SXYiw_hq@GrUwV3wM(sd&HW=_KpHU zor@5ZW5luUhyZk{d@+r4PM8aOj_IUvE^#hHjV*T;D`mr>fBY2jA6cr1z{M#-i|!De zcQ~6~9S`z3>zpo!Z8*ZwhJ=!uw#AXL@MHkX@FL2L3Rc!KRFXMZCrin9^OKdr4_N?F zc+8NRcY8&x&ulInS>%D{l>j@j*Cl02_V*>yy6SMjZl6h{?`hwEtTNT}64hBHrQ;!Y zgb(PvxRgwx!@K1!JNHAztH>%B3ro|=m@{l?dB7@5xg$E}sX?EUqcsNsrxdo9Yl(Sj zAagx}^bx`R0rD#6q}nV-XPM@^^o0AIF_nM=VKb#L_s&l-qkbp%`Y5;R%RgSWh-kJD zyT}&f-;<$r3~wR6L$Z6&MuFLRGd>x+<CH{UIr_3Ps<h!vy_2<z6tA*vCUp3t;grB( zwvg(&sZ5|J*|Yu4?m5}B@ug2$VK^{lYl<*~r&!01C%=FJ37ND(O)QiO%;6cOT}{q@ zKs`ljMTQBC6~$b&6?J37vJW(%e~>OXF2Zb{Nbfz<JZ2B@b%YAE(9Jsc+;KfCIAqM{ z3Qy)cVQ_f)?)zSGcBm`ju7LqQjVMZmqPHih5G@yG`kMgB`NGeM-skoQvS~oT6>N|) zM`Mf^b&s2j7o)DwQ_f_Yw_IDAF;D3oIR1vJ0ea}?nJ@$N3I{i#1;abuw(ulSHG*2| zWj*r2AI0CYNa<JIQksJ=<W~bkwpMP?mKmcSUe7Dt^Wr945jK<SzadXHm9rZP$M0cJ zL5t_my%R{Afu@j;Yrlt8tqToTJ{xOBA7xs}_>$$=e<do)64Y2H_2=&$FOhe`GQ)fF z1VdB5N88)hSF&gYQ(ig)Pk%I$%$5{H=-p4Ie_?<&=$)x>C%R(XM$hXrr87^V)4;;- z)m{z@q}fI(eh-Q$)KNm-{zNiUB>aAx1{a5+<s!x>jhtU!Jc>M|ePVEZ5OJ`jeNE9I zd6zh*;*9}U>#0tl*mytMQgTLxd9Qq)bRCv(ROk`$QbczKl66Wktot+yNOyq>tjeJ% zSzW0>cXyxh2{~J*IZ*P0P4p=7DcP`j_EypBuE!Xj!-H2%HDH{N@%>O6^y=SFAML?v z_n+!W9xcS{sr%9C_gg>>44}Tr{u<k+z6Pk46w0>{Cq!zYQc;;As_Q`FIAbv>sFCOW zLk+t3AL62;#1`1xtS?zm_K}ujahJ4rAUB<#5cMB1PzTaL6gTRB1S_%QI#P@)4G2QF zdpq!r;lumw(O&1Xd3PqR7DCK^)L4r%$_I!b^r7!^tKGA0AhgKz$Tj>Lg+}{B7?K^W z$|1eZ?$eAV@>xj~9OyVv9y$NJj>t@iLto#cgzWyk@BzYm#OpMb=xk}svA@|gIiJBo zBcopD%}9R$(msa!u*g225WuN0o?T7<fkVJ1e_nbmP$AHf-2u&o`557IznA;JW%j*( ztEjF&rtGUr#r`~4S2{GNOIrCn!wjozP~!q4Itfi~N~0A1le|;!vnK*2Ts9q2)bcr& zssPmi8?(<tafvZ!*~%0BGIbmn<$V?V?UnrX_RW)u)!Tf6UR6cTmR8({1u&ShJQh<v zhabd#spR})?F!>yH0%__`43Ts%NK+WHtpXBHHR=jAfWk9!Op9E@4hf4<<v4dO{?zU z2+&BVf#9Y@HKX-(!ztgcs(N6%YW$9>P0qT)YN%FSOL_WPdP}9!#wV#ThjR^uUomJN z1w181H<Vv_FJ4(vITUKaj)<Lm8ysvh9JM->mP`81i1|lrN#2Feznsvc)68-f*1U1m z`)gq!i>CFF8irdGYG8tI@zi&6FJ0mIkQjRx^Sf5iL)5O|WQ{r0;YQUWZ2YavaQKpq z&d12sg4U4EDg8|=^OC-k3LcDx$n7{FnsS0VNrb+YN9V8#b>Hj!MS8!^97oQ4=Fg|8 zV(*D^4feEFF($2;>zKJsA<NP$)9oAK4lXXPs1>p~d(2E}WOB<bxaZ+|nr42va>H{x z3#^Z)P!k;m0smLcANh73`TG}+vu}~1ersrZ#t&{_D14cx+Y@k}RFNjUc{5wElg)95 zj*I&+QO`Wwgw<PtM#xSl&r^nWe1CI9+6)S8HK7k7@Dx@X6GZ+_e?j5%&N<pIpB+;Z z<XE7BmtWisI!uzXv-jDIUf0yr3`hi=nNPwZuJi~92|>(VXei)|?Q|3bB5V$W&ygTK z)anZP^84k@#ZjqgM?mE3ETI>c_qSJ|@1w)R$LDyoI0bSLx?9jEV8(Wci@0cPp51qH zou4Z1cRn0AmM^n8U4jO>HX6Z+hv!EADC*MQ7P({kB{MV6Is(qq1sqd<Ou4&W9<Pl+ z4zR07Ye7H4eAUd)QjI#R9|Qqs+cF@;Z}`vxas!ZwR~2)VD+AZEfgZ2$p#CIH54+*F ze0sUy^#$g1@PR%p(TEn2+;lno2u8?vJ50li$K8^X0g^eAGr)oR6xz_ocKeVtNaW-7 zNd3#q>bX<Xh&CvOH@<9OFTa{w(AASp|HV+#v#WweYmC*)QDm}M@UGPG4FO<4xN^~0 zw5+T@yp23O_>!5_UxJYTY<DUal>SG1$T2k3GM~H!y)L|f@4yp6P`8JC9~!a(gb2Qj z0Cd!VIm9|TBp{;VjDbd?Jn_G837O%@3apL|SOCDKYPwf$!8r`@*g@tERJII;&Cq8M zGb0mpY>X<SlS;GWwd3!O)g65RjOe%daH^(qxGzas^dP}IE5Oeb0Z-VTt7G<ETuaPi z_ID;Mb2MS*oEhnjB{irqM*vuefsU@HLEIh;3VwnFwCZ^N-8TrhI8+}@<piK}Qr^mN zg$A(3U*V|KtAN```rm>kDn83Ttg5y{QFvbs9QJ8Q-TiKr5)ep%Q?em7lPhEBD>Z^s zDO(<?pNUcvIS0Fet9oV<mx#FH-GGE4yt5tN0-YGWUH8SG!>NVc!lJQXanTGZZ67J0 zUNu-YEdYTo1!Yq}USk1E!Xqh><Z_%U1FaIA5ZLo#jJ373YzfC}U!uyR#L~QwOaw#} z^SX?#z5QCt{>$C+>g~x-gb<0qzrVNgLg7sS9(Tj=wjYp_2R#pYj9E|Oh2?OOK@@L2 zjg;WXF^W;!ZO^f&FaAhas@t;-G_?qa-iw2r<u|CSZxtm<o>&vpX^N5LoLJKqtb*JI zjy=Ktz01h>(H)MsGoj$@H&v=ZGqZ9B(3?(~&JW@m#BtEc@g_F*bS|y`?92;te}CiZ zFkPVxkRB_@CIFW!1UXrSL0vTHOwn%f&^)Y$N3G5>8oYqGUn6!1q_yaS1Snn2&6UXf z6BvwUfh@9Nz1^2#p_TnQ7x{c%o@G5;>I#6lXWM+M_n6t)V?nRh34jeefq$LgJuh&m zH1nul8yp@2jS76+?Z?pZ@o|!=<#cekD3!yY=M}Exb{*gDO!X)rWd9nYfB=GSLsURr z0r>DZToeH^{S`3A^|}*I8!$fXF%xX+BUXw&q08fvAx%8CtxqN(zXN?V>j1*{xir-| zWpMw+f&3OY8vt3TJ|HomdBhqpm)QfK!lEJ*&Q^FUT)V#~Xl3~=$^^PHK+1grnij_Z z4=68lnFV5K2MATUUbPM2uMEu0bpV2l<O#V>_Sn?bf^$khIEO)lgnsQy<o|Ng-#?i1 zK7v|0ysst*Kw&`86F^VJ`%H4Q8}kou3et(~fZ9IBc!PLf-%2Mr4ZT#i^sj|+B71At zdDPB1fgOAMyg0(4?WhkHT<bk@$a_A&1GEMR5XXrtAY|&`uvETB4d}PYf_LmNsb{{+ z&Q=3j-{3eSSgw-4bLYO5<9h<=Mlxtt1kFXDFtXVbg(<J7m}&UT%8CJQDd_26fvNKZ zHqK*iE(acM0G!3h%ljT|1t{`ARaQoCym&JM!s-Wmq(xhJblb7_oO^P)OLBdvEju-J zvy~HY4bWbN;C8mH1;C|jSIAwE0rnELqI3VJ<k9LJc)QW4l%7aTQgZTKlPk?qTlt_Z zfK)ma`tLz+4KVOS0O%^KJ&_nqOubM7W|U080e%SV7hv8cpqFM5P*};G&!YnaFF}73 zK-AL>*}IvJRqiP(z&wF9?*lz^tR}7XI~%BU_A;0j5danQxh{*tB?zwCaIU-9%!`Uc zX};X<@3lMm7T{Dk#Q{v-(c4>r7f65s*tfuo!wt~3Vgq<Zm9>u4?J;8L?H|-EAy=Qf zyRibf%Ca)n&Z|udIN$qRSLY87p!x6D_NxWof*q*=y+8n<D*()Zx!C1@Y6E&T<v~3J zju>{*{T`#&Lk<>kb<n!t4BHcYNSlK}*GL?-grT0m?+%M3(2;e#nRTcqI-J3;<mnt# z4PcYgY|R(j#1$ZdfU;<9-+}(xQ-dU;SU|B95$g}dryR%g9s=V)u*36u)l7v1<1ha0 z!Uqier*M~mVCb8>>zO%-DC{JY(Mna@3hg1v=+-J{r;T(2nG&=zR*YyCJ-$k1NrKjg z0Y8>oc<iA)Ao#$_*0zo5^vckuu|U=M;zsV07JcmA7WX)NDrL+(ee*>3<!bEc>X%pQ z6)NL^IpLe$nq9@Sb|Vj4gPU&wmtO>3Z#L7tdhL?^0Khx6nO+>N;C6#1rcXMblI#-$ z0l5WubcxFZw40XNZ(NEgJ3W>k7oBrz%&<5-@go7TL4D(#lT@I#osZ97_HGB)1Y#Ex zOee+AA_l~iG5fTVlalDyug$i4-8U$4lq!a87LL~3iw}GPi$IvLqUDdGGPLIQ<nx@9 zQds@%#oWQ)5}emX4t{=@P0c&%KnIr5hE8g#5c=)HrgKVLG2PHRvL>XULo@oh#L);8 zH8ycl>CV!XUDL%VZiv^u7??ya!Z1uJh46^wrvu&h{h{l@QyAwWJTjIUr&`J6OTnzo z7*ko7Pr|9*V1spEVInOO67sQJr3*mxa*dFZHpAY3%p>_!eyt3F>OI6Hy~{I1+#BV* zMca+ga*aK-FuJ%8u{uBWQ7;EPh*f(zp#U0`dq_(P0>otn9p8SuK9h2fR4f~{&m7zf zCx1<}8Gr_>YEfI7Ng$S8ql`UeE)&{D&u{Pbu8o8zq38nl1pA`)T--RY;+7#zeYT%S z4O^AZTC1tpQ@0J=6I#=JQ?1@E>*UkX&c~!eI3tg?O;wnJEE(D8b$AxAhbE0TiB>gv zL)R%|aenC5k3%YNC}y<IN<U6Ei1(=6KqTfUWL1+kPe#^56f1a`s00Xw(nZFk@Yyqx zdf$CQa$%nbW%cJXWf8CR+07e)&~*D|TrE+cV~rczu%zQYnKUjrbnM3{a7)PbBW*|v zeA-YL-<Xe^r3H_M3emLD7j^4rpsdB_Ff>G!g1*QIgBO$YjwS<z({%$?CYxo@+n6jT zP05cp^R==p*+tAF1=B<;$n7x$W8he-X>u{&7g{1P(c1Q(KN$n`3MSr4f)xBRPdclY z3i-n{csV13F4`w|s(3H%lv>*AVs8Cfc?}`<RNsJBd2YNJ1&V(tT>n*npaa^NhG)i+ zJ=Z7(fA}>V3Qi;W^?{gcO-@_OjHyrGI5FuT9<<A$KnN?VKGY3miTk;L*1k-nrTHi+ zL=UMc7)h6X9yAi+T0anm&!@1=?i}*PAq?1a@e_S3d{A|&@fDJ@%PwjvA<(b1tA>Y+ z>f!wrRgoyJc3jm`0lub~#L44%xH1<FmQk9*O0LL;?DLNBBoF4LwJWKSg=`&C(js22 z|Hp8*9#=RYNX3>PO2@)wgrQ>U_C_s#*Z`70yCEq1cAh3p5#yCyMz3mI1w!v`-gLe4 zz41icXGHSwqQD+o$#@={bb~efv1JGy@RuHT>4>Qmyyyc|)23DeSvvx>N;^5;L7*@l zAJ>t}*cD67o`A%9<^7QOmsNr?G2J(sHEDyE<M*NnlVG-7E)PY4uX%toyl{evH?<o4 zF+^zt!{=A)e=r)Sd-I)-eFYYFUt0#GlMrLDqC&s0^oFwXa*=m$-XOuD_}Y=(5FFZ- zY=GM=U&L3HRgsAe@(+Sdr{O3};?J}nQ%p@!>BRZC?~Q0L;n)?l4B^9}k(3UmcDm}t z^9(j=nqli`D3JFZo2jb|ttwBBYvtQ&V&{LM0!ZzfP%<64PJM<q3<^*+hry11eHk9A zZ-P$Cv08084p&166^4HqI1PL$6zZZ7sBYtalcygM+Kzd46giE&oevbtjk#>2F!{ft ztf=F0%8%>KDXIJYPBDZ-IU4i<J?Nm$!$V>(aLXf4rjLGfM5zN_ilJff6rfgqG{&<& zLTIBClI?j~M+Hu96MJ;M&$Bvn^;U~Kghrv!UKp7mSi~ek+NM&DYdgAQGY7NZ`YMY% zeCzF<D^<&Qo*xcBxd1|==H0ULxEU@b(5IXzeuAc7b1N_gJu`T$?wL{WE@S7>I|pyZ zh%w&bjbcc@KpGiFQbko5ia|H)uyyX)-)`VQ=au5gi;Gd=js5WK=HE_-OsxLNH7sor z2Mm-&yUE~R%E5is6cPEaP#qV7cr~^92i{iTMnBftEGI*ETnhIPrJE-S(4EXdEaa{t zrm^k+sVSbUJ`+*2lB|qgMCjXP;%&2Ke`dCuoYG|%P<t(rO9t`A7RSBTW^TwIXeBaL zLO`L{2#f>S$^nGIz*Oj7G9M>x@@+IcF{59ls~OylxzrSxJQ=wVNSz~Vc(UtiMp_S? zuAuAk5U8KTFS*!)5`?^h!d0t2&!f7y1ev`NgWOz-|MdcBOAs}`@XQe9)-NAI*aaJJ zp-(@_^jG!`yuI2A)UV*p0D7kWQSgp%P(j;cTLujp<oVuPpSV&^i;`2T4M<|8A5PhZ z^kTaAhzWa22hzAlvqd~>=A7UsK|mPh<Wy1h3;2T{AUAT-|DgMff2qw@0pUG-WdP4a zXC?A6#f<WtECjU{cqJZGv-qsO<0&RkDp#Af5;piNLF*svanwMafxNxErJc6B>j~Jq zT;yGQF75xCgUZf_`#|_v1`zy%048c`iWPqc=Tyo9>0!ApKa{zi=PZ;KJF1vZ@Jy=u zKMr6=QVEzk2a`fy5E?=?i&cnoK|72~*V2P8!9Qnk+rNu}rtMVU3MHpRAOTuJJiLLR z+2-R6LY_*Nc}vBvt9uK0(+K}@{3f5R12B%N<9R}c=|?;h0&Mcspg0F_r#-)cx3z=X zq!+R4zXal80mzeR1&}9ak`JGZc&d-E7(aC<0;`G^>ajUQ)|W^F3bI!kplT=&Leo)+ zi|ndjxD+D#t<69Ukt#uaRO2|q1&XiUjLi5l92h*V{Mrvw&pkoN?4&Fj)$=@qY1>S$ z8n}g*Z*vbl>tFhpHmRP;GEJ3o)SXD}ME2Q`wff46z4gudN`a5D{-vu(ssRSnH-<5w zb#y#tBZoh{6>JU*JhE~f8P?jt@}9^soWzg_^U&3!L)$TB(?-}X0g7+lOCYgSp}%Yo z{|=9|EXR%zo5+H&G`wUPD_2TuhsL|f<AG_Rap3zJ61ztm_X&G3B2$D-eoW=*Q#1;R zfcjJrTYrwDbopnG_<q3;B$Yaoav9q-fMQUh(JhEqs)lXE_IKPtFTudb(9rGN&-J(@ zgT?z&@?E2ufO7U<mcEFt_8ip`@51kEJIsCAYmkMjYlqs#0}u4qTp}JQI~rK%=`G)< z^}9fz^!P$wRvu{HR6(kMIVT66*pxj#W0Ox##y7uglh4~sOcyQ6lc@I{3EWY9ggc5^ z2$G`9)1w2q(6Uiv>4;4}M3sExZA|IRO+F3yo^OqJkx8?RE#>5*Fu=W8WRGna@t{~J z9ANxk^ojc<4V7mig3mi}9xharNH%aAdz&uj=?}NDyxW@JwK>7DEMbsYAdBO})1g(U z+v2W@T|U6^{7;2CxhWN#Ckp;&Wx*&%b!p|%7oEM4BU$H*q=Xt>rZkmZyjtVOzUv(c z7}zjI9({kFv5JSGdiI|k>Ii!Y5V_hZIuF`*QOi&<+_}|BgIE?XdJ6U4(wi|7kL#bk zuME7Y^pG|fw$_DDfh|8&Ov0j*Ws)RB7VUz1$~d<mozZ0RUipH{LU$}UVeR|*A*QZz z`N=)r@^$GH@o{93`Bk1xw@x{G!IzuN!8XJ@IlcmOoa>9Tss95$sRmbhXtL|x;~j*V z#G{TZRw|!=`Ga3)%Y>rHEp6HLo;(9C3;Ba48j`HOb@o4h{cRV+&JE2^*)yZ=NgAme zyiY;I;Q{wMh|-|IeXSwN(yzGHh+$OJ0zZvL`X48qQ#v2R_0@bOjck>@oc*!dd#oU5 zFw<kp6+5oce{IdcZp-%80h^B0=ppS01}&pEyfKB~0N={gfqJEfNG5oEse0y>N59== zDz|nA{<McS4fw+(7BOqxkaG4l%i*!E;A8PgQd_Mz(tE$EUf+NC(0cmQuz(&9yes0} zVXC!(PONS^J5j2np?gBXaXb0Hz^V-;@nTXcj4N|6NCkj|X3V>#uAU%oO>~1uyu+Pk zi9MP}Z$NhAW&NrjetfTW+eL?*04^E%1iMChH9vEF>O{t#0?)R}H2F<2N0!#w5k!H+ zJ(X2q%%DD;C!C)K+>rFeGrOFyM!2zkfW8z+`}E&vkwvivcMQ{l(ZG|8QV;<+wVr+) zD2VV*)mPL1E8}WsrPuAS7IAj{Kkd=bC4*Q#^n2--QC)DS$=W@dIe2Rd#?5m!=kP>& zR0vQJuh6fEk6wZtOX6)%4eBeY@E>raw)T?6W=`cq3FoyXvfHco5%NDkLx+NKyC1J8 z6VqnIl=q+FZ{NPDfL?u-$rsCjd=)thX3>j^ikj$U6g&ohI;8~k7<~dnX|8o(@;@Rp zj$r_*IN<Ak;?a&>-CP36IVvOe7)3aBiH<V-eB-&3;KEwry|j({zmtzF(Ct?P3@=~+ zqYAn%G*&KE$Wr0F9=53y843^&<=`h88IR#Z+@bWQv6W}}VBfok&$4CrTi@Tb<7bsg z8#0ju`qevZeG)3FU|YZu)iApi0CxQa00DMlY^<kFrQr+J4AA6;E*5YGs{|bn^L;-+ zV*wR4A)y@jb2s43Q?`gD_??|TIJ$HIj&hEBtQ{_Qh>3#}n}Ai=i72sG<p6RJl_M3p zT<+MvAaw59bw>hj|C*v_`f0--q}3G%Ah-88AnI9jn}vu6bAsb`RR6EDFO8?NjlMpp zC<@i16e1ZaLS`}xA%*8b$e1BhN(mW`l4J@|(j-$thUg&ENlHY<%u~sbAya1ZU-$og zzr0^xU;3$j$9-SdwfD95+G}l|aGsKvdS(_l3=Iripo4HQfdXv;g1?1PlS};v0<TVb z9UbYSu{sAa!Ew$6T+5$mWKfT%WNKC8rsO13W;u{y;b~ZCO4f8fdg=Ym%{!=hxLsWB z+A@tBz>)1Q$)3$B@^mYkH`2)_bUm78-Z|wf8Rqhe(?7yQwyckQ1wqD1Cg`O$BYTGP z@}BIs*br4~Xar&>Dw$7_|DSKYpN`FSU#b`aEtmw2W-ES7o!n4#Y-|-C0hYevaY{jf zgp^AM72g+s58m(C`|j;4yT7o-zytW?^JlGgS8xlq(6azwH0M|<+1S?iQwe%a(z(rj zqC*UJGmT86*S-b>>k3;c3ly!1I;gOgqIos{uNaJk;{HD9K?uc8=cxJLVLmVnUhV2+ z*SGO%2uuiB7Qy<$Lqm9U+Wl~uXIYdb;K{&=U{g6r<Lx;kfQjCJv*8+pw6SvBr>Mkp zVXD93fv_3S4-?mkc+fdF)|a5!vI3byNL%1}*~OMtZtHO2Lau%NK1iwbzyQyd3P9!f z0Zy&(UbhiSw_7rxkj2C2W~!>R&;3^pL%hbcSCrc59an|VUV{JB5wc2<@s;*p9y`>0 zcb|rNv8C+Bbp(-JQ7OpYSXi2xo13+CbyMJ-%x-uGo@cJp;W;ZOXFZvQPSE&kN91FI z<q18Sod{+ryPF`apaF!el~5@{P*L=I6?_??vO;9qP{}mvnwtBM9H4?iyfUO6yo1we zDX*uK;<{hy?-DOocmPDyVfO$`G=f}*M=d!1`FUATYQ)1p4D{~mFI9Lyb-jCH{D1$M z?0yJYQps<)HxJCr%n-gLIDL0={r2CZn`7Dt!JIYJtc14-X8;mNu1j2LTF3+^Af_a2 zUHH(}Yd65}74}?Vgw;0%uL2I~7Jl;k*y90k%7OGggP#XUNzC{HULL&BUT9iRB77Y& z9SY!9B31WgcsLW}K)U$L(e|7JbX%APz_^N*g=3ox-#Y!bvM~82*ZM<8t~DL0-*1r< zJq`y%FqBVou=`Hq#3z`R%foR@$ng*qwf}o|2;nm%Ok$3W++5y2Suyj9M4cI|W{3C< zjhjRW$O@d>BjN9*n+6-554sF$!n^Q*5p0`?;IktWT~HoMLXq_YrZYlO_h+KBWqj_5 zq%#|YeMIAdFdv5R6|20~53*U8P+1<oz`(4y#BCX{E}5vv04b&pY{jpPbGV>+<>uje zKQ1TZ2j-PnK&E{XA-TDtFrum%7{q+97Oo3t7w>>Ilvg9%3a_X8Tv4H6y<%;a5h*DZ zZ<4mjfxPi|u>#rV7V^>qyg~8JX=J~8BotT)MLRn?iC1FWQ~ITD6^#CI-z9fbic1Z8 zF#pH6Ebz?hCf3w8zY~TSacFt2^)Jl~vg4sKGz;6@y;PHwodcU-{H)1x8K^T~$anFa zqZ{4wi$&=fZ|?;Z-_;Vd`BcxiM<q+5CYj4y>@{~)>29N(xVXYU<d?j`MxlxJ(ZLUG z`M2Hl41XNDC-$?kdK35dH;3fYW>3-FR^NSgI+VIvZR06^kWW1Sx}W0Ylv16GeNKOt zLW576X@k!~7eV%e8>TGjs;h5;)>-=90!*=6nV5*<h2_2}X8S4X@ndKAV%Hh@zP`Sg zep%loQLKoRii%3?g9ksRrbN~$*IR~nynFE?u(9!MWo4!N<;&lP9$NCFi$iwmdyG#N zcWM=cxKtC8=afMKy(iS_>O21HD>I*Y?+pwOe*)Vco^;OGxEgEg=;YLV@h3!kAG^Bt zX=-XBodw^zMTZ#GXlgxcVxnbj&2vw2Mex|MW1F^YiFxAE<8TYy`VMJnX_$a-W@a8; zUzs07qNP>uh4JlfSQxLA6dN-$Go*0Ri#cN3;F?WD_G;EWd!V+qwxjHZG-7~SQ*$>l zksEyr5=^7jb#>uz1rw+Fgn|NL86*)NPr`RCCG`OAfYQ>^(Qla#E&5+%XGdbk3v-{A zmY(=KpA#^HXxh1VuVbUt0yKqoNFg1aoih+xMkgfH;wny0PftutBnv$E-HDeYB34C( zh6F1D{+sosOZO@&l%RLL09M~8fF4oqLjCqF4pk9h?AJFk`h;fK_G5*T$aB++i!bpW zMC8FOs<k11<tGCalzb2^I*Rj>oSmH^qt}CTR!mIH;3p3c4-J-RVqzj*=JLUhA3wsf zF70#iUt{B4TwHo;YHExnTX*dE)Ytcnu+jebVdK_QL_<ZlggEOk=(70x`~MsnX=%OC zF*%MTla!m=nCRKDPui`=3T@W91W87EdU`eyi#ll2&<9QU^N8@FCM6{qIm84{&owB` zb=ck$v!ar)qBFCzF>B9w9M`_~|6M5TpO@$2;)>~q;*#iXRlk29SlH(ih7@=Owuu`~ z1+rC7pFeK^NJ@>$4-3=DyuDu)>qC>9pWh;j5ZySxjTbE$TRXdt@nW{&NxudMc_VDz zesFm4;zcvV%e=g3EEBszmJ)iN+ZY4<y39_Uy4&~kq;pY8Nj}B_kWh^YiikK=25lwD zJ)ObvUtSr{Au4M9`6+${g{MSj)YrE@D+>mG5W>m#J?8OqsX7l+`W=u^Akfoa$y(Sb zaIgmAI4CHnH8nLbIH+c4$Cv0VpTPbe{TQ!}^~JlPq0MH45b-t(Wfv60aw+&&xw+{< zaqf-8M)+ol>3#4(_)Z6U?Vg8nUf`*vrR5oKZ&^1tH{vj$`}Cl433cMY!esZGW7i^J zCw?UEfKe1$xHHj(q*ecS6xfLeXB2Hb_S0utynhZ2S=-oDq9K^NLxP`QBebBkwY9Oa zksKBlW+4YHCT&o?V|P6a5H$MeTMV<Zv)_IDR+2=d#UE97Pb~kupOTjL1__bv`(KLt zQ0+g2VOZ|k5oGG-k|%8l!!!!z0d%l*jmetKW8>qpn9G1X@vo6l42;;s_y}C_E?3ua z*J+mvVKvcFnFWa2XP*6O%5lp-kTbqsft_IItKN0Moo0jJyLY!DxZzh|KC%9%MOdQi zg!vI+;kaWG<gzk_Z{NO|L!$orBzqN77Xu^X2gJv_cXH?63JD1va=_FEe%x+#`3;KB zS2-i*)`1&jyk`$hmCi*rNh$|1f;uA(&$TuyNt_uXFWuU=KBaGH_yK1fqWBp7+7|PY z+1V?=vck6;>+0%K=8J)D5t|J<B4sHaZia@t5SEUYdrUZ#{GTB6KRXbx)|AK|nie@f zKaURO8$bmT^o)c`0ijN`f?;2N6+v-s*9%qtSY16hGIITlW}{J7R#wV8BI~2Ebqj}V zzqy+;SgY@ollAED)&i!n%eZrM&oUf2di3y-BQ~=?YY2-5ahkKPV=H=kdutjRMm{hr zRkgPpI~@fei5e@1AfeA^TsWCIjfMbNMVAK&I5^o|_>g}q62&1oxdh^B3fm(x3p>Z& z6mdw*abI(r-I!{7y}zxk4Z{ugu-t!=md5{UpN6{nXExJzd+=#NK|up!V;SN%q&K-% z<OhIzT6qd+h=)j334HkQ;Z<Sftg^!4JzW#R$Q>{4M<t3SAY&H36(khue5t9aIC_K& z{G<fqwrw3qs^I?a^X=Yb6TF9wO#}4h!otEEM|e2Snw#sQyAO|udDun{#*n1fSbGfD zz3=Zo5D*aX7dmMY3m4Z{0MUrC-KbSGz^ec#dHMK^txe&MB}5^x$!m$bZkLsnMU;;- zorw6~ks~G7MbJe?kln#>h+pT8u$R`>)>wYEP!4#SaQaW7s|pX%rcIj)Ot@;keBqUs ze_a1qdjEXF6p%n|tu`$AcCoVrPo=C(z953ryoH;OZ?l7g1Dt5*Aa@`<V<ga!3P50( zln{g@PC#l^HMQ=PNAdB3zcz2$bk@u)?Ed{-go6paZ0M>zFk=AlKheX&)O4S5ws~=m zRtxYG-}V4ZfT^8w_wcZVx*O4)=hAsx%B_bXgx$o`QwsGLaRUK)j`n_^=q_Xy5fQ<L z%*$=~=jP?{z{-XW^b8uitVr^J<G*@Kq_OI@pQCtWWMym8a=zoBB^4Cx-nDC&h2Q*t z$#JVU{9#d9z8BelHzFc6H#aIOip40$;xuk4MD?OjNj;C-o;`b>c3yFH6~=4@VbvyW zD{JfbC?t@-W6zIc{e{dZ;lSd_7j*l=y1GpH&pgxB)eZ0C$zT20+8T>%7B8S>Ynv=F zAOkFc57GF>%*c4({s0QOA20|JUH}M!=%)KO{yxOLip8#iir`OkdQOf5vQjAq(Fnik z4+{-7fnEeVzgR~(pKl+bR~cvS<>l3g@(6n|Z}Gr;vG+Q=+|_iw5I5x6*tj?`#Je&R z8otS?kG=tBLpHP>RgT;;M{o?>|5#B}IMSvsFF80=1FWYme*OCO-~av-gLXk%M@LLr zS^&oZ@*<29If)ne+U2UOto+!)mXKEew$CjnaEXpUw$^*D=!eYsAkyx_h3Rj`mU#dZ zs~IIPUkc>q<u$KT3Pc111blscQ}{&deM-W@7y+kGEOgkSwj2A0(pz}xxx`F*Mn+Om zQ9L|nxzHDIaB_B=TR5X(oPfus6NS_zP4B5TwiJ*59hVBE*~HjYZEbDBZ~~9W8>aaB zx;n4l#Um*+&t9~)x6eU@Q80F$?U;E-<&>AVcXV9bW-3an>Dk%J_V&eRLxtY^(rfrz zv-Us3)K@Z@)M+2Pef#!`N2JSt*SwSa>{H9&>B-0l1&+ogkoV?IQfaAqZJRjT>C>lC zP2UC<S9U6eB|$vnQ|}X{fM-bgV{4v(9~u>JYu~>o@JCNi&uDWhBeIczh{!$6-N4=J z1=}29?(2Ih9>PZq4bh2E_n35Yc77ygy9JffiD@bwfm1j+g`XM&0udk_(-N364=qwS zU7;B#gfZ|tv}dltLevDiOAV_Ch7kcn>Kho`ymRNypJGM774-1wm3wUdK_sB&!|aJ< zEHB1Ll>K>&<a+SfF+J0by88NW#oelw$liQrmpwc_G&jefhd(eeAyiN3rA}e=Eb-!( z`pe}w9q`_dFP)iaeRV%&WMYCTBzF**BDDdiSUdj-pjb3ULSeA4sF)a!fPkjJ1#b}{ zVc~0IslCJg4Q|SZyqqR9Ru?I`xzrCI)J&hd9feSe2j>f`^5EF{P#g;wjKrKqvQE3X z9a~;rHi)c(rGif>*ZsRHQIOl)r@1r<iMl;!-M3HM-97h$2lyiLv^-%m!Kkivh4mdF z6m|@~f#}C{QmOmIEy|lWHmEUi$FL&fnwlmE)u>~OeGw*zIVM9xLzt=}N=oF5i;FkW z)8CJdRylW$9ZB2)!YhrYtc;8pRVOU#(Cs}I)UK`sc)`wMLnb=&X=FSn;{T}a4a2A) zpc4Uh`;#Yc>&JK(5O5$cFc9h3eY{Qh$B!Q=DPk;J@X7C?{s_-lHG3j+`6ez<R%D=z z%*=VGc$5d+5S6T~_)y$>Yhje(yLIpG-51y_4BVo%p>T0<2tC~o-$<B9x#RR}+1H(4 zQ%>D?zk2nMkWly;#pR%v&wR8Keo=1Y6OeRg78i*k6U@&YqK?D7AW@J~m_$R|jPCC4 zRseMdv^_wr%cxmWem+Mm&o*fuVE@;cB{KcZ8_e}$A?@Hgp#?MI`=LI@GiS~KX%N$< zh>=dfwGMEzbaZq~L#UF6yBG4rnYlSzWLb?FO~fG>g!`D+0)&L&AM%*eL|<i^=Gn{4 zA;aC9-C>k%QHH*D3923&5In$=tY&6recmIS#6Msbix$kudU|>W2S*^z3keUufSZ`u zE0i|0w6shaQ|M21Rs<+L6tNtJ(nVPHZE$l(hZZcjxcWF`+!Ie8%EU`NdBSnz$PpQj z;c$JwevHIwK$?d0Bjw$@gsLj_i^WtH!axKV2@j_=Hd}b@E6SQ{YmuKUlzY;OpmxG$ zWQZ4)?=O=&f<j>!5)f?r9k3kdq$Z25jkBSoXiiVg%3{JqZeKq?Vpg1`@62Z0vwYIh z#tFiEwSi=J?AT#&$MXF72w+TTe45!W<6Mhjic+$p6YjW{t${Qe{pnRLE%%WjJf;U! zTwGo89QX9@nBbh@#Ef?2CF@@c2m=Np95vc2D_0vS6pHWP<q^4Ri5;IB8qU1TTg7L( zbcp^c%g@i(Ke%ORXh=p*j*|Nn@~b?^PB9w|^=)hC=R{!%DJe=W^XV<+C>xPW;dZ$- zRlXVtzU}a12|>XSczJY2mX?=)m9{^Gd0hYe`PVS9j>iuV4`XeKX;jfMF*onsyZ0}> zj*bo?4?<zqQSK)XN1Mjw-<L{fw%`T<&}=EWeu<bJR<pc?o*r?=>(_=yo#J~D8oCWh zfv=?4clkJ8L<o5cA!J}+Fxp?CB<}Dv_UpNFUs<SH;rrQaY;25eaGIVx6&Dxhb>qh1 z<m736{pconj4?Yi2P0=;R~xo@dVc<6d;7zrq=U(xjkvWB@$uzhm=J0|8TSFQzSOsi znf>gFzB1fyHCdUNypodh<qtGlt54UGSBg#zCM&uba1r0#!gt<FjGdAA^B>|XPKnov z{xwRvq@-DaDvkkE^!ylq^Pl?Jd{0AD(@=v140)nItIMqZ$<59#)g~cfJuV{ecJKNX zgnqVVMYaj&219p~;}KVRz&iui>X3AQO)iOUq2~%WjKu0IVkYBB<K$nV=~A}vC{Oq0 z1$dsuIp4Q$-|m*x%xTSz$mVnkFLOZ9WPC!x-{DEaPa1DIy_asej0A6OWOY<%FyrC~ z+@^f^izNL!+6uuG`o_&gGzC36hKld#f~P#ZHcaRqD97DQ2-w`6N9UH*6`8kuv?Frt z`cIXM-84m6i4Q{m|G#SetUrs_x{zt8n`6toS9t~!R85Vj8>D&V*QyC+G$M&iXRnMW zSO{Jp3<yZn%(uU*nm}Uao>&O3lba-~kQEwVFZ<PBdS=bhLoQdAtt~iZXeDYDcznLa zZIz#@JRmjeT5G-eD#`64XTGMzhDTw*+?FR%m-2TR4ygE-1)pcB`1RjUXMyvq`)3?i zrOnh=W5)|y%d_|O58j?0ztTEqL9uPl@*DDJY+AHtH==4M?pP{(W>@gfnE{Uhh1-9+ zXfjzi?Pwb{Y3d)58EqU2wKo5ekfCGWHde;Sq;^$Mdxm!UVCBr`h@+zP({z0rk9-tq zbZIjJWj6mIDQd`+ol+);%N?WoaBGius0MAAhIV+@cZAc-vg&bEabMUm{g%m;Tkhp% z>K@Z-HXcvjIW?{dX@<|6K1o(Asu$E7SQRq)kls_bsyn2f=CGmKxtKd%ly4!wsQ0PZ zP#&gu6i$+-w=%NOX?H#%kLr><xi!4Br#Dp|77UaWr7X|C3<*0X?e};?PeVq5PgG^% z<Fm{t3xNUQs!@Z(&q!=X(-*mI(stNzXI2>wKBIrYlT~@vh(w}0P5h34EiqqqH|CQf zB`Ow3JMG%$G)_p9s^0eYudUc||I1&$cVMO@lC!v-J|YNX;m6lE4kkS*SB}w5FeFPq zw(RD$Pl%-2Kl+n6B<ag#vOPTN)R}HiQe|}ihGVzh4Q|dng^Q0H+b^F8XK=mMA<}Bt zaHT6W=EWc9kT9n((GG=ej$E31$DP-F{D$mla}9ZfS?CVj7~r+Sk(p!3zd1xT+%|e% zHR$UT7I9C$2~YaUVt(GS!_wDTR+p4I*2>H3e7f2j<19Y}#f8}M6iWt-$LBe!N>7xq zJ1VOO7}ch~dO<tN<V(92?n|dL={y>2qM8+TYTQkMCBNe=H~sgppKZSEtepv=A!|L- z!#<&wEi1HAf`y8i5}YhN&tD1dPEeNVDx;nWncpV+>n_85kGf1(bV$_0C$V&Mk1Y?- zaYO}({s>|4xVGiZMcJb<VZW3^RZRz3D|#yeUCH<Ad`^cjv@nL4$EqD4zv!{O>2lz- z(zR_)N9+H!lAnI9>hqFCN97e2<DBb!=Exg#Tq;6Ohj{~z>0X`d*H9lhym<We9-G{G zzNCl+m8Hf0ll`pCg&`e^A<=V(6~0_Nnl`GG-gt*WU8{_Xl|OEO{nX|Cb~`CwdF<b4 zvg-Z5v00*f&iTp@&e(IivQ;Jij)!!<-f2QwC*R-pD?)|ocez^KkvK<zlOZ%A+u!l1 z%Qm{Leq4JOG!Wuz?Z9Zt-%PoGicvkJTXlThBZMPfO*v6FVR*Dnzni{hUV=|txPF;o z!!R=U<*Hty6W#U2{cZID!qL*pdWO|I{Uap4H3tW2`{frk&^C~Mhp;w?;dvREX~kK- zXC;50axCH9DIR>V?V=~|hVCV{gO{!M+VrlLV!0S>f88_LIIR3PYpj|z;>{kLpIJeZ zYvCsCiyDX4;*KV=2aO8O{nJR>#AR_(NSbrA|5@dHW*e(%npyi#-3MpcxKx-9ig3$C z39!g+-@11r^3Hqe(U9~nJXKb+*Qf`Yjt%Ygx-B)%M&m$c4y<O$KS9n3&l(BViXRI! zS@onJKKMX8Bg)~Ivry8Mqq^>&F&55YosfR_fbo}OAA2btTbG!`s2<8o=VV<RP8{U@ p%aSBq9?9_1m^;a9@#6+v%G24AyojbV_zfMTQz!M*vsJBb{tq@^dL{q> literal 43016 zcma&O1yt4D+by~QrAq{)OF%lL8xaseBm^X-k?xX4x)mt{5TqNVrBeYFknRvg8l>|) z8~@+A=NsR-cieY4#*4__d;QjWVm@=uMTF|ZdwAFs*a!py@4kYpIs$=OjzFLkW1_=% zP?$MX;eS_LWbSKV!hb%PX5sL6a#uNR*GCQ(t{%qD=7^{E4tD08E~d`r=Jqa@4z8PM z&64m#Y{(zF>uhf9YUN;0r(tDhj(BWtMt7T!PTt&v?l#Zu+jP9bqCEVfd;)aJkCcww zrt=X9I>de1I~tzv)}}qQAN{_8vE|T)eOHj>J;&9@nzqB%{+4#dICj)>uL_0>G*_jI zH7l)GtF>0!?BohHaodMUA5#mjNJqV*XUThD9<;`Ia1f>}i%CdVcYn?&tvi?L;8#<a ze!RD{^G*|0s4*F#3?_XrVWwQ|3Vy)l*DoeA8Ivx*vZPit!GeF32_G3q^f0}12c^W| z6MBSfo4Yu+i<?`>+qZA)+v(unG3l*UStDZN;_hwEG`>tp$&2;0HXkX{8LPB<^<O{1 zQ8@H)>twUpMh+DR;pgY4prrK3gY5Ff!RCqaV>h^;<lT%FH6}y9diClGe$eF#${yIO z*vd(G?W(#_VSn;FtFOQNpU=}WCHmyKZ4yl(5E-LQJXqsk$aw2kp&0JJr;+FHBt<S9 zb3inqgXr>m#dq&>{QvmYhko($JvBlP?DH@OdfH>hWg!|vcL!{0O#(&aKFLt<Cu*56 zvEJHl_vki08AkPy`f*)l#3xK*ptFD~{&La$sB8~Rl$^XTXr?Sh$bzW$S_L*kRX&k0 zv|NbqXb7=fuEOeyMsi1zWZRNIk@0K+2`&nv-qET2?#8-Jtzl1|t@h=b2e<OMB1o9# zbqX+lC6Qj>wS~NlxK=x!ukbDT|M-zIBE9ECyl4^<zT<RM_!v~-bVO=Q|4wiybtPUK z?NJK3v#ORBy47@5Q5La@1YgSA{?jX$Tc5Q7%UG$o<1J)rmYaK#+!6Wm<x8|wExUox zZng2Ue?K$-$bhKu2QHe_YGmJu$4f30x-VU+Csxbio^<c>36(DQW#vvt6t&}k>oQee zeIsu3>BJS<1@vz^4_h-|eybcqxqRmQ8^a}I*^{X4Bety<G!inm56}+l#-$Cj(JqhZ zMVIta?>E%YGP`vYy<q~`fBTJ<$R+mgBf{BKiT-_6iO*Dixis0;|I4?AgvDBW64}dr zPuvAf)D*O|lKywp<n$BkB?$iu2xayB)c4Q+{#ptD@DLXd4+W8(ooyX^`L@Aj5$!y< zkIq9Mmp+;Qp&})xt)22lrhxL%qeo+7V=TMhCnhMmy1G{IUtb>hkXkmQ7Mqxuo)yWr zw?YmkgIV`Ik9L?`U0vx&qY|B`j(Pa`i+Be(FFwA0O)g9oTi?!n`7wDpw?^99WUPX; z_4L}u$I0O>6%-VtPxjZZk&|N~mZv`nk%hL#PztH4t7lg7X5-@G8d+OoqoJYk;0DFV z6a8FT%E-&Z{GKk={rC5G64BVSv?tje)eGO3E_8HtW$xX(_otcQa_7~yq6~UkmYmCt z+tG0;1#$1+zpq_xay5zDFt)6W$8~M&nvlax$B$eUdSYy+g$^uY5|S?)Ha0dmzP`R$ zxw)gF1Dvd^cf7p3vM0ax_4QYnD^*$!$Y@zDZF{o7Kgi3Y5B~V^GA)hk<l^+yCo3n1 zK|nw=!%2h+zo@9l`hk1zt&Zd4<ErE7e`m@*60eOQyKROkdGz!8J24OZFJHdY`kjd$ z`bcifHpdZ29wPK=?B%4T5!g65*C;5kGgQN0-BMXt>P<#Gxz(1ImbN@q$Jh7u>&q7} z0`l@$WbfT;PTXZ<5E8m^?b@~Q@NmyncRoJ8N}Eyq^@*yx>5{(vB?e-Z4zn~`T3Vm` z`VyFGNG=jpe15>96B9$CqjAm5%vh*iG59tvT{l@-WrE{TFx50yDY|X&$sKRj_kg;@ zusIe!3At575m(?-9NpYHhw>j?BP2v1$ONp3-8ZHpn3dlkJX3Acgq=_W0|SM<_c>%4 zt|6oi4QbKPaS`d?yWigldhz;o#OKcnvT|}?&MB#>V;}&sPw{hdaw>jS`=0DqTnIYM z=-ZbTDKHi!>}ox5Ww$T1D|OWLZPe8C%P&nv1{~hi)%zTXzBCC+h&K7}X%bF<2AL2o z=VvFpzQ?Y_q@<aeg<6EzA)cF!2Q#7%Ei9O0X(S?BTBP+GJdvw!Sv9lYnrof=d$hH) z6R+IJ^ipAKZTzF+7ds7&gu(bpc#O~SlRH=?A3m5C6Uj%&K7aS_o%QwU$S;mgPEN%p zoU@~a3Q=f#d&oqReN-ObVPR%Q$H9R+Gc$t-c=wK~wyuu;=1nx23_>4CS_I+s>o;w< z`+xjspQy6owHe{4`(mW4OCcj8Lr09g{axzfv%L^6Z)<<XU3x;yeyHSizB#Z;2y1`# z^2GC+_wS`yn?FWGU`<rpooEefy_s^Z+cR8GHYOt`CT2-kQ@mLE{d?3J7e5S5NvT)2 zN?6s98yWRuEs8|!-%Fd%Bfq@5^#L+HJR;)do(JiT8>Smmbz{|bln5i(hOrt4+M?p( zLUAn<6Ncx{pTnY@PWJ7_*GWl9$s~P*ub`rM9c<tkOt0t#FcD!9UBAA>TvPg_QN7Ip zkBpMCSJ^)G-MfVNDXrYvT5bF6nt~Mv11;;qg3_9T9`1j8l*Kia)DhWf;<`RT;;}K+ zN+5L_0I}R$bv9ORn!;m@>g?=n<lqn&5P<MGoDUPAIoMw8t_a?54Z^6Hw6nA0HR-_I zI`TQ)^uH+7Z2Qa;%~4mJGB;9PuS0V4=FLjG$!hiQ#UDR@?0Gd=kg&O`S@|`i$+uCU zO5{H~fe>IWM>y`SXzs3!Uq?qr508xvjf(2kn@pGR4psXgKMw~p=UfLLO<U%<y+FZU zpbl*S6BjpFl|@!x|J|B91_lN!0P_l}>z~2wZ}+dY^C&!gSfWuC6CEx4<jIrNSe=&{ z8ODiUigWe~Z{r3*41{=ltqzVB!hfn#rTp~O6%~WE-4V#}-OC+eBt^)_(Zs;>;?hb| z-HWBluQXsJGKTloZuAl$3%vvT>$u!c&kwmU&kIFx`jZEgCipX}JJ&KiteMpBzTAI@ zRZ}w5_(gm?gHYYquV0G~wowAQ7r$JFE`06?quub{o|fj3rn_NjLvYHm7ZS{T&E01d zWM#!F)~k^>Hm3iUDi$W`d;EEN^YrL18J`92;NYN<l~u&7-w9s2-?3G3_p1_oQZlme z_wV1=kaAm2Ru^+**iF~FrTXWtbUjp7E>0MemzQU_b?Y9hDH$2rd*!!+iC*@4)wcNp zU+mo2_V-4He!Y)wnt8+4JT!`IUu7!mkiA8^RoGECxaFbPI~{CHvrS%o(&Q6SZrZ)j z_nHq8p!3mU{&%{6Xkg%-)Gm^!&RXXsY*aL~YgAOxeSHtc1qlcU;4t7wgvG>0{~Sh| zkfB>ySvkL&TwAlvjy5(j3K%KY>y6j7dOiuoI%JZs@}&Z!479>FPJD85@`}1}5~lT4 zO;>Y47Z;a*1#cxHk+Hg{2p3j!H<`=er#&|pm;5L9^%pN*Af%+E4i67Co^e1ez@rqL z0~{9y@Avy#>NPSlOd+8fuWrxzwyT&SGE`#jYsGPAv;G&pu)t_CUQ~R1{5^_XK{&+T zWbUPErUaGR)w@s<;Q&g8T#(CK{_}^`9M?|$5gb!bX~l^byZwizClD!Fx2rhZ>ApiU z@Y;^?q`ZB5lZ&flg0%`kCoWEf8i>Hc!s3C-5lbz`W6?*J-Jz$h{+#dWH)$5)g^q~p z6(QGzsqnq_*RTSm5e^Ox$kuJ%FacZj=g%M5stFDkJ!%PwMyPx6m~+2=jj%xed|q8G z&hNaV<J&mQ)OLAdSs6c47?X_2c2`FU;ECH4SRa+S{QgD}2lzJ%O3O_a7EA<>{gkln zbUg_(GqYk;<NNoBjEoFOAX!UG7DYux1VU6)booav0d%-fJW6yN9GvVJ9a`i%e%JXP z?PRORmKGM`cz7HrO4Qif+b0OK=4EANY5hntH#g6yR{62x+Zb6XLPk!m7KH{uT%5@Z zF$6s%!?Zj0g#shAynwK<dokS>Wa#JT=P4pC%#o3ia~m76Q1wGL^z@!RdnW9(a3xrV z{#%*^`)cYCRO0{?G&(}etv|WxnfG5YE7E>yx_$fhPv#o#Ck?7S+iq@d^S^#+vDUeH zc;x=9PP1NFEl$B5%`Q#uUL3YI7rYFbOnZvN4%8NC9_7#eS|F<nM2`BQ41^e#WO^L* zIY^1QWJAB!-9JAx-7^3@bU=(FOYHW7oB|^eqV!cDRJpr!ge|?jIH94TY~+SIxB2*H z6|J)`(vy;M5*AmNmm3nN6tC;U?uh!+$7z%nG|jBAHTSHoCVGfn7FadG7$p-IXJ<IW zP-x~<l0IapfztC%fF8EjYi3O#q!9rbcf;CdXK9m?lH6hquE9e}D=1*XhLk*MWcxJT z(Q)N&C~lxMopiI`S!Dsk>%>GRkC2L;NAbg9v9b8NRo0q+aLhM0Hs-cH70H8Ga(oO+ z894EKmXXoYIx+EP<|ov){G=|Xa>UkZX|u`B&o7oU4I(BcCe{)JPaPe3n$C}Hkcdq# ziWG*(fZW`t1~R2p9UZb)Ffi^y#6rSWOmcE?@c0}$Lh9^pOzYX?R`q=Qmh^TiCo79( z=}@1mM%{T5iDayPgvRUPsl$J!ZF+v*2FG4+`K{CoNij@KO|6~if?vqX-h8E9t{n{n z+X9rh%gXL^w4)pM8nQW1I$&$d(XmdSL$4YaGF76G)=Ki(j%@YXdJqPIS<U3wm{CXg zHA2iQhqD*w7+lG{x(^ECU%d)iUS5Xy1n{URBAMJ9Qdnq{oi-_GH(}}1i=JN(4Nq2k z+_!P(Kb3Am8P<R}gfcew=MNU3R)A^kot@bui%E%z#@h>>+MnDx5p#=iQeB(P7x>=Z zC%UwSO_`aQW>|K%wndjsCz?ux2ofDon4RN|;i)P8x-WJnCYkE7-{&0*9@s<sNw|$} zY}ncKL~epAVPI6x;O!+Y3AZq2iun5Zb4v)$4Iby8pLM4lc9#cW-&?lk+ns)XekmH~ z|ErJO=V&QeK}97jCI)YRb2csz4P)fUbFY;LcVniJUC9#KBlG01g#`w^{c1=wJgy8* zZf<UZfq{X<38FM9|B{477h$T}+FFg3I=de^c%0WK5xlrT0O>*=TPUfh+=JFw-1TCA zkL|WOIWkhiSFAcJK5E`qRFqjADOS_g4u?XLp^`4?CaGm@&DM0fqx4S7p9aC>v7rwg zLb2Ge8p;)5b7Uh{RqfX^d}IA1C&qRhx$AkH^3{NR0D$z~I@xQTcbr>W;}ELT{B_zU z`&V;xlzr1(b9YbcLHH$XB+eZ`ww))96j7v{GXH5iUcU5-3@_*4h`Ioi(5-$V!43gn zt7+2Vc9S&92@nI6^x)v&hN>N{+=r=fFa{HB6whHS(p0qnkeLa+;e919phLDQCP3S< z@p`E5ogpy{Z6C=SI=CeoBC7<y1x0!Xpso)9Z^Pr_!qU@eeSCaYeEV!hOXj1vn_6Ia zF}AjjgdQLC{Wm+k?h;{yY^~!wDkLWoxWcKAS0Bv!(*l^hc`r6NC@6^H!w@de1OVvr zhg{yic%iylprc7eO%2HGO>N%Q0jfa=h<`_*rlwZh#ezhvZ-ZdM8~GLujfKR@hX@C8 z>*2c=K%K_+_OWAQ+Rz?`PEy9l$5Vuz7@(z~BMy&OOObUg=H#~O{rgx@IdNbRxqj;* zU;#wSYFwu`^cKl<0%UKQm?#iY|EjL7%_39>M95=Z=gCaDD8(pYKxd>D78Z~(Rz6vO z_t@%1^&nj#&#eXAE@J`+1;%pR_V@l5=ZDZ&dChx>*QV+oS#?j>x!h_x+t*7-NO%tk z0DykEKO+b_g>6Vk2qqq02*kvH+Dk*4UjalBWGtDWEg1&~#}HNiFE$R2jt}?Z*!HH7 z$!+z;?(eU!?0Zwym*<q}x^wJRSa{0z@qgpgu}qTt99v$I1@Q1P5EW1v<e=Xry04ua zY{JRhWM)Pery+PNXxH}h%PWE75Mjs#|I=*@WLJ47a&xs2=Bc|E=Vt)px5l5Q58Ru< zW@BS3xLudX#j0-zIKcUp9>9%ks`M9c!aGgOEG$L}Ix{4Zt?d%CXO(E*H<4qwMkDU2 zy4mq}(xJICNBJ!jj|e~x?0X@lu<g*^J7yZaQ4la|9PW)8Rxz}G?&)d!Pg4O5qBGdJ zxXt6aWdz^|mx4c>^lDzv!h(s8vz}zD@980Lhe^45SJ}spg$ZNT4zp=&E3)f=H=*Yi z>r$-@j@p=;XFEm(B_@&}^A%B+(D9_%kJ{x|U;eNO#?H^5KU=1!siAKR`yBFIK@E&& z@VQAxhj^6pptbOE>D<qsS0F+*ra!U!QfWThUbqTF4k3U?_QxgvhMX7l(CRSzOAV!} z9&(Nl!qL(R3Ick!1C$Mum&MmNUzF`RIXM%ZePlE;pwSf+ARAqA!H$w|V=(^%iccnw zHT}-^9zRGECx;#N++BXD$XwzN4+e9}`PqTt^2&<IQcq%vkVD!CC&Am4l<>JZqbum> z4C3PY?w{)FL_!H@@u366Ngjt6P$+5uWP$=#lY<fkP+nhOzgQ}%Sn)@WG9WdeCD@k^ zgUj-fz0U4mulpJZFnR!A7nmZLXI4k>U=-GF@Z{Ah(#{CRWM}8(?0kL|r_5<l9)awn zj&m(2{B{!>OpE!tK0Xb$iDeK=z4kTb<$*OdLa=0U35hB$wNX+?&G4wG!XXz*O3K2@ zpCcn9TK3L`(c|Kn1H>lfhs8%;BvvkYN`?-0S{8glLb;<4Wrn0zfkyc6{iV<qC7{1w zHJVZg|MlzFmJ^k%eivsRyX%vW%@@n*MMWu)b2%KHiK(f1jhL{o5wrrxn&pWqc1;Ft zAj!1Mo%OV|auRr9B!n%1qR=nGjQ7Ota{jiN94-9t;gQ{-o)aa6M<$i-Yd}BC10R3| z2rzCBorf<K#Y)71&LETU;^#7`zk(kYboBSP>Cw&-48V{6)E0C>bD92x0bv5rOx*Wa zqxx%S=T#_e(B`Pc+_TKW0P>l8o*p`7b0+}vi3vNSNZ#JlqiF98jEvInxBS*iC$Jk% zs!(E;FH3CJ?P6t<SFc_T)VtZhOguML=K^3eZZMWJPMf2vK(nyScKkM=EogygC@3h7 z?(SNDYHfjPou8kFvVw_?Eu*LyX{!f4333-1pgGEv0`m8Py+{*(rVhY|mzP&j2^pA6 zC8|K|@+sP5z5I}sbp<vVyG9t9$#tcVUkYNt3R6;2`u`^Kd+NT0y;&J9EVmqBKvFL- z_!?*5_Bz@z2htWBwunbas5?!<d+X?TdZCZL;C)6S{YEbmAPrhuTR(sP{K9T^DE~SD zX85CTU@%O^#HzwbL@z1XG<c&2V3k>v$xt#>!l^VpF)^`vZ6b9XfxW$+!t;^p(oxvk zdChUCx9W!61A@A#|40An{`K`Ougk9o*)+{;uzWHpKk@U^zbc3pU~|g-&W_PV2nrP# z+W<HqGZz{TosiJuYM55+k>!=w<jNZ%hmtOq`<fjlybFtqP*J-&I^M(V&uR)23UuT3 z^>x-dCRwbXdzZ6c5hU>?BH~A3zi;d1CHgI0sx<zYfyz7KKqMxenxZN%FE8@F{??ex zcD&*$RO!1`R;;3~%K`8d&_0q9w}&%eub+NTgOU$}%`H_`Rl3QMk55$qCwh8&w?neT zs1eCQO=V<c1h6bGCkL@!KXRkFAm#PzFS^%yZq-7uNSGqZG=a8>z4uq~`ohjtY&oC_ zN37kDSiUDmp!b_r2uwMfv$XyfXF@p-UO!J2^QaK@la;*!iMp_`U}R<%%&Avxbb7dr zcg|n{=mkhQ85Uv&etr$D;YH`V|C(?JXdf0nv?2%m32W9|sf&q?g;_g8y^BXcAo1-K zFkB6-QVy5{8W7vV%#9>9!`Y>yfQ080$8BwG^GcJ46jonp&+R3sK}G~a2sdAx>uoOC z*|1AV(L!x%hmipVVd%4c6<9YcU}d!yRg-~uJl*c5g?JWr`4w$z%_1QofqWRqt90Le z@;CfenZFe78$g%=hWsBy*G+@Zm;*e}6Kxcq+lbj2LBe0+*vUiwa9}Hteb{!Q5(}uu zTA!l|T`n&Se5w|};cyfINt}AH3z<sETt(d(4|Q~Ogx%N3U|M%NKYeyN<J=;k<+mh+ zO>AgrfKAN$r4WHkM{)o21}Oa+8bf;P%zMTrCeN$wCbd@T++1C=a7gKOGokKg+Tri5 zBWDb#yH$ok_b<mBnCo_TU9ViZG6(QURf#H&MgkSs#YXRg_qn-Y0yZN?`)lJdJ0>R! zoj6n?&Q~C(9+&99oVK^Mji$dFDoscD-muwk=+-a-c7u_bxkuUFu+b|!fj2rPW&qZp zXovOi;lr4nBYB0pkF>S53lqnoHR?G5i&4-dPNQ<QnKi!<{T1Xl<S&oaAfvFb@Z7+M zdpv@IT1r%%ot;~Id!4W=2m}D&LzuuQ_IA=?Y{10F4+YTT=~=tx4nP3O{XpTP^*`lp zWVfFq3(ZfMQoI1J!4Rkva&qW|si~=uN?9oX*fb<Hu!LkmA6HRm5V4a+4HWAM$;!$a zA!mtllMvlS5twni;%MW51sNM3p9gSvxcnh*h~gD90MC0yMs&zi*_@FWUl$Y<+}huV zt$;!7786wr=KsNH742CoXz~JQpMV5Q_*`6E{R0DPJs?)CtCg45<OpJz1FtmtOWL=w zsvS5+q6k8A@`#72qWNAdr+>doZSC#`2L__RBr_yQe*x&0*K5}jS=&IJ0nVo@PU>8N zr27KKp_@EBgm5@;e(}@xKmfkHc~1ro2Wrjk!|Im+{aIOAfvKqXab+Qh>%I%n3(#DB zeZ9J~66Q;&-!Rr!Sv%#e-!RMu6jjn^VP`j;_ZLemer$EL)D%FOqT$+mHq)-?R;V;Q zz@bn{cqJtA7TzR^K(+)V35DC<^5e&K=n9qtnIZLee~{s}&jFJRX#{^fk9YHJ-+uha z2Q?VNSJZ7q7K;-Fw(5WIUZ0hboY(y?ob<Bvz`FGmo)%=5)|kjly@0s&>MSV$dG3ST z>Ie%9OA7$Y7!}E*;6x`IXFzPx<b3G)`T1pKWi3s<Cp>1K3Ejf7oq#WbnhQ(;FihbI z3B=>oFdzeX3i$q8XgP#QVHz}$%1VC7Wq0@L!LIpj&$8-ji!+n{5BCV*xguu}*9WCF z0XqGM%+d4{fELC6YHV+HbhI|_j+Pc_MMcH?(ozD@NwT90G5^R7_4gD0`t>WZI!czI z!RJVQ|5^|XhJAf~+EE0aXBcb0e}_T5SBRD{0IUO|)C3AHqLlVKz{{V%e=iOfKCWQ> z6fi0g>+bGebU=Vd!H)?iX5#ihT?#f82w&$(BH%WLsh<@4oFDzh#KKZTDxi3Dz{HtN z-2Nk3wZ3{4VPF82Rl6*$l#GJH<m`BFd9tQ-Y8{Y9we2|3_V%`-h?eQn;vyR4iCdUj zL~`;Ch`j6#9&5&z3jF)2tZGznJ}`R%6Y~P3*bPVGg`YonwLam|2eTCT7&<`d)AFGZ zH*ULfcyiJ^os&4-GT3ke&ChbY!aVT=Dnde1QpNGh+0LjN$YBH_4K=yy1O~YFmX?5F zv(MkZ6F`8P+ciwYr%eLg+1lG%*mI_+q@<>$)pg8o4J1iX>^+UB_W72PktEVtx?p~6 zh<q4=c=-5&fEQcaU)f2_%w#Zo`V?AM(lGbSqmz?G7(z?jX^S72WKT^^6)2kw-_fOq zDiH)M+nW#h53ZZW-+M$WH7IO5S&auNxq^?@vPyFS%vC<9z6xmOPoL(#9MzLE8C*fi z*BY%~zF>X(_6>dvL9w?h5=^L0{YLFlJ_)v@){55_5)uMRvy7McQXsRs0;3184m{RF zEP|m7m-1W;qUt)5uLip9|Iv<Pc5F%)`ou8Dyr@fXau)p22Ss$1UO0N)*w<IMt25)m zNa=$}XL{5>9ltIx*!lTXh)}Qnt!eAf6CReDY`R}6Yr?M7I-Me(Izy;;pPm>wvR_YH z{Gl=YPHF&cO{eUJQ$6~NG`st1f)#5jY`tJ0_~!+<XY}J#MLLcyzqsuXhw$8uppNP2 zGh6lD^zsh_lYg4)ETYmDHfMutUw2={jx-+j=3w?)!}XQh0KS<bpOClPV_wx$H1eZi z=I0~6zWpQO;^^~F1*9Ddhp|DJ-`QmLdl`fYQnAuoH$2aJnA`sH>6QN}{XNGc*S{)E zG5c^=_7PLo<Cp>>_AgCgK~Y_qxnZ}Oc0NVAj_aBenVoJ^qZfbaywUsEc)dDtdB;)Y z3eNY;l*Z*3zQ1`$V>a;OE@C>9{GYr(RkvPvibB>+5U{wWsoiLg=JVz(hFdZ%P^v<+ zY3>ba-ux*^=iJcz$M7%LzwE0t`{GA*eS1F?IFuxO#O)*Xn{ln{TUX8BU#69NQYm77 zSGj)PM(j8_?b)u85Og&hHE9dCqY)$@nQVSgzlwbFHww|YC%1xVr-(m3F72w1`8crK zF}lpR$WnnK{VsgAD@H~k>JR7*)Pmn!+qTuxTYk%;Bj$z^&B^NN@Pm>|yN~hy&h3tR z?i!MX$}q`}<i%^3q+P9lI1~D+FsvoKxAU!A1*1a5clleJ8J6vf2vI@DN+J0t?xeEA zR4QKv?tI^ocrdefL+hEq)#Q_XwkO#H4hv7p$sgdM1cX)7HpN+Rmn<xx?GM_1)jT`8 zp8KoC{M)8*#o3J_osZ}?TCss2?mUiO`ov#y3!#QRuz2FBVS;)c4Z#xtEvf8Vw6v!! zuj(<?wFASpsc-n`h*}F}<HGciN>`_^2cTcZXoQB1(!B5JBQAYJ=;3qNgf>_bJa5d3 zmLRuyeD;`!&)0x-p=w+9z(S<fd0SHPW#CQ9gAZjjLgJjl0Zr^(KI7N#H?OyjKR<4M z_B1<HnuGM_?b}2E$IMJ?3tkInR$-|cSz<~gBuu`Kiv371&x4SDz8ht;)hlrCkcOgf z<=$83hZ{V1uHoYCvDxuaRgyae&7A7LdlPEwd#}>Dpuu7BQTvfgdaz-W_eYQUr6A$o z;;1gRWA-HUc!8~9I8MjyLQ=^&PZ=J6$B*6j3TVQq_td^=^qqogbs48d3B$4ZjIn6d zUi+^v7QO6EnD<8|aFfpTBZn%fmc^=`f8fHH+ix!!OA-2%oJBA27AIHhVLxX~chIsp znVTg8I|X^ACH`c?qi`>-w*Klp?{(8`LJ}I9Sk^~5WxN_^#G&mD4;)-+DzA1*H1rR* zmi;=(3px}*#n%u0u}jPQEnlr#pf|VwsQ1qDxy`b80d1c)!3%!j1&!deTD&ms9Fy=M zF2qfr<r5rI*U0XbYiGQRUT~~pL?t_EIf<9CsdCCshEbAY+7(|qr@<OUe3O2Tu%tRA zn8~p2dEq4P*Y&;apCg<N2@T^>fiMEbM|zY9K-CHzi1*|EV4w1A>vo#BNQ9b*7*E~- z|GU0{%CWzWXSw9U{kBZ8MQh52ifk%@e{bCv4?TNnur0^!^;5yOv2A*q-Eqo-S@63T zZP&%LZu~x3Inz2%r-%3i<9bKO7d^tXn_A0F;079Sb}-%}_8)V=Y>iAyRJv0qG_v^i zEN9Hql|)=w7%_Tn(Jv(e-@~tM?CK{%XAZ=qds>GJ+O1lVKfUDd-Se1K@zFjJ@br2x z>N2La(F#QkQ9l%UR@-EON}3%V;y8N4F=ru0Uo&0mS|&b~jPC5U<$~||JLkbeYa|<I z^_Xu5!;H=bTVIf<<Jg-_7fJMAIHNStECsD&jJ|)mf>N~A95(poau+!cNd1m?yt91& zin+(y>g3Z39CUg4)bY?f3i-s-cBK5=dcd2POR6alFTJk4KA+FkWpLo<tNCRc-HTNo zrgF`>9Z#M)ine}*NAaoZvg_}npOw#yHFr|Ric0@!l-j=c!0M}qUGUGe<^{(!tk11R z=iYf2CsKo+i3<^qSR<0_Vd<oMVU75z6fP(?7L?SyxN(zTmn{GNw#qH+Xm=j=@=T9m z{l0jQbu9?emPLd2#<qv`#m}7yPDAUYF|untP6;|r57rL`E&Y824p6!_W}j!odP)De zOLcKmaev9Pq+CMO%P*KK{?S(R#`9-NSIabI&?i`4jm4k(03>=}HZ_EnxqHtuMSE*j z#e`A6E`o8imF%1*ZhiShy5aNq=WR6>@`U#`J1VPiEzaBS;ZP$oiK!+lDQkB4i>L2M z5R3S~_TRV@K0xQ$Y)W@Q7tA7k%1GFI%0$NZ9mic1A<ONwA<aY6gjx%|o3&5g{k5wD z`qfg!ft}Q9p1gTvq`dA3-+3G3*1)PQ)J6`k_~LK;V}tTD23k0+@!!gH<K>T}Gn~7K z@E5K=?CxQ|8{87w^3GC(*NMG0Bm>4bS#lGGY9&7tNwr6$j~I1t{-$GFn)_WY`T4Iw z4C{1^<LoR3TJ`Oe`d^f_OsQQneC}h!l|&LcMIDjX&)Za_)>}4{E7US;XiR;v=RW<S zbQMaX+c<R6SE>zH(z-KuQ6tsuis?72ARs5L@%QMuRC}wz(|(`Ntd(~9{(G|d9iHyz z;v39ii~AVs^lx|D4sV_|NE6gEr<=+*KO5WlbWJC{8twNN^LO3?#8Q2B7zW|j48LAh zoQOrI)6aN}Q^I)c^UD|ZKU-|ap4bj8NuRP1-z$;m`dXGF%<;nB{`owHR@1}VohsYE zdlC=#0<k2ewuHK#X|r%vcDad+I<>v%6yq`5j#Yd+W8g^oy<3if;n&rahf3R{s2C%+ z+D4toqdPxgm~L+<>dkF<Q-m}f_a}bEW(w5q2>4LQx{kk+RoHU$BkcES`|`mP@BR+m ztPR&WaVH@LX01K_%GU1L&CV0nIKn*+f@8Gi_NOIU7geQ0u7==?lNROVy-*kZwYu6= z=cm{~$op>ee%2(1K3B@sGa+u<7vXY7!WY>oR4%T^)|{~-I&5dhleO_<6kBT(i)f$r zSMcw0mBjjm%n)$7F0VCaPyEd_4LdlVD5oT}dSUIy<!o#_OKPyb>Wus>e*NX;$IFsS zD-HMQ>O}Jz>ZWigN2y{T7DPs0E!M4S18s1eEb<a+@uL<omRDC-Bb9i{yVsED>xKM# zwQmolFP%e`l>ZmwmoGQqJ+`qXUoj+2M2(K1V`aTRO%Jw3dBB41K)nI7&8~0v>|LL% zK|z3pMZ3vI#AXo)OmUw>Lco$hOB6c;o|YOhDTo7{-b<q-IQ7Yc0O%p439DBBI3TlJ z?AcD%rIAbv{7n^aN?CelXPdix0XKG<@MeDau>DPJlV!;hG)zPQ8U{XmSx{m2g~%;T zl?i5Dw=lZo-Km*|XV*c^G4nkIvS7GY#m#{z+<kChAdQI!M(<&2RQfEGx$PN_-jXjz zk*{CtBu>KHx3{<RKK(|e-|T0=+KmGe7EFmPJkjd0H%Lj*xss!vG<vZvd4ebfgj!f? zsx$NG!NEZ{sJnA>bE->{FpiJayRk<`Me%}24t{~hb)_(!0>6`)FqUAq*cG$2wIvO# z#aA)*IUrVmdgKL-T5H`8csSsf!N8=VLg0J6+v>g9h=HHGqo^q1y`SJMHt^&+^Me-P z8i%PvWv-y!`6_lzMgb2DUt8U@{-Dt7TXmnF9n21VeEQwm<;6t*=xFz-p})bBJN}2F z2W1qaB?j#tv%a~Nj9TM<r`z{{xL$x4YIv|X!<>kRG=SCEPZvx(0Go}}FZ&B0a|efn zsAXQAgSlS#*$yK<wW!9rANU@?mUPF-seG*(NNr5ezv1s{ZYi=~fuRGkEe@D@w)L;u ziR_Q*h$6NI?$K)2sr=J(qP!0_hCLfVWoUyT_@l*F85qUi7Zwgr8wI~W2kJnw4Tg2? z_0_c71BFItK=%b9?GHOky_h&Srt5Ke^p+z<<nVIXFv0h5rZ6zGay@AXhou>TNhIaA zIrd(Upb=92A1$Rl-doMBG%eAuBZ5N##zL;@q}bpSDXq`eokWKn<p&Q`b#)`b2s`)r z<xOGdpP1lq8vv;Qmr9rr3}n)<z64Ss=p;DxY(jJ}Po?lZ#YY%{Yeun<MSv8DLzu4U z?Pr@}fH*{2FM#Vy5wxQ~y6k`$%oCFxl(w5~q7FEnnF0wf1I!*kZ*~BSg%r-gvvcL@ zRbwFcKpkoSwAmB|O7%PtxS+<_hN*zZ_s)G=v_+Ua!M&g(ki=2~CY2OXH@1M$kHYXu zdaywNZC_Ea2Ko`*ONAjSRx3+!ad9N=0E03v`RyzF`}+=dPsz~l+_?k$hh*`t-ME1P zd53&Lkd~0%DEL5#7!@GiVbWi$nf1G?tNZ4S-%S}zH3>x%O0t5kk+@0<Z!jrB<YC;D zkCxWZAl5XG=(+E`^ZlJj5YR1)g{4gel`8Okuia%7e}Df6Z}_opsieum%n!42$ZNi* z>WN8&gL$e%lL$3|>W!YqrpcUo=!mlN@|Svv(@;s^4F?qp%aOTLWpiVHeKM@Szh6;P z2mCKUug!t4?yje&FxW<Ujam@4GY#ZG;R<M3fs-i$R8D$uKPoq~!oZEeRPLFYg;YSM zjWxc^%+P@S0DDtqcua_clarkZMIc{0Cb2*}F;&u6RMY%^;>u1vLva__)2;&}0`5E+ zFeX-Mj)OjaUaTVkE`+PJwCQYxm5`c9-`LvvdcL9(rdk0AM@SPwBMLcbR(3Woh%8S{ zS$5LlOuO3Jo};wr*E-&rp0`XDarwoANC6J;^E3@HaR(?od<Zfy4OmQ8+eK@0ScoVN z5<)0PqqKm%@T7S1?%liO-rnBm(%Jstq%^gkY0z0LUtCxqgF~~JYYD`S>-JtH1Ja}~ z{MyYFARuo8V^wgx2fjcn?bo-G@-V(q>Ukhj)Xhy!P3;;314DnE%WuUKaFqfrMGn#f zs`TF;qNQ5z11?dI4P2VIFc`~gNR>e#Xob^{0wY%O;90FcmyfS+x!ITa8m)zVMR+F= zSArkqKAZ>Q0)!)P8*h**c^&7Zk)Z@ia_%!LEc!S1rgiw$s3Bxjl!vwRK?nnjZD>J- ztW`llmm=#2Hobwx?zq>$Y4ZS4sgcbTY<abyL(owprdUj_pj=%ek*P<+KtPM_)-4Q> z09%2cVzUFD8^kr5BrbyjRn2OKf<Ypg%IEbTEi-lDWwdgI|6Zp42wuj|pWx)^_(Crc z9C~)C1r;P}5O$dnc|?<#st|V8k75!t3pmu!k)Ksi)T5>GC~>5{9qm7B$xB1Ns~5a$ z`h=Egb{?D3iqe2}5izCacw`>()NjB*@KT{eR$jh>S0i<8qAF^<(YvO;J#P?nsMnyw zcu|{nMv~wnav<BIWf|1e&E6nwZqcQs4vn!d-@NGois?2Y1>8fr(+AM^`k?NZ-`GP+ zgAs9YrH{pB#nz@jC68Cz={3eW{r(2MwS94$_#rBI4PS#Rt5?~qPM?d<|J)}hFE8(j zlV)~V*$WZ36=FWnSGvHFH+V=xL(>OZ%4<+Sc|g}<iwGpllr;1|69D4_^+=bgd3g|2 zdSmcw+2U&60*;uX(M#FZmc!Y_Wz{XZt3L8E_#)!t<5%2ppA~wt_H=cjgM^19l7X5l z-X9JJUhyUqQ}&Z?mgdM&@G+9q$AQSXzcC#JIw0`U4OPnjaX27L8nimEPG*(P;dUA+ zzi2pkr0ob<c{XRVhubl&T9Drc_83;i6p<ZObv(fKWbe^1+mpzh{e5{eGe-GXs{4h$ zFLm>4Ba<_M&fnUazZ*>})ihYWyljco*KkNVJZ-AD&27djm@N8I0-<hDJSJELzj85+ zJZ3(Lu6y-0tD<3T^E?#+t;m9kEUp#xm_+rbn$?*r1(kpvkYPAP!BifDscE{}Jvv(R zWLTT^ndUgS6je2oH1bQ}ugZB4EKQ1U!AlIL?Z<lA2DOg#&{t$Mb!EbF;kA9>BFv}I zy9e}KC)9VuP$LDqW;q)rDQPIOH9<H+Y$p!oQm|*=hj4;IHqh*UVO>}$b+&umYi~vF zzuqW1KcBVfWJ3%AVHkD1-=(Ed&L(Esg<H@=RF}k|gZh30xg3aHS>!7iSR(9#^MtB5 z5_JBBwk)_tz|geirD>YYU&p3*O%u{mGlP*dpsGs1*3NF2T0U`YZ4Dfk`IU<TIo3rX zA#zph)Wabamw9BTt+8FNK07lOS((A6_p*U)9mo-|wLOFn8!EE;k%Q!X9i#=BKPW3H znL?TF?oxHj`GSMUW2goeQT}6OG8;|vX46T@LkOON1#3eHw(Q}i$`@?s`yhlrDb#Er z*8E{@C}cmSy5d=&>bJ-92bJkPgn}{NWF2vsF@+;}tcDF-v_Xp8urVRvmv<DVatq56 zc+%jBa^uDg@XFIOFrb1~rgcmP&VUqNQ*^KqBM?vxyTGb2zp#+GTOPh&>#<1#?XYEW zn~;{abV!8)OusyU>(+qZ9TWr{2X)Z|C|IPyjPj8t`1c$^D}5*Gh7EftV{5ynHz@(# z^gksIl;MDql9D$J5G>?{tb7A3G5HjQ*aanBifRxgSq0Wb8h4RVVvsn}*B8CLzn{a# zNv|8qRvtqIaTOZ_{|>f&WP}1(Kxib|t!coSf$(uzTolkUHaOno-GCnrD~2BhtDL4e zRy$8YX_ulvM}e&zM5(M8Tr+_>Ml)7)`Jsq44byB2=YKUZQ*J0#?)u&bYumFjgUeVj zE~H?016LfnVPbK3L1nNd7nHAZ?LLq}J|fH4Oyk?}k5A)}<qKTVP`=0#W6PbF6wwhs z0QtqF*%wuF>DRV!XqQQ6n?jTO!etNM12*cyOwhtYzyI##Du5lg0KvM=8dtXI?BsxN zvc}=@A2P^oa|Dk#O<WZtT+(>^n#cJ0lx$M(((R_ErU%eT*vrd`^YX$)e2@A0z@6ST zJ>59Cxj8@I>T_{+&<ADlbw$N(w=gBlRc4j6zNtp<<f?6Loi!xF-7uSY!@$dX4Vu*o z{s%?C;Ge&K)%-(-=SETr13CzN`q`#=CM4W)I(m9w@?&6Rl&zk4+#O48RmF`5^n7?$ z7Gq*;MjwQsxxGC;V3l_uxNle+f;XxYS+Kz2UqM9h)>EA3_+T>*(s>?qHqV`(Sk?B^ zY#w_&Ah;uu9^hCxknrp8-U=oGC<2xk6FYI(-})5mrmjlJf;?+jgfFNL;gOM`75dv$ zl^z~?j+oM~sRK0`Ng?b=M<eMIj&+?`V}cZ}B_Qu$0F)Zd8*WPYIy_7QW#%53jX*M) z(Uss361qW8PmeT8QBhG@?(xWI1SF+0-Nvh;=tXzC>{XM(`5!3Qhf@mwHNAZ`A|<mt z3O%V62+;e56{vv$1r;&D^_NYyI2ytskA|uICnvJ=D(7`6!gei!Qx6G(p-SYlO}6BJ zq+plR952w+%%%o073xSn#+$j=207gr14vD1$B~e1KN`xbcpPTLZD*Sdv|`LnXU0~L zDr2M9-t)LP0x&8e1OCMeLXbY%!A4$Oer@R(L}RDjWi>EeU5AKZOBJdHUmM}!(NRWa zB{8BO+zud!Fi1$86`#SFH_+(q0c{0&bEv2j0|7c>3Sc*MT(Xvx+8u?$O5{d4Iq?EK z*&S)9gk1ue2SxhtAdxz?mmoh99+3&zQ|<iyTR3%(k?797dl<>whDaMYSm1h0f&-$W zaNO6%P+(-#>elRlNy7<p#`=JlkMAzT>hbO$G!GAtBnEGTbuWklkY;{Lx`TTOd2vWv zn+%FxKbSy9z86D%1;I!Lhi{0eod>1>-55JIxY6+W+qW)=E1X-mZfTjYv$3^7p4hmh z|NLoI)raW<f^|n%S4Eoi<FM{UWL1N@7!VS26@<TFz>+_I{X*WcfrLubv<8(L?ko@h zI@B^98XgWQE#;OxS>w~{X3Ant=s+#2sCW+B3F#bPTnu(Mjs)i-@KDYJYRBHBrKT=U zaI#qhD+cVwJ9Rc@fRNzU$tTM6vV(!0Fsc)K{cUP$aZ=nvlGih`3@`JRrckA4n|zV? zN2;Q~GPAR@!{rP1C2fEVV|6aMZ6~?e+4P*8%B-eBjB;{vwZ13S^5DfSGwH;J3O!t# zuFWA27d+tlhEDj39i&wbW75!PfD_>PDT?V<BwAfri3ByN9BNVD8v&&c%^kZSNwKZp z^tD1Asj_(t0ipTH0cs{VgA#9-WTN9!g~Og?@7dvl&<B`JQG|5s*EU$mb*p}7fmIDO zFSr??Nco8w+AVU}L7E)FzozkN6Z~w(=H^$asi~7P(Ja2F#lU(5-=s}BG#$xXm9j6* z3L7;0HBR_~r`gEFq-~=6+r2oN&U&|1ZSum5__SGQLSU@cs51m`1Y4_ZFmnJPOS{ri z?ZdN~_+u<wT(hdHFB2cvAq{0`BL&O@KburEJ4PUOT{mY8{&W|Yf(dC6h!;hPoB|pt z<&(D}{^#O|uaY$}>q-(m=af#)&V#isdgYz1;CF+;R#Sz?`9ZoQHTWZe`=JMfT`Cx% z83tBYw=n5S%*jzX>sxR&Gm{?xAAl@X53<|vRjucCCh`Jk9zwZX?jM8m|9N$<EWSgL zkq8-r$c`}n5d-R)rU#kk@k(JS|MB(@Gh<_8?OMkSc+Z%AaO+wN)Ud-m{3-uY?gC^# zoMX@06)-g+2V<~4cXc^d?VldG#60QYW@Ez^5fOnZ9sfK<s(i8E&5Y#toVgkfZt8ka zYw?32>JdG#IB&0w^SOm(klkmC^>4<E3v+0nI>6Hu*QPg#-E7O)awUG+Ec^2B)0FkU zth(flX*AhAF6>?b5;*$|4@VJ$5gLqg<DV;N1C{?u1zM!~?kPTaFx2-a0^7vNr~6Lu zQxf_U^K0+Iz6F0b@+cGf;4~b$z&%&fV)OY23U(ytz)2z2;aK^?ySxxjXL_>k%(n|s z4(Kg`jrp+x@xp)cQLCSXtv=5`Ob^`E(Rm*H<5Ssh8H%2rw438|XonoHpKl~xNN9(Y z3)dUp;<9iFJZ0dg7N`6x=@Mk2ydhVq_TWLZ49*4%j!mMr@{<PvLurRPh^}q4Ibm-S zt<p(Ex`qE@RNT|KI9t*0ZjT~A7tb0Eb~Rs1FJ0+Y{eRQZ)Y|JfBU|&^pI&=)>v&U} zC~H<SOxE;ijKXsTypD)NyD667%9PmQ!cpVJ2e+tb_pp2(n&pYzrrTuGQklj3(!a?w z>p`DW;~#DQ_jFrf*W$+gU(I78>o!H+mT}jv9}BHq(Tevaz_<}3r$V6Ov*bbYry>~- z>-PB}e$dFzg{Y;%GWSVE*Jrg<&qJQ7;UYeFrt{sKyD`90q<4WM$h{t@_e0)v`i!XR zj4ODEWUnJ$l-1Ys`a$GZBWp~Ac;(raFTb}F$qicKLNWeg4Nn_BzsT}C7-)jbkzDQV zHUz3B7LlX5->GSvHUlczjXkw{G`#IQ^F<tA{}qe1y|`*KQq=k;(R9wTpTzWtYu6+b zuP4*#?3_L2b(3FjtIcA>##d*@r3K%F0+GtWTn|U?c4BvTafE)#<hTs?O+51Wi;9n@ zVq>Dd^lhIw+PYr*&YR`wM8%ZF6g~|sc~vw2bA_H~!Fl`FzzaGiNy^lQuuQx-wB@mu zff=%A<2r;W(tk|#8(*WccazSc%;^e-HTsFhtCQ$Gd`qaDS(l!1J;p-Vl5YXM@u*H% z0B`&_)^aH;x9NT5aN+8ql^t7`{rMF!f;4kuX&Us+`molDbfuSXlF)o}wH$|*;6dHq zG0!J&&e*ay_z?bnch4wR`6gF{)%7P*YmqU}{<_?!Q2$uxXuXuR9~J1=d(c0=K(~MD zdWHH|RcE&EMP~cZu%}@<+I!6TNv-)8-yEqZCTR-ns|T=|V=e7aiODEr2$l-->X(I2 z2tlA{RA{b6Z`OAS-ppt%o%G$+TWlwIV?6oYrL1|9+ww;JwfRloxVyf~ABi)4^E_o% zR8FZ-Fu2;Sj^oIFXKg5DR6h$TOQcleaiTwdU3lvnu@qicAp5VB@}KO&nm%%Q`%}N@ zTrUJ%X@&_bY_BRMC`1Pb+^wB3AxiRDDmae{+{W12dm9%!u*#b`%RjYThbra$sqtZn z3)g&qL^qGX*VaFrg}naN+#~4nbV`{U?*+7tg-^S7<$vGNQ=MwZF`&-MZ^mjE!7y4p z8AutgA_-013^~KNm4VmjdYx0&zd9>z=f7!lYx$9f<mXdc>?P)rEr_}0y19>auLkDN zxyKHz`S<req7My3y!6H?%sp)PyNju{Ta+Re=4A3;UVum~e|x5g?lP$g_NGm_rv{Wu zGum+u`)^Ga2Q;_%jW<6$a~d>1CH0{cAwoxBc*QpI`gMG=d`^_r*_n~`!}HM3nf9dS zL03XUdbyJ>0h!!%dwW~bR@z_Aoccc>SbhmTAQz)<T0pE(EIkiB6$ucYXB0g3y83#s zyKu><FLk&1&8#qmTYFznGB48}#VK=4>-(8F$K6@OdGl{~oJ@#mh?}EAnsK`lV+#8n zjPOyPCui1Z-t}X=e>ZDiXjb;^(){roEbWBbq-_R<Ew5*YG?m}QH9RI#X!dU9tq9Kc zLv`oIYhxdf(*ER#Ldt<9>w2HYe*NTHl!u|3?}NIUtslv=H`m|Jmb;ksVsBd7h}JO3 zQVR!HHr?#kY<4<ew*D|hxw^g7=J|kRqYd+^uU0o{vsZ}B_zho?x!12-5z?(yUsI}* zTa$5a84o>>qB_kdP~R`=n8E0LK}`Dog(-UPMU#QUm<3s=>p=T@+1t-g1-$kn{_boQ zOf8z5S?LtIWB+$)GvJh`@9z_|ARS6x|6}J<rzHeBqkY$I)#&id4P^y8wF#7#!9k+q z<F%$2qe#vIY0jsXznUFBAwvGwq|xc{u|s7)isI&3#1HhdZ5_(5&p6;Nqp^)m;RX-l z%!9{kd<9!b(dbVN+By+VD-WUv*U|2TMB9BO<h*gBP%!O(i;7b21TLZYicXt{R%E#6 zxOT22!Zn@qW7l5<wT8_E{d#Bj)x^pQ>*md`t9%?W``wRSXpPq94S6d=XjG30yH#9x z4?nsW=@afwdbZ}qJnqbMT%D4VKIru9et{N%_V&rl@}?_=%7Rj>TSb7#E&Nh!#(+e8 z1kD5G-3&vs{`ZAti1)Mi1I$ye?1`c$cL$-25{AlCB}|>7-D09$I4CKVx-&nq*G{}y zjTzw+OyoKE_w`T|xa@%j)uD__WE_DR18!EBndXVUmhi5YS3%hau^_O0pvXE|*y~nK zu{qCQL0V2%M7X-6?GJnLcM}BZ`C_%s=y()YxJx5%JWy^8Q9dEym3~u7`qNlw6!BHm z|9P67#_QAYYfD0utDcV>y@UDS<?$D7Iny<~c3ZjFsgr$D2Xq~lgb$nV@+oPVuMFkC zIjCpw3;z1UFEr!3re{q*8lBAGcJY|6KL>i@UPN!m>pEWSM-{5*4G*(YP#f4>I>hXr z^?R%Z+K=43c0@<uSc%%f>KV_{b#l|YMLX5lCg4)3O~@#l(l7(1Hk{Qy)<c&Jjqv7t zk1}H&e%IE4XMaR>@Snna3f!&!1eI5go@HB(b_O-^+*{!k7ET%V2}yK+4k&IO?vz=; z=KQE+EEoS8r5s(m*zLqgkLknC1(zI!z3>kg^6)2WZDvk9zj~jtU7wZ+cjzwemdfOK zr+2{V{^Yynw9$emm6S6YR@*??f@90P<NMJm%5KjM#~+D9KK7m89UVu>`Du4VCI+91 z3=dW|OKdNyQq}E~g<VxPOxKI0(ZxhH8tXFnO56J<Gt~(Dn0X^`Gl%Y-r0<o<q_y@! zZc<}?O5c_|N{dq-qB@LZqu%mI!u8K2N`$|)OsTvTm`M}f=6k0tLh@ncT~OZcZ9kgT zAOoQtL^x98r9B{vWWtD>n|tHHv&1cCi>(`xCQr5Oo)(^+e#^8oCFD>*nC!ea$N15P zu<gl)!Leo15and)vt1#j6={r(rd+k<LIQ*Fr_YGO<#2`T-?$Bokk-?fmhqb53Rduf zpxW1*5PNi7?;shO7*1~@+<)~(jzF58uje<ZXur)Whb`JAL4HyDkB)Ulq`R~4^c6Z7 zc&PRav1_Yo7fxnBF)%S>f8TNA?f2l(=gdy@<FfTSYH99O>X7sbeU7hGR+^~1c)qOH zjdhoFvjrqwHajnmRvmHL8xi3tH;hC%I}LHDW;o>As@h*D|L*^N<SW}@emA`t-LSgc zZYqvxKzGG|i9t7_@PlCnYFh~YnSd1*m+LP3P-(4b#f>0}@5QmyGpjbagP0MKj&7<) zLpJ@_lWne?xEmN54!hsftMwQOpz+{r`DhbIuvV~A{JX7V0IiWq2QPv!+om$T<DZFb zUG9JW0u-cs6)yiDZkq~*7Q6!b!~{kxB$ENDh`jh+-`=kXKEk$z&L|L8t%C7s#KYk# z64G&pY5Wz&2d9g3-%GBv(QTEqE0($!__v+iKaapvTp$+BCgHh-_%|`IGchs2Fnnj) z_=Ws3@I0RTYp*YVHOvUyVx{Nfdj*`zk#2!D$4JBdX1J%#VxX!@I5j;D+F~CV5s}iU ztQ^0kJxN;*km(tq5F(*G4BLfbS3<xNsJzHraRJ_26a-k#JHf-OP5Br+OK@cghT~zb z**=R0c*CGX@^3gb0I8Hcsa2w%AF5C}3qNXP)H1{73C}6>(c<Fi=C^j!`eWe$|EJ#{ z5@E$FT&dEqlj9^IE-jX|tFn<J86F$uB6iS|yK8!1O=DfmYFPKvl{<utpG|Tjn9M5r z9$0Lgge41_$=v2Uc=z_mH!b)3Bi_2-HXErf{w^DJzmr(E^vnD&{04Ri)wMM`CSE_4 z3;kl&2BYRT;h1ONzWGQ#P%qZHTRM6Z)C0E!)PW|>C+BBN0l$6OuY!la=w`!Do#xx1 z<c_HIWq#+JzeZB-o&(Q}6j0#qdQn9q45dg58SJ*!IL+2BZzsrYT9au4c4K8IpFB3J zC-#wgmh)1Nj=9E^CI13Uj$42K2Ic1Fk_<{T{wW8sz|oceLSqk9BvX)wRr58LXPf;4 zKa>OExIA2#tXVOBotBpNbf=eF*!Q?{xVp}~mz3Xn2-Cb*cYhAMSSzZ^W)z5mQN=H| zxggmfui>ncaqGZ!PvlLDLc7hDKs1&|Ik<2$#bRH8`4v`g5@%4W5dxu6qE8MSSBPS^ z*Pklqx_2k{y04$e$`z%u>Qpf1JWRE&T1SQya-N1&7gsB%xvdOgrHZ;$OzeZW8U_~O z58!Oqn#{R;S{Sy2lan6@mV$D9;J7t+r(thIk4(flOX1I2FRTHn7r|<Pl0f<m;D&(7 zc6C<_Wf*e#5)#wcUl5H?;z-!FjQZc-u`o2{AuEH>S$3`eEoB3bRPgxw1*<052atF1 zUcGwM<o{hg44)drX^cWU2N1_Br)sy^<C57l3yeS+c+tCryv47<Mgn&?gW=J2IdSN# zW#u^NJ4}!=EK&^x3mT{vEX3F&FT=yFhw=g`1gy2~yx}^vHe9A+Z>IA0@>&R^_47Rc zJv5|2CG3cB7U2v$7X?2u&$Asf%PL*ur2xp*yvL<9K+nrOeh|*se~Vh>-mPM{qhV6G zZKYOWc7KGIF<&-<h()u2c$KX42`pMQL*^=6I2Uhj?VH)2DJ8~e55?>K8^TIYj{+mO zd5VIrVgk#y51Te*CETcwfL}rY7nCXCVuencn!U`Euuq@FfZUd~QAuDB1XDZ6)r24N z8OY%l6p|Dg;q5zy<gC-JkXJ-rYyqmfB35mUl^7ez&LNi1&YR)-lrij5ktwka2>y3L z`rkQ?j%{wkg3W!v49lTJUW(<=uO+muq5$I8axe?4_@gqmImq65nuST0>k;Z5aOZ#& zerrQMnKB8vLod@A<@fQdfEg(#TrwU61h25`G9ldj(&?T@rS`5qSfE;iVg@&g($pya z<ZI*;iMp?0!ij_KZ46f}k+(Q{lDXrMDkI!s$SGw1Srsw;d%zV{CR`JK#fe1o%b&UF zekSii$D&|)nRkNFpZSr}(0pTw;g>jJD07^bS#}j2Dtt`v^xL~Ta048EM-K8AH~`10 zU~(1P#4s<MiBv6LgT<i!e{uHKVO6bf*XTk40g(og?rs4o1(8w#357+M2udR*DcvO{ zEg*u@A}P|TA|Mh9(jg!%UFV*-_xJwZ^TqX@b6xunx58X=&G|fcj4|%<2rp4So1L0! z-tx5I=P!}vtv~+#bq2^RWLANijXBW~*wauFR&+;alE!PfWFd3IgDw83_F{C|8AAGN z?_7AgIIJ7K);)Rrtx$&)ncf*JF(RB;7;T4GX)^aY8J5&@@C49(XXz{Hx-!fq${5H- z`s+z)tE@s!?1}<EQ(PBjvU2;f`F!^85ttV@gB^xw7wY%ss7ElFDW;De>pl0x_QwG3 z4Uo0%&CKrSf@eYgQF>3Rm^{}=2Lv^+2G#OZX`vvH=vk$MsS}_m8UZJoTwaa@&isY> zr^Lz-Ri&(6C7v?M9(LKWu31Oh^XM0r2ixEC(J*8Ld`KZ}C>nP)bEJ7j^YvZ+t6kQ| zhkHF2H>$44FK7e#f(6sp01nmv;h22!f+6jy2L&*Aa)5yyZ)Tnt12!s<tUMT@qLigh zd>g0i4d^nmU6y~qu)-;H*1_mv3LE)J29N8~ATNws1)4BD1b!Bo0mhhVX!eJRM3>dE zNN|jyLLu7;nOfUf9lt2T&UnIhi`YsiIVLTQM?q8mt>>QOAq)o>nmym@V$Yla%|fA> z($~qB;7r$AIU1p4{2YbIGi!hDTb5i-gmF24SW#tBCRbqlAvBwmbk1xgo&_v1wSocg zq0y_Kw+@u`1v<+FNNkuMX$5dj9iAmnBbSwKyOB{*g#i{HKA6EFZ>_A1zrVfA4R-;o zs2}we+tMhLK^QSy|JBC~`+5N&Rec{{u40fr_<qg3Jq^JPS@ug}%KBhsG*n__LEBOc zlmSyuW;DHY6!71dZ!upr{}_*Wt}x)7nd`~4+jh43@$NdXun>L%fHBL0TLVZf>Zi8} z>j^TM4HLQyx*pkNWw3k*Y-N9kDC9gjMiEn8U^acNRYV0wJ*o2+??&||r4E?ZKJhn9 zfKA2&BbY%Fr;%{Nfrle&Ys-f?z}Wt*odhD;RGu|p4vgBY%{pTk;h}`UEUJbE$&y&B zYF+i{AjD(`cx4I?Ujh{-fk<w^slrgaTQygM5p0J5JPk@sr3Sd$4<OPhzzwGYSY4vS z(5A6K<vOj5COhMI?Sq5}nMQ?Bv<TCl?{sZqy@y6ds6p7|%YcK7%O=DT-;M+>1zbf6 z&1P;x1{M>mR%#dlJpFEg*79CvqyjSq58AlM^4S`g7MOz9o&52Ug%M_RON)sxmC@cl zfc_Iu6y+Pd7loJ8TWWHHN9S!oaU)4;4pSnUg`4}6?Yxw33yc<4)3HJvBzjw39)5Vx z!>7WhS6SG2J5J6MzTnjM7sqr}?HPVoq2%AWHD$=WJ~+3CWZ!ItN{F|%CkJqJ#9IWO z5Ksl{x(%AIHCUmHGJ)0IyWq6@r~l46sGOjxD(@7>c|y(wE`Fk#1zNmyJ7XR&S=sp3 z{1X~>LpG#JdskPV<9R@rO(5#xlim!M%|a{^Fvx6G-{<5oQ}7!`AXYdq4u-CAWTg_e zY+MeJj=gqz$$P`M;{wnhVbB6>Bfe~wLsb7dbr?uQLUKl1TN}wvu$;NSt!{&H3L?J& z78SAmKq4a0TNUaWrak!~SZdjFxNdH4u#6KSiB$E7&8Nf5gsq~}QECOkmup;Fg~$NV z;qk$0;dDsQj(2V~wWu{ZBhQ<ag#|5q=i3<KH^st&DbuIrmT&k$0)|6--@DquRBAR* z6}!L7dtUS}6w)Txzl37sNwME>3hX0b8IXq*+IKe=89;&rV@BIj2>BrERst97W(|a- z3#DxjbAPC@2;n_Q(5xNqvFh;fuonttDe3aO=^iTC8#fr^x=P)*9)T#3Vt>a#U%&Lx z02d6U$rNcZ1AzkEGauM=SjQx;8AFu=OQZ5MQ|<5W8pB#Q6~1?0O1dFq-daOumEHr= z=2Q6cK+y+)Bf+49+>H)8J$}9_C`kBPJ35e^3gw6#<dQ$`S3ye?_U=NmauCRQ7y#Zz zMw@}d9P>V~0ZE9iIPawIuos;Q!1BXs1qIx|SKvHVXbEwHm-rOE(A=I5NSJbjzyA3h z-U!tcZwj7Tm8}89Zz@n1ArnNf76rTy-6-*~t>jr<LSR1;C)`3}FvOWp-@hj>?S09; zmEfn4Q}(eEc@RmT%7FX;;bafj2#fsu)YKUmvKbWI$i2%2Nx&w-=7Si_ueafW>av$> zWG+GEAOSktEUhBle?m0R4`c~IMIHnaE!iwJV~CwFmWBnAHk*XP*5gi0-N@(YPD|mr z8ehM<YyT!K3#-);*m}k=zWiM3iO}CV@mVt@Evq6VX_0f??JTF;gx4$pQ%~Vl-31l} z0-dvim-6<(K>$&3eJeJgh9q5f%g1@9lNj-_0V$d<j8-AR8}3hsk++OpnB+E3`(|j2 zD-@<9XDCXK>%By|D2R!Dq2MmCO(i2(&D67+g~YWBLx7uyNbYLe7per;7F~?<D9}s> zK!O1iiSU+SLOt_TsLDekBjsUH#3m*NLFH=p<K4q|_*9%^$mYt%g`sRHY-17=6_yl8 z`oPKnCn_qc=fv}W&Ru9%)%XA8E{L;dQOM{n<UL@+x*yrZOsT{(4c-@W`0%nF@j`>$ z9O;Z-j!j4x0TZ`K#sJX|lIxc!XHM=5ec$W@I&Gd}$)cbHq=~l^c}!^iQNZZ9Jxo@t zZ7Q`7!oPH;LN^kmw9Mk--ye<KgVF*-fbe;N8o9Sjz>O|nGmijf9}5SuiF-g5Ms)Ea z`)RI#8FFMQ9cr_ij*dbIJtGmM17vmEu>z)Ejl(py7siA#=r7f3ogN}CDCY?2?$dTW zfZbDlykTpXQFQO{Gvo>f7>!3VfpW4CG$R)=uQ%2`*(|I-f+xr<DoO!%iS2N6eL&lS zT9kZ=zqYLp(j(HS_WdBu4H2P=bYcK!2NYE#HZm8Cf}}O|UrFl^Lwxfk81g=qRRYf* zI#9I=bAJ%vOrbgjcE1hu9lbyYf`Co$1FwcuZyH)Q<X4743Ziv=_%bbIu5Gqb)OPf_ z#lRVjQXo!WfJ_nzx<KXHd=0vlizrG%5am&K_o>JgWTMN9jGQxDP_*zoOHYr4C-XLn z5q_(CYX8|jRQotZT2@=j?j9aF;PD~)UihY=&9{7Ym_rs?qTKtsv;}5}<pWj!Q=Zy# zL8k%_2F;#DoQ?DW6A>L9K~A4#oG9uAh_7Hs8Xo{Je_(!*)(z%TUlr7{5OSG@$^*&& zH&e1Kfm*Ek$tK$p<%8w%26|}aMZa&C-;RSt+X9t?%~x!}Rqz5C9m6W_U$E8)=(_)Q zzT@ZRe=^p<h$$x@9|6=HRb7?f3x+rzfLsf5=W*#x>YlrWXku_@p_FnHaFU=!79=9y z1<kdnO_I_%a&Y)$do=JS5ONlwZV?51q#14Sa41aZ10{L1mq~xsJpd%VLdq<>c!5)U zDvqX~AG$&{2Zn+X@b}G-mOykS=Z}2>^fq>zmAEHRZwCl459Z#K%>a6{8N|-8OF?V- zuG5<lOm4tejvUrpE1(nd56kVw>%m^;dc~swdn9L-7-bHou-$Dv1Myie!)~q)I?|T| z8U8kQAUX%eu}K13a2i&CA_(prmVbyKqP!@xdv!b9+t$OSCQmYO62Yqy5(jd>07)RK z0EQJjLSTH^R5PSPfxTqDe3=w}pxq+V3p-tvT^Cye1l?ESd2}@1ZOd_*0%sfzU;cle zuple&=ClZw@vDR_K(o~W3l-hLum`>U4KTpkJS?oH^{J!)H%&+xh=>((D@Bn3D438j zVCbljx+^545!3;<t%M$%LNtY)9QALuk}|sZOi?KL^P8UJO3{q+n;0K_JgEhYQJ^@m z-(53F6Mr1!OjD3;7H#xTzDlvs!TGOb<$b(op+1t>bsw?)gy?{HD*$>k4UH9_i}vG} zO24l~yMybkCbVgkAVxvydS-;=K}%%`PcI`gFgO^JpGr?X4s?JqU93k)p$DbbFR+M! zwyN*7COfF`=&=2-|KSO^fGlF@vmueUNd(;o+PA5@8wq;Qp}y#|v>sChCCiMX_+YUC zA^`thcl8X&;vah;{iynW!bL?h?LxFV58XPiR>N0_Bj4WKI}dVPbd_y-_sFMcBt_Z& z41*1N{lBHCZWcn~B;^+g{-J@OUd+17rPC3^lGBG%qfY?6mBg+3*3l}1TBjxmRD#f5 zD5p!@ix*wK1X)9H;}Smh-F!919?kcNKm|I);dcglm#HGNRpJi*dyX>skwkmdoeXKN zKyo%QI~#~JS77-0d4p&qCl+lgm~khdPLqRdl%n)$^?&qMa#llz>&G?Z%$X2_0B*{d zw}7T1>2TIEP?qt>Xe*31H$y#)z-Q3N$Up2)10PUur=frm4Ov}XD%~pENZ{GOZg{ml zpGt*w8{~{5i?hD7qX@D{D0kIB&#t~|&)SyCfCE*i03|*|%kIr_YP&jLsCW_K2x@Jn zi?nXNojPB?X5)5A#jMY*+-FPt@8&oXZb7v+1&tU|cR&tFo^x<<Q2^sshPW=o2$Dw1 zji@7<UJ?4z$=|;}3saX}Q-4K@<M$LAAO;48#z7VGF1v;!Y@~@5GQJT^fP`r^D2oTx zS>{0vy9@BS01Pt^s^b8rV>FT92|b&jWiJDQ2!S(Rz%O>^NCmSB;zAe?2T#qjnVuV= z_bL^`<{*3`(*2$2j$*Z}GtgZH!{3xgZIHm5{a|Rfy}J7IeK0^gjm829@B(!?nINsP zQ1Zsn=*mbHAyjmYkUS<}asYx-HkfT@KR8=BQnF?XAskugWA015>OV+l-8G=I19fRA zxXcQsaKD4(T+Ckty6{@`sC8pA+?K=aNAf{IoTrf8a-e=LaelZ8Wt*~s%A32d{Q#tB z0=4>-4T7^WGq>Qq25^A~-RK3S2VzG9k|_{xcin5+oM<`=?aT$xWS4C1!&4X$Q}2T& z@f@V2`(It4ptE26#`)gOvJc{*(E+*)Tb-*4EVz1)%mRk1V6a7s-}t=ldZVRqv!$?X zu12T?ctbBBQpM3)=ku@*$%_n3^<aqMKJ8=$SXiBdJsD4a-Eau1ij0e_ShGUO%CIm3 zWwSarg@yGFcD@?UZQ%N$;Bm%vwZ1gEsjS=zUsk>RflA+xii?K?(j>BBpeq(YtiGKi zlI|Ail7nLE0j+<COs+;*)LPnITP7=T+Dl^ov0ArD0LA}W3!yY(Qc^fw_Yw+BkcXSm z<_u>eEtsSvH3*(MprO=JjC~2;Fb6^^(oimO>@9vTX9pSCtt)b5HxSDN$Pel@j^=~v zhS@_)eV{=Eo&Wd65duQO`A6S#Q$(z;HN5_;JNG3$9NJ`o_nxXD=-p(u8!VjM+~<9c zwk`su<GeB|K3eVI53vWqTnh_%1_eVd&`5jJKs-U3%)EZ2VuE%9Y2lH)0WY{r94P={ z1U+O%4R8$PPb|T6t!H4W#n<2eEac*vni_|Lh{t0!)zx4}mF|ph8Vp%T{=@B5B3U#z zvy(e>3aW<<gS{$*NX3X(VfW!C^!YI=EqdsR2QS1wgDx6zjx01IlKcbqJCA>sz^7FJ zSfujJT^`ECp`o4JdCg5Zwm65QYH2QG-W#j*CcZn~fkd)}D!(^*k)}e)f1OdaL?<N9 zH%IVYX#XbtlE${_B)>=OgU-gS5ia2i=@Hj2-0lnTF*T^sF6H&)F2=jn`6t=!km15x z?5yXbAB5sjZ=da${iL5?LnrwLi<|eo^`RFgw0EC<u>5R+uTo{7B-DVy*!ryE13^c- z-z}pxvDM6%8TGa(_9OQ(BX2jLDU$6i4(@w8<X)wSw!#;Qd)Z2}I3Fq>e~T3&J-Q=? z2do{6=OvSPbVweFOxupcM-AVL>tp|AQ+MD%nf+Dy<Z!@WTClLCs&irMLVc<@eg7i6 ziGK9W;mjLJ>qc#TeZ-KBkk^!@1|}JC&JlGP`i;d6Bw6PeG$nIh_E6{FYdd!f)jr~5 z{PmZ#dgX~y1l1J`uUoV&!x;T#CxWTdF{|G4(czoH#=U7Dl)f16iVrB5Xw0uyuOHtJ zIHC;dn>~^386<TO=C)%y%9Y>Ujr%SZbBOt^U@ceyau{H)zKBXwki}}UIZ%K65McM7 zS+DqW|7rn_7Ze=Hr4|WWv(pD+@9~js_!wke`FXQxnkv2T^<qCUznu@gnAFXV8%xEs zMm)=-HE~DvA}vrE0mg4m%N#vAvh$*-4=bnRT}!xGa82mEgU-C4EfuQwh3pX($JQoY zTHP~$imf*KhxoASUAS&*eZI4Mt?9gou=QrHZF02v54K^t8C;p5EBo6`?~MqXp8Yh+ zxjBC_{yg3pySzT!-{*pMZ~5V_9CKN#X<Y_RuFUJZjvj9@+`d<R!O15<T}(JjV-%88 zD2Puz6xo=1t1;e%@+93EN2EPk&f}u@xTfJ+nO*lZZ|3pJIMFSR7`J^))r{NM_Z#Xt z9;SHt8b~%=kD2bDR?XZ-Mc*@@Ty2>ff1$>)TX#{0I&#%wO#Y-qH;bh|I-f4oAt`ot z27Ab2JZzwzV?NzK;C2t)wYX5`@t)=!@3`L2>1@PnVJ$gnM6!6y)P0j@iSFiIr7$15 zao~E`xa#*rE=9RBv8IK2nW1uJpOib#E1>cxyIBuy%O`%3v9_q?yonJj4vD?5ZC`vW z9o8Bi$B3_;f5+FgzZgeie)Lf`(Fu=T^JGty2}h>bkPy`=CQSPHo!7cK^_^aAQ*QvS z@%)|wiWLT6L?KLv6=)Ti0vG|1I3p)#s5Aae3HLu7ukK`ik-#qFR1~JFr|Eb`vhHC- z+dSq&n7QiKC_yG_J^VEHcGBCZ>v&ymO9F{sa9fZ*gWd=K#%}xN&ZjgPCznE>n2eij zBxwrT&FVXSU2UwIJcx~*uc{Dx?<P_C<U+?CW~#>Iz6M{#AI#-~iX-QYD)`zY@KEAs z4(>D#rz8n{7?<Pu{*K&=f~PrSNoMcL8I%oK{rUINzD8Ank_n}O2rC5^<Lam7su_SV zV8tSs!u1U1@0i5J0fZ|@g7XB%LEKw1&^OYa0&PTsmLC}`cPX8A?zBX8PWK8%#l^=5 zK=y0rR5H0#p?m+T(YQ8tFbV66C+qU61AQ;FIHhXc=(_eTN(QDzpYaP`DI)bge0A26 zXxk`)6Mfi?ZTUFbWbPuVpu<P^)};)F2YdHDD|T^Gl~=AmcM-$rdn}P-`uZzJ$;U>6 zB#K`@eq0E>UElVg2qW0SI~2UN%r<j7+-S<r82M~nj>_#{Ho$_JDSQo0O?lupqICeb z<lR@nDZ5&OO+bZ3Lz`}S7(f<U`^Y0e%jQJ4BtiUrvkK$c7m8nNIiEi%5_xdP%Kt5% zOlB#);($tT_L=Uy04CH?GxOS9HxJGkls~rm`RK+REzNfib<+*_YDhh;>dzb52)9Q{ zgjKaKHwKSA>OA<;KJ{oQ_DGnAO#Kt_vLUI;h;U1)?YHu^fO>8c*_YSz<bzbtsSGx( z<ad3D4hZ1cZI~rwN3~n^h1B1pXfjyrbB}5bbE>Yj;4a>;xN&ghHfsNq7;T<>H{qCn z!x;g#qK<F5Y=3?Z)0vZ_95u^Ut0I+tf5+rSE6)4T7!IUnEtk8CY~oUU65-IyBlmkF zWO(}uH%0^gA<en*^kAEuN7B4ub#Jnsj6CWweEN=+%CgRCTKf8cAzcEya$lrDD-(2$ z!1%I5IwXhj697NTqfE4*v;!ynF0nvj&Jl}=e#7yp((A$gm!%bcWt*X%j#iHg<jb2m zB_C2(7N&$=&$LyTEp>iA7~ivX=ZO$lEw3b&xhs?mO+%)?Pdjb&M=5;7cU88-q-k5} zaASa{wf-u`$+Gq(JW7Y7k_P+jZQF>4`m{)|HCFhY#%$*!F*>7r5mAZ`H~2`$RT{DG z{N{?-|D{Ot$t%RGTw!wlS4YG%%Gkq$D$3qXZhh+p%7Zyq-_`J))Y<mQA@#Fs{e4bS zyWIH9%u%wRShxjG!!`8HrI2AETysO&42odj2P!;vZ10<!H+@f@<>kV|bOGbbbT-rV z^FBF+m#GwO|Fj2Q##JJJG~M76;U7(Eu;M`rEiuy7jJZu;KuVAP1u&ad9TYrnBaiF= zp^~?@J?AWLnt}pDYsc(KjtpKreKn;8_M0qEL(F93&*Ti6Z^bE~ruejb*(&3$i_z!r z-!48zpYz3Esn}<-UP`jRQib+=i+wGloQ}Sn@b>2yOo>Eb?r6TIH7!)ANhCS(%ixmp z2czQ}{Veavx36gMB;Gr!61bkA1{p&FEhYY*<Pl4=UyRcTM^+FJ9P&l4EG~Xre_XmZ z_xpFmGuGP>gMP<=XVlOgGv)vytH&^$3*6OTY7o3KL5oj<P=nyRr_4r*APb=L#lu=+ zPgCkYNALL*rbm%0s6mf5bcDwkfQFiXM6O6%E?ISAX(=3flW+)=o%e-`!{z?=p#fN9 z2jsFiWuj#VQYobWRWXr-Ltjz-{(U*1DL|duJI}hR^pmjOk|}!Cfla0;Mel(Z*!<w= zBLHB1U_+j2{hk9FH>0C$*{LVvB&I5NTS>y>In>X9=@L|ET0l|FIC}Gk01&J9Aq1K* zVFcx>Rk1onVsNs5fM08sks7$+!eEDGoK31S8%)m)frGF-IGzn0BtRdK4gIFp&sQR? zt*s~&oAUFu!;?09fToaXbEusIH6GT-!=yGTZ`@U+UX2(c%_1ZFwzuAxa!`5qwMNg0 z57^&njTDPK6i$&655GJnT{ZY|ex`M|>{TEFq!;|b9sI2g;CbeWbV=rj1$ySTl9Jo3 z6?DUOA2xN2pQDx8aQqr2Fd1L&LVGz}ZbA4?cS)*R5FGIV>uUmxwD9d!R6evii-Sd0 z70blis+rd!6(Xscj0H-gHf|xKd%&x;OSInxdoh0)UjYIgj!0>P-<+77G+D2NmkO5J zVbFZ*%)ZJ-e8^zx{rOc77vGZb$jCM@@+AaV<8!8TL!-F_xI(L!7+wu8k4^fU1%aMy z92y!z`Z=J;8InHr0g^wczCjKg7|gwAo3@1&<|wJcY%c+g*i-(7!)MN9rsjZDH-oh* z(z(FTRy^!ylX|j&2?~h9oZ%0_g!D*fsg>$O1%oXxrWZkX^#|9;+1~}d@MfE!r-kmU zMdIUaWL6wTJf{Hk!m3v#!W6b1CIa4~71R_0F2B*SBpWtRNAK=VN2`BucSf2v-8uMe z@RwEs+k54RIlN^!cShf6@G98Q0YxSwZ%quO`UKE$NTuxrcFJ&D0nYem8ez&4y7$UO z#0s%%mPVZkAc7!>PuS4hb-?VNpwllkz$D6}+D73Mdh5MB`ohraD;$E9`0swf2sn{! zSR`zVsdU$|>(HA66@cz2yL~uDp@Q@a&95zT%1{C@gb7{s+Y9%D5vCcsz)2UrGBUXO zIha#6fdT`1AN3A>WtHaw7=H^wBliz<094<nz3~?5bfMb=%HyJ?a3U`PsepG`84wW= z-<IGa)h^3a$nJ%)888okFGwv0Rm$=4fgJ+*L)bFwAowQ~!bcaH4P#M7numu0TFUYS zpiEfp$U`9m90<BGWQ!tP4^+HO5(@ROcmhEsLAK}$pG}LjL(gVbSM3HwOmUwWY%LD7 z<g4dEeP76WJJH^Qy#YphpwGLPKc3ZxcF7NE>u0Y8N(dO&L<mDy&%K8^jkM~c_B&x$ zkU$Z#)ob}%({#rpn<r{<brle2_YjCE%FhLSmslHNa`e-dM5Pomu?c#BPAIu<K+c-4 zHG)t?SVVpB4LUh_2uGJfw>CFhAa=A_zEQd_lzfj)+k*RUqi)ccbn}^2h+M6;mcm+x zf14$cV3iSdN0QT%m$fnh=rNb3E0fdL*wz*ZRT1=VSy}};9Sow4n`L=w9BO|h8sYO^ zdZT@oh1K`{7;p-25IyW)aO$-bhEVjZ$@r;k)>Hj2^E)@kcEA8nl!lfTL7T>QczgC> z0|Gk)gIQ?Bt|a(LeZF}YiSqz6-QGlA7Ob#fU^W0&|Ltg7h~AZYOmcH_OO6?fD<KSM zjR}u&!&)a?1Py>y;5Jr;8s9^}eHe)X>Qv+PUBH!4YCt9;E*Md<B~Yos@W0Nx3sjNh zgbfyVWN|oUV9p-yV2QGo!2H33W+ZSh>HvfbZ7VcWpQw)T;7LSt$QLr;fNH522!SrZ zArS5eK7k63T)8S+8UnF(?WgN_8Ki<<Sc_#~R|vf^BmVG`<?r3~Y2?V8Q#}7Fj2i<O z$OeSCBY+TKvzP#nGzzF~kT1cqx(O3kFowc}+>JqvBNmX*z-SOPKm8EZ4#us2QP8XE zVQBv*nQ(7Xa2#p1MLfH~y|46rA6l>v@O@-YahDIl|D9s=;Dir5rwqOnGA#=h<sX5_ zKwK=H>)O7(ycq}KzY`WQJOYp;v;YFhi!Mc-vKjs;OR$NLZ2v)otOJ0$Vbbv&f)haC zf=Gc-`EVWIe{ljh#{iI9Gg@XNiF5)ukp;mC7W(sI<KtT;urc5=R0v9ba|3(=0CESg z*CiiJ{sA#%gli7KK>@azf8cP9P=vv9P8FuIl7y5d%fbm+7^G46Y5j+@Ks9Cd`MDr+ zWe8I>USSC^$JN~240(+zvI4dqhM0sz^I!NBcF_c+dWj&G^?JLT>$5X6avgMxm`{%) z+QHx=Ot*xi&GMywjpMJe5ATte3Rbnii|cd<Sb?fgjZ6Ofy*5=)*%8sueE!c}6*B+b z(i+C5ka&epD-l1^XL%dobGTlJ5Q!dW1|VMR0*Ha25e6(9X)1WXj+lAY#sITuH7Cp| zGu?S2oPK(CU2D6gZQ&`8l6{x!;aDctw?`sFN%7!n{o+6OSM>)E13dUA_<IB(FAatl z+XWgvbTGr$#JIS#aM%q5uA3!8H(uYB<(uC;U|Bs#-0_r}KKd{%o(bsY5=0Q;!kn2; z#sV^YNjI|8N~50fXYBAAA9%v|kVQ@t0bCY1Yug6pTNs%Zs!W_xt_pRU1p~<D89xxc zc8%`8Z!kDK>i@jKz9So~g3h}q1Dp(>g8T}kMSvyIe*FA7z_7#uwkk*>V!(eG79P~0 z)?+2Tqv$2#+z@h13=jlWA@2hOMP;@>;nzL`U_}PgP7pF)LOz7|4+xcMFoJ4Beuw=d zixV6M`wZ4Ha3}ny>|NfSOa*v(tq|<M2IVgn`56KvC7jsR151Jx=J5IWqLAzG8P!Jw z1YjcDm4>T0rwo8w#GCfC0@NxQYPl~I+V|IXaR)x7@VTYx>B)Q{ou~izPp)bLFPP7; zce+!r5CRpCkn`|_7SZ9aA?cHZhK3@c#8?2a&Htx31J)e2#ghxL&M*CwuHYQCF1SXz z)RSjRsoPA<%y8US8>~HFD8<4gT#E!ib5!cQq=n=uHnyM<eva)2z`}w-b`T=yI}w+Z zL^#FM(_LU6q<9q-9aD=tvP`?a{`ISA?4P{{3N#4bE*!9}5Cjt0d*GjKo7p)Bdyj%w z7ZiMmbL%%58CBJ&fgj!mAYVXybL8!Ys1Qy{Nkc{k1yg|C$gV)HKwd_H21V5p&V#-E z{U0x1golO6cBhD(_l#t~iRrqpWjB%rqs=*xC9^9m&fjgFT8J|_%>V+Q`ubus{0-zz z%Ux%4*x`&4+c=Cp2*0u8rlU7Q{C*rrP+sj~np&4-E(n@J4pVZ)Nhr7;Jrs#>@85vj zgd<7Hk-!V|ZM7E{F=N0m7?Hfp-wLyo{|zAv$#}q@4E2=cof#DRb4*NeU+5C8M*z`# zCYQ_(yIF-(&@$u_Qb0oqo>F3kk#MW+r_^ho$M9rzivYYGM)h=L$Y#CRYo(YtEw+WO z6JbJ&@eX`EAA6b6^Z+FY_fyrc<2GIe(T^%J2yl=ouknqvJ4BcuNw960SlDt0sU)J7 zRNrVjm>*2pdo)-WT3sy$N6M`v6p)52L#jsTFPN%F+$#nh!UB3Ov&WJd-VxRSBB+|z z7^8m2#?w<ux6&F}C+l@~81<WAbaEak9nGHSY8BML&%A*63CC3;aTqZ%#=|qE^{->o zH4P_Xx(P8{nF5<#u3$hmuOCSXcQT}Wn~aU&XY01PYvVtlcRv}vyd-3QREei@gWXK6 zMZ-m!g!5yA5fP?sgEK}W<gQZhBX?v|LPr5v8S0vsz^zUL@oS7kXhgyXq#_hcvMgWz zOx(d=TH|$e_wrwT@TKG3{!PjuGm`E3wJX$rzzbRiJoZlsy9`iqfUS@(vh4vDLg0D$ z1O!GJGCJdy2nY*sKgcUqla=+$PZwu==_d3qZuo{2g;jUtHL=D*ODL`%1fiV@vJh#} zRbX6xZMhfIvz?l8H8Yz>nhaO3ES#i%C75m~C&+5wowZzL8e@L+#C)}%^K=b}B}M*P z!@zmdPZ$1nDeFkm+uoQq^enP2;3x&-#-|ua;(>abn4BD_IvrqLa3Oj@R2E$W+KU-M zzal(=9P|9sr-=M0Lhha-2-gcg5=RE{ca*)9I~>yvF5MY#nU|z1C1|7hH?x9^50Y8I zD9H9a?!-bj@~V(373uilco!s4BD)`?%Sb^BEOxc~78g_r6vF02!k+lG2YDunGY-QC zRLC-xWdVJt#$O?GD*@Qm=78%K5_bTq1M{SIxEMgry`aDx6vSg2YP_EFG$)7jx0e=A z5j3Q+FM<4L=HZDzq9Y=|fvFXwCTcj^rbagT-X5crR2sm`j;qX!jOUN-_F8bCuImbC z+a<``C%I2-COb@JF;U#fOljThH-XA-0<@wKa~}#*xv%1oI{hY=Eo8j1as~epDfT?$ z_+<abowbg)>G?F31pO(sZ=<$J$_|hc2TXfF4X1+;0&qs9097OcP}q%Mv(4=5eh;z( zCYU%uZV{oOM)e6UI<f>Pg}wb^NmJ8Deenwm!@c~-lVd@GYpR~AEFjS%t1yHFvjt{j zP>}s+OsM<?@!*3`0B*7}Hf($eY{r*j_8k6IR7oc3Qwgot#y9k8PQBfTOtF#Ig%{{I zcccS!4w&`=+*uBVhNW=X^Qpbz9Pc(EsZ~AM!kd5F)>t^9jAv0GoC6$rxDxQ-7oFoh zp&Hq8zp8SPlj*en#k?*9C|#A&8iWv_0;-Pi5Mj<-flfIanAqC`B?Btiey)*si4gQ! zz|9zxscq1Wp*uK9s6tsOrjcS80_bRlB?`iWtn8Wp9v;R%Q2!GiAcU-FKN>2I1S4u_ zdJ<q;fi87Wu+s{fcL)wp^8)yVv^nvyv4~6v$un24(qjAj!m@_iSfhc75mc0Z42b16 zXu(*ZT|OnJpk{(rQ|abS6WSZkRN<rmV4IOrSPi(=q3~K#U~9a#2DD-7p9uD4;;;l@ zl;CF)1*A@BpMDlNA`stVZRqhrq@T5}tH?26z=X1cRR!?7NLt7OUIBDiF-WdI2b!Md z*)+5jXf!C{yM`j-bgz*;;#@~)IY3iuDttP;-HZi|J4(jMNf>+(F6DI@f%BM-B8QN7 zF#RcC#bOQ90=poBC=!jSLU1n-;KArPOd+7)AivQ^BP7(w+Xg-!6!UF>6rq8G18O)M z5p{mcCk76V3dA6wS8mx#pL%JD=$H^m2<W$rX=hnM$pvY<9T6|W?|~FD2mG*{lCGZn zn8+J`NBZd7qY)4Au7k$=DJbI5HmMp41*Amba2KW^9zixM+&MD;4-XeoHk<>BRKSSm zxc$zG0rLLh;##)4E9)e^55>Stry16*s3Pd|<(?t^D6rwB$Q!6}oJDkU@G1*H*4r~5 zw$TpVJWMUXW36LL0jI*#?T(x=iM;u5paKOT0UmT0+%Eaw?QR0vutKw#AyEILDs;QI z(xf95&RSSlz(sEsY4u+o#y#ECzaxi_K=WG&CnkZAhy}9sM%zUrpkCiXqX72}ISk1b zAfjdbcjrb>`T|*tlwNYe&@p5Wa>M#Uj)W6)$cAJPs?Qg!|7G=$F_&?;DqCFNf#P$R zO915ul3v1MVVStOyTjEJhv)T>uL6w^69&EaF5?o8{-o<}tdgH>)`a^t!ByB;|5O}J zO&HKQLfANa3aI(5oGnH593&E1J;U{5C?&Z5Ql((v>jfV9`1JG(6}lG!Wt%_>0ZAP; zLD+Am|B=xn6*v++4L^VG>hjfiJtImFdv<Iy*5_0$UcyerY47ATQ~4BYcdQC%12A-C zh2Y+{S&y~*1zBn!-$4YVK#x{o@tn%}fZv7+ksEl2!O#SvQwZVZz@vZazZ)=70ZteX zplpWjAJf075d!!-UEZs?YO)};!MffENd}2%Amj>xR|`<WU;r+u{V74qvH;0q7hFS- zqVE^5EEDePuF&6Cf_pGx4VYcQev8-xP8Wm2V$~tmvVaB(Drbbz0YHTTj=+FQrWTS7 z0B9{p$p>UP3|rAGQ3k}Aw%!-=o!3hypuZLYMN9}H`7Tt=^Y6M(lEZoGhH3h+>lrIz zIkH4K`a=BgF~}q_85smcTAw7auOQWP-+~h&s)7eZ4(ddxB>-?A1O#kh?zQP7(0sws zoe!Wsf&GN!TlldG@To*j27q8-9fim|;aXcN4s}p>HiN4@vSDCn-)*JPgcGljvp97+ zI<-|*31G?qO8@)vii+Xxk1sG5B6M{u%bh!SghBs?BoYwIA~ONVUID)p`^S%w8VY<N zkl_u`8;Y#IrxIceAY*`u5y0Y4w@bb{Gw9JmknK|b;KD!K1=!F3+%At_4LlNoBms+` zr9xMw$CzEYw@eYI><R+vkd=Y$h<*vFB%Dh@8L0qe&yV-6{6q4hJy2@3Z<Q&Al@K#4 z*dN?wf_D}O)m11oni{RxFgODh6f(7<&OxRRnq0i$?6|J`_V!S+Kp71xo=63FU5qa& zIgOBH4+OKo?ceT!ul}0EE6D(mgjR&{Csi4qY&8zcm&0f38hjv+!2A?^GwO-G;kZw; zPV`T)08|M;$0<+jG1keEK->0DQw{cL*K_*pCDQ|UyukGSeC{(o4stALJ?B0B7+}*o zdi;js_(raVLu&L+5AW(;r?e%@DMlPgCMGWKpx1cDb9Dq{VmXk|Aly;(98gS1jw45O zz?>622S?aKwKxR^??quy3}#8RL57(kYV#4ACIssSU;u<Kf2$<PKcLicf0m9#cg?eN z`6Df4GTov?XZ($n)9oSS?p}+OmVgA8!U1V;Z5W;+feHd7oh)swt#BCdX{`hDbYECN zP-x;XX7A0V+R1s%l!5pfMhigx&VXPxU^xSvVJu|*<NYQ6Q&5TVCV(_e0Cb#gFwnS& zfj7y2+Z~{f{&#m+!#W%yBt1PAe)1>uhJK&}f;+WEgqcYE!bLnx;VGj#CQgeDXeK}( zWDZ*&TC)HcbVN{>Qj?aL^mL>F^M%g^L%G1ebtgWZ6fEQ}-C8G3;3eAs$p~d3S_N$& zp++D`IALuHP+z2yf-+tg<Hd7mvy<{458*LGjRGDKkip-=@3Jj+iv*O)o~9FI4&=r4 zvq=BGv9%=+0xRT7P&(KSVYs-uD#CySl)xpS;{!3cj4x3j>_(RwV;~5Yi0vpPC|r@i zg={`8mZ6q^wEz-+Iw0Y1fm7#Be>V(`g0Q?auUv4K%RS1CxX(ioon;nHWNO#CvI8#> zPWE00MLoi$fIVjDU*z*PdkfRz*ayjffltI<6KW}V_C!o6f452}W`=e*D*(_qj4bp{ zW*6ZJ%i_e+{4XRc0QH{c5AnTXRYXq;F<K_C|HF2Wo7cwq(_fnmj*jlJ`tSvW7Lbr{ zfC<v((Sc|9Fk*8BPcaaWC9Vr^%l-5+vUqPHgfx?v^}y}of?U7+CdX~?<iS88YWXm7 zZXwmUA&^-0X3L)ka%i;fNt|<95^?|#;DE?s7$ATBBJt(+8?7Q_LJ^Q-<WN132+e#i zbKRU%S*fo^T5JT-d!t<v1wpPnYV#Q&kIYasL0Q!XNUu)ElhB3+X#iM}IYy`=;UlCF z^(Ls#LD@jz{`5LN5>gSo6ka)?@_uj+cS{UuVbrD_&PD45o#LQ_;YOiRjne`NNGB=t zx~!lUKs@8Pw2LD^n}mESsGQB<X54muy4HA-Q1%?%0{gd3=WljDYum)SKY>Uo3~~+G zZpxW(w2_g~?*6{6NW2Ji+F;P+n9L74(#HOXI`gwnf$Ty}xQ<JRiOCcUIHs;%f}dUq z8wQ~^2K;qxbv0_@N~XqZK#=Bzj9{z`wRSL;0BRXj!ya@Q$Xu}P=Igs$&~C$LIBK_U zjFm`zf)8mozXp>4>8c&G2d+>%q3-Z6Zzf)7v@*}8et+8c{ik&4f2;!}#vuRNl?Bx? z@+T!gI!;%ToFksT=_jgp7YC<<Lt5PMFMM>R^(W@_%qn%xDKYjW3K!YTQ8$@Vt1Ea? zsr-z9^F766($S8wo8#Fkkw{_*a{(&%ko3Wjb9qB%$2X;wX?~EU_#CM`XkHCL(?$l1 zFDP~M1rsU+W#3@E7`wTilK1o|i8|Pbjj!ciE_Tm$CFcP#-M&-DC~i^X3azK_f72dO zUr7uMun8~Szfa{zm&^3tQ$QieMUFKuZ#Cw-*QbpS!q-|CM9w=1*$W4WI=CO38GJP4 zIW*`0ihrrvTQBfcR-9nI31_PJ4gR&^o?1U%6Y1$jX7QNV$gi{truu8tJ<^{(OJ7L_ zfs`q^>b1}9Ip3_~PmIN6CGQ6H7x?tgy(Ot+{yp#I_7YR*?T0>-KWI#C&-L;b@}4vN zXu75Fp8Cb84O8d7nv11VOS9F+8nUm3rY6UCUbxwDQQ*6uzg0R?A}^8oVT!UsT>U^| z^Z!yuWxZchhj-0_Y_w!%s(oee`9a5p&H(HDN83&LzA9=zs4@*0=|Mc7jFNdS?c?q3 zbMzx(Xb6|Sen1udJfAdz&;>X1n>CXsb*g!vDAskN_*=Os86{<99{r49i@*e<KcrN9 z;<%2{hRuW4{pAhBUn>bo--x8LwVJ15GG39#u~OH?p*+lsG3c}gsUHOjhb!StOv6Jm zx(pp=&%XDe6M5=58D63@4<&bgJj$}_&S>60V7pHAHn6;ZoK*j#7mKxzrBGFqC&=$~ zPz(V<K@>tphJznBvti-mq5`B(4$Fxew!)5f?)&$)&8slF{h~B!4#0)~rURm>wTX0l z-g8l&PbPedk`YMeIdH94X4l{4M04bBkOr-E^L(WMEe;=w2o$kLVcK(eoDJK=kG2hY zM`0MDGm7+;$Og&bIcpL+yprK6in=5i-hr<G{;`~+(w}#g#L73UvRrde*O^@}?%Q_l z&X=sL5@FYWSKj60(<BPeXLL~YXc{;nHL-J<+=+(7i^4(1sMegZ=nocTXBtyvb2tTM zD;aRwN4Nux$M>86JQgome?JGNXYk#gK7A^C@_r+zfstobSNm7_N(KXtUyPUStKQy< zG@`e>-Pzr6Hkc-g0sNR>{-S>JPLmQ{leMN|{fYrcVRG;*onMcnX252UkD=+|L5RQy zGqcH~64CS8myLYrA&Bat7+~1G=+~Y`()s0rN@7`dWjFWVcMfl%-KRnsa1wWdxe{){ z@!Z~ryA{)9N1j@1g8SJZvpkOq03H8oMLGHBjrtWhdxe@EMFb&UyV~fp7+Zwb`zWGX zeib6jbbHq9RJGzc%E`|29SJK_vXj!e!hR<)>9WVGYE8ma-Bw%NSY;2SckE!A<P55* zqhp@Dgl9Kx{f#&4^qRJ8`j1RppR2;Setb94!)4FZtPU|I?|GcS3Dm?W9O&9%io1o$ zFqrxA17n|xv0n!27iu~@ejQ{`Y+Bgfs9mPwjnSWt+Rx-`%p7<2i_?$tNqlo<Qgws* z+u}CH8;MR-;Kf6)wDp<38#7@^1YFUzyH0)dyOmxKJ7=F<{5oVL=KENjbz%+&MUX#O zi1G2zi|pK#qulLIy9~db%ynCdGm0gxbAjbBN{0V=*Uo6yYB9$rTE=#+|9;zHz$$5- z+70IRvt;<yMC9iTaIZ-H^h)R}^yOcn6%RS@p4I5}Y34(G!GyXG1I6+o#&TN1?v7%M z6FXM4m6Jl=8&wq9+V+lbOPrI`yUBA^jb=ufAvx?Tjr|QYG^oz2hZ8xfEk*C!wNHvE z@z+DHmB`~(<xMja+)I67lI%nyZcpd#K=M4dfh2NzXLQo&3o%H>j?J^`UC+BlHuqnY zifX<>kKfqzYO~bDJ;_JmsoXP<W$cV^giqW@d!GAV8o#61DOr4{{PJ*#{I%0~BjNz@ zMx*L;hmZ}aCnbmkBTq$wu39EtE9@tA9835Qye^(Qp(_6Dc;>(2&1lD*ICsoI9d%0e zCzLbRpXG}IdJV^ZSCjqPx~#9H%lHuZ{8qg#6TN7E?4x<_&F*yHs}EvXOQE7X1jZMA z?mA!y(b8u5tr+=a$#usTlu@9JWq<fydQLkVy<Q+0;{N(O>kA?Mm<x)E1GjzVcaQH> z?;lJ1e5vsujQ)0<PqwyU)z+w;q5ZqzqrZU1XX%nJ5_Nb7A6+{v`3Fh$j_>@VD$LZK zoWJKGXW5HEOObvfc;07ldgX&t{#mXjBTjDnajrAyx+KsyPT}8AZwsft@V~@KP2V5q z!aFj+>ZVF+@}pNUYm+QrpNkR~({Uxys&aDQS`BN;5}`1-xjWxe>rs866!r8LMrEen zHs0bRl{MSf_J;zFmlk`9{iPq!-#F^})F;b)HYU(tGuSM{6BQNdncDU3+|Nk8r}KK% zZhufh_pasZRN+Y-pS%+MULt?>^bK+L9Y{R>qZIKzf+r8p-fO!Ey&Ud1CNAc*fZbbz z{ZyG|560QIQbQkvSDEp*yyN+s6l4r#8nW*w=pT}b4cL2K>&@bvKex&*>9C-{cP;<H zcZ*h+v+6-sO0`+R!AV0W)UhpX-B=q3R~(kh#Kw2=c$Zk34@4Eycjg)M)IGlKm|xM( zlWC_$Z;qH?;-^MLpFwB3i+$YGJ>aL6)RN0mT^;0LbZ<QTPf_Kod{M)ZknyPa)_Kiq z3XS1cWql74lXisuU$V^!2CaVn57K;=C(~4)9}-1RZgFRA8mW|-4lvy9UNvhvd~7E7 zF=aqZ*Ous&r@rseJ=~#_i4I;9H+C1PMeSc9XKj*<XB!vqgf9h43i!0iyH!g!26Zv_ zmUtLN7&zfSWGib<ev$B`Y3a-Au5*h-0HICPt*vWqiPS+!Mjt&Jjx)a>mlAd4-z!^P zI?j^QX?PpMn%>;?%_rzG7DnfQ^ar+B2TQg&%Y&&;q8Ss@7}NL%H5bX;zPObcsVDko zMpNAs6jVAIn!hK*_s9K{bYrEYwe$B*(KC-C=bp@S#C0){;tR}d*3A3$OH$S^3Sf^% z%I?0XcWc-)w-*0^xqN%W0Y`cZvv1rdYW(Bj#nh9LH2>6R@MNhDdYm3}Yv~t8-&B!T zeN-1U-jc|$c%2Uu)(C@Nhj$p&yoFg*C1GL}A6Dt}&~-8F@e77{nWZYv>He5<>aNwB zZyVOP{>9###-}^$On&VhdsNqoV@o%zKAd*h^zm`DHI&u=GO}yxa*if>lUKLuc1KF& z_=}pnse}gZR~;Qbtuf@b&&y}^go$nB&I!Ca3@v3<Tx(=O#azEvcD9|s{UU3Of9gd( zhP*!X=M~R0k6*pJaquT$SfGYg|BD<m+op==#?j91KCj5zFM(NJo5p{-Bbzf@-k`T0 zm^|LQ<+?|qS9^FOpL`_v#!xT7`-uLXe&3Oi(NtlBQFnt;iJFYhu6swm^TTL?m8Fds z=F$_)h@Zpf`VUkvgX+)Er%IO2OQ<bGqGYH8SDRPNCN?VVEw1&=UD7%lJ&z}4F*8)M zJDpP!Y^GVV+UJu^bAzg8YHnNF;Rfl`D`$8bRKB<v)zgknPO|tLmUNWs>Sg8xt50L} zkQ6oqk}GRg3MxdL5Y5`I$jyXW4>}yr;$~X}Onj0&DzjDRkGHsn^GPow@AGPdzW%<A z$jVWk<YIyRv8n0jM$#hOo(HI|G^vR!)ut&sQJGW^{L2l`LqatNnGJU^1wXM*7qlr_ zXD&G4*RNi|-3cme?V;E5oTOeZ*ye57`AVa53G)KI|H|>hVd5uJJy=>xmp6~^)3jFG zUVd>&@EmE7%8i-W@!svZJCvfA_l@#ow=+}WRmnB*KNEEN$>hY(zj8F#&VJ_wd4><k z#LOgyBI?pXz3h+<Tg2f*rXZDZKW6T0ja&5BItt^y3z%ERa&A7Y+UeI!)GUhbYj};V zcr0eT<-r4f$}Kf~V^(OYhk2?~Z;!uQ*_0a@ao@o(TqHDFqO1^Np04MU{{JJ@)w>qm z&y>CWn4e=U&7|JQ!S0xxfSpbeZEqgWWkC1wk8pM6qTflij`vEb!OrW~fmK%3##=w% z3p8DHqy(Q0FOD%!!U|e(@da|3l?hbjC~*|ixRh}QU6S`P*>tf}%MaF{YN{B&dXAzD zQbcizKbrOtj%~~&tR8<74>cNmnbWdZJH;QD?S0<smG@3Fp~@OgbYtq3at0g}K9wA= z)R|~qmA+RGok!;Sj1*cs%wmV_uqa&l2(kW%RoPq=ru9?5w?}o5IP0N9^slto+6}k& z@SuNAi{)5Yq~+uOosNkab3HTrsz<|4>q1VJXGG3rQhOb0cuI*991Za`otwS;^!QBW z^0S;9YpL0v!|x3(hQ6+FuV2%oC9wH^_FY=AyQ&aMDBebz=qFc^qNgkYnVnmcCPgs& zr85&3P^ugYuY4u?n--Hddy=GU^lrwhbsB#<E){t4dG1i#Ctq^?joCMBGK^mxZ%VjB zMse#!#sf=4siqZZ+w`VgCT`e?%y?OtIaDlldQ(w^<<Ew=l!U&rNtsmPCNrcU&CT?< z?mHi$y~FKjLI2Hq>g;k~iJ*`K(e&{mt{pdi&JSA_Ep>vc`yz^qp_Zc}7y>^#PJX=K z%+LRLLO5af0QEhdmYDpaJ`tAwV}9PxtQX(4k@p)tH1{_6STgxcz1T82moqCQ?6z+5 z!<*8YWTP54HHpoMb>z0?x07owXO7dq2*nTY{QMzpeBL=~zVrI?s-Df!dGD3&(ep{G zW{uyfdT-8<mW4ILLB4GqPo}!AF>+(8bGDHiy>?jr%{po)O-1Uya8OK&g?FA@jV$S| z?xpKqlBmanq;p>4#6`2HfOEU%nheEcA&>8JR7g25G^CkHuEl+~a?42_)4EX@h`()M z)C_dGAczGcJlif*zGbuQYdP<4VB@q~^GMF;J&Ia+tR?8mh%AAO($R&~Gbgf)`Kn&r zxAeHe)biC{RDTUhjg(aH;&}2$`gUDN_>P|O?|?B0?qaJ)^feV1Qpjf+mev=DFW{Yz zw_o~6`>Nuw&@kU@t((T@`LV^-F-sdvnZ_e#Jt8-qLJ>2%*p?raoMCUvkFS=$<dn^N zbJl%Fz~5mfNw8IrI5uhXV?&;*tW(|7EHfit?+j}GmZHq6$zUt;%#!M;kWW}{Ce4&i zG2tU6J+;yWHW7<K=*I@M@uN*qu5(kL8@q9IhCkFzyj}HUWLAi`pj`<3;-b>~Qr;Dx zdK7EWbX1D3SxQ@{`yitB3pZ=u`}HqrYWM`B{#ZY$IBF`LtEe8Oqcwi!PArssof+`p zxWDB@we>}{HplIFOJwQ2Di3a`)BV{4;kr1HOIOMmd)DbS8ry>9US>B85ilPl8opzi zVaL-s;JZL86z6A0F}Tm2sdeIP8X6E__RevTWAR%du1ML|i}lA<U)z{25aT*|bgK!y zLFfGGqxXL>ZE2T5xzsAEpOH6Ump&}<Co@JV)zBfkp1UM&jb`sfj<e~#Az90&C9lr0 z)-(zHRFx6F8R^Z{{7s26S)QU5i+itEQD@)M&?`T2UKnw&H2*LZ%IZrJRhX;O@?P!o zG~4|p@u)fMG%$u37$~uf@w!cQykNU8-kSM<fL3;O<uGq>>s-a_5s9*jchUP8w8_%( z?vY!$jW>cVgU6NKZ5D^f_R>ZiZWNJnkM4!-Q&Hbt@$i2o%qVy0Y&!6|q5shL!^6?9 zC<!Nyki%EeSr5ncv$Ho=HGZs<&kT&GYGKm9Os^jkw$~+Ud;MrH7Xo?uu&2a&m$9Qg zovOEotmV&dO~S@5IBlF%54?+eB+9nNG6%#I3LgD(Hl;Q#_Ru)}IxBPr@|WA|RL{G} zKN}=_LrqpU!Mpg-PGOF=?$=R~#ez<OUl5wTl5xI<ec(y@FxAc_=jYl19!}p@X{k2% zJ%2d!R(;u9e5+b$8;)jT;>!BAbn(|=x3GqAqaCw*6jLQdE@AdB&tXmp*J<w$=Q=jY z%0$%9<7wP`L-~y*KfuDUup!IPrC04qhE@YprGDfQyVQ$L(n+%PO)1twJ|ma4!^Pvb zXGL)F?#-UL%Nuh`P5ZDSvA)PcZ?GYxNd6(!Pbr@i)%5qOuX{>S$Nh`@Z~cFC)TiLZ z6|3)d3Dmkt=6tQw9$$XdWa#pOLP=d5^%{#Dx2I-OwDS`|NXR$&cqb7*)gXfxBgwjn ziTt;6sporo_w*k2yj%11ulxNpBuvw;_j@T$ZQ@<E*F71o*9x_X^}<tjEC!<c@<a|l z5m0Fzg`-7g*;qAJHkU~r4G_um5Q_{;?YyjQ`2AweOQC2pOd|Rj$>WSYObvrD^wrTb z=6LMU$M#>3&Rx9{M`0lNA~0lU{+kTj74O9{!b>wu*OXW*q-1!H*zJ4`TXcV2iMij! z0RNWqT={r+tXcO!>ygdFsSw+<G+Wya713Uoc+%ca%g^s07Djc=<p0<rdBQ}Bx<6;X zB1oy5Qk+&pI(jao>WbiF#|{QwdVcqYN9vo$h5Lg0C&BoGUu;?l)Lq>(=D7-<WU4RK zF_rrjr|AC4#kgb^A=pmop{3s{)J|y@QKk?ajz<0<IQYw+_!IJnb}Cvv)6*YA=d{T4 zoc_F((<6P%)}f(`pwQg^sA1u!cKnD00Ltp>>X5WoZ@~Zs_4HP*Ur9+x=fgLM(hOw` zC^4KNOe(H&4|sLd#{*F~gJDK^U~_9L(AO8!P<l52pnVJ!5ax_sWji1iL2=l<hvOk) z`|sx7O>7V!gHHqci7D?j4>$v&V&dajy6UgOAp<BFB1Xa4Oz<yUK!V6T8Hdw7)Mte6 z0s4tT{Bn!|26s&PNVl~N`~a)?Hq2Y0P(WiR+`kr>eK=sLk){pe0u+jnPAca92pqgM zuy2PSlK;_T9V~PX-HZiJp9fnIP{#N4;vVsuZf<T8laTnq+0i!A%(Wos13?UOf|#jd z9~ayk+>5;vnU%@ZBHRIZ`!s<$!K+ub9-1y*<YZ(F;L8R_(La;y$3?-@Bf`W0(GscT zEpjlmfI}z{dK6sh+oCk(b#yKPSr6nI#*G`eyro8rKuQ3`^DJE(jL^VM9fO^H;e&(% z5hi>FG7|bsG7bk77#N70mkz2QR`HV21|Rxcxg;PPz6c)>Jx2~l2f*oEA+&jTU|I$4 zP>4f37~Pw8yk>%n!wD$Ae%wpp*9eP_4u(%+nI}iVx$r@t(T3BiD!h%X{#g;v<JbKS zkac5L*f+_Pi2NZ!MB@MJ5~>a{KfOjI!Ns!w{lyel=I7`0`tS74Fm1@7{G#NivJ5Ud zRoZ3uM;9bLlS`8#Cp;ifjnS%Q8r}0EbfG7Z-|Y@AbzR1M&xUOp>|V&Tz_Yz7@szov zhk@dYy`2SC&2ZK`wf4iX#>b9-oGKDNtEaNf+Z}G0Q5MS^btIJM5xL(8w7ojgc=xyQ z9qyaW$!D{aY-X`7E{DD|?7Ben{fND|ZB!P97_x2UrhX?_3-37j=E^Zqj1kDI^<F<D zcAzC(*6;g1Cih$E4#$n=uAexg9yAx@@Oz7IQwt|F2NgKe4KlEA5LIH)whaV}+AsY^ znUzhk229@dyC5NU^=DG9lGvpnOpl9Vp<>cjpS0cP34f0rksjkT_{tur{<_aTDE95E zf_WFtc*FT~Q8bEfh1fPfM_M!r9nhK=UPQD^HsVn3{60e+JD2pvtKj;V1f%uog@TZJ zZ!C*DK~q8P7p|sv%PDDZ?(*M^X!Py9|8DUI_5+DwDW_3Z*6OvryPhdxIP@RpgGg6< zT8rfpY^67JI5deWpAZ`^^_?|qsE-gT+q|3IID0Q{{b4i1=PiSf*cv?nr&g{}qSi0> zRMSh(X}((Z|M^lLy)9j|mfAy&-iem|jlM^2*ckS@d`(4;>zk>?6F)9T`oVK2{7g(= zrZ*zR?#JSOZX>uCihapK)Oe#cA>@%H&Ig-Zwb^*hytLN#0X+v{9vavQVn>^zp41u9 z4@g4uqxR_Rx(NbD3R%%YxtuP_&m;YlH7<Ekf#Mf+?SN;Z-051~+bgEKt*fyN59+Aq zCa1!!*5@|Rb%%W(zwAw~J-{9xIB+RczUJS3z9Tz$-qJdZ=gEeo62VAVV98d{{B!<w z*GmiHI^BC@+Jn<uv#x6!qjyS;?+P9UN45S~+Rw--c*yL`<f5GHd!y3Mc>S&G<l2FC zU709}`}l_e-LqtjQ{~CeYZmao4(zXvFLq?$Y5QGa+LaSy`eFoH8=l6y<5<?b(?)Tj zm#sVn9JWN>d6f}bJKwdTTAUk7?0-qu`i55Fxl{R-&o+0T5Bgjh+uN3UWlh8MqtA)e zQL{QiFRJ^E-mmpMWg&|N6e@*gv_=nm{t9g$<^%i9#dAv)CSs{Im~^TV%as!9dGn1A zA_(hfE|{PPDT<WCV~p~NCAG{`tZ8<}qxCHo(tN#M3b3B%D;H%pev>e){*J4*nxce1 zq)oBIU*P(g>E)Y?&95jMmMJNsO0gIiHp)3m-5!#?{QAs%=*#q56W4cDi7mZ#SJPsn zX3kugVDRPAtB!o3LTrEFc|Hxld-BhivYYiTjYsx_#vaX=OKJn%Eh*=4bjUG3-So){ zuHlWpcDBrIt_m<!Hhkg1@k@7V9(qJ<9#DL=I<mjDJ)_JN{Q8sSu1(*>qaCxDu2%8$ z1y?h%0c>LsYY3WC!8aZ==h**V*!H^o5_)Q(u*d1gh*G+*ek$?vYoBo)Z(#kXcqen+ zoAFQmxz}aby$sxcxDR9cX||b`W!}(xv?+Bxiu~S$@!XVg@cT(hyN@8B=QXVJ&n%|h zdiCtTnj9=Rt==W03u;JlZp1r}5>QY~(o!P8SFCf#8D$a3ykQ$G`i1Ic>aFaT#maL5 z3r#-Pj^A=j(mAGiya<;25^{+U9YT8+6+hb2^Qocvr(i*;*VA~uAgdVfKio8SI~Z)R zFT~5H?=npoI?{#3OHMW@T(F9ceJ)l@&C?dO6|#Xn*fSC}N1xs{Jg32Zf-yZ-Ub@zI zk#(NfaR947G1TFbiex_{set>Ql=4nRJN|cv``!AH_h=(pqwc9HwSFMOc(4&h8>RkO zO!umcejOfBQAp7PI==d+aXuo~J7!fdQ0=rMIRi)CVx;`WFH(<xXxgsgTd1%pM`n&u z%#_+q<EsQ+`h(Wg&bpe3U*vg)Xijt8+<|FjRxI0sP|_=}e@vm>tISQ&2$fsyF-%Ck zO7Q;#cL<30#k%c-^nY@zh8jSVaUA}?Ibq)9Vw^fF&Q->0$DWXcl!t{-ZeNc?!3rL) zy@k<k*7{CY?f6$-`v&uZEb#FaZ#36&3LWmD)X2D>4n<QdSXNA!CgvtE$WB9sAkJl% zMyu#DzWzP|Z-48HeiFJT7Q$`p^;?7eS5uI3H4GQ-O6P_QP5LpcJKhEhXLN*ILQPiA z0L{JL#^eF+=-tf}=X_s<*RnGx5<6A}<)s7oA<h=o4-VnWWz+HE3*+E_E?ewaTIKw< zA9wpsL=Q2mh^NRLy|Hco92o5X7z>XX;pHKn(M;@6j~I1D9}fX%eho(+08j~v%M4*6 zt%m^s0P2iXq$w<5WmR#m#l#$@202Jem(VR~jY2UlY}}4CpRe&rf9V|T0Rp1-uEd&s zmyw!4`3u|t00$^ZL_t(?4X4h>Ln(HoM>H8Yy!IG6e>4N_rMCkC4D8|c&J+xA?~QIv zF2Q%G7nYxUD0Zp}O8m9wN5nhYAoajTESok2FL<xUxjfy}eL6f7YXCzdLm0?pcr*k^ ztpOPs!$^W-Lb6Jt5irLes!tALotJ~OE{hv>!L;?;F-?<+y_5Rk!zF$3>d$Y(M(kXg zEd09a5FBQ1K(pK&6qKo<kryF1zYLC6Qh2~d4(Au&MEOO%VdzzI1uh?M!yiju!W-t| zZybd&ad_Y|3iB6z40l5TBR(5~L7xP|J7Ot1OXEU7In8|>+?R|Y8@lTDlF~C0?(D)l zpJ!sp-|?_LJP#8`j6~WW5%7CK0tJM;Ej|d^3#I>FEd4YZ9tg+Lyl(iSmlU>($_xyR zU}zwN?B5*-)MNx>-hn)WQbYyJ#|Gtp@RR>=o!ysMKZfWwZ4H8^X^_5e20TApivGh7 zV~UN~{i{L7QGfhw?}M+K<s;{I88jLN3Ul*OW^WA>Da?WLR(NsZr1HOe^+EA@5B%r5 z<5)6ivIL&3C`L?dIy#J=i$x#xgrP7REBX$BZ=g5MebTkTKTU^2S_X%X?O;`uj{-FT z&>%mr0J1h6;bJKEAZ;X+aR{EX0?E_2V9m&upo|oxiuCY6jbygy?%o<@Ss8dJh38jD z3B%@=h!0$b#fz8V`}3*DI2(ZFn?l7-rj==P2UseKP$ZEPOh(2=kZY*o{xm~$_Hc!1 zb_TLVc(9=+!mWMyq09@fyGZ0xJpiB#p&PzJk@v^&YHovGAAE<kqaNayU;Yt0*kgvR zhvMTcLD;r(7F=@g!^&qqCbp2$VL&;VgA3YQ6(j#)`QtvC{CpJ1TBD;>Zq1N9C1Qf- zV@1-N_<rqZI8r8(Q%l7TrG*Uk-MhhAnT7O+Qh0Kipp@IA+1(&4S-coa0>Y7<c^=;_ z--HvX5}BkcX@cg>n?X@j3~8h@g0Zn28mhR%*{}mVx->&}MwU)KO{d3Y=7UFLm~aO- z^V9&!aQ$W~`i~m{2T2_b0%ZynG%6*Ol6o#liO6+xvA^@D_`<6NWGF`1y0!Q{UkX(s z04PzWc+_CUxD%U(_WcK-r6i70TVl+oTd^}Z7{S4TSoDfBT6itOR=@FLC#z73dzT}T ztSCRY%ZQFeLC+V_zo|41?#z4lLD$SU#1@z5%gfAxd7lC3DxpjEKq(@&{|vJ+<IzS6 zW9~6RX{i7+3zLe;)fD!2)^Kd$Aabs!3Ap#efAMLIH-gp-M^mxmK+8OA)*s`B$q{`s z7a9O1*Q1cucPvIYh*5AWpirryQYx!Hd{m}@N~3~ODdyZ<HJ|eDjhMa987o%~hl30# zI=vBVe#;Rjo|ZOJmMI=*ExZ$pw6^^)u%$Q$Q<cY_9IeMJ!S_3Z5gZ(h!2b@1gX7Ea z-Rg(2&SD%uX&HMZiMSk@1O)(SvJrhJ4?SP%FV1#crEg8q`vq5I#>S!;05ti?%$B3i zz;3$ve9|d_?(P4OyytAC!@HN$^!K*Ytul>nKk`|q^XS5kB{ad^j!f)&lg}4FiJj<Q ziVHNqV^eDC(282MYDKMD+LA^4_vy4m+OPP}|5BF@qiEHS$7tWm=`>>UmvpH>8rl9; zN>}$RC2uzyvhF#F{6iDO4pdHvqX2JLa(#0H{kZQ7dVT0*3cOh?cB+4=^XT;F<LTuY z{&euC@95o8KC~-NDR!z#)ENiqjW)w*YqCo0C{+}kTSp^@Or!Nb9HqnCd}+$euPL%f zI@^Xi^Adf(=xy>@970i2$W_BUEF>PHS>7K|z~Mc#Wcqs)e5*wN@p3QGj>T`1yOk-~ z^_fav9KQAFJCu7Nm=;gyK~^Sq)W^q<4&PQtB%gx97kH9!Q-|{XN2``(W6_T0hUt5r zsiuvJPt!cNPBdoKLHcdqYMMG?5^cJaujg!3YYk9(bssJE_Mm3gJ!q=`@02L$IbJnm zPl(w`-tAoJt&In1-)9qP=#;H=vsB;Pa`oe1of|=Zuf0MuHXNXzw!BZHrthFQ9krbS z5$XIIVM*F;#1+`WwX=i7UM3)A8VE{)%vk>Tbh%oshFmV4(@vU_48%s?Lz$f`+}ld? zMF-NQL{`iV+%2+3XHOScOYdqfO44rQR*D9#J>1}EBBisifF>^vR}u`-wO2b>h#f|o zk~Bo!yoXXt7kIcfm)5*jl8H2h4O&QX&}mbWj_A0Duy^eYDZVR5sx*QUBr}Hm-yI0l za=CQoX-x^zV{ap|%m!WD+llw0gT_jkjhO54D6;JY&-ONwdM=<`Dbk{EAq6er;no7W zy6avqMJ!{CF~%5U4M}2mr!vMEV~jCYJ1$}wV~jDzSfj&5EMtr@#u#gKxQJzpF~%5U ujSd&Fj4{R-W318PB9<}67-NhzI{ycAp+Hlyshn~E0000<MNUMnLSTZ&6Y0qS diff --git a/content/english/hpc/data-structures/img/fenwick-sum.svg b/content/english/hpc/data-structures/img/fenwick-sum.svg deleted file mode 100644 index 6fc6c75a..00000000 --- a/content/english/hpc/data-structures/img/fenwick-sum.svg +++ /dev/null @@ -1,3 +0,0 @@ -<?xml version="1.0" encoding="UTF-8" standalone="no"?> -<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd"> -<svg xmlns="http://www.w3.org/2000/svg" xmlns:xl="http://www.w3.org/1999/xlink" version="1.1" viewBox="57 46 374 217" width="374pt" height="217pt" xmlns:dc="http://purl.org/dc/elements/1.1/"><metadata> Produced by OmniGraffle 6.6.2 <dc:date>2020-06-26 13:14:04 +0000</dc:date></metadata><defs><font-face font-family="Linux Libertine" font-size="10" panose-1="2 0 5 3 0 0 0 0 0 0" units-per-em="1000" underline-position="-117.67578" underline-thickness="39.550781" slope="0" x-height="429.19922" cap-height="658.20312" ascent="894.04297" descent="-246.09375" font-weight="500"><font-face-src><font-face-name name="LinLibertine"/></font-face-src></font-face><font-face font-family="Linux Libertine" font-size="7" panose-1="2 0 5 3 0 0 0 0 0 0" units-per-em="1000" underline-position="-117.67578" underline-thickness="39.550781" slope="0" x-height="429.19922" cap-height="658.20312" ascent="894.04297" descent="-246.09375" font-weight="500"><font-face-src><font-face-name name="LinLibertine"/></font-face-src></font-face><font-face font-family="Menlo" font-size="10" panose-1="2 11 6 9 3 8 4 2 2 4" units-per-em="1000" underline-position="-63.476562" underline-thickness="43.945312" slope="0" x-height="546.875" cap-height="729.0039" ascent="928.22266" descent="-235.83984" font-weight="500"><font-face-src><font-face-name name="Menlo-Regular"/></font-face-src></font-face></defs><g stroke="none" stroke-opacity="1" stroke-dasharray="none" fill="none" fill-opacity="1"><title>Canvas 18Layer 112237-4227132822-86-13894322913-52412345678910111213141516001337122822292-4227132-86-138-5249403tree diff --git a/content/english/hpc/data-structures/img/fenwick-update.png b/content/english/hpc/data-structures/img/fenwick-update.png index 50d7dae9a17c2e91c3ba7c18631221a219414108..475a0b8dbe3f2da48f3eb2cde1445a3f541f8bf7 100644 GIT binary patch literal 24876 zcmY(r1yEJr_da|H1?f;i5CkQpk#0mnq@^1~K>_J*6r=*xEMcmB>W40t(bpS@Q-&wADgQB{`1$GM4vKp^ntA4xw!AkY@!iw*lK{N|Y6^A3E& zbd-|Uz=nTb*w2FD=WBM4bR7{0!e!JK+Sj*tQs9@APBJ=9>bB-iuBHyp5w5PT94~FG z9L-Gao^#kbSR`zS-b5g7BjlwYXt*V=je90`>|Zo&4|W)_SN}XGsbsHQ&`M+eWWyXt z#2~*cC71MG{p%;k$4^LDZnVf(O_R&W1u)+Y&8v20QCkpFbB|mY<5o6Tub6 z=>Prq7Ee@SAp9(G=kVWm0R|?f;-@V*_wV0d8v1r8npNASw#=fNhTnQfveIF`V)`vr z@Vj?TwPiNr!Uvm^D(x}sA(4@sJ7<3!$WT|cJ3F#B@;QD!QfejF{1$(CvR>l!U>(Ec zcxV1>tC?`jW%L!bpq-y-^V?}VK|~z?tDDs>>*s!u2T ztB%e10*DF{o`$hXM@sIem~e@7^UnA$Sy|sWBxsm|N3ZovDdK! zqf&gpcyfKdi06UzO+HKM$vST|q(S9<&e}ba&Dukn&IEzQz2(0eU!MfFb5!T`I4bUBOjEZsZ3rq31d@sD3nwlthU#QkC+SVPL+rSs%Vms2eg2dJIi@rXbz1>3x zhx;EHm1c}DPsr8O)Dp$L;@mYi7)?!0Wm@0Q{oAoL@6!WbtAP%_I94&Sq=n7|jIEz{BIop-zQ9cjd}+ZJ3Jj|k z^ozbQ|7VB-yf$MMKL^2-tLTk$jq80v%zlblfQdJCKl-;3XYZV>aN+0Jt@_LJNHF#j zQDp~*@7-qbz(oEVjtxaqAM}o@Oi^BgBovjj;dG={Wmr_Zfh73EGNXfWgekc2U@ZDi zK0BPKL7uQ3!}}{Gh>`TD`;u2#ju#AXvXSydtctRPTwWE$eCA1FfU?J6Rqb0Zk{0dR zRk;^S@0th{aV}13A{M0=oJ_8$axBK#I$l(haAFz!J1b(ACjYl{DnGQ*G98U1aL=xE zT!}#E^Y4;u7hU{$bpXwHuEtfEr7{xJ60ICv1;f@uUlo>4KO~LVhQG@HSu5v-IfSiv zrELDp7srmNhid8lq5;7axrWpQryaE$$&rK!xgGxdtJX3M^O%#G3h4$74e`!mP@;>L}EYDLY3!d;1c(T<>S0#%`AV82QB-QS2X=xLE&?;jALj zcM|_+suiq&j*cKYSS~ec))-l^kWONsDb=NhMuEx3YL~}DS08-ytD&mJ6j9Ep$fi%E zm_tNucbEq%_MoMSk~2NESmozgoGf-Us*f1+Y?Dc{B4V+(c;Y5yqx0k2v{0c1#;Sjn zyHOaX3&QF{%y-z)+v>Oa|9BI}MLv$}n)}??5>kOYZgFo(-Js=^3eW3Gcz>8y)etuE zRqCsZP3+uhJDgTNO8C?8(nnO@GIRQ=3mJuvtRK@_V#&$KTIVHx#s4XS^L9R#Z#h4+>x3NrbznX$rDb641N; zaGHo&rQ}flF)@~b|Bzvf5{ae<_kO97z7cV%{+0?YLG3Wp_{&mQpbN5KBRka1+g_Ox z6>iz-D+GzNpBx1*PY7-Dy}1?rK0CE4ewT5s7*_E5Fi^pgXyx@&+Mc>NhdjB?NDh{m zyC@$JCso)8UZ%V*5mfOw&0<^GN<&Iq@gg8Gp45}5ayMLz7|Schv5QmfPBrIzTH+y@ z3+;&fzi_Jlsijuv#|>PuFroOlAHy;i0bc%N0S2uBg0K9Ngcfh9y*_M9BcnRbiaV4? zdD%1^CQ(Am=l8DfCFm^D^wtPJR-?z3TO+G7Gpe!^qPHr33AOPuFL?iRT7xHzUlppi zeF(nk7W@pU{&03ph${NM93omNLibv*R98e)r8Kk^uGCv`CW`?16Bj^(v3$A(%PnrckJV4?A4aOT&l6ggBC0+k_0 zcNeCQFnG;_d({6%$2I-;$mG~luuE)>mQFhH@Jh$vP6x`C zLJPyOT74b^f-}Wl1v@e(l2ZJ+n`g6&ED!{Wn-++SQ*&uNT;s6CiLd)nmZvE zvGb-@Y>k55#LBc#@21^zH$6jb zW|(g$9eCR=SFga*Tt3ci|mmr3`f7i9RCGq&g=X^N1_)JaHL%UiIQNQ zvNlo?r8#!7svpi2?Bd;fxvpsX+Af%&Vr4K%=-nc)v`p@th|zmqZ$gNV4qt3jsorVo zMmtv&cQxzN@sF()?KZ?5lmW+!qu->S8E&}BAr>^W)_LPhiQQqan8WhTHQyw;_3N@1 z41%l2w_BgwCca#$BbN&H(-@HqkKAPyop8~6>du@>6O}_Uu59&ICvVD(^}~JM zr7a4sQup13&-@AQU!QGG)S7<#_D!}W^akhSx1(jR!mtSm1!iVvi#tMNqoSUQxbJ+a zg-T`k%Buarn(pbv`DwFixxv`E;XaE{kf+0Yq9#|pk{70VhSgf_jOb8OWZH~ZOKsF%g z^lr34->PbcEPjqwzU63zeFo=+IL8Xw(b;Zy<)M+4xh5(!UeBiP`E(XiLA=t=Mm6OB zxS?2$y}T6Z`IByQr=4^0H*4XZd*U6r*QFTrPN>h%Y2I=+F1!X|>#+vwJ>1fZ9a$6e zE?u^fhp7(}~5?mKrA1Z=ac z1`D##;u*t!*^D+bx0B7yBpm!JCScmE4yjCHL6%%bu+c4QgB;d|xP_fpen8=?Q}RNsZtxZX7RFUPlW8eiiygzY!upG+TT_iCFaN&%{=EIq z_vZ>-DH66G)W%TEN+aTgomtrRN>Ux>f8$Z{zXNFW-F8wO0X3~$B#jtSuasrb*jEhp z;fl%jOv{(-Z1r**!-Coay?EZ2O|iy4FHPB{jV!tn(_A+v1gQyjlx0RL?C(H*ocSd4 zF=}_$MoQcFN8rvcM$6c5uxO+jR=YrnJsg@II~2pNe>5FLTXA`D3V6$6?&pVhv=ZWz zo)d1f_y7a2ad~`eWBPNodF-YnR|fJGx|79(>}N2UloPqcA~+wFml2_v^rp)IY`W^F zo~^8wqj4`sL(Qz*rgEO_Ey1ndKSD@#%d9DR%sUV!Qw{!m8{?zS5g1qmb}cuwUvO44 z8vytc^gLigxRcjY@I7YVijIkC#iI3U$cPkQ@yLgj*1*zuE&GCpi!BzdG*2_&6jTcg zga+&3=n@JK&CgHvpIKP^t#Nn!^Y^b(q=fYlmt<*ITg0t)FaYNZd_}CcvlqXg`jsRG zVq-&4DB#=*A!XVbu_`1c&CqH{6mjMF`t_@UbH64Yt#~4wB7%sFOm_K?W%-YmFmm;; zPZ&-PH*J-!1n85HkWhf5qfpIYPTguj_0n)L)A{LPw)OCL)IO8Ne0SU4-d-E}_6E=m z3aEV>wx)RdKu_YXI`vd@oOpS+olhH_2W{_Wuae6wI6lTlmcT zk_2qQt>Sv_*@sl?%XM9j7Mq=&bw3<;9jifr+C;}qtIxFJ!G}->qGF@M~@z%Z1f$K zKm_=kYMRuQER_`H!;SIbvR4U(p*Eq}$sVXLoMx_yg?K9i!|m0J6suc)qZRTOtnN%QtynOnFQ~ zm{_h7NQ#gD4c7NNQOGekEbPv`dm%6XW^sG$y--*o%7F(CO>A1Q6Wlp3zun9w=yN6r zNB+#*JSrweHW23qA~+-@4K}G@l8D0?;9`jAlU*>gT=n#HapM+W5c_L$yG1@fm)pSn#4k|UpF;(&|H z7I%whn6*We*iO=@Vl;_g9GQ9SEj5B&%Kpt#`l?@EU^Q1KA?$l0s-C4t$Ic#(+IYLg zY&eE*_Or6-85u$zd&sZ5@0nDRJ90F0r>w(S zR@-QPh~ls9PgS%?sys7xqCp za>dvZWw{o4Wdyz5UtdlBM7dA%l1lRXmH8XV*W?VH&O=wwjtMK->*dK}+GFYcxEr4I zwPX=V9px*OMpz)m{gH`>E)SitjDEWNQg8(+l*d2N>$=FITuE>DKY7=VF2Y>bkLN$` zz#*SYi6Z2R9!(5ruv9%wNdxu|#{DqnY}rMkzx1!Y^DPg0PQJiA zq{or8>qSwSzYxM8s6w0asq66Zh~?FD=e&@c&%XZk{h;3_rPDMHf9$T`ZuRRG{B=n` zTb`zd)$lUf;|WUw4dMnG^zwer0Bn;h`fUV{CxkhE4(ln;&kTJ3oBb&dKKmDsue`#< zYWUZS%exW`*Y+uWpfR6qSw@#T|P7u`HuA zj;JGha53PJ{d7WtqFb*scEYT;-fhHkz_4xS`lTyVKv5j0e%qDGzj`h^U;5e9zk1~g zF<7yS%)jERUog*;ys0t&GFRw&){{SlekmD?RWhvA>fA+Oa<2mXhhw%(T3Eikq3;g@^8CUY8RSZN} z>|dk8;%_!#74xsy3meZp4JO|8#LQVI*W}L|eyS;5n^VfAM86{>qThMB_z#?7Kh=WW z+Q}+i+oMd5M17XP!b*th5cXoCES`=lsu*s8s^EI=++sc~6${_||E+i075ol}=78RhxrMOhKk@w4M0@iWZQJxDw}^o3bkc=E*j z=NOy+Eh^9grz}Rq!@Umc*STT-+B7skG-M^u1BaZ34UfY%N z#}@}q$Go$sF>Dq=@oaagUmZ44`0cxsfZyC6hfaOGIPotrUbl04ts zH*wM01@&9pX)}MYa%NjAcfN)4n4}rC`~?d|;gq|JXNmdqcwS&wH(#F^%D2{dc3#9J zdfON3LOsHWn+1<847cyna-{s&#q>)kwh)#~jGXkuh%=dSKW7(`czukr8S;mRU#yAa za#G90YcMJ(`>R9=^CZuwIUX>SN@@}u3FKtfUtM~=lCu}>mP*KVwZHv#2Qk_ih8tz+ zrol*%ckWf++HYylH*eW}C}zz+`jIh(XN>FJ0P?O^gL&~SL~^%#exfNw z{?@pq(sNbA#VeWU+Grx)*UMAC@+ahf__dg4L(zJJKVcoSsAekcttqu6elRKH$IO=3 z($Ahflj@A;qbji(A1QH67IBRT2*8BI{HQ-i^PXnTr`6IsauG!y`RrMDuS| zgf%RU_>?;>|A|EPZ?eYufBydcba$b%FJC_vdK7GvRwAw&&)^s4fewSWD(d64Y^QKt zF^1Oa;DQT>O_`b&Zpdj9rKbx#dB&OYgUwhAq? z4)+E`U%o`Se7jJv)A~p>V3HRh`@WZlz9@*R_jJC$?3F&`85M{d{ZE9$7i0P47o@YFs*XkBxv{@%{pA=%=+Pj<%$(_dP=PZ7oeoyuD}C8&dARn zVZAa?8lDGhra%pOoh)Z-~Q* z2b5cRelY63@Yy1c%k;HX$p);gZD+n6Kqw(x1&pDdQAA`k<7l+pHrsZxuEerG6p&zb zGBlRNGGxPN08?i{3GB4g4+#&?^l~q)`w92dt#n`@6x~9A2hPHgG4W{*^7>A4;l074ar0wP zID2LK-A@~zGmBNhfq+h4ULKDokKg=dZ~7Iu|CQSe41w238Mj)gY`N@bq;2ZYT{h}Y zZJL_U&2)zV-~LT~fPsKR?}BO+a1%W@_Xj{0Pk^if5F;~KV5nQ8!%he&oxSdCM}bi} z@h2eqYlMW^aB-Rz=vea7dTpUWZ5yDO>k?+TcpXR`N!u6cK|Qj)NzXG1z9&gCR3K8qv5oic<6? zSQSz;myACl90k+TPxNZM=GLf21o(&iN7-N3)`rXS)T5p;4Yw{#d97b-z46 zJU*I@G^{Tt=P)nW~6R3%7HSgC=-Ez!Wh9Qc1)fb%_4h%a-19J~6xcL0Skg8^>O z*Q%wUkmT=A_W%v1DUsf9t`}y-Ph=gy(Z7lXz`tfe^ zB6|u}b&Jip9p)bWj$vPQR%Ctl{Q36DY9WtD$;SC@Y57ME1l$5+b{M$i9CHwr=095` zn?gc4Hu8n7s)R;5J4@0cP$TYV+ifUpTLCAzTkEpaL)ZF_vKh`_4nQyb0vI*^S0g{1 z2MrfLIsh^b0oU7YqxCI%-cH7-$mO`3?nRAsZp6@f@`MB+C8~Gdm#ZClIO$8P%Bt3n zRC^=lc~E58&oa0kRDW?226cPwCr&7)4yl$g7AgcpgvY@e%H5$OPY8fZqrl)Z zkimaTt+bPUFWezo8I(Uvg!Cl?T7jj%*^xrg-74+ol!bLCEO%tz-4+(sldE{`WQ!dY z60$nGdV73?E=MgxRwG|88Zu&U+CwZ!LwOX>VC-|;GV`a>76`U6$8IsF)q(K6JtuG) zu#H49&loux_W(liBR??ssQCE!!qa15Fup)!8M3Z3G#96*e|@k%y0~gCpesC;uu#tk z)_Vpa^*C7sVhLyc+5I3wT1JSM3<@!-`FeaP35AXh#aq*giO~W>kr)oc6|D|PM!ki` zy}4{7>n^B}3_EJr4Y4xv{`22G*($W(zxPG`F?Y$$aADi61nC6oSTQV<)*wRL!Qt(K zCC3}4G`;w0Mq}l-A8)4I-27qWHv8gxFso{jY4{UC;Aypuld5G_A*;L`tFUiSOmT<@ zssLz8q4BD(_{k{J^vRPac7oJ+)PnCJ$d7Ag*Q^p)SXl5lE-3!3Dk=W{eJA{=tE1yC zJA3E;CK&Bt;QPpm>SMo21!d)h@9i9p_{waM4@Wg1Jnzb^NI$F1Ok=HC)?Zc%trZor7woRGjP@&(U(v9h!}hC z-vAyjTk-RKlqt!E-{kf=a|1S{6_^&BTeedLkhJf=8sSClH>^S#Sak#PZ;%yl1D^%; z0EXqwakvo5fnR~z0;(7Lt*HknH4ZTKc^pxaQI3yvzdyT%BJogJcieTdQy_A9+HPy= z4dI_+3l&_2CGlo1j+Ng|F8Ts5|NN-T&s2a-FuXK|N(1P+4QEQQ?ib@#e5-@GM#V!z zItUuyGkX*-48Hv!Mcg|$GV)MZi#Z+$SX4z4L~v{7TN{nztJ0U{qsIrMwz`l?vsF^e zK?0J6*HtwiqyJ*8g1g#vGYbeB1N)X61%tAYG)l6vZ&xSz6O=~HZ`M7j^Ku{536?BW z;QBB}Ed>h*tzjt}UEkZBGwuE6KsD>$lU*2DDes`WnJD}#l$ zj}uh~pa#+_v({a*a`$}L@dRpHsq9DYqwO=1sI;*@Qku{4mRWK`2$z_e-cKN(BY>+# zM2lP!W8B&LnkZ7p+`^Usg*NT5u?mW+h?=ga_4KD#M?Clc9dCAlvE}wz2C@}B6b+nC zk{!mr=Q1uXWnXc_tjDYROB^fBfbGq~JGu9g%`e6GVt7$KKk;B~ID;XgA3SU=&^rQ% zYN+R!CnaTUff^@MW&I(9<#0R#^is)>drNShe-94T?7}hWr0JTPo3+#4u3LEyS(owh z8ikLn9~=v>$Xp0phR2fbS`qq)X8IM13U=XerXE{ZPS1J}f<$axl9-IF%-HpO&E@o- z=&FBzZF?)e-$rH+*&TL^57E*6l);_58oXPyQ+0{HgjaDAc6&)mokw#wT6yvA;D@uG z3sOh;AlZj-ED6pqMbW&pey+x1sH9bW9(zh5E129Fc}BQJQ#AM{uCcCl(QRs7Dwz}O z?3!2b9K-(vW>Q5Tj!X#*Cr>iYE{ zy$1>wxBmJ&#+zM@YVzS|_G312ATE5P_qz9E%5o45HO7kV;_afI=2_?8Kiw-5W3EP_gn>N=<%8deDsWyi_et8<%*k*7V*p+a@ksz*Co7n=_>%GK<{(BwW| zT4X%*1gU^p4L`EmUh>++XiMAkdq`r{hmMNlmS_u|hNSdA0x>w#ugaejBK4+9&yf#I z8?AXPKb2uy@2ZYRe9$77#-^)L!_Wz@O5y$xmL0q&Q!4FLPUgJwE@w8WrIuTI$j?QK zya|I(_0GFCvcp!QKNBnHFH)-`V$(bx1Z;h#^1p-p`fxndty`}zc-`cH-BP2l?c>=S z#oO#sQG-^JEULXLuaAhwah-JNu9MNZ({i>t|MDwpFnu^KXY@9n%d{*F?yPXo#NRbOuSC#MS=URCOj%{rr2M#v)J7JqN?hY!20QL;qYy<_l%n zIQz4LlmAqSD!#0y;oJIp%ko<|F;X)MM;)Vrs`R#;yf#kGn+r}++aGCleG=074Vom; zs)y6gD@qp+Jm@8;EGzqI{n%}&4=Zd#)xShY;&yU7TPeL5TO+gZrFGlVLwry zfKPq+aygImt=^00bS;;nQsEe-0T!tg{x?Ow^}#--Y2@Oj`fmuEKXHo8Fpdi6JX2xq zJD>TDoS$3(q0Iy2scYj{dJUf4_t2zB3Aj@ z65olfJ&*^u0bee-oqK zaVmePRhvVFS0av+P&oI1@VQ@Q15-mRa#_{4%HB&rwBFa1MsT&!zN%CTKf;sr(A7i! z;d*}V-Zq$Wa`Yc5CX!dH?sS+fvwSpH5ZtgE0Hj>62dsR~2%P&~e4havancJ7@)9(!HzSn4)&KhQAPcrl_V4Fd=`G-svy)sJ-zbn+aTO;TL=4`G z!*Uj0E-+*5<8Tpn?x$99`-*0%_Cc}FA(4i2^KF%HEc)4KfWvVvbDvRXaUMr-3T{AY z=^$FYu3<3Zwv|=F=GDEz{BvJ7`&_lGC(He$M-j6J^5JqIWtn23fjA*Byy-`d{Y%)SC z)q_e@6TdVyOxYF@YZbBFEt@*v#c2~=J>z8~r8~BLTW0fujLjppwH{l7p50onDFGBEn1ss#bNAfj4^=*IAV8C&=5_ zdyh6JgLKc1MQ%TncR0xBXs|%igzBq>1;jna%(=1g3U@3eTyh_Wt*q*H6RS!(nP1#X zI2P>dK;hoOJJRklp&$NhfLB<>*NV}J9Zz9UkE+r4FIYONJ>V94@UM(G4vEJjv zBt?oBRUvMof@$A7!lbV0i`ViTQHh<}*ZTD~aB46_!;f(b5}Yr)QiKkzFNS6uFmbRf z72njmBP#GHdln-&^Zyr0&vV<|`Hs}0iYtZF$#jA-HyYYYHN#XfmIAAMwf~4EU?>Uo z#hxfWuCplKlHl)KAe+P(W#b4VqQ?|EH0na|Bb>b`Rwt3`11xb)b>ELB&39BXHq-|A zmc4qBqX^+rnNBf+=jCZt`!?L$iMNzQwdHvcIaU#0^_4nK^+}s-H6)ez1^g(FTmo$H z?4qWU2dS^F(eYLtydEcPAb9p0+7!hDNa?76G`Hgs2nsd^e$ zq~>}tooUaAb8$)h&XyhbZj3wU`HfU^nz$7^J!=5;@ORy%`s;W+LC~G9X}dn?IKKo^ z?o>77dD3Gc>IfE>YHtancLS`M>CcfEH`|37uNK9ZgW;4)Wq@4VBLs)O=$etNHWLc~cCld^O4sV2I4H&g(eLrAY7lvyeLs z4D+&k42z7z!^7(~wfkDXKW+a?rXugR2guk8^>j7euVF<^0GHK(C;eCYJF83MXRFP+au3k$NUzrkH{JOqVx@GT2{)k6O&~#foXwh z{H(JiI~`SrUV4>RAnpId{9&EMGF`9xVCjs;+hf>%a*p`+$5-Yq~ zD_`ys0h$9IU;=Im*mMB!PT4F)Vdr!i+}23iZqTowgxyM!mz%zq2|zER+SRCrys*n! zaIeUzg;_R+r8TdB*Fxbys{&-?D=fxkY>gg}oJayG{?2mhmHgblMDc~dto7yT{@oM2cnWMJN-hEvR59S# z+^{=iTtHk7UZw!?4uoVcpb=b!yV6^roTmZ?tQ7!i6KHL`$nNm)d;|%}WCw5yMMYdG zE#IP=r>9!Tjpsm}QC`K|co&@yA(Oj3+qSz_-2HT6#j`1+B>+8qL%2-#lKMCy)K{VR zu28HhuqTnCJJ^8sGhuc@JOrjE+<>Ka1B$eAy=>Hr=ZRph8ksR#K^*RiobLe5lq_0FWLdXkOq9KO`k} z0)|F$G{EiA!!N7w%y}v5%$MdmhI@K<(>bOb8`Hi(x zsUKor%)jzP9O@cke@MZ@Za1a4n&N#xiSi=g&Vj~p95i&b2N@8GoP=F<2gVQR9?9@~bju$H^jo-&3mqu_nhoYe^ld6zwA1L-_WOQ) zPsuH;-EO6_9;^5O3?RDY`?)`v3eJIp-kf)F&g(_00;O(YMBH7Z1{Rf!T@N+hK-kdl z!J(do2iDjvP|E_;?cNbe&2j(!VfEOw-5eDk%{yrN{G_t2@$WRh15EeIOfbn!2_LVz zMlTrmfnG*i6UjAVue(Bvy?MIDVacq(_<|5dJaT>QJ|Tno1})o9gm^D85@()-=Fv-6vM5uQ^)U!I=Hxs=t1jMrylCv zE2&8facNV+{}t(4h=M-qzpQCq-Icss*SQFfMrXcbPvr#OkPipo; z9Bmp$yco&vU&q(_`ui|ZeSISOB@NlA;?XbtiMjN4wf%ux-zfGt8{J4%WgRPUxS&-d#gX3mQUN*tO2>O!#`1>QKr zD+XrHbkaU_JkczK7P;;mMHz92Hj+wwf0h|KxAV3b-#>Mx3m7CrU;#cNYGl6tara}x z^G|C=yCI1|%JnZIECTnn$Zul{`Dt`=*TxsIgmTM*Y(Ge!vM#w6}94zU{U+(&$t^b}6M-K% zw`R?^I)SAH&#?8Vh{@t5A992CV;~<|J{hf0BJJnbqn>jPzJ}*3zw6Znl(SuA%2x^JwpKV~GsM8BefL zG^ypC)-LCgvGVRU)B&mF;@(!|9$!JnCeC{;B-CDOPW<@3UnGjM3Vb)fKS{91iQ{yY zlg#znz#Q9QybRJFF{x*WZ5mqmztZ_G9EJA;HFe%$qTFpz5x}UF?I*MdGiCA=d_G?d zrtw-zdnEBF3KL!L(Q@qImn;E@rrW#glCKYQ!@rGe0m!Y^ev;P)8gtwF?v>=BvB20A${_1|z zz?fUk{isB=mwcW%^gX>O3&J~zHL0O=Ze;H0S8%M!^{ayA4j}r!U-$VD+8a|I6d2ga zFX96K@;c4iK?PP%cHwYfH&t)mibtZ25J1ai;|Su^MBl)Sv3xg-F@*o+#XCxzx9P4u z&pfAY7s;6^ju%LEjtp5kKal2NjdC;#YF4{Qxjf5m>pX8lsW6X@j)H-AH{g|g4n3R* z;JC7A5?wHg{+|A+d|tEQrWF@(H{WFI3XADNFw6o$rxsL&qf zi9nrBwovj7oYirn346pMU)xxgkP-4x*+1xH=FxS>1aoyKNh%;-(ZmmBq)TT1^0FK& zKX?4=suu^3xbU$_!N<;82qEqQ$A_K>TPf~>c3#e;obF4WsA?zaSW~<-S*%y?=+@gr z+Bo|ey468xYA|@ozNvYqsfTUw{@Kj66bF`W7lBN0?^R(|JsHCM&p=jCv@y$^Sg>m7g4M+uA~BqfbI2 zt=kmZ1t`Wz0A6qk34ngssJ4&(L2qwwK#;;Fk_Z_t{o32ASyk=te}DVj&mrDSJcL|m zz?Mh<*|WRE#N@VaJkNd-;We88x=>6Kjs{v@M^zZ*2Cipi^IrF;_^nZb>_k!bC~af! z=#CD>+_39E<1xDR4Y6r+32O3%7M}fc@F|7PZ5pY|s~i%zfNbn9m$sLc_u`peKKq>C~E`!Wk6-H4h0xadx3w--ry&V` zY4K)VgVdaqTgx|*fk)WP`P@drMGwqG{lfj^5`wYb`=(!}TZAvTRjXzEc{5Dchleu_ zl%cVn;E6s~7Y496;kF=v=q0UZFG?b;#mK_)8 zfDu$AlDRyx+HU%_K|rtx`L#0)o#()O49a4l_`^tuk?K==Q=H-45;E)?Grg%?tWmqh z)6>%-wzc~)qt|$Nc#5A&OaG8}jpV^=mh>IWhmoNkO`86U8(hDRt2;V6MrdfN?}~|? zYpE;TNgNkz^vX24**{+`ZPsCfJk7Xw)9mf*>fY)A3xz4_f6-?~?G0ll+N|7{YHb5F zr&4fgE#gw2BIR*^mU#GV5O14R&!1@3$l8Z_P1|Mh5xd8k8fkKl4Ff(nkNr2sxlEq<1Yo0g{YB^w4xhjLVKnE z9X+QIVbX+@A+P7%o(&-h8!$$AG7=r|QvU&miC4~OuT^nQ4)X@q(d~2`JMKv4Kt_~2 zbZwBI%7f!hda!x1BITCPfl4r_M4r%4&2^o8ReGyXggt-(JzrnH{9hOmONB6ag5cI7 zmvUQ|eb$Ug=Oy)TIn;4BC`%MC`7b0=O%Zo;+iovRNh)-K#5A}mkj}89F}>=)3bY+D z&)epO&go#C%CI<^&ms99C-Af$j1EUaoptG=@@?9H9DzAMIwv`%*s-KE?vzk2DAW4= zf5n^C5%m3`=DF@wAT;sDZCH#F;X&^>cMfH@g2F8`JvH zw~lS8j-BL0co#3QoBR3-h259c^*?z!n&trz{q1~tY^Nj5bqo=4@gXh$rv;#U9S!3- zuOD9J`it6Q#sBT`$DzL4FdnT(>JxyfUdK7`9aDFGvE|sdzdRq^E$fJ<%Gjgn{lDq+hRc=Jy9wT~3 zXMlCz1Fq*9q!9C+WY+{ms$6&Wto|pl-T?$$gSf(^oJu;4ptQX=^{u|xB zI901U*GjV`8T3L)GM!ED$Al_HTK@j8dxk!iBeY=Q_N~eQwq6AqA+SI-^m**G zIqDbl_?XeEr#n}R|18y8ZoIu|&D7x>x!{)X=K)EZjF7jyiilo9-X)#C2a?Z`j!fT4 zqp&j;ixt$fZ(l^XvZLzRQpCu=(2KWFQL^7)9B&r!7aoMv{~6TRH3yU@@drItk}b3 zib6fHe~_SuZ#ecpD@WIW7$k+6A-Os*kWY7#O@Y?~Yp(U9K&{zD%#tM;1}OPWm_3+= z+t5n=B=}luNcx>UM>=O$(zM*|T``^UET3MQ3q)cp;d@Ot+V86+gwP&N>8MDfE2-Tn z93NM$VJf=*$aS}ZRdw99stVccm+L2?ic#Q%d^^% zASS#mBqY>fQ%;qs5bI(!m)Jbrz=fx;qZ6kukx){?>j^v_J3l}FsJWxW;U!U8Fk*9b7c|{NdQtH$o)y%+M0C&++ghd zmIP=A;!G)kk&$g*T!P6HdHHMY$H{+p4nM@in0I$|l`Wr8KRDljNeBvfPlDb`+=maQ zDhdj_OVtL4hak}ekz8bbyV!B^?u=uJ2z7)C+B@Jesj9|&PS<7K+$zf4KpF~KJpo~- zrBwEb_jC{O8bCevt}G7JQV~J4K05j}frGgazKp>2mJqU z{rG=h0y5hxMfBFR$YK65%+H`^DePTKvP1}|1tnpfzeJvFKq-NuFM@-yL7Wp%ow3Bx z!i5KFxvwCLs9OZ#MJF`G<$=U`0RuxXw|9hL?gnOlP*j?M_kO>4*f*fhppJeBDguL) zai~>b2vf1{5L6HmFjsf{g<2WzZ{B$m-v1z6*yVV%WePhi0EL$ftbe!m0HrK~sXG{L zd2yNQDU2E6pqOd~%^b|4(7@+e_rzd`^^Ums#QtXwGpLF|bY(x? zbfqg%2=&BB1Ko9m$3h3C&+$C@Q!}$=kP{A0O2Gf5!^4B|Rs53)LBZN!Q;vGQ2u!#w zcyu0DgJ{Zmq>Rsk@-4Uzh(GLlH=|)(Z@3A>W1vaI0L!ai+W++Fy359xd+%fc<<14Z z%Szw9$-_hepxAV4J@{U}e2IE`V4SLrs{1BaP?aT&Z$WVjBB;`zyPkXc`#(rkwtycl zcHcFp@!nHYQC7=?WpBgFWO_kQdv4pla79mXsO!J9_rohk`N@WVN=VRd14Z2Zp zAOXq>@JB0rd(oeB0|bIw%&nmN`~d@X6)?*5SS3g$q@YJe#RChNZ-jZJD5$+XyeHDVKuibV@B=sx z*ULY5C!mYsRY(iU-yj-~V3fGVm}IwAT}7+~1_){4MK29cw})}UY9swBr+XlPO9g%p zrFaxQ-}^hE$=e3~k>hUP3$x#qI5&CB@lc#1aGpR`27)6S?j@tv6DfZ3b8iUV-s5Hcp?O!LJRq^O1tiVs{i+GCvUT&@Fuc_$R->+L=lSYN@mKajO%SN4$9U`PW1pB~ z#bwddJ%_6Y{r*;|httW_EFF!q?h83L<5H~`ma2uC<oPS9c&O%fW z3J5Vh{sb~!Cp_aOT=rz;`83DMkr#_X1zmz~M>&Wul1W~VlFwK&Ix;=4QKST)Dz{>ls+Dcp6^ z7*e@BTB?3pY1F-__G?o4P|)9CpTE=B{)rWYe@JO5sQ*F+v4Ub!QYN>eqVC~M#Qdq> z34#U;jEwP#5>`&}#mYucIHYVuyV6uos4u)NNZPC{-*!C74BEgW?X9iEXX<{bvJNUyGkO9JV)o?ZBuB@6 zOhB_SDzn2awRd+XLN2<`J`f=&2Pf4WZ>k{OPJa0!U@v=F>@DUFIiaBxu1Gb)e~NJ( zRL&x6gDwz;& z|6ClgHeCBzzh%q7#FTKC=vjy^6g z`2K_TmX`lHU?C<^HI0sB>9Z8LD4a!d0cTQwh=}=6Ha^R538UV3#oWSz9XaD8WT)F2 z+S+bnW3^2T%&l3L9xl`M_g4<6O$Uy$EV1)6wYCrXPRq-?@75&p*hO#eOzgEQfq`*p zX_6r&2ob%#0*|7c+%XFaq12#+ zhKY#`utH_fpMcfF1C5ZnLN)Lx5e+4)f~+H+KR-w)Ni{Vo(bdy)EwL)q>lVSUn(>R!Ak6KYyVNFONt{QO}4FeJKn75V^dGR1~O| zQ~0!m&5Nh67r`xWT_v~AE8_GtWgEm1x@0VLLf5IspBEOSe*3*EI;DT+O!j>B2_{OyXphsoOgi%FA^Qxs37E18qglz<&y?0mRuM#m;#8bXF8c0HJk*TpXvXD}Y2QI3? zGQxdQSbn#Nh)9*pUjRDLviarYcDYm+zI^%8%-Xsg!J1-Hr9iSWs!ibQ^0K$Un@U_Y zQ>*Agc{m2H-w`e@5Zw0|Zhd&9>F)IZ(2?p@*M@QZ72po`0b>5%-Q9ib?0bYAp!L^q zxu9oYASlkjNXP1MjxS-_?p1DZ3AhyF%8oY#o0yoMFDxve$$T$0)t#M(HDMjvAYw)u z-F@Q33E99O9tvLLok*`S#TJ&9&qKC?3H)F1hsT&aUaJa=g9MRg^AjhyVvQNZcx{?nX*n#A~XYA)%3{WJlya6QXZM(PWIh>$e^HQlukzGi7@ zxr3M2#MqdL0+{?0P~InO&74z}=V56_;3JfGnCwljfB!;5LnDn_6EtulbPZ{$Ntmi% za$gv)O@%0hDpQK^IT#-w7Zka@^h42W{8e*v1b(oL33q$Fk&#gr!k=w$|HFs>W^I-u zw{UDcRSpggCKL+T<}#S9GxMkr)PE%qP;zo|2yGO)(%#=c>#F~rmn}^XadO&sx+VG( zs)U0ZmEWfB>+^nsDS%^BPXKwU9XObBc{T5^B_{&{m_g=l3Xc_yo=1_7#I zjgf`TvR2mC)`*ip$mOvw%hzpzz#1DHZ}li#x_tTE=KA-4)dB|RTl}Gw00SuN-P~0B zI!OK~-s6&zZeUM?LO`fvyua9(nQ7_h(7@j1{+RI4M-1)@ZK`6iPF#LNk_cJMdOa=p zdRTuMh%uH|R>*@-c4DZ!{_EF4)YC|i7cN|&&}rt{v7>3??+?W|f-7_rNQuM@B~8 zLkGONS``$A|3bF@Lip8QXw{8e)YRA*HdcMv2Ne}s>D&MLhX7yVvg1s9Yz`w4&}`eb zlb4sd0cWkXyIWXF>dxq>&FtLVlal`xaBd%rzpt*If!cw1MB*_kkX1bZsU3Bpq>WQi zf(5Jd4-7LRUcOXu?o4jP^BDP;{=p_)ctnJOQFZNuR#@mraBbh@j1~No!}&Bj`&GVC zDi;@*b+Q`b;>Ams61k^=NjW4=I>;NdmXe@majz1oJN<@PFM3a(lR;8B;!{>x+2G9Z z*|TQ}%}G|v2Q%nehMV>d`<6X}V-F6hJsZv z<#_(@!fJq8K<&84x5skleXgxn|fSVmkvPl0HCEGLX!Oa{P$(9aj)5rySj=2tP`38#4gig$4KD2-)HIm7vrwz zBQN-7lZg1w6N+;4^ZdMoqPOu}W_!C~+#D5xl)u02sZA7j%ZGnLC!$;OuGIaVl!vD$ z{3TwswULi*+Xfu#dC{JZabv1vxeMMqFZ|{>-j&!V%|#+ga!IDjvNsn7vzA7oyG~Fw zx=a>JPBkcHvd^zSahbVrLM0NQVKmQD^Ar* z^A2d5cqS6aucKZA>MJ`z&hGh9Uob!>r>riY=<4k~rlqBo49ggz&XSPmbWvMFhGfs6 zABx9QB#cJa=H7`(O3DWg9)h}tH(-Vz{g-=!H$j_-1$6WF@#&je1Jv?HKLqxe{HaGY z*B$qZi(B)UrfS*R@(^i+C?8Qlj6%QVJ3Dw2jimc=aRMiT0O*La)7iBKGF*sIxD>5o+cZyC`#wI1b0qKQ82{t3hz%O)vG7$bA5D5#X z&Jpw`AP-5CO8J_anx~bOUr~fb^Qb05K`4AQ{RkE;S?S_FqEUb!(9qE>gXvUyrftV< zd3fCnqhW(q#5__3;&Cg_U4pfA{J21M8b+m)HxnP5+1eTh1gP*$httr~I-WcC20GGq zTnO=>8Mys4pzo1nX-aNx?#(}c2ndLID^br~2?%KSVeFj37eV+pwXx}hVybBN7lmW( zFH~I0$RxnNy8pO0WAhVEMhH@qLq(?~3wI|zb!?_vUS393$Ajd&i-Q9mOdPid-p;M< z0;Q@B@F*w@pNS;waS_GPRM{HL_VWHR|M4RU{UwkNl~h%UpA-~HdbFd2%PT4p zQddu}sX2t#65i*S{1>|#d34X%X%Z>a zLN2v`vg{aJ+{wv#6NvE6ojXvR-;0meadQ(#`kX|@rF><0tF`s; zh9b5*AqGtxlaPk{PL^pXVs~9#9Y^j6*Y!R!s7?g_#2d>l&3%1PiFm-E2?2T7pvmn3-`2x_Z_B&)U1Q9Z<$HAe>9s*FCIWtV>1Qv6^4(X>V`p>0!~(&>)^Nd$rUv7F`_bC0^?q z*Lmn~Y@#=jf6L+QS)o|LCy%#PX6<>BV`XQ@OmClEuE6EO0`z|D8!sU_CB3q;bn4c0 z$9f+#HT~~{8ds(wc(zvZT^~MJTdd5Y!9*z|U?tg=lEO7RJ8M3$Urg-&)`{w%JgXi~ z-?#m8G#p|Q66Qf-nUn^xcz(yf@piPgQ;3R-A5@j9donmUI78v9l2&XTE9jSAc-;vR zrrk99Z++$fx%1A`obQWW_4^j{Y=u2NJPLRGY2P&%T)f4+W+Sj&rD69mDe3b@3lssA zy5ep>Xp12&CJ$5cHXINqQ?AmTr*Q{u&r$&tQluvde-nT7Zf9e$DPonei=tpUVctJS zbBNyjKf8uqteX4}!x-4hMp=8x_U^k!N?=Ig3xB+-nZ_silRsQ&RrAoh9qe*}3bToi z>>e@M>!knDrj^|N$aPc4{!zuSvU@)p+nKB%NB`2f4;;Fx(Xf3geBr%7dmX<^n{C4} z9g2T*X5SB`J5IFt7}7L8qkMUxI-N7prTSNvO=$AaLLVjDmE5t7H={8JsKRbG$Tqmh z4oD}slB9ZVg=)`*-JtMt$YKak3*GZ*&*9h6wH1;MJknZ1Maof|vRoZ1Hy%sVpNzby zTOjBj#`jSpOvIvhg~pOE^JBSJ*2xYl8edv1t}QWLdk3}84;0>BqEvU;ocM5r?l~LV z5=m~*w4sv0B`j~iDfE@_S0{??rgo3T9&IRH*tO(!h?z9#uM*#;ZPH-IS2XPZrp4#7 z_#Ug}Hv3;}EOJ`b3jCjAU(y9OHkUmZRiQ{&da^Ie^1*O06A0+SyL{mL94yWH0N zL$iSECPPYmcC=k94Xir)A4RUM3P$S}gf(1@_iL7`)fb7p87TDBLa6I?)UQKyj%hB6 z(+b_!oBFoir9ci`quE6vGQDe5I)e-Kg&SE-`7i!zS?5otDtv%K~=y9~;rJjMi z3K>elYi)D-sRv|ztkP1>+}O7?f2k`rced& zO^fak{q%!xNKnm(`{f~>=_+oS2Zv)gpPpt7?lZ|7IDVJrI=}FZ7u5FRg)C5RLCfJMBDUo zg{rbrPc1wAv51q_jClOFL1oGk%C*_iIQ$`jYsB9O__>cOe}7aEchu7H;PADTUiw$H zewu^wOi_n+QEc8?t2FpV(W%Qw-_18F{Ve9q0NeEMILo0`?HBU_`&1|!W8OrB9id$% zt%vL-F)lE$3RoS8q@CXV@!bA7CN1AsTQ;eTx+u0n?o8RGw1hR%N$M6pm6ETlmUlU$ hzSeT?9JSf{htVQG=acpRAGje316>oHVl9X8{{uU+a1j6i literal 39961 zcmZ^L1yojDyX{LiA|WX$pfm_bNQVeYcSuQ#v~);IBO<6EQc9OP~ubw))i3q}Dx0RcvS5i!0SVge$JcePaHg$M4T zP>iVC@-o_9X=~p+^^jD2|%ebu@XSqtSsDg@J=!p-1|3|F~T?3=<9SHsk(AT`kX||IEH+n{GIa~f zy|yg;4i{6@uabuozf-ERA8T8is^quoB`@w)Sd&GryHBH>G6FPYiFPR-lz}|~ za`N`XBTrq`mx>dqhCf^^+9vL%3t4Ni%IW=__-rAxMloJ`30ruO_@`u7{<^p`^SvO( z8zH{4!&EmgxVc7>Uq0y-)|@K5+VOB$ZE=wNi|-^HA@*GhwJB3Ny|RGVjk&;)GHUaynXLZo`0Ktbg2Cd(OomzuL)6xoAB-Sp`7uL1s#;oG%|2>r~i@eRv z7N^=dK}@ZqiO+W!b>dw%Cj!-NCdqdSEs5Q_U*=s42r9g&midX61viL;QaN*NxC~td zIRV~h0p0}Y`#LhxLxc_XoE+|g81uBx%`z6;{rKA{t5NFjZoG{VjS`(5=%z^|FD_o# zTFTc~S{aL|B(6~VuS3k)=`-SUT9;GO)}=m5-Rs7Zp6*&$9$iy>@3S)3>RKEl-^w`#r33)-S-_Iz5VCoKT{RV|82JP z{cb|c0m--YezYIoKW0eHy^GwjiEzTP++q__{PYWp{W|Ky)`8`*TJ6iN6NL&Fhe39I z(v`<*x-kl5mmBv7Suv5V$&v9m=cOZRj^{jg@5r3;o~>&A(r42`$I)?e$lk{Ax?QSN zs5` z*)W+vW{1oFfk*%bPh3q$V8odEB0k9rDHUx6omX$p8+g^b@9qC$`{z6(SA;`yYd359 zGmdMlGOO||ZhBH#__B{)D;?_nRfjyN{9-U37ve=aSO1x6o~-%5)BSG;oD2&B8}5-m z-E6<0p06EF%rg5iiCsl7?(ZiP@d${Ch23#;yZ$Ymt!-pvWQgfmRAXb~_(n9Q49N~1 z0dfawX=P>nuiw61WMFVwR*xEM@OCq9@}su4wx;R+7o1)sJbOk?chigH+O=y+w{Ocl zc)(?0Y3Xu)=HJoP#VjOLH9A2a-FfgMNlm@jP$<2_*2X4+NiMvlw^w=9<)1+%r!nHz zx-Aj&+-uOE<78r*&n>lcaER{jzvJcY&BVrr4PPx5bI|$khC0j~$@m%?8ny1fG#}sB zF|yKOPLw9WyhgW!5-4QFJLOsVnU4t0y+s@asvC_^P9> zoJ%@UV#1k_7akG8!pW&(S(Kg4^7ZT2H@bx`tK-~Pu3VAkNon3ZDsA|K@8|FDajZ^o8X&_Oq<`aI)uDLNiQ|?lRQl+B5aaso@Dk=)Qel?*;QC{BmuI%bq zA&!%ilb5e=JA8I{cv!@F4waRi9i84iGeeJoL)%OGQPcwzl@4_ulUn5D;)a{IiJu(fU2fl5~)Jz)o>< z6ScOs;t~^gC3EUUr4NpeUxM%*qRpXD^YGi|@-Pb_tuUdvxj7#s0`?5QVDIgpGP#=O zKl@*xf)t`JWjl`*$$d&z?TbD!;e;bKuq06SfaZS7blGy-RSVu-K?3 zJR}4I1y7GN+30)Y5|44@_R=7wcxiGnh0p$PqI~VBdyPJ=U0v9y&ChC5^Nss6a)yR9 zD`SNb41p&XOPkNcP-friJR+l_f?vI&eXEx7Ixrnl)vrHn*Dj^}J*8i_{rnmfD`?2b zviU!K{YuAe`2MQ8&hE}mXmPRQV;8m8H(Osz_%bGabY27=^+nqza0qy^-a!yMhX;tre;L@PSD};xf(!dim0OjlHf~ z&+s_^rYJEsTy%N~pPb?96<8gOB`Yc3O$iACZQWNlobKJfFq)HdW$jx%C2X|Fqcy5g z`&|Odi_x9SBe?`Qn%Cei2^k5JyaEEG0RaJ#F)^X?%&jkl9QYt9N6KfN@Dp6(lJ@Zt zgFSK_`axlag{1YoJD4&p-|9Vup6WQ(n;?0$udh(LVr70FjesDU@>AmT=i#Tv`*Lha z^F4_yJp%*W!X=Q>WjT^9hTh2J=@jZFmOFi|y?EWaU;5D_VZ>YM=m^!+2GJKjrHkX3 znwskC>odY5-F+!|?a|qv4tykdXyEq(l`xAPHtsQnE+R+;eB`GPHpqxu3a36PNy!g1SJU>4_Rm6q)(Z&oBit4&GVbjqH zXRH;zeU(^XV_|c1b8C`b(7zmA4Bl8`+FUYhDRlk1)?bI4es8D6Xy7WmDAlarI7T9*aV0kTV%!p+;m=V48Gnz{H*$G^p6sQ6`9 zC)dB!EPV5vRal>{x(E&?AkxRzuTM4X;}E$V|9EiuSiT> zJVnf4AX}TmWo?ol69>m~W2Vv2v!A2?+qcV_`Q&hVw(xvUmNW~;^c#JIE^+Axef??- zo(#u8VW4wYO|84ZdmDx7OX0l{1t}PBuT9 zuZ1kF(&hpw8CkT%J6Or#?ZS8OZ1Tx|9vpbZ&`Z$VilakA^{0uF!FNW;Ghh3YmaCG& z%)>+6)YL>JWUrmLFlsNsqPCtwr^5^VuDT7MX^~(3} z-_^4fi5`5bqwqi4@tzljaAiBwAPVQ!*48$+u#oLxtf+`RUF{gXI#JrJ{eU;*?+zD` zlai7O`V`GH`5RfLefS{y68z*%#dLuVr*^)k!V*7TaM0$abI*MN*yV+pM&HiY3(L!q zU{P`$$yss{#8w3le*XN)%*RL4_nbR~5qEhsp9~_A&Kgbw%51#IK%FC*O8n7-#E+b) ztK|3{ot=&DF}xvv)0ocHV8*$vg>D+hLdZEnUCLKT--G8%S4?LqL^Jyqt@J{@O9IaD z;!b#8UYH?t+r9 zka?u+Vj&AWYA7z|KHA^70MYQ_x4PDm5lx%&lRtgDR5u>dX=-Y!zf--uHeE&F)Uc)a zEebHk8?LmO`FMef6jBBNTt5@O|=8`>JvJ!jB(6(pavquM3Hx3EGY1bV%rW zunG%bs`uPbbav+Zlp$rh=Ecd$x%2xQEf{d4&u(yRY^>@6e5^HsAt2JJ=@1(=^iGZ8 z!BhnnBsS$Ir5DM`L-X>kLMU9^4-^vqTT3WyxVpJj?dcEn^bAq0E-&XL775#rF6T-O zY3m{}S54|PbZ*WpP0XWUa_iEiOUpBj;t=?b_STIa7UwVk0OB(XQmv;ZhfrPZ_$7RQHju%v{R@+qa!vT!!mLvm5@UY+hC&Astv^+t|l830)|!JZ3C)^dK`KwgpZ>C+4oS zs%ji|cyJ&M0lg=IiPc^$my?pmI7pkL6B4bNrDX_mcGdyPIa!`ZuOiaPsqOWOuK?#{|1xym*mmP-%_zR7XusEqf|hoA!fo9LudZ zX|^Qu^Rv^Q{(i2#G~Hsu#f_-3S{F+R`9lU7lFd#^Q?uV+KY-J1Lj1kM77`KBtzznS zytnSP|IJgQ{*mL?25&ORIb-G>b>08&`lh&h1rn4{mw};SmQ~+#2uIJkjoPNCr%@Uy zahmxZ<(A!$s5kp~n;D@XoOFW}wVgEn_U@|{CgrzOJ#{cb*?*thZcR_*?dO8M)KeilMxUJD5w9xy|lQH6=Lu`blmHF;J zycuv(JL#5|p3d*Stge)+d2nz*)qo-U@FCB09+N0mmFLpMx(jgs?h8^UK}Wl*DTmWg z#o7*MVf_oS_iC+*_|1`uAn)nNfq~9}S2u)&F_qNRCa8=cFu%9^teD+hpwA^N{iBWv zZrt9^l=KmTTa54iZwlWys9?cX2_e=HUlyYp7(AbA#en>X9(aCwL!S8}Qldd@OMf6l zfU2|}2!>Nh<+tRx^9TR0ooZ)(Ty&LJRJ=54z58deujkXJajMm|wIU4zu%(h#&X_va z^=TonSNIQr0!s0j)>bqqU_-%3UpY19SLUee1st#6h@rlY4L$$?Oc-K#7>`Na^4d`Z z@nu=?Wpk+fA>3qt6{x=2wfhR3F^k#C%F1rNyg-{ns7v_4q{}Nee5tAKN_jb1STG=CLqHhSaPR!r{A2hY?c^&$Vc_(5X;j|GNJtq&kfH+ioK*0mL2|NWL^e_Lwj3=<=*;E`0wEI8OM^jynX9 zyndrDS9TtxAmHKQQOza`s=h$2^8*a@&(1Hc|5U4Bp5rUq+Z`7CGd`=_V4#Fld@WZ! zk4D(wfuXK%hfs<_!2eg*ZZuR4!4(BAo4Z(?(?|)vA_Da%SpzNAt9lqEle!3 zcUZXZ2tZeNK~0Tk*lKwJr=6YMWQ`L;a7c*D&a%djpFc18s=iZApM#552TV8q8vb-t`1(3R8t$eGgaH z5hYz{9QEvcidGOe%?Lbu0vQ)69F*?f<=&(7^z=j;IXJ1!&CMaXXN#*)Tq02oY9Mnb zr}&NLb9%XO!fX59pFe-TNJ4@_9h__ra&|;a-wXX)HU+J%J&2_*geJhyw@BD_NIKgw zIVmX#ElW7yUjRi)erT@V^Auh!(#Tfa+}=i-$U@y>j?QRQvT_*2cqXXV?5k+N&=0n{ z8HV#TSQ`7A|2v36o*C^+TYQ4t??oDzX|aT^4o7m=BqTv_;aIR_D#n#4aL>^2a9PMm z;C+$N(VTm9Ha1HqIaQ8ds*{=R2DYJTjiC|3?~c2Sf$D#C{FTs<#XW#xi{u$^scR4p~2djzF%mNNIXYqycKs}^WcPBrjSh%=$+rE4$m#bh~RkuuO7z1DZXq)UW=fB@DR>m9*nNkL7G2URB=T=6Sj@}F~a zL60$Tz)Zdm+?Wmh;JxT~T|ZNOE<0Yb4J;aka%w&m2J^_=U-!FkSu7M{Py44&#NZ{# z*5dQM-la^ir1eAk!a+Sf253(C$ET-hUcY_~nHFN;tNXC94<4(R!0{j_I73jg=+?{E z#QN{0B^RIR?ZpRwd<<3r-SZu5oN`#`oive)5CLq5-k=2> z%;PldO*>#eVZC}47nb4d>Di6L;AfUV)k}c%_ld&*O3_H@{%t+GkuWSYt^1+2R=d~Z zSvV;rWkg265gyWG`yaWNFK}~lArT&8R>Sruaf&7#ExY?xR_0wX)RVO?EGVR;hd*Su zR#(PGM~48QF!K4c3-}{`z10JagQw3TBMD_;x4~ZEmQ_OUOQF72tT1d2I91-g?P0m~ zUr`dAnq4&wGJe3k*Y)ey-G7Z>K_!iWO~|L;W~~J70;Ol|qjj1qqDz*mrm;Uc z0xGkfAJF7d+<5qv&z(x4qO!8R;|baP^0Muai1KI4Dgd*Za|p;)+**@w`e$UBBUK0as7DcnZZElTf=0F?kJAu%aQi)VroQwD-v zM~6IAs0G@?%LowAN?*t;WAqNo@~z`(Mdo6aE-j=aV-u5I;f-{0Pcv)la4144McKr} z!~*+t{UuBCv^k9H9}z(b!Ryz}0)89MAf@=|QT2?)XCcWIl8f_GV$DH8D7lEt*0whE zN}Iv)k5$?n6j@nWP*uHs`_?|DShtu4P?MzdRt|t9YwPMNy6r`Kzwm#gDl(}b2Ygvy zS~}<++2qq66*KVZm_II^01*Oi1$$5#0CjBP`fwv6lj@9-xWRawe2-5vhhkJMz!yAEA+5E~lZU z&i9DTDl9Bij30vo?nG&jX;`r(Dj~qegy;*vZ*@=rc6A^~*IM~BG!$UfdwYA>QO_gq zJb`luC^_pt7v6B1!9=CLdgZWQZom93FfwunNpu=mn|oiJETLaPih6j2#E+Ixcxs~o zCV}rb{hr^QN?u;^F>>V-tNE^`-ANA~G|`V+Jr5@-uSN2{Y{Eg+<7#9g5en6OBiRbaC@q#B=FSdXxkK%dIl_Ie{in_^} z)|!y|?Bn$fm!I384xBbPu$Y@)#^5>dS}5twW9f0K*nU0L*Cy(4zY}h2cOR^*rHsxoJg?Lll5ey2gm|dRl`0Lb zr(y4(8KZ)}|7JB-Ey8DK=R{eEh>V3QSkRG(}}(=q5!!&JC!=AOrf&P8kaBQMUY z%!4J)`pi@Oe)>6j-$RkC+>V5eguJb*Y}Btqj@#RnF;hjenku31&8Au?b_W#79!#T^HoIQR82lPjsK#UQt>Ck5eGh}l zl`M-vRt};sVzKJo<8-=DWv#T_^kVMS(6pOyqj?ClHGK(op77*9A`(Xh<*Lwn>D0%> z_5~=qTU>ec;6u=JEyebUkQ6A0G)sBx-+t?;IzE3JMr~96pwlwu z-hnV3q0i^k?6H260~bS;RJqvDg}(BYJO2H<8zCIwgenN<^ZwyC!%{Z^;L&{W3hJfI zk8k7LA)pJvwm>~HDfii+IWT@1T^&2yx64PNiuxxct1}1azfF>M&+wyApO?Lc@sAyC4Lx< zG`1?2Np$}f!O%df()bJObh)9PPn)XTe82yr1z0&{b)DXq{UR1_7;hv9SPC=%s~)P7 z?%)!-E8c#>c4TQ2YpqY?ykBAcVBA~L`SCs@R;P0M`y(^8oudP_n>r~!jI-Lby`23x z$4d-ip~Sb`eaU!2{@O6>da==k_9zBf=bbN)mA;4(Sw5$eqd`f>)h^riNOW8LPJJGE z>dMUK?{Vo`*hY5pi(1K<(^>cT4etrFF~%(id8Ou7H;Ct(a?oB1mdSU0)X{&%k4{ie zJsjsDEScK0?F67CZo5{K!p0{pm#dx6p8MN&Bf(KnbI@`s?=bRJ1I8?o$H&rpx##61 zA-VY!kPVu(vi9aW~!%^LQvLTWA z>!+yt8l0@cb(04NUN;0>)hs)MQJ9$63w1Y%mTI`#+KAYRE(>2#Ej91@nwdvQ8+>X$ z)tRCGace6nGB1bx#@2f3RIQLbG(xtx*RMKz&zdV8x!+ow$!;++{jNiM5p^-mC$`&c z!pyii5#xPjKM}MGvI{*p4e;*-prqIkUx^RjFL&vu8i^6gkly(>x_+Yj=YmTS$$t*gs08N)5NZ$R0s zwzD=xCNXOtPGf?~4#x7LPcQog|vy2FgSx`hdxk*_5Z#P-|jnrB;1`zSA_TM#$vu*}V0y*Bkh zKG#gQ=K5Q|5o#^TYknOVWRK`&&X-psy$iy-V@mlle+CvGb_&h5oLqL?TJ941s&eDj z;0OPqS|7XdF12*`;Fo|rX^U)bxR-H$)jxqW!IiRyuvf%h2FYt zNr#)a{neV_ZESoxY&&!K3#&{WJ&VfV3t1LMqx&n7N4#!4iK5fnW31fZb0=Z7SsbM+6{YT``I@B!{D;-4!k z_|PUmvw=cE4G+wBLSo`$ND#QPt;P3%mIdznp4R{{^6gVo)F4F?dom1dX?w(liH}u1 zB1q9i`s7DgvYh2ex#canh&M>T2-sF**X;84)1;)WgDp#aL&LVt&c!f1T|d8@fRfT( zzT8LmiV@fAV3X;=g9qx3bq}jEW1l>cfi}e4GW%6AA@qKVadi1f4p!FI2zv!+;8WES zJyPa*g=TLAfMaLJ2|V4U1Q0efAt8aXMDG3j_Z@(aYXE#gdGB*y9{~`_;VCI8CGEM& z_W>5g#mAQg!0Pcvp*F`&??0HpsbV$pl0KP(^}B6nIjBLk8CqD#+20>EZCT*X#m&tR zO>qqE4ORAu&+NIHcsFj`_}b`8S^sFAV^6`%%#0BrK6DI>U2^kZ2S-QDP{;#H6Hcrk zr>RK_dj_(zq8d>HWQzgTcMJ^;WxR3)6Uf0^j~+cbna*@Y|Fr|H!Fvn}|+?1fXB8%|=+id0QwR4N!CaFRtSNQgujB`U|iQK)FOjM3x1| zqilgW0C0o{Ab;TIZI+#t5Z#DIZc2K3EYc54*>M|NA%rEcF$i2dw>jUgUujJQR3}1Y zlaK@fc>q)bjoZ(D{vkE#WD^ji0Ki-Bv8s!b4#6gJhAli<5Ed3rmGG^8oK@%rXgUDy z_$dANwuzrOM(*GIdp~XCq`3!pxz|tPMC7gAGHvEzrUN;){-Y ztD4@s!kHZ1P_PQD3uB_vX7l;!1sNF`tgbt+gzvIDT)&G|SencLL8X8^s<6Vx+dI={ z@HO(?05POA$MJy?0JYE#(9WO261c4b9FFtWcZIdt=8Q2t4$82?60oRtU`a?wNj1$( zxC%>i+gA$h5E_|6*RHTAj6%m_V3ocMBz~x(rl-FtO6qtFDlie`-k4`Xf2nuH;aAy_ zJq@z9u&oN6@iyp0NvNpus#adUD;m?YTzhppTu*y!ti2rrRTgmS1D16q(9f>0lsT~{ zXT#U3xM{^L>1m4j+SadRY5Dsdph^jQ^M(;wM=+3@D8}=?nVEaQ^hcI?Z(Gj~XWwG2 zyJu^gHw2_TS`g!<=rB>YpSZxf(JTP>$B815JlSm5?Dji7@{lRkO*y(*@RgYU=2a1K zaRL55E(}P$e(~bkC~QoNzaQ{Sxi1%B%j$^hWv2kZK}iEn#OwU!xjbk; zz?&m&`^N(zd?l5z%<7JG2VIGEnn!!PwPsdK*u1}8K#{i(P2`}Bms_i4S)ZNPl} zGV#_iAmF?^CUl{&G>3q-xbCOz05QX4w?>xq1o6XUZ70hzQPPuD4tg(2 zJpTN+J^$lll33K8ePPGxD`0pXL+0eSXNxhQ%ht%hIh!)p+gs3*R2mWz!Uw3UIne%X z5NdWw)ovfxU3$mmZ%W|st***>zAa3^G$O`mzNv&$uY^lyL&IdppSvH-Qb|>n=d%#U z(+c?g0r2H*-3)<6**)%q-QBlviHL;UsSK^|-^ZPqnej4}O3un|Nq5L4Z2 zn95UF;>}}R+p@w*!6_S?x$0lv4W7U?8~_ma>5`FY$@M9*2WD1P9=iJ7G;H zBRUf|i<&fDz?CAOkg-4i@So#P2FHI98Kr5(UF7`N$&IQXT+ZRxnVU!`!L=yYe>ZXE^lE8=!|OjKO;Pg9 zkp9z61}C3+?v72KKa7=-Y#ed;%)>nL;pvVQD8343jw+t(Kp>f$pHG~f@PHT~3mDU< zCGVwlPxtZtsxl}gKrr*`q)xjDIAQg-O(mVz;+smtOd5T3-c4K=o((*2{B+cGX%CD{ zz>!&FCg#wE6>!GouOBM!Es^5|Gg5JQvt^AGa>1kSXm|{`&9`AS9IX@r;nW4lSFK-5 z2ZYmVr&)T)#N!6AX%K?=9mb1J&c94f&i(#YZ!D=Uiywym`}c1obs8d?2halUrfP5kr=dan4EABfPe~*;(P2)F{HOW&WHR@e0QtKO{`C2g4>ItB=TK*p}BeFUZy z{`cc!?!W5)x8b}WaTvs4CQC|AW<|6I2Xi48F!l5%(AV8ih{Lpj*gNd=J2e93+>xRA z=da-8&EbrK^|>ygaOvsk+omdQc4^wB&QDlSBoJ`JK%D5B?f{UnEcnWYPcp1Y&z@ar zIR0IK80ZGXIVDI!GHVVe0NkE)n(69#YiuM`Y*48PqV>f`{{BgQi?R@SuuyFvt?2bA z(R!zf2T_Y-LbTTci3>^WYJ``=haoTKSbF;#`0gxc4yPb{ADoe48n)7BsC%@sYMAzC z?bt9CYAPgVxMoD$TRUtj0ZItrR@`UAcNBJp_pA7zvSAVx{b0N_^bWax5;Q`g@lt2C z#!FWT7#J9s1O-Qj%x#ks6Pf<2h{i+HpiaKe?Es+gP^FC;5c=9LtBN}Le;7`*^}0=P z3rL0=t{UolZ`A2bCGVFQ<%qj}paa>IPx@!U*SDXz_m56gT)CHPAD8 zz7QTl?9d;oUR*@2{9zodi!h-6*^gByoy3m4DS7V|B{?MHAcn@p)enZY3#=0`Tr#C@ z73eC8GWpufK((WGj%09TNPxoAiR`oxd@xB;U0rS1z1T?fA~7+uW>R_K*XMVrAh2tT z?;TID;}k!`A5kb^=oNv&-xYRpcNat&7w{qGZqRcuU%4XxuK5c%Vm4@jkSFe|-|+VF z$#R$|@p*Lv^kdZ2)KGLq=yQ#Wl)y}h^#Dy{rc=Q+tacC+nEwj9Lnw=S%Cr?OeS$TS` z#0C**0ax%H?9Tp5nPqpJkh2TWz^u-3Z)}>HG@_^3 zUWfv${_O17>F-@(WJV%U>Ap6@4~H+++)&rs18=~=wNRW)hIGMpt&pPOhvOgb5929F z=lP#DIWZ3DOPaARCQ4#qVNpybgem`q0UHswpE2sbH^Patl}rOqt7d8xc=C&ixEh@pJ_#FYZ++aD-@HNV z!X*>;-uD92FA<8zWTv+~_!DTGDA0^b+rM&@QdiSVfu8321EHTy`LC>JDO9b=_x$V0 zmY!B5(q+VkpT1iNk$hN&hPwYhG78KgGtUd&rEIqA*$j)A;onR7F~IvjqWd45d^(e2 zX?-W=>ldEz-4d&C*ljBkJ)L8|HrgS1%C_e;8LUFYZZ^+Ixx9KK-FM*z=6XF^t1PC1 zHOFf<)VlTAg%A@iwEHYS{08Mq(3oiZi=XRyRINW=w?)q#A~^9D|@)y11APgEDV$B9G7%IFmk3(i+?Ri zt^E6HuGj7gdjgM0RGMpac(2Nz$D+&i0tA-2C2a+~m)egrF7R7)vYzZK*3xVR7E8}Q zJUi(2p!>PZ?~N5S+7~!Cm`aCY)}Iz@`S67OYZ@0SZ`0Y7(fV@!X)~sEbmsq)#grNJz`U zeABitDwcZM&es3u84Bgf>mUAbf3{`lh|On0?_T#5Cv_RadC`ph;xO8}J#TC4g5I0u zeVUTjPlCSQ!*@WHxwg|bd6ez6;2G=e$lb;G`rIPg-d23ef$mR_LZgAjiC(G3g2o+z zn>cF05y`kgEGyz)n7Kvqj8FB3uabDI-2Cp)5aFZCj2cs)ouYBMKra_&qo$e9E+yB@ z-_7SW8{oHWpfI?#_l{9NHh9CcVpGN6BR%Uw7c;Y-2zd79=@f|O?qU9t7=w#wNC8ocfl%dreQ9b zB02&FHqEU)agIYa{sQ+_*F(f~4PW1U8gF5giWBZx`1TwX`xM_ZU@}Z=myOZ+j%6$k z#@M%vveO)lw^(s9A#FIZi`D&N5pNRH=f{p!e!SlZtM(`&qP@^aUjI5{KBX$%*+tP; z@OL1tM9q4O$ORQt%V*Q(_G()rZIu)1=(q7*35g6A?NjXPtDGVp!X2gBawp}cE(b%Z zegavP;Q@5Pw>Sne3nC)Q&x0|-)h-8#`8_&SJG#bYvQ56hy@rSLb2oo6)?kuiQGRY~rQ+b1Tm!a9`DUV^dOWdh;SVN>fhH@=tYve*M0dd!FX9 z@NJ3sK{m(W?UxezBjuYo?_-v$-*2_0E??%URfegX~(h^M3z4qOm!!*xH};{>|x^Qw!W~v z7Ed`7LDNN=dBCAmSikB0+_ z`oRB_b5V6#Fz?U1?N3j;%d5JzxH9U{mnhIju2(RXXpjkYqRe8)$rLk_{(!7jWyIT$ zn%ZEnll^gHqrBy_yPOQ~tJQA(Sfsl8r|AX#~rH%dX_mg{TKV?)7nlN)`2=5W4` zs3dYgR}S??us*1CP`|ZO%hTQ7`lbtd3v`}LmuQ_#oS^No9WQ#3**??@>i&&#>J6u; zJbBkd=LHUn$}{d)dA}EaOdUJ_5=YG(W}sBN&USq@ayJ+SZ|tT#D0p>Xv9eTl2DUp< ziAspZIc{nTce{qtuDv8k##zi}zL*fshmml+i5hprRS^5rs~P@*d-2z=gud&EyT;Lv zzlOf5rFr#YKV<&XN4}f3UvV(L=>Os^V+P>6H!lM>HtGdNf!=_kMOz@z(CDvU>cWl~ zybgE`bsqA?%LF1BM{AOKb#53n{bE@gMn1{`KWbcdB0_5SoMzcOtD^Gx%Ck^ud`K4bkzWidj_$Y{Vy|5q7hF5xPH7bv!LMHe%bK z?q6cENhNsraP~sYU~@MvyTqwbAKArnCsxUGPVCRVU*_I(5ti~Ck+Ya^qsAoYjwR?W z@Mg3*Fm3-)%l|X}4%dt_Sl0DolZWK<0B^d2D@H}BheQwbR27cfRyUp~_l#x?=4V}Z z3OfwEW!XQbQhTv9d}+iV*O_Nj-}HNQkf~T^Yy^+grUw5Dp%;&vuiLbYoHXlg+3WCN zCW);ayv9e<;jLekA)vstrkPC~@qICPz-o-8g~U1Yeg z85X3a(@!Y5wG$)fzj|Vw{U(j>qs&gQ5TzZ=)qij;a&DMQ6J8$(!J@wsvtIq))jn<_ z?Io=k;Ru>bvd`vNYw2Xg00azL{ z3rqWuxwaA$6Vr1;*_Quk0YqGXL?S&vifOy~1QU8+ zDzO;9q8tq?;G06A<@)8J%2sw?8^P)+Dfip>p;LuXS|3MpQ{y{bFzWT`s^viRH72Ik zv$;*0gwHN&%B3q+0mCR0?$3-~YC;G5jj(xU7m6*}<<3FXbq z-XLBEqDI2Y=*daYcG8Mra_Y-~&@e#czRmcCc`vxEo7G8g%9G z>?joIWaHxEcpI#NU@0mpGPBBk^>zt*engR{T9WGk92NpeCI%kcW`kyWR<>9p8@qoX z`fOY8DtS*&k2HulJxERA>wtlwqoZRaP4-_(2U#K@g2*Ho&M+DPhJct3f2_JQ05p#a zL{G{9=c~Y9oy$2Q_x9VrA}&q~$lKnY)fWz363m zGO%k~Au7_D>d26fkp1Y;$K4>CFTx-Q)VG<)j)L(pYOF*hq~o`j*aHzUs2vg9;oKKd zZ3AOu>NNm*T#Js#i%2`FYAK|M;GuW!^b>UZ!9*OqVFH0GqE6qm0o)i1hVUI!)#FyF zV%I_LiAWV&gR#PiJAVN9OAO2KolgZds3qtZQ{gU6r@C%>RZECM5oon;(AD?SeE}^! z!e$lfl|DZ-^w!kUg6`OTuTB37X(r(OF!=IPwd9=&Af%svMT7%FPY@Uwn0=Nbr9*ee z>|-7XkqH4OlLML(wl?AEQ*=;)1F`>kC{yPuSdpp!E(OXAKu7Gj@H-#Dn(J@b9z-1&?1%ZJ&y1TPM7Yw>am_tnE zmn&tMDgfy7vO82|=RZ zu(Mz*5hXmz4A7CXswyGpXA~+kzXk4`?Y@J-&v4?_t0$m^N9bL6Xk;pnT8u6UrU@LE zNjSJ{Y;5ErHf6g`YBh6}Dbgf-fk~&?P$3ey7R-oDDtOf? znQj0&g8(LyIzaFPw^tUd80PUquiJcj15gWV)81lV%KZ25S(W!PDTYDtTJCq~ zN_bg}9DpZQNnX(N0$izh{Co_K)*Q6kJ5Gl$1VAJXYHmzSOtYy9E4YfE|00N+!|LmO zMHTn|06YRy#~9khhGRpS+9F^AFrAAHAR6Iid>xq5`p5{=%z$blU=hq(z;J{G4B)|R z9yi}ZuwY~y0@!X68k*i3CKVJ85^qr^H->wzU3M<4^1#`$;&4b zrH~}d0NZW>CYj1-jt;EMjPHg7ay%^F14fhtH;*x|Cryk@m0RLKn4Mo(2m!<%&Q*X3 zAQvG}sD=|#xBWWO_~i&R#skmA6i2LmZE4_|>$m*t%f z78cCH55MBhjiE1Fd3^7{g9)$ahZ~E^k>90&f&=bPwu@97l=U!^{z)9$JA#DeoS&Z` zCSAw`p1*;aa}qMLcJXko+_!Hn0Ag;iOu!Gu*JrK;-@Wrv5KqE)Pu|s4py}W{0V0`3 z#NorUFja5~=4T)bDT1N*rL}|B1N_YD%vY4O;Gh}2M7KzvNt>&69o`=Rfokz`NW(mY zKoOVkm|!4$eg~TZPD?P9Fftvr@WRbFCKae{N}GR>Xx#brGwAK^gaF?=5;^f*8PBH1#dD z4j2wpL83K^FW|)MlN~P8QP5kG1lPX4zV;%6%AtHM_M3rcbV$ts8+&i%JH>$~@wZSs zGUc20D)*-9J$Wh4Nwd@(Hh#vRfLLS+nH2#`RLl4zk$KyH_O?Vusx9X8XDTp$gM>q% zv07SMP{?o^3>Lo8=1^%Y26uqynGF$Iq2z)J{;ketci=T3J*9gRY7F%>zlYIee2{}Q zSoVxWUSzrQ&9lUy5*uE9Fn>8)y{oG$59S&395vBYdi!T*8DO;v#>R9o?goOAao5c1Tm(77 zM{D|MpI2}=&hU7h;!Zvfz6vENS4UG(W`(H?7_(`GRFU}VRrL9B^Labe5%X(n#wF`N zrJ7;(dnQjKJ6zgsxBXS;IRs*_ZYG~AjEtzqj~|!8R0O=B!@HX)bonAi7uCY$;GT9k z)?t_d{|ciDPzNGvC`Pp}8u{AJj~*$4%Ye>dxX!}?ZKSDp734?o?h%=4cAe(aJ=&jx z8Sw$9M|jmrUQpoyBgit9Q3Rq9a{%uTGfp__V2s`p1~v}XYi9X8e;dnGzkm^t&`?a6 zF2}2o$oy98f&p)+0&R)V#?^{!&3u@c%+=F+5cwMx8r4bF4jk;(LcA0+WCVzvVWp+K zFspzy)l^e+0|Zxi7cN{dWNdf_QK>+$G%hFSicEB%Y!osdhK$o;V`BqHrU(_N_bp~v z?n_zesAq|ZXkhio3@m6Qkg5QPEN1}K}ktbNu@&skrpLIN|cgL=@gJgIt8SZl13>>r9qIC zl9JAEJna2@-}8RgIp;e3wYQ7)u-2Soj=0A??wRx=&xzGHFdSDqLj#GcLj^&V>efq82jS-h=MEALnUJ#iRFadE`$5YE31L8}qjWnW zr~98?9Rme(m&6nlS1ppZ882+guGYJ0)*d?-%ve>|L z+%}r9g`YbDkqE}s5RDfaA*;M-AO_KmcOt$*@(ppd z%4w4s>L+Bf8R&F6zaw6Ef%^?6=S1~gt|{Bt-1OEh8ZzHhkvH6uI~5ZXgQ4oS+xSe< zsSnT!a_^^#he26013m(>jAjVSef93H55^|J@xq`nGBFG@WFg;8W$u2Jn4X8+uvkvk z;l`E<4il@)L!9kAq*`KGA;)OVds7Ih^4&rK%DA3yK?}(>E zBkop}buJy99xnkqm$$XVq=UqMu@4OxQ)bXmgz2?vvSRq3jh!7o7}bRoN#=$A+LoMt z{{A(i&UGm-t<2(W(yL5ayheNY>W!KPpvJNI^YhKC_mcyWR01X@!Dr6ya{q;L%3n6~ zE+mG(R#&@NU+amvY~z4`&zVULybcZ>yawXX)9$_o7w=1i67)-J#>K~%fj4ZxSRBx} zj3MdRUoI%Fng@Rlb7bg@lb37qALeF2m6P!?+w$P6Ctw58?5DtbIdvIO4)({V#Udpo zo#fk=JZ(P%?HxmF>kof-YaG|ZVRUY(_FY*b6tfn?AG#qCs{Z`B;fajjn=O8WP`*i~ zz?W3UV2n-so3oKIF@9h)h;#z9?Dg!Not+H_@*W_Qi?GGRT(;5Rh~bQbqDK=%T*yBc z=z&mDVw`L1N(X&O5Qxjcjy5M1K3os&oW6IAF`w)41qGdFm6)_2$l zem3br-};3ta9P1IgQL#qWH>nc*JsP)l9!N6gNONFd^YefVKIF2^V!$UAia|Uj}9$! zf0(fS8-9foK5Vg;b=r)+_^Y*vpmVp~31p)GxNTBE9sYIOfB-?xyFe#Z0IaxaI6Ae) z@>t4DM5<7dU%eOa6&i~7U|d;Vz7Zs6()!H|3=O@JRROA7-(H>5COX6>QingoP+0*i zlm_ZpU&FEXmIp6^ng|iQP6txp%(X{4ZO-`R{i%oFhrh?m$9ET!&k()LK2TeL!Ck1E zUk6@PFa+!MzF|3pQN$@zqZp9%XAUUPY*m+q; z?W1B02co71FV4`(yfW>INAr{PrWP>W2v=i+_{ybUuNS414YE+%i~Sd#^&}G)Dc1xk zA;gRp2p`p^0P-{fw<*|7<=(x}TZWSe*%H##-1a?$OBjV&n3_7Z(q^^=#120=v}8y1 zP>2~qh0e^vY%~f;8>BncpHh>PLm{+~AJrp>91t~vaDs4{)S?p-g8KSYAB<*7) z!Qj&D9=jB=fu`K+^mL{~LGxyjV6!ZjHTT1%TqN=sjE97UeFbs=i%6pD?y3%OrqX1? zXlgUt?}IAZ{S;A!JxQXwead=S%}~XV19=;=f^1;kE4k2V!>lgwSmGV=GLSuKf>nXA z-wCb~BI7jFZh!?K6S9+h@|brqxxhC_;zW{j7+ggDE3goz88(w%YZd91|Z4v0kpL^OWH<9*dB^$)7=BxTERGVtRG9dW^cx zb9SgC0TE3_J2D#^{ExcJMy`qe>Pw3hK`UglzM za}V%Ucv7UCgdzy(^XL1mp@c?Ip>?XSs#naW-6meWdd;`a_5zl%I#0!&qbOY}zzF;^ z)6mdJpX*N*d6LT3{l7f0TOX}Pqu_5qTJ0?NhOT_=+snN$Z0D!q&cQ6R zG%t&E(tb=YgHT_K>v~9^;zexfk}1|( z3wMQTf5`JhNSkLw7?t+Q)OH!8@C@d@UyvYt5nNKO^k%b##=h)^SE8dlGaDGb+wSNy zfbr((vWfCHQg0F@z3*vF8Kt{h`BH=H>}cwY7eFv7n(u9{=D-dm59RAPb*Th?7?8iN zLA0EulOZwdx_4d8hwsJ43nyU9cMr3ul&l(TTzKu@Vv^sD)Rrc#hAL2{{6YcxVKY2n3rCHq~#w z-=k?HivOE)!}PGtTstw6b634&TAQr<_D{|&ezYn8o^4HupzMY5j-$lviXL*fP2aLVeraN$j6ZE!o|wvEy|XGp;c`imF3;`_)P8Fm50HdW&@>{#Mj5jA;5jTX04IN zn;*RJZmgwSvqOX~-%Bvkl6OJHf75g4bvel$Ytdcq9od&0k1KtN4${#zNdyDMc^R%? zbv5}Vb0r`B6**pXI7p47H#kc{Nw4oy>|$Z@^748;*byIjrc3DHJh8*`NmK^7RCrKM zcZtaI+1j|b@a$VJ>6nr9&h(~gHc7nmyc-GL6-G@q!qb5=%m$tg{30>_+S*hEXaDEE z-X};>Nkyg6;#H?uH5fg#h|FueT${KJ8zAg?DhhD(7X-w~%{7LR|9?|`1=Wn*xaev3 zV+yr0{>Yc^5}KjM8>w-6yi3QI=J>H<&m@!yNZbOvzWw-+9h-{-)N^UbW8j4IO8;1= zFxnfurkXp|(lKb<-x_sB>HUwF9zl`UTI#rqd-vwnZ?Gi6XJhi)hp)ws0@0cI=?BV%E&3Z@c(o4!Ku)S?v z57lN%%L2^|=7)ml|8eDrQwFEb%(;Dmgvl4j3i4eSS25?h-97EBewBcUdt$`V*QD+YnA|;jvRSGP(TJ%-r z5)hvGC;k4D@D@A9oTb~JyGIV>1c+6;fd2Dxri1=Q=DO%*YoR%r5vBecVm$H^bkLF@ zg3>!OK0a?p4B);1LIx~jJgC)f1Mf1}Z5=Xhge-h_ig0b*>N^kz4c|}K)0&sF|WEY1i>lRih-znjf;VZ~g0!XcbWY(FH=dL@9 z=>dAoIr`iBXn&s(#=!s@gw!Jnv>E|XHn6tt5y2%MfrNMylzdu-hGHGxqJs{IKLn#+ zfOpkq6447A3*`e1rDVsqlCXh<4DQ>+AXe)n7l{GsqI3kk7^2K$ zXkg%F{v&5ITneW@vfA$V4E%vwVAn)oo!^yo9ItzrGU2zF@+K z88Odb2%^sI9UXU}YUwL6l$x|TPh>ikK@jR3T+8<&zZh0ef*2ysYh3NfzQuxlC~pIx z=k^z#4($v?v1gr1_i=*<9U!Ri*Xf`-bgi39C3_4BYuO&F{9~v#cGstRktmzzMUEC$ zzlF3&{)JtkPrMHtSSL8OP^KZ`y>P^^!S4J3EZhXJ94ZNr<9!O{{1W74$oE*M%JNA= z*Ti^A=YB&_&^J3fn;<$S0!|&65P447CrUbq0^*1lg-T?3+YtB#`3b5EA~Tdrc(S=_hO8P*`iH-`&+^|3;At>L-`I zTPjQn2P09zVLf7!ffM~Wg#`T*T9Vs801od5C5uVEKzd=P4TDG(CzdgHXj7@OWr=*K zghSIIZqn4=j?>fAlTo37Sx7J+Q4gz;s!7i%9RVcxSVxSET<8KciuhQ;cMWm$W&ULT z!`~4D$iRg}=itxxllM$^f`^L8-nhe_Gg`Q+H!kB89i9J61})fRtBAA=0@Oj)i{$Im z)6?pWwj@+kIMDb_B-F%$je>6%x2mF9`C`f+NaS;#TUWC8An8Z2nScVhK{ux23H(IP;e60E@- zp7gIh18zy^xHf?R+91INjY-l^${Yrmg77{XcUC0rfAXMU5PH}|=l@PZOB(|Aiu7WT z*TNs5aS|mSafku*F9*dd1fsXFo}kzTVRjH2+Al!@8%%}(1P!Txwmt_XFLZ^WDG&~6 zDa_!d(c%Odz+pj!)hqshc9Ha%LS_b{iv)#cBP=GOkqv0Da~q>DqWlJSy#)`34uZ|n zch>(L6@h24A?HU0RYP-rR6y3tV4O4&jtZ#jL7eA?l$9+;5k5&uq)P*wa#q835wD3Q z+wH;r{s-Sz3JE~I2wKClbW#;CdSYN?)GUJ4FYy06E8i4`?ew!noAXuKK=KCm+UkFt z6sy~Pa8l$A5s5}IZ*8hp0n=@bkQyB5{`+!s$(m;4baFAIH zzBw1!v|@LNwzXG?#0Fv3c#_zp04m?wD`X0tC&3_z4Kpwhagn29^Y2mF3LeQER&R_Y z{`a6X2cQ{;ZW1uvEG;bsJC}ieFl0MWS+N0Hh`>|-7VW?rAf~4e2TH+KxEe^>vc<>8 z%b#A8P;)`VSpf`S5ec{j(rmEbe(&DVA!h~562@^qS`LRhlBuEf;aE`z-Z)Hz9S56U zu{xS}fL#IgaBycD02#16t~~|Gco@~32lIj+K@|5Bw8rl4u;5{S&_bpHrC>1I+rq+% z6;7BaKn*yad~5IQ3=Y;}rs6PAQs}BK27A8cwqJ68oGk)ccrafKCjSV~ZbY0K{AX`g zPR{p-T{ui-a+p@^h@+A}pKk*pDa28Y)i`hp(%>WM(QGTeUV{hpb&8NE4w(NWAIodp z5w$dA#aa0*hq(Uhm;g%h?=b=7_8&*B+npjt``^a|@U-*epyVy9e_k6;=78Ees0Ja3 zLxr!|b_|lzNYmtbaWm1l$E1c(^2`80jbs=g*bNQRPB3j~VuZ2%?>o9<^Mqk{=J$mk5r^4#MP#{)`V z5R{3t9?QweA@%>8=Sx1hxz`}Mg`&|)Moz9}VBjsdu4f!tA3_1CgChn0khuU|DE0Ga z5u_C_B8vvfQmt>fdH8{P8ubdo)(#G`(~?@m7MmSt1nS?^iu@5P^&QOXI79P zh*8%A+{8O946zGADEiJF!$(-Q21NT6$}*Vu#DGtPY$oKU7(?k&*ptM8DTe@IL$%9I zNl6KL?Es+Rlf)k(43M~6ko>%hqz({LWmc1-^1Tqp2}UXT5J%o`C-v+&fL{24EQ|og zpSvnue0tG%&6jy;!j5Zrz`cPA1HheBRbVN=zUKp%Y5#Ap1o%91jIc&UZ7>;-3j78p z6UU?P4?Qe6G>D5hFJ`3*U}%cz=FK=jut!Hm40 zf5tP+LM)zQ>z4Hb9>$d-&tA#x&H4TH;)fe6d3X0;HFIFFSloKb;@kccdL@y`$z9+{ zrNEB|$V#Q^Ked561&j-XLQyIsiPOITNqrl1#+}+2UCbc+l~+;%za5L<@8Cj_Lj+d( z666SAE3-8VHs|B?gU51G$VzajM%}%Lp9JvbCS4%IeSP!EH>u;-ELD3;m^U9@xTAx4 zBbn<>;#(Z^p8G7w{zVde(hY;73vZq*lr+#Or{_wCeqFyvF;zx~%A-;`)SCdxXTN8Q zId5#N4AG|raG&Ro@)0Sajj>)FINz`xq|4Vh-;Cza>S_wiz6*T&13BH0i@?^f127E9 zC2T}NU?5J%>08;(Ymni?JTJ0a!0drm!aq)68ae_0l$#@N3q`yh-!P%$N_HVyE7P@& z?^a@0@RY_hz9rJ!+5qX15<0&<0J<-$QME1c0Mtk4a6iDhl?5Dc#+f5DXT;?XB6mF z^IK2Wc8Eaf_ckO2|I7p~n7E`v{k~F@s@>g&b~=Lp^neFHg9nTG!+k=x6ST9LW(Wk3 z976x<-)xti;gMQte>H$>B6ZZ>gVGGGwax{l2W})~p%jMg5!Cw*r)(c9`hK<6Fl-(u z`uG2pOShh|IbQd@61$cdwwo*mFy_`rKsN=Sk0DMa01#wDV`GGw0q0Ts`;Qs~+#YS? zqNS#W4Ds%O&`yq*#?YE~{-hZ1;L%SlyneSWg)c6$zjOx#c=r-+0jIso*_Xvv9ntrm zqw`+OhirZ`obeIddGO!y8bYtk^M@~RM`q%Mg(M%f$-;p`d__BmS(AMdTMCaMJ$D^y zkbSa}4?%mrKaV)vE6@-jvX=wqi^@fm$XaOT$ z?H`W;LKg5RG=MC2UYOw(LGTEHlnW_@kRYDYrZDG+t9?-ka#_H|p`zh{Ml-MkAifO& zk?764cfb$o7Xa~*5$I&DQ~469T?gh5Km-Uk2(EZJkIgXM`A=*nqo=>V6yOHhGGx%w zg7np&>vyBSpI=jNZv<59S@A(c%=HNZ&x{agV(>lrI-7Ja(uqgNG1-bSP%F{_ya#&z zwP&YCwV7ngh<|``1&V_v5k)jeYXSBl+tD)CAuoAxofoeQvznd=(JWwA9p4`vW(ZoQW z1roHV;{-Akfb`HQ$G;^kycNzY0-pwH7XV|%Nsfse_4A7iMb{ML zFA!-q!qZ~Ff>#%Ay0gq1D{%-3@8K}n%3Ub|EU5$pBhsK?M`FiEyWjT@ebi*Zh@-YY zet=f6QiVPrKffeoq0-MEu3ysjlSDH%hKvV?iM$=WQ5S+C*#~%zLM)1dDp8?YzqG8Z z83NP!ua5v9G4Qm=8#zQ-%#dwlR9yR6Y8$D#dA^yxZI{$ z1UDa*=M_@|eYxYkhBNZxC1c5pI5>^)nIONF*a&A<0N_ z5+0Xf0Qh_rLDw4^Qb^!{vj&z88;wxWmbE7DAdMOvtHbT3D)%EKDJH}RQYI9_h_S#V zJj+SJZ{Y3%JK`$<$bk7Eq@4%j14rn)FtOeU0S*G`dAwrL7z0E_Dj)Byj! z#>ASnhsl*+1zq-y;{&^YBNH@T;d6jIaS-&dz|QP{0Ve|JCNmJ#($W}#aSe`E2U2}# zmO__Eod}pBxGT_<_yVmzBvt`0;~0P<#A?ldTwR^4LzV)Gi*100AW#WZOSfQb9LeB8 z4F*X?5r9(_jnfc_%AgYp?_}L^Q+sA%f$QgU{VwzMbBjhRO1o|d(<1*iS>cK(gcu7A zt_tAT-togAehjVz2q&%Q@#qr%nY`Y)BkAkx>mvfB5>_tuZ3H*F1iQrCn}2bT^F@ZZ z`X6u@fNbK^e&?nIHN4&&0lUk-&}();bRq&JW

pGUv-bnMo0RxGVp$ zb~UvU0A&AHYkw!|Oaf=_+}dAEiO>`QlC0)L0PNt>v35u&{Cs^4KpzaTYEWN>Ygt0htvZ~p1so}u$uito6}25F z!hQ)K|0|rK^Gu(-gd_)W1q6BU*96Lo16VguWJp+MGFD=V)QJ0Vmq~z1aRUPr6Ow0e z(z%O{beb$6yYd?0BS0+-gAtg32K@`|`FpYe6c-LD6k?9gpF<+?KjDgVgNE<{6Z`O% z#dOFE|3i)8H9$e$1=#}lUa0WMPC`n-BY-7CxckSF&oiAn+v)(&IzdSU=o^$oEb|^; z$HsKG1GpC5(N+Mi0ed?y*$^`}BE(ugxvb6c>lL69bS3<26QN|@u8NLfm;dAgf zcoonxFwJ>X#Vjd8Zl&0F$N3Alai`S?U0gw z3$y|DhdBn|bO5N9luuYXAwy?Zx_9pj7#Nb~fJzTUW$(kB;^Ls5<&u$+3G%q+Zm!y}Re}Bx2OuIq4250>KNM`G zLcnwIVSwmP+TbEUHE>b@x5sc+HwO3yj2I0HQnc@NiBV~4`7l<<(|+Y1z?6s<)5zEh zz&Sw!B~Lv+igBD&?a32*?6rza6Q?r+qW;;a)Gfz2MiZP{1mQVLYM~d@khqT zTN)d^9Fr?~TQ`7>3jh$#P_d}%E*=DDk}QVybu9o5Fr96aWHS*!=-1qw`r31}vt3RI zs!@7U0zl^n(Dsp~M$l7|p_S|Mu&^4hhIXU@FF2v$Y6W4Mr`gQH(-Dv37|*z+UkN-;@bT3nRd8Tfc<~cr^u3vEqdtIAlYqnndoS z!CVjur3d|loilE7D5k!EusR%8pYy=?uo1>FkX|7|MhC4pVt52%rjW@2=_7hVas_h7 z^8pR>T~287w8C{M1|aSOl^=XSAhlco&~XC*2&0KAo7!2jUo$g4aE}pkyAqwS{V%%T z+^`CMaO6#n$bp86Z~*l^4p@N#j(ju#ZKa%?oFdozbsOO>GNlh*-MzD`t9ZbZ>M_Ud z?1hUpxI5f{@d(%Syq!A*HTnyv_{2oT>$G0O-)<<_Orbf|Wksm}WsgyW6hmR*TWZFY zuU4ohD*2HX;?O+H8?=Cb?CQeUSU!%{75X*RiwDn8ScrCMm|0y7MFx5RM72H0nk=3> ze-B){V01~lCMe{eLI;Rc=Lxg4=6PBJL99#?apdso|BhTDv%;@CALqUC{8H(mEwh0d zX`N?+m&#>_5xpR@jjiSTg?R77J2FLnO+C{wVLGg$97=HxP9D$}6*3QDcor&cDH55= zk*V-y@Oc4|x+}A6rR(2(E)MBGd(A^>l(<=I! zK&d0|zE5@aPn$t&AfaI{%C`7L;~ah@uV8m?EVhFY^Lu=4Uww&+>F7!#=S{hd@wYFx zcTdTSLLF)u1lA1Bmbq~h)vqj7(|iJ@*a)%Z=|Sq;laZ~7xWu^hk&5pO4p$>u1FjzY z`CDZjBEF~b0K3Ggo1KLk-F$hk&v>SHamh$Cqk&vXc7UmHayjBlmx-vBNanDLL`FbC zw%H%cT-kxiY;8LsG=KTmW9hV4Pq$e0?oE!~G_SPja}|@c#BUpzyiQaQ(1Sl7Y))Hz zrMDe5x)kKJ`vsNSbwJxfL-l&fW^l_{Q`!@U%rSg9CC~kxW_6%gDTxwcTeDZb;lSLZ zQ8AUTmvg3Ca-!QX0`W*8Xd+=+)CE2Aq$wLjvpgZ#yay& ze@^FD{c2PM5(+95qM_xe&w+^_GAinOW33|j=ePcnxPZ3OtA9M6rRzA)k6bkd`j;^( z9;Sct|FS$8Z_UB8YI~o=mT*$}IsPzg_{n9eZ!~V(*q`%zjJa+mf@FgC;AQ9Su)Vik zP_$8_hT!O{wJI463#k}i7#}dXiu*{&?8O)@&qCQE{#4o33zcOUpbt7_;`uSZ*}|z=_jnK-|>TGa}e(5~9J@fSvmRCYzV(c2>4W7}{^lt68+v)eZslL^n-8>qp z!J>8Gf5+FV-5L8L!MKIqYP&aj1ZKxZvh?epJ-75|)s+4H_I&~zKR+}Bn3U64+KQ3Z zWO+1JEi7P5-*f+I>>nB%J3IK6adBGxFS^2>uDE(O*8kjm%hwt&ys9!Z)|qfpa)!yd zHWK?xpCd*Q%S#LZZuK)ybCTln?uS_ALWA82&DN;#+@2|k=Es$Alb<0fzM#O8Z=Gvl zLPfaMf5@`i)Zdi1Rgla2vBbiF1=i=thdvG5lE7-CLJ8T-d{H~~+`WGoUsUKAeQNbg zlHT}U_lMrYcpYKavI{};D{*iI5;f`-ByIdJu59*cYgX>4Q@rmnzDL?SQf_x?gtF*!nfcL+@=y|Mts+eU6cmxMw`q+-#UA}gs&*IO9dwp3CrJ_=48Y9! zUAna!0D}oes1az=YXu9+xTTQ)V4!dhrx<{$7|W|jyEKFBAeixv{$7r-!!j>w70QzA z!_OZtJ=7`BF&IZ& zyK|uD8;m{ix?iRiD7Jz$|7c9LXp%@yUC}EepDyDT8(-=hY$kf~Rf?dwctSch>_!)= z@gu$XR-S(@WiOLx5_!GYyD06H+FYkMY7Fi!H+f`3oq=bx*`qKOp)nuW+cW?Z%g0E}!prr#( zWj5u`-(C^5>pfrNrs&$_EMHAW@SM>d5X)URGK%iE7}4uWJzJhgy=h6Mt3&10Yr{SB zx3K+`=xcIj%~f0dTn+SHu2CF~HgU6OmME&>rnII%#Fd8J_f3yKGpILxVZONte`LhW?S%z+5R0w`E#A>l;H;tScsUo9O4eG*xw1#7*1zHpTI_ z#K+?TYJxKdYVO?Pwrf|hzMd9Qb=B~+{9HRbbzF$fmFNF}Hk^nlOczO=IW1RYSZT|2 z`a+76%gT)=1pOzkyJfCcDmir=pWT2hRk*O<#Wm3!kvT@8>y*1i%r;AfG_r$}ZP+`j zX9pW|c6y&YQpiUcGUDyjE2Y<0K6ZF`B}L}*dQL6xDBro-Gr0L=nQtPvTc#LG_D+q& z!pQ-KRII!IjYm%YF?Zb6=AL00=t(z3x$~e$Qk%Z4?40@``|{O?(6{B0frBXg&f5`p zt}%#>?lCaNj%FyHUQ22w8PfEZ4|ETFEBg2&8%l#HmT)=H^rseMx$@1Ag{_wLO8tU- z^&|GBB(2^11UFb!eHB$OFRpO4f`j*nvoxJ>JA;C zKa<#NT1XGWrDOf&@zczx^!X3Wfy*LvDK`$zXbcA(uJ3X$d0CQJSiHCgPCfNDgVT}3YcmE`=Ox~kc|L8}LPo>k5*BV!AP)PLcJqT)tJj3p91ibdar9pyrFR$c zLk*3hJU8$`s@bbH`SRRi#$BfP%I1A;O5UvwaxMmOoS!0-t&;`fi>)UXu2PD;3*ma$ zxnDo?71Uu^--%MC|0`r%GQXuNTK$ox<;dMHV%N zg8g!(BfpuSEbyVQcx{)wQv+_wj7+e$?;GL^kJ8-Xw#d*@L#HqtkhSC*6~X?)H8pw- zOy1+D4E1FAdHT^c(un4k$kByo$2%H~v0aP7)*VTumWR0|T4Z~bMv8uCwKgN_4nn(K zYiu(-$~4jpx>%{4y_?Hwygo|3K~$5i6z!oc-APMk0`n?v4x2CTtxu45jK4C}m7~@B zY>)r`suZu|KTQ>*;vFZA#&1mv`)YjomJLJB z^$$cw3kmn{>pT>B6n}+wbo4;~L3w0Cws-zJIx7}l3vDvewu<--OZ{6C!v3$lCwM-2 zW@8vn{+imbA22OkTFE$i(PjJX0*cq$_%q*r_KU|~=x+z672d3$nV)RR>E>G;=#|<& z&`KOo&niFW!s4wgf4zBvA76d#;%!Pw!_K*m`0699vWLYpont!nq z^11|2*5~i^(6g_N*iOWa1-ksS7?m*3&B@qn7CruG#oyX977{}1Iz8=UqfM1{efP!J z=>I`h4Wha8dSJl**p)rLsiz0;j_|Pub)ZaZW~tTpTV29V6tv==K?laC(l3RibZcmE zkM=fF&{17|_WK6*?VD}W0+RaI#jUi5&*J+fhAJuX{*=!* zU@EyT$bEpD?d6cISR9M~ebk0qnf}l1vpC!kay2oV%NA`xU84W-Gho z*Dsly>$G>O);VD7uZ4``c)s;iNyh4@yg;m#i(P-(s6OYkB>$~b(to+zo@PS#l$0Vd z{7dr-i@O&Y_cjf=b_nj!I2B4VeiE&AA|?-)d+#|m^kpW)lS|LV!*8K@nCf2h-Qv2E z)lu58D$0>FRdw0S@W&g@!ygvMK5>fM6rlSlXb7$k&|6ms*D4BDmRfb8$Diz9JG@1U z)-E$~8nZ#5DBBUvo$8(t6YF1Ch*#@u9_aa#;*;mwl24VAF3lmHFK&EOlG6?6)D;xq zS$Ofa+WLP_g^pO+#qU%sDc8I*{k5TCcp*ERDYpsR%+%D!^xb&N^qK}nZf$3NPzXmbyTzqkZ+5T;S%%( z=Z6Wy-rX0C?Z&m94GN#dJ)bO3=sr3X7_!MD{G714PV>5&P_tZRRkv)Ec=Tyc?q>%B zJMG28RGsw&jf8x^ODlpFte>Ma4|j-_Y6z^X2bdQOtu}Ty1xk$cUwAyoFn%=Tt}_H5 z%NlQ7!jf!7tv8lBX@qaDxV7xYDf_+lm5MJbjTGgcf)NoV7a1xH^zZAmy!xZtW=e42 zgRe``{#|LKQij-+ZbXmb_MWlnw&sxm){uS~w?gf(_h3mIwud2u#|CE7OkMOJ-6L$; z@9vC`=(l=xBNKC17hn9;PAbkR+xa|H-XS(t-b9`FL66c*Ya=l$de}u5sa#e~T~Ya0Ei~u ztDWS{@>G{f6ODV^@s_#7Ry#3sYB!i*Z<}}Q);X5oIUc5O)AT#)Jbk;5*OZ2ia?eWp z=AszZqY5t+l%vN&p)4zU>9bqHPq##!)90mIE?`Lt^Dri&L@o& zGA*{{w%Tit&fb2rwY`x&8O2M;PjtOEsZ+0!A4>M}{UoiCgdcGuVQrY7XC~T)M>_rz zuRguayu5uE3!QEAHJ)HK7w5ITcb=Vw#RU4BZ54$_WnGw7e7;rIN~eskNs^Be?RPiE zFkdZqky9LqmfL$4_^A@|_syA>Rub0v&o9iCZ9T4T7H#+9YZWvvT^&1kxEgxKy<7XH zKYY6?X?3gm8`q3^N8C%w<-1s+`6II>N~>LdmaB$iNg}0gCx$wgMl#5MGFV8+T_QP# zkHl$*R-j-+X#Aw!eFEd^J+z}>qLYcwipPwv;m8E9YY7IGa$PO8CnPW`G{G#i_QVi$ zl#Z?l7k)0umJ=S-+w@1bST7wxpb_SZvh2>5(e@kMGPCb4?fLSS%Qq z)YxG3ah{=WCZcFyH@j;1`NbZ0KTHcj%LtmM2ao6;vV~102;_LFaXf@#jq6>1;p_6D zVea=gWXCO-v0Or^lM~$neR9RmW_4BjPtteo)Faq9(q)w=nH0k<&l8u;li=nd>YA}Z z5`EHtG1kfl|25_HU!ws?^B!qN>HIfY|GzN3j~{mVo_3OfhKoiQ*H>gx0>ub8IovWl zot%^A{C5Ej8O>!$-~s)C#scG`emMr4`+qBrtpE~@ns9^*Eih55&Nn4Skp7AGOdc zvqWW$7tf`-LW|xP_?7TbL^Eh)kBYZ-(q-~7bn!v(q^Ml=9K-Y_Yx`&J^!oNE^eZLU z1eY6OEX$t$Jr|X6U~sTcU7Z*rD1(6b=ht>(0L35m@L^=iLp@v|Hz1lcFk|2($F6?w z9v=Ltmgj71#t)eZLX+zb%+eVeU->RDq6DJ#h#&wWUZB>ocQE7Uyt_&bc=dH-Yp^kSlROr#n3SyK7qXz!5nL zMj5;U0s@AHFNOYV70gfVNh$x2a4s-K&TWU8^Z#Z=|8L0fdC#b3%xUtQKI z>3RR12VY0-Xeq}eEH_andl+DAU3)nBU^4a@KA zxk48ia2DY3IefY~j78mAl|kbk%fR~6)?Wi>wWaeT`D5Bg-gj(cxq{Q+ zDS$~Rk>C~Sz+*JvxI&b9aJJhc#?{GxGRyGH07C&&h9D>}EuDodxCmQFIe&*LKOekK zcjqgykJp2LOA?FPs|t|E@Mm*5@RSceR=T9Br>xUW>cf1u^MEmQmba_6cb$;nX;6dy z(Rjk%r}b z$aAVjRq56%yM$z2Vb4mr)$KT}*!aM_qd>a@cx=c!NgV3}Uoz}Q$w`Sxo=vopEs#t-N>Rx)!NB$dqhX-pkc=e8`Vhuom$4 zD?hu_(HZIc;+yRK%88E*LT^r6rDNy(Hx9p@UUq)U!7w+(A(zUYYD~MpC(0K1drCa4 zf1wj~Z;o#K%fxh(D4#9&v#^ZQBbxig@r_Z*M5iAOu1kHpxz+WC-0hj=%@N-)uG=C( zfn#Bo`=S)ee`iA4DT5@mjE!y10xqWszsn}opZPhE#J?Juvp|t2Cii`r?y`dN{FkM_ zdYqAF(A^c1tZ*?O!G8x*T#6Tf0m4A@oQ$8 z(kE5bq2S+{DPN^514_#&!!=^mmCy^fLU&S`V%!v(+bS1F?z4AjU2tsYS!xW;>XsJG zxa@jDcG95{nh^bUC7?p!@BypF5VyJr;nTlU&;DZnHZ>4$D=X!ux>Xo9(Y(yJx&9Gv z_2_6OqqHSILvGkG*qV3ZqCZVs$T=>t zwyf$uU2_=(tU9>3GeN6FT2Z?xMfPW zMq~Gz(Qw#5HF`qXxV3d&DW+2y?)5hU&>OvUhvuKImsPR_& zlJ_Tt9~1Gj+#Ilef3RD|EByO)xFGcTl!=slnuGswjM=#JEI0XVUbNViW6Z?eX%@Om z#b+jrzh$Z4xri$~l93kqUcSK4E-yZNY`%cax7WHHb?zl=vtgxy1AQ1!Qbu1+v_yLT|Re2 zoUKOQG7RNjp4Bg^DfxKat8Z{Pq_&7%uN?2ol>e?3|9!K^ON-nO7R|odAO1EoAMU5k z&%@lS((lfcW_2Ike~@bTntz?q$Li#2_C?yI!|tb#BpP47b`Kf0ai7B_IU21m8MG&0 ze6W8YOyJ-Vv$1SH7sROC#cfa@+57zDISJ_u?`I5Nw_}{+N8?UsLe;0rCWQ%YyRSSK z#!KXiuD1vicdB&}Vq)nE-`xIm>(^Xf_Q$k7j+T?y{?v)ap`=Xh21e<-VkHLk`xn2> ziZjoe(m#tx`BkHEynbdv%7S}Z5!({cZB!PK+?0JW@zWK17mO2hpQZj|vP&r!pKflB z+ZI@Sv0pY8@30Q}R-~fynJqXs75B=tQ~pHRniM9sL}ym7uR)UL9DOKOFXa@Iv+Si; zy(!-)xv_YEl@8g4SV&Wjgxqnb!B+c*B68fOav%ua;&jkvJ4}C*7~Au4Xb963`x*^@Vu){#37Ds6+<+R)N>Xx6-xR>v`YJXR3VoHJalUF?`+h`;P`=bvImc zO`0BI@)6TlBje^CYP%HLTNNb1ij1nN-X#SwFH#QS1XyKjzE<-bq-P$B|SvG^<|rbry}C#FeLqD>p|rmmZTu#g)uSqTC$3 z##MT9bE76xP)E&CE=z4kMh(@1ywiO#gOaSo~gR5+6|3(IOaic=ml+ z`l`a$RowL!OTLELYWL{KI%;{2>Be9co1_|l&H1P5)ZFsAV)HSwqRBBfeUDb?nw{2b z8l+w;U*2=~+aqY}^DIQCkKGy7s~hwtEn4E zr1QTI%#eNUTA)|aDosjP9z@D2lVz>j`T49X`q7xXS@>sHPK#~Fl9SE1N5;kFa-0{J zg}<+ETw3)vHYd)e@DrWBV6ZTQ?aGT!{kqsE$|6)w=bdVEUW0hRXC)#5(pf5(6}!_{ zlPr~GE~|?~!bxEwlh%1Th`2pN-~=-Uma~`53U$|8c&k)ovY=C=Z~fB?eQMn zpU7f+E)X(xo2#v|Uw{=X&o%$*p;tHc^`d(4BMsbfTC@M0Wub^X9yKb!>hjCq6;i~*7j8~m3 z{#Wb_xplIMq_&Xqj+%#Yx75kvrUyPVL~!OSKD(tb^MvYlO*!)i zFAHgY+u@n^EBG;|6d$T89NK;;8&YQ+xQeIL`4$E}0@}OpN}I=0z&j5KpJ&|(dU+)% z*&h8Wg{cyzcH_>RS>Mteiu@s8h5J)hh3T5_znWOIqm+|_&PY2vl$-IJmhmdjeECQ0L2dO})I=0bzQQ}6 znnhjQn{cnq{51S-!oKUT2Tkub_9(r@5X z#7m4~++B57IkwTx+Gl)3T~W}&jO!?ayZ(D9L^NbV*W>;t`4D{43s+{pUc=rC*~`2Z zl)s)r;!^$eh7s}CYn36+k&(l~rx%>`o6a(B&n=H`7-ov-pK92z^WQ7nUrigKqhoRXgBMWdynM>eZOADaFP zcU0r~(!%eY#uigWnPw{zGy5@y;tyroeH_=k7^#EXYuqkmTHZt#?%2D6MaY>G#nq!< zoKvLFD1IGR^pOeM+*Y>eB9T`^z>al%i7k2o-~#{rY)#1+cI$Bb*AJnrvH_^-{qsZ5 lF@T{76ua}E`Scl2)aM%)%uYyzT2b($Aah^(gQS7a{|6*zST6to diff --git a/content/english/hpc/data-structures/img/fenwick-update.svg b/content/english/hpc/data-structures/img/fenwick-update.svg deleted file mode 100644 index 79c8f848..00000000 --- a/content/english/hpc/data-structures/img/fenwick-update.svg +++ /dev/null @@ -1,3 +0,0 @@ - - - Produced by OmniGraffle 6.6.2 2020-06-26 13:14:04 +0000Canvas 19Layer 13942824-8621322737-421213-138-5222912237-4227132822-86-138943229-524123456789101112131415160013tree diff --git a/content/english/hpc/data-structures/img/segtree-layout.png b/content/english/hpc/data-structures/img/segtree-layout.png index 30635299d759919b9dc75dae08725c5bf1dea5b7..f2fdefca92530df3f03ce3118fed2bc300675861 100644 GIT binary patch literal 25533 zcmaHSWmJ@1*e=~&(lH>QbaxM}gz$nONOyO44Gk(FoeC@GG(%ndxFj8mFchm>VBqSt!RTbzfBqS6+@K_fM1AKlcVKN1Op?S!w>SBRE zfmoI>@HMupih&0b65&4L1Nlodb3XVcm8arsPaPL)PaiXPDP+?Ajj_ch@Q;i_TCS9NU&b{WA3G?GwGM)Em_x4rw{vW{RFgUD7R zv!6m$E<)_eLXuH$LoG^+!;y;wQyd=>k<7Ij4EsB-XzkjyFVuKC#q+qkQDSWao3A~v z8Ch3joazi|9C-Mwt_87`^N-#Vy{VZoxKune|3*Ny{e4+=`%VrJv*T; z5D(6R;fiD*;|2UTebHjSoDb2{9 z(YCr^Q(c9*wSRhhJdb;c+{zjLkBkBmM$s<> znJcA-u5SE72>B0^Rm$m*(v=L1qE4{k1PbBTLcwNpNFBlOMxqIm89B&@_(s2yl@b&l zL;cDWMujgr5|C6Be2XcS&nD-LO%~7jLD)oM_{0*WIZUcu&$Pzx-WY3)?D;z_iD|({ zy-r1zrGW}k>~h_U9}7M23u;X3NMW`tS^9VoV~QkZc~_#kH*tPWB12=zwl%?ekN+vs zB=XxbXC8XHkYz<^usA;Q9XKTF#FzO-2sz6gS*04sGeXy?WeV!W5_%`()MUh!XI&Fa>4-eh z!SN|03ANorU2n#3HjK);=Y!?SN$@xG&LMn5=(9haMh<*0f?D2)>A|&kq^NebdD|Gw zT9f50#_Xh<%^rCA?jGGP#I5VQh->v|-K;-XZkjqcw#BpA*=z{ zO@&=w>E3AZE`Y#rO{Bp4j`VeedI1^5sGO7W6XHwpvS0YqtQ21s!gltgp&D`a(yWjl zdh`{9b)gp{{)P)8S`r3Or*8(ULH**cNQqSH?0Fg=l$?SIAo<&FbwOSwS4qla2_Z zA$0Jtv>VQ`rE_1!+xqAwWi7q{T4+MQ*t*@|HsWK~ z-LYfiq55Z?JYkx0b`#n6l8wA%IYplAjf!Jg<6hG<{bX>gL^FtwHUV>835}d!Xh*pW zR4lYY#qRtT_P6Y1Fs60!_w~zjDWcS$IH{9rw(BMsoS1srf!-addtyBNRZ3+?K8%ys zm|uN(y9H2eLt#!91xzGlA*csLvGtWj!b1M?v|o4ODqV(z#IP68jryYCrKCu`r-H%< zMY{0_jEhgK5pJk z_))2ZQHtl>&=N-Cj}cgTJd@tIljc24g~3mIJotrCqBqZZzQ(cYvv4zfklKTXbPHjl zDtZ(!kaSjMSy8Z3qh3SIc}YJ;+l2cOaLc77Oqn?qFbXeQ5HhLSoD)3 z_1#f_3r!-HF4vk7ly*$fgBiVU(f-1v%%YX?L;I(wa)-BR!;f_GXoCxAy~TBf6x@$( zzf`*FQHgPfXirwg25cpDmB(!C+8na2di=kva!>co_?jfn@O-1M8~!<~92dn2eK%Wp zRv$gp#&;TJYjmW8Dy3s=WZ@)}B||e&d zvRjg23~)r$vi>GEXr_IvV>=3S8E46fMQd1Qtu@sqzoF$?J8l(gt){oz>Lt!$El1}? z!oCp+=PYJqL0%^ASiuX#xr_4-!B1_MWZ+8BR{6*vj~k=t$w@+#jOOsc;IE!hc!1D3 z4>magS>qIHUA)!gYrGfA)U+Y!ew>Ev4m-`93<~3O5^+N=Ly+1v98s8jc=rOIazXvP z^V`Lxx0q8HPgC{IaiHc&g{_0U2PWut--G$8wi{pE^EJN6Cm$zhD_UH&H42u)Fx=B| z8=1Z9<_<9p-!`!sK_4pdrp$0LmRz9J70BOahwu-LIM}Sc+z|O%cs7ZXxQkcI%7q>b zsah?QQGs^XigC*ucMaY5P3#pj2j}!CvcjmVb>E)IiRB9sK+kBJhTw$Ne{cBtpU{D|ok~R;995N5bVB?x zd+~}bi-8XY73LjsjI6EFfI?@Pe>6sS(iM9bp;QLf^!q|8ZGOO3{ij8aEx3{EVw&Hm zdQD$bZ5>S|+8T2tywsr-F6LEPffmp#d@)vAqb~avc9J6NN9a2|vw)s7sp2t9mtlfq zTRkR^yZ^TKqd~BUSBjSNns>*G*A0qQcOWkI_93FDBi<{B*%qn7MS)(Zyn?p++xPtB7#m!j7)4q-D z&ka+V@UMB`vwi6h*<1PPejze`*s1h1NHwUCpZYp+>ESJOC6o!50{H=TBs*Vg^>t)V z*p7XW(dWPRwy3q8cPk8KZDVB6f~)@Jp2B)KT=>#x#=8Ce6(>c8^_8U2XSfQKo(unv zeQjl=c7mzMKr=Ei1}bqhLpbLy@*4BS)lp7s*&=s4rO`q>YI}1#ndI%09c6M$l%Tml zeE!mw&|2%TfAZa^JjK|zlwqXrsQb~DkcdZV*EFF2f)zTC*>IEt{>XOj7?v!LK;=Yr zPU(MtjO=Ysv^ArPiY%4*Q}EX|^qduklh=Qt+KXwbYjs9eNNucIS(&0Dfz@7=Ip zaZHM>+$CArKSZr9)7{BQe!cx~$3)%30Hdy%4PW~?L@u3|!z|bZpBrUOgE|hClVlpF zsbjh`jiP%}k50+*GmhcOtKqTPqVEQ;t=Br`$sjCGm?{oF?Hq^_UOz;{-*GEih{(*x zDsA;b?skjL9-mYOn^;^ckRd-W-u>S1_OzDXpu^N*b~HuWvOSDO!JMni zD3l3Pmsaf#u+x$2&49xy zt1aGds7HVhq@!(ZS>A@&TwQ00NdKWjMyvTQNs3$2>?Smq<%DWWAU#TuM7zJGk-z_( zWL(x3Mv=?Gp2c2Z-QLjO&?8ITV^iJtq>|;7Wb(5jBy582;}fVsEDXy98b+13z)J5+ zn}L0&R%^i!f#%^B`_(=&evMPMd5Z=5;=W)>>r7S^kA~)5*Ns{Ds*k3Tf_m=f z$a*+W_8qKSS3_a_YJwT%ExzdYb?yuQ39e>rO<(b#j$Amh+>j0j77AxzSb?2ktrY+et5jo8(y18o=Uj)pDjJToTZJh z4Z}Wag+Ru}d5vS&7vk|ul_`_Qf8Y`B;(q5tu)vj$l&270CwV>=5VsclmK1tf6c*!} z>57Z%!<_gE4`RA}s;NW4zED*EB|7qkSw)=^k6!-s!0muKR037A82uODt6zk24-q}Y zPDF67WQq?wM6q5bq8q8FP3Wl~PqZcIpBD^!5Xrf~ur_~mmidO-q8TL%n(eD77*p7- z3p0j^V`lKecu$VP|G5$S@4Q5Q*ja|LaLGox#And6VAh!Iw6fQ+1T zZpm-haT}z5YE`kJx(LSPZW~zM@pxTzk*&ftW$3`d$?TT-Mzes=6Pa3b(T(`$Dpp>K zzP!i9^ODT@#NItZG8wMVbTFvXA^|$e+h|yp#AK?l1A$?Lp6@droPqDcwh#Zwp#^Yf z#rH@R&C;k9!hd5VHoMzOK1BBBiU?FiP1#(pxb*qD$qCNPOo@ zcG`HDs+5s-*l*XT-OCY1*sqA1s^9J=&n3vCTw+Z3s;azZ*>2_`SHvLM-wG|b`au2f+T%A6SQv>CCl>cnl~4^9z%!Rj%hPZm`7 zVV3zy$S(SGj#RDK?Hb82$EQ%`psIRRRh;T*a#?c`rsy%Oc2G^63YWcC$Y%7o_EM>C zWSjOXvYAiZ>0n|}*)94@C?r}W2z81*9+s{h{9e#4LhIq|K9h^cho62!yKRqS67+stf}hc zwETxRL%1z2XqRLAYTnM9CZX1svExxxQRe$w#VpoPVigW*tWi{HsKoG}Ju22kJ&AzS zgkE)KGOL{YkAjLj2Wl`^O@)Q`851>E%31NI#9_7e*No;}L zBE67z0s-=5laS>&l;W@3w~xp*fd~DiGDNCh;{`QwwM^#du;)`w`u^R>>J2<9c z>e=s!ok1KF37Tar4H`&_{eo<(FZ#MQOO8Uv_09<=-gjCb$>@XeKW4(|Gr2`+ znZc{j3lDE2B(lS=`UF=govsq-KnN?p0`P4umEUQFUdWZS93FF}V; zw1vGE^M}8_(>!_Zv$Lkb3Qv%|$B##u? z@BTA|wtVQO>}``STpi66&s&G5Iv-8+(s zmxSy0dkr-{r;ZEdCeSEBBCAU1ZaH>vlrg@AH%_K_uj4muSc?-eEFGah=;a%v|uRJ_S1N zwsrsYj;3Yn-^kZ2rW$M$@$&c5?^_m{u(s6ei82S(bN~I9UlumZcrPY2_Fn;eqW3$6 z(_m@VoyQaT@BlW5w*H(Jri3M|M0Y)1(yr zY-Zz(7fR+|ea2URi6(m&!hXoQ)Xq~z1;Qet$dRSd>TaRa&g!m;7p*WfeYe*>_Kf;j zK1BQ3Nwq;oXE~yC1Rvx|{HZ6)q{BjJS_G(e8AXy#Qoe=X&Pj*%E?c1qvr-r0k8+k^ zAIrZzy3b5PGb-_xwlC&JYkhzg!efKWLiF8BDDI*ps>#)-Fpq8P4Sm|1U#oJPjn#D&yUh4Qj!@_wlQ{Xf=p(h= zZgl1>-i1FwCbv4o{b%T!h)7f-Nm-SN{Mfk^M|8%Z&-UgwxBM=wKx#1wnDzEj1t{yab#T0yVLm5gRa_`(-)!?!Gi;*Ib!5Zv_&zL zjCY@PCxnpIElWEjd4E?B6`Hp#Yoruge(R)Q@O#|3su|wO4H3Cbt;(fqGh-uhBaYE` z-8SLC(r3b-iif!p%2@_dk7J4j1Z3+crV>%b+wdRq*k^jZ9{bYmj5_I?*zMKpsNDsz z#3i;HJz+D%wJZ2s{Ib{8vNAt8(&q^dt6REQ5V!r~#!rGUVRQW=UA>x#8tGWb@1x@k zUnAB_zq0#yX5xC%rxQEpG#bB8@56R13#bpi+B%LZZ8ZtlFoZn|i)Vj);mS=zLCrL> zLRiK?-|5BwyN&vpEh7QWn#48ZW3rYjH@A*M!RP+SKc8k+}o)TV&Jk`{&m zRxsq`u3w4fO5*fL;~m3qH+W;B0)I`TL6y8BS!H{9&(5ka+OE6~h10mSqQe0P8N+NG zL{Z2n;`9!wRUao)P`;;c$AyPrO+p`HX3n?@6F|P+$l9B6Gwvv&hWJM*>SaL~s%Lg? zeFf!Ol@|(&Leo54P&%^1o0*R;2`sa6lJ9p0w6qAhW^_^wMrg{*rkPY!0%IdTeNufX z8+1t*He%sp6bf5mBcpw}L`DqZAX)uMP)r-r#eRpc!t2V2fB8m5M;sq+PA|8)A$%CW z(u{J|^|=0q=1q##^lO90J4iD|C|=eb61Ouw#nlxyO1g? z7(-YfsY^3*@}|r>-AWni!)^F7wMtvU7W^a-(G2)K&Ac)=$T_5j&GaqmnvS zk~lejgmufVCFuA<%c&B7Hr0mCaTH0@0)I}R)Y(0lmON!&*5J*4{A?@91GRuVCVJNI zoJ$&Ah`fGhomNL_;E$pj<1_G84*Y^fRYYOLJCU-W*T!BM^ta{eb*yTe`u4Urp!jW43Sop_nGXoBK|kFQ<5 zch^v*ObDUJO9Ly~P!wG1(iNike^w~&a(kWgHluZGt$sCFCS_J!or*W3S>E4b9b9v1 zx~~r)kL6l!Pb+nq;WaDGoAgFk_`;BeGB+8LIQ@?&Je}Jywte57$B2HO7?MC1mM6k* zi42EnqFz_#mPMj{DjjgNJJu^^E)MJn(^$Si3J@YiFC>6$TAb0YX%WCr0%It4A5D`{ zLnI-`Q=&pYni;zp*HcVxwhDxgRnI){Dq-%u24aU4#e8(O({IuI9~5C9<`sn@#IZ(} z=^?De^sjJ+6(ct=4rxfOCZ5S$Xq8yxH(f<(s%&r5Tq?CSRS3C_mxfJ>riZw#biFF!dg($NGi2jC^5ctp zxSPm%3) z$+pzP+u<7GW*2Ew%ttzyWnWr3hcMy zRj%UMb$qOx*Gl~8uPgHEHC%m&xv7nG6$|vEb!rBYVsVbX3FD7{M2Z>i%s4kZYMC3y z$^p+Hs>iJJoMg0~?_=Db7qSoY&1JNU5{ZtF25~OrJjDL1Vv8b$k;Mvrizia8k8jsi zv>r#B`UF``kNcc9uo}|h+6u5wK3NAQtXLE^Ib)< z*VY6b{9%?tAI;TGqVG2;y)3n)3+w4jBZ|hM=s*!4|26#Ef=g^IwpYr)HhEmHCl!u; zaiC7X1jbXc5CI=Sh43#*ux)5TtBJ~rbVai8D_pIOJ*~Qi!UA-|vOFkF!d)0y{yM|D zQ&Ru$DY8U(J&Kxm3rT8okILy{&|4=tn8MB)oc#!9+(=fpxk>g2e}a{@E^Y$nJCrbK zK<8flHAXh7H`unt zZ_B5UjpI2c1<0(+Ui(wUQtdni8N!TJ;tW4rzTf7A? zQ{P!Wn4#j=YVZPa3<_~lfx?1ZxB2tDx|baV6ozQwi6pNc$4|qQ+BRQ{vU!4I#w$LA*(? zecQ%Xzjn9Jmr$(<2mf?-!nEPG8>jh<%)WRaM=jDqOAW6X_7+_*9hK?VcFMl1pqjDi zCxMTNz0Pi;$V?bqO4$Q6D&L9ipbA1EY4TeK9EW@s9~83FD%b}uN>w6saEUDxg}$zm;jbumEFRt2_ZaN?t0qbG!L9w7(Dj>;Aw+ z>dgD>Z5#<*%I9ZVkKMt0@FX)$8dk=m9JY=rTK!8ieVX8JylzY+rw(V8V9hUj@Ffm1@;nm;bO51DaTC1h}FN=;Yfz+;#^_3S(%TCA{${~>XapcaZ-CIRIAy+99#kXkSla`8xf zr^!c}?0x=ZUKi5@TE)sgn<#IXF^}f+LfedqBJ*uI?MICBwtK9xtRBisP6e!Gp5d^A zoYx^qGzm~SeTRw6&V1LMPXRD_!^O-wKdW;peCe zu-uoeNqJiIBgO}NFHoIQ@)>J|zTm5sODC;GNR6^g_j)}v(0x9*7(S;cPIO7NSC&{8 z*Xv25)5RrgvcY`kia+yyXEnz0c{8sTeWtB~Dozj$w;b(@R(VZdI`p1Uj724u$haE6 zdXBitJdL$@9fK0>FLpY{lEhLRF>6)*TT+P|+hdgRNeK(d4&P(JPR zJ42Bby{8XCY$fRx>VsV+)kkUQAswRFQmf0DWVpW^4gCs?wq5-H=~rKA*~Q0ULD_0yl#qPe4y zFc^$<<(LnTxpUu>B2K~0($ezzUm{A(a|en2Z<3``nMjnx#KbEU39y`;9C^PTaJ^tm zSE|nVBYqc5@b>A{x~rw{vq5AQ%IMVNiWXT@N~N5xbzUTwHeCCB5`q|J4hRIYtnQZz z9;UZM_G|`4n|0spk>X;jnEEfuisoq8V0ZD-T1yz(#Yo6#D`Wqv-@vBj z-IqkSwY4?=E5PTg9{#ej=g~r4;O$X^=fO;({qGhByJR{Er7y2mj8*=gtd7lnw{R1s zo&IW-{B5GrxVe9Lcg#dDt@5Bgm0X>V+92BpKn<)lxZ^m)x1?_NoV@OkXKAfOeMb$90<7$|(BrI{nz z(He9MOHSTB>tnd~chS?M05b+v@^3gt+_I3v%y_9LLw;ME%c}Hqz01n()fyYZ&A#MP zi@qcy*A8^)p-W6=rGx^v&0ZtFPf`?PV=|99;A|6BCNlZgpG!+gpX@H@iavcBrTU)L z?VzH;eZ_AjLgAiju>h`Ltc35}GC{Auzdu!4-1KD*-3)-(yr% zRNdX(j1PLeN=dW>V!JsC4bICgD>jrEnVFf=i*65h@Vmc@E|UNDilt}SLiil#co*uO z4s`M`Ugb^gmjuGk>-+DRm6KdIy0NK+?7OdKO-`=fhV%-zpRRY=O_pHm8@WY|1>AQO zDaOC9cg!@o7-Mjp{}J^?r*yK;K|i1W-`2qU?!Mdle{J`F7pE(Y7j0;Ewzk$h#e@DX z|F9am0aZhuB4`7L?;=IE<4 z_{?RaI~K!~)2MMg{rur(lMt1cgoMQL*%&Wgv39W%hUtr3@!#p^-p5OJNWJehrR~)H zqRl7=1_nm6oqrD0pZj0#nIZLlx9D?xc9I|Rz=yPXf4$o9NwSMK0n9ShZhlDXR8rFq z$9ewEPFx;Aq|M*@CUu{Jr%E+NkcwYq^4EW`p^TpRrq9DHExlr=@28%@ErhgLoGkt0 zn+h{G_vk!m`Ql*n;g6w&?r!MfV77=$9x^5P=M5FZB_%+tR2n;RoPfHEATcvDccq{I zJ)E~d>g{6)ar~c?l5^+@a<|>NFAfHr53u34&oz0JI$0?}p_CDrg!L6IV3AOHSv7KO zhq62_Mnq^r-L98?7WMk(=3W<@v<26vU4vs&PE1VHr-@kf;-T_dk7QZB$$YAw?zsYL z489w9@_#-(szs{4{l5>v*S!WX-T_?Z1l~?8#3xZ)VphNZcdZSR zI*T#U6(V$w-?b6$Ia=MTBRdV)#N_$nq> z)Gd5_D8pkTGGrdt`($M#OUS`}l|j-srQ%J)5Az-z0tRVXtz}NIBpaSTi-xWN#(+zB z6V~`~9p|c<#wgWPRn0DTMmXQpbB>+dboM6FTqGu3lWtg(R=)3%dA#eIsd$5coMxXB48drKfqv+RxnGgJ%gD`rx!M{u0C0O$;>2<&ovYliE-sZ>*&V#1xYr-F_K$oP zH%AStAH}_%=Lh}GoK*{YfbSm8eUHc#usMhC=A-huuE~P3Qi*%AhdkbkEH=2 z22O&KhSD$a;(=55mF5QzYRcTBwX5^|>R$O3;J2IW^HDnJ=c?7O(@xyO#fIdD75{iD z0c!-99k)FOD<#v9gd(GU1lwt{oxmWI2Ns0aW|THB}k=5sig%BI+%qbb#cc;=0s zhi#eVwh)=1f0u5O9uuA zH+pGYEFT~4Tdr3ES^(T8&`V_jE{5kmtjF`Tu7%@r_di_(Z8tq#>nQp3Y5T7}FQup} zqnMZ&9u1;Ni1qMU#4HGGwh9zh@WwczrbWneP{~90`d=>V; z$IIE}<(&T4$0JE}67Dh~0Eeh#f&<&VhLT-c;nASX)xKxuEB*%+p>6r^-*+MSVxiXV z*ruZrAKk1Y9KF_IHVGYvU7jQbYWnAA0G z!)kDVtQ8>W)Qd408K0yCDuL*p_$LA`E0WQ8R3usk$8!xX45%1*reM_kK45h~I5^@# zM^ZC02k3m3I_jMk<)6EN{kS|?&Cki9J#N~iM?#Fu9sD9w!pB+bUMl#`*LwJ4M~mNu zt9b6=%CBtDCU-Dg{#Sp~8EiIR!tYW(!vCz*^s0!qaO)Z*^n zK~>YHWq%5j<6?axm{`nTenX&*6Uq$e1rD>7^+dn;EPA!tg725SLI6bo=7|D(JXNer zn7k^c$Pd)PREwVn057FnajzS&g+P66gCW=&3R$Uf{780+M6i;rN_y~ zmkQ`$;_H7L0KL&|^$qIm|1-C+;KPdmt#+Na%5XdDc~TSd_z;TG(Gh{UKmCOiu+FH{ zv_`J@#7{4Wckkan)9!qn3K#b|e(_DOLPbM^6+9=X(B~JKoheKTWya0I4J&_EGY_@3 zwLQTWfrI7chv=Pql}-)hh1p@;zpo!|4Lm&sdD`yy!Py9&XRQ6YUQxgL z^E)>s{xtU7P zmPjEQZ~dJ*Hxn2%u#g`?8{7cW)c{Uv8=SATBYnKPAbesH$S1Pm%lw{A69ej;Oe;om zDbQ@u4I@UVqszUC2%fg#klG(EP$i}Fo6Tg8QL)NRp9o_b9C?zd8}X+eF# zT22-z;sV`2o;3~>r^krU>^K+~b93{#UjZUci}l-C4wZ!nIkWAO&2NQcHI$COCjsAe zb9Cg?lsrTG{Y{_k*)#0WSD)0*h6TpZ%Fo(1lO)OjJv++?I?km4G5)I3s3|4S4=kzV z%~}LN5f;;peSjbPwT|=e!0l-_dmp6;B?5wD1NRK7yge8NDym&5kDd?Lz?FaXiUeqc zEimv;!3fpX3Ig#t6!qk7KUgG`l~7th+JMTZT7v?(TW`!;|NZ$^@F$JIB=8K~YVZTL zU4CM(KO7fsq*a)F~+`Yx;e-1~RoixAP#jfA7)?k&`DApP>c+5noa6oAwRt~cpmJSGd| zP-iNQ)dA-Pu~vC01MvW=x7~VsY_ZlJhE!`i5$>}ZlmSri=7rW_5(SqbuX#6CI;Vkr z+VlU|4Sulv_f>Zt2xwX{_wJLP4M3kP&z{9T)A@YbO{$5+Yt|Vl85<1DQZHzQ?`a3- zh3#z%6R8O&&7z6($~~Ea>IM$FO@vV4msm=YY!51ZVlkZxv^v>Uz;N899yeEPOokGpCU?Y3As7s_)Uei8?*5;+e!%cYKCm`DC*nIH zJ3t@S*o@ITx11?G+^()A*?Ei7N}l&KBlc=%B>R0%&M?4hcH^IdI-g%)+kY?DDnxN< zIl};q$TgI~!^X-Q4d^rjh_ab-0|lUo?w^-JV@VtrYGZ*c1tTsRd?(7LnYT4-60$w1 zek4}O4`|NEg7WZKCZz3u3cupVwpQ2i2L53G?$>(D#posiG{Fw(Ru~HHLH7V=Y=He> z0?rHUSjFXl4s1`mFfHu)fSfGs;7c=|VWAy~$6w`2bDEw_hW=45pVEOm41(PjrrJvtm5O z2X5mmVW)+yN&y5tfl+ypCHO@aKUt)twDbb-FTxy7PECDz)4-ibBVrA-D#A*#uwr;K zP9gLgwoOkQxk0n=^!rv*Ab-I$3U8I1z-K=K0NrsGmH*-ZR(v-vfE$c#5dr|piQ#>| zr5d?qdKJ{*)<(9e;Ol4l0s;aE)M5y_B*w$|;%@k0)h!q9H_ff)!8-Ujia5KYK%8S76s;6_V1J=GYSeeq4c?rY>d?(vQ^m@sI zir*5;qBmi^W=wkMitx{Jvq)DA;cluj{h2|Y#J|h^F0je_0EUijz{)FDepxUX&JlYL z;NKOf4TRwZ&F?zj9ufm`7WhjVur6kMH z$B+GBaFhW#Qt_FiJ$v>H;Vb}4Ojen+8MS>2=XKkF(g$5?|NDD1wCuBrFAMY(6%a}K z#$M#Wvg7Xo!3He)3lJv|%G|kXt%d;WwLf3O2FM3kza(O2CBzN`>q(VDwH-kyT>ylN zpnVB*_SEDgB@nQPz8@0|0G(t5Tp&;(X|i4TMpOu*5007~f>7?bR(HtRyl6vD)( z?(R=z4$A8{FLd7?#G{w|s3{GPyj~5-2hYO~jQ(?A0KWeBZLZc%r(OCD@Tj8|2VgJZ zYY~L60TF^$G#qu)gn~Zi0O8EBR);(-1JJqzNQ3&{eVsslQcC#?GKAc@sca*59r4sa zuWZ?acIh);A?0LtsmUXP`r6ISt!ayei)*Axho_~37f&gHQVvLZz+)Sr3rx(+jZ-&>Tq2#v zWVHFDRUIr^Y<&Ds8fxK;5Wr?I6DeQ=5lf1GcMs&#RE>>hZvvIhniP=j2+=yyrN6X3 z(aP(4YDy1Er04Rv)=(zbr8gSQEUC~g+!Qssi zcXI+D>eQ4J*(e-x_W!<;8UOS&>x#yo{-&S&cz+Ud`R8{$AUCZBXXC0rFV#iE&^`!- z0xfm!skT4)2{jMO=xW9`43b1zFW!lB(f!355Vq&HAZ}fqhHG1y~AoI(sPdGrc6l?A-f_f(bbVPy3 zGG5s0>XLylOfFlzd{ZC7%sk&3BoY6F=Q9`}@MF{&2PekeNQf-AxXEm*AL%8K&#Rr#5gqs0JOuQ>X#A_ zj%R0Qa!qQMHZ}ra8ZZcG`$3fG3bM{}1JZG7;)K|LaU7NXQs&Ru*kZtekU)y5tE(%^ z+HrR=7eohi_o(qV4|jLTO1IXvFAU(gLp4Gh)$yE1lVi?gaiIF{Y3#3L(KP-G34$H6JeV&jND^Qlv%xjZU+i-r?tWIYu9># z?w3~x?&*1Aayn;`{M~Mf^{eg|60PhRaKHMo-$A$uRuB9KM5&H-SH#o;c8X0*dv>WbT8TmfkyTjv`IHYZzAD*T^(wZA5<^f?T=)AK*5K^(LSb?*_hWuGxnND1>#8`xb!$yyXxFo(#}G zf`o+Zr`LcSu>?edNS6`*3<&=-V7!00Z>s^5?*{}7WlGH+U}Qx0fFQ&%$qPJ?n_V6) zW&p;n2KO0FK=;ST0HKEw${d6t{l5(#Kqx%@MTd=%v=c_mED8*#$L{DSAbVqG!=3c3<0q4zUKg|v@p*}FCw$o+Agp#L7K&MCn%eM`Z z#%%!G2t$5m2M7<9wL2q$CMuG$` zA9#cBeiznx0mov9sx`W8f_zT~4xF+du%q|f2f)rF%-cFR1j?4jf4~MRB~tecyl0C@ zDyc^(FILUGVZgU407Ao`TY!MLq@;w`>yNH;^HF^P$kVF*uk0y!-VTGQwgnVg`Grjj zfq7B5Puu~8*MRtug4>8vR}3JOHHgI#RttC}K9C22-6uzQP8wGd8~|fYdqwfDe^_Dz zr)UbqOz`!~2fv(@ly<;)wjdD){^k=%^ATGOQdY1>#N*Kxz`Y|pJ_x7JKs0*_K!})} zToL#Iz_Nt@{McS*bpzUnO-drL=NeA_@PQ2Br5EZP=G#I-RJ>5(D9O}9aR^6IqDuN) zzw!kb3M#;Ez`q0=^e!QE0tOysJLn~WEHE1&p5R7gh9En$Ia&FI{KiKuopazned8WL z<}bXxh4oE>yku9eBS8tjn{|OP5N!v!YPfI=y_Ek4p4OZ2fa9hZ0CA|UU~w+MM7aYI zS`e}GN=L`PuImlChrrx=6(&f8qHoGdlfthUO+C%~|Q8^ug<$inj8K*Uz1fivJPxm!*< zLkFMPl6d}^ewIAn07wrhSRyH%T|1mfA$lv8B3slA!YCtqkO!i~f5!TnF&>(^;^g4H zwGVcf8b@PVMT@df2^ru4L#OufY~2HJ{+(L>xj(EKLs_{u0ceg$aQ#+d{(sfWb@9j9HjT4 z;*`}|hu~uRd8=8oOR#A^;~pcVU-hf#-K(P{fcp238`70~FnC~;md*nkwe4pu(Ys@j zEJZPSEma7X3b`3~OZnBS$!FZq4$IK~Wbii6QUSiYa7-M?X|@NEGzIlIb~kMud%4u#7%4U!l0 zt-W^={(S%Y1_0KkN%a5s4S`M;p8t0(C)9bZ$EjhN!aZD`@tU?3fqtKh*|GEk2BmlP-@dpsWvGN)lF_i{!MT~FHNNgYs@fhH{NrWzlM-Q$F(wTMbdE!W%B6Zk%Y z`Z2Zc1Ef5fLFLLp{^ra5zHFgI1X3 z6-Jr=#(pjg*xinHmlqis8B{XfqJ-Vg9G-vn!0u;06UL(5D2!*eyUgOgh5ZwdI5 z?!E!=3{V2qb2zZ7fP0Ssp(peT3?qP_fn@aD?*slF)O#vQ0-pg>fX{8IU0&RX+Ak6q z2>73oyKvDHxCnR{xEmM%BmrLnF9Jv1?kxpejb~R32O5JnnXlgm7Wi3*rNAyf?sJ&+ zeF7+t`hF1L`Sk3ffdhCNXbL<491pR)7_+?Qz+*s4h~=FrXo2SN3&0W^%4-L_0*t0` z=5ukx^}tiW$H0UT%ZdPQMq}`M;9VfykMGq10!#er8(B!^2Y0rZ|A+?e1nxu6JQ6q& zLV59KdEJ5E5g*wbVjaWChi!;R6-WGlgyuoOzqIQGBNCVg1b&W^Z`4QaXaL!05h)J1 z1CcxnWstB>0Rpu{S3v}|oh>!Zo8e$rME7=^~>I^ZruW(qhdy)`w+l@>P1>p`PWb8Q^JUEm^r1@`P zK5(h`+ik%k$Xp!A#-a++kQ@Zo1O3eFBpMHYBUA3_Mosi=dr|+70UFs@ehe}Zk4Gl{ zP!NGkLkoe+LM-b{LIm2B&Lnhl0&>k+2;9rRLjzhGxqSRU=<7us+VCwxa_Mw#-ir!# zfp<{*btbeiQyF*`wckrfffGg%<5FNhAsn+-2$)A+EmJZQlEEzW{{?Qfqf&n27UY{@ zARmdRDY|qyF~}^~($D*ELcU&qkXKU|h1BVzD1374FXGX{y$hMgpT_g`q(25WZg-ScbHZA97}xB`Txnz&*%3 z{s_|e1qv%LA{32pJQC8`sK56E*Sh_`4AKm2M$zS8BW+c26e3B;*?$#U7gnLDY)__^ z)@V-N9pZ16M-hr&QSkgf8{t6iG7C|Jc_?Ur*2oDIy$|i_Kt&XBJO^!VgPlt;s1IhL zwXd-KRZlk}kn6}Kv`M~|;5BXMA?@h`q$si{xojAPf_JbdG(QxSLFR>m5P`HL{ro8F zR%EI@-ROYQMp#W)2BlibY zk=Ckp2*2MBwU^b$Gx5HQt&X(S+kl;@zHKO;gfAAkAbgLA>mEcRu0?Cc7lfv9v!Z}! zQ#8@r#R(T7V(|*0nFG&7n$Kz2pPr*Qq88E^%_cOnY-Kd|=9zy_r|)r5pdxzCMQB4B ztiR_uNaoB&0g{~wEhr<5B0?1M&AI(F*5PI37vK%&p_C9!z1)+GcE-g;8TNzPE-M$f6jp}+PkUh}2_y(nl`3w1U6+}W>qXjQmZEq0Jpq|$&vvv&5$u!jzW#edEQ*r-8WEH$DW01vyHEo8$bOC5E?9SwO2DJY z;xQHIN^n+_FitZrM^SgD;}niKr1JI0;r%8gU<&ddm`L%A#(NBzLoKrRHS&cD7B?X+ zP_VnGtI;m@;sXAkh|F|rkjd^L&K$B=3PooAj5KDyBC=}bWO@jZ+hDb-*CS$P_2vHv zEpjXZk-G_qpzc5#!LAgaJW|b&Y3vmwEYC%GA(tQxUkYl68z@*aRRU>^=c4{7j%2p5 zPzGsGzNO$f=MUPTc|VTR=Z;cTh(+UM5!z7I2(p}psO=Y^&BfX5FJ`SJqtGVm34+tM zS3~RMmuRiL2d%4NoZ*;4V193Q;ZXu}_8)>wWcdlsT@(*9Z9%AjG)xvbIEDmepb>=E zA(9wOe&!fPYWb^er6nR%v5$DAL$%q*L zfHu;>AP%KKSzwlRwg;uuMlLAxQJ&Lav_|ETpYo5udz{_=Vh*I)eG9eOO9W?*t&ajR zW}^+~E&P9bXC5S1b=~1_HqB^O?Hi+o5m*ReR)G;R5I`|dhJ+Xlh2!9*2u@{%vQyx4 z#pQCP5?3&`v57+#CncM*O^{gviee;Ii9isNSggea0Rjmb(85R>ts^OaeD~IE4Kvfz z-P1EYlRj0idY1QIzkA=g=lt&PoO^DWu^V>Mf?=aC4d8dykY6pkhmfI(bbBi@1VaIU z+=g4BMX<1eV-xjzldBNmMvI*^85#oB96{S_FaE z-%wcY*bEZb`6Egv1*I5Lhl|q|V+u+>fVc(8g8x8P`8tMg`I8E?4R;{;zn$E7wW-07 z^>9IOCBMuARmR{hbIwyyo9SeyITP9DNWI@z+I03emUkZRqn(sSf-3->}c~4NndQiJ{nUAwzj5 zxi8upi{tvoxGT-K^0-G?hJkz^S&w3`nrXoqM83H>Rp{u}V9wi6l@hzivoNIi1cIe|C~TE#F=QG7=P%Q<5u1S>80R_$LI| z8*!1(BiQGsGBm@vr(%t@hcMips~m?6|5I3QrxZ$B6-y?et}R79$aQCmCIl+?;AG)1 zD7;2vsY*TOx!#So_OHopH>DZ`ao@ncx)1l|E~F-2S%!gBVo2`ZY$p~dcTqSiav9d4 zd!X>g88m6Y5XBA>sOSLx7Q<%WAoraZ#-L$XNO5zyX5eh_8FbVu(C9W6@jJ6g+#2dp za@)3R4wj?%fbQRA-r(w34`qapt`g4Oj1pq@Zr+8bNei0^AT`7N%Q7{22f)YDv( z+Q%ak^8k{%pF&IF2WjtpJw=$qi8|KIkxo%ao`?nVc>z@jC>RtqZJ5PA4kVM&^cJM&@kRi zZfw@+IH`#xQH~&+y#@iy{~$|s6Q$1@OuX3CBWQz%d!aHs8ds&(-A4o?U<| z*&oE!#{I>#BvKv4d$AlyvGZUDLu#;2Qv-hOjd8^>$+pMsKyXEiR{2Yu$v9YLVOe1#`WQN1@myfih5schgOAQ z=AUC85fqonN?(fKrVIE2{5p%5X$M^GR^gT2<3e!qrX!0MilE83VJW+{u z=PRg7|Alp8=hC;y(n;9I3(?;H1OfEaVmxeF@{^&Ia7E-3y<9skT=Pk6UpoTZJqY}k zVW_c`B)0cU2=105Q-3kCVh54Id=CL`WG)ZkD()-rJ?_Oks3o|C+h9o>g7A5$TMeWM z(N+Xf&tTb~y#;>PNw|;id02OGA=X~(i_F0g?D-fzzb`!3(E^{T)T9l=))(XWUV`KM z-7MEtA%ngE`?s3X9Ycjm8@6)+hGW-YJD1{^>`AIn2j_=tr{P3xMYy)Fx>%M0mZ1dl zz}h*t;EZ{Ff%m!@8J(}^d3_x+FLBv-cXAs(^VJMa-M@UL62m{IVOVfC0@^QQGJF$> z$vc7$V?P>+uVEdyQj{iq{%f&BPY3Q1xt8Sk4(D;-1^BKPV~FBVmh+p@Vc){B`+F`! zaQs}3-(QP>{0f|zci`9_iEVoVoz`!1yl;QWR0L<2k`_el=yFayI*<#|(KaDqnt<)- zz?`!?F@2*Hr5Pu6=OXxPY%Fg!UHBQK9)|MthEuN0&T0seW|Pm^(h;N$3z zx{kq2nT(9y*(7FfHTKUl^bQEiQbID+Kpsdfe+B9D)p_3I(>UY%9Qn;=8!%jQH`(*= zE<+l75$3(*10n`e+JGM-eg6-9xPO{(F~MiU&zWUQ-ojOC|3dcJ^I*%F*oHGO-|8p= zuGf&AeVpvZ{u8(r+lv?i+m_}0xybZ=mQ(kFuWHtiN-$JHrHv)|+1mI6& zia{~KQUiv(t_$CP4~B!X>3=hh?Q1YByB(i>H1<_B0;e|+Fs(#3qEM!+SaJdakTWn` zJr;GX0`+SPg0GjuQH8*60qWBO2yXZF;d?F**VW)VUV?yfxJWhj;U&nP9>&o5GV}#K z+42zxR^}t%X~fCT3LFQe5^*S0#^MCyT-3R0oJ>56ZQa_#b>q=C%|*M|gy~T)kXHNm z2Aa!K!ZH*=9vC{j13}uuIbM4s8p0dMZ^qV$VZQGpP2Cq6gNA4%g3H%>KYdW70{BB@ z9siWx*|?Y!bq=yf?Z}A5oRx$4UVe)CF#C)1$o(Rf2!=j`;foi--{&Ip^#FpCj*M$A zLl$l|>S1r-ZVD!9e~x4GpEe_i{%2g^QY@)LgMSIKON|JeYEd8GMn|`t!Wq~~Wa4hX zcHfWb5ZPw^k>^jV4 zD1$uE$S*_VejDavbY+bi&}o03v^`QjaslS=-Gn<2$Xf0~ zU5ZJ^(-7!fg}Rbi=WPQ5pPwLeR!UNh4q!eqeN)hZwBx(0K^AEZI)!IZUl(9lbvlLy z{~eu9PZATpFT#Z{&*RvAivD%NE%<)ULMJyJ9e)Rc>`G)f-U!bd3r}}t%KB645!5a~ zC%p@S%cGcU*PTqkcXBB@>IMX@M^HDb@jXqw-=2wasO&p#Pi;mm8+hRZ(U!t>HO@8cZ zF=QHstuIIC`9svLE6`C_p+lX3hIJ}VI5K4mGMM*}`$CGrmMO@n%twGR62bmx1Wbn! ztS(2#x(*#moM1I0vvzLyYzgNqMJ98g8SGkY#~BFRPb3w_;HEF?_y>6P34fY*!)DitYMn_-qZ&_%e=-=P(B>^ZZ%in2l|#LFRmQI9B199Evg; z^&%c0Ar*>$6DJ)nA@jT|oZp81&=!tX)SZ1e_SS|Ya~zhXEW-xmfrjpT$gVtufMGL* z0f+N(`hF^Fa%|(n#Yo>Lg;(N{OX#jpnSsFWQp{2L1?D)#+3yq#2VID{P-pVtT2kB4As<4( zz5#QHo<}EJDj+tSj6zoL0t6m=u`N51QJRB}rvd?6BQg%n=!|!wqk0U1ZmA}=k0XJJ zc#KDYv=6~p%=g=a?=}+vzJ-&R2SBCQ>3R zxmd^UZDb;UjeYq#Loejog8iQf=HeD{J4rh9D)x6R>77L|7j5JY?1Ol0#6B$bMSNu` z0U1sp57IVL*I@44Psz_XoQGiOJuE(3Y_QRUiyf}VHnijF)`!r!79w#Y=&Nuhd?$hq za5lIA^Mj_KW4<@z|NSjZ;rlMfkVGdMvYB*W&~+mT@ckMc(VO%Sn>rIg{Dla5Hz6Q- z2K(jIaKt>A?lv3WiNN$pG}hBFKWrkl?LAT(W7ztA(jKqtNyVlILn_hHo`!9WVZnA} zS)WFoIE+9pI^)^65_vR&puGqjYQkqNwy6SvMJED=_epG0JT_5!%tJLg*$6;BicWeg z0@>PdP6Yz8T^J@@hdE9$kFGvk6ItGN1UV7V4MuITi6k%)LEyCTjCG_vkt^_B9>90` zF22752wF#?uEd=mj-tM=!?C(O{B0_Zr>Wt4;^M4*q^;v3m>UYA)=C8StvIon7Cu`M z4#$GH*D2@Am8BnKxPiPVBQb~U8g#&m5h(P= zq)f+T`73dTbvpx}KUIOi~bF(Ax~IkrXPMhel*kxOO_uwCA9K-;5!o zWgHtW87P^JA>;YTgfyZN9Z&E3Np9mk;d2IpuI=cQ4j=%V zj)uCo9Shz=@be^w-(IF?ovJ|~bP58flTj}ZkQkEaBx9&>6FRyrp-u0E&v$$Io@_D} zoys(9Uv#*W!#!Uku`9a~#2kmtXi7L@k>%)&BS`Hjr?Lk7b7q&LC8H1GdG(}i^R|Vy zZvNojn`_m9JWlIyf3ZN-Jrcv|>9O!cmPn6a7Y93{p*E zwIgWRg>Bu6dL7xqZK&(Ju$}D;9>X0|5JqMBoP_ph1h%~cbz(oM_;nTP^Qdsn(eVE` zhBH66qAe*iab+pG3|EjBr5ahePa^QyjUXxR$=ZYlZ8C;l7GnsreJ~!ji_{|9H4j%I zwxUCdE9?%UvDqK~)=bL7*^Z#6BHX6}o%lKo2d$&~MHYiCwHU^ljp5MKNi4%ibfN>H zvz-WHV(qdQ!)LaYrhSn{65u+H#B{XcJFG&-8=Ywlw);rN-y*vb9Zc+#S`y1d`1`(a z?9RAHl*({@Ckb4>i@Ne232gQ^(>Olkn26wFJPC}pgf>*;dymf=nUA=q@8NKC;yCEc zc-E@$zbYITbs6`J_p8ddCiZ1y{`Zia@IC}5krj61J-m+ZGw` zqo~vI?-BUM3Bg{{IEb=0{C`jQ`z{8aU1~u+ZX%WOi6P^YpE=2ivWL_k q@ZpF|eE*r_;{%4@Hg9Dq3;zf8$WiJ6(do|s0000_5a!NJnqUU7g;&#E=lEV6ulcS~JRSRcJ%S%^n zoLpu%)XL&T{NzQ7&XyNkY@IH#>)Se7Qk=EC$Sx|zu3~AylTW(}BQqM1=12>C;iMvHrQad=3r{9r;Gj z+Gi5~=QFyv>8JxA=I1MGX>GT&v(q&(+0J^KGEJAt?>I%8ZvRxN?tzVT@+vBuhpODH z-#qqhdGki$e_ndN&_GvD&p#qU(|Wduyjh8~iMIIsb(thNUsl!Itc`(NcXxDju>bPC zdiClyHrah*R;@Q;WA%7{7Jb(ts6 z68E=jX=yzjaKL-Go*Rp`d0}??bmN_G-|Q||7+a+t8vL4V#6d5;S&MQdJwGgr+CYF2 z|Mv9uzBDha{9O1jH}_?Cj?Vnzq@MnnGwGF;iF9&W&jhQ_nwZ=g92}&XE_nW2hGEyP zw1*F=xVgCx@bfF2JGb}5i4(eK&RjZ!cPc3ngX|$V0U&^pDRL1Vu(_Q!YOF9+Q%C`{3CpA3HNt)wH!$oSj9Nykz&Wu%zYZZ&y}U zZf*RQ{5Y5AV9Z{P{GU@HLG!0)XD=NB~pYM22=FIF?y0nrKhKh;` z+*W<~?n8S-dR*QwV!JTxloDXv>g?=nZvHU$xDEe-L`K<^+o+E__wI>pqR?})Hxd*S zyezrTz`)@6*|Wd%DT}vn-+n+qKnWKww&cVgpkzML`y0qaUK4LRW`6({m1D?;qJgchS-k&z=JHg?kpgX{MpE)j@*!(W6HZIU1)=?;@K( zSNBFl#I_JBW1(ltqYK~cJCEv~JsXS!hF+rLj-a^v5$X%p4ft@WWhd7}~%LiZ~Mz3A=T6&V?M+1dGKP0boJ+c@A)O-oDp z#0lz{*w`xz<63v_++pY9+L)no>t;d0q0gT`OWE8Wq~uV*jm->IHQfQUj2k2*CW@|a z>gi!nQBjGychC3xH6~`}lluA%+vt_K4J!fqP z!`Ri`O}4#}kx_R4z`)CQ?-;YQvrPt_4Y8*`d@#mkioYNK@&!A7FC!zPu$Y)Xwt@KJ z!)6DK2fH_L{4hVf%dvpEyr@W&;36+CCF<(w)2HXd5)$^}v0<0fvfggmLz~FNj%R%y zg@R226p>d{%t+W_xBTp(i%a?Ldo83FaHc0xKRRo678r#x4-5A z@sDFmxv|igT}aEy+Jtp{6_WAf z$(o9MXUqiPFDNhX^Ln=>uQmHkBaaJ|-(yR?nmc{!)J;?p@Ob}-hzQ) zOVz7aD%nqVA2JOY%J}m6v!S?0=|zgzFabWcgULdPAyh5*4yoGM99VdOXZSAPC^(u^ zRRM)EwV?Ar=4G6q(OeUo?~S)#H!RJ{&wnqdHGdt$Z6egF5QF}*Pu6RXwA+l5p&`>v ze}9onA0(Il{?%}xcz~`1a3AZ*)oX5VSJBnoG5$7%cK7bxp**(lYFHQ|CBnNa>aqOUqN=yu!n6UX!>&w>J*-6my@#BNbzh`fM{CIwGX2^)+ z=8dMY{X9II>FDS}ZrxG_zM?Ojx3+Ggvg76Dm6VdI>*#peVsc0+;L|7b%$yuCZb89Y z^ARYYdeqXBCr{=U7XzxQJcEN)>FB7pZlz;myB!d)F*7Smr0!u^S=@YU{Qdh!-apFC zbv2OVMpbM;MMlTO+zh5=)>&FwdePVS zJ7bXvsKhBIwiAuy`r-s#OiT>fmb^L-#JBSL?nV*sSx25QunjxF#>PhYo$s+@$M!HW z)nThySy`cCZ}|Di2M_ldDzNVmzpv5jVs!d+Af75B3VoZ znO247uFLF`xxS-z<;PYmSD{TO$HvW@%l>XC0t%}}?t97RyRy#O+FHPgwtU*GWxf_gbFJnfvkMW?I?>Yqx^E=%J(tl$18MKR6bS4Y-k#k`tI# zQ&Z!(<`}539}9nZb=lRp)`#Wf$&!ZES9CKJMiu!x{5wY-~R#=Z3Rq&z4tK)(sCon6D{m zta`s|)VbpN%903vI6fvRDdOhM^#+E9buVA8qoJXBT2y2%Qydf&6k$V8wUKl@(T&6`1sWHI~`!4eu?54nd?o* z)o>^{I_AfxdVquC^-(D)Z|`7no*Lv?DTLLJ3Ktv}7nl0mp%z`&k(+M2Fy3ZRetG}G z@7r=~A&2eXmAHR=IX3_F@~Fbf&pxwO&u~V;x}fdMvdu4B9+O&2oz|;AL5V}3bJQgYUo{5i+_A9o2`>8cSj6B|;ZF_G*=@tatiPtAC|Xy6l_8BLxh zs_DNzefqQmScRHtXl#_n4QML;{MP$eZ#y&gd;~k=&%=U(+t>Sj%>Q=s)T#AY6|6J} z|CpnQxH#GL5IuVi1!ZM%3kV3L=H*3Tx1h#8&W|;bn*+$rFU?zk6cJ5;t2A$HZ`U>t z@^ecoER5cA%rO^BZbji}XGaa@fsQ5|$_U^kr1bOGuNhCD>N<1?{74Ed19+V;9x(s= z=8N;*3#SbXNG-owxQC^*e*2*Mr|daHL(x9B?(fe*fl;gxA$FDCKR$19adkCf(8uSp zIbJse@8-N=MBfOeRIux`)-0bw+74`m=C^NEO-&!fr_$dAK{>MXbh`P43+o&mr)huq z2z;r}>{jZo52hU*9eo+lFLh0epR%BDT}&kC!YqSi6BAp3ntz z^RB7=LY4Bh{O1S#(VGH-gU3)=z^}&gD`Uc5OP7HKp8%y$9`!H$HtbEi=-{w2?1){N z{_tVQmhK(9q62mHAISN(?|b&H=xgY>J^)u_Awx--A$vbR5C-J*-iwoeX*m>3O<6h9 zQGDT8{G~sCE;L8&-y?Hf7%-u0W_FwO*Gku|Km-K`2Z6l2y!dso;o)1cO9$*m0u=TW z#vefAD0s-MhT;CfuDB^Th~n0joH7PzOf;+1^z`*;&LNSJuRuq4?A*BlG=)b-k1Gl_ zmI{T4v_0Ap1+p%XM$x(gK==g@tYI{R$5cZw$BsT!l&)T=717 z^ytL5Zwh$y>-}#0(Nf-E*_nDyADwD=K#yw&6)7|o&CnBecKqeWm&9nnGVUsEKZh+H z(ahN9jqRX}o5st}U%2pcTcoRt%W?qYv^L$HC@m}$NlH#8U0qUIx_)3_FDg6vl7oXz z2-VvM$LvpPf=_)!e*?hZtgb$`Z{NQ7{JVEghi_)$z+EVZoAb-bu?-9iObl0hhF?8@ zbYD9qiy>c@eHUMP?vp1pmo8mGU!kDTd?2&e^Y@JRA^Vq!hwYy_e>pKei{dnY6|r}A zs4tQSl*7zxeiK@ck)|HM!X4S9khvdUG^hgW&d9ZlUY=Y1J!0|Z4j=uN$(_`J{m-1( ze3nKjBlaDox9@x6_`723moH!Xt38jDIlkWvhS7Te2(87d2tRZPkW4%}e^5Afetv2^ zPqYV$;CfrD3I@x@omt!Nq2uZvxOl4mT~CkrPQ!FZtZ!a^{^<1dEuYopG_zVCdOCr8 zqT#3d3pZ3swW?OB0g5bUV9I35YA6L>rz>>$k}2b>{zQXf8C zdNt*4ECgoy{CWn}K>>jP$f^Yj$7SK(o}N3+R|K#LDz1LL{JEL#LzNrt3tv-X=+td2 zQVi?Xt()j8V1o;=z`~&bIu`z}hX{kOUlk$SU>`U#iZL-U5v$<RDArY`NM~#3Rdi%G*krM3fonrzyPfn zmcU-8_=!>!)V+jajyLE3Jc=TVh#I{A4W^U1sdcA_kL!!KWAjmf*33+9@& zUERg;+fK*6tHP4<9Wa-#hQk?&tw<(HaJYxf8W!=y)l26 zcTV!kC#B+lg#!*y02yZsOecCD@BZ-NL&Lb&mW}H#T3e4n1ri#3=|j06iry-{rL}dN z+}aXnwq^oH^js+(O`bvF`W)@#y1`O=5ywICP+C^jV;%u8={|ja9lJ-?OLDyJelS|- zD3D+B>i9MePEJC~;bDlHziMfDY%;DTj@F8`2G1+M`juwpL&b(&yLK&p@1s==+9t27 z+uF|0+fqfsN2|@czEltU%&F- z{RHVzQc?NTo-Ey3pIcN!f9=|}h~sp}lBDcs0wSJyCeR@AXf7m=%R%YglwzloywE(~4*eDzv9CH($oaN=^H|j@^gA!t&MhJfc zRy^L%vyUc6uigpWuR&Dd)vH$uIy$$vXt-Ryd;^T_fs8wc)wCm%0_Wgv>c+!|2b7eR$?84K%X854 zh!~!_x)fL|kqe)vrnX5-R|#=a7^XjezHiHxE#?Z3K+lPhaP_KxZ0t@b?SysJ6CN#Z z-v$Q;ZW1&2a#<3-fGR$wsJM9Vg*&maV5b|bJ`TA2O$*8kgViT*Zf;&YXMB>Y^}DXV zzEX;;_s8k!?M_Zk*F8MSq-a5Jlhj#bK;9Y}lrl3jVF#UU<@_==^@00IW26FCcA_jL zm^Hb^ty{yb%-;+SF5j|2t;^&bF}R=w4Hh!u*~;eFY5kOf8D%YqNO z93V%7U=edW+2xQl47T)DEe{Jj8zHf#eW zjoL*Y1qjnKc?N!ef6tCVIKP4e7k%$u@L1DrP62^!Fq*(0f>Kh_a&rzVo2i?9+EBoj3&c*5XyTNS9 z)=^McU+fLuSW;RFlO-5rNnTZz5^hv|iEa1CZ{G%AtB(3Y7ZH+h_u$#fU*7D37_T21 zN~Tk1Wo31I(DKfu?ICa*98|bz?9aki;t>HO^{|JePqY8{oiC zHW!f7XxIakw5;#|s>@it_46&N0F{^;q`9f17*FmGsRbY^Clr+|O1 z_CkSHpuUHP)G0l^#+a3mu&k^vl=H23xuZmooGAL~HB5pJ4@xq@UU*LOr&Ps$(p?@|!I{mm| zFgiB29=~|%GI=VrVHK+v5ENwozRa<|;woRG2-gE!% zirz|Gmh?FoO<<8X%E}}Fam0)b2naae|MUVNMiG3_+`=L{D(VJI86+Vu1_sV)?vRv} z6y6~3cl-v5dJJnrOaYMD_ysNczEAIuE5U;^SL$lJe*}$&7Onz^u<(ut4<1a5(yHIF zQE_Rdr>Bp4@PLLUxPG4)EiJLbbLvHdLn8y5>plPenm!Vw_qwG;EZlP|+P~+oZ3zhp z#3~r6@$P))WJb($2zj=Z=ag^`GXs6+KCUB>itGmEC+dF!rbYAX*P~EUA)%r2ST7hR z@te*JI7pz~i`aCq6FLS=FLryZK`gGjce!~o0gAe(PyZa8gEn z6-qp?hF7n`@_+NbiCj)gshuT(6G&iV`=7cx1%moBGc&F3PWJZi$F0wIf8k1uL(YOf z*V9xRc>mtb!^4*^Sur>)yVSnVyfJJS#hOvL?czjdyo|f3$Ioxvo!5Y=9goi*vKk7; zQ#8k4(L0WLAN5(0%+^T}vMObgyv*+NZ`Cc-{YIjKQu*c4_2ccy;c&2oXQ8Go{;jPR zY+vwTPESu4M)xPtR_*F9qx*ufaH0*0t+ux(9rbr#7$*j<^tGwu;O%(dNURZT3waLj zDJn!1bgZnAKqujwHsh@c#H0W-0rl{!{kO(^=FFLRIbYeEH*XFv&2e%7JjSL5icwlk zz)+FvbLRnDkg?l5e@zhF)L1JCWwg=YeSi|!Ku zKmXC=$D?1qY(gFI6#7?GxZf8H@{(mEln+;TWxRDS)?x#YryeZfy(6m;Bni6pp0oxi z+Gz3fFK%dS^tOF*j(+f<{bKyiERWrR(#wF*?(T*BzBKx_CI-PCK7QDGtsiAU-T_ih zB{N4>P0--?@1SDmAjDxg43IIK=sIaC`o`vsU^RMd-*uI6SnVlca(0C39jZ;0SRmZ_$$ zej(BfmH1;}!R#+f=b#_Zy$Pc7{hjHuNe|!jACOb!6%``GR#*mTBMs-@vU@<+6Sw5T zV+1P+anA(VA9-_i@l)g0@nK$m5N&|N&Yh>)+uOS;PW_{t5blIav$O3>ZAz4m{LJ$~9v4OQ(%(GuhJ z+Uzh4{_S21ZAUbVf@u?go(=;=o6wH}VQD{*^$shvXfhvu?mjV4{MEkpAOkfW3(GCE z+N5KijA(vTTeisKioLwN8X~vfZmLgH4Bo-SL=6hFg6o-^o3oi6vUdaNstdV?4Zm%V zFa>~x@R(Dl=r4`b-0JE&b=}?lC2U(LQgpk`6hJd}Ny$CK!^5)<(O>4TlP;v4B3Lo5zm){Jw5tKvffPp*8UzsCIe;FM#b&v`B#Y9%nWGG$Q{ ze2k!FJZPqj(Fc9P_nC=8T11A|F zfEBWcSF~~~6u!Q`d1kdW8bWh_rA}yQT>0@?ecv$;27tw@rx)taTUiCItayER?!FVO zE#h8%Q&Ua?cTz|cJZj==0#6BF458x`wtREvx70I4PeNuj5x9g;AYO1m#cNi6Xp~n~ zJLUega|G%&8iwglq?QS@qdG0R;ivZ*FT-hLrj7G9zsk{@|Fp!`JZ9u*bEoSW0rH$Ao)uSQDi zAtrC>;IM7$)~!VA*VU!>ivQC3s)itw1BM6DTTM+vLoDF+#E7jcLz+t(BFc_1>C3LJ zn}X}3%Tv)!6oI5@0hfP%JN^9n{CYSLq^vj7?4yF46k7JX@FWHo(6I}*6<|L){9HiP zLSBK=9{u{YZ+w%fa#|kVvk?phxz#!hm*B}kEZ4&jgGC=Rm`@u+{0ZRx@7WR6ix;_& zjwQ^ggB1{-wvXm(ehMAdm^@h2P=X~Z4Ec|rKGo&zg+DRC9r63ra54>@79gmjug^L! z81nD3o168TnWvt>fg-g4>tms_}_6tCShj2tEbIxe);jGg{;4X({Kdgg)e+}{`v}b zfbEd(`)>_Fw4$0?NW=IQ*IiiMxt2q~J;VzIcX(R&7kk zGsXsrttDM1cKn(d3LhLa)JzcFw3A2kdVbD~D<*h8!h~&%WRWKCAYfenRD;7XSA-j;*@=Ecdmee&%pTZ@q z$LvF4#7@0`Mhh2gV|Zg7{8-5U0ILoA(sV=O|&xB26bjok-}gyvWzp)s+fHPfUpS@0rWW%19unsksd$ zH9t41h))dX-hA1`W$MmRK@#(VA(F}_7KnqyeL7vBHob}Gw1pA_J)bBOTm!$BFJ(&u zunsbuMrtBLz9HSce=Jcc00ROa%CBrdMZ>_}j9?PQ-W2licLk--HZp+Zu^1wv5BDS; zoy?r$o5@$oK1^cq@TdrM=T=r$#)r_i9l7(v?>Nhgy1=blDAvI`PGyvfO-NA3fXm{_ zpR1u|)%JZt*^UzTwXj|3^J@jUxVWOw*Sl#k3RMZM6Z|`a8(pm}K`eNZnIiG%Ri61j zUZJIkf|OvDm3aMeHmvbF3TKK54{z_DJst5P76*D+DRC6Ank-S0Si;TlBLNw`Mm3=BFF#H`g0Fr3Jn`JIK; zreTxS#LnbJ-}>?ESN4Vapp#&5%_t4fUTy34q_o6DrqB{z2I@&HimW--~>JlmPX zLr~KycyhDGImg*k(L1^j0jQgpB=hJz(1Zz*&V1xj$8+}u_-2Swrfi_E7}(fGY@2Ff z@Ilv1bfhVkK+jRr(C{DthYQG!<*N4hnJ!}Ss-r7Q!xjR^1@ik%YilMNxSL%8Qf}=O zSuHRq9vJ$Co~@6S&CF7sKY#uZ&leI+8IU9n7k3bDGSa9MizKD&LdCuA=!k-^XVaan zrEhF3&?#X^X1CBXG)H{?3L=dPM}Ve=_rirXzk3<)Q-=5`-jLU0&V~e}{p9NOE5& z>nupu5>K{~O@7DWhg~W0S@mENK2MSR=ur&fynN$|7}z_ZEnlu(+g&9CqUM5@y^K!! zy0de>v@Z;+85j`I0sqg?yR0Ng_dq7d)bhexIa};hau0}27JJW{b%N99!f{UiHCHcx zfZ-IfO!yDhP@lITv3WP-(o^g!7{|0hV zus{Tj$|CCO>qnN4-0dnXsmjR(I(m3`XiD{iFrU%;d1}|opLeb}$lVEo8cAhp zgQyE^8X6?%o5CA!Vx$H63fL+CJQsViI$SEBztcwu+l}SXs{1|SJM!pY4J>KjwLiRw zayDQ(4p6J4s%nAX8Ps^WS7rV}UGUfZ8D|DrH%H!@9B1GkLR{{t4;4Fs;1*~6ePQb0 zWB$WWW{JF^uUgN&{?Madk9esRc1q200KYpYD$t0saU0w7jG)J0$?mH3gN&2qIU~#WZ z50;*(R!1)eW5A5V=6fxHc-G55zX!vjc=`5ifnr@Xc093?6@zK^!xvj!{#g$yLo{J1 z*D*K?FR*BPoHWtRu(TbuzP@093I`ZIjn)VMN2ZT?Y?PLk#xw~ta$b_(V$2N2(wh|@ ziujQylV$k>64GJI%O1oq1r;OD5E<2u9XoCy+d>3U(oLV}rAESp`OhZR9;Lm?36L~2 z7#Ser2cZ6V+v|_7DQ(qIRZ?Q-;8-6W9gTUtMz}9wI&!NFh#3Fz7B5tv0A}wr+`DU6 zW8UnM^DiifxPUh{J0%EKUI)c?<@byUNk6~|f-&VJ#@mMbug`H(bKgFl|HkU=gEj*- z&@$7%187q8_uI0;_CkN{N4|{MQJX>yMi(xe%h1=JSZ9z9#s4rfMgJ`EM2yybiO_Q; zpfF3rjB6MYS{@B%Q!_HUkJhQCtQ-g}$vty&N)u^d@-fdi$$d1pSdT52eszNC(Rv`0 zfc|9zCclWtc-$9hmpB}%qLPwKgifUzW);Rljsr!3@CI$6yjZ@u3knHkyMBKc4WXV8 zA0Jg!C0%GBKutxJ$|8MD7fK@d=aO?i$KW*27`p}Gq`#Zmp-VWuQ!Zd zw)yhr?w#mpc}Vq4bi*hlM&7>$tX;-lZ4dQK0Gv;LsJKe@GSiVu`;JOU#ehMcGcxK1 z!L@+};xg4Qazb0%$NF~XeKie@401s^IlTB!zVi3$fLel1K*8p&t{8HE*n}mXzb|1P zlo2^ZBvec@@Q>`Cu*k?6-P^{Paho0}4u<`zg=WLR#I*eGj2qw#GjW+ezxQ=RZ>yg^ zeHTmm-^Z`5xJ$YJ;PU=E{p|DjxC)-}=h;~uxqmBSL(kn~A;Z1>EDiVSDt?8u zDEgf{ZQ;p^L%5vM*LUdMYH4k)3X3}f8@5`nWc}@?3{-9x=-lhx-n;PjZPCz6eE<36 zQf6)eYbSlfu;}6z1od>`jp-1_UC3ehjvoC}-ENON#iLtxnJ4l zY^T1NnGQ;?byIRw)ZDAtmAQIaamzOhp}#G`pfu%>x*VEcHp@y)y+Ld|iKF{yj(IPa z488X@`!zeF1OH637TPQnHm{xuAoKILZzoU#1lLD={~han_pU33Tca6CfWIWG6B>p8 zW?EKF0J;mB5ThI?y6l^~2O0M4QQN$E^CI}?bBUgD@ZXDeb|*-h3Z7i5QHwEd8$cth zCDjWTE=+fA?)0S{-MU*y7aczL;lrrX(!*kx2P<9eg9DVSx5J`Ot+V4sr< zgGJ}^@x?l+%O5G)Qshz)gQ_D8BE>=|kg|%(DMdwprDc8w>Rdc1A=Bq-khjv_^NqVN zIyvb;et#Yvl}8uJK;)roXqc#^=reK`9|~ed?~Pt~(cD}CN@%g|$VdnFL)D#*?{Cmd zVsPoI;E=osRr`XT<0@i>jLYZRN1Fg7B_MoEBIX-`58#=!t1Am{7a`b%uE@IX!Y58Z zM~Gh?*Tl_UJb(T;8eckaQt#x+PS4)9uM2!aLXiY7Q1hJN49+etneZSxA# zRE1?D))n22d0PMXi^Of{>gmbEpiJnmEX-Jds3feRVQQ4DWrG5sp0{_$VeSp&v((5l zhDSyw2_zl}Jgm&*quDg^P3R{PT zhX)hp7_{qvL^6|>3vIJP&b)IkGjl8&lUAa5CG!|>T97KC^O^_%2fT@Mq)#2nXg+%-2o3a|ML6LZA-H%qeQm772wsorkR zqb`8V zWqRJeff~L*c$w3PHs+@}@cu>5p5-7V??93T zakE_g?d$w?G%;^--Gt^t!ZtTH1~Lj6t6^Xu2c@@VI`mE)uo9guKKz$s-T+bBfk8nh zp36NxlW)E}Ha$HgL>mEL;M+iPPCt9PE!Df=I@EHbf+LjohwRzv>+?6^F}&{Yk0%Wa zjVzz%?h@(@X#UgR^SCM8Z;V>nergC6^k_lYVgs&(z)cCkOsd8`q3VSSMrLLWB$>q4 zzQ!z!snu?b=e~aP=39PUm#Ojr1_+3I%_Acz*c#-SV<;vc3>6^A-9HwoSfYdEi;I&} zF)=ZbfX{E5Vk})2rif$U&5~E5QYYb~6Dl5^k->>Y(?bAE>Jp`Iigp}l;~ZDUj^*Ux z2_lPwIi`G?=55=zzeXFzr@nsw{=UzjFS|XK=F%DjI?>c-Mrw87ps_={z~0V)>Z(2a zBl4N8w1`M_ws!Ifh*%0t+wK)Ksl2nc{QZ#O)AQ>PiNpe#wlhmakg~zSirJ~;|1fL< zcStw`7$*l>PS?A4*`&=m)vl_5cFp$q?B2Dj6B;c(Q<$t82#c3~+J|W^M0=}{BRvIX zwPzG9&j~ORS|nsrxoBGJW@W`g_|N9;OiJkTSX+u^lzZj{uL2DH@<~Xt1B!__Gv6$8`_Z&L@gUa^-;iHgFR1r;K90go9- za-RQMz;ZGQ7#t_8)6sj93M7QkIskW<`2kbd!(xZ5;5k5W9*Vx^F2m~B|I`m+$_D#* z+KL6P3fJVfZ(W$#7_qythN+~%8gFTw7>EHgL#EmZuzqhzqoWFQTM^Tn9GC(l*#4gY z1GPcXh(lQX4ih|d;6N(ad?zYJT}$g0Sd0o#qXc>=?A9&EXMa>3_a8od7kN5V#+m$f z9Z*d!_<9h#SxEf4LF}a)mN39{oL4(36+IPgF&%bc8Q;`jM;ix+OdQDhPeZr1Mu5r! z^=@CA`JgVhGPhypwS2%1X=45T{WQBvV$$JK!>!*xzMom#W(T%o7rLg@v`pUP$Cz>y zWf49 z6))^pA-r42cX$}DeXDv%-sgb)TN;oFQ zQ#m-^aK=0pu?SE#QgF!bQ?{T2h{^lo=TGmkixcf!Qc{U<+X-Z}Z4OTVObw%kG~CeO zxG)&H9Oyb=*GWo>lZhC~IF82BQ|tThNYEWw$RP~+6Iy^F4khS}5-@ApeGwiYXLK zl_Qo0j~&y4kT042|A;0$dNs&naYEjolGZ23+t&yO4Aqefknrl7(MZ_a zuiX&r>Z+wG4Ac&dbRSUjKe}gCY;l(k+l*+OPV)`BAI!^4^_{5PEJmi zT&UmtZKPI?NK_h{Y#d97W;|r7iP$eQJ6k2iXPJ+HI|inWWNOWD9K4YC&zkQ z6$_b;JX$DLnU9-05WdSqw~m~ngd6F02v?gsJEP$fen?xTrw%N!Xrdu~78O%!Bn`F9 z1yYGSv}rq#=5u4ZE=u^jWbLxdD4F0_@O&{eirE?mqal5rwg|8rPDHHU|Xi8);f ze<8R%1tWQxkTKn`azh)g&6QUd8_M!7Amw7H(|6 z?Ax=&JlYl(obDbTqOL$~*Kb|{zc{c*6R5V|z{RPcV`SnM)6&xDD_-EVpe-zF;M|gS zGAH{EstMnKX z+2LgHRvd}&!%*K}c^#o$fgd+fQbJdq3EzL+F_-h`(FEA9XMZd%a$!e#|NTW=a1ger z_}x#@Z_6$XodBjV?AURj8_ar`pA)&y2$2nkgbM{vv@aa}6Pf># z$zf>tk>LQ**9{E;gp+}@eE#%Ffs_HxS`u@(Sf~#!G?(ag%wH3e8l9A^(_oQhi|VnE z0z3&Evg^k0$Q~g5e6YmU1-1jt&@8AuVGl$e0DW)m6$^sIsB~#|JIe`6<4l36AYb z*njjpG;|9Fn>t~YPQnUd`+022F}mCa1l61|=Jc5}K8Nm&;4HHOsCCr&*|mG%Bcj)Y z#T5T<7T}2gjT<-Io>;?z(maOt#ik_n#qA4H0rMk>r%s0+sxyS?gKShBh^1ZVg7fD| z7#uqe$&rD%=J8dW&EavdgAq3^ku%{4C6M8_ffC21Qg(5?H-#dX35-m=Z2%2z6~I5}SD zF`|5tY$@9s(TJLRGBU@%VTc^%FMLIEA9|AAI2~FD!bW9;!|d+vf2{mjFk?kFFUFy))HPNYrwo?!HVRZcEo=~FSp|oa`wbxGLEZZb{ z0K>k0t$P${SY?)comuZRD3fmc##j?GnZ5;;aAOJwrmls%WBAE(nQ7m?l`|uP0s;WP zMlWCZPqe^a-{v@mvseT^J4z31LSl z^5ggK9E4dklTCkju(E1{I>C^8@@(xAJQuW~NPsY*&d{$$3@WH*aU6jH=1%>Xyol45 zt6*V07@=^!awQELvUJk8?e*(4#Adpf^C4L(8XSouXIrsIO}H_mlx!ZnI1t-`Yq7xz8H5F-TfwzzYAxP} zCzUMYUX*ZT9uP^|KfDAS3sHsP#nLr`&3OtB@pW5Ue*KKS$tKWRv;yMUMByR`GGbRw zpj=t#+l8Q6P}Snz%Pw6V9pp$8H7)I|v$|HLsIkyJfV2(9fdq#wz&(il8q8OdKinGWArqHjF+o?38s(4u)4^3%1z$8?;Kc-&Jbx=7FE( znB!lCfWX`pm&B1zAlTDgGpKR(*?+}u!zZ=4o!I`xsNEZ`(&r27cQt7&GJhE~R0@=z zvg{jvSJdAUX6|-&%KYW9tD}K@UihxbE_JHc-otVRMeJL+Mok5rWWQdfQJ+bkZT*no zKAs@m_$KTwABAP%r6H-Jva)QX*7PUSHi+2u93VCruG&X0!x(al7o)N8w{sQxzK)Nd zf~}5po*3vTi97dALzNV;&q-Wt)?MMy5O^?3I6Zo2AQS;4X_AhEC29B^ zpWTiO^}@uGOT+*OJWb5w@7}a|Gdd6(gb%qom^+$jN7BGaNXP=yJqt(Rot+y3^Ok~x zgJX+}#V{-J-Qo zuh;|>e$saz9nus5IL7A5S3B7Np{Oh=oMy3SX2kf#Eu5o&jhG+t&k5KDD?)=}LhgTl za0?ndz47R5fj05!?B59lUfXp#2xI7-Fi9UezAqzf3Vu)y7^H8X{g!jCiP{fj`0^!z zdU*kRu?$9%dHM;g{l_b&fw;vja-!Za0Z zgB9lQ{C4r=UEJ|01Wb|%BLeiHHn`c(V`GX4?3MJ(U7y1OhUi$jpIr5Kw4RbYvs}HL zH?B7z5s7+7phUb$IBuyWSC?Rc#bBpBcGBg41DGcU*oUE^NemF(u{YRCO+5j#Til^v z7!}2XEHzQqTWYBMiocje64XD2etDUh?^Cn5M%}-!fo7BeY;87qI*nuqSN-?WyG_qM zeufbT5kd`v=&A^#KTl4ogCB_FY$aAX;!Ra=s1gVA8cx{Cb|W>%kT(?W``ZPB#IV}q zFsUr;oPTDs>~ZkQ5DB3}@DgTZthChi$~qW2Di05X&DwJfY~C_o{KxzfMFVM`GrMA z`Q0uaJ$$&d!SEr#(V4!WU|qDD60Q+j;M}8F-e|Rl$b7rJ;CF82y@K)$qG35bnv+Q~w^%PpD+Vwo7gD1l$!!mErjd1(j6O)5O zf1tTM^z#i56aS3(0Whi!TrZ;w!4`S|#?E)(z{69+>$7rlx=~oKp(zlj{`u#8>-ZH! z&BUzfDzzrxUoiclBKGQ6XrL7IS~YAaVkR*ib^d(B_m{Ed0#0c$VGW^`S!cowPQdjT;&H>gv^l!8E}5}yKWVVfle9NbZ`-EdKk$8|wsvt44bJyj zxW~ee9}l70NuqO4$Rx&H3oRgYGqgZ#QUgR>3bI&CnXslLNxS zQ8?XtKwMludkrDctEVE*Nzg8+>h%JoOM1isUkXl@#jzC^&^Ce;L{CD?MO6kFstwL79V*M=ehy%uN{iRB;3tg-cbiCkQ}}A(oiDeXd8?lz@yr5 zrf%&49o0r+)R5HzlAgk<+aSElV>4KtA5$eVV-+V|$&<#>7<{MJpElOE`owM*eC32~ z))F4~0CikQh)TpBc}x-K1)WR!)J+Y@5Om|>5*00KxK?deIo7xf^p&_o(9K^i-5@7` zZ6APxUB0H^Wp2)aIfgFiBz}f=kQ$PS5K9y;l{m&-$gQEX6)|)dYz6bg+BS^MW3(Y0 zGW?Z#uz^6`3&|u~yu?i5{sJP&o`*PEw#z9DQR0BPmWm!2 zgTM?sACr80v9-GkYS&I61pmwNRuGvPF;mEZ8P3IIz5Q`25VOpu;6*eK4J8w<+k3a~ z5cyUYJX-C_7v8YDl4`fasByW>d^q|Ir~VO~pG1PBFU33k@8Gt2@;z=gFj~MG^&V{( z16U&GaG`qypTG>2H&z?-iG$LEmm|YugoePXOJSbS1=P++-dE{%1GH zOznMFwv?UV+^{JKYr(_RFEBlZL%&}%bvU9sNGyl#QS9hlTvU{4zXL+zM4lYxw{t>5 zs3yz0|JOt;@HufDu_}oxaj>5>^pf*0*MoD#R=lVL#=BsoD68%*GF7XLK1QIrm>;H>u_>6zVR&an5P6dQWsn=e#_NPDD_;pv_06lv*RlQ zW!KkMjg@ z^E7-J$0NNQ8nQ#6^lL+{8xp(qy-xlBS0MY2`k*mB=BQ&PNF2|te>c``II${=gDR7= zpoc3$IXWSY=EN_8*ODc*85<*Z{1#$|9D5}=FIwO{r#!-5!E$Y`tGkJ2a{^*A4IRn* z+50un&Lm8^5&1}LaC~<`gYY)8<``H%nqpUk8D)rBYgc~;bb5%M&mTV?2SRk^TZfm? z3rVSdhDV!;c6|Dckg-Dy$OcT3Sk`T(D{o;l;VhKig|D;d9b|w77Ef0)$3bSi5pJ@l z;;MC5UmvfhrzhzmB>aMWs$o)9XN31nhBnFCVdmk=uQSzD(P~`Y-9WX<*=zjej~J5G zB@j;#{%zHksBXREa;`g0JK4Q269fVgv!wQZaXbG8d=i>zPpSRcNy8uMwfv_ak%5@+ zsvOUKRug$NlWte!3jFD$4<9}d z=mU*x4m)GHu>8{=z27BNa3ws&lb)5e-OdDkjL-y}11EG$dqjruE1cF6pf;vW$lW3s z$RJh`h8b%s=>C#^N@mND{iFZAtQf|tC&}YSPwxhslcX?;H+5m%Bv!KZn}tckEFWi%(j!KRcAbzp z1a6m$fLV?4@+wGi*2wS4?%-}oFFpuUHQ1Y9STgY4N++hix#E~=oW&9wo170ZU;^7u zS71HyTnMElbp=ndrX|SvfBhrOW0By$qEDTRo`fve)sYL&vT%fl7o_~|+vBhC^vLWJ z3GmMq1;DfUPl==3O+NpQd+?5ZyMQyRWMJVG%Vr94kMN}S<7l|(k#1^cB7-x~G|rbVr{djxcN;mNi6laQ0B5w5q^IaZ{~I(UjR|RafmPU- zzGebcuzvwmu0(kv3og1goG5_%wpm;G10JDFq$Q44c*Qz)-TMT+EsigdN%;N%8U6&! zmAvsB%<}>EqMtoG0w@Z{VYOA0_I;n$qlU>yI841*d>!d67U{U+mp5VTbYq~FxZg-~ zG=gbZmiZs-JgD6bt1c_h3mLV;#&cmXIrWf99NkhCg4Iy(9>pLLE*{^Opl`kq{KjCL z@jwLT0hhdd0&}JuK9rwKJPVg9>A?a0o7)`hc3 z$KlYHTznnlbnQednjP4;+W5c;>{RX_Wa1sRpc0Kl5F_1wA6tgINH1Xj!)_{ zh-CXkD=QTUl20z*$`PGny-xzXo12<~!EI_Po;@^+!CY3Nh(%DVU<;<{iGyann(v8I zHl+>&F&Ne2c zS_GTx+D@CJNMEDtcTaL{tJg)vWZ70Yf1%#Aaif}?&V7knV$+8jaStI)V}+vWf{fw0 zpa*x$&b^u&t+#w4$H_rP&r!E0(FAfaU47SC*W41Wgbt#){cwYDtUIo&-3wj)u;bvJ zSCK3&?yA=?tc~bDqr>9ue6U zEhDqYNJ1I+`Mkd0>;B!p|L)_sj_de*s*m39*X#LwjPr4x=L69NVoe2np zX2?NOe6D;NL6h|EXs1jObA!ZW%LAOW>j5Q40Ldr~Ux zvGW)kbgun5G*o9r^Bmo%$(Gvg==vhU)aAZy+dW~p~yTTecl+e^}{CLL&@`f(>`Kj!Kq|e6jBj^ll z%}yfU#8|o8;*z#w6pmp*h33ES;3xw#H7~Dy-Yc4DisqpmLrF3RBcBZ6Yv8gSSoNI% zGSbbO9}z?Z=`6dD$BTbL6rWW^Hw-L>x2I!<4uScrTjqQ)l?^R!=MCz5CnuK1|z{eV0GXDPxW5(rU4 zdxQaG2k?`N0VW@J%D6>AVW4R9ooJ{tj4EY-3}SG4jKnoujrkVk2;d%@cZwg!iA%m) z4rZs2eu-*w^&RA7@&chKv6B?PK>{W6#2PB<##m|;Bzj~7iQu&5e`g+tsTQ*!!17DWxHhSpHP=2EC_Y@&e}j=mx)A_8bBA45m`hScHYo_KjtByqEy#4bb# zAaGdWW}^Eu%#a3GG#R*9fsBw+xs_Z!=szj#?W%Z{EclXY$fg#@K0)L9fC!2}gCe(= zOb=6!J+a=xIADasN}RULbSm5=?(LLVONkP*<-j~7X}TP}gb2F#g}*qVgFp+QqSxlX zheE-5?ngvTw@QpSBTAG99>oO(wRoKg_?`A;TrY22bM@=bmB4_(z?+A-FQErgAB)4x zz8`G^t-(LI6%d&n1+vXhaBKE4%fjauFC+j}r~*dPGX)K{XD!I|%PE1lf1P2ueVh6ZzoZBNuw>>ek`&$-!gj zDO$xJAetP8ur-pMAR5Fq0?whP2&#}zYV&&pfwAU;deRgnnGQ$+L`h5`rW3$80zmJs zYHFgz7^w=+5+0jX=cFpkAaw8nwz~W1Q5HO-lNiQJbU~R{jS4guye8OjB$7%4w~UvM z&ji2SQ)smnngS72D}a-jK}flgg_uXQ?Vzizs})&*NUR+pM7gly0;YsBAQh?iDe^Mu zmpX1lJQjQ(Ux4?8J2@~lmGkiDzMP!&p z={f?VG5_c!1PHQ-Thfe8(4R+VrZP%=FKaW*yucd;5GLL9*7v=7(E zx1B-Lw(X_}fCrqlU=$Gho}-`)fy+%&lx?E~vIST}w`-Ii@o!;7mLo!62)p;WfnD?( z@0A#m0%-G{gm9o5(LysMLjkJEB>Yj@Bg%4jARR{+4HF*~6aQcnR$^g}0!8HLQFES8 z_ULAuXTGh6w(KF>zAYQpm3Xeo@G@;gA7)}@b#YP}?Y1+}avUPG;TL=z@ga%D62z4J z@7J&d%Hgmxg9)*1vMMz>nG5d;1E;XFyt#RKw|#t0z5MB=x~~p(s2VWERvJFyLJ|}6 zHvRReW~LTBsyBi8(nNa!$i&CTGpGC4s)~#J0R{wV5&t_G$SQ~h7#{i^*aM&C^@c+* zf^w74USx^03JTt&jk&JrTUaOx@U9y8@a$$w#A*a`NZ2z}|2>S3=D%^sKwtmi4?DiS zdwB!|6iu>#!;-SU=po+cnzgr&fB&b+B_qROn#T&(ZOj$s8&$h@?rbW%XTpjfIsgn~ zOK?3b$h4Y@*ZTUfUC7Y99UE)#xSNrYm^%UwLBETSTZ13FALA-<*zMIo`IG)na{yt7 z1c=5J%{!3MA=KMt%t&Bfm8PA;;Vc#RFQLWT&zuGwCuR^}8@Aa6Sr}wcE&}rA2mB!k#B>sfQ+4rCYlfR6iTM=J4S$U+Tw@;S&!G`J zj%Xjd%tE1KKS~Giv;p%jg&kDI-vNISS2Pq2 z*JXBVrz-D)vHe6}B`5HVALo-DSVB6C2vVgn*t_1PLM+r{U)af4(T8M`V#aH9s_`CGl#~%P# zKU?Jlp|R`x60kp_^#kCMhw#?w=vA|Sn&kg&u7`lJ5Tjo-N7WdoO1vi{bkKH!5xY&m zh>Nk#GAw69V|vIU?M=+#Jn*2D0G$+Uz$M#+A3;rlUf)4-*VBc}cGAQBAW8^sxcVM9 z@g2=)N=Z$n+UICmV)#m9iFpbwl*El_+@oh9kKrDbY84zBN~0PjlRcYjX=r06m6OV4m+4&_6M8m;Cq?jW1SUqm4*0fZ=!@$ZhNARPHSBD67HO zaS(!k@`iQZ3j1LwP>Pi72j<(TGDHveL=E(hokze!`Rw(dU)&nnKvgROd+Fy7=eWQ& zj3|En9y1z;tVE9xbT^|U(jFh4KLUb>H^6B76uxb=*MYH8yf9cywGxQ-)7|01x_vv% zQ~SS1B0hHSBfp?)<}2*Ul=^=i`!uy<=l?&qAWcn!#X9xee{RHouSV1HpQ7piy(Z)r z&+kZ!-ShvwJ8Ao=d}H^VyT;EXyRquw%9{Ow@w=CnXwDr_e{h&folE^enabHWTbkY} zC@RQjm%O82c=+J4j=paCgqzm-jrJN$GwGVX{5R!&JNbm&cJv)KeR$vjvts&NDZx<*t##+Fu{}_C z7>Z=Pzd?|Nk-{`IJoIQoc=;jj6W0_D9U4B8QA77^)ZQh)N#ZZ_$-ABMr|!J13q5O6 zz!b?$wZkm@$k4}g*|ioO=KI%eGE=?gzW=aLep4vzzDt!oDT&3}j2l1vjL_%$SX}OX zwDRitHTJVX!<=W9Po9pt{kyCwL-qLMbw+IUZV}w3r1oCmUYy0)*m-JwC(j?6Xs;gL%k%m$@Nc8O|J5C z?jPk8(>xQIfsseaI_0ZU2Ud>IQ2FP)Zc*{oeeCqYxJfuc+?{W&;+cD=dy;li=sK@m zYp~6nAE>D>U%1h4_x#_nqLcLA)9>V*kGVX(bwj&H?^sepad+5m2DR}9)5bEJfS%b+ zx?-#W?o+gyTN~X2=6L)~d++h|9L|zvqg&ZJ#Y^E8&YMWJwzJDLEJ-%KaYjcp+K1JQ zef!I{%ngjDVoP`ZE`{>sycHEV*UKJbbeOH>b8Ee{+Wn4)k+yW5IfC_V)xo)|LkE=T zChnNM$bWeD67>P$onHNYf4&X1rf--iWPM@3@wg24nfJcoXKrXZ75NuAXkYcr`R2eM zSJ)kyYrddjDYp8-(}yxVQt(UUO=OnOLpjIxa|I?=bi~FWt#j0yN-na%Fq55n3CvI6s;`t`(=Pv^% zzEuXtU2=c1ht-K+&u{~Cl9YP{WB=UZ9BXfx5R2T#XMH2Ro9Y|69P)U+sVH6fy5UsK z1S4aoM0L>53+pydMC{~}Ag9Y{ zb7H1qd%}fdg;%zFjhxrJ^liZTtoYb=mgJvy@YO)tkxQG_ww=E%P`NX!gh$6gHDIQ} z_36RNgZY2wUyn3STK2SU8Fkuu`{oOuvCrx>oa>wp^8eaM{Vum^rHE$t4yO&$9_=TW zCrrjF7I{m>X#GMIwleH^R@rhqyl6fz+2Q?aMXQAO1-%_~M@h zl0=HbIeLc-Pa7{tD_wne^5Dz2;W|YIRHrUABqV*H8H(9hSMe{rJ?$Jzula}9r44%R z=lqikWb>!ExDRcP@2!-}moSoSQZ@nvohgFH0Bsy;-E-bbVWrNoOBd!3?d`=5g9s z(>_JhkNiALWwfz@LVknSZA~eK1(_ztvyAeJYA-6;yCxG3b`G~Lxctos zcOMn!?n@b1p1rJdmi?Z7j7|}Aw2$y5H^p6L&6VDgG&z?HY%RtvNF9uqKd$`o@}@V6 zA8+OluM^MZ+9SL~t^E9>V11hNj)$L9Hs9JOCY0YCn!oQ#Wlz(oqgQiGb8T~m?(-qY zJi0Plr1`-#Y~~>Q4G-x%>$8H2=pKka`scg)z=QU%r~5r+?ahhtYIGx_yL^je4eH)T zXQV!rQkd`Vd3yKHm*0Z^R?q8h8Yst_CviM<-PPx??Vi3t5X&bv%0{kjGsc!_R2+R& zTQ?ik*b2>2wyZnJSV%{opkc@9DckN>bi<@Jaq9b;=CRKu@kT2f4)rB7eE1o6$4{4y zGJHz2>dgC$PyHuoT3e}0+>K>TADG%LvG`QolxhE_UF%hJOeEp{0m?}ak2QrETE4|i z%%|>(P4VS+hSnyX5e|!4o$9{elPh2?t!sH@_uz-ka+lXVRoV2Oqu@kaOUbLFluwOy zExAdJ6P@fWH$*8vQ*=sPc1jhAQuZ)2(u8+pM6x%{99HC-cUg1%5B>8m^mJpxYJ&O_ zIIJ0#?G~bcY2MwrI8zq+#P(L&q)9-q-ccTY8QNC|4vlqXW`7Gwtzeh>onxXnwyUmh zc~hk5s|4*`yN)ky<}}_|t{plvCeNg)&@{^WPCowNa>u97`Eq9a5655dX2_ZG9B6v* zp`H5pS}wT@bS7!Fm-ZX{s_A|Rp1wWH(a@ROER;RNyqnEL)*DUgbTh2 z#me0JSr&LiXwK{ItDak;8&A0FdWXMzBk<&EwNs?c-j7V-+bg(Sk18_u?DMf7SKRSO z`{tq=i`mcZdbuX9%s%d}d<-uC37UIM-IN|yDa})1&g@f|*j@L0e$%Em6?Uwt8GjS9 z^Qcwp=_|4W#jh`|cd}Ku{HMF;xQA=24VyroL8M0N$26K@qvlLncD+l}l%X$6TP+;@ zOiwdhp-8dz^wb2<>>k~GGnV>w%h;M={-e1soI_u|2&XmjXQBkMn#!Aho&IfXVx_Zi zdT3yye%Iu^1e>>OEE8Iz_*m0jw@`gjuABW(X`E|xfg&}gr&Hravs;Xfv3rBNf01uG z(~fD0Xl1R&m(HyICs+MbDq?5OIGQwgwYAmsv7bpA_iAa(9;Okf$lzIVn?7uNX>fXS zLvP$#S=n(%R5$N9O$;sv%myM5$p$O6TFX1=wnon36#S0QI(0UV^r%*?FN8t(no zPuv@h+Z?NYbY4JUULv2dSormGR;pO(Qyn*d@vM8xu~t@^KEQPIV2*2P?tzkfEy3o- zuYZIGWLoY>H)RB_mb}x{ok9`Z0wZ{Jr$R} z{`mr>%IAK<=5I!oBbSA)U0f~-;>nTT?hyZ4_-l57D#NyW*H;RpqjI(GF4B0OI#q5; ze`aOVL{u)d?rhD$SNGR=on9~-} z5*}q!Q>ZtJ-TU^ae#6y^v0g2~e&0O3>2*GL7uY>J`J4L8qe;J!o(R?pibe++_w4Xb z-TeAf+SM)2xjzcm{o5;cvV5R2iG#A6A;j-}ZI!(L%q$G_y$^b@mNRZY<5o2(Bh+ql zcVlAhzuJ=#zpi#}NfL{%4RHF`o_bFB?vYoa7mWZzs}3#|H=}+W8w78`)7{0L%*~KdpFyOmkoB8 zuL{*OHPj8hEq2J&bKFT|m{nK8Vi3mevh9-fyVuhZMVoTo*aVr#EaxqZwQZO(Lr~=3 z{lYMfgFEhLRM3s+E<0o8@S5%qx>r*a@z^{VjxTL{rJ{Kv)m@PN;>$L*mw9{l%rSYy z=TWK1t+R}JI{eU#bu9cgyT5If{nF{@9xt{AQ?D?1x4*aC^^cD8ieuHZp}X<^>`+;1 zx|)~_jvk431_>Qfd%laOY}N_f$3-#8@`6 z=gzumL8_U5@kwE|_B$RL81*OzC)%ZFKKp65M(bz9TNzENvPi4XSGdpJYtbIF*&6OQ zTiI5Zz1M{zYW&HRTZ`e>sgO7gH7d&P<5?f2r^izqHt2t3r1r92uE|fV5}_Dvuc{+zw`F$dDb18Yvgir zFTK6G)Masmr6~Wl$YFNf)6}`vN9YH#C?ZpN>0NB1JJ{%K*MB|S@now`$sf+bcQ};Lbe?}q|dFy`_I3!p- zHS(x&9scHGX0a0RdrL+QK^PA)XdgW zi}!5WKW$$ayeP%GG%h;49AY!t(s3jI#~^od{B{P;)pI9a#}03pq#A0FmNQqN_B(Hp zJN+VlfLkcXzEgi^+;xe?J;HhQzl272erQ>t*_E^8$9eupZfJbKr*Nllz0=IJ>myvP zH=0*6RJ{)~7m?pJEwkj6er(FPXR2l8iq{SO0rR&0cMWtRB7>KIgnYS`a4Wz$&w<;# z@`pElFqQHCw{pdg7!}rCOmKX&sq*UAvdmX@y%n4Gu;(3X%4m1cq{LaBqSG7O$i*nS za7(1P^LD=i>xVr@tFx`|hmrK~pulz4iJBdpUPm#R<=U9H#q5L}vtb zO-@H~u9Iu;AFq9zeR=Tt;3T!{0FT?-Fv{r*Is!aWnIlmAiK`rzBN ztlhujc3Cv$g+C9;d$e8V%F$q&X0QDlG9ziY>8Iz;n>4I?*$6(WwjbXV{6~Agc=v^> zKU!PR3_q9A?3H*MY4-VFQMH!D(aO9>Yj65mcys!-_=JY7v6@b>xVrdpy{?^o{+(%S z+I-fk<4ej#9trvjxemqtf0CcqOuO8fxXN`^YUjty26f(rr>dUkZ(RPgmD7M#Hv74# z$IF}LiE~evG9G(!D!$%%vfOO8qw@j7^$Byqp1-tALxZgkHi{cMF^&}FuTfc|V`sTD z5aReNX-kKvQvK%I9XewAd7C%N<@MxG)4A;lJ#p!z12R-+&}Y(uVidl_O8{{HF4sbJ#C}{oPGBg|9WaWK%qHvMqIL|T1kbIq3I3U}G92rp}Twkkhuw@kBxy=CXUTz;0?sipHCTo6uka(?nO_|x2%XTFE#dN%9N z|G3M7)oD+g+zfLD4m#6sp7XEy||%Xqj$%#gb_AX{`$8lk3+5s|(xK zSEQ>|ajrA$byV$0Up1$+*!CWFE#EaNd7`7KZfgaft2w_q@J0WpvCeyn)W8|38B9xTn+l*K*F6J}yyoe$_eWn;a}I-%rs| zK#MZ`V=YC7%XFOC*~um_Na(_WSbzKYr2G@Jzm2UgQ9<|CqqF1G_Tww5bX9fV`ozpX z**WHVSH!JRDw8sPvNLp^d+qKMTE(oBy_2fW$t)@iESopOL-3AI?2g}*4TJFm@xN>K z#;^Y1J+=oUNbq?t$u-@kK`FIZxFw7p`kzIU#{(x;l#27)nAtLs%=Hy!S{!eDVEJceogQ?4HQ@utoWqbZlG-OXG^6+7pK9 zrxw=q?D&eGbgFfp2UKl4j5Q-@fZrG;ez(xZTw)mJfet;mpr{{MTlc zzG&{JS$mtuLnvMh!Y`MP+>DU1ZIe*Blw3T1=!fX0xu;)##CzI^759~GHvAW@L+>pj zw0%X7`gXkM*3V8~s*Z%*o4!aDHof&%v`rnWOcn2!Do%kHJfEJ){+Xh^ljJRNAV|`W zhhgf}B{_?UwI?2O9baB=P%S2MqRo77uKV1Yp(BTmf2p^y6!)P066LSs{99(yL{xH; z`s~AvCgls{~9!6STe>lL2yOV(bhH=`re&BE*?=mb1v?=eg{FSwgXzkU3ky&|!IRY;)kZs)O_~0DcCf12@`V$<`8DRm3-2QX0&KkG1>A0LXFr%d9-ns5 zg4%8Js(h!{5uHzI-brtixDWVepFVqeXRmcmT}|nQyxd45iN)u%izTJYDqqUtg8#^j zuVLaAnm=AuF%i%E3H-dHWTq#i7tQPR5l=wz-c^>A^zS&tEbqmRXE=l&&M9<#sS z-DfPY^q=0_c-!u-;(t*yB4x{l5jGV{byHhjIfRDGdhL+Li=B zjH85Mx@vF5tu7gF0o^>sa;WjX*}BInd_LR4MbGke9x<)ku=Y4cx)=5>>qO}W#{>4hzWLbli%)v{qpkl>WN;9Q;O@G0*>zvd*nnz<9}$IJ(ev$Y07FECfFqL15Bb2hGcAt*XLnX!Q6u2K% zSTkNw-%bIWZ$Fo6@z01eu!3H*8-`@{*HqJuR2w1q+CTxJen0E6kHrQn^Q#?Tm@zFE z{rbE}<2bxJv~T{XqZ`f2gOgKdnu>fui|3=f!?rEH zN%as+ftV;@&u_T@IzRL$kqrH`e~=t?N)qZKOa>qw$hGILTOA2L$ys zPoV&L^f${*HeNe0?r~Yzpvc z8ms!0xmiKTWpupu!)FHC?Ed{*Jua|2cNgzprfOR&Bdcj=lzeDbmBYUCEw%k@2m@6^ zv}J#qb;*T+F+I-YYu$O?+6Ds5R2SAp{CZioTk+n(UqRc=SvvW|mJT_(JnCIIk=*or z_pdk}0h96C;?kUne2ekToGi(X&!yGoXKCxKcXaskMYauOc8w-wWNo)D=;-%QE6G)( z^=nIP$=KGEA#B7i6f+89Z``Ayq7in^Xa2H1>G_|S zW>JrUo3XyPR4+E~d;i~25E++_JdykVF%K&0diFna)&KoZ&7R(j{y)Q3{9+2-JLT~I zYZUeH_FJKbRLne~aT~x6q5()3mLnP!{-+Ztw)u{ z!Cs-#D=XEBkriPOURB^1m8m5V?(rlE*~zw@7&HQ4s+Rii4vfwIqth>aC`|1JM21%51H z)*nCFzf!S+pJl~=(fyEPQL+OtKXq9x*A@_9jzi0RJa1~F8;TR~BgibAAl4q4n! zmf*MCS_F0j-!cf(D_C*!O=%nfp>vf7`=7v4YAr-KT2CR1gg}~Uc)XbczvQ+1+^*@% z=70r4+4sD`!$XjR_J9wFi)V|#DG-0pFVH7C*4JB&*m)4vQp!Au|JPeHIt&s+13YdN z-1gpx#^|5Sz_C~;vy&Udz!30zh|B+gE7YnC;0}Weaz0`F@S#I?@LPd%Pa50#kK zCo$Eo)}EFt0wqJg%o&?HIKT=Iu91ZB&^OSYe)RNzxd4G1pJy@4vQ2^Jj_-Gp5JW2W zV$cuOEhK8{3voeDp~l9k*$o|$2BdFWcI~2v<`a=Xz)=Mztq{(wS5#Ci^IYXXs3GRU zFWYnlS$MGI7yKjIkZ%gS{~-~9GeVZ^K&r$X#HlK73Ze%+m?AJh-~S(e31S;rVF_ME zBA)mH3{9Jqo}A@92`fG5q#kB8WI#GQ1FsY?r?$d)XxLCij(EGm>4FpX<2)FD)74_% z%npd=as>wqKj2IeAAA(*SNEJa04+2=8$Ap{@%^wx8iIjSU>JeIg;Dq#uv)Na@r}DV zbTWkY@B;($#SNZt0)teqe{?i8>;p(L*Wu&zE=zAaD7KpX9$+BXzF6^yv-xMfUv-so!B_KE$^8L}@^ye61 zSAPeVRp7ujmmkGTgRc}NwBeMo@ootFA9g|lv;!Bz_|hefksE|xLzcLGEnSb-2Pw0D z`in;}X9D8@#$tZzGnpGeG=rJmSTKIfZV@f{!=LiG`j9?bQd<0J@0kdd@k6M9p z-uQQ@oE}Q>_?WO_qbb?IdlrkOaIs+{o*HH&bP{JkQj!gOZqjs%GHT}2z40>PV~#iQ z@CQSB5I#552&$3{-w(6;r5DOxAh!tQz`xT2J~i{2=Fytxv_T`b0;!!i=79gC{?|kP zCAM*ZE2|1`+|K!o2Ml;$?^4dQuid>o3++3W4k zhiG-_{9YK`fH08`iU<7Z>UB$k9zc>x%)gOSiSrm(Yp{V^(V-tam`I!gFyrQo)8HSlnrHC!x4v*9(}# z#N+`d148J$pnJhhNDDHnm@&9KbP@M4NX-b*6D|d663=tW zD_=8%43>h~&z>octekWd7rA$tKV( zLa>GW#%p6^^KHgiv%vhkF7)Qw-;cvD`~jP&<#ABHlRz$7EA0aQsL~Imp!|B+qTmR zu3!A1y0oJ_t8mohe+xSZOKAM$G7mo=6p8MuBBi@{n~#y0+oNyy@yHxKk?%M~a*S zvN$1NQfWXis35Kimh8W8Ji!qBLG@17Wh)qskxkr){nmNhZBRJ(p%^T%4U%@kEHdiv2$pW-d5Q67O}$DN>NOk+GhC@ZAh7Q&9u!-5s3))CCA4>g*;h|~C6^a5L9=Ph0NSD_4S zDlixM=Y`ZyoLEEdS%zl%NnFt1aCjHkCIiAf-2NdOA9lR8hg}M;dr+GMw2g*iEdmr} zA&9n>j)ehfBCB{^L z1PfH(t{1g^JYb6B7+lTI&3%drsRLF`I-DZ^m4V`Oz2iTyG2fObnh~8Q$jrgYkBEki)v2JuW9Y0DFlt8B85B2rKPolCHBp@ff$04nN zH>FOKfsjbdNLzRH<2Vy{DPs2sj^ti8B6vgq48vX{I5XQP&aB=BEm0jRoHP*sh^>XP zC~Oyq=_%1LLlhpPbpCG~VrZfJzng^5jz{HiRC*FlWGV0udH}vV1Xgm(F%Q9F-b>yF zk(fZd&Rl9Zo_uqEFRdxw<{)^#C#T?RTWD9;kQENoB*_~;kL@8ZC)CgK!)1N)=xAmAml z(+I*|<8bPPtG@~BrveKGVgrG1whwyxtz=;nS&e;TqUSz#$5xlVo>%I4r-y~H)}7Bb z5_gKNA<%z?VMP`A{~d6vt5_)PwcYr?V$E!WV?;s%iy2h~1=?{o$j1{&0tLAh=4D+V zL|1;4nR!tq_7hey4O<~EK$>y;p0*DGomkd3$NIg%W&bw2$b9oA6!9d$=cuyklf3}Y z=?Zu3fN{7;a%!sQ_cYwvL`bWN*$DqOWmJAemL1-x9U7aP75|!l!(U58s1V=$`|+VB z%^Qjn{4fm^Ip;Jq!f|K!!Cudyfg3u0q@7a`3RsjoL}C>SqO!1>Oi^LsS#9k|Fn*yF z3IQ{B=-SblEH0+NF__%HE40#p`6jPSyb20HTyOP+uUq4#T~CwDile6GoV3j)=Ksj; zwr$_u5cqnP242x*a~+tfL?ISqDTVN;TWCqV4PlUWQCYOY0#-%Jh-!SW8$cz?I=*WN z4q?Qw{C&EHKQX;S6&`cGd5MD*HxOx|8s~v`2+Y?ujZ=ve)rDz2ZH8gFq>HO-7Y>BD zUS&!OZ(m`ToE%Z-5dR9S@g|R)tl1&zD|=51R=@w*0-!$J5sNAnPny()7n0sctakUh zHi%uQSY6G@=8ZiUtz4?8_&qT>d7G~JMYE)U4T*GWYN~b#PXlb3ndP==Zed=xZCPaf z<))O~t`R5q+k8)PoqFCq#ZW0BYt6WcUR5D3v|;&d=bm|qmu+t(y2nei|CK@l#ky

a<{+GpU*lV!I1UBB1Fp9efV%M4udw3=hTI7!6Z#gaE)+n%a|%U{m%M`n9Zj;{2gGJ z+rz8sR{T8tU|-Mox`u)?Y=ZV3r@oVa=dF3FEUjkLQ!`LPq?}AJ#`dt6a8fJ3X9Qa% zE@Sc>#^iMA7@#ltBjNiQgoMV zlRnz6<)tP30Xg(hd*hrFFQg_z0X7QrL}=dxK=S*LW_^ZU0IT3_#+ioZXkg<3m5ku@*d1$C;B(QdiA&ek%>zPr-#82f0lb&JTp<7W`) z0j_z2`cSs;`%>0ze?!sxLu7H1hDMja`OkP~b#Ey6pvI7p%}>$VnGbfK=0kGJ<&fB&?qeiRBS zv)0FlQ{PMx>qI#Jei_>M4*D+KQ3kVZr;^oYUXI3HF)mr#f4EJ`p>6b@AWZ+!62KuL z#4}MV(>J1Xb^i(^8dx#I*)(jgd+{Q`q+rSgshEQANwW6H0e9)v6Bcy?gBBcd2hd4_lDx#rXLq z$HvmSX>Ab&^-=btJ=?eK0kk*+2#zg~O~@f=yTc^z5(UJsFEf!uIv+*ZoX^ z`;h8KMqX zlDWuf9v-e=U7qs1rPufZ>&t-5jX>SAKSJRn6g^Phgl47Ws$NNcAF$`a68;S>%`Fve zLE8gU75@(GJs<7Z5U;@>@q2pu>60g)ZIY}2tzL#MEi>ve2H(v%W}-ooa3MLJvvm@y z-p&Hv!pFE^>kdn#aA>WT`h6IphpaASz9#_|XUJ8H??qJN62Aulb zojyw#0mX9ro*Zd;8PATaYLk|dV%@s6 z7PTFFo%(-53|7jpva&Wp{)d>ZVEgz`>B$371v}c=_3PP|&mHgXaE(gK$r(kfiye)s z2t~^1OwfIFHW$6bCOXn3H-xaCd93uBTo@#)tlo2ihNjAot74{Ndvzdi825VD-WB8v9~upilC3^oaQ zI{4pP94-T4YF~xqZ-@t0mo9bnyWQHdDZ$*r!U5tWIyinqTW)q}2R`N^Lgs9!>`F=I zqT##``HNttjRTLi#3Njt^@_E134E-`gTOChgUPnwdPfI`LAWlLm^HItHRM5LEC?-A z^$(0bv8}KY*%N)jz+hM9>WXXWt5+nB+&Vvpy(oj^H6es<2227iX188nRxCh#2OK}B z7)^mP!#*iGZehW>eBz4YC`wr1s$1m3(cor1e=d&YVJldK;l%d{ zrCAef#HmL^<+!aFHzh#qMc&E;OzV){5xMdID4X75joL$~bzS?_(xRi;$mIhDxf89U z0s8>nJLZ~8zq?d?r>?o;)BxVo!)KUkkhrW zNW2dC2iMna^5tl>KJTzvD)z~eJuEqfHr?zX84j@+KsqahPFU4(_?Oni?YxkqS~td5$ow*xKOIp>ZIRJODU&%wMJtcvOet1%J8; zMN3{rMi9~#iL=KX$Bx6orL?4^9hOIMj~Ie;R}-XzpZ`SS$3WdIDro_`fX)f_q4K9# zKTX0gnhV93`iS%jkCjg2_T&BA0ivA0A)WU3?=2t}4*_;TnqSh}86Tm*hvHb<55DC{ zWj@sOrf_@58%@6ir|v^Vo($7!HNAk z8hW(&{pcyKL_O_YC(1_8uuHNIpAcEl9_D*r(2^GL$l@r#NOK5x?;Jd>{w^+FM_pOw z`2E4a{0t||QpqQELrg~)s71ubTIl+gxMqDr!&Wq%*8w-6@JoPs3rc||;0>%n9|{lY zCR=t|ijs&8FuAuownPsx@pZSM>v`K{v@d~Zzue~DYrn*iH~o1b5v!$-pe-~1Iumxs=Z2VM^iTtqU5?>b* zd1UR2nJ@QgYiZdxJ>2i;@~ds1mX<~%4?k1d zbH(nS^Q~LG-G~0PoXC9t3p!A8o3R5@ff&(PSVTX0au^PL==_c%irA@S_+@DFi5Z$50(zQDf4$HfyKz-SOR;4U@;r^ET}0J02pudtwC5E&X? zK=x4lvfY(iF`&1_RtXZIW!y^*9|@Bx%dX!`MBO3Z zjK>E=)wL56Oys#y%v#={cC0;URbQykA-58nVgbhrS=maOik0J#Vh`KD>>W)eySs>M zgP2(NlM}NdM4nj+^9Anvq9Q>+xFqm`Pj8oxP^brFc9@$x@uLGDCae+|Ccu$!7NeM! zALGhtSy{tyJt8#={9UngM?FE+4o(E~ml-?$$zy+cd3m`64Dt84=3)N!7G-ceaFaqf z=b})8tL8^oD8s?w0d6aO>!v)NFGCXILfAfaas~P5KaC*_UNFOzgt;azD&WcyK>N|| zU0Rv%VIxZ(NCp*D%B?SF*GV%|~`&t~hqu(9iGa=e zJSl}?(enhE5P0|?eAA-mj=jS>#%vw45cp23pd0vw^)JI1;?tjA^hAex0XBT8*gbj} zksTKt{}Vf!DOYlUsvp}2QUP@W9bEj?EbIU`javfJ*Pae{S<7!dM_zJ`Q~-_d=_C`p zrwjDc)KD|_0CEoq$S&w6~Y!b_A5U!A{k3CRVu3Ui@xS5gXbgogQ zM@!HKq*M>wRs?L^qFyQiaWpFn=%z^LK%zUw2P%kasA8iL1a4Vg%fRDvn{5&MJ~83e zJv?-yBMJpY?4ieZacn}k!;pHf9#pe%Pf$;Xl()gD*To-nmqMU`2qUN}(0*g}w6HyZ zxbrLt)tmDMZXE0;98)n@&KtAF#F`ZjZ=}3zF${i=dyJ{hFpT?mu&|s)6@#dN zrFw(NlHfJM8PP+svwq9Dw=+4*y-H_~uU+Kai z50?2TIs=O;7yruzh$JSzy>my7Cv{BYKOr9tpBzFF0fN+El5-%{p3=}=gH2szvobT| zbNczI9`}TH|6LSRi2aEuJOSeo<^rj08EhUNgkL5ax-~f4Fp_u!JoQNCQUx9nu%*Qh zdbS;KlECawztFOSC;XfHi+vFa$n25A6${*GB;Z)u1lQEEi}H%7cS?ZblTsCh5=yes zV^d|d=tSA+Wbj#Vr~L`KfWI`t-$mrJ8e{>P%{1(UveC>01S z67M=GagX(&H;yuf@58sLXmsCtVDSLS>Ik2sSYKh^?MF&%Angj;+TEucuwi8qD|CfP zd4T0mstKyOsP*o`X9~3^tY1iRG>eBle_5v8Ck5>h@Th^zoC*b?$hc$Fn9-tC!S%*? zbOa_|eqoe?38rehX`imJm5>iG$)9f8!d4)X*BkYPQrs4b24yP zZhr6dMb-QbeSFbaBP!$zly4G4IW!>_ zh@t7K`xf45s9Xo=>J$;3-C@bf$=QZ*Je$|Ru~n`UC43uzC7E~MC#ffQWdX$Vbmc|} zwz9Et?<;%D7UBmebP!O^@80=Jq1S%_9CQF6!}v_-!EgWkT*+QAnBxfaQ6M zX8gyG1+s@5W(sZC(Q`^;?II!rsj~NlzX1oRnPgH-0SNvE!5?@9Ij)|vnAnTc4M+L;e>@tGBj>!gTyLLf!YtE2_`!Yp6pbVD4WCT+XWb8^d29@f z+=>;61V!hLQwM8L}|wJ~fFwK+Md{PVb!Hd|?5*H&j5U<>kpk5o!|9)7O8C zDto0WI>!^V3 zfU*z>2RhmxH$F3LLIa2)7pyo1`fiO2;US=vJbEY>g*(>m^`uRd=w^*zwgMv78{lwK zg>gM-l}Y)C(~^V@NhGtQ!+n6SjE-vve{`il|AkC@@)o#3=Hr4o8HgvLogi@-k$`Pi zGuj3_MDz29*4>EfMKtrqokg|HhtTvfS#dj>=PsIMhGGH0cizQK z*HB=@Vv1`H$`bxN0`KF@zPh;IJu4P%`O4#$_=ON_0!MppL&Ak}~0U?eci?*Kd zq3*!BCaw2SxC))`&Xx5BLj*<`KwV8RC`=(yaz~)2FCTQhfPBKhA=OA4W5gyhkih9_ z9*uKt-Ti>>%aPN(foJX@**}G^AB=ilBkXk`5V_{Z;!h6-?v1MgC39eKFsa+)ObDAV ziikuoa##OWeEFhJe76b#SB8RUd$V zVEGdk9sOy{0fE68BL+1iqfq>PwJ^;>#yJBlgR|;IyDa#83*S@YbCCnEOnEm9UVvf| z8!)(S1*gW{qD5ir2e8ucg)1f;Z(}63!ig~ozQkxj*YfZr9s4l|&srV4MY8!xQ(IeX z!3dXdWxjG1?oNO1{d=!wU=V^IVhVV%r|8OdEJK2M(4maih=L{*zI`v z`mRS&5Ec~`fOlqsjZ&=G94W`$LACU#1siuZgH%V58N0u~t(Fj&$utFz$k^Q6@4*8m zOHnq*AMZB8qxz(?vpD<=2Y&u^?;bTk)Pu=c2%aZBDkDN-1V73Lg=-bIUf=uMXcM!H!n_zP;`m3Ze-Ow3&vGsE3_Jw_lW@XG@uBR4}UBW`Q8QL)&} z$Kuv@peIA|*mB^Qy8t0q;Rcp##+5!stMXN?l0SkpY^hjRHQMrI{;YE`aaKYPVh^Iv ziLwT7=scjEeG6uf0&pBft%y`8f~6D@D3{`1oc}4Sa)nG{zTi z@HFkQeS{#eNVMSaien!K_R+?pDY*5uLI&u=mogV(C`mHykIbCj8SYjf%KaeW{ikwh zX$e6)EDHZd{R63Fd7PkNPnMoo1y;#|qB}1A@{x=s7{o7Ohv|X&9!X&+T&RrPN(!0WB}yIOlI0 zVQLRVgaVo?D~=QgCVg;6tLp5`Kj)5BoW`(#)Pa>HEE(^i+}nlJe7-joJp+uBcLK*c zjWN5=>f`8WQ}FJhuyzJJxhI7L1_TD4z;y@5 z<@fQjzr6Ru!n{Zn02b+u(M+7@ii^}t-BzbESu7Uo0b#{sb=irSD59KTJ#-mcUTv=;7{u~m1MJiJ#$g|9;GscP{u*RZOt1FG zNZz1@#~dm-iWkBiwnuRZ3eu4PWoIXX8|RNriAy*OI+Y&MAhRBX=kZ+*X*TpQ=X;Ny zBUW_ag!H_0*bAT3P}~?()VHjXj%izV9hqL$P*?Xx1QK@NEtz!7VvGg)A&hyd+TxQX>-1 zf4d^3diERiEem+CGmtdZc62aezH=K~TM#@6$>=E(J$b*>3>%r#F zIG2$fZ{wt2y%Ea;2jlA3Qf9K-7(MU1vDMPTd;|vR=@}JxE+T4Q2hX1z4iIw5;Xs(h zOMunkUCe%n5hadMKu}OAPwj&8OAtt5oa&FhGoiGUjeO-j=cCtodamx7Kf#1OLG=if z0EsrEN8gMr(~xUYD}oindXf0G2W#ebtagDd zW-)(5h*qY&fkhA`uvd#po=Dh9WXwk}1NvcUINhu++ zG&rp%ib|pqWtSl)>v$}!qJ=23WT|7XgluU@i^rafFhglU8_G6esOR&WXWsXEuj_TW zTwTZMEdT%gyYKJ){oV`x9&rf?&d3U4UL|xF&ZuMlSOGMa)hZHuHdCQeiz8XOYLy!U zp|PPJ+S&@}ZGPc>7I!m4rHbO(+bzHZ#R&jk;HF~%3QtFQ%S@=V7r~tjN({;q9tz1) zMNQ2SUq~pY4Qi^QvT{*L$%`@TCnzYlze?`uS@_(j0zf_&$|*gkw%FAJTE%xljJ zSR7S`KNL+6lvs#~gKsS&-{&4IU(kwU3xTc@^Gc~t=B0z_z^w&%ivvvOz+g%T#SB=> zR5ci$BGHC%%j#4x6hZE}bKlF}Jbd&>Oje!7wiL*fB$5Z^4*MZPig*uOCIfT!b6u*8 z-Ax7za8K(hwykg+>4GS{eqEUzd^9|4q}pB47bMYxRnb`^$3c0E7o&k?E6OX=&ecfH zT{*XVekx(Jt|*9z^zu3G8)=cWdGn<@l^*WQovuM&j@&350y9G?l{eOn$I@Aqx%|n~ zr|Kr-`yVsP(_3N#k`b;Dx0Q=s(+_hHZzcLZWVN`yi;VTA)6sa>?T}fsBJnfS{}kW7 zt4WfeqM;GZtYMhYr7Q;uS>lgwB+=H6!ZO)wT@`$+W30@iUq2U#3_(}Q`bqY{+2rv) z%;S80{yXqhDBzLfMq!uWHUd(Ng)7K=M~2DPF5D>4-cJe<@(P;#dXxv>d;aECf>r8x zGjsFdS~9#f;}lg?Ri!MpaddL3X>NXMa;ihw$kS+Hx#G8VHS6A-dz{NGUX4Cu43Sr( zcI=qEW6a*v)EMOBUQkyhqyiEh9r>d)p{2sEY4SkbOz@c=PzT=AQeK~DF018I2zz6YwjuzO8W|133&OKR#1yCy zCT`bZ)^1?S9MoL&=+XIzZQ!{m^!jMDD+zrgbhkXt z`0X!7Rx*up6YZjE_oAsioJ~6zem8VZ)*?H7`H>@HR;ma^RC0o`Hj#$m5jn<={58{p zcshhc#UN)l@v7gxQ5GNg`Sa((d-vWZ>r5mcg-R%GUIi{^eg}B4fu)lI+2~-vm`l( zk=j*17r+A}dnBi%M8ma+SPP{ihNcqR^i|}*f2S?!5xQljgZ3xI*));nQh1ifX2UvJID1f{TML#^x=aQ zp{FV480ddS0IE(K(R4)q^zfk$Wn3c=zi1>76QmE(c2bM2m0;kx0C<*w+cyQ8BnNY6 zh2zD#zuuoZ>+){y<)I-DVSXjRUGgn*O(!$0LnltCA;eM`C8exFvE>AM9&4?Zv0nq` zzbgcm9yM+Nkj=b)`_cL-ndQb~#>|j}k(cG>%V{qjJ;M^ z+uQd7n7M&H1!jQ#0W;7VHg;p!??b$f`^rEyByUUk?4lEHYR+^iwBEk-^d5S8acERh zPHLyDgBZ=RGV&wh4s861+s zoFU>mmGh>^1Ij2o7B)7zw`O%?E^k=DJJ%hVKAu8=5*fsPxam9Tqc0=}|7MS9^l!=q zqp6TYW+@=Kv!3YC_^JyF|t<$>Q-755)n9#536Nn2?q`mvb zoFN!71=W)z^~eUI(GM$A3VWSxeyaIaThjx@tv`%b#)72@KUN__Y{_zZDjvuZ4vQAm{!9JE*jV zKhB-3>c6s=j!rOC{}YzkuW$b{|6^0dgj3(2cT04QA}b9h$mQndjw5lTAs_+LuGzY> zbqU176KdYCe0gO+(@G6$Z4vDGl17)ickYxRMnphqS99j{*kmXyH997yaad^vM-?)& zn5;hu#?R4whty`5skOBx=eOhQCH8$Q4=}U7bW6(}A1^2*ZctY(ZS8kx2vpyy5`Mz$ z_CwK$0(8XbNeJHHfhJ9i43hLtvBpLY_O-Hn$& zLnculV?1=IHfh098hd>%(8a-h)WDbE7Dc*_QBurEsrx8qO_wtu0D?^cW_fMH9+j1M+KcG^g5msra zj{dJ~(&*Q12%e2Xc;J-U8-TemADn}}irIz@8-7i%2UX%{dPJv8oW^H4bmfW>*|B5j zY?LOHHMq~tNIa|8t`%4ipyC+5Y*_>rucKD_qx*!(Co9Mo7UW$ej&h$ILyt@}uH2V1 zUW*GSzLtOs4m0pwofN(0i%4&*Z+%%c)_5}jiPLQl@3c_n>FELk35ek-+A?^rD6~TO z%xz#em;?IXhOgegb7w4x4_xXqA49rF#=dV%P&U%k$Y`9&2>K!*m7o%U=d(rO*;NY? z`w$9VK2uaMnD)p52Ij9O#vAAvn28(+SyYom)djtjL^T1HgOuXWD|e#2PF4u^aClf%0u|;8ouCo9 z)l2*%TIMmqznlVJY*N*J(7H7@T4J|sQNxtFQ~4w+l~~~|);|c6fKB3w(cM_S!!o}DjCMU*(qhP?!#H?{b%K+Gj z_()7+(RLEKH;~p|b_wforFU3i8H>i_U@D6%D$aSAG)s2|4zjUvrHhD$B7MCp+yt5@ z-KY{G9SNR0S1DPh$T!Zi;mQGMtDYA~pLSIE_UgaEQe>Z#E{|K3S3ch`tHe3|ZT=hY#~2 zGRL*?`ae*Le50(aV`_-+lqpI$!I~$RQ%!0;J9Sd;@9r%1y?j~yZgH^!FV*4iTf8yi zQIgc|d)Kbr3=0jy3OY0G9}ZT5Vz)&r+rt+w96%*uJW5X&5*fkq$`I0aJoHXgp(tNiUlE1F(xSy9rdBl zp-3ZHZ4}$RPZC|HeMgP@8sTaw6UCqKp0?fkj7fksAPLj(Ij(~%wir0@TiF<{s$%CA z;aj39H=U6f6m+^vkmVgfo+t=%c5w-z?n0n35>q;n4djYqX9JRix68E+v*yl?L6&1b zS!eMSnv^_xDiUo}V&{8$CI)|N&^UKu02sC~}CyDIz7^VG@Z% z>_e8CBs)}xt(*UlcsP1=plw>%dJn7?-+*GtNnme>Doxzv`$U0*85v!m(GjO&;nqpg zWf@0~DC2Hv-(DIX9)&JS_SD1BPy;5JJ){{(QhYY?UDmH3L+v@;#}W8&nih-bt=w63 zNzpw!ltc>X4AW_?ze>YuD9X*OcmXmXxqa+5q|jT26cp8f_pd1XN-xG>27^ z^nq%m;Q8|(%QPy#{UNa)5?p%gA&XOws(rd|A=aUtJf{BdV5+Cox@saxsNEtsj(qH8`BwIQQ%Ye$T{TWd~BoPAm@7^uV_RHiR zBGREWiIpk}eGe}pb^z%}I|};OyLozQa-U)zTBMJ7@a95%{DdfrNu;lw^-Y2m0x3bD zD`Tx89n>C)loqUyKe~Lg;vMy{5u6h;W3}r(oOe2Van7npE#L^ufuuOX4B`X=#%ua? z4Q{76Qn=GZ#nhaeJS8x2tfW6>-;Ova* zJz6xztXtwXA=0d=x!K#_Uxair3kB$$$mE>STOnS(0|L~xix=d+@PZ`PJXPv*@l~e{ zGdmzpd2IQs7>*jQhEHXTv65*~axD%Vf>!INPcc-+5KYNdctSLlHQSugiy&aV{rnz{ z`}SBuJ@GxX_q?6{co^=SH&Cq* z+r;iW03o`$dwP~4@J55PSqJ5K<65`w-8WG~10{Heak_+3$rT)za5) z+$KxFRpqM4b2MHT&v~)*UH+uClz_U zaa_pZiqel|E5};yubl2I?Jr+mw&Ln-%!o@}2yFoKE{+^`T&AXcX8hqu5JFY>hY`%A zQ9HF_YyLZWc@qt-@>N`5X~caD3bIaED}Khdu_7)bAr}frxOz^E`20W%@;|uUU|8&Z z3if@*X8Wl_=t9+24XN;WjJGw2drdVE>4B7C-Z?Z^K9lh!jPiAx$$$ z>^}28>8%0w=K1yHl0}~YDZWqUSGF{~YN4DEKNeHgnDlF6qNhU5he36;(`@68%-hD+EXcZASQf|9161Om5%9eYaR&k%qRJ!DJ+XW+bK})8TM-SF(vkKA zKS#_xmX+m>$4Xgg>5)={Q^&I`Lp zPaeK2xNvt`S|JI1Yf`gD(CKL!TrO@TMokLWF@ML+Io+qIu1McDV9=nebMenk{{Fjd z9|_! z+U8)A9dFa%T$OFKVphIf?yTpQJ|gSXsb1HcUL7f45WVu`>E9Ly+9%m>J#16^Q)s9f zf=>-?l^R-`6~R@U7Wnv!=FNL1wpN{Jd{Fpt@H~6<(|b!R?-sSp+eE~TseX9>S2z2W zqe6c9P{c2yDhw z6}<)Sdd@rk{>k>Ahn^`rc|7^=fBy5(k|^=o|MMICP`KDO&RqQV|2}w8Q@MQ3KQH;; r2QOSR-t&K+U>jUL{QtdEY)AKIn=-H6aYaiz@yFd|qH~U8aO8ggnL`D% diff --git a/content/english/hpc/data-structures/img/segtree-layout.svg b/content/english/hpc/data-structures/img/segtree-layout.svg deleted file mode 100644 index aefb2427..00000000 --- a/content/english/hpc/data-structures/img/segtree-layout.svg +++ /dev/null @@ -1,3 +0,0 @@ - - - Produced by OmniGraffle 6.6.2 2020-06-26 13:14:04 +0000Canvas 10Layer 113-1223-42311352-880903-124122522718-86-9372452828594-52-52-138-53229tree diff --git a/content/english/hpc/data-structures/img/segtree-path.png b/content/english/hpc/data-structures/img/segtree-path.png index e22597253d3787c1847f1f6a56b66a375cc3975b..44517df16f4870d83204c10caca34498ee1843a5 100644 GIT binary patch literal 29558 zcmaI8bySpJ^e#Shmvom%C?VY)QZh(42nf>M9U`qDAR#F*f`kf4Gk}1I^bpb^-QDmz zkBVmD-T$>|IG32!siuswk)Hm$Nqjx`+6RZtA<0ovvp`keyshVGH zee8UXLUQdGe(t{emfE`=I1so>M=X!MCy43;DS?>ahRug5&IfxjM}ZG-g5wUtb=~_M zTd&sTv9%5BhcSrd(Ob}q@SqX0ym`bdKNM-qAk2#P8?#L!R#sN8=jk)r}4L z=mJ)pO632{UiTFqV>4fG4$9ow>`c|(fAC=Bo1Io+QBlQSh0(WHGI!UhuLx*8dgy(L z87enJAEd^v^ksPNFUX1eY(FX2uW06!H?idBV|A&n z2ObCyVd}{4`1i8)k^_3F4>UDoZjMuXZTK?@>BVhOH3Cn!AFr&gW`S4JR7qwK%G7u; zdNv>G8kdkDXlrMeeY*X}y3DlUE2oUiJh(|?Yoba-uT1yT&*gf)0X<4xCffs?p?fg= zM{b7~m_@4`+xBNu=tpf+!_U?-&sd{4x<0O`VK0h++tyaIT`?C%iq#dBl-|C7|HQ`T zKD$=#N1vUk@v}0$a%C?s!Rb2ZBKc_CaWbn&OoDp91D2j-#))#n*>Gn##sVgj@zL!g zOLi{kw|dm|cx%UWVtEL1A+KR-X*pdaxLS;6GwXQD6pD z4B*vP)`yD-ufQv;Gcj;fUx^Vs@v>@i4QGWn!D{k}cicayNAmScpJU9x*I&#V1{2n$oV+{BhpC)yi-kmKpwDHoMB%0aT$>7iGjI)G`C6J z{)js7+qATAL>bl59WR=#+qM>Ycm+QVp<&hn7Yf!1$7`g6Mn(70 zPleu`7>uG3Nm|}qH0`wt>`Z?jM6QuV=uPd*>+^&0vl5NW=f8gc_L>ZtZS*cFuWFro zmy)v7|D%73aHk^tEZiOLy*jaqn%+ac2f0OUg_LDwFowm1ZOAG$1taaDSio);MciRF zZ#l8Y!ONTc8hfpXM9_KaT@t;-NWL5zWc7DuC)fzDU%$=|IN1d66oE_5^(HTGup=7p zPosCa_q1(a`eW;E{1?C9%N47o#Xb5ROUhou*~tkf^^x&KnL=?TJIc&LH9(1vGh#>m zIni8)?T0=lW{$0`rkV#fr^VmZ)sdD!Ur0)7D#|+8@L-><{mvA?XOzx~h(O&6RCtB| z#ic3W!~uA)QH?!`l~p4GClKu_Y|m}m(^flu3GJa9{^v?a0A%NcBL`_ZHIa?C9{L%D?j%@nDPMa6SyP`EZuG6&Z}n9Pv()o8$k84DD(E79i<< z{XOJD488TTr<7kW1HRF?Cm<%4d3@|uSXlTvF_BGJIQ{*5s?tvWzbh*nAwkShZZYTv z^4Mnd;$A=Pt_J%1*#rbe?XD>0v8A4X(6m^=s!W-Wu9CnLMqKFiHXXeF_c_HxXb3%Y zp!vtLU+80$KeCPfs%RcFqRM(lkDt#zGI`*DI1aLuswNN`0Mj~8NVxwfYF|-|35UFX z#@>8sdu|Z@RuMyx_V?Z9e_T2>F!R|18E+i+Q;qCd80ZsH4r&xd1d5c zfQU;8!a6p0B|2_QSY9;-S}W~|fL|KfoAkf<)fgt#wK!+5a={Vv7hGgYSmTvcVioSpX(`pF z7)E^ZKDT=kHBy3KCe(KrXdknz0AU*@MEQa#TH*?9r;L0xxc*2r>eTJCP-0O9mUj;p zWd)qtN>62_M9!Go)S2a@fRmsT?u5M;LDAcu%NS2ucs2qrXB)1Q|3wU@i36s|_f*mi zt&1qOntr2{l&hxAb?2QN+uK{u|Ngq4mtxFcqxJqK=~AA&Ky~`rA8H}FS_vC$Y-ZFq zVDR%-K2M?uWzG7<54$*OYJ#`37UzN+Ym$u~D9XM8Ya47IqgYS!q&m<-`Gg`D_KF6z z5LT+7es8mA2an_pc<&AYw&0E4Ag>NeHQXijd)r5fxFlrA&xP2E9q-e#CM>93Hr$^& zK6XJqvzY~Kndmgcu%X2L4qiA(TRQZ0p}z$B4;lCir-`!{HcZCMGstV$WLT;`G}=hL z*Y79KK^dEVWhk@h{q{7DM76R%j*h$h8Yy|oy_g*%T7oy)YjfIx( zq%Jub-`ntHQEN&mG2)6n79;COLN{PRyb~;_l;mbp>{ZKRYQ8^JYHN#)tzGuSjV;2= zAMpk|-Pjk4se3oLp%-sa3al=D7_N&G?{ahOGo^@!Bu+Iq#m1}(EZB>#Q(l4W`E}V> zZ@ikZjld&iOIpHxF&+N2RP7ix5sGflv*0rSRx3Myrzr8gMDhtsL^^o-n|El&&%8F} zuAZD@E)?eVaYVT#3fu1O5))4`L&s4vym*R;rJ38)_?~$$h&zeQ3Cq5lz2by1lZ41) z3Xri(DxpkR;?=oyA6TWP*qfA1psIL-I~u_no$IJ32RdG<--I8x6SL?>qZt$#!=^Lu zD)hD~qS?^c!=+%-rmY^B?_Wvcvm%X0mAE6TNB6vMoQcoHZc#jx z^raU{R6$2rwQ;S(!MKU*#a{?Wxvq}}Wpt~|KkVc(8e{4Mf8$T5y?zIzRLl zClar(kfJC?Je1SW!0*ozI{I^KQe(e(*l1GcbYLVFMci%;4ok1~p@JVL1G!S_E5FiW zUk|lOkJmaH5Hd)@u0oX7Q-OyvE5cPZG(LvhT(VWJ{SwcOi;Ls(XJgKJ^Wy3Ml+XoN zD)QB1)Kdt#E(8(P-HABZtkc7B>H9#$Z(#|43|n++mCN(jEDBQz^E;9*772a>!W?>J zQ&{38YB|+{kEKj>PwvZDn)(`ukinvhCEFXkV-*m5)DmZAOf7uWl2I`WN_KwnwfUeq zKv(mitUZoh`N{=%)7PF-2&TTbF#l(v=q0F(DE)61xa>jLHU0W>^ntMRl%k)XD9C)Q zLEb0$as)lBSUuybbr0dad-u+-_S*a}Rx&<8<@&!%f@E-HbW|+#Rx*RfY$?sayg(z9 zKbOoZX7;YMP}waM|6KlP4h=>!&x_)Vrj=zezt<}lP29eGeQ_k{GV^|Y-JYMHpD$Cx zw9zZ}=g%kOpN%>TmEvL00@S_?-nC95nUBF&r*jO4e0A_m9$mZ(n*}U{mo$dY)-5d+ z)+4M(Xn3benv102+*AT6Sk2t;`GAKPD!zRAQu6Nl==yv)Sy54u&vRA7JoLsJnHO}$ zQ(mkWB^v@KSX^Ax$d$~D#-rr7`AIxpY1Ws{ZQ^%#bB032Z-Z?SbbtY}h@%owO>LbG zI{Dd;$|ywBEq6E$9us`J+=s3^gi*u=8z6s^yggq|pqEJh{Q1$Zcbp0!g)}lUI^LQ{ zW>rZdrlnOkHKp5YJ|YC8ADzBA^~^Izd8cw>J22FYRff?*0Uh40je|?fQOKh+Yz*Ky#bmjZk_fKCIOgf59uW*~)4-LTpcC5=n(pTB^#Gb%cIxYA4lY_iV3y{W3|>dPk+ z786(d&f)W6MsSve2f8w4OU(|q--R5Wucrr4s!=HzS4NqHT&z$&)6t=S76_~}GSKO44#tlxgBhSRLki&;o$tg#I*n%c%~b@AA&k-spV{rRa~4VER|jB5G2u$eYOpy@2)-h z1qA$$`bFsLvZaH?YaJ(EEWpD9$gJ5nIX`g0UK|r?ixg*08b4>mXdPd+1d+{}KNGx^ z$a_93RMH0;mSbgl5Clv+a&pqJqqEb0w_#1xa}~es`Y_(Rsm^)&a7cmZap-MO`Xi%` z@d}fZN8=$2;<|0+xbPSGB141MA%bOt_M`|Y`~`vhdwh`hh5u&F&0*^C_b16}e|C?F zYl{+a-m9~{Qjj)&Hm;5P`0>8u{3{YxWxf}`skVILDS01&?D6*I3eE8Q(MY7MuO*KI zESgWZze>2Q=xQD|RYc5!9vvMGO2nDx6R}B25BT{@YZ~B7xP;tG>K;7 zm${A>(W5AALQsMT#SoQ$krZ*7lm%0IqNdhi!~ex=nWB3Bsz8 z_cpiK{d=RkQ2xBGjsE1_qDU^CDRQsSOLR-@HFw^e%@1yll^cDv!2$2P3}6Lh*kV&p z9|FaW&-P?Hc#z%B+(Ln=Zakt0{(K=S)qA|5VbC6dy%VE-qptL3q*hZG{R0P6sa5Bz z>-oEzJOhbey}c;=r!Ts|M0-I^3faUw$~2H5l}AtQ2}|adL9YI^OH=>O2OB%P)XYp( zd3pJQvzZnxA;-!U;Msr3$88QU4>BJao=_ym+F z{|`_D1iL6=mWAJWGI`F^b>vM?jg4sn&vuJn^k;#y+U$(qwBAkbEI4jNF5+Q32-4?! zp?6y&jipaNu!x9ch9ijIG#S*$90e&aYN)7 zCZqN;Y?l-9k;Cw>G4N6Aww|dmE5bOiYPy9j@|378Iox7Wo$FBVj9LBQEs&4(Z{@Dd|I3 zUNmEsV_b;v56ymqGZ#%PeJ>U_;H$Ps4u&i_{Tbt{vhb43vo;ia*zzqXb(jhLu*SZA zB={`+%)=McfDN_pjanXYhe{Sowzf#7I*jS3!6u4he~+3FqxuL!9y?4J7MoHlr$<;; zF5jEX(4=CrVL~K0bU`Km`dPP!^%qI8b2R8EQ4fKN(3V|ha#IR?`!`qAK9PE#KVdXl zIlb&dM#lNpk&Fn}hwtTI6YFE0GuBDtD+|VosCcUh6^Qo{9UmP;6L)`WBbwy(haFoc z7=%cTmVZ4&@yZ&n?#2Uq-4fbR ziFwPHvHWOgG~WZ8IpPO7{9e#>OuLSV<(8v6P5tQP5;k)XjHmilhnyY?AAyiyspEQN0f zEPZ}Z%8UBaxcr{5-Lk4_na}omMZ6BpK(>1MVEOO`@Ly2@Y8=VXESK5Vlm@a}~&gkS)C;G;$rC?qUW47TJ@=ID}Mz*YJxB?Z!p#I1eR#Bc-wS% zD76V4Giu7s-yc&efi~L@T6ED@nbM@0gH{79iNFpyRE*ubkE-rYHz2=?Mk?QcA*#HX z&qYcV9bgPy@4Hpq-+{5?1CXOSAvz@zyvldENzCW75JBYhsJIAw(mk(iS%-es7pCFH z!hL^Iy^{Ca;j$sKC>lc>&kw<(e$sFH?x_eQ5IE-wq>72V>9+7gumV@Ge7&5+-MeEe zK7COx*jJ5^7F?K>eQeZzG6$W>z)l9(&&zT8WxdT`B62HzuGtq9#wz13y zyP>0WyUO>9jcCJ=9=j&3H4UEjD*ef8h!FOZuJM?HFW`!&_E+$JEG}AO7UtvPn~^#T zLR5O-N^XT>`&e~q^APKJR>cJvW>EaG750PEVCYjcU!KN6J~o$p^bm z4rUtn{zS#Gs*#3^eI!!g8BGEN3H}RqJ|zL zUm@ks;1kQX9E<=OL!QMJlP`Lggedfhz{OfgV=hOyWgW;0euY%levoxUj;eF&`O+bS z{HKmxfjn0*%U?E>tGCkl#qH$q=X7gm@%%N4@{==|Ir(l-wq>^HO7qFf(_O{O(>P)# zE%$fk!C~>R`72jhQZSTSgP-n0WfTxfPh`=S;~reYTSwTnifNx&H6f+UMECdx{M`n<9~w2s*PhpcV>x?ZXirgwXkg6*uWZ$%i7e znubjf43+ai-Iv(p9KLy#4ts~QvcQFVPB#D4%7bl)Oo(Y-%#x_A4w_@LK%#rpj7_Lj zH1de;Q(_j{BRZj`8)0375z50bci3>P(n1rXP=ZQ+7DUn&up{aY5JogxHA7xPAzx=y zF8EyRW$s~G3!GGXrVX1o(S7c%0C9SD(0N1jZPN2=p=Brk&J0H5bqAJi3*Ij7zTmUr z=z&Y?>yu}QdwZRO^tvpGJYhu9C^H^?+#1=!{p?tPfM5qW?|Z5VhTndm^2`*lOR9BZ zv5)HN>iPo8m<)X$4h|-E_85>E=aH>~`qb`xf01bdL^*9LIH=I5A3u*8wnfRe8_ac@ zZ>9g6YeW94og4yC;9^j$D^sF{fuxGo41KX+e}6x$Aqms~FT6;iGEVpAwThGz@wpWM zFd!zzcJiD!oQs*6*>YF4QkV~5WXhDX$TWxL9e{5B2S8T7RUwerj~Isb8D%`6I+zGC z<5M1fx1rBP;xvUe3q`{S;Lm+VyFh_BH*NKA)<;iufZHPk%`FPM*N*g2-FTry@6oS| zI;E8?HxnB3wozM87u`1@v*DZQlcLTe9(qxurCO8$DXU9e@q{uKAZ1D}bq+xB{DPrB z79Qg+&8ti2@OPF(w(z313)a5qa$LvM5eS$eozkaov$IY9imwgk_1N$mg%14u$s{9_ z2a30+dD6(LM^fzA8?K6rrT;-&=lP^#Y?i7e2Or%!wi9Xz+L^zH+oH+E#RaHG8TMsZ zStC8ya+2{fb8>!zOifbChCH8SpP8#V4*jyQGE-d1NhiKzR7S7EhgV%0WA~*0iZUb} zkRiRP?0Bd;tBUfQA?CA)af(dztFq8#k5n$b&VAY$#R^vLK z3+NOIs|Eka1E3dSzQY}j5{sN=ZQvZ_^-Z|l{r1SUQe z98gycH>cAV2R)1xdx;EEQvpN{Gxd*;R{KYn-<(bi?!fsGEj`Z(tI34=emw85vx4iu zwP5pTRIVE<(`S?u4T{_dd5hyMRC{Fr??|Z)z46mY+X=T5cjf&A9i_=ZJg9 zK1CH36N}KRo$n2cuV|@+ou~lF@{Zdi9aQ;4A5}>~ z0UOh6^L%^q3&a#KP}jFtdte3UvJ2lsk0^3GF6g2y3RaE!zXv*#JCvoD&CwaITmSv} z_IkgQ2rTYvBwV=LwC91uH{3Q6oB*eAX)(BfWc0Ss54&AX(RkeY^lGLp^S4r(a4&u^!8qW)0LI~j?5f#UG*+Vn2oa!;}yFm!ZX zt*MT9ApUtmM>siv{Z`?NEaN07mdESEl2_Z`!f1YuKL;?KG;CM=(mz@ed8tPCLgMuXSD0$G40K<$| zsxw91^>vU1rar}#<09>mxzscb|(Z zZn#JVjcdQm)TlUG^8xA6+F&UthejZ<(SBJcOTD1~~=>OINC~(LmIk*#}{5IM*D^e!-`t_?H@DKm>f|&Eo@~YSI@o4#A7RTFvQURd3 z53sMrqyy35jWou135uDXTiXEsA`uL|zm;oPE_oHtuyrP?P&F#j^YDwVWYKd&lw5 z?oXDH$YYJm>=dQK?Vh}p6$+Qzl=hE+m$GGZSd^EBEdfG+;$EC~ZNtzlT~3y{eP7jj zSq5cnsjn9WAeaI`3_+P&2+k+n3ga|T-TLc9g+gx*H;W>Uv%>0(jTvBgbcd_5@{CwS z41dA>)weShCL&AA@p)(>l9D;6JJY1(T3%NE|jZZ}Kjq^(K8Gim*viVS0`AHYic@IV1xH89n{4+IRcdY^B>NHs$ z&TT>$y2coK{rKkU?5wLGRDQN*vxujT8uqj|eJ=V=u2$_>$s674_Wc{a-+;M2|C_3H zemd=1!HwhEb|Vb@5;@TL)7`v~OHQz)Y+PKw8s`Am+)M8}V-@_m1CLUBBilk-Zj70U zbnIo#EB_C2a28jh`^R4ZSX$>Y`wIYKf9jmG6>zIL$hma$6=Db_Z;nU3w#NI#i?XF~n zw_z)o1q6tHpAV6+?n$x*;B^9nR2B&9Z_?6~3=OG(YoH;5qFVyb#+MmB>S2&;m?hJs zpdl2D4q$@cim9D3H)+sL{=Q_*$J?~3dIiJsawru48CvMed~%=Lpo_yqf0p$TB?BA) zxs$bylmI3<>0&;Det07R8O#^IDlQU#tBmSezI=hDNKNn{`)k?V>hKjC_YVM$h}hA>6+Z{FXh+%HXa z{%gz+|KK~<3w|~ZPJs{BmJk7QyEEIh6-V~@a~5DbTY$agZjav@10fE50WOzNDp3$2 zzM|eS7lahr9%RIXHYR>q>p$zKJ5)7WwI4Yja+T;XR%Ub+l?3c#U=f@qU^f8$?qU|> z&C1Gp(u`&FPDP5|FszXxMOzE4q^L4RZ}sFw{ws^~zUF>D-HccEx?y*iT(3ikr%%vP z{4`bFY}o+@1oA+BzkN$2$6n_|4g8YugIuX*wsM|y@bPX`P85wIBC2sxf2zfR-~`)| zVN-iuFUz@)$=Z~9z=nUgREyC(_(bn%uB57*97GWQQ?aj(etmI5{{h z!4AC!+YIs9C>3D4hdbZrjq?hB_V!wJCr}eEN3L6*UJ6`^SJ;{GyuODbwEvABON4*o85h_v3_1_EJ-JUAq&g&fGhb}Ckir$9=M zK=?`pdK~wqA2`z2`wHT|yZA^24YgE&poGsh69{);q5MY@7e%>tM##s^+W%nhJe--a zL*0z+BtmAIBcF`~loM&X^5r7!5ik-^1lS%dbw$O*=(Gm;hXAz-$fIN%JXWYg-I##H z=kWARymtGUGprlMpg{tzPrqP{h@|My%7T;tWVh#IX>+q=@^^aapn8DLjdd+}vd(yZ zcKY*G++1nSsNSt$MLb}1os`v7nB+u40Hp*(JlbIDOr(^_$9AM8HE}-^C2XU1!d&3; z-dBgwlEf8pE(`fT>I54@yANa%Fjznx@tWzu|E1?ZYBWHpA&!(e{6kjx$s&Qa;!ogS ziwXc|dmlC~f?RNWsx}_vS~P<8(51`M)HpomWAelHF+dOa9S^IR8T$_AN&%_UaCL#> z@>{^#B!G1v`C9B~y;Nb*HdlP2xX?Q=;Bz4^9(*Z?Jc59~@(>7TXJ!4`oo#wKR+grI z)J8A;nLa}sJ$qxcG^06%xXZB7ixYSzUn3QSPR!#6fL&$EUeSD-BC}LSscT48beoJj zm`MC)Tf-|;Gc@*)Dzt^{yPy%J;8n?)y;ezm1>%QMh_jP9TYQJs9nb z3q5jGbzv7pC}@i5j;=?DzQvuv`7F@L)M4ANeU_MlQ0d>uh~Jvm&VQgBLLd}83~|zN zAYWr-;>#LQguf4uYn95h4Q%$GX2iL%)FZ2s*)*PvbQF;ebk;sn$yLI*U^!MSXx&yb z7B5_->Q~leT>2$^qD|Ubx;$CP*B!m{>+(5VgoOKQZR&eN7P8vfR!A`}a~99&sO6uCudj$Stz8jD#Bp2zA9R z6sU+p)xl-S$jgwowq6pZ9Ot|BY8p1;eE<;0Fhk{<|XJyzx0NxZ=J=G4gk+%b>S}h?HZ?^-+oc#%j0F z`m7!R6UiJdutnxT2nXzU1E+S=aw-j05lguIQU5 z_g5vteHaR!QHjdKy3r0@O8#%S2p> zk_rD!QWQh&-~GjkOu!V~z*j6up8R)pvaIUA_ozoV7Kc70epo245}y8VKsW3k|E7ax zk^kQsNuqN9TN`2u*nju7TEe1%3HC9-t-wRtnW6t#P5T@6%zw8UF#TI@s(Mr*dYB&C zlkUfG8XEvsw#aH>IbbMs>DDt3fqUMB<1@xhyBS$_8JLTv&jW%5Toxte@p`JW@ak6@#Rk!w`k z6;ido`}Zyj*sHV=V4#>_@qMnyz9m9%ce*7oAcnP)-Xu=lf3sEK5VTbf81ifX&j1V9 z;B$SB86EWsOHHJahWrYkeZch(BK|W3ocQyGY=z|j<~xV>9OQHI-Q_nhnKtB>(H;L; zq#O4XIfAaKdYUD8VK4$!!K4>(wL3iU2jJF;q}yNwXmii2&(MGCFmxyXMW5||U7$}@ zeNUC>(gX%}l{}{e@yMeMcyc1oYoaFsL!t8if>z*w(1u z6yWjm$m{f}_R*41#;;zba|lK|fNg;Ih;@>bMCvt{H%^u>!$&&d7~9+WFiC|g-=T?( z8m$@o8ueJ#>`iM*1g;^Pl4KdipusMNG|DO32dGGG%Ey8cr|`l@#5)}U6KD^i^eNK? zBT=m>v5spa7SF>)8F$+Vs~Nm2d@xTNsx>LfmdLPom1cFyY0QWdoh&i;fqMIK{EzQO zDNl4E)Az_fhpu!8`@;gCTpadEADcf>zu4Nk+wuU80vlFTt8ctH9mI@ma=)L5-ZnjScCL)sg zF7ed*`hvryVZfJhmV4{*t*Lzms;(OyM=Si?bIHknAcL_{G@y=95XFAD5My|_QWrLx z3TSVT_&))PB2%DNsmudXT)=OBUcn{SgwTk%XaoiIk{2NHx*SotL>i&j@uO|1F8rCG z@{~bdlo-AbNK$CA8hIYw0suiliTkQd&#b(9Z}LlIPEL-jxXMZRI!G2AfT#c{3o$T| za$um8g?@U*mPclPcYE`(xL8kN%=}xB2l1IM)~^g+i@9RYzp)@^Ncr$V^JH_(%=iw} zksm*Pe5XY4(rt!4+9TPO2lGZ(xDoO~lZBP_VxsNV)=LcHJqz+B(FAy-s|bqypRa5r zBy4s)I?tYo0^JZuF}z;}``X&rRQv9kJ2oWb2X5K|cmPphVz4*c=nSwQAnrx1KNl79 z-qdaM+Ia5u_-B9ri7M0Q7@$))00<7P8OW$7D*F$X!}rbB?Ez-eeDT84i+XFOLBMgM zQmBzC`aT;QQV&KAt~3wa5lZJVbMShcQ)b>G4g|^~HTa;AuF*Y3eF}=AKVNO*hr1vP zK$-j;C^VEQJ3j%j#xsyC7R9%BC(^#0sQgf1yKmb@0rD`pe7$mgz+ul1@oRh#PXjqU ztuZF^@se7P12Rj)US#?I#rywfN~ir2w^BU6`yq9bo%MkrW_Mvp365T0xNo?l)JzC* zHST(`Ni3=sdNr0G3T&(WlsrsDdqP{_CXO6>FYE(c00{DC^vH$6{tC+nmZiyrWgKj~ zAMnFF;g}1UQh4NHCD|DR($WLoFn1El%X86y@`TI4KpLTn3|<07&2{=G)=mx2sjAV1 znKixxx$bW3mCHTSmPVIfka_FuZ7~#b`{)AY zgfVGH3ko6Q-7|O|Y`A$Iwb5S(TMI~v4?q>+arjpi*>eCSPZfK?0F(Lq_pjTI)2-w@wn6j`-z^wTZDG;pxK_@qyO1ALAbI5w756^&EF5(5rWPKR}_$xBG1g<#~ zd^W3FuHSz-SD+Acz8J&Ep;Lk{6>wag^iUXRUOGCQNJ&4Il#J9ln^ZRbs<`kOM0IsE zcf>sYxbG#NTr{BPjew2jp^k=?yE{jD&jL^*E+>aBhFJ#}7uOIJk1<4ytBqTg%iXlD z$E(C7@OhZ*tT`_tD9Lf#JRT!NIdPZJ4YQu{9Q#QJoIOiGKmg#s1REP0z^$~Kd=`yV zJv=-nf!~xS2iu`mW!*8xH*)4F1RFV2RfQH*7RAAM@0{Uwur@i1BzcR^z@0n*$a;8q z2(Z-EDaWeXxgO-00IUIZmDkSHSoYO3A%tQ#NUwvWQt4#=ELTZ1wkt4?a9 z@nCJ=(O`pC!}wGQmqGcJZwp{=XGyi?T72}AB(zh#wFK^FF!aKmaDHQipsnV&(gumo zbs!g^HAoB~NB~J2Ep}&Ce-qYt#v^h^M|iA6AXxSx@WfrW%dTkk1Q(tPpMd#6ELn!g zz#(1qYY?`z=hA6X3AAYDHi#Il=Dm)ey8QHjkHlNiWqcwI$jZ- z)yjc$(&M~1v4y>`{pTC<9zmzcH=wE#ZJzkAb3Pb)i_|YQOXZx>1&7g) z1SPJX(7=v79&WC>TJW{yvBv-YtpQLkWl)ZT*g`)jef0>7kj~A}4(O4Zj|L>3M)FYp z$t%i*k*t&!MWFB03X-1_dkf^fIBP7>;Rl#yEKoM3r9Ct_T+0hJ#)@ioTSNsWT>}tZ zDL`wywkO{L{P|C|!k_u3z=GX=+ut%O1trfi(dpzco}Y4|d`AfeUM{$&GL332ALAJS z^kBO!XXeNqd~vwaa608E7I^x(Q=F5NbH;nZ%xUVQI{%MbYg6myN!APoF)z2Z)U|u*F6lbL6fZ*M%1daR<|f zP){0a$>;eHC!=`YEk=KfUXZKGW9U;fyuvg9BozXoyW4gb0vZ?S=CXwxDZne10C^CS z8AO!n#Yf|LJ5~@VXkIr{>d`u=FVY;Y!5TU=0_YK>v?4AxfNiO9`jY~9hamuIA_aJ| zwZgiUrkR{NC47L&DACG$3M_mgeyrvsRKNvS?Y}~Uid5vnXxl7+ekk=}WqkxdZ$^O1 zx3c+Aje^%49}W&tq&m>R)Ey&P@(>%8$j?D176@-cZK0t^xTEIfC=!Hz2^;xwl6f0)mUS0*3fi^uojY?a*&t?BHg0k`jDJ=I+u%=2;P;aoplJQ(^oQR(=s>CE zb~QxBOa(2hA1>pwm!?!3v|6~DTK+n`Lhlp=sPlzLqW1!JgaR*q=L0ck7>$NFVD7-S zaPJqI82Y5c4nWqT39B~Q*(UB#SLZL9E8mFpGQaqekk?EN3S`9KKGRx=47`rQA+0|d zH+cyd0D$v|7?^pJ0}k?gt1Evdm4weDK(?_+nx0iHlERpNwc0vh*0oir!8!Ydx1eU? z+uy0f91b=jxtEtc0eB@XGUvw%rcw-o0rG?%R$<%-exC;O58Meodko7z@@3!3hH`hs z2u>p~7!)x+6XFw80t%{rtFj5igK&j1%!MvW3!4t~>6$PgKpqirdjfwP*Uz;7fEObl zaGU^qZol+YQ;R#1MDid&j#mvL0l>#&dOvPSv0$A*&y^I!NX<*%GiAi#Qn0P@EsWMz~ecF z-?YwCH3eWjbw3*pgT;OD;1#kz3+N}28bAte;~yXmJCQuTfjO=})ottPxF8vmH%*4W zr+5XN)a~}#A4x@k91q!LhD7c2BM?N==V2fV+!XWfVYVdUf+Nh;sbRdoLttJW_$d5x zb-gxApg-#xu#A6K{?G^_DZys{qt7y>A3vTu(Xi;YNvbcri?F-7_NoOCaLh&%fQH~8 z8oa!@7}b(l0Vz?jhizN`erh=GKw$n?fE`v%9kffbV-LTPR3qLG$xkX_Is2tcns7M9u zSBEjK?j(9VS->3tW-AB~_&~Gy8epD@+fHVC4F*N6#42P5x4H<%*;G;Cv`bSEbU1aC zZ82IeR%y!&s-6O2B?|0Waa^tKDpC+`P-Px{DG0b#zQ9u_sfcnrXhl8|ib!Er*3M0o zn)eN5+Ar)U5A{ET+zk!M?XBeoCIREGH(X`GvWM(oOzlQ7m4V^p0SpLQ92^{tJn0XA zzS)6ILDiki==x#Q3m~exKzR)|*3tI2!9(DbZ&Op3cbHZ3D^s(zgUwvVnnMN3gvtrZ zUR3PxF8hMK7^#Z|aXSG7;9S65xr49=0S;4fWau*A`T+!E0S5$!1^Eho-ERfkqrq2Y zI;DhwKKp)4M?s-17k-&W4{~pS5s$Pq&AVcW6pUEz{>Bm_7-Rb}iq$@&Khtnja0v@M zGt`Z9`tRe06~PD{8er_)9@=vH zTy?Yt=plqOanCo<`0WPhvV?SqGKE~{W~LCi;B0LPI5|Ha)uMh}2s8Z2olbbi3gvt~ ztu3_Wr@J%sERZMCAw}e4=x;y(zZZ6<0fyyxXKW5AtZWh2)w^4*X$ul0c$+2S`ZOAs zyqP;w_MUs1N~3^N=T3N-zx4 zcgO~g49(2y6LNn@k2FijH_LX^Iv2g0!cL%bzB!q|!NXey^ur=@tAjiObUx+*>A)yB zt$lW9RFfH`sQ4@)U=?1#!%%+o^a$7wF}cPKV~$}75~byS0@=zB$ObI^Zz61}cHTsm zlE-v?=-vrPB2(VKx7X}KLsI<{=7EWybevnz2l8YF!OrLhY?oOs$E?!n+3L>ZK0_WC82k>NN+}FF@Qfxk`BJK1YuCbg)a4h*w$+K!=1IV z7zZPNh7)4NU_@$3`WBw9b^J}n=M@IV$j$w#tEg?>=6zI9eW2$9zqwwMxmz4Fuqd`@ z3qf`e0NU!C+oHTnkFG88b}`f7C?$300Q6QdndSmkL&$c-A~fWE6>{$Pe@-T1~`TAZlcq$0L2WlpEu;% zud?knpmYng1&knVx-(TgVYP=kRWuSl7FG%ca?I7xHB^~GkAbOy-4OQ$I7)8ltv@*J zWI?q=KL-dBdcfz{%r*NzjXXrMnLP}F8dZh5%Ehe?kVL5eRUNF9n_CNJ8Xb^ze*r+{ z(SW)6>VNbC!lerXONIvM{P*TJa^`Zi`?YdXdY2-bdwXgKZ*Oich$ipzz&wZPI&Q!k z3FSZgNJ2vLJ6j~Vx|$CZw~FBW1&n*p`l{&wdkwfW+?FuioW)m9i zCgk;Sn~Ex{=3rN&mPdgig{W1$;bSn%-|PZPL{xn6EaxvZUgIF`A)pn01Ilk8mv(JF zz_bXtD4vUPiM*WAhEW{ttWKmt6+HeH4z$zhhPD*LJb|(+4o-!rJ^gag;vgzrK~2@3 z)@#Fsu;2h^KT_Hcf+*NVitU^!c$wIRz14*?l-azKxQ*yU`w*lIc zEU>P>X$(Qa+50l@J&+6xfz-&A&~`rjFW%8}UYcntKBYuk@>s@a6rceI&_gM_t>+Pd zQceNR&$2UOF0!mQryNd%^B50S{?_mt}^Q^2P>A}!- zvvnu*uJnL%hYV1Svy|NgPPYb2RG3U(TKjAwVnJbwi!lDB_`kIPxVX(YmD^RtH?Y#Q z?ck(}0ecBX5XCEKlo>RK4j4kQ;pTh;)gGq9UPfxnEGXGh$K zO+_8WB!nmsY*`%;*0qg_t#+|Qcg)H@Dnu{s32CJuDapC4zwycmu?=q2jQDK@Yaxk;$rMn z;xHR@sf1h~0BM2;5DMA>RBC+u6=20O7(>p(ffYd4&7aobrqbZEO8HtvA&#XR8aTUB zvjOJV;qyt$B$?}dG?2oxa&sqwB|ZQOv?HMFLfZ$={|XvAkWCxFhEzfh?-XP=!$qJK z?Uh_u*I~|1j|@zcX|gplZrOpeI6<9e>n=>E zp&i%M8h03WAu64gnb`{x!IwZH0c0Nzy|nc&?(d!jYNCucerdIptQ~*#-LyYbAO(1?px@-5yh-7PXnK-C=|uXb{Qb_^qU7b-C%_If z1gta1z`uotrbW|A`tboG8re>REPeX3MFJ%zja78tnd`dTJi(bB*8l1f$?_#{cE!o2u^D$&FraHt_^+%g(4pukF;+nWu?NGfA6xGW1aMflcT+=nrsV| z0g?YNDCZXdB@4doA^|k4qymiW2pnRMUyj^={`@(pmp*l-HZs(QIXCANJ`3(z86GGY zvC(Pfj1p<|*--;uv5*&Zi1YaIW2DqaXYFP55Dx?BMm*a7e$XHurdL=7XjtM}E1)5Ca%0Q< zCs3wZL+cN^Y2)d*C8VWu!8c^c4R#i9HV9a7RYrs6t=xw`CRYb9B+0xcNMK>dAhx() zns1F)&_9bb=O_8N_R|~`up2WC-$u_W8&+v58%ki9<{4&O{%LS>YUHcaC3k@lTcEw9nPeF`t%8q)<#(L+_6t1uQzr>3!Ro&ChkN(c6C7`zND8=^0i+cIh-cz{ZKzk^S5^Um zcp8&oIn+0A-oABbW@YUhi?4DC2?;UO*B>XLV|V8saOJ67BbpBT0Q8Txrq>-jSf=iv zU_yX)**xC^Dxt9V0Nb~Q^+uVwlbtF5I4$mkm(0w}a5XgW^m8)cnP%qY!STCDK7qi$-tw~z zy-@`&?;BwGtO%uoR@jKPj#<5f&?Zr18ZRz+*sXkGqep?Tj^ zX{Vd*=SudQ=?gZ35o_Ld$4ghgr-Qar?_JChz_s8Ml28q_ z5S0!Eq>*RM{rdp_7mu$_;GCH~Yp=cbx?*-(POzfd^hB=Kd`g`)zQ3XV;a#YN%8v@o$GEw`N);~qLx;LbFc{-_G(K31vH!xc~)}#>BJ^tm2mLX2aq}YVG&&=Q! zUsdLf`|j>s8__Gpyt9L|Cmt1USYGaC<&}h zn4f!~lsOb@wo-Bv%|3f?YH-U*IW4Eo=PK9z`-GW=*l~?)k^u2zlK##%j9%B2kgfcK%D5YTo%G3%ybCTpDF* znsk9O>$53KBcNppH03ivi{R}* z=lvUa%vIR#t>ya`bE%)pCw3(g;&eF*#BbORi6>#%8G5Icrp!$48uHlX=Gnca4EyD< zlq^C;%9V+^a@Z|#+9P-4(N5FB%Y_puvf*}dy!pe1sp)o~sfI5N+h+H-=A~B7E{2J9 z6O_3^(d5x=*(s@u7-w?xk-?1=mo5JqzmGZ0$^5q3pu)1FEXPaHJ*ECCQb?07gjm~B zW?q@z%$Y&7UefWo1De(M58-x>{tq|X{Gh#yDO$FE&(%RulvD(M7#Ur5kNra80!sZF=)va+8xCRjX8wQ zJA`y;tv<1l6v1?LXn#8MaJ_P`(Mi!89f?*dC>+-`jS607i7JlQl(gz?&>nG7pd*>G z%SUCqn3+ZdOR&z^He*yDu<3+E)0+2dzF?kN^~G{1_L#9Z59Du%5oiB!Xu4ZIuK6s@ zbMvM4G;Uzc$SrwWn@Wc}ifNuRVq1Kp#az*ux!4!XE8KD2k`bv^hLBhUqVJ6W3L$4R>bXiT20}`iJ)y*>^wE zL>F$1o-+)NWhT7%3dZk!iK}Nx zN0q{ftDoS6l3$64Y+jpy%k{)^H0*uU5gaWH*Bab2PJLCuFjT)m?NrFV=2e7Td92)f zJ?b>ak)zPY%amkt^!<{!vnJb>vt#BBbz#mFk-rn#JllIDwPPMKbT(4rkSJA-a)oRy z=W>0xogFDH)OH}(QMZpKsJ7b0&W@~3Ha9A?{-<@>p*&@= z`O48~j<0WeSt7Iu>=Qa_<3#k!X%C0=97DEjQnAzDg;;#b&)vLsh}UVYrFQ(;?t&(M zbM~zVI;7M&?!1v4C&fgcHD-;iIC4+gt79kAvK)@o+qyX-_vVHp4Io9Fuudx;w(JZJuv6-+tdwY~3I`6`rA)fFo1C|lKfNPbt&@8X3iMw*N*y&wMRT(Y|~zP{_6{uErI+qI)df_u2$|4 zSNw0n2P)4=+{f*$M+gqJc|VQ6e z<5)7gGo+MP1vxAP+f0eoohV}rJ+bKOre*KKzMWQB&vClP@eH@KqJo6BMdJK62OP8U z^@x;dT#G>3zqfaHZMdJ1b@!7WUsOAFM3z;QzgXiJhUkPxESZUz%?; zpD~!ivGPv#@StXf$F-9I?ip{GYLmrS1j{~n=kpT zKjM?$VSX)r`bd*PQ)Gn7-BzX|EK_=8|FG^_fKP0iBkIG?eHu16BTVw3B<=r(kRW!B z@_!@9*Xw^(;@mT4^A@L6=*UU2Wy_A(IcY4s#YPqMG^u!>F8{CavR~CK#ezRiAGtiJs^7?-)SnRB|73a9h&1=oiw*^zG^ zE}n8_?=9K{?QGt|8C8-oTJ#McKfFwJiqS%v+rwZ_h7wgTykGx336Z3 z83tZh=MFwTWj-hwh&MPk@1w>!RyjyxkP3`P6O`ui{mwAT&g3^ZH1Ffrj!7iTQ_0_v zraBrT%_-k*v(yWWVW=$5sLWW4J2#?Xw!|xhgvH5K%_`eNYFPapjniD59ChtVp)OB& z>c82CKQH9)6>W{3V*|Dx?NuxOW5R}S^mD5}g-miMk}J^pIm4-&TM0hWvzL^79_72T z&fWX3_^zYPRsLD>N3j9CIc<<# zBhBSGEupYXr8(WHspua4$g%#X>DL0=lFjTeGI@*+r3J(8bt#Jlslu4LP+iZ_3KISN zmTs0Al`iCekV4V5t;xV-RMalGS-Do6mcuMLs9pYq(lTGed5acj+HTyaaD{xH$gzP_ z(!q!K$=;vM-^7rl^`l4n=AKJQer2CUx!qzKCe1BVuExj|v(R-^3&C+#SZPOp+cIJq zyZmJIqU`z$YN3&K{f?hKPOG4+3q6W++NjjUicLo=B@}UGx9iI>E9R$bpc(qj6xF#V zLpZZ_3m=T%L7&*#1op2Qeb*y*jTL|T^=lY4K_qHuy)0fkeqMA}KEYH4mo&RHa+gKo ztDS_yrfiYaRutOwuSLgSGK)^mR)N(gY|b56cZUBmg@DE*yJb(1KwG+PmDw|Jk28Nx zc9Z?kiuB?F-=f4pGLvU%W-^3gigu2GwmLJQv%pkC6^&|QWDWH-%}>Wf?@5?Ifla>h zLy1`>pP#L{h5@~zg2$|%&re>pP~maz%luMu%XCL2t4EZPkx`7^=R}V%R6h2Z_F)g+ zGH0fo=f5^A7s~6#$@%GH4`zOX(ig#dK^k_2EIKus5SaN%M@L@4AtBu0f)lm%K0n`m zb%f<^()GxUjAr7wJdJlCYtB@?>^I2{T3hI|+I%pq;+dZpCzAlYd+5)PZf;PhKv!oh zcMUwIj*gC2?Vkh!8tHa=;0RVzDnXbyZVoHZ)0O*c^ZI3t{_gr#KQXX_wQhN8uV@akG|uD zQF4Elxi}uCpT$7B*8s=XmtG)pA;ioUL^9fW>O*)FnTOiRR-U))0j#+T67tKGlnP*#DT;=8W z2V2FSZPSBBkxfj@ASunH!PE6u-xVNayl!a;fU(cbbp`EE-y43l*OyMh)|{cSJ~1(| zXDQOYf*}UC2x&K6$50w=_VWdlp%ZvRYqL{~zMq$eGnmN8$eiXUB=0&BL(6{|Or#zo zOh#~!)$hm1wCs%o7cUn8mcKFFWHSKz84&eyp=DiUTsLMfjskWZ(!eeUl%Mvdy=4_W z^owDrbD^cS2^OsI7obOc>)d?&5+R%b3Q_nAwfO5-XlbbcAg9%agka_6w$X?H-^z^M z9rB8Z&lpgcumkh+p~LMXJkV7JV874)s@#9N!VX5=TEueJCd01_9KBY*6u+BvL$Eh! z{vo`st=TQ1&QKSFQmnD6-WNQzGm?q=!9ilToG64bg@8q;heg0ST>J zcVy?_sIVTS9ZHi_26+$|_olO-2*=07v;t=fpN#C*U^jHtAC_C-0PqJIBV4DIJs96( zP#~)UbOM-lW8pvtuuQ=Z>y3`#BWUr$_$(AF8G_6c2Ax%0JOxBjI?xUs_pr1E%^tms ze=7L3mKfK`Ly!WPZS?nmcx5E(&U0WEA$b70|7#z4^O9f^K@cgzaS)scEPJ0$Vyb~U z8(>-EMlWsq20swR&((W`&(=7{gS#>On(^OOMBfi+QBRTp6P$R`j3@Z~{Q!pBcewmq zC6ZHz_GI{57#8qAJ3f^WO$Izbzr?r~JOR9*Z~Xy`muGFlsra++WyynzYU3kGvhGl- zcss}-`+?c$_3wD)+x|mEpnQQ;v#HR30Y+{boYBB(HybE9dDRc1luC4TbdG?j?7dmu z4Q6mbvo_$y2MnOmhG)*e&Fupi;W2t7wIblcAEBYy+Hy^mn$;+Tw0}ngUx9(BlfN7P zc`3uF=EcjGW1yb(IoRw4gJ29qcuFH;9kJ0+XjF5(x7Xsq8^~(F{!({w5d@6C#O?$I zu3K`TYzJO~dtqLq(>DpojmqGd2#s2xEhi;VvU#5NFaj%TIE42aJRRXCTco5z3U@z0@tF7eLDGD-s|FE>iw-OVgd#z}EFCxn;QZ_b z>e9#c=@1uiDu`b(%(WjtQYf)V0O$mPUf>FKvK0sb$x&5N5j45n3Z`KxG6B+%qpSdt ztFFbfL#(>NJ>0l{0YV3Q_NwOQ%n0}eV+wZ;bBS>x>kb}dDFR~R_dv@UdlIoVR~O&x zcZv{efnluy`5cB;=cXmxt55_d1ituh07Gm2M1fdR?JzA5j7KpLkCD*cSUhlSlAQVW zj^3lB#I)%y+(U@aYi7;QOAdkC4uU&Em=yU2inC_u~=q$O2J=Ig3 zY6o7<>#(qFn1@FvC(y(n0)qy#0$OmKghh-LPC3uz`{$=S^JzdXbpO#sf{3RfG9g5T z5t9a(^&qb__}E;O432xGui7VKhhbrY*`@^2TmpRjMS$dOfbWe0TaYw>JhdUoBT&Xi zdD10TSSzIjn)c&nZ<^!Ya-O&RTK34x%d7Sn*ajJ^3oR{b3xE#ub3aW{pi)(AWKs4skFQR64DuZbRMaT%)W&o>*6#Z@k@{~6S^WU%c55QD`@XiDR zQl37U^VoX=!~k*Q@12Ja)mQ@j-i0Cnp z$?`HDMAF?_1`PIDlMH~ANzJir%DX`QUY*fr%ZDYr9E1pvpb(~#hog}NHsam@z9zT| z%*k)x@|^pOLp`;aVw0Ywl_5O^*+Js-{HYj6emiP;?mEfk?+0o&0u=z0EYl#2_RT7jfEZp1#@aN;_ z3*k_+AQVD?;<}axl{Qt=Z~LEa-9ljAsSTKch;Rbce6&Qg{-t>1gb!9Oz`o_|wWn+5 zeTc~AIsm}7o&0#kV|^laSw;lPC~R!(djOs`c-}{#@!35An3Z{&If7R5r;q~efZ+;D z2wT9vGwa$O4_h;PkbT3nvViM*W2d5~hL9r>Yo6lABap9=k^nf>DjFJzC0cAZZzckN zMhNI`<9^;SU6#M6LHUJX@i55s(lVeOhBROaB>*-K&a}mo=m>BVK#14C^2H}6PAJ)# zsd{pBd`xlWO6Pmkc*Ji7aM1XVEFw_1bdKxJ%|Lck0)UsFLE{ugxX$lX6nMcLtgN3W zgMJ?zAPcC&Y<1+w$URWPb6E||L6$)lVp!oGfzky?oe7XcQO}=0H+HOfe(VNx(hmkF(CKrRygcG!emFZkAji~o=?31k(7R}9$?!y>`p9WCM&QscLG zc787Iz5D)Y0!UsX3xJqz;0n>A{X(ZE<|18>7= zXBZgjV9km~9=-r`t6qs^&=QP5O0uS*z7Yam4~NFv4j}bh_i$T=TnRZ#=HhhWVouby z1RfGVC+e^WX$w}rrLC<&I_E|fjj07~(Ef-zd~Jm!florh4q-t|Ehzgoz|71HmgRLH zlTy>$h`|8lm-lwHR}n`|kp9M0*UZ7{lPvC@xA7Mg=*Ue|d&8Fn83s%~VxWq20Zust z4OVLejR>`n1vXI05oZFhjQ!?evi=Cxwvrnf1vGI|Fwh%mTFxem;cateGYf1n0w`9<@G8lpldHR9LJGPBnU`!y199VPXAGFyJX*XlQ znX9aWTSnbrk*xrvDg9DY4X`_b)rlF(E+7%UTNeS`7@m00>Cp&EB)`QQyK{B!VEIkff%84O8YfygkW!Mn> z{`PF495OxJBOb>Rb=V9cl)4R%V#6vimqir9&I5udtX^eMt3bdb!weMppTLkU*)OT7 zxdj^z(Moryig>43Uq!?=yo7B5SU-5s{rTGD`yWea?7iyPa)I=251SKFROLeP9QHp5 zErj=qLqtV>^{O%qD-?{6fY>#rlQ=&=5Bdov^5Ga**AVYz*-8xyi|h^OQnMDPHI8}6 z@W7OR35)T6S>iYFkn5C;nf9fSrzPZTWmuo=uUYqDJ@qURCe^l9gkz_tr{Dj0Up4w2 zo_v3mb!grnf5SoR-JiWsJoHSK`+#*)ZYlNIQn&z&ZAsK5@NZUUsy0kwr=e_p&!101 zO4_{96ycu9p_M)W1&um`-zpfqz=IlGJ&Hi7_Kh))HsIfJzEKJ;`AR~>9?uW)=cVGN z$Va(s*~~iFAoHrak2te{J<=nn9=yE0|A2uD1rw7Nq(=3>MirK-ci-@>9n3?V7C=JT zko4W{GHfK12%43M>C&MC&*kLSbfR6*&CFUaw%rU55U7(Es9|L=y6$UR0B#sVqwXyK17v!k0Ip~Hp=NPmJ zu)K#sX@it!^76rU3$yV1$MArbuR#2Zjg3VzHx(X8#=L_lr6~Gd$T=v37XRA_F%wL}ZfGd$1D%B8(xuOk6%h^- zG#lLCFzBWpL80F-! zL_u@`bNck;2o|)RfPxITh};i0hI<4Z)w79fD!B3B;{LYebAJe-Y%r7cc82}mypsdW>&J3tfbyUBSS)(rS@xa|SI zp+Nz=Xgx5);M1E`X7cL*lQIls*D4|L7B-)1lQAg~e(@3UGi+ew1sgpWKEyN_){qp5 zZXsp12QWwog}TyCS9bpc?0+?3TS6n_F9ni4Hpu#oyM4Ui%1u*nYZmeE0b@Nd=lBe@ z7q{L=N-gQ*_xDL4g@fz)7Y}Vpp8IHE=jwfYiMV``F>c;1^!D`~FEx`vLG!2wDs{w5 z2S`5y4W3i%x#Hk2@=Y5opWU{;iz@}(SVr`4UtdafrZei%Pl$~PMh`<>!C+s z4f~P^CYf~Pdsx``#)m7ANP|3FLOlA4$-f5}U{IUD-feDD8nWZlwXx2LRu<(nUNW+1 zUn&QX4T;?c@5v6VT+df z3>nRD_#*M9{5a%DzJ&k?KpH8pGD?!e&;9-3;A~_7h1lWgP*En77;tvd{liGjS)FQ% z3k5E03`F{eEwBF=j>yl?=ij-5Y@c%f?RcDr)GBKv3(95FK7+O-zhC8tu-l-!=tv1BA=u<L!%GrYCU>Tk5$SVOe5(^K~1z-VTk^(>-E3${$JM%8I}Lf50@?^af+y7aOU-i zdN3s&9{oyOtIB8Rfdi-AJ9NSFqD&@hBx-ajr;5DPdd<9cQa)7*Lml-Jb0HQ?*)%c* z1sPV!ituKPc)KIVS}6(@#g6fhNRF_wdapCQQfd_}TIiU%T)wtJut@Df<<+gDXyN`N z0LvLG19efMPm0nD5^q%j!^S&QYh0KSsQKZ2e@=7Z-%OcrEH$)H+OmV{J5$Zwr`GXI zUL*!_Yv=fj{N^0LS=_MRbq8t%t%x)}!25HX6sMZyFlFhrm`!t2_lhIMRJ!_E5RL4D z`VJE|DQYFCAnxXxHP#a5s6SbyCLwe4kzDJoR_gF7+s(xx4GXF-0liJN;}rj{X1Ckm zNd=LSH%!k4DZ+3BCs8wo*Z5~e znM`H$qod>ar~|QpL%eFW30)d$)HP;_%#?+efi%x}#Rm#__I@u98F3E80}f>!bW9$^ zrH{4BE((tDw4W|uSf#ZHV423g$6FLPmm*K zW(#Zct%l6BsnDRC`h>1Jj6!1ubZBe|Bi>ifWdRD3jgX$>T}oICRkz+)>O56Im3|UewbVuO=~tUC34LDA3*voc_|s!r&@& zU3sqYx)CoPxdb8en2YCk!r!df@i#aA1@MG^NAJ(lwI<2>J^6)NYQ$X)R{3Et$enqM zoa6}O;*&mUP0%>{{&jS0boQ6ty6}O+-{A%ser0q8=qN^XZ1ELz>@{9hJVPvAinbRF zBNFQcFvFH2YrjaqJt^M4Sm~1ZqK2`k|8j~k#gun@F}x?XB@j(oyi6?L|D$ynjh2~b zFgJU8U|DNg?;PE z;Cv?F|3kCPA6tlFg^ngF(hJSd=I6G2r_~i}bNObaYIQqBe872e;*T!IMGn@!nlO2E zY*z}}lLvnQC(P&rH`^2i!_p=?Bp1OQki%oC?T@G z>!}J$@|W10`fSMCa}rZanYtMZFv!Uq8U{2{&<1{Pw4<-rY>Md_ziXa|QrC8TJ9bZ* zlT}yP3_<^PW?g=1a9nnMC@s<2ih|nlk%=*<#;^nU!RoMrh5H zBW0F#qJU^owFCc6qK_x@_JSYzaD7Ig9hGh0(E!=6A>+Dw71}D{xVz~2b%Tye>XN@{ zm-^>}Z}$|=(DnaLL% zp<|g#bt?%1OH%XU$=&yb+D}D3d70r3QWVZV!?=^WbBYS!?INp8DmlCneoO(_Kf zcc38>^0(pRd`|zMvkwNS9n22K-lnBhHo&&Sraf$zcF#5H?8{#?ZT5!$HIX-3 zCX_rL2Nd`Q{G=8Asp?-Z{*E+Ig8Y^rQ-!n#h4(elS=7zV_x1JlVEi72goFgSOhI^k zKk$ZIU&u#J$1;V2aM~yI0HR`I?c?Kg z+SdczrWd>84>ucOJtzXNiGrnjh7gEgzU_>pyo~C5xzy+|I7jrFNU)W#6UiHG9alW= zi{$}YrRW)jHyT?MWM;Oqxste6E6t_{w3Sqk#EI(99!k>($KdA&?AN6XG|uaW|joIcLv?*W#Go z^|kn2zm;MtUK0K~Y1^SQ7hy$^qBpAHIR82;9Ie>FWj;?m)(`u%}L$vWh05A`NxB^P>qQ+SLXv z5;9^&K4`arNRXv)D%e>%&~Psv^o6H+==~icVtpF_FZ!NZJiuM1F)gT=)Cs%BE==H2y!t@#t$f-5S z=Otp$L|G#ec@v4!Q+lI0$4dKq(iS+@s=QQD%vJqGzr_7OLu9HZ$|J)o`_i#C;jtX+ zYx4=!+rc$xzHFPg5(ed{xn^iBp|?6(@YZRh-enpCD}sj-Jvn`MWsJ|SBm+%)eq8U5 zFEqh=M8#ngdaNtvz=uk?(EVAqC65^>rev^M$+a)bu~;hiLf+pVpB|>lTlYaDUJZxX z(b>>XrJ&Jox_m``ov;)B&GErhzVI$mbEpu_Mc;0Bl+EH{6$BCSpWv<#T=YnRo&(Ns-@{&n+e|BwuDJn#B zh1XP1e15`6gUvD6Bm^$!LuwhxuWCkj!%#Cly$!m$%Ao_vuL&9QWax3(zQoVV2aE0v zNY7tL^Yx%{R@UwA7VoM~&nH|0C0-L<$V93Vd4I=64Lwyj-il7G@Hv$_sQkc9pTb>a znQ*%(*ldCM1Cg5tNpEn6XM>AhEX`M_{kUapJHTZ_31@{In6T988z9S(hLAdBG5Z+@$|FtPmG#&uB{9hGSX{)m=4z?TQziNbaQb z=GQwbjUlqqFpTNU;kF&w-8V+n1;M|vC`l`v$R{$no~x#3(-q%tC9!lJZGY8c3b)z9 zyWVHa@ydDW(BZD9@FA^W++@szkrVGnx^ys^7fKal%d;cluOQD6y%}xn zhO=varT3w{0(c%6Y7@<~#E!+dy$hSAk_SqBo_z2`-Y@!Sp#df_Fzsw3=@trEhkC9D zBJ}yswHK50?#&i5<ou63q@Kg<1^)SL4uMP2IJqsgRllf}1w zPuJh;$_ABEoQ!=)RXR#yEq}2tquOB`?UxMJ7qahv-G6(tsH5sT?CY`%F+Scxxt6Ao zahBk}YsRPZto(jBSXxWyjdn-I1?(+C`EQZMqKkKVh_>=UBduPYR|J{`R3 zOm0efI3dGgRVq!6>B|h*>Z_$>++V)q5tq|F{_E?yMTS1#r4EDt5TgBzEb#b28K5qm z=x8w-|7J|#%T+2nk3h* zZM!e(;Jq{vd#LR;LrEi===x;`=r-kjuA0eZ@yURT3Q4HHts0JG*x52jzy)ZS<` z627?q1U?WMAWj2fiJXoY1h_yfL7;-*Eciyyi=-c{;h!*LzyvvCK0!eksU&75(RsWO z-@ZOkJw3hh%F2$J87y^m^`W8E7MqT>PCsPyAP`dRAv5b@Xbyy-Cnfdu_kaCsRRlbo z64wZv(~p`OCW@GqT3Xuri5!V{95CSF*#CsGbd?IdU>v%aKK!}|vZ;eBD+J2Q$^@L> zbq-{K_3vV;l_;4XfG2dsSF2Ggmuj@n&c<=MUj5nG5&8c8d+j-eVm@yO9#iELD?5AZ z30G8PB&*{A4j&&M3>=)@?U^AXo>c09EIDRa05T)5+;J0D&!=W-)NRPyN9gG2GNe=3RpB+VghRBt(oI#ejX?}gQ+_}2SL+Q-r@$kKNrI+4tU@TiKsu-qNI0WyN*|3Vh zxCx_1{Wl`Yip$H(t-sSn&R=>51`uRqMvI@@Qd3js78dXb^?+@0{F6HPDix7XQBhIa zT%MnAEH&7%?~r36EG;cnx@fjEH-EFSiI`n)u-~7dPDx!<$dlf37NgCP7aLAwa`1pN z9Ehfqh0JXgGB+G4VPROX_v3qP!NI{(Y`sfM$*{rVp0!ghAY zvs6*hL_*&afh`%`s&8!kD@jqLRFX&qPwM@h1oEtAdU$wvb_s)5FfaxZ$${83r3_kC z;2`3YlHPq@pB;@t#R+>_YIWxr-I`lm+(GkUL`n_WZn8%0QM{*&(GuL?#}k<;g&N_-^j?<&(Cip5WPDHo8_NJgg_qA z0wRt^^LIJ2J!hB72=Sau$HN8ovMh-h3d3yw?jRQz`IQR&UQw{Kc5GjZl%}1-KWn!( zbM}GcLK5OKh+_av;H|ONys)|&Xk&$u6LFYI42uzXq9dH0Ju0v!*czC$Q&et+6 z%{;Q<>FKXp7(wrso1GaJu{xa}@8k*;rwYGH7{>>KnRo%ez{G?=mTb{9%&S*>pUNN( zwi6GkfHEzv6nM_=z_$$YU76{wCx5&~2CyJU%Erbvy48Hz^KPCLZ6H#IzgY{M=Q>k# ze0<+{mWV@aUtn0+EL$Eh2jHm=`!g{)G2ZRjd3kn!r#`2^74Gita>gkXt5jHAD#bct zWPSrlzyB<+qGI$2?VH3}xU!3cN0PJTyB3m+cb~R)4^#BaA zKJd?u4ItF4^EXQ6(&XSvtV}4#;4i;WQ-^?&|6b@Ux40_ryr-lrDkiDKW3dX(%OevC z#H5q&;@=gKmq+gn#UCuZe|mcA>h9iZNCuv{izM~U$w?$f9v9+qkTh8E1;)E~61=>; zGuYpCbO=9x{>;7;P4QJcZ??v0I59pk5f4mD3%N#Zc)rTmf#9H{v-4a-I)YfpJ{HD& zEd2#Q3|N1a*BPi_Z4skBet!FIY;bUJ#_AKh_`E^#a0H;|nzDPR)P(#U#)b}nw6 z31Y;BQrI0TxwR|pQqXRLfI)43G^mkmhjM8XJ$uZ%vjP$(7WpF`blCs_|bXQK%d1IrSQj*b@8(a}*lOPq}7 zadmYC8!P!0WHJT0#`dB+z5@S~Q?jSh4S3=9xw%L75S zixLwbzog33UAn45Q&ZD{G)u8aNh-Dvef_O-HC|to0LB;X*5n;qfK>U~Z5}f_^}3zD z!1Lch_7yA$$6u*v>uhU~T>utx*mj}v1zpR_n#Pq%^Oc4nm=$QmWMMO=9v5uf&SwUx zaWD!X*flilxy@*5Xc(CML~|hoOK~~db=ZpHVs|=P8fWXBn2sV{3&{+Z15N}^s>O%R?S>XO62Rq|8D?`X6B-(tzq4gwF)=Y95@mji07j?L=zN|# z@+l%cJ>5d$p&9HrI~Vm^N~P#ofBt5pYBkH5NS81dYXN@#v*u6n0Riw1f2YPx6~tqy zOw;py$m@a65yRny2pgX|Xlt)p1eGNvCd$`YX{KooRh5*KoG~Kv@bGxg z9k$O;My0Ia2b5Z)rSV#bGrAZxpwzDjeyO5i*GD5wPv`YdN}GVgp= z+Y|SJyT89bnZqgc=H|wt3~G3I7=-ptklS!@a1@u6RJrhi`9YY4{ia zzzrL;)pXqYk(PE$XAAMP5;6l25+X>Y5>Kuct>1A3Q(3n?)_$r2l_2u!|Hvx;^>%Nx z|4v@O>+tUi$kQfm9}`MiiuVhrO_rD@G{9Tv4$#RC& zzk8lyZBc3Kq@8zR6l>v0?zCZ_tKJeP@F=X3?EbbPsAo>M>CR*?<72GQO>T(@Oh|Y+ z7JkPYB0$Zec5&c$K4H~dS6U5;aGV2whW=I*qlon?MhJI2nh?iGCDf<0e3c?(vYH0r@G_!Klzg&s|!2TZtUZdEREGGMFwZDD^*Mv`XOFSlfVA>Zr7L6$<2Nz(5l_zdx* z_tZOyN{Hf=6jKF(S>kvWM1oq~=;h8F*z$ZCjCuaQ5U07!*Y#c|x^nETnh}3a>7g}U z;Pkk7ZFpn6TEQdqfn%YFd|&HBoL_)NM|*`EH&J|LO)@Li^#G$d;>ezE;v zvz{vQ>dfSV(Gp)fOk7Jw3hY@~gtvoo6%kKd-f*nOqcCr9`-^TW#NEefq=gceNTOXj^BGU`{OWMdqlYnD}Qd8ykA$p?g-`a3%+r#af|pRvzp6n~UQq z|0W#rWZWc|kg-n5q`i05U z26LjZW!%yGihQw?5j+vkWKqN*kzwEJBS>goPqRU|P#TNAOCBR!?^ zxJUVRrsPsuT6J%W`_x8lFu%l|AL}9ZELDeumO>s3pw36|WnNscv3daC5Wy1Ez8{ZN z3;saPl-b2oF6CFNFPW}o4KO|2F^WBKRR%*!;vO79d;~%lXh8^YaIO=P3S%#`KSdd7 zVZtTl8Q&B(w3<@(Ti4`9yP^A_HRQC8q$#xRG`;zVL#GF``dA+S15PcPWRZ5OPO3Pb z$FI8$-=HcCeh@BndlW9Lr(12}+nD%6srT7VT)~jZEf(}sW8Ti6$eW#}_V;GAi9Y++ z<(Ydnz7O@3%E}Z{lCm%6W^){(){{>V$L%aK%OlHV`!zEjMbu zE|-g<<*T~$=3Zf;|0RW9jJWhQxxm753SqjzJekVO zr^U<9F4r(3G}*fZ^fW?Id;+&!TPU8EJZJS%6A~8V$?SO@C8hW}LM3%v6*6M5Q;mlo z(jlB0!86BH!09JbsEG#bs9^IgmJ8ej@-Bn;RPsL)^_4bA3Z*TB)i%d#**~ih?@yV| zJNB>I-cQWvN2+gkK^q42xZP>Jy7JZ{0kv6f&ExFC}&aMQ}fYxp! z=!56La>D@e+~Uqy`sd*aT4hk}=dAXBr7&Kwry4#97$jdEiI7-aKfzrK*pxE!>4q(` zw$&3F+1v0p!#w897z^V;xsHx=pEOBG@#(Q+rz7LTg?--{d!{B5ed{+w@9hl{ccYOmb7GPQrkCx$nmg=D(OcqFKL z>CF#}XNGkHcq@yGu3A(fI2ek<-`5kNAVb9`D_QBSE;T{Gb!zxV1dq*vT?}fDY*KAFVFmRKw;AU96_ea z^_`Z3CUnCt2%74mth-}cAV2qKu!8yS?Fv%y{exiHK;3+W9(Iheg@J~G`w)h#o#`=+ z#r0jXuS1dC0Kb=WzxWiz?M-mrVq0=SxsyZZQ$X4$VHmrIld|_nqrWv#BLX|b0J{Gu z_;%Rp9ko16+n=bO!kiNE>;=cnb9Q}VJIDFH8$h->ik}nl=H{!+nwt1>eb$_k3x?mnpE~8JoC5eU zJmcaGkaMUit&Q)Ou4qK?BK3|UEM}9(&DrHA`IVC1LX<8{6Y-dNg{Zvfd`bIfE6qlV zOL5~Sbnue?*O)@=XWsaj?v0j88kft?dyYR>Gp>E4wSWRQ+VXBvzLJ82Dkt|*&ysg( z%So`qU$L8Ut@sg%ZZ+(z4Xoik{kh=36+TF9|1T{J@D=}W#|EOR zJ~Aq5zQ!CIuw~G{hlfJ|SIAYzqEetZQKVd^u?&>QryxH+xYE*6u#Vxm49+VrW2bytE>s}@ncycsDGylqW14!pHhID5+S+C>3AYf=0mO3G6FB) z$y6HcKj(P|{d9_owun80L?1qJ9EL%;+=N_84=n1WWItdHS#zxxKZ z-J*BJc?VyqSj8(SXi#F6jFU6jYPqq(dX*ar6xc(_YRqAL_OYL11`A(ur>J#c_<9VBZ67uyNBF> zfTRM3W`DGlfP&9%AYnjFO$}MA>(#-y1TkWQ0MgaLd?(;GE2^sv4i{>s^5uTDxL*Iw zaU(*KashSS^>N!XV1%$CC0s)T8(2ESFE~`@8>Bz=^{lOTe<{H-UhslFP^mF{3v33q zr3`KTvc~CHtI_c=k*VNP3>lA62eK2T8uL2WZT5TUD=RCYs>b~B^mxwzCXVX!} ziZ3D}fT5B;cA;XTr z?++^A=9U(CWaMGTA>cj0ROEr(DHF_s;(c$afq{sKh&)ET_3=u*Clo&@Az_{}OvwM4 zmxie*4{j{{@#`0K5{vbAoVG{%%1Ub6(;XrU z3k$RLibXY$@m{H$nNa|~scUeM60m%-0AoUM#jN7nw`h%RTh2&=PJ;n(dcwMQlef5#40@i zHy4MC%4LI}5F)U_W(I>UkxKhL`K9G$Hv3&Pz-c_(?xay^Hdg=sR0#!msLinq0m(#0 zcwk#t)M|pj^iR*v85~Xc{oNoR)oK4}T#7JOpg_jY&yOHTjCcaD2?HISKTkSER9rmM z^ZBXy`RN`U%j8@}z-%<|3gs{K^pONyF6DMR(oo#Dn;2MBD!^WTMSOH+lL_?#gdNfc z?qr*d9$hOdD{us8!nWtzQjHG#u^X^3fVVdu&lIxX9czCbvDXVC(B5Jl4MZvdX6Agp zJq~8Fm-DvlpMm}Y_Ng0KHGo1rYwOXw%l-LkQw1*>gBOtH0$~i;pkRh9t>FMBCnqPX z`&~8U;F|RSW@>xfzG7jidyg@%RHF6<9bG;hL$&W9_%%Fy1`q(cdV0>b;lIfjKx6`H7>A9%6Z91Q?(Ovh=b~?BHUNC&2Jk~b z6QKhe`5L;(ZU@Ev?qaQO)nm|c2&e~u5uOCD$Up5`Fm0yVU`ql(ij;`x1%L*?SBZm& z7=oYp9T~tkk`)Mz_QxwJK#v5_lfdoH-r{mep@{4(4M@yeFiHZCC-;e2Z6+o^&g|9U z;@*4}+4%VQd}s@bx}Kkp&v=cwGJu=cyx=(ysE2T%p^}(JfF;3^;4rkpOX=1GZ`?i3K%8SpVR}8_fk18(XB#YWZe2 zE3A^F643SkMt1DKf>`N*+CNxrOT%R_(y_7{8Hk~@adO&O^?YV>G|5DRxB;3U>W3b% zXF%&28XCgk@!$jw;ckgN7a9^`IF`;=tl5P1!)e9Y54i36!GXnQA5eDpfprrB4LmQfLDZaa->+^oTYg$CoRwSrJAbiC4XyAj3%WwKaH9fVDj3n*rAF0Uihy}Z1>ySmzeFmBNc z$wn4x%nQ`(O5YF5GO$ckJO-UIP5nagzscsxI^i&3{k z!}n8S5GP?^VPDdMOa$L)ppyHKYVU(ZuK6I?7*x*=!+l{@NJ$E81EX4NL4-MXT(DW zmYM-R0fq#|F$BU$CrCs-MGQj&CHm@UdGKJqs-nCc63QV-0!W;2IT(q$1OR3OFqQ*m z3tS(76dESxrvQsDOy)De>FI<3NTosO>hwk4IILSGrJ@=FOMEE~GK*9nKfg(kSfIRT zh58W|h6(~u>`GUh{aVK>Fwshv1~?>~-<$nWTWh?pKQ;mn)oiq1J6Y?T4+Ru_AP6Mm zm4*@@1(jG2<|=+c#{E||#$Ys*4m_)@`z(at6P(~?B1lxH3%^3<3=M}O3tII4IkOTq zl7M0Y#RBx)Slh-E^7T4MBa(YkHC$;tX?r6mYhpWV(#4){95>-#M)FOS3J!Z?{LB~UaC zNluFCYWQDw&Q+UYfKwwOB_&;~xL);4+Zv2hXtX!|D0t=K@9&@PdVTZ&LLij=!Cb`c z`3?{o={|~(o-u$1jMUG}%vjde*4$4w`d~bXAS`Nt>+Oq+>ge;zzdJg9epbl+1}tnK zj%F0Z{nL{Zoq_1jAo{+B~a<8)T;eKgb^n;*8wuQF2GSL zje0bY5HMRU`6nf*hqi3;b3!sE5iqA5uuC8qVn|3xF6R}K*OFKH6263ClHF+-0v7cc#0F8@ejoO+_ByrG~ZxH zCM_*}b9*aSWh4c{`=`YjC)atcEv3-dN33MPnK0RJ*iy4GugT(1Q=l5`s zdI2rHCz3=WU$u%9q@O(~Qkx7QpQwa^f%$WIC}m~K%3=Dlg($sID zqXLq~f0t$@6fhtD z3>|>opt0aTv^Di-i{FK9ShFnKWpM1K)GHdvJRqM&_)=ivkz^!{uEGXNjK3Ah+) z_2m~3I6XaWUtRt1qt*Q$q%N2VL~6f%K#~AKHhgv%UXW-d(CNeeD@l00(eQ2UC8Yl0 zaHLwK1Go)hM?c4Lv%YtqH#Z2EVWpZ)wi`XjK+BB>C=Lya6b?UFhI?z-CJa)0us7*I z3BS3&hxo1I@yhVeSBRe?s9_*}2j(0Pav6Z`Ad7nMSzvu&S z^sig~3+A~N=|C$RB&!33+vgVhvKL??Ai=<8{YGeGW5deITK@f1Z=ufG8qCs5lwMub z5`YqfJix~ffshB_V9aMrC&Jru0WE{eX8Q?v*UNN32NbJT`I=1^;sQX^S^;DUz*STy zaO94~;Fv%Y$V*^79|cK+KmV`n0gJ5x|EfVe9}&iXrIh}mp}7Djz?n~$sI!1TrvGp5 z01_8VHQ1z+S@}RS1jw~v@j`m$*5FV1LFWymrp0Hs?~CnIQ&Xe2Ug3Oty0-=&Rsh5d zND2HJ>~=sU*$+voEf{gOOdHbg1S&f^lQBZoOu=3MprDPS zvQ~gLZ=o1Lq5u-kwZDJOmVOI*CmX?+v6XF)7H|GV_Eohy2_m7kgK~)gapcC!`X*3%_ zPf6j!jsx(NN_}ynn8ZYJVc{On)GAP2!6M=C+S@aN7zJSVm9_P;^29G|U0pBG8FS;_ zRfO^8O*^)>Tf}kPPtXVR9+EME_QoC*p_jQ7zf3M>1r#NSO13bf7iS=IOy%%m|P_ekg_{l??^1Y9&}T zYEq!I1`!DMxI{)=d>uFwz=DnKZKE;Q0Og&B+w&e2qFK-gASxxr^nH;^D+N-xJvw>Z zw~&&MyfBaw#JD5tFZKe|1I_-Wr^iI2gSq3uobCaw_Y0{M_88EUg3A$X0dZnrF~Ru0 z-(MlcC>j_Tgr}y4@@Mp23Or(+QSP&>5K&fGW8U^159RIUjpkHtj0wYyexcYtsjU>xp?2sL%Z>3o zX=ty3&YcmZZ*A1puJ?=mMFz~PW23!~>~0AvQqr>q zX!vI|o)5LZKF(sikd1K-N_}M4@ic(nm&t4HxDjfPQCN1hCy&-AavCr3z|$-9MRQvt z2F+WW>$ik21*{ZGg;?%tX!;|Z_gSMzUEBSe`wCD2^Lbl>` z*T5IJ37H#mDYDHK8uh5AbV+R3m&Q9;B}51{u5mUy@>~7W9)+wmKrQ{y@+;*@+27O1 zSw=Ijb#yxXT!Zz)P=&zrH%{WLF;5nWRiV}>@wuaJs%593>iTy}l_o>!bSXeGvR6ruOXZC|VVik?daTr77|FQ%?7Eyj+AlGsCD#jP5K9f!?* z=CddxXaL-gHG9$BVmBw$3ZC}39)695{U}&@QSo)d3kbG)& zc!D9R%%r}_ot&DCD}6GIhr#mc`(N0wtc^Y{18$KVv`SY*R~pVgay&n_M?PP9tFM{m zqzTg(MOX70Ni_~U!+P*8D{0G<=&)1rPo}XlB&aaRG&-|nW~w7=e63iLnAw&M@2i_) zQQ4XdpdNX=-ODQ&daoF2nW9E}64unXlcT9)HOoD;C3TcsWYaYfE?Jc) zmIiM&*Cr14X4QF#I;Wv7>Zyb@a>5iTDtCB8Buv)V^9ui7fu2%tD0HnlAyF`{U^1TB zm8u51Epswj!Jt`(T-8AY-)nVEJ-T4O=bhD2v~)e6VB%4OcvK|g9>e97XXQH>(1(kYZT@6d9Rce;KcmBiUg`7m9qgy?W z=ejPl@3aoDgI!B6q1Kxmct?lEaucVw=$f#G?nKZgRz9cc!W`U%deFR|XX#bCx@~!& z2}2}Wd;PzxKpP{zl0}M}h!tF^_eor!x!%exuoRC;@fIi_sp=bE!aUpd%si}xwz{`B46y)sgyM_j6#DCuiTCqh$F3krjZ zr%U^hR)V%JJFG76B|Bz{$g*5qxK5R7zENyGH~;K`xhC45qfLh;HL17LL7#lkyxA^N znuE%xn6O^R62uZcGoI$j*_~^QUh$0h8?UlHL|vf>l~2py;?1 z$+GzI^1Tl&d61K;+uqA_9`eZB>#0S-9`eu16&gy!;eR+n3a)t!70)AOxW@7@JQ#kh zlkS{yozRHTEW56)lv_;2skWZLez-JGwygj41$lh2&bF<;H`?>bDwlY*vZp4HQfP>( z!sV~?T(+<$v-NvPCXBHqwl41ixPsGRH129SC?iGlZLtc^ytB){gqfi z_Rb%xbp5XlJAz#=Zl0hI(n|FbVK75U{BQJ`+N_c#w95vj_RlHTTy8pKI2t1?cfOnt zmcIyIc8v9RXTYtwlDwjMb1B7a8M;=&oRdRDITZ;XKN&+>!((7m3HM=!CaAmfqY(E^ zz^XCxRu`enib+!xs!~~w`dhA{32DwgC+Mp+Hmeu=&#ED_8P+)0PV0kllxHc=#R9Mv zJGPV?3)XHNr}Sx~A0*D#8(htl6XVlgmX}HFlXkPMVe5_IoI~HD^o14wW!Q}q&}gWP zlCD5Bop1iYp`VtRQWEfIcG}qF%(-gkLaPNQrU*MEXk}fz7>~Z=b(408QR+;(^rnZ! zUj<`x|1g=1YI8PdL1HZ*$h|M)8fpg5B>%XOm!g|vG7Mq|%z@8uhu3t)RDXu$Sr|Lw8{|P*GX|N$`zl5_E zxy>2zKvh|pEBxX_+oA8rp++bp-V`3P>hx#2@Kb^734`_5iCBYwPKxK>OHOA>*N0g# zO|?j;;K)S65~jG)_17zTbkd$~e~cX3*58&6!Fd?YV_lu=zRK!n@1-BH);TOOa#LDe z$5n3oRK<>jcS;JETcW=h>Col&^l@c`wCj;OzL4pzoN|po-Ss()#`x$@nnRR|V;%LR z-xQrBRYy9(`vmEo2<5#d8@jZ5qbt$;vSPi18Jr1J8RJd=H^Iy3f@(|fN)IPo5)YLu zn=_Y5?fCgwT%w68}y#iY271~$~Uf02q+?jUPxb_*n(6&H8%%V z5mjwi*K;t;bw!R>mqxKk{H~&tzP^s92iz0t|YKazVy-%2E{T%XcW@GSB zj5XoYiF%q#{q^co@czU}(pcnn3s!~xFzXDdlc8Q6f=ej#bHtYrH)&H{uZ9V9iIiys z`4jcph-#&ve9PQXBG`pq8E^8>=ckNHE}Enok93yeiw4}c_VL{`d*UX&-DM4^=mD-Xkw}vK$tvaIZ*GV|!>FzT)2rQfb%UKkeo_zX;x-*Tu~4Q!0K*p|;Ae5%WH!Yte4fA1?| z*Y88zcdEJ=+GfCY`hsq|F+NAdr0W-Edmf@xU3M>sdhobRwUU`G1?Tk09g<&G;dJ=n zCM9TdJaPalG$#LnJVHodDB@7VdghlBGs5rWb!)|(!gGuOi*@RsC{I0Q&%Q+GBxh`w zLPpm~!CSvS+kazzMPJ`wNg(l3pKp)%a{Ku73GTy5JNR;pK_?%UklXQ;%9mGs^?7HW z-4i;Gz<@cpfXtmOQ7!1olA=xJW17J74#vlyZ1r2pD8BFX1AV`n z7QlDq9Tqsnqg8@y1Y zATiNnt2{&)cBy+R^TfF-wW?S4AdaBk{8!RyfL85Bcsx2e;UxcoVtHveuq`G2a(~fq zm-jz<0bmjhE{6&m?gr@{-TNLCP?4=`4*}0ig>VUFYCxuG_XOWQqxYjY)@1w zh`%vjFC&}uAgth_`5Us|gaTd4gpb#0XTo9=9d7&o+S)Zq{Db4M$D<=}!wtt_N5bqg zP36Y*B9cn^?r9)RLv9-AiAwyWcSNE;s|Qn*vUKiJL;EReXDS26z1B`IQ3~$Yd8W;o$hChBx#a2N5CiTD|?St(e3cU6XiSdW;=?+cKxIMSDQT!r67Y(!_t@Rt+k5VLHNsI`uphFa9f_8EKt$dd@UAZZtIaw={@e`vf?183X3c`8 zx=4+gWzme>qIQwPG&Mry`+C`ALWwKcdBUXdrTcqm3;`9drA?#c(eq7b!6UYvtzq0_ zu4~uPq{zsZfRhII|9chN0FNXfBh%WsPNUUK?(_&VyjeBMEdZ`~Y0N5p{kjIU@^AL`-rx$7 zhM`HKFu0xdos&RALu0d+5?uK@GKWtLDJv_RSo9AFkaKwlGpq{;bckNEXf#F0zz}7L zQB8n8L;=sy#KN}ml`aFC7xYv_4Kk9vCNhByylBbJ&NiySee-5KdmkAIDbc=DSk_Zg zl98+^JRt%1#*LJL9DPoh^B7QNgn)E6H2>Q(Jskrcip73^M}w$9U{iHJwGA7d-v{Op#~8sK(;Y<4h}+L1fOd1 znBU;y1^^Yvq-)~=P+1$Ge@K53>E*BJ*w}S&Uo@RuymTMjLPZ6lAvha7;4NdbaWq2I zQj?I59Kclu03KcsaM-pyZvcNq^B!O9?v)|$PgSP$%1U(5vcRXPhzaTZ0__qxVq2z( zzCeBkYK zZRtYwW?H~yG(iJ2 z@~nv}&~w3U=D}qwzWaK|Lp&hqpb&)lot%8T0DP|4mV&(e7!c3E-L*g9`ei(md2A#V zNguLnf<3$*q~Qw>Y8hx9Mmva}w`U}lhK54W>8S5nf0z5fSZ<{_tz^6y1H5965V2g< zU;p6X>{~Z(Mq?-e zmpF8~dbJ5(UT!YE0TG`%csdgD4`ycOT+msASt#*dg_VP266g+kef&RbEA@L(@SqO= z{Ndf%*VZAn2NaZX!G=chKeqOZABn8-1v<4+stg<;}dYLqn1QOOV@J!Bb*m-0|Vqef*oS z-aTjtJ{O?3!xZWQbob#F*a307W)t>PIbol?h6YsxHc<6aD_pJ*7s(>t85kMSE74?_ zhU>(V#7ATI5y!p**DWW&4b0#fsWG>Izr@PM_NL)Q2G(MowX%v9p@gO_bf*5NK1ef^ z@aC&Z(LrP&%=Ih0h$U>NLZqdp2^Uhcq6Iw_^!2aNZEbB$z~!|b$ot^1K=+$Lv&YZv zj_hV$EMgWyMwrR7m#Y3tKioxW2Vd$r^ZB9R<3wTiMJ6PO0eJ-jj0&sg+zxmk$g9No zDnK(O&q4lal!4^zMI~s(n~!MNk7z(-3++hMUK|(^iwzWGg#t?;rU&;SzedmAdIfF? zi(3LaKx=s6mAl%92&i{l@GSYSUxluLUKP9a0|u)xL`7??%2-B?5RJS99~>r7lfa54 z0OtCL4kCi)h@6Wn^1#fNmX?5}AF={k<>s?ra_YdTqBVF~eWa$T|9+Y&zgQ??#rqO* zc?(eJ=+5ondR@$l7EsuhGldYr5gyFZ<0=Xe0$uIz>bp zNogqsr9)amN*YAEMH)mUq+7Z{x=WBd07hn=ldWitojCYxY3l?;kIGd)Z`MKzv4eOO!okI&?rDG<>N?u z%6+c957d1z9&;5i0DA9{lOR=0&wnfM{U@|_vmlp=U4WU5P2F-9{X`}3-#QOfZvNXS z0NKi>;N)Tx_vIrrzWw{2%sA{p7Frkv?ycMgFI10QRoG+DR?(E}T=}v8NmYIjdIgIw z{(ozpmKeN>0kD<-e(K)}w7dmzr1kk5-Gl$?x}{qy)PDvOsHJVwbl^5$Oo3oIf&hoi zLEq#j4PETsAO*P6Qj0%)JOJv0aw||I#weVUSi!7==EEmi(eR@xB%M&(5ks?d1WdX2g`odn^7tn(q(}NG%qqq8j4*oT>m|)odGLuJjg@xfZP8JC{ z8-;~z3<5(aP;~9AgC*Q(SpsEjohYS@Dw+Q7WN$AGW$18i(0rCEAtz)`;+ZljJP;NE z75p1&FDrSz#2}9EmGAyru8&|!|=7$^Kd{6nVMH1}qa@05o6t0ZQ5fm6<1IO~zKv%p4H5veCOYGuE zt-3!t4C+~Mb5PQM*P7j$TmPTf4$;8PKr60w;was}lTiErzN1XT7;k!A;h)Nz@dIyR z281~zGBOtZA@rKVMpxy?*d5Wdj36Xoc3!{CAvqP0gc})eK4jLtEW% zc^~!#i`KKH`oS9}y2gs)9r%STo4bx^{Zgas!h)x~M*iUVIXWv+Mv8|*7ow_%$eT4elN(5Ji_|7}GjU}NxoQ|_DMxUrI)R_hx~Ey%|y2U%?|4Y16&p!TktEh2rh<%m7}ErX8x<1pP7XJG!r~F61p(^ zlli8NyRHG{kQwaC%wBZ^e>(1Gg~#Gx!P(f^H=tx!YLlS&-!;Hs0LtV^3J&iL;y!*9 zSYh*2udeC`EDt>+^Z0ksU!vBX7^g7h$c<~h+MrsfTfv0G%jAdg#Nwa$TZq9Tp+Z2cX0C!mCb4By^EMz!1eDmdz%bRGCBuH?R#;91w<8iUMRJuJ)O$@#oj` zwJR;66~U@R$5O&^IRGppFFt-?j-@h5_R(t=7M3A^Xak?a;;52%C%BfYEYQhnScOF;6nUa+}R2>E!0`VqjsJK=iP%?}|-G z_z2KBaM#6NgcdUi(b3Uh-KH+tH*Bt0qtm_MI0Ro>l zb=EC@Tk)%j4IlYwai^t`}q@k@RY3ufq{2@VZKMP&a_7#erP zTi@9D>NWXDg{1e}H;0RJXF!WFk2n3ie*HRKZF}sbPyfD@Q#vya`~vujLz9!wUg|2W zg6G=P+l!jkfY${j9^nXpQGnZr*zmk;UEWV%3@`*-{38Yc!NGsZ%=#&ipf4L9iaYo36N9cb znGL4^TuA%$^e6bIDYO(!l^UTzXuI@5N$la{$Jn!GJrQ;2rupyKxI4j3+5^;jxs5D4 z++|>Q6m5l#Y{1m?v>dlmLMfn>Ay813kFtsD2W3yfFYyA)}w)5~g8%c+RyCU(sJl&Ew91@Vvi zAKh7{L8ywO%sl9O&5!)9guMbte6>`IagF!osu>RrhLO^=a$JrES>Q*eJ z1#iFW=e9TYH+YYQY_jfgalcMP-`B4&GLB>vu3*Vn+2N)Ne;KGv+q^-Rk@Ok+giuD( z=;D!b$iZmzW&Zi8UfiNq%?nJnHiCuqWt^C)ez9z9;Um(FWb*}+uf|$_YlNtaTQ0bA zov)}(-}HF9yQGWOUHe%R4Neo62-&u_B7@BIqQ& zb=)JwSo^rFCZPjNIuL$BdQ zL(Ml0SdaEE=9SfZ|K{*ckj^&$CVuW+jgA|85gc~y4qj!q%ofeXqP6OG?%YbXGuN*w z*uQYfeD-zpVmGkW7uEiLaS#)!>u>q7Vdve=?I5$uL14%;Z6KP5Y!N?bGKkHQ6bbhIEu9W127cVjj8+zv3!Cnrbn7?EDWFg*7DGx z)z>#z`W&*mxVt8apo*{D4jDMfpP#rV^{Y1d?B_Sm{poJpQtK@Y#*dAkxVDb7xrnhQ z3)|`j#kjD z_4sdUKc+sXK?EPrEbg5)FKcG^wW*!pne!KS-AyfSy-Um9{xd8@xpXa^} z#D6IaP^HMt)dP3iez}OZPHs(_T)43A4ZIhRmK=S#EX=(s_pJ1^J7Vg5RMxZJh_=|Y zXTTq!{5FgA%J5GLLd0$-dT_+#%L-GI=NeW=)$5>(i(Rg5%IVrKs4fD7-II&AEJ|p@ zO+QR5LdN4)#P>Q&*GMbZ1_3PFt)2DVIaI2#FC5B`MVnBcT(So0&tH|gZp*BEMXz*@ z%`~82YNqAx&crgi;9TujVDBc|c6WIz8docI9%$J*)#Uy_)2EJv*2e*1G+kaXVD-U3 zudVL4u0vFG;;m17UleJ1t5cMj!v!NEoaQA3QUf0QY%{hl|KR#lK}QC8su`vE_`~M0 zmsj^%UGV8ko+n09HF@A4FNwq|US7Xq>KUkZ{zH(#NTpYXu$sK`?HAu|4-6I=Qug#3 zZZ(XRT>Nn3%SxfR{<_wqSp>nfK_J_h;(g(h5ORH3Csl z0z93mmh3IWk-er!K-`K_PWd~X5u6JT7wP(kH@GA?Ngfi7JTZQN^PA?(aEGpPcY?Zj z4jM{FtF~0GF=YKcBr;2`?C$MPEVEgYpGtl18~tq{4H0oxE^_O#8RbRP2bFo>-Y4nXUnLYy#)|1q|ISA(V;;Ve3w7Vv8Il8+)4Vszoq?2 z;G5jn$9jLSr(Cru5rS^76liFHZv!B7X`&8CC6F;XY;^pRNtuH4}*tQSweiS_j zHdo!Ei#bJL#j+XM5GAboihZ9+ulKMb$!^Sbt56{h*7phJ zy7VG^MP7zip=;2#OM3Wnyq*7e&;dW{^ylGE&Jrj6>vQd#y6?HpKRB%O?2Eptt3dO) zqduOSKHx}2^DD^njj4%V%cA{@zu(fnJh5F4v$TrlU)25EuxwsvkK-FP8CSzjxq@q4 zwsp}t<#vbY*!CU&%{=i;5Vg7#`<4pYH&No`SXcyDOU9-cB6za6;6|Aa4#}cU# znq9Eea_zTzy1ig%i|%zi75;9h)aT7(IacW_jKnf1Cso`Cc;N5!f0$W1Ieh zvRqh`c922DN)mEK2HMo`0zu7K$@#o~_b6Zcs0n7_77fs4QtRk)*u9HS8zf%w0vEP5n zkK$Z^oub|9`xG?($8KU~)d-hh5U){u4*#lO&5*7B42RhMa0gdQSh)2%qI>3Axlbii3$3stdkU)#twhGK=53BpJ}&l_nXexzx>R{^KYVw9t>oCh zuwaPoi-wCOyuS8=A;`TE`<8gyPWE$#!qd2^T2EfR@Obj(IRc*2+yL?UV*_7}t4~Iw z6XX{3#~%4?gtam;*VR!GgcF*cFLDZWtFzwzta;lle0!{nBbeBZp|FN;M#ml{g(t~| znCO-gTe|f#GRh>0-o1M|?A*G#G1nG7LR#FrYLmzVhXSJ#Yv zQqSB4G<@Se1uoQA2JtABJbdmb%+Yb8%7y{Gy^U&rx0O-#lpB0L4~D94DOeen*JQu%~^ z<$QPN_%|x+qxT=Doray5IKVUNFzcy=0*!Me}v zS07?##c@aWll3LO3-Nk>5t3FAcYTkGaN>mjsZI+5zaKP+izz(|vvjl{QGVPFjx&+8 ztlX!T+kE&a#0<;IX2x%CEk)e+X05DdePpeku#*2iakh;82i9}9nX(GBA3g4>F61rc zYmDInf+102!a>$C^GDy-o|!U)3cvBvHrd~AxfSu8E#ed5Wuw=vX4<(3jE39sRV!q- zn?}DHYT8O$3c53sa6D=56d}djHK?5^Eh9;9$T+uc%xEBft2xm(Uc&O?LMldLaQ@cd z;8nm?`{B;?@U?Ehqz7Dw$C2j{n=9eO?NNh_+C?@Wys-Y zTR6^#*Rzv4(kX9goqYGK>7n>;_?)#P+s8I@vfjk7I1S&wzT1IAh~}zT`|a^s#-7NF zyG=rWv$_pkUx-g-qDV8!Yq zVHh|TdFS)Z`QLw-27H7N?Cb{g+@A$<-14Ryrrta#ZrvZC)1#!+!eLR~?#21|%+biy zB)Nq4J%#N+DW<1Aaj>SXSYZ&;~C zK9qkCKRZ0i=Dgdx_$-@2Vrua)I8yvAAMDa^b8_tYeLe#5m`oc2LJSTJ`P->ddU{Gd@;z}JLP44JHy_-jc-MC zJ~24OH+D-yz{MnW53P_)@^*7P7Ax+?BWaS*FxG|j%{Jv(oIMt*MAD0E-Kk2DRG5BO zx!5F|ohtE+?fU;<0XoH1F7`T=J3Z)qh6MHq(Gf)NUq6+7al&GquuPDWtG*c>!vs2UjL+h zt%a=e-P@vy8!sbtjYuxVq>TPv!_suYl_pO=`?KOjqRNX9T^IIPy7<&Mj2hDu9`z{u z&mS#e7SyX>=h$JP)y8Y@{kS5%{Xd&-bD12oc+UAn9ZWTsS?s75-lXolbeUz|W7c@n z_lTS8*U7a$g#BEQQJMxxue5J2BW>roW-d(~W~q%;(Cm{aNxS}%`0btv;yaS%yCrAK zule)8e~x zWL74@SS`FyRr#r&z)EqT>+OS7G+M&9a_4u@b5OU1YY@yb>pzp3FAQmAbvM)X?(sjq zy%kV5`N7nSHS3%a5$YeDp**Sj^k!UI+c3t{9hY&5k2pV4uUfW6==1B1DW4h~5@GVM zKQpD<78b*_6Nze>Jh6%L37u0CFjf?3>T{K~`4J+oqG$Q$Ru`4}&(+M4sv9$0dn=OE zr}$&O40F>5C{6C%q}`{@XwSNY@6lewmH2H{B?~(4&_(`K>U`@~EU_=cwmE|K?QbZG zk%*_uU)1IfY4l&eaRm(SH#X*BAN=mly5hpCbPxRFVnXIx@R!e&{>xn z#_q4t>w=q^>%VYb?KF{H-fZ1QM?<06s`77e8m&L(q8Qs;pl8}_} z!nb{O*JKoHo6PE&CCT%4bDruJ!%oFQ-E%aKIl(t&8_mWWN-ptR9$cMS8e#P3c4(JP zuTX^p!pnBlN)ncDNyrU7->XV}vtRv7u_WU?@yFq%o6>}DfBxh`D7*i8R(KZHmBdN9 z^nJB?@dxdrrOwufGuCocb?tk-M}0=78r0Pk&Ixbtc^x}k&FC{Ty;*8=9(g*2ZqwTO z$BMe?{6CM!~~iPrw$de@Q!b@(=( zZFL>abRTj1tzQ3oogz$F_;Y%4gs9$e*#5{X^qi=&H_ZEXh2r|Y`IN$Yt81wy7kSv& zp)?C6c1>M=y1)BoRN@FX^^{m%9?2Oc(Z2q5B%`W&-TP`w$Uv3EageQhZ(4YP)RQFf zvZo~}^;t$xc@|@aF-c5A?eq{&wSNo|#S!cB7-iJQrLTEt*z!GX-G_{exUfz z&T=_Uq5gH)=V}sH4_4y$!G4B$LfJE5&5j;Ce@Z0g!H?E`8IN+8gU)p+>>H**{G1%O zr4YNbZnr~{|Bp+7>w_ct3!=3e$HRQ_{$5wIhPu}<_Jj;a>cqJ%^{J1Ctarxv$Hl4Q zwY8X;BR~I`8z+zI{z1;!vfM~OBF&(z$011V)7nfltBlF|^oI{oikw`}gbyx|)joE1 zq+oGLrX%oJD|y(KE@-2SoCs$9;g zenluer5b-}zZ?9CS~8i7dPWpa9KhSkOOlk z4Dw_L78ZgS*D&rMeCE5ipzGT`xo(z)B|%s%W!~~_^*-?j^o1I&Hc3vJJJ}?L{TPPWJZfTRtC;;% z?RH%plQuL2sZ>$s+=CBKzoXjy*<7&yIWJCBTO?#Z{Ds^XLp&nLv?xYy+$4_hboxh| zez8^dPf%hWJPPUAuNAhmNp%!{L|!%CjN%=B_4GAw(R_9SJY=G9XjwMU7( z6~a|iCY77qWJU5j3l$my!`l^4_R5z1t$1+j2`V4CyB+_~>h$sz{er(JHij!z&)k%$ zZ!%vnigjFgSJmtN8b8g}C3zG@XM<1ss?pcSeLs!&T3w7dF23L27WNkCyYzPY-dxaK z@^q+X7tw;gQ^8MfIKlMw>XlWWi`2xjqB2XyC=2)M6xEa*J6C95v`_<&ZphM-yn1Zx z_z-TW@rlt_CEi%ORE)bEh(+9_i&aEh7|C`Dg9z?9DXFaty;|W7o=U0i$R&weX*g{T z9Gz%kOyfMkPoC2)os<7{VJElV9_3$84JAvOcfyM@(75u7A+%evHC1v|+R4@8VwHb5 zw0Kx^^Md&1LenQ#gzq;DEE1eH4m#&4)Tg}TXoiTb>g3_+4Fx*ga}nZ6rHh~~OL>nw zMI7x!7qjx#JbG_zc=TH_=sdlNnyIj*xGieWby9X8Cf!f*CI61MX(K};{PMVwqUSsc z7kAQjeOQI%okl>@-JJ3ovY41`<`3sAwXvF9tY4)fSJbI-I%QB_ZOp2*#>H>$C6|c` z9k2ftdRHWBtB_Qm@#IOnfJScgE`p$s{i(9%JHoGv3a{?T=O+ypRnVU$V`elfU}i8p zZzf`Q_v?5?gc(kT{oyC?Gm~GaT%|HVow0Z#^bSKKxYQ|xtlEMY;cI22&-%{P{MUq3 zfRkSPa${{UvE<%vAZ74?^%;Fbx);_f60Gv7uF8gwi*C})ipe#hctq_kgr;SJXN&js zC>wBto*fsaBtGC@-`of+0}2&rmL?nidW$L`c3spRTS%LQ;v*mbE=sPIX!OR1V-RwQu8Wz>SJM6EC=Z7-cNCUm z7sH*q*;@6c+#*kzJ5q&%X1_jjfN>Cg?p}>4(yF6ohcRI|9} zF(ZVCG!^gDR`#J**QP({)ioUZ;?vgJsQo0&KbYb~N#<4?q1W5-IODQ?eKqu7jiKqy zSAnbgByO3rcsH7U;KorAg*xFXD?Vr`onR-!ily1j@Sap*X6=!xt-aeMdc;46uW=OX z$9!!~F^K-bult2jgWeaZ?p&!Sq^}QF)~zzs|JI8miawuZR-oRl+HabSzZy&5MY*26 z^fyYk^wgf~4XUVh@}Fy=6SiFpggkrhUctws<|geP&l&zS&M!#+S^ibvHg%|G7s&Kr zo9xTT-q7l{3|Zq#fr-lloDV_{(XQ-dcXmAk`I65N=O)tvVrQPaE>m~neKo(v*|)oq z^DDia4o@b0(8bY>O5-GEokkGbW$VSu+a0n*Nw(WxIqON{Eoc!%l}}QOu`2Pz&90w< zcGOuiCY`Q?>Sv}{w>W`lf=(Tc^y7o?{`fv3{+&l?*eR2rBT2ObUH_hvPStF&NdIO! zvs~TfnyyAiOC$-Nlujwry&G}z)m=GsmqL=JnV(SOPh<#eXMnbCBCtobNPBvvP1p@N zY@2>8T5a~8B|tC-zu`S18EfB5c5nYCm(J2sw(Y-YQqQazBFX!cR>``SGHR8Gkm$6R zED=q!eOi6E-byU#Jkrm&c)6uI@orbXkOu!3>Iwb9Rw1$7{sG=Q{_iHx2&v-vW+u;E zt-f=IZ;WcBC);eVdmw5!;t!4up8j^X+y3D|#kCq|82&V@rTq+b&)lXpN%1vKs&*b< z(+7VO0leo_J3*vn!;jg28qLjwSXkYeXIN}`y?*zs!j38sEbZHel z4gb!RPV$5Ajfl4mP5ue>4W}$c%nJP0#RJpVbb<@SoN^X(;y2p@wxzvhl-{qKw^b@^m9r}uS22D z-pL6qmGehP7{CcX)_e0KX#le1@Xfcbt7&V80A~kK#?de_9Z*|8RUv_a1%_5uxWdA} zqt1y!rUqli$N++(u;~H)xKN3-e;s@ED;LSgv$B!~BpwU)D<1DM;r5+780z@vAxGqjO#LD#=$+)xISq_%l~ zM_9OC&Z-NzmsHf%(SgCI_8TA$=;5ls2EoPkj?*bCr;r>!y{xqmDB%RD2$(}(PLK{K zMbc7I&aT{7RLmEt()I>&B^1QXn>Q6sen6M|M z;TzDA3)F_!oi+m7ct=MEZ=SW&c?y4|E&?c^l;skki(Y{vGxCy$pI_F^t^J69Q5$1hA2Hll7UypO zN1Po`J|9k(!m^&LQQ2{X@+J^Zn1A?2PDW-39A#YhRoR3a?lW_8J_0TxkQTj3HIhJJ zi~z2A1XLd-q3{}`Lx2D>ti)H3^5u7aG)HA+G1iudyX;U!QVYLr4Zt!1QaZ;tVV_G8 z02YM;*wF~O+#Tb1`1oLWOq|X<9?7j+49v{f2a2MAz}13jB+Q|00H|D! zAIY6Qw+e7vF5?dD+LC8Cgtj=j4q#m z)PezuqrjQe1)L(Q9ADfH%|zS|(1GRUrRsA>C^-PQaJ9)GhLMqR0eCJQpI_}QN?Cue zyE!m0FjcK^*=GR^9&J#MU07Lp1E|+;`ijq=(`G)3dJBE-bEae0A_&Y9dx5LN9!T8& z{yiK7uE3bNRrx1RjI6A}4h|f%va|D689rOX?79%(C2AQTzX>P9QT#Q!FA%CN0F0m= zo+5#lxhe$8{>Xa}1gayTgJE&0h4lw^2Lk9^?->}-{6=}()M3TuBHWzs zh>6j_lfB=X0X)L^I|d-TL;y51w!R23P9)^yxV8AI5$+58$D+fqsBa|WFWvU=9RP%}De6EO=Hd&}Wx>63p z0q-D*GG$ccOBEaCo<0SvDy+yWZEfuj-vH#0h$#X10?lg^dE$UufH#5v(b91m9%~q3 z0f3AmRt;C8Ci0Ga%Z zG6dD{ruT*Fb|91zXlXDLCqa+@?%klDj1t^KtP8oUaj61! zH2A*rrA7hp7)q~xT5tfC`eR&NdwYAPwr~M3SE8pi>Cl@3@(+gEU@Sz?CbR)73t+c> zVV=TVvrnq!;TTQXE>L2P6((!w?|1;LR}$uh0New3biS_~0DJ~Bb?_08+mx4=e>6;Y zzrDi)m$g0^V2SGfaHcnTZs6)F2=j>;#VIK$jN9H^Q^=PRu^j{yHXwmzw@1-Qw0SyI zZ>B2;gEHo}Ntn4oMHSxGCJA7bM!3bg8ngAMT&$O>K;}4HcmM#;p!9V5ow?coBw3h( znnzn(+ul+qES_t+3EgoUUIZ=$>9)^w+Vu zfEKM1>Kk8IVb%k;Ss$4)Ng&)ZbF1XB3?s4JrJ2m1dDw~72|+8g@`o)o4$BFJcYXo4 z2JnT3`}?B;vr(Yj;aEyTg9wb3^8L1I4E#&|`BHPIuN?ZXU-kD>e}N%Q{s6!S<*0tr z#y`6*&84fU$uum+_r(dYzQ(}eh{%uX9KMqkQ1)~4XVxx>v!%*YMa3?_WEdG6`vb)i zUqK{j&`cm;18}`8ta$sae1Q?+uznd*mwO00US1NIhhYH(7D)A|(G9x9$i^lDXjg!8 zg4;UXU2^nQQParW<>vygUBDz?50M7)ap?*rvv__RiW)r*Kx?TeD+7>x^NEH=%nO0K z+KU0;6mq;LkQk&yv2;tQ(ag?aOXr=SP`9f@Q?E3c>^uT*YSkv67#+r4ZwQ)$V! zIr>TANTfEA-;x&0p_2Sg-&)Cvk}1nQU(Dy3P-@_*x=_FSkio z8t}#-jXcaCi|0}(8a_9yZfKwaq#8OR@U;s9`8FBQ5+T<6GMH1QJ}adb?+uI8oHeXAD2NOoSYu{kkpvQ2DG-6sml4^+RWkd)f&g<{zAjvn}OVam{Z*(@4$ z5>O3Kfp>Y%TC(Yh#a{wkGMW?1lLEAn9c zttqJ&%*25)Q}@)(;kv^BS4LVfXp+Gh&)y*+A)yl#B%>mDWEA2@roZ8Vs1QJvaK5q` z&cL7%^PI{Gk4XsD#Hs)s_1^I@MoV9&{6fT-Z;oY=__%~fX^v7V0YT8G;tbZ$NRZTR zHa0xyvU#t@MFOOt_yEZ32J?5FimLB%@+Qs)Sj4(==6R5Bn0;#xZUU%BXf#;qi7u_I zlm_HDxV>a~YTpeHNYE@nJHN`=)vvKf0oy_b#BDGJmn?4-#uT}`yJO?wb-UAQAFTK8`N8Bkgjv8ekkt8hFE0Xeyr;k4w9vp=%6;yB zjKAjd=g%F-qN+JUqiDrtjg2!_%*9emN;@RNYiO2DEb?%6*yhRrI4;x)QjQ1i1fmkK3eMZzbFID(@-JG^?;0rm?xuZ0TU z@ngN{-SimI4n#*q$#gEj1@xk%j0^@aj0FRLl1H-yXl;Y^VrPNNL-Glv@M8Pmzm z9=nu>T>xwvN&NZv`_9e|Qp-Y1#dQjbUf2p*BO`iLG_=-5`-GUH68cw0=81C2^}Ik| z6_}TIk4D5*S&FdO#=?e6S4E|@1|AH&_J)QAiP6f5l$U9EASpI+6hOtZ0219V?>9}h zM2d!kiwh*KZ5H1*{qLt>2M?df?1S=m-Cx3hPO+gccF$$EXuw+nqB3OD6DWDvEi+0} zIlPW_|FM?INmAr4!+@e++uP(tx}FJ&;7>!O1wc${02_u=XWtD2yyZ&L3!6qFj7q_H z=oFK7aNu@7`4s{`!(-6Quk>`}hVZBWDgue&1kYgsqyqYhpl_s_Xg559R=nx!J+9S< ziGK{Bg^*E~_Qq$jp8G)XyXPtb)=-8u8}sRltpY;W$|}_WfGSg%J8W)kJ&EKq#@VkC zh5jILfnf{Kdi8j8s;!Z3Cv<43!qXEH2htR8aU(nacTPYzouH=u`m@qNm{=lT* z-_#@qyl>Hv`1;5{wFJ&6OmKb($a!wpJ;Sc8$&*6K-lVbYEm0UeH(`47;P6n&&Fujw zd<1lq*&dTAsx!)by|HA$!)AGYFya!#_bnc|Ke+rX+4yB+EkJ9BA7^3ClruvRP$_H3;pK=n8r;1N zk#Cj!OS&Cg*h*;YL2d)+Qu%WCmuTtf=@1pj0&sU>XQvo`{sKd>2+vHTtRBt6Tr(Z) z82!C_*G^7O$nL)g9+?@|%=61XPkE(DZxc-c99o%3cjNpxprD{&WTvdTdO|1<1zc<| z2So~-`snu;!pO4<#FW{Ub0htI)jOERl|{QKy~3Rmu|P4{ zJ~c%KJZ>^NIte2)Mz}?|W{u0d!8_N7M<*wm32;nd6AhX?Lx2{XMZ=g5 z$rA@z2Ptpwrim9fRmUR`X8t5dzzwHZp3bcErSpcIsk0V#zcuWVE^_MdZmDN#0FSJD>@K?jS9j!q~NH4aUO@^*t= zU6|0K5TwlfQRM5edRE18-pl#XTNlZ{GQ@xkQ%hU>UR6JYk@ ziSq+8aRgTibGlgNJvz0(0sVw)H<(&_daxmz0A0Dy+&%K83;eYK>+3dGlJ6^-SdJ5d z6$@>it|BS&Gh?wK_jq`2fMAB_v{;8D0q3Bn!61#hg6!D=@zKap6Uc`s#KcJAY%)Rn zH2ImMg9Cr0Di`w%hIluoG7nGPAr$(?RM8Z6J-~$n4;usKiFR02H@&^S<#_S;ud%c zLKHxg{XVr;uXsd}`<=|(AfNurmaYU)Dd^z#AVa*tK_%1e`~8q2i2n5H8ZZ8CAnGK zpyL1nP7V#yEkG8lKEgB#{jQJwB4eOMJ0}5Q6P+X1Cbrj$VNb97gJSWUC9(* zl%KC!3x|`R>i9N|7D&0;C-^Hz>FM(dOTToqVqqJ1%DEBV8Fw}vxjNDpamsYw^`AOu+0kc^BM16E9EQY z{hdp@-ysHOXMJm{!Z;vjhvVnU$^y96@(SP(A;k>hn&{+YNS4`_8cR7DJz3+3$iaL; zNGm1{S!D2YG&sO++0F?9`v10w2o-3G!l@yu7ie_+wwQuckzFUE5QxAjcv9JHO?UDg zc==Tf3K)Q|V^)Ia<-XaGjjZG<5|7lS85Yy3u`dyzWqO^K#@lvAe zq?lhhS)=ElW#&7`Di8*D0RxN@@-`D#_>@gS1pqB3W>EV>L&G##IehQ~dI$a)W#T-) zoL}#NS`}&5-={vHitGjJ#H;Z#jlk+{K>;f?ECfOS!dLAt2@*gU3}G6<-Me?Q&9|(u zVI?UfS^c#&R{3r5$}SZ@M4l7ThM1a~212u5saX(^tjo&dj~S`P*mtn%Y$ejKJH640|D*2I!ikgzi}gQAgqnC9Stavxh5X_Q4{<)CK?Ksf8C6SZrNV&y-5-8s@Vece=@VON;P|5ENpJ}=Cda- zA%pT#(OvNPTfcq7ZQvwr>JS)lwV9Oqljbz!3fT4$?y#&BBCM5e9DMpr*C$ZG4ECksQ$wbwURu^T_o1 z)O)gEn1s;!=GqNJu8&0QyLY~jm3!j-(_WC}NxKg1i0~}0LdDB4k=)S)WV8(Xfd#=6 zBti^llVodMc4=UCIC8Buk-Kl%Z-Ey(hq=h7M+j~3QPlpe&q5n5)cb|u6n*R~~)M*YX zcA(_flbSj_Hs~FP5hG7;XJL+fCfob%8C#SmtHzY`fAU@O_v6bHAUUwOwB#Sh`8>aI zOyavgw}jI(2$mw-Swcx(Tld@Mw0hYfZ71j)OBZ0LhP zaePT7mJOnahY);1ZOR{J3zynNzJZXy7-Dj4XcUAv!3-wKSIbo?KZ3*&DE1{d8o+7G z2rLzO_Fkxd2smdjJ%S8vLPK03x(lYSL(S<~%{&lr7%sL2a`Et3y`~0cYFUV`qTj#w z$0DRj1EL0;*-l<4EWNx73K!URMcj|D!T(ctlXq}4;?aFBh2-x2-!z}L@U)c5k;Tn< zOo)Wvl?mTX$XSH^D`eMX2hOrADL@rke6QoXEo=kEQ%X^h00`F!fg>Yu3vve!A+Z1= zjt+^uz^$;zNLfg0AOqlLAD_T256JJ};NT2F2`SKy4YXw>P$RUDj1WVvRdtF#O^4Rn z)|LQTEd#fRvC)aIeT8B4Fb+CR*$bj$u*&yf_dXB#W2>KFk+`diz>6x#ih5SHKOl!3AXBuGP}``(NkmP12mn-hw{DDq(;%KqBZfMhpEmL$Q&4e!fD7+P9} zAo=c*4Vqa^UCNeRRxnRtBH`Bm+bie4gye5Q)>$-0V&zRSp+yzp3!=$g-hT-xcCfYW z@W6|8mwykHz83GpSR<0+s2oeiJ9k=vRVKQ=gHY@Y0ZRbmO;HaLhSqZ0m$P0aJ`I?Nn^N)Zxq$`$3?_@fV_SAC$A0Bw=j0@W+Az)}2D7;VHy4*7 zquL7eh~6E};kj7?1LTmMkI`21zmaaj|zfrhw5>2g zMJ8*zi6vF$_3aOkldr z%Z>WrU~KD|O7NIp9^o27^;TjCA7*eB+%hLd8yC1Y(#qy1VMuyBLvsh&gbDH4^#sH< zGbAVg*^5bm`+s@;n$jrt1}vRGa4QV*8L%v@xJ87qzonuD3Lx`y*RtQf`Klx z;c6sE_N`eLoshNj#!W7n@2{a($arwOJKEzxE_07;?beIJzb}YqC@V~#wV$XnVJ)7D zqekChmYu;Bd&_=x3s=;7>e;r9)#I-wb zI(ptOyPdBl(8F^}y2ryFwSKp%YmJsL&^W|(Rr+HT)+3+pM*f2ReOnU7uP#=FtNJtfT2uWN7Q6#qrPF`HO!bIaGU zBjq>7pW>hXLB{HWPdh$aF=a+f*_aO6|6KE`(P=>mWnJKMyE*aH`g1v-G}%_zkAY7A ziOGPpqnihgh^Uw!hO+f9aeLnvFPwGWb=A-g|BHhao+M0}!V!8`%90}6)%_La$$T#s zo*vgl#Z9usw6}N7^Bs$=5Joy&-oIB-4c0@H2VA-vG-zAe6@PIaA7t0{J5}YHduujp zr5mjrn9H2)3U+)7;}n@)FtujGxo4WFaU-9L_)+txu-2Ub?yE0}J<*H{+3!3S)t*ir z;Ou%|-LBr9_7#g3U)bhEODB+X+WL~nF8%oT+U-6e%cz({0h8H*m{Rd(#O=7`hkg_@ zQmem>HaZk?T~CFrr<*8$oGP@KS2VT6;7m~83q+{YlbS3J&@tX*o#s#xx=cCcP<+7N z`Llnv)TtJCpivknHEB?0*~IVv*Qb~Eic_^@@9Jbngd=}*|LQF4lY%3ϑGv0O=o(X5pU{dQGXQKTJWzL*bZY{Rw=Xu|M)xO8o zO0ROsQH?ksJL)ZYHrj6!eqoJe;3aN))-!zcTDWIzS7GV&)sxtQPg@0VhfHE{+t&t+ zMyw{u=Dl!yC{{{lKWmqku!e9%mg&nzk6AKQm<I9E9^1QT*{yj-KDV}$WIa6am1tDmoUK-LR4F(w&2hlx z1Fn7^=Mf**VS`IeTth*RH)^$tf zr8~N+@9m6GxfmT6df4^6@(s~)^$L{bJmOY#WAiO;3#i?{wkebZ;T;l9B1SnAljc8t z$7IQp_=rGI3r$O{$bruGRLTA*1LK>9PyF=)y8#$+Fy*WUd&B+hDAxaT?e$wR2j5&ai7-<$ zDD6$>mnElfop0ZDe#UnDJ1x<>&0E=Yd1kS)o+dw#dF^j7{?>tyiDHD&!O;bJa`}aN(;+Br_k~KS{;fV|l9;K#4e*c>aX38G3){*=a-!>=N zs=`wr_1fG}^o#qw989CoNilX@uvcSHq#I#kUeMOepEZBCF^N zQmj@%cw|bqsZqoHYA(9-0q4+uXYW)G`2jce zAErSjs}{i-p>Jj4vmN7n1rL;0(VBieFDj*v%d1XHtK-`$&0inrXaA<%Y%3O^JC?Pd zcFg!`1xFoKjo3=FW8u%R^!e{z@i~qkMmULPUsj^hIaRics9$L{y+1{x*$#G2TkhlQ zvvo{SGRReo6J^(K+C)QTyE&T_&|8G+;`5_zfV$wU&gD^Ekc?A7t~+0Lsn<7$1`mRq ziZ83Z|hjWt@@g{;h(-Q{$RVv?j`e~zQp4_W=XGsmcH1)aJ|w~J9$H1 zG7_CwoXxdM`s|3Cp{kN$3MSpp!npAhdVJkaem_}}B_yu)l=~Zq6|Xyc&m67d|EcXf zgQAGKHxD8zNf0C_C1=S9LzD~y3WLO9hKzs`g&`x76bTZBoJ4YzoFxfE9`cZpC}BX! zVF=sr`>(Cq4_mdLwyQs!>biZax~s#z=lt&TEPf$1r|#}Ze7+el8lhWG_36@@h&fa6 zh^*_q=lzbDW&u_Cs=_4$V9`tp*gzt+yk){8S=65PM9 z%jtEae!O)|oE#|Mo2y$v^Z7M&vBcV5*bI}-I@kOZb#L#uTa3PTA#PT0d33ekOy!_`lAX06vwy%z%R5RMiI>uowb)cgvAMWg_yf$^He5oQ~?$&+ixh$}U`CF~Oa~RWFxD zDt3RvyN$lt@_H!PH;+84Nuoe3YXnXd4WwrWX+R3~VN{8dhCFIjYkKBGq;~e9#EZ3&FE(JEoV$!dh9z$_Y%|GXH zBs})ad?Cs!f5mPL;nRZqpR>aloyV>-VwTlTtqQz^J=BAS^2+@#s;&dNB2-+5o+xd~ zt9g;3d}rRO1szM6SsRVY`v|eBGF`GJml9J)~UHq&z_2FW_#_OT4}v1E#5*h zwaB+>B$V(t!pNua9ieS(d$EQZPg2+$dM$^L^Gsb$~QY+sf!3F7B zZMn0xoPX=>C>{n?J@lDy*ABWBj~!rObdspA_o8V@t=MXopptQU>KRWWl;t!1OYl=q zexUYTT^X)&Qv_a`5tfGfjM$RFm&+|1+5Zl^)`P>^iKK!R?_(znar zwO_m6I^)W*!VOloYgf`XNF%d*Ugg8R9zJ zgTq0<5?fg7#VJX(?4QQ3#AnUr0D@rAvsx$9SWsgJ{u-6E!tk!_*&D}z>I-hH5H~@1TzVxq~ z1ZT%FALtYGwqRjZ!8J14p@W5|doQ;5Wnz9Uc9Et)V%?7o1AN;Vb3S;yEcjU=UcGh^ z*nz@2$!8 zHwd^~l=TSTG)2ZG{BavS$1B|KHd&|W%qh4X`N;-Z^$Lv9Uhn%ESr#C!WcNC&!3eAt z5=Vu0ASo!FPm%F_%73Py0(a&LQ0TWZIDZ1W#IPvXuI?mlgUuU1rpU@|uw5Q$Zfb0j zzl}MF>e`V!dx$cOTCdqV0Pk;fRFa&HyLXC)PsiaL@bS2W-YtO_rrNLwb94)#LY+Hy zXkb8O0=tXR*4gluOlZjAbLChK0TIJfK7woF8LE|<`Q^EET$!ZAy{=iMMsUJ^lHuzX7aEqWXC(IdNgCs`*mv zpL`@zA>2ao1$zTgqjQZ44A!NsTe9|2m7JNIzw#sAER%7IcXYeg=*Q^LRmrD~hLS*s zYgZ;XuZwMNizuef$v&Uh& zc~^-7*h9yG@5q0nyWfT=|F5C`VIb^3E%5$Bn6pbuo9xr!0br{7{)3NCWR;RS-vY^A zK4djl7@)fRpaZaG1PUYZ(!(`?Pjv8A_0dVdNM?p;-{4?SryI}`u%Awn9-aWSjPW}F zK#ba?0Z2;UJshk6q)kN&eWt&YAb?hOH71+lJplci>+i!U+g|kw3?{_`mK8wOoBmVf zebSC~oGg0kkye;rSg3H~v+^QLcJ;>t0IZEHT~@z_0X+bm`hUASlyP7Gv^ub_sf9H< z1th5zq0kJrLERXzVLXQk*7Koh`RPH9`hRZ4U8VEe380(-;O&X3F}7o=vw+6y4?qN2 zo27a(N{t<;wFAon00%xkM20<(oOQoAv~|Q(MAy|FvJ_>;$9DlqVZgX9Wtzyt6_0sy5XA|*9gh$L17l*IvUKZ8n``92R? zI2HU01>kop!7VK<3vOB8Cbj`QMid_(KXTUys3)L+g6Kw!9v>j|36SW6%xk={m()O7 zoQc`ich`mpKp!-K{s1X!<#NLT5-nn&pk*mc_wVd188jf4xKxP;$g(|RU~sHjGkI@k z_@B$xVh4SlAbE#}M>ejhqGSL{2OPfv)n&ay+mgWR%P1ykN)V{0)lZ+|2K|3LD*8)l z-<|*u0I(;<(+v^-H+o>3^#3cA0iZANK1EP=Dh`FIaosl5wQ1CENCKHQvCFYX@V5m0 ziS+a2GcoID-gIcz#v1?3hz6axNlqX?r0p9IW^PwOEB?Y!u})G|)?fBbp-^Fa6cYgb712>LQk@y0 zn&n4GO50G)VW{f*p%h#biO%xd#Bt9a9c_+A2gIqJtWEI?!C3_g{fQN8jfVpkYUH~b zj^2JU;A?{K3r%>!ZnoRF-N2|n!;+=T^8U)Mc=gP`GZ`Z z(Z^_Au-UQ&2J*1(SB%_W>m7%*gf}Qal}k+ap@c(HCMPh3>BeK5g&$sdfk*e`eztQs zwZGMvL$z8y6@RevD`Rc;GD8VnRos^1+9b^ydz0vOwt1k1b9w%5+xbU|e*r)L))Yd*Sx(;m3WuDf znM3|lIH7rHG~Q(NOZWZ0uN>%y*e#tK)a$nl_lCn6S?34z5hUUpH z6p}jN=N)R(Y0nJh=Lrq}$|>2YcWCU+7cF`Kyd zM^&3#^3GZ*2w!~~3?$&(b(YOnny0Xp*!I!cYw~dOYjRNF*5`)!NOa!!E;El=`BwIk z4+pWl?`vGZ_wDl&zXwx^aPwetWvC4pn63i88gj=81-C4W;jPV54}C5j$z$dJURBZ6 zfghbBZ!7)h@+Fr>L21-nJzr*f+RLTl0<0djm-A-2g2)BV{rLIOol_y7<99N__(!W- zA~i@tqjRlB-J*BJ(>`~W8J>;)lS0ah>Gs$PFR%Hx#tbaVRT<7gD&L9mf#;R6iz0Vl zS6iskCZONr{k&JH1z9;)a!fs}ucffFP$_FM2>~vqox6qef=K=HdOE-bFO+5i{tqNc|1T zYh;$d@^Z~@h@?@)B1)_<$~*biqE0#$e`W2TQ6KHcR(^~|l0j^;M;$Ed zIX7qgo5Tm>eUMp41l)fZQ8gvM*tC6mJ7TFG#3h6GrdY2{R5K2zAzqgo(=$*Lx*Ix2 zrT_=TH*6PVjiC3CQrAg;eXjHn8w_Wq;!kF#y11=GE^3Fvj{?M=;P@0`vpe&@_6xkT zyHGt`9g!3E@Y^M=$B`ANt1%ryl_EiBQBZAP#R5zEiGq|OipgMv^3Do}c5nO3N6IPf z-S7W!@%rA+$Rz1qkvyF-UF6=V&rkk5AiwzSLWK)qH?l=TJSRrdTeJHhy*fUj!P(Uc ztt7;lSpB*iJWxQdf}zuDcO00{k47@SyLX&x@|$F`M!qY>XzpK6@cSkRwXmyq&ppaE zXk(o-m$hGA-htV&>+f&+JEuhCUe%%^DJ*2jS_nAIsG+IsdNc`{O=?^EPTO?KV8m99 zaq`m>a{)7}eS#;p+kI1WJSFgc`_+-?`K>|FTHZ)|{QS(kh3B&{kQ!mu7p(%&0Ji6I z=m`C=w$@zByVj19AG~GKJ|XGcLEQV*0*n%oHb1XPRzUuAOYQd@38_uw`DN#hzuVU+ z!iww%gGJsshy}9BG+86{Q`1xH#uJsT^H&ds8g05bkjiGdZua51w%O(bUq6xdsto6p zx4bAo5Wm?ud1B>2K7^P)a=v~VO=Y8m?(>fQ*gB&6I-(_ zuXb!TyGQtaOx;x|-*Da)cH}QXh*P%P2*d43H)2a+S67y-a}6(7Fa!y2%02{#HqYS% z%(A;{C8+2S3#{EsI(brb+RB{eC3qHL1sV7Fk&#{6pXiG-XxxY{7^~4WZ~DX!)hfAj zGYhNjdc^Ctsj5QdH(Sbf4k%fp-@Rzb#y!O`Af=Il37EBmrT0v%3*79OP?PdCB~ziS z5iMf^{k4f|MtQte-S1dygWESxL8!Xi(zZj!h}@CR@mD?ug=ro%4%-)?kzt2zo1&6e z-Ebl^(t&%*U0UJbR*m+amD;mVkfzi|Q~z&i181eXmmQ1q4({ssisPk4Ti(%qRp03( zyzuWs{!-Lfuz3XO#Dh#9l@%M0J!%sou`M%6R4#qZu1Qz}pJSM;T|O) zRA?6lU>G7Bscx$lTVpw`Z|%Ue!4*7RwX&NjPO8#=-li66{C1Erh=nSxH{?PX_Zp zIClnB`V=WO#Qx)@rE9Yg&eCf9N9R7~^!A4#e)AV7_#n^ZXu!Sjj-q)!I~5IL=+@$S z-;ZwA9QtcPc>SwB_xGm~FmZBoI_mW}KI~wXbTU(d;KNu$JnVw2NBO7Z7D|Raswj*J=d z>^uJPG?HcwTm1GuN0k**E4^oYW?*w*c5YA+lZA@NN@^dWMm{sD?Cdoqw0RUH66QFZ z*8jd{8$KdeC*=Y*?qFR1vv9Vp4b=-6`S@enJ(PcSVQ^(<%%f;mstaYhI&$&FM*OYj z#G&oFc9y|C`pWWTd2gp5wLFet;`M(OwNqknaH*P) z4g!OyfT1`2a5LCp5!9pvFhyM7(}$JZj0y1??OViiXzJJ3I=x}uSa+tdWsNvGBMP7;cW$4V;E9*Zs3&oD5AlgTd{a+G_a@5k+m+G;bM;zfiJN5Xnk>FkB3jewg0w` zLn37pY**giW*|8Uno1Uyiq*{m;>+##^Oo31L zv@_b1Ld^Dj&v_5Sh=}bP)w_yS-d9Kz!-rKho$e62Ye)yh*t8OrtmWqv65?%&3$e6B z)jKO+J-&D&*7wshMhsK!c1X_Ap{$p$ewz9C*coQ<3Jlk;#wLM z)IvhFN`?HHlKMkUrD-XKW)Y3vH+)Ub_;)Y+&$>dZ`xPC#9E5iHUX=!V^6~t=qhD4H zN}>Lr8Tt(xG84}KKT~ui9yyW!eSARR{}q - - Produced by OmniGraffle 6.6.2 2020-06-26 13:14:04 +0000Canvas 5Layer 113-1223-42311352-880903-120123456789101112131415122522718-86-52-937245-138282-53229-5248594[0,1][2,3][4,5][6,7][8,9][10,11][12,13][14,15][0,3][4,7][8,11][12,15][8,15][0,7][0,15]A diff --git a/content/english/hpc/data-structures/img/segtree-permuted.png b/content/english/hpc/data-structures/img/segtree-permuted.png new file mode 100644 index 0000000000000000000000000000000000000000..4d9273d73f25fa5dd95c048324ffc1ad84933efd GIT binary patch literal 24136 zcmaI81ys~u^e;MegCZ&29nwQfgMffEN_RIRjUXv9gmefDDS|YT3P^*X(j^Qf-3|AQ zzx!YBuJzu1t~G1)JLh}OK0D6d`?L2kMoUwf0QVs-1Og#YRZ-A^Ku}d75EO1~Oz>n< zn@tvcp?k@x>SBXGf!Nm3;NLjzDn?!q2#G23gW_)2ihM}vt!U`|%+1!@&%)CN;^*hb z>)`6-Wo6-R!|Ud0m$fVT5CWlxs46_xeVM&I@Bi|_@5{~uH?=2&gf%!d(KQ34_Z776 zlagSZFxIehmgcjQSK*OslXyJ7XCSL^99vK-Lw=7BHyXE~ubneIn!5(uhAY5f^Zg>7 z+j!B>I*rVXA?6^NQNgDU0)w0U8QIW1=pG6@1$We)p445!8HzdQ_Ghs&CDwzt1us7( z)=@7EO*VTTgG2VA*8RSk7J9X^jg?{9B?@@H$RjVozI1;$*gtu zW8WqOJ{s4@(n$DD8Dqjzbkxec_P4Nvyf?HO=XYC=G_4kka7A+I;JPGUo`TIfY8>zs zm%5}PFpz^ZR6hy%4M+W;8-Xj!rK(mIZPMg!5w=GI^^7p8wO#dhIa(Wz zr<2L;Pi9L_N+RCDl=TS`}%z4o%hC%Fe=w3 zhNUal#;lgzFL6($FPP8jZ3f>zdbw|YdAjo?j*O@2c>GEK&=7tJ#qY4FP*N3<-Kemy zon9ytE(re+X7}Xv`m?a72XL~Nm|}DkSSbW0=4u>vV}M{z%bMC+|MNdlswyhj=eH3! z58Rudr*P_Vp49_#$7*bzlC>Oe#)Sf}^%3spO^IxIf#cg08O#Z{j6jO4fl9)cn@H;To~o+q#lmc}7n`8ni1o?e-^?s5 zW952WgtU_OYkv$-m_qZyOwd|oNYV+pg^%*pIM#mxz76ODX*7GS3%JgSy*k?+1`B>$ z?upx+tfI@v$iSx-OGJw5J&(!ZYMPl&rLl0aMwacwpC?y0EyUFrI30S2t-e|7^(-oP zENyn2dg1nYK;8}s8-1vGOd`%iaf zkJ|FyzyCf_X#%#)!1V+(ao)X)p!HV9lG)N9TC?=KFeceKODn?I-PoeXl%)<<%l08*R( zmRFn~4)UR-i+L~tp032O6?<`M_F` zRtxzbqm;chVolZIe7O#pZS4R8)h<@g2<5W1vrA4*O>OD!>Cyi59J;(b6r}$|LxY8h zsp}3N+lljxr~qlodJ&Qw5b7`XxDdGRXZt1%4$_36PSqAXZS0b?ryQ&1nZ{-udej;) z2XS7k1k_dexmzK3>_aQ^`>5*5r`S>t3tsPzn+Du-Um%O%u& zDR)YD7A>jnAZM)ikT@wB(M;WjnOp(?L0<-clfL9I;o#mEDAt+Y&T!rMe~MNr`Ge4` zXGP$!af*cXf#JI=Pq`dHpi+KIIzkQP%Ja}2Pt>c z=vlUv3SDl@2^6R;*$>O_9UYYZdy(He+<0cHRLh7}!J*dEfS!-huJ)L1DgIu=D5l6t zh0rznwor4tufzlU*Hz=cFxhv`Zc&QhLji>P3^b#uI*&Awuf31$>_ND2sNz~&=`N86 zl7{EOV3sVZP-eu2RLBkt=51P%I3c(PHAbaQaHmYwxsQ}^F1{Em8)-N zIMyUz6#Jjys`@WUjnaf(2Jcas%m!oL#v+I!d9@@uoQ_y&tH3?L`n*~vQ^AL1GrG{v=vI3+YUZos=IA_t&zFT~Vt2bZ)6j0Zi zB=uJ{`}dKraiKJ8WZH^`JWGs#-ElB7_Np6|FDN^@RVf-sO%;y+X%!((vM4qb_{hB* zA;4+JK~z_?ByIow82^<9yUin@-!=&aN)n_!Q;zvKczTjQ&=H=PIU*6g@|Jix+$K{`V6% z&e}b?3zJ#t^al=*1P2SlH0ye3Vnc`x69{_@U%W zp*VIg%9GQ|zDepwXLZ7(#e@wa-+|RluUlDF?p92M9Xo5L-iC>EI|heYi&cbo}(wbkV4C#f||bao_4a_FnbDt-N> zz^U-zxm$0dw88=v!qJ<(UFY*q`m1~1e=IcVC7+2DmwHJytQBHB*OH!%?_DLO3j z#QYfkrKlIdoKfD8D#V3@I(k-Yyw<#xMe`}-j4=lo7mI=w^ShYjG%Y3M31ISMyJMZC zPN-&oPoMB#nB!9Lf%MVCfzlL#D7<`=M%UKU8K=iCiMhF&a&o9}wbYs-*|WtEnZ=O1 zAlGT#{=!sdjLvz##bMy5U3j{p?=z)~y5&~dXsJzC1YR6wdyPPS$Kp0cam{hV4cSJ~TIqx%in{nU-V(!Xp#wqp(As zD1(rdQCnJ4lHBS*Y8PekELH}+ z;P0aVa_gEQ;m&f%R+{2kr1A2ZG5M1E(p}=vykMfk{h5D9$O3T0hIRJ$(J=A*f_Nk( zvKkj&#RXlRdK@BDF$t+vYHf#y#1T84zdJoTCTAcZ<_NA5eq8))+Ga|ca05jbZ7rbT ziY`g_NT(=C=w5T0ForGT(QFy^xO*0^nvxP`UdSyJOzD(6>9;e}(+6X^xV_q=d}N7U z)qd%?8vsHDa?_V#eC_Aqf4ZV-0q2%!Z?W5P$2(wIeOH}qh>kb2FHPw-&Son65X~}> zl^atTtvOT>LHz-9;Xf3L_T8Cz}G7y=Wp81>NrTjE!t|mI4s!i})5Wl+`4-9-J?tsycIg?Zreh5hOTIxhepq0v8 zKRHlWa&2BEvmMD+ovyLz z;uzpI=lcPfQn@wyW`M$nc5`D&g*u15t-Kad4#~zYxd&p)QmqfDRjr4(!cG&`KZ`Z| z7ff1wU0esXGX!kJ{12I-P-u5g#)5PrT0F)ljG7zM5UPxBQHf5GI9T&W0uyNwGyQzE zCPz_AnC*2#3@~jiAf~>{ldX%*lDxxKY3C`xfl`#tzrQOS#tJYF4m^#54tmJv0*{6v zy5+ht-Q5cB`OGwX6KDXny!kZyxYdy67z#Jp?^8imfe$QbA9EVop1A0!O|wf&=Yp6@ zEsgi>V5;b&)0?x!r;V=jrKWA-GPmbUg0@4|1>p!jt&R`jm&C(o*W0LpY76&rCdo!e z+brHcAx(ZWtNSBfls;5&R5BS2i;y1#gwnU?R!Rxf8CMG+=qfY*EgIuj@A&)tYOiCM zeZbSYcpon;>$Tw>Njq7Ir^+3OGUB&>d_V)b!ueG40A2YVt&cjZn+#9lt@7RH7J5#8 zaLQDt@|e8W%n=`J_Ob(P=mit2K6$e8tL#|{pP95@n}E%L`}w|wAW<8ZgDP%!hte6A z*8I^Pvw2BIq?Q`=XFR3gISB7mG&PGsm|f@a^9>pn_f9N@eL`dST2Mu#_bn|>HlFV1 za;@@nV0qv4KU*pq*V%vVxIN+f`>U*LuGvdfU7eYMA?#<#)Adhjxw*9N$0pIckB#3M zqX@&K)eKM;C?9b_CL!xgIrWd_ywG5`sCy`sf5_xGR-J(Xe0_~|)cxRTVtRTXqCdIJ zq>*HpC z28qeZm4U5_fy8V#0E>tg(Rwt(HR5`c*IkT4juNBtsgT;7V~M7Lm?&_^sFeuCD{M!l zbU5UdN6xfK83imgRnfPhD!|g$SH!`=!C&qcXlIdrrLir|_{+P#Z{)Y{x+V=g6W8q| z$4fjb)))C0&zKYMH!nnqw(&uu8<@-TbAUx0AA5Ux2BHnDjpmJhdXaQ%wb&kX4b;Ya zcaAbNS1O-Q^ha z()6et|G8=QH+Fco%~*lrP=;U>AgkK^E2^xxHxWoAu;>C4HL^q=xy%Ta>s5Rgo_uaK zVZlXz!`N1-M0GG9ARtxcauBe?Jbo!fUZ!>P=WL-W_k@c9`X@!+{z($<01(~9$gl{C&NZp8jVb$ z4`v;~wzQ|J2o#>Er9Gy8lH9#gsp_w_>{uW^!&RI7^a735a%-#*OX}4zyL$`JNYZmt z?2j7?`>7aLx@yx)16)64dW`?>Z`OKh0t8_|Ha4rpIP_6EMa3q z1fw&F<%yW@c6!;>(8!4GY=aBPbVi`-)WpQZxGzFhzAwik3_wot@k55xlTWNF0)lK@ zXhpLxut6^N9Q$LXK^3jbYy*$y>h`v?h|7!uaF(UOfx@=Xwx>S7{`E^YQ^e(=?nndt zNm`BFC~e}^YTjcS1rE{&BXUWC;y$V+OZ`r%I-g5;le8Ci1D@sw!2H6U^}g9RRU&54 z)W3^XB_>;=jV@hYsiW?69HdH1?ag*9npuG^o2=*7M=-M><~}%H-If?V)L)IX@d4kTK*S zP0eIF%5H{1AMRmWnYD$IjJ5}37)QJihwH}NCz8a>R&r$&#I9f59(&R4Cqqd>hwC!=%e@a3^MSPfv$jBv6wmTc4cv$OdcGHxQN;KRJe_- z6<$JFN-QtzQ|r0G_XLaGt`9n#1HZAz=tZNSDHKm%=$PYOaH+w>+jRyqP$R!Gjtx+= ze>9Ll?bTC*8uBgidp#CFc2)B0Y~Xt9OcXsbr5e~LGUsV0 zTyC7z0=UkwBgAYuUi$GQ0|!n#DNF5@0&uZq0{aYy&7%r(hj16}r>LG!hYnLVy{wt5 zb;{8hxS;`N(1ll=gi$a zQgrmoBlucV3DhCPE|>45JUtsVTka5R_BZ87HN#sXw`2-* z-JUF~#bhT{M14C{>S!1C&3Xt7xhAYbu2jZ|p1HxKpSOp{`C%&qI^`Yx0()Q4aXw;VFkY+?qM6lIS{gK$>v1AdCmMd~kH`J%NBT|i>dsQHeJ=cu6(r^C7&2irc8RO&CB ziYyean4NotItk?V#C=Kob!5>YgzR^#Vl0Jv8XPUZA-7S!mZ!15NB*iQMw-gvwg-#x zZ1W^Aov-sf;yntj-R8>Fc~r^P{%61AL)V|19-SF9IQ3{quprGCsUgu41gO$yk@2S- zEe@7FOD`YTUzS+M6r*RVDPpiaMQ--u395z@@i4S7#S=BBfXRDIj5JOsVLbt;qW)W%i_%KdGUL7zI{3X7GY=Y5#~S+Un5GO0}Oqs9JTtI$}Aq zekp-^#yW*{j2WS&aK5+CLmzjDzj$(;w$n*^N~*!pizhTZM(In&uB3}8t#FmjNt#5~ z7cIPJAMJWd##Wq-$ITj|s49`eNjfDHTHm*h;CF9n>xofq3zgsah@wV1kZxclo@vm9 z`~7IU!x>^cZ6E}vE-X^_LXg;@ntecuT1zz@EIQJg*1`P4x)}$#stHEsnu?;5jM^?f zuG;Fean>;JhgP@5cQ();-R%V&q#oZL637b8#(nJ}*U{H>MT|VJfW$dH)c!IjYI@_e zYe(G1-FB2FEvuW0h!4Y_QFP@Xf7kXVR$1qW^IjNX2(eMMJJSC=6p~q(D*G%TuJBq2 zYa+iEvy@nlEzDc`?waK`?sQn|bFuh7hB*5CX6tJWfgeU_y7)g;=?kMmZi!a}uu5&r z#!>_n$pe2fhGZqs=N#Wd;^f1e?+PjX4$ZyyWVKLxQEYfjJ-8v8PxLdX2R^ zOf)3su7<$g4bVFP|4E(7EgYHw#53^Wnk`HcGUx2(p^FYQ6i28sh7hjr3nj`~D=j)4 z{o06{BTVm{`BzdB9{QoPR)zbVYKgXJ^I6ZE;7pb|+KK3T=xB(w( z3Y}FOSNK##ix8s^QgMfmd#1JREEo*&73Aa zkjB{;gbw50Ezds*F;5$4&u*jgDzz)5*6Jy7ifg)%iibyN9j~ni>V>WHrVZ94rp@l~ z#yziaO&d(rymv;?INL_vOb0jC9w?Vs1VEATHd$O%R;AHL@@NO&mhXy_T5l0qUNAW*7bnaRb#@Bu%}M8<1pw(%W^ zQ1q0N3SH}qv<93{=3M;>K~SgzS5ui*Rr3kAe`YvL^k&IfiaOD*je+oi0&$A1Qo@u@`O>4ML{?phkMt^s%}V|7Gg z&l&^<4eychPapu*8{R+WfUCudAqx{AGhS8re+4Q32b4MOMwWD4DR#8O40#PJ@<_`v zVkh;nrGN0;)OuO^it!oEoW-)Vp7%M)OPc;5tM{{sTgSW|lbFP_7nf=UTVeiv#~M9j zWG_kGyRk`W=GWFiPLKSQ032V)Z9#HLJWJHA7(~0e9}Vv(GAwfGf20={PEDYZ(D@+O zm+UmzDY|Kmkx!&Wo1wYX2HEOGn-4qbQ}~h`8mls4x5{5uR+hq*=C?2lgm5rv7NXq| z95j%|m$%*qB0yyH^H9WD#nMvc!K+EkSFdj8DVs<$7+G-AUk|ZtR6zwUHm`CQpP;-> zoD=d-x;?mtXw(o+QZ zagLK}%Hh%W^lrypFaWi4)^CvcoOvfizsfY{3f>re^9rP1${+}wY7dgAG_Gfql9Ku*f5nD+ zO;kq@SRJZl!gIC`%854YmlPQdolfl4 zR4*T-Z=8KGH4&Mqb9ewS2QC_M@4ll;kW+2WHuAlbxfMfx0O8gEC9$QwBx%P@YDB$Z z_ow@K#dP*j)`);kj`P_AnA6zc-K$K%vwOymV@u=X@l8W+d_Xm6aC^G8QY_-`X2a7Q zJ861;Lp=Y{K3S6c3EAit6#da_MP~aa#O~MU9!hT?_5dE2>cDYdV-hSc-rb5KD^rk= zn-g-t^yRzTy*vB4CXZcTpfIZm61({zi2=2?aPU$MsQ5^3MQlJ3%NuWgHO;>{%afIJ z%A}e63PS$ga11;ogAGjmZFaXh?rtSOQO5?HA`tEuUs5)ImU#8Er13jVaF36VQwUgN zLqLvhzR>DF7qHjX!}0AULfn(#!E;xE){cibzZcfejAinrbR4RPvT?S#M!u#(=Uhdf zyyyP&4Uf-ya~$r!Fxb~;esg(}!oj3kA8@*@3evesPBoEnmL=9h&9EavcaE@OcA?TA z5*DnBf-m=RqA0BVf?)1_S)HZsghd~{Ktbpb6a~E&{MYZ3uU|zAyUe8cAFZJf7r&^n z#+ft?;DLzFxudzY9$?ZN)%$Hv!KzN^(HHnCD-vzD%zvRI_}>Oi9ktZJ23MDJ19ty_ z3fhPl4T^e(083+do!7c1$c(tNg3HqXGECmj&GSL~=tZ)q@PqRkCpgssT*#YN`Qn=< zJx|QAWQWQ4M^eH{W`7`5)KHKiff~yQ_woMw3nvr+JAjf-WV#7YGaaycC#96ks_~5{ypc$ipKt|lFE?ziNtXni`~u*W zwD}%>wXbJ41(V5~L(?i74Kjq;A5FSkFqxDS5XD)p11?`o0o&4E_)>x=-8;TOK6FTU z1?0>qrAV>bjoJb`^4^(AUFm%%=)NQ?7^{Q3$lL@^EG@ z{Ori*T(@SL3r<)FyrH|Jj)%=gF7lr$62%ob0*1)a?c+5ZRyR>tcm)i|q>BOcb`lr8C>^u-4)ashAE zOgjfX<@9=XR!xYg85$NbGeA$CbjN;wa4(UwV<465Xv*pxX~0!elNhjyLpTbAU9Vi% zm9E*`&+a@(!2QSG*~Pi1$Bx1qzpL5k_?K}^Zg_B06G9RHLqs4GyM|?J?m}-G@3$?k z(Pba(b3Pr_nfNLn*($cEfq?-=w%#P{v=Fn`!<__g&gc|By)DaSxu^W=X&nm=KrAS3 zuFgsm5no zNgk?{MpeD-w%qa$9;M&YfIkw5?dBpd!zd#WAl_ZmA}asrheUrt3&Pda?ah@xsQP-j zsUcClg@8R4fU^OJ`4C{ss5@pZS5mU z(`IoNq|~^cGmf0kcDXwSiSize%E&a1YJ;`}qyv4qmSNWiP@euImTaF1qYt#V53R*c z4^S6G2^UY-l{}Z`!U!jQmLE2R4s5? z(*~bgK$4P@kbZf(!G#LMKXbuXJ0A!)cVnL`_u{?`@j|I0x8+L0VuP2GrV2}6ARbJ7 zdhsMrI*0=5$$SUwyws?c6@o;O<=^0ccJv-6P^s{JJo$`1Ans)d7x5>F=iRS3Q3V7A zn}ILz3K{M1f5hc1@qjW^*KotxgJBQ;^7KHJVM_tjbVI#HrP3M~gIhY!s6<_dx?>(3 zwE>ssWg2wy+ntC`6Kk5CEUP5P@qFc(Ix&!t(M4 ztyRmsE5NapK~13ad9$Y$7Ql7yfyw}~L7?;`IXM|1+-MmF6{XpX1Ho@YWp;QK?hh~> zU2@nt5q#t|YY)^q!)u%!21*Q|X?*dcS88gwp!wq0M6#C=h{hpM(`Z|CTGxJQ+EGd}m|8T`sgpLS$B~{o@9IZstyzGdHt56I=Q4nH zz?;TJAVz}_X$ceg`l-rU4`b>bvx9d*p{yRYm{-F<$;0ZoRZl$vTbst&9CYUUGdHEw zo|U<$bIIY;tdz}k4vG@ux*pBXxnNry6boDI$xCC(%Dnx0{9W89{Tu z42}ms?V-d-V19w;gp_39JmmxvDVxfyRGd{EZPq#dEZCcQHP$NB%7Bpd|`Vf)@_U!Ks27q_%?Mc$5b@ zF@OSeK!G3LX%$%E>(GC164rJ(E1^kU-Qty+f4xl=OsDy8c1z#%VU;ddme-;~r)3yd z6V)dD7T{4|B;Rv^{sJrZE2I1W4K&yLH}Lat?7s;u^Yj3{ZcWTsO%vZ%*aHl~`heQ_ zTJeAMOFot2*QaViWU@CI{H!X3@f2%(&`sS=uA4bBwJ;4&*t2Ys? z`)|2Ntm76Ug$(vnJY>xGoBtac6|%KdDMH;*wLTRP$|*u^mQg5FO?%O??bLOE^)lfG z50LO`fqzygD*-TU>Rvy1E#!`rphZzs0xV+oa$rCAt&P^7&rJVFpfET1x765h!oT?o z@?L-?|8@o=#J}`^_bgrh%=K>^yPe;E3&q{9~LVT}i}rd^yTh;?{<3)?h*GTFQSb z#Qy~Jg53~m;{G?67|)RHF9MEPl2&f4`!X0I3LVuQ_Yt*HU8Ow)j-nD-86NfjyMV3V70-)HHK*%KYy!?wUlk zlowI+YSYijR+G`M69oTB){D0!YC7IkVc^xhQU@=ily^V>d(4k~eBI?rgdDzt_(yy; z*F!)&9iz={-aLkObE`!|)ef%;cB@4r3{Z8U@Z=xuBg6#%sDr)>N$1Hlw&6r;iOHR) zG9>2#bb&m5Si1mhxyVA)jvdXJRUFL7mv13j65RmI@I+-gXW<>^grdp6sR)T({+kMD zKG|EQK(JNnF~Eq-*lPl_=@*H_+zC!q9>^-Sv0mXb7xE#^LFD}3>F4ha$(;TZ>kq1K{eSDi9wkhH*?@6*nlhh) zp>B*JD`$X(oTCo{05?nVTNGz8LMU~@riiZ59KOXVuG2pym@usbZ^ye$e^0#$g^kt{c!VgyVbhUL5I2;q@F`X* z)z5jB7R_?nbWI|CK2=JrS#y}J;X?i<3|LU*DDIh%(!VH^Moo9)o!CWteY5CikA23W zfANPUb*pCvxyT(v`afwN03tn>vn zTS;1mjhpis;dth93a%adda2lQ7@K5o1}-;|Wfrx$NcCueKxDZ{QiM>tmpA1PqB87+ z$L)ocOG+c!HHue!@1Y2n1;2siBK|IesSs&Yo(CVV3+0bxwq@wT-pAzZzAv z7^K*`Jx)~bz4f|p?>{>rBd(BVd3S+bCsqdkdslrIh;F=4d-#qJ<~Jf!!n2ok(3^1~ zWQ=Lv9eqEA&n=j444^1T{34RvR0SkManJ9nI&PPD8W%@+(&ST6UM|P5m4d(r#+3)E z$RI3y4F*3TC%1cL0W#x$fZ%uyuHLhOyaDkXz?jq*{cUYo0Z;-5fDtn*D_QgHrxz_R zkA_9TFo@>f@kYv_|6Paax6JB5NdcX8(ISjWb~}{2@6gHnF9*h}%w(QOJADv;xlf-L ze90lUPfb80o&r+p=@(=m=;}Q`Sos>ZZr_*40K%TWe5OJP{0Mn7-0t3k8zp?(sz-TuP(a zVxP@kgK8BrTY3(?N^*2tY@go*U2!R#B1cwYptPH!^9b38{jVbYKVWNU(l53$&ZNP9 zje{h^--@0>aMX}=1Mc2vG~VOu010>I?wvt24avJ(f{fSfWI18Xj_Az^a}2I#Sp8$( zxLd>$Gs9P+#V8$Fb)%QE!pt(Gd8|7>HSJLBEO()}2=&p=0xaF}Ic{jIP?|gCoLPK~wp~376NR(5=oR{ub zPKL9zKx%E(nJ+$EOv-C{BdfKe(kuOJdDcsvVb}XzM7Gq9BGs792kj_z``6pXotKrN z?3P;f$(-WP&>0{4=|D~&|!H(_gEV&w(M@51&TS5YTP zHs1HF`hpD-AKL2Kp>X7RfJAxxdDz&Z+}bbRa6BvCU(0gyJg?r|EL#qf2+fk}Xi3Cc zr0TnaY^(IXqq75L27S6>ruNvQ-+d9yL7|%ZbWKg;KYv~T!=jK04$QrJr#OPKt5jTq znxRw5=8l&Hl_WCZQUz)!dHh2jq`Q%nf``%@pkYc>Jxk>6SfTO}zyik`-OTi>2m>0> z_6QpA1=hzKRp1<2gUUulzdO@+IDhS{^|7j_%QkziHi0HWf6yuht_uVnk16|4_;13~ zraD4GI<22GDaF(C@KlP$e7Wuj{mVF=?1pjszC?@gWMl^}u#Z2@PtKml712e_Eo*aU3`Qp|}qrIZOmj}r9< z$P@L&LOfIyJ3EAnM2Qb?W@_^WZx(7-F8`t)D;paQBkufXqa@9(FWP*GGjtpBWBIhRQ~)GQR;YBTWuxmd>z*U-$AUB7^X6po2sCRM`!p%_9gEX35J^pJ(O5CPg+`C#Thqotcxds*(Al$;0Z4_vSayEB8?nfsgoV$tA${#{uzSSp#*5#V(+U zM_O8XNijJ)TOBl&wF0cw8eD#`r4^?tW50<%EG(Y<^h)8vG0(L@ha2dWN~-PC*r!?8?W^&rr+^C9ndx{){r1Nsb{1t#kWqZ1==? zc@w~RSV~ID75%bs;JG_!jvNMhc@7{VN^$Sfj40PoK82j{y38;QbMh>JKKUyZ9PzGzZx;flkr4ggG(T_f89+I)qE+L@!CauuDH!P#w%Hg zX1P=`nlJxWWPnbp63racqCCw=+{lwF{0u68;sC5oqNOT^@v#8(Lknv2wXoNzNlD*8 zY3~96l%6JUa)VBdl;SC1Rjb20zJdVN17$mxi9c-`*BpJ3D>9Snl81=*krC-5qFed4 z#y|mqg~WCrv$plmcJBydmoZ


igAvc8}GY^Cg4assx@I7ufRC}JRWMPoQi-HNJgeDsB1}HZWQX%3eZwS z9+&WVdM8^wgGipud4ag>H5Tp5P@s$aL6_cw>sG3$pA^J}Ok*W|Q08_o&X`B{d|<(% zdzn5<<}}iQ>Bt*Lev3hApfZOdz_61I_3~tEnXk4$Q%;~}076|G&Z-T%@-_8eC6#5p z2)MgBgT6Yp_b}0UMgP=pQ{^YZw@4u4wUGa$toW43_i)=FLG^4gSbNZw(z2Exn&`z% zFxB|06r7iA`Vt!GI?B{y?(|RNe&qNq^wapwX)j$aY;gd04odf+t})f*!31I(HV%%M zkKBu0cCTKE^QKia9d-D~kE;#!OqN*V>cqqwE`|DvZc1~S&6p^flLRN`^<$4Y3YD7las28Ko)vQ**xLB5V` zqfB)3D;xzn8b0Oh2LBM&cdagYH&}0X0;H7XpIJmHHD|9RT z4m=+cM$Zs-djFfKfhjR+!lmyl;6vFXY7ZF-A(KHa-5LkuV|1DtPN$zpY6_HAia4< zQ}E4+0rG?d{Q631m;D%&NWA13z`ZlXm#t5DB{Cy~*1Fvs2s~cAW{bt>NW`F zj58+~Qwy)c$Uj>!(|f+?h~xYCjMgEeOlOJgPctf%FBSHoF9bqPUpdnYQycZnryPd` z(R#M8(ZA+0RpRuNfZcj{Oo?rGfQyyc9^WYe%>VPFnhx1G?9Lbw=aNDxQUC1;Ut2f$ zSWo`pdpw>r7rOUn@j&Ojp^xBR`V-eECFE za}60{0>|fnnMsn|=q67#elV4X=kV%C|IJ*e$46Vn_CV41{=k7keg2XJobB3xco13c zY7R=dwyFowCFp*DIsUdrcLzml&qax#OCex^c=c$0{v9Dz3?WbNoC7W3C+O`>A1{e< z`cpWbH39d+0l?-luBefa60Ah+&|5;L)Gc+?s5Ij0vOEoA$LgCD(tA za<_S%7Bq$qfcOzf1?YMCRjwC@40G3W9xLPDjKU633WWq7IaKZ75EKuyV9udwqx~+q2AMmGn(kVP9WB^$Qy~s$#JqHD6=z6wS?1IrB zUwR%UHXO;!;kSHjPv~6|uc^ONWo!cJahNo4-`(tmpp@aCY>Y9Dia(XP5dvr2GP2$S zRPlLGeNi41up8LZ2hdcD3_L+`RA!2;*|{j`oPIo;j{VBgLSChUfkvq|T9&u0fZMRT zGnS!`-fTN3vAQ=3-E};;HgU&kGu0(v7}sNZ5H;y zj{>6fuR!laF9^rP?yirFs?AZZ&i45D!)kCE{!l@wNPvXE<$+}YD)imVOC)_?_4EM6 z|J4fWh(F4sI?pR4<5t5jJVQWdH(Glx=&mnPO-5}ui1?`@P`NA~!JJD?_gz}uO5NM8 zpeJIIjYSJDXBFww*zgmRlSNN7!QAN}7;=3%d=(5@!W}?=6EYeFK*3kqT_#kS=m;yy z`}qD;txHBj2%7gU^>1@~gg}qV+Kf}xD)Iw*uD~r*fMfK6l?`ic31n`LAAq|xt8-qX z!bjEGe4rWxy5h%c?Vg6B7P7HuMB)|W+3P%`HO4;xX-I`jN}sZE>M{ez2Wz71G5||91q6cVu%(p z#ji>Jzh1`fJWZHYD_aJ)u|H5=bRG5$3|NBgB0ImjySwEkH8eFvPNqPM7IU#dQadpD zd}4}RP<#h@Bxa5>6Sy)Yb9bE#IyJDei+ zt)|4GDHCkh#`D!6)B`93G3n zmTbLGHN~2vmLB|GHSa>yMZS!sAI}eH3JqcSnXj&pBt^E0Z~gtm+T1v?w_ikd>lOl` zsYL$20AGzncVXx<{x%*X+N9R56+xX?k7?Pk%Cg12p7mwRYo?%aSGU}w9lbXb@$Ge@ zgB&YkvN$=MxZSAx_04q?&TVE~#Cqa;em0K>*KJ(HO7|__tYN!b1=mct$79~g?e=O~ z%s^|l-u*=%!m?$o9maBE68O3@-N}0tx5{q7_#kwIaS#7U(FkCsI5c97E)&!=i4L6Y zn75SQd^_lqqjKYhcxhe4Q7W-rmlT7FP^&YRxL&=MoDG|EHbqY$`>a8hbYJ3O$=fD$ zjx)*pC)l%<3dLNGO7^4z0BHJUjaZvy>)y#daXw4jWbw=uP3$;HxYtOvIqe#a{P!fR zquwlf_U*p2T|a&-p&;Rf$sVCaR#gA2GtE|BqXDeLOE!9z;K=zbVv|KjE!EJY)6}|Q zSW%dA{opRB^S8xVFj0sU8v~S9di0@z~trz45ABs;Qt%g>w8=F$pMf@ z*ay8VPoH&@9nqHhsfLi;fq9W^?ndt z=`eP+8UgQvW>T#MTSB=hT`6$Pq3voq+`$Vo^dYln7Q1$pb~5X4UNFK_3KhiX$XtIm zOsW@cf2Kg?-=g;KQhk+xOEzesDmXIc-4W7zKVZFoXbam?!n-1+Cf%=l!XcJU>CiDo zviXK$EjasoJb%*U1z;0&P7WF`8lMuzOhKT`9mUc`!1VYwarP-|%Ajfk@hue9J|Y{` z=+aa;^ekT_%&D}CIE0ua-jn4?K3zw*jfAyYKH{Dxgeo7Y&zNz)VHw6{3M*2L?nyLp zLbwTV$)EG(-avg5MBus`jFhf|6%Mmbm^!zH4_9qZp_k!Pc*%wyG(E*p=r3h*rh5%rLQte@#-CV#VE{=IX|~ByhqM(d;rNZ#jSdhm{H@8tn7Mf z8QgOhh2a)HTGcfPfv5Bj zrOTAF^Dgmytc*6|(77ou0E$=UO*BURK3<`NbC24_wcyA&R`*NPTl+qZvUGEwrs2QE zE{RzSb17W)J-a--ybZ}a0HSb1G0 z$9OWOvX!AglG(sk^hlyf(G{*@!FV}P>C`_BD#9+W=dNv0aTE;tc#_t!3%r=ki$=AD z-mozFC&V)}+Bd&{S0l z?)%HTjtP|i%IaX(dG@AT=O2ra7h}w+(?x=*HOUHSvry5GQ6WrK$Ao*z+?n5!V1!JR???$I z$9Zz|e4i_OC>U($bC9C=u4XyB(?z#cK5y!=n0!|J`)7ChrXRC{j_9}Ex5d@gMO`Y8 zxBXO}UYDigCqRZ0;Bn+XmQDLxMI+BvS3PI3;OcE$rpLz-EO%kqg*_tF62{+>d^JnI z(3-LI_ZOO9Az;Zi8mkw~(Ose^qj&@lf?|e}-gBw8(y2 z8cE5PHL_;wjx3Y1jGc^qX^bRBC}pqiEZHghQrRa$VAVI3G+~Zt`G@R`w?rvJHG8~mV;PNU5o4oq)hVy4-t6Zr zd0EHbo1BFekKCG8P|?QND1SIvb0Lk#&%Z@^Q;>P_oxjs{?)Hu)IOkjaR!5Z$)z31U zQVFh?;B+rZW~Ljq_k-JayBGZSIS{$qSN!JqZ+rSH({06mh#z5j*Z%LHO}TG;l{8Yh z?4IZ+U-{qDMI7z)Q|r@nklAGL5Tl)}Ih$}|Z5A9ZSIzRehh#C@wdpgYaSeHJZAUe{ zZy(&(exZ>rrFO(W!La z;IN4W#<}g>+@G@AgTkCO>?GT_Bk{u`Iqobcp5;7bEsT0MhT{O9O7k~s;f+ERxEjQ# zJjosQ2{ErWo#*XVF+bl=SNcb=`V1GE-9r?kDkO|Z>8gD^${kGV<14dHS^=5hqB@!;VokEC1 zeBt24egjm^+tF^YROlNROn-URE%`(3*cr=reBQ4QnJbrI!7E@BvNiAE?cJ(O7YI&m zFeR`dvhkwgG8T7RK#Oz1nE=~Pg_2zg2x!41vf8>A5*s%kKt94$&hVK<9es5K$E0pl zf>B=OH()gdXme?>_^Qc@?mIw4(XM0FrH&u>8E0!Tv^}P-*36^Z@zer{D~J>ChD|)G z79be`v*%V`b&+7dSI$vgT3Wgp5(?tSXdJT-dE8^NF&)h0W-lDxJ0ukXaPVcKfsX_@ zo$`z-=+@at(USEa4$Y&vrR_-LI>trsdZeG4K0s7?L#5G^Ozu26t^kkpqPX7wnF{q^HnLD_2Vg=Z<%wVION`SPxRQ}aFfcF>OcPKKNjqazosc+P znb-}f3h57U28hBF$Q8R^>(i7Q{f6+n45YL)ew&~rQ9wQo0qeLqm#7Dq^PID`w%T*7 zbw0^T-RdJKnt4EyhzN3dji1z+x70_ac)<0Rsu@=w(NHvIf=6v!4Z2Oi*t4_oI2$GEyI@8^4wd?^hI3BOs z0()B}-iZ^ESf0vH7;RbWn3A^BkLlj3BI{>`0TNEfOTXGoiY?UWrr zIAHD(=gf0>hC<>6fJSx%Fdl=!tV29hFR)S}u=zst>97+`mbRw@nIN{sY6eeW@no6Bk)S!gUi~0eXlPprJ}3l zMoE%5B|uL;eE2Z+hEdQ79NhL|si!+(FAS*}e<3@>M+M$sgK%;1N2_gxGT1L!BN5qU05)zsL6$zxK6I*!u1Y^Lf?lxTZ>FJ-ziCsOND}QeC zhi++|p&JfETZxGd_gO+Cku*_C(&1r^(s;NcsC zIXY&&rgwUDv~wT1q;G3E^XwUM2pqtMU|PG$p~1<%y6=y=EEth#ZOZ(0TniIP#AFrP zRTN0YtHi{E-4WMPP1vdWmg(v2g|=1TXGKLtQ8C;y zcL1qJojP^O)y-`N@-2wQlxoPhvfrF*A9~#d=#;jT6DA{Lx#I4h%eFzGLLjnSZP;a=5Hx3}EfL=20m423WL}i*r^S-?YCLcumfg(_L zc6LU9FgG+b1o8I0igEvbS;#j)A@Lpu_J?;^5)hjJ z6g5y&Q&5&pK&U}5M~mohFGC9p9*AaZag{r_tBQ&6NmJl7*xTDH!I6iy@9w?lvPG&N zE$!MP1R*LPy1TFU;sFsE#9e7WwJvU0f2zd$sMG%9ISzWmT zf<%`mlDG%$PBXu00y2!QV}vzIN{TpLL`0-hE|Ip$ z0FzFGrNaTBN`$)gow<-tg*RAsMMVP8;YeY}tJ`G$Q3OU|(N&(@+?(sR=G5`>6+oae zKx3d(&z}cZi8`R4lAnq02+6Ux$))#c_jH8MKLJgrsZ^>zgq8XE` zbC3jnN`wgNbU6#PjW}ZvCjVdu*2F}UK&T?vzIpQ|;@-VSkn5wNfkI3Vxw^44i@mw~ z7#Xbuhet-hohQP>!z1aQt{SqV^y0;fp?PS4R&5uT>;ce$A ziH4qR-<@$j0CybR6y><#KIrTtFctiMxEBXaKQTR>tzfMfW`|dx0h*baL8S7^P^rjN z!N9-}5gBPO&7JmS9@rB*U?k=i7Dhig55fT;o8_Q}hlfPRlQZPPzxAXb20}$xG*Vk7vy{5RNBnFt1Ye{YJ7K4|? z{n*%fV0IDp2qF(4EMwtDadL7ZF9E;_r&m^@+npTX1;eir&}pAyYHF$k;C$9|X^WCgn@XgU7TmM!&Pn+xBL7a z^2S;nmB*b$#6#euvt-SKVQ?KLu5-7W4jnr58lcO76{s?Oz^P$C;zA**hM=w(Qk*_? zTz6_^g|!)==0I>&xO#b2$Hlu#b-e4Z=?9l0dH2k<3R0n2zN~%AzU7sbI-lt~&G&QN8c+`%c_2lK{O_p_LfrQAn zcG6jRUMLQFEHQGmApqKw7q7LIt?gxP?QpY~7S~)tYsR}_An4rsN<|N@eh0Bc{f;s@ za3hD1pi1f1ldZL*6xt-D%Fudxh%4Z@LUOv&!0z+proObv5&)C~V$d*1n3StLEd+mFG^cTCFW5*B+Dx|t7t zzZK{><`eb)vIjw+2cWmC2x^$-MoxW>k3at-dA}>PKH~1kkTp5L5PE$tlv^eEu*e~*S&@;EyKA(;fdN-Q z@~eNJl9H;~+X4UAC>*Nv3KmQGHUcLlx5+CgRM%(1o5PJ%4Y!xP9f}-`>^J}5IX>0g z!lH0+aL^)jsHZ23lZ(rpjg5_1Wyq?*%Eso|?|4Si=5@jR{5<}8R~mjeK+doqT*2k& zQ5Z7E8qVl#tgrWg`?i>Pz_4Eq$%y5(+x^(t>CqohR9svI&xrDkXm;g_bB(o`S+2aI z;$(moyXrbgzB@}ZeDraf156*IK=@bcL?6S$OyoIXWNn?c-xY55>PNGySM%Y_7EN%u zqp#1XtE;n}?4_>G#B26f9$XD-Nt->}Ku^gsprLA}yxR2L`;##+7UV>UK6 z5fKrUv~o(AgrhtPBr!9UyW9uvj>Qp{VFQqum{_x8VP}{1%E4l5d)qSpeQevCu`#Qk z)Dozg`hXYr$?)pEmzg!wf&rqTKl^#(G4nw&C?_RWkneH@lP{7np=8k%|R; z0%ev}7ikit`VGG3Ho?$KzP>NI@MYxSt(Rl`$2)vsw($MjxlEjahw9}Y)0J_awHfA@ zTIg_%_8wSs^ZNHPrz9jQtr{PxxRFh7-mJ6Y7wf=E)YaA1b0r@8{{4H99>>fdHukAg zJq9!(_;za}zILPuVah+x@fpn`4gQoY-rAqU&<=temTk(2vdTS;j@h@DN95oW2=kd@ z>M8L*79ENTq3;Z0#P=SGLN)5enYag&7=6nf2N(+&DcSfAc71djakk>e7D_3L$dpAd z%|TFe>r;*{l?oI5c2M};AHtfs4vVmL{^@llpZEz)FCYFZkC8j}?``}4|C95HU;fN} z_4T7YtiL#v=CfAMX-OEP9xtWM3J<8>oX}9TWAiZ>zA((j=w{rnxrkSD+fy|5cj2Tb ze~+ebwzHF1KfSKzV>qc4NcV@1!u|#Q;9IQr^#RQ1>NSP~wQc+aEv8 zXoo!^{&y$yPcb3-D~}gf#j#;?_fz*(Uz}7TrM_}M-*aS_H7~BYDLbRst2waw>hJ}( z!_%D2p>#=MgWIj>DQ^wY3iK~;yJREfHTpgJ??E-3tUD^}<;@OU*LhijV@`Q^vT1XU zvDYiH7qIJvlI_R!?a=F2r<{|-4r|q7iFOVhKW`gn{xzPhFPk+n5Q5sjn8q-LYyO+Y zp`aReyP3NoymFD{!e@!=IV@u5&oj@B1ZS4n9@VHC#rp4luVM<+&XvHc!C{fk3{+ z9B=+0G{M^2{^|YYiJ=amDV3-5y^0do(PyoBn64yHmcNJaU(v4{4KMIdh~z2?eEEwx zu;tnP2o7TBp>fY+HU%eEJxRS`W1i8y%`#b7Du1<0ZY!lki*R|_x22ksJvh#dIZBh$ zcTI%(JF~TS6Pj#{^E}d%Pi{5mNr-k67NFZXZk+8da_ddOF}`RPp$qvEYemfRa_9m7 zYph>}rQ$ed0{AU3tMoQ{1aHHDserK=PVCtx=0&t>{9dWS!jsVWEH%&gAt{zDOC1)I z@LXF>PBe~Sn8!^z)SMK=h(2sQn|nBn%NR!rB8pz7Q^=0%Hy*fmn3#?Jq|hPY6aRv^ zRv@;19joXS-enQNQR?HA=+qmw!VqSYp+nFPjyh*Rd)rV}yArckYMpP)fGK!%xXjnqbqHYTNEK_2skj|djWO4$ zojZGuG~^5y5|7?lc3HWzY&A?uKRDub%UX*~=(}!J+^?L}y6-d5ytg>+p40uOpvCYq zEmu;MU4!#PC<8BCQ|JEbMS-06HUqVpT?LZum%JJ&8sYSj>FKfX(wj%u^ ztyl)RtOsHAl3h30r~J8A(o5-=Rh>?=U3+luG8_F~`(SniM;Rw`r&dFvQZAvWbcPh_ zwbl3AZ!@I8QSQ)B9?Upna`dvBKO@e#?L)e1%(53>F)`s7J*hG1o<_q$KZ%{>tgsSp z!ezL_GaRS4sZIF#M}Q!@M1N+5oYjin69{A+{)?=S&Bigl9(^f!U{K^{{d_Smq4=9l#Nj&_|v0GcMX$9$ofQmm03A`0w;mzp)z|EhTK+x6s>oRML2Y{wG!guk9!lN03 zQ`#+`P$!b_E!`_!gM(cwav*|Wv5igu;4d>g`YWY@!=Q#agu$kJf?UCQv3xhG3)sgr za)f)aS%k}WBxc3!)PW**U0&fo4$n>aF{7He4gNJ!5apvq_MMJ|ivS>9%wK#D05JbA zp8s(EzYu>imq%ykLcwpqV6Y7Yu)DZE2mp}2^034szlWt-mZ0+HY7*C4PS5r1HsbGo1r{n29hk#t?rhyVP|WM3Yn6F@XF0YwXN*DRRK|->D@Tt#t==7+RVhlgRySmx zb35l)Y{`iR9h|w=1?A*(wij+p{j*gl2*`Rn`?7&7$3vB^$eE#4idx-Ci&)!e{t`_| zYkXI||3MR8EFyb>?qyyv;un1-Kd&$H>JaW`fwhwr*{(8osSAh8`V3j4?>7kW#oo*( z9p^N4gPuzT21~6ub8udUATBbI$u@k zXSl0yZS&pz#*cpHErJ~1AJ428)Q8nSdUwuU^9P)KXWRt zb2vd z^erq>&nzhpZEH}HPt3|lGAW~}8U(-_uMT$Dre;4fM#ABnxMSye0~-l)-RCW4)EjZb{d)aw~!@q{@xlDMwA(^B39~k|7KX`Gm4AI*xDGs-i8(ES>ZUb~;C(*SC3J-Af zy75cwgDfNxCHc1UZyru0O}_IzzpdiQGPh`=->(Kv^sr4lFBbFLaCv1iw_RMichZmA z7&BGtkI}5Ar##p6=}=lYWmsaRRouSVuTwFbSPuPGIBxn`@52R$Xs6Z7N~i6!1D>4@(B5=^XGjchP$AJ+cSYvAKeneKeA6E5 zG84FN73(K3I;R0EihDaxbIz0a#aSYh9xmsDycFgk@mY1@hIAqGfFLNgFK(d;v{?*3UsxgKRJFE?1nG4 zqt62dkKR{Lyx*n%OgjlHnN}m}*7S((=Sp`1VUV8>G&Uv9YY-$BLBqv9 zp*G)j+2d3sr2ZNqWwLRkrUb{Lski2fBVmJ7&n7+{Iz^8S>=gOjtnzXS<`Km$D_45I z+jY|WrEx0#xBMGhWz5K$hi7{zN!1pGE9W|q+*1NEHwWyQ+WXuW+`~GwnFg`k9ifUi z-Zb9B)3;#?wBU=&;|_TA+53nTjy_3W=7UYg-o~(9cp8yAtRcS#14JXTdu7JA z2BONDaddul&9iXJs4FUT@0`czd~c;^tg>lD)AhzPH*UW}pAu}G>=5Y6e%GDpeoWYY zd{c@-aa$bB+aiXsc4#G|^}xOmh_x%>cBe?g!3DDAUgg=;co)j$JDd!v^B`?RetW~- zwfg>qhsK^Qx3O;59d$gl)Nw_^Hdm7ig+qom?unS}aE9$~*3tMHAE*H{OcH#Bwt7B3 zz|)J6L9g^KynD+$Z9eaon%VmNOxNn<*`v8^9+Nt`D79Tinv}dajy+~GuP*09;WLc# z?!R14zTgciyS`;*c}WSqiIWiH?8B|ubUIY($yd^SMn^dVH^L!~i&Y+Q5`}1!I-PJy zrZ%iruYd9R)G1*a=u~Pme|h#?de)NGf|p(W-|Mqt>eT$} zyRo>blA5#I46foi&e9s_hMLgEnjL_#@gGl)sI1uu~tTXWSBrfu$*RRJ}ZE zuQi6Q!FsrF0NwJ@W%u^G*>sH$&X+pSin1JB7?!wBCmV~U`BG55-*knb=Ep9!)I4cr zRQnVq)HTJs!;Um|ix~TZs-MBCu3yTm6!b>=;?cD|rL7O7I1&7%w%)R1G2R%yuI`B*^_f*M+Kmrjj;h#z+&3CyoCZ&A_t+)%xn5` zI0`P0rNE4bTt|=I=!mvjM>_~CYo~Q8)0LYNZ!VxdSy*bC6@4wGw5kzT0Ex-NbIRv8 zlf;<#8S*}juV|QtSm4y(w;q5Ya(zN?bv1R`e~2r=1(i4)KQLyR;RRsj?d14m3l-B~ zDP(>4ksA(F0A*@VfE^{Eb{Mr@F1A4p$e6z2c=^Ksb*dVNz#N+6l~SzR=kBfQlA8b{ z?VU$S*ibo;n2yP(=?u}o2oA}f3Y2#-`V+2k-}o@Xf$-DLzND(sqz|c(4evm!Zcy*c z-|GQf&05|d*Xr^j(9W1`%IAsoSh(Yy1MP(5Yege%dhK*YndhlOYYg6pkzL5_FS_z6 zEXjRXD%==qjagIO5i^CD5{u?UkjI!tC#^TJt;eY}JJm#^cEjUejqQI>vSvS^SmpOk zxm;13!gg*Tpv+445ekVKYKePWeXE!$g>kWHKHh%QX!<#d*A?}s&M_o@J95`oTX?Lv z?ZaWN<;Nmp!jnd1i;`icl5E%eEogF|&a_xGKX1Q1@&qq#Os~4+ZZR*>sH$g~In$l@ zBL2S_yoHJtc{)UtxL;)UF`gMa3NJsMO{p&YSzYc2A&Ufa}Nl%Wj|I{J!8;`TM_$X7J

GaE`HE+xh;B?T(w zBh^i~mPnJvgdS>FZ6XkIo6+dy59A`!lduh%zyK zQ+z|7!<49-+UREqz2q^PflT8I^t78li9t>Qy!kIgU;2t*j68JJ)0@$ppg0QA|13-P z!J7dz_oQG1AFMq7Y_wNo<%#`=i)G`E{1GvQ8nYSxwn{ZA&cYSg?b~61hc~R*UZzSk=7-Sipm{t{^jAmfvMWq}WJ)bBM zF%n3Kt#k+*t-N@zYCr;qYIm>DOY5e>whj!aepTzAV8?5|G?ZD~n3aVSE#7Yg==@~3 zoqEYHgkukl2%Un!)ZIoZOQkl&J=muQkGNoi{p&;4&5IXltf>&*8r2#a1<(YtoH4=h}Gac!cpLXVY%UY?dqboZjMP&$2jZJMmqw$+pj!*#;#&7e`@@7_Cnx{VB ziA6AKm={GJAzhj#5k)#fkAWJfFT zL6XiZ#HYDmVMIl&_1K7sa?G?(#z6ddB=BWnFqgr?YzOOLg?$OIr<(Ce&*(f@?Ja-8 zWm}+uU2Jb@pSM1hk;hBou)@l8L)Rv9d%`WI7d>_Z8WxAjU@N{CfYUS6GP&ENYvvCn zb44=CQNvVEvxeyUH-V8QX}WFKNoa4@jf*R;*#_%r2@~E<;xz+Voda+b^_Y(=0F9I z;Z&{$%?x!A@c{7y_t(nzl1UgyF~2S464az`*u?DmwmU{QNntZzaE`Pcn{u;kIa58l z(0cud--2TXH$24@-u~yQ#!&_Z^Tvl#m`@R%BxVUQ3o-pzeW~BsUinwIM%I$q(O)#F zo?5ty<}L#_uaT)Tvrx7(UjyEf9yjD8o?efH<5oT6izdT4D|Qj^uQAD`BC(o*(rMwe zLceIh_0d*m@d8&Ca(Jp47rp*F>ZE^IqIEgL=DNY?^Ffi=yVsk6LZNkypgl=c3j7a6 z`ra>Zn#b(deO;HWD8VFL(@{u!cjP#*@%SFUZ$6sih4bv5E2Y2(gjfbb*lZfeU7%Q4 zHVOwm*As}iLYaQTyDip?S)1<2O*TD{t$ERUu{v_-qfP3B9%%s=OuGX1PpUdZaPX-~1l`*d92~B?bysMob27H28OIoL?BI^Ec3z+ApH<0e&GxbB zIGBma*oYS}ncG`nYUBo9weVi~q>O!Pr?8_K*(o|hZu7e0?4VrS-GM)w8x2(HkKUCL z`evD%H1yf`J4Jq6PCkr?=S}i-So<22O(NTVYWgt1ebI{K0hfSsun|A6%Avf1;BVJ` zZ`ou>QC>s;aRc1p8R~}%(z!YKgLJ-;7^}u^XM}x6x2OM z-!DCX8QLQVDs-%KCRzTskLumC>(hKROXWqrSb7c$hRsgBssEW(-Y)-qC&48_neIER zCC=v==C#XQJo|S~nH0dpE$FU8?dMYqa@}1A;njN7oYwkqEHP{OcfVamihu@z1~GAb z+E0|YveI0uRu<3x)SZUr*h1+bdk;rSZwym?F6AY@w2^Xr3DHgFvccC7&XJnf zWl7zkwcG%->4g}e>jK~GbYILCQ45CuX1vAgX8m~em3IwqgH2R?go-TycJYYoBO}ZH z)SaUHE;H|&g@R|Nu0(pX<~pHx!51F`MVpiFek%(SPNm*u;H+=}ZG+A8vI1Y3o@&leX@X4^&+uVW{fEwjf~x%{U^-fFp~k!w@uw5N7oA z>A;l9bY>O`7BNkp;PLRR>ZfHD#4~1@aO{$ir5CjMHV<7R-U2i+RlPmQhVT^~e(d4) zRO(%cIpj$W2_@L_0{>qK+)S569Tk%xaZwdC)a>2(dv3}d@jB9iG`EcaRq89KWwvza zaYFkgp`{|BY+cu~{#$<;Sk>RTS6+w=qRyE;{7=nyXruDshEXO5NdF0_;jn7p-Aq@J vLFmK!KPvBH^aHDM@KWx;7?F z%Y`0(G4sqbPwc(td{$AC!9XQOg+L$}aC=?T>2qW*^;*olGHaZfLQe53VZGXYd9d}}?&6`^9Ida`M1u5?!N9TH8vBO z4bMAo>0wKR-8W)Qqgn7>%EHz`3?{wyB#^0svAgd?2e9%EFOrN}>)C>_i3Cy?AXLE% z%qFgIRmOV)dgB@5}JukZY`ub`fmRh|M@t8G}>iod{;ITTvQp9XjRN~xY z$=v4WmYwB__cp=UAz$+lLa@O*&}lp4p+AEIXd9kGe{y_<*ZTKv#Motd=oq@h-hL5B zR?TufS{%H+*gH8r&5?;FZ)k4*-d1O~AP7EIJeaNB885;a8rr>89^>(SdT@&(dnH|> z`Q!NNYFHUZ2cj0BVuXmDo*7A30(*{-0Y?sFM_8dq6`UTM!AvC?kj!a<==X3zFf4M% z;&r*dQrSaJ!^!!~pMS|aj#;aW!(mwrpGCVPOFZc96p7%+7v@80+2+F;Eq8~_-VZKq z#mu+}nsA0WuqCk_7Rx>zui`d@{a3_j!nYO?yZkZ4l$S}55gZ5|`jppzNsy_W^tN^% z?~bM>OW);MrWBBg`s%^e1R-Hrot&JcG&eW9?v67mj`18si$0F%>Fd`{Pir46cLwP+ zI&IYa`Xvc|WxLiLsWVe$T0O6fxvN(zj_{W85$PIc!#^asrF~1q8Mm&=-OlN>{no0e zoz`9kSv5>_IM6#hPiVyse9Sy($>)xp@XU%^VL z`u|B}&%8bR#;8WfZHAt&kf?N}BOw9X+1crRIb#kPFMSuM_Fc(wcf9E1+$awQ+&B1G z7T7!+_Mv0lET8=+q_)*0KI)8Q99<|JNg{A#no z9GjV{cYK@sGu5}>Rz5EChA+)Qu;3^b)`TYz$RWP1JzQ*JIfspg0f{O>Ya!Sn-y#~H zOW=lGU%NNC>>+ks>`lRJPm~bk%g3*Fg`lC634Y7V%Nw7{@O!)&cbO{F4+s9!{&>r* zr1VJ-=MokZre98UlS#_)kW5?rwxgJOgv`QH^4~U4ZP<9ps*UaR75}4DWF=FOB)t4)mpbC><_A^+fg|wy0G5-{x6gKsS#uB}&=9P4xc>98g;k7K0`JWPb)C zg`||!%I4;$&`>0Da&lb8x3ptb=qM;CpN?s;=<*a>c~VzA3zM7xgGVljk?)cSYk2_!p9PyJ6H6h*Y7mX+nhTo}w(b}CqUs&0 zg<0iN!aCb>QxIxJgk2)GL>@HW#o%bjqM4Dy+{=lUE>l;e*JPQ#D*pawRxjWa{(%+T zxBlO^p9=*L)n;2^v@5(dZfnGB zdE`ffZQ?)@f{iH&-hi69-_0roqbs|_?NaYQXQ4<35glFhqFhc6zqsh!)LH)0X^Sf{ z%?Xyq!Ta{)ZKUjzUwg&(PCfsoJRj*FFqwLGuQTZv>_5PTU|&js2a+l%!G^4L5RV4J z>}fpI7|tN%%``&xDOsmyr4Qwmk*W5RKo*#;UoCufRVDBE8_r7>E$b4#h^-)fC};E#KAn72Pvb9P=!4OTX_4VuN7Fj|mbl zk-?lQ$Zi9ZqG2{z4H6eI?_%{=Cib_zsPf^1v0Wv=69&|tTMV@-P~Qhdj5*@ow%7aq z$=dI|7NFV>29v-|+xO%;cT2LB^2tW+$vZ0Y#^n`vv6DqXrm}&iI_=OlS_w*(*Yy$J zoT$PNgw3AoheG<()(I%aIM7)U-o_$FLrhr=5{nn9*_pEC#Xh*WS-6t6&bDFmL6`rO zzL{M(Ia*2kwX3+d>%LEhEY*G>c<}lg*BxTH9%Sxy+J`hUAJ1FJTUd!O@0exzu!t%W zi&FN_6)-Q~WwGK?Ti>w4I6j+JnmuVHW0Yi;ER%Jhlf10&-TTfqgRIBC%!eAIH4}f| ztHe{xpc&;yoAHen$|+4bDn`cEWnmz|sgosAqyM|7zg)~#+AjlVU*yfl^M z(6-uTBP&ccd%$zdd5(ri3GEQ@qBrpY^m*y-sW5iHk0j$IwS#cEak~~9VX$KGZEEQr z2I*)1Ua6NODG`BZpi_0mzmT0-IQ&7uUW-N)L=~Ltgh(!Vp(2`gZ(r+G7W3C-S#+7w z3O%t_4`N2m|1*pf?DLi=;;9d;c&aAsOfipaZM5SH5(KK^6H=o}dX613eZweJs7QY5 zM>XLy7;@#vePo+UaO_EQ_*B8d4pImLDHtLd)vI^-64Mz}Nq&o!?hRHdQgDVC^Snmw zE=z3=v}I=eVo85c9Evr65NCRye%5gP`t@ta?{I=5y+*E~p_ySa5`6sbKm@do%`$@! zY|Vt!zWg7h@K$B6=9Hh-t=Ra+MU3}=T|v+c_Pp4AXM2Z1%+q^P^~!a>Lg!U5_T|t1 z{%|w`HkEvNyteyOg%n=vgdL|uHvJsEMyG~`^(W8$q4Z|&)4n&pj~;YvY(w8zl7B^# zY<y5(ohae5UrcN{ABEV>f9;T8&-mz!N)=ooaCmqKkAzeW@|=7k+u(X%oaIav9jFd@ zKZ?r9$-yH%$@VAsymVTp6&EL)Y4svv)#c;-C<;RQztVDW_irxJ-Q_{9&)ucXTwSDe zB=NK7&)@gfDQAkvr1Cuw37pah34P&kJ4_7z)tJnu8W!>21jbrgT9DDuOh7&J ztGk=)Loaq3l#^RZ;udmH?1(nuZqt6}Y>rWBcTII&` zO)jabxu3AfAJJ;9He44i^;Eaw>jP36^##!EI!4wHCb=w9yWl~b_d3S4N(moXUkZmT z1l%UG>><^>M)fwOITk$_57S}YewFDWg}<$dj-lqsT^%4bU`?8yTvP@;qh z(R!ada$^2NdD-4!^>TywTI(%t%h9=p#*LjtHhv?@V^&L?Zhj<07>#P}@5m~1t5+#v z%2`;)@Yq-bkYc#W<`7-&bda!}KFrnG0?T5Uo10V9(0H5;@;8EF6GpqvhREycF#k<5 z7pbu4H^;T^m7St)78dOa|H8r-WqOTWktFA`9T` zI1-aL?i49x^wv`uyY?@>bD(X2=4NOgEDBGN-+H-A%PVnp z`{Y0D|MXfh(`5gLlmv2WHpTwqc5d`&jue_w+RIKbn?}zIrmZ3U9G5-wB#{U6F9ht} z^Noh5mveA8o2liI7J;VepT3AOj$iGPapCu%!kNLL;19=355Bu*wGz=-p#*L_9$aiSFRkM$T8yzQ@WB&PZLI}aTl9ZU5ZV*}ujax97a6U%=uwWE>nAFc-=d8;A) zIGU@EMxyW@*)AABw8|QKRsPju<1g*Pky}7thhayqG{0<;%xECBE{IF!yy} zWXN!b^+&~HI_@U1=WstZ0-;4ZQ^dytl)4a5e17-!;bmc>r~f4EbxFuOsAO~6F4gN^ zC4T~YT>K0=th#TVhoC((?v5a0(WwceQ%H#Ws)DIC^*n-@H~2G-;nwzF8gwGAV6cGS zFWX285h@U_VXpDq(Hjy51u5vC=8$zEVV3sXAQ2fM8YIB z+8RpVpKs)@t=QfxZy!3{C}GrZB1S=}><|!ivFiBsna0uCd80d$gvWX^qS}0TwAsx* zHnysXuRrdu7;**b1Wx>ow{Y?+EL`3y&Mv*@dY?(>8l4$Hd%GIW-qx|OK;{oNRWxX5 zP8Yq7IK3`ya&mKTuGgZ+eh@r=9vB^s^>BB^pF?z~;1{~}W82@|)f-~sk&$m@|YAia@ z*jvObEM6!iuo#2sN%QS-jKK3A{=u1!YI=RMvp%(;Y9f6kll~LZ2KNW{!GA}ZPdZ5< zkU55iK{;bR(a_hpD;Ndz@Xps2O*eMnjqvdBDtWTcK%>%KVf=Htng3w3b;8bd4^1 zn!8o}_KSJOKR-c81wV#WRB#|-k|KA2-HL*S)^E+tVs73V!qn+Nd2Hu~3QK<;tp)$MnN~x*~nHw)I`zzV7YSk-VfNd_0rJYI*zPYDsz9>fu6jQT#$8 ztFANvSlqX>LrPu!FU;VbSm2H>YU=&Y#>CJ&^pD{GQTjWTHk2 z6iFbU#U$jEFchGYNysgGj8)knIJbj2JF0w)D{Mw5*f8A!G1vL-l5Ej)PfbF?w|inD z1f(UEVs+S`q3vD%FmT$kxe2|0G zLcMspQCt61Hz-8b@zh%x#4rRb05(v7X1CN9>ie*`+#3@IO(+vlB~qNlwytpfiQ6rUsXGRj6hW2mLLFiIgp!otZSK7!ER7re@N zw#Kq3Xd#3n7F6P((qG&e+u}mk!NPb*0WP}Xsk_kZW)8*-Fbt}nMs8FQz|}x^n`$Ot zAxzo9UOVdP>+a!E>9iqVpp-6eV`H;xbY(qNRdUBDdHG-c5eFvw^DC7JGAeQR58Zkq zUMpl0zWX;nbVO##)8H867x-;wCI3$~@%Z=%s)^382jN{{4+2b$Z|itPjmK&{!~60(Y$&`XYNP(D-elrh=}@*SAGik-10;VUlP(tM~IDarIM?J?aMJDLhtX6 zi=h!^V3f-h4@x=TekLpV%TQACKHjbf#MQH-_NUc@14V#EoV7wRB@Uww2*`s2HI`!z zF7f5EG5k*J$Ylntf2It#N?#s>{2Ue?{T&pV0y>W%6X`WL1b~%aTV0JkIK8_%8vg4V zf*nTn&sA5p?6Ag40=y9Vc%ND0ZHG!0tOF5mr-f1pQlcatJO<>s>e%c1lzV5* zLs*Xt1{q}IL`~SI&e~!#ep(#jx8vIDf$A0nSF11)GK#Xh5h*y<$PU6-bEvRpS@e7h zaOm)loO$sQ$TF(17Lv2xK8#;rf&!G1ORI6NpS!Fis%3P@X3_0R*grd_vGVO0NRw1V zW)H@`ImKqW{;LxAb+f!MVl^a%LFM&@Ql!M8HlD=%+**evSHVtBz_a!8I7;M(l$KW8 zUic#kWM)Sx1mVg;oFR#Z(a;ReDW=hc4kYtZr@!jPc1fV7LirX2Okx@P~zgR4?lVubKAG) zkjJ^)DI~%q=r?}FI&Cdmbu>dR_ab&DqvaOu35~*o-i2Tz(By9kMMVTm7<4TCxvOv4 zZvO1T&H=!JQo&<>OB2F@?dRpu#NUdu5&N&$mmkom2FyLkE4*p5MuVhK3(43D&$2e8 z4C7&E>#@6F;9;gXe9>PUG-Nm3;l$42d}*pyL0*|olU)k3y?KcVpE6#C6rkUC&;WnV zQA8$z{7UticA!fS$GV2|Xtz!pl3?l_DOE7u1Y|OXvrrz^EJvO@iLpAot;&o~7TGw} zGkdoacbxE3OGS%9uclI!1hNSRtu3T+V@)V|s+vm;;($D7n~B{;n^8z9^%{Y-6L%h5 zTvclCOlo0j=NycUSP0KTbH-?9(`YUWwdRfxOj@T%5Uv+GY6x~R8f@8D&cdqioQjv; zB6t#n9WiA!p0No-^Pwr!uaS)@l z!kPCGstZVvF?tdHWD4dGGj3@9T^0Nbf7#bLl_t1Xl2bie={XS`vrMtSUB8UAGZ4H-Y_5oI3rr=&Zv}I(XASRiN=HY~0L{utt6>VnJ?{DUCLOhrJU?<+# z#$bn}B3Y%G%MUkhn2@MUwG`kX>n5FmMhudSrl9z;H8L>SK?gae5wXLL3sK{w(7%5W zT=SxO(+W&V%*;`l-Lz{;dwi=BxzNZeJop9a`O_5(#a#gXChws7h5tdXR3^6TPc-d- zwP@qs@KAr*adYQq+h)Lk4i>fk4pkX#1xX%?)hM?e&QvOk4+%7MyE`0kX&FNV_6dT=nvXPVN**X7;*0Oa5SXOQqNs+Sh}&K z!_Fpwg(nTfvUJxHg*X>D9=@e4aJ<_g1v4MY|S z0)IC8y8aApgY$HFa_(AQ30#SM+0^&q-5l%Drt#ak9W;grxWALdnYP20t=0g09rEE8{`egWgeoS zVhbbqT|04Omi||rPvQ&40E`U%rt`uP%q5WTYmG$dGcjAJXB2&jZQGhZtt4s~2J3`s z(BYK#DI5Fd67u%(nT&2MPKj2}%Vb)?Y76NlJeV`-gMR>SZ_sG zr%dt0e7@~q8)dt_O$IAxBSUAz2nLa%9VIS+B?R4T9KjoeKL_Y_6Nu1l;z)MAemV_} z|J%mZ2r!8;oamRbepln9S;4?nsE=5VqH4X<-~^MW*1LEN0seXGJ6E`Wl)5%18qX56SbAZHg!RrY}o$0 zAz7mvXr!ssr+&(yJjgkEIr0ZOWnn2*!zQ_2ypY_gG;3RmkzVj&3}b&I&qZ{6ufBPd z1TgDQGzcb*2_b3_*xz}XSObH=&0`9L9TFnMgb)r*kC6xPsUdjhsJ7Asi0+K-BxAW7)X$%2W$eLqyV0p6Z!P;%N-a4}!-QB=ry z9};2eat@yh+gOF863Jke&HF$7Y*I^7huO_RJ=I6V%qhb^+zt!25q!A7D_XhHc9HnwDQx7gU& zu<-Do;7y zgxRooW_}K{HyohaFVN;R&&f4Wo1-V4d+?vEwbmNc1kFAV-KxFGsX2|)*2k08-?bwn zDnCB%j1p(nTKtuG<$LcQK+0dDX*<%=0?SjF&ucfYad*4-lus%{k$T@4mZmV}>~ej) z0xRHp_#x$oFrBDl<&5rDem0!<_^AEAH~Et%lQ<@=j6&AF{Q z+ozF2RRU1*7f55+gRZ#a`sAmhQ!FO4FaXJWLjX!w?Rua?ub3hU>g|Wy3zOxJ06@YJ zHDx~BMUS+2D1#nMF<*Y;U856zXQzpyTjEV!+LiA^ zvhM&c@!@nmxyEKDG*?FX8SZ$@LUZCtFHI!ILNpB&#ny9jaWNe)R1FRdRV&eC-5g5q ze~l&l3%umC)mCu~8S^owJF>aTe7ODN(PEU(UDx)AxLGZ>#D_Awg(jPYrg#7eId1&H zYx=m;^_A*%SX9)|)&GV5gm0a{;9me{jUDA*%}xjB&JK{Z7#Yw(m;;{c{Aei?Kt3Z) zF4o=A6w2>cTZ?%=9E|3dd_25=4!Y0a>S}Ja?@=pLhVA)!jiZS5ddMplfO3VRS|wWL z1x_1(f`fxcYAn?rZ?-ap?44J-IPUK=-v9h`01tm%>2>7D6~^WO ztB{GK#|Vqi8wz_iylIBR8W64W8}=-R=$M>?HhK@O=Pe3v{+NS$IxWpNaWccW*n0NT zEZe!dd(e9U9}WQvgpG~e0pL|R`9B_tYjgES93OfKJ+FKt=jw9_`R!hk@IMBY+t;Sq zt*VaSY$wmR2k8}IMu?FCN_KtEuG4vI$g;Asawnr1EO66OIlzpJlfC*@?K(F)eDK?~ z5DjS0JhH#!FdbpSwXkFy!A0<%{>0fyV>4)Ruk%|1poA5xZrwO!aBNI1Rn*V7pkKeu zTi~KBquI@*M$3SxL_lC-GA+%@n!SZXzS;d`RQL*D!m9&^_@pQ;v2t@1Qi`9A`>=T9 zI*g27e2sNx!7AlGG{M0?b*VR!kO)jjN=jO}0mvg*zk%pP>!~tJtunnnr_`BwR&5Gx zo0DedT_JA5i;#q5s}E_9MJznWvIKDRZl zm;3Kn+1S_?oc{j(tK;q6f=@zH5)$(8W1@6MCrC#plK4k$B#GOH=1jA}l$|4dZ-NO& zqaxCs=#ck?;hj`{<)#HNJHv(xDm}|DvQ$rp<#ThhopAR|8*3lZR_p$l z^DGkk#(965lbGM`4RB{DmQfJ2pbVf!pQ{BKu=jVfE{M4Cm)E}qm+YD1v))pytdN|} z23I#NS?5I2DKDHIs61Vj=PJ1}dJHfZ-f)IQ%F4^H0@6Fj_u*Er-B z7tF%&o~R~M_Vs*D>Z|jhg`?&p5s}AhlML@uxkuof;~NHk4=&!`-Y-4dRKK||(49hYhtqz2YcO=#;nzX6tbXb&*OA}C_jcW&%&Y8h zzA-nR-OxZmR(6C(aZKdlT^`6ue2!$76h}44fff9-|iMOt!9HHpvW6sNuIa=WM>zQ<`;*>vEDWJd!iK zwvSY$NReew7S;kr`zswqS_Cr-y8gGBK(;bfOT=wv0fy^#v^W7E?pnX62dy!l3Eh+P zG3D=Ub9r(&tRR8>taBc5#K|MW1F{JQSd&=&#z^b<}STh1l!f zFQT~|7NL?3uC}s?E3U+?s%1&i_s`6GD0r5!K=TTSoS>%tf-f0Vo*>_{ME1eAO{lVKnNNEkh`FREy11Rd1YScn2n}t zn3mjUqY_Q3%LfG#ttN_3fb*#V?o0EB&hIv#yX&n?KkWvNh>N{w9P+lGvq6&=JzY8L zTZmcR=BD*g+$zNJ&*$YQcPvu@%A*9(^_d1on)Fw0-Qa0chJIu!0a}etOsu7r!~n$n zxHmbtz0ebzkiaOtSv~F-N=$PmDpo8v`&G-pgU*F1&$K_`Z@Zr#07_&401LJ}9W`~J zd8W@Zhg`=sBmhbcIS*eLJv~0aBOt6C9SwtA;LLl>gLVP2AJ+~U?FbSxbLMINB7cTV zlJG8|8%SrKfFE22z!s0)eDGo`5?~u;A8s|mJ_~f$FCpTPkm#xm zdZO|Qt@Mzxqy{%xkcJauS&YR=zB-)`^Z@Bez`*N(x9xfr4WMQ}fzHIiMe(i>>~Ljb zUaN0w0tn8=qL%tac>b;LSlmb&OwWfxI41$C))|EKuIVEq#CEm@4LnUqocQgYzmiE=Nm{sD_dJKA*gunHyep(W5MBtjvj4=lnW#y%E8*Gg)pSU z!%@MUKsDA~rr-S5?`R|&j!t3G3sVm;S2}ezpHovwKp|uVQf!Uw96kdB12K<{(|N+0N4@fW}z?nqAvnHkGBM(^eQ{i*KcJ%HCJ*_jxydJBS$BeTFy^+k*gH z|MYkzn!u(n|2Ic!ru_*SOK2o2j=}zIvAS&vj|bt1`(dN`4nT!Lel8Nm)F@WZ2C=V* zUv>Q_ullVS4pX5ZrerQ+WI;qdG4g%N5*dmKsgV0uFr_NzEhRwejpakCS9f;gjEpGl zTQBf|m1NViL}+P)dwXR?p6;FNY-YMZNp`(kQeHKqQx}!f?J5$yYxxo}0;Y+!-2jg_ zsg+Je=%&$03~jt|pr9R@OkZ6cS6SqS7NG7qb_J1*Cl6GS*HWR>9++A zlz;;pOd4!AKsdMDY{IwQY?{9}S+iKY7e~Z8RMMN*cio?s1fnP?vkCTs(c}G%(EagG zf8u71rW)h5av@b-qql8E1X)U&PuX_*i|KuWPAth;z%M@@&I=6h5wIHw@%p^9UlbB~ z)Lz(I-q`pJT=z^zcJx%6FpxM(g*-k*4&DszL6yKWqvL}*haqW0&@R~nqsL5aVU}UL zzA6M5I`F%>!0m-s=<)G^yJrF|+o7n0gh)0BSl)ol1N3?fn|{;s(NU&txh45Gi3PHx zU_tz{+(X0&2{Npb*u>NjruGU3kMu;<*TnOxFpn~Yl#lO5qZleWP2@og2YDLWc&)Ck zLO_L%00RTFgBUcFCJ+bGSuR5{I_1G~7;Kjc5=G&hidGhEIdfr_m4Bq3gHr2BT&7B(!T^QJ@%>ql`|4)~$V-Q12pl5%GAJ|Q5fQmS%LHhwcR(Tl zhqPIX`9e`v6n_&^0i)6U2LUguT-nzaj;S@R3-4m1NSOjPZlS>u&i|F$VV(#`(oN$$ z?|~D6@LmKrU!qZxP!@LZ8HWK%5*WDZXV!3$%N4F|+mCjK@IfSE{Y3l)>mKP3>aKD6 z+RGUZ>TbT|u}@K=cC`60%(UFFtgIA|-tJB;t?lVP1B@=vmxTb4o8SEj64))qAX5Mi zo5wJIA&yt-;$^_Duwl3$z7m6IJ-NmwV*L&#i!gSs@s#E>p+d9w@{Em9sU-b zRW(E$?(|wD^Nqn%z&<{%M}jTmC(wsjw6(SAZ2YO^AxP9A9v>AF-m?zHBvaNm15<{U zEj)&Xh6fW`h7-W1wSZjGadvT;fc%gML7fGH%_`xjsu#ZL%1Lj}068OKksgDdf7)#?~=#y0e$6gcNwJakEfu6;mrKQvTZ zi$r0QA4&E~(+fmrwdl@p0(L7vU5sXUY!caa_`~P~A!6`|ogE*Kfb{>(@3%cU2f=hl`PaKVfBl)v@%sK1R{8v6&a1 zl5;#=mt*DN;F$Y!$XJD37Hr$E72;~$OAkau%rc!5lRsi~4rXngG%STfOiix<8!Z8x zYF;>Tn6`oGoR&37fsHMe_u0}(HZYM{O$H<&b-Q)e&}ktK7l^V5~w|xHA^er zmykt6#al52mQeZ1TDSg8^A$D#0svdYFsHv$- zN1UjFX#`z%MF9gUnk(a{RQX;ing2*(bKoX0mM+}9y-EqFMrWE_{!xHnn9$~q>h&jV zAbt!3g>xpLImy-aEw2<}mr9wQ@Z@@+(V_@V1+R*!YB@_e!r~8KSJ6wr8w>?c< zs0haGXsW^I)fdOLPt}fV?AN!N+%^=TPo?AHiUn!gX}JRi78VxJ*j3M}APVjFf z)x$_-<(=~1Rg>>8cpp`WXoD=omp=e0My3VQ4M6SQHDl_}buSbY?+|L8Atj&irm_;h{))eAnEUQI6 z<2$RI0@72yW~$_dkm5WtJJ;8eQ4>gCVjl$h2zb963JE+#!SbHnifKd3A@Uafv> zIj9zxZ+k&&mW?^uGEns(e%j=>dfi#vP~TQ)vO>vZ^d}O(P+*QNJJ5T@eO|@Tbw*ex zdy_B{VRp-b%y2yTNHyToW z{v(aR3d0AEL-=p8Bl-Vv91`~P0UrGQoT!v6tvkE_JLHV+BmYXWAPaaS9i z@b*houK%8?8W$*sw5uWGgQDO$!kMJ6%ef(OW`c9cY9l88-&+U={u}zEKsLR{lz;u3 z>GnJ%vL2+u_La;e=M}b{PKklie?vaVWYx>ZlNt@WEX(?5vXAI>T$w+BaT&S){lCW0 z?B6Uc+6UzSnWHTkn4>xbOIY<$^6q#f$YPM z3je)q`u)F4>V_x(yMzLIN!)oBT=m4i8QEqE>xF=2yPVPa?^&Ii|4mFq)TsHt!LjN7 zH*nV(D?ycVCWrKMm`O3fRk6%?3aSi~v%QgGn}+@iZ~BuUvLRO^D|Mu>tL^tZBtH-k zsN;VCBO}u6i0hi}f#-P%g??etE`Xa$o==!B49!EbjTAO;@qpaLoM*9mU|d{HN7(*n z!PI?V3v=bxr&aQO#(!3sGahDu2NuEX|F{fCUr)zRD{P^=&kC-JCMfO-Yz007J|Q8^ zX4V}I+N`esZ-SxF&q-I(bS!49|4nhzIk<#H4l8AN@jsJbT&4i4ZP|!rn+^Y)aY|2& z5+s(KC%W~&k+69wBkc-wg;;05Kl<;2pU;}h=!CV8_l5rr>}=W#%>UVNC1$_szh|*MpDe-2tYTMwQT>0xTQ4~b zt+O`@{|!*b(754C?0MD&^txdbZAB1@z7(-Ky@>)-W#-ut&&^%qLbO-57aSqD&PX$) zGpILkKK+H`Qk$C2A8-|bA?7v?W68>|3RCjq#pm75>V(CPC;y*f*5h)luj*AT`E*rD z9fmud6a(T5VfwIgM}sGyD-MbD2d={4)~;Tbz>%o-?B;f4la}HHH91rvw5o26qnhzc zGTmj%|Fu!19@y$^YUt!rqy4g%a#;+6U$|)g@a2P7-%iyga`TwfOl9Va8Vl zv5&jmn&fdmZVH?&UvB!5<43wK90r^$ETrH^bkFpf!2$+V*L zJcW4z0*!jVT+5`Hv)PDkme1s_!eEQvo%n2Zdkj1Y9bQ`T@CgFUl9ubXI>8$*b z6m*mM`S(xmo6Z+|Z3m@9Gp`&(#X!C6;5OWc;Bxru_ixABb4wsrexaHpS=Zn5>sJNP zTaUZWgdwTjYxdI=ohUcAji=Cz*819t+KS-jp6iHJpeWg&y^=mtCY`m>X{Debl(`T6RWz@*VhL%bN0*uWZcnh^%n|* z&0g~$`$j0xjxHY@L{>XbqNIRhR)A;0vl=f92oHChIS+=9jAO`c^x!7P^ONRPnc77d zJ|6F{GVLz}0=B=YgSD?iF;&5#uZ6E=jINSWP^5y5D^iPGyxN|rI4K)Kxuhh*+r2H|4FrP z>jOyK5m710&g735Ks4NQf1}{&;*#xov6?36QYvS z8M%R2mtHwz003UlQGj!4RlC0RHlOnh+ouG4@6ma!J-PiJwSY8wFHO*NVgZ#I6%bWG zYX**@g#sGa_j(0M&~@Kp!pES&AsN6ep}oD<-&o$oeb2D!3Pf1ab;;~%0tY_-ygF+6 zsVw>+dA2#|anyPRPS{}`&V>y11J16|p#uxB)3k+n+7qoDup48n3k~oC#tkr(*w~0>NBhAP4rnZ7B1(gX|vLC*Q} zl4@WB0GJLA9H3q40_Q;h1^pLL4VQ-t*3%-O_cs0&_W6AOh6yw_X5)oaHuKkLfGg*> zq3H64GQPtZ!Zj9u1A$%=4X{{0)mEK1@1^7D7s4WhWbZBm0K$$c=;Apr;rFi6q(>O$CVJhqURjFCsFW;)nsWOI5rA+PtvJ!!|$LYLMMIs6~F}R z*3HSL@4LF%diwk8n7+t5n9{tZeh8)TeaSm&2sUX(UHm((oq=Qu)zTzhg-2vvDE$i# zX!Zf9Przjt7vR{&KxPc(!#shfN&RhqNiYVx-WZyl#nx9^&FS;4B)B#H>LT2gT~=N&vk&KBs_qH0E&KSWbW$fa%zbLYdluDb_T$TXHs7p z06HndqW3bL%hXB7{<;6x^`6{DFQmpvcnt!_%J?gj8mGc1d;9Eyi>iXhLX=i#4HM?} z*GA{bNh?2*Dglf#|KYs>VnPepHkTn1uygiza2887D zJ_jJ=LBSZ{F@IaV+~gC~6;UxT6t9xOaU!r(I>#MwjVbn5&yR}^8$HFaxRk!1G3{0t zM-&S__;tKdALK4sJX)+aR|ekE4b^r7W)~cHcz=7o1DKG6hK7cZbZX#zP(&DrH~>oZ zlqBp$^!f8=bi$EH`}Xv&bCnuw9jfYC8D(;XH>6YVgtlPrLy8YE9bgC{-f-|nt7LVP5bcyg!3ElX^P*K(%lbgr>^&_ho`Gdfsg1>Qc(r$O_h%|IIfMK zT`P;SnjA03NJ=eiW``12x;x4m^k`~GD1JyAwbdlOc-mzj{c^`Ar7zZ;yE#7YHlZn# zKKJRJkQTW+(lzXbpQp^}<>W?wjOW&HCIbr#I{4ZKD7C$_unZ1+EQ3tJW0y9y_fG7= z?^U2D1>-KvaP}fap~(l1;;Lo>rot&1zI4?%m*F>$tr)woqgdJy8<@@PHWK-bx){dS zMuH!=JAm^44+tB*AUh@td%1qn`3UmUCcqSXz(KLwiwBRL{HxCq>Kj46*T!6cDzUJ0 zGemSbd=S2IGta^*-Y)b#oX}Dw5J|nThap6{5B`D#5rNx<#Z%=jz7G+6Y)V>mt_3I6 zbZV`nqRE9vh5djE9ZJE0ToYPe&WUwvH83h`KjzW)-oDW)=! z`RXQKNcQtNIKVXsuxfV@YXw~Q2|-Hc5e4M@u2USn;`_yx-z^WW9Cuflc}nTTH#FegOQYvGZ?>sSiG5(&eXR6`$;&MWd$bjTnW51LEFiADAxk15%+L7t^*awd4 zLq`ZO^3LJmXcGQ$7G(RD6GT9hmcFb01d2=!yZM(wLQV@YMeMOVQK%x!rd2k)8*KHjS)vJMkTDg8vt;~Xg5Ib4Qp9FIfM?pw)_evBM8Nk|L;~W_?Wh}Q zXqo|rm)Pvao8jjRNn|@%886Dcu?*a3Y-ZSMxdfgjqOo{NYA= zdzmG;T;mRNPXJ_lUL7ieKmx7e0kF3TLdi}-qsJM&vhW4&Tpdv?dZkc8i)uiFDa`Lg z6}3(uI38*fb|cD>XLv92++kdz3?*)dc?>g^&0ZU^AT?)-`egv_Sb7x_&Er!GgY?v3eZ>@2IUbV<75A@zjgs%kpD9;O^fb zMsVSZ-h+(`Ej$+&7lj_L*8s~N`8ynA?}v+o;{84Qu2mrdQTbQ%+)dQ6h*?WJ4^Pk2 z(|&ffqPKKFsF1{GOAO#;AdiCV4yE=d#(~&Ez{!b%PJYi56T=)qTthBB8?WLaCfe0T zv70vm=d#MK0uHbUgRdR=H8+>)dN8XB$a&wp1AYiN?FPlXi?l1L*;~(_17y<#2-84Q z^tayL5T8x27aYX90Wygq!*&rsq1@OGITzxnnDxlNQ`fp(yIGJk$VYd@LP6>5=?Mjr zLLj_I7MupeOv3N*C!?-lis?FAtgD4B#jPQXNYa``G>s3(FNwCeg%R(Q?$3o;0pnACV04qrB=PRszF%uuQVT;dnzQKuxvznIR(!m8rqJU!V`LvemB_J6i@57w8S(xotJ@=u-wH_@T5?0bk7X$R6;y8r z4yih>bRo3en9*~a-FH`-Xff(e#wm*sfdc9wl5{%p?Mxk!142Cc#9k0;izl2{#H^!| zH~7AdhNc3twuBz2@s7{cO$0&d|JvJI2&@Ods?TQ3GxN*Nz{bYL#g!C6$Z1-yzHP_- za+^aMDSzOdyhOG+Q9Uf8018&nHb6Joy{OmL|FHIzQCYQJw+M=eAT1${g0vtZNFyE6 z9n#(1h=O!Tmx6RlcS)B>izpzC2uO+aS^IszGtL;_@AHiD3{ha;_kHbauWPNj=A4Uy zQsRhv(L+@lhW<#V``(zsNniLIHpYCNQ2@Rl_ngQ zh-ie^&~>>#E`WsJ_2b(QXtA$6m{+cUAl*-ui(pnlpL=-wRee40@y-tu$WPRY)xvJ# zP=0|-T%=r9{=HBp1ZDAoaf;+X;aGxU?onUjSH4E>&w8gDUi7G~g}gXM8b}4tsnD@V zIw1U7O>Us|MxKOqci)G*Ig~5&3Xan8K)pOKpwLXI6g_#)pcuo!{`tA`9V#JtC=w%u zeSCZhZZExi>cIv#h@0`wxER)`?!A=@H~NTmHxchg{IvRG;c?0s!@$?Tqb07l2X?LT z-X5deEA-Vy$DavzMYmexnBBC?q<#mw#?OneJ&L3q@frVR-#P5);lX=as9B~X(HV5( z1T=~X)|Ysj$vlE_1Q_XEot-afCBl{RVLa!xNtt4#+?KoSQUD34|RB_yfmepqU z?a$PgiUoAk)PYcpsK6UXm=+(6&>S5@A8@&{Fl%0#W@v%8PzI0;@W{zallbestiKaZ zUfHV4LbDATW|6{MuKXr{{!W!>^7;OIRHkbqM9wAF)G1$dlPDbP7KaDj*P%qtkPhE& zcq0G}T~}7--B=Kg?Rs;SYdMA*zUDvtdM4K77&kYff`LG4b7A56S>DL? z8lWFmTOcI?sA!d{=GB=TiAb}OUe}<<$HtdFe0N7L+ET!(KN8R$D6ykro)f^?7@WfQ%hRVZC(Q&=aCm%+d9Ntt%+GfO91 z8?9g_PvN1K>xI$M=Tkes)MMztgP>ZAhGwZs%FV)tDSOz3ctl@@b-> z?Y#d99a_0`y{k@rG!xR7UETyGGQ2b#L{;V^WZ&u?VsL|C19Q4rOB%I>@Bem76zwI& zmb4G@#BQ9JbPRZ@XDf5|mOFBftv@iJ=XTXSPkKRk(LNNYApQQJx^pMZ@$%dY9fwk0 zOA9eY+H6CV({N5M*Dv@FIjRWROtU}|RhGbXBd)wcGq6g!-Op~tUasp-B5z!?OewHv zc^kY`Kbj60dU~&!OqW`b3$HSDNS^AG~idP>!Z6S5WOIM`Uo|Vn7BA802Poi%XV|;=dtd;b3#|7H9T*p*=2nG z0@az;jXX7fiO0EgG~&CUh2z=bGsQvl&47xpRab~u9nc`qX0BgeoJCR!rUC0C1z@5| zi&07dw61rIN=l!@-wpFvbf`!H~o8?1Q#A| zaRe1T9U3d9@`{sqtpaDM>tK>GurYS=Zz+QM@9)_z00f~TR?W%PR9JC) zN~bv-vlU=?JM}Hyt=XG$v3uSnaehjasS=CKE3G_oSDD0{9yZ^4S7b3>;%=H?0*Jja zQ$YoQCw0K(xq#RHYuJ~Gz+}{-W!g?(D!hzO+KpqE4kS7qt+s4>n+QzB?QfS*Yu*UV z%8FrBrsV~4K$c?W;hc`3fIuN|haN1AqsOw02aLasDByd+Ib^o_8l5cyN&?^yAXXi) ztw7{}2mMPpf*buW_>ol9$jE4AFfn4H;JM4uFLj-oSZyxE)&i;q%jwe8#-;a}PT!>z z%dwu(o-yGyE+yloRfL8FmlaryI)uf%kxiU`vHi^pAh(3CUz=;}f24(mVgTatwcddW zD(l?*d>zh4Ss(UH@f{7>A+dMplvfeMaerCT4wWSrz)qu3*;AqD19)kw3I!u6?vsN7YSMpr^N@i%3G~<9`pOmn} z@`8ai+3&7|)eChpe46 zzxvV38b8;-NM6!y@%*Dlk6spxZsK3`b(p-vLE~7oCE(BmP+}Rj@1}J&5ZgWAP9ZE0 zdc_QBs4E?voGR7S7*?p?X^W#%SxY9Mh0rFea{oNsjQDrT16C13D+9~2_4Xr99?uuM zZ;?Kw#K->x0QVE`!-s(K!^QTNV}#Q(XDO^WVo>G>SI(Al7tk+R-v2q5QzHTkiY`dD zQ+ORH;W)^J+p_YRl~)NkQa=H+1LAY)6DL}zMsi+s2KBnYw?_bXPhg{-e`ZbnBX87x znfuQN|G_t|+btI5j;dbgPx$ro-7MA_Tz4%oY4i4;HcfEc#hLB*ICF$A1d9(`B;e{}0?Nl-&V#tr`lt-ej-14&9njkO_Tl@p_DS zBOLvj0@`xaOT+!DvVUr^^eyqf^n%99Dlu$Rdmf3U=Rk@&0eIJH zvJek|VBeiHNB}_0J_59&MBEwNO|yZA6!PXR8<&HTF-OiU#uNGa?87zhfItO5u{l+e zOq=@S=(>^j_brnb&A0mQ?H8%no(@jEeIOSa8JQb!bt#|56$YHd@Vl(CkP*u$C~zw_ zASutBL%UsAA*;a1cAPfcV)vHJ2g|eVOdzBPpKpBwG-at&PJ@4EGt|#1HQ&a^Z+_SI z{wZ{_LI9L~z_fUPkw(kUpEeNBFafF`K&pmQH)o_ik3T*rGxQobIXI%40v*oFq@K7p{PsQi5RH9^Oe{MhDG60wi;`4{V9&*WSW>C5SMF{e_VK_mN4Fs^L$>w;T zWXaQr@9S1=8N+&h?E5(qv_DLR(`eOX5dRd2JjnY8H412~?!_N#R2XDJB;o`!1+d2f z@c>p&cxtU^8M$-beV+K;(YD3!TAT)6%E4PR4tOO0xaaD&`+f-GJT{M9oOmw00&I74 z(4P=KR0@RV$vqeUTA)NrNlAg7Wfv?CxKv`A^rZ>9x{H<@r&ZZ3-pX%I#rY+C7oJ<7 z#bVv#`Q3Ou6&-p-@_;{j{#m>RB_0Pa9&7)4&z}~FbjSIKor+52)brrN*B!0{zE4Qw3-mW4yYsCR!5v5nJ*j?F)E|~xWI&`^l4K~M z_~pGqmMz}aHK&t(Go#zUFO#HBe$|!OB7HmcRw$z6Dn9 zTvSw~-L?Z;IF!?D5JMyZq#@)&zE37h2nYzU{c^3yf|LJnQ-j>hW_&?cK{knn@uP}> z5`E2$ZcveNPVq4^YDbcA=s)3aqv_DSe%<#0s0FKhM}R0w0MuaIK zG2*NdKNyG{3Qt@&VikPn1FrnUsZy&@bLekL`#l>%!z8SH1aqE=?5_6WVR&wc1stVc7v8 z+RY~8AE|I-sDwYO?Lu!SVlYwn)}Ol);gf^`7qwHlThMyT?oa2fKzRlr9yoUKatgXb z-5Vm5W>I#h@>YEoMHs#WBXm4QAPz@yXFyZZOa+_a>Fj%EP5UpPHIt0O?EE1Zqn$bp z1|S2Fj;8{c1Ex|y<{O7B)3w~^DPMX+gJ;mIzAxqahLnQg=3lZ5et$bkmw8~zIrQ6q z>2=XtNkB&>!84JkIs=`xeJLU0n8h=>!bZshL*#1rx|K14Yk6-8vRHogsDXNtykj z_<8k9Hi26CIVBCvZLg!LB>+HP-y%R?`fY1)Ka8f@2jDeG#X10e1XCB``nUnb&$Mtb zfvr)5L2Jm{w6wt*D{rWB9oD9yiaC{T)b51nUgiTAP6W?yzh^riAz4pvCET z`f=U&%UbLY^HcZXKHQqiy*@NZbS%0&wS6`^7~sYT z*+iCuZqwH^)n`zMBw)E~lAo;%%913>mS{SZsVNU)4JbbLd%9NXv8!CH;rBjI?kZQW zapr6L6^>*ymP(CzD##ung|Pq|u9czW7@$zAR2T@FRNBG~huARI;G_w#K}3Wco0xbX z{@uTHCkMJ2mNTJ>1;~+vlwc5c@hB;y;Ya}ZT~j`UV$1kXca7y9W4(j!PS^mJxVeWo zM&$eV?-hbOzA|WPERN+^N+in~h8X3ygT4TYKa*;Bi}`VBOeB1qU`EI9ygBZMimKzX zq4Y$(do=rLC3j^F7a(hf?>uls1Co}pD`dK(a|u8E0P@BPW_+-(z!NP>Qd zi;4XMqZte=EVZVZFp4NBns66i7ZiZOu9Ve$%brUg_$)c?&I_ujsxqn-#^QV3y!KG1 z<{{{~%-~oLud(RU@r0*J6LvZLJmu}Xz7^ksI;-JfLheJ8`}QC}f>EobrD1Q)@@LRk zp)H*Jz~h;*F)c`@Rv^B~C9bNfsCe$L+_JZ~ADxOF`uz{i`9R^ymqCsAeSAt)H=iv1 zB745`-6@ydCGp+LiYc6R{gAXI;TvGmgocNMJ@8Q;nj!f3!WBaT$8YynvDS6WbleA{ zJ{IhG2$j#Iq~f9X$ujdU5`KJkb~eTJ&`iMpJY{lnlEl2AOI?K{Ne@8-2 z&S{8AEC!5Y1qf=9X1WHw@M;ji*hk{c8>A*!dWa?NlGO1ezRUsDoOat9;u8GBKe*y)6X=5Y%&ZfY}p`F1}`3H%x@U$V5D+0nZ3>t|cY*=k1sQY)_2=lCKr zacXRNpS?vzxnLshsRtGt8+-ZvgxFYU1PB0jqn*Z{vdJ5YckE|-?PlO|iK5x*x1k& z(Yrh9)&Ka0L0P}LZC|1H-ItT2cePkwTc-4I-RCz=vX4ErF!S9*!Mqj|cvE|l@y5q& zyXIQ~9jzb4?sxa~(%&~T+F-l3;(h+lNu!rIcDKkFdUnraYwuS(x7=ymIJgS`(0_;9 zEOw0M=J5TMT)KkgNMB_^(WIt+Cfzt3S0e6&D$z>`^B*JL`xO6f%u1zBNex@QrmLub zd28LbM0owAiGnYlv488L`?U-I=4fhOxa6_hnKtxj{1Z@FYUdF zE%6RlSDzsXSUFg zKKj|x(%;r9_c_M!BK==|5R$mXJ6B+8n4-xqaPIl3?AeWRf&esR2(rA zjHbWw@;kf2bkWqb-W;9FYLKjZ(^$g1R*Gq^8A5xuVM8$bSW9%Qe|0%S-eTBLpRZh4 zv)Xu7z=vd}Z;bd+O1Wy8gW9@N4LANtYGwOY^mpi=*|qenoushY>B3 zE+t>n3GT)Hc+^lv%Q0$a78><2?}dL}C0Eh}1?jN=?TPQ@vtih|@AX=(mQJn=BaP)u z_&k1%RurbdO665%+&)?l$V!pq3mxhITm;v?qAJa5?6;qgrO@@Kv{kt3XV_(loCipJp6Ik_(XA{kuWSq!~jXFpc# zbA3?si&=3@z6DI3mo04`ES4q>ZBgt*WBq^sQF{3avRcw9El7salFi5bgl7PZs-j= zPt-l;a-)p@%LS00d7;S6SQmX5&+KBu%!d+^5pt5KJd*!$a_+?o-lU^y)&t>{TI{Ni znLVfO^VM28bSF$4eNabMu{P*V_Ao5)4CX0? zm;MN$rA=;I*)_}>GdXz?gPC>Ankj{aX|5aJs)&T>FNzcMDn*WtwoB{&=hga$SBp}p z=FCeHL`facq;vj(QQbtOarMT_-`{00Yk9>N3h%V6>=LJtn7qAo z^qeP@{T4>Lb+1+C(b0T395nM;A7gkaT&Whe=AHgMC$eVOt;U-r^~bDDCMe5#A3HNC zll>Rg@esQzCgkDdyG~5+!m9wqMR8NAi_#bA3K`fDp|crN>4vq%`=3;t$P=HT#BY}-tCpRSDmfnE?gdEiw) zpGNFY@1GG3v>n8??X-PYZt*5@9IEG;&v>f@htOnJ)Fb135!mXE?+7{A)r zb;bTXz8O51S>#Y3@o8J6%iF zsAg}g_WKm&V%pirn3+=9sb<@W^Bm|f)szv1lpesM^5qR0bhG$1XA;?K+Dz{A1bUlf zaw~Psq5O(=n?)7RDj=KIQ>+Y{bax^vrOtD(FPnYi{hxnHqN&Kq!?cSTtN&pbTvuz# zvd@27^yRO`X&pn!mpTnsnv3G8C&6S122| zxQ9jRUr;>RrZZHfGmheg9q>P28+%GWFP!v$3Y+4C%9&2qU(D~iuqm>|=5(CNR`r9> zF+b=HlG@-Yo(`xJ11)B2w1-u2mPxzkW?TtD-6#5T5n|RCi8%=`e@rj zt;nx-%vd~e{Ao_v%d~lPB?}AJx{gKsbDSgR_QFac4LPhvLwqG(n&M7O6>H|w>EEIk z-`Qt>VCn=-wkL*jPP!IZbz0_LiZTsu6eaJq!oPeV*wieeFTB4_;djQYW^HIU`Io0- zBEw%wEg_peL`}rn>mIeAqOjbcuh~;%c}iHMHt*Z(sNfWxktMD^%!@}9LN?fgx z`a7nVxLWu!J0a$qb_90k;+#5MLg)JGy<%lO`m;v#E9#4t;xt)j2(nntwJgoq>iAn- z-h8rZIbX=i-fF$sT;)>Nj#j8d_ZmA=ON!4dqb>`FPsl7?;;lP7i#c93pG0PbRUF0yJD@$(mEzWY(f`NiwpCtX z7~LSz=Sh4D&Oh9E53^pj{I+e*FT}#@pOdCa{M0 zj$or_#Ikq)HombLSYmk=EB3cK$esGgPfk2_b_7!=aI(HOgtKqU?C-Qv{)0%7UuStC zoM%~z_=?<*Y}X$7*2o5Z6JvL1l-y^Luz{mqyLA|iY_R6_2k9=kBI5<^&RDVgPf2^s zjr#vQRW)gB-&n1i&?@Jfl>F6XVA8noYxT|~{FU&=+p)hi!3A?;0psx3CuqlUd`-?a z59`UQVt8yGPT(t2>j&j=C2R(_1SGOi<)F96`#Y}|#OZZir0Yi-m$hP}M{ljRAF4Mw_nJvIz)2XRj}Vxd?ZcrxB=YEn_$EV_MR@OinH64_zTsAsYnpNYpSiDaomWI$q(#dX8gCsGcx5yy z-5)u+e+;ti#J1lm7u9uqr8}hXmPjv-Q2>jem1(LkR0!6>Rsr&i=;iC!W8%NBoC1 zAMUV!`uMUdZSHHWB7-reopif3_lT^w7F}j2qt@IU%JV2=)`>7!msVt5rfIzz^S{cX z_6BM-|q6{~ZlW?$`#jMIBA=0f4vMt&uE7K9JQ#+jg#_>AU z=;5*HZ3%~3CLw#w5t(*tvuz?B)`*z&pfb;UB6Ca2`AQgba4k>T2y5LbEk`j$p@c&F z1^mr-S`n-4Ti^SM(B!^GV?|?i>_2-G-7l>2HhB9V?GFN2X9u z!s6%&Ized8DmF!mCvhiZ_9!@&wr`S|_h|_3)>Kk7;(w!Puo8W0I|A6xbSjcQbRXb2_c3(%vux2ZtlaK;yMBeTFZ4TnHU5oL|KTm! zdzBZ5R4!qn({FXH)=t)AE351RG5alM@Dn_z7mLO|Vh?#_=oG86jEhvfa{W19r9?%% z;+JV{^fWW}c$q#cGT4rB&tvYDZ%hr^9o?^Y`@cKT3tsWBI2V+yKdquI7ZEb`jCioS zkrDY=33pa1NhoJ9dtwv!lg6cav~bVtI$76KPqe$&gr>$y1{ioKq{ zuKY~ZS#8r>`j6)XZJ3R_RwO0ge4oW`u^UWP?6C3LU&(g28&XLJ?!z0fsZh-pyWz@z zpm_u6QieP8t+z$q33&ZEIs(o4j}>kUq$(=s{{lUm2XfN#zL5SjFvTyLR(jv28~QtD`V-3k1-E->Lug(PTi>Pzi2%s+RqAZDo!5B+sx3@S@At(}-Gpm~0sMq`@mGIq z0`L-sfqD9*6BAI7k`8T79D|$NHI+hxRzhf(vJ0|523;^8`0PLvsTU0Se!D|6{VNF5 zQSRKi({g#b)pA)YX%CBm6 z>tJvOFiOM$z>>FT0A?|GwyOXMsd_$YV4x^N(C58wy{U@7$FbV^Im_+%Q?Nq@B<%y( zZQreR0j4db;9CM?@HO}K^8@vUAeATd;()Tel&ERIvGpS*jj6$Nk7lM^ViWt2foSC* z$;~zJp<$aq!SnY#+Sj-(=jvDil>PupZ?GK40M=ay7U?H2YaSu{G7K(NR~Q)o!YQ8K zS^625oRU)VgunM~>CCn<0mz?}Q%&dz6e2e94E5Q@qTZk zkV)n2Y4r7?x^|61yQZHrZ|A@|47U?K`#RRnnULZ{6O49n@O&7Ifs=IXt&|gCuKI$cz4URvH!L9r* z>`N7WcMVd0*Fu!;j+baXzz>(GH14@h4O^)Pnz$uR=d8Ma!|@;d!u)fWh>43pDmsJk z1|}hHAPgpH`n6vz=IPmh@~qi?yBmld&tNTYPy<5$0MujIP3NJZE=y(g=QliWv$G3V z`d_#!L(I7rx;<2giLDEV7yqucJ*s29W39XQCouFNId{D zeS_s}RUS$=7)hxozgGwF77*`(2AB0{7{*5&*q`~d9tgf^zpAK&d|CM2vF<|SO^+!Q zM(^?4p)`>^am+#*Vd}y#?1A^!^5P=MxPCLDK*#((ApG|sY^ZNT>&p|5kV|L7wc>W> zjvMmj`S}4mDlX{fiU)auq02+IyA1Bar3;Yeo)O!N^g8qoEuzBh1yCWf9-=Hx2_&2^VGmIBErp7XO^Qh4XR1M&AP$3C4h ztFwQA2`jb>ZJDqZfByUw2PYT>@M+@U$Y%|T~ z-U9g!`Y6!ad;wloh+|a*@zek_oMK+*=il(~t8m#X#7cFKZ%-89bPWtla~25njE%)X z7rM%7f?hV^VJEb+!HK?U>~=QSO!|c;Yv@G*_7p;GAEcVF?JZ%s$U!gYf=N z&<8QGfcXQgK<^I*{Tf0rl$g(vQg;2t6&NE-DXpTyNkETfy4-t0$Zmg`@1Ooo??RDsWD>01x=Y}G+)>I@>DWRc9JPzHthzb$*&KHoafdkX4_wQli zMf81c^+p9zBcrVeQVyw@@@zTcIUeJ2II6mU0CNiy6PYFfYQ?vUi!b0gB8rcT!+UJ0 z#2(QVYwPcP;}`ZRv~%R&w*}TOT!!8zE+22ZXeNJJ7gE1a7knt2Z$xs4Ln&ZMe(2}t zM?`u^%&OC)>riXDJD&qPRzI3Tcj#=FS6Ren+iIqq0$4XN8UVdlGyu$S@76cNa6cw> z>mI`ZyF49k!)Qu<1xH8TA)-1|YwXk6AE7c4a1(*y^Z3pwJQUE{*CkK(`uqC<>>aYG zl&O++Q)W7P){3@MCLpbt!5)Wx2p%r32#}T{uMMp^!>!P%{nZ(G&BV#kvEN-%UI%iK zJ`naoY^4F-0}Ou&2cONog#db=>~W(<#-Or!|2foan|rHwJ@r*d2{Y2>>p10-Ev!BqY-CMm_)i$sVxl>0`hp7YOlKxJZDm86M#w z%xifK>BryC_HSpi0uJ&wi`o!k;S&xnDH0B=0n0BH2Ce)IiW!eeS|uVhlv}-RqRIKJ zxQt=zResX3iFox24MuGKo^MTukBbj1G~k<#+{xl0;n4(kM`YncRaFMZ(Lw8ku6t-H2+2 ze({@R*9cP<>r}UXC&#!asW)%D)kpJTm6|m(3Hk-}U`cOYfZxwYARDvP$MMYPNXG^B zddBEAW~fi<>9R!m`0~ut6~HH|&kmf%q^B*pl4?UefCHy;dgFZ;QDOa4CBKLe9ErDT%|mJEhR4{!>C>h0f4 zpuS7GHB#TafRl~@jE$Y0IpM@K26`oA9242ESmZ1msX)Aw@j$K)=E5-v2<&SKTIyqn zf*e7s);iIe6M1DIE<$3y25cqtM|VzPy-dJhySH5*SQQQK+xZ;x;uxZxL8!#ZZ;E?; zkedriE@-xQxw5Z;C^-U-rg8WTz_Ad6-w2J-WT-xgSfT`W)+ZQPh4AHp(9<Xk;qg$1$r&g9>LRp|Nejz+MksH0)#Bu===bh-Sm8X!FY7}6IH7g z0=^^)ebJXKe|6Z(4b-nz6M#+vQ&k&%PdGtaxDFhfUeg3b9tW8k!U9FTf%ygA6QzJ> zR^j<@&=07bfSG>>o<1`epoKg&uxWh`H?oW4j#)x*Q$U*U44QOA7YPnf8(%BwfVy*o znvX9P#6M2!idZ>PqhMy;_3uP|oXhe8WXOl`L;;NjF*hyNtgx3)DCU9te|B~@QDG>I zNc0hzAdn6$1l)&`T0sExzWr)mywTZgV|L=bddXTz-9I#Q#d1nve;{>G?Y5no7ShoG~3Q75h{wn*7?z;G1MkzSj>w1K+v~f7j_?O#uRw z0ti?kqsTJpi%!!LMR(<=c<=WYDGAgHm0p7Q4+iXc!*dPP)}KIZCuCe2c|&yv`~z=$ zdN(~XmCl-aH4V2b2v=h15T7q<+I3dbxSj6TitL7zkRHMMKBAlVI`{)Dw2FQh7x4$JqULE;CB4WJFPYQ++yQE|2yaz!B zn0==8u>e~n21+$msO3!xrQmm9&JW6qll=wIbJFKXy@6p@&o{?Uz(`$bBI^j2Z2|wrg($W34^mYel(ZRpXTO)&jJLTz9)f#Q0T*9W>fqMjzoyWy)Xp;ZWxD2 zFV<^Z(Adlf_-?=es;5p)0}>)4a*FTghb-Q^O{le7051mvQX61>^PAtqun2~-sv362 z@+j;qMC4pQ?;U~2aUgO!@2@a^eiwB;9B4b@K|w<1@xVSh0p9_Ra48c_yQje7IuCRlku@x1rV<~gq+RR36-nwlbtzXioMM3J zo(-V`VGbk2{;aH*kyP!0QZe{NI#X{912WBPzZOHh0>agUc`#3|vsL~;-)Mz!t-w^J z(rH-|G0unQYHZUL&I(t_=nSAc-%4Q94|*Wy56=18NaX=*YL&kLH3Z7088ToJeB5)b z)rVv^Q(nPe?(Q13_`yuk7vLrt z4BI@G-B767njuy0+Y1jVDF)!ReTt?iDN-w9aeRd%Y+IhnM!JbU?jam-=?m1*7eo}` zd63F!h6!}-m60t)%MDxqE5ADxt!u&U0cMxyE;LZO`?1wY$pPUPj!jN*fKk^^QY?5p zz1|FA7)duFwD!I@Ug|M4nkp84;%3cmZyyAFzHe$@P5}Y&0#>ROOf9bv1%(t%=9#D(;*Imm%pZ{~13%Ky&21Id*!M_10BQgX0rSiAN{w`_yxiPJ z){_rG)oMj_f`Wh_ARX)mFAq47!SNrYDHZLM9q|k^dwX6}BsDv$t36QQa?R{P@&LoU zLV!BF2)>|}cbHVA?;SsUz4+%R8a(i+=-9>JEw8iOjfpVzVK~O*@3a=}&huZL>{%ci zhj9#<;0ZJi?ib+jZ|Sr*nY;(bbd~F-%G-EVprE7g?FEl*(&EuZ-{ZYQdplDn8JKCH z(-{G3W}#-qyQxFq)ro>ROqgnc@k>JLBrp%aC!BoQ6A_S)%=p{>Hr0SxL=T83FWbmn zCbdv-OXL8hI1JcxQ2uGI8U}Ej!$RBZ;NYKN6@(ik=%eU-b+&L-sFH^{kstgvK@CfL@Q89?zyrh_XgDD9hm-JR;p!5J zzrf9Zz{QnNuAiu@4`DigOhFZNeQlSgHi$Qpo<11Z3c>?~OjYHwKX>h$1Qh29AfRwS zs>>%6J6)VosHlXaRX%z0osL_*slGz5aah;EO0$cD6&~b#IP}bkq%^|e?tz;p+xf)> zJU3)p(-dgo%Zs=&*47}eW`N{_7TlPwnOa6qBTREj%Fht_z0S7lfvbur$chK$A@+mp zmczgSG9y52sAH(4s@=92;k9!nen^Cpe9oBd1%nSpgL8cpb5NuF{1i+oU-?EfmLqUF10ysruA{rqt0I6ft(tNm4GCbacSp?)*rhdr@ z=8__|=wNc;=&W)?^b$8HHa0ktqB$`4J_T5hnEm@RfB5T&6fwPqJP-t`a$`aROn~(K z1o#;7CplOjbvfSAN8&o1sW6pv;@M4bpMv9ETlj+EC{%|L`3mW$iP5^4^7c`1z|XC(WX1Q;1KLPDCe_)lARRl1>{dg7QKHAT_Fm7 zZ1XvyXu?nt%0XR#h%ljwLuwqC_CJsaf|kF{$@mCrEgPPf9AD~pU%(Vtg*{{Cuj)Q+V)M237DQr6Zf{3HiFqZlOKYx<-l&^N3ZSfc@xV5svEJN$>KMz3*MElzF=i@9wbW*WUd6O21wFLv= z8-O}(j<3qG=`~nK9zR+cxB6DY2xCyW&|w%9=WB3?vPq-bhFy;2P#_>h1=JF8qKA(U z_E4$C&LC!!2A%M=RI66S%)c-Y8xk{yPoK>9=6#}o^!f8()CaM4@ju=CkB1THPY?+j zWc(WE2284JJ#4j508y@D!`4u`_N_!la9@mfcUD z2&@#mmqf;~B9#L&82}`v7#J9jc05177j9qjHkJIQfD5uzxHA$HQPL>N%FiJT0BXR} z*D2a}&w-Ghc8Mw@GxG#Y-;t^Z;ENY9s0SnrppXc$EMBMv%Ow>6?59>YT={c;CM6^c zdn}9?52sK;Uiko!&y)QJz^ylq^!W{C9J~^etef=R5!tC@gXyi-G-&2Ao{+?n|5x5j zTF(c*c1i_`6!7_^fc^|=*b>~{Jg}e(XFO-omQk?`rLqQ#Wq9th0W%DrarDN6m~IYs zONb8+FxxquSEMg5&m0s?Uk&e$Ws3)xkLZZX$};V?>BggD4_4XE(t=oKw#JG8vR??e z6o_>P(lF9%$dcy6kwavXfMG(rNu;)Y3i1-DAt)eFuW#4uPJHNLcqY|{4``1{6xXRuh0;F~Bs_TT#fh6iUV z4TYGnJBSg9UcRM0EIoW)Ff}XKd zvx2`Tq5xo9dGSgZn2sE$V1986ye$yY_KF9jC21sjFkZyydxH}OjTVyA$R=`n--YKg zC?|&wnQRJPIxEe)ZK`HsV8E=%r0LJDpMTW($_JT$13gG>n8ORukr&%QMN+d6Hh3i= zPzX}4TFEaX*sI1i8iKvPM5S!G@^_=9btlo>1#cEbd3lIywpq}N_zuegbD2^zXMd1` zdB`gCCYLWRJA0D@@bX1vJ*BzqT`&jU{uqskRK9i*plwA_1a zaS0xCcwk2X({e6Sg#1WAU?7Rn3953aCr}|RgErhbq@m#0?gWZbXm7#b>2_C8~nQ!NR;dk&S}NAenFr`-Bl&^Ka$*D{0AZmz#3!>%rN z0|dMe`&Wi{k`Qe@GAbHU+po1Y#IQWMPzM$d1xG$EJ3Hewg&oZ0FrNXVjS*LdYtHH? zAZE^oGY|zttaoW%|GLeB^kYA1KxZQgr5iL;oPzXBzrr?V4|K21e;U}thJi({ae(8S zk%A--A4(;>z=t5wz`(%?__+tt1X}O|)3!IX1HmN|3WwujhkT5Ttn6JnNqOQscg${h zMg#^{bjLGPY9~3Zf2J5KR%gBz3TGBRxrK~`dZh%4*n(o%ZWr+hV0Z_{C z195M-)x_Fu;$%!zP{Z-G_O$xoRI||0(@VlYms=uF`1!-3tObLPZuyuBrR*S-Zm8jr zi$IeZKM;DPem|XHOG;dxtHaxXk}3==-OaQmRaD+0+ZY7sgnhPDgfjY0DR#^uqYjDY|WkCZg^ zXC>rBF({GmArTFS(niKl}>*8YH{Cy4i56Q4w9-;9&~Llfpsj15M16ef7I?NiF|=9I^nQ4ud-q za6^ty5%HqnUsz65T)wu`qnn?qe99x;+}vDKZ3N9P zgjotGuOS{nWd3nGW;B`oJ&JsJZss;-52JDdCv)nL>c`Nu9sKw(k1=(6!rYzrI!+y+ zH2SHLzlT947?nCbqM)*|w$=+kr8fATff4FsbNA7Q4C?Op`T6O)aZ_v`H>xx}G*hHFOk|NwRlRuu4RDnrl@ixI%FfOo!}hzg zqV0gJ%=&-P{1hr92c`jhAZ&(=xlZ9#Zfh^3K84zWK&Xhx@3d)uc{eo$a!LjTzT$ zsLW7x>W?Xmmjid>ONN&O6QosARM9UY;VULtE0V*T*ksR4O-<`44h;!+jTj+h$ZLM4 zYrb3(M#X)S7}oDij>u1(E4eG&HB7x4Gr#<-;&nyg7PU3Mn1C%?s? z$)<~QY7I?|fzPkn(%`kRUim}pC!{)@l2m}k6}&kB)wK>@c(`~?hu*@l9Dy7;FBC66 z={F^6{B)?*w?n(0$dsDgk>BNciT^@PmDEVGBsj@sJPu8#id78{)iGoTNIJ!Yd}ytcW3!&Nm6PrlgT5@x=#o=o zC0+%~LI5&hLPZCF&Lqd4P$sCaFm|^3#8^&BlX2MFH)>a{+`P-07TMg)KTw8E^9UC6 z0|ez>T#~MG6W#mt^wH52uTK6gzk~;p)98~3k8Mgp1N267a7kdQN`&|eTgUpWh=+O& zX($fLO|EE_HZwE|Zl3OM;sMaZ-nr{~*TA4E29vm)Mx$cBwCK+0j-#`?dtK*`plFCM zOb{%0z6%+Dt2s&AigSOX3gw_3T<{f?ei^e*9TH!FjumN#qe44>trq#MPgAv?O_}u^ z9NG(BCD)OJe8*At3@mwxC@Cp5E}ro5Z5|CUEOJBqvYQj4-)>P4f*2JBX+^Hjw5>z^ z3xha@m>Z8ja$T#ejvqXrh-RGG^_4G0@FfsA@GZi7xjbqkiyY`*a z>r;M!v|XeoiY7H#!%fqQe++hiQ1?+QJM>MXM4oOkqO9D+4}AJn%ui;n3+-X#@APJo z7PqNN8E3SfxY4irD~~N01V4+&tZc0a;hO2Y+2Y?^RKIA)Iih;HIYzTfXR*k!Z%`G+ zIOx}ri#nj#;UI#oKFNVY+@MRM_g>_X&W8DwFMf+Bf7LfsMaPPV-F=-VL*!k9S!m(Z zXTNyr_P90{jW}=z{yfCD{luV2!e{-x)7-pE*s8)3}X z;~J^g?r8mRuuAFqN)j^QCB}JQzq*d6pVUco5${ho{E*DZZ4W`3qc-y9vgpUo-=jtU z#$!`EcVz}G65imbdnE30eIKPg&D*$FVh}nQkXe`EWB7(-y#9vo5s2481JpQ3w!sVX zKi~WR`^7=P_K$#4O*e|Zy_>|pm$<}udM1^w69xpEzy7?Y^I6iony8FM!G?`y7(HJ) zjhmHqf{QMkbHsW;{xlP#EKPX7FI;}2(DLy1Pz^oLHcmv1u))Ieu}}MVxd2bT_o|ue z?)5H0$IU*5PuE?9-XD+dTK(_;P68}G)&v@nXzFx4;`&LXVRH8QW_Y0`{22Z=nGmLY zr^TLVhQ5|u&wag+)x2QWC-})PNT-e7S2_Oie>cNmWlO47k2|y0Iq@p z2YGbg$zXf8hls)Hsjh7=m6MUOgcr>=!yctxjOyUvwKt0s4Jd}al)2Qw4(uwJ-*BU*-^*wOrII>lKXPNVR(Y&CU z$l7oj-T5W<{_m*)Q@tpUzflg|qk9BVkvmi@%y&H=YOJKu1xSjS2!6!`C~7{RH=ncv_i* zLOAhkny}cb^xc8=uc5+~+8^xKag*`OBJqCMG?O!r~vY+zxIH z^s14weI7RQRL&HWr0d~OZ;Adt0W<^4{EEk9Gin0QmKM;%M*K@Wjl=P&vT{5l594ZC zytbB=gXkEPG+a}70&f;0YvqnVS_{i1{q}Br6~~p8gUoAKAZeV^{2piG9_)pi%gXUj zY>`whiF~}UoSg0GU&9|&KKsx#wuzxfB3oaPNK#3j!RxR_oQWn-8S2|g3+XUUp0ziL z0eaGoKR&wu`1vx3fApM>={s-fGC^|W^TazW&BLN)7VZ^~|FJ?^2E<@ksyTH{WU(NI z)uZB-JycGPSrUz1CuxcsH7BZenxqR#^J0ls9Vj&zt7V>-WS-3LLvR9qTvpBy1`UlA zj}e2gWqXdnu~=AINQYZGKbK0RXl_9*GsUL3PyKarBXPg3v%X2nan{!Hk4Iw@+hk~Q z<{tRXDE!_3HL4=p1RxvZQaJbj%B>P(xOB~BFT<_jn+s`aGKb1NJWJAF zO<9{rO<(EemPz8!dQeCkKgmRLvvP@B>K? zlx8u$R9Zxb{&IjnYeo&qWe1OzZP>ky5JQO!78ceZdtz<%3gODt}LLbch8@rTPrtPO;M}SWQJI) zACMNzCa_3u_HCN+H3N4@&Ub^Pl@^qh<9PAzm&k%YW5*w_jorn&e^5F;OEL*xlX<;L zcLOTP1PT9NB*mLk%F02ynKo=2PaC``v#oX89Q3q`KvX2lX@dpACX3uZ;K%X?~{G^ zJZavl^LwR^#y2GEafP9s%ax(`eY~nhP3K57|9iUW(N{~b=UmB%)K=8~Z)_s*a2$ig QsQ>@~07*qoM6N<$f-WJPAOHXW literal 0 HcmV?d00001 diff --git a/content/english/hpc/data-structures/img/segtree-wide.png b/content/english/hpc/data-structures/img/segtree-wide.png index 6d70aa700e31b347dd29cea324497942600bfff4..bf268fc2e036ce6d7282f11c53a924a6403ca694 100644 GIT binary patch literal 8767 zcmaKSbzD>7_y6dSP(VsTVU*G!9fHI_8j)@RVZi8cw4ee4(hW*0os$|<36ZW5qm)u= zFuK1NKcC;{&mXU`d!KX9JV;M|&&82F|& zkSYcagx<;;Pf3BlAkyb=fial}#MB!EqW*jRkN4>{XFfn=@=-PMF@!n#_}h9pfc*Xa z1)bfY-uAX04uUW*r|fMxW)SEuNJB;GX+X|au73*4MCPARd3pV4{ftC$4wBliM~qtU zA+3(-&-vR_Na))#5Nq;6jNWXd!q79uK?OxuAthy&`a?qy8AX6d%~{RYKZ%Kvko$GyV)=pi|WcJ`u&t9`u41 zf2PW+tJzPs{X3un`JxUn-e^-N(>|x6ysgdpAqCDRT+|JQ7>&>I$E|vF#S~>RE%-47 zyuhdHH8JErXKdn+4E~AS{eYMB?^+BA<@^6kBKODs@;^#TdH(MFwY$`la4ax9GNNy2 zZH?8+k@9@;{CVWpuV1HzhxJ@NJcK2Lg>T+rWPFP^J~1Kf>+7qdtsQ4*X}L8%J}zZs zWaQ)L_blPvyG}DRGvW($2XwOPGpnp$=Sa`_ZB;VR^`vAmCn%+=jwO^8itxj&DUvdx zZV+YF3QK=81i$?LBwU$&_{Jt)6G%3EC`>opcG8O-q)7Q3MyLAeg-_c}RefABkrWGU z!zL&ma^B9)4Z&yZB$R;RZJ|^aRX&gf5Pi83Ku60MO!kFXis6{hgvb>?E1V$QG~@1* zDZF#O+HFP;{oh#s{lxUP~LxzpH%=OQQH$#P&Br)L*{HirqFs1c;XPKNr8~oaK;xKSaW& zZs(p%Cc8r|GmrN&2n`qYAlPx}cdK#^;2jb6M$9ubnDuQ zR@w#9HU!GJ7s(f0Y>2xLAtCyDCy$K4dnrf?SAsOq4dD~*=DUcS?cqz-c=nck)9&|> zu?o_qybe*<%Ct=3BB<8(bfbOWzkI;=2fX?ivdUmYkU(%nTtW8YLc)~WnU>Q-}i3)lrR_=GtXEw(of+~0hEts>%D>VtHaSK5Y6*U5V~~|@zC$i|16sS3evePOEFpoTlILGC7${jG&AY%*? zuPykR1hsna6UlBE^tD^7zq0)5b#HSZi}#(a3El^(L`kLl)ETFWgy$4{Dps{3iR1Qb zq*`}=ppcJIj^Ya%`pT++CB1pBdl7B)cQ@NUeAwv|I%VqiNaYyFc=YLgpr+EVL4O?a z;#soG-HZOQH$;`S1u0U@cllxzeKUH-FRE5cThlPnOiMiLz|HSj@-{@42sA?UKY9mz zQ#tfhD3LGxX-H0PjvV8+$%0A)M>whqA8o9!iBI`HMN_T%XL5=G5O6y0JOjNc*R_u* zsYx#h%n6U^ok&CdpH$aOD52`dGO?$fqVsK3{!X>gxPgtGRR;OUUZELAhhmNoG#w-t zmb{wGfIEK@wHpjl`G@FBXC+(gia8f2mD16vFL`6oWI$WBB!NG{dTNa{OD5x z$!)Htpi%=ib}c?{V8(AM>J)- zC8oC-#7NlmVegS@T)&VX)2jcu@Q&f)0Pllh4^ofY9$c%3BTJ|bJZM41)2PIaptoKM zaFTYT&mZs%xC#Q5il5xx;YvDD#e{L8@IT0Zd&&1n9M2eJY|j=uDOdbQBOTkoe#sD? zg?;+<0to(>MLpoB~9(EbqO#yy^X*q`t=JS)B?lwhEuS@II6loBbToerleor^Qoi@%H23|S5OY7m zI%LExW9hMesEt~t4(aeF8WQBZ5{ZY8P%Cqk~-#P@s`?&&u~ zjiIVkPLL?S(FFt3sQ7-1^-aot4MPR`z0Ly0!d(;-Dl-GK53?uE4I0IrFs`;8xM#U- zY6OJ{FPS~)`oYq_`!tG4HB_cI#6`yO@ReH9-ZUkixyGSktBBG#JncXi(l(mwEm8RD zLmIf&gkR$A-oX zDQ1ccYKEP(1@8UX-d;^Lw%Qf2uC*glJpV;q4pH0DbT+tA^YXwYD=W)@JU`$!SylU? z8C%efQoQ2jaSCpG+g(TM7hwo4Ip1Tqi{Z?DvF1C~($5Fg&o9qUkSpmHUcEnBd^g00 z1%q9X$6Z44a{I%Nk!z7R1iyq0WJ}ijZJ8}lTZdj8$esQvY}@U=^NvZ3yEtgN&SBQ8 zPe5+Bla|G^^A_Ip^U{HE(cx-G*?&^FGi8nl2{G|B+ZQdvIns+0YAVV%tl!PtdL*>*o zw7T7{2^SzYI8|$3nO|LA&dHWw-dC!Xne;d=)y|>g;?RDk!I=*$DJe-XJUaRt&i{@T zI!aIiFl?I?)z*%f%?}SRo*@lhii?YzV49Igg)M1^*)N`;&5oaIxv@vM1$`XlvP~1~x z*;=vH20z+vSW5wd43co2TFmxncj%)B&4--1vGDjp1f}UyaW;Fa1Ak8?TUuI14`FU@ zZfG1}JYWgnPxLd^q{@`&F1y;082g;Jb<`?-IeVlhQoTDSL3j7%jlQko72G-daUmnL z$!err+yd!7*(iP2qjVp``ox3LG0pYPK*op)+!DJ*bS}0dnde}Q(z@F`c3?%|STs^4 zj`{PbG?CZ$v#|H`;ybdAii`F`aCIB*ySYj@gaj#ncmLEjvNJy>^G&tlfW+lnrjM3< zmL)bN*|qu!(j?78d?Nj*_^Y=}KF{KdpY_YR&F{r7)c$d#-l1_U3I)~}zL;H8+@}mv zj@e`@NxPC7>52NydK8C(BEuiXetsJBG1o#$v6fZ?yqs9)9k%`;XBa+I#CzJ6L;>%T zuWp8~SXq9jL*(&UM=V*)w&r;|^O#6`MCM}+{ZY@mJ1pE8A*|jMbCeYJgSwok_Mi|! zLDk(Q4y~F!zLHS$1nCk56rq77Dk^b8htG`LPDhlgo6a)v_$mqvtF|DP%R9abwOeh~ zrxWrPHh%glqsI2c6<=8<|vckgvGBPK3WcEe7KqzT_Xqlw^& zaL*8^Bw1D9xxvRhKp^4%o# zA6pw5?wiaUfjQ^GyGQp>i5YvzGvKaR4D$tT2{rt-eS%%|duNV5kTw@!l<;&x7myv-OZuCRN=RHkh}VH7ma zJn*DVk$%j!nI|eItXc0&F}kDtq)q#DW2d~FPGkTDgP0Wt(_n`84(!mN2{vwd{K|*8KtT4Jtv|65KmR>`O z#^ep4$|^^{>XeSbnIz(x+S|%jtHBUuWLg~HVP04YG2{F?#71Tn3)WiNli-^(^wvBP zCJA^`^bn`;Yt<=%uv(R7oZE=a3c4)caXiiW6`DxcsF4~yz)Ti$2-z_M+se2k-^cHO z*IKy(<9f*ULAmz*&{0!PrOlBJ6SOC!APA|OImykM<`kUzd*$w8;CW@DO}+XzYMK?X z;Gb<0&K7wGMltC=Xb5C7v`hH=+W>(i_(lGQW`u*f*y^%R z{;Nsz%w$h$Fr2p1j#`|6ewS2@8Y@3vpdmNmKA~B@(W+*yd`B~2?6JG1vsh};!&BBz zXh_W*o~O61U1mN1ihFjEu2(unkejhSkFy+R zC7d6+B>xQ8sz~(U1o++$a*zSz1YrDrkd4eP5glKbqtI*l`%6zS#3?RPYco+eOO>Vs zuS1*vEL1=H>F4Jz|D%p`rz+URy%eT?nqek&- z6V6&mTG3;zl0(RkP7bnq;WPjne}6TqWxkJ=rx9(0Y@Y(w;k6Y;&1`qQyao3Op~sw( zXPSZm7(0tFh^hWy69wFCAc{~|E&*^ubS&nQtcizsI74{u_By|>``X&H=U>_Y6*$bH z;J5P3112raVg397KpLA4ha?ot-<886+8q3QC8Vw%5-|n- zxyIvJtulA{@&8`W{mqEat!=8AY9M(|qvVB6Ng+S7K?Cppo_7gKsTfSxtz!eKF z(~hA+gios+q9H$e*vOjAt^?9QIKlJZI~G|P^xO=75ZcB?1_84tTrb7j^wkefmDx6A zGR`;uk!B_SCIHbTR5IbbXMJtNPbYbGn?X01WO8-Ai$}=AAMJTT*HPhnF3Kt(Zivh= znOsYUp(8Usba&3NG3t{yAB6N}RH11R$!M2;#cBH4<;Zub{l?m`* zC6jAUJPPOu{xQi+rUa}2-dOOLBTF}f%YZmdCYd;Z4L8ny8>n~ue*(Cg)+joVB0 z{2vaQC!57T97Hu@Q#DQOmxzG$`vI_h34c|iYDwX_@-X0yd4O_r^+=B_KtS^*8bUh5 z7C3n)QkCZ0IotgxVFAtILkQ{U^-{<2ROF0(*h+-AD*=)5hSh^7N0i05DBxg1x5UVr zj$|(`S8tkzzkHo7-$k8Qw5C?U6mYu3mI>|^wSVW=d;l((3`ossUa#y{DN@K1xg?7g z-)xq&HDT*GAaN_bn`2SoReZAvOwqT*&3}El?w+eTP1=fQHNeePrd_937L;8>k&rD+ zc{?&a-AIt`{sOla92}Xru)GQ$=bq*rA2U~4BP+(JF}@~`2+gXVn3-xCU-sNLOTL6+;v0y@Jwx`W{ zWjsyan$)+qBApW>V-OlqY<$duxEt=8#Y68=@5;=Jby@XG@mT!TL%q2#s+ zZ5Z;t|8ex^|Ci(G=9e&il{Zm!fV?QcZ<>u7`u$4Wg_dvJ5+apN!1vxt|J6+8UA_At zMV^M-7GP)sCR{7k19WZ&u79+1O_JrHuxl9B>h(u9SdiM~AzaJEv*{+ukh#p8Jwb@zr?0hRT;^+o)}Hi5-CYgrdtHan&{qpr>Q7C( zp)_!iM!DjiJd1wua8`SZ5cHmgA*yr6$)O!&N7AHv13z5*M&}U!$DQ>m6rEZ%LSxkrK zAoJ!^%wBF(3pHHaij_ArSlq&YS6oA{QTEN}g?vi7b{3Ufy!7NHPgH)T0y!K^H^V{Z zAl9cMGAsM;CmWf*3HTRBfA5Z=CWbq(a>Vx5_`UmZunOQSuR0BVAmjXZ^eSvkBjhTI z>iCsP#R!keScj@Z!Yn4?8wVMetf0*}8uE4&5M~Pg3QazYD`RWM`J;MNfb___l~_aP zT|2@!a=3<%wZU&RsMpXp=d{HyF^v26pF7-=XL#Gff25aHCf+QTrzxm|DEAJC>QewR zX8;Z}@W*l2QNsS)tE40U9{rTJ(0B!nKrSx2XTw=+o9t*?Rae-craXQK`68m9&=FHZ zWtYsdXlr8^>D!wENWFXbLSQ7X&psnW;(jccu3h^+JrvykWZc23^8iHqOE%-Jq9%~) z_+loxbCo=SN4N9{umd0)0cNIiO)@D2W>brpD(t#7N$^f}{aeNe@2%{HJ1=@nuI!3T zfG4FCSm)n-uGwyWJAy(^i9%giVb+ZOkINf^MrdnJUIUS#S=GaRY$ z;$SIAOK`yJN#H{daJM+6X&MtK1m6dIIaoBO(4~TQr|Q(|)z6)pH|tv~wW(Ua%yGQK zj*sBFWLIg;eIq@Lz#p=$BAxB;^by63jr0BR%Z?xEfr=QrT8X+Lu84fNFOh6x5)xEj zHxY|?n3_)0)-y_$1g5|ihJYCZC*c#pxgr?iO?Ybb zRX&wPj^lK1#zjM;IW6}v8$nxC^G{!dE@d~rTW1d9{7+AVx$$r83FC(C(oL-$7unKt zU%SjV!fS>dGLlKixp|V1wBsvrt=0+CG}f&tIc6!UJ;=*9y3!eb7@(Fs6F_<#8hv#o zu_?p7PYj(na8vV7`VoAg+slTJ8?gDb#-ZO>zn#v%orbfR*-~2!6Zfq<_(ftbxcECG zr}rRjNYZ|=)EOf{{7zg#B3>&)*tQcE9UmW`)zTv4G+LyAjGF*nRB2o65_{v1E(bzlN25`F|CC5$hc`t;h0m_TBG32FPs(q+_?XD%ZES<0T(Hq?8mF zFB%U78V6`JItyqi$kOuY>{oiLtE+1V2e&r{ZJ;YBfVWuP_bh>DiNKrXjS`*Q_0}L` zW8;IxMH^Ra`_bWCPo+7c`|Rwj-1Q8+@OAu?;hT;1b%R6afB-2#>a(~L+X@t6)A!~k z%+8MR*4Ea;-GH35H2MV@rxEwYjQBmf#d{cq!~OlIZVO2H`Po@H3!tHaNN8wmoB&D? z-I+iq1$#6*KW`=G*w0uByyDML5fBh0B(rp#O9Fk3wY3>{pe7M}bZ}tQ*w8>>WNf@P z=Dq-8>pXw)hFd2`N-pI5I41;V-sDR5dCo^Vsv(O-IqUcm~u}@?4N))R8?)4pEPA}u_UUt2D2VNEE!{Fshk_lHB^HlfMB@v zc-l328MhkQ%Vbs)P%qE8KYskU0fCq#9ClBzlYxXSpeH7BouG)p!NK*9AMZh1eSx+M zcDkO29usZzL>YzdhOlA>Mte1%*?EY+{?w9^pyfp%n3RDi^kclU0t=lrt6X()giFZuk^5^oZx-YeOl9U14Trt=(W@0{Lb9`i|m zpo=FJR`0#0v*2|?g|4#PFrS~8;93}wq&bKt7>ppgQGs!N@q)4WbiKTCMO#~YFw-f| zSwIn}gXOj#&0}WVZ!id^$Qbdg@B5dUS6JI{XuI{3|2b=Fc=It1$vID|51ig za@IybaV5p@^~w3fPGM<|S4`uaj@+(F>8DSCYx;ybX7O?Wul6I(w_CA(Ag8$&U*w7x zKR5CY%|dFu$1mj#iH(g7^{VDoX6Q)@pJ5KrJj9qO0F^YSh2W#cCJdl(kFeI|mm7Mb!ILZAzb0@2)0IDX3T@5x z_(rDV?OpF1oNnotNg&ngw}lhv*r;k{L78@hw8NTNV!@T30*m}+4L@22PESu0vB6k4 z8VC4Nb}h#{zVW=QFy8AB15<>y&IPXJ8t_E32omMEVVX_^I|6p?NIa}UP8=3++SwA) zVfD{@X`s0-P-reT^y&0^RI;wP}MYL zk4yjTwEb7CODVYSL;Wk#{eR1LpYfhmS@fb%VOV@lPR`ncjY)5e8{#DGXn%hPD9W1m zwYIcOwZjpOK?my-NUV@q!&@?1gVOcDi-SqLfSs=-ruB~6nt?Ew5D0KCovh#HjwNDA zstrIATE=aCb({rGyUD%2V z77$vAa}8x<@o_4 zAqa$5bNyR@*6EPg_@z#uH0o3dG_Bd$Lym1{`~a-iZ`UvQQ`Z2{U+^>Un5Vd^v+E)U zK(vyA%at9tX|Jnr0yamSQhTmRQI$WdtV1_9Ha;x(gTi1#U*=q(hCOVlSew72#c~(> zV|**uc(%l{U+$1VGnb@l8=;eC8{-QLLnQ1$@0;dM&(12(Hjm~A_)TiUYVCVU+B)3l zWI$;`)`PJ^$P5F);F)Ay1r6W{rz!nz-W>=VpB>n;)z`TD?bIS~UqTbIVTEaXEVl}~;dHWX&rjcn2qKlzzE79XBgvrD z+SoF)Rm9ue|qOu zu6R~}HKu6pIY>pgqrD8{c2kN^KmT*5^!;@|%;ssl&VwYsWIXk+`@UVn?%tML=x2mm dhr3rKc00a#%5LXdz>hsZ8mc-fRmwII{|9TMBa{FD literal 13306 zcmbVzc{tQ>)c3btS|k+_D!)P{ktK{JL@AMdSJ^^#lijFnMTVlqzGNE=S+Y}<5iw-T zo^=@ezVn{1e$Vn=*LywJbG`EiGjo6MbD#U%XS+Y=%nRka@`q@dX%PfDbnB+93WDsy z!22+oz3?~c(3U#<+wCBION|Cz?li{!@cE?U4J}7i8&gN82lgh2nYE3T36F!3y@`pn z!y_BVsa@p~@X!VHAsKs<2ae`8)~D6XtxOP26XVnTS5M2CJUq>Rg`fX4-!;)IBBED? zPb;e4X3C0SLXgwQE!pd8&XE({Zq8~8QKT8A%Y)K1XP*AE`(_2t7yI_S_^Dp{Ud zOX+c&`aaoRJX6WOd4xsS+Oha^Y%IcX{*IK2%Il=Lf{D+a)0=BkZhG>-sGnr6*S0L4 zoe;)si{e-=N*rGCvEL8T(W9?DJpYrxx~7ky58Z2^RaM|lPLPA1d`K;cG%1UvzwkDVH-g0(_ z;h!@F(fGRN7oJHLb-QhNM2 zJsNjhR`%HLZ80@BH=mZ2EGK7xhFD+332@E3b${6vKx?l+>yI~VPOIB;xnqcKU5Q@mHUXke_%kL ztvDfJmrcpcUV8@zyZ&$c5zUw6U33_)$jDQ7qA@2~Sy{O*T(B=Xla`WV>oUH)?CdvJ zS5uR{_aE+*;>iez{)(ER6#FQndZdB$5)JY zlIvq6r|eIde(~CY%1vf8SXW0UJSiWCb#fY*$nTXiFkq8dpO!(SWMy|xmM$3#4i26c z6D#g*b6OnNPB$!P>&&;2x3s*7+^06DR{!|%wS2v{{82|RSE91M{)FRz(`1*(nw{H$ z0|#VGO_Q1~gnsxi+HyV!RR1Zej&FV$5tAr9SFbXFeoI8%1~>58s?!%c*4nqeGKc7q zDI?$8Bb5{sXy5W02AR~pJ#+5d_vC?&$3mg+-id7#50xCDzc=0I>*rrBRX=PosG+Jl zFzfl_#}5zMZ{GVEu6V9H>*G|k&CR)m9{={Lix74bk4a#3a*(*FN&FVdt?Ia;S3m4S zOG~S4U@$rBIWseJxAr6jtR;{HCLI!keexteC8c_E*Uohj&4hOe^7ym7yyOQGl!)H- z^Y-@kCDG)@02Nw4I(9j0Yhq<2Ia0(S%cAAN&uH<^%rKuhX<6CUZz!e1dUY*>6v_bX z>T>V-a}I*_p5gCX3zMQJk*zOb6oG5V(W${@0i^3Om2-2d_VoYtp@K8C{% z^Zl73?4={(yr5}5eU*L=2LArl2T(0s_|;n&-lL?v5S`W-a1x$8S`AOmM>A3jUpx`! z6?oKe`kf8-6uM>0uHCCJeGjaKQDENVdQK6MlalLwbc%eG1jf=~A6JR<=|x{T z&r)B29%b(qr&tWi?~Jem#=Z)~B)kK^S+X)5QC(N}9b(N6y$^HR1-574#O{btwntI> z^H{e9aXe!M=U~iSO0jTlWp;89oc);Yl=0~oNHd5{{AE{{-grme#S!5Ctb>G$i;FWQ z%IS3P-V7n&Lvv0V&MPG&1M4FBfswjVaA1**fPiwqwWW<50{=!(B-#9#NVPTPF-gN$PR{{O; z{v_Hq@IuaKg zNp8H&MP3>C-lD52!YeB)hnh*(uel$$H8eDC*0+rK%!xd$6k*>2?f6H1WV}7cn&Pv3 zc0c_B8cv!i6Wrny%WIlOmo(=@dbc*0Lf*W2Zkbuy`seCF&&av`2jLI~C+yu>F46CY z%}D!4@2b|;)^?Zflb;-RVDp&2_c1w{#oafK)DlUa9jg80zG7d$vV4ooXzaR`#ZY!y zMpnnRQs$U(ctiwi^s)JMGmt$uH}~CrFIL`$d;y8A*wvFT^PaU6V9D2bWsiA-ShG(b z-m{Oro}z_Up~Y>^#e_sej9sBPbXw&C*xQuV(sw9Ob4Sep@^-_hU(zO1fW_o9l5O62Cz zyJdM5;R*^0SrD}zw`YU2uLn6ud?mH$z5pv$*3cN9^%U<_=qf%qlkrXVDVTD|-D{dM z#8V>nyFt$m&xr_k+3o_By%xhTc#XoqthaI06c+oi3l=l571VqK(LFx?tfGTpsp05A zC+jm+Cnw>>$?gEqS@-!yc?2d>VPbe>sg?H2^5T4+lbF(PPJt?=DMC83Fmhjc= zrrhYtiP4_kWK7)wku?vCNXB?&u>3uFV6A?I;nBUQrxRqTTHjL5BPbZAbfx-SJtesO*s){xH&z#Zf3}M7s5hG({DFp`3ZDZa z6dCJ6S1}=q4_7m!u5TM0h)GL#UL_ZX|NMC;`9+L_;2=*uMLE{C-f#Y6TZuAxD$(5B z-2FlDrROYc{9#E+=aW-=-W4*hG)BAM2we@B6EO}2CczUNC@d_DoI6uKGCnSAXZNXT zs(5nPr@y~H(z6$W(FaI0j@hEFq-QYfj@jNr%-$~;I2Tg|Bh${{MC=D?AanTSwqy;w z$6cey$k__=hQcXAUi)roOoX?2&J9?u=IWXbHvA1SE-XyMgPo6JQT_8;jEpHsgq{C6 zm1^-$ax`>K+E@slB12fIeypgg8_~4(L+@3NVFhH9KQ!xp9C`KX)$bzg#nwzi^2({~ z-SsQMCkw+FFK}^vS(&3WwzLe*d-C%RJ|2EMC*5oal;k0ESY$u^rI49c2+rSS&Qw4^ z070r=vG}oXEt4Y2XKvdHQ5cRN_klFgzTWT7jVYJ7^z=*roIk&Jnzot%wuzLHTfULN z!ghWuWabwGw}B=mQBFNSe%*Y_@z;5-P2{r2mz|R%7$qFS7OL{Qz%HTF1|uH!_HBG( z;vNRhyYmH~2?Sc)`T+Yqb#;>%BPKzMW2hVg*(KQ^RX;qY6peX7&mnhOL`1u(P!HRp z{$3JSmXs&!+MaL2F0uS)m!c&LGqbXmmcMyZl1K?wp4%rooBx!!>oLfNb8r}L?)^Dt zWI0ICNJfl|j0tIJ2L?#^Hpk1Usi|5;j@+Q$0vVjYD&eY?(Lw0pGqsz>#|h@hCi6xTUqVAB^(7kX3k+Q?0DL{6L={ zd+WT`KNAIgjg1#GOP+B?J6DV}P2YkXcy?~?mBb<1Go&Ea%K6bIcpn3zc-WkBPcCM! zPcRT}ZfQB`!7R~U<{`%BK7Bnw>A7_EQrVI?L@*K%`GKb7QDRQa9?aK$7<5G~9zWiV z96WXH2w7q+y)%y;x6TJn@&MwzGW1TIZS4nV%xTm7vQw z_wFX|?0c7)?~LMy^=fg+s!4ag0y}@6Oh9b6g_IfT5#Jl_O67(u?U7C`>A;?FjDdCG z(21WPu97A?(7d<*djO&*x%aUU3o{GLJr$K_y&IEG)@_Ub;Gd3#TR+%VxlJJiFxK#L&(>NV>ww2_HXx8=#z@<>c9l+NtR|8j%i2 zhGVhD<~leGHat(c?Y%LVNHM4S0p_zfQP1@Pk|9P%Rn^E)3`)(w1`1!}yt$Rg z(YJCa(Qb#yl3OK>$$f7ictPIE;=z1O`Y^jhV1!LsJ1Bg6p#$ZLb=Ni25N;Y9|5H?4 zj9LwPi@p8!65px-mp7=_zI}&JU0D@ypWT?kvBM?Sw=tPsP0QKQku1LSXS%O)vAbMM zfjji=+q0K1AC~buj4etkE@l}T8d}t(%HFwi5H8|ZZ2w3lFIiez%G|iI=hv@aM_5?W zM!Oh!4bsh7Lkx(ZGF@&{rCeNG4Y_6Dv8=4DZYn7`tx^=Y^~&5&0;$97baZs2lzR!m z@8A1pWo1b$<+MNM;^Dag;1IY&fvDPC^ClEYP_%q8pKV&lT(Ys)8Qv-+Boq(yPM$n@ z*-{zuq3ciLd|JCn=?^eG(9^*wE}PUId=Jzd&tqUTim` zXJxgTZa;c*tgFyLtHhZPy+f~&9vT@nc63;5Z8`P0a0*#r(J zl%%+bKg4w9OW%vWkGI}rd8|G8`JS)RU-cjULx5`at`7@qmAVRUXUqXVBsu^2=QJ-b z9i8}ma1&!|wex!1p7W}PhIn~4m)#DO`WwL9`8)5C#ACjPSbY81J(-XzcKx!DSgenm zNuthmW;8j5*}YY^Dy#i5dh6VrVc()>_gp^n`pp}928LaT^Y8~V1bOk|1rx^kN{M3#Y=rNHxA6G4~ zIMtg8$Y3Ipl9Ix;Fc-6>SNEPzW`}b5`Vg=Bty?dcuiW>dJ#avEYom87BRf0$7;>DR zJ}%mG)f}RZ>HN7Pd+uDn{2gwurVSpse&{+ zmCdm(z{#aePuuR1QVbWkB6?Y(=C zl*jbfeHK$arL=t!$Qe&(&L-o_APm4A}xaCUr=Aqhm>hA6(J7YJewlD>es##j5ebtm1Gz6THIvip1S(3AOs%%Y7T|I1dR0&5w zQgp2gZ*V7u)|slJ+HefXgS&{jfdLUvKeO^)R;RAGnwpw}Wz0vrrkT>6T3jG@tDBi6 zXSYA57L5^e+E^QNW+Np$bYLfjo3o6qY_$9933mbFQ&C72v~2x77Br&mV&Ji$4iZaD zOiUL)b~I8w3jC@b%m6g;6g4Qg#sOR*S7NAzx%q>MxZpqO)bXBDUG79RE2|8dZh93L zfVPP`Iyxwq5KxUL_|8~6AM7VkY0R+l^XZspd4A76q!BnGTTpJQ59|<%cm-dD=Mb72 z8{;g2t;6b4pGouqL>R|-})Pu-q7iOxNf z_Dcopom|ABr;hThk(fxOQTmq>$+>HK`9mv#Be$8gkBAb>KANE}|0q?gLLc3Mv7ini zs6oZaI~nDvS|L|sdle;g6(+zeB!B5(y<3Yoa@E)xix)8q&8jfxLXR42T;e1RPni6* zG;s(R2Ksa%k}Yj*mPAo$%WnGU3wT=%H;K(Q=wWta_89o*?orKDSKj7K3oeOm~o=u<^s8g!L$~&vN$WC=!wdujYxU7dqsZO@> z3pjjdT_^9>ZDCzCTa z$sJAq^{tlMdH_qb=ZH-U{RVLGZd=8Utm6=L)2Av}J28fd)|`pqD(>)&ass$c|kD*-PkF2arfIve6 z)qnFp2&@kHl5V@Iv&PAH11G35&wtcxp?7VDrBSAC+5oxj!HoYL+{h4%FGmG20xHKl zqK@@8z{-!@Lzi247p)F&oxIu5ab?`7_V(a>J}MEnq0&&zinggaZbQi3u~Mvy338qL->8Jnyhh7-Sn;+ zY%`7W?5R-q08NuGO{<4%(V9NiI~ussc9AtVPc0e<5(9hTz`Q zt4rxh4wVK67LTvt0EZ0Ju1^uJdaOEP;lisg=~oVH6mJW#PP?T&#|%%5sWgPWZF@PV zg1*tOy+AH!k|q_i)Qc;hB$-Z}NhG>+)YR6_>9){sri)duZ`(dxIsOPI9&!BhWk zD=b6;BulPwfWk{Reky%&WJNY4y$=#B&kWK2LQ3=;ScLdRB?E)#Nz#`79ijH=;&5@M!qv}sVn01VxSpA!6*1dRynJ&Zb7NLS z%S~GIG$H7k^K|UaSV&FR!IP;)PkAmCdy061^+F!i8hxa(yF0^kAgU0>@6!LAB@* z4*6B9f%*;x*PZvUvgD$SJY3j|I`lqALCM zV(`C%sCNDxjHLoubTCNTx{_f8wgsnk`XWS;{cLh-tn}4I96PnQm%9=V{Is#sr)=}% zM-QZmU>%gyoYR?{K0XMj96-$&rPgYy)?00IR010ME{KVB3ILVVSsJsv#cq02R05o~)UmM^xJx}G09}o3x)9WV&OTGf;xhuK)5Iz+g~{)BnD{jA zoBa86YpA+>$=YOb5`2GW+c)4g#tdIcWW)lBnFx65L}podBGsQjxX=VVC)Ggy6+O|J zzwf*7xg6)I(j_%l*COEWiV-l7h8{tAeUwoV?q&(-i}^3BlU3EYs9O_;yethyNhRZVFE4=Qbq}`o?=8akY*0!FCET|6uVi z32ji$w;e_w4k?id46Ljfo-+Z4@4-MI<9#rUF6M6L15fHXd4E8Ub|&gL@azUlpfQl- zqyefo4ol?C(nQoH8Q?asVej8RYVR;wJ%rC8IIPZ_Cn(jPy9n9`#L)O}gcq;sNC;7f z^OOcEL`;QgPAW(wDnRG#2{(b2rO^znpgP{&1>LENj zSD~9;1NAME+^-9UC#rcPM@HIGR_&Q5xg0I_` zWT7svl`%m6wnmrdJOq-UNEgE7b3cD3w4`a~EOJ?Dus+BG7FArl z!z}RTLKuw<(TbPbbNrPv#sBLQ@QgnD9iG*2kxgl1OZ#^wD9~{WJon$31egiRp`0$R zLHtYSzdoU38UH!<_Y<5Yn~X3<1GJ&$->IqM|NRqTn-Dsd2UGu!p;O^db6TUXB!)ga z0N{NJAX5vZ!+;y==<3olGA5gE02t#(iNUc!YYfDZikh&CpPXkN0xr9#-PzIc5-!vL zSv(S5gbGz~aBu^K2?f0-q5I_a?SLf9dkTG%#Zy&`hVD;6P6HwGY<9(Efbl})WP(d~ zlEbZS9Q$F65rDtuMdhuQsVOO6ASk^AY-YSOpAM>lZ20cR#vO^r;u8`Cqi#z}?}T)H zc={7SF`=QMkBTG_0b$|dlDB}&9!%5Bl(Vzr12iQIFu`=4JX$_JJ~n(OG{L~*->RxC zMDfs40sRc~B2`d>5a?#N#qpFzNq8_zZj36~jsNCDzCgUSS|8+;{2F{lW;K0ca#GIC zO>DdE=(nAn9TOx)l_ZeR%YcBs;o9fBY=}b?$t@Ajwa(stUl!{xkRWvt7Z%499zVV$ z7s%8%&6ju#1)C-(trbZP_B=unhoAx|dJ#$ah)HLATtmMgocKO*wIbE^ZWw? z9ak|QK?}y(GD&ZFbOE;k+&l~Laun+c51N?BGd4D!cZr7fk5|xVq_|{bVP76@BriG zW=BtGQt}T9YF#gpgwBKK`{eWJ~dPtJj&>C96s9+`b1#j1;%6q5hW`|Bd zpVDWW-bq^?mGyLI-F)k)$~?@*e3Ng?o6LNG^epUyUM)kPD;9SF`(wG#34|VZ6avhN39KtJOqD#no4r*yG}SY zHFebH6iHa%dAKB5{BLjm>>1Jh@}7u zZNMty?7-7=sTX8A#+L+GWXw2yL7n+)k(EqhV&m}Q^ zDa~_3)6!04fv|4n3y2w-+FF(Mp~K- zC{>K5h+V(;u#teEU>t#<(4-SlH>^CjF*8-N)HbM^e7F5#o-nSBLoRSyOKyFmG%qEE z21>P165NSE@6cYLV8FWT3Te(8xYjFj{IMyR4Gl#{fsCjx_tzb97C<|MutW^W(8r zQg;mYZ{_3Ggz=UE0Rd3V2@Z}O#e8H!ldMI&pr9Z(azRK)qlphX=#C=`ev<272Re>l z)Zjv>P=sxAe7tTYOc>gjon4wQ?%k;G8`!n?pfaKc@+27rvWjW!7@4Cjy9$Q=Y-cXr z>)M}p6ct~Q@>C8AS$B<0&xD2^+w-MQeXl05h56%pP=lFw+vH3x!j=~IyaS z;x@|J3?QXAQQUDPcxNnH%k#q*uZI8m38C&7BIq!DXPe|12c9XG^5JmHW5_~);d;eD z2L!!4JCrr`v0;@YeGi}yP5~fg0B$ivb%j*uumq@hbA`gzRv#^{yF*z?V&x+#k3x>o zMAV>E`*)HY?aM|o1Jq%oB$R9y1iyX&wJ8@@a)$PgwGGVImbTqWqy~N?E;8CzH*A-I zQ9^rU1foa~Adew_?CAqkkix89e*%Y1I?h4Mn^1o_;?T1bz^!5Mhr9D1s zejVZ(6vV~ZQsUxvAhu14!hPT)pzI0m4Ds4p&D^Rmi}4tKVui&DBC&wRH@Q@QiZZvd zYHVx+j+2_(z-$599mH=0}uvLAk@g#x@x6Q9KD2!~?Z!<=?R zukh{sykowlbKti&wC-|;7kw{Go624Xm&JYTpmm!X^A-<+yy z-6kL;wDU^a+-|cT*D;<CSWSiXVUWa19>}sKJ z2$`Wv-1KhbwYX6x$O4dgT=X!%Pz>J0EW2hl+W*rv$DCy}tQbet?dOKFZ+OM3|<5UBO!@pSWRh zILYYJ!VK-4sy-d)Ngrt0;NajmRKB;6$FQucPshMz=y>KNJ5&Of1*@$#*H;d=oh|>= z{W2fQwf`Bk3tiSQeNe%0D0mBiSZ^j-umrxsn)H8rQDd6Zo;j3>+emxq?|)@o6q$`#Ny(voyEQ?RS@5+|ONyBF-5;KKhUlw&y27<>7drLX}IA zbob63Sq+W1l_bc~roS^9J}N5DZ&QJ+@ayw^M$m7}*7i9+KLI+jZ{0$B71-JFx89r= z3C#d?i^RC7G;Qqo18y4n&K9&tNNiHlUPy4+47+y{OQ^pZ8qf}KxDY_P{IDjJ`!J;u z+Sn9U2tS%`lt@{9zci9tKYx1@Rr3zMaqy>u6$yG-wZ&P{8`NU)RZln zkitA@d(RkUdUsjB0xa}YP@e)6nc*TLARb-1QFIh~p3wZsUH?=2<7+OnA2-4Tl}cPN z!otD{P$37&E*7+Qk{&_@G)XaHcv?$Mp$|0wk)>smV_@w&UXC+ot~U)7W@H@7$jD%` z%!rgXF*%2pl14F*kMmh|P%CtYqnn2g9RlAf1zqK&yiNCL z0FyZHj?ZnU8tUrm;gh}1&AStiK|{MZaRJ;p0-fJ_F|X>WG7p zO=nmZ%+7<>8GJNUBn7+6K~|7lc&>B|*1@7T@7}F~9JGsQSN!)^6lb&fHyd_tM-f^b pv_X0MrxO2Dr1;SBvd+alW#fAF*{{4NK%b>prqQTqO~{{zawnK%Fd diff --git a/content/english/hpc/data-structures/img/segtree-wide.svg b/content/english/hpc/data-structures/img/segtree-wide.svg deleted file mode 100644 index dd6bc878..00000000 --- a/content/english/hpc/data-structures/img/segtree-wide.svg +++ /dev/null @@ -1,3 +0,0 @@ - - - Produced by OmniGraffle 6.6.2 2020-06-26 13:14:04 +0000Canvas 13Layer 1 diff --git a/content/english/hpc/data-structures/img/src/fenwick-sum.svg b/content/english/hpc/data-structures/img/src/fenwick-sum.svg new file mode 100644 index 00000000..a6d62fb1 --- /dev/null +++ b/content/english/hpc/data-structures/img/src/fenwick-sum.svg @@ -0,0 +1,1449 @@ + + + + Produced by OmniGraffle 6.6.2 2020-06-26 13:14:04 +0000 + + + image/svg+xml + + + + + + + + + + + + + + + + + + + + + + + Canvas 18 + + Layer 1 + + + + + + + + + + + + + + + + + + + + + + + 12 + + + + + 2 + + + + + 37 + + + + + -4 + + + + + 227 + + + + + 13 + + + + + 282 + + + + + 2 + + + + + -86 + + + + + -138 + + + + + 94 + + + + + 3 + + + + + 229 + + + + + 13 + + + + + -52 + + + + + 4 + + + 1 + + + 2 + + + 3 + + + 4 + + + 5 + + + 6 + + + 7 + + + 8 + + + 9 + + + 10 + + + 11 + + + 12 + + + 13 + + + 14 + + + 15 + + + 16 + + + + + 0 + + + 0 + + + + + 13 + + + + + 37 + + + + + 12 + + + + + 282 + + + + + 229 + + + + + 2 + + + + + -4 + + + + + 227 + + + + + 13 + + + + + 2 + + + + + -86 + + + + + -138 + + + + + -52 + + + + + 4 + + + + + 94 + + + + + 0 + + + + + + + + + + + + + + + + + 3 + + + + + + + diff --git a/content/english/hpc/data-structures/img/src/fenwick-update.svg b/content/english/hpc/data-structures/img/src/fenwick-update.svg new file mode 100644 index 00000000..394e041a --- /dev/null +++ b/content/english/hpc/data-structures/img/src/fenwick-update.svg @@ -0,0 +1,1406 @@ + + + + Produced by OmniGraffle 6.6.2 2020-06-26 13:14:04 +0000 + + + image/svg+xml + + + + + + + + + + + + + + + + + + + + + + + Canvas 19 + + Layer 1 + + + + + + + 3 + + + + + 94 + + + + + 282 + + + + + 4 + + + + + -86 + + + + + 2 + + + + + 13 + + + + + 227 + + + + + 37 + + + + + -4 + + + + + 2 + + + + + 12 + + + + + + + + + + + + + + + + 13 + + + + + + -138 + + + + + -52 + + + + + 229 + + + + + 12 + + + + + 2 + + + + + 37 + + + + + -4 + + + + + 227 + + + + + 13 + + + + + 282 + + + + + 2 + + + + + -86 + + + + + -138 + + + + + 94 + + + + + 3 + + + + + 229 + + + + + -52 + + + + + 4 + + + 1 + + + 2 + + + 3 + + + 4 + + + 5 + + + 6 + + + 7 + + + 8 + + + 9 + + + 10 + + + 11 + + + 12 + + + 13 + + + 14 + + + 15 + + + 16 + + + + + 0 + + + 0 + + + + + 13 + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/data-structures/img/src/segtree-layout.svg b/content/english/hpc/data-structures/img/src/segtree-layout.svg new file mode 100644 index 00000000..257b831c --- /dev/null +++ b/content/english/hpc/data-structures/img/src/segtree-layout.svg @@ -0,0 +1,1064 @@ + + + + Produced by OmniGraffle 6.6.2 2020-06-26 13:14:04 +0000 + + + image/svg+xml + + + + + + + + + + + + + + + + + + Canvas 10 + + Layer 1 + + + + 13 + + + + + -1 + + + + + 2 + + + + + 23 + + + + + -4 + + + + + 231 + + + + + 13 + + + + + 5 + + + + + 2 + + + + + -88 + + + + + 0 + + + + + 90 + + + + + 3 + + + + + -12 + + + + + 4 + + + + + 12 + + + + + 25 + + + + + 227 + + + + + 18 + + + + + -86 + + + + + -9 + + + + + 37 + + + + + 245 + + + + + 282 + + + + + 85 + + + + + 94 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + -52 + + + + + -52 + + + + + -138 + + + + + -53 + + + + + 229 + + + + diff --git a/content/english/hpc/data-structures/img/src/segtree-path.svg b/content/english/hpc/data-structures/img/src/segtree-path.svg new file mode 100644 index 00000000..b4f27e99 --- /dev/null +++ b/content/english/hpc/data-structures/img/src/segtree-path.svg @@ -0,0 +1,1786 @@ + + + + Produced by OmniGraffle 6.6.2 2020-06-26 13:14:04 +0000 + + + image/svg+xml + + + + + + + + + + + + + + + + + + + + + + + + + + + + Canvas 5 + + Layer 1 + + + + 13 + + + + + -1 + + + + + 2 + + + + + 23 + + + + + -4 + + + + + 231 + + + + + 13 + + + + + 5 + + + + + 2 + + + + + -88 + + + + + 0 + + + + + 90 + + + + + 3 + + + + + -12 + + + 0 + + + 1 + + + 2 + + + 3 + + + 4 + + + 5 + + + 6 + + + 7 + + + 8 + + + 9 + + + 10 + + + 11 + + + 12 + + + 13 + + + 14 + + + 15 + + + + + 12 + + + + + 25 + + + + + 227 + + + + + 18 + + + + + -86 + + + + + -52 + + + + + -9 + + + + + + + + + + + + + + + + + + + + + 37 + + + + + 245 + + + + + -138 + + + + + + + + + + + + + 282 + + + + + -53 + + + + + 229 + + + + + + + + + + + -52 + + + + + 4 + + + + + 85 + + + + + 94 + + + [0,1] + + + [2,3] + + + [4,5] + + + [6,7] + + + [8,9] + + + [10,11] + + + [12,13] + + + [14,15] + + + [0,3] + + + [4,7] + + + [8,11] + + + [12,15] + + + [8,15] + + + [0,7] + + + [0,15] + + + + diff --git a/content/english/hpc/data-structures/img/segtree-permuted.svg b/content/english/hpc/data-structures/img/src/segtree-permuted.svg similarity index 99% rename from content/english/hpc/data-structures/img/segtree-permuted.svg rename to content/english/hpc/data-structures/img/src/segtree-permuted.svg index e43e1474..5ef98c2b 100644 --- a/content/english/hpc/data-structures/img/segtree-permuted.svg +++ b/content/english/hpc/data-structures/img/src/segtree-permuted.svg @@ -13,7 +13,10 @@ height="172pt" id="svg460" sodipodi:docname="segtree-permuted.svg" - inkscape:version="0.92.5 (2060ec1f9f, 2020-04-08)"> + inkscape:version="0.92.5 (2060ec1f9f, 2020-04-08)" + inkscape:export-filename="/home/sereja/Projects/algorithmica/content/english/hpc/data-structures/img/src/segtree-permuted.png" + inkscape:export-xdpi="132.43047" + inkscape:export-ydpi="132.43047"> Canvas 5 - + inkscape:version="0.92.5 (2060ec1f9f, 2020-04-08)" + inkscape:export-filename="/home/sereja/Projects/algorithmica/content/english/hpc/data-structures/img/src/segtree-succinct.png" + inkscape:export-xdpi="132.65491" + inkscape:export-ydpi="132.65491"> <sodipodi:namedview pagecolor="#ffffff" bordercolor="#666666" @@ -27,9 +30,9 @@ inkscape:window-height="1016" id="namedview462" showgrid="false" - inkscape:zoom="8" - inkscape:cx="396.19673" - inkscape:cy="29.549452" + inkscape:zoom="2.8284271" + inkscape:cx="186.96904" + inkscape:cy="56.328076" inkscape:window-x="0" inkscape:window-y="27" inkscape:window-maximized="1" @@ -135,11 +138,6 @@ fill="none"> <title id="title14">Canvas 5 - [0,15]</tspan> </text> - <text - transform="translate(118.385826 235.1063)" - id="text454" - fill="black"> - <tspan - font-size="8" - font-style="italic" - font-weight="500" - x=".33203125" - y="7" - id="tspan452" - textLength="5.3359375" - font-family="Linux Libertine">A</tspan> - </text> <g transform="translate(-79.370032,1.4160156e-6)" id="g10659-2"> diff --git a/content/english/hpc/data-structures/img/src/segtree-wide.svg b/content/english/hpc/data-structures/img/src/segtree-wide.svg new file mode 100644 index 00000000..1d38e472 --- /dev/null +++ b/content/english/hpc/data-structures/img/src/segtree-wide.svg @@ -0,0 +1,1696 @@ +<?xml version="1.0" encoding="UTF-8" standalone="no"?> +<svg + xmlns:dc="http://purl.org/dc/elements/1.1/" + xmlns:cc="http://creativecommons.org/ns#" + xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" + xmlns:svg="http://www.w3.org/2000/svg" + xmlns="http://www.w3.org/2000/svg" + xmlns:sodipodi="http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd" + xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape" + version="1.1" + viewBox="60 54 410 70" + width="410pt" + height="70pt" + id="svg392" + sodipodi:docname="segtree-wide.svg" + inkscape:version="0.92.5 (2060ec1f9f, 2020-04-08)" + inkscape:export-filename="/home/sereja/Projects/algorithmica/content/english/hpc/data-structures/img/src/segtree-wide.png" + inkscape:export-xdpi="103.46" + inkscape:export-ydpi="103.46"> + <sodipodi:namedview + pagecolor="#ffffff" + bordercolor="#666666" + borderopacity="1" + objecttolerance="10" + gridtolerance="10" + guidetolerance="10" + inkscape:pageopacity="0" + inkscape:pageshadow="2" + inkscape:window-width="2560" + inkscape:window-height="1016" + id="namedview394" + showgrid="false" + inkscape:zoom="2.5390244" + inkscape:cx="116.76816" + inkscape:cy="-10.971841" + inkscape:window-x="0" + inkscape:window-y="27" + inkscape:window-maximized="1" + inkscape:current-layer="g390" /> + <metadata + id="metadata2"> Produced by OmniGraffle 6.6.2 <dc:date>2020-06-26 13:14:04 +0000</dc:date> +<rdf:RDF> + <cc:Work + rdf:about=""> + <dc:format>image/svg+xml</dc:format> + <dc:type + rdf:resource="http://purl.org/dc/dcmitype/StillImage" /> + </cc:Work> +</rdf:RDF> +</metadata> + <defs + id="defs4" /> + <g + stroke="none" + stroke-opacity="1" + stroke-dasharray="none" + fill="none" + fill-opacity="1" + id="g390"> + <title + id="title6">Canvas 13 + + Layer 1 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/data-structures/segment-trees.md b/content/english/hpc/data-structures/segment-trees.md index 88ac5288..35591bc0 100644 --- a/content/english/hpc/data-structures/segment-trees.md +++ b/content/english/hpc/data-structures/segment-trees.md @@ -32,7 +32,7 @@ The main idea is this. Calculate the sum of the entire array put it somewhere. T These sequence of computations can be represented as a static-structure tree: -![](../img/segtree-path.svg) +![](../img/segtree-path.png) Some nice properties of this construct: @@ -171,7 +171,7 @@ The last issue is the most critical one. To get rid of pointer chasing, we need To store our segment tree implicitly, we can also use the [Eytzinger layout](../binary-search#eytzinger-layout), storing the nodes in a large array, where for every non-leaf node $v$ corresponding to the range $[l, r)$, the node $2v$ is its left child and the node $(2v+1)$ is its right child, corresponding to the ranges $[l, \lfloor \frac{l+r}{2} \rfloor)$ and $[\lfloor \frac{l+r}{2} \rfloor, r)$ respectively. -![The memory layout of implicit segment tree with the same query path highlighted](../img/segtree-layout.svg) +![The memory layout of implicit segment tree with the same query path highlighted](../img/segtree-layout.png) One little problem with this layout is that if $n$ is not a perfect power of two, we would need more array cells to store the tree — $4n$, to be exact. The tree structure hasn't change, and there are still exactly $(2n - 1)$ nodes in the tree — they are just not compactly packed on the last layer. @@ -297,7 +297,7 @@ int sum(int l, int r) { This results and a much simpler and faster code. However, when the array size is not a power of two, the `sum` query doesn't work correctly. To understand why, consider at the tree structure for 13 elements: -![](../img/segtree-permuted.svg) +![](../img/segtree-permuted.png) The first index of the last layer is always a power of two, but when $n$ is not a power of two, some prefix of the leaf elements gets wrapped around to the right side of the tree. @@ -383,15 +383,15 @@ To make a segment tree succinct, we need to look at the values stored in the nod Note that in every implementation so far, we never added the sum stored in the right child when computing the prefix sum. *Fenwick tree* is a type of a segment tree that uses this consideration and gets rid of all *right* children, including the last layer. This makes the total required number of memory cells $n + O(1)$, the same as the underlying array. -![](../img/segtree-succinct.svg) +![](../img/segtree-succinct.png) To calculate a prefix sum, we need to repeatedly jump to the first parent that is a left child: -![A path for the sum query](../img/fenwick-sum.svg) +![A path for the sum query](../img/fenwick-sum.png) To process an update query, we need to repeatedly add the delta to the first parent the contains the cell $k$: -![A path for the update query](../img/fenwick-update.svg) +![A path for the update query](../img/fenwick-update.png) More formally, a Fenwick tree is defined as the array $t_i = \sum_{k=f(i)}^i a_k$ where $f$ is some function for which $f(i) \leq i$. If $f$ is the "remove last bit" function (`x -= x & -x`), then both query and update would only require updating $O(\log n)$ different $t$'s @@ -472,7 +472,7 @@ But we are going to leave it there and focus on an entirely different approach. Here is the idea: if we are fetching a full cache line anyway, let's fill it with information that lets us process the query quicker. So let's store more than one data point in a segment tree node — this lets us reduce the tree height and do less iterations descending it. -![](../img/segtree-wide.svg) +![](../img/segtree-wide.png) We can use a similar constexpr-based approach we used in [S+ trees](../s-tree#implicit-b-tree-1) to implement it: From 3968141169ad8d36c53d666a63f89b883bfd28b5 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sun, 27 Feb 2022 18:28:39 +0300 Subject: [PATCH 258/531] segtree intro edits --- .../data-structures/img/src/fenwick-sum.svg | 10 ++-- .../hpc/data-structures/segment-trees.md | 56 +++++++++++-------- 2 files changed, 38 insertions(+), 28 deletions(-) diff --git a/content/english/hpc/data-structures/img/src/fenwick-sum.svg b/content/english/hpc/data-structures/img/src/fenwick-sum.svg index a6d62fb1..1a72235f 100644 --- a/content/english/hpc/data-structures/img/src/fenwick-sum.svg +++ b/content/english/hpc/data-structures/img/src/fenwick-sum.svg @@ -15,8 +15,8 @@ sodipodi:docname="fenwick-sum.svg" inkscape:version="0.92.5 (2060ec1f9f, 2020-04-08)" inkscape:export-filename="/home/sereja/Projects/algorithmica/content/english/hpc/data-structures/img/src/fenwick-sum.png" - inkscape:export-xdpi="103.8797" - inkscape:export-ydpi="103.8797"> + inkscape:export-xdpi="99.539169" + inkscape:export-ydpi="99.539169"> Segment trees are cool and can do lots of different things, but in this article, we will focus on their simplest non-trivial application — *the dynamic prefix sum problem*: ```cpp -void add(int k, int x); // execute a[k] += x (0-based indexing) -int sum(int k); // sum of the first k elements (from 0 to k - 1) +void add(int k, int x); // react to a[k] += x (zero-based indexing) +int sum(int k); // return the sum of the first k elements (from 0 to k - 1) ``` -Note that we have to support two types of queries, which makes this problem multi-dimensional: +As we now have to support two types of queries, our optimization problem becomes multi-dimensional, and the optimal solution depends on the distribution of queries. For example, if one type of the queries were extremely rare, we would only optimize for the other, which is relatively easy to do: -- If we only cared about about the cost of *updating the array*, we would store it as it is and [calculated the sum](/hpc/simd/reduction) directly on each `sum` query. -- And if we only cared about the cost of *prefix sum queries*, we would keep it ready and [re-calculate them entirely from scratch](/hpc/algorithms/prefix) on each update. +- If we only cared about the cost of *updating the array*, we would store it as it is and [calculate the sum](/hpc/simd/reduction) directly on each `sum` query. +- If we only cared about the cost of *prefix sum queries*, we would keep it ready and [re-calculate them entirely from scratch](/hpc/algorithms/prefix) on each update. -Both of these options perform $O(1)$ work on one query type but $O(n)$ work on the other. They are only optimal when one type queries is extremely rare. When this is not the case, we can trade off the work on one type of query for increased performance of the other, and segment trees let you do exactly that, achieving the equilibrium of $O(\log n)$ for both queries. +Both of these options perform $O(1)$ work on one query type but $O(n)$ work on the other. When the query frequencies are relatively close, we can trade off the performance on one type of query for performance on the other. Segment trees let you do exactly that, achieving the equilibrium of $O(\log n)$ for both queries. -### The Structure +### Segment Tree Structure -The main idea is this. Calculate the sum of the entire array put it somewhere. Then split it in halves, calculate the sum on both halves, and also store them somewhere. Then split these halves in halves and so on, until we recursively reach segments of length one. +The main idea behind segment trees is this: -These sequence of computations can be represented as a static-structure tree: +- calculate the sum of the entire array and write it down somewhere; +- split the array into two halves, calculate the sum on both halves, and also write them down somewhere; +- split these halves into halves, calculate the total of four sums on them, and also write them down; +- …and so on, until we recursively reach segments of length one. + +These computed subsegment sums can be logically represented as a binary tree — which is what we call a *segment tree*: ![](../img/segtree-path.png) -Some nice properties of this construct: +Segment trees have some nice properties: + +- If the underlying array has $n$ elements, the segment tree has exactly $(2n - 1)$ nodes — $n$ leaves and $(n - 1)$ internal nodes — because each internal node splits a segment in two, and you only need $(n - 1)$ of them to completely split the original $[0, n-1]$ range. +- The height of the tree is $\Theta(\log n)$: on each next level starting from the root, the number of nodes roughly doubles and the size of their segments roughly halves. +- Each segment can be split into $O(\log n)$ non-intersecting segments that correspond to the nodes of the segment tree: you need at most two from each layer. -1. The tree has at most $2n$ vertices: $n$ on the last layer, $\frac{n}{2}$ on the previous, $\frac{n}{4}$ on the one before that, and so on. -2. The height of the tree is $\Theta(\log n)$ as on each "level" the sizes of the segments halves. -3. Each prefix can be split into $O(\log n)$ non-intersecting segments corresponding to vertices of a segment tree: you need at most one from each layer. +When $n$ is not a perfect power of two, not all levels are filled entirely — the last layer may be incomplete — but the truthfulness of these properties remains unaffected. The first property allows us to use only $O(n)$ memory to store the tree, and the last two let us solve the problem in $O(\log n)$ time: -When $n$ is not a perfect power of two, not all levels will be filled entirely. The last layer will be incomplete, but this doesn't take away any of these nice properties that let us solve the problem (look at the bold path on the illustration): +- The `add(k, x)` query can be handled by adding the value `x` to all nodes whose segments contain the element `k`, and we've already established that there are only $O(\log n)$ of them. +- The `sum(k)` query can be answered by finding all nodes that collectively compose the `[0, k)` prefix and summing the values stored in them — and we've also established that there would be at most $O(\log n)$ of them. -1. Property 1 guarantees that we will need $O(n)$ space to store the tree -2. **Update** query is processed by adding a value to all vertices that correspond to segments that. Property 1 says there will be at most $O(\log n)$ of them. -3. **Prefix sum** query is processed by finding all vertices that compose the prefix and summing the values stored in them. Property 3 says there will also be at most $O(\log n)$ of them. +But this is still theory. As we'll see later, there are remarkably many ways one can implement this data structure. + + -To calculate the sum on a segment, we can check if the query covers the current segment fully or doesn't cover at all and return the result for the node right away. If it is not the case, we can recursively call the query on the children and they will figure it out: +To calculate the sum on a segment, we can check if the query covers the current segment fully or doesn't intersect with it at all — and return the result for this node right away. If neither is the case, we recursively pass the query to the children so that they figure it out themselves: ```c++ int sum(int lq, int rq) { - if (rb <= lq && rb <= rq) // if we are fully inside, return the sum + if (rb <= lq && rb <= rq) // if we're fully inside the query, return the sum return s; - if (rq <= lb || lq >= rb) // if we don't intersect, return zero + if (rq <= lb || lq >= rb) // if we don't intersect with the query, return zero return 0; return l->sum(k) + r->sum(k); } ``` -For the prefix sum query, since the left border is always zero, these checks simplify: +This function visits a total of $O(\log n)$ nodes because it only spawns children when a segment only partially intersects with the query, and there are at most $O(\log n)$ of such segments. + +For *prefix sums*, these checks can be simplified as the left border of the query is always zero: ```c++ int sum(int k) { @@ -164,18 +168,18 @@ int sum(int k) { } ``` -Since we have two types of queries, we also got two separate graphs to look at: +Since we have two types of queries, we also got two graphs to look at: ![](../img/segtree-pointers.svg) While this object-oriented implementation is quite good in terms of software engineering practices, there are several aspects that make it terrible in terms of performance: -- Query implementations use [recursion](/hpc/architecture/functions), although the `add` query can be tail-call optimized. -- Query implementations use unpredictable [branching](/hpc/pipelining/branching), stalling the CPU pipeline. -- The nodes stores extra metadata. The structure takes $4+4+4+8+8=28$ bytes and gets padded to 32 bytes for [memory alignment](/hpc/cpu-cache/alignment) reasons, while only 4 bytes are necessary to hold the integer sum. -- And, most importantly, we are doing [pointer chasing](/hpc/cpu-cache/latency) in both queries: we can't descend into children until we fetched their pointers, even though we can precisely infer the segments we need just from the query bounds. +- Both query implementations use [recursion](/hpc/architecture/functions) — although the `add` query can be tail-call optimized. +- Both query implementations use unpredictable [branching](/hpc/pipelining/branching), which stalls the CPU pipeline. +- The nodes store extra metadata. The structure takes $4+4+4+8+8=28$ bytes and gets padded to 32 bytes for [memory alignment](/hpc/cpu-cache/alignment) reasons, while only 4 bytes are really necessary to hold the integer sum. +- Most importantly, we are doing [pointer chasing](/hpc/cpu-cache/latency): we have to fetch the pointers to the children to descend into them, even though we can infer, ahead of time, which segments we'll need just from the query. -The last issue is the most critical one. To get rid of pointer chasing, we need to get rid of pointers, converting our structure to being implicit. +Pointer chasing outweighs all other issues by orders of magnitude — and to negate it, we need to get rid of pointers, turning the structure *implicit*. ### Implicit Segment Trees From 661715a488b29e5b3e0f1fc4622ea0ecb63df4c7 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sun, 27 Feb 2022 23:26:24 +0300 Subject: [PATCH 260/531] implicit segment tree edits --- .../hpc/data-structures/segment-trees.md | 54 +++++++++++++------ 1 file changed, 39 insertions(+), 15 deletions(-) diff --git a/content/english/hpc/data-structures/segment-trees.md b/content/english/hpc/data-structures/segment-trees.md index b5e39eb7..6c4c396d 100644 --- a/content/english/hpc/data-structures/segment-trees.md +++ b/content/english/hpc/data-structures/segment-trees.md @@ -8,7 +8,7 @@ The lessons we learned from [optimizing](../s-tree) [binary search](../binary-se In this article, instead of trying to optimize something from the STL again, we will focus on *segment trees*, the structures that may be unfamiliar to most *normal* programmers and perhaps even most computer science researchers[^tcs], but are used [very extensively](https://www.google.com/search?q=segment+tree+site%3Acodeforces.com&newwindow=1&sxsrf=APq-WBuTupSOnSn9JNEHhaqtmv0Uq0eogQ%3A1645969931499&ei=C4IbYrb2HYibrgS9t6qgDQ&ved=0ahUKEwj2p8_og6D2AhWIjYsKHb2bCtQQ4dUDCA4&uact=5&oq=segment+tree+site%3Acodeforces.com&gs_lcp=Cgdnd3Mtd2l6EAM6BwgAEEcQsAM6BwgAELADEEM6BAgjECc6BAgAEEM6BQgAEIAEOgYIABAWEB46BQghEKABSgQIQRgASgQIRhgAUMkFWLUjYOgkaANwAXgAgAHzAYgB9A-SAQYxNS41LjGYAQCgAQHIAQrAAQE&sclient=gws-wiz) in programming competitions for their speed and simplicity of implementation. -[^tcs]: Segment trees are rarely mentioned in the scientific literature because they are relatively novel (invented ~2000), mostly don't do anything that [any other binary tree](https://en.wikipedia.org/wiki/Tree_(data_structure)) can't do, and *asymptotically* aren't faster — although, in practice, they often win by a lot in terms of speed. +[^tcs]: Segment trees are rarely mentioned in the theoretical computer science literature because they are relatively novel (invented ~2000), mostly don't do anything that [any other binary tree](https://en.wikipedia.org/wiki/Tree_(data_structure)) can't do, and *asymptotically* aren't faster — although, in practice, they often win by a lot in terms of speed. ### Dynamic Prefix Sum @@ -41,7 +41,7 @@ The main idea behind segment trees is this: These computed subsegment sums can be logically represented as a binary tree — which is what we call a *segment tree*: -![](../img/segtree-path.png) +![A segment tree with with nodes relevant for the sum(11) and add(10) queries highlighted](../img/segtree-path.png) Segment trees have some nice properties: @@ -177,23 +177,41 @@ While this object-oriented implementation is quite good in terms of software eng - Both query implementations use [recursion](/hpc/architecture/functions) — although the `add` query can be tail-call optimized. - Both query implementations use unpredictable [branching](/hpc/pipelining/branching), which stalls the CPU pipeline. - The nodes store extra metadata. The structure takes $4+4+4+8+8=28$ bytes and gets padded to 32 bytes for [memory alignment](/hpc/cpu-cache/alignment) reasons, while only 4 bytes are really necessary to hold the integer sum. -- Most importantly, we are doing [pointer chasing](/hpc/cpu-cache/latency): we have to fetch the pointers to the children to descend into them, even though we can infer, ahead of time, which segments we'll need just from the query. +- Most importantly, we are doing a lot of [pointer chasing](/hpc/cpu-cache/latency): we have to fetch the pointers to the children to descend into them, even though we can infer, ahead of time, which segments we'll need just from the query. -Pointer chasing outweighs all other issues by orders of magnitude — and to negate it, we need to get rid of pointers, turning the structure *implicit*. +Pointer chasing outweighs all other issues by orders of magnitude — and to negate it, we need to get rid of pointers, making the structure *implicit*. ### Implicit Segment Trees -To store our segment tree implicitly, we can also use the [Eytzinger layout](../binary-search#eytzinger-layout), storing the nodes in a large array, where for every non-leaf node $v$ corresponding to the range $[l, r)$, the node $2v$ is its left child and the node $(2v+1)$ is its right child, corresponding to the ranges $[l, \lfloor \frac{l+r}{2} \rfloor)$ and $[\lfloor \frac{l+r}{2} \rfloor, r)$ respectively. +As a segment tree is a type of a binary tree, we can use the [Eytzinger layout](../binary-search#eytzinger-layout) to store its nodes in one large array and use index arithmetic instead of explicit pointers to navigate it. + +More formally, we define node $1$ to be the root, holding the sum of the entire array $[0, n)$. Then, for every node $v$ corresponding to the range $[l, r]$, we define: + +- the node $2v$ to be its left child corresponding to the range $[l, \lfloor \frac{l+r}{2} \rfloor)$; +- the node $(2v+1)$ to be its right child corresponding to the range $[\lfloor \frac{l+r}{2} \rfloor, r)$. + +When $n$ is a perfect power of two, this layout packs the entire tree very nicely: ![The memory layout of implicit segment tree with the same query path highlighted](../img/segtree-layout.png) -One little problem with this layout is that if $n$ is not a perfect power of two, we would need more array cells to store the tree — $4n$, to be exact. The tree structure hasn't change, and there are still exactly $(2n - 1)$ nodes in the tree — they are just not compactly packed on the last layer. +However, when $n$ is not a power of two, the layout is no longer compact: even though we still have exactly $(2n - 1)$ nodes regardless of how we split segments, they are not mapped perfectly to the $[1, 2n)$ range. + +For example, consider what happens when we descend to the rightmost leaf in a segment tree of size $17 = 2^4 + 1$: + +- we start with the root numbered $1$ that corresponds to the range $[0, 16]$, +- we go to node $3 = 2 \times 1 + 1$ representing range $[8, 16]$, +- we go to node $7 = 2 \times 2 + 1$ representing range $[12, 16]$, +- we go to node $15 = 2 \times 7 + 1$ representing range $[14, 16]$, +- we go to node $31 = 2 \times 15 + 1$ representing range $[15, 16]$, +- and we finally reach node $63 = 2 \times 31 + 1$ representing range $[16, 16]$. + +So, as $63 > 2 \times 17 - 1 = 33$, there are some holes in the layout, but the structure of the tree is still the same, and its height is still $O(\log n)$. For now, we can ignore this problem and just allocate a larger array for storing the nodes — it can be shown that the index of the rightmost leaf never exceeds $4n$, so allocating that many cells will always suffice: ```c++ -int t[4 * N]; +int t[4 * N]; // contains the node sums ``` -To implement `add`, we similarly implement a recursive function that uses this index arithmetic instead of pointers. Since we also don't store the borders of the segment, we need to pass them as parameters. This makes the function a bit clumsy, as there are now five of them in total that you need to pass around: +Now, to implement `add`, we create a similar recursive function but using index arithmetic instead of pointers. Since we've also stopped storing the borders of the segment in the nodes, we need to re-calculate them and pass them as parameters for each recursive call: ```c++ void add(int k, int x, int v = 1, int l = 0, int r = N) { @@ -208,7 +226,7 @@ void add(int k, int x, int v = 1, int l = 0, int r = N) { } ``` -To implement the prefix sum query, we do largely the same: +The implementation of the prefix sum query is largely the same: ```c++ int sum(int k, int v = 1, int l = 0, int r = N) { @@ -222,13 +240,19 @@ int sum(int k, int v = 1, int l = 0, int r = N) { } ``` -Apart from using much less memory, the main advantage is that we can now make use of [memory parallelism](/hpc/cpu-cache/mlp) and fetch the nodes we need in parallel, considerably improving the running time for both queries: +Passing around five variables in a recursive function seems clumsy, but the performance gains are clearly worth it: ![](../img/segtree-topdown.svg) -To improve further, we can manually optimize the index arithmetic and replace division by two with an explicit binary shift — as the compilers [aren't always able](/hpc/compilation/contracts/#arithmetic) to do themselves — and, more importantly, remove the recursion and make the implementation iterative. +Apart from requiring much less memory, which is good for fitting into the CPU caches, the main advantage of this implementation is that we can now make use of the [memory parallelism](/hpc/cpu-cache/mlp) and fetch the nodes we need in parallel, considerably improving the running time for both queries. + +To improve the performance further, we can: + +- manually optimize the index arithmetic (e. g. noticing that we need to multiply `v` by `2` either way), +- replace division by two with an explicit binary shift (because [compilers aren't always able to do it themselves](/hpc/compilation/contracts/#arithmetic)), +- and, most importantly, get rid of [recursion](/hpc/architecture/functions) and make the implementation fully iterative. -Here is how a fully iterative `add` looks like: +As `add` is tail-recursive and has no return value, it is easy turn it into a single `while` loop: ```c++ void add(int k, int x) { @@ -246,7 +270,7 @@ void add(int k, int x) { } ``` -This is slightly harder to do for the `sum` query as it has two recursive calls. The trick is to notice that when we make these calls, one of them is guaranteed to terminate immediately, so we can simply check this condition when descend: +Doing the same for the `sum` query is slightly harder as it has two recursive calls. The key trick is to notice that when we make these calls, one of them is guaranteed to terminate immediately as `k` can only be in one of the halves, so we can simply check this condition before descending the tree: ```c++ int sum(int k) { @@ -267,11 +291,11 @@ int sum(int k) { } ``` -This doesn't improve the performance for the update query by a lot because it was tail-recursive, and the compiler already performed a similar optimization, but the running time on the prefix sum query roughly halved for all problem sizes: +This doesn't improve the performance for the update query by a lot (because it was tail-recursive, and the compiler already performed a similar optimization), but the running time on the prefix sum query has roughly halved for all problem sizes: ![](../img/segtree-iterative.svg) -This implementation still has some problems: we are potentially using twice as much memory as necessary, have maintain and re-compute array bounds, and we still have costly branching. To get rid of these problems, we need to change the approach a little bit. +This implementation still has some problems: we are using up to twice as much memory as necessary, we have costly [branching](/hpc/pipelining/branching), and we have to maintain and re-compute array bounds on each iteration. To get rid of these problems, we need to change our approach a little bit. ### Bottom-Up Implementation From 292f93e684d3dacee1b526d99bd1eb082c8030ef Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Mon, 28 Feb 2022 01:09:32 +0300 Subject: [PATCH 261/531] bottom-up segment tree edits --- .../hpc/data-structures/segment-trees.md | 44 ++++++++++++------- 1 file changed, 27 insertions(+), 17 deletions(-) diff --git a/content/english/hpc/data-structures/segment-trees.md b/content/english/hpc/data-structures/segment-trees.md index 6c4c396d..008028d7 100644 --- a/content/english/hpc/data-structures/segment-trees.md +++ b/content/english/hpc/data-structures/segment-trees.md @@ -299,13 +299,15 @@ This implementation still has some problems: we are using up to twice as much me ### Bottom-Up Implementation -Let's change the definition of the implicit segment tree. Instead of relying on the parent-to-child relationship, we first assign all the leaf nodes numbers from $n$ to $(2n - 1)$, and then define the parent of node $k$ to be equal to node $\lfloor \frac{k}{2} \rfloor$. It's easy to see that you can still reach the root (node $1$) by dividing the node number by two, and each node still has at most two children — $2k$ and $(2k + 1)$ — as anything else would floor to another number. +Let's change the definition of the implicit segment tree layout. Instead of relying on the parent-to-child relationship, we first forcefully assign all the leaf nodes numbers in the $[n, 2n)$ range, and then recursively define the parent of node $k$ to be equal to node $\lfloor \frac{k}{2} \rfloor$. + +This structure is largely the same as before: you can still reach the root (node $1$) by dividing any node number by two, and each node still has at most two children: $2k$ and $(2k + 1)$, as anything else yields a different parent number when floor-divided by two. The advantage we get is that we've forced the last layer to be contiguous and start from $n$, so we can use the array of half the size: ```c++ int t[2 * N]; ``` -When $n$ is a power of two, this yields the same structure, and taking advantage of this bottom-up approach lets us starting from the leaf node and go up to the root: +When $n$ is a power of two, the structure of the tree is exactly the same as before and when implementing the queries, we can take advantage of this bottom-up approach and start from the $k$-th leaf node (simply indexed $N + k$) and ascend the tree until we reach the root: ```c++ void add(int k, int x) { @@ -317,7 +319,7 @@ void add(int k, int x) { } ``` -To fix this, we can similarly calculate the sum of a segment in general. For that, we need to maintain two pointers on the first and the last node to be summed, and stop when they are giving an empty segment: +To calculate the sum on the $[l, r)$ subsegment, we can maintain pointers to the first and the last element that needs to be added, increase/decrease them respectively when we add a node and stop after they converge to the same node (which would be their least common ancestor): ```c++ int sum(int l, int r) { @@ -325,44 +327,52 @@ int sum(int l, int r) { r += N - 1; int s = 0; while (l <= r) { - if ( l & 1) s += t[l++]; - if (~r & 1) s += t[r--]; + if ( l & 1) s += t[l++]; // l is a right child: add it and move to a cousin + if (~r & 1) s += t[r--]; // r is a light child: add it and move to a cousin l >>= 1, r >>= 1; } return s; } ``` -This results and a much simpler and faster code. However, when the array size is not a power of two, the `sum` query doesn't work correctly. To understand why, consider at the tree structure for 13 elements: +Surprisingly, both queries work correctly even when $n$ is not a power of two. To understand why, consider a 13-element segment tree: ![](../img/segtree-permuted.png) -The first index of the last layer is always a power of two, but when $n$ is not a power of two, some prefix of the leaf elements gets wrapped around to the right side of the tree. +The first index of the last layer is always a power of two, but when the array size is not a perfect power of two, some prefix of the leaf elements gets wrapped around to the right side of the tree. Magically, this fact does not pose a problem for our implementation: + +- The `add` query still updates its parent nodes, even though some of them correspond to some prefix and some suffix of the array instead of a contiguous subsegment. +- The `sum` query still computes the sum on the correct subsegment, even when `l` is on that wrapped prefix and logically "to the right" of `r` because eventually `l` becomes the last node on a layer and gets incremented, suddenly jumping to the first element of the next layer and proceeding normally after adding just the right nodes on the wrapped-around part of the tree (look at the dimmed nodes in the illustration). -Magically, it this works even for non-power-of-two array sizes and for queries where the left boundary is to the right of the right one because the left at some point will "wrap around", and when this happens, the `l <= r` condition will become false. +Compared to the top-down approach, we use half the memory and don't have to maintain query ranges, which results in simpler and consequently faster code: ![](../img/segtree-bottomup.svg) -Now, since we are only interested in the prefix sum, and we'd want to get rid of maintaining `l` and only move the right border like this: +When running the benchmarks, we use the `sum(l, r)` procedure for computing a general subsegment sum and just fix `l` equal to `0`. To achieve higher performance on the prefix sum query, we want to avoid maintaining `l` and only move the right border like this: ```c++ int sum(int k) { - int res = 0; + int s = 0; k += N - 1; while (k != 0) { if (~k & 1) - res += t[k--]; + s += t[k--]; k = k >> 1; } - return res; + return s; } ``` -It works when $n$ is a power of two, but fails for all other array sizes. To make it work for arbitrary array sizes, we can do the following trick: just permute the leaves so that they go in the right order, even though they span two layers. This can be done like this: +In contrast, this prefix sum implementation doesn't work unless $n$ is not a power of two — because `k` could be on that wrapped-around part, and we'd sum almost the entire array instead of a small prefix. + +To make it work for arbitrary array sizes, we can permute the leaves so that they are in the left-to-right logical order in the last two layers of the tree. In the example above, this would mean adding $3$ to all leaf indexes and then moving the last three leaves one level higher by subtracting $13$. + +In the general case, this can be done using predication in a few cycles like this: ```c++ const int last_layer = 1 << __lg(2 * N - 1); +// calculate the index of the leaf k int leaf(int k) { k += last_layer; k -= (k >= 2 * N) * N; @@ -370,7 +380,7 @@ int leaf(int k) { } ``` -Now, when implementing the queries, all we need to do is to call the `leaf` function: +When implementing the queries, all we need to do is to call the `leaf` function to get the correct leaf index: ```c++ void add(int k, int x) { @@ -393,7 +403,7 @@ int sum(int k) { } ``` -The last touch: by replacing the `s += t[k--]` line with predication, we can now make the implementation branchless (except for the last branch, where we check the loop condition): +The last touch: by replacing the `s += t[k--]` line with predication, we can make the implementation branchless (except for the last branch — we still need to check the loop condition): ```c++ int sum(int k) { @@ -407,11 +417,11 @@ int sum(int k) { } ``` -Combined, these optimizations make the prefix sum queries run much faster: +When combined, these optimizations make the prefix sum queries run much faster: ![](../img/segtree-branchless.svg) -Notice that the bump in latency for the prefix sum query starts at $2^{19}$ and not at $2^{20}$, where we run out of the L3 cache. This is because we are still storing $2n$ integers, and also fetching the `t[k]` element regardless of whether we will add it to `s` or not. We can actually solve both of these problems. +Notice that the bump in the latency for the prefix sum query starts at $2^{19}$ and not at $2^{20}$, the L3 cache boundary. This is because we are still storing $2n$ integers and also fetching the `t[k]` element regardless of whether we will add it to `s` or not. We can actually solve both of these problems. ### Fenwick trees From fabf13d9d995097420ce6e39624491c36a44deac Mon Sep 17 00:00:00 2001 From: Declan Kelly Date: Wed, 23 Feb 2022 16:11:44 -0800 Subject: [PATCH 262/531] Fix typo in mlp.md and list-ranking.md --- content/english/hpc/cpu-cache/mlp.md | 2 +- content/english/hpc/external-memory/list-ranking.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/content/english/hpc/cpu-cache/mlp.md b/content/english/hpc/cpu-cache/mlp.md index c0622e37..11c5b660 100644 --- a/content/english/hpc/cpu-cache/mlp.md +++ b/content/english/hpc/cpu-cache/mlp.md @@ -3,7 +3,7 @@ title: Memory-Level Parallelism weight: 5 --- -Memory requests can overlap in time: while you wait for a read request to complete, you can sand a few others, which will be executed concurrently with it. This is the reason why [linear iteration](../bandwidth) is so much faster than [pointer jumping](../latency): the CPU knows which memory locations it needs to fetch next and sends memory requests far ahead of time. +Memory requests can overlap in time: while you wait for a read request to complete, you can send a few others, which will be executed concurrently with it. This is the reason why [linear iteration](../bandwidth) is so much faster than [pointer jumping](../latency): the CPU knows which memory locations it needs to fetch next and sends memory requests far ahead of time. The number of concurrent memory operations is large but limited, and it is different for different types of memory. When designing algorithms and especially data structures, you may want to know this number, as it limits the amount of parallelism your computation can achieve. diff --git a/content/english/hpc/external-memory/list-ranking.md b/content/english/hpc/external-memory/list-ranking.md index 6a043588..07b33c71 100644 --- a/content/english/hpc/external-memory/list-ranking.md +++ b/content/english/hpc/external-memory/list-ranking.md @@ -48,7 +48,7 @@ I/O complexity of this algorithm with therefore be the same as joining, namely $ List ranking is especially useful in graph algorithms. -For example, we can obtain the Euler tour of a tree in external memory by constructing a linked list from the tree that corresponds to its Wuler tour and then applying the list ranking algorithm — the ranks of each node will be the same as its index $tin_v$ in the Euler tour. To construct this list, we need to: +For example, we can obtain the Euler tour of a tree in external memory by constructing a linked list from the tree that corresponds to its Euler tour and then applying the list ranking algorithm — the ranks of each node will be the same as its index $tin_v$ in the Euler tour. To construct this list, we need to: - split each undirected tree edge into two directed ones; - duplicate the parent node for each up-edge (because list nodes can only have one incoming edge, but we visit some tree vertices multiple times); From e80cce7637963c9c08d7536970fd6da3f6f3f2d5 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Mon, 28 Feb 2022 16:54:21 +0300 Subject: [PATCH 263/531] fenwick tree as succinct segment tree --- .../hpc/data-structures/segment-trees.md | 53 ++++++++++++++++--- 1 file changed, 45 insertions(+), 8 deletions(-) diff --git a/content/english/hpc/data-structures/segment-trees.md b/content/english/hpc/data-structures/segment-trees.md index 008028d7..73248999 100644 --- a/content/english/hpc/data-structures/segment-trees.md +++ b/content/english/hpc/data-structures/segment-trees.md @@ -425,14 +425,56 @@ Notice that the bump in the latency for the prefix sum query starts at $2^{19}$ ### Fenwick trees -Implicit structures are great. They allow us to avoid pointer chasing and visit all the nodes relevant for a query in parallel. What is even better is *succinct* structures. In addition to not storing pointers or any other metadata, they also use the theoretically minimal memory to store the structure — maybe only with $O(1)$ more fields. +Implicit structures are great: they avoid pointer chasing, allow visiting all the relevant nodes in parallel, and take less space as they don't store metadata in nodes. Even better than implicit structures are *succinct* structures: they only require the information-theoretical minimum space to store the structure, using only $O(1)$ additional memory. -To make a segment tree succinct, we need to look at the values stored in the nodes and search for redundancies — the values that can be inferred from other nodes — and remove them. For any node $p$, its sum $s_p$ equals to the sum $(s_l + s_r)$ stored in its children nodes. Therefore, for any such "triangle" of nodes, we only need to store any two of $s_p$, $s_l$, or $s_r$, and we can restore the other one from the $s_p = s_l + s_r$ identity. +To make a segment tree succinct, we need to look at the values stored in the nodes and search for redundancies — the values that can be inferred from others — and remove them. One way to do this is to notice that in every implementation of prefix sum, we've never used the sums stored in right children — therefore, for computing prefix sums, such nodes are redundant: -Note that in every implementation so far, we never added the sum stored in the right child when computing the prefix sum. *Fenwick tree* is a type of a segment tree that uses this consideration and gets rid of all *right* children, including the last layer. This makes the total required number of memory cells $n + O(1)$, the same as the underlying array. + ![](../img/segtree-succinct.png) +*The Fenwick tree* (also called *binary indexed tree* — soon you'll understand why) is a type of a segment tree that uses this consideration and gets rid of all *right* children, essentially removing every second node in each layer and making the total node count the same as the underlying array. + +```c++ +int t[N + 1]; // +1 because we use use one-based indexing +``` + +To store these segment sums compactly, Fenwick tree ditches the Eytzinger layout: instead, in place of every element $k$ that would be leaf in the last layer of a segment tree, it stores the sum of its first non-removed ancestor. For example: + +- the element $7$ would hold the sum on the $[0, 7]$ range ($282$), +- the element $9$ would hold the sum on the $[8, 9]$ range ($-86$), +- the element $10$ would hold the sum on the $[10, 10]$ range ($-52$, the element itself). + +How to compute this range for a given element $k$ (the left boundary, to be more specific: the right boundary is always the element $k$ itself) quicker than simulating the descend down the tree? Turns out, there is a smart bit trick that works when the tree size is a power of two and we use one-based indexing — just remove the least significant bit of the index: + +- the left bound of element $7 + 1 = 8 = 1000_2$ is $0000_2 = 0$, +- the left bound of element $9 + 1 = 10 = 1010_2$ is $1000_2 = 8$, +- the left bound of element $10 + 1 = 11 = 1011_2$ is $1010_2 = 10$. + +And to get the last set bit of an integer, we can use this procedure: + +```c++ +int lowbit(int x) { + return x & -x; +} +``` + +This trick works by the virtue of how signed numbers are stored in binary using [two's complement](/hpc/arithmetic/integer). When we compute `-x`, we implicitly subtract it from a large power of two: some prefix of the number flips, some suffix of zeros at the end remains, and the only one-bit that stays unchanged is the last set bit — which will be the only one surviving `x & -x`. For example: + +``` ++90 = 64 + 16 + 8 + 2 = (0)10110 +-90 = 00000 - 10110 = (1)01010 + → (+90) & (-90) = (0)00010 +``` + +More formally, a Fenwick tree is defined as the array $t_i = \sum_{k=f(i)}^i a_k$ where $f$ is some function for which $f(i) \leq i$. If $f$ is the "remove last bit" function (`x -= x & -x`), then both query and update would only require updating $O(\log n)$ different $t$'s + To calculate a prefix sum, we need to repeatedly jump to the first parent that is a left child: ![A path for the sum query](../img/fenwick-sum.png) @@ -441,11 +483,6 @@ To process an update query, we need to repeatedly add the delta to the first par ![A path for the update query](../img/fenwick-update.png) -More formally, a Fenwick tree is defined as the array $t_i = \sum_{k=f(i)}^i a_k$ where $f$ is some function for which $f(i) \leq i$. If $f$ is the "remove last bit" function (`x -= x & -x`), then both query and update would only require updating $O(\log n)$ different $t$'s - -```c++ -int t[N + 1]; -``` Now, instead of making it actually equivalent to a segment tree, we will make all sizes a power of two and maintain a *forest* of trees. In a sense, we maintain $O(\log n)$ different trees. From 5f10c5eeda40fe0a6fcbc177cc1948cc63d31129 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Mon, 28 Feb 2022 17:56:46 +0300 Subject: [PATCH 264/531] fenwick tree edits --- .../hpc/data-structures/segment-trees.md | 66 ++++++++++++------- 1 file changed, 41 insertions(+), 25 deletions(-) diff --git a/content/english/hpc/data-structures/segment-trees.md b/content/english/hpc/data-structures/segment-trees.md index 73248999..2f55c4df 100644 --- a/content/english/hpc/data-structures/segment-trees.md +++ b/content/english/hpc/data-structures/segment-trees.md @@ -473,57 +473,73 @@ This trick works by the virtue of how signed numbers are stored in binary using → (+90) & (-90) = (0)00010 ``` -More formally, a Fenwick tree is defined as the array $t_i = \sum_{k=f(i)}^i a_k$ where $f$ is some function for which $f(i) \leq i$. If $f$ is the "remove last bit" function (`x -= x & -x`), then both query and update would only require updating $O(\log n)$ different $t$'s + -To calculate a prefix sum, we need to repeatedly jump to the first parent that is a left child: +We've established what a Fenwick tree is just an array of size `n` where each element `k` is defined to be the sum of elements from `k - lowbit(k) + 1` and `k` inclusive in the original array, and now it's time to implement some queries. -![A path for the sum query](../img/fenwick-sum.png) +Implementing the prefix sum query is easy. The `t[k]` holds the sum we need except for the first `k - lowbit(k)` elements, so we can just add it to the result and then jump to `k - lowbit(k)` and continue doing this until we reach the beginning of the array: -To process an update query, we need to repeatedly add the delta to the first parent the contains the cell $k$: +```c++ +int sum(int k) { + int s = 0; + for (; k != 0; k -= lowbit(k)) + s += t[k]; + return s; +} +``` -![A path for the update query](../img/fenwick-update.png) + +Since we are repeatedly removing the lowest set bit from `k`, and also since this procedure is equivalent to visiting the same left-child nodes in a segment tree, each `sum` query can touch at most $O(\log n)$ nodes: -Now, instead of making it actually equivalent to a segment tree, we will make all sizes a power of two and maintain a *forest* of trees. In a sense, we maintain $O(\log n)$ different trees. +![A path for a prefix sum query in a Fenwick tree](../img/fenwick-sum.png) -Now, the tricky part is how to do it *fast*. If the array size is a perfect power of two, we have a trick. Notice that what left children have in common is that their indices are even. If a node is a deep interior node, it will end with a lot of zeros in its binary representation. We can there just remove the last sequence of ones, which can be done with `k &= k - 1`: +To slightly improve the performance of the `sum` query, we use `k &= k - 1` to remove the lowest bit in one go, which is one instruction faster than `k -= k & -k`: ```c++ int sum(int k) { - int res = 0; + int s = 0; for (; k != 0; k &= k - 1) - res += t[k]; - return res; + s += t[k]; + return s; } ``` -Now, when we defined $f$, on update, we need to identify the nodes that contain the element that is being updated. Since the $f$ function removes the last index, these have to be the nodes that have the same number of zeros at the end, some of the same prefix, and some number of ones at the middle that will be cancelled to produce a number that is lower than the original one. All such numbers can be yielded by adding the last set bit to the index, which trims zeros: +Unlike all previous segment tree implementations, a Fenwick tree is a structure where it is easier and more efficient to to calculate the sum on a subsegment as the difference of two prefix sums: ```c++ -void add(int k, int x) { - for (k += 1; k <= N; k += k & -k) - t[k] += x; +// [l, r) +int sum (int l, int r) { + return sum(r) - sum(l); } ``` -Sometimes people use `k -= k & -k` to iterate when processing the `sum` query, which makes this implementation delightfully symmetric. +The update query is easier to code but less intuitive. We need to add a value `x` to all nodes that are left-child ancestors of leaf `k`. Such nodes have indices `m` larger than `k` but `m - lowbit(m) < k` so that `k` is included in their ranges. -This is a structure where it is easier to calculate sum on subsegments as the difference of two prefix sums: +All such indices need to have a common prefix with `k`, then a `1` where it was `0` in `k`, and then a suffix of zeros so that that `1` canceled and the result of `m - lowbit(m)` is less than `k`. All such indices can be generated iteratively like this: ```c++ -// [l, r) -int sum (int l, int r) { - return sum(r) - sum(l - 1); +void add(int k, int x) { + for (k += 1; k <= N; k += k & -k) + t[k] += x; } ``` -The performance of the Fenwick tree is similar to the optimized bottom-up segment tree: +Repeatedly adding the lowest set bit to `k` makes it "more even" and lifts it to its next left-child segment tree ancestor: + +![A path for an update query in a Fenwick tree](../img/fenwick-update.png) + +Now, if we leave all the code as it is, it works correctly even when $n$ is not a power of two. In this case, the Fenwick tree is not equivalent to a segment tree fo size $n$ but to a *forest* of up to $O(\log n)$ segment trees of power-of-two sizes — or to a single segment tree padded with zeros to a large power of two, if you like to think this way. In either case, all procedures remain working correctly as they never touch anything outside the $[1, n]$ range. + + + +The performance of the Fenwick tree is similar to the optimized bottom-up segment tree for the update queries and slightly faster for the prefix sum queries: ![](../img/segtree-fenwick.svg) -There is, however, one weird thing. The performance goes up rapidly close to the L3 boundary. This is a [cache associativity](/hpc/cpu-cache/associativity) effect: the most frequently used cells all have their index divisible by large powers of two and get aliased to the same cache set, kicking each other out. +There is one weird thing on the graph. After we cross the L3 cache boundary, the performance takes off very rapidly. This is a [cache associativity](/hpc/cpu-cache/associativity) effect: the most frequently used cells all have their indices divisible by large powers of two, so they get aliased to the same cache set, kicking each other out and effectively reducing the cache size. -One way to negate this is to insert "holes" in the layout like this: +One way to negate this effect is to insert "holes" in the layout like this: ```c++ inline constexpr int hole(int k) { @@ -545,13 +561,13 @@ int sum(int k) { } ``` -As computing the `hole` function is not on the critical path between iteration, it does not introduce any significant overhead, but completely removes the cache associativity problem and shrinks the latency by ~3x on large arrays: +Computing the `hole` function is not on the critical path between iterations, so it does not introduce any significant overhead, but completely removes the cache associativity problem and shrinks the latency by up to 3x on large arrays: ![](../img/segtree-fenwick-holes.svg) -There are still other minor issues with Fenwick trees. Similar to [binary search](../binary-search), the temporal locality of its memory accesses is not great, as rarely accessed elements are grouped with the most frequently accessed ones. It also executes has to perform end-of-loop checks and executes non-constant number of iterations, likely causing a branch mispredict, although just a single one. +Fenwick trees are fast, but there are still other minor issues with them. Similar to [binary search](../binary-search), the temporal locality of their memory accesses is not the greatest, as rarely accessed elements are grouped with the most frequently accessed ones. Fenwick trees also execute non-constant number of iterations and have to perform end-of-loop checks, causing a very likely branch mispredict — although just a single one. -But we are going to leave it there and focus on an entirely different approach. If you know [S-trees](../s-tree), you've probably guessed where this is going. +There are probably still some things to optimize, but we are going to leave it there and focus on an entirely different approach, and if you know [S-trees](../s-tree), you probably already know where this is headed. ### Wide Segment Trees From 8b175f9fd9ddd6b56ca622b44304030ce83defce Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Mon, 28 Feb 2022 18:29:03 +0300 Subject: [PATCH 265/531] wide segment tree edits --- .../hpc/data-structures/segment-trees.md | 49 +++++++++++-------- 1 file changed, 28 insertions(+), 21 deletions(-) diff --git a/content/english/hpc/data-structures/segment-trees.md b/content/english/hpc/data-structures/segment-trees.md index 2f55c4df..0a828d81 100644 --- a/content/english/hpc/data-structures/segment-trees.md +++ b/content/english/hpc/data-structures/segment-trees.md @@ -571,19 +571,23 @@ There are probably still some things to optimize, but we are going to leave it t ### Wide Segment Trees -Here is the idea: if we are fetching a full cache line anyway, let's fill it with information that lets us process the query quicker. So let's store more than one data point in a segment tree node — this lets us reduce the tree height and do less iterations descending it. +Here is the main idea: if the memory system is fetching a full [cache line](/hpc/cpu-cache/cache-lines) for us anyway, let's fill it to the maximum with information that lets us process the query quicker. For segment trees, this means storing more than one data point in a node. This lets us reduce the tree height and perform less iterations when descending or ascending it: ![](../img/segtree-wide.png) -We can use a similar constexpr-based approach we used in [S+ trees](../s-tree#implicit-b-tree-1) to implement it: +We will use the term *wide segment tree* to refer to this modification. + +To implement this layout, we can we can use a similar [constexpr](/hpc/compilation/precalc)-based approach we used in [S+ trees](../s-tree#implicit-b-tree-1): ```c++ -const int b = 4, B = (1 << b); +const int b = 4, B = (1 << b); // cache line size (in integers, not bytes) +// the height of the tree over an n-element array constexpr int height(int n) { return (n <= B ? 1 : height(n / B) + 1); } +// where the h-th layer starts constexpr int offset(int h) { int s = 0, n = N; while (h--) { @@ -594,32 +598,32 @@ constexpr int offset(int h) { } constexpr int H = height(N); -alignas(64) int t[offset(H)]; +alignas(64) int t[offset(H)]; // an array for storing nodes ``` -We effectively reduce the height of the tree by $\frac{\log_B n}{\log_2 n} = \log_2 B$ times, but it may be tricky to realize in-node operations. - -In this context, we have to options: +This way we effectively reduce the height of the tree by approximately $\frac{\log_B n}{\log_2 n} = \log_2 B$ times ($\sim4$ times if $B = 16$), but it becomes non-trivial to implement in-node operations efficiently. For our problem, we have two main options: -1. We could store $B$ sums in each node. -2. We could store $B$ prefix sums in each node. +1. We could store $B$ *sums* in each node (for each of its $B$ children). +2. We could store $B$ *prefix sums* in each node (the $i$-th being the sum of the first $(i + 1)$ children). -If we go with option 1, the `add` query would be largely the same, but the `sum` query would need to sum up to $B$ scalar in each node. If we go with option 2, the `sum` query would be trivial, but the `add` query will need to add the element to some suffix of each node. +If we go with the first option, the `add` query would be largely the same as in the bottom-up segment tree, but the `sum` query would need to add up to $B$ scalars in each node it visits. And if we go with the second option, the `sum` query would be trivial, but the `add` query would need to add `x` to some suffix on each node it visits. -In either case, one operation will perform $O(\log_B n)$ operations and rouch one scalar, while the other will perform $O(B \cdot \log_B n)$ operations. However, we really want to use [SIMD](/hpc/simd) to accelerate the slower operation. Since there are no fast [horizontal reductions](/hpc/simd/reduction), but it is easy to add a vector to a vector, we will stick to the second approach and store prefix sums in each node. +In either case, one operation will perform $O(\log_B n)$ operations, touching just one scalar in each node, while the other will perform $O(B \cdot \log_B n)$ operations, touching up to $B$ scalars in each node. However, it is 21st century, and we can use [SIMD](/hpc/simd) to accelerate the slower operation. Since there are no fast [horizontal reductions](/hpc/simd/reduction) in SIMD instruction sets, but it is easy to add a vector to a vector, we will choose the second approach and store prefix sums in each node. -This makes the `sum` query very easy: +This makes the `sum` query extremely fast and easy to implement: ```c++ int sum(int k) { - int res = 0; + int s = 0; for (int h = 0; h < H; h++) - res += t[offset(h) + (k >> (h * b))]; - return res; + s += t[offset(h) + (k >> (h * b))]; + return s; } ``` -For the `add` query, however, we need a trick. We only need to add a number to a prefix of a node. We need a mask that will tell us which element to add and which not. We can pre-calculate such a $B \times B$ mask just once, which tells us for each starting position whether the element is engaged in the operation or not: +The `add` query is more complicated and slower. We need to add a number to only a suffix of a node, and we can do this by [masking out](/hpc/simd/masking) the positions that need not be modified. + +We can pre-calculate a $B \times B$ array corresponding to $B$ such masks that tell, for each of $B$ positions within a node, whether a certain prefix sum value needs to be updated or not: ```c++ struct Precalc { @@ -635,7 +639,7 @@ struct Precalc { constexpr Precalc T; ``` -We then use these masks to bitwise-and the broadcasted delta value and add it to the values stored at the node: +When processing the `add` query, we just use these masks to bitwise-and the broadcasted `x` value to mask it and then add it to the values stored in the node: ```c++ typedef int vec __attribute__ (( vector_size(32) )); @@ -659,17 +663,16 @@ void add(int k, int x) { This speeds up the `sum` query by more than 10x and the `add` query by up to 4x compared to the Fenwick tree: ![](../img/segtree-simd.svg) - -Wide Fenwick trees make little sense. The speed of Fenwick trees comes from rapidly iterating over just the elements we need. +Expectedly, when we increase the node size, the update time also increases, as we need to fetch more cache lines and process them, but the `sum` query time decreases, as the size of the tree becomes smaller. Unlike [S-trees](../s-tree), you can easily change block size: ![](../img/segtree-simd-others.svg) -Expectedly, when we increase the node size, the update time also increases, as we need to fetch more cache lines and process them, but the `sum` query time decreases, as the size of the tree becomes smaller. - There are similar considerations to [S+ trees](../s-tree/#modifications-and-further-optimizations) in that the ideal layout (the node sizes on each layer) may depend on the use case. + + ### Comparison This is significantly faster compared to the popular segment tree implementations: @@ -682,6 +685,8 @@ It makes sense to look at the relative speedup: The wide segment tree is up to 200 and 40 times faster than the pointer-based segment tree for the prefix sum and update queries respectively, although for sufficiently large arrays, memory efficiency becomes the only concern, and this speedup goes down to 60 and 15 respectively. + + ### Acknowledgements This article is loosely based on "[Practical Trade-Offs for the Prefix-Sum Problem](https://arxiv.org/pdf/2006.14552.pdf)" by Giulio Ermanno Pibiri and Rossano Venturini. It has some more detailed discussions, as well as some other implementations or branchless top-down segment tree and why b-ary Fenwick tree is not a good idea. Intermediate structures we've skipped here. From 1887a3c78390c82020adfae49095750015776903 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Mon, 28 Feb 2022 19:44:56 +0300 Subject: [PATCH 266/531] segtree comparisons and acknowledgements --- .../hpc/data-structures/segment-trees.md | 23 ++++++++++--------- 1 file changed, 12 insertions(+), 11 deletions(-) diff --git a/content/english/hpc/data-structures/segment-trees.md b/content/english/hpc/data-structures/segment-trees.md index 0a828d81..3224563e 100644 --- a/content/english/hpc/data-structures/segment-trees.md +++ b/content/english/hpc/data-structures/segment-trees.md @@ -575,7 +575,7 @@ Here is the main idea: if the memory system is fetching a full [cache line](/hpc ![](../img/segtree-wide.png) -We will use the term *wide segment tree* to refer to this modification. +We will use the term *wide (B-ary) segment tree* to refer to this modification. To implement this layout, we can we can use a similar [constexpr](/hpc/compilation/precalc)-based approach we used in [S+ trees](../s-tree#implicit-b-tree-1): @@ -639,7 +639,7 @@ struct Precalc { constexpr Precalc T; ``` -When processing the `add` query, we just use these masks to bitwise-and the broadcasted `x` value to mask it and then add it to the values stored in the node: +Apart from this masking trick, the rest of the computation is simple enough to be handled with [GCC vector types](/hpc/simd/intrinsics#gcc-vector-extensions) only. When processing the `add` query, we just use these masks to bitwise-and them with the broadcasted `x` value to mask it and then add it to the values stored in the node: ```c++ typedef int vec __attribute__ (( vector_size(32) )); @@ -663,27 +663,26 @@ void add(int k, int x) { This speeds up the `sum` query by more than 10x and the `add` query by up to 4x compared to the Fenwick tree: ![](../img/segtree-simd.svg) -Expectedly, when we increase the node size, the update time also increases, as we need to fetch more cache lines and process them, but the `sum` query time decreases, as the size of the tree becomes smaller. -Unlike [S-trees](../s-tree), you can easily change block size: +Unlike [S-trees](../s-tree), the block size can be easily changed in this implementation (by literally changing one character). Expectedly, when we increase it, the update time also increases as we need to fetch more cache lines and process them, but the `sum` query time decreases as the height of the tree becomes smaller: ![](../img/segtree-simd-others.svg) -There are similar considerations to [S+ trees](../s-tree/#modifications-and-further-optimizations) in that the ideal layout (the node sizes on each layer) may depend on the use case. +Similar to the [S+ trees](../s-tree/#modifications-and-further-optimizations), the optimal memory layout probably has non-uniform block sizes, depending on the problem size and the distribution of queries, but we are not going to explore this idea and just leave the optimization here. -### Comparison +### Comparisons -This is significantly faster compared to the popular segment tree implementations: +Wide segment trees are significantly faster compared to other popular segment tree implementations: ![](../img/segtree-popular.svg) -It makes sense to look at the relative speedup: +The relative speedup is in the orders of magnitude: ![](../img/segtree-popular-relative.svg) -The wide segment tree is up to 200 and 40 times faster than the pointer-based segment tree for the prefix sum and update queries respectively, although for sufficiently large arrays, memory efficiency becomes the only concern, and this speedup goes down to 60 and 15 respectively. +Compared to the original pointer-based implementation, the wide segment tree is up to 200 and 40 times faster for the prefix sum and update queries respectively, although for sufficiently large arrays, both implementations become only memory-bound, and this speedup goes down to around 60 and 15 respectively. + +Code and some important bottom-up segment tree ideas from were adapted from a 2015 blogpost "[Efficient and easy segment trees](https://codeforces.com/blog/entry/18051)" by Oleksandr Bacherikov. From 4d5b6a1fd7d30741a150c02cea9c5aaa79c78b10 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Mon, 28 Feb 2022 19:58:57 +0300 Subject: [PATCH 267/531] segtree grammar edits --- .../hpc/data-structures/segment-trees.md | 28 +++++++++---------- 1 file changed, 14 insertions(+), 14 deletions(-) diff --git a/content/english/hpc/data-structures/segment-trees.md b/content/english/hpc/data-structures/segment-trees.md index 3224563e..865879da 100644 --- a/content/english/hpc/data-structures/segment-trees.md +++ b/content/english/hpc/data-structures/segment-trees.md @@ -183,7 +183,7 @@ Pointer chasing outweighs all other issues by orders of magnitude — and to neg ### Implicit Segment Trees -As a segment tree is a type of a binary tree, we can use the [Eytzinger layout](../binary-search#eytzinger-layout) to store its nodes in one large array and use index arithmetic instead of explicit pointers to navigate it. +As a segment tree is a type of binary tree, we can use the [Eytzinger layout](../binary-search#eytzinger-layout) to store its nodes in one large array and use index arithmetic instead of explicit pointers to navigate it. More formally, we define node $1$ to be the root, holding the sum of the entire array $[0, n)$. Then, for every node $v$ corresponding to the range $[l, r]$, we define: @@ -439,13 +439,13 @@ For any node $p$, its sum $s_p$ equals to the sum $(s_l + s_r)$ stored in its ch ![](../img/segtree-succinct.png) -*The Fenwick tree* (also called *binary indexed tree* — soon you'll understand why) is a type of a segment tree that uses this consideration and gets rid of all *right* children, essentially removing every second node in each layer and making the total node count the same as the underlying array. +*The Fenwick tree* (also called *binary indexed tree* — soon you'll understand why) is a type of segment tree that uses this consideration and gets rid of all *right* children, essentially removing every second node in each layer and making the total node count the same as the underlying array. ```c++ int t[N + 1]; // +1 because we use use one-based indexing ``` -To store these segment sums compactly, Fenwick tree ditches the Eytzinger layout: instead, in place of every element $k$ that would be leaf in the last layer of a segment tree, it stores the sum of its first non-removed ancestor. For example: +To store these segment sums compactly, the Fenwick tree ditches the Eytzinger layout: instead, in place of every element $k$ that would be a leaf in the last layer of a segment tree, it stores the sum of its first non-removed ancestor. For example: - the element $7$ would hold the sum on the $[0, 7]$ range ($282$), - the element $9$ would hold the sum on the $[8, 9]$ range ($-86$), @@ -453,9 +453,9 @@ To store these segment sums compactly, Fenwick tree ditches the Eytzinger layout How to compute this range for a given element $k$ (the left boundary, to be more specific: the right boundary is always the element $k$ itself) quicker than simulating the descend down the tree? Turns out, there is a smart bit trick that works when the tree size is a power of two and we use one-based indexing — just remove the least significant bit of the index: -- the left bound of element $7 + 1 = 8 = 1000_2$ is $0000_2 = 0$, -- the left bound of element $9 + 1 = 10 = 1010_2$ is $1000_2 = 8$, -- the left bound of element $10 + 1 = 11 = 1011_2$ is $1010_2 = 10$. +- the left bound for element $7 + 1 = 8 = 1000_2$ is $0000_2 = 0$, +- the left bound for element $9 + 1 = 10 = 1010_2$ is $1000_2 = 8$, +- the left bound for element $10 + 1 = 11 = 1011_2$ is $1010_2 = 10$. And to get the last set bit of an integer, we can use this procedure: @@ -505,7 +505,7 @@ int sum(int k) { } ``` -Unlike all previous segment tree implementations, a Fenwick tree is a structure where it is easier and more efficient to to calculate the sum on a subsegment as the difference of two prefix sums: +Unlike all previous segment tree implementations, a Fenwick tree is a structure where it is easier and more efficient to calculate the sum on a subsegment as the difference of two prefix sums: ```c++ // [l, r) @@ -561,23 +561,23 @@ int sum(int k) { } ``` -Computing the `hole` function is not on the critical path between iterations, so it does not introduce any significant overhead, but completely removes the cache associativity problem and shrinks the latency by up to 3x on large arrays: +Computing the `hole` function is not on the critical path between iterations, so it does not introduce any significant overhead but completely removes the cache associativity problem and shrinks the latency by up to 3x on large arrays: ![](../img/segtree-fenwick-holes.svg) -Fenwick trees are fast, but there are still other minor issues with them. Similar to [binary search](../binary-search), the temporal locality of their memory accesses is not the greatest, as rarely accessed elements are grouped with the most frequently accessed ones. Fenwick trees also execute non-constant number of iterations and have to perform end-of-loop checks, causing a very likely branch mispredict — although just a single one. +Fenwick trees are fast, but there are still other minor issues with them. Similar to [binary search](../binary-search), the temporal locality of their memory accesses is not the greatest, as rarely accessed elements are grouped with the most frequently accessed ones. Fenwick trees also execute a non-constant number of iterations and have to perform end-of-loop checks, very likely causing a branch misprediction — although just a single one. There are probably still some things to optimize, but we are going to leave it there and focus on an entirely different approach, and if you know [S-trees](../s-tree), you probably already know where this is headed. ### Wide Segment Trees -Here is the main idea: if the memory system is fetching a full [cache line](/hpc/cpu-cache/cache-lines) for us anyway, let's fill it to the maximum with information that lets us process the query quicker. For segment trees, this means storing more than one data point in a node. This lets us reduce the tree height and perform less iterations when descending or ascending it: +Here is the main idea: if the memory system is fetching a full [cache line](/hpc/cpu-cache/cache-lines) for us anyway, let's fill it to the maximum with information that lets us process the query quicker. For segment trees, this means storing more than one data point in a node. This lets us reduce the tree height and perform fewer iterations when descending or ascending it: ![](../img/segtree-wide.png) We will use the term *wide (B-ary) segment tree* to refer to this modification. -To implement this layout, we can we can use a similar [constexpr](/hpc/compilation/precalc)-based approach we used in [S+ trees](../s-tree#implicit-b-tree-1): +To implement this layout, we can use a similar [constexpr](/hpc/compilation/precalc)-based approach we used in [S+ trees](../s-tree#implicit-b-tree-1): ```c++ const int b = 4, B = (1 << b); // cache line size (in integers, not bytes) @@ -682,7 +682,7 @@ The relative speedup is in the orders of magnitude: ![](../img/segtree-popular-relative.svg) -Compared to the original pointer-based implementation, the wide segment tree is up to 200 and 40 times faster for the prefix sum and update queries respectively, although for sufficiently large arrays, both implementations become only memory-bound, and this speedup goes down to around 60 and 15 respectively. +Compared to the original pointer-based implementation, the wide segment tree is up to 200 and 40 times faster for the prefix sum and update queries, respectively — although, for sufficiently large arrays, both implementations become purely memory-bound, and this speedup goes down to around 60 and 15 respectively. -Code and some important bottom-up segment tree ideas from were adapted from a 2015 blogpost "[Efficient and easy segment trees](https://codeforces.com/blog/entry/18051)" by Oleksandr Bacherikov. +The code and some ideas regarding bottom-up segment trees were adapted from a 2015 blog post "[Efficient and easy segment trees](https://codeforces.com/blog/entry/18051)" by Oleksandr Bacherikov. From 32ecd0f325075f08630d8fa6694e76b86e1cbcc4 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Mon, 28 Feb 2022 20:10:17 +0300 Subject: [PATCH 268/531] segment tree comments --- .../english/hpc/data-structures/segment-trees.md | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/content/english/hpc/data-structures/segment-trees.md b/content/english/hpc/data-structures/segment-trees.md index 865879da..2cfe2072 100644 --- a/content/english/hpc/data-structures/segment-trees.md +++ b/content/english/hpc/data-structures/segment-trees.md @@ -4,12 +4,14 @@ weight: 3 draft: true --- -The lessons we learned from [optimizing](../s-tree) [binary search](../binary-search) can be applied to a broad range of data structures. +The lessons learned from [optimizing](../s-tree) [binary search](../binary-search) can be applied to a broad range of data structures. -In this article, instead of trying to optimize something from the STL again, we will focus on *segment trees*, the structures that may be unfamiliar to most *normal* programmers and perhaps even most computer science researchers[^tcs], but are used [very extensively](https://www.google.com/search?q=segment+tree+site%3Acodeforces.com&newwindow=1&sxsrf=APq-WBuTupSOnSn9JNEHhaqtmv0Uq0eogQ%3A1645969931499&ei=C4IbYrb2HYibrgS9t6qgDQ&ved=0ahUKEwj2p8_og6D2AhWIjYsKHb2bCtQQ4dUDCA4&uact=5&oq=segment+tree+site%3Acodeforces.com&gs_lcp=Cgdnd3Mtd2l6EAM6BwgAEEcQsAM6BwgAELADEEM6BAgjECc6BAgAEEM6BQgAEIAEOgYIABAWEB46BQghEKABSgQIQRgASgQIRhgAUMkFWLUjYOgkaANwAXgAgAHzAYgB9A-SAQYxNS41LjGYAQCgAQHIAQrAAQE&sclient=gws-wiz) in programming competitions for their speed and simplicity of implementation. +In this article, instead of trying to optimize something from the STL again, we focus on *segment trees*, the structures that may be unfamiliar to most *normal* programmers and perhaps even most computer science researchers[^tcs], but that are used [very extensively](https://www.google.com/search?q=segment+tree+site%3Acodeforces.com&newwindow=1&sxsrf=APq-WBuTupSOnSn9JNEHhaqtmv0Uq0eogQ%3A1645969931499&ei=C4IbYrb2HYibrgS9t6qgDQ&ved=0ahUKEwj2p8_og6D2AhWIjYsKHb2bCtQQ4dUDCA4&uact=5&oq=segment+tree+site%3Acodeforces.com&gs_lcp=Cgdnd3Mtd2l6EAM6BwgAEEcQsAM6BwgAELADEEM6BAgjECc6BAgAEEM6BQgAEIAEOgYIABAWEB46BQghEKABSgQIQRgASgQIRhgAUMkFWLUjYOgkaANwAXgAgAHzAYgB9A-SAQYxNS41LjGYAQCgAQHIAQrAAQE&sclient=gws-wiz) in programming competitions for their speed and simplicity of implementation. [^tcs]: Segment trees are rarely mentioned in the theoretical computer science literature because they are relatively novel (invented ~2000), mostly don't do anything that [any other binary tree](https://en.wikipedia.org/wiki/Tree_(data_structure)) can't do, and *asymptotically* aren't faster — although, in practice, they often win by a lot in terms of speed. +(If you already know the context, jump straight to the [last section](#wide-segment-trees) for the novelty: the *wide segment tree* that works 4 to 12 times faster than the Fenwick tree.) + ### Dynamic Prefix Sum + ### Part I: Performance Engineering The first part covers the basics of computer architecture and optimization of single-threaded algorithms. @@ -140,18 +172,19 @@ Among cool things that we will speed up: - 2x faster GCD (compared to `std::gcd`) - 8-15x faster binary search (compared to `std::lower_bound`) -- 7x faster segment trees +- 5-10x faster segment trees (compared to Fenwick trees) - 5x faster hash tables (compared to `std::unordered_map`) -- ~~?x faster popcount~~ +- 2x faster popcount (compared to repeatedly calling `popcnt`) - 2x faster parsing series of integers (compared to `scanf`) - ?x faster sorting (compared to `std::sort`) - 2x faster sum (compared to `std::accumulate`) +- 2-3x faster prefix sum (compared to naive implementation) +- 10x faster argmin (compared to naive implementation) - 10x faster array searching (compared to `std::find`) - 100x faster matrix multiplication (compared to "for-for-for") - optimal word-size integer factorization (~0.4ms per 60-bit integer) - optimal Karatsuba Algorithm - optimal FFT -- argmin at the speed of memory This work is largely based on blog posts, research papers, conference talks and other work authored by a lot of people: @@ -181,27 +214,27 @@ This work is largely based on blog posts, research papers, conference talks and - [ridiculous_fish](https://ridiculousfish.com/blog/) - [Creel](https://www.youtube.com/c/WhatsACreel) -Volume: 300-400 pages -Release date: early 2022 +Volume: 450-600 pages +Release date: Q2 2022 ### Part II: Parallel Algorithms -Concurrency, models of parallelism, green threads and runtimes, cache coherence, synchronization primitives, OpenMP, reductions, scans, list ranking and graph algorithms, lock-free data structures, heterogeneous computing, CUDA, kernels, warps, blocks, matrix multiplication and sorting. +Concurrency, models of parallelism, green threads and concurrent runtimes, cache coherence, synchronization primitives, OpenMP, reductions, scans, list ranking and graph algorithms, lock-free data structures, heterogeneous computing, CUDA, kernels, warps, blocks, matrix multiplication and sorting. Volume: 150-200 pages -Release date: late 2022 / 2023? +Release date: 2023? ### Part III: Distributed Computing Communication-constrained algorithms, message passing, actor model, partitioning, MapReduce, consistency and reliability at scale, storage, compression, scheduling and cloud computing, distributed deep learning. -Release date: ??? +Release date: ??? (more likely to be completed than not) ### Part IV: Compilers and Domain-Specific Architectures -LLVM IR, main optimization techniques from the dragon book, JIT-compilation, Cython, JAX, Numba, Julia, OpenCL, DPC++ and oneAPI, XLA, FPGAs and Verilog, ASICs, TPUs and other AI accelerators. +LLVM IR, compiler optimizations, JIT-compilation, Cython, JAX, Numba, Julia, OpenCL, DPC++ and oneAPI, XLA, Verilog, FPGAs, ASICs, TPUs and other AI accelerators. -Release date: ??? +Release date: ??? (less likely to be completed than not) ### Disclaimer: Technology Choices From 2cd24492f4684f5f264cfaf141f946ff10e9a071 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 2 Mar 2022 22:25:37 +0300 Subject: [PATCH 276/531] bugfix (tnx @BorysMinaiev) --- content/english/hpc/data-structures/segment-trees.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/hpc/data-structures/segment-trees.md b/content/english/hpc/data-structures/segment-trees.md index ab1b8253..08aa0fa2 100644 --- a/content/english/hpc/data-structures/segment-trees.md +++ b/content/english/hpc/data-structures/segment-trees.md @@ -151,7 +151,7 @@ int sum(int lq, int rq) { return s; if (rq <= lb || lq >= rb) // if we don't intersect with the query, return zero return 0; - return l->sum(k) + r->sum(k); + return l->sum(lq, rq) + r->sum(lq, rq); } ``` From 823cb159578f27d1cb5f7765648296dd19ece2f6 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Thu, 3 Mar 2022 02:34:13 +0300 Subject: [PATCH 277/531] faq draft --- content/english/hpc/_index.md | 38 +++++++++++++++++++++++++---------- 1 file changed, 27 insertions(+), 11 deletions(-) diff --git a/content/english/hpc/_index.md b/content/english/hpc/_index.md index 2a3d5d63..e3687d1b 100644 --- a/content/english/hpc/_index.md +++ b/content/english/hpc/_index.md @@ -19,27 +19,43 @@ All materials are hosted on GitHub, with code in a [separate repository](https:/ ### FAQ -**Release date.** +**Fixing errors.** If you spot an error, please do one of these — in the order of preference: -There are also future parts (see below). +- Fix it right away by either by either clicking on the pencil icon on the top right on any page or directly modifying it on GitHub (link to a markdown source is also on the top right). +- Creating an issue on GitHub. +- Emailing or texting it to me directly. -As of March 2nd 2022, the first part is 70-80% complete. +or commenting on other websites — I read most of [HackerNews](https://news.ycombinator.com/from?site=algorithmica.org), [CodeForces](https://codeforces.com/profile/sslotin), and [Twitter](https://twitter.com/sergey_slotin) threads where I'm tagged. -**Fixing errors.** If you spot an error, please create an issue on GitHub or, preferably, fix it right away (the pencil icon on the top-right). +**Release date.** The book is split into several parts that I plan to finish sequentially with large breaks in-between. Part I, Performance Engineering, is ~75% complete as of March 2022, and will hopefully be >95% complete by summer. -**Pre-ordering / financially supporting the book.** +"Release" for an open-source book like this means mostly freezing the table of contents, filling all TODOs, doing a final round of heavy copyediting[^copyedit], drawing illustrations, and then making a print-optimized pdf and figuring out the best way to distribute it. In either case, the web-book will always be available online in full version, and the e-book/printed version will probably be sold on a "pay what you want" basis. -Until I find. +After then, I will mostly only be doing minor edits, fixing errors and adding changes changes in technology or reflecting new algorithm advancements. -The best way you can help is to share the articles on link aggregation. +[^copyedit]: Hopefully with help of a professional editor — because I still haven't figured out how commas work in English. -In either case, the book will always be available online in full version. "pay what you want" hard copy. +**Pre-ordering / financially supporting the book.** You can't — until I find a way that simultaneously doesn't sponsor the war and won't put either me in jail for either tax evasion or treason. -**Translations.** As the book. The website has a functionality. +So, don't bother. Just share the articles you like on link aggregators and report/fix bugs and typos :) -Italian and Chinese (and I will personally translate at least some of it in my native Russian). +**Teaching performance engineering in colleges.** It is one of my dreams. -However, you are encouraged to make your translation. I'd appreciate it if and also sent me the link to the translation. +There are two impactful books on which all computer science courses build on, but one is 50 years old and the other one is 30 years old. + +I want it to be a book that CS students read after they've [TAOCP](https://en.wikipedia.org/wiki/The_Art_of_Computer_Programming) and [CLRS](https://en.wikipedia.org/wiki/Introduction_to_Algorithms) + +There are good endeavors, such as "[Programming Parallel Computers](https://ppc.cs.aalto.fi/)" from Aalto University, "[Performance Engineering of Software Systems](https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-172-performance-engineering-of-software-systems-fall-2018/)" from MIT, and also non-academic ones like [Denis Bakhvalov's endevors](https://github.com/dendibakh/perf-ninja), but these are more of an exception, and are also not deep enough to get people to the edge. + +I've created courses from scratch in the past. I've already received, and I'm looking forward to collaborating more. Which is one of the reasons I rush to finish it by summer — so that colleges can pick up on the idea. + +Competitive programming is, in my opinion, misguided. They are doing useless things, but they are good at doing wrong things, and the performance engineering community should learn from them. + +**Translations.** + +There are already volunteers that want to translate it into Italian and Chinese (and I will personally translate at least some of it in my native Russian). The website has a functionality. + +As the book is evolving, it is probably not the best idea to start translation right away. However, you are very much encouraged to make your translations and post them in your blogs — I'd also appreciate it if you sent me the link to the translation. **"Translating" the Russian version.** The articles at [ru.algorithmica.org/cs/](https://ru.algorithmica.org/cs/) are not about advanced performance engineering but mostly about classical computer science algorithms, targeted towards competitive programming audience. From 545ac212e3a5b3632fb4934742850c3227b881ba Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Thu, 3 Mar 2022 04:04:08 +0300 Subject: [PATCH 278/531] hpc faq --- content/english/hpc/_index.md | 38 +++++++++++++++++++---------------- 1 file changed, 21 insertions(+), 17 deletions(-) diff --git a/content/english/hpc/_index.md b/content/english/hpc/_index.md index e3687d1b..6601a654 100644 --- a/content/english/hpc/_index.md +++ b/content/english/hpc/_index.md @@ -13,37 +13,41 @@ This is an upcoming high performance computing book titled "Algorithms for Moder Its intended audience is everyone from performance engineers and practical algorithm researchers to undergraduate computer science students who have just finished an advanced algorithms course and want to learn more practical ways to speed up a program than by going from $O(n \log n)$ to $O(n \log \log n)$. -All materials are hosted on GitHub, with code in a [separate repository](https://github.com/sslotin/scmm-code). This isn't a collaborative project, but any contributions and feedback are very much welcome. - - -They are undergrad-level, and most of the information there is not unique in other placed on the internet — e. g. the similar-spirited [cp-algorithms.com](https://cp-algorithms.com/). +**Translations.** The website has a separate functionality for creating and managing translations — and there are already volunteers willing to translate the book into Italian and Chinese (and I will personally translate at least some of it into my native Russian). However, as the book is still evolving, it is probably not the best idea to start translation at least before the first part is complete — to not potentially waste the effort. ---> +That said, you are very much encouraged to make translations of any article and publish them in your blogs — just send me the link so that we can merge it back when a centralized translation process starts. + +**"Translating" the Russian version.** The articles hosted at [ru.algorithmica.org/cs/](https://ru.algorithmica.org/cs/) are not about advanced performance engineering but mostly about classical computer science algorithms. They are oriented towards competitive programmers and mostly don't discuss how to speed them up beyond asymptotic complexity. Most of the information there is not unique and already exists in English on some other places of the internet: for example, the similar-spirited [cp-algorithms.com](https://cp-algorithms.com/). ### Part I: Performance Engineering From 7e0229e50adcc286769a0b82b58555bbbb693ab6 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Thu, 3 Mar 2022 22:52:19 +0300 Subject: [PATCH 280/531] changing wording --- content/english/hpc/_index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/hpc/_index.md b/content/english/hpc/_index.md index 2781c7a3..384c281d 100644 --- a/content/english/hpc/_index.md +++ b/content/english/hpc/_index.md @@ -66,7 +66,7 @@ Competitive programming is, in my opinion, misguided. They are doing useless thi That said, you are very much encouraged to make translations of any article and publish them in your blogs — just send me the link so that we can merge it back when a centralized translation process starts. -**"Translating" the Russian version.** The articles hosted at [ru.algorithmica.org/cs/](https://ru.algorithmica.org/cs/) are not about advanced performance engineering but mostly about classical computer science algorithms. They are oriented towards competitive programmers and mostly don't discuss how to speed them up beyond asymptotic complexity. Most of the information there is not unique and already exists in English on some other places of the internet: for example, the similar-spirited [cp-algorithms.com](https://cp-algorithms.com/). +**"Translating" the Russian version.** The articles hosted at [ru.algorithmica.org/cs/](https://ru.algorithmica.org/cs/) are not about advanced performance engineering but mostly about classical computer science algorithms — without discussing how to speed them up beyond asymptotic complexity. Most of the information there is not unique and already exists in English on some other places on the internet: for example, the similar-spirited [cp-algorithms.com](https://cp-algorithms.com/). ### Part I: Performance Engineering From 05bc260593490307cc619c2af4a0cf10849c1f2c Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Thu, 3 Mar 2022 22:56:10 +0300 Subject: [PATCH 281/531] change faq order --- content/english/hpc/_index.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/content/english/hpc/_index.md b/content/english/hpc/_index.md index 384c281d..293e8167 100644 --- a/content/english/hpc/_index.md +++ b/content/english/hpc/_index.md @@ -35,7 +35,13 @@ The e-book/printed editions will most likely be sold on a "pay what you want" ba **Pre-ordering / financially supporting the book.** Due to my unfortunate citizenship and place of birth, you can't — that is, until I find a way that at the same time complies with international sanctions, doesn't sponsor [the war](https://en.wikipedia.org/wiki/2022_Russian_invasion_of_Ukraine), and won't put me in prison for tax evasion. -So, don't bother. If you want to support this book, just share the articles you like on link aggregators and social media and help fix typos — that'd be enough. +So, don't bother. If you want to support this book, just share the articles you like on link aggregators and social media and help fix typos — that would be enough. + +**Translations.** The website has a separate functionality for creating and managing translations — and there are already volunteers willing to translate the book into Italian and Chinese (and I will personally translate at least some of it into my native Russian). + +However, as the book is still evolving, it is probably not the best idea to start translation at least before the first part is complete — to not potentially waste the effort. That said, you are very much encouraged to make translations of any article and publish them in your blogs — just send me the link so that we can merge it back when a centralized translation process starts. + +**"Translating" the Russian version.** The articles hosted at [ru.algorithmica.org/cs/](https://ru.algorithmica.org/cs/) are not about advanced performance engineering but mostly about classical computer science algorithms — without discussing how to speed them up beyond asymptotic complexity. Most of the information there is not unique and already exists in English on some other places on the internet: for example, the similar-spirited [cp-algorithms.com](https://cp-algorithms.com/). **Teaching performance engineering in colleges.** One of my goals for writing this book is to change the way computer science — algorithm design, to be more precise — is taught in colleges. Let me elaborate on that. @@ -62,12 +68,6 @@ Competitive programming is, in my opinion, misguided. They are doing useless thi --> -**Translations.** The website has a separate functionality for creating and managing translations — and there are already volunteers willing to translate the book into Italian and Chinese (and I will personally translate at least some of it into my native Russian). However, as the book is still evolving, it is probably not the best idea to start translation at least before the first part is complete — to not potentially waste the effort. - -That said, you are very much encouraged to make translations of any article and publish them in your blogs — just send me the link so that we can merge it back when a centralized translation process starts. - -**"Translating" the Russian version.** The articles hosted at [ru.algorithmica.org/cs/](https://ru.algorithmica.org/cs/) are not about advanced performance engineering but mostly about classical computer science algorithms — without discussing how to speed them up beyond asymptotic complexity. Most of the information there is not unique and already exists in English on some other places on the internet: for example, the similar-spirited [cp-algorithms.com](https://cp-algorithms.com/). - ### Part I: Performance Engineering The first part covers the basics of computer architecture and optimization of single-threaded algorithms. From 35818df7c6a6510cdea12c7dd8fcf3cb5c863afc Mon Sep 17 00:00:00 2001 From: Alex Shroyer Date: Mon, 7 Mar 2022 13:17:45 -0500 Subject: [PATCH 282/531] Update _index.md s/advices/advice --- content/english/hpc/architecture/_index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/hpc/architecture/_index.md b/content/english/hpc/architecture/_index.md index b4361e36..5b61a175 100644 --- a/content/english/hpc/architecture/_index.md +++ b/content/english/hpc/architecture/_index.md @@ -6,6 +6,6 @@ weight: 2 When I began learning how to optimize programs myself, one big mistake I made was to rely primarily on the empirical approach. Not understanding how computers really worked, I would semi-randomly swap nested loops, rearrange arithmetic, combine branch conditions, inline functions by hand, and follow all sorts of other performance tips I've heard from other people, blindly hoping for improvement. -Unfortunately, this is how most programmers approach optimization. Most texts about performance do not teach you to reason about software performance qualitatively. Instead they give you general advices about certain implementation approaches — and general performance intuition is clearly not enough. +Unfortunately, this is how most programmers approach optimization. Most texts about performance do not teach you to reason about software performance qualitatively. Instead they give you general advice about certain implementation approaches — and general performance intuition is clearly not enough. It would have probably saved me dozens, if not hundreds of hours if I learned computer architecture *before* doing algorithmic programming. So, even if most people aren't *excited* about it, we are going to spend the first few chapters studying how CPUs work and start with learning assembly. From 086c25da5559ee971e27a79edb1abfc4b82135e8 Mon Sep 17 00:00:00 2001 From: Project Nayuki Date: Mon, 7 Mar 2022 19:25:22 +0000 Subject: [PATCH 283/531] Fixed typos. --- content/english/hpc/arithmetic/division.md | 2 +- content/english/hpc/arithmetic/integer.md | 2 +- content/english/hpc/external-memory/virtual.md | 6 +++--- content/english/hpc/pipelining/branchless.md | 2 +- 4 files changed, 6 insertions(+), 6 deletions(-) diff --git a/content/english/hpc/arithmetic/division.md b/content/english/hpc/arithmetic/division.md index dd31f657..41b35e30 100644 --- a/content/english/hpc/arithmetic/division.md +++ b/content/english/hpc/arithmetic/division.md @@ -121,7 +121,7 @@ $$ \lfloor \frac{x \cdot \lfloor 2^s/y \rfloor}{2^s} \rfloor $$ -then for any integer $\frac{x}{y}$ where $y$ is not even, the result will be stricly less than the truth. This only leaves the other case, $m = \lceil 2^s/y \rceil$. Now, let's try to derive the lower and upper bounds for the result of the computation: +then for any integer $\frac{x}{y}$ where $y$ is not even, the result will be strictly less than the truth. This only leaves the other case, $m = \lceil 2^s/y \rceil$. Now, let's try to derive the lower and upper bounds for the result of the computation: $$ \lfloor x / y \rfloor diff --git a/content/english/hpc/arithmetic/integer.md b/content/english/hpc/arithmetic/integer.md index e7a325ff..bd70314b 100644 --- a/content/english/hpc/arithmetic/integer.md +++ b/content/english/hpc/arithmetic/integer.md @@ -56,7 +56,7 @@ Here $\bar{x}$ represents bitwise negation, which can be also thought of as subt As an exercise, here are some facts about signed integers: - All positive numbers and zero remain the same as their binary notation. -- All negative numbers have the highest bit set to zero. +- All negative numbers have the highest bit set to one. - There are more negative numbers than positive numbers (exactly by one — because of zero). - For `int`, if you add $1$ to $(2^{31}-1)$, the result will be $-2^{31}$, represented as `10000000` (for exposition purposes, we will only write 8 bits instead of 32). - Knowing a binary notation of a positive number `x`, you can get the binary notation of `-x` as `~x + 1`. diff --git a/content/english/hpc/external-memory/virtual.md b/content/english/hpc/external-memory/virtual.md index aa5a84ed..1c299307 100644 --- a/content/english/hpc/external-memory/virtual.md +++ b/content/english/hpc/external-memory/virtual.md @@ -13,7 +13,7 @@ These problems are not that critical for some specialized computer systems such ### Memory Paging -Virtual memory gives each process the impression that it fully controls a contiguous region of memory, which in reality may be mapped to multiple smaller blocks of the physical memory — which includes both the main memory (RAM) and external memory (HDD, SDD). +Virtual memory gives each process the impression that it fully controls a contiguous region of memory, which in reality may be mapped to multiple smaller blocks of the physical memory — which includes both the main memory (RAM) and external memory (HDD, SSD). ![](../img/virtual-memory.jpg) @@ -47,6 +47,6 @@ std::sort(data, data + 1024); Here we map a 4K file, which can fit entirely on just a single memory page, but when we open larger files, its reads will be done lazily when we request a certain page, and its writes will be buffered and committed to the file system when the operating decides to (usually on the program termination or when the system runs out of RAM). -A technique that has the same operating principle, but the reverse intention is the *swap file*, which lets the operating system automatically use parts of an SDD or an HDD as an extension of the main memory when there is not enough real RAM. This lets the systems that run out of memory just terribly slow down instead of crashing. +A technique that has the same operating principle, but the reverse intention is the *swap file*, which lets the operating system automatically use parts of an SSD or an HDD as an extension of the main memory when there is not enough real RAM. This lets the system run out of memory just terribly slowly instead of crashing. -This seamless integration of the main and external memory essentially turns RAM into ab "L4 cache" for the external memory, which is a convenient way to think about it from the algorithm design perspective. +This seamless integration of the main and external memory essentially turns RAM into an "L4 cache" for the external memory, which is a convenient way to think about it from the algorithm design perspective. diff --git a/content/english/hpc/pipelining/branchless.md b/content/english/hpc/pipelining/branchless.md index 3eb7838f..1edf567e 100644 --- a/content/english/hpc/pipelining/branchless.md +++ b/content/english/hpc/pipelining/branchless.md @@ -183,7 +183,7 @@ int abs(int a) { A very common value for strings is the empty string — which is also its default value. You also need to handle them somehow, and the idiomatic thing to do is to assign `nullptr` as the pointer and `0` as the string size, and then check if the pointer is null or if the size is zero at the beginning of every procedure involving strings. -However, this requires a separate branch, which is costly unless most strings are empty. What we can do to get rid of it is to to allocate a "zero C-string", which is just a zero byte allocated somewhere, and then simply point all empty strings there. Now all string operations with empty strings have to read this useless zero byte, but this is still much cheaper than a branch misprediction. +However, this requires a separate branch, which is costly unless most strings are empty. What we can do to get rid of it is to allocate a "zero C-string", which is just a zero byte allocated somewhere, and then simply point all empty strings there. Now all string operations with empty strings have to read this useless zero byte, but this is still much cheaper than a branch misprediction. **Binary search.** The standard binary search [can be implemented](/hpc/data-structures/binary-search) without branches, and on small arrays (that fit into cache) it works ~4x faster than the branchy `std::lower_bound`: From 163adbd8d8d78958a31f39804507c5815a9c5d43 Mon Sep 17 00:00:00 2001 From: Quentin <14336407+Qrbaker@users.noreply.github.com> Date: Mon, 7 Mar 2022 15:38:11 -0500 Subject: [PATCH 284/531] Fix broken link (parent dir traversal) --- content/english/hpc/pipelining/branchless.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/hpc/pipelining/branchless.md b/content/english/hpc/pipelining/branchless.md index 3eb7838f..07903276 100644 --- a/content/english/hpc/pipelining/branchless.md +++ b/content/english/hpc/pipelining/branchless.md @@ -88,7 +88,7 @@ $$ x = c \cdot a + (1 - c) \cdot b $$ -This way you can eliminate branching, but this comes at the cost of evaluating *both* branches and the `cmov` itself. Because evaluating the ">=" branch costs nothing, the performance is exactly equal to [the "always yes" case](branching/#branch-prediction) in the branchy version. +This way you can eliminate branching, but this comes at the cost of evaluating *both* branches and the `cmov` itself. Because evaluating the ">=" branch costs nothing, the performance is exactly equal to [the "always yes" case](../branching/#branch-prediction) in the branchy version. ### When It Is Beneficial From e6f1307fb90a18a3e8d8c340a321e9ed0822e71f Mon Sep 17 00:00:00 2001 From: Quentin <14336407+Qrbaker@users.noreply.github.com> Date: Mon, 7 Mar 2022 15:46:54 -0500 Subject: [PATCH 285/531] Missing article ("are") in sentence. --- content/english/hpc/pipelining/hazards.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/content/english/hpc/pipelining/hazards.md b/content/english/hpc/pipelining/hazards.md index 0be339e0..d51485fd 100644 --- a/content/english/hpc/pipelining/hazards.md +++ b/content/english/hpc/pipelining/hazards.md @@ -1,6 +1,7 @@ --- title: Pipeline Hazards weight: 1 +published: true --- [Pipelining](../) lets you hide the latencies of instructions by running them concurrently, but also creates some potential obstacles of its own — characteristically called *pipeline hazards*, that is, situations when the next instruction cannot execute on the following clock cycle. @@ -17,7 +18,7 @@ The only way to resolve a hazard is to have a *pipeline stall*: stop the progres Different hazards have different penalties: -- In structural hazards, you have to wait (usually one more cycle) until the execution unit is ready. They fundamental bottlenecks on performance and can't be avoided — you have to engineer around them. +- In structural hazards, you have to wait (usually one more cycle) until the execution unit is ready. They are fundamental bottlenecks on performance and can't be avoided — you have to engineer around them. - In data hazards, you have to wait of the required data to be computed (the latency of the *critical path*). Data hazards are solved by restructuring computations so that the critical path is shorter. - In control hazards, you generally have to flush the entire pipeline and start over, wasting whole 15-20 cycles. They are solved by either removing branches completely, or making them predictable so that the CPU can effectively *speculate* on what is going to be executed next. From 67c2cf4b8316c62c2d29887d29805fe24b90e128 Mon Sep 17 00:00:00 2001 From: Quentin <14336407+Qrbaker@users.noreply.github.com> Date: Mon, 7 Mar 2022 15:53:45 -0500 Subject: [PATCH 286/531] Broken link - missing "-" in auto-vectorization --- content/english/hpc/pipelining/branchless.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/content/english/hpc/pipelining/branchless.md b/content/english/hpc/pipelining/branchless.md index 3eb7838f..1c6df537 100644 --- a/content/english/hpc/pipelining/branchless.md +++ b/content/english/hpc/pipelining/branchless.md @@ -1,6 +1,7 @@ --- title: Branchless Programming weight: 3 +published: true --- As we established in [the previous section](../branching), branches that can't be effectively predicted by the CPU are expensive as they may cause a long pipeline stall to fetch new instructions after a branch mispredict. In this section, we discuss the means of removing branches in the first place. @@ -217,7 +218,7 @@ That there are no substantial reasons why compilers can't do this on their own, **Data-parallel programming.** Branchless programming is very important for [SIMD](/hpc/simd) applications, including GPU programming, because they don't have branching in the first place. -In our array sum example, if you remove the `volatile` type qualifier from the accumulator, the compiler becomes able to [vectorize](/hpc/simd/autovectorization) the loop: +In our array sum example, if you remove the `volatile` type qualifier from the accumulator, the compiler becomes able to [vectorize](/hpc/simd/auto-vectorization) the loop: ```c++ /* volatile */ int s = 0; From 737789ba2aa9f733cb4b3473069749abe22fc42e Mon Sep 17 00:00:00 2001 From: Radoslaw Jurga Date: Mon, 7 Mar 2022 21:57:54 +0100 Subject: [PATCH 287/531] Replace "wait of" by "wait for" --- content/english/hpc/pipelining/hazards.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/content/english/hpc/pipelining/hazards.md b/content/english/hpc/pipelining/hazards.md index 0be339e0..2cbf42a9 100644 --- a/content/english/hpc/pipelining/hazards.md +++ b/content/english/hpc/pipelining/hazards.md @@ -1,6 +1,7 @@ --- title: Pipeline Hazards weight: 1 +published: true --- [Pipelining](../) lets you hide the latencies of instructions by running them concurrently, but also creates some potential obstacles of its own — characteristically called *pipeline hazards*, that is, situations when the next instruction cannot execute on the following clock cycle. @@ -18,7 +19,7 @@ The only way to resolve a hazard is to have a *pipeline stall*: stop the progres Different hazards have different penalties: - In structural hazards, you have to wait (usually one more cycle) until the execution unit is ready. They fundamental bottlenecks on performance and can't be avoided — you have to engineer around them. -- In data hazards, you have to wait of the required data to be computed (the latency of the *critical path*). Data hazards are solved by restructuring computations so that the critical path is shorter. +- In data hazards, you have to wait for the required data to be computed (the latency of the *critical path*). Data hazards are solved by restructuring computations so that the critical path is shorter. - In control hazards, you generally have to flush the entire pipeline and start over, wasting whole 15-20 cycles. They are solved by either removing branches completely, or making them predictable so that the CPU can effectively *speculate* on what is going to be executed next. As they have very different impact on performance, we are going to go in the reversed order and start with the more grave ones. From e5e6749f576e49fbdbb28220c74e8a88366eebd1 Mon Sep 17 00:00:00 2001 From: Radoslaw Jurga Date: Mon, 7 Mar 2022 22:17:24 +0100 Subject: [PATCH 288/531] Replace "sign byte" by "sign bit" --- content/english/hpc/pipelining/branchless.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/content/english/hpc/pipelining/branchless.md b/content/english/hpc/pipelining/branchless.md index 3eb7838f..b820ee54 100644 --- a/content/english/hpc/pipelining/branchless.md +++ b/content/english/hpc/pipelining/branchless.md @@ -1,6 +1,7 @@ --- title: Branchless Programming weight: 3 +published: true --- As we established in [the previous section](../branching), branches that can't be effectively predicted by the CPU are expensive as they may cause a long pipeline stall to fetch new instructions after a branch mispredict. In this section, we discuss the means of removing branches in the first place. @@ -40,7 +41,7 @@ sar ebx, 31 ; t >>= 31 imul eax, ebx ; x *= t ``` -Another, more complicated way to implement this whole sequence is to convert this sign byte into a mask and then use bitwise `and` instead of multiplication: `((a[i] - 50) >> 1 - 1) & a`. This makes the whole sequence one cycle faster, considering that unlike other instructions, `imul` takes 3 cycles: +Another, more complicated way to implement this whole sequence is to convert this sign bit into a mask and then use bitwise `and` instead of multiplication: `((a[i] - 50) >> 1 - 1) & a`. This makes the whole sequence one cycle faster, considering that unlike other instructions, `imul` takes 3 cycles: ```nasm mov ebx, eax ; t = x From 2956a7a8621f5ea002adc7fd33bf4225e14cdf3e Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 8 Mar 2022 04:54:43 +0300 Subject: [PATCH 289/531] removing broken link --- content/english/hpc/pipelining/_index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/hpc/pipelining/_index.md b/content/english/hpc/pipelining/_index.md index 9dc491d0..3d7d49b5 100644 --- a/content/english/hpc/pipelining/_index.md +++ b/content/english/hpc/pipelining/_index.md @@ -44,7 +44,7 @@ Having this in mind, hardware manufacturers prefer to use *cycles per instructio CPI of a perfectly pipelined processor should tend to one, but it can actually be even lower if we make each stage of the pipeline "wider" by duplicating it, so that more than one instruction can be processed at a time. Because the cache and most of the ALU can be shared, this ends up being cheaper than adding a fully separate core. Such architectures, capable of executing more than one instruction per cycle, are called *superscalar*, and most modern CPUs are. -You can only take advantage of superscalar processing if the stream of instructions contains groups of logically independent operations that can be processed separately. The instructions don't always arrive in the most convenient order, so, when possible, modern CPUs can execute them *out-of-order* to improve overall utilization and minimize pipeline stalls. How this magic works is a topic for [a more advanced discussion](scheduling), but for now, you can assume that the CPU maintains a buffer of pending instructions up to some distance in the future, and executes them as soon as the values of its operands are computed and there is an execution unit available. +You can only take advantage of superscalar processing if the stream of instructions contains groups of logically independent operations that can be processed separately. The instructions don't always arrive in the most convenient order, so, when possible, modern CPUs can execute them *out-of-order* to improve overall utilization and minimize pipeline stalls. How this magic works is a topic for a more advanced discussion, but for now, you can assume that the CPU maintains a buffer of pending instructions up to some distance in the future, and executes them as soon as the values of its operands are computed and there is an execution unit available. ### An Education Analogy From 926b9e2c2111b6e381c794242ceff3b9cd53e3d2 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 8 Mar 2022 05:28:16 +0300 Subject: [PATCH 290/531] update hpc faq and toc --- content/english/hpc/_index.md | 31 ++++++++++++++++++------------- 1 file changed, 18 insertions(+), 13 deletions(-) diff --git a/content/english/hpc/_index.md b/content/english/hpc/_index.md index 293e8167..425a7c95 100644 --- a/content/english/hpc/_index.md +++ b/content/english/hpc/_index.md @@ -20,26 +20,30 @@ All book materials are [hosted on GitHub](https://github.com/algorithmica-org/al **Bug/typo fixes.** If you spot an error on any page, please do one of these — in the order of preference: - fix it right away by either clicking on the pencil icon on the top right on any page (opens the [Prose](https://prose.io/) editor) or, more traditionally, by modifying the page directly on GitHub (the link to the source is also on the top right); -- create an issue on [GitHub](https://github.com/algorithmica-org/algorithmica); +- create [an issue on GitHub](https://github.com/algorithmica-org/algorithmica/issues); - [tell me](http://sereja.me/) about it directly; or leave a comment on some other website where it is being discussed — I read most of [HackerNews](https://news.ycombinator.com/from?site=algorithmica.org), [CodeForces](https://codeforces.com/profile/sslotin), and [Twitter](https://twitter.com/sergey_slotin) threads where I'm tagged. -**Release date.** The book is split into several parts that I plan to finish sequentially with long breaks in-between. Part I, Performance Engineering, is ~75% complete as of March 2022 and will hopefully be >95% complete by summer. +**Release date.** The book is split into several parts that I plan to finish sequentially with long breaks in-between. Part I, Performance Engineering, is ~75% complete as of March 2022 and will hopefully be >95% complete by this summer. -"Release" for an open-source book like this means mostly freezing the table of contents, filling all the TODOs, doing one final round of heavy copyediting[^copyedit], drawing illustrations, and then making a print-optimized pdf and figuring out the best way to distribute it. After that, I will mostly be fixing errors and only doing some minor edits reflecting the changes in technology or new algorithm advancements. +A "release" for an open-source book like this essentially means: -The e-book/printed editions will most likely be sold on a "pay what you want" basis, and in either case, the web version will always be available online in full. +- finishing all essential sections and filling all the TODOs, +- mostly freezing the table of contents (except for the case studies), +- doing one final round of heavy copyediting (hopefully, with the help of a professional editor — I still haven’t figured out how commas work in English), +- drawing illustrations (I stole a lot of those that are currently displayed), +- making a print-optimized pdf and figuring out the best way to distribute it. -[^copyedit]: Hopefully, with the help of a professional editor — I still haven’t figured out how commas work in English. +After that, I will mostly be fixing errors and only doing some minor edits reflecting the changes in technology or new algorithm advancements. The e-book/printed editions will most likely be sold on a "pay what you want" basis, and in any case, the web version will always be fully available online. **Pre-ordering / financially supporting the book.** Due to my unfortunate citizenship and place of birth, you can't — that is, until I find a way that at the same time complies with international sanctions, doesn't sponsor [the war](https://en.wikipedia.org/wiki/2022_Russian_invasion_of_Ukraine), and won't put me in prison for tax evasion. So, don't bother. If you want to support this book, just share the articles you like on link aggregators and social media and help fix typos — that would be enough. -**Translations.** The website has a separate functionality for creating and managing translations — and there are already volunteers willing to translate the book into Italian and Chinese (and I will personally translate at least some of it into my native Russian). +**Translations.** The website has a separate functionality for creating and managing translations — and I've already been contacted by some nice people willing to translate the book into Italian and Chinese (and I will personally translate at least some of it into my native Russian). -However, as the book is still evolving, it is probably not the best idea to start translation at least before the first part is complete — to not potentially waste the effort. That said, you are very much encouraged to make translations of any article and publish them in your blogs — just send me the link so that we can merge it back when a centralized translation process starts. +However, as the book is still evolving, it is probably not the best idea to start translating it at least until Part I is finished. That said, you are very much encouraged to make translations of any articles and publish them in your blogs — just send me the link so that we can merge it back when a centralized translation process starts. **"Translating" the Russian version.** The articles hosted at [ru.algorithmica.org/cs/](https://ru.algorithmica.org/cs/) are not about advanced performance engineering but mostly about classical computer science algorithms — without discussing how to speed them up beyond asymptotic complexity. Most of the information there is not unique and already exists in English on some other places on the internet: for example, the similar-spirited [cp-algorithms.com](https://cp-algorithms.com/). @@ -49,7 +53,7 @@ There are two highly impactful textbooks on which most computer science courses And yet, the computer science curricula in most colleges completely ignore this shift. Although there are some great courses that aim to correct that — such as "[Performance Engineering of Software Systems](https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-172-performance-engineering-of-software-systems-fall-2018/)" from MIT, "[Programming Parallel Computers](https://ppc.cs.aalto.fi/)" from Aalto University, and some non-academic ones like Denis Bakhvalov's "[Performance Ninja](https://github.com/dendibakh/perf-ninja)" — most computer science graduates still treat the hardware like something from the 90s. -What I ideally want to achieve is that performance engineering becomes taught right after introduction to algorithms. Writing the first comprehensive textbook on the subject is a large part of this — which is why I rush to finish it by summer so that the colleges can pick it up in the next academic year. But creating a new course requires more than that: you need a balanced curriculum, course infrastructure, lecture slides, lab assignments… So for some period after finishing the book, I will be working on materials and tools for teaching performance engineering — and I'm looking forward to working with other people who want to make it into reality as well. +What I really want to achieve is that performance engineering becomes taught right after introduction to algorithms. Writing the first comprehensive textbook on the subject is a large part of it, and this is why I rush to finish it by the summer so that the colleges can pick it up in the next academic year. But creating a new course requires more than that: you need a balanced curriculum, course infrastructure, lecture slides, lab assignments… so for some time after finishing the main book, I will be working on course materials and tools for *teaching* performance engineering — and I'm looking forward to collaborating with other people who want to make it a reality as well. From 61c7402953cf20c878d58122e2014cc3e60ea104 Mon Sep 17 00:00:00 2001 From: Dan Gora <41758210+danielgora@users.noreply.github.com> Date: Thu, 10 Mar 2022 01:59:00 -0300 Subject: [PATCH 300/531] Probable explanation for why uncached writes are faster than even cached reads. --- content/english/hpc/cpu-cache/bandwidth.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/content/english/hpc/cpu-cache/bandwidth.md b/content/english/hpc/cpu-cache/bandwidth.md index 8a39f862..e2a83105 100644 --- a/content/english/hpc/cpu-cache/bandwidth.md +++ b/content/english/hpc/cpu-cache/bandwidth.md @@ -1,6 +1,7 @@ --- title: Memory Bandwidth weight: 1 +published: true --- On the data path between the CPU registers and the RAM, there is a hierarchy of *caches* that exist to speed up access to frequently used data: the layers closer to the processor are faster but also smaller in size. The word "faster" here applies to two closely related but separate timings: @@ -101,6 +102,8 @@ In fact, the performance increase in the case of the RAM is even more than 2x an - the instruction sequence becomes simpler, allowing for more pending memory instructions; - and, perhaps most importantly, the cache system can simply "fire and forget" non-temporal write requests, while for reads it needs to remember what to do with the data once it arrives — similar to connection handles in networking software. +-- The reason that pure non-cached writes are around 50% faster than the reads is because a write cycle to a DRAM is around 50% shorter than a read cycle. For reads the address has to be broadcast to the RAM chips, then wait a number of cycles until the bus "turns around" in order for the retrieved data to be broadcast back to the memory controller. For writes, the memory controller just broadcasts the address, then the data with a (small? none?) number of "dead cycles" in between. + Also, for these reasons, a single CPU core usually [can't fully saturate the memory bandwidth](../sharing). The same technique generalizes to `memcpy`: it also just moves 32-byte blocks with SIMD load/store instructions, and it can be similarly made non-temporal, increasing the throughput twofold for large arrays. There is also a non-temporal load instruction (`_mm256_stream_load_si256`) for when you want to *read* without polluting cache (e. g. when you don't need the original array after a `memcpy`, but will need some data that you had accessed before calling it). From 001de5e174b04cb9c9a346a850231f424a14bef8 Mon Sep 17 00:00:00 2001 From: Alexander Nenninger <44548181+AlexanderNenninger@users.noreply.github.com> Date: Thu, 10 Mar 2022 16:53:15 +0100 Subject: [PATCH 301/531] benckmark to benchmark --- content/english/hpc/profiling/benchmarking.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/hpc/profiling/benchmarking.md b/content/english/hpc/profiling/benchmarking.md index 7357f451..d873ca62 100644 --- a/content/english/hpc/profiling/benchmarking.md +++ b/content/english/hpc/profiling/benchmarking.md @@ -51,7 +51,7 @@ int main() { } ``` -This is a very low-overhead method that lets you run more experiments and [get more accurate results](../noise) from them. You still have to perform some repeated actions, but they can be largely automated with frameworks, [Google benchmark library](https://github.com/google/benchmark) being the most popular choice for C++. Some programming languages also have handy built-in tools for benchmarking: special mention here goes to [Python's timeit function](https://docs.python.org/3/library/timeit.html) and [Julia's @benckmark macro](https://github.com/JuliaCI/BenchmarkTools.jl). +This is a very low-overhead method that lets you run more experiments and [get more accurate results](../noise) from them. You still have to perform some repeated actions, but they can be largely automated with frameworks, [Google benchmark library](https://github.com/google/benchmark) being the most popular choice for C++. Some programming languages also have handy built-in tools for benchmarking: special mention here goes to [Python's timeit function](https://docs.python.org/3/library/timeit.html) and [Julia's @benchmark macro](https://github.com/JuliaCI/BenchmarkTools.jl). Although *efficient* in terms of execution speed, C and C++ are not the most *productive* languages, especially when it comes to analytics. When your algorithm depends on some parameters such as the input size, and you need to collect more than just one data point from each implementation, you really want to integrate your benchmarking code with the outside environment and analyze the results using something else. From deef86524281044abec05983fc39e8e2e733f97e Mon Sep 17 00:00:00 2001 From: Alexander Nenninger <44548181+AlexanderNenninger@users.noreply.github.com> Date: Thu, 10 Mar 2022 16:59:51 +0100 Subject: [PATCH 302/531] tense --- content/english/hpc/profiling/noise.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/hpc/profiling/noise.md b/content/english/hpc/profiling/noise.md index c530c160..8dcdb032 100644 --- a/content/english/hpc/profiling/noise.md +++ b/content/english/hpc/profiling/noise.md @@ -11,7 +11,7 @@ Situations like these are usually not caused by fraudulent actions by their auth There are many things that can introduce bias into benchmarks. -**Differing datasets.** There are many algorithms whose performance somehow depends on the dataset distribution. In order to define, for example, what the fastest sorting, shortest path, or binary search algorithms are, you have to fixing the dataset on which the algorithm is run. +**Differing datasets.** There are many algorithms whose performance somehow depends on the dataset distribution. In order to define, for example, what the fastest sorting, shortest path, or binary search algorithms are, you have to fix the dataset on which the algorithm is run. This sometimes applies even to algorithms that process a single piece of input. For example, it is not a good idea to feed GCD implementations sequential numbers because it makes branches very predictable: From cdd917f98937d51ef44be7bcded3b3897888be70 Mon Sep 17 00:00:00 2001 From: Alexander Nenninger <44548181+AlexanderNenninger@users.noreply.github.com> Date: Thu, 10 Mar 2022 17:09:26 +0100 Subject: [PATCH 303/531] don't insult the reader. --- content/english/hpc/arithmetic/float.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/hpc/arithmetic/float.md b/content/english/hpc/arithmetic/float.md index 7feb2769..4bc2a1b2 100644 --- a/content/english/hpc/arithmetic/float.md +++ b/content/english/hpc/arithmetic/float.md @@ -9,7 +9,7 @@ The users of floating-point arithmetic deserve one of these IQ bell curve memes - Then they discover that `0.1 + 0.2 != 0.3` or some other quirk like that, freak out, start thinking that some random error term is added to every computation, and for many years avoid any real data types completely. - Then they finally man up, read the specification of how IEEE-754 floats work and start using them appropriately. -Most people are unfortunately still at stage 2, breeding various misconceptions about floating-point arithmetic — thinking that it is fundamentally imprecise and unstable, and slower than integer arithmetic. +Too many people are unfortunately still at stage 2, breeding various misconceptions about floating-point arithmetic — thinking that it is fundamentally imprecise and unstable, and slower than integer arithmetic. ![](../img/iq.svg) From c040deb9b527648dcadafc660e4ff426ab0121b5 Mon Sep 17 00:00:00 2001 From: Alexander Nenninger <44548181+AlexanderNenninger@users.noreply.github.com> Date: Thu, 10 Mar 2022 17:37:57 +0100 Subject: [PATCH 304/531] don't insult the reader fix link --- content/english/hpc/arithmetic/float.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/hpc/arithmetic/float.md b/content/english/hpc/arithmetic/float.md index 4bc2a1b2..54da4b8d 100644 --- a/content/english/hpc/arithmetic/float.md +++ b/content/english/hpc/arithmetic/float.md @@ -189,4 +189,4 @@ fp operator*(fp a, fp b) { Many applications that require higher levels of precision use software floating-point arithmetic in a similar fashion. But of course, you don't want to execute a sequence of 10 or so instructions that this code compiles to each time you want to multiply two real numbers, so on modern CPUs, floating-point arithmetic is implemented in hardware — usually as separate coprocessors due to its complexity. -The *floating-point unit* of x86 (often referred to as x87) has separate registers and its own tiny instruction set that supports memory operations, basic arithmetic, trigonometry, and some common operations such as logarithm, exponent, and square root. To make these operations properly work together, some additional details of floating-point number representation need to be clarified — which we will do in [the next section](../ieee). +The *floating-point unit* of x86 (often referred to as x87) has separate registers and its own tiny instruction set that supports memory operations, basic arithmetic, trigonometry, and some common operations such as logarithm, exponent, and square root. To make these operations properly work together, some additional details of floating-point number representation need to be clarified — which we will do in [the next section](./ieee-754.md). From e3f7e0e2d362adb811c07bba2252700e26f15015 Mon Sep 17 00:00:00 2001 From: Alexander Nenninger <44548181+AlexanderNenninger@users.noreply.github.com> Date: Thu, 10 Mar 2022 18:12:13 +0100 Subject: [PATCH 305/531] Article --- content/english/hpc/arithmetic/ieee-754.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/hpc/arithmetic/ieee-754.md b/content/english/hpc/arithmetic/ieee-754.md index 9c708ffe..171a3c93 100644 --- a/content/english/hpc/arithmetic/ieee-754.md +++ b/content/english/hpc/arithmetic/ieee-754.md @@ -58,7 +58,7 @@ The default way integer arithmetic deals with corner cases such as division by z Sometimes a software crash, in turn, causes a real, physical one. In 1996, the maiden flight of the [Ariane 5](https://en.wikipedia.org/wiki/Ariane_5) (the space launch vehicle that ESA uses to lift stuff into low Earth orbit) ended in [a catastrophic explosion](https://www.youtube.com/watch?v=gp_D8r-2hwk) due to the policy of aborting computation on arithmetic error, which in this case was a floating-point to integer conversion overflow, that led to the navigation system thinking that it was off course and making a large correction, eventually causing the disintegration of a $1B rocket. -There is a way to gracefully handle corner cases like these: hardware interrupts. When an exception occurs, CPU: +There is a way to gracefully handle corner cases like these: hardware interrupts. When an exception occurs, the CPU - interrupts the execution of a program; - packs every all relevant information into a data structure called "interrupt vector"; From 45cd7455b5e742e90a341fcc3851dd963d890106 Mon Sep 17 00:00:00 2001 From: Alexander Nenninger <44548181+AlexanderNenninger@users.noreply.github.com> Date: Thu, 10 Mar 2022 19:06:03 +0100 Subject: [PATCH 306/531] Fixed mixup between eps and epsilon --- content/english/hpc/arithmetic/errors.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/hpc/arithmetic/errors.md b/content/english/hpc/arithmetic/errors.md index 14952cc4..a5a56cd5 100644 --- a/content/english/hpc/arithmetic/errors.md +++ b/content/english/hpc/arithmetic/errors.md @@ -71,7 +71,7 @@ bool eq(float a, float b) { } ``` -The value of epsilon should depend on the application: the one above — the machine epsilon for `float` — is only good for no more than one floating-point operation. +The value of `eps` should depend on the application: the one above — the machine epsilon for `float` — is only good for no more than one floating-point operation. ### Interval Arithmetic From c61ce4bc0f9f87e181ef0355265d021b4b17e6ef Mon Sep 17 00:00:00 2001 From: Alexander Nenninger <44548181+AlexanderNenninger@users.noreply.github.com> Date: Thu, 10 Mar 2022 19:21:35 +0100 Subject: [PATCH 307/531] low to small --- content/english/hpc/arithmetic/errors.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/hpc/arithmetic/errors.md b/content/english/hpc/arithmetic/errors.md index a5a56cd5..3091e36e 100644 --- a/content/english/hpc/arithmetic/errors.md +++ b/content/english/hpc/arithmetic/errors.md @@ -127,7 +127,7 @@ In this one, it is easy to show that the error is be bound by $\epsilon \cdot |x ### Kahan Summation -From the previous example, we can see that long chains of operations are not a problem, but adding and subtracting numbers of different magnitude is. The general approach to dealing with such problems is to try to keep big numbers with big numbers and low numbers with low numbers. +From the previous example, we can see that long chains of operations are not a problem, but adding and subtracting numbers of different magnitude is. The general approach to dealing with such problems is to try to keep big numbers with big numbers and small numbers with small numbers. Consider the standard summation algorithm: From e34314ee0d28762ac9456d2c8eae7c42f7092fe2 Mon Sep 17 00:00:00 2001 From: Alexander Nenninger <44548181+AlexanderNenninger@users.noreply.github.com> Date: Thu, 10 Mar 2022 19:24:55 +0100 Subject: [PATCH 308/531] int literals are easier to read than words --- content/english/hpc/arithmetic/errors.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/hpc/arithmetic/errors.md b/content/english/hpc/arithmetic/errors.md index 3091e36e..2a077941 100644 --- a/content/english/hpc/arithmetic/errors.md +++ b/content/english/hpc/arithmetic/errors.md @@ -139,7 +139,7 @@ for (int i = 0; i < n; i++) Since we are performing summations and not multiplications, its relative error is no longer just bounded by $O(\epsilon \cdot n)$, but heavily depends on the input. -In the most ridiculous case, if the first value is $2^{23}$ and the others are ones, the sum is going to be $2^{23}$ regardless of $n$, which can be verified by executing the following code and observing that it simply prints $16777216 = 2^{23}$ twice: +In the most ridiculous case, if the first value is $2^{23}$ and the others are are equal to $1$, the sum is going to be $2^{23}$ regardless of $n$, which can be verified by executing the following code and observing that it simply prints $16777216 = 2^{23}$ twice: ```cpp const int n = (1<<24); From 0033e640cdc1e74f404bf35fed29764766b2667a Mon Sep 17 00:00:00 2001 From: Alexander Nenninger <44548181+AlexanderNenninger@users.noreply.github.com> Date: Thu, 10 Mar 2022 19:38:31 +0100 Subject: [PATCH 309/531] one straggling 'so' --- content/english/hpc/arithmetic/newton.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/hpc/arithmetic/newton.md b/content/english/hpc/arithmetic/newton.md index 8ce3cd37..90fac488 100644 --- a/content/english/hpc/arithmetic/newton.md +++ b/content/english/hpc/arithmetic/newton.md @@ -3,7 +3,7 @@ title: Newton's Method weight: 3 --- -Reaching the maximum possible precision is rarely required from a practical algorithm. In real-world data, modeling and measurement errors are usually several orders of magnitude larger than the errors that come from rounding floating-point numbers and such, so we are often perfectly happy with picking an approximate method that trades off precision for speed. +Reaching the maximum possible precision is rarely required from a practical algorithm. In real-world data, modeling and measurement errors are usually several orders of magnitude larger than the errors that come from rounding floating-point numbers and such, we are often perfectly happy with picking an approximate method that trades off precision for speed. In this section, we introduce one of the most important building blocks in such approximate, numerical algorithms: *the Newton's method*. From be3b7328ac2f5365d7d41e11b5e785094dce8245 Mon Sep 17 00:00:00 2001 From: Radoslaw Jurga Date: Thu, 10 Mar 2022 20:09:50 +0100 Subject: [PATCH 310/531] Add missing word "used" --- content/english/hpc/profiling/instrumentation.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/content/english/hpc/profiling/instrumentation.md b/content/english/hpc/profiling/instrumentation.md index bdf3392b..e31208c8 100644 --- a/content/english/hpc/profiling/instrumentation.md +++ b/content/english/hpc/profiling/instrumentation.md @@ -1,6 +1,7 @@ --- title: Instrumentation weight: 1 +published: true --- *Instrumentation* is an overcomplicated term that means inserting timers and other tracking code into programs. The simplest example is using the `time` utility in Unix-like systems to measure the duration of execution for the whole program. @@ -77,4 +78,4 @@ void query() { This way we can remove the need to sample a new random number on each invocation, only resetting the counter when we choose to calculate statistics. -Techniques like that are frequently by library algorithm developers inside large projects to collect profiling data without affecting the performance of the end program too much. +Techniques like that are frequently used by library algorithm developers inside large projects to collect profiling data without affecting the performance of the end program too much. From 62fd512214cabaabecafb3c0ff282d07c02d558e Mon Sep 17 00:00:00 2001 From: Radoslaw Jurga Date: Thu, 10 Mar 2022 21:32:37 +0100 Subject: [PATCH 311/531] Add missing word "to" --- content/english/hpc/arithmetic/_index.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/content/english/hpc/arithmetic/_index.md b/content/english/hpc/arithmetic/_index.md index 686d15ab..ca4357c5 100644 --- a/content/english/hpc/arithmetic/_index.md +++ b/content/english/hpc/arithmetic/_index.md @@ -1,11 +1,12 @@ --- title: Arithmetic weight: 6 +published: true --- As we repeatedly demonstrate throughout this book, knowing darker corners of the instruction set can be very fruitful, especially in the case of [CISC](/hpc/architecture/isa) platforms like x86, which currently has [somewhere between 1000 and 4000](https://stefanheule.com/blog/how-many-x86-64-instructions-are-there-anyway/) distinct instructions, depending on how you count. -Most of these instructions are related arithmetic, and using them all efficiently to optimize arithmetic operations requires a great deal of both knowledge, skill, and creativity. Therefore, in this chapter, we will discuss number representations and their use in numerical algorithms. +Most of these instructions are related to arithmetic, and using them all efficiently to optimize arithmetic operations requires a great deal of both knowledge, skill, and creativity. Therefore, in this chapter, we will discuss number representations and their use in numerical algorithms. diff --git a/content/english/hpc/complexity/levels.md b/content/english/hpc/complexity/levels.md index d0757754..281bdea2 100644 --- a/content/english/hpc/complexity/levels.md +++ b/content/english/hpc/complexity/levels.md @@ -30,12 +30,12 @@ You get especially frustrated if you had a competitive programming experience. Y Programmers can be put in several "levels" in terms of their software optimization abilities: -0. "Newbie". Those who don't think about performance at all. They usually write in high-level languages, sometimes in declarative / functional languages. Most "programmers" stay there (and there is nothing wrong with it). -1. "Undergraduate student". Those who know about Big O notation and are familiar with basic data structures and approaches. LeetCode and CodeForces folks are there. This is also the requirement in getting into big companies — they have a lot of in-house software, large scale, and they are looking for people in the long term, so asking things like programming language. -2. "Graduate student". Those who know that not all operations are created equal; know other cost models such as external memory model (B-tree, external sorting), word model (bitset,) or parallel computing, but still in theory. -3. "Professional developer". Those who know actual timings of these operations. Aware that branch mispredictions are costly, memory is split into cache lines. Knows some basic SIMD techniques. -4. "Performance engineer". Know exactly what happens inside their hardware. Know the difference between latency and bandwidth, know about ports. Knows how to use SIMD and the rest of instruction set effectively. Can read assembly and use profilers. -5. "Intel employee". Knows microarchitecture-specific details. This is outside of the purview of normal engineers. +0. *Newbie*. Those who don't think about performance at all. They usually write in high-level languages, sometimes in declarative / functional languages. Most "programmers" stay there (and there is nothing wrong with it). +1. *Undergraduate student*. Those who know about Big O notation and are familiar with basic data structures and approaches. LeetCode and CodeForces folks are there. This is also the requirement in getting into big companies — they have a lot of in-house software, large scale, and they are looking for people in the long term, so asking things like programming language. +2. *Graduate student*. Those who know that not all operations are created equal; know other cost models such as external memory model (B-tree, external sorting), word model (bitset,) or parallel computing, but still in theory. +3. *Professional developer*. Those who know actual timings of these operations. Aware that branch mispredictions are costly, memory is split into cache lines. Knows some basic SIMD techniques. +4. *Performance engineer*. Know exactly what happens inside their hardware. Know the difference between latency and bandwidth, know about ports. Knows how to use SIMD and the rest of instruction set effectively. Can read assembly and use profilers. +5. *Intel employee*. Knows microarchitecture-specific details. This is outside of the purview of normal engineers. In this book, we expect that the average reader is somewhere around stage 1, and hopefully by the end of it will get to 4. diff --git a/content/english/hpc/cpu-cache/associativity.md b/content/english/hpc/cpu-cache/associativity.md index ee3203cf..b9f278ee 100644 --- a/content/english/hpc/cpu-cache/associativity.md +++ b/content/english/hpc/cpu-cache/associativity.md @@ -95,7 +95,7 @@ along with a "tag" information which helps identify which block it is Performance issues caused by cache associativity effects arise with remarkable frequency in algorithms because, for multiple reasons, programmers just love using powers of two when indexing arrays: - It is easier to calculate the address for multi-dimensional array accesses if the last dimension is a power of two, as it only requires a binary shift instead of a multiplication. -- It is easier to calculate modulo a power of two, as it can be done with a single bitwise "and". +- It is easier to calculate modulo a power of two, as it can be done with a single bitwise `and`. - It is convenient and often even necessary to use power-of-two problem sizes in divide-and-conquer algorithms. - It is the smallest integer exponent, so using the sequence of increasing powers of two as problem sizes are a popular choice when benchmarking memory-bound algorithms. - Also, more natural powers of ten are by transitivity divisible by a slightly lower power of two. diff --git a/content/english/hpc/cpu-cache/bandwidth.md b/content/english/hpc/cpu-cache/bandwidth.md index 252cb866..88b547ad 100644 --- a/content/english/hpc/cpu-cache/bandwidth.md +++ b/content/english/hpc/cpu-cache/bandwidth.md @@ -21,7 +21,7 @@ for (int t = 0; t < K; t++) a[i]++; ``` -Changing $N$ and adjusting $K$ so that the total number of array cells accessed remains roughly constant and expressing the total time in "operations per second", we get a graph like this: +Changing $N$ and adjusting $K$ so that the total number of array cells accessed remains roughly constant and expressing the total time in "operations per second," we get a graph like this: ![Dotted vertical lines are cache layer sizes](../img/inc.svg) diff --git a/content/english/hpc/data-structures/binary-search.md b/content/english/hpc/data-structures/binary-search.md index 61aec502..ff9f73b4 100644 --- a/content/english/hpc/data-structures/binary-search.md +++ b/content/english/hpc/data-structures/binary-search.md @@ -175,7 +175,7 @@ With prefetching, the performance on large arrays becomes roughly the same: ![](../img/search-branchless-prefetch.svg) -The graph still grows faster as the branchy version also prefetches "grandchildren", "grand-grandchildren", and so on — although the usefulness of each new speculative read diminishes exponentially as the prediction is less and less likely to be correct. +The graph still grows faster as the branchy version also prefetches "grandchildren," "grand-grandchildren," and so on — although the usefulness of each new speculative read diminishes exponentially as the prediction is less and less likely to be correct. In the branchless version, we could also fetch ahead by more than one layer, but the number of fetches we'd need also grows exponentially. Instead, we will try a different approach to optimize memory operations. diff --git a/content/english/hpc/data-structures/s-tree.md b/content/english/hpc/data-structures/s-tree.md index 8e3bd30e..35528c7a 100644 --- a/content/english/hpc/data-structures/s-tree.md +++ b/content/english/hpc/data-structures/s-tree.md @@ -467,7 +467,7 @@ A lot of the performance boost of the S+ tree comes from removing branching and -Although nobody except maybe the HFT people cares about real latency, and everybody actually measures throughput even when using the word "latency", this nuance is still something to take into account when predicting the possible speedup in user applications. +Although nobody except maybe the HFT people cares about real latency, and everybody actually measures throughput even when using the word "latency," this nuance is still something to take into account when predicting the possible speedup in user applications. ### Modifications and Further Optimizations diff --git a/content/english/hpc/external-memory/_index.md b/content/english/hpc/external-memory/_index.md index d7c1612c..2255945b 100644 --- a/content/english/hpc/external-memory/_index.md +++ b/content/english/hpc/external-memory/_index.md @@ -41,7 +41,7 @@ It becomes ever more important to optimize Modern computers grow ever more powerful, but their memory systems can't quite pick up with the increase in computing power, because they don't follow the same [laws of silicon scaling](/hpc/complexity/hardware) as CPU chips do. -If a CPU core has a frequency of 3 GHz, it roughly means that it is capable of executing up to $3 \cdot 10^9$ operations per second, depending on what constitutes an "operation". This is the baseline: on modern architectures, it can be increased by techniques such as SIMD and instruction-level parallelism up to $10^{11}$ operations per second, if the computation allows it. +If a CPU core has a frequency of 3 GHz, it roughly means that it is capable of executing up to $3 \cdot 10^9$ operations per second, depending on what constitutes an "operation." This is the baseline: on modern architectures, it can be increased by techniques such as SIMD and instruction-level parallelism up to $10^{11}$ operations per second, if the computation allows it. But for many algorithms, the CPU is not the bottleneck. Before trying to optimize performance above that baseline, we need to learn not to drop below it, and the number one reason for this is memory. diff --git a/content/english/hpc/external-memory/list-ranking.md b/content/english/hpc/external-memory/list-ranking.md index 07b33c71..cf5d9929 100644 --- a/content/english/hpc/external-memory/list-ranking.md +++ b/content/english/hpc/external-memory/list-ranking.md @@ -52,7 +52,7 @@ For example, we can obtain the Euler tour of a tree in external memory by constr - split each undirected tree edge into two directed ones; - duplicate the parent node for each up-edge (because list nodes can only have one incoming edge, but we visit some tree vertices multiple times); -- route each such node either to the "next sibling", if it has one, or otherwise to its own parent; +- route each such node either to the "next sibling," if it has one, or otherwise to its own parent; - and then finally break the resulting cycle at the root. This general technique is called *tree contraction*, and it serves as the basis for a large number of tree algorithms. diff --git a/content/english/hpc/external-memory/locality.md b/content/english/hpc/external-memory/locality.md index 8607506d..a26ff70f 100644 --- a/content/english/hpc/external-memory/locality.md +++ b/content/english/hpc/external-memory/locality.md @@ -34,8 +34,8 @@ In this section, we will do some case studies to show how these high-level conce Consider a divide-and-conquer algorithm such as merge sorting. There are two approaches to implementing it: -- We can implement it recursively, or "depth-first", the way it is normally implemented: sort the left half, sort the right half and then merge the results. -- We can implement it iteratively, or "breadth-first": do the lowest "layer" first, looping through the entire dataset and comparing odd elements with even elements, then merge the first two elements with the second two elements, the third two elements with the fourth two elements and so on. +- We can implement it recursively, or "depth-first," the way it is normally implemented: sort the left half, sort the right half and then merge the results. +- We can implement it iteratively, or "breadth-first:" do the lowest "layer" first, looping through the entire dataset and comparing odd elements with even elements, then merge the first two elements with the second two elements, the third two elements with the fourth two elements and so on. It seems like the second approach is more cumbersome, but faster — because recursion is always slow, right? diff --git a/content/english/hpc/external-memory/sorting.md b/content/english/hpc/external-memory/sorting.md index 6ac13ae0..e58b2887 100644 --- a/content/english/hpc/external-memory/sorting.md +++ b/content/english/hpc/external-memory/sorting.md @@ -37,9 +37,9 @@ Remember [the $M \gg B$ assumption](../model) when we introduced the computation ### Merge Sorting -The "normal" complexity of the standard mergesort algorithm is $O(N \log_2 N)$: on each of its $O(\log_2 N)$ "layers", the algorithms need to go through all $N$ elements in total and merge them in linear time. +The "normal" complexity of the standard mergesort algorithm is $O(N \log_2 N)$: on each of its $O(\log_2 N)$ "layers," the algorithms need to go through all $N$ elements in total and merge them in linear time. -In the external memory model, when we read a block of size $M$, we can sort its elements "for free", since they are already in memory. This way we can split the arrays into $O(\frac{N}{M})$ blocks of consecutive elements and sort them separately as the base step, and only then merge them. +In the external memory model, when we read a block of size $M$, we can sort its elements "for free," since they are already in memory. This way we can split the arrays into $O(\frac{N}{M})$ blocks of consecutive elements and sort them separately as the base step, and only then merge them. ![](../img/k-way.png) diff --git a/content/english/hpc/number-theory/_index.md b/content/english/hpc/number-theory/_index.md index 091f476f..f4936581 100644 --- a/content/english/hpc/number-theory/_index.md +++ b/content/english/hpc/number-theory/_index.md @@ -6,7 +6,7 @@ draft: true In 1940, British mathematician Godfrey Harold Hardy published a famous essay titled [A Mathematician's Apology](https://en.wikipedia.org/wiki/A_Mathematician%27s_Apology) where he discusses the notion that mathematics should be pursued for its own sake rather than for the sake of its applications. As a 62-year-old, he saw the devastation caused by first world war, and was amidst the second one. -A scientist faces a moral dilemma because some of its inventions may do more harm than good. One can find calm in pursuing useless math. Hardy himself specialized in number theory, and he was content about it not having any applications: "No one has yet discovered any warlike purpose to be served by the theory of numbers or relativity, and it seems unlikely that anyone will do so for many years". +A scientist faces a moral dilemma because some of its inventions may do more harm than good. One can find calm in pursuing useless math. Hardy himself specialized in number theory, and he was content about it not having any applications: "No one has yet discovered any warlike purpose to be served by the theory of numbers or relativity, and it seems unlikely that anyone will do so for many years." It is ironic that within just 5 years number theory was the basis of cracking Enigma and relativity theory developing atomic bomb respectively. diff --git a/content/english/hpc/number-theory/cryptography.md b/content/english/hpc/number-theory/cryptography.md index 87f58124..0dd500dc 100644 --- a/content/english/hpc/number-theory/cryptography.md +++ b/content/english/hpc/number-theory/cryptography.md @@ -22,13 +22,13 @@ To calculate $d$ and restore the message, the attacker would need to repeat step When doing actual communication, people first exchange their public keys (in any, possibly unsecure way) and then use it to encrypt messages. -This is what web browsers do when establishing connection "https". You can also do it by hand with GPG. +This is what web browsers do when establishing connection "https." You can also do it by hand with GPG. ### Man-in-the-middle There is an issue when establishing initial communication that the attacker could replace it and control the communication. -Between your browser and a bank. "Hey this is a message from a bank". +Between your browser and a bank. "Hey this is a message from a bank." Trust networks. E. g. everyone can trust Google or whoever makes the device or operating system. diff --git a/content/english/hpc/number-theory/hashing.md b/content/english/hpc/number-theory/hashing.md index 0484d173..294573a1 100644 --- a/content/english/hpc/number-theory/hashing.md +++ b/content/english/hpc/number-theory/hashing.md @@ -12,7 +12,7 @@ Hash function is any function that is: * Computed fast — at least in linear time, that is. * Has a limited image — say, 64-bit values. -* "Deterministically-random": if it takes $n$ different values, then the probability of collision of two random hashes is $\frac{1}{n}$ and can't be predicted well without knowing the hash function. +* "Deterministically-random:" if it takes $n$ different values, then the probability of collision of two random hashes is $\frac{1}{n}$ and can't be predicted well without knowing the hash function. One good test is that can't create a collision in any better time than by birthday paradox. Square root of the hash space. diff --git a/content/english/hpc/number-theory/inverse.md b/content/english/hpc/number-theory/inverse.md index dbfe1676..ccbb14ea 100644 --- a/content/english/hpc/number-theory/inverse.md +++ b/content/english/hpc/number-theory/inverse.md @@ -18,7 +18,7 @@ mint inv() const { In this section, we are going to discuss some preliminaries before discussing more advanced topics. -In computers, we use the 1st of January, 1970 as the start of the "Unix era", and all time computations are usually done relative to that timestamp. +In computers, we use the 1st of January, 1970 as the start of the "Unix era," and all time computations are usually done relative to that timestamp. We humans also keep track of time relative to some point in the past, which usually has a political or religious significance. At the moment of writing, approximately 63882260594 seconds have passed since 0 AD. @@ -69,7 +69,7 @@ where $\phi(m)$ is called Euler's totient function and is equal to the number of These theorems have a lot of applications. One of them is checking whether a number $n$ is prime or not faster than factoring it. You can pick any base $a$ at random and try to raise it to power $a^{p-1}$ modulo $n$ and check if it is $1$. Such base is called *witness*. -Such probabilistic tests are therefore returning either "no" or "maybe". It may be the case that it just happened to be equal to $1$ but in fact $n$ is composite, in which case you need to repeat the test until you are okay with the false positive probability. Moreover, there exist carmichael numbers, which are composite numbers $n$ that satisfy $a^n \equiv 1 \pmod n$ for all $a$. These numbers are rare, but still [exist](https://oeis.org/A002997). +Such probabilistic tests are therefore returning either "no" or "maybe." It may be the case that it just happened to be equal to $1$ but in fact $n$ is composite, in which case you need to repeat the test until you are okay with the false positive probability. Moreover, there exist carmichael numbers, which are composite numbers $n$ that satisfy $a^n \equiv 1 \pmod n$ for all $a$. These numbers are rare, but still [exist](https://oeis.org/A002997). Unless the input is provided by an adversary, the mistake probability will be low. This test is adequate for finding large primes: there are roughly $\frac{n}{\ln n}$ primes among the first $n$ numbers, which is another fact that we are not going to prove. These primes are distributed more or less evenly, so one can just pick a random number and check numbers in sequence, and after checking $O(\ln n)$ numbers one will probably be found. diff --git a/content/english/hpc/parallel/gpu/_index.en.md b/content/english/hpc/parallel/gpu/_index.en.md index aafb7ba1..f08d10c7 100644 --- a/content/english/hpc/parallel/gpu/_index.en.md +++ b/content/english/hpc/parallel/gpu/_index.en.md @@ -73,7 +73,7 @@ CUDA is available for many languages. Nice documentation can be found here: https://documen.tician.de/pycuda/index.html -If you are on Colab, go to Runtime -> Change runtime type -> Hardware accelerator and set it to "GPU". +If you are on Colab, go to Runtime -> Change runtime type -> Hardware accelerator and set it to "GPU." ```python @@ -431,7 +431,7 @@ Well, you don't really need anything more precise than that for deep learning an It is called mixed precision because input matrices are fp16 but multiplication result and accumulator are fp32 matrices. -Probably, the proper name would be "4x4 matrix cores", however NVIDIA marketing team decided to use "tensor cores". +Probably, the proper name would be "4x4 matrix cores," however NVIDIA marketing team decided to use "tensor cores." So, see, this is not exactly fair comparison. diff --git a/content/english/hpc/pipelining/_index.md b/content/english/hpc/pipelining/_index.md index 3d7d49b5..8c388a94 100644 --- a/content/english/hpc/pipelining/_index.md +++ b/content/english/hpc/pipelining/_index.md @@ -19,9 +19,9 @@ Parallelism helps in reducing *latency*. It is important, but for now, our main Sharing computations is an art in itself, but for now, we want to learn how to use resources that we already have more efficiently. -While multi-core parallelism is "cheating", many form of parallelism exist "for free". +While multi-core parallelism is "cheating," many form of parallelism exist "for free." -Adapting algorithms for parallel hardware is important for achieving *scalability*. In the first part of this book, we will consider this technique "cheating". We only do optimizations that are truly free, and preferably don't take away resources from other processes that might be running concurrently. +Adapting algorithms for parallel hardware is important for achieving *scalability*. In the first part of this book, we will consider this technique "cheating." We only do optimizations that are truly free, and preferably don't take away resources from other processes that might be running concurrently. --> @@ -62,7 +62,7 @@ You can find many analogies with modern CPUs: 2. There are multiple execution units that can process these instructions simultaneously while sharing other CPU facilities (usually 2-4 execution units). 3. Instructions are processed in pipelined fashion (saving roughly the same number of cycles as the number of years between kindergarten and PhD). - + In addition to that, several other aspects also match: diff --git a/content/english/hpc/pipelining/branching.md b/content/english/hpc/pipelining/branching.md index 849e75a0..2a3b81af 100644 --- a/content/english/hpc/pipelining/branching.md +++ b/content/english/hpc/pipelining/branching.md @@ -45,17 +45,17 @@ body: jmp counter ``` -Our goal is to simulate a completely unpredictable branch, and we successfully achieve it: the code takes ~14 CPU cycles per element. For a very rough estimate of what it is supposed to be, we can assume that the branches alternate between "<" and ">=", and the pipeline is mispredicted every other iteration. Then, every two iterations: +Our goal is to simulate a completely unpredictable branch, and we successfully achieve it: the code takes ~14 CPU cycles per element. For a very rough estimate of what it is supposed to be, we can assume that the branches alternate between `<` and `>=`, and the pipeline is mispredicted every other iteration. Then, every two iterations: - We discard the pipeline, which is 19 cycles deep on Zen 2 (i. e. it has 19 stages, each taking one cycle). - We need a memory fetch and a comparison, which costs ~5 cycles. We can check the conditions of even and odd iterations concurrently, so let's assume we only pay it once per 2 iterations. -- In the case of the "<" branch, we need another ~4 cycles to add `a[i]` to a volatile (memory-stored) variable `s`. +- In the case of the `<` branch, we need another ~4 cycles to add `a[i]` to a volatile (memory-stored) variable `s`. Therefore, on average, we need to spend $(4 + 5 + 19) / 2 = 14$ cycles per element, matching what we measured. ### Branch Prediction -We can replace the hardcoded `50` with a tweakable parameter `P` that effectively sets the probability of the "<" branch: +We can replace the hardcoded `50` with a tweakable parameter `P` that effectively sets the probability of the `<` branch: ```c++ for (int i = 0; i < N; i++) @@ -69,7 +69,7 @@ Now, if we benchmark it for different values of `P`, we get an interesting-looki Its peak is at 50-55%, as expected: branch misprediction is the most expensive thing here. This graph is asymmetrical: it takes just ~1 cycle to only check conditions that are never satisfied (`P = 0`), and ~7 cycles for the sum if the branch is always taken (`P = 100`). -This graph is not unimodal: there is another local minimum at around 85-90%. We spend ~6.15 cycles per element there or about 10-15% faster than when we always take the branch, accounting for the fact that we need to perform fewer additions. Branch misprediction stops affecting the performance at this point because when it happens, not the whole instruction buffer is discarded, but only the operations that were speculatively scheduled. Essentially, that 10-15% mispredict rate is the equilibrium point where we can see far enough in the pipeline not to stall but still save 10-15% on taking the cheaper ">=" branch. +This graph is not unimodal: there is another local minimum at around 85-90%. We spend ~6.15 cycles per element there or about 10-15% faster than when we always take the branch, accounting for the fact that we need to perform fewer additions. Branch misprediction stops affecting the performance at this point because when it happens, not the whole instruction buffer is discarded, but only the operations that were speculatively scheduled. Essentially, that 10-15% mispredict rate is the equilibrium point where we can see far enough in the pipeline not to stall but still save 10-15% on taking the cheaper `>=` branch. Note that it costs almost nothing to check for a condition that never or almost never occurs. This is why programmers use runtime exceptions and base case checks so profusely: if they are indeed rare, they don't really cost anything. @@ -86,9 +86,9 @@ for (int i = 0; i < N; i++) std::sort(a, a + n); ``` -We are still processing the same elements, but in a different order, and instead of 14 cycles, it now runs in a little bit more than 4, which is exactly the average of the cost of the pure "<" and ">=" branches. +We are still processing the same elements, but in a different order, and instead of 14 cycles, it now runs in a little bit more than 4, which is exactly the average of the cost of the pure `<` and `>=` branches. -The branch predictor can pick up on much more complicated patterns than just "always left, then always right" or "left-right-left-right". If we just decrease the size of the array $N$ to 1000 (without sorting it), then the branch predictor memorizes the entire sequence of comparisons, and the benchmark again measures at around 4 cycles — in fact, even slightly fewer than in the sorted array case, because in the former case branch predictor needs to spend some time flicking between the "always yes" and "always no" states. +The branch predictor can pick up on much more complicated patterns than just "always left, then always right" or "left-right-left-right." If we just decrease the size of the array $N$ to 1000 (without sorting it), then the branch predictor memorizes the entire sequence of comparisons, and the benchmark again measures at around 4 cycles — in fact, even slightly fewer than in the sorted array case, because in the former case branch predictor needs to spend some time flicking between the "always yes" and "always no" states. ### Hinting Likeliness of Branches diff --git a/content/english/hpc/pipelining/branchless.md b/content/english/hpc/pipelining/branchless.md index 62f0aa2f..160d455a 100644 --- a/content/english/hpc/pipelining/branchless.md +++ b/content/english/hpc/pipelining/branchless.md @@ -184,7 +184,7 @@ int abs(int a) { A very common value for strings is the empty string — which is also its default value. You also need to handle them somehow, and the idiomatic thing to do is to assign `nullptr` as the pointer and `0` as the string size, and then check if the pointer is null or if the size is zero at the beginning of every procedure involving strings. -However, this requires a separate branch, which is costly unless most strings are empty. What we can do to get rid of it is to allocate a "zero C-string", which is just a zero byte allocated somewhere, and then simply point all empty strings there. Now all string operations with empty strings have to read this useless zero byte, but this is still much cheaper than a branch misprediction. +However, this requires a separate branch, which is costly unless most strings are empty. What we can do to get rid of it is to allocate a "zero C-string," which is just a zero byte allocated somewhere, and then simply point all empty strings there. Now all string operations with empty strings have to read this useless zero byte, but this is still much cheaper than a branch misprediction. **Binary search.** The standard binary search [can be implemented](/hpc/data-structures/binary-search) without branches, and on small arrays (that fit into cache) it works ~4x faster than the branchy `std::lower_bound`: diff --git a/content/english/hpc/pipelining/scheduling.md b/content/english/hpc/pipelining/scheduling.md index da380037..b4857a0c 100644 --- a/content/english/hpc/pipelining/scheduling.md +++ b/content/english/hpc/pipelining/scheduling.md @@ -22,9 +22,9 @@ While complex instruction sets had the benefit, with superscalar processors you Instructions are microcoded. -uOps ("micro-ops", the first letter is meant to be greek letter mu as in us (microsecond), but nobody cares enough to type it). +uOps ("micro-ops," the first letter is meant to be greek letter mu as in us (microsecond), but nobody cares enough to type it). -Each architecture has its own set of "ports", each capable of executing its own set of instructions (uOps, to be more exact). +Each architecture has its own set of "ports," each capable of executing its own set of instructions (uOps, to be more exact). But still, when you use it, it appears and feels like a single instruction. How does CPU achieve that? diff --git a/content/english/hpc/pipelining/tables.md b/content/english/hpc/pipelining/tables.md index d18d99c6..24678270 100644 --- a/content/english/hpc/pipelining/tables.md +++ b/content/english/hpc/pipelining/tables.md @@ -30,7 +30,7 @@ You can get latency and throughput numbers for a specific architecture from spec Some comments: -- Because our minds are so used to the cost model where "more" means "worse", people mostly use *reciprocals* of throughput instead of throughput. +- Because our minds are so used to the cost model where "more" means "worse," people mostly use *reciprocals* of throughput instead of throughput. - If a certain instruction is especially frequent, its execution unit could be duplicated to increase its throughput — possibly to even more than one, but not higher than the [decode width](/hpc/architecture/layout). - Some instructions have a latency of 0. This means that these instruction are used to control the scheduler and don't reach the execution stage. They still have non-zero reciprocal throughput because the [CPU front-end](/hpc/architecture/layout) still needs to process them. - Most instructions are pipelined, and if they have the reciprocal throughput of $n$, this usually means that their execution unit can take another instruction after $n$ cycles (and if it is below 1, this means that there are multiple execution units, all capable of taking another instruction on the next cycle). One notable exception is the [integer division](/hpc/arithmetic/division): it is either very poorly pipelined or not pipelined at all. diff --git a/content/english/hpc/simd/_index.md b/content/english/hpc/simd/_index.md index 988e83e8..5e05da8e 100644 --- a/content/english/hpc/simd/_index.md +++ b/content/english/hpc/simd/_index.md @@ -29,7 +29,7 @@ Now, let's add the following magic directive in the very beginning: When compiled and run in the same environment, it finishes in 1.24 seconds. This is almost twice as fast, and we didn't change a single line of code or the optimization level. -What happened here is we provided a little bit of info about the computer on which this code is supposed to be run. Specifically, we told the compiler that the target CPU supports an extension to the x86 instruction set called "AVX2". AVX2 is one of the many so-called "SIMD extensions" for x86. These extensions include instructions that operate on special registers capable of holding 128, 256, or even 512 bits of data using the "single instruction, multiple data" (SIMD) approach. Instead of working with a single scalar value, SIMD instructions divide the data in registers into blocks of 8, 16, 32, or 64 bits and perform the same operation on them in parallel, yielding a proportional increase in performance[^power]. +What happened here is we provided a little bit of info about the computer on which this code is supposed to be run. Specifically, we told the compiler that the target CPU supports an extension to the x86 instruction set called "AVX2." AVX2 is one of the many so-called "SIMD extensions" for x86. These extensions include instructions that operate on special registers capable of holding 128, 256, or even 512 bits of data using the "single instruction, multiple data" (SIMD) approach. Instead of working with a single scalar value, SIMD instructions divide the data in registers into blocks of 8, 16, 32, or 64 bits and perform the same operation on them in parallel, yielding a proportional increase in performance[^power]. [^power]: On some CPUs, especially heavy SIMD instructions consume more energy and thus [require downclocking](https://blog.cloudflare.com/on-the-dangers-of-intels-frequency-scaling/) to balance off the total power consumption, so the real-time speedup is not always proportional. diff --git a/content/english/hpc/simd/intrinsics.md b/content/english/hpc/simd/intrinsics.md index d702fdbc..f10f7b9d 100644 --- a/content/english/hpc/simd/intrinsics.md +++ b/content/english/hpc/simd/intrinsics.md @@ -156,7 +156,7 @@ typedef int v8si __attribute__ (( vector_size(32) )); Unfortunately, this is not a part of the C or C++ standard, so different compilers use different syntax for that. -There is somewhat of a naming convention, which is to include size and type of elements into the name of the type: in the example above, we defined a "vector of 8 signed integers". But you may choose any name you want, like `vec`, `reg` or whatever. The only thing you don't want to do is to name it `vector` because of how much confusion there would be because of `std::vector`. +There is somewhat of a naming convention, which is to include size and type of elements into the name of the type: in the example above, we defined a "vector of 8 signed integers." But you may choose any name you want, like `vec`, `reg` or whatever. The only thing you don't want to do is to name it `vector` because of how much confusion there would be because of `std::vector`. The main advantage of using these types is that for many operations you can use normal C++ operators instead of looking up the relevant intrinsic. diff --git a/content/english/hpc/simd/moving.md b/content/english/hpc/simd/moving.md index e2cf3035..20c03327 100644 --- a/content/english/hpc/simd/moving.md +++ b/content/english/hpc/simd/moving.md @@ -13,7 +13,7 @@ While using the elementwise instructions is easy, the largest challenge with SIM ### Aligned Loads and Stores -Operations of reading and writing the contents of a SIMD register into memory have two versions each: `load` / `loadu` and `store` / `storeu`. The letter "u" here stands for "unaligned". The difference is that the former ones only work correctly when the read / written block fits inside a single [cache line](/hpc/cpu-cache/cache-lines) (and crash otherwise), while the latter work either way, but with a slight performance penalty if the block crosses a cache line. +Operations of reading and writing the contents of a SIMD register into memory have two versions each: `load` / `loadu` and `store` / `storeu`. The letter "u" here stands for "unaligned." The difference is that the former ones only work correctly when the read / written block fits inside a single [cache line](/hpc/cpu-cache/cache-lines) (and crash otherwise), while the latter work either way, but with a slight performance penalty if the block crosses a cache line. Sometimes, especially when the "inner" operation is very lightweight, the performance difference becomes significant (at least because you need to fetch two cache lines instead of one). As an extreme example, this way of adding two arrays together: diff --git a/content/english/hpc/simd/reduction.md b/content/english/hpc/simd/reduction.md index 28fb4d9c..078983d2 100644 --- a/content/english/hpc/simd/reduction.md +++ b/content/english/hpc/simd/reduction.md @@ -50,7 +50,7 @@ You can use this approach for for other reductions, such as for finding the mini ### Horizontal Summation -The last part, where we sum up the 8 accumulators stored in a vector register into a single scalar to get the total sum, is called "horizontal summation". +The last part, where we sum up the 8 accumulators stored in a vector register into a single scalar to get the total sum, is called "horizontal summation." Although extracting and adding every scalar one by one only takes a constant number of cycles, it can be computed slightly faster using a [special instruction](https://software.intel.com/sites/landingpage/IntrinsicsGuide/#techs=AVX,AVX2&text=_mm256_hadd_epi32&expand=2941) that adds together pairs of adjacent elements in a register. diff --git a/content/english/hpc/stats.md b/content/english/hpc/stats.md index 2961f4d5..6e436d15 100644 --- a/content/english/hpc/stats.md +++ b/content/english/hpc/stats.md @@ -193,7 +193,7 @@ f(n, m) &= 1 \times (1-\frac{1}{m}) \times (1-\frac{2}{m}) \times ... \times (1- \end{aligned} $$ -This product shrinks pretty quickly with $n$, but it is not clear what value of $m$ is needed to be "safe". Turns out, if $n = O(\sqrt m)$, the probability of collision tends to zero, and anything asymptotically larger guarantees a collision. One can show this with calculus, but we will choose the probability theory way. +This product shrinks pretty quickly with $n$, but it is not clear what value of $m$ is needed to be "safe." Turns out, if $n = O(\sqrt m)$, the probability of collision tends to zero, and anything asymptotically larger guarantees a collision. One can show this with calculus, but we will choose the probability theory way. Let's go back to the idea of counting pairs of birthdays and introduce $\frac{n \cdot (n-1)}{2}$ indicators $I_{ij}$ — one for each pair $(i, j)$ of persons — each being equal to $1$ if the birthdays match. The probability and expectation of each indicator is $\frac{1}{m}$. From 7e7cf436f691b526a57e810a264a1894500d56c9 Mon Sep 17 00:00:00 2001 From: Ruslan Sakevych Date: Sun, 13 Mar 2022 23:03:28 -0500 Subject: [PATCH 330/531] Fix number of non-free layers in the merge sort. --- content/english/hpc/external-memory/sorting.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/content/english/hpc/external-memory/sorting.md b/content/english/hpc/external-memory/sorting.md index e58b2887..12eb7d56 100644 --- a/content/english/hpc/external-memory/sorting.md +++ b/content/english/hpc/external-memory/sorting.md @@ -1,6 +1,7 @@ --- title: External Sorting weight: 4 +published: true --- Now, let's try to design some actually useful algorithms for the new [external memory model](../model). Our goal in this section is to slowly build up more complex things and eventually get to *external sorting* and its interesting applications. @@ -43,7 +44,7 @@ In the external memory model, when we read a block of size $M$, we can sort its ![](../img/k-way.png) -This effectively means that, in terms of IO operations, the first $O(\log M)$ layers of mergesort are free, and there are only $O(\log_2 \frac{N}{B})$ non-zero-cost layers, each mergeable in $O(\frac{N}{B})$ IOPS in total. This brings total I/O complexity to +This effectively means that, in terms of IO operations, the first $O(\log M)$ layers of mergesort are free, and there are only $O(\log_2 \frac{N}{M})$ non-zero-cost layers, each mergeable in $O(\frac{N}{B})$ IOPS in total. This brings total I/O complexity to $$ O\left(\frac{N}{B} \log_2 \frac{N}{M}\right) From 352688cf794a64c9c1f76f53523c9fbd49fd4855 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Mon, 14 Mar 2022 16:47:55 +0300 Subject: [PATCH 331/531] fix checking if a point is inside --- content/russian/cs/geometry-basic/polygons.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/russian/cs/geometry-basic/polygons.md b/content/russian/cs/geometry-basic/polygons.md index 7537e591..e0a3c5e7 100644 --- a/content/russian/cs/geometry-basic/polygons.md +++ b/content/russian/cs/geometry-basic/polygons.md @@ -80,7 +80,7 @@ $$ В более общем случае есть два популярных подхода, оба за $O(n)$. -Первый заключается в подсчете углов. Пройдемся по всем вершинам в порядке обхода и будем последовательно рассматривать углы с вершиной в точке $P$ и лучами, проходящими через соседние вершины многоугольника. Если просуммировать эти ориентированные углы, то получится какая-то величина $\theta$. Если точка $P$ лежит внутри многоугольника, то $\theta = \pm 2 \theta$, иначе $\theta = 0$. +Первый заключается в подсчете углов. Пройдемся по всем вершинам в порядке обхода и будем последовательно рассматривать углы с вершиной в точке $P$ и лучами, проходящими через соседние вершины многоугольника. Если просуммировать эти ориентированные углы, то получится какая-то величина $\theta$. Если точка $P$ лежит внутри многоугольника, то $\theta = \pm 2 \pi$, иначе $\theta = 0$. Второй заключается в подсчете, сколько раз луч, выпущенный из $P$, пересекает ребра многоугольника. From 37a5ba70ce32828e16335d35e6c1f46979a50ed5 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 15 Mar 2022 20:25:04 +0300 Subject: [PATCH 332/531] tsp on a plane problem --- content/russian/cs/programming/bayans.md | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/content/russian/cs/programming/bayans.md b/content/russian/cs/programming/bayans.md index 7d8d773b..d35880cc 100644 --- a/content/russian/cs/programming/bayans.md +++ b/content/russian/cs/programming/bayans.md @@ -302,3 +302,15 @@ def query(y): ``` Ваша задача — отгадать число, используя не более 10000 попыток. + +## Коммивояжер + +Даны $3 \cdot 10^5$ точек на плоскости. Выберите среди них любое подмножество из 500 точек и решите для него задачу коммивояжера: найдите минимальный по длине цикл, проходящий через все эти точки. + + From c55830c6b439f2f8cb0e00ed2dd5a397330b5cc4 Mon Sep 17 00:00:00 2001 From: song-jx <79297685+song-jx@users.noreply.github.com> Date: Thu, 17 Mar 2022 14:40:57 +0800 Subject: [PATCH 333/531] Changed incorrect parameter "-mpopcount" to "-mpopcnt". --- content/english/hpc/compilation/flags.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/content/english/hpc/compilation/flags.md b/content/english/hpc/compilation/flags.md index 08e83341..1f70a622 100644 --- a/content/english/hpc/compilation/flags.md +++ b/content/english/hpc/compilation/flags.md @@ -1,6 +1,7 @@ --- title: Flags and Targets weight: 2 +published: true --- The first step of getting high performance from the compiler is to ask for it, which is done with over a hundred different compiler options, attributes, and pragmas. @@ -21,7 +22,7 @@ There are also many other optimization flags that are not included even in `-Ofa The next thing you may want to do is to tell the compiler more about the computer(s) this code is supposed to be run on: the smaller the set of platforms is, the better. By default, it will generate binaries that can run on any relatively new (>2000) x86 CPU. The simplest way to narrow it down is to pass `-march` flag to specify the exact microarchitecture: `-march=haswell`. If you are compiling on the same computer that will run the binary, you can use `-march=native` for auto-detection. -The instruction sets are generally backward-compatible, so it is often enough to just use the name of the oldest microarchitecture you need to support. A more robust approach is to list specific features that the CPU is guaranteed to have: `-mavx2`, `-mpopcount`. When you just want to *tune* the program for a particular machine without using any instructions that may crash it on incompatible CPUs, you can use the `-mtune` flag (by default `-march=x` also implies `-mtune=x`). +The instruction sets are generally backward-compatible, so it is often enough to just use the name of the oldest microarchitecture you need to support. A more robust approach is to list specific features that the CPU is guaranteed to have: `-mavx2`, `-mpopcnt`. When you just want to *tune* the program for a particular machine without using any instructions that may crash it on incompatible CPUs, you can use the `-mtune` flag (by default `-march=x` also implies `-mtune=x`). These options can also be specified for a compilation unit with pragmas instead of compilation flags: From fdeab7545a053c4fa7c9068513f0b1d29cf64626 Mon Sep 17 00:00:00 2001 From: song-jx <79297685+song-jx@users.noreply.github.com> Date: Thu, 17 Mar 2022 14:52:22 +0800 Subject: [PATCH 334/531] Changed incorrect instruction "jump" to "jmp". --- content/english/hpc/architecture/functions.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/content/english/hpc/architecture/functions.md b/content/english/hpc/architecture/functions.md index ec8631f0..0edee3f6 100644 --- a/content/english/hpc/architecture/functions.md +++ b/content/english/hpc/architecture/functions.md @@ -1,6 +1,7 @@ --- title: Functions and Recursion weight: 3 +published: true --- To "call a function" in assembly, you need to [jump](../loops) to its beginning and then jump back. But then two important problems arise: @@ -55,11 +56,11 @@ add rsp, 8 ; "call func" push rip ; <- instruction pointer (although accessing it like that is probably illegal) -jump func +jmp func ; "ret" pop rcx ; <- choose any unused register -jump rcx +jmp rcx ``` The memory region between `rbp` and `rsp` is called a *stack frame*, and this is where local variables of functions are typically stored. It is pre-allocated at the start of the program, and if you push more data on the stack than its capacity (8MB by default on Linux), you encounter a *stack overflow* error. Because modern operating systems don't actually give you memory pages until you read or write to their address space, you can freely specify a very large stack size, which acts more like a limit on how much stack memory can be used, and not a fixed amount every program has to use. From 8e516998e679c068169f9d2a370a0ef96555acf4 Mon Sep 17 00:00:00 2001 From: song-jx <79297685+song-jx@users.noreply.github.com> Date: Thu, 17 Mar 2022 14:53:10 +0800 Subject: [PATCH 335/531] Changed incorrect instruction "jump" to "jmp". --- content/english/hpc/architecture/layout.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/content/english/hpc/architecture/layout.md b/content/english/hpc/architecture/layout.md index 1ab39c82..11735951 100644 --- a/content/english/hpc/architecture/layout.md +++ b/content/english/hpc/architecture/layout.md @@ -1,6 +1,7 @@ --- title: Machine Code Layout weight: 10 +published: true --- Computer engineers like to mentally split the [pipeline of a CPU](/hpc/pipelining) into two parts: the *front-end*, where instructions are fetched from memory and decoded, and the *back-end*, where they are scheduled and finally executed. Typically, the performance is bottlenecked by the execution stage, and for this reason, most of our efforts in this book are going to be spent towards optimizing around the back-end. @@ -126,7 +127,7 @@ normal: ret swap: xchg edi, esi - jump normal + jmp normal ``` This technique is quite handy when handling exceptions cases in general, and in high-level code, you can give the compiler a [hint](/hpc/compilation/situational) that a certain branch is more likely than the other: From 4cede62f8d71dede3b3b9a5f31293dff2f651d8f Mon Sep 17 00:00:00 2001 From: Gianluca Della Vedova Date: Thu, 17 Mar 2022 09:59:58 +0100 Subject: [PATCH 336/531] Change type inside mod_power_of_two I find puzzling to use a float for a value that should be an int (a remainder is always an int) --- content/english/hpc/compilation/contracts.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/hpc/compilation/contracts.md b/content/english/hpc/compilation/contracts.md index cedf20dd..2337ddcc 100644 --- a/content/english/hpc/compilation/contracts.md +++ b/content/english/hpc/compilation/contracts.md @@ -196,7 +196,7 @@ int mod_power_of_two(int x, int m) [[ expects: is_power_of_two(m) ]] [[ ensures r: r >= 0 && r < m ]] { - float r = x & (m - 1); + int r = x & (m - 1); [[ assert: r = x % m ]]; return r; } From c0bd68c53c8c3de8cd842faf258e010c107c52d0 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sat, 19 Mar 2022 03:45:32 +0300 Subject: [PATCH 337/531] note on storing last elemens in S+ tree --- content/english/hpc/data-structures/s-tree.md | 1 + 1 file changed, 1 insertion(+) diff --git a/content/english/hpc/data-structures/s-tree.md b/content/english/hpc/data-structures/s-tree.md index 35528c7a..ad9fc2ef 100644 --- a/content/english/hpc/data-structures/s-tree.md +++ b/content/english/hpc/data-structures/s-tree.md @@ -548,6 +548,7 @@ Other possible minor optimizations include: - Rewriting the whole thing in assembly, as the compiler seems to struggle with pointer arithmetic. - Using [blending](/hpc/simd/masking) instead of `packs`: you can odd-even shuffle node keys (`[1 3 5 7] [2 4 6 8]`), compare against the search key, and then blend the low 16 bits of the first register mask with the high 16 bits of the second. Blending is slightly faster on many architectures, and it may also help to alternate between packing and blending as they use different subsets of ports. (Thanks to Const-me from HackerNews for [suggesting](https://news.ycombinator.com/item?id=30381912) it.) - Using [popcount](/hpc/simd/shuffling/#shuffles-and-popcount) instead of `tzcnt`: the index `i` is equal to the number of keys less than `x`, so we can compare `x` against all keys, combine the vector mask any way we want, call `maskmov`, and then calculate the number of set bits with `popcnt`. This removes the need to store the keys in any particular order, which lets us skip the permutation step and also use this procedure on the last layer as well. +- Defining the key $i$ as the *maximum* key in the subtree of child $i$ instead of the *minimum* key in the subtree of child $(i + 1)$. The correctness doesn't change, but this guarantees that the result will be stored in the last node we access (and not in the first element of the next neighbor node), which lets us fetch slightly fewer cache lines. Note that the current implementation is specific to AVX2 and may require some non-trivial changes to adapt to other platforms. It would be interesting to port it for Intel CPUs with AVX-512 and Arm CPUs with 128-bit NEON, which may require some [trickery](https://github.com/WebAssembly/simd/issues/131) to work. From d308ffd213bb7e3a89ab444fc7915789c12495ba Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sun, 20 Mar 2022 13:41:13 +0300 Subject: [PATCH 338/531] fix link --- content/english/hpc/architecture/functions.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/hpc/architecture/functions.md b/content/english/hpc/architecture/functions.md index 0edee3f6..24ab9898 100644 --- a/content/english/hpc/architecture/functions.md +++ b/content/english/hpc/architecture/functions.md @@ -94,7 +94,7 @@ Note that the data in the stack is written top-to-bottom. This is just a convent ### Calling Conventions -The people who develop compilers and operating systems eventually came up with [conventions](https://wiki.osdev.org/Calling_Conventions) on how to write and call functions. These conventions enable some important [software engineering marvels](/hpc/compilation/linking/) such as splitting compilation into separate units, re-using already compiled libraries, and even writing them in different programming languages. +The people who develop compilers and operating systems eventually came up with [conventions](https://wiki.osdev.org/Calling_Conventions) on how to write and call functions. These conventions enable some important [software engineering marvels](/hpc/compilation/stages/) such as splitting compilation into separate units, re-using already compiled libraries, and even writing them in different programming languages. Consider the following example in C: From c78da5a8fab95cc31500e1b38039fc447590060f Mon Sep 17 00:00:00 2001 From: Lukas Barth Date: Thu, 24 Mar 2022 16:01:22 +0100 Subject: [PATCH 339/531] Fix wrong hazards A (conditional) branch is a control hazard, not a structural hazard. --- content/english/hpc/pipelining/branchless.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/hpc/pipelining/branchless.md b/content/english/hpc/pipelining/branchless.md index 160d455a..dd2df164 100644 --- a/content/english/hpc/pipelining/branchless.md +++ b/content/english/hpc/pipelining/branchless.md @@ -93,7 +93,7 @@ This way you can eliminate branching, but this comes at the cost of evaluating * ### When It Is Beneficial -Using predication eliminates [a structural hazard](../hazard) but introduces a data hazard. There is still a pipeline stall, but it is a cheaper one: you only need to wait for `cmov` to be resolved and not flush the entire pipeline in case of a mispredict. +Using predication eliminates [a control hazard](../hazard) but introduces a data hazard. There is still a pipeline stall, but it is a cheaper one: you only need to wait for `cmov` to be resolved and not flush the entire pipeline in case of a mispredict. However, there are many situations when it is more efficient to leave branchy code as it is. This is the case when the cost of computing *both* branches instead of just *one* outweighs the penalty for the potential branch mispredictions. From c2de63fb53fe97c480af21dffb9fd9e5c8d252ea Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Thu, 24 Mar 2022 18:40:33 +0300 Subject: [PATCH 340/531] fix at&t syntax example --- content/english/hpc/architecture/assembly.md | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-) diff --git a/content/english/hpc/architecture/assembly.md b/content/english/hpc/architecture/assembly.md index 325b2962..013d2987 100644 --- a/content/english/hpc/architecture/assembly.md +++ b/content/english/hpc/architecture/assembly.md @@ -117,20 +117,18 @@ There are actually multiple *assemblers* (the programs that produce machine code These syntaxes are also sometimes called *GAS* and *NASM* respectively, by the names of the two primary assemblers that use them (*GNU Assembler* and *Netwide Assembler*). -We used Intel syntax in this chapter and will continue to preferably use it for the rest of the book. For comparison, here is what the summation loop looks like in AT&T asm: +We used Intel syntax in this chapter and will continue to preferably use it for the rest of the book. For comparison, here is how the same `*c = *a + *b` example looks like in AT&T asm: ```asm -loop: - addl (%rax), %edx - addq $4, %rax - cmpq %rcx, %rax - jne loop +movl (%rsi), %eax +addl (%rdi), %eax +movl %eax, (%rdx) ``` The key differences can be summarized as follows: 1. The *last* operand is used to specify the destination. -2. Register names and constants need to be prefixed by `%` and `$` respectively. +2. Registers and constants need to be prefixed by `%` and `$` respectively (e. g. `addl $1, %rdx` increments `rdx`). 3. Memory addressing looks like this: `displacement(%base, %index, scale)`. 4. Both `;` and `#` can be used for line comments, and also `/* */` can be used for block comments. From 34b02ef5c8d40285a4f7d0ec705a44ea808d3c8b Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Thu, 24 Mar 2022 22:26:33 +0300 Subject: [PATCH 341/531] b-tree article outline --- content/english/hpc/data-structures/b-tree.md | 222 ++- .../hpc/data-structures/img/btree-absl.svg | 1342 +++++++++++++++ .../data-structures/img/btree-absolute.svg | 1462 +++++++++++++++++ .../data-structures/img/btree-relative.svg | 1424 ++++++++++++++++ .../hpc/data-structures/segment-trees.md | 2 +- 5 files changed, 4450 insertions(+), 2 deletions(-) create mode 100644 content/english/hpc/data-structures/img/btree-absl.svg create mode 100644 content/english/hpc/data-structures/img/btree-absolute.svg create mode 100644 content/english/hpc/data-structures/img/btree-relative.svg diff --git a/content/english/hpc/data-structures/b-tree.md b/content/english/hpc/data-structures/b-tree.md index 25440bd0..9f7d5b6c 100644 --- a/content/english/hpc/data-structures/b-tree.md +++ b/content/english/hpc/data-structures/b-tree.md @@ -1,7 +1,227 @@ --- title: Search Trees -weight: 4 +weight: 3 draft: true --- +In the [previous article](../s-tree), we designed *static* B-trees (*S-trees*), and we [briefly discussed](../s-tree/#as-a-dynamic-tree) how to turn them *dynamic* while retaining performance gains from [SIMD](/hpc/simd). + +In this article + +The problem is multi-dimensional. + +Of course, this comparison is not fair, as implementing a dynamic search tree is a more high-dimensional problem. + +We’d also need to implement the update operation, which will not be that efficient, and for which we’d need to sacrifice the fanout factor. But it still seems possible to implement a 10-20x faster std::set and a 3-5x faster absl::btree_set, depending on how you define “faster” — and this is one of the things we’ll attempt to do next. + +Static as + +![](../img/btree-absolute.svg) + +![](../img/btree-relative.svg) + +![](../img/btree-absl.svg) + +When the data set is small, the latency increases in discrete steps: 3.5ns for under 32 elements, 6.5ns, and to 12ns, until it hits the L2 cache (not shown on graphs) and starts increasing more smoothly yet still with noticeable spikes when the tree grows upwards. + +One interesting use case is rope, also known as cord, which is used for wrapping strings in a tree to support mass operations. For example, editing a very large text file. Which is the topic. + +It is common that >90% of operations are lookups. Optimizing searches is important because every other operation starts with locating a key. + +I don't know (yet) why insertions are *that* slow. My guess is that it has something to do with data dependencies between queries. + +I apologize to everyone else, but this is sort of your fault for not using a public benchmark. + +## B− Tree + +[B+ tree](../s-tree/#b-tree-layout-1). + +B− ("B minus") tree. The difference is: + +- We are specifically storing the *last* element. This is needed +- We use a small node size $B=32$. This is needed simd to be efficient (we will discuss other node sizes in the future) +- We don't store any pointers except for the children (while B+ stores one pointer for the next leaf node). + +The difference is that + +### Layout + +To simplify memory all with infinities. + +```c++ +const int B = 32; +int H = 1; // tree height + +const int R = 1e8; // reserve + +alignas(64) int tree[R]; +int n_tree = B; // 31 (+ 1) + 32 for internal nodes and 31 for data nodes +int root = 0; + +for (int i = 0; i < R; i++) + tree[i] = INT_MAX; +``` + +To "allocate" a new node, we simply increase `n_tree` by $B$ if it is a data node or by $2 \cdot B$ if it is an internal node. + +### Searching + +```c++ +typedef __m256i reg; + +reg cmp(reg x, int *node) { + reg y = _mm256_load_si256((reg*) node); + return _mm256_cmpgt_epi32(x, y); +} + +unsigned rank32(reg x, int *node) { + reg m1 = cmp(x, node); + reg m2 = cmp(x, node + 8); + reg m3 = cmp(x, node + 16); + reg m4 = cmp(x, node + 24); + + m1 = _mm256_blend_epi16(m1, m2, 0b01010101); + m3 = _mm256_blend_epi16(m3, m4, 0b01010101); + m1 = _mm256_packs_epi16(m1, m3); + + unsigned mask = _mm256_movemask_epi8(m1); + return __builtin_popcount(mask); +} +``` + +```c++ +int lower_bound(int _x) { + //std::cerr << std::endl << "lb " << _x << std::endl; + unsigned k = root; + reg x = _mm256_set1_epi32(_x); + + for (int h = 0; h < H - 1; h++) { + unsigned i = rank32(x, &tree[k]); + k = tree[k + B + i]; + } + + unsigned i = rank32(x, &tree[k]); + + return tree[k + i]; // what if next block? maybe we store 31 elements? +} +``` + +### Insertions + +```c++ +struct Precalc { + alignas(64) int mask[B][B]; + + constexpr Precalc() : mask{} { + for (int i = 0; i < B; i++) + for (int j = i; j < B - 1; j++) + mask[i][j] = -1; + } +}; + +constexpr Precalc P; +``` + +```c++ +void insert(int *node, int i, int x) { + for (int j = B - 8; j >= 0; j -= 8) { + reg t = _mm256_load_si256((reg*) &node[j]); + reg mask = _mm256_load_si256((reg*) &P.mask[i][j]); + _mm256_maskstore_epi32(&node[j + 1], mask, t); + } + node[i] = x; +} + +// move the second half of a node and fill it with infinities +void move(int *from, int *to) { + const reg infs = _mm256_set1_epi32(INT_MAX); + for (int i = 0; i < B / 2; i += 8) { + reg t = _mm256_load_si256((reg*) &from[B / 2 + i]); + _mm256_store_si256((reg*) &to[i], t); + _mm256_store_si256((reg*) &from[B / 2 + i], infs); // probably not necessary for pointers + } +} +``` + +```c++ +void insert(int _x) { + unsigned sk[20], si[20]; + + unsigned k = root; + reg x = _mm256_set1_epi32(_x); + + for (int h = 0; h < H - 1; h++) { + unsigned i = rank32(x, &tree[k]); + sk[h] = k, si[h] = i; + k = tree[k + B + i]; + } + + unsigned i = rank32(x, &tree[k]); + + bool filled = (tree[k + B - 2] != INT_MAX); + bool updated = (tree[k + i] == INT_MAX); + + insert(tree + k, i, _x); + + if (updated) { + for (int h = H - 2; h >= 0; h--) { + int idx = sk[h] + si[h]; + tree[idx] = (tree[idx] < _x ? _x : tree[idx]); + } + } + + if (filled) { + // create a new leaf node + move(tree + k, tree + n_tree); + + int v = tree[k + B / 2 - 1]; // new key to be inserted + int p = n_tree; // pointer to the newly created node + + n_tree += B; + + for (int h = H - 2; h >= 0; h--) { + k = sk[h], i = si[h]; + + filled = (tree[k + B - 3] != INT_MAX); + + // the node already has a correct key (right one) and a correct pointer (left one) + insert(tree + k, i, v); + insert(tree + k + B, i + 1, p); + + if (!filled) + return; + + // create a new internal node + move(tree + k, tree + n_tree); // move keys + move(tree + k + B, tree + n_tree + B); // move pointers + + v = tree[k + B / 2 - 1]; + tree[k + B / 2 - 1] = INT_MAX; + + p = n_tree; + n_tree += 2 * B; + } + + if (filled) { + // tree grows + + tree[n_tree] = v; + + tree[n_tree + B] = root; + tree[n_tree + B + 1] = p; + + root = n_tree; + n_tree += 2 * B; + H++; + } + } +} +``` + +## Optimizations + ... + +## Acknowledgements + +Thanks to Danila Kutenin for meaningful discussions of applicability. diff --git a/content/english/hpc/data-structures/img/btree-absl.svg b/content/english/hpc/data-structures/img/btree-absl.svg new file mode 100644 index 00000000..1096c24f --- /dev/null +++ b/content/english/hpc/data-structures/img/btree-absl.svg @@ -0,0 +1,1342 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/data-structures/img/btree-absolute.svg b/content/english/hpc/data-structures/img/btree-absolute.svg new file mode 100644 index 00000000..9be62391 --- /dev/null +++ b/content/english/hpc/data-structures/img/btree-absolute.svg @@ -0,0 +1,1462 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/data-structures/img/btree-relative.svg b/content/english/hpc/data-structures/img/btree-relative.svg new file mode 100644 index 00000000..b80dc6f3 --- /dev/null +++ b/content/english/hpc/data-structures/img/btree-relative.svg @@ -0,0 +1,1424 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/data-structures/segment-trees.md b/content/english/hpc/data-structures/segment-trees.md index 455ca264..a61dcd71 100644 --- a/content/english/hpc/data-structures/segment-trees.md +++ b/content/english/hpc/data-structures/segment-trees.md @@ -1,6 +1,6 @@ --- title: Segment Trees -weight: 3 +weight: 4 --- The lessons learned from [optimizing](../s-tree) [binary search](../binary-search) can be applied to a broad range of data structures. From 12bd96479ed9efb7bce4c70295d65a38eaa6d255 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Fri, 25 Mar 2022 14:53:59 +0300 Subject: [PATCH 342/531] b-tree draft --- content/english/hpc/data-structures/b-tree.md | 243 +++-- .../hpc/data-structures/img/btree-absl.svg | 388 ++++---- .../data-structures/img/btree-absolute.svg | 746 +++++++-------- .../data-structures/img/btree-relative.svg | 885 ++++++++++-------- 4 files changed, 1215 insertions(+), 1047 deletions(-) diff --git a/content/english/hpc/data-structures/b-tree.md b/content/english/hpc/data-structures/b-tree.md index 9f7d5b6c..aed26981 100644 --- a/content/english/hpc/data-structures/b-tree.md +++ b/content/english/hpc/data-structures/b-tree.md @@ -4,68 +4,63 @@ weight: 3 draft: true --- -In the [previous article](../s-tree), we designed *static* B-trees (*S-trees*), and we [briefly discussed](../s-tree/#as-a-dynamic-tree) how to turn them *dynamic* while retaining performance gains from [SIMD](/hpc/simd). +In the [previous article](../s-tree), we designed *static* B-trees to speed up binary searching in sorted arrays, designing S-tree and S+ Tree. In the last section we [briefly discussed](../s-tree/#as-a-dynamic-tree) how to turn them *dynamic* while retaining performance gains from [SIMD](/hpc/simd), making a proof-of-concept. Simply adding pointers to S+ tree. -In this article +In this section, we follow up on that promise and design a minimally functional search tree for integer keys, called *B− tree*, that achieves significant improvements over [improvements](#evaluation): 7-18 times faster for large arrays and 3-8 faster for inserts. The [absl::btree](https://abseil.io/blog/20190812-btree) 3-7 times faster for searches and 1.5-2 times faster for with yet ample room for improvement. -The problem is multi-dimensional. +The memory overhead of the structure around 30%. The [final implementation](https://github.com/sslotin/amh-code/blob/main/b-tree/btree-final.cc) is around 150 lines of C. -Of course, this comparison is not fair, as implementing a dynamic search tree is a more high-dimensional problem. - -We’d also need to implement the update operation, which will not be that efficient, and for which we’d need to sacrifice the fanout factor. But it still seems possible to implement a 10-20x faster std::set and a 3-5x faster absl::btree_set, depending on how you define “faster” — and this is one of the things we’ll attempt to do next. - -Static as - -![](../img/btree-absolute.svg) - -![](../img/btree-relative.svg) - -![](../img/btree-absl.svg) - -When the data set is small, the latency increases in discrete steps: 3.5ns for under 32 elements, 6.5ns, and to 12ns, until it hits the L2 cache (not shown on graphs) and starts increasing more smoothly yet still with noticeable spikes when the tree grows upwards. - -One interesting use case is rope, also known as cord, which is used for wrapping strings in a tree to support mass operations. For example, editing a very large text file. Which is the topic. - -It is common that >90% of operations are lookups. Optimizing searches is important because every other operation starts with locating a key. - -I don't know (yet) why insertions are *that* slow. My guess is that it has something to do with data dependencies between queries. - -I apologize to everyone else, but this is sort of your fault for not using a public benchmark. +We give more details in th evaluation section. ## B− Tree -[B+ tree](../s-tree/#b-tree-layout-1). +Instead of making small incremental changes, we will design just one data structure in this article, which is based on [B+ tree](../s-tree/#b-tree-layout-1) with a few minor exceptions: -B− ("B minus") tree. The difference is: +- We do not store any pointers except for the children (while B+ stores one pointer for the next leaf node). +- We define key $i$ as the *maximum* key in the subtree of child $i$ instead of the *minimum* key in the subtree of child $(i + 1)$. This removes the need. +- We use a small node size $B=32$. This is needed simd to be efficient (we will discuss other node sizes later). -- We are specifically storing the *last* element. This is needed -- We use a small node size $B=32$. This is needed simd to be efficient (we will discuss other node sizes in the future) -- We don't store any pointers except for the children (while B+ stores one pointer for the next leaf node). +There is some overhead, so it makes sense to use more than one cache line. -The difference is that +Analogous to the B+ tree, we call this modification *B− tree*. ### Layout -To simplify memory all with infinities. +We rely on arena allocation. ```c++ -const int B = 32; -int H = 1; // tree height +const int B = 32; // node size const int R = 1e8; // reserve - alignas(64) int tree[R]; -int n_tree = B; // 31 (+ 1) + 32 for internal nodes and 31 for data nodes -int root = 0; +int root = 0; // where the tree root starts +int n_tree = B; +int H = 1; // tree height +``` + +To further simplify the implementation, we set all array cells with infinities: + +```c++ for (int i = 0; i < R; i++) tree[i] = INT_MAX; ``` +We can do this — this does not affect performance. Memory allocation and initialization is not the bottleneck. + +To save precious cache space, we use [indices instead of pointers](/hpc/cpu-cache/pointers/). +Despite that they are in separate cache lines, it still [makes sense](/hpc/cpu-cache/aos-soa/) to store them close to keys. + +This way, leaf nodes occupy 2 cache lines and waste 1 slot, while internal nodes occupy 4 cache lines and waste 2+1=3 slots. + To "allocate" a new node, we simply increase `n_tree` by $B$ if it is a data node or by $2 \cdot B$ if it is an internal node. ### Searching +We used permutations when we implemented [S-trees](../s-tree/#optimization). Storing values in permuted order will make inserts much harder, so we change the approach. + +Using popcount instead of tzcnt: the index i is equal to the number of keys less than x, so we can compare x against all keys, combine the vector mask any way we want, call maskmov, and then calculate the number of set bits with popcnt. This removes the need to store the keys in any particular order, which lets us skip the permutation step and also use this procedure on the last layer as well. + ```c++ typedef __m256i reg; @@ -89,9 +84,12 @@ unsigned rank32(reg x, int *node) { } ``` +This is also the reason why the "key area" in the nodes should not be contaminated and only store keys padded with infinities — or masked out. + +To implement `lower_bound`, we just use the same procedure, but fetch the pointer after we computed the child number: + ```c++ int lower_bound(int _x) { - //std::cerr << std::endl << "lb " << _x << std::endl; unsigned k = root; reg x = _mm256_set1_epi32(_x); @@ -106,7 +104,17 @@ int lower_bound(int _x) { } ``` -### Insertions +Implementing `lower_bound` is easy, and it doesn't introduce much overhead. The hard part is to implement insertion. + +### Insertion + +Insertion needs a lot of logic, but the good news is that it does not have to be executed frequently. + +Most of the time, all we need is to reach a leaf node and then insert a key into it, moving some other keys one position to the right. + +Occasionally, we also need to split the node and/or update some parents, but this is relatively rare, so let's focus on the most common part. + +To insert efficiently. ```c++ struct Precalc { @@ -131,45 +139,49 @@ void insert(int *node, int i, int x) { } node[i] = x; } +``` + +Next, let's try. To split a node, we need. So let's write another primitive: +```c++ // move the second half of a node and fill it with infinities void move(int *from, int *to) { const reg infs = _mm256_set1_epi32(INT_MAX); for (int i = 0; i < B / 2; i += 8) { reg t = _mm256_load_si256((reg*) &from[B / 2 + i]); _mm256_store_si256((reg*) &to[i], t); - _mm256_store_si256((reg*) &from[B / 2 + i], infs); // probably not necessary for pointers + _mm256_store_si256((reg*) &from[B / 2 + i], infs); } } ``` +Now we need to (very carefully) + ```c++ void insert(int _x) { - unsigned sk[20], si[20]; + // we save the path we visited in case we need to update some of our ancestors + unsigned sk[10], si[10]; unsigned k = root; reg x = _mm256_set1_epi32(_x); for (int h = 0; h < H - 1; h++) { unsigned i = rank32(x, &tree[k]); - sk[h] = k, si[h] = i; + + // check if we need to update the key right away + tree[k + i] = (_x > tree[k + i] ? _x : tree[k + i]); + sk[h] = k, si[h] = i; // and save the path + k = tree[k + B + i]; } unsigned i = rank32(x, &tree[k]); + // we can start computing this check ahead of insertion bool filled = (tree[k + B - 2] != INT_MAX); - bool updated = (tree[k + i] == INT_MAX); insert(tree + k, i, _x); - if (updated) { - for (int h = H - 2; h >= 0; h--) { - int idx = sk[h] + si[h]; - tree[idx] = (tree[idx] < _x ? _x : tree[idx]); - } - } - if (filled) { // create a new leaf node move(tree + k, tree + n_tree); @@ -180,16 +192,19 @@ void insert(int _x) { n_tree += B; for (int h = H - 2; h >= 0; h--) { + // for each parent node we repeat this process + // until we reach the root of determine that the node is not split k = sk[h], i = si[h]; filled = (tree[k + B - 3] != INT_MAX); - // the node already has a correct key (right one) and a correct pointer (left one) + // the node already has a correct key (right one) + // and a correct pointer (left one) insert(tree + k, i, v); insert(tree + k + B, i + 1, p); if (!filled) - return; + return; // we're done // create a new internal node move(tree + k, tree + n_tree); // move keys @@ -202,26 +217,128 @@ void insert(int _x) { n_tree += 2 * B; } - if (filled) { - // tree grows + // if we've reached here, this means we've reached the root, and it was split + tree[n_tree] = v; - tree[n_tree] = v; + tree[n_tree + B] = root; + tree[n_tree + B + 1] = p; - tree[n_tree + B] = root; - tree[n_tree + B + 1] = p; + root = n_tree; + n_tree += 2 * B; + H++; + } +} +``` - root = n_tree; - n_tree += 2 * B; - H++; - } +There are many inefficiencies, but luckily they are rarely called. + +## Evaluation + +We need to implement `insert` and `lower_bound`. Deletions, iteration, and other things are not our concern for now. + +Of course, this comparison is not fair, as implementing a dynamic search tree is a more high-dimensional problem. + +Technically, we use `std::multiset` and `absl::btree_multiset` to support repeated keys. + +Keys are uniform, but we should not rely on that fact (e. g. using ). + +It is common that >90% of operations are lookups. Optimizing searches is important because every other operation starts with locating a key. + +(a different set each time) + +We use different points between $10^4$ and $10^7$ in (arount 250 in total). After, we use $10^6$ queries (independently random each time). All data is generated uniformly in the range $[0, 2^{30})$ and independent between stages. + +$1.17^k$ and $1.17^{k+1}$. + +It may or may not be representative of your use case. + +As predicted, the performance is much better: + +![](../img/btree-absolute.svg) + +When the data set is small, the latency increases in discrete steps: 3.5ns for under 32 elements, 6.5ns, and to 12ns, until it hits the L2 cache (not shown on graphs) and starts increasing more smoothly yet still with noticeable spikes when the tree grows upwards. + +![](../img/btree-relative.svg) + +I apologize to everyone else, but this is sort of your fault for not using a public benchmark. + +![](../img/btree-absl.svg) + +I don't know (yet) why insertions are *that* slow. My guess is that it has something to do with data dependencies between queries. + +### Possible Optimizations + +Maximum height was 6. + +Compile. I tried it, but couldn't get the compiler to generate optimal code. + +The idiomatic C++ way is to use virtual functions, but we will be explicit: + +```c++ +void (*insert_ptr)(int); +int (*lower_bound_ptr)(int); + +void insert(int x) { + insert_ptr(x); +} + +int lower_bound(int x) { + return lower_bound_ptr(x); +} +``` + +```c++ +template +void insert_impl(int _x) { + // ... +} + +template +void insert_impl(int _x) { + // ... + if (/* tree grows */) { + // ... + insert_ptr = &insert_impl; + lower_bound_ptr = &lower_bound_impl; } } + +template <> +void insert_impl<10>(int x) { + std::cerr << "This depth was not supposed to be reached" << std::endl; + exit(1); +} ``` -## Optimizations +```c++ +insert_ptr = &insert_impl<1>; +lower_bound_ptr = &lower_bound_impl<1>; +``` + +Recursion unrolled. + +### Other Operations + +Going to father and fetching $B$ pointers at a time is faster as it negates [pointer chasing](/hpc/cpu-cache/latency/). + +Pointer to either parent or next node. + +Stack of ancestors. + +Nodes are at least ½ full (because they are created ½ full), except for the root, and, on average, ¾ full assuming random inserts. + +We can't store junk in keys. + +B* split + +If the node is at least half-full, we're done. Otherwise, we try to borrow keys from siblings (no expensive two-pointer merging is necessary: we can just append them to the end/beginning and swap key of the parent). + +If that fails, we can merge the two nodes together, and iteratively delete the key in the parent. + +One interesting use case is *rope*, also known as *cord*, which is used for wrapping strings in a tree to support mass operations. For example, editing a very large text file. Which is the topic. -... +[Skip list](https://en.wikipedia.org/wiki/Skip_list), which [some attempts to vectorize it](https://doublequan.github.io/), although it may achieve higher total throughput in concurrent setting. I have low hope that it can be improved. ## Acknowledgements -Thanks to Danila Kutenin for meaningful discussions of applicability. +Thanks to [Danila Kutenin](https://danlark.org/) from Google for meaningful discussions of applicability and possible replacement in Abseil. diff --git a/content/english/hpc/data-structures/img/btree-absl.svg b/content/english/hpc/data-structures/img/btree-absl.svg index 1096c24f..4ed0a949 100644 --- a/content/english/hpc/data-structures/img/btree-absl.svg +++ b/content/english/hpc/data-structures/img/btree-absl.svg @@ -29,35 +29,35 @@ z - - - - - @@ -66,12 +66,12 @@ L 777.6 69.12 - - + + + + + + + + + + + + + + + + + - + - - - + + - + - + - - - + + - + - + - - - + + - + - - + - - - - - - - - - - - - - - - - + + - + @@ -552,64 +554,64 @@ z - - - - @@ -646,7 +648,7 @@ z - @@ -687,7 +689,7 @@ z - @@ -713,7 +715,7 @@ z - @@ -760,7 +762,7 @@ z - @@ -775,7 +777,7 @@ L 639.712815 311.04 - @@ -958,7 +960,7 @@ z - @@ -971,91 +973,91 @@ L 777.6 512.64 - - + - - + - - + - - + - - + - - + - - + @@ -1106,64 +1108,64 @@ z - - - - @@ -1332,10 +1334,10 @@ z - + - + diff --git a/content/english/hpc/data-structures/img/btree-absolute.svg b/content/english/hpc/data-structures/img/btree-absolute.svg index 9be62391..6709908f 100644 --- a/content/english/hpc/data-structures/img/btree-absolute.svg +++ b/content/english/hpc/data-structures/img/btree-absolute.svg @@ -29,35 +29,35 @@ z - - - - - @@ -66,7 +66,7 @@ L 777.6 69.12 - @@ -102,8 +102,8 @@ z - @@ -134,7 +134,7 @@ Q 14.890625 38.140625 10.796875 36.28125 z " id="DejaVuSans-53"/> - + @@ -143,8 +143,8 @@ z - @@ -164,7 +164,7 @@ L 12.40625 0 z " id="DejaVuSans-49"/> - + @@ -174,13 +174,13 @@ z - - + @@ -190,8 +190,8 @@ L 777.6 151.106016 - @@ -222,7 +222,7 @@ Q 31.109375 20.453125 19.1875 8.296875 z " id="DejaVuSans-50"/> - + @@ -230,23 +230,7 @@ z - - - - - - - - - - - - - - - + - + - - + - - + - - + - - + - - + @@ -597,13 +581,13 @@ Q 113.5 126.357813 115.7 126.357813 z " style="fill:#ffffff;opacity:0.8;stroke:#cccccc;stroke-linejoin:miter;"/> - + - - + + - + - - + + - + - - + + - - + - + @@ -877,12 +861,12 @@ L 226.05126 311.04 - - + - + - - + - + - - + - + @@ -965,12 +949,12 @@ L 639.712815 311.04 - - + - + + - - - + + - + - - - + + - + - + - - - + + - + - + @@ -1185,15 +1169,15 @@ L 777.6 432.897344 - - - + + - + - + @@ -1201,15 +1185,15 @@ L 777.6 393.026016 - - - + + - + - + @@ -1217,23 +1201,7 @@ L 777.6 353.154688 - - - - - - - - - - - - - - - + - + - - + - - + - - + - - + - - + @@ -1452,10 +1420,10 @@ L 777.6 311.04 - + - + diff --git a/content/english/hpc/data-structures/img/btree-relative.svg b/content/english/hpc/data-structures/img/btree-relative.svg index b80dc6f3..e40210ff 100644 --- a/content/english/hpc/data-structures/img/btree-relative.svg +++ b/content/english/hpc/data-structures/img/btree-relative.svg @@ -29,35 +29,35 @@ z - - - - - @@ -66,8 +66,8 @@ L 777.6 69.12 - @@ -87,15 +87,15 @@ L 12.40625 0 z " id="DejaVuSans-49"/> - + - @@ -126,15 +126,15 @@ Q 31.109375 20.453125 19.1875 8.296875 z " id="DejaVuSans-50"/> - + - @@ -173,15 +173,15 @@ Q 46.96875 40.921875 40.578125 39.3125 z " id="DejaVuSans-51"/> - + - @@ -205,15 +205,15 @@ L 4.890625 26.703125 z " id="DejaVuSans-52"/> - + - @@ -244,15 +244,15 @@ Q 14.890625 38.140625 10.796875 36.28125 z " id="DejaVuSans-53"/> - + - @@ -289,15 +289,15 @@ Q 48.484375 72.75 52.59375 71.296875 z " id="DejaVuSans-54"/> - + - @@ -313,67 +313,13 @@ L 8.203125 64.59375 z " id="DejaVuSans-55"/> - + - - - - - - - - - - - - - - - - + + - - - +" id="DejaVuSans-112"/> - +" id="DejaVuSans-117"/> - + @@ -597,132 +546,123 @@ z - - - - - - - - - - - - - - - - - + + + + + + + + - - + - - + - - + - - + - - + @@ -760,13 +700,13 @@ Q 113.5 110.211875 115.7 110.211875 z " style="fill:#ffffff;opacity:0.8;stroke:#cccccc;stroke-linejoin:miter;"/> - + - - + + + + @@ -885,13 +842,13 @@ z - + - - + + - - + - + @@ -939,12 +896,12 @@ L 226.05126 311.04 - - + - + @@ -954,12 +911,12 @@ L 363.938445 311.04 - - + - + - - + - + @@ -1016,12 +973,12 @@ L 639.712815 311.04 - - + - + @@ -1030,7 +987,7 @@ L 777.6 311.04 - + + - - - + + - - + + + - + + + + + + + + + + + + + + + + - - - + + + + - - - - - + + + + + - - - + + - + + + - - - - - + + + + + + - - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + - + - - - - - - - - - - - - - - - - - + + + + + + + + - - + - - + - - + - - + - - + @@ -1412,12 +1443,62 @@ L 777.6 311.04 " style="fill:none;stroke:#cccccc;stroke-linecap:square;stroke-linejoin:miter;stroke-width:1.25;"/> + + + + + + + + + + + + + + + + + + + + + + + - + - + From 1b61ba263abf3f98c8b8a258e321527edc7f68dc Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Fri, 25 Mar 2022 18:06:26 +0300 Subject: [PATCH 343/531] b-tree intro --- content/english/hpc/data-structures/b-tree.md | 61 +++++++++++++------ 1 file changed, 44 insertions(+), 17 deletions(-) diff --git a/content/english/hpc/data-structures/b-tree.md b/content/english/hpc/data-structures/b-tree.md index aed26981..7f1cb527 100644 --- a/content/english/hpc/data-structures/b-tree.md +++ b/content/english/hpc/data-structures/b-tree.md @@ -4,36 +4,58 @@ weight: 3 draft: true --- -In the [previous article](../s-tree), we designed *static* B-trees to speed up binary searching in sorted arrays, designing S-tree and S+ Tree. In the last section we [briefly discussed](../s-tree/#as-a-dynamic-tree) how to turn them *dynamic* while retaining performance gains from [SIMD](/hpc/simd), making a proof-of-concept. Simply adding pointers to S+ tree. +In the [previous article](../s-tree), we designed and implemented *static* B-trees to speed up binary searching in sorted arrays, and in its [last section](../s-tree/#as-a-dynamic-tree), we briefly discussed how to make them *dynamic* back while retaining the performance gains from [SIMD](/hpc/simd), and validated our predictions by adding and following explicit pointers in the internal nodes of the S+ tree. -In this section, we follow up on that promise and design a minimally functional search tree for integer keys, called *B− tree*, that achieves significant improvements over [improvements](#evaluation): 7-18 times faster for large arrays and 3-8 faster for inserts. The [absl::btree](https://abseil.io/blog/20190812-btree) 3-7 times faster for searches and 1.5-2 times faster for with yet ample room for improvement. +In this article, we follow up on that proposition and design a minimally functional search tree for integer keys, [achieving](#evaluation) up to 18x/8x speedup over `std::set` and up to 7x/2x speedup over [`absl::btree`](https://abseil.io/blog/20190812-btree) for `lower_bound` and `insert` queries respectively, with yet ample room for improvement. -The memory overhead of the structure around 30%. The [final implementation](https://github.com/sslotin/amh-code/blob/main/b-tree/btree-final.cc) is around 150 lines of C. +The memory overhead of the structure is around 30%, and the [final implementation](https://github.com/sslotin/amh-code/blob/main/b-tree/btree-final.cc) is under 150 lines of C. -We give more details in th evaluation section. + ## B− Tree -Instead of making small incremental changes, we will design just one data structure in this article, which is based on [B+ tree](../s-tree/#b-tree-layout-1) with a few minor exceptions: +Instead of making small incremental improvements like we usually do in other case studies, in this article, we will implement just one data structure that we name *B− tree*, which is based on the [B+ tree](../s-tree/#b-tree-layout-1), with a few minor differences: + +- Nodes in the B− tree do not store any pointers or meta-information whatsoever except for the pointers to internal node children (while the B+ tree leaf nodes store a pointer to the next leaf node). This lets us perfectly place the keys in the leaf nodes on cache lines. +- We define key $i$ to be the *maximum* key in the subtree of the child $i$ instead of the *minimum* key in the subtree of the child $(i + 1)$. This lets us not fetch any other nodes after we reach a leaf (in the B+ tree, all keys in the leaf node may be less than the search key, so we need to go to the next leaf node to fetch its first element). + +We also use a node size of $B=32$, which is smaller than typical. The reason why it is not $16$, which was [optimal for the S+ tree](s-tree/#modifications-and-further-optimizations), is because we have the additional overhead associated with fetching the pointer, and the benefit of reducing the tree height by ~20% outweighs the cost of processing twice the elements per node, and also because it improves the running time of the `insert` query that needs to perform a costly node split every $\frac{B}{2}$ insertions on average. + + + +### Memory Layout + +Although this is probably not the best software engineering practice, We rely on arena allocation. ```c++ -const int B = 32; // node size - const int R = 1e8; // reserve alignas(64) int tree[R]; +for (int i = 0; i < R; i++) + tree[i] = INT_MAX; +``` + +```c++ +const int B = 32; // node size + int root = 0; // where the tree root starts int n_tree = B; int H = 1; // tree height @@ -41,10 +63,7 @@ int H = 1; // tree height To further simplify the implementation, we set all array cells with infinities: -```c++ -for (int i = 0; i < R; i++) - tree[i] = INT_MAX; -``` +We need these fields to hold temporary values after insertions. We can do this — this does not affect performance. Memory allocation and initialization is not the bottleneck. @@ -55,11 +74,13 @@ This way, leaf nodes occupy 2 cache lines and waste 1 slot, while internal nodes To "allocate" a new node, we simply increase `n_tree` by $B$ if it is a data node or by $2 \cdot B$ if it is an internal node. +`std::set` needs at least 32 bytes. 3 pointers (parent, left, right), plus another 8, while we need ~5.2 *on average* and slightly over 8 in the worst case. (sequential inserts) + ### Searching We used permutations when we implemented [S-trees](../s-tree/#optimization). Storing values in permuted order will make inserts much harder, so we change the approach. -Using popcount instead of tzcnt: the index i is equal to the number of keys less than x, so we can compare x against all keys, combine the vector mask any way we want, call maskmov, and then calculate the number of set bits with popcnt. This removes the need to store the keys in any particular order, which lets us skip the permutation step and also use this procedure on the last layer as well. +Using popcount instead of tzcnt: the index i is equal to the number of keys less than x, so we can compare x against all keys, combine the vector mask any way we want, call `maskmov`, and then calculate the number of set bits with popcnt. This removes the need to store the keys in any particular order, which lets us skip the permutation step and also use this procedure on the last layer as well. ```c++ typedef __m256i reg; @@ -252,6 +273,8 @@ $1.17^k$ and $1.17^{k+1}$. It may or may not be representative of your use case. +[Hugepages](/hpc/cpu-cache/paging) are enabled globally for all three algorithms. + As predicted, the performance is much better: ![](../img/btree-absolute.svg) @@ -264,6 +287,8 @@ I apologize to everyone else, but this is sort of your fault for not using a pub ![](../img/btree-absl.svg) +Interestingly, B− tree wins over `absl::btree` even when it only stores one key: it takes around 5ns to figure out branch prediction, while B− tree is branchless. + I don't know (yet) why insertions are *that* slow. My guess is that it has something to do with data dependencies between queries. ### Possible Optimizations @@ -317,6 +342,8 @@ lower_bound_ptr = &lower_bound_impl<1>; Recursion unrolled. +It is possible to get rid of pointers even more. For example, for large trees, we can probably afford a small S+ tree for $16 \cdot 17$ or so elements as the root, which we rebuild from scratch on each infrequent occasion when it changes. + ### Other Operations Going to father and fetching $B$ pointers at a time is faster as it negates [pointer chasing](/hpc/cpu-cache/latency/). From e4fd3076e7e677eec8f96a2ed3196cfebf71e02d Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Fri, 25 Mar 2022 18:58:50 +0300 Subject: [PATCH 344/531] b-tree layout --- content/english/hpc/data-structures/b-tree.md | 45 +++++++++++-------- 1 file changed, 27 insertions(+), 18 deletions(-) diff --git a/content/english/hpc/data-structures/b-tree.md b/content/english/hpc/data-structures/b-tree.md index 7f1cb527..47f6b943 100644 --- a/content/english/hpc/data-structures/b-tree.md +++ b/content/english/hpc/data-structures/b-tree.md @@ -41,40 +41,49 @@ Analogous to the B+ tree, ### Memory Layout -Although this is probably not the best software engineering practice, - -We rely on arena allocation. +Although this is probably not the best approach in terms of software engineering, we will simply store the entire tree in a large pre-allocated array, without discriminating between leaves and internal nodes: ```c++ -const int R = 1e8; // reserve +const int R = 1e8; alignas(64) int tree[R]; +``` + +We also pre-fill this array with infinities to simplify the implementation: +```c++ for (int i = 0; i < R; i++) tree[i] = INT_MAX; ``` -```c++ -const int B = 32; // node size +(In general, it is technically cheating to compare against `std::set` or other structures that use `new` under the hood, but memory allocation and initialization are not the bottlenecks here, so this does not significantly affect the evaluation.) -int root = 0; // where the tree root starts -int n_tree = B; -int H = 1; // tree height -``` +Both nodes types store their keys sequentially in sorted order and are identified by the index of its first key in the array: + +- A leaf node has up to $(B - 1)$ keys, but is padded to $B$ elements with infinities. +- An internal node has up to $(B - 2)$ keys padded to $B$ elements and up to $(B - 1)$ indices of its child nodes, also padded to $B$ elements. -To further simplify the implementation, we set all array cells with infinities: +These design decisions are not arbitrary: -We need these fields to hold temporary values after insertions. +- Padding ensures that leaf nodes occupy exactly 2 cache lines and internal nodes occupy exactly 4 cache lines. +- We specifically use [indices instead of pointers](/hpc/cpu-cache/pointers/) to save cache space. +- We store indices right after the keys even though they are stored in separate cache lines because [we have reasons](/hpc/cpu-cache/aos-soa/). +- We intentionally "waste" one array cell in leaf nodes and $2+1=3$ cells in internal nodes because we need it to store temporary results during a node split. -We can do this — this does not affect performance. Memory allocation and initialization is not the bottleneck. +Initially, we only have one empty leaf node as the root: -To save precious cache space, we use [indices instead of pointers](/hpc/cpu-cache/pointers/). -Despite that they are in separate cache lines, it still [makes sense](/hpc/cpu-cache/aos-soa/) to store them close to keys. +```c++ +const int B = 32; + +int root = 0; // where the keys of the root start +int n_tree = B; // number of allocated array cells +int H = 1; // current tree height +``` -This way, leaf nodes occupy 2 cache lines and waste 1 slot, while internal nodes occupy 4 cache lines and waste 2+1=3 slots. +To "allocate" a new node, we simply increase `n_tree` by $B$ if it is a leaf node or by $2 B$ if it is an internal node. -To "allocate" a new node, we simply increase `n_tree` by $B$ if it is a data node or by $2 \cdot B$ if it is an internal node. +Since new nodes can only be created by splitting a full node, each node except for the root will be at least half full. This implies that we need between 4 and 8 bytes per integer element (the internal nodes will contribute $\frac{1}{16}$-th or so to that number), the former being the case when the inserts are sequential, and the latter being the case when the input is adversarial. When the queries are uniformly distributed, the nodes are ~75% full on average, projecting to ~5.2 bytes per element. -`std::set` needs at least 32 bytes. 3 pointers (parent, left, right), plus another 8, while we need ~5.2 *on average* and slightly over 8 in the worst case. (sequential inserts) +B-trees are very memory-efficient compared to the pointer-based binary trees. For example, `std::set` needs at least three pointers (the left child, the right child, and the parent), alone costing $3 \times 8 = 24$ bytes, plus at least another $8$ bytes to store the key and the meta-information due to [structure padding](hpc/cpu-cache/alignment/). ### Searching From d192be2e03ed3072c98dfd374a71a0b9a2554b08 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Fri, 25 Mar 2022 19:20:55 +0300 Subject: [PATCH 345/531] b-tree search --- content/english/hpc/data-structures/b-tree.md | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/content/english/hpc/data-structures/b-tree.md b/content/english/hpc/data-structures/b-tree.md index 47f6b943..35bd8708 100644 --- a/content/english/hpc/data-structures/b-tree.md +++ b/content/english/hpc/data-structures/b-tree.md @@ -22,7 +22,7 @@ that we call *B− tree* Instead of making small incremental improvements like we usually do in other case studies, in this article, we will implement just one data structure that we name *B− tree*, which is based on the [B+ tree](../s-tree/#b-tree-layout-1), with a few minor differences: -- Nodes in the B− tree do not store any pointers or meta-information whatsoever except for the pointers to internal node children (while the B+ tree leaf nodes store a pointer to the next leaf node). This lets us perfectly place the keys in the leaf nodes on cache lines. +- Nodes in the B− tree do not store pointers or any metadata whatsoever except for the pointers to internal node children (while the B+ tree leaf nodes store a pointer to the next leaf node). This lets us perfectly place the keys in the leaf nodes on cache lines. - We define key $i$ to be the *maximum* key in the subtree of the child $i$ instead of the *minimum* key in the subtree of the child $(i + 1)$. This lets us not fetch any other nodes after we reach a leaf (in the B+ tree, all keys in the leaf node may be less than the search key, so we need to go to the next leaf node to fetch its first element). We also use a node size of $B=32$, which is smaller than typical. The reason why it is not $16$, which was [optimal for the S+ tree](s-tree/#modifications-and-further-optimizations), is because we have the additional overhead associated with fetching the pointer, and the benefit of reducing the tree height by ~20% outweighs the cost of processing twice the elements per node, and also because it improves the running time of the `insert` query that needs to perform a costly node split every $\frac{B}{2}$ insertions on average. @@ -87,9 +87,11 @@ B-trees are very memory-efficient compared to the pointer-based binary trees. Fo ### Searching -We used permutations when we implemented [S-trees](../s-tree/#optimization). Storing values in permuted order will make inserts much harder, so we change the approach. +When we implemented [S-trees](../s-tree/#optimization), we ended up storing the keys in permuted order due to the intricacies of how the blending/packs instructions work. For the *dynamic tree* problem, storing the keys in permuted order would make inserts much harder to implement, so we will change the approach instead. -Using popcount instead of tzcnt: the index i is equal to the number of keys less than x, so we can compare x against all keys, combine the vector mask any way we want, call `maskmov`, and then calculate the number of set bits with popcnt. This removes the need to store the keys in any particular order, which lets us skip the permutation step and also use this procedure on the last layer as well. +An alternative way to think about finding the would-be position of the element `x` in a sorted array is not "the index of the first element that is not less than `x`" but "the number of elements that are less than `x`." This observation generates the following idea: compare the keys against `x`, aggregate the vector masks into a 32-bit mask (where each bit can correspond to any element as long as the mapping is bijective), and then call `popcnt` on it, returning the number of elements less than `x`. + +This trick lets us perform the local search efficiently and without requiring any shuffling: ```c++ typedef __m256i reg; @@ -114,9 +116,9 @@ unsigned rank32(reg x, int *node) { } ``` -This is also the reason why the "key area" in the nodes should not be contaminated and only store keys padded with infinities — or masked out. +Note that, because of this procedure, we have to pad the "key area" with infinities, which prevents us from storing metadata in the vacated cells (unless we are also willing to spend a few cycles to mask it out when loading a SIMD lane). -To implement `lower_bound`, we just use the same procedure, but fetch the pointer after we computed the child number: +Now, to implement `lower_bound`, we can descend the tree just like we did in the S+ tree, but fetching the pointer after we compute the child number: ```c++ int lower_bound(int _x) { From a468df83296daefba357abf1e3638506a31b59bc Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Fri, 25 Mar 2022 20:27:51 +0300 Subject: [PATCH 346/531] b-tree insertion --- content/english/hpc/data-structures/b-tree.md | 59 +++++++++++-------- 1 file changed, 33 insertions(+), 26 deletions(-) diff --git a/content/english/hpc/data-structures/b-tree.md b/content/english/hpc/data-structures/b-tree.md index 35bd8708..f5602b48 100644 --- a/content/english/hpc/data-structures/b-tree.md +++ b/content/english/hpc/data-structures/b-tree.md @@ -65,7 +65,8 @@ Both nodes types store their keys sequentially in sorted order and are identifie These design decisions are not arbitrary: - Padding ensures that leaf nodes occupy exactly 2 cache lines and internal nodes occupy exactly 4 cache lines. -- We specifically use [indices instead of pointers](/hpc/cpu-cache/pointers/) to save cache space. +- We specifically use [indices instead of pointers](/hpc/cpu-cache/pointers/) to save cache space and make moving them with SIMD faster. + (We will use "pointer" and "index" interchangeably from now on.) - We store indices right after the keys even though they are stored in separate cache lines because [we have reasons](/hpc/cpu-cache/aos-soa/). - We intentionally "waste" one array cell in leaf nodes and $2+1=3$ cells in internal nodes because we need it to store temporary results during a node split. @@ -101,15 +102,17 @@ reg cmp(reg x, int *node) { return _mm256_cmpgt_epi32(x, y); } +// returns how many keys are less than x unsigned rank32(reg x, int *node) { reg m1 = cmp(x, node); reg m2 = cmp(x, node + 8); reg m3 = cmp(x, node + 16); reg m4 = cmp(x, node + 24); + // take lower 16 bits from m1/m3 and higher 16 bits from m2/m4 m1 = _mm256_blend_epi16(m1, m2, 0b01010101); m3 = _mm256_blend_epi16(m3, m4, 0b01010101); - m1 = _mm256_packs_epi16(m1, m3); + m1 = _mm256_packs_epi16(m1, m3); // can also use blendv here, but packs is simpler unsigned mask = _mm256_movemask_epi8(m1); return __builtin_popcount(mask); @@ -132,21 +135,17 @@ int lower_bound(int _x) { unsigned i = rank32(x, &tree[k]); - return tree[k + i]; // what if next block? maybe we store 31 elements? + return tree[k + i]; } ``` -Implementing `lower_bound` is easy, and it doesn't introduce much overhead. The hard part is to implement insertion. +Implementing search is easy, and it doesn't introduce much overhead. The hard part is implementing insertion. ### Insertion -Insertion needs a lot of logic, but the good news is that it does not have to be executed frequently. +On the one side, correctly implementing insertion takes a lot of code, but on the other, most of that code is executed very infrequently, so we don't have to care about its performance that much. Most often, all we need to do is to reach the leaf node (which we've already figured out how to do) and then insert a new key into it, moving some suffix of the keys one position to the right. Occasionally, we also need to split the node and/or update some ancestors, but this is relatively rare, so let's focus on the most common execution path first. -Most of the time, all we need is to reach a leaf node and then insert a key into it, moving some other keys one position to the right. - -Occasionally, we also need to split the node and/or update some parents, but this is relatively rare, so let's focus on the most common part. - -To insert efficiently. +To insert a key into an array of $(B - 1)$ sorted elements, we can load them in vector registers, and then mask-store them one position to the right using a precomputed mask that tells which elements need to be written for a given `i`: ```c++ struct Precalc { @@ -155,25 +154,30 @@ struct Precalc { constexpr Precalc() : mask{} { for (int i = 0; i < B; i++) for (int j = i; j < B - 1; j++) + // everything from i to B - 2 inclusive needs to be moved mask[i][j] = -1; } }; constexpr Precalc P; -``` -```c++ void insert(int *node, int i, int x) { + // need to iterate right-to-left to not overwrite the first element of the next lane for (int j = B - 8; j >= 0; j -= 8) { + // load the keys reg t = _mm256_load_si256((reg*) &node[j]); + // load the corresponding mask reg mask = _mm256_load_si256((reg*) &P.mask[i][j]); + // mask-write them one position to the right _mm256_maskstore_epi32(&node[j + 1], mask, t); } - node[i] = x; + node[i] = x; // finally, write the element itself } ``` -Next, let's try. To split a node, we need. So let's write another primitive: +There are other ways to do it, some possibly more efficient, but we are going to stop there for now. + +When we split a node, we need to move half of the keys to another node, so let's write another primitive that does it: ```c++ // move the second half of a node and fill it with infinities @@ -187,12 +191,15 @@ void move(int *from, int *to) { } ``` -Now we need to (very carefully) +With these two vector functions implemented, we can now very carefully implement insertion: ```c++ void insert(int _x) { - // we save the path we visited in case we need to update some of our ancestors - unsigned sk[10], si[10]; + // the beginning of the procedure is the same as in lower_bound, + // except that we save the path in case we need to update some of our ancestors + unsigned sk[10], si[10]; // k and i on each iteration + // ^------^ We assume that the tree height does not exceed 10 + // (which would require at least 16^10 elements) unsigned k = root; reg x = _mm256_set1_epi32(_x); @@ -200,7 +207,7 @@ void insert(int _x) { for (int h = 0; h < H - 1; h++) { unsigned i = rank32(x, &tree[k]); - // check if we need to update the key right away + // optionally update the key i right away tree[k + i] = (_x > tree[k + i] ? _x : tree[k + i]); sk[h] = k, si[h] = i; // and save the path @@ -209,13 +216,13 @@ void insert(int _x) { unsigned i = rank32(x, &tree[k]); - // we can start computing this check ahead of insertion + // we can start computing the is-full check before insertion completes bool filled = (tree[k + B - 2] != INT_MAX); insert(tree + k, i, _x); if (filled) { - // create a new leaf node + // the node needs to be split, so we create a new leaf node move(tree + k, tree + n_tree); int v = tree[k + B / 2 - 1]; // new key to be inserted @@ -224,14 +231,13 @@ void insert(int _x) { n_tree += B; for (int h = H - 2; h >= 0; h--) { - // for each parent node we repeat this process - // until we reach the root of determine that the node is not split + // ascend and repeat until we reach the root or find a the node is not split k = sk[h], i = si[h]; filled = (tree[k + B - 3] != INT_MAX); - // the node already has a correct key (right one) - // and a correct pointer (left one) + // the node already has a correct key (the right one) + // and a correct pointer (the left one) insert(tree + k, i, v); insert(tree + k + B, i + 1, p); @@ -249,7 +255,8 @@ void insert(int _x) { n_tree += 2 * B; } - // if we've reached here, this means we've reached the root, and it was split + // if reach here, this means we've reached the root, + // and it was split into two, so we need a new root tree[n_tree] = v; tree[n_tree + B] = root; @@ -262,7 +269,7 @@ void insert(int _x) { } ``` -There are many inefficiencies, but luckily they are rarely called. +There are many inefficiencies, but, luckily, the body of `if (filled)` is executed very infrequently — approximately every $\frac{B}{2}$ insertions — and the insertion performance is not really our top priority, so we will just leave it there. ## Evaluation From 3dc8bcf7a353b56bc6c5a41477876bdb600d80aa Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Fri, 25 Mar 2022 21:34:37 +0300 Subject: [PATCH 347/531] b-tree evaluation --- content/english/hpc/data-structures/b-tree.md | 38 +++++++++++-------- 1 file changed, 22 insertions(+), 16 deletions(-) diff --git a/content/english/hpc/data-structures/b-tree.md b/content/english/hpc/data-structures/b-tree.md index f5602b48..931a92c6 100644 --- a/content/english/hpc/data-structures/b-tree.md +++ b/content/english/hpc/data-structures/b-tree.md @@ -88,6 +88,8 @@ B-trees are very memory-efficient compared to the pointer-based binary trees. Fo ### Searching +It is a very common scenario when >90% of operations are lookups, and even if this is not the case, every other tree operation typically begins with locating a key anyway, so we will start with implementing and optimizing the searches. + When we implemented [S-trees](../s-tree/#optimization), we ended up storing the keys in permuted order due to the intricacies of how the blending/packs instructions work. For the *dynamic tree* problem, storing the keys in permuted order would make inserts much harder to implement, so we will change the approach instead. An alternative way to think about finding the would-be position of the element `x` in a sorted array is not "the index of the first element that is not less than `x`" but "the number of elements that are less than `x`." This observation generates the following idea: compare the keys against `x`, aggregate the vector masks into a 32-bit mask (where each bit can correspond to any element as long as the mapping is bijective), and then call `popcnt` on it, returning the number of elements less than `x`. @@ -273,44 +275,46 @@ There are many inefficiencies, but, luckily, the body of `if (filled)` is execut ## Evaluation -We need to implement `insert` and `lower_bound`. Deletions, iteration, and other things are not our concern for now. +We have only implemented `insert` and `lower_bound`, so this is what we will measure. -Of course, this comparison is not fair, as implementing a dynamic search tree is a more high-dimensional problem. +We want the evaluation to take a reasonable time, so our benchmark is a loop that alternates between two steps: -Technically, we use `std::multiset` and `absl::btree_multiset` to support repeated keys. +- Increase the structure size from $1.17^k$ to $1.17^{k+1}$ using individual `insert`s and measure the time it took. +- Perform $10^6$ random `lower_bound` queries and measure the time it took. -Keys are uniform, but we should not rely on that fact (e. g. using ). +We start at the size $10^4$ and end at $10^7$, for around $50$ data points in total. We generate the data for both query types uniformly in the $[0, 2^{30})$ range and independently between the stages. Since the data generation process allows for repeated keys, we compared against `std::multiset` and `absl::btree_multiset`, although we still refer to them as `std::set` and `absl::btree` for brevity. We also enable [hugepages](/hpc/cpu-cache/paging) on the system level for all three runs. -It is common that >90% of operations are lookups. Optimizing searches is important because every other operation starts with locating a key. -(a different set each time) + -As predicted, the performance is much better: +The performance of the B− tree matches what we originally predicted — at least for the lookups: ![](../img/btree-absolute.svg) -When the data set is small, the latency increases in discrete steps: 3.5ns for under 32 elements, 6.5ns, and to 12ns, until it hits the L2 cache (not shown on graphs) and starts increasing more smoothly yet still with noticeable spikes when the tree grows upwards. +The relative speedup varies with the structure size — 7-18x/3-8x over STL and 3-7x/1.5-2x over Abseil: ![](../img/btree-relative.svg) -I apologize to everyone else, but this is sort of your fault for not using a public benchmark. +Insertions are only 1.5-2 faster than for `absl::btree`, which uses scalar code to do everything. I don't know (yet) why insertions are *that* slow, but my guess is that it has something to do with data dependencies between queries. ![](../img/btree-absl.svg) -Interestingly, B− tree wins over `absl::btree` even when it only stores one key: it takes around 5ns to figure out branch prediction, while B− tree is branchless. +When the structure size is small, the [reciprocal throughput](../s-tree/#comparison-with-stdlower_bound) of `lower_bound` increases in discrete steps: it starts with 3.5ns when there is only the root to visit, then grows to 6.5ns (two nodes), and then to 12ns (three nodes), and then hits the L2 cache (not shown on the graphs) and starts increasing more smoothly, but still with noticeable spikes when the tree height increases. -I don't know (yet) why insertions are *that* slow. My guess is that it has something to do with data dependencies between queries. +Interestingly, B− tree outperforms `absl::btree` even when it only stores a single key: it takes around 5ns stalling on [branch misprediction](/hpc/pipelining/branching/), while (the search in) the B− tree is entirely branchless. ### Possible Optimizations +In our previous optimization efforts. + Maximum height was 6. Compile. I tried it, but couldn't get the compiler to generate optimal code. @@ -364,6 +368,8 @@ It is possible to get rid of pointers even more. For example, for large trees, w ### Other Operations + + Going to father and fetching $B$ pointers at a time is faster as it negates [pointer chasing](/hpc/cpu-cache/latency/). Pointer to either parent or next node. @@ -380,7 +386,7 @@ If the node is at least half-full, we're done. Otherwise, we try to borrow keys If that fails, we can merge the two nodes together, and iteratively delete the key in the parent. -One interesting use case is *rope*, also known as *cord*, which is used for wrapping strings in a tree to support mass operations. For example, editing a very large text file. Which is the topic. + [Skip list](https://en.wikipedia.org/wiki/Skip_list), which [some attempts to vectorize it](https://doublequan.github.io/), although it may achieve higher total throughput in concurrent setting. I have low hope that it can be improved. From 04e1d77e614c500ca741b0572cfd96eed35898bd Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Fri, 25 Mar 2022 22:05:41 +0300 Subject: [PATCH 348/531] publish b-tree --- content/english/hpc/data-structures/b-tree.md | 42 ++++++++++--------- content/english/hpc/data-structures/s-tree.md | 2 +- 2 files changed, 24 insertions(+), 20 deletions(-) diff --git a/content/english/hpc/data-structures/b-tree.md b/content/english/hpc/data-structures/b-tree.md index 931a92c6..e47ef5a6 100644 --- a/content/english/hpc/data-structures/b-tree.md +++ b/content/english/hpc/data-structures/b-tree.md @@ -1,12 +1,11 @@ --- title: Search Trees weight: 3 -draft: true --- -In the [previous article](../s-tree), we designed and implemented *static* B-trees to speed up binary searching in sorted arrays, and in its [last section](../s-tree/#as-a-dynamic-tree), we briefly discussed how to make them *dynamic* back while retaining the performance gains from [SIMD](/hpc/simd), and validated our predictions by adding and following explicit pointers in the internal nodes of the S+ tree. +In the [previous article](../s-tree), we designed and implemented *static* B-trees to speed up binary searching in sorted arrays. In its [last section](../s-tree/#as-a-dynamic-tree), we briefly discussed how to make them *dynamic* back while retaining the performance gains from [SIMD](/hpc/simd) and validated our predictions by adding and following explicit pointers in the internal nodes of the S+ tree. -In this article, we follow up on that proposition and design a minimally functional search tree for integer keys, [achieving](#evaluation) up to 18x/8x speedup over `std::set` and up to 7x/2x speedup over [`absl::btree`](https://abseil.io/blog/20190812-btree) for `lower_bound` and `insert` queries respectively, with yet ample room for improvement. +In this article, we follow up on that proposition and design a minimally functional search tree for integer keys, [achieving](#evaluation) up to 18x/8x speedup over `std::set` and up to 7x/2x speedup over [`absl::btree`](https://abseil.io/blog/20190812-btree) for `lower_bound` and `insert` queries, respectively, with yet ample room for improvement. The memory overhead of the structure is around 30%, and the [final implementation](https://github.com/sslotin/amh-code/blob/main/b-tree/btree-final.cc) is under 150 lines of C. @@ -22,7 +21,7 @@ that we call *B− tree* Instead of making small incremental improvements like we usually do in other case studies, in this article, we will implement just one data structure that we name *B− tree*, which is based on the [B+ tree](../s-tree/#b-tree-layout-1), with a few minor differences: -- Nodes in the B− tree do not store pointers or any metadata whatsoever except for the pointers to internal node children (while the B+ tree leaf nodes store a pointer to the next leaf node). This lets us perfectly place the keys in the leaf nodes on cache lines. +- Nodes in the B− tree do not store pointers or any metadata except for the pointers to internal node children (while the B+ tree leaf nodes store a pointer to the next leaf node). This lets us perfectly place the keys in the leaf nodes on cache lines. - We define key $i$ to be the *maximum* key in the subtree of the child $i$ instead of the *minimum* key in the subtree of the child $(i + 1)$. This lets us not fetch any other nodes after we reach a leaf (in the B+ tree, all keys in the leaf node may be less than the search key, so we need to go to the next leaf node to fetch its first element). We also use a node size of $B=32$, which is smaller than typical. The reason why it is not $16$, which was [optimal for the S+ tree](s-tree/#modifications-and-further-optimizations), is because we have the additional overhead associated with fetching the pointer, and the benefit of reducing the tree height by ~20% outweighs the cost of processing twice the elements per node, and also because it improves the running time of the `insert` query that needs to perform a costly node split every $\frac{B}{2}$ insertions on average. @@ -59,12 +58,12 @@ for (int i = 0; i < R; i++) Both nodes types store their keys sequentially in sorted order and are identified by the index of its first key in the array: -- A leaf node has up to $(B - 1)$ keys, but is padded to $B$ elements with infinities. +- A leaf node has up to $(B - 1)$ keys but is padded to $B$ elements with infinities. - An internal node has up to $(B - 2)$ keys padded to $B$ elements and up to $(B - 1)$ indices of its child nodes, also padded to $B$ elements. These design decisions are not arbitrary: -- Padding ensures that leaf nodes occupy exactly 2 cache lines and internal nodes occupy exactly 4 cache lines. +- The padding ensures that leaf nodes occupy exactly 2 cache lines and internal nodes occupy exactly 4 cache lines. - We specifically use [indices instead of pointers](/hpc/cpu-cache/pointers/) to save cache space and make moving them with SIMD faster. (We will use "pointer" and "index" interchangeably from now on.) - We store indices right after the keys even though they are stored in separate cache lines because [we have reasons](/hpc/cpu-cache/aos-soa/). @@ -147,7 +146,7 @@ Implementing search is easy, and it doesn't introduce much overhead. The hard pa On the one side, correctly implementing insertion takes a lot of code, but on the other, most of that code is executed very infrequently, so we don't have to care about its performance that much. Most often, all we need to do is to reach the leaf node (which we've already figured out how to do) and then insert a new key into it, moving some suffix of the keys one position to the right. Occasionally, we also need to split the node and/or update some ancestors, but this is relatively rare, so let's focus on the most common execution path first. -To insert a key into an array of $(B - 1)$ sorted elements, we can load them in vector registers, and then mask-store them one position to the right using a precomputed mask that tells which elements need to be written for a given `i`: +To insert a key into an array of $(B - 1)$ sorted elements, we can load them in vector registers and then [mask-store](/hpc/simd/masking) them one position to the right using a precomputed mask that tells which elements need to be written for a given `i`: ```c++ struct Precalc { @@ -303,7 +302,7 @@ The relative speedup varies with the structure size — 7-18x/3-8x over STL and ![](../img/btree-relative.svg) -Insertions are only 1.5-2 faster than for `absl::btree`, which uses scalar code to do everything. I don't know (yet) why insertions are *that* slow, but my guess is that it has something to do with data dependencies between queries. +Insertions are only 1.5-2 faster than for `absl::btree`, which uses scalar code to do everything. I don't know (yet) why insertions are *that* slow, but I guess it has something to do with data dependencies between queries. ![](../img/btree-absl.svg) @@ -313,13 +312,11 @@ Interestingly, B− tree outperforms `absl::btree` even when it only stores a si ### Possible Optimizations -In our previous optimization efforts. +In our previous endeavors in data structure optimization, it helped a lot to make as many variables as possible compile-time constants: the compiler can hardcode these constants into the machine code, simplify the arithmetic, unroll all the loops, and do many other nice things for us. -Maximum height was 6. +This would not be a problem at all if our tree were of constant height, but it is not. It is *largely* constant, though: the height rarely changes, and in fact, under the constraints of the benchmark, the maximum height was only 6. -Compile. I tried it, but couldn't get the compiler to generate optimal code. - -The idiomatic C++ way is to use virtual functions, but we will be explicit: +What we can do is pre-compile the `insert` and `lower_bound` functions for several different compile-time constant heights and switch between them as the tree grows. The idiomatic C++ way is to use virtual functions, but I prefer to be explicit and use raw function pointers like this: ```c++ void (*insert_ptr)(int); @@ -334,6 +331,8 @@ int lower_bound(int x) { } ``` +We now define template functions that have the tree height as a parameter, and in the grow-tree block inside the `insert` function, we change the pointers as the tree grows: + ```c++ template void insert_impl(int _x) { @@ -357,18 +356,20 @@ void insert_impl<10>(int x) { } ``` -```c++ + -Recursion unrolled. +I tried but could not get any performance improvement with this, but I still have high hope for this approach because the compiler can (theoretically) remove `sk` and `si`, completely removing any temporary storage and only reading and computing everything once, greatly optimizing the `insert` procedure. + + +Deletions, iteration, and other things are not our concern for now. Going to father and fetching $B$ pointers at a time is faster as it negates [pointer chasing](/hpc/cpu-cache/latency/). @@ -386,10 +387,13 @@ If the node is at least half-full, we're done. Otherwise, we try to borrow keys If that fails, we can merge the two nodes together, and iteratively delete the key in the parent. - - [Skip list](https://en.wikipedia.org/wiki/Skip_list), which [some attempts to vectorize it](https://doublequan.github.io/), although it may achieve higher total throughput in concurrent setting. I have low hope that it can be improved. ## Acknowledgements Thanks to [Danila Kutenin](https://danlark.org/) from Google for meaningful discussions of applicability and possible replacement in Abseil. + +--> + + + diff --git a/content/english/hpc/data-structures/s-tree.md b/content/english/hpc/data-structures/s-tree.md index ad9fc2ef..a6e3ea57 100644 --- a/content/english/hpc/data-structures/s-tree.md +++ b/content/english/hpc/data-structures/s-tree.md @@ -584,7 +584,7 @@ My next priorities is to adapt it to segment trees, which I know how to do, and Of course, this comparison is not fair, as implementing a dynamic search tree is a more high-dimensional problem. -We'd also need to implement the update operation, which will not be that efficient, and for which we'd need to sacrifice the fanout factor. But it still seems possible to implement a 10-20x faster `std::set` and a 3-5x faster `absl::btree_set`, depending on how you define "faster" — and this is one of the things we'll attempt to do next. +We'd also need to implement the update operation, which will not be that efficient, and for which we'd need to sacrifice the fanout factor. But it still seems possible to implement a 10-20x faster `std::set` and a 3-5x faster `absl::btree_set`, depending on how you define "faster" — and this is one of the things we'll [attempt to do next](../b-tree). ## Acknowledgements -Thanks to [Danila Kutenin](https://danlark.org/) from Google for meaningful discussions of applicability and possible replacement in Abseil. - ---> - +Thanks to [Danila Kutenin](https://danlark.org/) from Google for meaningful discussions of applicability and the usage of B-trees in Abseil. From 39bf49ab371b188c4ea164b537159deb3c2b78f3 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sat, 26 Mar 2022 00:57:55 +0300 Subject: [PATCH 352/531] edit b-tree intro --- content/english/hpc/data-structures/b-tree.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/english/hpc/data-structures/b-tree.md b/content/english/hpc/data-structures/b-tree.md index 8c65f1b8..96d1a08e 100644 --- a/content/english/hpc/data-structures/b-tree.md +++ b/content/english/hpc/data-structures/b-tree.md @@ -5,9 +5,9 @@ weight: 3 In the [previous article](../s-tree), we designed and implemented *static* B-trees to speed up binary searching in sorted arrays. In its [last section](../s-tree/#as-a-dynamic-tree), we briefly discussed how to make them *dynamic* back while retaining the performance gains from [SIMD](/hpc/simd) and validated our predictions by adding and following explicit pointers in the internal nodes of the S+ tree. -In this article, we follow up on that proposition and design a minimally functional search tree for integer keys, [achieving](#evaluation) up to 18x/8x speedup over `std::set` and up to 7x/2x speedup over [`absl::btree`](https://abseil.io/blog/20190812-btree) for `lower_bound` and `insert` queries, respectively, with yet ample room for improvement. +In this article, we follow up on that proposition and design a minimally functional search tree for integer keys, [achieving](#evaluation) up to 18x/8x speedup over `std::set` and up to 7x/2x speedup over [`absl::btree`](https://abseil.io/blog/20190812-btree) for `lower_bound` and `insert` queries, respectively — with yet ample room for improvement. -The memory overhead of the structure is around 30%, and the final implementation is [under 150 lines of C++](https://github.com/sslotin/amh-code/blob/main/b-tree/btree-final.cc). +The memory overhead of the structure is around 30% for 32-bit integers, and the final implementation is [under 150 lines of C++](https://github.com/sslotin/amh-code/blob/main/b-tree/btree-final.cc). It can be easily generalized to other arithmetic types and small/fixed-length strings such as hashes, country codes, and stock symbols. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/algorithms/img/mm-blocked-barplot.svg b/content/english/hpc/algorithms/img/mm-blocked-barplot.svg new file mode 100644 index 00000000..93334ac1 --- /dev/null +++ b/content/english/hpc/algorithms/img/mm-blocked-barplot.svg @@ -0,0 +1,1402 @@ + + + + + + + + 2022-04-05T01:18:41.689702 + image/svg+xml + + + Matplotlib v3.5.1, https://matplotlib.org/ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/algorithms/img/mm-blocked-plot.svg b/content/english/hpc/algorithms/img/mm-blocked-plot.svg new file mode 100644 index 00000000..87dda835 --- /dev/null +++ b/content/english/hpc/algorithms/img/mm-blocked-plot.svg @@ -0,0 +1,1474 @@ + + + + + + + + 2022-04-05T01:18:54.049300 + image/svg+xml + + + Matplotlib v3.5.1, https://matplotlib.org/ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/algorithms/img/mm-kernel-barplot.svg b/content/english/hpc/algorithms/img/mm-kernel-barplot.svg new file mode 100644 index 00000000..834d8b39 --- /dev/null +++ b/content/english/hpc/algorithms/img/mm-kernel-barplot.svg @@ -0,0 +1,1277 @@ + + + + + + + + 2022-04-05T01:18:16.721432 + image/svg+xml + + + Matplotlib v3.5.1, https://matplotlib.org/ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/algorithms/img/mm-kernel-plot.svg b/content/english/hpc/algorithms/img/mm-kernel-plot.svg new file mode 100644 index 00000000..99f9315a --- /dev/null +++ b/content/english/hpc/algorithms/img/mm-kernel-plot.svg @@ -0,0 +1,1385 @@ + + + + + + + + 2022-04-05T01:18:30.773700 + image/svg+xml + + + Matplotlib v3.5.1, https://matplotlib.org/ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/algorithms/img/mm-noalloc.svg b/content/english/hpc/algorithms/img/mm-noalloc.svg new file mode 100644 index 00000000..a4911ea0 --- /dev/null +++ b/content/english/hpc/algorithms/img/mm-noalloc.svg @@ -0,0 +1,1344 @@ + + + + + + + + 2022-04-05T01:19:35.314892 + image/svg+xml + + + Matplotlib v3.5.1, https://matplotlib.org/ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/algorithms/img/mm-vectorized-barplot.svg b/content/english/hpc/algorithms/img/mm-vectorized-barplot.svg new file mode 100644 index 00000000..610d8276 --- /dev/null +++ b/content/english/hpc/algorithms/img/mm-vectorized-barplot.svg @@ -0,0 +1,1140 @@ + + + + + + + + 2022-04-05T01:17:55.289785 + image/svg+xml + + + Matplotlib v3.5.1, https://matplotlib.org/ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/algorithms/img/mm-vectorized-plot.svg b/content/english/hpc/algorithms/img/mm-vectorized-plot.svg new file mode 100644 index 00000000..7374f73f --- /dev/null +++ b/content/english/hpc/algorithms/img/mm-vectorized-plot.svg @@ -0,0 +1,1379 @@ + + + + + + + + 2022-04-05T01:18:01.560593 + image/svg+xml + + + Matplotlib v3.5.1, https://matplotlib.org/ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/content/english/hpc/algorithms/matmul.md b/content/english/hpc/algorithms/matmul.md index 408c6892..29081c0c 100644 --- a/content/english/hpc/algorithms/matmul.md +++ b/content/english/hpc/algorithms/matmul.md @@ -1,9 +1,49 @@ --- title: Matrix Multiplication -weight: 4 +weight: 20 draft: true --- +"[Anatomy of High-Performance Matrix Multiplication](https://www.cs.utexas.edu/~flame/pubs/GotoTOMS_revision.pdf)" by Kazushige Goto and Robert van de Geijn. + +For reasons that will later become aparent, we only use sizes that are multiples of $48$. 1920 + +Cache associativity strikes again. This is also an issue, but we will not address it for now. + +GCC 13. + +3.5s for 1025 ad 12s for 1024. + +baseline 13.58622 0.5209607970428861 +hugepages 16.749895 0.42256312651512146 +transposed 12.377302 0.5718441708863531 +autovec 3.117215 2.2705806304666187 +vectorized 3.075742 2.301196914435606 +kernel 2.24264 3.1560517960974566 +blocked 0.461477 15.33746643928083 +noalloc 0.408031 17.346446716058338 +nomove 0.303826 23.295860130469414 +blas 0.27489790320396423 25.747333528217077 + +![](../img/mm-vectorized-barplot.svg) + +![](../img/mm-vectorized-plot.svg) + +![](../img/mm-kernel-barplot.svg) + +![](../img/mm-kernel-plot.svg) + +![](../img/mm-blocked-plot.svg) + +![](../img/mm-blocked-barplot.svg) + +![](../img/mm-noalloc.svg) + +![](../img/mm-blas.svg) + +Which is fine, considering that this is not the only thing that CPUs are made for. + +--- ## Case Study: Distance Product From d97421b9a3d47b22fe6ba12591a04edc5e406af9 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 5 Apr 2022 01:41:49 +0300 Subject: [PATCH 370/531] matmul code --- content/english/hpc/algorithms/matmul.md | 137 +++++++++++++++++++++++ 1 file changed, 137 insertions(+) diff --git a/content/english/hpc/algorithms/matmul.md b/content/english/hpc/algorithms/matmul.md index 29081c0c..b787ae52 100644 --- a/content/english/hpc/algorithms/matmul.md +++ b/content/english/hpc/algorithms/matmul.md @@ -25,14 +25,151 @@ noalloc 0.408031 17.346446716058338 nomove 0.303826 23.295860130469414 blas 0.27489790320396423 25.747333528217077 +```c++ +void matmul(const float *a, const float *b, float *c, int n) { + for (int i = 0; i < n; i++) + for (int j = 0; j < n; j++) + for (int k = 0; k < n; k++) + c[i * n + j] += a[i * n + k] * b[k * n + j]; +} +``` + +Transpose: + +```c++ +void matmul(const float *a, const float *_b, float *c, int n) { + float *b = new float[n * n]; + + for (int i = 0; i < n; i++) + for (int j = 0; j < n; j++) + b[i * n + j] = _b[j * n + i]; + + for (int i = 0; i < n; i++) + for (int j = 0; j < n; j++) + for (int k = 0; k < n; k++) + c[i * n + j] += a[i * n + k] * b[j * n + k]; // notice indices +} +``` + +```c++ +void matmul(const float *a, const float *_b, float * __restrict__ c, int n) { + // ... +} +``` + +```c++ +const int B = 8; // number of elements in a vector +const int vecsize = B * sizeof(float); // size of a vector in bytes +typedef float vector __attribute__ (( vector_size(vecsize) )); + +vector* alloc(int n) { + vector* ptr = (vector*) std::aligned_alloc(vecsize, vecsize * n); + memset(ptr, 0, vecsize * n); + return ptr; +} + +float hsum(vector s) { + float res = 0; + for (int i = 0; i < B; i++) + res += s[i]; + return res; +} + +void matmul(const float *_a, const float *_b, float *c, int n) { + int nB = (n + B - 1) / B; + + vector *a = alloc(n * nB); + vector *b = alloc(n * nB); + + for (int i = 0; i < n; i++) { + for (int j = 0; j < n; j++) { + a[i * nB + j / 8][j % 8] = _a[i * n + j]; + b[i * nB + j / 8][j % 8] = _b[j * n + i]; // <- still transposed + } + } + + for (int i = 0; i < n; i++) { + for (int j = 0; j < n; j++) { + vector s = {0}; + for (int k = 0; k < nB; k++) + s += a[i * nB + k] * b[j * nB + k]; + c[i * n + j] = hsum(s); + } + } +} +``` + ![](../img/mm-vectorized-barplot.svg) ![](../img/mm-vectorized-plot.svg) +```c++ +void kernel(float *a, vector *b, vector *c, int x, int y, int l, int r, int n) { + vector t[6][2]{}; + + for (int k = l; k < r; k++) { + for (int i = 0; i < 6; i++) { + vector alpha = vector{} + a[(x + i) * n + k]; + for (int j = 0; j < 2; j++) + t[i][j] += alpha * b[(k * n + y) / 8 + j]; + } + } + + for (int i = 0; i < 6; i++) + for (int j = 0; j < 2; j++) + c[((x + i) * n + y) / 8 + j] += t[i][j]; +} +``` + +```c++ +void matmul(const float *_a, const float *_b, float *_c, int n) { + int nx = (n + 5) / 6 * 6; + int ny = (n + 15) / 16 * 16; + + float *a = alloc(nx * ny); + float *b = alloc(nx * ny); + float *c = alloc(nx * ny); + + for (int i = 0; i < n; i++) { + memcpy(&a[i * ny], &_a[i * n], 4 * n); + memcpy(&b[i * ny], &_b[i * n], 4 * n); + } + + for (int x = 0; x < nx; x += 6) + for (int y = 0; y < ny; y += 16) + kernel(a, (vector*) b, (vector*) c, x, y, 0, n, ny); + + for (int i = 0; i < n; i++) + memcpy(&_c[i * n], &c[i * ny], 4 * n); + + std::free(a); + std::free(b); + std::free(c); +} +``` + ![](../img/mm-kernel-barplot.svg) ![](../img/mm-kernel-plot.svg) +```c++ +const int s3 = 64; +const int s2 = 120; +const int s1 = 240; + +for (int i3 = 0; i3 < ny; i3 += s3) + // now we are working with b[:][i3:i3+s3] + for (int i2 = 0; i2 < nx; i2 += s2) + // now we are working with a[i2:i2+s2][:] + for (int i1 = 0; i1 < ny; i1 += s1) + // now we are working with b[i1:i1+s1][i3:i3+s3] + // this equates to updating c[i2:i2+s2][i3:i3+s3] + // with [l:r] = [i1:i1+s1] + for (int x = i2; x < std::min(i2 + s2, nx); x += 6) + for (int y = i3; y < std::min(i3 + s3, ny); y += 16) + kernel(a, (vector*) b, (vector*) c, x, y, i1, std::min(i1 + s1, n), ny); +``` + ![](../img/mm-blocked-plot.svg) ![](../img/mm-blocked-barplot.svg) From 4f3fb47f84d394b114338f3bfb8dd6fc28ac5bff Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 5 Apr 2022 01:49:56 +0300 Subject: [PATCH 371/531] matmul outline --- content/english/hpc/algorithms/matmul.md | 434 ++--------------------- 1 file changed, 26 insertions(+), 408 deletions(-) diff --git a/content/english/hpc/algorithms/matmul.md b/content/english/hpc/algorithms/matmul.md index b787ae52..1a611a52 100644 --- a/content/english/hpc/algorithms/matmul.md +++ b/content/english/hpc/algorithms/matmul.md @@ -6,6 +6,8 @@ draft: true "[Anatomy of High-Performance Matrix Multiplication](https://www.cs.utexas.edu/~flame/pubs/GotoTOMS_revision.pdf)" by Kazushige Goto and Robert van de Geijn. +Inspired by "[Programming Parallel Computers](http://ppc.cs.aalto.fi/ch2/)" course. + For reasons that will later become aparent, we only use sizes that are multiples of $48$. 1920 Cache associativity strikes again. This is also an issue, but we will not address it for now. @@ -25,6 +27,12 @@ noalloc 0.408031 17.346446716058338 nomove 0.303826 23.295860130469414 blas 0.27489790320396423 25.747333528217077 +$$ +C_{ij} = \sum_{i=1}^{n} A_{ik} \cdot B_{kj} +$$ + +Implement the definition of what we need to do, but using arrays instead of matrices: + ```c++ void matmul(const float *a, const float *b, float *c, int n) { for (int i = 0; i < n; i++) @@ -103,6 +111,15 @@ void matmul(const float *_a, const float *_b, float *c, int n) { ![](../img/mm-vectorized-plot.svg) +## Theoretical Performance + +$$ +\underbrace{4}_{CPUs} \cdot \underbrace{8}_{SIMD} \cdot \underbrace{2}_{1/thr} \cdot \underbrace{3.6 \cdot 10^9}_{cycles/sec} = 230.4 \; GFLOPS \;\; (2.3 \cdot 10^{11}) +$$ + +RAM bandwidth is lower than that + + ```c++ void kernel(float *a, vector *b, vector *c, int x, int y, int l, int r, int n) { vector t[6][2]{}; @@ -180,424 +197,25 @@ for (int i3 = 0; i3 < ny; i3 += s3) Which is fine, considering that this is not the only thing that CPUs are made for. ---- - -## Case Study: Distance Product - -(We are going to speedrun "[Programming Parallel Computers](http://ppc.cs.aalto.fi/ch2/)" course) +### Generalizations Given a matrix $D$, we need to calculate its "min-plus matrix multiplication" defined as: $(D \circ D)_{ij} = \min_k(D_{ik} + D_{kj})$ ----- - -Graph interpretation: -find shortest paths of length 2 between all vertices in a fully-connected weighted graph - -![](https://i.imgur.com/Zf4G7qj.png) - ----- +Graph interpretation: find shortest paths of length 2 between all vertices in a fully-connected weighted graph A cool thing about distance product is that if if we iterate the process and calculate: -$D_2 = D \circ D, \;\; -D_4 = D_2 \circ D_2, \;\; -D_8 = D_4 \circ D_4, \;\; -\ldots$ - -Then we can find all-pairs shortest distances in $O(\log n)$ steps - -(but recall that there are [more direct ways](https://en.wikipedia.org/wiki/Floyd%E2%80%93Warshall_algorithm) to solve it) - ---- - -## V0: Baseline - -Implement the definition of what we need to do, but using arrays instead of matrices: - -```cpp -const float infty = std::numeric_limits::infinity(); - -void step(float* r, const float* d, int n) { - for (int i = 0; i < n; ++i) { - for (int j = 0; j < n; ++j) { - float v = infty; - for (int k = 0; k < n; ++k) { - float x = d[n*i + k]; - float y = d[n*k + j]; - float z = x + y; - v = std::min(v, z); - } - r[n*i + j] = v; - } - } -} -``` - -Compile with `g++ -O3 -march=native -std=c++17` - -On our Intel Core i5-6500 ("Skylake," 4 cores, 3.6 GHz) with $n=4000$ it runs for 99s, -which amounts to ~1.3B useful floating point operations per second - ---- - -## Theoretical Performance - $$ -\underbrace{4}_{CPUs} \cdot \underbrace{8}_{SIMD} \cdot \underbrace{2}_{1/thr} \cdot \underbrace{3.6 \cdot 10^9}_{cycles/sec} = 230.4 \; GFLOPS \;\; (2.3 \cdot 10^{11}) +D_2 = D \circ D \\ +D_4 = D_2 \circ D_2 \\ +D_8 = D_4 \circ D_4 \\ +\ldots $$ -RAM bandwidth: 34.1 GB/s (or ~10 bytes per cycle) - - ---- - -## OpenMP - -* We have 4 cores, so why don't we use them? -* There are low-level ways of creating threads, but they involve a lot of code -* We will use a high-level interface called OpenMP -* (We will talk about multithreading in much more detail on the next lecture) - -![](https://www.researchgate.net/profile/Mario_Storti/publication/231168223/figure/fig2/AS:393334787985424@1470789729707/The-master-thread-creates-a-team-of-parallel-threads.png =400x) - ----- - -## Multithreading Made Easy - -All you need to know for now is the `#pragma omp parallel for` directive - -```cpp -#pragma omp parallel for -for (int i = 0; i < 10; ++i) { - do_stuff(i); -} -``` - -It splits iterations of a loop among multiple threads - -There are many ways to control scheduling, -but we'll just leave defaults because our use case is simple - - - ----- - -## Warning: Data Races - -This only works when all iterations can safely be executed simultaneously -It's not always easy to determine, but for now following rules of thumb are enough: - -* There must not be any shared data element that is read by X and written by Y -* There must not be any shared data element that is written by X and written by Y - -E. g. sum can't be parallelized this way, as threads would modify a shared variable - - ---- - -## Parallel Baseline - -OpenMP is included in compilers: just add `-fopenmp` flag and that's it - -```cpp -void step(float* r, const float* d, int n) { - #pragma omp parallel for - for (int i = 0; i < n; ++i) { - for (int j = 0; j < n; ++j) { - float v = infty; - for (int k = 0; k < n; ++k) { - float x = d[n*i + k]; - float y = d[n*k + j]; - float z = x + y; - v = std::min(v, z); - } - r[n*i + j] = v; - } - } -} -``` - -Runs ~4x times faster, as it should - ---- - -## Memory Bottleneck - -![](https://i.imgur.com/z4d6aez.png =450x) - -(It is slower on macOS because of smaller page sizes) - ----- - -## Virtual Memory - -![](https://www.cs.uic.edu/~jbell/CourseNotes/OperatingSystems/images/Chapter9/9_01_VirtualMemoryLarger.jpg =500x) - ---- - -## V1: Linear Reading - -Just transpose it, as we did with matrices - -```cpp -void step(float* r, const float* d, int n) { - std::vector t(n*n); - #pragma omp parallel for - for (int i = 0; i < n; ++i) { - for (int j = 0; j < n; ++j) { - t[n*j + i] = d[n*i + j]; - } - } - - #pragma omp parallel for - for (int i = 0; i < n; ++i) { - for (int j = 0; j < n; ++j) { - float v = std::numeric_limits::infinity(); - for (int k = 0; k < n; ++k) { - float x = d[n*i + k]; - float y = t[n*j + k]; - float z = x + y; - v = std::min(v, z); - } - r[n*i + j] = v; - } - } -} -``` - ----- - -![](https://i.imgur.com/UwxcEG7.png =600x) - ----- - -![](https://i.imgur.com/2ySfr0V.png =600x) - ---- - -## V2: Instruction-Level Parallelism - -We can apply the same trick as we did with array sum earlier, so that instead of: - -```cpp -v = min(v, z0); -v = min(v, z1); -v = min(v, z2); -v = min(v, z3); -v = min(v, z4); -``` - -We use a few registers and compute minimum simultaneously utilizing ILP: - -```cpp -v0 = min(v0, z0); -v1 = min(v1, z1); -v0 = min(v0, z2); -v1 = min(v1, z3); -v0 = min(v0, z4); -... -v = min(v0, v1); -``` - ----- - -![](https://i.imgur.com/ihMC6z2.png) - -Our memory layout looks like this now - ----- - -```cpp -void step(float* r, const float* d_, int n) { - constexpr int nb = 4; - int na = (n + nb - 1) / nb; - int nab = na*nb; - - // input data, padded - std::vector d(n*nab, infty); - // input data, transposed, padded - std::vector t(n*nab, infty); - - #pragma omp parallel for - for (int j = 0; j < n; ++j) { - for (int i = 0; i < n; ++i) { - d[nab*j + i] = d_[n*j + i]; - t[nab*j + i] = d_[n*i + j]; - } - } - - #pragma omp parallel for - for (int i = 0; i < n; ++i) { - for (int j = 0; j < n; ++j) { - // vv[0] = result for k = 0, 4, 8, ... - // vv[1] = result for k = 1, 5, 9, ... - // vv[2] = result for k = 2, 6, 10, ... - // vv[3] = result for k = 3, 7, 11, ... - float vv[nb]; - for (int kb = 0; kb < nb; ++kb) { - vv[kb] = infty; - } - for (int ka = 0; ka < na; ++ka) { - for (int kb = 0; kb < nb; ++kb) { - float x = d[nab*i + ka * nb + kb]; - float y = t[nab*j + ka * nb + kb]; - float z = x + y; - vv[kb] = std::min(vv[kb], z); - } - } - // v = result for k = 0, 1, 2, ... - float v = infty; - for (int kb = 0; kb < nb; ++kb) { - v = std::min(vv[kb], v); - } - r[n*i + j] = v; - } - } -} -``` - ----- - -![](https://i.imgur.com/5uHVRL4.png =600x) - ---- - -## V3: Vectorization - -![](https://i.imgur.com/EG0WjHl.png =400x) - ----- - -```cpp -static inline float8_t min8(float8_t x, float8_t y) { - return x < y ? x : y; -} - -void step(float* r, const float* d_, int n) { - // elements per vector - constexpr int nb = 8; - // vectors per input row - int na = (n + nb - 1) / nb; - - // input data, padded, converted to vectors - float8_t* vd = float8_alloc(n*na); - // input data, transposed, padded, converted to vectors - float8_t* vt = float8_alloc(n*na); - - #pragma omp parallel for - for (int j = 0; j < n; ++j) { - for (int ka = 0; ka < na; ++ka) { - for (int kb = 0; kb < nb; ++kb) { - int i = ka * nb + kb; - vd[na*j + ka][kb] = i < n ? d_[n*j + i] : infty; - vt[na*j + ka][kb] = i < n ? d_[n*i + j] : infty; - } - } - } - - #pragma omp parallel for - for (int i = 0; i < n; ++i) { - for (int j = 0; j < n; ++j) { - float8_t vv = f8infty; - for (int ka = 0; ka < na; ++ka) { - float8_t x = vd[na*i + ka]; - float8_t y = vt[na*j + ka]; - float8_t z = x + y; - vv = min8(vv, z); - } - r[n*i + j] = hmin8(vv); - } - } - - std::free(vt); - std::free(vd); -} -``` - ----- - -![](https://i.imgur.com/R3OvLKO.png =600x) - ---- - -## V4: Register Reuse - -* At this point we are actually bottlenecked by memory -* It turns out that calculating one $r_{ij}$ at a time is not optimal -* We can reuse data that we read into registers to update other fields - ----- - -![](https://i.imgur.com/ljvD0ba.png =400x) - ----- - -```cpp -for (int ka = 0; ka < na; ++ka) { - float8_t y0 = vt[na*(jc * nd + 0) + ka]; - float8_t y1 = vt[na*(jc * nd + 1) + ka]; - float8_t y2 = vt[na*(jc * nd + 2) + ka]; - float8_t x0 = vd[na*(ic * nd + 0) + ka]; - float8_t x1 = vd[na*(ic * nd + 1) + ka]; - float8_t x2 = vd[na*(ic * nd + 2) + ka]; - vv[0][0] = min8(vv[0][0], x0 + y0); - vv[0][1] = min8(vv[0][1], x0 + y1); - vv[0][2] = min8(vv[0][2], x0 + y2); - vv[1][0] = min8(vv[1][0], x1 + y0); - vv[1][1] = min8(vv[1][1], x1 + y1); - vv[1][2] = min8(vv[1][2], x1 + y2); - vv[2][0] = min8(vv[2][0], x2 + y0); - vv[2][1] = min8(vv[2][1], x2 + y1); - vv[2][2] = min8(vv[2][2], x2 + y2); -} -``` - -Ugly, but worth it - ----- - -![](https://i.imgur.com/GZvIt8J.png =600x) - ---- - -## V5: More Register Reuse - -![](https://i.imgur.com/amUznoQ.png =400x) - ----- - -![](https://i.imgur.com/24nBJ1Y.png =600x) - ---- - -## V6: Software Prefetching - -![](https://i.imgur.com/zwqa1ZS.png =600x) - ---- - -## V7: Temporal Cache Locality - -![](https://i.imgur.com/29vTLKJ.png) - ----- - -### Z-Curve - -![](https://i.imgur.com/0optLZ3.png) - ----- - -![](https://i.imgur.com/U3GaO5b.png) - ---- +Then we can find all-pairs shortest distances in $O(\log n)$ steps -## Summary +(but recall that there are [more direct ways](https://en.wikipedia.org/wiki/Floyd%E2%80%93Warshall_algorithm) to solve it) -* Deal with memory problems first (make sure data fits L3 cache) -* SIMD can get you ~10x speedup -* ILP can get you 2-3x speedup -* Multi-core parallelism can get you $NUM_CORES speedup - (and it can be just one `#pragma omp parallel for` away) +Which is an exercise. From 823b55298830685d3eaa57b2b28d10ea91c92de1 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 5 Apr 2022 17:33:05 +0300 Subject: [PATCH 372/531] matmul intro --- content/english/hpc/algorithms/matmul.md | 64 +++++++++++++++++++----- 1 file changed, 51 insertions(+), 13 deletions(-) diff --git a/content/english/hpc/algorithms/matmul.md b/content/english/hpc/algorithms/matmul.md index 1a611a52..c092f138 100644 --- a/content/english/hpc/algorithms/matmul.md +++ b/content/english/hpc/algorithms/matmul.md @@ -4,17 +4,8 @@ weight: 20 draft: true --- -"[Anatomy of High-Performance Matrix Multiplication](https://www.cs.utexas.edu/~flame/pubs/GotoTOMS_revision.pdf)" by Kazushige Goto and Robert van de Geijn. - -Inspired by "[Programming Parallel Computers](http://ppc.cs.aalto.fi/ch2/)" course. - -For reasons that will later become aparent, we only use sizes that are multiples of $48$. 1920 - -Cache associativity strikes again. This is also an issue, but we will not address it for now. - -GCC 13. - -3.5s for 1025 ad 12s for 1024. + + +In this case study, we will design and implement several algorithms for matrix multiplication. We start with the naive "for-for-for" algorithm and incrementally improve it, eventually developing an implementation that is 50 times faster and matches the performance of BLAS libraries while being under 40 lines of C. + +We compile our implementations with GCC 13 and run them on Zen 2 clocked at 2GHz. + +## Baseline + +The result of multiplying an $l \times n$ matrix $A$ by an $n \times m$ matrix $B$ is an $l \times m$ matrix $C$ calculated as: $$ -C_{ij} = \sum_{i=1}^{n} A_{ik} \cdot B_{kj} +C_{ij} = \sum_{k=1}^{n} A_{ik} \cdot B_{kj} $$ -Implement the definition of what we need to do, but using arrays instead of matrices: +For simplicity, we will only consider *square* matrices, where $l = m = n$. + +To implement matrix multiplication, we can just transfer this definition into code — but instead of two-dimensional arrays (aka matrices), we will be using one-dimensional arrays, to be explicit about memory addressing: ```c++ void matmul(const float *a, const float *b, float *c, int n) { @@ -42,6 +44,14 @@ void matmul(const float *a, const float *b, float *c, int n) { } ``` +For reasons that will become aparent later, we only use matrix sizes that are multiples of $48$ for benchmarking, but the implementations are still correct for all other sizes. + +Compiled with `g++ -O3 -march=native -funroll-loops`, this code runs in ~16.7s for $n = 1920$. + +[Cache associativity](/hpc/cpu-cache/associativity/) strikes again. This is also an issue, but we will not address it for now. + +3.5s for 1025 ad 12s for 1024. + Transpose: ```c++ @@ -113,6 +123,8 @@ void matmul(const float *_a, const float *_b, float *c, int n) { ## Theoretical Performance +This CPU importantly supports the [FMA3](https://en.wikipedia.org/wiki/FMA_instruction_set) SIMD extension that we will utilize in the later implementations. + $$ \underbrace{4}_{CPUs} \cdot \underbrace{8}_{SIMD} \cdot \underbrace{2}_{1/thr} \cdot \underbrace{3.6 \cdot 10^9}_{cycles/sec} = 230.4 \; GFLOPS \;\; (2.3 \cdot 10^{11}) $$ @@ -197,6 +209,20 @@ for (int i3 = 0; i3 < ny; i3 += s3) Which is fine, considering that this is not the only thing that CPUs are made for. +```c++ +for (int i3 = 0; i3 < n; i3 += s3) + for (int i2 = 0; i2 < n; i2 += s2) + for (int i1 = 0; i1 < n; i1 += s1) + for (int x = i2; x < i2 + s2; x += 6) + for (int y = i3; y < i3 + s3; y += 16) + for (int k = i1; k < i1 + s1; k++) + for (int i = 0; i < 6; i++) + for (int j = 0; j < 2; j++) + c[x * n / 8 + i * n / 8 + y / 8 + j] + += (vector{} + a[x * n + i * n + k]) + * b[n / 8 * k + y / 8 + j]; +``` + ### Generalizations Given a matrix $D$, we need to calculate its "min-plus matrix multiplication" defined as: @@ -219,3 +245,15 @@ Then we can find all-pairs shortest distances in $O(\log n)$ steps (but recall that there are [more direct ways](https://en.wikipedia.org/wiki/Floyd%E2%80%93Warshall_algorithm) to solve it) Which is an exercise. + +Strassen algorithm is only useful for large matrices. + +https://arxiv.org/pdf/1605.01078.pdf + +[cache-oblivious](/hpc/external-memory/oblivious/#matrix-multiplication) algorithms + +## Acknowledgements + +"[Anatomy of High-Performance Matrix Multiplication](https://www.cs.utexas.edu/~flame/pubs/GotoTOMS_revision.pdf)" by Kazushige Goto and Robert van de Geijn. + +Inspired by "[Programming Parallel Computers](http://ppc.cs.aalto.fi/ch2/)" course. From ab322dd710898564821b2f58cf84ada7e171f845 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 5 Apr 2022 19:39:23 +0300 Subject: [PATCH 373/531] transposet matmul --- .../hpc/algorithms/img/column-major.jpg | Bin 0 -> 22004 bytes content/english/hpc/algorithms/matmul.md | 61 +++++++++++++----- 2 files changed, 46 insertions(+), 15 deletions(-) create mode 100644 content/english/hpc/algorithms/img/column-major.jpg diff --git a/content/english/hpc/algorithms/img/column-major.jpg b/content/english/hpc/algorithms/img/column-major.jpg new file mode 100644 index 0000000000000000000000000000000000000000..675d0b856231c263b12d3a246ab6216fe02af3f0 GIT binary patch literal 22004 zcmdRWcUV(f+HdS#57MM7RY2fSLKQei2uK%52rY1=6Pol6mXRt!KtQ?%Bn^Uu9-2t+ zHS{9AgY*tJdPZbs&dfLW`R+e=^AO&wz4qJIyV|>co1@;NFMwZ^6_gYJr%nR^r^r8m zqXB>%;M|$BXV0EFNB%f>?%eqcS1(>5UpFpaxpbA{#?6})Hz>r*&YeAT>hwkOUC1whQ)kHZh07POTsVFH^ck{Mr_Y={ zcb?+HO`1D`I??SNl(b^MLtNe7l3Hve_S4Jd2FV({{SZ z2{L(3kTa7ve@|vhf9e$A^ck{ueqMjFCExp%g6y}`WKW$r_f4^HHBOzTICJ)veNT8`AeKVubn@cwH%8~fvSew{{s zZYFvXxbiZ1kmn}ER=bSPFzlr!wRdufa>)lwI(+gn6ea?ii%GvaKVFL2;krCapY=pW zsy`+22|fb5b*==vX*%|#w3&*or`B70Rh_y6X!OO$Jx`W>6vhTwx)5ztOZjri>Rmmd zsmI!53TW($%^qEo16^*|9xnnfOHgQvs6Fniab@!AkLN*Ei$iIT0KFvdXOmEZ$1DB< zt9LDB$HLD1)ZjjLQ}H|Be|uXzQNS5=Uc^ zlyN^^9#cIImh@*@)TkGSKKvn;um7%a7Ge`Mk;t0{Fa5N=4EIy|OjN)P-ry;Xs2cT- z|M#~MTvrC$f4>l9uBWE%v|!1eulEOZfXVo3_Z-H2U(bL(uWow)HEx5osWYN_`!+#^ z4+zrq0Gz$|{~~55+-|12&%Jz;^2WXU zgK3fVV)$n3#Ayw-W|(~)%i~;AvBZGUb5d}4--C$kH!DRDx{@8kZd?+PC&O0C&gQqCc!>9OM?H~2qtYS>g< zqZbf2^d&ljg-Rp#YeHLSS?m*g8{Knrh7)~oH)^xPb#T2s_QjsX_4v61?se|nYt!5p zjsSt>r+W)}dzP7vn58(dBWAgs{dOW5Y46kRCw|R_XoiuX@;#7Ai1gpIX+rtH!tR)38n-8aUzK(m%eDy#|{Ifl$@Uf5DYTP42FS?;5h1gu94tUJ}95eG}&iOXwJ7oJyG7UMx4SYbf3<(xY$Kp$0-AI~owkIAl_LSf8~mqj_qU>AhcKMJbzx363}T1RaDRSFdUR|gmEY;<^|?s#Rq&QyVV0bxT!l+oItNo_55teedX@NkNI z0u#ztKI((&tEEWxo2sE%E-rVMrl#lO=Q4YYgm{DB=2AhZnkO|Ch2eLy^H{+w^#EczSGqb&6cf*!X zcvwJ6XJ!*E@zW1cWg?!PGgXxff%lbWXjLKDR8^>?yk*C#<#aU(>jQP5P)K!4k$WS6%a&1j-9#mEy0>x=k%By)qUsKz>!nR0F zF9UBfl$E=e{mqnpm!^q5NYG$(d{#IFrpD}J^kp&4^rpcg z6-|>3!}AYqW!#!f!c$?lBJ!#1U)lI+VX-SKR)KJzE0BJs6z+wx zI!Qy-#kl634!_>!%ImW5*3E05$P-hBt6%8sf%Ic==BQv2uT<3Jk_5H8tCMvW3k`85 zpe7h;6&T02&h27=)%yMG#xw5jR7WXf6#~j8R^;vqfyqR5{QFMw{m#ijQns~Nwq_3J zSW8lTYfi-%Ir#*PO2Y<6C?4dyw4<}JD9#dGgYBe=MVYEck^Y)sdO+?)ql?{Zdm|Ax!dEUS#Ba)kg_ro3 z-_ZHTVN;vFzgs|0BS&R_shQ9pLkNSIBonv<_pPfY9!9tv0faEyVQ}gC%KgsR&0gPn z-Skoz<@zfd{jWuTB}TuL!%ova2bLT*OS`c+iT`+>P0gD{p>|kyBoIy-ZAobl zF2xEz6dX$FPtUWti3dT$_6cm>ns9M4TFmfNoh?*o;0n~0)W#MCc3?$fc|6)77ZY=9hXj*@6ZG(3 zLs?cAv%1IU8Ch%hi)}Uvck4)>KQ{aQlW?U`OBcuWJsrazwQ4VB9|8FL5A0Ktxyj7(*tlq-$*m~2yztHf>W^sx z1Vf|kj)Cj*iBXzJp7f;xN8`9(qHJ zzA~iPV;!@}(i`*34Z<_RzUa_3?YkqtEFjK3PHe69;=f)uYUvz~>L-lS$W9IzXV!N6 zeRoO@*(q$t7P0Ga1dTbF)8#subaN4S2W`Xf1rN|Q`{$!w;Pxw=SRB%WptEex6Z&UHIX8=~jvXNWBOv2O&ip!TH9kal!|aLhwJC&j`|C}m zFYuxJj7#mum5!skB#6@mtSK_j)OoW#!#Tf5>gT$ceE{6v z>?3!@l;g%|a_S2jQ_0&Ux>norUN(JQ7m-P%hD(iVac(9MZIIR&%kL35P40UVy`&vT zGv%hW$q`^0-0|_))MH;o?0*kZ7el6q*b>0fb;^b#K((3cJ?Gdtc`e?bsyi7v zGv%4}N<*?(avy=8M2foW>)(3sH`mcMm(UF7PgJT>6W>bE6NF7>1Fp_L{XQ1}0OzF$ z<#8EnVYLG&Q8uPNjAOn8Bq)BhiGX1j-oGIN9qH`kie&_ zYi#9aYj!-Tw{nbaMZLVd4XM4iE(4R+l-LP5u0-V)`7oJeEjUiu!nUnqGPm|g$iP{5 z{kJ?g{UOfyi#oT4mp|OXJLSo`VZA~*xKa+|Kl|R%Fvap-x_<;%@1~r+?9HU9tCLK~ zaD9DRE0q6tSZukMcLH%ATNsi~-o79%fXtzlQ& zWqcWH^B|0*u}Ki}C~nsj8# zE01>>_Rc*5)M1hYni%T{6&Cq$2&|-x1LSa`vx>JPN|DnNKBv zhE6NYj|MWWQMWFmL&kp6FF{O@Pb}3BROdVy)HfA{vbDsdi<@n7#~^Cj;GSyi&D6c~ z+H!h6*_QcNctxMd7(B^9`?xnck3C~hx+y=+CuhK**#^u~0Mc*3Cj1g@YT#AckZx?! zooj(U3oZM5EFZOK*+rSwa-`N4{3;$AA!c0fvushW908~SjQb(WpvT#yVSWK;Ym6zh z+!g2?8$nO*B@O5<4SMXIY+*~3%c&IG$i+8<0upwdkVmy(Egfx19t$8vu3bzLVT%g9 zZ#T3^-PnM}T5(w8z%Z)jvE^v!RF40(k+#G>uICvVO}Qyf9SGc-UU|UsjJj=9g$~!0 zzDAMN;lo4ar&kN$eJs_*zY+9poI_qIlce$K>3X9N^CEQ`DRoJDx7Ov`TaDP&e~0Ls z%wjVI-O{eBv(^-_q`P>Y?K`W5tY`WRm7k2m!?ZzRnQ){NXOL@fa0%y41I7JuiuIU; zr72z|zSKj7D>k&-`i$Wb~@%J{c z)dEuo`lZ#Px|AU;IF@@Z4todr^a_WV&DU1FnI@SYZhLwCec^5|;Tw4k2Fa$ce5949|o@R1u%|ndA@sB zAd|nsb7yHo zJr3{GG_Bm@cMY2m6ck*8AbI|>U>k4*Fpd}v)^kEX$EPO~u`g|In+QqfKDShe0%EEX z>bhW*p)i**PU~FGhE-$n1Bn;AYP5u|=(A;xyUrKZ;|0tyTsMf9y>h>tDr@=si~zW` z)Np8#(<1gQ3HHj}++SF4nt`15v^OiB9xaGkPI;-skjr~r!=tztch8l^JUw4^H;$WM ziGLzzCYUPa#pEDqE~rEwD-e9;w#ys}tXD4Rkt5t|t0(WO7g(E-N1&YmsvkD>X|}{^ zNl&UBa)hgi7{OdcK4EbU7m-PVCT}}w9xf(S=qPP+`Ga{U_EXF9*hQPa;Mn&l%dym} zewrit5lkIHOR=Nz(;IsNs*(6`$=Y|mmKC=mAUgWpX1Fldea4$WM(h+8pQ%Po?NJt` zd=Z-%q6QwO1j9J8=WOygn=`t$(n&&?h;lc{58YOyO9)*()-2{-#lp}mW=xIFQ#C=W zYzixahU@|GZ@UkYYr#4Iod2+c@!4cESj)+uIJ2N~9aZXaNh8TZMX_W+EH7k>HpM{gddSlA;1Vnlg}f-z)6XdKL6 z@jP8YfZD8OSl(-XN7M-@qcw-C&VB^m7YewX_Z*p0tk5SFEhokv4(@SnUXi$Oy`J)i zPP}a?#CYMRnDJHd<+pf$UwP;WT8%t))W$FJ+0qa(nO(0kG7+?r$w{kQoDJ(5k=1^o z#n=@a6vgaWY(1ymYGNZ@VgIjEHBug_S$3eYNj>>~GuulHkq>F)jTaf_GUYqLW5Xm7 z;Kz&U349RfMO&$93-{^XgXgb{?tENMiW^BW-Msu@-CYj#H9?Qu99Z0VksZ!T&)0Wp zi_e@%b$t)vN)%_%#i&;;h3RC`OocR4219@%?=Gg3`&7J3bqY^bv&A7p`Fjb1#yQ=T zIBS{74^00uxwP*`>_?x?*N*7wU4B$6ddH1uFolBgX0~x?_wjE{q}=pLN2Z4$#??Gx zAEZ$V`USrwikZ=P&C8I_5W85YKe%>aiOEAHpmP4Mj}HSCgsOKPN*OPK%3sA z3WTgbF^8nZv>yS)t9&`q_liO~a<`PpoeIVny4S>H>k9T3FR?vLwcNVAA41SiLrgU2 z@B!rsb<>KvbD4e*3Ow;JKWzw+Re+-86pTIu)3r8$jKAV5LM`m>IsWohq;ZcEkOFpS zgH3*VB_knk+cIO_g2m=d0V|&$-MlniVE-gvt6=g$wK}H_DR$+4Qq3h!-H9m2kj1)z zl4+~>nc#8A${i*GFQ^7lt>;_bDVcomx?5%(cI5~_pF}%RdHM)&oim&0D^>jtXx{@k zd+A!tp?SovhST08Z~Y0n5keAr*1t$3r038T@x>s?N1ORQO2C$$GdH0Ph+qRM%uOtN zY4dz((cXqUL~b;Po<7jRlX^Fh1y?IL*UO!XLSmkACrQguA9SP^a!s`BPl<(t+?KG3 z0U_t{Is$cEo6%oJ?ym7!4&Hxvf)1Jsa&c@;$S=>9+p?3Me!;j?Y_mrt>()8UZYkwh z%}wxD<6sgM)Nm@*QGWBt&@9lC?_)7SDS4uEA`eXzbXAl#sBX|V*;H`@BHDEMeqpDg z`bG32-T09Fr11xcE5w){n{-=1lwf~zr*^i9rNgQ;CJoK87T)<*Hm%*8PsvthhI_8_ z>G?hJ9}|md!acAmFdpj{QMi2sIGAmI_1Vq&*<)H;n7x>)a=n3`bj=q+t28C^7FMSv z2s^0;4W^K-D`U0{OwuXlWEP7O%j-#KJ}^5p^(@awX&T$g?rRlv91aghyB|}{8&;b~ z0NasA;Wl|CBi$NPVhOwT`z-?AOo?l>Of%cNfj=oy>cBw?xngX-rC6zs(rVNi7dZ{+ zMZGFf+CIW90j;q#Fzy>ZCzo?I03UwsV`mpg0gup~F|~P1MA7qbQRs0txj98jsnRk| z8Xw5WAZ0MLWho2eV_O{y4q_WoJ7H6Y1b^Ys;abc4)wNzSGVhNmq+cfU$inX5ykVK& z>Ii<<)X}?lVft<1vSMwQ)G;OV_lkytJI&kkmv)raUf%aVOw!c*{)lB!cgt5E?-u&H zUmLHgqRZK}fo}wn4_lvui(Ll|t#SH9-@7jrk1Y<`%}|tRxpHGX*Kzjos+ zf`@bK(4C4kQKn_-UO8`LrNVh*D8|0j*Of>*Sv#=&oy|MJ?D;@Xr5AJ%{J=V zXJO5-H9ov6KV_Xlu&#%)c(_;8A4`<_At$O@n;=fV6{yRYd@Qjv-g!%G_SQk~7^e4| zS|$wFclwfcHf8@vySw+}!DMB-*9Nm#ULNkX`yCQE3#B=uSWcUJK{eToP-!26KUPrMtG2&_TXmzp>`+krf)EWIt{a;zU6?u3I-Jk!Yca&$>L%4)*_tts z7PnMCj_%j4(adT4Dt!??!%6R|~EwMb1AT>q(whUNNU`<;3hdJh^5~`}r5p+a{_L&PQ$F{j=2Bzp0C!}e1+SFuNcxcO!4L+KVtvil}1?j{;4;boTm z#15mIAI-mr81CHnPW;{gKk8Iel8R}9T`TI2pK>Q}$C;;@$lek6DiBc#CyX=^WR;eT zeLi@~I`r{Tmqrmnfz5x|f#jowQtefDoJw82d%pT$ODs{PsO>HBSx#?)v6`BsaWeO}ROkv<{>K#?{yXfvV~ORzpq};ieIq z&Bhh0JUs6OpJ^hE7M!ORvlbp53}|>(zc62^IQv8{X=(u>%66-(NLV3{QeLC};KJ6i zrK>!H?fiujOpCI$MYUviBQ{gU2gk`xUN<^NzEdQSNIs!H;0*19tA`&e(h+F!=FzZS zfn-xEH&Jp(`zzLi*Cu{D5^VYwp_57pqgYzmZq$AuD99bc$>Bsk68bN=EtlB7g3==qSnopI5%smqvZJIrC6=fTXo6Nio<b4^(-yo*wwsm&Axp)RJbNI06h1tv;J{nrIH#epQL` zeK~UQI_G;%?A3tfbHjuM{v*J`%fxT6kF69Cpv zkpt(BTuNd-T$+|fW2XggO0yYp3`5%GL3N>dUsV?fge}KDQb;iy$v&zBxK`AxP_&C( zl5SRKca9@Y%1=PF-Uf%>xsxl<`Bp5=dI4hYd4l#x4&DQZfxw_Kx$~IfdV>ISi4Ox- z1-zWU!?Ac<@YPH+@Vd&C!PlSWAR3>S^D-b%IDTXdJjTg70UD^ge`^v3@sqO6%0z8T zL(~|BHEtY;6$MR;?MFjycwW@d-U`JO|j+wt8a_uEK<@u|H)MMpcUL{|2=g9m=VzCa8A#&8o`EJa6=I;HQ z+)Tyy_Tlk3T|8KxC%15r9DB^ljrBwW-m!AD34Xzk{%2J$NmA1Ji(B7Rhbz zEQ{2=sjoiK18kGXkw8ZjLH%{GE7-ZjfXIu%1Xe9?_K$;(CD%hpuC{^@QN1aB594ws zUbDDW(eqL@-N3r8&d|`%pbUWO;6uA(+>X&nnhWVyk!V@S$8Vg=z|(E2vm7#$i_a>O zxftVBbgTT?2Sq=bF@Z)ed=R?3pe7x0RxN_^55RB9dD|}O>qZXnSM5z>J)<27?VjYr zM@6xZ#dqS9p91R0iO2u__1IgsdBuiVG#^(w@Wu6`p-qRbS$cvev0HUw!XnWMT|Q<1 z^n(k?(OSN`59-rInJDO$JM?Ts`0VsmqfD_sNWX^%hLPH@;>s*8wcIMzxBs*p_uM%^ zKquiN{N0n4BIGURZs7%m*&Krd;*f_jmKXvu`?Xd@sA!2K9yLKN&~JY+%PAd1E3nY% zc1`sLD__@t;6?XW3j8)Y?kxYS>VQlYFaP$d+=4f8Z(^&G-#33qc$m7)ry>?R;p8BV z*}zf_t^KP0q^W2*%Br0ifYQ4Zm z;vR$TR}(Il^*iz9iW2Ix7MkPFi8ki@6oGDR!LJ`4--WCc0pxsb>hOk*Y72~?j&1(Fxr z(cNqU36xh8aHuiRIk06rpoW=_^WpsRDGo#MIc3-I%mIOPp~cTuVo2o%7molSsP1+n zd?fR4{zzQL^0Ue|$G64EoU6rf@nHeSa*gY>noL)Dg^?x@V5>ZM3J~DFB)LsiD7%j)yJs2>yNQ=D64k!H!mKS)pC!8oZaF{K{?|hLmNmAV}rLWat~nSns!MvXZ6!{gbjk%XGJE;Uifu$jWDBygSp9 zIviX%G&uk4A zY_`G$E~YzBaO3&FRc!eKUt&4;u9Rb`guYJ%19pGT^`y&m-vFMfZ($wl%$NF%8^6A( zbDHbt-CJ*XoS}iHdVX20S$?W!>c(NBs|%dW=&yB z7V+t}QHkZ(BBj&M_u1e<&S@w5WK5P@fhSV-H?* z%3JXEQMZXlmzQ40)KxBsc+>zvrtf_S)ymbN7zk-ix&iNlFz$9voEGSItq&cDH`$zu zZT55gwq1i~A4~+=t4-Wqi)csLkwj=P54k z<^-Zcf@Tda7ZYXJA>&?LoY)|=5uN$~bUZ1gXJRu0pNy=q30sBsN8Fk-i2S?({j`&? zt-ynG7t?fs$bu*7=xbb=N=`52ZJiA1g-QLheUz)t(Dh)W$@@{u(?TVk9c&E$4E zn62|welPuV9T8WZRrx^-xkd(?A{-vO$pNJSry~CNcfoarvc}G%1|n+U)3TJK=$St} zhZ&sZkJREzPnHltEqQ0b6%gT0YDa)&UJ?ZQ`r60%x>cS83p+AO1r_qpK?@>k&4?>y zae8T{HbP6E9*RlPThv`Ljj?EHnP-o|cRDUnxpWM&qs*^eeoza#Z`C6P^2>5+Y$P9j z+0Quts*+T*_@oa6t08zD0Tv6;7+%Z$4=`S~=+~;tAB-7v7yV1|WxtBO_D`|zW2ZR+ z?BBBxz?I$;yU4S&?nQjBWucyz!|sS;UX@Z5liw9tp$f0gjC1REYEXT)5gjvU!ui_y3=jS+}E2?`#UXdlyA zD7KPP8oe6bS*tq7+0jWL>L@C)whOkzh``gsN!pVh1s=I1*K6-srZe1n-{{gb39n}r zsznXii`ER<3z6!kNwwOmW?_qvAswrUcA}#_5=B?jAKh>{K~yEBH?ymICo|SvyV5PQ zj%(YgOHNW@3Zxp8X$#MEVuw#Ck}};CGE};w8mr$0H>tqeJSArX`r@ zfwZ2~=x#hVo`2Q44pmq$tny)VD8cS}`e#Dxlhn@`?A_aHMoB{~x@dF@$Su>M2YE-z zx4=~NT8ggoJ^mdCG>X*KwtqcmoSQ3Vg`JSi9-@PbW>*BZLeyjw;?tLA7-odRQpVrV z+M%Yl=F{?`qiLF6W)C1EKOEFV?NJLSPAP;ImrVN8V+B~PQOv}`e;3Bz%Fpor+`T+H zF%X*a4R^G9Ij&{^7+SRJE%V#-7GE3ixkPBTYZtiPVpXHB>VZ+ny$JMxcyCFXM_OyIoI^ZH@clq@kzbyZXKIn33RBtrfR`MYo!ajjf9!OAsKHV?D3SxEdoS>*flo z&8N_jY#|HTLbLG9yHcdUFVYG z@XlvFrz|9{qkZm8t8N604E%yscTUVD8XEDlMO3~bEBjpmM)7Y3tiO?dNBvfTLC+0c z3>k}24%v;l=iJpO$1o9%jpvbQh{t29YsbcozhKg~ec>$6Q2Vone~1Yzjn(jb_0ra?cn@OZ#$%BHWfm2w5}hM zh!$61fVg6aU1Kpy^K}V9QLL4J_X}6aFw(W>~5V^ zs#ybWyRj@7j~n0hv0VA9wv!Uc1-lTsRMb??-h?O(!I>DXNYQ)fp=lmJp`|5Tq;vkLzP(MC)@F8Gw}Koyoc!!#|^8? z0FAHHm!|`y|841W%UfFb;J2m+f3H3MA8Y?L=kU_INg780w=J!ly)x zo)qr6O!<_^lbTUUaUrY&<;kZiTlNP~>~oA~0+V(M)RnrRXWT(j67h3GY@ftEM-p5n zlH{H~St}Qvt(Oe>BBGs*&+HTJ`C5n(ef+vBT|2fVyS-?r#k+EZp|CT!89XjmxKG3G zUA|^--8hMn47d2F?Zq}&+jC-hTWk)ATZ;H1Vz}6}3s-;t$*O1|ZoMlBWF_qBvao3V zx7bt2X?x2~E!XR3KrMr<>1lJkzpc*pc3v&#uVf852o!!v=e(D@0+Z>=$)rg1MoKz$V}0-a%2ef2G?RTHnjFy8;#yz4|3 zEbcj)us)Fl@(=;4Xw6$pZQetAc!zz;-Xe-qw~l%rD(T7d_BJ-vcg+ zhbY7Yp%JoFC?mUoS!x4!(g-k{^+r>k>Ah?OhNHK}Vub3i;HiV~8 zD%{zxeNNC#o%0{iQrFp3#dUKerze}fkP~EkisPh!6G%_n+#MQnNq3Q{k_YGOf^!jBXTD1D=UhNj8dkZKy)8kcX}>$9s%yCgEVRM zgOeUD;-$iE*1ii?yx_Rx&@ZjPU*Bb<{Wal1L9FnL!Y#2y=2TjDKZ3aZxYnoId2f!DCrCdo zq1fk#@@&Vf#h`b>X1Hc3!R1pf?Fl7nUwYkJnwDF$_s-pcvjXLCU-s9&;h~s}ZkQhap1b?TH=<&}zfLqo zE1cj`k)1B;|g z|6!-6RAPxrb@SQgKnVsMipm(bhbWQXM10OT3UPnDhww;ba>Vd62f5+dmER-v6{{kG(>KEuUS;n%uVNQ$I#v3(fOOsGP$lWAK@&W-ZBj9MRbB<%eV}t$37?wvuqk-2BtZ*Q7QJ-Gn(#Up9to z;C%B8I~!V&?92{$(3d5?2){QL zxZ({s3LV+IajknG$3~iZN-_wW8n*YbOUg_4u^wZ#!p2HbHjci$^m4~~^B4^(k>kXS zaVdU(tm5E~h6$FxY7as#%^L?Q zJAkIhNMFVlK0hYzQrd+6R1s(QD+D^s4B%?0xNA`W8wkqBVWDlBjt*|*$R$`3uH z?FyjChiAa&)imhqMY@D*87QM$TFwAdtWd9aWj+N}mnusk|;V#hWssNnQ1mrp4$>;=*5kP9nKb zQ)OV2BRUnP*uV0yNARR4G($l0PEP4)6#4BVY(vbDcfq1D^GAc64CF+&97n@4wM$z) zrM1B-n@Oq(d|8Qg#c@?(9KxD!Y+WQ|vjp)^jf`CO&4lhVgFwDOcKuq7o3mIOV#tR4 z;d+REG*SZQqg2v+0-!b#P+8_?>E|2X%YF5YyWEE#U^VeRU+SfCD`nnmeK8@y7A9)Y z!44(1K;ziKc1>*}k0WalH6_eNiXB0+k5kKGFm+@AL3}@B-9F8}3nLH_K+td;n{#|P zSiYSBOU|3F&|wqgIw9>%m0X1b`LS-*bgu`6u9))?cra;rXZHAz56j?(F$#IW5fg`{ zEPCs8?suN;4Ub{KOA29HK3rkQ|dd1p39gVib;^zITa zCq9n{>zNd42y?h(LP#>&^xar4t~-5fn=0m6gz@NmJiC;8(7J zmN3rBM8lfM)j0!;d&AR%*gH$;*Exka_X#D_9UC5jdXh({>!YCoqy)7`r}i~$wh{8P zfQRZflU*$<9)oca>#{Yz5V5K}4bqu=>9)s2&+Dh2P?sU^K=V@2VH0}u2w*XJ1lYb2 z&BhTDc`H4(Sh=phr+#-yULNWnzmlRY%xd2=nw-=A))%ie81xf;cfZDai<*4apkri^ z3yh{|K=YBpkQtF&qL*XkBhv)HV}4^;wp=4=;JC1=4iWppI#+xnRkCpEmwo2czfq7H zsg3}&byW_|e(QGX6VtMuAs58Zc4d{UVHYR?L8Xp$}hTh=a=2WjDUM;GL_8~By_+dI;DmsN<~#f zOqPbiMfmO0K@V+zu8dC7`gqJ6JV}`69_1BWIYg2=L8k&lMPRp!e?OmQ2Nhs*9e3J} z4=H2Y>wD=dU3nGh;|b%})?{$E;ip>7TlAT)S(kShL-NdoP0GtlNNMFkRs5<}^}X3+ zcMX^fSHT0wK|Lt3y09YsC74pWg+pAlKS}s ziMbEK9lTifun_bxS6(o7nTLxdC_y1=q3zPa2M@7^`;B^AH}0OGd6_h?BYpzUznUK7 z(W_o{jPo^e_Pg#erwA08y12oW^|=2isAm)+x$6|QB(-`Q;!Po96qsr(e+kOcGr{H^vEJZ?3A*C zA)LF9Re#i-F=uW`S`&@#PA8TcV68Fc2))Jajevk!Ym~Cr#@sUXmy9>pYiy#@DBA%!(K_YW=$R4uw+(Ff4re`&#eRiH$efte6Y?A^cKUB=}P; z$LDbzjFlC&IlHfa{69Qi%dlKS9uN`^x;MfD+Wq>}4(Gj^=ZoBYE52^q(JsCjSt`iz z5e?mJaM&{FzQW=5PbWGIa9o;OJwYcQo79lY-+ogEn?>)p`dqm|dduGyHEDbV;Pq-8 z)tHNY|JA6W%=nKGOdU3`a`cUV9(j969M1mQAlpxVA#TV)FaGOd*_Uk}q3}(jKUtIv zo#Gx$b}w_DbIki^Ml^2I8v3y4`Y#xWIy&w$&mnb*(pSl>Vc6d_Oz8GprrapMuHfC} z+d2eQ1Hx{|R$7(YrXK-3OHD^}jq|nrONMh2JCChf7W-jcLiYulpNIXz1|n_E1(L1- zRul3yoFv_Kx0L?W-D`J;KQ9Epy#QxeKYjQ1Rg`Jj^~EDVEp8JNGNyRq%Qz}<;QDm| zpZ%w&7apH{O(?qRANI#E4!G2NqY~8UF8P<|Dd68skrZ0%u28_GM*zTI&sA46hprdA zI2`kn9eW3`l@bBQW_!BMX9aSi6CV)wfpMvHeT6WG5(f_Gw~mh_=J zy!+9B(~SVY$x$`(#Op80+Cx0GdVq7?0DxNm@oOs)H04(}l3!IWI>k)CZS@;NQ3B`c z06_XLfd2#u23$S~);&qTos&yN_zdM8gs>D_FWQ0k-&phm107nLxNinx|y%D7hu|9)C^eeA-~qId|Un#i8itI zkK8oU|7q{C4X(jVtTSKRMPv8Uy+^+I)YlNnuWl~;KU?FXC%=)+aaY(HWoe^o8-HBH zR(g1$Dba@fs^^<`)+kd|+vc3B43`5A0%~bk!fWc6>lvHv@~M>QzADGs7m0av>g2f? z({r`yZx#0)0R*!V!7m6-C-xl`EBv~H`A*Iy*%F-p#?EOYeRoy&pO2rY#{H=zR$+9L zfho{#A*9x#Kz}@eF1jNh5&fEt#FHuyQ9#1W^)mh;Wi|Q_N0R6LH3;5JyiC;B_V|#` z%zVP0CvS;3W*bd<7wCsp#TP*}hJB@a28u`tF8c@OU%0?a_z<_>x3UVHaiX(3yR}|> z?vgq7DJdBxXTVG0{9&(s=NAX(nu&gXIjzP7UM8Pw&kI)m=8WpVqPKT-%6! z3QubNM?su*8BT_G3T0WFlAvcb0=Q!iWY%tNz)R7@Ri5mS^3C_iQ&{)=#DA-|r;F{2 zWB!OABK17ZV+L1e`YsHY4Ygzppr+^V=aS#Z({6>_r#o-TnCFy6x?#8LqG$BVhN5MB z9M%n#Nq?Vu+7&|Vs?ruiMV={z;F9Q28m-+@B>kfH_-_u}rk@+~b+v!G41rH8Sd*fPml0uL`S|Ejg&gR$g~)pDO8#9F z#DS!!C{V;k6F4C+FQ;$@WG@NOc^z$R^Uvss$!MR-U!a*g4 z)xS9yo9CP=LT3nSit_GEQf{{y2 z_4^CMpGR7nM!tFMr+kX^q|gU9j%)hh+%zmRQX+Q7vm~JSK~r zaizcbEs^*v&^)H_v6-N`1dc5XT>}(Yo9U8orx0(QKW1cfDqZwHsQ%*Elob7IaecE{ zG<%6#iSeU+WvQbgD?PBuWX}4+DYj??j9Qa5~GrIUbd)e+&M}438T5Nst*xKdT6^-@ReT>$f?NfW2dh*QT zNsD(W+i|n>-JO+kwc_fUdl!ZKL!Hak^`+Vd|K?miD?0VnpVLe4ZMs;r!&2wXj>4iSPe2Y>mozKI^KZXV%tBQExwNld`giWD(pxu_rljt^bpbDp%l+ zw~$w}7Zblw(9mbfjhoM1IwJ! z;Iqt|n;*^F{_y1as^*fbrcdn8f4$x%`B7%M_6f7rqD@byomzG6sN3m9hfb-xd{?#6 z^j!C5ZFSVqt7nV$*OtUD{>$KyE|HZw^X3g`7GK&T#8*iYUAum3Lq2a@fIw_3tMB@eliJzFMs~ROZH1QPFj-S4As* zepW5gdUoYpuqklD^m}H)r1!m@|1Ouj@VZ}Q{Wz(9%Uvs>?RRF+OO@U^FSPDvSd;fY zGxguTmltZ+ohGqXZ?{8M$%bg8fHrp;pI!BA8~46*`qxfp-F=mK1X|Lb~O%i8a1`DVQjZ+zCcC#rx|Uis|e3ZPrs-yxPS8QH8h+8BE8qvgwe=98?n zFNK8c)Y&?}x&Ogdo_p~JL8~{9ewq6B<3{~k>*jLCzVj3B__rte^7+Zr=gf_m-kPDZ z`*BkI)>tc{-FNza+?f70+^=QbceS>zZCO#F3qWC2cNG{`??GV&Rn{w-6BTzFF_?JD zVD?ct$8BzRbbZ!Mn#5D)kuocb^X08cx;}YsueGIDz3AE#@3?yQjKHkKX;fQyOK$oZ KcKVqA-vj`yOfcI3 literal 0 HcmV?d00001 diff --git a/content/english/hpc/algorithms/matmul.md b/content/english/hpc/algorithms/matmul.md index c092f138..7102bc8c 100644 --- a/content/english/hpc/algorithms/matmul.md +++ b/content/english/hpc/algorithms/matmul.md @@ -5,8 +5,6 @@ draft: true --- ```c++ void matmul(const float *a, const float *_b, float *c, int n) { @@ -65,16 +71,26 @@ void matmul(const float *a, const float *_b, float *c, int n) { for (int i = 0; i < n; i++) for (int j = 0; j < n; j++) for (int k = 0; k < n; k++) - c[i * n + j] += a[i * n + k] * b[j * n + k]; // notice indices + c[i * n + j] += a[i * n + k] * b[j * n + k]; // <- note the indices } ``` +This code runs in ~12.4s, or about 30% faster. As we will see in a bit, there are more important benefits to transposing it than just the sequential memory reads. + +## Vectorization + +/hpc/compilation/contracts/#memory-aliasing + +/hpc/simd/auto-vectorization/ + ```c++ void matmul(const float *a, const float *_b, float * __restrict__ c, int n) { // ... } ``` +![](../img/mm-vectorized-barplot.svg) + ```c++ const int B = 8; // number of elements in a vector const int vecsize = B * sizeof(float); // size of a vector in bytes @@ -117,20 +133,27 @@ void matmul(const float *_a, const float *_b, float *c, int n) { } ``` -![](../img/mm-vectorized-barplot.svg) - ![](../img/mm-vectorized-plot.svg) -## Theoretical Performance +[memory bandwidth](/hpc/cpu-cache/bandwidth/) is not the problem. -This CPU importantly supports the [FMA3](https://en.wikipedia.org/wiki/FMA_instruction_set) SIMD extension that we will utilize in the later implementations. +[Cache associativity](/hpc/cpu-cache/associativity/) strikes again. This is also an issue, but we will not address it for now. -$$ -\underbrace{4}_{CPUs} \cdot \underbrace{8}_{SIMD} \cdot \underbrace{2}_{1/thr} \cdot \underbrace{3.6 \cdot 10^9}_{cycles/sec} = 230.4 \; GFLOPS \;\; (2.3 \cdot 10^{11}) -$$ +$1920 = 2^7 \times 3 \times 5$, so it is divisible by a large power of two. + +Slightly slower than. + +3.5s for 1025 ad 12s for 1024. + +However, now we *really* hit the memory limit. + +## Register reuse + +This CPU importantly supports the [FMA3](https://en.wikipedia.org/wiki/FMA_instruction_set) SIMD extension that we will utilize in later implementations. RAM bandwidth is lower than that +The latency of FMA is 5 cycles, while its reciprocal throughput is ½. ```c++ void kernel(float *a, vector *b, vector *c, int x, int y, int l, int r, int n) { @@ -203,10 +226,18 @@ for (int i3 = 0; i3 < ny; i3 += s3) ![](../img/mm-blocked-barplot.svg) +Avoid moving anything: + ![](../img/mm-noalloc.svg) +$$ +\underbrace{4}_{CPUs} \cdot \underbrace{8}_{SIMD} \cdot \underbrace{2}_{1/thr} \cdot \underbrace{3.6 \cdot 10^9}_{cycles/sec} = 230.4 \; GFLOPS \;\; (2.3 \cdot 10^{11}) +$$ + ![](../img/mm-blas.svg) +We hit about 95. + Which is fine, considering that this is not the only thing that CPUs are made for. ```c++ From f08193cceb6cfd7563d2d558ab77533e6c4b6ded Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 5 Apr 2022 21:43:45 +0300 Subject: [PATCH 374/531] vectorized matmul --- content/english/hpc/algorithms/matmul.md | 147 +++++++++++++++++------ 1 file changed, 109 insertions(+), 38 deletions(-) diff --git a/content/english/hpc/algorithms/matmul.md b/content/english/hpc/algorithms/matmul.md index 7102bc8c..09184bce 100644 --- a/content/english/hpc/algorithms/matmul.md +++ b/content/english/hpc/algorithms/matmul.md @@ -44,7 +44,7 @@ void matmul(const float *a, const float *b, float *c, int n) { For reasons that will become apparent later, we only use matrix sizes that are multiples of $48$ for benchmarking, but the implementations are still correct for all other sizes. We also use [32-bit floats](/hpc/arithmetic/ieee-754) specifically, although it can be [generalized](#generalizations) to other types and operations. -Compiled with `g++ -O3 -march=native -funroll-loops`, the naive approach multiplies two matrices of size $n = 1920$ in ~16.7 seconds. That is approximately $\frac{1920^3}{16.7 \times 10^9} \approx 0.42$ useful operations per nanosecond (GFLOPS), or roughly 5 CPU cycles per multiplication. +Compiled with `g++ -O3 -march=native -ffast-math -funroll-loops`, the naive approach multiplies two matrices of size $n = 1920 = 48 \times 40$ in ~16.7 seconds. That is approximately $\frac{1920^3}{16.7 \times 10^9} \approx 0.42$ useful operations per nanosecond (GFLOPS), or roughly 5 CPU cycles per multiplication. ## Transposition @@ -79,76 +79,132 @@ This code runs in ~12.4s, or about 30% faster. As we will see in a bit, there ar ## Vectorization -/hpc/compilation/contracts/#memory-aliasing +Now that we are just sequentially reading the elements of `a` and `b`, multiplying them, and adding the result to an accumulator variable, we can use [SIMD](/hpc/simd/) instructions to speed it up like [any other reduction](/hpc/simd/reduction/). -/hpc/simd/auto-vectorization/ +We can use [GCC vector types](/hpc/simd/intrinsics/#gcc-vector-extensions) to implement it: ```c++ -void matmul(const float *a, const float *_b, float * __restrict__ c, int n) { - // ... -} -``` +// a vector of 256 / 32 = 8 floats +typedef float vec __attribute__ (( vector_size(32) )); -![](../img/mm-vectorized-barplot.svg) - -```c++ -const int B = 8; // number of elements in a vector -const int vecsize = B * sizeof(float); // size of a vector in bytes -typedef float vector __attribute__ (( vector_size(vecsize) )); - -vector* alloc(int n) { - vector* ptr = (vector*) std::aligned_alloc(vecsize, vecsize * n); - memset(ptr, 0, vecsize * n); +// helper function that allocates n vectors and initializes them with zeros +vec* alloc(int n) { + vec* ptr = (vec*) std::aligned_alloc(32, 32 * n); + memset(ptr, 0, 32 * n); return ptr; } -float hsum(vector s) { - float res = 0; - for (int i = 0; i < B; i++) - res += s[i]; - return res; -} - void matmul(const float *_a, const float *_b, float *c, int n) { - int nB = (n + B - 1) / B; + // first, we need to align rows and pad them with zeros + int nB = (n + 7) / 8; // number of 8-element vectors in a row (rounded up) - vector *a = alloc(n * nB); - vector *b = alloc(n * nB); + vec *a = alloc(n * nB); + vec *b = alloc(n * nB); + // move both matrices for (int i = 0; i < n; i++) { for (int j = 0; j < n; j++) { a[i * nB + j / 8][j % 8] = _a[i * n + j]; - b[i * nB + j / 8][j % 8] = _b[j * n + i]; // <- still transposed + b[i * nB + j / 8][j % 8] = _b[j * n + i]; // <- b is still transposed } } for (int i = 0; i < n; i++) { for (int j = 0; j < n; j++) { - vector s = {0}; + vec s{}; // initialize the accumulator with zeros + + // vertical summation for (int k = 0; k < nB; k++) s += a[i * nB + k] * b[j * nB + k]; - c[i * n + j] = hsum(s); + + // horizontal summation + for (int k = 0; k < 8; k++) + c[i * n + j] += s[k]; } } } ``` +The performance for $n = 1920$ is now around 2.3 GFLOPS — or another ~4 times higher. + +![](../img/mm-vectorized-barplot.svg) + +This optimization looks neither too complex or specific to matrix multiplication. Why can't the compiler simply [auto-vectorizate](/hpc/simd/auto-vectorization/) the inner loop? It actually can — the only thing preventing that is the possibility that `c` overlaps with either `a` or `b`. The only thing that you need to do is to guarantee that `c` is not [aliased](/hpc/compilation/contracts/#memory-aliasing) with anything by adding the `__restrict__` keyword to it: + + + +```c++ +void matmul(const float *a, const float *_b, float * __restrict__ c, int n) { + // ... +} +``` + +Both manually and auto-vectorized implementations perform roughly the same. + +## Memory efficiency + +Now, what is interesting is that the implementation efficiency depends on the problem size: + ![](../img/mm-vectorized-plot.svg) [memory bandwidth](/hpc/cpu-cache/bandwidth/) is not the problem. [Cache associativity](/hpc/cpu-cache/associativity/) strikes again. This is also an issue, but we will not address it for now. +You can see an even more noticeable dip at $1536 = 2^9 \times 3$. + $1920 = 2^7 \times 3 \times 5$, so it is divisible by a large power of two. Slightly slower than. 3.5s for 1025 ad 12s for 1024. +Now it is clear that we are really bottlenecked by the memory system. + However, now we *really* hit the memory limit. ## Register reuse +If we + +Here is a proof of concept: + +```c++ +void update(int x, int y) { + int c00 = 0, c01 = 0, c10 = 0, c11 = 0; + + for (int k = 0; k < n; k++) { + int a0 = a[x][k]; + int a1 = a[x + 1][k]; + + int b0 = b[k][y]; + int b1 = b[k][y + 1]; + + c00 += a0 * b0; + c01 += a0 * b0; + c10 += a0 * b0; + c11 += a1 * b1; + } + + c[x][y] += c00; + c[x][y + 1] += c01; + c[x + 1][y] += c10; + c[x + 1][y + 1] += c11; +} +``` + +Before, we were reading $2 n$ elements to update one cell, and now we are reading $4n$ elements to update four cells: that is $\frac{2n / 1}{4n / 4} = 2$ times better in terms of I/O efficiency. + +It also boosts instruction-level parallelism and saves some instructions from execcuting the read instructions. + +We are not going to really try it. Instead, we will generalize it right away. + +Of course, this would not beat SIMD. + +## Micro-kernel + +*micro-kernel*. + This CPU importantly supports the [FMA3](https://en.wikipedia.org/wiki/FMA_instruction_set) SIMD extension that we will utilize in later implementations. RAM bandwidth is lower than that @@ -156,14 +212,14 @@ RAM bandwidth is lower than that The latency of FMA is 5 cycles, while its reciprocal throughput is ½. ```c++ -void kernel(float *a, vector *b, vector *c, int x, int y, int l, int r, int n) { - vector t[6][2]{}; +void kernel(float *a, vec *b, vec *c, int x, int y, int l, int r, int n) { + vec t[6][2]{}; // will be stored in ymm registers for (int k = l; k < r; k++) { for (int i = 0; i < 6; i++) { - vector alpha = vector{} + a[(x + i) * n + k]; + vec alpha = vec{} + a[(x + i) * n + k]; // broadcast for (int j = 0; j < 2; j++) - t[i][j] += alpha * b[(k * n + y) / 8 + j]; + t[i][j] += alpha * b[(k * n + y) / 8 + j]; // fused multiply-add } } @@ -173,6 +229,8 @@ void kernel(float *a, vector *b, vector *c, int x, int y, int l, int r, int n) { } ``` +## Macro-kernel + ```c++ void matmul(const float *_a, const float *_b, float *_c, int n) { int nx = (n + 5) / 6 * 6; @@ -189,7 +247,7 @@ void matmul(const float *_a, const float *_b, float *_c, int n) { for (int x = 0; x < nx; x += 6) for (int y = 0; y < ny; y += 16) - kernel(a, (vector*) b, (vector*) c, x, y, 0, n, ny); + kernel(a, (vec*) b, (vec*) c, x, y, 0, n, ny); for (int i = 0; i < n; i++) memcpy(&_c[i * n], &c[i * ny], 4 * n); @@ -204,6 +262,10 @@ void matmul(const float *_a, const float *_b, float *_c, int n) { ![](../img/mm-kernel-plot.svg) +There is still a memory bandwidth problem. + +## Blocking + ```c++ const int s3 = 64; const int s2 = 120; @@ -219,7 +281,7 @@ for (int i3 = 0; i3 < ny; i3 += s3) // with [l:r] = [i1:i1+s1] for (int x = i2; x < std::min(i2 + s2, nx); x += 6) for (int y = i3; y < std::min(i3 + s3, ny); y += 16) - kernel(a, (vector*) b, (vector*) c, x, y, i1, std::min(i1 + s1, n), ny); + kernel(a, (vec*) b, (vec*) c, x, y, i1, std::min(i1 + s1, n), ny); ``` ![](../img/mm-blocked-plot.svg) @@ -234,6 +296,13 @@ $$ \underbrace{4}_{CPUs} \cdot \underbrace{8}_{SIMD} \cdot \underbrace{2}_{1/thr} \cdot \underbrace{3.6 \cdot 10^9}_{cycles/sec} = 230.4 \; GFLOPS \;\; (2.3 \cdot 10^{11}) $$ +(and also getting rid of `std::min` in the macro-kernel) + + +[https://www.openblas.net/](OpenBLAS) + +[numpy](/hpc/complexity/languages/#blas) + ![](../img/mm-blas.svg) We hit about 95. @@ -250,11 +319,13 @@ for (int i3 = 0; i3 < n; i3 += s3) for (int i = 0; i < 6; i++) for (int j = 0; j < 2; j++) c[x * n / 8 + i * n / 8 + y / 8 + j] - += (vector{} + a[x * n + i * n + k]) + += (vec{} + a[x * n + i * n + k]) * b[n / 8 * k + y / 8 + j]; ``` -### Generalizations +Register spilling. + +## Generalizations Given a matrix $D$, we need to calculate its "min-plus matrix multiplication" defined as: From c8860104c5e7610c7e03eddda174ade8712f7100 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 5 Apr 2022 23:31:52 +0300 Subject: [PATCH 375/531] matmul memory efficiency --- content/english/hpc/algorithms/matmul.md | 19 +++++++------------ 1 file changed, 7 insertions(+), 12 deletions(-) diff --git a/content/english/hpc/algorithms/matmul.md b/content/english/hpc/algorithms/matmul.md index 09184bce..09bb6f29 100644 --- a/content/english/hpc/algorithms/matmul.md +++ b/content/english/hpc/algorithms/matmul.md @@ -122,6 +122,9 @@ void matmul(const float *_a, const float *_b, float *c, int n) { c[i * n + j] += s[k]; } } + + std::free(a); + std::free(b); } ``` @@ -147,21 +150,13 @@ Now, what is interesting is that the implementation efficiency depends on the pr ![](../img/mm-vectorized-plot.svg) -[memory bandwidth](/hpc/cpu-cache/bandwidth/) is not the problem. - -[Cache associativity](/hpc/cpu-cache/associativity/) strikes again. This is also an issue, but we will not address it for now. - -You can see an even more noticeable dip at $1536 = 2^9 \times 3$. - -$1920 = 2^7 \times 3 \times 5$, so it is divisible by a large power of two. - -Slightly slower than. +First, the performance (in terms of useful operations per second) increases, as the overhead of the loop management and horizontal reduction decreases. However, at around $n=256$, it starts smoothly decreasing as the matrices stop fitting into the [cache](/hpc/cpu-cache/) ($2 \times 256^2 \times 4 = 512$ KB is the size of the L2 cache), and the performance becomes bottlenecked by the [memory bandwidth](/hpc/cpu-cache/bandwidth/). -3.5s for 1025 ad 12s for 1024. +It is also interesting that the naive implementation is mostly on par with the non-vectorized transposed version — and actually slightly better because of the transpose itself — for all but few data points, where the performance deteriorates. This is because of [cache associativity](/hpc/cpu-cache/associativity/): when $n$ is divisible by a large power of two, we are fetching addresses of `b` that all likely map to the same cache line, reducing the effective cache size. This explains the 30% performance dip for $n = 1920 = 2^7 \times 3 \times 5$, and you can see an even more noticeable one for $1536 = 2^9 \times 3$: it is roughly 3 times slower than for $n=1535$. -Now it is clear that we are really bottlenecked by the memory system. +One may think that there would be at least some general performance gain from full sequential reads since we are fetching fewer cache lines, but this is not the case: fetching the first column of `b` is painful, but the next 15 columns will actually be in the same cache lines as the first one, so they will be cached — unless the matrix is so large that it can't even fit `n * cache_line_size` bytes into the cache, which is not the case for all practical problem sizes. -However, now we *really* hit the memory limit. +So, counterintuitively, transposing the matrix doesn't help the memory bandwidth — and in the naive implementation, we are not really bottlenecked by it anyway. But for our vectorize implementation, we certainly are, so let's tackle it. ## Register reuse From 376d46a118aed91aa8d78a279c4ba327f4bd3ab5 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 5 Apr 2022 23:50:40 +0300 Subject: [PATCH 376/531] matmul register reuse --- content/english/hpc/algorithms/matmul.md | 45 ++++++++++++++---------- 1 file changed, 26 insertions(+), 19 deletions(-) diff --git a/content/english/hpc/algorithms/matmul.md b/content/english/hpc/algorithms/matmul.md index 09bb6f29..e90abf3b 100644 --- a/content/english/hpc/algorithms/matmul.md +++ b/content/english/hpc/algorithms/matmul.md @@ -144,13 +144,15 @@ void matmul(const float *a, const float *_b, float * __restrict__ c, int n) { Both manually and auto-vectorized implementations perform roughly the same. +The performance is bottlenecked by using a single variable. We could use multiple variables similar to other reductions, but we will solve it later anyway. + ## Memory efficiency -Now, what is interesting is that the implementation efficiency depends on the problem size: +Now, what is interesting is that the implementation efficiency depends on the problem size. -![](../img/mm-vectorized-plot.svg) +At first, the performance (in terms of useful operations per second) increases, as the overhead of the loop management and horizontal reduction decreases. Then, at around $n=256$, it starts smoothly decreasing as the matrices stop fitting into the [cache](/hpc/cpu-cache/) ($2 \times 256^2 \times 4 = 512$ KB is the size of the L2 cache), and the performance becomes bottlenecked by the [memory bandwidth](/hpc/cpu-cache/bandwidth/). -First, the performance (in terms of useful operations per second) increases, as the overhead of the loop management and horizontal reduction decreases. However, at around $n=256$, it starts smoothly decreasing as the matrices stop fitting into the [cache](/hpc/cpu-cache/) ($2 \times 256^2 \times 4 = 512$ KB is the size of the L2 cache), and the performance becomes bottlenecked by the [memory bandwidth](/hpc/cpu-cache/bandwidth/). +![](../img/mm-vectorized-plot.svg) It is also interesting that the naive implementation is mostly on par with the non-vectorized transposed version — and actually slightly better because of the transpose itself — for all but few data points, where the performance deteriorates. This is because of [cache associativity](/hpc/cpu-cache/associativity/): when $n$ is divisible by a large power of two, we are fetching addresses of `b` that all likely map to the same cache line, reducing the effective cache size. This explains the 30% performance dip for $n = 1920 = 2^7 \times 3 \times 5$, and you can see an even more noticeable one for $1536 = 2^9 \times 3$: it is roughly 3 times slower than for $n=1535$. @@ -160,13 +162,20 @@ So, counterintuitively, transposing the matrix doesn't help the memory bandwidth ## Register reuse -If we +To compute the cell $C[i][j]$, we need to compute the dot product of $A[i][:]$ and $B[:][j]$ (we are using the Python notation here to select rows and columns), which requires fetching $2n$ elements, even when $B$ is stored in column-major order. + +What if we were to compute $C[i:i+2][j:j+2]$, a $2 \times 2$ submatrix of $C$? We would need $A[i:i+2][:]$ and $B[:][j:j+2]$, which is $4n$ elements in total: that is $\frac{2n / 1}{4n / 4} = 2$ times better in terms of I/O efficiency. + +To actually avoid reading more data, we need to read these $2+2$ rows and columns in parallel and update all $2 \times 2$ cells at once using all possible combinations of products. Here is a proof of concept: ```c++ -void update(int x, int y) { - int c00 = 0, c01 = 0, c10 = 0, c11 = 0; +void update_2x2(int x, int y) { + int c00 = c[x][y], + c01 = c[x][y + 1], + c10 = c[x + 1][y], + c11 = c[x + 1][y + 1]; for (int k = 0; k < n; k++) { int a0 = a[x][k]; @@ -176,25 +185,21 @@ void update(int x, int y) { int b1 = b[k][y + 1]; c00 += a0 * b0; - c01 += a0 * b0; - c10 += a0 * b0; + c01 += a0 * b1; + c10 += a1 * b0; c11 += a1 * b1; } - c[x][y] += c00; - c[x][y + 1] += c01; - c[x + 1][y] += c10; - c[x + 1][y + 1] += c11; + c[x][y] = c00; + c[x][y + 1] = c01; + c[x + 1][y] = c10; + c[x + 1][y + 1] = c11; } ``` -Before, we were reading $2 n$ elements to update one cell, and now we are reading $4n$ elements to update four cells: that is $\frac{2n / 1}{4n / 4} = 2$ times better in terms of I/O efficiency. - -It also boosts instruction-level parallelism and saves some instructions from execcuting the read instructions. +It also boosts instruction-level parallelism (we don't have to wait between iterations to update the loop state) and saves some cycles from executing the read instructions. -We are not going to really try it. Instead, we will generalize it right away. - -Of course, this would not beat SIMD. +Of course, although better in terms of I/O, this $2 \times 2$ update would not beat our vectorized implementation, so we are not going to try this version in particular and instead will scale the idea right away. ## Micro-kernel @@ -224,7 +229,7 @@ void kernel(float *a, vec *b, vec *c, int x, int y, int l, int r, int n) { } ``` -## Macro-kernel +The rest of the implementaiton: ```c++ void matmul(const float *_a, const float *_b, float *_c, int n) { @@ -261,6 +266,8 @@ There is still a memory bandwidth problem. ## Blocking +*Macro-kernel* + ```c++ const int s3 = 64; const int s2 = 120; From 4491d76d98814d18f62aa498eababe0bd87f7ac3 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 6 Apr 2022 15:17:30 +0300 Subject: [PATCH 377/531] matmul kernel --- content/english/hpc/algorithms/matmul.md | 42 +++++++++++++++++------- 1 file changed, 31 insertions(+), 11 deletions(-) diff --git a/content/english/hpc/algorithms/matmul.md b/content/english/hpc/algorithms/matmul.md index e90abf3b..1f6abce0 100644 --- a/content/english/hpc/algorithms/matmul.md +++ b/content/english/hpc/algorithms/matmul.md @@ -162,6 +162,8 @@ So, counterintuitively, transposing the matrix doesn't help the memory bandwidth ## Register reuse +Any two cells of A and B are used to update some cell of C. + To compute the cell $C[i][j]$, we need to compute the dot product of $A[i][:]$ and $B[:][j]$ (we are using the Python notation here to select rows and columns), which requires fetching $2n$ elements, even when $B$ is stored in column-major order. What if we were to compute $C[i:i+2][j:j+2]$, a $2 \times 2$ submatrix of $C$? We would need $A[i:i+2][:]$ and $B[:][j:j+2]$, which is $4n$ elements in total: that is $\frac{2n / 1}{4n / 4} = 2$ times better in terms of I/O efficiency. @@ -201,17 +203,22 @@ It also boosts instruction-level parallelism (we don't have to wait between iter Of course, although better in terms of I/O, this $2 \times 2$ update would not beat our vectorized implementation, so we are not going to try this version in particular and instead will scale the idea right away. -## Micro-kernel +## Designing the kernel -*micro-kernel*. +We follow this approach and design a general kernel that updates a $h \times w$ submatrix of C using columns from $l$ to $r$ of $A$ and rows from $l$ to $r$ of $B$ (i. e. not a full computation, but only a partial update — it will be clear why later). We have several considerations: -This CPU importantly supports the [FMA3](https://en.wikipedia.org/wiki/FMA_instruction_set) SIMD extension that we will utilize in later implementations. +- In general, if we are updating an $h \times w$ submatrix, we will be fetching $2 \cdot n \cdot (h + w)$ elements to update $h \cdot w$ elements. We want that ratio of $\frac{h \cdot w}{2 \cdot n \cdot (h + w)}$ to be as high as possible, which is achieved with large square-ish submatrices. +- We want to use the [FMA](https://en.wikipedia.org/wiki/FMA_instruction_set) ("fused multiply-add") instructions that are available on all modern x86 architectures. As you can guess from the name, they perform a vector `c += a * b` operation in one go, which is the core of our computation. +- We want to be able to exploit [instruction-level parallelism](/hpc/pipelining/) to achieve better utilizaiton of this instruction. On Zen 2, the `fma` instruction has the latency of 5 and the throughput of 2, meaning that we need to concurrently execute at least $5 \times 2 = 10$ of them to fully saturate its execution ports. +- We only have $16$ logical vector registers that we can use as accumulators, and we want to avoid register spill. -RAM bandwidth is lower than that +For these reasons, we settle on a $6 \times 16$ kernel. We process $96$ elements at once, which can be stored in $6 \times 2 = 12$ vector registers (we need some more to store temporary values). We [broadcast](/hpc/simd/moving/#broadcast) an element of A, and then use it to update the first row ($8 + 8$ elements). Then we load the one below it, and so on. When we have updated the last row, we move to the next $6$ elements to the right. -The latency of FMA is 5 cycles, while its reciprocal throughput is ½. +The final implementation is simpler than it sounds: ```c++ +// update 6x16 submatrix C[x:x+6][y:y+16] +// using A[x:x+6][l:r] and B[l:r][y:y+16] void kernel(float *a, vec *b, vec *c, int x, int y, int l, int r, int n) { vec t[6][2]{}; // will be stored in ymm registers @@ -229,10 +236,15 @@ void kernel(float *a, vec *b, vec *c, int x, int y, int l, int r, int n) { } ``` -The rest of the implementaiton: +We need `t` so that the compiler stores these elements in vector registers. We could just update the final destinations, but unfortunately, the compiler re-writes them back to memory, causing a huge slowdown — and wrapping everything in `__restrict__` keywords doesn't help. + +The rest of the implementaiton is straightforward. Similar to the previous vectorized implementation, we just allocate aligned arrays and call the kernel instead of the innermost loop: ```c++ void matmul(const float *_a, const float *_b, float *_c, int n) { + // to avoid implementing partials, + // we pad height to nearest 6 and width to 16 + int nx = (n + 5) / 6 * 6; int ny = (n + 15) / 16 * 16; @@ -242,7 +254,7 @@ void matmul(const float *_a, const float *_b, float *_c, int n) { for (int i = 0; i < n; i++) { memcpy(&a[i * ny], &_a[i * n], 4 * n); - memcpy(&b[i * ny], &_b[i * n], 4 * n); + memcpy(&b[i * ny], &_b[i * n], 4 * n); // we don't need to transpose b this time } for (int x = 0; x < nx; x += 6) @@ -258,15 +270,19 @@ void matmul(const float *_a, const float *_b, float *_c, int n) { } ``` +This improves the performance by another ~40%: + ![](../img/mm-kernel-barplot.svg) +The speedup is much better (2-3x) on smaller arrays, indicating that there is still a bandwidth problem: + ![](../img/mm-kernel-plot.svg) -There is still a memory bandwidth problem. +If you've read the section on [cache-oblivious algorithms](/hpc/external-memory/oblivious/), you know that one universal solution to these types of things is to split matrices in four parts, do eight recursive block matrix multiplications until the matrix fits into cache, and carefully combine the results together. We will follow a different, simpler approach. ## Blocking -*Macro-kernel* +Note that we are reading. ```c++ const int s3 = 64; @@ -286,6 +302,8 @@ for (int i3 = 0; i3 < ny; i3 += s3) kernel(a, (vec*) b, (vec*) c, x, y, i1, std::min(i1 + s1, n), ny); ``` +This part is sometimes called *macro-kernel* (as opposed to the *micro-kernel* that only updates a 6x16 submatrix). + ![](../img/mm-blocked-plot.svg) ![](../img/mm-blocked-barplot.svg) @@ -294,8 +312,10 @@ Avoid moving anything: ![](../img/mm-noalloc.svg) +The theoretical performance limit is: + $$ -\underbrace{4}_{CPUs} \cdot \underbrace{8}_{SIMD} \cdot \underbrace{2}_{1/thr} \cdot \underbrace{3.6 \cdot 10^9}_{cycles/sec} = 230.4 \; GFLOPS \;\; (2.3 \cdot 10^{11}) +\underbrace{8}_{SIMD} \cdot \underbrace{2}_{1/thr} \cdot \underbrace{2 \cdot 10^9}_{cycles/sec} = 32 \; GFLOPS \;\; (3.2 \cdot 10^{10}) $$ (and also getting rid of `std::min` in the macro-kernel) @@ -325,7 +345,7 @@ for (int i3 = 0; i3 < n; i3 += s3) * b[n / 8 * k + y / 8 + j]; ``` -Register spilling. +(Assuming that we are in 2050 and using the 35th version of GCC, which finally properly manager not to screwing up with register spilling.) ## Generalizations From 02aba431230500c968e972019b7aa92f0e03ebfb Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 6 Apr 2022 15:52:04 +0300 Subject: [PATCH 378/531] matmul cache blocking --- content/english/hpc/algorithms/matmul.md | 40 ++++++++++++++++++++---- 1 file changed, 34 insertions(+), 6 deletions(-) diff --git a/content/english/hpc/algorithms/matmul.md b/content/english/hpc/algorithms/matmul.md index 1f6abce0..9acbc61a 100644 --- a/content/english/hpc/algorithms/matmul.md +++ b/content/english/hpc/algorithms/matmul.md @@ -282,12 +282,24 @@ If you've read the section on [cache-oblivious algorithms](/hpc/external-memory/ ## Blocking -Note that we are reading. +Alternative to divide-and-conquer is *cache blocking*: selecting a subset of data and processing it, and then going to the next block. Sometimes blocking is hierarchical: we first select a block of data that fits into the L3 cache, then we split it into blocks that fit into the L2 cache, and so on. + +It is less trivial to do for matrices than for arrays, but the trick is like this: + +- Let's select a subset of B that fits into the L3 cache (say, a subset of its columns). +- Now, let's select a submatrix of A that fits into the L2 cache (a subset of its rows). +- Select a submatrix of previously selected submatrix of B that fits into the L1 cache, and use it to do the kernel update (a subset of its rows). + +Here is a good [visualization](https://jukkasuomela.fi/cache-blocking-demo/) by Jukka Suomela (it shows different approaches; we use the last one). + +We could have started with A, but this would be slower. Note that during the kernel execution, we are reading the elements of $A$ slower than elements of $B$: we are fetching and broadcasting just one element, and then we multiply it with $16$ elements of $B$, so we need to store $B$ in cache, and the last stage be about selecting B in cache. + +We can implement it with three more outer `for` loops: ```c++ -const int s3 = 64; -const int s2 = 120; -const int s1 = 240; +const int s3 = 64; // how many columns of B to select +const int s2 = 120; // how many rows of A to select +const int s1 = 240; // how many rows of B to select for (int i3 = 0; i3 < ny; i3 += s3) // now we are working with b[:][i3:i3+s3] @@ -302,13 +314,29 @@ for (int i3 = 0; i3 < ny; i3 += s3) kernel(a, (vec*) b, (vec*) c, x, y, i1, std::min(i1 + s1, n), ny); ``` -This part is sometimes called *macro-kernel* (as opposed to the *micro-kernel* that only updates a 6x16 submatrix). +These outer `for` loops are sometimes called *macro-kernel* (as opposed to the *micro-kernel* that only updates a 6x16 submatrix). + +It completely removes the memory bottleneck: ![](../img/mm-blocked-plot.svg) +The performance is no longer seriously affected by the problem size: + ![](../img/mm-blocked-barplot.svg) -Avoid moving anything: +Notice the dip at $1536$ is still there. Cache associativity affects the effective cache size. We need to adjust the step constants or insert holes into the layout to mitigate this. + +## Optimization + +We need a few more optimizations to reach the performance limit: + +- Remove memory allocation and just operate on the arrays that we are given (note that we don't need to do anything with `a` as we are reading just one element at a time, and we can use unaligned `store` for `c` as we only use it rarely). +- Get rid of the `std::min` so that the size parameters are mostly constant and can be embedded into the machine code. +- Rewrite the micro-kernel using 12 variables (the compiler seems to have a problem with keeping them fully in registers). + +Effectively supporting weird sizes requires a bit more work, and this is the reason why we benchmarked at an array sizes that are divisible by $48 = \frac{6 \cdot 16}{\gcd(6, 16)}$. + +Avoiding moving anything pays off: ![](../img/mm-noalloc.svg) From 6553b3f085132e827b2dd207ab702927b8ca89db Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 6 Apr 2022 16:04:13 +0300 Subject: [PATCH 379/531] matmul optimization --- content/english/hpc/algorithms/matmul.md | 23 +++++++++-------------- 1 file changed, 9 insertions(+), 14 deletions(-) diff --git a/content/english/hpc/algorithms/matmul.md b/content/english/hpc/algorithms/matmul.md index 9acbc61a..30bdd36c 100644 --- a/content/english/hpc/algorithms/matmul.md +++ b/content/english/hpc/algorithms/matmul.md @@ -332,32 +332,27 @@ We need a few more optimizations to reach the performance limit: - Remove memory allocation and just operate on the arrays that we are given (note that we don't need to do anything with `a` as we are reading just one element at a time, and we can use unaligned `store` for `c` as we only use it rarely). - Get rid of the `std::min` so that the size parameters are mostly constant and can be embedded into the machine code. -- Rewrite the micro-kernel using 12 variables (the compiler seems to have a problem with keeping them fully in registers). +- Rewrite the micro-kernel by hand using 12 variables (the compiler seems to have a problem with keeping them fully in registers). Effectively supporting weird sizes requires a bit more work, and this is the reason why we benchmarked at an array sizes that are divisible by $48 = \frac{6 \cdot 16}{\gcd(6, 16)}$. -Avoiding moving anything pays off: +Avoiding moving anything pays off. These improvements sum up and give us a 50% improvement: ![](../img/mm-noalloc.svg) -The theoretical performance limit is: +We are actually not that far from the theoretical performance limit — which can be calculated as the throughput of the SIMD lane width times the fma instruction times the clock frequency: $$ -\underbrace{8}_{SIMD} \cdot \underbrace{2}_{1/thr} \cdot \underbrace{2 \cdot 10^9}_{cycles/sec} = 32 \; GFLOPS \;\; (3.2 \cdot 10^{10}) +\underbrace{8}_{SIMD} \cdot \underbrace{2}_{thr.} \cdot \underbrace{2 \cdot 10^9}_{cycles/sec} = 32 \; GFLOPS \;\; (3.2 \cdot 10^{10}) $$ -(and also getting rid of `std::min` in the macro-kernel) - - -[https://www.openblas.net/](OpenBLAS) - -[numpy](/hpc/complexity/languages/#blas) +A more realistic comparison is some practical library, such as [https://www.openblas.net/](OpenBLAS). We just call it from Python using [numpy](/hpc/complexity/languages/#blas), so there may be some minor overhead, but reaching 80% of theoretical performance seems plausible (matrix multiplication is not the only thing that CPUs are made for): ![](../img/mm-blas.svg) -We hit about 95. +We've reached ~93% of BLAS and ~75% of the theoretical performance limit. Which is really great for what is basically 40 lines of C. -Which is fine, considering that this is not the only thing that CPUs are made for. +Interestingly, the whole thing can be rolled into one large `for` loop: ```c++ for (int i3 = 0; i3 < n; i3 += s3) @@ -406,6 +401,6 @@ https://arxiv.org/pdf/1605.01078.pdf ## Acknowledgements -"[Anatomy of High-Performance Matrix Multiplication](https://www.cs.utexas.edu/~flame/pubs/GotoTOMS_revision.pdf)" by Kazushige Goto and Robert van de Geijn. +The algorithm was originally designed by Kazushige Goto, and it is the basis of GotoBLAS and OpenBLAS. The author himself described it and some other aspects in more detail in "[Anatomy of High-Performance Matrix Multiplication](https://www.cs.utexas.edu/~flame/pubs/GotoTOMS_revision.pdf)". -Inspired by "[Programming Parallel Computers](http://ppc.cs.aalto.fi/ch2/)" course. +The exposition style is inspired by "[Programming Parallel Computers](http://ppc.cs.aalto.fi/)" course by Jukka Suomela, which features a [similar case study](http://ppc.cs.aalto.fi/ch2/) on speeding up the distance product. From 32518f4c003b666546ba393e167556c5ff8fd430 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 6 Apr 2022 16:39:02 +0300 Subject: [PATCH 380/531] floyd algorithm and matmul --- content/english/hpc/algorithms/matmul.md | 45 ++++++++++++++++-------- 1 file changed, 30 insertions(+), 15 deletions(-) diff --git a/content/english/hpc/algorithms/matmul.md b/content/english/hpc/algorithms/matmul.md index 30bdd36c..e01544db 100644 --- a/content/english/hpc/algorithms/matmul.md +++ b/content/english/hpc/algorithms/matmul.md @@ -334,7 +334,9 @@ We need a few more optimizations to reach the performance limit: - Get rid of the `std::min` so that the size parameters are mostly constant and can be embedded into the machine code. - Rewrite the micro-kernel by hand using 12 variables (the compiler seems to have a problem with keeping them fully in registers). -Effectively supporting weird sizes requires a bit more work, and this is the reason why we benchmarked at an array sizes that are divisible by $48 = \frac{6 \cdot 16}{\gcd(6, 16)}$. +Effectively supporting weird sizes requires a bit more work, and this is the reason why we benchmarked at an array sizes that are divisible by $48 = \frac{6 \cdot 16}{\gcd(6, 16)}$. We leave the code out, because the change is large and tedious and involves slightly modifying the benchmarking code itself. It is straightforward, but we only implement the version for this particular size, whithout any safety checks. Cheating on the benchmark. + +https://github.com/sslotin/amh-code/blob/main/matmul/v5-unrolled.cc Avoiding moving anything pays off. These improvements sum up and give us a 50% improvement: @@ -350,9 +352,9 @@ A more realistic comparison is some practical library, such as [https://www.open ![](../img/mm-blas.svg) -We've reached ~93% of BLAS and ~75% of the theoretical performance limit. Which is really great for what is basically 40 lines of C. +We've reached ~93% of BLAS and ~75% of the theoretical performance limit. Which is really great for what is essentially just 40 lines of C. -Interestingly, the whole thing can be rolled into one large `for` loop: +Interestingly, the whole thing can be rolled into one large `for` loop (assuming that we are in 2050 and using the 35th version of GCC, which finally properly manager not to screwing up with register spilling.): ```c++ for (int i3 = 0; i3 < n; i3 += s3) @@ -368,17 +370,21 @@ for (int i3 = 0; i3 < n; i3 += s3) * b[n / 8 * k + y / 8 + j]; ``` -(Assuming that we are in 2050 and using the 35th version of GCC, which finally properly manager not to screwing up with register spilling.) +There is also a way to do fewer arithmetic operations — [the Strassen algorithm](/hpc/external-memory/oblivious/#strassen-algorithm) — but it has a large constant factor, and it is [only efficient for very large matrices](https://arxiv.org/pdf/1605.01078.pdf) ($n > 4000$) for which we typically use multi-threading anyway. ## Generalizations -Given a matrix $D$, we need to calculate its "min-plus matrix multiplication" defined as: +FMA also supports 64-bit floating point number, but it does not support integers: you need to perform addition and multiplication separately, which projects to decreased performance. If you know that all intermediate results can be represented exactly as a 32- or 64-bit floating-point number (which is [often the case](/hpc/arithmetic/errors/)), it may be better convert them to and from floats. + +You can also apply the same trick to other similar computations. One example is the "min-plus matrix multiplication" defined as: -$(D \circ D)_{ij} = \min_k(D_{ik} + D_{kj})$ +$$ +(D \circ D)_{ij} = \min_{1 \le k \le n} (D_{ik} + D_{kj}) +$$ -Graph interpretation: find shortest paths of length 2 between all vertices in a fully-connected weighted graph +It is also known as the "distance product" due to its graph interpretation: the result is the matrix of shortest paths of length two between all pairs of vertices in a fully-connected weighted graph. -A cool thing about distance product is that if if we iterate the process and calculate: +A cool thing about the distance product is that if if we iterate the process and calculate: $$ D_2 = D \circ D \\ @@ -387,17 +393,26 @@ D_8 = D_4 \circ D_4 \\ \ldots $$ -Then we can find all-pairs shortest distances in $O(\log n)$ steps +Then we can find all-pairs shortest distances in $O(\log n)$ steps: -(but recall that there are [more direct ways](https://en.wikipedia.org/wiki/Floyd%E2%80%93Warshall_algorithm) to solve it) - -Which is an exercise. +```c++ +for (int l = 0; l < logn; l++) + for (int i = 0; i < n; i++) + for (int j = 0; j < n; j++) + for (int k = 0; k < n; k++) + d[i][j] = min(d[i][j], d[i][k] + d[k][j]); +``` -Strassen algorithm is only useful for large matrices. +This requires $O(n^3 \log n)$ operations, but if we do these two-edge relaxations in a particular order, we can do it with just one pass, which is known as the [Floyd-Warshall algorithm](https://en.wikipedia.org/wiki/Floyd%E2%80%93Warshall_algorithm): -https://arxiv.org/pdf/1605.01078.pdf +```c++ +for (int k = 0; k < n; k++) + for (int i = 0; i < n; i++) + for (int j = 0; j < n; j++) + d[i][j] = min(d[i][j], d[i][k] + d[k][j]); +``` -[cache-oblivious](/hpc/external-memory/oblivious/#matrix-multiplication) algorithms +As an exercise, try to think of ways of speeding up this "for-for-for" computation. It will be harder than matrix multiplication because you need to perform updates in this particular order. ## Acknowledgements From 82ddb7412be5c82bb937185e5c199cb7c418fe23 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 6 Apr 2022 20:32:17 +0300 Subject: [PATCH 381/531] scalar matmul edits --- content/english/hpc/algorithms/matmul.md | 49 ++++++++++++++---------- 1 file changed, 28 insertions(+), 21 deletions(-) diff --git a/content/english/hpc/algorithms/matmul.md b/content/english/hpc/algorithms/matmul.md index e01544db..a6237da5 100644 --- a/content/english/hpc/algorithms/matmul.md +++ b/content/english/hpc/algorithms/matmul.md @@ -17,13 +17,15 @@ nomove 0.303826 23.295860130469414 blas 0.27489790320396423 25.747333528217077 --> -In this case study, we will design and implement several algorithms for matrix multiplication. We start with the naive "for-for-for" algorithm and incrementally improve it, eventually developing an implementation that is 50 times faster and matches the performance of BLAS libraries while being under 40 lines of C. +In this case study, we will design and implement several algorithms for matrix multiplication. -We compile our implementations with GCC 13 and run them on Zen 2 clocked at 2GHz. +We start with the naive "for-for-for" algorithm and incrementally improve it, eventually arriving at a version that is 50 times faster and matches the performance of BLAS libraries while being under 40 lines of C. + +All implementations are compiled with GCC 13 and run on a [Zen 2](https://en.wikichip.org/wiki/amd/microarchitectures/zen_2) CPU clocked at 2GHz. ## Baseline -The result of multiplying an $l \times n$ matrix $A$ by an $n \times m$ matrix $B$ is an $l \times m$ matrix $C$ calculated as: +The result of multiplying an $l \times n$ matrix $A$ by an $n \times m$ matrix $B$ is defined as an $l \times m$ matrix $C$ calculated as $$ C_{ij} = \sum_{k=1}^{n} A_{ik} \cdot B_{kj} @@ -31,7 +33,7 @@ $$ For simplicity, we will only consider *square* matrices, where $l = m = n$. -To implement matrix multiplication, we can just transfer this definition into code — but instead of two-dimensional arrays (aka matrices), we will be using one-dimensional arrays, to be explicit about memory addressing: +To implement matrix multiplication, we can simply transfer this definition into code — but instead of two-dimensional arrays (aka matrices), we will be using one-dimensional arrays to be explicit about pointer arithmetic: ```c++ void matmul(const float *a, const float *b, float *c, int n) { @@ -42,17 +44,17 @@ void matmul(const float *a, const float *b, float *c, int n) { } ``` -For reasons that will become apparent later, we only use matrix sizes that are multiples of $48$ for benchmarking, but the implementations are still correct for all other sizes. We also use [32-bit floats](/hpc/arithmetic/ieee-754) specifically, although it can be [generalized](#generalizations) to other types and operations. +For reasons that will become apparent later, we only use matrix sizes that are multiples of $48$ for benchmarking, but the implementations remain correct for all other sizes. We also use [32-bit floats](/hpc/arithmetic/ieee-754) specifically, although all implementations can be easily [generalized](#generalizations) to other data types and operations. -Compiled with `g++ -O3 -march=native -ffast-math -funroll-loops`, the naive approach multiplies two matrices of size $n = 1920 = 48 \times 40$ in ~16.7 seconds. That is approximately $\frac{1920^3}{16.7 \times 10^9} \approx 0.42$ useful operations per nanosecond (GFLOPS), or roughly 5 CPU cycles per multiplication. +Compiled with `g++ -O3 -march=native -ffast-math -funroll-loops`, the naive approach multiplies two matrices of size $n = 1920 = 48 \times 40$ in ~16.7 seconds. Put in perspective, it is approximately $\frac{1920^3}{16.7 \times 10^9} \approx 0.42$ useful operations per nanosecond (GFLOPS), or roughly 5 CPU cycles per multiplication, which doesn't look that good yet. ## Transposition In general, when you optimize an algorithm that processes large quantities of data — and $1920^2 \times 3 \times 4 \approx 42$ MB clearly is a large quantity as it can't fit into any of the [CPU caches](/hpc/cpu-cache) — you should always start with memory before optimizing arithmetic, as it is much more likely to be the bottleneck. -Note that the field $C_{ij}$ can be viewed as the dot product of row $i$ in matrix $A$ and column $j$ in matrix $B$. As we are incrementing the `k` variable in the inner loop above, we are reading the matrix `a` sequentially, but we are jumping over $n$ elements as we iterate over a column of `b`, which is [not as fast](/hpc/cpu-cache/aos-soa). +The field $C_{ij}$ can be seen as the dot product of row $i$ of matrix $A$ and column $j$ of matrix $B$. As we increment `k` in the inner loop above, we are reading the matrix `a` sequentially, but we are jumping over $n$ elements as we iterate over a column of `b`, which is [not as fast](/hpc/cpu-cache/aos-soa) as sequential iteration. -One [well-known optimization](/hpc/external-memory/oblivious/#matrix-multiplication) that mitigates this problem is to either store matrix $B$ in *column-major* order or to *transpose* it before the matrix multiplication — spending $O(n^2)$ additional operations, but ensuring sequential reads in the hot loop: +One [well-known](/hpc/external-memory/oblivious/#matrix-multiplication) optimization that tackles this problem is to store matrix $B$ in *column-major* order — or, alternatively, to *transpose* it before the matrix multiplication. This requires $O(n^2)$ additional operations but ensures sequential reads in the innermost loop: @@ -144,21 +145,27 @@ void matmul(const float *a, const float *_b, float * __restrict__ c, int n) { Both manually and auto-vectorized implementations perform roughly the same. + + ## Memory efficiency -Now, what is interesting is that the implementation efficiency depends on the problem size. +What is interesting is that the implementation efficiency depends on the problem size. -At first, the performance (in terms of useful operations per second) increases, as the overhead of the loop management and horizontal reduction decreases. Then, at around $n=256$, it starts smoothly decreasing as the matrices stop fitting into the [cache](/hpc/cpu-cache/) ($2 \times 256^2 \times 4 = 512$ KB is the size of the L2 cache), and the performance becomes bottlenecked by the [memory bandwidth](/hpc/cpu-cache/bandwidth/). +At first, the performance (in terms of useful operations per second) increases as the overhead of the loop management and horizontal reduction decreases. Then, at around $n=256$, it starts smoothly decreasing as the matrices stop fitting into the [cache](/hpc/cpu-cache/) ($2 \times 256^2 \times 4 = 512$ KB is the size of the L2 cache), and the performance becomes bottlenecked by the [memory bandwidth](/hpc/cpu-cache/bandwidth/). ![](../img/mm-vectorized-plot.svg) -It is also interesting that the naive implementation is mostly on par with the non-vectorized transposed version — and actually slightly better because of the transpose itself — for all but few data points, where the performance deteriorates. This is because of [cache associativity](/hpc/cpu-cache/associativity/): when $n$ is divisible by a large power of two, we are fetching addresses of `b` that all likely map to the same cache line, reducing the effective cache size. This explains the 30% performance dip for $n = 1920 = 2^7 \times 3 \times 5$, and you can see an even more noticeable one for $1536 = 2^9 \times 3$: it is roughly 3 times slower than for $n=1535$. +It is also interesting that the naive implementation is mostly on par with the non-vectorized transposed version — and even slightly better because it doesn't need to perform a transposition. + +One might think that there would be some *general* performance gain from doing sequential reads since we are fetching fewer cache lines, but this is not the case: fetching the first column of `b` indeed takes more time, but the next 15 column reads will be in the same cache lines as the first one, so they will be cached — unless the matrix is so large that it can't even fit `n * cache_line_size` bytes into the cache, which is not the case for any practical matrix sizes. -One may think that there would be at least some general performance gain from full sequential reads since we are fetching fewer cache lines, but this is not the case: fetching the first column of `b` is painful, but the next 15 columns will actually be in the same cache lines as the first one, so they will be cached — unless the matrix is so large that it can't even fit `n * cache_line_size` bytes into the cache, which is not the case for all practical problem sizes. +Instead, the performance deteriorates on only a few specific matrix sizes due to the effects of [cache associativity](/hpc/cpu-cache/associativity/): when $n$ is a multiple of a large power of two, we are fetching the addresses of `b` that all likely map to the same cache line, which reduces the effective cache size. This explains the 30% performance dip for $n = 1920 = 2^7 \times 3 \times 5$, and you can see an even more noticeable one for $1536 = 2^9 \times 3$: it is roughly 3 times slower than for $n=1535$. -So, counterintuitively, transposing the matrix doesn't help the memory bandwidth — and in the naive implementation, we are not really bottlenecked by it anyway. But for our vectorize implementation, we certainly are, so let's tackle it. +So, counterintuitively, transposing the matrix doesn't help with caching — and in the naive implementation, we are not really bottlenecked by the memory bandwidth anyway. But our vectorized implementation certainly is, so let's work on its I/O efficiency. ## Register reuse From 55edc44d68bf04054446ee084f575a397a5ee66b Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 6 Apr 2022 21:48:29 +0300 Subject: [PATCH 382/531] matmul kernel --- content/english/hpc/algorithms/matmul.md | 83 ++++++++++++++++-------- 1 file changed, 56 insertions(+), 27 deletions(-) diff --git a/content/english/hpc/algorithms/matmul.md b/content/english/hpc/algorithms/matmul.md index a6237da5..64282bd3 100644 --- a/content/english/hpc/algorithms/matmul.md +++ b/content/english/hpc/algorithms/matmul.md @@ -169,36 +169,41 @@ So, counterintuitively, transposing the matrix doesn't help with caching — and ## Register reuse -Any two cells of A and B are used to update some cell of C. +Using a Python-like notation to refer to submatrices, to compute the cell $C[x][y]$, we need to calculate the dot product of $A[x][:]$ and $B[:][y]$, which requires fetching $2n$ elements, even if we store $B$ in column-major order. -To compute the cell $C[i][j]$, we need to compute the dot product of $A[i][:]$ and $B[:][j]$ (we are using the Python notation here to select rows and columns), which requires fetching $2n$ elements, even when $B$ is stored in column-major order. + -What if we were to compute $C[i:i+2][j:j+2]$, a $2 \times 2$ submatrix of $C$? We would need $A[i:i+2][:]$ and $B[:][j:j+2]$, which is $4n$ elements in total: that is $\frac{2n / 1}{4n / 4} = 2$ times better in terms of I/O efficiency. +To compute $C[x:x+2][y:y+2]$, a $2 \times 2$ submatrix of $C$, we would need two rows from $A$ and two columns from $B$, namely $A[x:x+2][:]$ and $B[:][y:y+2]$, containing $4n$ elements in total, to update four elements instead of one — which is $\frac{2n / 1}{4n / 4} = 2$ times better in terms of I/O efficiency. + + + +To avoid re-fetching data, we need to iterate these rows and columns in parallel and calculate all $2 \times 2$ possible combinations of products. Here is a proof of concept: ```c++ -void update_2x2(int x, int y) { - int c00 = c[x][y], - c01 = c[x][y + 1], - c10 = c[x + 1][y], - c11 = c[x + 1][y + 1]; +void kernel_2x2(int x, int y) { + int c00 = 0, c01 = 0, c10 = 0, c11 = 0; for (int k = 0; k < n; k++) { + // read rows int a0 = a[x][k]; int a1 = a[x + 1][k]; + // read columns int b0 = b[k][y]; int b1 = b[k][y + 1]; + // update all combinations c00 += a0 * b0; c01 += a0 * b1; c10 += a1 * b0; c11 += a1 * b1; } + // write the results to C c[x][y] = c00; c[x][y + 1] = c01; c[x + 1][y] = c10; @@ -206,52 +211,74 @@ void update_2x2(int x, int y) { } ``` -It also boosts instruction-level parallelism (we don't have to wait between iterations to update the loop state) and saves some cycles from executing the read instructions. +We can now simply call this kernel on all 2x2 submatrices of $C$, but we won't bother evaluating it: although this algorithm is better in terms of I/O operations, it would still not beat our SIMD-based implementation. Instead, we will extend this approach and develop a similar *vectorized* kernel right away. + + + ## Designing the kernel -We follow this approach and design a general kernel that updates a $h \times w$ submatrix of C using columns from $l$ to $r$ of $A$ and rows from $l$ to $r$ of $B$ (i. e. not a full computation, but only a partial update — it will be clear why later). We have several considerations: +Instead of designing a kernel that computes an $h \times w$ submatrix of $C$ from scratch, we will declare a function that *updates* it using columns from $l$ to $r$ of $A$ and rows from $l$ to $r$ of $B$. For now, this seems like an over-generalization, but this API will be useful later. + + -For these reasons, we settle on a $6 \times 16$ kernel. We process $96$ elements at once, which can be stored in $6 \times 2 = 12$ vector registers (we need some more to store temporary values). We [broadcast](/hpc/simd/moving/#broadcast) an element of A, and then use it to update the first row ($8 + 8$ elements). Then we load the one below it, and so on. When we have updated the last row, we move to the next $6$ elements to the right. +To determine $h$ and $w$, we have several performance considerations: + +- In general, to compute an $h \times w$ submatrix, we need to fetch $2 \cdot n \cdot (h + w)$ elements. To optimize the I/O efficiency, we would want the $\frac{h \cdot w}{h + w}$ ratio to be high, which is achieved with large and square-ish submatrices. +- We want to use the [FMA](https://en.wikipedia.org/wiki/FMA_instruction_set) ("fused multiply-add") instruction available on all modern x86 architectures. As you can guess from the name, it performs the `c += a * b` operation — which is the core of a dot product — on 8-element vectors in one go, which saves us from executing vector multiplication and addition separately. +- To achieve better utilization of this instruction, we want to make use of [instruction-level parallelism](/hpc/pipelining/). On Zen 2, the `fma` instruction has a latency of 5 and a throughput of 2, meaning that we need to concurrently execute at least $5 \times 2 = 10$ of them to saturate its execution ports. +- We want to avoid register spill, and we only have $16$ logical vector registers that we can use as accumulators. + +For these reasons, we settle on a $6 \times 16$ kernel. This way, we process $96$ elements at once, which can be stored in $6 \times 2 = 12$ vector registers (we can't use an $8 \times 16$ kernel and use all 16 vector registers because we need some to hold temporary values). + +To update them efficiently, we use the following procedure: + + + ```c++ // update 6x16 submatrix C[x:x+6][y:y+16] // using A[x:x+6][l:r] and B[l:r][y:y+16] void kernel(float *a, vec *b, vec *c, int x, int y, int l, int r, int n) { - vec t[6][2]{}; // will be stored in ymm registers + vec t[6][2]{}; // will be zero-filled and stored in ymm registers for (int k = l; k < r; k++) { for (int i = 0; i < 6; i++) { - vec alpha = vec{} + a[(x + i) * n + k]; // broadcast + // broadcast a[x + i][k] into a register + vec alpha = vec{} + a[(x + i) * n + k]; // converts to a broadcast + // multiply b[k][y:y+16] by it and update t[i][0] and t[i][1] for (int j = 0; j < 2; j++) - t[i][j] += alpha * b[(k * n + y) / 8 + j]; // fused multiply-add + t[i][j] += alpha * b[(k * n + y) / 8 + j]; // converts to an fma } } + // write the results back to C for (int i = 0; i < 6; i++) for (int j = 0; j < 2; j++) c[((x + i) * n + y) / 8 + j] += t[i][j]; } ``` -We need `t` so that the compiler stores these elements in vector registers. We could just update the final destinations, but unfortunately, the compiler re-writes them back to memory, causing a huge slowdown — and wrapping everything in `__restrict__` keywords doesn't help. +We need `t` so that the compiler stores these elements in vector registers. We could just update the final destinations, but, unfortunately, the compiler re-writes them back to memory, causing a slowdown (wrapping everything in `__restrict__` keywords doesn't help). -The rest of the implementaiton is straightforward. Similar to the previous vectorized implementation, we just allocate aligned arrays and call the kernel instead of the innermost loop: +The rest of the implementation is straightforward. Similar to the previous vectorized implementation, we just allocate aligned arrays and call the kernel instead of the innermost loop: ```c++ void matmul(const float *_a, const float *_b, float *_c, int n) { - // to avoid implementing partials, - // we pad height to nearest 6 and width to 16 - + // to simplify the implementation, we pad the height and width + // so that they are divisible by 6 and 16 respectively int nx = (n + 5) / 6 * 6; int ny = (n + 15) / 16 * 16; @@ -277,15 +304,15 @@ void matmul(const float *_a, const float *_b, float *_c, int n) { } ``` -This improves the performance by another ~40%: +This improves the benchmark performance, but only by ~40%: ![](../img/mm-kernel-barplot.svg) -The speedup is much better (2-3x) on smaller arrays, indicating that there is still a bandwidth problem: +The speedup is much higher (2-3x) on smaller arrays, indicating that there is still a bandwidth problem: ![](../img/mm-kernel-plot.svg) -If you've read the section on [cache-oblivious algorithms](/hpc/external-memory/oblivious/), you know that one universal solution to these types of things is to split matrices in four parts, do eight recursive block matrix multiplications until the matrix fits into cache, and carefully combine the results together. We will follow a different, simpler approach. +Now, if you've read the section on [cache-oblivious algorithms](/hpc/external-memory/oblivious/), you know that one universal solution to these types of things is to split all matrices into four parts, perform eight recursive block matrix multiplications, and carefully combine the results together. This solution is okay in practice, but there is some [overhead to recursion](/hpc/architecture/functions/), and it also doesn't allow us to fine-tune the algorithm, so instead, we will follow a different, simpler approach. ## Blocking @@ -419,6 +446,8 @@ for (int k = 0; k < n; k++) d[i][j] = min(d[i][j], d[i][k] + d[k][j]); ``` +Vectorizing the distance product and executing it $O(\log n)$ times is faster than than naively executing the Floyd-Warshall algorithm, although not by a lot. + As an exercise, try to think of ways of speeding up this "for-for-for" computation. It will be harder than matrix multiplication because you need to perform updates in this particular order. ## Acknowledgements From d129828bb3d764ccf146eb41df189aac7559a4dc Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 6 Apr 2022 22:17:08 +0300 Subject: [PATCH 383/531] matmul cache blocking --- content/english/hpc/algorithms/matmul.md | 32 +++++++++++------------- 1 file changed, 15 insertions(+), 17 deletions(-) diff --git a/content/english/hpc/algorithms/matmul.md b/content/english/hpc/algorithms/matmul.md index 64282bd3..2126daea 100644 --- a/content/english/hpc/algorithms/matmul.md +++ b/content/english/hpc/algorithms/matmul.md @@ -316,19 +316,20 @@ Now, if you've read the section on [cache-oblivious algorithms](/hpc/external-me ## Blocking -Alternative to divide-and-conquer is *cache blocking*: selecting a subset of data and processing it, and then going to the next block. Sometimes blocking is hierarchical: we first select a block of data that fits into the L3 cache, then we split it into blocks that fit into the L2 cache, and so on. +The *cache-aware* alternative to this divide-and-conquer trick is *cache blocking*: splitting the data into blocks that can fit into the cache and processing them one by one. If we have more than one layer of cache, we can do hierarchical blocking: we first select a block of data that fits into the L3 cache, then we split it into blocks that fit into the L2 cache, and so on. This requires knowing the cache sizes in advance, but it is usually easier to implement and also faster in practice. -It is less trivial to do for matrices than for arrays, but the trick is like this: +Cache blocking is less trivial to do with matrices than with arrays, but the general idea is this: -- Let's select a subset of B that fits into the L3 cache (say, a subset of its columns). -- Now, let's select a submatrix of A that fits into the L2 cache (a subset of its rows). -- Select a submatrix of previously selected submatrix of B that fits into the L1 cache, and use it to do the kernel update (a subset of its rows). +- Select a submatrix of $B$ that fits into the L3 cache (say, a subset of its columns). +- Select a submatrix of $A$ that fits into the L2 cache (say, a subset of its rows). +- Select a submatrix of the previously selected submatrix of $B$ (a subset of its rows) that fits into the L1 cache. +- Update the relevant submatrix of $C$ using the kernel. -Here is a good [visualization](https://jukkasuomela.fi/cache-blocking-demo/) by Jukka Suomela (it shows different approaches; we use the last one). +Here is a good [visualization](https://jukkasuomela.fi/cache-blocking-demo/) by Jukka Suomela (it features many different approaches; you are interested in the last one). -We could have started with A, but this would be slower. Note that during the kernel execution, we are reading the elements of $A$ slower than elements of $B$: we are fetching and broadcasting just one element, and then we multiply it with $16$ elements of $B$, so we need to store $B$ in cache, and the last stage be about selecting B in cache. +Note that the decision to start this process with matrix $B$ is not arbitrary. During the kernel execution, we are reading the elements of $A$ much slower than the elements of $B$: we fetch and broadcast just one element of $A$ and then multiply it with $16$ elements of $B$. Therefore, we want $B$ to be in the L1 cache while $A$ can stay in the L2 cache and not the other way around. -We can implement it with three more outer `for` loops: +This sounds complicated, but we can implement it with just three more outer `for` loops, which are collectively called *macro-kernel* (and the highly optimized low-level function that updates a 6x16 submatrix is called *micro-kernel*): ```c++ const int s3 = 64; // how many columns of B to select @@ -341,24 +342,21 @@ for (int i3 = 0; i3 < ny; i3 += s3) // now we are working with a[i2:i2+s2][:] for (int i1 = 0; i1 < ny; i1 += s1) // now we are working with b[i1:i1+s1][i3:i3+s3] - // this equates to updating c[i2:i2+s2][i3:i3+s3] - // with [l:r] = [i1:i1+s1] + // and we need to update c[i2:i2+s2][i3:i3+s3] with [l:r] = [i1:i1+s1] for (int x = i2; x < std::min(i2 + s2, nx); x += 6) for (int y = i3; y < std::min(i3 + s3, ny); y += 16) kernel(a, (vec*) b, (vec*) c, x, y, i1, std::min(i1 + s1, n), ny); ``` -These outer `for` loops are sometimes called *macro-kernel* (as opposed to the *micro-kernel* that only updates a 6x16 submatrix). +Cache blocking completely removes the memory bottleneck: -It completely removes the memory bottleneck: - -![](../img/mm-blocked-plot.svg) +![](../img/mm-blocked-barplot.svg) -The performance is no longer seriously affected by the problem size: +The performance is no longer significantly affected by the problem size: -![](../img/mm-blocked-barplot.svg) +![](../img/mm-blocked-plot.svg) -Notice the dip at $1536$ is still there. Cache associativity affects the effective cache size. We need to adjust the step constants or insert holes into the layout to mitigate this. +Notice that the dip at $1536$ is still there: cache associativity still affects the effective cache size. To mitigate this, we can adjust the step constants or insert holes into the layout, but we are not going to bother doing that for now. ## Optimization From f50135e9fa4cd1937da55eb1df4d5077d26e70df Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 6 Apr 2022 22:57:16 +0300 Subject: [PATCH 384/531] matmul final edits --- content/english/hpc/algorithms/matmul.md | 52 ++++++++++++++---------- 1 file changed, 30 insertions(+), 22 deletions(-) diff --git a/content/english/hpc/algorithms/matmul.md b/content/english/hpc/algorithms/matmul.md index 2126daea..e6749b81 100644 --- a/content/english/hpc/algorithms/matmul.md +++ b/content/english/hpc/algorithms/matmul.md @@ -360,33 +360,39 @@ Notice that the dip at $1536$ is still there: cache associativity still affects ## Optimization -We need a few more optimizations to reach the performance limit: +To approach closer to the performance limit, we need a few more optimizations: -- Remove memory allocation and just operate on the arrays that we are given (note that we don't need to do anything with `a` as we are reading just one element at a time, and we can use unaligned `store` for `c` as we only use it rarely). -- Get rid of the `std::min` so that the size parameters are mostly constant and can be embedded into the machine code. -- Rewrite the micro-kernel by hand using 12 variables (the compiler seems to have a problem with keeping them fully in registers). +- Remove memory allocation and operate on the arrays that are passed to the function. Note that we don't need to do anything with `a` as we are reading just one element at a time, and we can use an unaligned `store` for `c` as we only use it rarely. +- Get rid of the `std::min` so that the size parameters are (mostly) constant and can be embedded into the machine code by the compiler (which also lets it [unroll](/hpc/architecture/loops/) the micro-kernel loop more efficiently without runtime checks). +- Rewrite the micro-kernel by hand using 12 vector variables (the compiler seems to struggle with keeping them in registers and writes them first to temporary storage and only then to $C$). + +These optimizations are straightforward but quite tedious to implement, so we are not going to list [the code](https://github.com/sslotin/amh-code/blob/main/matmul/v5-unrolled.cc) in the article. It also requires some more work to effectively support "weird" matrix sizes, which is why we only run benchmarks for sizes that are multiple of $48 = \frac{6 \cdot 16}{\gcd(6, 16)}$. + + + +These individually small improvements sum up and result in another 50% improvement: ![](../img/mm-noalloc.svg) -We are actually not that far from the theoretical performance limit — which can be calculated as the throughput of the SIMD lane width times the fma instruction times the clock frequency: +We are actually not that far from the theoretical performance limit — which can be calculated as the width of a SIMD lane times the `fma` instruction throughput times the clock frequency: $$ \underbrace{8}_{SIMD} \cdot \underbrace{2}_{thr.} \cdot \underbrace{2 \cdot 10^9}_{cycles/sec} = 32 \; GFLOPS \;\; (3.2 \cdot 10^{10}) $$ -A more realistic comparison is some practical library, such as [https://www.openblas.net/](OpenBLAS). We just call it from Python using [numpy](/hpc/complexity/languages/#blas), so there may be some minor overhead, but reaching 80% of theoretical performance seems plausible (matrix multiplication is not the only thing that CPUs are made for): +It is more useful to compare against some practical library, such as [OpenBLAS](https://www.openblas.net/). The laziest way is to simply invoke matrix multiplication from Python with [numpy](/hpc/complexity/languages/#blas). There may be some minor overhead, but it ends up reaching 80% of the theoretical limit, which seems plausible (this overhead is typical, as matrix multiplication is not the only thing that CPUs are made for): ![](../img/mm-blas.svg) -We've reached ~93% of BLAS and ~75% of the theoretical performance limit. Which is really great for what is essentially just 40 lines of C. +We've reached ~93% of BLAS performance and ~75% of the theoretical performance limit, which is really great for what is essentially just 40 lines of C. -Interestingly, the whole thing can be rolled into one large `for` loop (assuming that we are in 2050 and using the 35th version of GCC, which finally properly manager not to screwing up with register spilling.): +Interestingly, the whole thing can be rolled into just one deeply nested `for` loop (assuming that we are in 2050 and using the 35th version of GCC, which finally does not screw up with register spilling.): ```c++ for (int i3 = 0; i3 < n; i3 += s3) @@ -402,21 +408,23 @@ for (int i3 = 0; i3 < n; i3 += s3) * b[n / 8 * k + y / 8 + j]; ``` -There is also a way to do fewer arithmetic operations — [the Strassen algorithm](/hpc/external-memory/oblivious/#strassen-algorithm) — but it has a large constant factor, and it is [only efficient for very large matrices](https://arxiv.org/pdf/1605.01078.pdf) ($n > 4000$) for which we typically use multi-threading anyway. +There is also a way to do fewer arithmetic operations — [the Strassen algorithm](/hpc/external-memory/oblivious/#strassen-algorithm) — but it has a large constant factor, and it is only efficient for [very large matrices](https://arxiv.org/pdf/1605.01078.pdf) ($n > 4000$), where we typically have to use either multiprocessing or some approximate dimensionality-reducing methods anyway. + + ## Generalizations -FMA also supports 64-bit floating point number, but it does not support integers: you need to perform addition and multiplication separately, which projects to decreased performance. If you know that all intermediate results can be represented exactly as a 32- or 64-bit floating-point number (which is [often the case](/hpc/arithmetic/errors/)), it may be better convert them to and from floats. +FMA also supports 64-bit floating-point numbers, but it does not support integers: you need to perform addition and multiplication separately, which projects to decreased performance. If you can guarantee that all intermediate results can be represented exactly as a 32- or 64-bit floating-point number (which is [often the case](/hpc/arithmetic/errors/)), it may be better to convert them to and from floats. -You can also apply the same trick to other similar computations. One example is the "min-plus matrix multiplication" defined as: +You can also apply the same trick to other similar computations. One example is the "min-plus matrix multiplication," which is defined as: $$ -(D \circ D)_{ij} = \min_{1 \le k \le n} (D_{ik} + D_{kj}) +(A \circ B)_{ij} = \min_{1 \le k \le n} (A_{ik} + B_{kj}) $$ -It is also known as the "distance product" due to its graph interpretation: the result is the matrix of shortest paths of length two between all pairs of vertices in a fully-connected weighted graph. +It is also known as the "distance product" due to its graph interpretation: when applied to itself $(D \circ D)$, the result is the matrix of shortest paths of length two between all pairs of vertices in a fully-connected weighted graph specified by the edge weight matrix $D$. -A cool thing about the distance product is that if if we iterate the process and calculate: +A cool thing about the distance product is that if we iterate the process and calculate $$ D_2 = D \circ D \\ @@ -425,7 +433,7 @@ D_8 = D_4 \circ D_4 \\ \ldots $$ -Then we can find all-pairs shortest distances in $O(\log n)$ steps: +…we can find all-pairs shortest paths in $O(\log n)$ steps: ```c++ for (int l = 0; l < logn; l++) @@ -435,7 +443,7 @@ for (int l = 0; l < logn; l++) d[i][j] = min(d[i][j], d[i][k] + d[k][j]); ``` -This requires $O(n^3 \log n)$ operations, but if we do these two-edge relaxations in a particular order, we can do it with just one pass, which is known as the [Floyd-Warshall algorithm](https://en.wikipedia.org/wiki/Floyd%E2%80%93Warshall_algorithm): +This requires $O(n^3 \log n)$ operations. If we do these two-edge relaxations in a particular order, we can do it with just one pass, which is known as the [Floyd-Warshall algorithm](https://en.wikipedia.org/wiki/Floyd%E2%80%93Warshall_algorithm): ```c++ for (int k = 0; k < n; k++) @@ -444,12 +452,12 @@ for (int k = 0; k < n; k++) d[i][j] = min(d[i][j], d[i][k] + d[k][j]); ``` -Vectorizing the distance product and executing it $O(\log n)$ times is faster than than naively executing the Floyd-Warshall algorithm, although not by a lot. +Interestingly, vectorizing the distance product and executing it $O(\log n)$ times in $O(n^3 \log n)$ total operations is faster than naively executing the Floyd-Warshall algorithm in $O(n^3)$ operations, although not by a lot. -As an exercise, try to think of ways of speeding up this "for-for-for" computation. It will be harder than matrix multiplication because you need to perform updates in this particular order. +As an exercise, try to speed up this "for-for-for" computation. It is harder to do than in the matrix multiplication case because you need to perform updates in a particular order, but it is still possible to design a similar kernel and an iteration order that achieves a 30-50x total speedup. ## Acknowledgements -The algorithm was originally designed by Kazushige Goto, and it is the basis of GotoBLAS and OpenBLAS. The author himself described it and some other aspects in more detail in "[Anatomy of High-Performance Matrix Multiplication](https://www.cs.utexas.edu/~flame/pubs/GotoTOMS_revision.pdf)". +The final algorithm was originally designed by Kazushige Goto, and it is the basis of GotoBLAS and OpenBLAS. The author himself describes it in more detail in "[Anatomy of High-Performance Matrix Multiplication](https://www.cs.utexas.edu/~flame/pubs/GotoTOMS_revision.pdf)". -The exposition style is inspired by "[Programming Parallel Computers](http://ppc.cs.aalto.fi/)" course by Jukka Suomela, which features a [similar case study](http://ppc.cs.aalto.fi/ch2/) on speeding up the distance product. +The exposition style is inspired by the "[Programming Parallel Computers](http://ppc.cs.aalto.fi/)" course by Jukka Suomela, which features a [similar case study](http://ppc.cs.aalto.fi/ch2/) on speeding up the distance product. From 15e65f57a7b32c64d4afdabff0e726a542615fb5 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 6 Apr 2022 23:00:53 +0300 Subject: [PATCH 385/531] publish matmul --- content/english/hpc/algorithms/matmul.md | 3 +-- content/english/hpc/complexity/languages.md | 2 +- 2 files changed, 2 insertions(+), 3 deletions(-) diff --git a/content/english/hpc/algorithms/matmul.md b/content/english/hpc/algorithms/matmul.md index e6749b81..01159313 100644 --- a/content/english/hpc/algorithms/matmul.md +++ b/content/english/hpc/algorithms/matmul.md @@ -1,7 +1,6 @@ --- title: Matrix Multiplication weight: 20 -draft: true --- @@ -154,17 +156,17 @@ The performance is bottlenecked by using a single variable. We could use multipl What is interesting is that the implementation efficiency depends on the problem size. -At first, the performance (in terms of useful operations per second) increases as the overhead of the loop management and horizontal reduction decreases. Then, at around $n=256$, it starts smoothly decreasing as the matrices stop fitting into the [cache](/hpc/cpu-cache/) ($2 \times 256^2 \times 4 = 512$ KB is the size of the L2 cache), and the performance becomes bottlenecked by the [memory bandwidth](/hpc/cpu-cache/bandwidth/). +At first, the performance (defined as the number of useful operations per second) increases as the overhead of the loop management and the horizontal reduction decreases. Then, at around $n=256$, it starts smoothly decreasing as the matrices stop fitting into the [cache](/hpc/cpu-cache/) ($2 \times 256^2 \times 4 = 512$ KB is the size of the L2 cache), and the performance becomes bottlenecked by the [memory bandwidth](/hpc/cpu-cache/bandwidth/). ![](../img/mm-vectorized-plot.svg) It is also interesting that the naive implementation is mostly on par with the non-vectorized transposed version — and even slightly better because it doesn't need to perform a transposition. -One might think that there would be some *general* performance gain from doing sequential reads since we are fetching fewer cache lines, but this is not the case: fetching the first column of `b` indeed takes more time, but the next 15 column reads will be in the same cache lines as the first one, so they will be cached — unless the matrix is so large that it can't even fit `n * cache_line_size` bytes into the cache, which is not the case for any practical matrix sizes. +One might think that there would be some general performance gain from doing sequential reads since we are fetching fewer cache lines, but this is not the case: fetching the first column of `b` indeed takes more time, but the next 15 column reads will be in the same cache lines as the first one, so they will be cached anyway — unless the matrix is so large that it can't even fit `n * cache_line_size` bytes into the cache, which is not the case for any practical matrix sizes. Instead, the performance deteriorates on only a few specific matrix sizes due to the effects of [cache associativity](/hpc/cpu-cache/associativity/): when $n$ is a multiple of a large power of two, we are fetching the addresses of `b` that all likely map to the same cache line, which reduces the effective cache size. This explains the 30% performance dip for $n = 1920 = 2^7 \times 3 \times 5$, and you can see an even more noticeable one for $1536 = 2^9 \times 3$: it is roughly 3 times slower than for $n=1535$. -So, counterintuitively, transposing the matrix doesn't help with caching — and in the naive implementation, we are not really bottlenecked by the memory bandwidth anyway. But our vectorized implementation certainly is, so let's work on its I/O efficiency. +So, counterintuitively, transposing the matrix doesn't help with caching — and in the naive scalar implementation, we are not really bottlenecked by the memory bandwidth anyway. But our vectorized implementation certainly is, so let's work on its I/O efficiency. ## Register reuse @@ -172,7 +174,7 @@ Using a Python-like notation to refer to submatrices, to compute the cell $C[x][ -To compute $C[x:x+2][y:y+2]$, a $2 \times 2$ submatrix of $C$, we would need two rows from $A$ and two columns from $B$, namely $A[x:x+2][:]$ and $B[:][y:y+2]$, containing $4n$ elements in total, to update four elements instead of one — which is $\frac{2n / 1}{4n / 4} = 2$ times better in terms of I/O efficiency. +To compute $C[x:x+2][y:y+2]$, a $2 \times 2$ submatrix of $C$, we would need two rows from $A$ and two columns from $B$, namely $A[x:x+2][:]$ and $B[:][y:y+2]$, containing $4n$ elements in total, to update *four* elements instead of *one* — which is $\frac{2n / 1}{4n / 4} = 2$ times better in terms of I/O efficiency. -To avoid re-fetching data, we need to iterate these rows and columns in parallel and calculate all $2 \times 2$ possible combinations of products. Here is a proof of concept: +To avoid fetching data more than once, we need to iterate over these rows and columns in parallel and calculate all $2 \times 2$ possible combinations of products. Here is a proof of concept: ```c++ void kernel_2x2(int x, int y) { @@ -220,7 +222,7 @@ Of course, although better in terms of I/O, this $2 \times 2$ update would not b ## Designing the kernel -Instead of designing a kernel that computes an $h \times w$ submatrix of $C$ from scratch, we will declare a function that *updates* it using columns from $l$ to $r$ of $A$ and rows from $l$ to $r$ of $B$. For now, this seems like an over-generalization, but this API will be useful later. +Instead of designing a kernel that computes an $h \times w$ submatrix of $C$ from scratch, we will declare a function that *updates* it using columns from $l$ to $r$ of $A$ and rows from $l$ to $r$ of $B$. For now, this seems like an over-generalization, but this function interface will prove useful later. - To achieve better utilization of this instruction, we want to make use of [instruction-level parallelism](/hpc/pipelining/). On Zen 2, the `fma` instruction has a latency of 5 and a throughput of 2, meaning that we need to concurrently execute at least $5 \times 2 = 10$ of them to saturate its execution ports. -- We want to avoid register spill, and we only have $16$ logical vector registers that we can use as accumulators. - -For these reasons, we settle on a $6 \times 16$ kernel. This way, we process $96$ elements at once, which can be stored in $6 \times 2 = 12$ vector registers (we can't use an $8 \times 16$ kernel and use all 16 vector registers because we need some to hold temporary values). +- We want to avoid register spill (move data to and from registers more than necessary), and we only have $16$ logical vector registers that we can use as accumulators (minus those that we need to hold temporary values). -To update them efficiently, we use the following procedure: +For these reasons, we settle on a $6 \times 16$ kernel. This way, we process $96$ elements at once that are stored in $6 \times 2 = 12$ vector registers. To update them efficiently, we use the following procedure: -These individually small improvements sum up and result in another 50% improvement: +These individually small improvements compound and result in another 50% improvement: ![](../img/mm-noalloc.svg) -We are actually not that far from the theoretical performance limit — which can be calculated as the width of a SIMD lane times the `fma` instruction throughput times the clock frequency: +We are actually not that far from the theoretical performance limit — which can be calculated as the SIMD width times the `fma` instruction throughput times the clock frequency: $$ \underbrace{8}_{SIMD} \cdot \underbrace{2}_{thr.} \cdot \underbrace{2 \cdot 10^9}_{cycles/sec} = 32 \; GFLOPS \;\; (3.2 \cdot 10^{10}) $$ -It is more useful to compare against some practical library, such as [OpenBLAS](https://www.openblas.net/). The laziest way is to simply invoke matrix multiplication from Python with [numpy](/hpc/complexity/languages/#blas). There may be some minor overhead, but it ends up reaching 80% of the theoretical limit, which seems plausible (this overhead is typical, as matrix multiplication is not the only thing that CPUs are made for): +It is more representative to compare against some practical library, such as [OpenBLAS](https://www.openblas.net/). The laziest way to do it is to simply [invoke matrix multiplication from NumPy](/hpc/complexity/languages/#blas). There may be some minor overhead due to Python, but it ends up reaching 80% of the theoretical limit, which seems plausible (a 20% overhead is okay: matrix multiplication is not the only thing that CPUs are made for). ![](../img/mm-blas.svg) We've reached ~93% of BLAS performance and ~75% of the theoretical performance limit, which is really great for what is essentially just 40 lines of C. -Interestingly, the whole thing can be rolled into just one deeply nested `for` loop with a BLAS-level of performance (assuming that we're in 2050 and using GCC 35, which finally does not screw up with register spilling): +Interestingly, the whole thing can be rolled into just one deeply nested `for` loop with a BLAS-level of performance (assuming that we're in 2050 and using GCC version 35, which finally stopped screwing up with register spilling): ```c++ for (int i3 = 0; i3 < n; i3 += s3) @@ -407,13 +407,13 @@ for (int i3 = 0; i3 < n; i3 += s3) * b[n / 8 * k + y / 8 + j]; ``` -There is also a way to do fewer arithmetic operations — [the Strassen algorithm](/hpc/external-memory/oblivious/#strassen-algorithm) — but it has a large constant factor, and it is only efficient for [very large matrices](https://arxiv.org/pdf/1605.01078.pdf) ($n > 4000$), where we typically have to use either multiprocessing or some approximate dimensionality-reducing methods anyway. +There is also an approach that performs asymptotically fewer arithmetic operations — [the Strassen algorithm](/hpc/external-memory/oblivious/#strassen-algorithm) — but it has a large constant factor, and it is only efficient for [very large matrices](https://arxiv.org/pdf/1605.01078.pdf) ($n > 4000$), where we typically have to use either multiprocessing or some approximate dimensionality-reducing methods anyway. ## Generalizations -FMA also supports 64-bit floating-point numbers, but it does not support integers: you need to perform addition and multiplication separately, which projects to decreased performance. If you can guarantee that all intermediate results can be represented exactly as a 32- or 64-bit floating-point number (which is [often the case](/hpc/arithmetic/errors/)), it may be better to convert them to and from floats. +FMA also supports 64-bit floating-point numbers, but it does not support integers: you need to perform addition and multiplication separately, which results in decreased performance. If you can guarantee that all intermediate results can be represented exactly as 32- or 64-bit floating-point numbers (which is [often the case](/hpc/arithmetic/errors/)), it may be faster to just convert them to and from floats. You can also apply the same trick to other similar computations. One example is the "min-plus matrix multiplication," which is defined as: @@ -453,7 +453,7 @@ for (int k = 0; k < n; k++) Interestingly, vectorizing the distance product and executing it $O(\log n)$ times in $O(n^3 \log n)$ total operations is faster than naively executing the Floyd-Warshall algorithm in $O(n^3)$ operations, although not by a lot. -As an exercise, try to speed up this "for-for-for" computation. It is harder to do than in the matrix multiplication case because you need to perform updates in a particular order, but it is still possible to design a similar kernel and an iteration order that achieves a 30-50x total speedup. +As an exercise, try to speed up this "for-for-for" computation. It is harder to do than in the matrix multiplication case because now there is a logical dependency between the iterations, and you need to perform updates in a particular order, but it is still possible to design a similar kernel and a block iteration order that achieves a 30-50x total speedup. ## Acknowledgements From b149f0900ce5b63a3c94088152879ee5530f81dc Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Thu, 7 Apr 2022 01:42:11 +0300 Subject: [PATCH 387/531] typo --- content/english/hpc/algorithms/matmul.md | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/content/english/hpc/algorithms/matmul.md b/content/english/hpc/algorithms/matmul.md index e0ebdaac..a5a7b4f2 100644 --- a/content/english/hpc/algorithms/matmul.md +++ b/content/english/hpc/algorithms/matmul.md @@ -391,7 +391,7 @@ It is more representative to compare against some practical library, such as [Op We've reached ~93% of BLAS performance and ~75% of the theoretical performance limit, which is really great for what is essentially just 40 lines of C. -Interestingly, the whole thing can be rolled into just one deeply nested `for` loop with a BLAS-level of performance (assuming that we're in 2050 and using GCC version 35, which finally stopped screwing up with register spilling): +Interestingly, the whole thing can be rolled into just one deeply nested `for` loop with a BLAS level of performance (assuming that we're in 2050 and using GCC version 35, which finally stopped screwing up with register spilling): ```c++ for (int i3 = 0; i3 < n; i3 += s3) @@ -409,8 +409,6 @@ for (int i3 = 0; i3 < n; i3 += s3) There is also an approach that performs asymptotically fewer arithmetic operations — [the Strassen algorithm](/hpc/external-memory/oblivious/#strassen-algorithm) — but it has a large constant factor, and it is only efficient for [very large matrices](https://arxiv.org/pdf/1605.01078.pdf) ($n > 4000$), where we typically have to use either multiprocessing or some approximate dimensionality-reducing methods anyway. - - ## Generalizations FMA also supports 64-bit floating-point numbers, but it does not support integers: you need to perform addition and multiplication separately, which results in decreased performance. If you can guarantee that all intermediate results can be represented exactly as 32- or 64-bit floating-point numbers (which is [often the case](/hpc/arithmetic/errors/)), it may be faster to just convert them to and from floats. From 1d039027db5e184c4d0b4b4824ddfbd119ae1f62 Mon Sep 17 00:00:00 2001 From: Daniel Paleka Date: Thu, 7 Apr 2022 13:30:12 +0200 Subject: [PATCH 388/531] Typo in argmin.md --- content/english/hpc/algorithms/argmin.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/hpc/algorithms/argmin.md b/content/english/hpc/algorithms/argmin.md index ccd9f140..0a9531c1 100644 --- a/content/english/hpc/algorithms/argmin.md +++ b/content/english/hpc/algorithms/argmin.md @@ -3,7 +3,7 @@ title: Argmin with SIMD weight: 7 --- -Computing the *minimum* of an array [easily vectorizable](/hpc/simd/reduction), as it is not different from any other reduction: in AVX2, you just need to use a convenient `_mm256_min_epi32` intrinsic as the inner operation. It computes the minimum of two 8-element vectors in one cycle — even faster than in the scalar case, which requires at least a comparison and a conditional move. +Computing the *minimum* of an array is [easily vectorizable](/hpc/simd/reduction), as it is not different from any other reduction: in AVX2, you just need to use a convenient `_mm256_min_epi32` intrinsic as the inner operation. It computes the minimum of two 8-element vectors in one cycle — even faster than in the scalar case, which requires at least a comparison and a conditional move. Finding the *index* of that minimum element (*argmin*) is much harder, but it is still possible to vectorize very efficiently. In this section, we design an algorithm that computes the argmin (almost) at the speed of computing the minimum and ~15x faster than the naive scalar approach. From 965c76bb87126d51013dbbe8e181fa439c638138 Mon Sep 17 00:00:00 2001 From: Alex Saveau Date: Sat, 9 Apr 2022 14:33:35 -0700 Subject: [PATCH 389/531] Fix extra word typo --- content/english/hpc/arithmetic/errors.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/hpc/arithmetic/errors.md b/content/english/hpc/arithmetic/errors.md index f2e0fbf6..df62e91d 100644 --- a/content/english/hpc/arithmetic/errors.md +++ b/content/english/hpc/arithmetic/errors.md @@ -125,7 +125,7 @@ $$ f(x, y) = x^2 - y^2 = (x + y) \cdot (x - y) $$ -In this one, it is easy to show that the error is be bound by $\epsilon \cdot |x - y|$. It is also faster because it needs 2 additions and 1 multiplication: one fast addition more and one slow multiplication less compared to the original. +In this one, it is easy to show that the error is bound by $\epsilon \cdot |x - y|$. It is also faster because it needs 2 additions and 1 multiplication: one fast addition more and one slow multiplication less compared to the original. ### Kahan Summation From a211cf62040495eddefa3c88f46b2206b513fd86 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sun, 10 Apr 2022 19:52:42 +0300 Subject: [PATCH 390/531] bugfix --- content/russian/cs/tree-structures/treap.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/russian/cs/tree-structures/treap.md b/content/russian/cs/tree-structures/treap.md index dd3417dd..724ed15f 100644 --- a/content/russian/cs/tree-structures/treap.md +++ b/content/russian/cs/tree-structures/treap.md @@ -199,7 +199,7 @@ struct Node { Вместо того, чтобы модифицировать и `merge`, и `split` под наши хотелки, напишем вспомогательную функцию `upd`, которую будем вызывать при обновлении детей вершины: ```c++ -void sum(Node* v) { return v ? v->sum : 0; } +int sum(Node* v) { return v ? v->sum : 0; } // обращаться по пустому указателю нельзя -- выдаст ошибку void upd(Node* v) { v->sum = sum(v->l) + sum(v->r) + v->val; } From cbd4948a082bc4959dfc565a2cc99041753d03b9 Mon Sep 17 00:00:00 2001 From: Alex Saveau Date: Sun, 10 Apr 2022 13:32:20 -0700 Subject: [PATCH 391/531] Fix possible typo? I'm pretty sure this should say not. --- content/english/hpc/external-memory/policies.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/hpc/external-memory/policies.md b/content/english/hpc/external-memory/policies.md index 1ff0e724..4cb36bdd 100644 --- a/content/english/hpc/external-memory/policies.md +++ b/content/english/hpc/external-memory/policies.md @@ -33,7 +33,7 @@ $$ The main idea of the proof is to consider the worst case scenario. For LRU it would be the repeating series of $\frac{M}{B}$ distinct blocks: each block is new and so LRU has 100% cache misses. Meanwhile, $OPT_{M/2}$ would be able to cache half of them (but not more, because it only has half the memory). Thus $LRU_M$ needs to fetch double the number of blocks that $OPT_{M/2}$ does, which is basically what is expressed in the inequality, and anything better for $LRU$ would only weaken it. -![Dimmed are the blocks cached by OPT (but note cached by LRU)](../img/opt.png) +![Dimmed are the blocks cached by OPT (but not cached by LRU)](../img/opt.png) This is a very relieving result. It means that, at least in terms of asymptotic I/O complexity, you can just assume that the eviction policy is either LRU or OPT — whichever is easier for you — do complexity analysis with it, and the result you get will normally transfer to any other reasonable cache replacement policy. From 6e13a8d7a027ad4dc486e7b82335e766d8137c59 Mon Sep 17 00:00:00 2001 From: Alex Saveau Date: Mon, 11 Apr 2022 00:26:23 -0700 Subject: [PATCH 392/531] Fix code typo --- content/english/hpc/cpu-cache/paging.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/hpc/cpu-cache/paging.md b/content/english/hpc/cpu-cache/paging.md index fad39a54..684fcd65 100644 --- a/content/english/hpc/cpu-cache/paging.md +++ b/content/english/hpc/cpu-cache/paging.md @@ -53,7 +53,7 @@ always [madvise] never #include void *ptr = std::aligned_alloc(page_size, array_size); -madvise(pre, array_size, MADV_HUGEPAGE); +madvise(ptr, array_size, MADV_HUGEPAGE); ``` You can only request a memory region to be allocated using huge pages if it has the corresponding alignment. From fc5fb2c45ee664d270bc65ca78e40a3a0aaaffbf Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Mon, 11 Apr 2022 17:58:33 +0300 Subject: [PATCH 393/531] fix approximate logarithm formula --- content/english/hpc/arithmetic/rsqrt.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/content/english/hpc/arithmetic/rsqrt.md b/content/english/hpc/arithmetic/rsqrt.md index 06659136..63b2799a 100644 --- a/content/english/hpc/arithmetic/rsqrt.md +++ b/content/english/hpc/arithmetic/rsqrt.md @@ -77,13 +77,13 @@ $$ \log_2 x = e_x + \log_2 (1 + m_x) \approx e_x + m_x + \sigma $$ -Now, having this approximation in mind and defining $L=23$ as the number of mantissa bits in a `float` and $B=127$ for the exponent bias, when we reinterpret the bit-pattern of $x$ as an integer $I_x$, we get +Now, having this approximation in mind and defining $L=2^{23}$ (the number of mantissa bits in a `float`) and $B=127$ (the exponent bias), when we reinterpret the bit-pattern of $x$ as an integer $I_x$, we get $$ \begin{aligned} -I_x &= L(e_x + B + m_x) -\\ &= L(e_x + m_x + \sigma +B-\sigma ) -\\ &\approx L\log_2 (x) + L (B-\sigma ) +I_x &= L \cdot (e_x + B + m_x) +\\ &= L \cdot (e_x + m_x + \sigma +B-\sigma ) +\\ &\approx L \cdot \log_2 (x) + L \cdot (B-\sigma ) \end{aligned} $$ From bb31ad26a9cb50c350a24104c2d734704ea72e2f Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Mon, 11 Apr 2022 18:30:22 +0300 Subject: [PATCH 394/531] exponent bias --- content/english/hpc/arithmetic/ieee-754.md | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/content/english/hpc/arithmetic/ieee-754.md b/content/english/hpc/arithmetic/ieee-754.md index ae624add..6b1e2a24 100644 --- a/content/english/hpc/arithmetic/ieee-754.md +++ b/content/english/hpc/arithmetic/ieee-754.md @@ -15,7 +15,7 @@ When we designed our [DIY floating-point type](../float), we omitted quite a lot - What happens if we increment the largest representable number? - Can we somehow detect if one of the above three happened? -Most of the early computers didn't have floating-point arithmetic, and when vendors started adding floating-point coprocessors, they had slightly different visions for what answers to those questions should be. Diverse implementations made it difficult to use floating-point arithmetic reliably and portably — particularly for people developing compilers. +Most of the early computers didn't support floating-point arithmetic, and when vendors started adding floating-point coprocessors, they had slightly different visions for what the answers to these questions should be. Diverse implementations made it difficult to use floating-point arithmetic reliably and portably — especially for the people who develop compilers. In 1985, the Institute of Electrical and Electronics Engineers published a standard (called [IEEE 754](https://en.wikipedia.org/wiki/IEEE_754)) that provided a formal specification of how floating-point numbers should work, which was quickly adopted by the vendors and is now used in virtually all general-purpose computers. @@ -27,6 +27,15 @@ Similar to our handmade float implementation, hardware floats use one bit for si One of the reasons why they are stored in this exact order is that it is easier to compare and sort them: you can use mostly the same comparator circuit as for [unsigned integers](../integer), except for maybe flipping some bits in case one of the numbers is negative. +For the same reason, the exponent is *biased:* the actual value is 127 less than the stored unsigned integer, which lets us also cover the values less than one (with negative exponents). In the example above: + +$$ +(-1)^0 \times 2^{01111100_2 - 127} \times (1 + 2^{-2}) += 2^{124 - 127} \times 1.25 += \frac{1.25}{8} += 0.15625 +$$ + IEEE 754 and a few consequent standards define not one but *several* representations that differ in sizes, most notably: | Type | Sign | Exponent | Mantissa | Total bits | Approx. decimal digits | @@ -46,11 +55,11 @@ Their availability ranges from chip to chip: - Half-precision arithmetic only supports a small subset of operations and is generally used for machine learning applications, especially neural networks, because they tend to do a large amount of calculation, but don't require a high level of precision. - Half-precision is being gradually replaced by bfloat, which trades off 3 mantissa bits to have the same range as single-precision, enabling interoperability with it. It is mostly being adopted by specialized hardware: TPUs, FGPAs, and GPUs. The name stands for "[Brain](https://en.wikipedia.org/wiki/Google_Brain) float." -Lower precision types need less memory bandwidth to move them around and usually take fewer cycles to operate on (e. g. the division instruction may take $x$, $y$, or $z$ cycles depending on the type), which is why they are preferred when error tolerance allows it. +Lower-precision types need less memory bandwidth to move them around and usually take fewer cycles to operate on (e. g. the division instruction may take $x$, $y$, or $z$ cycles depending on the type), which is why they are preferred when error tolerance allows it. Deep learning, emerging as a very popular and computationally-intensive field, created a huge demand for low-precision matrix multiplication, which led to manufacturers developing separate hardware or at least adding specialized instructions that support these types of computations — most notably, Google developing a custom chip called TPU (*tensor processing unit*) that specializes on multiplying 128-by-128 bfloat matrices, and NVIDIA adding "tensor cores," capable of performing 4-by-4 matrix multiplication in one go, to all their newer GPUs. -Apart from their sizes, most of the behavior is exactly the same between all floating-point types, which we will now clarify. +Apart from their sizes, most of the behavior is the same between all floating-point types, which we will now clarify. ## Handling Corner Cases From 436ffa7b608309d8a2246f403d2c95557bbb7d76 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Mon, 11 Apr 2022 19:10:37 +0300 Subject: [PATCH 395/531] comments about bit tricks in fast rsqrt --- content/english/hpc/arithmetic/rsqrt.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/content/english/hpc/arithmetic/rsqrt.md b/content/english/hpc/arithmetic/rsqrt.md index 63b2799a..9817e5a9 100644 --- a/content/english/hpc/arithmetic/rsqrt.md +++ b/content/english/hpc/arithmetic/rsqrt.md @@ -77,7 +77,7 @@ $$ \log_2 x = e_x + \log_2 (1 + m_x) \approx e_x + m_x + \sigma $$ -Now, having this approximation in mind and defining $L=2^{23}$ (the number of mantissa bits in a `float`) and $B=127$ (the exponent bias), when we reinterpret the bit-pattern of $x$ as an integer $I_x$, we get +Now, having this approximation in mind and defining $L=2^{23}$ (the number of mantissa bits in a `float`) and $B=127$ (the exponent bias), when we reinterpret the bit-pattern of $x$ as an integer $I_x$, we essentially get $$ \begin{aligned} @@ -87,9 +87,11 @@ I_x &= L \cdot (e_x + B + m_x) \end{aligned} $$ +(Multiplying a number by $L=2^{23}$ is equivalent to left-shifting it by 23.) + When you tune $\sigma$ to minimize the mean square error, this results in a surprisingly accurate approximation. -![](../img/approx.svg) +![Reinterpreting a floating-point number $x$ as an integer (blue) compared to its scaled and shifted logarithm (gray)](../img/approx.svg) Now, expressing the logarithm from the approximation, we get From 95899a63c97b582a7b93ceb66369d27cd854c3e0 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Mon, 11 Apr 2022 19:12:10 +0300 Subject: [PATCH 396/531] more precise wording --- content/english/hpc/arithmetic/rsqrt.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/hpc/arithmetic/rsqrt.md b/content/english/hpc/arithmetic/rsqrt.md index 9817e5a9..0fa4d209 100644 --- a/content/english/hpc/arithmetic/rsqrt.md +++ b/content/english/hpc/arithmetic/rsqrt.md @@ -87,7 +87,7 @@ I_x &= L \cdot (e_x + B + m_x) \end{aligned} $$ -(Multiplying a number by $L=2^{23}$ is equivalent to left-shifting it by 23.) +(Multiplying an integer by $L=2^{23}$ is equivalent to left-shifting it by 23.) When you tune $\sigma$ to minimize the mean square error, this results in a surprisingly accurate approximation. From aec8d782b10e76df76c6483d34fb05a4f988462a Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Mon, 11 Apr 2022 20:40:12 +0300 Subject: [PATCH 397/531] fix variable names in dp example --- .../english/hpc/external-memory/locality.md | 49 +++++++++++-------- 1 file changed, 28 insertions(+), 21 deletions(-) diff --git a/content/english/hpc/external-memory/locality.md b/content/english/hpc/external-memory/locality.md index a26ff70f..569d9437 100644 --- a/content/english/hpc/external-memory/locality.md +++ b/content/english/hpc/external-memory/locality.md @@ -47,44 +47,51 @@ In practice, there is still some overhead associated with the recursion, and for ### Dynamic Programming -Similar reasoning can be applied to the implementations of dynamic programming algorithms but leading to the reverse result. Consider the classic knapsack problem, where we got $n$ items with integer costs $c_i$, and we need to pick a subset of items with the maximum total cost that does not exceed a given constant $w$. +Similar reasoning can be applied to the implementations of dynamic programming algorithms but leading to the reverse result. Consider the classic *knapsack problem:* given $N$ items with positive integer costs $c_i$, pick a subset of items with the maximum total cost that does not exceed a given constant $W$. -The way to solve it is to introduce the *state* $f[i, k]$, which corresponds to the maximum total cost not exceeding $k$ that can be achieved having already considered and excluded the first $i$ items. The state can be updated in $O(1)$ time per entry if consider either taking or not taking the $i$-th item and using further states of the dynamic to compute the optimal decision for each state. +The way to solve it is to introduce the *state* $f[n, w]$, which corresponds to the maximum total cost not exceeding $w$ that can be achieved using only the first $n$ items. These values can be computed in $O(1)$ time per entry if we consider either taking or not taking the $n$-th item and using the previous states of the dynamic to make the optimal decision. -Python has a handy `lru_cache` decorator, which can be used for implementing it with memoized recursion: +Python has a handy `lru_cache` decorator which can be used for implementing it with memoized recursion: ```python @lru_cache -def f(i, k): - if i == n or k == 0: +def f(n, w): + # check if we have no items to choose + if n == 0: return 0 - if w[i] > k: - return f(i + 1, k) - return max(f(i + 1, k), c[i] + f(i + 1, k - w[i])) + + # check if we can't pick the last item (note zero-based indexing) + if c[n - 1] > w: + return f(n - 1, w) + + # otherwise, we can either pick the last item or not + return max(f(n - 1, w), c[n - 1] + f(n - 1, w - c[n - 1])) ``` -When computing $f[n, w]$, the recursion may visit up to $O(n \cdot w)$ different states, which is asymptotically efficient, but rather slow in reality. Even after nullifying the overhead of Python recursion and all the hash table queries required for the LRU cache to work, it would still be slow because it does random I/O throughout most of the execution. +When computing $f[N, W]$, the recursion may visit up to $O(N \cdot W)$ different states, which is asymptotically efficient, but rather slow in reality. Even after nullifying the overhead of Python recursion and all the [hash table queries](../policies/#implementing-caching) required for the LRU cache to work, it would still be slow because it does random I/O throughout most of the execution. What we can do instead is to create a two-dimensional array for the dynamic and replace the recursion with a nice nested loop like this: ```cpp -int f[N + 1][W + 1]; +int f[N + 1][W + 1] = {0}; // this zero-fills the array -for (int i = n - 1; i >= 0; i++) - for (int k = 0; k <= W; k++) - f[i][k] = w[i] > k ? f[i + 1][k] : max(f[i + 1][k], c[i] + f[i + 1][k - w[i]]); +for (int n = 1; n <= N; n++) + for (int w = 0; w <= W; w++) + f[n][w] = c[n - 1] > w ? + f[n - 1][w] : + max(f[n - 1][k], c[n - 1] + f[n - 1][w - c[n - 1]]); ``` -Notice that we are only using the previous layer of the dynamic to calculate the next one. This means that if we can store one layer in the cache, we would only need to write $O(\frac{n \cdot w}{B})$ blocks in external memory. +Notice that we are only using the previous layer of the dynamic to calculate the next one. This means that if we can store one layer in the cache, we would only need to write $O(\frac{N \cdot W}{B})$ blocks in external memory. -Moreover, if we only need the answer, we don't actually have to store the whole 2d array but only the last layer. This lets us use just $O(w)$ memory by maintaining a single array of $w$ values. To simplify the code, we can slightly change the dynamic to store a binary value: whether it is possible to get the sum of exactly $k$ using the items that we have already considered. This dynamic is even faster to compute: +Moreover, if we only need the answer, we don't actually have to store the whole 2d array but only the last layer. This lets us use just $O(W)$ memory by maintaining a single array of $W$ values. To simplify the code, we can slightly change the dynamic to store a binary value: whether it is possible to get the sum of exactly $w$ using the items that we have already considered. This dynamic is even faster to compute: ```cpp -bool f[W + 1] = {}; // this zero-fills the array +bool f[W + 1] = {0}; f[0] = 1; -for (int i = 0; i < n; i++) - for (int x = W - a[i]; x >= 0; x--) - f[x + a[i]] |= f[x]; +for (int n = 0; n < N; n++) + for (int x = W - c[n]; x >= 0; x--) + f[x + c[n]] |= f[x]; ``` As a side note, now that it only uses simple bitwise operations, it can be optimized further by using a bitset: @@ -92,8 +99,8 @@ As a side note, now that it only uses simple bitwise operations, it can be optim ```cpp std::bitset b; b[0] = 1; -for (int i = 0; i < n; i++) - b |= b << c[i]; +for (int n = 0; n < N; n++) + b |= b << c[n]; ``` Surprisingly, there is still some room for improvement, and we will come back to this problem later. From 9872a11b931c184f51ce01e076d6b9adb1bbe690 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Mon, 11 Apr 2022 20:46:50 +0300 Subject: [PATCH 398/531] change wording --- content/english/hpc/cpu-cache/bandwidth.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/hpc/cpu-cache/bandwidth.md b/content/english/hpc/cpu-cache/bandwidth.md index 88b547ad..a28570f5 100644 --- a/content/english/hpc/cpu-cache/bandwidth.md +++ b/content/english/hpc/cpu-cache/bandwidth.md @@ -38,7 +38,7 @@ All CPU cache layers are placed on the same microchip as the processor, so the b ![](../img/boost.svg) -This detail comes into play when comparing algorithm implementations. Unless the dataset fits entirely in the cache, the relative performance of the two implementations may be different depending on the CPU clock rate because the RAM remains unaffected by it, while everything else does. +This detail comes into play when comparing algorithm implementations. When the working dataset fits in the cache, the relative performance of the two implementations may be different depending on the CPU clock rate because the RAM remains unaffected by it (while everything else does not). For this reason, it is [advised](/hpc/profiling/noise) to keep the clock rate fixed, and as the turbo boost isn't stable enough, we run most of the benchmarks in this book at plain 2GHz. From 69390b1012b84459f33b279f2e4646a0ed41f357 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 12 Apr 2022 17:45:08 +0300 Subject: [PATCH 399/531] typo --- content/english/hpc/external-memory/list-ranking.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/content/english/hpc/external-memory/list-ranking.md b/content/english/hpc/external-memory/list-ranking.md index cf5d9929..6d7c0053 100644 --- a/content/english/hpc/external-memory/list-ranking.md +++ b/content/english/hpc/external-memory/list-ranking.md @@ -50,11 +50,11 @@ List ranking is especially useful in graph algorithms. For example, we can obtain the Euler tour of a tree in external memory by constructing a linked list from the tree that corresponds to its Euler tour and then applying the list ranking algorithm — the ranks of each node will be the same as its index $tin_v$ in the Euler tour. To construct this list, we need to: -- split each undirected tree edge into two directed ones; -- duplicate the parent node for each up-edge (because list nodes can only have one incoming edge, but we visit some tree vertices multiple times); +- split each undirected edge into two directed ones; +- duplicate the parent node for each up-edge (because list nodes can only have one incoming edge, but we visit some vertices multiple times); - route each such node either to the "next sibling," if it has one, or otherwise to its own parent; - and then finally break the resulting cycle at the root. This general technique is called *tree contraction*, and it serves as the basis for a large number of tree algorithms. -Exactly the same approach can be applied to parallel algorithms, and we will convert that much more deeply in part 2. +The same approach can be applied to parallel algorithms, and we will cover that much more deeply in part II. From 0b9d2bb532003b65c1ae4bc9f5477bd2f4a5ddf4 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 12 Apr 2022 17:48:21 +0300 Subject: [PATCH 400/531] link to strassen algorithm implementation paper --- content/english/hpc/external-memory/oblivious.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/english/hpc/external-memory/oblivious.md b/content/english/hpc/external-memory/oblivious.md index 5e4650b2..a0327855 100644 --- a/content/english/hpc/external-memory/oblivious.md +++ b/content/english/hpc/external-memory/oblivious.md @@ -198,7 +198,7 @@ $$ T(N) = O\left(\frac{(\sqrt{M})^2}{B} \cdot \left(\frac{N}{\sqrt M}\right)^3\right) = O\left(\frac{N^3}{B\sqrt{M}}\right) $$ -This is better than just $O(\frac{N^3}{B})$ and by quite a lot. +This is better than just $O(\frac{N^3}{B})$, and by quite a lot. ### Strassen Algorithm @@ -237,7 +237,7 @@ $$ You can verify these formulas with simple substitution if you feel like it. -As far as I know, none of the mainstream optimized linear algebra libraries use the Strassen algorithm, although there are some prototype implementations that are efficient for matrices larger than 4000 or so. +As far as I know, none of the mainstream optimized linear algebra libraries use the Strassen algorithm, although there are [some prototype implementations](https://arxiv.org/pdf/1605.01078.pdf) that are efficient for matrices larger than 2000 or so. This technique can and actually has been extended multiple times to reduce the asymptotic even further by considering more submatrix products. As of 2020, current world record is $O(n^{2.3728596})$. Whether you can multiply matrices in $O(n^2)$ or at least $O(n^2 \log^k n)$ time is an open problem. From c5b7bd4b85ab1a90c25400c073181df687978377 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 12 Apr 2022 19:11:58 +0300 Subject: [PATCH 401/531] note about kernel design choices --- content/english/hpc/algorithms/matmul.md | 25 ++++++++++++++++++++++++ 1 file changed, 25 insertions(+) diff --git a/content/english/hpc/algorithms/matmul.md b/content/english/hpc/algorithms/matmul.md index a5a7b4f2..c692a227 100644 --- a/content/english/hpc/algorithms/matmul.md +++ b/content/english/hpc/algorithms/matmul.md @@ -272,6 +272,31 @@ void kernel(float *a, vec *b, vec *c, int x, int y, int l, int r, int n) { We need `t` so that the compiler stores these elements in vector registers. We could just update their final destinations in `c`, but, unfortunately, the compiler re-writes them back to memory, causing a slowdown (wrapping everything in `__restrict__` keywords doesn't help). +After unrolling these loops and hoisting `b` out of the `i` loop (`b[(k * n + y) / 8 + j]` does not depend on `i` and can be loaded once and reused in all 6 iterations), the compiler generates something more similar to this: + + + +```c++ +for (int k = l; k < r; k++) { + __m256 b0 = _mm256_load_ps((__m256*) &b[k * n + y]; + __m256 b1 = _mm256_load_ps((__m256*) &b[k * n + y + 8]; + + __m256 a0 = _mm256_broadcast_ps((__m128*) &a[x * n + k]); + t00 = _mm256_fmadd_ps(a0, b0, t00); + t01 = _mm256_fmadd_ps(a0, b1, t01); + + __m256 a1 = _mm256_broadcast_ps((__m128*) &a[(x + 1) * n + k]); + t10 = _mm256_fmadd_ps(a1, b0, t10); + t11 = _mm256_fmadd_ps(a1, b1, t11); + + // ... +} +``` + +We are using $12+3=15$ vector registers and a total of $6 \times 3 + 2 = 20$ instructions to perform $16 \times 6 = 96$ updates. Assuming that there are no other bottleneks, we should be hitting the throughput of `_mm256_fmadd_ps`. + +Note that this kernel is architecture-specific. If we didn't have `fma`, or if its throughput/latency were different, or if the SIMD width was 128 or 512 bits, we would have made different design choices. Multi-platform BLAS implementations ship [many kernels](https://github.com/xianyi/OpenBLAS/tree/develop/kernel), each written in assembly by hand and optimized for a particular architecture. + The rest of the implementation is straightforward. Similar to the previous vectorized implementation, we just move the matrices to memory-aligned arrays and call the kernel instead of the innermost loop: ```c++ From 473fe8562d44b769d27a0b8c8229f281eea2d3b3 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 12 Apr 2022 23:40:21 +0300 Subject: [PATCH 402/531] mlp clarifications --- content/english/hpc/cpu-cache/mlp.md | 2 +- content/english/hpc/cpu-cache/prefetching.md | 8 ++++---- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/content/english/hpc/cpu-cache/mlp.md b/content/english/hpc/cpu-cache/mlp.md index 11c5b660..95dfa4cb 100644 --- a/content/english/hpc/cpu-cache/mlp.md +++ b/content/english/hpc/cpu-cache/mlp.md @@ -3,7 +3,7 @@ title: Memory-Level Parallelism weight: 5 --- -Memory requests can overlap in time: while you wait for a read request to complete, you can send a few others, which will be executed concurrently with it. This is the reason why [linear iteration](../bandwidth) is so much faster than [pointer jumping](../latency): the CPU knows which memory locations it needs to fetch next and sends memory requests far ahead of time. +Memory requests can overlap in time: while you wait for a read request to complete, you can send a few others, which will be executed concurrently with it. This is the main reason why [linear iteration](../bandwidth) is so much faster than [pointer jumping](../latency): the CPU knows which memory locations it needs to fetch next and sends memory requests far ahead of time. The number of concurrent memory operations is large but limited, and it is different for different types of memory. When designing algorithms and especially data structures, you may want to know this number, as it limits the amount of parallelism your computation can achieve. diff --git a/content/english/hpc/cpu-cache/prefetching.md b/content/english/hpc/cpu-cache/prefetching.md index 8ccdea6b..3001389c 100644 --- a/content/english/hpc/cpu-cache/prefetching.md +++ b/content/english/hpc/cpu-cache/prefetching.md @@ -70,7 +70,7 @@ There is some overhead to computing the next address, but for arrays large enoug ![](../img/sw-prefetch.svg) -Interestingly, we can prefetch more than just two elements ahead, making use of this pattern in the LCG function: +Interestingly, we can prefetch more than just one element ahead, making use of this pattern in the LCG function: $$ \begin{aligned} @@ -82,17 +82,17 @@ $$ \end{aligned} $$ -Hence, in order to load `D` elements ahead, we can do this: +Hence, to load the `D`-th element ahead, we can do this: ```cpp __builtin_prefetch(&q[((1 << D) * k + (1 << D) - 1) % n]); ``` -Ignoring some issues such as the integer overflow, this way we can reduce the latency arbitrarily close to the cost of computing the next index (which in this case is dominated by the [modulo operation](/hpc/arithmetic/division)). +If we execute this request on every iteration, we will be simultaneously prefetching `D` elements ahead on average, increasing the throughput by `D` times. Ignoring some issues such as the integer overflow when `D` is too large, this way, we can reduce the average latency arbitrarily close to the cost of computing the next index (which, in this case, is dominated by the [modulo operation](/hpc/arithmetic/division)). ![](../img/sw-prefetch-others.svg) -Note that this is an artificial example, and you actually fail more often than not when trying to insert software prefetching into practical programs. This is largely due to the fact that you need to issue a separate memory instruction that may compete for resources with the others. At the same time, hardware prefetching is 100% harmless as it only activates when the memory and cache buses are not busy. +Note that this is an artificial example, and you actually fail more often than not when trying to insert software prefetching into practical programs. This is largely because you need to issue a separate memory instruction that may compete for resources with the others. At the same time, hardware prefetching is 100% harmless as it only activates when the memory and cache buses are not busy. You can also specify a specific level of cache the data needs to be brought to when doing software prefetching — when you aren't sure if you will be using it and don't want to kick out what is already in the L1 cache. You can use it with the `_mm_prefetch` intrinsic, which takes an integer value as the second parameter, specifying the cache level. This is useful in combination with [non-temporal loads and stores](../bandwidth#bypassing-the-cache). From 2a0cf6808d345a51c19de689c8b206f30d1ae92d Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 12 Apr 2022 23:47:11 +0300 Subject: [PATCH 403/531] prefetching edits --- content/english/hpc/cpu-cache/prefetching.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/content/english/hpc/cpu-cache/prefetching.md b/content/english/hpc/cpu-cache/prefetching.md index 3001389c..4f5a7545 100644 --- a/content/english/hpc/cpu-cache/prefetching.md +++ b/content/english/hpc/cpu-cache/prefetching.md @@ -30,9 +30,9 @@ for (int i = 0; i + 16 < N; i += 16) { } ``` -There is no point in making a graph because the latency is flat: 3ns regardless of the array size. Even though the instruction scheduler still can't tell what we are going to fetch next, the memory prefetcher can detect a pattern just by looking at the memory accesses and start loading the next cache line ahead of time, leveling out its latency. +There is no point in making a graph because it would be just flat: the latency is 3ns regardless of the array size. Even though the instruction scheduler still can't tell what we are going to fetch next, the memory prefetcher can detect a pattern just by looking at the memory accesses and start loading the next cache line ahead of time, mitigating the latency. -Hardware prefetching is usually powerful enough for most cases, but it only detects simple patterns. You can iterate forward and backward over multiple arrays in parallel, perhaps with small-to-medium strides, but that's about it. For anything more complex, the prefetcher won't figure out what's happening, and we need to help it out ourselves. +Hardware prefetching is smart enough for most use cases, but it only detects simple patterns. You can iterate forward and backward over multiple arrays in parallel, perhaps with small-to-medium strides, but that's about it. For anything more complex, the prefetcher won't figure out what's happening, and we need to help it out ourselves. ### Software Prefetching @@ -88,7 +88,7 @@ Hence, to load the `D`-th element ahead, we can do this: __builtin_prefetch(&q[((1 << D) * k + (1 << D) - 1) % n]); ``` -If we execute this request on every iteration, we will be simultaneously prefetching `D` elements ahead on average, increasing the throughput by `D` times. Ignoring some issues such as the integer overflow when `D` is too large, this way, we can reduce the average latency arbitrarily close to the cost of computing the next index (which, in this case, is dominated by the [modulo operation](/hpc/arithmetic/division)). +If we execute this request on every iteration, we will be simultaneously prefetching `D` elements ahead on average, increasing the throughput by `D` times. Ignoring some issues such as the integer overflow when `D` is too large, we can reduce the average latency arbitrarily close to the cost of computing the next index (which, in this case, is dominated by the [modulo operation](/hpc/arithmetic/division)). ![](../img/sw-prefetch-others.svg) From 68ae398833ab4e47918c4e20f013a333729b0bb9 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 13 Apr 2022 10:54:28 +0300 Subject: [PATCH 404/531] column -> cell --- content/english/hpc/cpu-cache/aos-soa.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/english/hpc/cpu-cache/aos-soa.md b/content/english/hpc/cpu-cache/aos-soa.md index 048271db..d5765339 100644 --- a/content/english/hpc/cpu-cache/aos-soa.md +++ b/content/english/hpc/cpu-cache/aos-soa.md @@ -99,8 +99,8 @@ As the performance on smaller arrays sizes is not affected, this clearly has som From the performance analysis point of view, all data in RAM is physically stored in a two-dimensional array of tiny capacitor cells, which is split into rows and columns. To read or write any cell, you need to perform one, two, or three actions: 1. Read the contents of a row in a *row buffer*, which temporarily discharges the capacitors. -2. Read or write a specific column in this buffer. -3. Write the contents of a row buffer back into the capacitors, so that the data is preserved, and the row buffer can be used for other memory accesses. +2. Read or write a specific cell in this buffer. +3. Write the contents of a row buffer back into the capacitors so that the data is preserved and the row buffer can be used for other memory accesses. Here is the punchline: you don't have to perform steps 1 and 3 between two memory accesses that correspond to the same row — you can just use the row buffer as a temporary cache. These three actions take roughly the same time, so this optimization makes long sequences of row-local accesses run thrice as fast compared to dispersed access patterns. From 50ffb1c9324e9d62433f178ba62494070c9b1afd Mon Sep 17 00:00:00 2001 From: Alex Saveau Date: Fri, 15 Apr 2022 11:33:19 -0700 Subject: [PATCH 405/531] Fix missing word --- content/english/hpc/data-structures/binary-search.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/hpc/data-structures/binary-search.md b/content/english/hpc/data-structures/binary-search.md index ff9f73b4..36bb5059 100644 --- a/content/english/hpc/data-structures/binary-search.md +++ b/content/english/hpc/data-structures/binary-search.md @@ -9,7 +9,7 @@ Instead, the most fascinating showcases of performance engineering are multifold -In this article, we focus on such fundamental algorithm — *binary search* — and implement two of its variants that are, depending on the problem size, up to 4x faster than `std::lower_bound`, while being under just 15 lines of code. +In this article, we focus on one such fundamental algorithm — *binary search* — and implement two of its variants that are, depending on the problem size, up to 4x faster than `std::lower_bound`, while being under just 15 lines of code. The first algorithm achieves that by removing [branches](/hpc/pipelining/branching), and the second also optimizes the memory layout to achieve better [cache system](/hpc/cpu-cache) performance. This technically disqualifies it from being a drop-in replacement for `std::lower_bound` as it needs to permute the elements of the array before it can start answering queries — but I can't recall a lot of scenarios where you obtain a sorted array but can't afford to spend linear time on preprocessing. From 35016003c29a455a023f56118f5a9a0cf9c48072 Mon Sep 17 00:00:00 2001 From: Elk Cloner <28754537+elkcl@users.noreply.github.com> Date: Sat, 16 Apr 2022 17:34:45 +0300 Subject: [PATCH 406/531] =?UTF-8?q?=D0=98=D1=81=D0=BF=D1=80=D0=B0=D0=B2?= =?UTF-8?q?=D0=BB=D0=B5=D0=BD=D0=B8=D0=B5=20=D1=81=D1=81=D1=8B=D0=BB=D0=BA?= =?UTF-8?q?=D0=B8=20=D0=BD=D0=B0=20z-=D1=84=D1=83=D0=BD=D0=BA=D1=86=D0=B8?= =?UTF-8?q?=D1=8E=20=D0=B2=20=D1=81=D1=82=D0=B0=D1=82=D1=8C=D0=B5=20=D0=BF?= =?UTF-8?q?=D1=80=D0=BE=20=D1=81=D1=83=D1=84=D0=BC=D0=B0=D1=81?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- content/russian/cs/string-structures/suffix-array.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/russian/cs/string-structures/suffix-array.md b/content/russian/cs/string-structures/suffix-array.md index 80d2b129..25b90a3e 100644 --- a/content/russian/cs/string-structures/suffix-array.md +++ b/content/russian/cs/string-structures/suffix-array.md @@ -136,7 +136,7 @@ vector suffix_array(vector &s) { ### Алгоритм Касаи, Аримуры, Арикавы, Ли, Парка -Алгоритм в реальности называется как угодно, но не исходным способом (*алгоритм Касаи*, *алгоритм пяти корейцев*, и т. д.). Используется для подсчета $lcp$ за линейное время. Автору алгоритм кажется чем-то похожим на [z-функцию](string-searching) по своей идее. +Алгоритм в реальности называется как угодно, но не исходным способом (*алгоритм Касаи*, *алгоритм пяти корейцев*, и т. д.). Используется для подсчета $lcp$ за линейное время. Автору алгоритм кажется чем-то похожим на [z-функцию](/cs/string-searching/z-function) по своей идее. **Утверждение.** Пусть мы уже построили суфмасс и посчитали $lcp[i]$. Тогда: From 656f10fb82d03cb22d928566a5a67e7f6a8fcbd6 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 20 Apr 2022 05:28:45 +0300 Subject: [PATCH 407/531] bugfix --- content/russian/cs/layer-optimizations/_index.md | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/content/russian/cs/layer-optimizations/_index.md b/content/russian/cs/layer-optimizations/_index.md index 492473b5..2456aa4c 100644 --- a/content/russian/cs/layer-optimizations/_index.md +++ b/content/russian/cs/layer-optimizations/_index.md @@ -10,10 +10,7 @@ date: 2021-08-29 **Задача.** Даны $n$ точек на прямой, отсортированные по своей координате $x_i$. Нужно найти $m$ отрезков, покрывающих все точки, минимизировав при этом сумму квадратов их длин. -**Базовое решение** — это следующая динамика: - -- $f[i, j]$ = минимальная стоимость покрытия $i$ первых точек, используя не более $j$ отрезков. -- Переход — перебор всех возможных последних отрезков, то есть +**Базовое решение** — определить состояние динамики $f[i, j]$ как минимальную стоимость покрытия $i$ первых точек используя не более $j$ отрезков. Пересчитывать её можно перебором всех возможных последних отрезков: $$ f[i, j] = \min_{k < i} \{f[k, j-1] + (x_{i-1}-x_k)^2 \} @@ -30,7 +27,7 @@ int cost(int i, int j) { } for (int i = 0; i <= m; i++) - f[0][k] = 0; // если нам не нужно ничего покрывать, то всё и так хорошо + f[0][i] = 0; // если нам не нужно ничего покрывать, то всё и так хорошо // все остальные f предполагаем равными бесконечности for (int i = 1; i <= n; i++) From 85bc919acc8cb33a7a09e0d37d973cef0548e7bf Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 20 Apr 2022 05:54:04 +0300 Subject: [PATCH 408/531] fix divide and conquer dp --- .../layer-optimizations/divide-and-conquer.md | 31 +++++++++---------- 1 file changed, 14 insertions(+), 17 deletions(-) diff --git a/content/russian/cs/layer-optimizations/divide-and-conquer.md b/content/russian/cs/layer-optimizations/divide-and-conquer.md index 61a7304a..c5e218db 100644 --- a/content/russian/cs/layer-optimizations/divide-and-conquer.md +++ b/content/russian/cs/layer-optimizations/divide-and-conquer.md @@ -8,44 +8,43 @@ published: true *Эта статья — одна из [серии](../). Рекомендуется сначала прочитать все предыдущие.* -Посмотрим на формулу пересчета динамики для базового решения: +Посмотрим на формулу пересчета динамики из базового решения: $$ f[i, j] = \min_{k < i} \{f[k, j-1] + (x_{i-1}-x_k)^2 \} $$ -Обозначим за $opt[i, j]$ оптимальный $k$ для данного состояния — то есть от выражения выше. Для однозначности, если оптимальный индекс не один, то выберем среди них самый правый. +Обозначим за $opt[i, j]$ оптимальный $k$ для данного состояния — то есть аргминимум от выражения выше. Для однозначности, если оптимальный индекс не один, то выберем среди них самый правый. -Конкретно в задаче покрытия точек отрезками, можно заметить следующее: +Конкретно в задаче покрытия точек отрезками можно заметить следующее: $$ -opt[i, j] \leq opt[i+1, j] +opt[i + 1, j] \leq opt[i, j] $$ -Интуиция такая: когда мы сдвигаем i вправо, то точка, с которой может начинаться последняя группа, не может уменьшаться. +Интуация такая: если нам нужно покрыть больший префикс точек, то начало последнего отрезка точно не будет раньше. -### Идея +### Алгоритм -Пусть мы уже знаем $opt[i, l]$ и $opt[i, r]$ и хотим посчитать $opt[i, j]$ для какого-то $j$ между $l$ и $r$. Тогда, воспользовавшись неравенством выше, мы можем сузить отрезок поиска оптимального индекса для $j$ со всего отрезка $[0, i-1]$ до $[opt[i, l], opt[i, r]]$. +Пусть мы уже знаем $opt[l, k]$ и $opt[r, k]$ и хотим посчитать $opt[i, k]$ для какого-то $i$ между $l$ и $r$. Тогда, воспользовавшись неравенством выше, мы можем сузить отрезок поиска оптимального индекса для $i$ со всего отрезка $[0, i - 1]$ до $[opt[l, k], opt[r, k]]$. -Будем делать следующее: заведем рекурсивную функцию, которая считает динамики для отрезка $[l, r]$, зная, что их $opt$ лежат между $l'$ и $r'$. Эта функция просто берет середину отрезка $[l, r]$ и линейным проходом считает ответ для неё, а затем рекурсивно запускается от половин, передавая в качестве границ $[l', opt]$ и $[opt, r']$ соответственно. - -### Реализация - -Один $k$-тый слой целиком пересчитывается из $(k-1)$-го следующим образом: +Будем делать следующее: заведем рекурсивную функцию, которая считает динамики для отрезка $[l, r]$ на $k$-том слое, зная, что их $opt$ лежат между $l'$ и $r'$. Эта функция просто берет середину отрезка $[l, r]$ и линейным проходом считает ответ для неё, а затем рекурсивно запускается от половин, передавая в качестве границ $[l', opt]$ и $[opt, r']$ соответственно: ```c++ +// [ l, r] -- какие динамики на k-том слое посчитать +// [_l, _r] -- где могут быть их ответы void solve(int l, int r, int _l, int _r, int k) { if (l > r) return; // отрезок пустой -- выходим int opt = _l, t = (l + r) / 2; + // считаем ответ для f[t][k] for (int i = _l; i <= min(_r, t); i++) { int val = f[i + 1][k - 1] + cost(i, t - 1); if (val < f[t][k]) f[t][k] = val, opt = i; } - solve(l, t - 1, _l, opt, k); - solve(t + 1, r, opt, _r, k); + solve(l, t - 1, _l, opt, k); + solve(t + 1, r, opt, _r, k); } ``` @@ -56,8 +55,6 @@ for (int k = 1; k <= m; k++) solve(0, n - 1, 0, n - 1, k); ``` -### Асимптотика - Так как отрезок $[l, r]$ на каждом вызове уменьшается примерно в два раза, глубина рекурсии будет $O(\log n)$. Так как отрезки поиска для всех элементов на одном «уровне» могут пересекаться разве что только по границам, то суммарно на каждом уровне поиск проверит $O(n)$ различных индексов. Соответственно, пересчет всего слоя займет $O(n \log n)$ операций вместо $O(n^2)$ в базовом решении. -Таким образом, мы улучшили асимптотику до $O(n m \log n)$. +Таким образом, мы улучшили асимптотику до $O(n \cdot m \cdot \log n)$. From d5c5fb5a62c2a5645d9473dda6bec8eb7430a39f Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 20 Apr 2022 05:56:29 +0300 Subject: [PATCH 409/531] fix knuth dp criterion --- content/russian/cs/layer-optimizations/knuth.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/content/russian/cs/layer-optimizations/knuth.md b/content/russian/cs/layer-optimizations/knuth.md index 5c49dbe6..8a184d2d 100644 --- a/content/russian/cs/layer-optimizations/knuth.md +++ b/content/russian/cs/layer-optimizations/knuth.md @@ -9,13 +9,13 @@ prerequisites: Предыдущий метод оптимизации опирался на тот факт, что $opt[i, j] \leq opt[i, j + 1]$. -Асимптотику можно ещё улучшить, заметив, что $opt$ монотонен ещё и по первому параметру: +Асимптотику можно ещё улучшить, заметив, что $opt$ монотонен также и по второму параметру: $$ -opt[i-1, j] \leq opt[i, j] \leq opt[i, j+1] +opt[i - 1, j] \leq opt[i, j] \leq opt[i, j + 1] $$ -В задаче про покрытие отрезками это выполняется примерно по той же причине: если нам нужно покрывать меньше точек, то новый оптимальный последний отрезок будет начинаться не позже старого. +В задаче про покрытие отрезками это выполняется примерно по той же причине: если нам доступно больше отрезков, то последний отрезок в оптимальном решении точно не будет длиннее, чем раньше. ### Алгоритм From ac8906113eee302e9ee6b681909a56d391cf5bb3 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 20 Apr 2022 06:15:41 +0300 Subject: [PATCH 410/531] mark drafts in toc --- themes/algorithmica/assets/style.sass | 5 +++++ themes/algorithmica/layouts/partials/sidebar.html | 4 ++-- 2 files changed, 7 insertions(+), 2 deletions(-) diff --git a/themes/algorithmica/assets/style.sass b/themes/algorithmica/assets/style.sass index fe3ebaeb..0a42a2d6 100644 --- a/themes/algorithmica/assets/style.sass +++ b/themes/algorithmica/assets/style.sass @@ -157,6 +157,11 @@ body &::before content: counter(chapter-counter) "." counter(section-counter) ". " font-weight: bold + + .draft, .draft a + color: $dimmed + + #wrapper width: 100% diff --git a/themes/algorithmica/layouts/partials/sidebar.html b/themes/algorithmica/layouts/partials/sidebar.html index 2276957a..816887f5 100644 --- a/themes/algorithmica/layouts/partials/sidebar.html +++ b/themes/algorithmica/layouts/partials/sidebar.html @@ -24,13 +24,13 @@ {{ if isset .Params "part" }}

  • {{.Params.Part}}
  • {{ end }} -
  • {{ .Title }}
  • {{ if .IsSection }}
      {{ range .Pages }} -
    1. {{ .Title }}
    2. {{ end }} From 16a9a52c12e777103d06cb52728aadc8fcb5c4ce Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 20 Apr 2022 07:22:48 +0300 Subject: [PATCH 411/531] inversions edits --- content/russian/cs/sequences/inversions.md | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/content/russian/cs/sequences/inversions.md b/content/russian/cs/sequences/inversions.md index f18d1f4a..2fbec7d9 100644 --- a/content/russian/cs/sequences/inversions.md +++ b/content/russian/cs/sequences/inversions.md @@ -4,13 +4,18 @@ title: Число инверсий weight: 5 authors: - Сергей Слотин +draft: true --- -Пусть у нас есть некоторая перестановка $p$ (какая-то последовательность чисел от $1$ до $n$, где все числа встречаются ровно один раз). *Инверсией* называется пара индексов $i$ и $j$ такая, что $i < j$ и $p_i > p_j$. Требуется найти количество инверсий в данной перестановке. +**Определение.** *Инверсией* в перестановке $p$ называется пара индексов $i$ и $j$ такая, что $i < j$ и $p_i > p_j$. -## Наивный алгоритм +Например: -Эта задача легко решается за $O(n^2)$ обычным перебором всех пар индексов и проверкой каждого на инверсию: +- в перестановке $[1, 2, 3]$ инверсий нет, +- в $[1, 3, 2]$ одна инверсия ($3 \leftrightarrow 2$), +- в $[3, 2, 1]$ три инверсии ($3 \leftrightarrow 2$, $3 \leftrightarrow 1$ и $2 \leftrightarrow 1$). + +В этой статье мы рассмотрим, как находить количество инверсий в перестановке. Эта задача легко решается за $O(n^2)$ обычным перебором всех пар индексов и проверкой каждого на инверсию: ```cpp int count_inversions(int *p, int n) { @@ -23,6 +28,8 @@ int count_inversions(int *p, int n) { } ``` +Решить её быстрее сложнее. + ## Сортировкой слиянием Внезапно эту задачу можно решить сортировкой слиянием, слегка модифицировав её. From b402d342b998a1a13d17eea48781845321abcae4 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 20 Apr 2022 07:23:07 +0300 Subject: [PATCH 412/531] quickselect edits --- content/russian/cs/sequences/quickselect.md | 17 +++++++++++++++-- 1 file changed, 15 insertions(+), 2 deletions(-) diff --git a/content/russian/cs/sequences/quickselect.md b/content/russian/cs/sequences/quickselect.md index b1606bbd..7e83a267 100644 --- a/content/russian/cs/sequences/quickselect.md +++ b/content/russian/cs/sequences/quickselect.md @@ -1,12 +1,12 @@ --- -# TODO: реализация title: Порядковые статистики weight: 4 +draft: true --- Если в [начале предыдущей главы](/cs/interactive/binary-search) мы искали число элементов массива, меньших $x$ — также известное как индекс этого элемента в отсортированном массиве — то теперь нас интересует обратная задача: узнать, какой элемент $k$-тый по возрастанию. -Если массив уже отсортирован, то задача тривиальная — просто берем $k$-тый элемент. Иначе мы его можем отсортировать, но на это потребуется $O(n \log n)$ операций — и мы знаем, что используя только сравнения быстрее не получится. +Если массив уже отсортирован, то задача тривиальная: просто берем $k$-тый элемент. Иначе мы его можем отсортировать, но на это потребуется $O(n \log n)$ операций — и мы знаем, что если мы используем только сравнения, быстрее не получится. Есть другой подход — мы можем модифицировать алгоритм быстрой сортировки. @@ -26,4 +26,17 @@ weight: 4 Подумав над тем, что размер отрезка каждый раз убывает приблизительно в 2 раза, над ограниченностью суммы $n + \frac{n}{2} + \frac{n}{4} + \ldots = 2 \cdot n$, и немного помахав руками, получаем, что алгоритм работает за $O(n)$. + + В C++ этот алгоритм уже реализован и доступен как `nth_element`. From c10ebb35240390d9bd7fd69769c8b65ed4f0cdfe Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 20 Apr 2022 07:23:21 +0300 Subject: [PATCH 413/531] sequence compression --- content/russian/cs/sequences/_index.md | 3 +- content/russian/cs/sequences/compression.md | 50 ++++++++++++++------- 2 files changed, 35 insertions(+), 18 deletions(-) diff --git a/content/russian/cs/sequences/_index.md b/content/russian/cs/sequences/_index.md index d02ed49b..6888831d 100644 --- a/content/russian/cs/sequences/_index.md +++ b/content/russian/cs/sequences/_index.md @@ -1,7 +1,6 @@ --- title: Последовательности weight: 4 -draft: true --- -В этой главе рассматриваются некоторые алгоритмы на неотсортированных последовательностях. +В этой главе рассматриваются алгоритмы для неотсортированных последовательностей. diff --git a/content/russian/cs/sequences/compression.md b/content/russian/cs/sequences/compression.md index 332011b3..58686d5c 100644 --- a/content/russian/cs/sequences/compression.md +++ b/content/russian/cs/sequences/compression.md @@ -3,46 +3,64 @@ title: Сжатие координат authors: - Сергей Слотин weight: -1 -draft: true +date: 2022-04-20 --- +Часто бывает полезно преобразовать последовательность чисел либо каких-то других объектов в промежуток последовательных целых чисел — например, чтобы использовать её элементы как индексы в массиве либо какой-нибудь другой структуре. -## Сжатие координат -Это общая идея, которая может оказаться полезной. Пусть, есть $n$ чисел $a_1,\ldots,a_n$. Хотим, преобразовать $a_i$ так, чтобы равные остались равными, разные остались разными, но все они были от 0 до $n-1$. Для этого надо отсортировать числа, удалить повторяющиеся и заменить каждое $a_i$ на его индекс в отсортированном массиве. +Эта задача эквивалентна нумерации элементов множества, что можно сделать за $O(n)$ через хэш-таблицу: +```c++ +vector compress(vector a) { + unordered_map m; -``` -int a[n], all[n]; -for (int i = 0; i < n; ++i) { - cin >> a[i]; - all[i] = a[i]; + for (int &x : a) { + if (m.count(x)) + x = m[x]; + else + m[x] = m.size(); + } + + return a; } -sort(all, all + n); -m = unique(all, all + n) - all; // теперь m - число различных координат -for (int i = 0; i < n; ++i) - a[i] = lower_bound(all, all + m, x[i]) - all; ``` -```cpp +Элементам будут присвоены номера в порядке их первого вхождения в последовательность. Если нужно сохранить *порядок*, присвоив меньшим элементам меньшие номера, то задача становится чуть сложнее, и её можно решить разными способами. + +Как вариант, можно отсортировать массив, а затем два раза пройтись по нему с хэш-таблицей — в первый раз заполняя её, а во второй раз сжимая сам массив: + +```c++ vector compress(vector a) { + vector b = a; + sort(b.begin(), b.end()); + unordered_map m; - for (int x : a) - if (m.count(x)) + + for (int x : b) + if (!m.count(x)) m[x] = m.size(); + for (int &x : a) x = m[x]; + return a; } ``` +Также можно выкинуть из отсортированного массива дупликаты (за линейное время), а затем использовать его для нахождения индекса каждого элемента исходного массива бинарным поиском: -```cpp +```c++ vector compress(vector a) { vector b = a; + sort(b.begin(), b.end()); b.erase(unique(b.begin(), b.end()), b.end()); + for (int &x : a) x = int(lower_bound(b.begin(), b.end(), x) - b.begin()); + return a; } ``` + +Оба подхода работают за $O(n \log n)$. Используйте тот, который больше нравится. From ad0c2aa70cfb6e6d3622174e8cbd6fee8399bba7 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 20 Apr 2022 07:28:06 +0300 Subject: [PATCH 414/531] quicksort edits --- content/russian/cs/sorting/quicksort.md | 19 +++++++++++-------- 1 file changed, 11 insertions(+), 8 deletions(-) diff --git a/content/russian/cs/sorting/quicksort.md b/content/russian/cs/sorting/quicksort.md index f3a6a5d6..e6494cd3 100644 --- a/content/russian/cs/sorting/quicksort.md +++ b/content/russian/cs/sorting/quicksort.md @@ -7,13 +7,18 @@ draft: true Быстрая сортировка заключается в том, что на каждом шаге мы находим опорный элемент, все элементы, которые меньше его кидаем в левую часть, остальные в правую, а затем рекурсивно спускаемся в обе части. ```cpp +// partition - функция разбивающие элементы +// на меньшие и больше/равные a[index], +// при этом функция возвращает границу разбиения +void partition(int l, int r, int p) { + +} + void quicksort(int l, int r){ if (l < r){ int index = (l + r) / 2; /* index - индекс опорного элемента для начала сделаем его равным середине отрезка*/ - index = divide(l, r, index); /* divide - функция разбивающие элементы - на меньшие и больше/равные a[index], - при этом функция возвращает границу разбиения*/ + index = partition(l, r, index); quicksort(l, index); quicksort(index + 1, r); } @@ -25,8 +30,6 @@ void quicksort(int l, int r){ Существуют несколько выходов из этой ситуации : -2. Давайте если быстрая сортировка работает долго, то запустим любую другую сортировку за $NlogN$. - -3. Давайте делить массив не на две, а на три части(меньше, равны, больше). - -4. Чтобы избавиться от проблемы с максимумом/минимумом в середине, давайте **брать случайный элемент**. +1. Давайте если быстрая сортировка работает долго, то запустим любую другую сортировку за $NlogN$. +2. Давайте делить массив не на две, а на три части(меньше, равны, больше). +3. Чтобы избавиться от проблемы с максимумом/минимумом в середине, давайте **брать случайный элемент**. From 78d207d2d08787ecfecfafb25dfe6adaf347a03c Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 20 Apr 2022 07:31:36 +0300 Subject: [PATCH 415/531] fix ru broken links --- content/russian/cs/algebra/matmul.md | 2 +- content/russian/cs/basic-structures/iterators.md | 4 ++-- content/russian/cs/matching/matching-problems.md | 2 +- content/russian/cs/spanning-trees/kruskal.md | 2 +- content/russian/cs/spanning-trees/safe-edge.md | 2 +- content/russian/cs/string-searching/manacher.md | 2 +- content/russian/cs/string-structures/palindromic-tree.md | 2 +- content/russian/cs/string-structures/suffix-array.md | 4 ++-- content/russian/cs/tree-structures/treap.md | 2 +- 9 files changed, 11 insertions(+), 11 deletions(-) diff --git a/content/russian/cs/algebra/matmul.md b/content/russian/cs/algebra/matmul.md index bc5ca593..8a633bea 100644 --- a/content/russian/cs/algebra/matmul.md +++ b/content/russian/cs/algebra/matmul.md @@ -188,7 +188,7 @@ matrix binpow(matrix a, int p) { Эту технику можно применить и к другим динамикам, где нужно посчитать количество способов что-то сделать — иногда очень неочевидными способами. -Например, можно решить такую задачу: найти количество строк длины $k \approx 10^{18}$, не содержащих данные маленькие запрещённые подстроки. Для этого нужно построить граф «легальных» переходов в [Ахо-Корасике](/cs/automata/aho-corasick), возвести его матрицу смежности в $k$-тую степень и просуммировать в нём первую строчку. +Например, можно решить такую задачу: найти количество строк длины $k \approx 10^{18}$, не содержащих данные маленькие запрещённые подстроки. Для этого нужно построить граф «легальных» переходов в [Ахо-Корасике](/cs/string-structures/aho-corasick), возвести его матрицу смежности в $k$-тую степень и просуммировать в нём первую строчку. В некоторых изощрённых случаях в матричном умножении вместо умножения и сложения нужно использовать другие операции, которые ведут себя как умножение и сложение. Пример задачи: «найти путь от $s$ до $t$ с минимальным весом ребра, использующий ровно $k$ переходов»; здесь нужно возводить в $(k-1)$-ую степень матрицу весов графа, и вместо и сложения, и умножения использовать минимум из двух весов. diff --git a/content/russian/cs/basic-structures/iterators.md b/content/russian/cs/basic-structures/iterators.md index b2d8269f..c048e0b6 100644 --- a/content/russian/cs/basic-structures/iterators.md +++ b/content/russian/cs/basic-structures/iterators.md @@ -71,7 +71,7 @@ for (int x : c) ### Алгоритмы из STL -Например, итераторы `std::vector` относятся к `random_access_iterator`, и если вызвать функцию `lower_bound` из стандартной библиотеки, то она произведет [бинарный поиск](../../ordered-search/binary-search) по элементам (предполагая, что они отсортированы в порядке неубывания): +Например, итераторы `std::vector` относятся к `random_access_iterator`, и если вызвать функцию `lower_bound` из стандартной библиотеки, то она произведет [бинарный поиск](/cs/interactive/binary-search/) по элементам (предполагая, что они отсортированы в порядке неубывания): ```cpp vector a = {1, 2, 3, 5, 8, 13}; @@ -93,4 +93,4 @@ array a = {4, 2, 1, 3}; cout << *min_element(a.begin(), a.end()) << endl; ``` -Подробнее про разные полезные алгоритмы STL можно прочитать в [ликбезе по C++](../../programming/cpp). + diff --git a/content/russian/cs/matching/matching-problems.md b/content/russian/cs/matching/matching-problems.md index cedfe69d..cd14e54e 100644 --- a/content/russian/cs/matching/matching-problems.md +++ b/content/russian/cs/matching/matching-problems.md @@ -81,6 +81,6 @@ $$ Пусть у вершин левой доли есть какие-то веса, и нам нужно набрать максимальное паросочетание минимального веса. -Выясняется, что можно просто отсортировать вершины левой доли по весу и пытаться в таком порядке добавлять их в паросочетание стандартным алгоритмом Куна. Для доказательства этого факта читатель может прочитать про [жадный алгоритм Радо-Эдмондса](/cs/greedy/matroid), частным случаем которого является такая модификация алгоритма Куна. +Выясняется, что можно просто отсортировать вершины левой доли по весу и пытаться в таком порядке добавлять их в паросочетание стандартным алгоритмом Куна. Для доказательства этого факта читатель может прочитать про [жадный алгоритм Радо-Эдмондса](/cs/combinatorial-optimization/matroid), частным случаем которого является такая модификация алгоритма Куна. Аналогичную задачу, но когда у *ребер* есть веса, проще всего решать сведением к нахождению [потока минимальной стоимости](/cs/flows/mincost-maxflow). diff --git a/content/russian/cs/spanning-trees/kruskal.md b/content/russian/cs/spanning-trees/kruskal.md index ddb9cabf..1f4c98a4 100644 --- a/content/russian/cs/spanning-trees/kruskal.md +++ b/content/russian/cs/spanning-trees/kruskal.md @@ -34,4 +34,4 @@ for (auto [a, b, w] : edges) { } ``` -Раз остовные деревья являются частным случаем [матроида](/cs/greedy/matroid), то алгоритм Краскала является частным случаем алгоритма Радо-Эдмондса. +Раз остовные деревья являются частным случаем [матроида](/cs/combinatorial-optimization/matroid), то алгоритм Краскала является частным случаем алгоритма Радо-Эдмондса. diff --git a/content/russian/cs/spanning-trees/safe-edge.md b/content/russian/cs/spanning-trees/safe-edge.md index cc7138c9..19f97006 100644 --- a/content/russian/cs/spanning-trees/safe-edge.md +++ b/content/russian/cs/spanning-trees/safe-edge.md @@ -24,4 +24,4 @@ weight: 1 - Если веса всех рёбер различны, то остов будет уникален. - Минимальный остов является также и остовом с минимальным произведением весов рёбер (замените веса всех рёбер на их логарифмы). - Минимальный остов является также и остовом с минимальным весом самого тяжелого ребра. -- Остовные деревья — частный случай [матроидов](/cs/greedy/matroid). +- Остовные деревья — частный случай [матроидов](/cs/combinatorial-optimization/matroid). diff --git a/content/russian/cs/string-searching/manacher.md b/content/russian/cs/string-searching/manacher.md index 8954b653..16d32ccb 100644 --- a/content/russian/cs/string-searching/manacher.md +++ b/content/russian/cs/string-searching/manacher.md @@ -32,7 +32,7 @@ vector pal_array(string s) { Тот же пример $s = aa\dots a$ показывает, что данная реализация работает за $O(n^2)$. -Для оптимизации применим идею, знакомую из алгоритма [z-функции](string-searching): при инициализации $t_i$ будем пользоваться уже посчитанными $t$. А именно, будем поддерживать $(l, r)$ — интервал, соответствующий самому правому из найденных подпалиндромов. Тогда мы можем сказать, что часть наибольшего палиндрома с центром в $s_i$, которая лежит внутри $s_{l:r}$, имеет радиус хотя бы $\min(r-i, \; t_{l+r-i})$. Первая величина равна длине, дальше которой произошел бы выход за пределы $s_{l:r}$, а вторая — значению радиуса в позиции, зеркальной относительно центра палиндрома $s_{l:r}$. +Для оптимизации применим идею, знакомую из алгоритма [z-функции](/cs/string-searching/z-function/): при инициализации $t_i$ будем пользоваться уже посчитанными $t$. А именно, будем поддерживать $(l, r)$ — интервал, соответствующий самому правому из найденных подпалиндромов. Тогда мы можем сказать, что часть наибольшего палиндрома с центром в $s_i$, которая лежит внутри $s_{l:r}$, имеет радиус хотя бы $\min(r-i, \; t_{l+r-i})$. Первая величина равна длине, дальше которой произошел бы выход за пределы $s_{l:r}$, а вторая — значению радиуса в позиции, зеркальной относительно центра палиндрома $s_{l:r}$. ```c++ diff --git a/content/russian/cs/string-structures/palindromic-tree.md b/content/russian/cs/string-structures/palindromic-tree.md index 3d70c76b..9b57534a 100644 --- a/content/russian/cs/string-structures/palindromic-tree.md +++ b/content/russian/cs/string-structures/palindromic-tree.md @@ -19,7 +19,7 @@ weight: 3 Будем поддерживать наибольший суффикс-палиндром. Когда мы будем дописывать очередной символ $c$, нужно найти наибольший суффикс этого палиндрома, который может быть дополнен символом $c$ — это и будет новый наидлиннейший суффикс-палиндром. -Для этого поступим аналогично [алгоритму Ахо-Корасик](aho-corasick): будем поддерживать для каждого палиндрома суффиксную ссылку $l(v)$, ведущую из $v$ в её наибольший суффикс-палиндром. При добавлении очередного символа, будем подниматься по суффиксным ссылкам, пока не найдём вершину, из которой можно совершить нужный переход. +Для этого поступим аналогично [алгоритму Ахо-Корасик](../aho-corasick): будем поддерживать для каждого палиндрома суффиксную ссылку $l(v)$, ведущую из $v$ в её наибольший суффикс-палиндром. При добавлении очередного символа, будем подниматься по суффиксным ссылкам, пока не найдём вершину, из которой можно совершить нужный переход. Если в подходящей вершине этого перехода не существовало, то нужно создать новую вершину, и для неё тоже понадобится своя суффиксная ссылка. Чтобы найти её, будем продолжать подниматься по суффиксным ссылкам предыдущего суффикс-палиндрома, пока не найдём второе такое место, которое мы можем дополнить символом $c$. diff --git a/content/russian/cs/string-structures/suffix-array.md b/content/russian/cs/string-structures/suffix-array.md index 25b90a3e..a7b90768 100644 --- a/content/russian/cs/string-structures/suffix-array.md +++ b/content/russian/cs/string-structures/suffix-array.md @@ -22,7 +22,7 @@ weight: 100 ![Сортировка всех суффиксов строки «mississippi$»](../img/sa-sort.png) -**Где это может быть полезно.** Пусть вы хотите основать ещё один поисковик, и чтобы получить финансирование, вам нужно сделать хоть что-то минимально работающее — хотя бы просто научиться искать по ключевому слову документы, включающие его, а также позиции их вхождения (в 90-е это был бы уже довольно сильный MVP). Простыми алгоритмами — [полиномиальными хешами](/cs/hashing), [z- и префикс-функцией](/cs/string-searching) и даже [Ахо-Корасиком](/cs/automata/aho-corasick) — это сделать быстро нельзя, потому что на каждый раз нужно проходиться по всем данным, а суффиксными структурами — можно. +**Где это может быть полезно.** Пусть вы хотите основать ещё один поисковик, и чтобы получить финансирование, вам нужно сделать хоть что-то минимально работающее — хотя бы просто научиться искать по ключевому слову документы, включающие его, а также позиции их вхождения (в 90-е это был бы уже довольно сильный MVP). Простыми алгоритмами — [полиномиальными хешами](/cs/hashing), [z- и префикс-функцией](/cs/string-searching) и даже [Ахо-Корасиком](../aho-corasick) — это сделать быстро нельзя, потому что на каждый раз нужно проходиться по всем данным, а суффиксными структурами — можно. В случае с суффиксным массивом можно сделать следующее: сконкатенировать все строки-документы с каким-нибудь внеалфавитным разделителем (`$`), построить по ним суффиксный массив, а дальше для каждого запроса искать бинарным поиском первый суффикс в суффиксном массиве, который меньше искомого слова, а также последний, который меньше. Все суффиксы между этими двумя будут включать искомую строку как префикс. @@ -132,7 +132,7 @@ vector suffix_array(vector &s) { Тогда есть мотивация посчитать массив `lcp$` в котором окажутся наибольшие общие префиксы соседних суффиксов, а после как-нибудь считать минимумы на отрезках в этом массиве (например, с помощью [разреженной таблицы](/cs/range-queries/sparse-table)). -Осталось придумать способ быстро посчитать массив `lcp`. Можно воспользоваться идеей из построения суффиксного массива за $O(n \log^2 n)$: с помощью [хешей](hashing) и бинпоиска находить `lcp` для каждой пары соседей. Такой метод работает за $O(n \log n)$, но является не самым удобным и популярным. +Осталось придумать способ быстро посчитать массив `lcp`. Можно воспользоваться идеей из построения суффиксного массива за $O(n \log^2 n)$: с помощью [хешей](/cs/hashing/polynomial/) и бинпоиска находить `lcp` для каждой пары соседей. Такой метод работает за $O(n \log n)$, но является не самым удобным и популярным. ### Алгоритм Касаи, Аримуры, Арикавы, Ли, Парка diff --git a/content/russian/cs/tree-structures/treap.md b/content/russian/cs/tree-structures/treap.md index 724ed15f..ad11c794 100644 --- a/content/russian/cs/tree-structures/treap.md +++ b/content/russian/cs/tree-structures/treap.md @@ -100,7 +100,7 @@ $$ Примечательно, что ожидаемая глубина вершин зависит от их позиции: вершина из середины должна быть примерно в два раза глубже, чем крайняя. -**Упражнение.** Выведите по аналогии с этим рассуждением асимптотику [quicksort](/cs/sorting/quicksort). +**Упражнение.** Выведите по аналогии с этим рассуждением асимптотику quicksort. ## Реализация From d184936628da9db13363466ce12f91f7c1af4660 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 20 Apr 2022 07:39:32 +0300 Subject: [PATCH 416/531] fix hpc broken links --- content/english/hpc/algorithms/prefix.md | 2 +- content/english/hpc/architecture/assembly.md | 2 +- content/english/hpc/architecture/indirect.md | 2 +- content/english/hpc/cpu-cache/paging.md | 2 +- content/english/hpc/data-structures/b-tree.md | 4 ++-- content/english/hpc/data-structures/s-tree.md | 2 +- content/english/hpc/pipelining/branchless.md | 4 ++-- content/english/hpc/pipelining/throughput.md | 2 +- content/english/hpc/simd/shuffling.md | 2 +- 9 files changed, 11 insertions(+), 11 deletions(-) diff --git a/content/english/hpc/algorithms/prefix.md b/content/english/hpc/algorithms/prefix.md index 5e31570d..f07daaf3 100644 --- a/content/english/hpc/algorithms/prefix.md +++ b/content/english/hpc/algorithms/prefix.md @@ -61,7 +61,7 @@ for (int l = 0; l < logn; l++) We can prove that this algorithm works by induction: if on $k$-th iteration every element $a_i$ is equal to the sum of the $(i - 2^k, i]$ segment of the original array, then after adding $a_{i - 2^k}$ to it, it will be equal to the sum of $(i - 2^{k+1}, i]$. After $O(\log n)$ iterations, the array will turn into its prefix sum. -To implement it in SIMD, we could use [permutations](/hpc/simd/shuffles) to place $i$-th element against $(i-2^k)$-th, but they are too slow. Instead, we will use the `sll` ("shift lanes left") instruction that does exactly that and also replaces the unmatched elements with zeros: +To implement it in SIMD, we could use [permutations](/hpc/simd/shuffling) to place $i$-th element against $(i-2^k)$-th, but they are too slow. Instead, we will use the `sll` ("shift lanes left") instruction that does exactly that and also replaces the unmatched elements with zeros: ```c++ typedef __m128i v4i; diff --git a/content/english/hpc/architecture/assembly.md b/content/english/hpc/architecture/assembly.md index 013d2987..5c981547 100644 --- a/content/english/hpc/architecture/assembly.md +++ b/content/english/hpc/architecture/assembly.md @@ -57,7 +57,7 @@ Most instructions write their result into the first operand, which can also be i There are also 32-, 16-bit and 8-bit registers that have similar names (`rax` → `eax` → `ax` → `al`). They are not fully separate but *aliased*: the first 32 bits of `rax` are `eax`, the first 16 bits of `eax` are `ax`, and so on. This is made to save die space while maintaining compatibility, and it is also the reason why basic type casts in compiled programming languages are usually free. -These are just the *general-purpose* registers that you can, with [some exceptions](../functions), use however you like in most instructions. There is also a separate set of registers for [floating-point arithmetic](/hpc/arithmetic/float), a bunch of very wide registers used in [vector extensions](/hpc/simd), and a few special ones that are needed for [control flow](../jumps), but we'll get there in time. +These are just the *general-purpose* registers that you can, with [some exceptions](../functions), use however you like in most instructions. There is also a separate set of registers for [floating-point arithmetic](/hpc/arithmetic/float), a bunch of very wide registers used in [vector extensions](/hpc/simd), and a few special ones that are needed for [control flow](../loops), but we'll get there in time. **Constants** are just integer or floating-point values: `42`, `0x2a`, `3.14`, `6.02e23`. They are more commonly called *immediate values* because they are embedded right into the machine code. Because it may considerably increase the complexity of the instruction encoding, some instructions don't support immediate values or allow just a fixed subset of them. In some cases, you have to load a constant value into a register and then use it instead of an immediate value. diff --git a/content/english/hpc/architecture/indirect.md b/content/english/hpc/architecture/indirect.md index ce6e86b8..487b81e3 100644 --- a/content/english/hpc/architecture/indirect.md +++ b/content/english/hpc/architecture/indirect.md @@ -106,7 +106,7 @@ During a virtual method call, that offset field is fetched from the instance of Of course, this adds some overhead: -- You may need to spend another 15 cycles or so for the same pipeline flushing reasons as for [branch misprediction](../pipelining). +- You may need to spend another 15 cycles or so for the same pipeline flushing reasons as for [branch misprediction](/hpc/pipelining). - The compiler most likely won't be able to inline the function call itself. - Class size increases by a couple of bytes or so (this is implementation-specific). - The binary size itself increases a little bit. diff --git a/content/english/hpc/cpu-cache/paging.md b/content/english/hpc/cpu-cache/paging.md index 684fcd65..3e6cfd8f 100644 --- a/content/english/hpc/cpu-cache/paging.md +++ b/content/english/hpc/cpu-cache/paging.md @@ -81,7 +81,7 @@ Enabling huge pages also improves [latency](../latency) by up to 10-15% for arra In general, enabling huge pages is a good idea when you have any sort of sparse reads, as they usually slightly improve and ([almost](../aos-soa)) never hurt performance. -That said, you shouldn't rely on huge pages if possible, as they aren't always available due to either hardware or computing environment restrictions. There are [many](../cache-lines) [other](../hw-prefetching) [reasons](../aos-soa) why grouping data accesses spatially may be beneficial, which automatically solves the paging problem. +That said, you shouldn't rely on huge pages if possible, as they aren't always available due to either hardware or computing environment restrictions. There are [many](../cache-lines) [other](../prefetching) [reasons](../aos-soa) why grouping data accesses spatially may be beneficial, which automatically solves the paging problem. -The usual disclaimer: the CPU is a [Zen 2](https://www.7-cpu.com/cpu/Zen2.html), the RAM is a [DDR4-2666](http://localhost:1313/hpc/cpu-cache/), and the compiler we will be using by default is Clang 10. The performance on your machine may be different, so I highly encourage to [go and test it](https://godbolt.org/z/14rd5Pnve) for yourself. +The usual disclaimer: the CPU is a [Zen 2](https://www.7-cpu.com/cpu/Zen2.html), the RAM is a [DDR4-2666](/hpc/cpu-cache/), and the compiler we will be using by default is Clang 10. The performance on your machine may be different, so I highly encourage to [go and test it](https://godbolt.org/z/14rd5Pnve) for yourself.
      {{.Title}}
      @@ -20,7 +25,9 @@ - + From 5bb09004d6024361734f78e4518b2fb829a7b103 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Thu, 21 Apr 2022 16:02:16 +0300 Subject: [PATCH 422/531] search string translations --- themes/algorithmica/i18n/en.toml | 9 +++++++++ themes/algorithmica/i18n/ru.toml | 9 +++++++++ 2 files changed, 18 insertions(+) diff --git a/themes/algorithmica/i18n/en.toml b/themes/algorithmica/i18n/en.toml index 9aae4777..6fa12340 100644 --- a/themes/algorithmica/i18n/en.toml +++ b/themes/algorithmica/i18n/en.toml @@ -15,6 +15,15 @@ other = "updated" [sections] other = "sections" +[search] +other = "Search this book…" + +[searchCountPrefix] +other = "Found" + +[searchCountSuffix] +other = "pages" + [prerequisites] other = "prerequisites" diff --git a/themes/algorithmica/i18n/ru.toml b/themes/algorithmica/i18n/ru.toml index 5e96226c..08d47b66 100644 --- a/themes/algorithmica/i18n/ru.toml +++ b/themes/algorithmica/i18n/ru.toml @@ -21,6 +21,15 @@ other = "обновлено" [sections] other = "статьи раздела" +[search] +other = "Поиск по сайту…" + +[searchCountPrefix] +other = "Найдено" + +[searchCountSuffix] +other = "страниц" + [prerequisites] other = "пререквизиты" From 641a7d6dd401360a778594035d1ddc62ee55d21a Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Thu, 21 Apr 2022 16:02:32 +0300 Subject: [PATCH 423/531] add lunr --- themes/algorithmica/static/scripts/lunr.multi.min.js | 1 + themes/algorithmica/static/scripts/lunr.ru.min.js | 1 + themes/algorithmica/static/scripts/lunr.stemmer.support.min.js | 1 + 3 files changed, 3 insertions(+) create mode 100644 themes/algorithmica/static/scripts/lunr.multi.min.js create mode 100644 themes/algorithmica/static/scripts/lunr.ru.min.js create mode 100644 themes/algorithmica/static/scripts/lunr.stemmer.support.min.js diff --git a/themes/algorithmica/static/scripts/lunr.multi.min.js b/themes/algorithmica/static/scripts/lunr.multi.min.js new file mode 100644 index 00000000..6f417304 --- /dev/null +++ b/themes/algorithmica/static/scripts/lunr.multi.min.js @@ -0,0 +1 @@ +!function(e,t){"function"==typeof define&&define.amd?define(t):"object"==typeof exports?module.exports=t():t()(e.lunr)}(this,function(){return function(e){e.multiLanguage=function(){for(var t=Array.prototype.slice.call(arguments),i=t.join("-"),r="",n=[],s=[],p=0;p=W.limit)return!1;W.cursor++}return!0}function t(){for(;!W.out_grouping(S,1072,1103);){if(W.cursor>=W.limit)return!1;W.cursor++}return!0}function w(){b=W.limit,_=b,e()&&(b=W.cursor,t()&&e()&&t()&&(_=W.cursor))}function i(){return _<=W.cursor}function u(e,n){var r,t;if(W.ket=W.cursor,r=W.find_among_b(e,n)){switch(W.bra=W.cursor,r){case 1:if(t=W.limit-W.cursor,!W.eq_s_b(1,"а")&&(W.cursor=W.limit-t,!W.eq_s_b(1,"я")))return!1;case 2:W.slice_del()}return!0}return!1}function o(){return u(h,9)}function s(e,n){var r;return W.ket=W.cursor,!!(r=W.find_among_b(e,n))&&(W.bra=W.cursor,1==r&&W.slice_del(),!0)}function c(){return s(g,26)}function m(){return!!c()&&(u(C,8),!0)}function f(){return s(k,2)}function l(){return u(P,46)}function a(){s(v,36)}function p(){var e;W.ket=W.cursor,(e=W.find_among_b(F,2))&&(W.bra=W.cursor,i()&&1==e&&W.slice_del())}function d(){var e;if(W.ket=W.cursor,e=W.find_among_b(q,4))switch(W.bra=W.cursor,e){case 1:if(W.slice_del(),W.ket=W.cursor,!W.eq_s_b(1,"н"))break;W.bra=W.cursor;case 2:if(!W.eq_s_b(1,"н"))break;case 3:W.slice_del()}}var _,b,h=[new n("в",-1,1),new n("ив",0,2),new n("ыв",0,2),new n("вши",-1,1),new n("ивши",3,2),new n("ывши",3,2),new n("вшись",-1,1),new n("ившись",6,2),new n("ывшись",6,2)],g=[new n("ее",-1,1),new n("ие",-1,1),new n("ое",-1,1),new n("ые",-1,1),new n("ими",-1,1),new n("ыми",-1,1),new n("ей",-1,1),new n("ий",-1,1),new n("ой",-1,1),new n("ый",-1,1),new n("ем",-1,1),new n("им",-1,1),new n("ом",-1,1),new n("ым",-1,1),new n("его",-1,1),new n("ого",-1,1),new n("ему",-1,1),new n("ому",-1,1),new n("их",-1,1),new n("ых",-1,1),new n("ею",-1,1),new n("ою",-1,1),new n("ую",-1,1),new n("юю",-1,1),new n("ая",-1,1),new n("яя",-1,1)],C=[new n("ем",-1,1),new n("нн",-1,1),new n("вш",-1,1),new n("ивш",2,2),new n("ывш",2,2),new n("щ",-1,1),new n("ющ",5,1),new n("ующ",6,2)],k=[new n("сь",-1,1),new n("ся",-1,1)],P=[new n("ла",-1,1),new n("ила",0,2),new n("ыла",0,2),new n("на",-1,1),new n("ена",3,2),new n("ете",-1,1),new n("ите",-1,2),new n("йте",-1,1),new n("ейте",7,2),new n("уйте",7,2),new n("ли",-1,1),new n("или",10,2),new n("ыли",10,2),new n("й",-1,1),new n("ей",13,2),new n("уй",13,2),new n("л",-1,1),new n("ил",16,2),new n("ыл",16,2),new n("ем",-1,1),new n("им",-1,2),new n("ым",-1,2),new n("н",-1,1),new n("ен",22,2),new n("ло",-1,1),new n("ило",24,2),new n("ыло",24,2),new n("но",-1,1),new n("ено",27,2),new n("нно",27,1),new n("ет",-1,1),new n("ует",30,2),new n("ит",-1,2),new n("ыт",-1,2),new n("ют",-1,1),new n("уют",34,2),new n("ят",-1,2),new n("ны",-1,1),new n("ены",37,2),new n("ть",-1,1),new n("ить",39,2),new n("ыть",39,2),new n("ешь",-1,1),new n("ишь",-1,2),new n("ю",-1,2),new n("ую",44,2)],v=[new n("а",-1,1),new n("ев",-1,1),new n("ов",-1,1),new n("е",-1,1),new n("ие",3,1),new n("ье",3,1),new n("и",-1,1),new n("еи",6,1),new n("ии",6,1),new n("ами",6,1),new n("ями",6,1),new n("иями",10,1),new n("й",-1,1),new n("ей",12,1),new n("ией",13,1),new n("ий",12,1),new n("ой",12,1),new n("ам",-1,1),new n("ем",-1,1),new n("ием",18,1),new n("ом",-1,1),new n("ям",-1,1),new n("иям",21,1),new n("о",-1,1),new n("у",-1,1),new n("ах",-1,1),new n("ях",-1,1),new n("иях",26,1),new n("ы",-1,1),new n("ь",-1,1),new n("ю",-1,1),new n("ию",30,1),new n("ью",30,1),new n("я",-1,1),new n("ия",33,1),new n("ья",33,1)],F=[new n("ост",-1,1),new n("ость",-1,1)],q=[new n("ейше",-1,1),new n("н",-1,2),new n("ейш",-1,1),new n("ь",-1,3)],S=[33,65,8,232],W=new r;this.setCurrent=function(e){W.setCurrent(e)},this.getCurrent=function(){return W.getCurrent()},this.stem=function(){return w(),W.cursor=W.limit,!(W.cursor=i&&(e-=i,t[e>>3]&1<<(7&e)))return this.cursor++,!0}return!1},in_grouping_b:function(t,i,s){if(this.cursor>this.limit_backward){var e=r.charCodeAt(this.cursor-1);if(e<=s&&e>=i&&(e-=i,t[e>>3]&1<<(7&e)))return this.cursor--,!0}return!1},out_grouping:function(t,i,s){if(this.cursors||e>3]&1<<(7&e)))return this.cursor++,!0}return!1},out_grouping_b:function(t,i,s){if(this.cursor>this.limit_backward){var e=r.charCodeAt(this.cursor-1);if(e>s||e>3]&1<<(7&e)))return this.cursor--,!0}return!1},eq_s:function(t,i){if(this.limit-this.cursor>1),f=0,l=o0||e==s||c)break;c=!0}}for(;;){var _=t[s];if(o>=_.s_size){if(this.cursor=n+_.s_size,!_.method)return _.result;var b=_.method();if(this.cursor=n+_.s_size,b)return _.result}if((s=_.substring_i)<0)return 0}},find_among_b:function(t,i){for(var s=0,e=i,n=this.cursor,u=this.limit_backward,o=0,h=0,c=!1;;){for(var a=s+(e-s>>1),f=0,l=o=0;m--){if(n-l==u){f=-1;break}if(f=r.charCodeAt(n-1-l)-_.s[m])break;l++}if(f<0?(e=a,h=l):(s=a,o=l),e-s<=1){if(s>0||e==s||c)break;c=!0}}for(;;){var _=t[s];if(o>=_.s_size){if(this.cursor=n-_.s_size,!_.method)return _.result;var b=_.method();if(this.cursor=n-_.s_size,b)return _.result}if((s=_.substring_i)<0)return 0}},replace_s:function(t,i,s){var e=s.length-(i-t),n=r.substring(0,t),u=r.substring(i);return r=n+s+u,this.limit+=e,this.cursor>=i?this.cursor+=e:this.cursor>t&&(this.cursor=t),e},slice_check:function(){if(this.bra<0||this.bra>this.ket||this.ket>this.limit||this.limit>r.length)throw"faulty slice operation"},slice_from:function(r){this.slice_check(),this.replace_s(this.bra,this.ket,r)},slice_del:function(){this.slice_from("")},insert:function(r,t,i){var s=this.replace_s(r,t,i);r<=this.bra&&(this.bra+=s),r<=this.ket&&(this.ket+=s)},slice_to:function(){return this.slice_check(),r.substring(this.bra,this.ket)},eq_v_b:function(r){return this.eq_s_b(r.length,r)}}}},r.trimmerSupport={generateTrimmer:function(r){var t=new RegExp("^[^"+r+"]+"),i=new RegExp("[^"+r+"]+$");return function(r){return"function"==typeof r.update?r.update(function(r){return r.replace(t,"").replace(i,"")}):r.replace(t,"").replace(i,"")}}}}}); \ No newline at end of file From 4ffb00832e101ca478e2f973cef4afdff72b82aa Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Thu, 21 Apr 2022 16:02:50 +0300 Subject: [PATCH 424/531] build search index --- themes/algorithmica/layouts/_default/list.searchindex.json | 5 +++++ 1 file changed, 5 insertions(+) create mode 100644 themes/algorithmica/layouts/_default/list.searchindex.json diff --git a/themes/algorithmica/layouts/_default/list.searchindex.json b/themes/algorithmica/layouts/_default/list.searchindex.json new file mode 100644 index 00000000..6310c263 --- /dev/null +++ b/themes/algorithmica/layouts/_default/list.searchindex.json @@ -0,0 +1,5 @@ +{{- $.Scratch.Add "searchindex" slice -}} +{{- range $index, $element := .Site.Pages -}} + {{- $.Scratch.Add "searchindex" (dict "id" $index "title" $element.Title "path" $element.RelPermalink "content" $element.Plain) -}} +{{- end -}} +{{- $.Scratch.Get "searchindex" | jsonify -}} From c387bd73ba6a942b8b922b07ea019eafbb4672d6 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Thu, 21 Apr 2022 16:03:16 +0300 Subject: [PATCH 425/531] implement search --- config.yaml | 9 ++ .../algorithmica/layouts/_default/baseof.html | 1 + .../algorithmica/layouts/partials/head.html | 88 ++++++++++++++++++- .../algorithmica/layouts/partials/search.html | 6 ++ 4 files changed, 103 insertions(+), 1 deletion(-) create mode 100644 themes/algorithmica/layouts/partials/search.html diff --git a/config.yaml b/config.yaml index 7e4ca1b7..8fb26a1c 100644 --- a/config.yaml +++ b/config.yaml @@ -8,6 +8,15 @@ outputFormats: baseName: index mediaType: text/html isHTML: true + SearchIndex: + mediaType: "application/json" + baseName: "searchindex" + isPlainText: true + notAlternative: true +outputs: + home: + - HTML + - SearchIndex markup: goldmark: footnote: false # katex conflict diff --git a/themes/algorithmica/layouts/_default/baseof.html b/themes/algorithmica/layouts/_default/baseof.html index f9056521..dbe71ede 100644 --- a/themes/algorithmica/layouts/_default/baseof.html +++ b/themes/algorithmica/layouts/_default/baseof.html @@ -6,6 +6,7 @@
      {{- partial "buttons.html" . -}}
      + {{ partial "search.html" . }} {{- partial "header.html" . -}}
      {{- block "main" . }}{{- end }} diff --git a/themes/algorithmica/layouts/partials/head.html b/themes/algorithmica/layouts/partials/head.html index f87a8873..2f4c3c46 100644 --- a/themes/algorithmica/layouts/partials/head.html +++ b/themes/algorithmica/layouts/partials/head.html @@ -10,6 +10,11 @@ + + + + + {{ $dark := resources.Get "dark.sass" | toCSS | minify | fingerprint }} @@ -18,22 +23,100 @@ console.log("Toggling sidebar visibility") var sidebar = document.getElementById('sidebar') var wrapper = document.getElementById('wrapper') - if (sidebar.classList.contains('sidebar-toggled') || window.getComputedStyle(sidebar).display == 'block') { + if (sidebar.classList.contains('sidebar-toggled') || window.getComputedStyle(sidebar).display == 'block') { sidebar.classList.toggle('sidebar-hidden') wrapper.classList.toggle('sidebar-hidden') } sidebar.classList.add('sidebar-toggled') wrapper.classList.add('sidebar-toggled') } + function switchTheme(theme) { console.log("Changing theme:", theme) document.getElementById('theme').href = (theme == 'dark' ? "{{ $dark.RelPermalink }}" : "") document.getElementById('syntax-theme').href = (theme == 'dark' ? '/syntax-dark.css' : '/syntax.css') localStorage.setItem('theme', theme) } + + async function toggleSearch() { + console.log("Toggling search") + + var searchDiv = document.getElementById('search') + if (window.getComputedStyle(searchDiv).display == 'none') { + searchDiv.style.display = 'block' + window.scrollTo({ top: 0 }); + } else { + searchDiv.style.display = 'none' + } + + if (!index) { + console.log("Fetching index") + const response = await fetch('/searchindex.json') + const pages = await response.json() + index = lunr(function() { + this.use(lunr.multiLanguage('en', 'ru')) + this.field('title', { + boost: 5 + }) + this.field('content', { + boost: 1 + }) + pages.forEach(function(doc) { + this.add(doc) + articles.push(doc) + }, this) + }) + console.log("Ready to search") + } + } + + var articles = [] + var index = undefined + + function search() { + var query = document.getElementById('search-bar').value + var resultsDiv = document.getElementById('search-results') + var countDiv = document.getElementById('search-count') + + if (query == '') { + resultsDiv.innerHTML = '' + countDiv.innerHTML = '' + return + } + + var results = index.search(query) + + countDiv.innerHTML = '{{ T "searchCountPrefix" }} ' + results.length + ' {{ T "searchCountSuffix" }}' + + let resultList = '' + + for (const n in results) { + const item = articles[results[n].ref] + resultList += '
    3. ' + item.title + '

      ' + const text = item.content + + const contextLimit = 80 + + if (text.includes(query)) { + const start = text.indexOf(query) + if (start > contextLimit) + resultList += '…' + resultList += text.substring(start - contextLimit, start) + + '' + query + '' + text.substring(start + query.length, start + query.length + contextLimit) + + } else { + resultList += text.substring(0, contextLimit * 2) + } + resultList += '…

    4. ' + } + + resultsDiv.innerHTML = resultList + } + if (localStorage.getItem('theme') == 'dark') { switchTheme('dark') } + window.addEventListener('load', function() { var el = document.getElementById("active-element") //console.log(el) @@ -46,6 +129,7 @@ toggleSidebar() }*/ }) + window.addEventListener('scroll', function() { var menu = document.getElementById('menu') if (window.scrollY < 120) { @@ -56,8 +140,10 @@ menu.classList.add('scrolled') } }) + window.addEventListener('keydown', function(e) { if (e.altKey) { return } + if (document.activeElement.tagName == 'INPUT') { return } if (e.key == 'ArrowLeft') { document.getElementById('prev-article').click() } else if (e.key == 'ArrowRight') { diff --git a/themes/algorithmica/layouts/partials/search.html b/themes/algorithmica/layouts/partials/search.html new file mode 100644 index 00000000..ee853dfa --- /dev/null +++ b/themes/algorithmica/layouts/partials/search.html @@ -0,0 +1,6 @@ + From 849e9d1e652b60e4c7bdc8e7d35ba37ed5741ffc Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Thu, 21 Apr 2022 16:03:21 +0300 Subject: [PATCH 426/531] search styling --- themes/algorithmica/assets/style.sass | 29 ++++++++++++++++++++++++++- 1 file changed, 28 insertions(+), 1 deletion(-) diff --git a/themes/algorithmica/assets/style.sass b/themes/algorithmica/assets/style.sass index 0a42a2d6..a6835c1e 100644 --- a/themes/algorithmica/assets/style.sass +++ b/themes/algorithmica/assets/style.sass @@ -222,7 +222,34 @@ menu .title opacity: 1 transition: opacity 0.1s - + +#search + display: none + font-family: $font-interface + + input + width: 100% + padding: 6px + + color: $font-color + + background: $code-background + border: $code-border + + #search-count + margin-top: 8px + color: $dimmed + + #search-results + margin-top: 6px + border-bottom: $borders + + li + list-style: none + margin: 12px 6px + + p + margin-top: 0 /* .github From 882df601131e76b86687c2c843b58252c5faec46 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Thu, 21 Apr 2022 16:22:12 +0300 Subject: [PATCH 427/531] update readme --- README.md | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/README.md b/README.md index 171f5406..959dc025 100644 --- a/README.md +++ b/README.md @@ -1,8 +1,10 @@ # Algorithmica v3 -Algorithmica is a free and open web book about Computer Science. +Algorithmica is an open-access web book dedicated to the art and science of computing. -If you are concerned with editing, please read the [contributing guide](https://ru.algorithmica.org/contributing/) (in Russian). +You can contribute via [Prose](https://prose.io/) by clicking on the pencil icon on the top right on any page or by editing its source directly on GitHub. We use a slightly different Markdown dialect, so if you are not sure that the change is correct (e. g. editing an intricate LaTeX formula), you can install [Hugo](https://gohugo.io/) and build the site locally — or just create a pull request, and a preview link will be automatically generated for you. + +If you happen to speak Russian, please also read the [contributing guidelines](https://ru.algorithmica.org/contributing/). --- @@ -16,11 +18,11 @@ Key technical changes from the [previous version](https://github.com/algorithmic * Rich metadata support (language, sections, TOCs, authors...) * Automated global table of contents * Theming support +* Search support (Lunr) Short-term todo list: -* Search with lunr -* Themes (especially a better dark theme) -* Minor style adjustments for mobile and print versions +* Style adjustments for mobile and print versions * A pdf version of the whole website +* Meta-information support (for Google Scholar and social media) * [Sticky table of contents](https://css-tricks.com/table-of-contents-with-intersectionobserver/) From 75443b0d15f22a3d5f765621d5919ee1aaf2e9a6 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Mon, 25 Apr 2022 17:13:20 +0300 Subject: [PATCH 428/531] consistent spelling --- content/russian/cs/sequences/compression.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/russian/cs/sequences/compression.md b/content/russian/cs/sequences/compression.md index 58686d5c..5b469fec 100644 --- a/content/russian/cs/sequences/compression.md +++ b/content/russian/cs/sequences/compression.md @@ -8,7 +8,7 @@ date: 2022-04-20 Часто бывает полезно преобразовать последовательность чисел либо каких-то других объектов в промежуток последовательных целых чисел — например, чтобы использовать её элементы как индексы в массиве либо какой-нибудь другой структуре. -Эта задача эквивалентна нумерации элементов множества, что можно сделать за $O(n)$ через хэш-таблицу: +Эта задача эквивалентна нумерации элементов множества, что можно сделать за $O(n)$ через хеш-таблицу: ```c++ vector compress(vector a) { From aeef2db22cf8692463b39fbdc8f321c080f4356a Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 26 Apr 2022 13:37:36 +0300 Subject: [PATCH 429/531] fix integer overflow issue --- content/russian/cs/modular/reciprocal.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/russian/cs/modular/reciprocal.md b/content/russian/cs/modular/reciprocal.md index 5d0e34e9..7b966de3 100644 --- a/content/russian/cs/modular/reciprocal.md +++ b/content/russian/cs/modular/reciprocal.md @@ -99,7 +99,7 @@ $$ ax + my = 1 \iff ax \equiv 1 \iff x \equiv a^{-1} \pmod m $$ int inv(int a, int m) { if (a == 1) return 1; - return (1 - inv(m % a, a) * m) / a + m; + return (1 - 1ll * inv(m % a, a) * m) / a + m; } ``` From 238e3987c9f1c6d6e2716e8bb1b8ce4a9a2cdfee Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 27 Apr 2022 00:01:51 +0300 Subject: [PATCH 430/531] number theory intro --- content/english/hpc/number-theory/_index.md | 32 +++++++++++++++++++-- 1 file changed, 30 insertions(+), 2 deletions(-) diff --git a/content/english/hpc/number-theory/_index.md b/content/english/hpc/number-theory/_index.md index f4936581..d532bcfd 100644 --- a/content/english/hpc/number-theory/_index.md +++ b/content/english/hpc/number-theory/_index.md @@ -4,10 +4,38 @@ weight: 7 draft: true --- -In 1940, British mathematician Godfrey Harold Hardy published a famous essay titled [A Mathematician's Apology](https://en.wikipedia.org/wiki/A_Mathematician%27s_Apology) where he discusses the notion that mathematics should be pursued for its own sake rather than for the sake of its applications. As a 62-year-old, he saw the devastation caused by first world war, and was amidst the second one. +In 1940, a British mathematician [G. H. Hardy](https://en.wikipedia.org/wiki/G._H._Hardy) published a famous essay titled "[A Mathematician's Apology](https://en.wikipedia.org/wiki/A_Mathematician%27s_Apology)" discussing the notion that mathematics should be pursued for its own sake rather than for the sake of its applications. -A scientist faces a moral dilemma because some of its inventions may do more harm than good. One can find calm in pursuing useless math. Hardy himself specialized in number theory, and he was content about it not having any applications: "No one has yet discovered any warlike purpose to be served by the theory of numbers or relativity, and it seems unlikely that anyone will do so for many years." +I personally don't agree — and I wrote this book partially to show that there are way too few people working on practical algorithm design instead of theoretical computer science — but I understand where Hardy is coming from. Being 62 years old, he witnessed the devastation caused by the First and the ongoing Second World War that was greatly amplified by the weaponization of science. + +As a number theorist, Hardy finds calm working in a "useless" field and not having to face any moral dilemmas, writing: + +> No one has yet discovered any warlike purpose to be served by the theory of numbers or relativity, and it seems unlikely that anyone will do so for many years. + +Ironically, this statement was proved very wrong just 5 years later with the development of the atomic bomb, which would not have been possible without the [understanding](https://en.wikipedia.org/wiki/Einstein%E2%80%93Szil%C3%A1rd_letter) of relativity, and the inception of computer-era cryptography, which extensively builds on number theory. + + From cf6e133d384586f41c59c859fcf1130afd57ef28 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 27 Apr 2022 17:30:50 +0300 Subject: [PATCH 431/531] ignoreIndexing conflicted with drafts --- themes/algorithmica/layouts/partials/sidebar.html | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/themes/algorithmica/layouts/partials/sidebar.html b/themes/algorithmica/layouts/partials/sidebar.html index 816887f5..652a1f1b 100644 --- a/themes/algorithmica/layouts/partials/sidebar.html +++ b/themes/algorithmica/layouts/partials/sidebar.html @@ -24,7 +24,7 @@ {{ if isset .Params "part" }}
    5. {{.Params.Part}}
    6. {{ end }} -
    7. {{ .Title }}
    8. {{ if .IsSection }} From 47df9a54170f32812ac3043aace1ef7f2df2027a Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 27 Apr 2022 17:42:27 +0300 Subject: [PATCH 432/531] modular arithmetic intro --- .../english/hpc/number-theory/img/clock.gif | Bin 0 -> 2331 bytes content/english/hpc/number-theory/inverse.md | 137 +++++++++++++++--- 2 files changed, 116 insertions(+), 21 deletions(-) create mode 100644 content/english/hpc/number-theory/img/clock.gif diff --git a/content/english/hpc/number-theory/img/clock.gif b/content/english/hpc/number-theory/img/clock.gif new file mode 100644 index 0000000000000000000000000000000000000000..0d0c65556eafa788280115c63496ef6024ecce91 GIT binary patch literal 2331 zcmV+$3FP)iNk%w1Vc7uL0K@GD4WmdGa(nH9|*Us54 z-qPRF-rs=0Bk0WH0OjT-*zn1e?CI+7BKPvpKmrIR0Sq8OKtNZkc=+_~BP1jM1BN3E zI5@K)fU$G)U@0_FuZKfu0vyoD`*aAIyFVqLK|81+<`Sk$ObyQdJS8L ztyr}<%AWYh)JIT`W83DPBDciH13dh>+B?Q=l)fVdJOsSsqbI^(3?F_eg1|(pjUQ45 z1-Xml$t4Vk+`1#?M7y1}e1;+T1Od!5`U-I1fV7UGMG>D$jfFQ1$0i0iNE-X40-_6& zD~CO!>ebD=?^5yo;^SuFCu{VD^Dl?)!Ek$1Z(;gnfT0tKbNLP)dY08La2vgOazJ)3 z?4yTQHJ&`?z69l?-WK`k=iYSv#lYVGeO(l2plS&!=pKXX9r&1Z(O5{(EB7*yEC`gI;oB&@f7~a5QjV-8%T8uK+;}09R7x?WHtuA}NsiR?;-z?2X7WduV@4s1LBgP!rVaxPWI#?# zikXrYzorebS z00Sft@IV42*vS)?T!yERXx$|?kr{xFDL?{%Qux)NDiOMwXb-dsKmZdF6w{@LiV%_l zrDlYyAd&(gz;VHn!0HI1l3Hy42{gs3Ym=97%WMJ#^h&~U7LCAw1J_DRVIs}Gg(?XN z#>MgtPnlNvv11^>BPz>bjFIWGnFmO5rKR~RP0no%izuh*x ztqu%IE`7Lr+3T4jjNJZO=2C`@8n3 zCw1_`XB*K2jP)4%yF)SODcR7+87kK9vJEE4>FA#KPH-Idna_WN2o3y5kiG#12LTOO zU_KH!DGH7Ne5Covkc5Y{2DEAr2=s#qr1q)ot;>NqdDhebH;D@7tpE`mNjn7Sz!I3G zd_@W&$M{!6rASOy%9FvY;s*elRL_A7u-DKYU_hp&Y6hO-7p5GbLyag1*RSPKXg>x zAA1(aLAs%Iwlky)AE!vpF|v`?upA^wm$^hv5{Qz#Bbc2%1@dHv;h_G z1x~6x1e-2|rzwF(Api=FWhW!&k@8kROf|p(IHX#*Ko!t;>NB3pGYJDsQX9q1rfUE= z**uzww|AERK!s*tKs}4~km_kPFC4|2h3>H!5fp$n(4<rGff{KT#$)fed6Ii9&$FoN-i0F~n3u5u>Lz7`KK_a81M51`6zR zu_Y2wabC4*Rg;<#FP2fF3GM1Qb4bi>-GwG*sE8Z$#5WnxG_JEVlqgx*)4|pCb!%l3 zVUgO$!+tKY{aY+w%lXjJLDr#_4Gv8=yV;S}=(2Vt!cZx)g#d7trZIh)GJEG(066uj zTa#cGNO%T!UMaJ;MS??Ucm7aa%_J%*raC39f=*7 zfZ62OXtky(9MS#Wmpe=gH!X=6SKhIw9}QNPB<)S3=AEboOb@}PWz~vH^^YESo<=o}onvC_O%sbGWUvR=!e~F(*n@<&M^(J+ zUq0ezKl}D#2@S$>PkY=dHMhD8o$Yle+m_Box4h>~?|R$&-uTY9zW2@Ve$Orh06Qq$ BG)w>h literal 0 HcmV?d00001 diff --git a/content/english/hpc/number-theory/inverse.md b/content/english/hpc/number-theory/inverse.md index aec428fe..beb56611 100644 --- a/content/english/hpc/number-theory/inverse.md +++ b/content/english/hpc/number-theory/inverse.md @@ -3,39 +3,79 @@ title: Modular Inverse weight: 1 --- -```c++ -mint inv() const { - uint t = x; - uint res = 1; - while (t != 1) { - uint z = mod / t; - res = (ull) res * (mod - z) % mod; - t = mod - t * z; - } - return res; -} -``` + + +Computers usually store time as the number of seconds that have passed since the 1st of January, 1970 — the start of the "Unix era" — and use these timestamps in all computations that have to do with time. + +We humans also keep track of time relative to some point in the past, which usually has a political or religious significance. For example, at the moment of writing, approximately 63882260594 seconds have passed since 0 AD. + +But unlike computers, we do not always need *all* that information. Depending on the task at hand, the relevant part may be that it's 2 pm right now and it's time to go to dinner, or that it's Thursday and so Subway's sub of the day is an Italian BMT. Instead of the whole timestamp, we use its *remainer* containing just the information we need: it is much easier to deal with 1- or 2-digit numbers than 11-digit ones. + +### Modular Arithmetic + +Two integers $a$ and $b$ are said to be *congruent* modulo $m$ if $m$ divides their difference: + +$$ +m \mid (a - b) \; \Longleftrightarrow \; a \equiv b \pmod m +$$ + +Congruence modulo $m$ is an equivalence relation, which splits all integers into equivalence classes, called *residues*. Each residue class modulo $m$ may be represented by any one of its members — although we commonly use the smallest nonnegative integer of that class (equal to the remainder $x \bmod m$ for all nonnegative $x$). + + + +*Modular arithmetic* studies these sets of residues, which are fundamental for number theory. -Modular arithmetic studies the way these sets of remainders behave, and it has fundamental applications in number theory, cryptography and data compression. +**Problem.** Today is Thursday. What day of the week it will be exactly in a year? +If we enumerate each day of the week starting with Monday from $0$ to $6$ inclusive, Thursday gets number $3$. To find out what day it is going to be in a year from now, we need to add $365$ to it and then reduce modulo $7$. Conveniently, $365 \bmod 7 = 1$, so we know that it will be Friday unless it is a leap year (in which case it will be Saturday). -Consider the following problem: our "week" now consists of $m$ days, and we cycle through it with a steps of $a > 0$. How many distinct days there will be? +**Problem.** Our "week" now consists of $m$ days, and our year consists of $a$ days (no leap years). How many distinct days of the week there will be among one, two, three and so on whole years from now? -Let's assume that the first day is always Monday. At some point the sequence of day is going to cycle. The days will be representable as $k a \mod m$, so we need to find the first $k$ such as $k a$ is divisible by $m$. In the case of $m=7$, $m$ is prime, so the cycle length will be 7 exactly for any $a$. +For simplicity, assume that today is Monday, so that the initial day number $d_0$ is zero and after each year, it changes to + +$$ +d_{k + 1} = (d_k + a) \bmod m +$$ + +After $k$ years, it will be + +$$ +d_k = k \cdot a \bmod m +$$ -Now, if $m$ is not prime, but it is still coprime with $a$. For $ka$ to be divisible by $m$, $k$ needs to be divisible by $m$. In general, the answer is $\frac{m}{gcd(a, m)}$. For example, if the week is 10 days long, if the starting number is even, then it will cycle through all even numbers, and if the number is 5, then it will only cycle between 0 and 5. Otherwise it will go through all 10 remainders. +Since there are only $m$ days in a week, at some point it will be Monday again, and the sequence of day numbers is going to cycle. The number of distinct days is the length of this cycle, so we need to find the smallest $k$ such that + +$$ +k \cdot a \equiv 0 \pmod m +$$ + +First of all, if $a \equiv 0$, it will be ethernal Monday. We now assume the non-trivial case of $a \not \equiv 0$. + +For a seven-day week, $m = 7$ is prime. There is no $k$ smaller than $m$ such that $k \cdot a$ is divisible by $m$ because $m$ can not be decomposed in such a product by the definition of primality. So, if $m$ is prime, we will cycle through all of $m$ week days. + +If $m$ is not prime, but $a$ is *coprime* with it (that is, $a$ and $m$ do not have common divisors), then the answer is still $m$ for the same reason: the divisors of $a$ do not help in zeroing out the product any faster. + +If $a$ and $m$ share some divisors, then it is only possible to get residues that are also divisible by them. For example, if the week is $m = 10$ days long, and the year has $a = 42$ or any other even number of days, then we will cycle through all even day numbers, and if the number of days is a multiple of $5$, then we will only oscillate between $0$ and $5$. Otherwise, we will go through all the $10$ remainders. + +Therefore, in general, the answer is $\frac{m}{\gcd(a, m)}$, where $\gcd(a, m)$ is the [greatest common divisor](/hpc/algorithms/gcd/) of $a$ and $m$. ### Fermat's Theorem @@ -65,6 +105,17 @@ $$ where $\phi(m)$ is called Euler's totient function and is equal to the number of residues of $m$ that is coprime with it. In particular case of when $m$ is prime, $\phi(p) = p - 1$ and we get Fermat's theorem, which is just a special case. +Несколько причин: + +Это выражение довольно легко вбивать (1e9+7). +Простое число. +Достаточно большое. +int не переполняется при сложении. +long long не переполняется при умножении. +Кстати, 10^9 + 910 +9 + +9 обладает всеми теми же свойствами. Иногда используют и его. + ### Primality Testing These theorems have a lot of applications. One of them is checking whether a number $n$ is prime or not faster than factoring it. You can pick any base $a$ at random and try to raise it to power $a^{p-1}$ modulo $n$ and check if it is $1$. Such base is called *witness*. @@ -105,8 +156,27 @@ int binpow(int a, int n) { } ``` +179.64 + This helps if `n` or `mod` is a constant. +```c++ +int inverse(int _a) { + long long a = _a, r = 1; + + #pragma GCC unroll(30) + for (int l = 0; l < 30; l++) { + if ( (M - 2) >> l & 1 ) + r = r * a % M; + a = a * a % M; + } + + return r; +} +``` + +171.68 + ### Modular Division "Normal" operations also apply to residues: +, -, *. But there is an issue with division, because we can't just bluntly divide two numbers: $\frac{8}{2} = 4$, but $\frac{8 \\% 5 = 3}{2 \\% 5 = 2} \neq 4$. @@ -180,8 +250,33 @@ int gcd(int a, int b, int &x, int &y) { y = x1; return d; } + +int inverse(int a) { + int x, y; + gcd(a, M, x, y); + if (x < 0) + x += M; + return x; +} ``` +159.28 + +```c++ +int inverse(int a) { + int b = M, x = 1, y = 0; + while (a != 1) { + y -= b / a * x; + b %= a; + swap(a, b); + swap(x, y); + } + return x < 0 ? x + M : x; +} +``` + +134.33 + Another application is the exact division modulo $2^k$. **Exercise**. Try to adapt the technique for binary GCD. From 0d4d13729d5662f715d258fb37bebba5ada2928c Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 27 Apr 2022 17:54:18 +0300 Subject: [PATCH 433/531] reorganize hpc number theory --- .../english/hpc/number-theory/cryptography.md | 2 +- .../hpc/number-theory/error-correction.md | 2 +- .../hpc/number-theory/exponentiation.md | 70 ++++++++ content/english/hpc/number-theory/finite.md | 2 +- content/english/hpc/number-theory/inverse.md | 170 +----------------- content/english/hpc/number-theory/modular.md | 105 +++++++++++ .../english/hpc/number-theory/montgomery.md | 14 +- 7 files changed, 192 insertions(+), 173 deletions(-) create mode 100644 content/english/hpc/number-theory/exponentiation.md create mode 100644 content/english/hpc/number-theory/modular.md diff --git a/content/english/hpc/number-theory/cryptography.md b/content/english/hpc/number-theory/cryptography.md index 0dd500dc..e552372a 100644 --- a/content/english/hpc/number-theory/cryptography.md +++ b/content/english/hpc/number-theory/cryptography.md @@ -1,6 +1,6 @@ --- title: Cryptography -weight: 6 +weight: 7 draft: true --- diff --git a/content/english/hpc/number-theory/error-correction.md b/content/english/hpc/number-theory/error-correction.md index 91f1f472..e8774ed8 100644 --- a/content/english/hpc/number-theory/error-correction.md +++ b/content/english/hpc/number-theory/error-correction.md @@ -1,6 +1,6 @@ --- title: Error Correction -weight: 4 +weight: 6 draft: true --- diff --git a/content/english/hpc/number-theory/exponentiation.md b/content/english/hpc/number-theory/exponentiation.md new file mode 100644 index 00000000..f82af3e6 --- /dev/null +++ b/content/english/hpc/number-theory/exponentiation.md @@ -0,0 +1,70 @@ +--- +title: Binary Exponentiation +weight: 2 +--- + +### Binary Exponentiation + +To perform the Fermat test, we need to raise a number to power $n-1$, preferrably using less than $n-2$ modular multiplications. We can use the fact that multiplication is associative: + +$$ +\begin{aligned} + a^{2k} &= (a^k)^2 +\\ a^{2k + 1} &= (a^k)^2 \cdot a +\end{aligned} +$$ + +We essentially group it like this: + +$$ +a^8 = (aaaa) \cdot (aaaa) = ((aa)(aa))((aa)(aa)) +$$ + +This allows using only $O(\log n)$ operations (or, more specifically, at most $2 \cdot \log_2 n$ modular multiplications). + +```c++ +int binpow(int a, int n) { + int res = 1; + while (n) { + if (n & 1) + res = res * a % mod; + a = a * a % mod; + n >>= 1; + } + return res; +} +``` + +179.64 + +This helps if `n` or `mod` is a constant. + +```c++ +int inverse(int _a) { + long long a = _a, r = 1; + + #pragma GCC unroll(30) + for (int l = 0; l < 30; l++) { + if ( (M - 2) >> l & 1 ) + r = r * a % M; + a = a * a % M; + } + + return r; +} +``` + +171.68 + + +Несколько причин: + +Это выражение довольно легко вбивать (1e9+7). +Простое число. +Достаточно большое. +int не переполняется при сложении. +long long не переполняется при умножении. +Кстати, 10^9 + 910 +9 + +9 обладает всеми теми же свойствами. Иногда используют и его. + diff --git a/content/english/hpc/number-theory/finite.md b/content/english/hpc/number-theory/finite.md index fbef0015..cae2f2ef 100644 --- a/content/english/hpc/number-theory/finite.md +++ b/content/english/hpc/number-theory/finite.md @@ -1,6 +1,6 @@ --- title: Finite Fields -weight: 3 +weight: 5 draft: true --- diff --git a/content/english/hpc/number-theory/inverse.md b/content/english/hpc/number-theory/inverse.md index beb56611..c0d9df08 100644 --- a/content/english/hpc/number-theory/inverse.md +++ b/content/english/hpc/number-theory/inverse.md @@ -1,121 +1,8 @@ --- -title: Modular Inverse -weight: 1 +title: Extended Euclidean Algorithm +weight: 3 --- - - -Computers usually store time as the number of seconds that have passed since the 1st of January, 1970 — the start of the "Unix era" — and use these timestamps in all computations that have to do with time. - -We humans also keep track of time relative to some point in the past, which usually has a political or religious significance. For example, at the moment of writing, approximately 63882260594 seconds have passed since 0 AD. - -But unlike computers, we do not always need *all* that information. Depending on the task at hand, the relevant part may be that it's 2 pm right now and it's time to go to dinner, or that it's Thursday and so Subway's sub of the day is an Italian BMT. Instead of the whole timestamp, we use its *remainer* containing just the information we need: it is much easier to deal with 1- or 2-digit numbers than 11-digit ones. - -### Modular Arithmetic - -Two integers $a$ and $b$ are said to be *congruent* modulo $m$ if $m$ divides their difference: - -$$ -m \mid (a - b) \; \Longleftrightarrow \; a \equiv b \pmod m -$$ - -Congruence modulo $m$ is an equivalence relation, which splits all integers into equivalence classes, called *residues*. Each residue class modulo $m$ may be represented by any one of its members — although we commonly use the smallest nonnegative integer of that class (equal to the remainder $x \bmod m$ for all nonnegative $x$). - - - -*Modular arithmetic* studies these sets of residues, which are fundamental for number theory. - -**Problem.** Today is Thursday. What day of the week it will be exactly in a year? - -If we enumerate each day of the week starting with Monday from $0$ to $6$ inclusive, Thursday gets number $3$. To find out what day it is going to be in a year from now, we need to add $365$ to it and then reduce modulo $7$. Conveniently, $365 \bmod 7 = 1$, so we know that it will be Friday unless it is a leap year (in which case it will be Saturday). - -**Problem.** Our "week" now consists of $m$ days, and our year consists of $a$ days (no leap years). How many distinct days of the week there will be among one, two, three and so on whole years from now? - -For simplicity, assume that today is Monday, so that the initial day number $d_0$ is zero and after each year, it changes to - -$$ -d_{k + 1} = (d_k + a) \bmod m -$$ - -After $k$ years, it will be - -$$ -d_k = k \cdot a \bmod m -$$ - -Since there are only $m$ days in a week, at some point it will be Monday again, and the sequence of day numbers is going to cycle. The number of distinct days is the length of this cycle, so we need to find the smallest $k$ such that - -$$ -k \cdot a \equiv 0 \pmod m -$$ - -First of all, if $a \equiv 0$, it will be ethernal Monday. We now assume the non-trivial case of $a \not \equiv 0$. - -For a seven-day week, $m = 7$ is prime. There is no $k$ smaller than $m$ such that $k \cdot a$ is divisible by $m$ because $m$ can not be decomposed in such a product by the definition of primality. So, if $m$ is prime, we will cycle through all of $m$ week days. - -If $m$ is not prime, but $a$ is *coprime* with it (that is, $a$ and $m$ do not have common divisors), then the answer is still $m$ for the same reason: the divisors of $a$ do not help in zeroing out the product any faster. - -If $a$ and $m$ share some divisors, then it is only possible to get residues that are also divisible by them. For example, if the week is $m = 10$ days long, and the year has $a = 42$ or any other even number of days, then we will cycle through all even day numbers, and if the number of days is a multiple of $5$, then we will only oscillate between $0$ and $5$. Otherwise, we will go through all the $10$ remainders. - -Therefore, in general, the answer is $\frac{m}{\gcd(a, m)}$, where $\gcd(a, m)$ is the [greatest common divisor](/hpc/algorithms/gcd/) of $a$ and $m$. - -### Fermat's Theorem - -Now, consider what happens if instead of adding a number $a$, we repeatedly multiply by it, that is, write numbers in the form $a^n \mod m$. Since these are all finite numbers there is going to be a cycle, but what will its length be? If $p$ is prime, it turns out, all of them. - -**Theorem.** $a^p \equiv a \pmod p$ for all $a$ that are not multiple of $p$. - -**Proof**. Let $P(x_1, x_2, \ldots, x_n) = \frac{k}{\prod (x_i!)}$ be the *multinomial coefficient*, that is, the number of times the element $a_1^{x_1} a_2^{x_2} \ldots a_n^{x_n}$ would appear after the expansion of $(a_1 + a_2 + \ldots + a_n)^k$. Then - -$$ -\begin{aligned} -a^p &= (\underbrace{1+1+\ldots+1+1}_\text{$a$ times})^p & -\\\ &= \sum_{x_1+x_2+\ldots+x_a = p} P(x_1, x_2, \ldots, x_a) & \text{(by defenition)} -\\\ &= \sum_{x_1+x_2+\ldots+x_a = p} \frac{p!}{x_1! x_2! \ldots x_a!} & \text{(which terms will not be divisible by $p$?)} -\\\ &\equiv P(p, 0, \ldots, 0) + \ldots + P(0, 0, \ldots, p) & \text{(everything else will be canceled)} -\\\ &= a -\end{aligned} -$$ - -and then dividing by $a$ gives us the Fermat's theorem. - -Note that this is only true for prime $p$. Euler's theorem handles the case of arbitary $m$, and states that - -$$ -a^{\phi(m)} \equiv 1 \pmod m -$$ - -where $\phi(m)$ is called Euler's totient function and is equal to the number of residues of $m$ that is coprime with it. In particular case of when $m$ is prime, $\phi(p) = p - 1$ and we get Fermat's theorem, which is just a special case. - -Несколько причин: - -Это выражение довольно легко вбивать (1e9+7). -Простое число. -Достаточно большое. -int не переполняется при сложении. -long long не переполняется при умножении. -Кстати, 10^9 + 910 -9 - +9 обладает всеми теми же свойствами. Иногда используют и его. - ### Primality Testing These theorems have a lot of applications. One of them is checking whether a number $n$ is prime or not faster than factoring it. You can pick any base $a$ at random and try to raise it to power $a^{p-1}$ modulo $n$ and check if it is $1$. Such base is called *witness*. @@ -124,59 +11,6 @@ Such probabilistic tests are therefore returning either "no" or "maybe." It may Unless the input is provided by an adversary, the mistake probability will be low. This test is adequate for finding large primes: there are roughly $\frac{n}{\ln n}$ primes among the first $n$ numbers, which is another fact that we are not going to prove. These primes are distributed more or less evenly, so one can just pick a random number and check numbers in sequence, and after checking $O(\ln n)$ numbers one will probably be found. -### Binary Exponentiation - -To perform the Fermat test, we need to raise a number to power $n-1$, preferrably using less than $n-2$ modular multiplications. We can use the fact that multiplication is associative: - -$$ -\begin{aligned} - a^{2k} &= (a^k)^2 -\\ a^{2k + 1} &= (a^k)^2 \cdot a -\end{aligned} -$$ - -We essentially group it like this: - -$$ -a^8 = (aaaa) \cdot (aaaa) = ((aa)(aa))((aa)(aa)) -$$ - -This allows using only $O(\log n)$ operations (or, more specifically, at most $2 \cdot \log_2 n$ modular multiplications). - -```c++ -int binpow(int a, int n) { - int res = 1; - while (n) { - if (n & 1) - res = res * a % mod; - a = a * a % mod; - n >>= 1; - } - return res; -} -``` - -179.64 - -This helps if `n` or `mod` is a constant. - -```c++ -int inverse(int _a) { - long long a = _a, r = 1; - - #pragma GCC unroll(30) - for (int l = 0; l < 30; l++) { - if ( (M - 2) >> l & 1 ) - r = r * a % M; - a = a * a % M; - } - - return r; -} -``` - -171.68 - ### Modular Division "Normal" operations also apply to residues: +, -, *. But there is an issue with division, because we can't just bluntly divide two numbers: $\frac{8}{2} = 4$, but $\frac{8 \\% 5 = 3}{2 \\% 5 = 2} \neq 4$. diff --git a/content/english/hpc/number-theory/modular.md b/content/english/hpc/number-theory/modular.md new file mode 100644 index 00000000..92e0c687 --- /dev/null +++ b/content/english/hpc/number-theory/modular.md @@ -0,0 +1,105 @@ +--- +title: Modular Arithmetic +weight: -1 +--- + + + + +Computers usually store time as the number of seconds that have passed since the 1st of January, 1970 — the start of the "Unix era" — and use these timestamps in all computations that have to do with time. + +We humans also keep track of time relative to some point in the past, which usually has a political or religious significance. For example, at the moment of writing, approximately 63882260594 seconds have passed since 0 AD. + +But unlike computers, we do not always need *all* that information. Depending on the task at hand, the relevant part may be that it's 2 pm right now and it's time to go to dinner, or that it's Thursday and so Subway's sub of the day is an Italian BMT. Instead of the whole timestamp, we use its *remainer* containing just the information we need: it is much easier to deal with 1- or 2-digit numbers than 11-digit ones. + +**Problem.** Today is Thursday. What day of the week it will be exactly in a year? + +If we enumerate each day of the week starting with Monday from $0$ to $6$ inclusive, Thursday gets number $3$. To find out what day it is going to be in a year from now, we need to add $365$ to it and then reduce modulo $7$. Conveniently, $365 \bmod 7 = 1$, so we know that it will be Friday unless it is a leap year (in which case it will be Saturday). + +**Definition.** Two integers $a$ and $b$ are said to be *congruent* modulo $m$ if $m$ divides their difference: + +$$ +m \mid (a - b) \; \Longleftrightarrow \; a \equiv b \pmod m +$$ + +For example, day 42 of the year is 161 119 = 17 \times 7. + +Congruence modulo $m$ is an equivalence relation, which splits all integers into equivalence classes, called *residues*. Each residue class modulo $m$ may be represented by any one of its members — although we commonly use the smallest nonnegative integer of that class (equal to the remainder $x \bmod m$ for all nonnegative $x$). + + + +*Modular arithmetic* studies these sets of residues, which are fundamental for number theory. + +**Problem.** Our "week" now consists of $m$ days, and our year consists of $a$ days (no leap years). How many distinct days of the week there will be among one, two, three and so on whole years from now? + +For simplicity, assume that today is Monday, so that the initial day number $d_0$ is zero and after each year, it changes to + +$$ +d_{k + 1} = (d_k + a) \bmod m +$$ + +After $k$ years, it will be + +$$ +d_k = k \cdot a \bmod m +$$ + +Since there are only $m$ days in a week, at some point it will be Monday again, and the sequence of day numbers is going to cycle. The number of distinct days is the length of this cycle, so we need to find the smallest $k$ such that + +$$ +k \cdot a \equiv 0 \pmod m +$$ + +First of all, if $a \equiv 0$, it will be ethernal Monday. Now, assuming the non-trivial case of $a \not \equiv 0$: + +- For a seven-day week, $m = 7$ is prime. There is no $k$ smaller than $m$ such that $k \cdot a$ is divisible by $m$ because $m$ can not be decomposed in such a product by the definition of primality. So, if $m$ is prime, we will cycle through all of $m$ week days. +- If $m$ is not prime, but $a$ is *coprime* with it (that is, $a$ and $m$ do not have common divisors), then the answer is still $m$ for the same reason: the divisors of $a$ do not help in zeroing out the product any faster. +- If $a$ and $m$ share some divisors, then it is only possible to get residues that are also divisible by them. For example, if the week is $m = 10$ days long, and the year has $a = 42$ or any other even number of days, then we will cycle through all even day numbers, and if the number of days is a multiple of $5$, then we will only oscillate between $0$ and $5$. Otherwise, we will go through all the $10$ remainders. + +Therefore, in general, the answer is $\frac{m}{\gcd(a, m)}$, where $\gcd(a, m)$ is the [greatest common divisor](/hpc/algorithms/gcd/) of $a$ and $m$. + +### Fermat's Theorem + +Now, consider what happens if instead of adding a number $a$, we repeatedly multiply by it, that is, write numbers in the form $a^n \mod m$. Since these are all finite numbers there is going to be a cycle, but what will its length be? If $p$ is prime, it turns out, all of them. + +**Theorem.** $a^p \equiv a \pmod p$ for all $a$ that are not multiple of $p$. + +**Proof**. Let $P(x_1, x_2, \ldots, x_n) = \frac{k}{\prod (x_i!)}$ be the *multinomial coefficient*, that is, the number of times the element $a_1^{x_1} a_2^{x_2} \ldots a_n^{x_n}$ would appear after the expansion of $(a_1 + a_2 + \ldots + a_n)^k$. Then + +$$ +\begin{aligned} +a^p &= (\underbrace{1+1+\ldots+1+1}_\text{$a$ times})^p & +\\\ &= \sum_{x_1+x_2+\ldots+x_a = p} P(x_1, x_2, \ldots, x_a) & \text{(by defenition)} +\\\ &= \sum_{x_1+x_2+\ldots+x_a = p} \frac{p!}{x_1! x_2! \ldots x_a!} & \text{(which terms will not be divisible by $p$?)} +\\\ &\equiv P(p, 0, \ldots, 0) + \ldots + P(0, 0, \ldots, p) & \text{(everything else will be canceled)} +\\\ &= a +\end{aligned} +$$ + +and then dividing by $a$ gives us the Fermat's theorem. + +Note that this is only true for prime $p$. Euler's theorem handles the case of arbitary $m$, and states that + +$$ +a^{\phi(m)} \equiv 1 \pmod m +$$ + +where $\phi(m)$ is called Euler's totient function and is equal to the number of residues of $m$ that is coprime with it. In particular case of when $m$ is prime, $\phi(p) = p - 1$ and we get Fermat's theorem, which is just a special case. diff --git a/content/english/hpc/number-theory/montgomery.md b/content/english/hpc/number-theory/montgomery.md index e784dfaf..233e355d 100644 --- a/content/english/hpc/number-theory/montgomery.md +++ b/content/english/hpc/number-theory/montgomery.md @@ -1,6 +1,6 @@ --- title: Montgomery Multiplication -weight: 2 +weight: 4 --- When we talked about [integers](../integer) in general, we discussed how to perform division and modulo by multiplication, and, unsurprisingly, in modular arithmetic 90% of its time is spent calculating modulo. Apart from using the general tricks described in the previous article, there is another method specifically for modular arithmetic, called *Montgomery multiplication*. @@ -79,6 +79,9 @@ Since $x < n \cdot n < r \cdot n$ (as $x$ is a product of multiplicatio) and $q Here is an equivalent C implementation for 64-bit integers: ```c++ +typedef unsigned long long u64; +typedef __uint128_t u128; + u64 reduce(u128 x) { u64 q = u64(x) * nr; u64 m = ((u128) q * n) >> 64; @@ -134,7 +137,6 @@ Transforming a number into the space is just a multiplication inside the space o ### Complete Implementation ```c++ -// TODO fix me and prettify me struct montgomery { u64 n, nr; @@ -148,6 +150,9 @@ struct montgomery { u64 q = u64(x) * nr; u64 m = ((u128) q * n) >> 64; u64 xhi = (x >> 64); + //cout << u64(x>>64) << " " << u64(x) << " " << q << endl; + //cout << u64(m>>64) << " " << u64(m) << endl; + //exit(0); if (xhi >= m) return (xhi - m); else @@ -163,3 +168,8 @@ struct montgomery { } }; ``` + +```c++ +montgomery m(n); +m.transform(x); +``` \ No newline at end of file From 013fd0109e05d69a4a30b5b86db74ffa6e784adf Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 27 Apr 2022 21:01:34 +0300 Subject: [PATCH 434/531] publish modular arithmetic --- content/english/hpc/number-theory/_index.md | 1 - .../{inverse.md => euclid-extended.md} | 32 ++------- .../hpc/number-theory/exponentiation.md | 9 ++- content/english/hpc/number-theory/modular.md | 66 ++++++++++++++----- .../english/hpc/number-theory/montgomery.md | 1 + 5 files changed, 65 insertions(+), 44 deletions(-) rename content/english/hpc/number-theory/{inverse.md => euclid-extended.md} (51%) diff --git a/content/english/hpc/number-theory/_index.md b/content/english/hpc/number-theory/_index.md index d532bcfd..bb6a8b3c 100644 --- a/content/english/hpc/number-theory/_index.md +++ b/content/english/hpc/number-theory/_index.md @@ -1,7 +1,6 @@ --- title: Number Theory weight: 7 -draft: true --- In 1940, a British mathematician [G. H. Hardy](https://en.wikipedia.org/wiki/G._H._Hardy) published a famous essay titled "[A Mathematician's Apology](https://en.wikipedia.org/wiki/A_Mathematician%27s_Apology)" discussing the notion that mathematics should be pursued for its own sake rather than for the sake of its applications. diff --git a/content/english/hpc/number-theory/inverse.md b/content/english/hpc/number-theory/euclid-extended.md similarity index 51% rename from content/english/hpc/number-theory/inverse.md rename to content/english/hpc/number-theory/euclid-extended.md index c0d9df08..ea01588c 100644 --- a/content/english/hpc/number-theory/inverse.md +++ b/content/english/hpc/number-theory/euclid-extended.md @@ -1,41 +1,21 @@ --- title: Extended Euclidean Algorithm weight: 3 +draft: true --- -### Primality Testing -These theorems have a lot of applications. One of them is checking whether a number $n$ is prime or not faster than factoring it. You can pick any base $a$ at random and try to raise it to power $a^{p-1}$ modulo $n$ and check if it is $1$. Such base is called *witness*. - -Such probabilistic tests are therefore returning either "no" or "maybe." It may be the case that it just happened to be equal to $1$ but in fact $n$ is composite, in which case you need to repeat the test until you are okay with the false positive probability. Moreover, there exist carmichael numbers, which are composite numbers $n$ that satisfy $a^n \equiv 1 \pmod n$ for all $a$. These numbers are rare, but still [exist](https://oeis.org/A002997). - -Unless the input is provided by an adversary, the mistake probability will be low. This test is adequate for finding large primes: there are roughly $\frac{n}{\ln n}$ primes among the first $n$ numbers, which is another fact that we are not going to prove. These primes are distributed more or less evenly, so one can just pick a random number and check numbers in sequence, and after checking $O(\ln n)$ numbers one will probably be found. - -### Modular Division - -"Normal" operations also apply to residues: +, -, *. But there is an issue with division, because we can't just bluntly divide two numbers: $\frac{8}{2} = 4$, but $\frac{8 \\% 5 = 3}{2 \\% 5 = 2} \neq 4$. - -To perform division, we need to find an element that will behave itself like the reciprocal $\frac{1}{a} = a^{-1}$, and instead of "division" multiply by it. This element is called a *modular inverse*. +If the modulo is not prime, then we can still get by calculating $\phi(m)$ and invoking Euler's theorem. But calculating $\phi(m)$ is as difficult as factoring it, which is not fast. There is a more general method. -If the modulo is a prime number, then the solution is $a^{-1} \equiv a^{p-2}$, which follows directly from Fermat's theorem by dividing the equivalence by $a$: +Note that this is only true for prime $p$. Euler's theorem handles the case of arbitary $m$, and states that $$ -a^p \equiv a \implies a^{p-1} \equiv 1 \implies a^{p-2} \equiv a^{-1} +a^{\phi(m)} \equiv 1 \pmod m $$ -This means that $a^{p-2}$ "behaves" like $a^{-1}$ which is what we need. - -You can calculate $a^{p-2}$ in $O(\log p)$ time using binary exponentiation: +where $\phi(m)$ is called [Euler's totient function](https://en.wikipedia.org/wiki/Euler%27s_totient_function) and is equal to the number of residues of $m$ that is coprime with it. In particular case of when $m$ is prime, $\phi(p) = p - 1$ and we get Fermat's theorem, which is just a special case. -```c++ -int inv(int x) { - return binpow(x, mod - 2); -} -``` - -If the modulo is not prime, then we can still get by calculating $\phi(m)$ and invoking Euler's theorem. But calculating $\phi(m)$ is as difficult as factoring it, which is not fast. There is a more general method. - -### Extended Euclidean Algorithm +--- *Extended Euclidean algorithm* apart from finding $g = \gcd(a, b)$ also finds integers $x$ and $y$ such that diff --git a/content/english/hpc/number-theory/exponentiation.md b/content/english/hpc/number-theory/exponentiation.md index f82af3e6..68142c30 100644 --- a/content/english/hpc/number-theory/exponentiation.md +++ b/content/english/hpc/number-theory/exponentiation.md @@ -1,9 +1,16 @@ --- title: Binary Exponentiation weight: 2 +draft: true --- -### Binary Exponentiation +You can calculate $a^{p-2}$ in $O(\log p)$ time using binary exponentiation: + +```c++ +int inv(int x) { + return binpow(x, mod - 2); +} +``` To perform the Fermat test, we need to raise a number to power $n-1$, preferrably using less than $n-2$ modular multiplications. We can use the fact that multiplication is associative: diff --git a/content/english/hpc/number-theory/modular.md b/content/english/hpc/number-theory/modular.md index 92e0c687..b6045a3a 100644 --- a/content/english/hpc/number-theory/modular.md +++ b/content/english/hpc/number-theory/modular.md @@ -20,11 +20,13 @@ Computers usually store time as the number of seconds that have passed since the We humans also keep track of time relative to some point in the past, which usually has a political or religious significance. For example, at the moment of writing, approximately 63882260594 seconds have passed since 0 AD. -But unlike computers, we do not always need *all* that information. Depending on the task at hand, the relevant part may be that it's 2 pm right now and it's time to go to dinner, or that it's Thursday and so Subway's sub of the day is an Italian BMT. Instead of the whole timestamp, we use its *remainer* containing just the information we need: it is much easier to deal with 1- or 2-digit numbers than 11-digit ones. +But unlike computers, we do not always need *all* that information. Depending on the task at hand, the relevant part may be that it's 2 pm right now and it's time to go to dinner, or that it's Thursday, and so Subway's sub of the day is an Italian BMT. Instead of the whole timestamp, we use its *remainder* containing just the information we need: it is much easier to deal with 1- or 2-digit numbers than 11-digit ones. -**Problem.** Today is Thursday. What day of the week it will be exactly in a year? +**Problem.** Today is Thursday. What day of the week will be exactly in a year? -If we enumerate each day of the week starting with Monday from $0$ to $6$ inclusive, Thursday gets number $3$. To find out what day it is going to be in a year from now, we need to add $365$ to it and then reduce modulo $7$. Conveniently, $365 \bmod 7 = 1$, so we know that it will be Friday unless it is a leap year (in which case it will be Saturday). +If we enumerate each day of the week, starting with Monday, from $0$ to $6$ inclusive, Thursday gets number $3$. To find out what day it is going to be in a year from now, we need to add $365$ to it and then reduce modulo $7$. Conveniently, $365 \bmod 7 = 1$, so we know that it will be Friday unless it is a leap year (in which case it will be Saturday). + +### Residues **Definition.** Two integers $a$ and $b$ are said to be *congruent* modulo $m$ if $m$ divides their difference: @@ -32,9 +34,9 @@ $$ m \mid (a - b) \; \Longleftrightarrow \; a \equiv b \pmod m $$ -For example, day 42 of the year is 161 119 = 17 \times 7. +For example, the 42nd day of the year is the same weekday as the 161st since $(161 - 42) = 119 = 17 \times 7$. -Congruence modulo $m$ is an equivalence relation, which splits all integers into equivalence classes, called *residues*. Each residue class modulo $m$ may be represented by any one of its members — although we commonly use the smallest nonnegative integer of that class (equal to the remainder $x \bmod m$ for all nonnegative $x$). +Congruence modulo $m$ is an equivalence relation that splits all integers into equivalence classes called *residues*. Each residue class modulo $m$ may be represented by any one of its members — although we commonly use the smallest nonnegative integer of that class (equal to the remainder $x \bmod m$ for all nonnegative $x$). + +![](../img/gcd-dependency1.png) Modern processors can execute many instructions in parallel, essentially meaning that the true "cost" of this computation is roughly the sum of latencies on its critical path. In this case, it is the total latency of `diff`, `abs`, `ctz`, and `shift`. We can decrease this latency using the fact that we can actually calculate `ctz` using just `diff = a - b`, because a negative number divisible by $2^k$ still has $k$ zeros at the end. This lets us not wait for `max(diff, -diff)` to be computed first, resulting in a shorter graph like this: -@@ + + +![](../img/gcd-dependency2.png) Hopefully you will be less confused when you think about how the final code will be executed: diff --git a/content/english/hpc/algorithms/img/gcd-dependency1.png b/content/english/hpc/algorithms/img/gcd-dependency1.png new file mode 100644 index 0000000000000000000000000000000000000000..4e58904c19b0720e58dd26977fc20d0b837a9eb9 GIT binary patch literal 14837 zcmaL8cQ}`Q{5JlvH?7iphAYwPRc=^<$5;(nH-ciu%%X0M=;?L|R3896yYSveI21r-GuK_gQ` zzl_-&5=oGBQ0IVIK-Toz<<+3`qn>dGdh$K~@ipAMJ zHsffP`9Ci!W*MY-PycMXQc`qXorac{)-f|PYDq~+F$oFn)2FvrR8(BOdR6oIaZdG+ z)tKVqJuia(vX-8HdAX!SikXGQ*29BUSy{R24lRwP_{XT|=*FfdDi;?QnSJ}NEiDlN!ye!U-8WwCqc<(c`b$y4H0zh{pI!tEi8C!$lm_`yR@&ciXOeA zKK^t6hrY=-8WiXK{PY|gb|h_Ai~IHY`TV8Y%zd7XU%qU=fAmgtcJ}sRa=vU zhqkh^qU1F~+O}=m)!VlvyPuwtcy@oZvcKPghnM#&US}mBPQHR8`r7-@e0E{MK~mfG zz+bcAs!RX;ruxexdbKFR^7BQ`xw?A)_^|tg&tuxhX2HJ8zt3*cI%99&J{Oy=6(X6_S`mS(fI zwyw$#Zut0->-6c?8J%2{(hdnzdmd0>e6s@q%nJV3$UtFAE+Z6qEW$~wsy!_a1fo6W3 zYsBlRk=01xHvBpFnfliSr@VgsA{Zf@V5b71)P*!?>pjd*WBr=Itf z^f

      {QTVf{HFeZu57h`KY~7e`eb7@+EZ$y;XBp8*~G-;>a}Z)T^bv!pXUM>e?<9B z4+u$0vc$#3C5+!y&zIg<65PF;fr^%S?r><<9)!^aUOG-Wy%=m%#_-jtHwe;DJ~dcf+HjGRgJaUgFj&j5nNCzx)XI7w`|4E^6&2O~(DiMH z4jm$eMbNQ@l)t^D=z7e=zXG{q^CVSrLtnf$wt6LMTOj}mKNc? zd%3U!*GI|L&;6pVU*~w{H~qTz00}2@x1eB7!XpA}W3snCSnID@VoZE|Q?B-vGw04R zIypH3WUyo@<()cpPMqL!dL9!)UsF>vGTD>QqZX)e?beDa(*c5o-f@0CbhjmMQBrMZcz4N%TS{v(YDOp(wH|3p#Wn`{CY&A~I z4=U2zAu5`S`vkyf!Kc?f|2$9bDtGP?X693%=jG+C@%pNjZ|~~tJc1?N|Ja3HK~Ygi zNQmP2@#Dn$GQEE!>iKh3N-8SvseV};KqHm_9qYERv^36h=gwibMd4aJc=)iXx0e%l zw&m`@R2ghF+M}l8Vq%&{jKQ#o#XrZkK3tX?cZNs{-gM%h+e`-u#AM8 z$8c?;Zafzdj^erh%tY{TM+bAIq0f+ z;^R*q*>kzy!tEH3lB#Ozfe6|Z_FYG2-IKmHT}j0Xop)co*Z(qT-}2hu;jN#g)YY@y zhH6qUn3x2Qzj^oL(BkY@?Y#T<^|P|F3^3Mg$SNVLN+%yXZ?++i=mrL=e1CtR?cmX) zFMeLZ9U#R|;{xxIT^u8A6+@$itky3AK$*Ub3f!GklsMaSLr zU3x6f7E*u5`^zdTwKX+KD+}M7ER!mox`>mk30YHd<+Wt86sJ%MUXsBj{W?AU_VsGA zfJSs;B8$gRP0t8t!;AikvrOv2OC+qwwl{C&@6;#xyn+JV604_;)yh&NRa>i-b&TTo^uE7;w7ou@s{M&; zUw_i?G=M+5yuAFitgMc{i^0Lc1~Xq?wlDtdTUKEcihT9z)m4Ge)s#|!qcPXG%*8b{ zG~|Mpyhdkcy8N!ssQ69s7@L}ka7&+d2%h@=`+SAV`<=wK7Zw)g5!eX&QWtvkM6rcX z*>dE`c=6+)9K0GK`;~nsHw_x|ace3+S%l5%8v zn#JASow#;QSJ=q*c15z1_sj|k3M`)cJ!+Gx72x5a4-E~CjEp3xi>0OIu@fislDDgM zm)q-uH2|s^1vEkkeE9M*=!$vi=>sMvTcV?*F)q(6Om6S>#ivQOW0=qWnLdT>BD`-O z_syF(aU&z1JyZGi?ORSx4hj31dHep1CyG)JWe0^WtR`u^sYD^s(vQkn} zX>4we>a9FTofxuYIG6b1#eU3dI(>cps(gA5NzP_-08zxPTYM88H!;n_H-9%tyK$qE zKYX+iTx1TTr_AqHeom}g;gKvQA+zfByL^BC&D7bqCh@m*bvdar4#zqa{+^#F7=V(p zayft0=)1NQ5n2CVR#_?+;-!}sH}{5@>&5@6?&~bya!iPQAOHAFDh* zs}$SRWl!@2@8T#ZDlXQswH3n0ZMBwCmX~Mm>FEK*i4A*KE`dE4x;jS(=2&Lm%F4V= z>FTtBnF#f|@^L#m;q3hvFN1iE0&hUP^xfSh9UL4mdXqr2a`W;i85tRI`J=0<_Ua|_ zH9wUfDY1Tj%-npJ$$eqPK5Rj+=>f%nfB^dDv&&{p)Fwq(URfHU+d)Ufq@-GK5eQyk zivbR5s)JLn>#wb<;yQWqq`~3CuYUHGZsy{O28H0+y0xjVj~l$iwWqo_AvsweYlRr% zH*Q>MY}C4Y|Gxc5eR!6^(|-^v#vczn2-w?t@jE!GrbC< zCh)7K54{Y!A<)FGMZbH`o*hX^Ne-{C@7@8t^qil4^X7(h`@;A4?Km$3XJ_ZiR*YuR z)?|TQ0|NsNKR-Ug>b|!uv^<5~=Fpj??ACf_TYf*L+jCtgEYgeq;U8M>z#YC;|uI1!bnRu<&JOeigGxbCq}R-;3DV z+Ts+LMG7om1}dJcbYx-IR(vx(SRH%s-p+%E4?C#7Ha}5fbr6i4THO;1gjdB^>}lX) zQ9XGg{mK&lMUUAAj}sZ6Zc^1%K4$x8ldGthO1One6~sK0I#wDvnd44KVX5=>WaZ4ne^% z+=*z==JH!(-~|q=OLM+yq7=?23fkrtSIVxrb>n$NK#qv7>XV)^68 zk)@^5gR-@|Wo0*m8lP3wt5hg7%gxP|k&!v0!bGYc9NhZ*_iuwEM?C&m0DmTjo>tZm zYwPIg@>ZWn!0(@rh2~w$FEP# znwy&q4jpO)J}#%wxeq)Jzk7EFrXCNkCk~#Tjm^nnkB8c&6%LKH1=GqFsJxa~_FaSm zShCxVy+R0Lkc#{ZJkGAJ^CCFfp`mS<6O(3Elu*bG)#*v?ChVC=8P~#g{LNF;1z4seWDtIP8}&;31zUR-3yIMp*UVj-OY@(V`R1C&8)TA}wf=@0ak zSj*bCFarXs8ukz`Ja~|y^k`-K>({R#&DwjVwY|M%S-0!WqA6AXGmE;8^M7ah2{nCkd#bJ z5z!xQPu-<8-=HiVAaxuf)O=%dSQx*-HPtdVzcJ7B@%~W(tgka5%URQbe}Antr>Mgk z!Ii!wn~n{>zDySAv<=2eTKLh+1nrlYeh9Y@$N1ns@w8vVIvW(jd+{$7sidT2c7A^B z1-TL%CN=fFxzlAGUEQxZ5iopW;*yn5Sz6lt`!kKhB^U3#FwwyaZK2uYBa%R4LxVKbLsoOi@<(1bAsh2-VsRS!13m7iGH{3^a9;=7u+ zW{19qR1`PScLZv5KRUH9{OzvYyJMkrbHUkq+&#a%h=pNuz}R>*;TAws)!Z7xiXy4c zJOL;}RX?9%REH}NNENd4C- z6I0W=!0&CI5Q~Y~o3-d0u!74S-|~UAQj*ByZdkz(26LZVcS1AT)P+nAR!d_B5e5P~ zJNwjdoq)Q!Iz1!fIR*Rk=jGt$)P%z72#_MS`{?K>-_TBV_3eNOT(J7rX%b%{4&n3I zVeGbbc1Fj>GJxpFU@K75(T%-|W%oLNp3uMv2?@t6Eg9+P=!WV-1u#ennLIn%%#@Lt zDWayv50Q2NGZ4EY>SfU1y%|^*M@D`oL$V60s61Y<>ZICkvVi2hDJNsBTProN1#B4>XVknswPNATO85kJrZuL~ENYrMw@=LQ#MYA zU~m7k3lbb095U_$3fq*tNML1~*x9vlp$;1vDbp!UO-<#%eZr-RiH*&D@PHm?8j+CD z;ZK$Xft>x?l#-Fr*8ly?Dr^x-PfyQ1mXBkm#a_6)UKIZG=TGp-$|RR3PoC(2QI6cG z2wh!DA!a+KZg}ULM{uZOIuws%>pQJ6dYo4Lel-IF&LHbxFo+-~IW1s^Yw`u$2n?)E z18*c&S68W@kv&-5-KX_z4l!l!Ond#|14nszc}#M$lL}k@i5AA|kr~5zFD@>&u(`Pi zKhy#9POEZHKW3AzN9_2*g0I~9Zw6KNvv=Erh<#Xm`lY4V%z>b3lV z@)!tZ7k-{N;X@kP#yM2uEiz~v@D3n{JCjpUacx>`$kjvQ(W6JPnVDPJX`FN*sf>>w z7sbgrs4@zDn=d*7dA|b!#8$EOG7VE3PRQbIw;5C@`>mI1`)*kmB_uGzk`v*m)-Eh9 z-35b0bo=)0Zu-kwZf<*c6`VK0m^^cK9Yi|?v#LVQ-}O}+3*)chx}(?A(oWV1T)Tan z|I>?$@(+1y$+#psz>ksU0qF4ZiVESy#YM{7)r$60Csd#;bhNc2p4dIIDA$^t*)})X zBa&BAa^%H}7u0gYrFH9vV7uvzmc&Ui zeIcu$umk?jPOP8cn`0OkK>SF;rjQ;pWwBq)u7jr~NLbDg84iJyrTZ`bj$B=x?{a`l z3BD755Ie{LF2FY!lfu{vPHsb3cNI=u+iO3sTpwk%@wmJ8wTaHo)zxs1q~t*z9qO?$ z^W(=i!&JzFguz>HuXuU;qFJnU#t)X6xZ31 zbyGV{$RC&I$>pINA*&1TkCr`o(mc`-k*Rq4;zebgzSruKH(~3+QlA`nBBAZ_iJgZh z27bMeurOua>Tf29XhL`czC)j!rX%6j6E;)8@^4XsWPk|4j5-6>u2rXCX^OjIyEOY% zE7WqRZpl)7Wd4FS?k9m5cpv0eg7;X9_e;MsE-t$X4GFVp6z(zX895OePCeub^eim8 za6d2-ufl!%Gd-A(OPQspj4Xm4qx3;hk=^V_1CP8TbI9^IcUAuU=P(MYS(qvYhK7~= z1;)YBy`eNfL{OoC)g@WbDH*R3&4uYfZk!+xf*Isy?)&>)9NgR@khoYccw0lA5XwOF~dow0BbYp5#f) zLE~e`G!1(<(a=QAoK}8;z2@xV^7T`t)8yyp`$T?TGtD!CS^7}%U}X52kJFPLCZ_{iZba^k~Dn;?GH)tcV=fF|&`XG6M#Jxo|!mFU+_HFC4xr<`_ZSEFi@O zQy%EiyBCsrtmYG4;3jIpv4b%%OTH7z7|h@r=da~&uvTYjPuLlA3*SvaZ-g)$u?PUh z2OvhuFim}b4e>&vgahe80BMOBLxd?3i2}e)$>|%te6Ca)1lr(c>IqOISd$G8%}ZzB zAJrf{M;=wbJy>^w5)$!XEF;7;6BpM4rMI)U|N7%coTy>?ncv^qGn9QCrjy*XbRw@th)LN_8XKX!-WeM4(M<{ zz#&3yRPaZrx>>jN4j__CC5$^5rPlxjGIA4e9AaQ19Fyd2qj@cPvlD(TWdLKn2nHKgLEE%G(3Hzh` zT(5XbD@dI*=sgn+4NYfCQ+iHW*(U2}zP75g=FWxG$Ry~G9617WnzHZjU-zO%r>aMB z6Iy`CIzL*-!(cL05LCWDKEzIoS#2I*KezF7ShWo9X7;q05f>*XjgOB{ma-2mbW&>l z#OEij+|VOYE3DwY22+J_>!NoZ%`W55Z>JZssP^E;$`|d{qNu-Tl-XTkt-88QzC++J zd#p(!zbY3jB1pc}>Hcym6%~~XMOQ9~`I&(yOkciyp=V;UKfC@>Z^t1;yI1sZ?Ff!B zEw(>seu4!XoPM>*f`?`WVL!X7@9fG5kIBF0GqMG}N3FwL`H74G5G^z_NlRCgDb z*nt5{YFgTgTltEcZkiZnDH(qY5t5UW+q7j1<*8Grs`5!BpuX4o>cuMi#{t5o;@aBU zZEbDDa=~?&|NHmK)2C0nvnm}O9it~FojVs#=Gl$4B*r8pIH=P0>D)6Tc_E|2GKrt& zv0`Fi2+z&6t!({NW=looqAo8Zqv`A{W`lLZ37*Rr!&F|JCgB*nz)u^Aw|=wO3y(kcUo4hRe!1*~9ag~PEX zodFv1V7yLr-j0VN&A55<0CqS0|CpEQ#;9G5B}?h_DPQkQ^j&qS$%Cd!wZ2+w^oI>^dZ6=|9~fH-S}pzf{Zg`1~vi z+q%lWAuv4?EdyPm3oxs*f*(V#JLR`*B*q_>AIugxgd55 z(aMR^(=5163saxi)q?)KsyO>bNn;q0AuJ*SYxpwAa~|j%p@yK^7{U0#I7g951T6mG zM9|8Y^%Fi5u73jzG~SCzANq8jnZa)3Q9i!|z`XOi|JmyD@&=rX?YVQW{Wy{c5FdRN zOZfPzhvG%CaKgIis7h?=q^?I%st=D@WONowOGq?f8S2`l^}$JRXl=bR&0{3|u7FbA z#!0jo3U?DY4=fG>5~OTu6M)%!S18J50)BssHQ>)(l}?dPllj@5OEuATCpmT3(F%T= zdUE9|!mu}f33}KYU*X~jKR>b``$G~5MRs;JVTe6c#tkK+>4a^{*Lt@N5#bR`RKbcX zYc7RLhoNcV#x{YfoP0y14umFzq&_Q2C5?gviWZ%I2|HcA3kDFK4Q{TjaA@B`*jdfZ zx+6Y)A3xfGMfTTTdfpbSj>3dwMn(n!!Iox)$e_YR^pohv7qz6Mq-rkwb_`R zY!OcQ>aZt($Az$Ka|#NgaG61z+I3cZiaeE3z#tM?Y}X3@aS0$$hN{05(l!~V&TX|_ zItl z1re-@yos25kAU23@yBsc3|p9T#JcS=JT(mb1OU{-mG6P5*&i@V($mumK6KbSznTBR zrsH#SbA;T&2VyVA3u#9qSz3I)?_E-SyfzRVZXPAZct?gT71oy1KNp3Q*4Cf7%{AiX zEX9Qu*4Kj2cNl|?AnZ4g4%|~z4W893 zG$h<6dS>QV2ybB)|5(HA&5(Cu#Y(OB<*P@--C-dH5J|b{Z}Rdb!-s>WYlha={7`|LpY|Tz{k2d0ui1u) zx%n0NgM+dUxbF4-D%5F45*ZF%N~#Aylp#h>XE#pFE!c7%jTShy`f&C9X1=_gTnnX< zR8ryrz__)Js9F|z&i?u1s7iUIpgt$ANkz6&qF4w z>N_%iBU%}Fj0`L+XBDsH^q6=P{x0_FzMw2N0rgE7GjcGce*Ydfq`h`& z7;Z-lZnJX0te_3qr2~`t% zd=JeD_vuNY3lud-54iSN5V?9dC*yw#^!xW%^ndI?qS{`+-qJtxh;Q)uA>$iaSwkaT za32o)`ueuPpr*b)Ik>+5&r=R6=vqoj!?~o7Vd+J@tODwX!C2bB>}!|jPgQ$tW?^CZ z4G~GP>)&2~e}6lKS%(Y_c?DF(#fy6RuP$x3EOHEl;sy8#0ULM1^30k+rke^bk&2kA zJmZ%uY**o`s;Yz}UOv76 z04F3NVf@=S9SG-ktb88cQqdg!4IU0&-dNONP>W#e#;hH9!VTqg#@RWhZex9`eM^GN zJ04>G)?QKtc_$2Ui&LlMvrHn(KmSPI2`cP-?i?t@ma2S(HVnJNH{}WrCL0|*D1<~I zVS3t<@7aC3e)yFTuhLugoy-KtV-url;+8G2b^CUyEq~b@sn)XM4uGtqSMkh%gK;T1 z)su}ad8b-B*N=bBg*?7JCL|cUvx=J9L=POYn{R?zTdl5K#q^{YY5VP3?R;LEhoi+kL7yO9Vvg zD?g>Ns4a4ME%{b{CksiRe&PH(^Do$ZvGsE<Oi7*l1QBI@!T*yW*toIt^RvM84qjPcfLquC zWzd6jA!>VAV5*SyN=zmo0-J6~S!>}KAd_S~hLm)4bim@NFc-_p%PB}G9Q=qMU6Y23 zdi}Ia!8E8}`o?^?#76qO33RID)q#gUbh8x6KTb;|I+AVFV8XA>+D7~GA^*@((N zJ@wf|{y8#oMp#H_uDJ&Gy90bio8YeY_SlXr<<$FRG&csm^=3Vg?zs`l3={a@kC>qi zaX199iO{4vKp!3%`O17>_QS%$@Te#n@=tWQ1Z8DY<`>yCR_onnHy)IhUV%+D+~R43 zbt{M~0h&o$ZH(yvDYO&WA)NCNSd9z$uQ=^15gtzNo9j>jC#E8jTeZ%H3 zEOY4UvIckjkT5ZXkzaye85kKwp*8yY&70+0lJExF+uK<@OfgIF`dLMNS9JGCuy^#1 zs$as5-@ku<4hjSv4cdyMTYrR zPdmSB#}0+%2-hF3TS6a456aLJ!W2Ob^0pbz_5CQi5XKvZEJ)4j+L|mPK!lrPC_Y&r zO+WABL%5~zr5Zkc;zd-agJ2j_O9+7yrhy1f8Sb`>oLr+#=voxqI{bnaKbah^(?E-g z3eZYmJ>ml*vkVVE5b+6Ofej3inO81SAXM=YtD50$A;9Nll>W%Yw8m%|(ZPQKhE<^8 znuBd4htvrbjASVM_{2nA^z|e;B5*V196NXjLlXSUP5)~wQU$#?PPm{rvn;67$AaR)J=sWHbV|89(X16S~e|Z*L#SbDIsn{8)P_$nDDuUVP~^U+r3GvQfVU z_62q+tT2v-&MU|vTRS_gVZ*u2pH1gdP#^?g(qT+h<>%bHN6W-5MI66N?h+c~$twQ8 z5+G0-hld4_AQq29-CYI=2A2yXx>rxOJ)(g|*)1kE0{AB)@bR-F061A-EJVtdg&~)Q zoPsT30wGHjDELHUC*X<_`B8XB(8_|#;oB;lSbELi)6{aiP|l_w!c2#;#R`OGLJwuK zzg%R@2Mu0I^cpom`_MRI#Jz4I=hPeZp zS!sn{F|cp2>j<>ZBcr2*_v~TDhTxGoLv;EgAR)EjZ(zp(q4i)xV|k}syLK5Tir%F? zD%c2h>d{>74-w~ZE{rts^V^F_diCbbSi;#IA1BONqUM8QJQO=Cgbh>(IN`17nhzVe zbQE{gFA?EIG%G*UX&lpkZT-N7dWY3~UNL^K$-((JgAn z4xsKI^#6GTK$H^f^t*V4aAf+SHdsVw{LCK2H=J5T)SM%Fe#w z%D8o_fW1bDqKOg)?B^=hN2EY!A+nUICG?%L{<(`iu#pawZ~^c za60foEd)wn3_R+=N`ond8+a2nNcPvUv55M5O`=o`m5EDC3k#nPbi{YEJKQJ54pg0Z zh5()coTV42HsCu6hab`8|Guae2%oQj*)+X>&yG}(_#J|#06q~74FTBE;UFc9OiVmlocw$$X=&+N4n`7u zOX6~&a6+`|rG1VK>l=y+0xn(^pR6qNo3;#}QVaYu#ekP1Jg|qHh6z6YFYANT!uTCh zZw_ze;Tgdf z5r*ENx8)v7arJ{ojLcN&Z6 zf6&bfR>TRmJ3jU(q!HG`pseK%OY!LD*4F&m+U-#DK+UV*YRK@q{1Uig)8lM6Z{EB# z*DCOIApQH|&i!g?n*{{AdaLsF5*sm~X3++exfPyQU}uFi1$}Dtf{|lGN+oozEzQ|y zSM`1Ppo0yU&x?)UQgH7cR_)UUcPzm(PELfz)SD;N>6d|oNe=gzcvuC75V0OXk0N)Q-m7nH%xl=D7Pvr^7(}`<&%uS* zv2R~>&os=HHXwZ6$~r1`F;AZ=lU?zBb}lXl>4hA|U#EQoAI`dSF)(loVr(sXA>t7M z3Qdgmn$AGT{0blPk&^2BHQF}ovc>2~0;`My-4N}qug)Ls9~d}a>jWblqAd>v*#Bv2 zAa9`{oDQ@^+|Qo?8wQPSVQ;6z5q40(;@f&tRZ#_bHm1 zObIDPmMpOv02d?kr;8zwvSb;!%@+uugCRGztZc%92Ss@96&A)~hw4=m;tCd0neTTK zlreK){g6mOK|!5%k^o?E(^gpZ6eMsyf@o*mzAc2zBQH+*$zh7F` z=)Xr`kTgYwuz9VmTnH?cP|qzGjo_7c96=cr;P0sVdY|=+fXzr|60`FS6rtt>5Feev z)K&^y5QEwKeQ9pYqU;O>_Vu2?oXbav!aGtqOW5v85Ou z!y1*xZlI>7E~~AssIG2>iS^x@izY1FxWNelS2)NJ4lM$L(thBv3uV{Hw{O(wV7d+0 zT95ct>tPb)XHXgVbmCaGKJ$E4a-_N#;h1nE->U4NnZ zhXBI-_R+*BblUwx<_*-}qXIrDE zTG8h?b6=f(^+~NG@)q5rJld?;BG=#Z@xhhVEQTY~8}fJluK6`vd8B!o^%fep2AHw$ z;@-}ns%uPRF0WQx_4}BtUTzi#QMgDuCehR8ylNF9J0(oe|{-8!otGH$jEo)?i1#hFYltKr}vs4PM`byVBnCEhEe(7-}CRgyZ7C@ zcW-s-u9>;HIhm?zy_)kSBcq!Q@thN_=`!V>(|Z{O46f}vdf}w8aax0+-W-)6pP;^_ zCC~o-`x8=AZ{9w7v9_--<@w^L&QA9~zwFE3zJ1@*!>D=pn9U)*o7v$HRhNGKs=Fh2 z$=u2+6mJ%l_R-SXv-0C}^j>yp^UZ%7Upt>r=S=9$(c`+!*SGB)otdFjS6AP|CJ|%j zzL!&ekB*Mcez|~KPu!obTpup6SNyj+L0wl@_pYH~r`(m5N&7tSnI4PLch^^!d(Fna zd?6PU6wDyy3`pP!ckSG{v#!2=tTvpIhHkvAtiHK$h+3J7!k3=luEeRdtUB zIPgv~UteEY|E290Uz$!`qFr+tcx8V`Kp?E7MBHVl{5pqRz=F7Zw?w-mpTL#n@B5mY zo86{c*$*8$RB)#u^Xk8-cus}!x+wa*sgot9uM!HJzBJzx7885_vijc#pCiH|A~m!9 z#TFJ8H9kl5k1CIpSaiJhoL

      Tz{g6V zjxtMm@R4FzLiL_#7pCv-{?M*eo8;J=uc)oPmsc&uT;?__qo7gP&*`?}baK4$eOz7)cR64KLSM@L7=d*kEcZp6n^OM8FcuCz<*%o&u?_3Uh(=2UUY)YR0H zQe8bgWo*dU{5-XsTqXaMdWluo_}if&V;){!yBORzt&GfD{%N%mvtN;6JF>E} zn8j@OY~Qg1mn$mrcwv(+(D?lM@aAT%J! z&YnH#<0IkW;lahvPcq2RA5;}8E#i`w=fIC-*)oL}E+?d=?SHaxR$uumEd?bi*m1vL z*YxsCubYQQNM4@6F7B^`Wv;PpoU(p2c5NBqSS?vOIjNa#?^4Abxda6Ekoj%?lRKxU zSLimT6vHmVSiP}wsnQ+iD*o56U$(RC-5ecCwIS>4>uQys+34xF1qB6#g@-GiKE0bv zQg!?-6`KUti4#n18FJee=JG$@6SH&sF(kFVHl3Mp`}Phpls!Z5OS6XlVmr3mR(Mxp zX6DFhdiH>LfW!NPU%q@vyL$KT-5e}2l}oWz!`{pG_U<=#kzICl>@}LN-rNx4mHPM8 zsv+*_)2HTv#?PFy!^3o5y?RyPKA~o9XQ!;8@l?0^;pX~JYI16h=7?E`=XrUPfA2Ty z&!|#TWcKy-om5rbfs60p#%}G19mu~_<|5lVJH31N?yq@2^>aEDr1s#?N){H}**Q71 z?Ci0d8}nDi<%NzNi_=b*R>V2~_U+qQUEN&YibAUf9x17NcMta-Ms4+ko7{XxDadCW zfaUu!QXP*K%X#yLb>q)+C{d9uEsM)D7hBDi+uP4d%E%B+%GA^}GCn@y`gO%htNFgd zZOQy+Y9pvQxda8N$jbeH@{@6ii6whFhtZsyxZ~>Y?;pE)b<@}0-ahB|QGWg#jC9XB zE_08a^YfF$-WS+>ypw6VYvULn-*(MA0@o`lWN+QRJvKevJKA^p^y!6_6%rcOgS@;L zqDM6*aMRGzUN0;Z(N2@Rj(V7wY)WoyZ6y`id@S(#cFt+Ig2lnX!Jslg+{xK_alD># zs49r<(W6Hx63(op&Vx5Pa@2cSP8>c=v2Wi#H*fFo&`{;#s$=?pl@3Z+DypfG|5{$A z-M6pa=ZL}-PGmuW@O~+ehomLc-}=g+YkBa7wC24Mv-AH*fCtn(d?I z;o%{g=NqqYwLSsAA|HxyCCnV6|=%euB zV`BD1Mn@|f8nW=J$A!hk?bXRl&8L*($yL|aXF#Kl!Lb!EtEwOsa}9a;`l=fkFax$o zeJNTRT3U56EMlxduWsGF8-ex^!*(K{_ihMUQ<%aZgoa?p=Du-!kx$luF`BH-!r;->t0@^?@;aB=eKtc9XXQJ`9$ZZ^*UfL z`{6^C;p(ex$&2?L|E|nUwkd2*2Be`oCJUKWbs6}~4U{mJRSVYFM>9p;xx-jkRK&Ao z%NFC<<dF4jhn;>Sg8Q=cm)u)UMK8W84 z6ELkvdHC?5u=GlGhtd-L3D@DtO9I%l@#*P?GQVZR3O^Ak_wns!RsmP8&|kT7rHbTq z*kz>Z%;eOR|37hGA0PfBM~+MmS6-2gR-~l($RQu7p%%leh*dLendr*ZQ_<9nzTF34 z6PMRMb&8vxpFeN1R>-b3jr;u#DqiBOn~m?-zMTfW)#Kl~e@2WB~%V)fF&&kP& zDlL_8_w=+1bOMq_0*<>=eQ$GOS=6j;Y?|IjP*ZS+^$iXtQ_%A#jExUq+y|Yx{r{~?^_Ms@ z4cd1V1g?2lIy<+N`RQcHa-#)JVuQr*jFh{F64ijqEeYQJf*L%Cz9oLgJyc>1v7Q^E5^2WF6GuGJeCW+YrR%dY%(8K0jwvodsY5;!a< zXmdEnwe0Uft(21>4>d-66+nUb$jQlLVtaq=Zt8x1G3w15>DS&fjDxvd-xkK+tDHIW zq9<`HJqDyC>4kWS_jL?s~;(}+Sy-ZL+QA9 z)fP?DdVaVvCZ;=o_bK$^j*bp)ZEbC*C!Zc1)X7Ld;n5#GdX%(R@$=A7!r=vw&Unm$`EC`WHL`<0Lh>ySo!GKJCuS!$VNfwLi;8+pNS@ZEg7z@7}eW ziT|d9HDC3;qn1#Q4cd~Vfq}91<;^lR` zHA5~MWs#Mgy%mMW6-IPzoa1Yeks6^^1bKb@`0>KhQtcznyAN`6`=3hLU1HsGJ@uzw zR8*A3>cUvsu);_`@JL7JWzV0M5-c|}8=EppJ#jJR(~Cu&l;Ljj!Lo_ZS3EHqyrImnv$iAN~7Qla`jY#UYa9sH&=Z_TokLvu9~gt-Z7E z182DHkAC?0adCakvm)p(=^yYcJ0|y zSK`<^Iy>8ZelM6*A4-4}%NBv|syFJs>vduefI3x zoZ#T^%9N+r+WVB2a(>T`hDGB-C0O31dsu`@hO!tdFom!_2k_MQC1 zaWam71gOVjD*>|f9&3$xZ8*l`9v0Sq+Lf1=n}gXiOM6|eoSB}UwnB-cAh+=@QgHjN zE!iZ$Y(mT6l8`9vcORV9kA9+)spjV9_PkQ_JvH7M3c~#F&y0}Mz^ka4MVyNroSd8j z1_fj@JulubO?_l(Eb{Izu-G;|lAD`L#gPd0Gy!aCY-?+PX4yh^U&0v}w3Td>&|(40 zN6?Hw6?WhR9y)wD3yf*!-u>4As02i(r0h~uRLnCf+1=XOikGjyeJcZoob}*A`{?hB zN=jRCm|8{_Z8I54%pv(y6cw*m`2V~yr@xvl)vHlOfEVh;xb4GoPFlW~xHZ%XrD)y$K=I+LSGQJIS7~r6 zTHm06t(U)dr^Snli{JN(o&3^r-@5+hzF5=Ek%7tFmYe&I3IlR*pM~kCqfcCJ=b9a< z=43nJ0(MW%Yjs*vbK=t@&4fF5ZX_nsZr!?-nWe1c@UdfGp6O**R?6A7rbTV8Pb-jJ z)6G%avg)(Jd-N!!yu5s5Ow4vNLQMDdmG>9E@t(OcK0a>iI&wtY8P9gs*f{3Y{3PAV#@zJUSD)-)-i z;BqF)Y{F>5!op61_<)0c5$OH&DdfQe2m%2&fa=rSTt#E!{n-BZz(9kKW%c`#CMT`8 z@7y^77>SIEs)4#jH{A(E==HUI3tA$W+DUvaH=Y6{F@{zAX6HiF=g;hvi~^%xS&lQH zk)(Sy8>pCax3R4t1fM6}Npb8Sy&TBMwr$(?^!anEk9PzMUcCwfxvUh+E;|6wdvX3N z+XU2UO^S#W7Z=wx91AP#UP%wuQO(?(3&ezmfg!AV^B)_k59I6g3vcvL5zD&BfXqC> z!-vP7DQHI@;=>&+EG%4ubcE6o310U@8zt5vAt~uux6q^Va%MDBB|x8Pb#N>sInmwK z)YQuT7J0x(n_J@w%s)LmRqi{_@#)j2(Xp{DP>rQ}&s4ESI4&g6P;+Zyb#~_OeLs_Y z>sDBq>&Q4*1JMl$x=T1B-@7PqWtHmS4KAz^_zjkEkuPFiMO8I1Ik|H*0LSDhY^2)#3mZz8^F~a^X`Lk>SW4~fdQw;`inyUJb3gdd?fhq3sUZSycJ+sNl&kMSP32M zeRpr@vXaMapGcalKOGR{Jyg{2Q#F08S}2qvD7CQncYU&V*!0o4IVyDa{gQ5E&o93G zV!Y<^F(xL4hLMpRgmQi3_lR+N^3MUNT1oYTcwda11P3GI4JfFZkr7Tv+O*#ZP+fcQ z5&=OK75iVmeywrp)b;9W1@L+1uiLv-ax359I(Tp= zAgl4?$M>T(pfXN$%Fq-``J~5uVM}VlTXTD8MO-Yq`e*E~+ zQFdTc3}=>E(#^bhh2DJ8Xsmem-nE=k6|2>eL^C zvX_*UJieNzY7!U}H1_@bO>{FL=DT+muuQ{k#SaP9s`dtU$VE9C`OH-kk{ z`O*?V{>8U+OfI!;cVo6*rIvfH;+ zYu~?bg00i|z597ndzNz4qenK$)jf%ci7GHFjH?3gHJRkrw6w%x7ZanIge2_WVgZZH zYqpA_rqqVn(ZhKUYNv5Qfm%#3Q z$wx2dA3y41rJ|yea9HngLM+>fB&&Y))=q_ElB#OORg0fX3 zb%9;JG^@6?7BHotqB?+cZX8=vrgZN7c~z`tRC+plwUWmWFakFr@~3^@OB~x6W@cs@ zM@L6e+|!E6%BH8fIjLOi@@q08l0(2}a zO`XPYU7}Ebu(rkm1kfoiJn?giQTVqSI%BwP?Scj@FcqRjzH}>*^;@_`eDpA5BGWO` z3JsJjvn$KV#gtjt>E`9>EPiy4+U?Blf($to{TA-w44gSBwid9fmS3cU>U^vVqT_OWcd zuP;`DZ&1sxaJfzg4vqEK_N+Y4+pB;7CgWHZrEcR610iw~-?A{>me~Y8)d=d;1Sop# z@CMJO!oa{_WV~s5;=~C{8=LxW=3~c>)ipLo;?#-WI^P-iZ~flaZ{KL~vRcE`DekD~ zXyOJcm{WD%`p(;Ka8!qf@96d2kz#3SS-YH^4a?@h)s1DQj;<~~&*@g@9%cwYK2Xhx zBh;oW@_~}5-Y*U@8-IV(qjZw)-D^5~A@=9D&V!=>J3 zW(q+P`4kkYd!}LHn9W4>JchpHfzgRVzPqS43wPKW03Q?MJyyGeM^Nz0F_SlSjeg#1 z!=iPaoym40%$W{_1qC!%h;VokoyyzDP|>V`2i_b8?i<`+UPp1aTpe86sh};{EGQ=@ zr=4kGiXsU^U3PVM1MRkeTN5@D;DH2OA?Qv%X&!I_M>Cw7Q=YIHv+}BSFI-R~oUr1m zwsaXLqz2&;5j(NJq;7#93O{j33=9oxpul{83|+?|&zabaTvS?uwFXy9R8AlQrw?R` zTTrm2(GTXKA5J?jWH!Vzy3>Qlk4fECu6pWo~N2S=}ck3J|@K_EH{9AVtR4CDsP}Dj2ME1jZ}!77^($A8sGDu(u~}IM1MvIQ%Kc z?djCwIjGK`?+!Sahi$FJ9njLzZM}N+Dgz@UjK{55{lNQQZ!Rh&Wn>JGPQITXsg6KW zgI{)oX}Nv-_G>gDuoY$p%c$4Z*8EmxSpf0C*XvlJ9Mn4$eE6eB2a=MK+@Q=rqIa*X ztVjm^Iqv1{ZDDB{(${AIzVZMLDC~U);SqFVUO~aw+|RDBZr!@2*7gBieFwk}o+u^g zv+DWt`y`wP^j&psp&4`Y@N9*f27*Vz`Jv_Dh=UyjsOVzRJa?{0w>V(&G$NUcV&03P zVPPuCGX3Ie=oc2w&Qt(Cg8DB`G_3w?Q&>a|z(&mhCH)_J75wrfii+5D$aBe`Oh4%- z!in>6!FS~<(u{X7F4~LjbbwOhphaK%3VESX3jG!>^GqvQAz;ZD=DH>z5)6PT1~mwha$+;OWYJ=2(G%;b-nk5aJPSLs(qA9`fLOiz>7XcCUFhkRG?ul^VTb z2XBI~VzDC4sBv^R8YZR)xWIWYUWoWDm_t00BNTa2XmaE*e*&+Uhq~((swnX zJ7)xW-j(Tvz|Kdb6Alo3CbEmWv8BbiYbXm;7dxN?Xp?e2zA`^z=D2kNt4YvM(0IR% zKjKIfoi(Cv=FlvwzfBrlJu}$!tLTREkhqkEMm4=v6-K7f!6)6 zMygNz*A~}_K2#CIlmt(hkb5=DgcigRwm|zx+2tT8 zISQC;Wbb|P-ljvuC@3U0h-2E3^H5FgtDAGN@pMlbwv`kUMK)^4WZL-i$LiDIwuvGHLj|QPQ@ObnEy@`l#f2mLEk=Cj{$BTLPKEgI{-MT(c-_y4d@aror3Mm(_7ftMHm*_5@Hqj687rV@jokb(eN23K>n#nwlg!GNg}2iSWsQ~$G-|> z_3*#7k1aZ%VL^43L_&fm@He+@z??Uq?JN8`^jfT_sEEiriKBh)+&QRt6{ul2R`b$j zFA)RaP#i7p@crHu)%o^m7B68WpF}}GMmIpgqbz%$K3>GR*U;Xnrh(}LhHzpQKwAwl zHTc19xY@sEda0L|mSTTgG=_)$p-oGZw3;~%S4k{*KI3jx;9BizF+q3tMYWQ~{fH;W zCng-(O)C8sx4vt;k{wtUus0V4&Mfk6@aV=MB%@Zdf{Up#@Sy!wJ8_w7Wsdsx>VMieaKdkEpo1 zSvBkpgr>%&7?_v{Q$h0TKUr921>Q3k9zJ1b#;*!A@kzb4qn_Bd1+~BWNd6W zY!&$R#9qq=uQSW|{-{gjRbPOzfG8Y=W*3%_P=gDL;ez0R;q@Hf`QgH)p^&5`I>>W` zP-gQ1_^|hQ<05KZz_4h$jL+;z6_xEsycLWRx6`OMZeke@{(i$Sclf1Gb z(DTv^!M2=i(4SW|o_~Sai@&bd-aumDaqK}4?C-6xfCyZ*>&Nv0@Hf!A;3p&&%fdYI z6nqAs7$KFU-vXDk*Vk)pnF{YeevB^uQGH*ld#dIB31fwAT|EmA74M{`vi=yZj1J!T zBO-Tv0@`G|n3$M$gM6?NOf)Qsl<$3`5Ogt1Jc_Zx<6&bwCFp~CD0Mw>t)`=c7Fzlm z&hulfdm(U)l=bxJiN^%pvT$+Pi`Xnp;VK6_F;vbq#F9@;z9Nc*We`dPgD{j}4fMVC zw5#k@fQ=?%-ADt{;L&}wtYxg8QkJOUwP+137_MAcu)G#3%mFCg5i)B%J)3z7wFo+r z_XSp&7A*Z=olJj>Lf~4^BFrl7yzK7hG1A%IPE%D?MX*qU-GFBWtjso@4#ac`cyI9^ zzHjmQ#;MMZ@<_^Z)+ zzR6q5nvfj?VmnF&-i4tqX4m@e)2F04eF6+F3}1C@`I_QIx?~nb1_B2Yn+>sw4m&mKHhI7^N=@SpiBX4TK6|S=N zWMg7U>ciqG&`Fg`m+tFdyjTnuv#yvo38NyT@F5zig8usL%kBsB)jfN*ooDaF7W04W zYZxaG^7%2;^t+E!Avg?T*=K(EbM_ARq;vWXa8J2mSaZWR7`@aCYlW~95END# zYs(%2v*-_MeT*U+aaYnuQIV2#7w{G#_s#A_bpbIn*BlzLF-trf;X1kFU8o%+1 zg2z~NK?^Oj5w&d)^;thWJlqhz_~ z5<+$A>FE)M-d(>J>?643$jvD95HlT59Ehf*1DZzqe;Y_1-rfhmsQBTY@f|%H{k#x+ zr-IpoD5xqzek-#qU9`65yC><+vo;a0;K#^|z|+9MAmRM;3yB{-oX)7XcU(2nfSgPf z!-`fJUkh4WIv^q}e0g8Xo9`g8+VCXQKsg(ckPsm!YY?UAfdk>|^H(={NhA^xd-(bL zFG^k-K6>mJKg6kuj!yidUON3jZ5lvFm2r-PquH+?W`~0w6MZofIC`BNWDq)#|yCNC@72%PQVMFEB0a+2z zkeQg6Xx*2f!sl6EVPf_M%ib>^$PB^_C%Nv{VZBswZAHZ`sA2b&S;Ip|kG?07$YI-A zNDh|9-&R`soU8X_baIktZvw{U35gPn@U7nChj7AUaCTosM!Da+fSEQV6j%nr)WkqY zWLzAHaHKJo0UD;HsF-E4u7nmrICSV}?;f5?e(>T2Arz$l5~P(|LV^+LeO>ZF9XC8$ zbD9)AVXa5f@~)-@4c0htZ@Gp64uTVdn=nK0z=svKs0~wV`*hyK1fx&p@Px>SIT{#G zM3=;#TjBt>cXoy%fXq24?-Dy)6%-A}Y#hm))05g%Lg#UDk%7+w&5s$25 z-zjm;*1!zH>_yZ72OP6iS&tu+1DA-M0Qn)>9~kR4Y?#(P(H&?y25=$<-!_Q7O6a;^ zY8r>N7=?Bs#v`6+r?=;*Q^Qk;nClvN$ar2O7}i86gbq-jFv>w?pXTSQVOu(&8we7F zmQPINi0dwbznfcFTnBPZKnS*Xby?dBt%hQh(K)7#r;QK;uue}nULx#b78ck%OlZCh z7~x}Xjmv(@xmcY=$%$Xw3Cb=Rye zt}RW$Bo0NO<>lqoryb!CF`&wAuPGn6N`zl^4Gp}Qb?oo&M-+F`#pM|2=tp}2!2<^n z|L=f#lczRFesg>Wh7*W!meoIhJOW-Jq_F@mNJvOH2}hY5vs}0_jHmD_C~&f{un-v^ zG1LT|hlHsHcD0yo^V2eugR0_87*y=TGK~KC5r=33y4nicf2Jz=ysqwb=+jkT_~0NL zprZiZVv{3<{$EuLS^QcUQwEpL2JgVnL-O;F@lG%@FpMG-mGocYvu#e<0;L5-DJ&|{ zT~+c?sR-sZiUiY?chKsw-L>G6wEOq(!8o05b!1OMZLJb6I1bMQXn*kWnfs(Otl5DQN+qR=s~3b19ohw8uZNHHU2hy1U5(`yNyOZya}Yf0T&m(`Y*7G)CwN}Qi_PqF_`sJT%z0ORnx&sY|5uj zQ9x}rgi|sRBWyiO3%`Fi;L+T_wsTi8(T-jkq=6Se#5fp{V;h;!b5?#U$Fp{fpPw9r zgy=N|!5aZpq`edrHK2>5U%rs~cAR%{8Y-tn4NLmWG7sjSeWCSMi0@rlRas@g^0;F& z6dz%07nGGX-~_C(zLm$+X+0o;7;Gxl%QB)h(l9`dc^%T*3XZwclXJ@U_J=W-7ggSbBM7)(i=bq`nAIyb_~p!+$RzC1bhc#B$h<-^6?>|wOsO|-Es|c z5Zp(O?8fAZh0MNZ`yPZpwa3oneq@yv?q>N_#wW8Y%{eoAQn_bf}TIGVJaobi}RAeKThAogJt z!IO}O@+#43cVb#s^h6yClg7Fc%=WrZCFq^w)&mK1jdj1AZ%&{C*jd74Oax%N$==J9V*kye3#5Yc2!zEmYV-Us=^OlIp$gW{7 za};S$?9G?i*-Ap?=aw-!8S&@$eA}9ktZdrzLd-+58Bg1V6Kw;zGNE9nA7Q*p1@c;4 zx)dJC+_-vobjHAfBCLK#>IVoc6;F~_IC z@lF^UUje?AGuF1Y1c(yD<=0%*>1=In#e4iQqKT2=u>bLlh>3FaOkjh1k6D=uJDgTG zOf6{_8oatpE8*04t*fhxoU+ju1J{bs%M-pS4!&P=&rlMxgqZ6@8_tF~0rV(&%nZi^ zk`Z4Z*s|r<{0JwekkZ;Ve^+l#83q$ZM1P~ z@){CK1SS#&*UE2Wsj-Ivy-kEmCC>3|1G`R5%<7EJ^}q!O!%%|7hjyrb`SKyc@<1mL z$%qGZO>6K&!Ph7)!Al3o#xKe~Ek%XPGt2r3gS&kA4B z_Rv4yG+inFEkwax;<604jTcPkU0Ku}qypbt1Tm)!uwKqrE^!TkFj#Rq*wN#SZw3%% z^PM{v($7>VT>S^Gvj;}Y7#5=(17(oG2xvnv4@8VL5KRfU3sj|od<(mAgGW@94y8!U zk%C#H_Tqqt%&e@n{rxQJarS!VAZt`~f?W@)Z%FkZ7;Bx+FV&Dnb<;c#HBsiXYSJ81^CER+ds-vBBs_N<(@J(f; zeTMn?g(%nn7{yzh`p8tOXP*S%==E!i%?`$#uKc&ITzY_)`_zYhc#RHf4%R>chTCh= ztkq%jhQEFLS7$!|mCA@v5lBZ7rsDte68_xY)8n$Q(sCC!K}hZ#Gr{n^{?U{30}F{U^oEbU}3s;I&4Q`qP2Cfp5Z(vaKa6#e+B#=j3qfeS<*u+ZScnHFur2Jee77P zk2Ef$<7`mi(0v42Zvycyz(g}RSRMv(7?R3%9wl*n7es&vHK1pSbAS=Tnt=gU$g>(u zQ}qEDRKM1#2_O$^ASND=G+0=SF)rqj2;b%TMWMd_{{Ig6^U^WeU%otw*~5a;(l2jb z3YVeH?@-!RS$~pY%VffNcnS zOllav^`SzE)?b7p%hLP(C%u;cz0YXPQ0!o)3RfZAJlJuPB~Jg3nCu>K5*%D0_D=TQ@t1gEIKTz)ZFZB-JMX*9A-{PJBWsx6Ni{IhrX>3hlH4f1c$hU{9Re_=Fm6Qy{=gy z0)gD{(tdE?%sXRq+RyA?J2vud(ZkQe0*DauyW0AXX!Mk(Azm_Fx42O+kh~v+${Gce z($b3rYF!jR#sNd zrS^;Eo4n~wUG~*crGfuaPXeF#L0Vdxyp)53L(A!IpNp&Oe0+~LSj&A4jc|i$91h2WRaR26 zE?ik!3S*MAr?_xlCg9sOVXzT@e8fuXRn7-uL;r`PlBLk%eT1{~h|hUze*P6{Y3U65 zGtZ)7gc%H`ns0Ky9Hy5069OqKFRz0gaE0xOY{kgQ%QMT#$!+iNUx8DD3s4E^H{GzU zBmj=^aI%);mG>>VjCNlluUiV?Jt$~tu|IP)<#;66;MBdny^Wrm%x|Nje*XGJoTf*n zqM|bMGumx)LW+l+`>Smt z%g2v98ACO4?^{`A7*X-KqWbR<1YNQNPVQWyX}Hv!TR|ZQaBl;g1T31)s%G%nnCLcl>%*2FV8a2}|muY+F&KU!4sglxs`dL5_|V*{R#lA$c7 zOsgB?HDe7Z_Uf7%B8aoI^U>t&dARjw&-sJR$&5jdfwD(6+v5(|^8ZAlp!0M6tQ`AwCx zy+oJw_f+CZ2^ksD6BCwGO=xleFpKtp*1Zu+_x=6-p5EThKf|Rynrq2^tAq3m4B~<3 zuM(LK2MY~fARFrIKbDnIcXoDW&@+jOiP1?(NmO_QIC6 zkzUaR6I0XBM1I2skfUYReB6?ooG=tx<-PN4#<)2~JUo>A8keFX3rK9sqT(KiZ)ddq z{N}~!s*1SVlGYJ{K;?z)Zxs(bnvY;q@GPrUf@7%xN)WBz7={~W1ELbRH7PHQ)xtrQ z$QQ9Oi>$dgW@hF}`)|xIry8$!h0`pu7EUT3FEOOPf6vXy8SM4%Y#3O3S7#>?zd=Sv zJf|8R8(VB+qwGvotqZFE(9Od`-QQn{l8OqX?43?N20TD-p$keg-X5qDevSRRf<_$F zs*%pv*f=sedWkiEvcjq<1P~S%H#bQ;VH)p^^Z7m9=t{CaQmzKTU!a#3!X9u$G(SHd z`Q}ZqYR(+1A*N2;!QLJ-J#D=nq7R32Lppw$M6Tft>>M0&D=X=8a&msPfBeYRv+NcY z9{#bpIo<0)?$Gcsr;w2Dx?tU%@mePaK7))%UfpDf;{~0Csqv#nZ&Fhk^9l5$&v+7F3ONg=PZq#-9K2jzn* zDY3#}u*k^Bcm$w$H9b9QoEg@**vz8Ijkg{-$#-U~7;q{}OiVntu>mNczN6!wRokma zrI&1N;Iw7vaJxY>nvruzQF0V}ilGvno4q)1*t=x_SWzT}TUq(Zk)*xMFOi^es+K&iLNvkcnC3o4q}lImrI}+*}aoOYJ&IhwpRyqc*FlIw~N} zQ1l^l43_5{-+uC^l!B5{RZA;!FTKL1J+`DoQ1NVUFk|qHeRON}_wUG_9<9?ed|*Ut zt5Qj6X_fCjBGHe9usz?UrK5u}R5=U#^5wzlVf#gQVRnGvR9c)e0B?9iM8vjX-}DQ> zC;|ck9bU8kg%1t&^_Tzc@pfynD&_QIlyl4LovIj{U+PV~q?q%+*%u!G4=>{4A{rWG z4z{L+mR{^^Z+8Tfkkc_Ra9h_lR8@7xvMR{(4`R=1_jYjXx1EieUjRPf^Sl0IVBlJQ zetrww{c%8mGM++78oiYQwvLIqExLeX4}A5*hph;t5&(2wuy>x3{Of2&Pq6SNFxWefAwn?ZRMOmU^z2$A!@wo0>8J!Kg*VDW3@3Wu^g~jJuPx zGchsYeyj2t6;;=5n^#1BetrhU(WW*wali(Zr^m%zTXJjCj&6PZ`ii2WVl4vnTYTR# zMkXRXojJ2ATg;Z&AVU^Jh(e*RnVXxJTRJ)l(6Y!vsHmte?_+IUK8W_kL@Nmkc`~T> zsD=D@7GX+-moPdqHAVgS@nfg=;$%rWyr)SXsy$sm@^h=J=>?N05p~SWF=UXZNMy|Y zP_nf%rrfQsM}Pod17tlLtF-Ou>+1pzYJPR~@af=%4iH}wFE1}(hc1OYMbrk{8A|Tq ztj3U_ksGYXr{?1$2U2NdY#a=bnu9VetEkX;`BFHXmi2Am=_0$lji`tS7G(Eldtl($ z_;@GqkcoWy3-M+tfatcSKf7*C$+b2lu>$5`Vq&VSskyx4jQ1i9@M?PjKm5e7@jz!@ zj8^0{HZnq?P|$&vxFXUmuNo z`0ydqi7_Xz7O_oDP2!rGng_phMXN9v)tQ9odiOPx1KsN4dAO=3Zz6Wk5XCoEXeJkJI zP1HJBPAW1n6X3X5aYI8xn2pT$5rMw{tqXdU%q@^Utj<7=STHuajcH6F7V7urUEy9 z9lknVPI)=k5!{+pCzKh}nw`!5UcsvZ_N}iks&{#J=IzKGHF)6KFJ8};ex~wlInd6N zEk5WJg4B`n3Q%Qv6q7~_vmJe$XL|Le5SEvhyP(mD2;kHqLH+%DPpW+Na*XBxGXipc zhX7VV<3Dwzrn(yAhi{E|`uMl87MzVDm+L1_to^Ez>!$(uBmsTn`pV5MXLLrJf&SgrlR5*;$(tUo={> z9yJ^Ln(DS^)2xW-d0k!IJiviGI^Dp)0JPrvLDU=|2coK~s>=F$4P)c$e1=)k<24R{ zW9^J^-P)d`yg~K#(h$1yr@#1t{NE#-@B;5!R9s91rv^h=a`J*rXqJMP2++cC<6I3} z;0YX&$nHwpE-}}I2QxoiPWGJ2H%CTb|0X4I%E~fAj7&^E9x{>zm70H|6gQt_LZi_Z zmX-)s@_U3T4vgzl#|?bj4g~UFHZBxRux7jN;o0Y&1i>Z#7J_$)lNx&A4tmMEPHu~-hl2J2uh^GY{#-2?h5D1suCG9ecD$HC!^UFzA zs_-eos7}6#wgc)TVIfL1ujUG9ZpDk!zmaiqs~KO#0CRr9;TC~3{@RoRh^(xtx*ouJ z+h6`k*L3Of6*97CqZQUiV3PLfmax1mCrI@4XD@hX$Q9tiv5AJtcdQP;!~{$_8L+iq zK;|K}?6{LyTL_;k3u>&AXmuc4^Wnpr5DqD+n`~@s7*_HV?76ew!A57J>tghTc}mRq z_&C$Z$cQ5X@#`w^F;g);3oafW-TAsH-Z(Lx2MyUeRT zP*>-akT59$M7O3CPL%q^bz^-!id!>wuIrjgm>yqfO3Lk%gH53+*AJ)-e0|_W05!Yv zRFOYZ#?;^4T~#NiVpw@eiOxENVK|)TAD7=hIG;N>boKXB-y<|v6o?g$1Dsi;e-acG zjnK%w9>A=T4g3O$V|(gg&Z2O0U0sqcAKd9KGmD%X>Ej0HBm~e0%xZREEWkr-Z!L17 z_9_Fv!2_`Bpo9IC2p}u~U2`MlkHNCR5KjzQ+ZenN3;%G&1#~v9)*1}aP9#e?HC1>c zbMuI@GGSrs=Aoi9rWT<^P&Li__k&?j8JSU@bhSs1m^6SFl(PG9Mnry{jZzFdRBXC4 zLtboidpd9<^*^M%3CMQ>PgK?9ek=BW&v$F`Oj2PItcrq|y&F;T{sr_6YXLd0@Zh7Pt6u24KrE26Rd=FO` zo0#C@;v&xWKS;Yfz?aeo2D4KGcHQOsHlCn|T6Vt*UE1j95@}|$!0^xbtY*%{7ICYn zumd`xxOz2VS4vu1$BVSFsfklSfa0?kUPD!tIOyQ`I~ZTrs4YoPuxyShpYgJ^jBw!4 zgY@_q7EE7J$M{Epsgoq=r|x~meXWC;R=*Wo4e10(4t@@Ya#QgmH|E8qCLCbretQGp;{tSVU)7Ziy60E+XyZg%d`MLLI z-Kd!Vi92M|^pqFuTirws3gp*RMLHfEH$2d@@WMB<(kX58!e + +![](../img/prefix-sum.png) ### Другие операции From 6b3df0447d0d22ba55d4ad7496f7a3965b7f8ff2 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sat, 7 May 2022 00:46:05 +0300 Subject: [PATCH 442/531] typo --- content/english/hpc/architecture/functions.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/hpc/architecture/functions.md b/content/english/hpc/architecture/functions.md index f7a74cc6..02614f94 100644 --- a/content/english/hpc/architecture/functions.md +++ b/content/english/hpc/architecture/functions.md @@ -16,7 +16,7 @@ Both of these concerns can be solved by having a dedicated location in memory wh The hardware stack works the same way software stacks do and is similarly implemented as just two pointers: - The *base pointer* marks the start of the stack and is conventionally stored in `rbp`. -- The *stack pointer* marks the last element on the stack and is conventionally stored in `rsp`. +- The *stack pointer* marks the last element of the stack and is conventionally stored in `rsp`. When you need to call a function, you push all your local variables onto the stack (which you can also do in other circumstances, e. g. when you run out of registers), push the current instruction pointer, and then jump to the beginning of the function. When exiting from a function, you look at the pointer stored on top of the stack, jump there, and then carefully read all the variables stored on the stack back into their registers. From 2808896da9ff9168952ef3e4b5725dd172dd400f Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sat, 7 May 2022 00:46:13 +0300 Subject: [PATCH 443/531] extra space --- content/english/hpc/algorithms/argmin.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/hpc/algorithms/argmin.md b/content/english/hpc/algorithms/argmin.md index 0a9531c1..2089d083 100644 --- a/content/english/hpc/algorithms/argmin.md +++ b/content/english/hpc/algorithms/argmin.md @@ -164,7 +164,7 @@ int argmin(int *a, int n) { The compiler [optimized the machine code layout](/hpc/architecture/layout), and the CPU is now able to execute the loop at around 2 GFLOPS — a slight but sizeable improvement from 1.5 GFLOPS of the non-hinted loop. -Here is the idea: if we are only updating the minimum a dozen or so times during the entire computation, we can ditch all the vector-blending and index updating and just maintain the minimum and regularly check if it has changed. Inside this check, we can use however slow method of updating the argmin we want because it will only be called a few times. +Here is the idea: if we are only updating the minimum a dozen or so times during the entire computation, we can ditch all the vector-blending and index updating and just maintain the minimum and regularly check if it has changed. Inside this check, we can use however slow method of updating the argmin we want because it will only be called a few times. To implement it with SIMD, all we need to do on each iteration is a vector load, a comparison, and a test-if-zero: From ece7674101f421484943c6df14c142e30059abde Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sat, 7 May 2022 03:29:13 +0300 Subject: [PATCH 444/531] typo --- content/english/hpc/pipelining/branchless.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/hpc/pipelining/branchless.md b/content/english/hpc/pipelining/branchless.md index f84627b5..280498b1 100644 --- a/content/english/hpc/pipelining/branchless.md +++ b/content/english/hpc/pipelining/branchless.md @@ -41,7 +41,7 @@ sar ebx, 31 ; t >>= 31 imul eax, ebx ; x *= t ``` -Another, more complicated way to implement this whole sequence is to convert this sign bit into a mask and then use bitwise `and` instead of multiplication: `((a[i] - 50) >> 31 - 1) & a`. This makes the whole sequence one cycle faster, considering that, unlike other instructions, `imul` takes 3 cycles: +Another, more complicated way to implement this whole sequence is to convert this sign bit into a mask and then use bitwise `and` instead of multiplication: `((a[i] - 50) >> 31 - 1) & a[i]`. This makes the whole sequence one cycle faster, considering that, unlike other instructions, `imul` takes 3 cycles: ```nasm mov ebx, eax ; t = x From e6d9601a8dcb8a41d6776f57cde82e0abfe22732 Mon Sep 17 00:00:00 2001 From: yatancuyu <45235844+yatancuyu@users.noreply.github.com> Date: Wed, 11 May 2022 15:40:33 +0300 Subject: [PATCH 445/531] Add missing return value --- content/russian/cs/graph-traversals/cycle.md | 1 + 1 file changed, 1 insertion(+) diff --git a/content/russian/cs/graph-traversals/cycle.md b/content/russian/cs/graph-traversals/cycle.md index 5347e9cd..7a274da1 100644 --- a/content/russian/cs/graph-traversals/cycle.md +++ b/content/russian/cs/graph-traversals/cycle.md @@ -60,6 +60,7 @@ int dfs(int v, int p = -1) { } } } + return -1; } ``` From 912a24172441950b850629814c0893ebea8c6915 Mon Sep 17 00:00:00 2001 From: yatancuyu <45235844+yatancuyu@users.noreply.github.com> Date: Wed, 11 May 2022 15:45:14 +0300 Subject: [PATCH 446/531] Prevent infinite loop --- content/russian/cs/graph-traversals/connectivity.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/russian/cs/graph-traversals/connectivity.md b/content/russian/cs/graph-traversals/connectivity.md index 45ceec28..17628308 100644 --- a/content/russian/cs/graph-traversals/connectivity.md +++ b/content/russian/cs/graph-traversals/connectivity.md @@ -31,7 +31,7 @@ void dfs(int v, int num) { int num = 0; for (int v = 0; v < n; v++) if (!component[v]) - dfs(v, num++); + dfs(v, ++num); ``` После этого переменная `num` будет хранить число компонент связности, а массив `component` — номер компоненты для каждой вершины, который, например, можно использовать, чтобы быстро проверять, существует ли путь между заданной парой вершин. From 63526ca0348b0abd5359d53cd91f24c92ddbf654 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Fri, 13 May 2022 16:59:54 +0300 Subject: [PATCH 447/531] slides theme --- assets/slides.sass | 50 +++ config.yaml | 6 +- content/english/hpc/slides/01-intro/_index.md | 297 ++++++++++++++++++ content/english/hpc/slides/_index.md | 10 + 4 files changed, 360 insertions(+), 3 deletions(-) create mode 100644 content/english/hpc/slides/01-intro/_index.md create mode 100644 content/english/hpc/slides/_index.md diff --git a/assets/slides.sass b/assets/slides.sass index e69de29b..671ababe 100644 --- a/assets/slides.sass +++ b/assets/slides.sass @@ -0,0 +1,50 @@ +$font-text: 'Source Sans', serif !default +$font-code: 'Inconsolata', monospace !default +$font-headings: 'Garamond', serif !default + +$borders: 1px solid #eaecef !default + +/* fonts */ +@font-face + font-family: 'CMU' + src: url(fonts/cmu.woff2) + +@font-face + font-family: 'Merriweather' + src: url(fonts/merriweather.woff2) + +@font-face + font-family: 'Inconsolata' + src: url(fonts/inconsolata.woff2) + +@font-face + font-family: 'Garamond' + src: url(fonts/garamond.woff2) + +@font-face + font-family: "Open Sans" + src: url(fonts/opensans.woff2) + +@font-face + font-family: "Source Sans" + src: url(fonts/sourcesans.ttf) + +@font-face + font-family: "Crimson" + src: url(fonts/crimson.ttf) + +body + font-family: $font-text + font-size: 24px + +h1 + font-size: 2em + text-align: center + margin-top: 0 + margin-bottom: 20px + +h2 + font-size: 1.5em + +h3 + font-size: 1.25em diff --git a/config.yaml b/config.yaml index 8fb26a1c..1f196de4 100644 --- a/config.yaml +++ b/config.yaml @@ -42,8 +42,8 @@ languages: params: repo: "https://github.com/algorithmica-org/algorithmica" reveal_hugo: - theme: white + #theme: white slide_number: true transition: none - #custom_theme: "slides.sass" - #custom_theme_compile: true + custom_theme: "slides.sass" + custom_theme_compile: true diff --git a/content/english/hpc/slides/01-intro/_index.md b/content/english/hpc/slides/01-intro/_index.md new file mode 100644 index 00000000..492ceb6a --- /dev/null +++ b/content/english/hpc/slides/01-intro/_index.md @@ -0,0 +1,297 @@ +--- +title: Why Go Beyond Big O? +outputs: [Reveal] +--- + +# Performance Engineering + +Sergey Slotin + +$x + y$ + +May 7, 2022 + +--- + +### About me + +- Former [competitive programmer](https://codeforces.com/profile/sslotin) +- Created [Algorithmica.org](https://ru.algorithmica.org/cs) and "co-founded" [Tinkoff Generation](https://algocode.ru/) +- Wrote [Algorithms for Modern Hardware](https://en.algorithmica.org/hpc/), on which these lectures are based +- Twitter: [@sergey_slotin](https://twitter.com/sergey_slotin); Telegram: [@bydlokoder](https://t.me/bydlokoder); anywhere else: @sslotin + +---- + +### About this mini-course + +- Low-level algorithm optimization +- Two days, six lectures +- **Day 1:** CPU architecture & assembly, pipelining, SIMD programming +- **Day 2:** CPU caches & memory, binary search, tree data structures +- Prerequisites: CS 102, C/C++ +- No assignments, but you are encouraged to reproduce case studies: https://github.com/sslotin/amh-code + +--- + +## Lecture 0: Why Go Beyond Big O + +*(AMH chapter 1)* + +--- + +## The RAM Model of Computation + +- There is a set of *elementary operations* (read, write, add, multiply, divide) +- Each operation is executed sequentially and has some constant *cost* +- Running time ≈ sum of all elementary operations weghted by their costs + +---- + +![](https://en.algorithmica.org/hpc/complexity/img/cpu.png =400x) + +- The “elementary operations” of a CPU are called *instructions* +- Their “costs” are called *latencies* (measured in cycles) +- Instructions modify the state of the CPU stored in a number of *registers* +- To convert to real time, sum up all latencies of executed instructions and divide by the *clock frequency* (the number of cycles a particular CPU does per second) +- Clock speed is volatile, so counting cycles is more useful for analytical purposes + +---- + +![](https://external-preview.redd.it/6PIp0RLbdWFGFUOT6tFuufpMlplgWdnXWOmjuqkpMMU.jpg?auto=webp&s=9bed495f3dbb994d7cdda33cc114aba1cebd30e2 =400x) + +http://ithare.com/infographics-operation-costs-in-cpu-clock-cycles/ + +---- + +### Asymptotic complexity + +![](https://en.algorithmica.org/hpc/complexity/img/complexity.jpg =400x) + +For sufficiently large $n$, we only care about asymptotic complexity: $O(n) = O(1000 \cdot n)$ + +$\implies$ The costs of basic ops don't matter since they don't affect complexity + +But can we handle "sufficiently large" $n$? + +--- + +When complexity theory was developed, computers were different + +![](https://upload.wikimedia.org/wikipedia/commons/thumb/4/4e/Eniac.jpg/640px-Eniac.jpg =500x) + +Bulky, costly, and fundamentally slow (due to speed of light) + +---- + +![](https://researchresearch-news-wordpress-media-live.s3.eu-west-1.amazonaws.com/2022/02/microchip_fingertip-738x443.jpg =500x) + +Micro-scale circuits allow signals to propagate faster + +---- + + + +

      + +---- + +The development of microchips and photolithography enabled: + +- higher clock rates +- the ability to scale the production +- **much** lower material and power usage (= lower cost) + +---- + +![](https://upload.wikimedia.org/wikipedia/commons/4/49/MOS_6502AD_4585_top.jpg =500x) + +MOS Technology 6502 (1975), Atari 2600 (1977), Apple II (1977), Commodore 64 (1982) + +---- + +Also a clear path to improvement: just make lenses stronger and chips smaller + +**Moore’s law:** transistor count doubles every two years. + +---- + +**Dennard scaling:** reducing die dimensions by 30% + +- doubles the transistor density ($0.7^2 \approx 0.5$) +- increases the clock speed by 40% ($\frac{1}{0.7} \approx 1.4$) +- leaves the overall *power density* the same + (we have a mechanical limit on how much heat can be dissipated) + +$\implies$ Each new "generation" should have roughly the same total cost, but 40% higher clock and twice as many transistors + +(which can be used e. g. to add new instructions or increase the word size) + +---- + +Around 2005, Dennard scaling stopped — due to *leakage* issues: + +- transistors became very smal +- $\implies$ their magnetic fields started to interfere with the neighboring circuitry +- $\implies$ unnecessary heating and occasional bit flipping +- $\implies$ have to increase voltage to fix it +- $\implies$ have to reduce clock frequency to balance off power consumption + +---- + +![](https://en.algorithmica.org/hpc/complexity/img/dennard.ppm =600x) + +A limit on the clock speed + +--- + +Clock rates have plateaued, but we still have more transistors to use: + +- **Pipelining:** overlapping the execution of sequential instructions to keep different parts of the CPU busy +- **Out-of-order execution:** no waiting for the previous instructions to complete +- **Superscalar processing:** adding duplicates of execution units +- **Caching:** adding layers of faster memory on the chip to speed up RAM access +- **SIMD:** adding instructions that handle a block of 128, 256, or 512 bits of data +- **Parallel computing:** adding multiple identinal cores on a chip +- **Distributed computing:** multiple chips in a motherboard or multiple computers +- **FPGAs** and **ASICs:** using custom hardware to solve a specific problem + +---- + +![](https://en.algorithmica.org/hpc/complexity/img/die-shot.jpg =500x) + +For modern computers, the “let’s count all operations” approach for predicting algorithm performance is off by several orders of magnitude + +--- + +### Matrix multiplication + +```python +n = 1024 + +a = [[random.random() + for row in range(n)] + for col in range(n)] + +b = [[random.random() + for row in range(n)] + for col in range(n)] + +c = [[0 + for row in range(n)] + for col in range(n)] + +for i in range(n): + for j in range(n): + for k in range(n): + c[i][j] += a[i][k] * b[k][j] +``` + +630 seconds or 10.5 minutes to multiply two $1024 \times 1024$ matrices in plain Python + +~880 cycles per multiplication + +---- + +```java +public class Matmul { + static int n = 1024; + static double[][] a = new double[n][n]; + static double[][] b = new double[n][n]; + static double[][] c = new double[n][n]; + + public static void main(String[] args) { + Random rand = new Random(); + + for (int i = 0; i < n; i++) { + for (int j = 0; j < n; j++) { + a[i][j] = rand.nextDouble(); + b[i][j] = rand.nextDouble(); + c[i][j] = 0; + } + } + + for (int i = 0; i < n; i++) + for (int j = 0; j < n; j++) + for (int k = 0; k < n; k++) + c[i][j] += a[i][k] * b[k][j]; + } +} +``` + +Java needs 10 seconds, 63 times faster + +~13 cycles per multiplication + +---- + +```c +#define n 1024 +double a[n][n], b[n][n], c[n][n]; + +int main() { + for (int i = 0; i < n; i++) { + for (int j = 0; j < n; j++) { + a[i][j] = (double) rand() / RAND_MAX; + b[i][j] = (double) rand() / RAND_MAX; + } + } + + for (int i = 0; i < n; i++) + for (int j = 0; j < n; j++) + for (int k = 0; k < n; k++) + c[i][j] += a[i][k] * b[k][j]; + + return 0; +} +``` + +`GCC -O3` needs 9 seconds, but if we include `-march=native` and `-ffast-math`, the compiler vectorizes the code, and it drops down to 0.6s. + +---- + +```python +import time +import numpy as np + +n = 1024 + +a = np.random.rand(n, n) +b = np.random.rand(n, n) + +start = time.time() + +c = np.dot(a, b) + +duration = time.time() - start +print(duration) +``` + +BLAS needs ~0.12 seconds +(~5x over auto-vectorized C and ~5250x over plain Python) diff --git a/content/english/hpc/slides/_index.md b/content/english/hpc/slides/_index.md new file mode 100644 index 00000000..794e67a6 --- /dev/null +++ b/content/english/hpc/slides/_index.md @@ -0,0 +1,10 @@ +--- +title: Slides +ignoreIndexing: true +weight: 1000 +draft: true +--- + +This is an attempt to make a university course out of the book. + +Work in progress. From 498100cf79e8d6d512edcedbb09e274f40030d38 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Fri, 13 May 2022 19:00:33 +0300 Subject: [PATCH 448/531] typos --- content/english/hpc/external-memory/locality.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/english/hpc/external-memory/locality.md b/content/english/hpc/external-memory/locality.md index 569d9437..eca83766 100644 --- a/content/english/hpc/external-memory/locality.md +++ b/content/english/hpc/external-memory/locality.md @@ -23,7 +23,7 @@ In this article, we continue designing algorithms for the external memory model In this context, we can talk about the degree of cache reuse primarily in two ways: -- *Temporal locality* refers to the repeated access of the same data within a relatively small time duration, such that the data likely remains cached between the requests. +- *Temporal locality* refers to the repeated access of the same data within a relatively small time period, such that the data likely remains cached between the requests. - *Spatial locality* refers to the use of elements relatively close to each other in terms of their memory locations, such that they are likely fetched in the same memory block. In other words, temporal locality is when it is likely that this same memory location will soon be requested again, while spatial locality is when it is likely that a nearby location will be requested right after. @@ -136,7 +136,7 @@ $$ t[k][i] = \min(t[k-1][i], t[k-1][i+2^{k-1}]) $$ -Now, there are two design choices to make: whether the log-size $k$ should be the first or the second dimension, and whether to iterate over $k$ and then $i$ or the other way around. This means that there are of $2×2=4$ ways to build it, and here is the optimal one: +Now, there are two design choices to make: whether the log-size $k$ should be the first or the second dimension, and whether to iterate over $k$ and then $i$ or the other way around. This means that there are $2×2=4$ ways to build it, and here is the optimal one: ```cpp int mn[logn][maxn]; From 457960740ed133df92f23d0002176a66c8abd923 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Fri, 13 May 2022 19:48:07 +0300 Subject: [PATCH 449/531] "great-grandfather" --- content/english/hpc/data-structures/binary-search.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/content/english/hpc/data-structures/binary-search.md b/content/english/hpc/data-structures/binary-search.md index 56f1609a..d9a3dcf6 100644 --- a/content/english/hpc/data-structures/binary-search.md +++ b/content/english/hpc/data-structures/binary-search.md @@ -175,7 +175,7 @@ With prefetching, the performance on large arrays becomes roughly the same: ![](../img/search-branchless-prefetch.svg) -The graph still grows faster as the branchy version also prefetches "grandchildren," "grand-grandchildren," and so on — although the usefulness of each new speculative read diminishes exponentially as the prediction is less and less likely to be correct. +The graph still grows faster as the branchy version also prefetches "grandchildren," "great-grandchildren," and so on — although the usefulness of each new speculative read diminishes exponentially as the prediction is less and less likely to be correct. In the branchless version, we could also fetch ahead by more than one layer, but the number of fetches we'd need also grows exponentially. Instead, we will try a different approach to optimize memory operations. @@ -359,9 +359,9 @@ This observation extends to the grand-children of node $k$ — they are also sto \end{aligned} --> -Their cache line can also be fetched with one instruction. Interesting… what if we continue this, and instead of fetching direct children, we fetch ahead as many descendants as we can cramp into one cache line? That would be $\frac{64}{4} = 16$ elements, our grand-grand-grandchildren with indices from $16k$ to $(16k + 15)$. +Their cache line can also be fetched with one instruction. Interesting… what if we continue this, and instead of fetching direct children, we fetch ahead as many descendants as we can cramp into one cache line? That would be $\frac{64}{4} = 16$ elements, our great-great-grandchildren with indices from $16k$ to $(16k + 15)$. -Now, if we prefetch just one of these 16 elements, we will probably only get some but not all of them, as they may cross a cache line boundary. We can prefetch the first *and* the last element, but to get away with just one memory request, we need to notice that the index of the first element, $16k$, is divisible by $16$, so its memory address will be the base address of the array plus something divisible by $16 \cdot 4 = 64$, the cache line size. If the array were to begin on a cache line, then these $16$ grand-gran-grandchildren elements will be guaranteed to be on a single cache line, which is just what we needed. +Now, if we prefetch just one of these 16 elements, we will probably only get some but not all of them, as they may cross a cache line boundary. We can prefetch the first *and* the last element, but to get away with just one memory request, we need to notice that the index of the first element, $16k$, is divisible by $16$, so its memory address will be the base address of the array plus something divisible by $16 \cdot 4 = 64$, the cache line size. If the array were to begin on a cache line, then these $16$ great-great-grandchildren elements will be guaranteed to be on a single cache line, which is just what we needed. Therefore, we only need to [align](/hpc/cpu-cache/alignment) the array: From 01f16643633967f4b2ac68f889316263e5239b8e Mon Sep 17 00:00:00 2001 From: hectonit <48787141+hectonit@users.noreply.github.com> Date: Sun, 15 May 2022 19:57:02 +0300 Subject: [PATCH 450/531] Update fenwick.md Wrong variable naming --- content/russian/cs/range-queries/fenwick.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/russian/cs/range-queries/fenwick.md b/content/russian/cs/range-queries/fenwick.md index f07a1ed4..9e37fc8d 100644 --- a/content/russian/cs/range-queries/fenwick.md +++ b/content/russian/cs/range-queries/fenwick.md @@ -84,7 +84,7 @@ int sum (int r1, int r2) { int res = 0; for (int i = r1; i > 0; i -= i & -i) for (int j = r2; j > 0; j -= j & -j) - ans += t[i][j]; + res += t[i][j]; return res; } ``` From 0339dbbd098c1cfd2943d443cbdc0b78ee6849f6 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Sun, 15 May 2022 22:39:35 +0300 Subject: [PATCH 451/531] elaborate on b-tree insert performance --- content/english/hpc/data-structures/b-tree.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/hpc/data-structures/b-tree.md b/content/english/hpc/data-structures/b-tree.md index 122e1c8e..d69a814e 100644 --- a/content/english/hpc/data-structures/b-tree.md +++ b/content/english/hpc/data-structures/b-tree.md @@ -305,7 +305,7 @@ The relative speedup varies with the structure size — 7-18x/3-8x over STL and ![](../img/btree-relative.svg) -Insertions are only 1.5-2 faster than for `absl::btree`, which uses scalar code to do everything. I don't know (yet) why insertions are *that* slow, but I guess it has something to do with data dependencies between queries. +Insertions are only 1.5-2 faster than for `absl::btree`, which uses scalar code to do everything. My best guess why insertions are *that* slow is due to data dependency: since the tree nodes may change, the CPU can't start processing the next query before the previous one finishes (the [true latency](../s-tree/#comparison-with-stdlower_bound) of both queries is roughly equal and ~3x of the reciprocal throughput of `lower_bound`). ![](../img/btree-absl.svg) From eefefe42b7db3cdb8dc5ab74cbfc864e08ad0dff Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Mon, 16 May 2022 08:55:12 +0300 Subject: [PATCH 452/531] elaborate on why ctz of negative diff works --- content/english/hpc/algorithms/gcd.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/hpc/algorithms/gcd.md b/content/english/hpc/algorithms/gcd.md index b9e9007a..63efdec9 100644 --- a/content/english/hpc/algorithms/gcd.md +++ b/content/english/hpc/algorithms/gcd.md @@ -207,7 +207,7 @@ Let's draw the dependency graph of this loop: Modern processors can execute many instructions in parallel, essentially meaning that the true "cost" of this computation is roughly the sum of latencies on its critical path. In this case, it is the total latency of `diff`, `abs`, `ctz`, and `shift`. -We can decrease this latency using the fact that we can actually calculate `ctz` using just `diff = a - b`, because a negative number divisible by $2^k$ still has $k$ zeros at the end. This lets us not wait for `max(diff, -diff)` to be computed first, resulting in a shorter graph like this: +We can decrease this latency using the fact that we can actually calculate `ctz` using just `diff = a - b`, because a [negative number](../hpc/arithmetic/integer/#signed-integers) divisible by $2^k$ still has $k$ zeros at the end of its binary representation. This lets us not wait for `max(diff, -diff)` to be computed first, resulting in a shorter graph like this: + +**Definition.** The *representative* $\bar x$ of a number $x$ in the Montgomery space is defined as $$ \bar{x} = x \cdot r \bmod n $$ -Note that the transformation is actually such a multiplication that we want to optimize, so it is still an expensive operation. However, we will only need to transform a number into the space once, perform as many operations as we want efficiently in that space and at the end transform the final result back, which should be profitable if we are doing lots of operations modulo $n$. +Computing this transformation involves a multiplication and a modulo — an expensive operation that we wanted to optimize away in the first place — which is why we don't use this method for general modular multiplication and only long sequences of operations where transforming numbers to and from the Montgomery space is worth it. + + -Inside the Montgomery space addition, substraction and checking for equality is performed as usual ($x \cdot r + y \cdot r \equiv (x + y) \cdot r \bmod n$). However, this is not the case for multiplication. Denoting multiplication in Montgomery space as $*$ and normal multiplication as $\cdot$, we expect the result to be: +Inside the Montgomery space, addition, substraction, and checking for equality is performed as usual: + +$$ +x \cdot r + y \cdot r \equiv (x + y) \cdot r \bmod n +$$ + +However, this is not the case for multiplication. Denoting multiplication in the Montgomery space as $*$ and the "normal" multiplication as $\cdot$, we expect the result to be: $$ \bar{x} * \bar{y} = \overline{x \cdot y} = (x \cdot y) \cdot r \bmod n $$ -But the normal multiplication will give us: +But the normal multiplication in the Montgomery space yields: $$ \bar{x} \cdot \bar{y} = (x \cdot y) \cdot r \cdot r \bmod n $$ -Therefore the multiplication in the Montgomery space is defined as +Therefore, the multiplication in the Montgomery space is defined as $$ \bar{x} * \bar{y} = \bar{x} \cdot \bar{y} \cdot r^{-1} \bmod n $$ -This means that whenever we multiply two numbers, after the multiplication we need to *reduce* them. Therefore, we need to have an efficient way of calculating $x \cdot r^{-1} \bmod n$. +This means that, after we normally multiply two numbers in the Montgomery space, we need to *reduce* the result by multiplying it by $r^{-1}$ and taking the modulo — and there is an efficent way to do this particular operation. ### Montgomery reduction -Assume that $r=2^{64}$, the modulo $n$ is 64-bit and the number $x$ we need to reduce (multiply by $r^{-1}$) is 128-bit (the product of two 64-bit numbers). +Assume that $r=2^{32}$, the modulo $n$ is 32-bit, and the number $x$ we need to reduce (multiply by $r^{-1}$ and take it modulo $n$) is the 64-bit the product of two 32-bit numbers. -Because $\gcd(n, r) = 1$, we know that there are two numbers $r^{-1}$ and $n'$ in the $[0, n)$ range such that +By definition, $\gcd(n, r) = 1$, so we know that there are two numbers $r^{-1}$ and $n'$ in the $[0, n)$ range such that $$ r \cdot r^{-1} + n \cdot n' = 1 $$ -and both $r^{-1}$ and $n'$ can be computed using the extended Euclidean algorithm. +and both $r^{-1}$ and $n'$ can be computed using the [extended Euclidean algorithm](../euclid-extended). -Using this identity we can express $r \cdot r^{-1}$ as $(-n \cdot n' + 1)$ and write $x \cdot r^{-1}$ as +Using this identity, we can express $r \cdot r^{-1}$ as $(-n \cdot n' + 1)$ and write $x \cdot r^{-1}$ as $$ \begin{aligned} @@ -75,7 +122,13 @@ def reduce(x): return a ``` -Since $x < n \cdot n < r \cdot n$ (as $x$ is a product of multiplicatio) and $q \cdot n < r \cdot n$, we know that $-n < (x - q \cdot n) / r < n$. Therefore the final modulo operation can be implemented using a single bound check and addition. +Since $x < n \cdot n < r \cdot n$ and $q \cdot n < r \cdot n$, we know that + +$$ +-n < (x - q \cdot n) / r < n +$$ + +Therefore, the final modulo operation can be implemented using a single bound check and addition. Here is an equivalent C implementation for 64-bit integers: @@ -138,39 +191,86 @@ Transforming a number into the space is just a multiplication inside the space o ### Complete Implementation ```c++ +typedef __uint32_t u32; +typedef __uint64_t u64; + struct montgomery { - u64 n, nr; + u32 n, nr; - montgomery(u64 n) : n(n) { - nr = 1; + constexpr montgomery(u32 n) : n(n), nr(1) { for (int i = 0; i < 6; i++) nr *= 2 - n * nr; } - u64 reduce(u128 x) { - u64 q = u64(x) * nr; - u64 m = ((u128) q * n) >> 64; - u64 xhi = (x >> 64); - //cout << u64(x>>64) << " " << u64(x) << " " << q << endl; - //cout << u64(m>>64) << " " << u64(m) << endl; - //exit(0); - if (xhi >= m) - return (xhi - m); - else - return (xhi - m) + n; + u32 reduce(u64 x) const { + u32 q = u32(x) * nr; + u32 m = ((u64) q * n) >> 32; + u32 xhi = (x >> 32); + return xhi + n - m; + + // if you need + // u32 t = xhi - m; + // return xhi >= m ? t : t + n; } - u64 mult(u64 x, u64 y) { - return reduce((u128) x * y); + u32 multiply(u32 x, u32 y) const { + return reduce((u64) x * y); } - u64 transform(u64 x) { - return (u128(x) << 64) % n; + u32 transform(u32 x) const { + return (u64(x) << 32) % n; } }; ``` ```c++ montgomery m(n); -m.transform(x); -``` \ No newline at end of file + +a = m.transform(a); +b = m.transform(b); +c = m.multiply(a, b); +c = m.reduce(c); +``` + +```c++ +int inverse(int _a) { + u32 a = space.transform(_a); + u32 r = space.transform(1); + + int n = M - 2; + while (n) { + if (n & 1) + r = space.multiply(r, a); + a = space.multiply(a, a); + n >>= 1; + } + + return space.reduce(r); +} +``` + +SIMD + +166.79 ns + +207.04 ns + +```c++ +constexpr montgomery space(M); + +int inverse(int _a) { + u64 a = space.transform(_a); + u64 r = space.transform(1); + + #pragma GCC unroll(30) + for (int l = 0; l < 30; l++) { + if ( (M - 2) >> l & 1 ) + r = space.multiply(r, a); + a = space.multiply(a, a); + } + + return space.reduce(r); +} +``` + +**Exercise.** Implement efficient *modular* [matix multiplication](/hpc/algorithms/matmul). From acfb5c857b2bf915adf6f28c17bc2d2ba5adef91 Mon Sep 17 00:00:00 2001 From: Project Nayuki Date: Wed, 18 May 2022 05:40:26 +0000 Subject: [PATCH 455/531] Improved spelling and word choice. --- content/english/hpc/_index.md | 4 ++-- content/english/hpc/architecture/assembly.md | 6 +++--- content/english/hpc/architecture/functions.md | 8 ++++---- content/english/hpc/architecture/isa.md | 2 +- content/english/hpc/architecture/layout.md | 6 +++--- content/english/hpc/architecture/loops.md | 4 ++-- content/english/hpc/arithmetic/division.md | 2 +- content/english/hpc/arithmetic/float.md | 2 +- content/english/hpc/compilation/_index.md | 2 +- content/english/hpc/complexity/_index.md | 2 +- content/english/hpc/complexity/hardware.md | 10 +++++----- content/english/hpc/complexity/languages.md | 4 ++-- content/english/hpc/external-memory/sorting.md | 2 +- content/english/hpc/pipelining/_index.md | 8 ++++---- content/english/hpc/pipelining/branchless.md | 4 ++-- content/english/hpc/pipelining/hazards.md | 4 ++-- content/english/hpc/pipelining/tables.md | 4 ++-- content/english/hpc/pipelining/throughput.md | 4 ++-- 18 files changed, 39 insertions(+), 39 deletions(-) diff --git a/content/english/hpc/_index.md b/content/english/hpc/_index.md index 92d0cd91..942c9f6a 100644 --- a/content/english/hpc/_index.md +++ b/content/english/hpc/_index.md @@ -33,7 +33,7 @@ A "release" for an open-source book like this essentially means: - mostly freezing the table of contents (except for the case studies), - doing one final round of heavy copyediting (hopefully, with the help of a professional editor — I still haven’t figured out how commas work in English), - drawing illustrations (I stole a lot of those that are currently displayed), -- making a print-optimized pdf and figuring out the best way to distribute it. +- making a print-optimized PDF and figuring out the best way to distribute it. After that, I will mostly be fixing errors and only doing some minor edits reflecting the changes in technology or new algorithm advancements. The e-book/printed editions will most likely be sold on a "pay what you want" basis, and in any case, the web version will always be fully available online. @@ -51,7 +51,7 @@ However, as the book is still evolving, it is probably not the best idea to star There are two highly impactful textbooks on which most computer science courses are built. Both are undoubtedly outstanding, but [one of them](https://en.wikipedia.org/wiki/The_Art_of_Computer_Programming) is 50 years old, and [the other](https://en.wikipedia.org/wiki/Introduction_to_Algorithms) is 30 years old, and [computers have changed a lot](/hpc/complexity/hardware) since then. Asymptotic complexity is not the sole deciding factor anymore. In modern practical algorithm design, you choose the approach that makes better use of different types of parallelism available in the hardware over the one that theoretically does fewer raw operations on galaxy-scale inputs. -And yet, the computer science curricula in most colleges completely ignore this shift. Although there are some great courses that aim to correct that — such as "[Performance Engineering of Software Systems](https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-172-performance-engineering-of-software-systems-fall-2018/)" from MIT, "[Programming Parallel Computers](https://ppc.cs.aalto.fi/)" from Aalto University, and some non-academic ones like Denis Bakhvalov's "[Performance Ninja](https://github.com/dendibakh/perf-ninja)" — most computer science graduates still treat the hardware like something from the 90s. +And yet, the computer science curricula in most colleges completely ignore this shift. Although there are some great courses that aim to correct that — such as "[Performance Engineering of Software Systems](https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-172-performance-engineering-of-software-systems-fall-2018/)" from MIT, "[Programming Parallel Computers](https://ppc.cs.aalto.fi/)" from Aalto University, and some non-academic ones like Denis Bakhvalov's "[Performance Ninja](https://github.com/dendibakh/perf-ninja)" — most computer science graduates still treat the hardware like something from the 1990s. What I really want to achieve is that performance engineering becomes taught right after introduction to algorithms. Writing the first comprehensive textbook on the subject is a large part of it, and this is why I rush to finish it by the summer so that the colleges can pick it up in the next academic year. But creating a new course requires more than that: you need a balanced curriculum, course infrastructure, lecture slides, lab assignments… so for some time after finishing the main book, I will be working on course materials and tools for *teaching* performance engineering — and I'm looking forward to collaborating with other people who want to make it a reality as well. diff --git a/content/english/hpc/architecture/assembly.md b/content/english/hpc/architecture/assembly.md index 5c981547..00c7caac 100644 --- a/content/english/hpc/architecture/assembly.md +++ b/content/english/hpc/architecture/assembly.md @@ -19,7 +19,7 @@ Jumping right into it, here is how you add two numbers (`*c = *a + *b`) in Arm a ldr w0, [x0] ; load 4 bytes from wherever x0 points into w0 ldr w1, [x1] ; load 4 bytes from wherever x1 points into w1 add w0, w0, w1 ; add w0 with w1 and save the result to w0 -str w0, [x2] ; write contents of w0 to wherever x2 points/ +str w0, [x2] ; write contents of w0 to wherever x2 points ``` Here is the same operation in x86 assembly: @@ -33,7 +33,7 @@ mov DWORD PTR [rdx], eax ; write contents of eax to wherever rdx points Assembly is very simple in the sense that it doesn't have many syntactical constructions compared to high-level programming languages. From what you can observe from the examples above: -- A program is a sequence of instructions, each written as its name followed by a variable amount of operands. +- A program is a sequence of instructions, each written as its name followed by a variable number of operands. - The `[reg]` syntax is used for "dereferencing" a pointer stored in a register, and on x86 you need to prefix it with size information (`DWORD` here means 32 bit). - The `;` sign is used for line comments, similar to `#` and `//` in other languages. @@ -55,7 +55,7 @@ Most instructions write their result into the first operand, which can also be i **Registers** are named `rax`, `rbx`, `rcx`, `rdx`, `rdi`, `rsi`, `rbp`, `rsp`, and `r8`-`r15` for a total of 16 of them. The "letter" ones are named like that for historical reasons: `rax` is "accumulator," `rcx` is "counter," `rdx` is "data" and so on — but, of course, they don't have to be used only for that. -There are also 32-, 16-bit and 8-bit registers that have similar names (`rax` → `eax` → `ax` → `al`). They are not fully separate but *aliased*: the first 32 bits of `rax` are `eax`, the first 16 bits of `eax` are `ax`, and so on. This is made to save die space while maintaining compatibility, and it is also the reason why basic type casts in compiled programming languages are usually free. +There are also 32-, 16-bit and 8-bit registers that have similar names (`rax` → `eax` → `ax` → `al`). They are not fully separate but *aliased*: the lowest 32 bits of `rax` are `eax`, the lowest 16 bits of `eax` are `ax`, and so on. This is made to save die space while maintaining compatibility, and it is also the reason why basic type casts in compiled programming languages are usually free. These are just the *general-purpose* registers that you can, with [some exceptions](../functions), use however you like in most instructions. There is also a separate set of registers for [floating-point arithmetic](/hpc/arithmetic/float), a bunch of very wide registers used in [vector extensions](/hpc/simd), and a few special ones that are needed for [control flow](../loops), but we'll get there in time. diff --git a/content/english/hpc/architecture/functions.md b/content/english/hpc/architecture/functions.md index 02614f94..412fc027 100644 --- a/content/english/hpc/architecture/functions.md +++ b/content/english/hpc/architecture/functions.md @@ -18,7 +18,7 @@ The hardware stack works the same way software stacks do and is similarly implem - The *base pointer* marks the start of the stack and is conventionally stored in `rbp`. - The *stack pointer* marks the last element of the stack and is conventionally stored in `rsp`. -When you need to call a function, you push all your local variables onto the stack (which you can also do in other circumstances, e. g. when you run out of registers), push the current instruction pointer, and then jump to the beginning of the function. When exiting from a function, you look at the pointer stored on top of the stack, jump there, and then carefully read all the variables stored on the stack back into their registers. +When you need to call a function, you push all your local variables onto the stack (which you can also do in other circumstances, e.g. when you run out of registers), push the current instruction pointer, and then jump to the beginning of the function. When exiting from a function, you look at the pointer stored on top of the stack, jump there, and then carefully read all the variables stored on the stack back into their registers. -By convention, a function should take its arguments in `rdi`, `rsi`, `rdx`, `rcx`, `r8`, `r9` (and the rest in the stack if that wasn't enough), put the return value into `rax`, and then return. Thus, `square`, being a simple one-argument function, can be implemented like this: +By convention, a function should take its arguments in `rdi`, `rsi`, `rdx`, `rcx`, `r8`, `r9` (and the rest in the stack if those weren't enough), put the return value into `rax`, and then return. Thus, `square`, being a simple one-argument function, can be implemented like this: ```nasm square: ; x = edi, ret = eax @@ -190,7 +190,7 @@ distance: ret ``` -This is better, but we are still implicitly accessing stack memory: you need to push and pop the instruction pointer on each function call. In simple cases like this, we can *inline* function calls by stitching callee's code into the caller and resolving conflicts over registers. In our example: +This is better, but we are still implicitly accessing stack memory: you need to push and pop the instruction pointer on each function call. In simple cases like this, we can *inline* function calls by stitching the callee's code into the caller and resolving conflicts over registers. In our example: ```nasm distance: diff --git a/content/english/hpc/architecture/isa.md b/content/english/hpc/architecture/isa.md index a1a4e66c..4862efb3 100644 --- a/content/english/hpc/architecture/isa.md +++ b/content/english/hpc/architecture/isa.md @@ -14,7 +14,7 @@ Abstractions help us in reducing all this complexity down to a single *interface Hardware engineers love abstractions too. An abstraction of a CPU is called an *instruction set architecture* (ISA), and it defines how a computer should work from a programmer's perspective. Similar to software interfaces, it gives computer engineers the ability to improve on existing CPU designs while also giving its users — us, programmers — the confidence that things that worked before won't break on newer chips. -An ISA essentially defines how the hardware should interpret the machine language. Apart from instructions and their binary encodings, ISA importantly defines counts, sizes, and purposes of registers, the memory model, and the input/output model. Similar to software interfaces, ISAs can be extended too: in fact, they are often updated, mostly in a backward-compatible way, to add new and more specialized instructions that can improve performance. +An ISA essentially defines how the hardware should interpret the machine language. Apart from instructions and their binary encodings, an ISA importantly defines the counts, sizes, and purposes of registers, the memory model, and the input/output model. Similar to software interfaces, ISAs can be extended too: in fact, they are often updated, mostly in a backward-compatible way, to add new and more specialized instructions that can improve performance. ### RISC vs CISC diff --git a/content/english/hpc/architecture/layout.md b/content/english/hpc/architecture/layout.md index 11735951..9ddebfd5 100644 --- a/content/english/hpc/architecture/layout.md +++ b/content/english/hpc/architecture/layout.md @@ -16,7 +16,7 @@ During the **fetch** stage, the CPU simply loads a fixed-size chunk of bytes fro -Next comes the **decode** stage: the CPU looks at this chunk of bytes, discards everything that comes before the instruction pointer, and splits the rest of them into instructions. Machine instructions are encoded using a variable amount of bytes: something simple and very common like `inc rax` takes one byte, while some obscure instruction with encoded constants and behavior-modifying prefixes may take up to 15. So, from a 32-byte block, a variable number of instructions may be decoded, but no more than a certain machine-dependant limit called the *decode width*. On my CPU (a [Zen 2](https://en.wikichip.org/wiki/amd/microarchitectures/zen_2)), the decode width is 4, which means that on each cycle, up to 4 instructions can be decoded and passed to the next stage. +Next comes the **decode** stage: the CPU looks at this chunk of bytes, discards everything that comes before the instruction pointer, and splits the rest of them into instructions. Machine instructions are encoded using a variable number of bytes: something simple and very common like `inc rax` takes one byte, while some obscure instruction with encoded constants and behavior-modifying prefixes may take up to 15. So, from a 32-byte block, a variable number of instructions may be decoded, but no more than a certain machine-dependent limit called the *decode width*. On my CPU (a [Zen 2](https://en.wikichip.org/wiki/amd/microarchitectures/zen_2)), the decode width is 4, which means that on each cycle, up to 4 instructions can be decoded and passed to the next stage. The stages work in a pipelined fashion: if the CPU can tell (or [predict](/hpc/pipelining/branching/)) which instruction block it needs next, then the fetch stage doesn't wait for the last instruction in the current block to be decoded and loads the next one right away. @@ -49,12 +49,12 @@ The instructions are stored and fetched using largely the same [memory system](/ The instruction cache is crucial in situations when you either - don't know what instructions you are going to execute next, and need to fetch the next block with [low latency](/hpc/cpu-cache/latency), -- or executing a long sequence of verbose-but-quick-to-process instructions, and need [high bandwidth](/hpc/cpu-cache/bandwidth). +- or are executing a long sequence of verbose-but-quick-to-process instructions, and need [high bandwidth](/hpc/cpu-cache/bandwidth). The memory system can therefore become the bottleneck for programs with large machine code. This consideration limits the applicability of the optimization techniques we've previously discussed: - [Inlining functions](../functions) is not always optimal, because it reduces code sharing and increases the binary size, requiring more instruction cache. -- [Unrolling loops](../loops) is only beneficial up to some extent, even if the number of loops is known during compile-time: at some point, the CPU would have to fetch both instructions and data from the main memory, in which case it will likely be bottlenecked by the memory bandwidth. +- [Unrolling loops](../loops) is only beneficial up to some extent, even if the number of iterations is known during compile-time: at some point, the CPU would have to fetch both instructions and data from the main memory, in which case it will likely be bottlenecked by the memory bandwidth. - Huge [code alignments](#code-alignment) increase the binary size, again requiring more instruction cache. Spending one more cycle on fetch is a minor penalty compared to missing the cache and waiting for the instructions to be fetched from the main memory. Another aspect is that placing frequently used instruction sequences on the same [cache lines](/hpc/cpu-cache/cache-lines) and [memory pages](/hpc/cpu-cache/paging) improves [cache locality](/hpc/external-memory/locality). To improve instruction cache utilization, you should group hot code with hot code and cold code with cold code, and remove dead (unused) code if possible. If you want to explore this idea further, check out Facebook's [Binary Optimization and Layout Tool](https://engineering.fb.com/2018/06/19/data-infrastructure/accelerate-large-scale-applications-with-bolt/), which was recently [merged](https://github.com/llvm/llvm-project/commit/4c106cfdf7cf7eec861ad3983a3dd9a9e8f3a8ae) into LLVM. diff --git a/content/english/hpc/architecture/loops.md b/content/english/hpc/architecture/loops.md index b441ae67..9dc1faba 100644 --- a/content/english/hpc/architecture/loops.md +++ b/content/english/hpc/architecture/loops.md @@ -23,11 +23,11 @@ Assembly doesn't have if-s, for-s, functions, or other control flow structures t **Jump** moves the instruction pointer to a location specified by its operand. This location may be either an absolute address in memory, relative to the current address or even [computed during runtime](../indirect). To avoid the headache of managing these addresses directly, you can mark any instruction with a string followed by `:`, and then use this string as a label which gets replaced by the relative address of this instruction when converted to machine code. -Labels can be any strings, but compilers don't get creative and [typically](https://godbolt.org/z/T45x8GKa5) just use the line numbers in the source code and function names with their signatures when picking names for labels. +Labels can be any string, but compilers don't get creative and [typically](https://godbolt.org/z/T45x8GKa5) just use the line numbers in the source code and function names with their signatures when picking names for labels. **Unconditional** jump `jmp` can only be used to implement `while (true)` kind of loops or stitch parts of a program together. A family of **conditional** jumps is used to implement actual control flow. -It is reasonable to think that these conditions are computed as `bool`-s somewhere and passed to conditional jumps as operands: after all, this is how it works in programming languages. But that is not how it is implemented in hardware. Conditional operations use a special `FLAGS` register, which first needs to be populated by executing instructions that perform some kind of checks. +It is reasonable to think that these conditions are computed as `bool`-s somewhere and passed to conditional jumps as operands: after all, this is how it works in programming languages. But that is not how it is implemented in hardware. Conditional operations use a special `FLAGS` register, which first needs to be populated by executing instructions that perform some kind of check. In our example, `cmp rax, rcx` compares the iterator `rax` with the end-of-array pointer `rcx`. This updates the FLAGS register, and now it can be used by `jne loop`, which looks up a certain bit there that tells whether the two values are equal or not, and then either jumps back to the beginning or continues to the next instruction, thus breaking the loop. diff --git a/content/english/hpc/arithmetic/division.md b/content/english/hpc/arithmetic/division.md index e3f699db..ad1cf525 100644 --- a/content/english/hpc/arithmetic/division.md +++ b/content/english/hpc/arithmetic/division.md @@ -45,7 +45,7 @@ You can also divide 128-bit integer (stored in `rdx:rax`) by a 64-bit integer: ```nasm div(u128, u64): ; a = rdi + rsi, b = rdx - mov rcx, rdx ; + mov rcx, rdx mov rax, rdi mov rdx, rsi div edx diff --git a/content/english/hpc/arithmetic/float.md b/content/english/hpc/arithmetic/float.md index cda42944..70217a91 100644 --- a/content/english/hpc/arithmetic/float.md +++ b/content/english/hpc/arithmetic/float.md @@ -139,7 +139,7 @@ $$ \{ \pm \; (1 + m) \cdot 2^e \; | \; m = \frac{x}{2^{32}}, \; x \in [0, 2^{32}) \} $$ -Since $m$ is now a nonnegative value, we will now make it unsigned integer, and instead add a separate boolean field for the sign of the number: +Since $m$ is now a nonnegative value, we will now make it unsigned integer, and instead add a separate Boolean field for the sign of the number: ```cpp struct fp { diff --git a/content/english/hpc/compilation/_index.md b/content/english/hpc/compilation/_index.md index cbc0f691..07b0e07f 100644 --- a/content/english/hpc/compilation/_index.md +++ b/content/english/hpc/compilation/_index.md @@ -8,4 +8,4 @@ The main benefit of [learning assembly language](../architecture/assembly) is no There are rare cases where we *really* need to switch to handwritten assembly for maximal performance, but most of the time compilers are capable of producing near-optimal code all by themselves. When they do not, it is usually because the programmer knows more about the problem than what can be inferred from the source code, but failed to communicate this extra information to the compiler. -In this chapter, we will discuss the intricacies of getting compiler to do exactly what we want and gathering useful information that can guide further optimizations. +In this chapter, we will discuss the intricacies of getting the compiler to do exactly what we want and gathering useful information that can guide further optimizations. diff --git a/content/english/hpc/complexity/_index.md b/content/english/hpc/complexity/_index.md index 69cebf4c..c537c4ce 100644 --- a/content/english/hpc/complexity/_index.md +++ b/content/english/hpc/complexity/_index.md @@ -11,7 +11,7 @@ Complexity is an old concept. It was [systematically formulated](http://www.cs.a ### Classical Complexity Theory -The "elementary operations" of a CPU are called *instructions*, and their "costs" are called *latencies*. Instructions are stored in *memory* and executed one by one by the processor, which has some internal *state* stored in a number of *registers*. One of these registers is the *instruction pointer* that indicates the address of the next instruction to read and execute. Each instruction changes the state of the processor in a certain way (including moving the instruction pointer), possibly modifies the main memory, and takes a different amount of *CPU cycles* to complete before the next one can be started. +The "elementary operations" of a CPU are called *instructions*, and their "costs" are called *latencies*. Instructions are stored in *memory* and executed one by one by the processor, which has some internal *state* stored in a number of *registers*. One of these registers is the *instruction pointer* that indicates the address of the next instruction to read and execute. Each instruction changes the state of the processor in a certain way (including moving the instruction pointer), possibly modifies the main memory, and takes a different number of *CPU cycles* to complete before the next one can be started. To estimate the real running time of a program, you need to sum all latencies for its executed instructions and divide it by the *clock frequency*, that is, the number of cycles a particular CPU does per second. diff --git a/content/english/hpc/complexity/hardware.md b/content/english/hpc/complexity/hardware.md index 1d59d101..d1c950b6 100644 --- a/content/english/hpc/complexity/hardware.md +++ b/content/english/hpc/complexity/hardware.md @@ -4,9 +4,9 @@ weight: 1 ignoreIndexing: true --- -The main disadvantage of the supercomputers of the 1960s wasn't that they were slow — relatively speaking, they weren't — but that they were giant, complex to use, and so expensive that only the governments of the world superpowers could afford them. Their size was the reason they were so expensive: they required a lot of custom components that had to be very carefully assembled in the macro-world, by people holding advanced degrees in electrical engineering, in a process that couldn't be up-scaled for mass production. +The main disadvantage of the supercomputers of the 1960s wasn't that they were slow — relatively speaking, they weren't — but that they were giant, complex to use, and so expensive that only the governments of the world superpowers could afford them. Their size was the reason they were so expensive: they required a lot of custom components that had to be very carefully assembled in the macro-world, by people holding advanced degrees in electrical engineering, in a process that couldn't be scaled up for mass production. -The turning point was the development of *microchips* — single, tiny, complete circuits — which revolutionized the industry and turned out to be probably the most important invention of the 20th century. What was a multimillion-dollar cupboard of computing machinery in 1965 could in 1975 fit on a [4×4 mm slice of silicon](https://en.wikipedia.org/wiki/MOS_Technology_6502)[^size] that you can buy for $25. This dramatic improvement in affordability started the home computer revolution during the following decade, with computers like Apple II, Atari 2600, Commodore 64, and IBM PC becoming available to the masses. +The turning point was the development of *microchips* — single, tiny, complete circuits — which revolutionized the industry and turned out to be probably the most important invention of the 20th century. What was a multimillion-dollar cupboard of computing machinery in 1965 could in 1975 fit on a [4mm × 4mm slice of silicon](https://en.wikipedia.org/wiki/MOS_Technology_6502)[^size] that you can buy for $25. This dramatic improvement in affordability started the home computer revolution during the following decade, with computers like Apple II, Atari 2600, Commodore 64, and IBM PC becoming available to the masses. [^size]: Actual sizes of CPUs are about centimeter-scale because of power management, heat dissipation, and the need to plug it into the motherboard without excessive swearing. @@ -17,7 +17,7 @@ Microchips are "printed" on a slice of crystalline silicon using a process calle 1. growing and slicing a [very pure silicon crystal](https://en.wikipedia.org/wiki/Wafer_(electronics)), 2. covering it with a layer of [a substance that dissolves when photons hit it](https://en.wikipedia.org/wiki/Photoresist), 3. hitting it with photons in a set pattern, -4. chemically [etching](https://en.wikipedia.org/wiki/Etching_(microfabrication)) the now exposed parts, +4. chemically [etching](https://en.wikipedia.org/wiki/Etching_(microfabrication)) the now-exposed parts, 5. removing the remaining photoresist, …and then performing another 40-50 steps over several months to complete the rest of the CPU. @@ -56,11 +56,11 @@ Throughout most of the computing history, optical shrinking was the main driving Both Dennard scaling and Moore's law are not actual laws of physics, but just observations made by savvy engineers. They are both destined to stop at some point due to fundamental physical limitations, the ultimate one being the size of silicon atoms. In fact, Dennard scaling already did — due to power issues. -Thermodynamically, a computer is just a very efficient device for converting electrical power into heat. This heat eventually needs to be removed, and there are physical limits to how much power you can dissipate from a millimeter-scale crystal. Computer engineers, aiming to maximize performance, essentially just choose the maximum possible clock rate so that the overall power consumption stays the same. If transistors become smaller, they have less capacity, meaning less required voltage to flip them, which in turn allows increasing the clock rate. +Thermodynamically, a computer is just a very efficient device for converting electrical power into heat. This heat eventually needs to be removed, and there are physical limits to how much power you can dissipate from a millimeter-scale crystal. Computer engineers, aiming to maximize performance, essentially just choose the maximum possible clock rate so that the overall power consumption stays the same. If transistors become smaller, they have less capacitance, meaning less required voltage to flip them, which in turn allows increasing the clock rate. Around 2005–2007, this strategy stopped working because of *leakage* effects: the circuit features became so small that their magnetic fields started to make the electrons in the neighboring circuitry move in directions they are not supposed to, causing unnecessary heating and occasional bit flipping. -The only way to mitigate this is to increase voltage; and to balance off power consumption you need to reduce clock frequency, which in turn makes the whole process progressively less profitable as transistor density increases. At some point, clock rates could no longer be increased by scaling, and the miniaturization trend started to slow down. +The only way to mitigate this is to increase the voltage; and to balance off power consumption you need to reduce clock frequency, which in turn makes the whole process progressively less profitable as transistor density increases. At some point, clock rates could no longer be increased by scaling, and the miniaturization trend started to slow down. , but for now, you can assume that the CPU maintains a buffer of pending instructions up to some distance in the future, and executes them as soon as the values of its operands are computed and there is an execution unit available. +You can only take advantage of superscalar processing if the stream of instructions contains groups of logically independent operations that can be processed separately. The instructions don't always arrive in the most convenient order, so, when possible, modern CPUs can execute them *out of order* to improve overall utilization and minimize pipeline stalls. How this magic works is a topic for a more advanced discussion, but for now, you can assume that the CPU maintains a buffer of pending instructions up to some distance in the future, and executes them as soon as the values of its operands are computed and there is an execution unit available. ### An Education Analogy Consider how our education system works: 1. Topics are taught to groups of students instead of individuals as broadcasting the same things to everyone at once is more efficient. -2. An intake of students is split into groups lead by different teachers; assignments and other course materials are shared between groups. +2. An intake of students is split into groups led by different teachers; assignments and other course materials are shared between groups. 3. Each year the same course is taught to a new intake so that the teachers are kept busy. These innovations greatly increase the *throughput* of the whole system, although the *latency* (time to graduation for a particular student) remains unchanged (and maybe increases a little bit because personalized tutoring is more effective). diff --git a/content/english/hpc/pipelining/branchless.md b/content/english/hpc/pipelining/branchless.md index 280498b1..0f87da83 100644 --- a/content/english/hpc/pipelining/branchless.md +++ b/content/english/hpc/pipelining/branchless.md @@ -32,7 +32,7 @@ Suddenly, the loop now takes ~7 cycles per element instead of the original ~14. But wait… shouldn't there still be a branch? How does `(a[i] < 50)` map to assembly? -There are no boolean types in assembly, nor any instructions that yield either one or zero based on the result of the comparison, but we can compute it indirectly like this: `(a[i] - 50) >> 31`. This trick relies on the [binary representation of integers](/hpc/arithmetic/integer), specifically on the fact that if the expression `a[i] - 50` is negative (implying `a[i] < 50`), then the highest bit of the result will be set to one, which we can then extract using a right-shift. +There are no Boolean types in assembly, nor any instructions that yield either one or zero based on the result of the comparison, but we can compute it indirectly like this: `(a[i] - 50) >> 31`. This trick relies on the [binary representation of integers](/hpc/arithmetic/integer), specifically on the fact that if the expression `a[i] - 50` is negative (implying `a[i] < 50`), then the highest bit of the result will be set to one, which we can then extract using a right-shift. ```nasm mov ebx, eax ; t = x @@ -101,7 +101,7 @@ In our example, the branchy code wins when the branch can be predicted with a pr ![](../img/branchy-vs-branchless.svg) -This 75% threshold is commonly used by the compilers as a heuristic for determining whether to use the `cmov` or not. Unfortunately, this probability is usually unknown at the compile-time, so it needs to be provided in one of several ways: +This 75% threshold is commonly used by the compilers as a heuristic for determining whether to use the `cmov` or not. Unfortunately, this probability is usually unknown at the compile time, so it needs to be provided in one of several ways: - We can use [profile-guided optimization](/hpc/compilation/situational/#profile-guided-optimization) which will decide for itself whether to use predication or not. - We can use [likeliness attributes](../branching#hinting-likeliness-of-branches) and [compiler-specific intrinsics](/hpc/compilation/situational) to hint at the likeliness of branches: `__builtin_expect_with_probability` in GCC and `__builtin_unpredictable` in Clang. diff --git a/content/english/hpc/pipelining/hazards.md b/content/english/hpc/pipelining/hazards.md index 02a0869d..d4a2d7df 100644 --- a/content/english/hpc/pipelining/hazards.md +++ b/content/english/hpc/pipelining/hazards.md @@ -20,6 +20,6 @@ Different hazards have different penalties: - In structural hazards, you have to wait (usually one more cycle) until the execution unit is ready. They are fundamental bottlenecks on performance and can't be avoided — you have to engineer around them. - In data hazards, you have to wait for the required data to be computed (the latency of the *critical path*). Data hazards are solved by restructuring computations so that the critical path is shorter. -- In control hazards, you generally have to flush the entire pipeline and start over, wasting whole 15-20 cycles. They are solved by either removing branches completely, or making them predictable so that the CPU can effectively *speculate* on what is going to be executed next. +- In control hazards, you generally have to flush the entire pipeline and start over, wasting a whole 15-20 cycles. They are solved by either removing branches completely, or making them predictable so that the CPU can effectively *speculate* on what is going to be executed next. -As they have very different impact on performance, we are going to go in the reversed order and start with the more grave ones. +As they have very different impacts on performance, we are going to go in the reversed order and start with the more grave ones. diff --git a/content/english/hpc/pipelining/tables.md b/content/english/hpc/pipelining/tables.md index 24678270..5f69c579 100644 --- a/content/english/hpc/pipelining/tables.md +++ b/content/english/hpc/pipelining/tables.md @@ -14,7 +14,7 @@ In this context, it makes sense to use two different "[costs](/hpc/complexity)" -You can get latency and throughput numbers for a specific architecture from special documents called [instruction tables](https://www.agner.org/optimize/instruction_tables.pdf). Here are some samples values for my Zen 2 (all specified for 32-bit operands, if there is any difference): +You can get latency and throughput numbers for a specific architecture from special documents called [instruction tables](https://www.agner.org/optimize/instruction_tables.pdf). Here are some sample values for my Zen 2 (all specified for 32-bit operands, if there is any difference): | Instruction | Latency | RThroughput | |-------------|---------|:------------| @@ -34,7 +34,7 @@ Some comments: - If a certain instruction is especially frequent, its execution unit could be duplicated to increase its throughput — possibly to even more than one, but not higher than the [decode width](/hpc/architecture/layout). - Some instructions have a latency of 0. This means that these instruction are used to control the scheduler and don't reach the execution stage. They still have non-zero reciprocal throughput because the [CPU front-end](/hpc/architecture/layout) still needs to process them. - Most instructions are pipelined, and if they have the reciprocal throughput of $n$, this usually means that their execution unit can take another instruction after $n$ cycles (and if it is below 1, this means that there are multiple execution units, all capable of taking another instruction on the next cycle). One notable exception is the [integer division](/hpc/arithmetic/division): it is either very poorly pipelined or not pipelined at all. -- Some instructions have variable latency, depending on not only the size, but also the values of the operands. For memory operations (including fused ones like `add`), latency is usually specified for the best case (an L1 cache hit). +- Some instructions have variable latency, depending on not only the size, but also the values of the operands. For memory operations (including fused ones like `add`), the latency is usually specified for the best case (an L1 cache hit). There are many more important little details, but this mental model will suffice for now. diff --git a/content/english/hpc/pipelining/throughput.md b/content/english/hpc/pipelining/throughput.md index 27789b28..03562291 100644 --- a/content/english/hpc/pipelining/throughput.md +++ b/content/english/hpc/pipelining/throughput.md @@ -6,7 +6,7 @@ weight: 4 Optimizing for *latency* is usually quite different from optimizing for *throughput*: - When optimizing data structure queries or small one-time or branchy algorithms, you need to [look up the latencies](../tables) of its instructions, mentally construct the execution graph of the computation, and then try to reorganize it so that the critical path is shorter. -- When optimizing hot loops and large-dataset algorithms, you need to look up the throughputs of its instructions, count how many times each one is used per iteration, determine which of them is the bottleneck, and then try to restructure the loop so that it is used less often. +- When optimizing hot loops and large-dataset algorithms, you need to look up the throughputs of their instructions, count how many times each one is used per iteration, determine which of them is the bottleneck, and then try to restructure the loop so that it is used less often. The last advice only works for *data-parallel* loops, where each iteration is fully independent of the previous one. When there is some interdependency between consecutive iterations, there may potentially be a pipeline stall caused by a [data hazard](../hazards) as the next iteration is waiting for the previous one to complete. @@ -64,7 +64,7 @@ If an instruction has a latency of $x$ and a throughput of $y$, then you would n This technique is mostly used with [SIMD](/hpc/simd) and not in scalar code. You can [generalize](/hpc/simd/reduction) the code above and compute sums and other reductions faster than the compiler. -In general, when optimizing loops, you usually have just one or a few *execution ports* that you want to utilize to their fullest, and you engineer the rest of the loop around them. As different instructions may use different sets of ports, it is not always clear which one is going to be the overused. In situations like this, [machine code analyzers](/hpc/profiling/mca) can be very helpful for finding bottlenecks of small assembly loops. +In general, when optimizing loops, you usually have just one or a few *execution ports* that you want to utilize to their fullest, and you engineer the rest of the loop around them. As different instructions may use different sets of ports, it is not always clear which one is going to be overused. In situations like this, [machine code analyzers](/hpc/profiling/mca) can be very helpful for finding the bottlenecks of small assembly loops. diff --git a/content/english/hpc/architecture/assembly.md b/content/english/hpc/architecture/assembly.md index 00c7caac..de94e4cf 100644 --- a/content/english/hpc/architecture/assembly.md +++ b/content/english/hpc/architecture/assembly.md @@ -128,7 +128,7 @@ movl %eax, (%rdx) The key differences can be summarized as follows: 1. The *last* operand is used to specify the destination. -2. Registers and constants need to be prefixed by `%` and `$` respectively (e. g. `addl $1, %rdx` increments `rdx`). +2. Registers and constants need to be prefixed by `%` and `$` respectively (e.g., `addl $1, %rdx` increments `rdx`). 3. Memory addressing looks like this: `displacement(%base, %index, scale)`. 4. Both `;` and `#` can be used for line comments, and also `/* */` can be used for block comments. diff --git a/content/english/hpc/architecture/functions.md b/content/english/hpc/architecture/functions.md index 412fc027..3f98a381 100644 --- a/content/english/hpc/architecture/functions.md +++ b/content/english/hpc/architecture/functions.md @@ -18,7 +18,7 @@ The hardware stack works the same way software stacks do and is similarly implem - The *base pointer* marks the start of the stack and is conventionally stored in `rbp`. - The *stack pointer* marks the last element of the stack and is conventionally stored in `rsp`. -When you need to call a function, you push all your local variables onto the stack (which you can also do in other circumstances, e.g. when you run out of registers), push the current instruction pointer, and then jump to the beginning of the function. When exiting from a function, you look at the pointer stored on top of the stack, jump there, and then carefully read all the variables stored on the stack back into their registers. +When you need to call a function, you push all your local variables onto the stack (which you can also do in other circumstances; e.g., when you run out of registers), push the current instruction pointer, and then jump to the beginning of the function. When exiting from a function, you look at the pointer stored on top of the stack, jump there, and then carefully read all the variables stored on the stack back into their registers. @@ -249,7 +249,7 @@ Apart from requiring much less memory, which is good for fitting into the CPU ca To improve the performance further, we can: -- manually optimize the index arithmetic (e. g. noticing that we need to multiply `v` by `2` either way), +- manually optimize the index arithmetic (e.g., noticing that we need to multiply `v` by `2` either way), - replace division by two with an explicit binary shift (because [compilers aren't always able to do it themselves](/hpc/compilation/contracts/#arithmetic)), - and, most importantly, get rid of [recursion](/hpc/architecture/functions) and make the implementation fully iterative. @@ -724,7 +724,7 @@ This makes both queries much slower — especially the reduction — but this sh **Minimum** is a nice exception where the update query can be made slightly faster if the new value of the element is less than the current one: we can skip the horizontal reduction part and just update $\log_B n$ nodes using a scalar procedure. -This works very fast when we mostly have such updates, which is the case e. g. for the sparse-graph Dijkstra algorithm when we have more edges than vertices. For this problem, the wide segment tree can serve as an efficient fixed-universe min-heap. +This works very fast when we mostly have such updates, which is the case, e.g., for the sparse-graph Dijkstra algorithm when we have more edges than vertices. For this problem, the wide segment tree can serve as an efficient fixed-universe min-heap. **Lazy propagation** can be done by storing a separate array for the delayed operations in a node. To propagate the updates, we need to go top to bottom (which can be done by simply reversing the direction of the `for` loop and using `k >> (h * b)` to calculate the `h`-th ancestor), [broadcast](/hpc/simd/moving/#broadcast) and reset the delayed operation value stored in the parent of the current node, and apply it to all values stored in the current node with SIMD. diff --git a/content/english/hpc/external-memory/hierarchy.md b/content/english/hpc/external-memory/hierarchy.md index 35670da9..f0ca9c65 100644 --- a/content/english/hpc/external-memory/hierarchy.md +++ b/content/english/hpc/external-memory/hierarchy.md @@ -40,8 +40,8 @@ Everything up to the RAM level is called *volatile memory* because it does not p From fastest to slowest: -- **CPU registers**, which are the zero-time access data cells CPU uses to store all its intermediate values, can also be thought of as a memory type. There is only a limited number of them (e. g. 16 "general purpose" ones), and in some cases, you may want to use all of them for performance reasons. -- **CPU caches.** Modern CPUs have multiple layers of cache (L1, L2, often L3, and rarely even L4). The lowest layer is shared between cores and is usually scaled with their number (e. g. a 10-core CPU should have around 10M of L3 cache). +- **CPU registers**, which are the zero-time access data cells CPU uses to store all its intermediate values, can also be thought of as a memory type. There is only a limited number of them (e.g., just 16 "general purpose" ones), and in some cases, you may want to use all of them for performance reasons. +- **CPU caches.** Modern CPUs have multiple layers of cache (L1, L2, often L3, and rarely even L4). The lowest layer is shared between cores and is usually scaled with their number (e.g., a 10-core CPU should have around 10M of L3 cache). - **Random access memory,** which is the first scalable type of memory: nowadays you can rent machines with half a terabyte of RAM on the public clouds. This is the one where most of your working data is supposed to be stored. The CPU cache system has an important concept of a *cache line*, which is the basic unit of data transfer between the CPU and the RAM. The size of a cache line is 64 bytes on most architectures, meaning that all main memory is divided into blocks of 64 bytes, and whenever you request (read or write) a single byte, you are also fetching all its 63 cache line neighbors whether your want them or not. diff --git a/content/english/hpc/external-memory/oblivious.md b/content/english/hpc/external-memory/oblivious.md index a0327855..93c4f2fc 100644 --- a/content/english/hpc/external-memory/oblivious.md +++ b/content/english/hpc/external-memory/oblivious.md @@ -118,7 +118,7 @@ It seems like we can't do better, but it turns out we can. ### Algorithm -Cache-oblivious matrix multiplication relies on essentially the same trick as the transposition. We need to divide the data until it fits into lowest cache (i. e. $N^2 \leq M$). For matrix multiplication, this equates to using this formula: +Cache-oblivious matrix multiplication relies on essentially the same trick as the transposition. We need to divide the data until it fits into lowest cache (i.e., $N^2 \leq M$). For matrix multiplication, this equates to using this formula: $$ \begin{pmatrix} diff --git a/content/english/hpc/external-memory/virtual.md b/content/english/hpc/external-memory/virtual.md index 6535283d..92bb454c 100644 --- a/content/english/hpc/external-memory/virtual.md +++ b/content/english/hpc/external-memory/virtual.md @@ -19,7 +19,7 @@ Virtual memory gives each process the impression that it fully controls a contig To achieve this, the memory address space is divided into *pages* (typically 4KB in size), which are the base units of memory that the programs can request from the operating system. The memory system maintains a special hardware data structure called the *page table*, which contains the mappings of virtual page addresses to the physical ones. When a process accesses data using its virtual memory address, the memory system calculates its page number (by right-shifting it by $12$ if $4096=2^{12}$ is the page size), looks up in the page table that its physical address is, and forwards the read or write request to where that data is actually stored. -Since the address translation needs to be done for each memory request, and the number of memory pages itself may be large (e. g. 16G RAM / 4K page size = 4M pages), address translation poses a difficult problem in itself. One way to speed it up is to use a special cache for the page table itself called *translation lookaside buffer* (TLB), and the other is to [increase the page size](/hpc/cpu-cache/paging) so that the total number of memory pages is made smaller at the cost of reduced granularity. +Since the address translation needs to be done for each memory request, and the number of memory pages itself may be large (e.g., 16G RAM / 4K page size = 4M pages), address translation poses a difficult problem in itself. One way to speed it up is to use a special cache for the page table itself called *translation lookaside buffer* (TLB), and the other is to [increase the page size](/hpc/cpu-cache/paging) so that the total number of memory pages is made smaller at the cost of reduced granularity. -Interleaving the stages of execution is a general idea in digital electronics, and it is applied not only in the main CPU pipeline, but also on the level of separate instructions and [memory](/hpc/cpu-cache/mlp). Most execution units have their own little pipelines, and can take another instruction just one or two cycles after the previous one. If a certain instruction is frequently used, it makes sense to duplicate its execution unit also, and also place frequently jointly used instructions on the same execution unit: e. g. not using the same for arithmetic and memory operation. +Interleaving the stages of execution is a general idea in digital electronics, and it is applied not only in the main CPU pipeline, but also on the level of separate instructions and [memory](/hpc/cpu-cache/mlp). Most execution units have their own little pipelines, and can take another instruction just one or two cycles after the previous one. If a certain instruction is frequently used, it makes sense to duplicate its execution unit also, and also place frequently jointly used instructions on the same execution unit: e.g., not using the same for arithmetic and memory operation. ### Microcode diff --git a/content/english/hpc/pipelining/throughput.md b/content/english/hpc/pipelining/throughput.md index 03562291..0b596404 100644 --- a/content/english/hpc/pipelining/throughput.md +++ b/content/english/hpc/pipelining/throughput.md @@ -84,7 +84,7 @@ Bandwidth is the rate at which data can be read or stored. For the purpose of de In the previous version, we have an inherently sequential chain of operations in the innermost loop. We accumulate the minimum in variable v by a sequence of min operations. There is no way to start the second operation before we know the result of the first operation; there is no room for parallelism here: -The result will be clearly the same, but we are calculating the operations in a different order. In essence, we split the work in two independent parts, calculating the minimum of odd elements and the minimum of even elements, and finally combining the results. If we calculate the odd minimum v0 and even minimum v1 in an interleaved manner, as shown above, we will have more opportunities for parallelism. For example, the 1st and 2nd operation could be calculated simultaneously in parallel (or they could be executed in a pipelined fashion in the same execution unit). Once these results are available, the 3rd and 4th operation could be calculated simultaneously in parallel, etc. We could potentially obtain a speedup of a factor of 2 here, and naturally the same idea could be extended to calculating e.g. 4 minimums in an interleaved fashion. +The result will be clearly the same, but we are calculating the operations in a different order. In essence, we split the work in two independent parts, calculating the minimum of odd elements and the minimum of even elements, and finally combining the results. If we calculate the odd minimum v0 and even minimum v1 in an interleaved manner, as shown above, we will have more opportunities for parallelism. For example, the 1st and 2nd operation could be calculated simultaneously in parallel (or they could be executed in a pipelined fashion in the same execution unit). Once these results are available, the 3rd and 4th operation could be calculated simultaneously in parallel, etc. We could potentially obtain a speedup of a factor of 2 here, and naturally the same idea could be extended to calculating, e.g., 4 minimums in an interleaved fashion. Instruction-level parallelism is automatic Now that we know how to reorganize calculations so that there is potential for parallelism, we will need to know how to realize the potential. For example, if we have these two operations in the C++ code, how do we tell the computer that the operations can be safely executed in parallel? diff --git a/content/english/hpc/profiling/_index.md b/content/english/hpc/profiling/_index.md index 0b7ca30f..ceca0f2f 100644 --- a/content/english/hpc/profiling/_index.md +++ b/content/english/hpc/profiling/_index.md @@ -10,7 +10,7 @@ There are many different types of profilers. I like to think about them by analo - When objects are on a micrometer scale, they use optical microscopes. - When objects are on a nanometer scale, and light no longer interacts with them, they use electron microscopes. -- When objects are smaller than that (e. g. the insides of an atom), they resort to theories and assumptions about how things work (and test these assumptions using intricate and indirect experiments). +- When objects are smaller than that (e.g., the insides of an atom), they resort to theories and assumptions about how things work (and test these assumptions using intricate and indirect experiments). Similarly, there are three main profiling techniques, each operating by its own principles, having distinct areas of applicability, and allowing for different levels of precision: diff --git a/content/english/hpc/profiling/benchmarking.md b/content/english/hpc/profiling/benchmarking.md index d873ca62..dd543bcc 100644 --- a/content/english/hpc/profiling/benchmarking.md +++ b/content/english/hpc/profiling/benchmarking.md @@ -59,7 +59,7 @@ Although *efficient* in terms of execution speed, C and C++ are not the most *pr One way to improve modularity and reusability is to separate all testing and analytics code from the actual implementation of the algorithm, and also make it so that different versions are implemented in separate files, but have the same interface. -In C/C++, you can do this by creating a single header file (e. g. `gcd.hh`) with a function interface and all its benchmarking code in `main`: +In C/C++, you can do this by creating a single header file (e.g., `gcd.hh`) with a function interface and all its benchmarking code in `main`: ```c++ int gcd(int a, int b); // to be implemented @@ -93,7 +93,7 @@ int main() { } ``` -Then you create many implementation files for each algorithm version (e. g. `v1.cc`, `v2.cc` and so on, or some meaningful names if applicable) that all include that single header file: +Then you create many implementation files for each algorithm version (e.g., `v1.cc`, `v2.cc`, and so on, or some meaningful names if applicable) that all include that single header file: ```c++ #include "gcd.hh" diff --git a/content/english/hpc/profiling/events.md b/content/english/hpc/profiling/events.md index 71ae9cd3..eb2ba613 100644 --- a/content/english/hpc/profiling/events.md +++ b/content/english/hpc/profiling/events.md @@ -93,7 +93,7 @@ Overhead Command Shared Object Symbol 0.80% run libc-2.33.so [.] rand ``` -Note that, for each function, just its *overhead* is listed and not the total running time (e. g. `setup` includes `std::__introsort_loop` but only its own overhead is accounted as 3.43%). There are tools for constructing [flame graphs](https://www.brendangregg.com/flamegraphs.html) out of perf reports to make them more clear. You also need to account for possible inlining, which is apparently what happened with `std::lower_bound` here. Perf also tracks shared libraries (like `libc`) and, in general, any other spawned processes: if you want, you can launch a web browser with perf and see what's happening inside. +Note that, for each function, just its *overhead* is listed and not the total running time (e.g., `setup` includes `std::__introsort_loop` but only its own overhead is accounted as 3.43%). There are tools for constructing [flame graphs](https://www.brendangregg.com/flamegraphs.html) out of perf reports to make them more clear. You also need to account for possible inlining, which is apparently what happened with `std::lower_bound` here. Perf also tracks shared libraries (like `libc`) and, in general, any other spawned processes: if you want, you can launch a web browser with perf and see what's happening inside. Next, you can "zoom in" on any of these functions, and, among others things, it will offer to show you its disassembly with an associated heatmap. For example, here is the assembly for `query`: diff --git a/content/english/hpc/profiling/mca.md b/content/english/hpc/profiling/mca.md index 4634ba25..99cfe2ed 100644 --- a/content/english/hpc/profiling/mca.md +++ b/content/english/hpc/profiling/mca.md @@ -40,7 +40,7 @@ First, it outputs general information about the loop and the hardware: - It "ran" the loop 100 times, executing 400 instructions in total in 108 cycles, which is the same as executing $\frac{400}{108} \approx 3.7$ [instructions per cycle](/hpc/complexity/hardware) on average (IPC). - The CPU is theoretically capable of executing up to 6 instructions per cycle ([dispatch width](/hpc/architecture/layout)). - Each cycle in theory can be executed in 0.8 cycles on average ([block reciprocal throughput](/hpc/pipelining/tables)). -- The "uOps" here are the micro-operations that CPU splits each instruction into (e. g. fused load-add is composed of two uOps). +- The "uOps" here are the micro-operations that the CPU splits each instruction into (e.g., fused load-add is composed of two uOps). Then it proceeds to give information about each individual instruction: diff --git a/content/english/hpc/profiling/noise.md b/content/english/hpc/profiling/noise.md index 8dcdb032..74ff0272 100644 --- a/content/english/hpc/profiling/noise.md +++ b/content/english/hpc/profiling/noise.md @@ -87,7 +87,7 @@ for (int i = 0; i < N; i++) checksum ^= lower_bound(checksum ^ q[i]); ``` -It usually makes the most difference in algorithms with possible pipeline stall issues, e. g. when comparing branchy and branch-free algorithms. +It usually makes the most difference in algorithms with possible pipeline stall issues, e.g., when comparing branchy and branch-free algorithms. **Cold cache.** Another source of bias is the *cold cache effect*, when memory reads initially take longer time because the required data is not in cache yet. @@ -130,7 +130,7 @@ The issues we've described produce *bias* in measurements: they consistently giv These type of issues are caused by side effects and some sort of external noise, mostly due to noisy neighbors and CPU frequency scaling: - If you benchmark a compute-bound algorithm, measure its performance in cycles using `perf stat`: this way it will be independent of clock frequency, fluctuations of which is usually the main source of noise. -- Otherwise, set core frequency to the what you expect it to be and make sure nothing interferes with it. On Linux you can do it with `cpupower` (e. g. `sudo cpupower frequency-set -g powersave` to put it to minimum or `sudo cpupower frequency-set -g ondemand` to enable turbo boost). I use a [convenient GNOME shell extension](https://extensions.gnome.org/extension/1082/cpufreq/) that has a separate button to do it. +- Otherwise, set core frequency to the what you expect it to be and make sure nothing interferes with it. On Linux you can do it with `cpupower` (e.g., `sudo cpupower frequency-set -g powersave` to put it to minimum or `sudo cpupower frequency-set -g ondemand` to enable turbo boost). I use a [convenient GNOME shell extension](https://extensions.gnome.org/extension/1082/cpufreq/) that has a separate button to do it. - If applicable, turn hyper-threading off and attach jobs to specific cores. Make sure no other jobs are running on the system, turn off networking and try not to fiddle with the mouse. You can't remove noises and biases completely. Even a program's name can affect its speed: the executable's name ends up in an environment variable, environment variables end up on the call stack, and so the length of the name affects stack alignment, which can result in data accesses slowing down due to crossing cache line or memory page boundaries. diff --git a/content/english/hpc/profiling/simulation.md b/content/english/hpc/profiling/simulation.md index 2f6c6dc6..75401b8a 100644 --- a/content/english/hpc/profiling/simulation.md +++ b/content/english/hpc/profiling/simulation.md @@ -50,7 +50,7 @@ Mispred rate: 22.0% ( 22.5% + 0.0% ) We've fed Cachegrind exactly the same example code as in [the previous section](../events): we create an array of a million random integers, sort it, and then perform a million binary searches on it. Cachegrind shows roughly the same numbers as perf does, except that that perf's measured numbers of memory reads and branches are slightly inflated due to [speculative execution](/hpc/pipelining): they really happen in hardware and thus increment hardware counters, but are discarded and don't affect actual performance, and thus ignored in the simulation. -Cachegrind only models the first (`D1` for data, `I1` for instructions) and the last (`LL`, unified) levels of cache, the characteristics of which are inferred from the system. It doesn't limit you in any way as you can also set them from the command line, e. g. to model the L2 cache: `--LL=,,`. +Cachegrind only models the first (`D1` for data, `I1` for instructions) and the last (`LL`, unified) levels of cache, the characteristics of which are inferred from the system. It doesn't limit you in any way as you can also set them from the command line, e g., to model the L2 cache: `--LL=,,`. It seems like it only slowed down our program so far and hasn't provided us any information that `perf stat` couldn't. To get more out of it than just the summary info, we can inspect a special file with profiling info, which it dumps by default in the same directory named as `cachegrind.out.`. It is human-readable, but is expected to be read via the `cg_annotate` command: diff --git a/content/english/hpc/simd/intrinsics.md b/content/english/hpc/simd/intrinsics.md index e091ddb6..4e9c6804 100644 --- a/content/english/hpc/simd/intrinsics.md +++ b/content/english/hpc/simd/intrinsics.md @@ -95,7 +95,7 @@ for (int i = 0; i < 100; i += 4) { The main challenge of using SIMD is getting the data into contiguous fixed-sized blocks suitable for loading into registers. In the code above, we may in general have a problem if the length of the array is not divisible by the block size. There are two common solutions to this: -1. We can "overshoot" by iterating over the last incomplete segment either way. To make sure we don't segfault by trying to read from or write to a memory region we don't own, we need to pad the arrays to the nearest block size (typically with some "neutral" element, e. g. zero). +1. We can "overshoot" by iterating over the last incomplete segment either way. To make sure we don't segfault by trying to read from or write to a memory region we don't own, we need to pad the arrays to the nearest block size (typically with some "neutral" element, e.g., zero). 2. Make one iteration less and write a little loop in the end that calculates the remainder normally (with scalar operations). Humans prefer #1 because it is simpler and results in less code, and compilers prefer #2 because they don't really have another legal option. @@ -135,7 +135,7 @@ Also, some of the intrinsics don't map to a single instruction but a short seque diff --git a/content/english/hpc/simd/reduction.md b/content/english/hpc/simd/reduction.md index 078983d2..c67c1942 100644 --- a/content/english/hpc/simd/reduction.md +++ b/content/english/hpc/simd/reduction.md @@ -3,7 +3,7 @@ title: Sums and Other Reductions weight: 3 --- -*Reduction* (also known as *folding* in functional programming) is the action of computing the value of some associative and commutative operation (i.e. $(a \circ b) \circ c = a \circ (b \circ c)$ and $a \circ b = b \circ a$) over a range of arbitrary elements. +*Reduction* (also known as *folding* in functional programming) is the action of computing the value of some associative and commutative operation (i.e., $(a \circ b) \circ c = a \circ (b \circ c)$ and $a \circ b = b \circ a$) over a range of arbitrary elements. The simplest example of reduction is calculating the sum an array: @@ -68,7 +68,7 @@ int hsum(__m256i x) { } ``` -There are [other similar instructions](https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#techs=AVX,AVX2&ig_expand=3037,3009,5135,4870,4870,4872,4875,833,879,874,849,848,6715,4845&text=horizontal), e. g. for integer multiplication or calculating absolute differences between adjacent elements (used in image processing). +There are [other similar instructions](https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#techs=AVX,AVX2&ig_expand=3037,3009,5135,4870,4870,4872,4875,833,879,874,849,848,6715,4845&text=horizontal), e.g., for integer multiplication or calculating absolute differences between adjacent elements (used in image processing). There is also one specific instruction, `_mm_minpos_epu16`, that calculates the horizontal minimum and its index among eight 16-bit integers. This is the only horizontal reduction that works in one go: all others are computed in multiple steps. diff --git a/content/english/hpc/slides/01-intro/_index.md b/content/english/hpc/slides/01-intro/_index.md index 492ceb6a..615a89aa 100644 --- a/content/english/hpc/slides/01-intro/_index.md +++ b/content/english/hpc/slides/01-intro/_index.md @@ -151,7 +151,7 @@ Also a clear path to improvement: just make lenses stronger and chips smaller $\implies$ Each new "generation" should have roughly the same total cost, but 40% higher clock and twice as many transistors -(which can be used e. g. to add new instructions or increase the word size) +(which can be used, e.g., to add new instructions or increase the word size) ---- diff --git a/content/english/hpc/stats.md b/content/english/hpc/stats.md index 6e436d15..15d81e39 100644 --- a/content/english/hpc/stats.md +++ b/content/english/hpc/stats.md @@ -18,7 +18,7 @@ A **random variable** is any variable whose value depends on an outcome of a ran 2. $\forall x \in X, 0 \leq P \leq 1$. 3. $\sum_{x \in X} P(x) = 1$. -For example, consider a random variable $X$ with $k$ discrete states (e. g. the result of a die toss). We can place a *uniform distribution* on $X$ — that is, make each of its states equally likely — by setting its probability distribution to: +For example, consider a random variable $X$ with $k$ discrete states (e.g., the result of a die toss). We can place a *uniform distribution* on $X$ — that is, make each of its states equally likely — by setting its probability distribution to: $$ P(x=x_i) = \frac{1}{k} @@ -121,7 +121,7 @@ The last transition is true because it is a sum of harmonic series. ### Order Statistics -There is a slight modification of quicksort called quickselect that allows finding the $k$-th smallest element in $O(n)$ time, which is useful when we need to quickly compute order statistics, e. g. medians or 75-th quantiles. +There is a slight modification of quicksort called quickselect that allows finding the $k$-th smallest element in $O(n)$ time, which is useful when we need to quickly compute order statistics; e.g., medians or 75-th quantiles. 1. Select a random element $p$ from the array. 2. Partition the array into two arrays $L$ and $R$ using the predicate $a_i > p$. From 3bb8fad0b2f4c9bfeca09d3dfd8e2c9d24763184 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 18 May 2022 10:57:34 +0300 Subject: [PATCH 459/531] amount/number and much/many --- content/english/hpc/algorithms/gcd.md | 2 +- content/english/hpc/arithmetic/float.md | 2 +- content/english/hpc/arithmetic/ieee-754.md | 4 ++-- content/english/hpc/compilation/precalc.md | 2 +- content/english/hpc/cpu-cache/alignment.md | 2 +- content/english/hpc/data-structures/binary-search.md | 2 +- content/english/hpc/external-memory/locality.md | 2 +- content/english/hpc/external-memory/sorting.md | 4 ++-- content/english/hpc/profiling/benchmarking.md | 2 +- content/english/hpc/simd/shuffling.md | 2 +- 10 files changed, 12 insertions(+), 12 deletions(-) diff --git a/content/english/hpc/algorithms/gcd.md b/content/english/hpc/algorithms/gcd.md index 63efdec9..d56be8f7 100644 --- a/content/english/hpc/algorithms/gcd.md +++ b/content/english/hpc/algorithms/gcd.md @@ -135,7 +135,7 @@ int gcd(int a, int b) { Let's run it, and… it sucks. The difference in speed compared to `std::gcd` is indeed 2x, but on the other side of the equation. This is mainly because of all the branching needed to differentiate between the cases. Let's start optimizing. -First, let's replace all divisions by 2 with divisions by whichever highest power of 2 we can. We can do it efficiently with `__builtin_ctz`, the "count trailing zeros" instruction available on modern CPUs. Whenever we are supposed to divide by 2 in the original algorithm, we will call this function instead, which will give us the exact amount to right-shift the number by. Assuming that the we are dealing with large random numbers, this is expected to decrease the number of iterations by almost a factor 2, because $1 + \frac{1}{2} + \frac{1}{4} + \frac{1}{8} + \ldots \to 2$. +First, let's replace all divisions by 2 with divisions by whichever highest power of 2 we can. We can do it efficiently with `__builtin_ctz`, the "count trailing zeros" instruction available on modern CPUs. Whenever we are supposed to divide by 2 in the original algorithm, we will call this function instead, which will give us the exact number of bits to right-shift the number by. Assuming that the we are dealing with large random numbers, this is expected to decrease the number of iterations by almost a factor 2, because $1 + \frac{1}{2} + \frac{1}{4} + \frac{1}{8} + \ldots \to 2$. Second, we can notice that condition 2 can now only be true once — in the very beginning — because every other identity leaves at least one of the numbers odd. Therefore we can handle this case just once in the beginning and not consider it in the main loop. diff --git a/content/english/hpc/arithmetic/float.md b/content/english/hpc/arithmetic/float.md index 70217a91..dcc33039 100644 --- a/content/english/hpc/arithmetic/float.md +++ b/content/english/hpc/arithmetic/float.md @@ -9,7 +9,7 @@ The users of floating-point arithmetic deserve one of these IQ bell curve memes - Then they discover that `0.1 + 0.2 != 0.3` or some other quirk like that, freak out, start thinking that some random error term is added to every computation, and for many years avoid any real data types completely. - Then they finally man up, read the specification of how IEEE-754 floats work and start using them appropriately. -Too many people are unfortunately still at stage 2, breeding various misconceptions about floating-point arithmetic — thinking that it is fundamentally imprecise and unstable, and slower than integer arithmetic. +Unfortunately, too many people are still at stage 2, breeding various misconceptions about floating-point arithmetic — thinking that it is fundamentally imprecise and unstable, and slower than integer arithmetic. ![](../img/iq.svg) diff --git a/content/english/hpc/arithmetic/ieee-754.md b/content/english/hpc/arithmetic/ieee-754.md index 06d58e4d..65cc5f48 100644 --- a/content/english/hpc/arithmetic/ieee-754.md +++ b/content/english/hpc/arithmetic/ieee-754.md @@ -52,7 +52,7 @@ Their availability ranges from chip to chip: - Most CPUs support single- and double-precision — which is what `float` and `double` types refer to in C. - Extended formats are exclusive to x86, and are available in C as the `long double` type, which falls back to double precision on arm. The choice of 64 bits for mantissa is so that every `long long` integer can be represented exactly. There is also a 40-bit format that similarly allocates 32 mantissa bits. - Quadruple as well as the 256-bit "octuple" formats are only used for specific scientific computations and are not supported by general-purpose hardware. -- Half-precision arithmetic only supports a small subset of operations and is generally used for machine learning applications, especially neural networks, because they tend to do a large amount of calculation, but don't require a high level of precision. +- Half-precision arithmetic only supports a small subset of operations and is generally used for applications such as machine learning, especially neural networks, because they tend to perform large amounts of calculations but don't require high levels of precision. - Half-precision is being gradually replaced by bfloat, which trades off 3 mantissa bits to have the same range as single-precision, enabling interoperability with it. It is mostly being adopted by specialized hardware: TPUs, FGPAs, and GPUs. The name stands for "[Brain](https://en.wikipedia.org/wiki/Google_Brain) float." Lower-precision types need less memory bandwidth to move them around and usually take fewer cycles to operate on (e.g., the division instruction may take $x$, $y$, or $z$ cycles depending on the type), which is why they are preferred when error tolerance allows it. @@ -77,7 +77,7 @@ This is a complex mechanism that deserves an article of its own, but since this ### NaNs, Zeros and Infinities -Floating-point arithmetic often deals with noisy, real-world data, and exceptions there are much more common than in the integer case. For this reason, the default behavior is different. Instead of crashing, the result is substituted with a special value without interrupting the executing, unless the programmer explicitly wants to. +Floating-point arithmetic often deals with noisy, real-world data. Exceptions there are much more common than in the integer case, and for this reason, the default behavior when handling them is different. Instead of crashing, the result is substituted with a special value without interrupting the program execution (unless the programmer explicitly wants it to). The first type of such value is the two infinities: a positive and a negative one. They are generated if the result of an operation can't fit within the representable range, and they are treated as such in arithmetic. diff --git a/content/english/hpc/compilation/precalc.md b/content/english/hpc/compilation/precalc.md index 29b31cd6..4a7cb7b7 100644 --- a/content/english/hpc/compilation/precalc.md +++ b/content/english/hpc/compilation/precalc.md @@ -37,7 +37,7 @@ constexpr int fibonacci(int n) { } ``` -There used to be much more limitations in earlier C++ standards, like you could not use any sort of state inside them and had to rely on recursion, so the whole process felt more like Haskell programming rather than C++. Since C++17, you can even compute static arrays using the imperative style, which is useful for precomputing lookup tables: +There used to be many more limitations in earlier C++ standards, like you could not use any sort of state inside them and had to rely on recursion, so the whole process felt more like Haskell programming rather than C++. Since C++17, you can even compute static arrays using the imperative style, which is useful for precomputing lookup tables: ```c++ struct Precalc { diff --git a/content/english/hpc/cpu-cache/alignment.md b/content/english/hpc/cpu-cache/alignment.md index 83d62310..59579467 100644 --- a/content/english/hpc/cpu-cache/alignment.md +++ b/content/english/hpc/cpu-cache/alignment.md @@ -77,7 +77,7 @@ This potentially wastes space but saves a lot of CPU cycles. This trade-off is m ### Optimizing Member Order -Padding is only inserted before a not-yet-aligned member or at the end of the structure. By changing the ordering of members in a structure, it is possible to change the required amount of padding bytes and the total size of the structure. +Padding is only inserted before a not-yet-aligned member or at the end of the structure. By changing the ordering of members in a structure, it is possible to change the required number of padding bytes and the total size of the structure. In the previous example, we could reorder the structure members like this: diff --git a/content/english/hpc/data-structures/binary-search.md b/content/english/hpc/data-structures/binary-search.md index 7c408228..48bf07b4 100644 --- a/content/english/hpc/data-structures/binary-search.md +++ b/content/english/hpc/data-structures/binary-search.md @@ -313,7 +313,7 @@ Here we query the array of $[1, …, 8]$ for the lower bound of $x=4$. We compar The trick is to notice that, unless the answer is the last element of the array, we compare $x$ against it at some point, and after we've learned that it is not less than $x$, we start comparing $x$ against elements to the left, and all these comparisons evaluate true (that is, leading to the right). Therefore, to restore the answer, we just need to "cancel" some number of right turns. -This can be done in an elegant way by observing that the right turns are recorded in the binary representation of $k$ as 1-bits, and so we just need to find the number of trailing ones in the binary representation and right-shift $k$ by exactly that amount. To do this, we can invert the number (`~k`) and call the "find first set" instruction: +This can be done in an elegant way by observing that the right turns are recorded in the binary representation of $k$ as 1-bits, and so we just need to find the number of trailing ones in the binary representation and right-shift $k$ by exactly that. To do this, we can invert the number (`~k`) and call the "find first set" instruction: ```c++ int lower_bound(int x) { diff --git a/content/english/hpc/external-memory/locality.md b/content/english/hpc/external-memory/locality.md index eca83766..e61cb5a3 100644 --- a/content/english/hpc/external-memory/locality.md +++ b/content/english/hpc/external-memory/locality.md @@ -174,7 +174,7 @@ The AoS layout is usually preferred for data structures, but SoA still has good This difference in design is important in data processing applications. For example, databases can be either *row-* or *column-oriented* (also called *columnar*): -- *Row-oriented* storage formats are used when you need to search for a limited amount of objects in a large dataset and fetch all or most of their fields. Examples: PostgreSQL, MongoDB. +- *Row-oriented* storage formats are used when you need to search for a limited number of objects in a large dataset and/or fetch all or most of their fields. Examples: PostgreSQL, MongoDB. - *Columnar* storage formats are used for big data processing and analytics, where you need to scan through everything anyway to calculate certain statistics. Examples: ClickHouse, Hbase. Columnar formats have the additional advantage that you can only read the fields that you need, as different fields are stored in separate external memory regions. diff --git a/content/english/hpc/external-memory/sorting.md b/content/english/hpc/external-memory/sorting.md index c7effc46..299da78f 100644 --- a/content/english/hpc/external-memory/sorting.md +++ b/content/english/hpc/external-memory/sorting.md @@ -34,7 +34,7 @@ So far the examples have been simple, and their analysis doesn't differ too much In the standard RAM model, the asymptotic complexity would be multiplied $k$, since we would need to perform $O(k)$ comparisons to fill each next element. But in the external memory model, since everything we do in-memory doesn't cost us anything, its asymptotic complexity would not change as long as we can fit $(k+1)$ full blocks in memory, that is, if $k = O(\frac{M}{B})$. -Remember [the $M \gg B$ assumption](../model) when we introduced the computational model? If we have $M \geq B^{1+ε}$ for $\epsilon > 0$, then we can fit any sub-polynomial amount of blocks in memory, certainly including $O(\frac{M}{B})$. This condition is called *tall cache assumption*, and it is usually required in many other external memory algorithms. +Remember [the $M \gg B$ assumption](../model) when we introduced the computational model? If we have $M \geq B^{1+ε}$ for $\epsilon > 0$, then we can fit any sub-polynomial number of blocks in memory, certainly including $O(\frac{M}{B})$. This condition is called *tall cache assumption*, and it is usually required in many other external memory algorithms. ### Merge Sorting @@ -58,7 +58,7 @@ Half of a page ago we have learned that in the external memory model, we can mer Let's sort each block of size $M$ in-memory just as we did before, but during each merge stage, we will split sorted blocks not just in pairs to be merged, but take as many blocks we can fit into our memory during a $k$-way merge. This way the height of the merge tree would be greatly reduced, while each layer would still be done in $O(\frac{N}{B})$ IOPS. -How many sorted arrays can we merge at once? Exactly $k = \frac{M}{B}$, since we need memory for one block for each array. Since the total amount of layers will be reduced to $\log_{\frac{M}{B}} \frac{N}{M}$, the total complexity will be reduced to +How many sorted arrays can we merge at once? Exactly $k = \frac{M}{B}$, since we need memory for one block for each array. Since the total number of layers will be reduced to $\log_{\frac{M}{B}} \frac{N}{M}$, the total complexity will be reduced to $$ SORT(N) \stackrel{\text{def}}{=} O\left(\frac{N}{B} \log_{\frac{M}{B}} \frac{N}{M} \right) diff --git a/content/english/hpc/profiling/benchmarking.md b/content/english/hpc/profiling/benchmarking.md index dd543bcc..2be61235 100644 --- a/content/english/hpc/profiling/benchmarking.md +++ b/content/english/hpc/profiling/benchmarking.md @@ -186,4 +186,4 @@ plt.plot(ns, [x / y for x, y in zip(baseline, results)]) plt.show() ``` -Once established, this workflow makes you iterate much faster and just focus on optimizing the algorithm itself. +Once established, this workflow makes you iterate much faster and focus on optimizing the algorithm itself. diff --git a/content/english/hpc/simd/shuffling.md b/content/english/hpc/simd/shuffling.md index 111c34d5..6ff3b749 100644 --- a/content/english/hpc/simd/shuffling.md +++ b/content/english/hpc/simd/shuffling.md @@ -175,7 +175,7 @@ The general idea of our algorithm is as follows: - use this mask to index a lookup table that returns a permutation moving the elements that satisfy the predicate to the beginning of the vector (in their original order); - use the `_mm256_permutevar8x32_epi32` intrinsic to permute the values; - write the whole permuted vector to the buffer — it may have some trailing garbage, but its prefix is correct; -- calculate the population count of the scalar mask and move the buffer pointer by that amount. +- calculate the population count of the scalar mask and move the buffer pointer by that number. First, we need to precompute the permutations: From 893772a2538f1592fb1fdc55611267a7effd5868 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 18 May 2022 11:49:06 +0300 Subject: [PATCH 460/531] fix eytzinger example (tnx @tmp-coder) --- content/english/hpc/data-structures/binary-search.md | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/content/english/hpc/data-structures/binary-search.md b/content/english/hpc/data-structures/binary-search.md index 48bf07b4..babe0092 100644 --- a/content/english/hpc/data-structures/binary-search.md +++ b/content/english/hpc/data-structures/binary-search.md @@ -286,7 +286,9 @@ This function takes the current node number `k`, recursively writes out all elem Despite being recursive, it is actually quite fast as all the memory reads are sequential, and the memory writes are only in $O(\log n)$ different memory blocks at a time. -Note that the Eytzinger array is one-indexed — this will be important for performance later. You can put in the zeroth element the value that you want to be returned in the case when the lower bound doesn't exist (similar to `a.end()` for `std::lower_bound`). +Note that this traversal and the resulting permutation are not exactly equivalent to the "tree" of vanilla binary search: for example, the left child subtree may be larger than the right child subtree — and even more than just by one node — but it doesn't matter since both approaches result in the same logarithmic tree depth. + +Also note that the Eytzinger array is one-indexed — this will be important for performance later. You can put in the zeroth element the value that you want to be returned in the case when the lower bound doesn't exist (similar to `a.end()` for `std::lower_bound`). ### Search Implementation @@ -302,18 +304,18 @@ The only problem arises when we need to restore the index of the resulting eleme ``` array: 1 2 3 4 5 6 7 8 -eytzinger: 4 2 5 1 6 3 7 8 +eytzinger: 5 3 7 2 4 6 8 1 1st range: --------------- k := 1 2nd range: ------- k := 2*k (=2) 3rd range: --- k := 2*k + 1 (=5) -4th range: - k := 2*k + 1 (=11) +4th range: - k := 2*k (=10) ``` -Here we query the array of $[1, …, 8]$ for the lower bound of $x=4$. We compare it against $4$, $2$, and $5$, go left-right-right, and end up with $k = 11$, which isn't even a valid array index. +Here we query the array of $[1, …, 8]$ for the lower bound of $x=4$. We compare it against $5$, $3$, and $4$, go left-right-left, and end up with $k = 10$, which isn't even a valid array index. The trick is to notice that, unless the answer is the last element of the array, we compare $x$ against it at some point, and after we've learned that it is not less than $x$, we start comparing $x$ against elements to the left, and all these comparisons evaluate true (that is, leading to the right). Therefore, to restore the answer, we just need to "cancel" some number of right turns. -This can be done in an elegant way by observing that the right turns are recorded in the binary representation of $k$ as 1-bits, and so we just need to find the number of trailing ones in the binary representation and right-shift $k$ by exactly that. To do this, we can invert the number (`~k`) and call the "find first set" instruction: +This can be done in an elegant way by observing that the right turns are recorded in the binary representation of $k$ as 1-bits, and so we just need to find the number of trailing 1s in the binary representation and right-shift $k$ by exactly that number of bits. To do this, we can invert the number (`~k`) and call the "find first set" instruction: ```c++ int lower_bound(int x) { From b82fb8fa10e5eaac97e6111016f9886464b3135c Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Thu, 19 May 2022 13:44:01 +0300 Subject: [PATCH 461/531] on optimizing latency and efficiency --- content/english/hpc/complexity/levels.md | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) diff --git a/content/english/hpc/complexity/levels.md b/content/english/hpc/complexity/levels.md index 281bdea2..b1e29e2e 100644 --- a/content/english/hpc/complexity/levels.md +++ b/content/english/hpc/complexity/levels.md @@ -40,3 +40,25 @@ Programmers can be put in several "levels" in terms of their software optimizati In this book, we expect that the average reader is somewhere around stage 1, and hopefully by the end of it will get to 4. You should also go through these levels when designing algorithms. First get it working in the first place, then select a bunch of reasonably asymptotically optimal algorithm. Then think about how they are going to work in terms of their memory operations or ability to execute in parallel (even if you consider single-threaded programs, there is still going to be plenty of parallelism inside a core, so this model is extremely ), and then proceed toward actual implementation. Avoid premature optimization, as Knuth once said. + +--- + +For most web services, efficiency doesn't matter, but *latency* does. + +Increasing efficiency is not how it is done nowadays. + +A pageview usually generates somewhere on the order of 0.1 to 1 cent per pageview. This is a typical rate at which you monetize user attention. Say, if I simply installed AdSense, i'd be getting something like that — depending on where most of my readers are from and how many of them are using an ad blocker. + +At the same time, a server with a dedicated core and 1GB of ram (which is an absurdly large amount of resources for a simple web service) costs around one millionth per second when amortized. You could fetch 100 photos with that. + +Amazon had an experiment where they A/B tested their service with artificial delays and found out that a 100ms delay decreased revenue. This follows for most other services, say, you lose your "flow" at twitter, the user is likely to start thinking on something else and leave. If the delay at Google is more than a few seconds, people will just think that Google isn't working and quit. + +Minimization of latency can be usually done with parallel computing, which is why distributed systems are scaled more on scalability. This part of the book is concerned with improving *efficiency* of algorithms, which makes latency lower as the by-product. + +However, there are still use cases when there is a trade-off between quality and cost of servers. + +- Search is hierarchical. There are usually many layers of more accurate but slower models. The more documents you rank on each layer, the better the final quality. +- Games. They are more enjoyable on large scale, but computational power also increases. This includes AI. +- AI workloads — those that have large quantities of data such as language models. Heavier models require more compute. The bottleneck in them is not the number of data, but efficiencty. + +Inherently sequential algorithms, or cases when the resources are constrained. Ctrl+f'ing a large PDF is painful. Factorization. From 25333d550985213cdde5a743f5d6e4862207e4ce Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Fri, 20 May 2022 06:37:07 +0300 Subject: [PATCH 462/531] estimating performance engineering impact --- content/english/hpc/complexity/levels.md | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/content/english/hpc/complexity/levels.md b/content/english/hpc/complexity/levels.md index b1e29e2e..e2e8b58f 100644 --- a/content/english/hpc/complexity/levels.md +++ b/content/english/hpc/complexity/levels.md @@ -62,3 +62,17 @@ However, there are still use cases when there is a trade-off between quality and - AI workloads — those that have large quantities of data such as language models. Heavier models require more compute. The bottleneck in them is not the number of data, but efficiencty. Inherently sequential algorithms, or cases when the resources are constrained. Ctrl+f'ing a large PDF is painful. Factorization. + +## Estimating the impact + +Sometime the optimization needs to happen in the calling layer. + +SIMDJSON speeds up JSON parsing, but it may be better to not use JSON in the first place. + +Protobuf or flat binary formats. + +There is also a chicken and egg problem: people don't use an approach that much because it is slow and not feasible. + +Cost to implement, bugs, maintainability. It is perfectly fine that most software in the world is inefficient. + +What does it mean to be a better programmer? Faster programs? Faster speed of work? Fewer bugs? It is a combination of those. From bf8a1e817963151180f6f7352e3d979b1f6f7f33 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Fri, 20 May 2022 06:50:32 +0300 Subject: [PATCH 463/531] how to read this book --- content/english/hpc/complexity/levels.md | 2 ++ content/english/hpc/preface.md | 16 ++++++++++++++++ 2 files changed, 18 insertions(+) diff --git a/content/english/hpc/complexity/levels.md b/content/english/hpc/complexity/levels.md index e2e8b58f..981d467c 100644 --- a/content/english/hpc/complexity/levels.md +++ b/content/english/hpc/complexity/levels.md @@ -76,3 +76,5 @@ There is also a chicken and egg problem: people don't use an approach that much Cost to implement, bugs, maintainability. It is perfectly fine that most software in the world is inefficient. What does it mean to be a better programmer? Faster programs? Faster speed of work? Fewer bugs? It is a combination of those. + +Implementing compiler optimizations or databases are examples of high-leverage activities because they act as a tax on everything else — which is why you see most people writing books on these particular topics rather than software optimization in general. diff --git a/content/english/hpc/preface.md b/content/english/hpc/preface.md index 28adae07..2e18e715 100644 --- a/content/english/hpc/preface.md +++ b/content/english/hpc/preface.md @@ -19,6 +19,22 @@ There are a lot of forward references I couldn't get rid of. Read some of the SIMD and memory chapter first. +Chapter 1 is a "why you should care" sort of read. + +Chapter 2 is an introduction to computer architectures from the perspective of performance. There is a high chance that you already know it from a college course, but I still advise to read it to get into context, as we will cover assembly-level optimization techniques there. + +Chapter 3 is where experienced programmers should start from. + +Chapter 4 discusses compilation with the example of C++ and GCC/Clang. Chapter 5 discusses language-agnostic profiling methods. You are free to skip both. + +Chapter 6 discusses arithmetic and chapter 7 discusses modular arithmetic and its applications. They also acts as a sort of reference for algorithms in the case studies. + +Chapter 8 introduces the external memory model and how the memory system works. Chapter 9 follows up with experimental studies of how it can affect performance. + +Chapters 10 discusses SIMD programming, which is a major part. It is not *that* intertwined with the preivous ones, and if you are feeling comfortable, I'd suggest that you start reading with it because it will teach you powerful techniques right away. + +Chapters 11-12 contain case studies of complex algorithms. Performance engineering is a practical field, so you should learn from major examples. + The first 5 chapters build up general understanding of performance. Chapters 6-10 go deeper into modern features. Arithmetic, number theory (the techniques that are also relevant outside of it). Some are theoretic, and then applied in practice. From 22ad3b1ff984da97081d26b4ef60f9b7c7137a24 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Mon, 23 May 2022 10:25:34 +0300 Subject: [PATCH 464/531] fix formatting --- .../russian/cs/factorization/eratosthenes.md | 17 ++++++++--------- 1 file changed, 8 insertions(+), 9 deletions(-) diff --git a/content/russian/cs/factorization/eratosthenes.md b/content/russian/cs/factorization/eratosthenes.md index 02e72c0e..acf47749 100644 --- a/content/russian/cs/factorization/eratosthenes.md +++ b/content/russian/cs/factorization/eratosthenes.md @@ -12,10 +12,10 @@ published: true Основная идея соответствует названию алгоритма: запишем ряд чисел $1, 2,\ldots, n$, а затем будем вычеркивать -* сначала числа, делящиеся на $2$, кроме самого числа $2$, -* потом числа, делящиеся на $3$, кроме самого числа $3$, -* с числами, делящимися на $4$, ничего делать не будем — мы их уже вычёркивали, -* потом продолжим вычеркивать числа, делящиеся на $5$, кроме самого числа $5$, +- сначала числа, делящиеся на $2$, кроме самого числа $2$, +- потом числа, делящиеся на $3$, кроме самого числа $3$, +- с числами, делящимися на $4$, ничего делать не будем — мы их уже вычёркивали, +- потом продолжим вычеркивать числа, делящиеся на $5$, кроме самого числа $5$, …и так далее. @@ -23,10 +23,10 @@ published: true ```c++ vector sieve(int n) { - vector is_prime(n+1, true); + vector is_prime(n + 1, true); for (int i = 2; i <= n; i++) if (is_prime[i]) - for (int j = 2*i; j <= n; j += i) + for (int j = 2 * i; j <= n; j += i) is_prime[j] = false; return is_prime; } @@ -49,7 +49,6 @@ $$ У исходного алгоритма асимптотика должна быть ещё лучше. Чтобы найти её точнее, нам понадобятся два факта про простые числа: 1. Простых чисел от $1$ до $n$ примерно $\frac{n}{\ln n}$ . - 2. Простые числа распределены без больших «разрывов» и «скоплений», то есть $k$-тое простое число примерно равно $k \ln k$. Мы можем упрощённо считать, что число $k$ является простым с «вероятностью» $\frac{1}{\ln n}$. Тогда, время работы алгоритма можно более точнее оценить как @@ -65,11 +64,11 @@ $$ ## Линейное решето -Основная проблема решета Эратосфена состоит в том, что некоторые числа мы будем помечать как составные несколько раз — а именно столько раз, сколько у них различных простых делителей. Чтобы достичь линейного времени работы, нам нужно придумать способ, как рассматривать все составные числа ровно один раз. +Основная проблема решета Эратосфена состоит в том, что некоторые числа мы будем помечать как составные несколько раз — столько, сколько у них различных простых делителей. Чтобы достичь линейного времени работы, нам нужно придумать способ, как рассматривать все составные числа ровно один раз. Обозначим за $d(k)$ минимальный простой делитель числа $k$ и заметим следующий факт: у составного числа $k$ есть единственное представление $k = d(k) \cdot r$, и при этом у числа $r$ нет простых делителей меньше $d(k)$. -Идея оптимизации состоит в том, чтобы перебирать этот $r$, и для каждого перебирать только нужные множители — а именно все от $2$ до $d(r)$ включительно. +Идея оптимизации состоит в том, чтобы перебирать этот $r$, и для каждого перебирать только нужные множители — а именно, все от $2$ до $d(r)$ включительно. ### Алгоритм From a80dc4f9e0aeb28a873c5c3347626477cb45ac42 Mon Sep 17 00:00:00 2001 From: Timofey Date: Tue, 24 May 2022 13:11:02 +0300 Subject: [PATCH 465/531] Update products.md --- content/russian/cs/geometry-basic/products.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/content/russian/cs/geometry-basic/products.md b/content/russian/cs/geometry-basic/products.md index a4e1a3d5..ca0a5dd3 100644 --- a/content/russian/cs/geometry-basic/products.md +++ b/content/russian/cs/geometry-basic/products.md @@ -1,6 +1,7 @@ --- title: Скалярное и векторное произведение weight: 2 +published: true --- Помимо очевидных сложения, вычитания и умножения на константу, у векторов можно ввести и свои особенные операции, которые нам упростят жизнь. @@ -42,7 +43,7 @@ $$ Так же, как и со скалярным произведением, доказательство координатной формулы оставляется упражнением читателю. Если кто-то захочет это сделать: это следует из линейности обоих произведений (что в свою очередь тоже нужно доказать) и разложения и разложения по базисным векторам $\overline{(0, 1)}$ и $\overline{(1, 0)}$. -Геометрически, это ориентированный объем параллелограмма, натянутого на вектора $a$ и $b$: +Геометрически, это ориентированная площадь параллелограмма, натянутого на вектора $a$ и $b$: ![](../img/cross.jpg) From 9ef98a1c9f5b4a103e68d9db4d44634753e9378e Mon Sep 17 00:00:00 2001 From: Timofey Date: Tue, 24 May 2022 13:27:57 +0300 Subject: [PATCH 466/531] http://www.gramota.ru/slovari/dic/?word=%D0%B2%D0%B5%D0%BA%D1%82%D0%BE%D1%80&all=x --- content/russian/cs/geometry-basic/vectors.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/content/russian/cs/geometry-basic/vectors.md b/content/russian/cs/geometry-basic/vectors.md index 05051396..ee1a052a 100644 --- a/content/russian/cs/geometry-basic/vectors.md +++ b/content/russian/cs/geometry-basic/vectors.md @@ -1,6 +1,7 @@ --- -title: Точки и векторы +title: Точки и вектора weight: 1 +published: true --- Отрезок, для которого указано, какой из его концов считается началом, а какой концом, называется *вектором*. Вектор на плоскости можно задать двумя числами — его координатами по горизонтали и вертикали. From 689bc2ee0615285809b0086df388dc5b3dafcfbc Mon Sep 17 00:00:00 2001 From: Timofey Date: Tue, 24 May 2022 14:47:45 +0300 Subject: [PATCH 467/531] =?UTF-8?q?=D0=9E=D0=BF=D0=B8=D1=81=D0=BA=D0=B0?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- content/russian/cs/geometry-basic/products.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/russian/cs/geometry-basic/products.md b/content/russian/cs/geometry-basic/products.md index ca0a5dd3..488dbca6 100644 --- a/content/russian/cs/geometry-basic/products.md +++ b/content/russian/cs/geometry-basic/products.md @@ -41,7 +41,7 @@ $$ a \times b = |a| \cdot |b| \cdot \sin \theta = x_a y_b - y_a x_b $$ -Так же, как и со скалярным произведением, доказательство координатной формулы оставляется упражнением читателю. Если кто-то захочет это сделать: это следует из линейности обоих произведений (что в свою очередь тоже нужно доказать) и разложения и разложения по базисным векторам $\overline{(0, 1)}$ и $\overline{(1, 0)}$. +Так же, как и со скалярным произведением, доказательство координатной формулы оставляется упражнением читателю. Если кто-то захочет это сделать: это следует из линейности обоих произведений (что в свою очередь тоже нужно доказать) и разложения по базисным векторам $\overline{(0, 1)}$ и $\overline{(1, 0)}$. Геометрически, это ориентированная площадь параллелограмма, натянутого на вектора $a$ и $b$: @@ -66,7 +66,7 @@ int operator^(r a, r b) { return a.x*b.y - b.x*a.y; } Скалярное и векторное произведения тесно связаны с углами между векторами и могут использоваться для подсчета величин вроде ориентированных углов и площадей, которые обычно используются для разных проверок. -Когда они уже реализованы, использовать произведения гораздо проще, чем опираться на алгебру. Например, можно легко угол между двумя векторами, подставив в знакомый нам `atan2` векторное и скалярное произведение: +Когда они уже реализованы, использовать произведения гораздо проще, чем опираться на алгебру. Например, можно легко вычислить угол между двумя векторами, подставив в знакомый нам `atan2` векторное и скалярное произведение: ```c++ double angle(r a, r b) { From 2638bf74c962ab81b0515318e832d0a9451e4b7c Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 24 May 2022 21:04:56 +0300 Subject: [PATCH 468/531] fix formatting --- content/russian/cs/interactive/answer-search.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/russian/cs/interactive/answer-search.md b/content/russian/cs/interactive/answer-search.md index 28e4b4bc..0b38ce24 100644 --- a/content/russian/cs/interactive/answer-search.md +++ b/content/russian/cs/interactive/answer-search.md @@ -66,7 +66,7 @@ int solve() { Здесь, в отличие от предыдущей задачи, кажется, существует прямое решение с формулой. Но вместо того, чтобы о нем думать, можно просто свести задачу к обратной. Давайте подумаем, как по числу минут $t$ (ответу) понять, сколько листов напечатается за это время? Очень легко: $$ -\lfloor\frac{t}{x}\rfloor + \lfloor\frac{t}{y}\rfloor +\left \lfloor \frac{t}{x} \right \rfloor + \left \lfloor \frac{t}{y} \right \rfloor $$ -Ясно, что за $0$ минут $n$ листов распечатать нельзя, а за $xn$ минут один только первый принтер успеет напечатать $n$ листов. Поэтому $0$ и $xn$ — это подходящие изначальные границы для бинарного поиска. +Ясно, что за $0$ минут $n$ листов распечатать нельзя, а за $x \cdot n$ минут один только первый принтер успеет напечатать $n$ листов. Поэтому $0$ и $xn$ — это подходящие изначальные границы для бинарного поиска. From f81a9ea614b579811ba6b8e7edf9a112c601b441 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 24 May 2022 21:05:09 +0300 Subject: [PATCH 469/531] factorization code --- .../english/hpc/algorithms/factorization.md | 243 +++++++++++++++++- 1 file changed, 242 insertions(+), 1 deletion(-) diff --git a/content/english/hpc/algorithms/factorization.md b/content/english/hpc/algorithms/factorization.md index 4ff8061d..7c2d8aa7 100644 --- a/content/english/hpc/algorithms/factorization.md +++ b/content/english/hpc/algorithms/factorization.md @@ -14,7 +14,6 @@ Integer factorization is interesting because of RSA problem. - Less than 10^100: Quadratic Sieve - More than 10^100: General Number Field Sieve - and do other computations such as computing the greatest common multiple (given that it is not even so that ) (since $\gcd(n, r) = 1$) For all methods, we will implement `find_factor` function which returns one divisor ot 1. You can apply it recurively to get the factorization, so whatever asymptotic you had won't affect it: @@ -32,6 +31,23 @@ vector factorize(u64 n) { } ``` +0.056024 +2043.968140 + +```c++ +typedef __uint16_t u16; +typedef __uint32_t u32; +typedef __uint64_t u64; +typedef __uint128_t u128; + +u64 find_factor(u64 n) { + for (u64 d = 2; d * d <= n; d++) + if (n % d == 0) + return d; + return 1; +} +``` + ## Trial division This is the most basic algorithm to find a prime factorization. @@ -193,3 +209,228 @@ This is exactly the type of problem when we need specific knowledge, because we ## Further optimizations Существуют также [субэкспоненциальные](https://ru.wikipedia.org/wiki/%D0%A4%D0%B0%D0%BA%D1%82%D0%BE%D1%80%D0%B8%D0%B7%D0%B0%D1%86%D0%B8%D1%8F_%D1%86%D0%B5%D0%BB%D1%8B%D1%85_%D1%87%D0%B8%D1%81%D0%B5%D0%BB#%D0%A1%D1%83%D0%B1%D1%8D%D0%BA%D1%81%D0%BF%D0%BE%D0%BD%D0%B5%D0%BD%D1%86%D0%B8%D0%B0%D0%BB%D1%8C%D0%BD%D1%8B%D0%B5_%D0%B0%D0%BB%D0%B3%D0%BE%D1%80%D0%B8%D1%82%D0%BC%D1%8B), но не полиномиальные алгоритмы факторизации. Человечество [умеет](https://en.wikipedia.org/wiki/Integer_factorization_records) факторизовывать числа порядка $2^{200}$. + + +--- + +If you have limited time, you should probably compute as much forward as possible, and then half the time computing the other. + +How to optimize for the *average* case is unclear. + +0.087907 +3964.321045 + +```c++ +u64 find_factor(u64 n) { + if (n % 2 == 0) + return 2; + for (u64 d = 3; d * d <= n; d += 2) + if (n % d == 0) + return d; + return 1; +} +``` + +0.199740 +7615.217773 + +```c++ +u64 find_factor(u64 n) { + for (u64 d : {2, 3, 5}) + if (n % d == 0) + return d; + u64 increments[] = {0, 4, 6, 10, 12, 16, 22, 24}; + for (u64 d = 7; d * d <= n; d += 30) { + for (u64 k = 0; k < 8; k++) { + u64 x = d + increments[k]; + if (n % x == 0) + return x; + } + } + return 1; +} +``` + +19430.058594 + +```c++ +const int N = (1 << 16); + +struct Precalc { + u16 primes[6542]; // # of primes under N=2^16 + + constexpr Precalc() : primes{} { + bool marked[N] = {}; + int n_primes = 0; + + for (int i = 2; i < N; i++) { + if (!marked[i]) { + primes[n_primes++] = i; + for (int j = 2 * i; j < N; j += i) + marked[j] = true; + } + } + } +}; + +constexpr Precalc P{}; + +u64 find_factor(u64 n) { + for (u16 p : P.primes) + if (n % p == 0) + return p; + return 1; +} +``` + +352997.656250 + +```c++ +u64 magic[6542]; +magic[n_primes++] = u64(-1) / i + 1; + +u64 find_factor(u64 n) { + for (u64 m : P.magic) + if (m * n < m) + return u64(-1) / m + 1; + return 1; +} +``` + +Except that it is contant, so the speedup should be twice as much. + +--- + +```c++ +u64 find_factor(u64 n) { + while (true) { + if (u64 g = gcd(randint(2, n - 1), n); g != 1) + return g; + } +} +``` + +99.292641 +25720.164062 almost 15x slower + +```c++ +u64 f(u64 x, u64 a, u64 mod) { + return ((u128) x * x + a) % mod; +} + +u64 diff(u64 a, u64 b) { + // a and b are unsigned and so is their difference, so we can't just call abs(a - b) + return a > b ? a - b : b - a; +} + +u64 rho(u64 n, u64 x0 = 2, u64 a = 1) { + u64 x = x0, y = x0, g = 1; + while (g == 1) { + x = f(x, a, n); + y = f(y, a, n); + y = f(y, a, n); + g = gcd(diff(x, y)); + } + return g; +} + +u64 find_factor(u64 n) { + return rho(n); +} +``` + +56.745281 + +```c++ +u64 rho(u64 n, u64 x0 = 2, u64 a = 1) { + u64 x = x0, y = x0; + + for (int l = 256; l < (1 << 20); l *= 2) { + x = y; + for (int i = 0; i < l; i++) { + y = f(y, a, n); + if (u64 g = gcd(diff(x, y), n); g != 1) + return g; + } + } + + return 1; +} +``` + +426.389160 + +```c++ +const int M = 1024; + +u64 rho(u64 n, u64 x0 = 2, u64 a = 1) { + u64 x = x0, y = x0, p = 1; + + for (int l = M; l < (1 << 20); l *= 2) { + x = y; + for (int i = 0; i < l; i += M) { + for (int j = 0; j < M; j++) { + y = f(y, a, n); + p = (u128) p * diff(x, y) % n; + } + if (u64 g = gcd(p, n); g != 1) + return g; + } + } + + return 1; +} +``` + +2948.260986 + +```c++ +struct Montgomery { + u64 n, nr; + + Montgomery(u64 n) : n(n) { + nr = 1; + for (int i = 0; i < 6; i++) + nr *= 2 - n * nr; + } + + u64 reduce(u128 x) const { + u64 q = u64(x) * nr; + u64 m = ((u128) q * n) >> 64; + return (x >> 64) + n - m; + } + + u64 multiply(u64 x, u64 y) { + return reduce((u128) x * y); + } +}; + +u64 f(u64 x, u64 a, Montgomery m) { + return m.multiply(x, x) + a; +} + +const int M = 1024; + +u64 rho(u64 n, u64 x0 = 2, u64 a = 1) { + Montgomery m(n); + u64 y = x0; + + for (int l = M; l < (1 << 20); l *= 2) { + u64 x = y, p = 1; + for (int i = 0; i < l; i += M) { + for (int j = 0; j < M; j++) { + y = f(y, a, m); + p = m.multiply(p, diff(x, y)); + } + if (u64 g = gcd(p, n); g != 1) + return g; + } + } + + return 1; +} +``` + +There are slightly more errors because we are a bit loose with modular arithmetic here. The error rate grows higher when we increase and decrease (due to overflows). + +788.4861246275735 From 968abd50c4b267ab6a7b27e991b02267c1d08518 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 24 May 2022 23:02:47 +0300 Subject: [PATCH 470/531] factorization intro --- .../english/hpc/algorithms/factorization.md | 70 +++++++++++++------ 1 file changed, 48 insertions(+), 22 deletions(-) diff --git a/content/english/hpc/algorithms/factorization.md b/content/english/hpc/algorithms/factorization.md index 7c2d8aa7..8baf4aaf 100644 --- a/content/english/hpc/algorithms/factorization.md +++ b/content/english/hpc/algorithms/factorization.md @@ -4,42 +4,59 @@ weight: 3 draft: true --- -Integer factorization is interesting because of RSA problem. +The problem of factoring integers into primes is central to computational [number theory](/hpc/number-theory/). It has been [studied](https://www.cs.purdue.edu/homes/ssw/chapter3.pdf) since at least the 3rd century BC, and [many methods](https://en.wikipedia.org/wiki/Category:Integer_factorization_algorithms) have been developed that are efficient for different inputs. -"How big are your numbers?" determines the method to use: +In this case study, we specifically consider the factorization of *word-sized* integers: those on the order of $10^9$ and $10^{18}$. Untypical for this book, in this one, you may actually learn an asymptotically better algorithm: we start with a few basic approaches, and then gradually build up to the $O(\sqrt[4]{n})$-time *Pollard's rho algorithm* and optimize it to the point where it can factorize 60-bit semiprimes in 0.3-0.4ms, which is almost 4x faster than the previous state-of-the-art. -- Less than 2^16 or so: Lookup table. -- Less than 2^70 or so: Richard Brent's modification of Pollard's rho algorithm. -- Less than 10^50: Lenstra elliptic curve factorization -- Less than 10^100: Quadratic Sieve -- More than 10^100: General Number Field Sieve + -and do other computations such as computing the greatest common multiple (given that it is not even so that ) (since $\gcd(n, r) = 1$) +### Benchmark -For all methods, we will implement `find_factor` function which returns one divisor ot 1. You can apply it recurively to get the factorization, so whatever asymptotic you had won't affect it: +For all methods, we will implement `find_factor` function that takes a positive integer $n$ and returns either its smallest divisor (or `1` if the number is prime): ```c++ -typedef uint32_t u32; -typedef uint64_t u64; +// I don't feel like typing "unsigned long long" each time +typedef __uint16_t u16; +typedef __uint32_t u32; +typedef __uint64_t u64; typedef __uint128_t u128; +u64 find_factor(u64 n); +``` + +To find full factorization, you can apply it to $n$, reduce it, and continue until a new factor can no longer be found: + +```c++ vector factorize(u64 n) { - vector res; - while (int d = find_factor(n); d > 1) // does it work? - res.push_back(d); - return res; + vector factorization; + do { + u64 d = find_factor(n); + factorization.push_back(d); + n /= d; + } while (d != 1); + return factorization; } ``` +Since after each removed factor the problem becomes considerably smaller and simpler, the worst-case running time of full factorization is equal to the worst-case running time of a `find_factor` call. + +For many factorization algorithms, including those presented in this article, the running time scales with the least prime factor. Therefore, to provide worst-case input, we use *semiprimes:* products of two prime numbers $p \le q$ that are on the same order of magnitude. To generate a $k$-bit semiprime, we generate two random $\lfloor k / 2 \rfloor$-bit primes. + +Since some of the algorithms are inherently randomized, we also tolerate a small (<1%) percentage of errors, although they can be reduced to almost zero without significant performance penalties. + +### Trial division + +Trial division was first described by Fibonacci in 1202. Although it was probably known to animals. Perhaps some animals can factor? The scientific priority probably belongs to dinosaurs or ancient fish trying to divvy stuff up. + +I tried finding references to who invented trial division, but probably it was known to animals long before to split into equal parts. + 0.056024 2043.968140 ```c++ -typedef __uint16_t u16; -typedef __uint32_t u32; -typedef __uint64_t u64; -typedef __uint128_t u128; - u64 find_factor(u64 n) { for (u64 d = 2; d * d <= n; d++) if (n % d == 0) @@ -48,8 +65,6 @@ u64 find_factor(u64 n) { } ``` -## Trial division - This is the most basic algorithm to find a prime factorization. We divide by each possible divisor $d$. @@ -434,3 +449,14 @@ u64 rho(u64 n, u64 x0 = 2, u64 a = 1) { There are slightly more errors because we are a bit loose with modular arithmetic here. The error rate grows higher when we increase and decrease (due to overflows). 788.4861246275735 + +### Larger Numbers + +"How big are your numbers?" determines the method to use: + + +- Less than 2^16 or so: Lookup table. +- Less than 2^70 or so: Richard Brent's modification of Pollard's rho algorithm. +- Less than 10^50: Lenstra elliptic curve factorization +- Less than 10^100: Quadratic Sieve +- More than 10^100: General Number Field Sieve From 6f0850d4fc30ba584623340bc8ad3a774b8ff8e9 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 25 May 2022 14:10:08 +0300 Subject: [PATCH 471/531] centered codeblock --- content/english/hpc/arithmetic/newton.md | 6 +++--- content/english/hpc/data-structures/binary-search.md | 12 ++++++------ content/russian/cs/numerical/newton.md | 6 +++--- themes/algorithmica/assets/style.sass | 8 +++++++- .../_default/_markup/render-codeblock-center.html | 3 +++ 5 files changed, 22 insertions(+), 13 deletions(-) create mode 100644 themes/algorithmica/layouts/_default/_markup/render-codeblock-center.html diff --git a/content/english/hpc/arithmetic/newton.md b/content/english/hpc/arithmetic/newton.md index 38bcddda..de42104c 100644 --- a/content/english/hpc/arithmetic/newton.md +++ b/content/english/hpc/arithmetic/newton.md @@ -68,9 +68,9 @@ The algorithm converges for many functions, although it does so reliably and pro Let's run a few iterations of Newton's method to find the square root of $2$, starting with $x_0 = 1$, and check how many digits it got correct after each iteration: -
      -1
      -1.5
      +
      +1.0000000000000000000000000000000000000000000000000000000000000
      +1.5000000000000000000000000000000000000000000000000000000000000
       1.4166666666666666666666666666666666666666666666666666666666675
       1.4142156862745098039215686274509803921568627450980392156862745
       1.4142135623746899106262955788901349101165596221157440445849057
      diff --git a/content/english/hpc/data-structures/binary-search.md b/content/english/hpc/data-structures/binary-search.md
      index babe0092..8a4924ea 100644
      --- a/content/english/hpc/data-structures/binary-search.md
      +++ b/content/english/hpc/data-structures/binary-search.md
      @@ -302,12 +302,12 @@ while (k <= n)
       
       The only problem arises when we need to restore the index of the resulting element, as $k$ may end up not pointing to a leaf node. Here is an example of how that can happen:
       
      -```
      -    array:  1 2 3 4 5 6 7 8
      -eytzinger:  5 3 7 2 4 6 8 1
      -1st range:  ---------------  k := 1
      -2nd range:  -------          k := 2*k      (=2)
      -3rd range:      ---          k := 2*k + 1  (=5)
      +```center
      +    array:  1 2 3 4 5 6 7 8                     
      +eytzinger:  5 3 7 2 4 6 8 1                     
      +1st range:  ---------------  k := 1             
      +2nd range:  -------          k := 2*k      (=2) 
      +3rd range:      ---          k := 2*k + 1  (=5) 
       4th range:        -          k := 2*k      (=10)
       ```
       
      diff --git a/content/russian/cs/numerical/newton.md b/content/russian/cs/numerical/newton.md
      index 248e1b4e..5426cff5 100644
      --- a/content/russian/cs/numerical/newton.md
      +++ b/content/russian/cs/numerical/newton.md
      @@ -66,9 +66,9 @@ double sqrt(double n) {
       
       Запустим метод Ньютона для поиска квадратного корня $2$, начиная с $x_0 = 1$, и посмотрим, сколько первых цифр оказались правильными после каждой итерации:
       
      -
      -1
      -1.5
      +
      +1.0000000000000000000000000000000000000000000000000000000000000
      +1.5000000000000000000000000000000000000000000000000000000000000
       1.4166666666666666666666666666666666666666666666666666666666675
       1.4142156862745098039215686274509803921568627450980392156862745
       1.4142135623746899106262955788901349101165596221157440445849057
      diff --git a/themes/algorithmica/assets/style.sass b/themes/algorithmica/assets/style.sass
      index a6835c1e..eb5e2410 100644
      --- a/themes/algorithmica/assets/style.sass
      +++ b/themes/algorithmica/assets/style.sass
      @@ -492,7 +492,13 @@ pre
         padding-left: 8px
         font-size: 0.85em
         text-align: left
      -  
      +
      +pre.center-pre
      +  text-align: center
      +  font-size: 1em
      +  background: none
      +  border: none
      +
       .highlight
         margin: 0px
       
      diff --git a/themes/algorithmica/layouts/_default/_markup/render-codeblock-center.html b/themes/algorithmica/layouts/_default/_markup/render-codeblock-center.html
      new file mode 100644
      index 00000000..d263bb5a
      --- /dev/null
      +++ b/themes/algorithmica/layouts/_default/_markup/render-codeblock-center.html
      @@ -0,0 +1,3 @@
      +
      +{{.Inner}}
      +
      From 7297d591846a63f1615ec5415db99d0e5d447e26 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 25 May 2022 14:11:59 +0300 Subject: [PATCH 472/531] bump hugo version --- netlify.toml | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/netlify.toml b/netlify.toml index 1b5ed16e..fb612037 100644 --- a/netlify.toml +++ b/netlify.toml @@ -2,7 +2,7 @@ command = "hugo --gc --minify" [context.production.environment] -HUGO_VERSION = "0.87.0" +HUGO_VERSION = "0.96.0" HUGO_ENV = "production" HUGO_ENABLEGITINFO = "true" @@ -10,20 +10,20 @@ HUGO_ENABLEGITINFO = "true" command = "hugo --gc --minify --enableGitInfo" [context.split1.environment] -HUGO_VERSION = "0.87.0" +HUGO_VERSION = "0.96.0" HUGO_ENV = "production" [context.deploy-preview] command = "hugo --gc --minify --buildFuture -b $DEPLOY_PRIME_URL" [context.deploy-preview.environment] -HUGO_VERSION = "0.87.0" +HUGO_VERSION = "0.96.0" [context.branch-deploy] command = "hugo --gc --minify -b $DEPLOY_PRIME_URL" [context.branch-deploy.environment] -HUGO_VERSION = "0.87.0" +HUGO_VERSION = "0.96.0" [context.next.environment] HUGO_ENABLEGITINFO = "true" From 251dd08c54db23dac6a977a84ea4a60ced3c9532 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 25 May 2022 16:38:32 +0300 Subject: [PATCH 473/531] wheel and lookup factorization --- .../english/hpc/algorithms/factorization.md | 259 ++++++++---------- 1 file changed, 118 insertions(+), 141 deletions(-) diff --git a/content/english/hpc/algorithms/factorization.md b/content/english/hpc/algorithms/factorization.md index 8baf4aaf..9f7958ed 100644 --- a/content/english/hpc/algorithms/factorization.md +++ b/content/english/hpc/algorithms/factorization.md @@ -15,7 +15,7 @@ Unlike other case studies of this book, in this one you will actually learn an a ### Benchmark -For all methods, we will implement `find_factor` function that takes a positive integer $n$ and returns either its smallest divisor (or `1` if the number is prime): +For all methods, we will implement `find_factor` function that takes a positive integer $n$ and returns any of its non-trivial divisors (or `1` if the number is prime): ```c++ // I don't feel like typing "unsigned long long" each time @@ -45,35 +45,30 @@ Since after each removed factor the problem becomes considerably smaller and sim For many factorization algorithms, including those presented in this article, the running time scales with the least prime factor. Therefore, to provide worst-case input, we use *semiprimes:* products of two prime numbers $p \le q$ that are on the same order of magnitude. To generate a $k$-bit semiprime, we generate two random $\lfloor k / 2 \rfloor$-bit primes. -Since some of the algorithms are inherently randomized, we also tolerate a small (<1%) percentage of errors, although they can be reduced to almost zero without significant performance penalties. +Since some of the algorithms are inherently randomized, we also tolerate a small (<1%) percentage of false negative errors (when `find_factor` returns `1` despite number $n$ being composite), although this rate can be reduced to almost zero without significant performance penalties. ### Trial division -Trial division was first described by Fibonacci in 1202. Although it was probably known to animals. Perhaps some animals can factor? The scientific priority probably belongs to dinosaurs or ancient fish trying to divvy stuff up. + + +The most basic approach is to try every number less than $n$ as a divosor: ```c++ u64 find_factor(u64 n) { - for (u64 d = 2; d * d <= n; d++) + for (u64 d = 2; d < n; d++) if (n % d == 0) return d; return 1; } ``` -This is the most basic algorithm to find a prime factorization. - -We divide by each possible divisor $d$. -We can notice, that it is impossible that all prime factors of a composite number $n$ are bigger than $\sqrt{n}$. -Therefore, we only need to test the divisors $2 \le d \le \sqrt{n}$, which gives us the prime factorization in $O(\sqrt{n})$. - -The smallest divisor has to be a prime number. -We remove the factor from the number, and repeat the process. -If we cannot find any divisor in the range $[2; \sqrt{n}]$, then the number itself has to be prime. +One simple optimization is to notice that it is enough to only check divisors that do not exceed $\sqrt n$. This works because if $n$ is divided by $d > \sqrt n$, then it is also divided by $\frac{n}{d} < \sqrt n$, so we can don't have to check it separately. ```c++ u64 find_factor(u64 n) { @@ -84,13 +79,43 @@ u64 find_factor(u64 n) { } ``` +In our benchmark, $n$ is a semiprime, and we always find the lesser divisor, so both $O(n)$ and $O(\sqrt n)$ implementations perform the same and are able to factorize ~2k 30-bit numbers per second, while taking whole ~20 seconds to factorize a single 60-bit number. + +### Lookup Table + +Nowadays, you can type `factor 57` in your Linux terminal or Google search bar to get the factorization of any number. But before computers were invented, it was more practical to use *factorization tables:* special books containing factorizations of the first $N$ numbers. + +We can also use this approach to compute these lookup tables [during compile time](/hpc/compilation/precalc/). To save space, it is convenient to only store the smallest divisor of a number, requiring just one byte for a 16-bit integer: + +```c++ +template +struct Precalc { + unsigned char divisor[N]; + + constexpr Precalc() : divisor{} { + for (int i = 0; i < N; i++) + divisor[i] = 1; + for (int i = 2; i * i < N; i++) + if (divisor[i] == 1) + for (int k = i * i; k < N; k += i) + divisor[k] = i; + } +}; + +constexpr Precalc P{}; + +u64 find_factor(u64 n) { + return P.divisor[n]; +} +``` + +This approach can process 3M 16-bit integers per second, although it [probably gets slower](../hpc/cpu-cache/bandwidth/) for larger numbers. While it requires just a few milliseconds and 64KB of memory to calculate and store the divisors of the first $2^{16}$ numbers, it does not scale well for larger inputs. + ### Wheel factorization -This is an optimization of the trial division. -The idea is the following. -Once we know that the number is not divisible by 2, we don't need to check every other even number. -This leaves us with only $50\%$ of the numbers to check. -After checking 2, we can simply start with 3 and skip every other number. +To save paper space, pre-computer era factorization tables typically excluded numbers divisible by 2 and 5: in decimal numeral system, you can quickly determine whether a number is divisible by 2 or 5 (by looking at its last digit) and keep dividing the number $n$ by 2 or 5 while it is possible, eventually arriving to some entry in the factorization table. This makes the factorization table just ½ × ⅘ = 0.4 its original size. + +We can apply a similar trick to trial division, first checking if the number is divisible by $2$, and then only check for odd divisors: ```c++ u64 find_factor(u64 n) { @@ -103,24 +128,27 @@ u64 find_factor(u64 n) { } ``` -This method can be extended. -If the number is not divisible by 3, we can also ignore all other multiples of 3 in the future computations. -So we only need to check the numbers $5, 7, 11, 13, 17, 19, 23, \dots$. -We can observe a pattern of these remaining numbers. -We need to check all numbers with $d \bmod 6 = 1$ and $d \bmod 6 = 5$. -So this leaves us with only $33.3\%$ percent of the numbers to check. -We can implement this by checking the primes 2 and 3 first, and then start checking with 5 and alternatively skip 1 or 3 numbers. +With 50% fewer divisions to do, this algorithm works twice as fast, but it can be extended. If the number is not divisible by $3$, we can also ignore all multiples of $3$, and the same goes for all other divisors. + +The problem is, as we increase the number of primes to exclude, it becomes less straightforward to iterate only over the numbers not divisible by them as they follow an irregular pattern — unless the number of primes is small. For example, if we consider $2$, $3$, and $5$, then, among the first $90$ numbers, we only need to check: + +```center +(1,) 7, 11, 13, 17, 19, 23, 29, +31, 37, 41, 43, 47, 49, 53, 59, +61, 67, 71, 73, 77, 79, 83, 89… +``` + +You can notice a pattern: the sequence repeats itself every $30$ numbers because remainder modulo $2 \times 3 \times 5 = 30$ is all we need to determine whether a number is divisible by $2$, $3$, or $5$. This means that we only need to check $8$ specific numbers in every $30$, proportionally improving the performance: ```c++ u64 find_factor(u64 n) { for (u64 d : {2, 3, 5}) if (n % d == 0) return d; - u64 increments[] = {0, 4, 6, 10, 12, 16, 22, 24}; - u64 sum = 30; - for (u64 d = 7; d * d <= n; d += sum) { - for (u64 k = 0; k < 8; k++) { - u64 x = d + increments[k]; + u64 offsets[] = {0, 4, 6, 10, 12, 16, 22, 24}; + for (u64 d = 7; d * d <= n; d += 30) { + for (u64 offset : offsets) { + u64 x = d + offset; if (n % x == 0) return x; } @@ -129,38 +157,80 @@ u64 find_factor(u64 n) { } ``` -We can extend this even further. -Here is an implementation for the prime number 2, 3 and 5. -It's convenient to use an array to store how much we have to skip. +As expected, it works $\frac{30}{8} = 3.75$ times faster than the naive trial division, processing about 7.6k 30-bit numbers per second. The performance can be improved by considering more primes, but the returns are diminishing: adding a new prime $p$ reduces the number of iterations by $\frac{1}{p}$, but increases the size of the skip-list by a factor of $p$, requiring proportionally more memory. -### Lookup table +### Precomputed Primes -We will choose to store smallest factors of first $2^16$ — because this way they all fit in just one byte, so we are sort of saving on memory here. +If we keep increasing the number of primes we exclude in wheel factorization, we eventually exclude all composite numbers and only check for prime factors. In this case, we don't need this array of offsets, but we need to precompute primes, which we can do during compile time like this: ```c++ -template +const int N = (1 << 16); + struct Precalc { - char divisor[N]; + u16 primes[6542]; // # of primes under N=2^16 - constexpr Precalc() : divisor{} { - for (int i = 0; i < N; i++) - divisor[i] = 1; - for (int i = 2; i * i < N; i++) - if (divisor[i] == 1) - for (int k = i * i; k < N; k += i) - divisor[k] = i; + constexpr Precalc() : primes{} { + bool marked[N] = {}; + int n_primes = 0; + + for (int i = 2; i < N; i++) { + if (!marked[i]) { + primes[n_primes++] = i; + for (int j = 2 * i; j < N; j += i) + marked[j] = true; + } + } } }; -constexpr Precalc precalc{}; +constexpr Precalc P{}; + +u64 find_factor(u64 n) { + for (u16 p : P.primes) + if (n % p == 0) + return p; + return 1; +} +``` + +This approach lets us process almost 20k 30-bit integers per second, but it does not work for larger (64-bit) numbers unless they have small ($< 2^{16}$) factors. Note that this is actually an asymptotic optimization: there are $O(\frac{n}{\ln n})$ primes among the first $n$ numbers, so this algorithm performs $O(\frac{\sqrt n}{\ln \sqrt n})$ operations, while wheel factorization only eliminates a large but fixed fraction of divisors. If we extend it to 64-bit numbers and precompute every prime under $2^{32}$ (storing which would require several hundred megabytes of memory), the relative speedup would grow by a factor of $\frac{\ln \sqrt{n^2}}{\ln \sqrt n} = 2 \cdot \frac{1/2}{1/2} \cdot \frac{\ln n}{\ln n} = 2$. + +All variants of trial division, including this one, are bottlenecked by the speed of integer division, which can be [optimized](../hpc/arithmetic/division/) if we know the divisors in advice and allow for some precomputation: + +```c++ +// ...precomputation is the same as before, +// but we store the reciprocal instead of the prime number itself +u64 magic[6542]; +// for each prime i: +magic[n_primes++] = u64(-1) / i + 1; u64 find_factor(u64 n) { - return precalc.divisor[n]; + for (u64 m : P.magic) + if (m * n < m) + return u64(-1) / m + 1; + return 1; } ``` +This makes the algorithm ~18x faster: we can now process ~350k 30-bit numbers per second. This is actually the most efficient algorithm we have + + +$\tilde{O}(\sqrt n)$ territory + ### Pollard's Rho Algorithm +--- + +```c++ +u64 find_factor(u64 n) { + while (true) { + if (u64 g = gcd(randint(2, n - 1), n); g != 1) + return g; + } +} +``` + + The algorithm is probabilistic. This means that it may or may not work. You would also need to Ро-алгоритм Полларда — рандомизированный алгоритм факторизации целых чисел, работающий за время $O(n^\frac{1}{4})$ и основывающийся не следствии из парадокса дней рождений: @@ -232,99 +302,6 @@ If you have limited time, you should probably compute as much forward as possibl How to optimize for the *average* case is unclear. -0.087907 -3964.321045 - -```c++ -u64 find_factor(u64 n) { - if (n % 2 == 0) - return 2; - for (u64 d = 3; d * d <= n; d += 2) - if (n % d == 0) - return d; - return 1; -} -``` - -0.199740 -7615.217773 - -```c++ -u64 find_factor(u64 n) { - for (u64 d : {2, 3, 5}) - if (n % d == 0) - return d; - u64 increments[] = {0, 4, 6, 10, 12, 16, 22, 24}; - for (u64 d = 7; d * d <= n; d += 30) { - for (u64 k = 0; k < 8; k++) { - u64 x = d + increments[k]; - if (n % x == 0) - return x; - } - } - return 1; -} -``` - -19430.058594 - -```c++ -const int N = (1 << 16); - -struct Precalc { - u16 primes[6542]; // # of primes under N=2^16 - - constexpr Precalc() : primes{} { - bool marked[N] = {}; - int n_primes = 0; - - for (int i = 2; i < N; i++) { - if (!marked[i]) { - primes[n_primes++] = i; - for (int j = 2 * i; j < N; j += i) - marked[j] = true; - } - } - } -}; - -constexpr Precalc P{}; - -u64 find_factor(u64 n) { - for (u16 p : P.primes) - if (n % p == 0) - return p; - return 1; -} -``` - -352997.656250 - -```c++ -u64 magic[6542]; -magic[n_primes++] = u64(-1) / i + 1; - -u64 find_factor(u64 n) { - for (u64 m : P.magic) - if (m * n < m) - return u64(-1) / m + 1; - return 1; -} -``` - -Except that it is contant, so the speedup should be twice as much. - ---- - -```c++ -u64 find_factor(u64 n) { - while (true) { - if (u64 g = gcd(randint(2, n - 1), n); g != 1) - return g; - } -} -``` - 99.292641 25720.164062 almost 15x slower From dd88f5e0bdc1fbf03ac4f62c35ac458b085eeafb Mon Sep 17 00:00:00 2001 From: arnu152 <36503815+arnu152@users.noreply.github.com> Date: Wed, 25 May 2022 17:36:10 +0200 Subject: [PATCH 474/531] Fix a typo in the prefix sum code sample --- content/english/hpc/algorithms/prefix.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/english/hpc/algorithms/prefix.md b/content/english/hpc/algorithms/prefix.md index f07daaf3..81d31900 100644 --- a/content/english/hpc/algorithms/prefix.md +++ b/content/english/hpc/algorithms/prefix.md @@ -76,7 +76,7 @@ v4i prefix(v4i x) { // x = 1, 3, 5, 7 // + 0, 0, 1, 3 // = 1, 3, 6, 10 - return s; + return x; } ``` @@ -91,7 +91,7 @@ v8i prefix(v8i x) { x = _mm256_add_epi32(x, _mm256_slli_si256(x, 8)); x = _mm256_add_epi32(x, _mm256_slli_si256(x, 16)); // <- this does nothing // x = 1, 3, 6, 10, 5, 11, 18, 26 - return s; + return x; } ``` From 88b757a7ceb792b7ca1435a06f4e17617e443381 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 25 May 2022 19:45:20 +0300 Subject: [PATCH 475/531] pollard rho --- .../english/hpc/algorithms/factorization.md | 147 ++++++++---------- content/english/hpc/algorithms/img/rho.jpg | Bin 0 -> 14570 bytes 2 files changed, 67 insertions(+), 80 deletions(-) create mode 100644 content/english/hpc/algorithms/img/rho.jpg diff --git a/content/english/hpc/algorithms/factorization.md b/content/english/hpc/algorithms/factorization.md index 9f7958ed..90a1bf43 100644 --- a/content/english/hpc/algorithms/factorization.md +++ b/content/english/hpc/algorithms/factorization.md @@ -195,7 +195,7 @@ u64 find_factor(u64 n) { This approach lets us process almost 20k 30-bit integers per second, but it does not work for larger (64-bit) numbers unless they have small ($< 2^{16}$) factors. Note that this is actually an asymptotic optimization: there are $O(\frac{n}{\ln n})$ primes among the first $n$ numbers, so this algorithm performs $O(\frac{\sqrt n}{\ln \sqrt n})$ operations, while wheel factorization only eliminates a large but fixed fraction of divisors. If we extend it to 64-bit numbers and precompute every prime under $2^{32}$ (storing which would require several hundred megabytes of memory), the relative speedup would grow by a factor of $\frac{\ln \sqrt{n^2}}{\ln \sqrt n} = 2 \cdot \frac{1/2}{1/2} \cdot \frac{\ln n}{\ln n} = 2$. -All variants of trial division, including this one, are bottlenecked by the speed of integer division, which can be [optimized](../hpc/arithmetic/division/) if we know the divisors in advice and allow for some precomputation: +All variants of trial division, including this one, are bottlenecked by the speed of integer division, which can be [optimized](/hpc/arithmetic/division/) if we know the divisors in advice and allow for some precomputation. In particular, we can use [Lemire division check](/hpc/arithmetic/division/#lemire-reduction): ```c++ // ...precomputation is the same as before, @@ -212,14 +212,13 @@ u64 find_factor(u64 n) { } ``` -This makes the algorithm ~18x faster: we can now process ~350k 30-bit numbers per second. This is actually the most efficient algorithm we have - - -$\tilde{O}(\sqrt n)$ territory +This makes the algorithm ~18x faster: we can now process ~350k 30-bit numbers per second. This is actually the most efficient algorithm we have for this number range. While it can probably be even further optimized by performing these checks in parallel with [SIMD](/hpc/simd), we will stop there and consider a different, asymptotically better approach. ### Pollard's Rho Algorithm ---- + -### Brent's Method +To construct this sequence, we need a "seemingly random" function that maps the remainders of $n$. Typical choice is $f(x) = (x + 1)^2 \mod n$. -Another idea is to accumulate the product and instead of calculating GCD on each step to calculate it every log n steps. +Now, consider a graph where each vertex $x$ has an edge pointing to $f(x)$. Such graphs are called *functional*. The "trajectory" of any element — the path we walk starting from that element and following edges — eventually loop around. This trajectory resembles the greek letter $\rho$ (rho), which is why the algorithm is named so. -### Optimizing division +![](../img/rho.jpg) -The next step is to actually apply Montgomery Multiplication. +Apart from this trick, Pollard's rho algorithm relies on a consequence from the Birthday paradox: we need to add $O(\sqrt{n})$ random numbers from $1$ to $n$ to a set until we get a collision. -This is exactly the type of problem when we need specific knowledge, because we have 64-bit modulo by not-compile-constants, and compiler can't really do much to optimize it. +Now, consider a trajectory of some element $x_0$: {$x_0$, $f(x_0)$, $f(f(x_0))$, $\ldots$}. -... +Make another sequence out of it, virtually taking each element modulo $p$, the lesser of prime divisors of $n$. -## Further optimizations +**Lemma.** The expected length in that sequence is $O(\sqrt[4]{n})$. -Существуют также [субэкспоненциальные](https://ru.wikipedia.org/wiki/%D0%A4%D0%B0%D0%BA%D1%82%D0%BE%D1%80%D0%B8%D0%B7%D0%B0%D1%86%D0%B8%D1%8F_%D1%86%D0%B5%D0%BB%D1%8B%D1%85_%D1%87%D0%B8%D1%81%D0%B5%D0%BB#%D0%A1%D1%83%D0%B1%D1%8D%D0%BA%D1%81%D0%BF%D0%BE%D0%BD%D0%B5%D0%BD%D1%86%D0%B8%D0%B0%D0%BB%D1%8C%D0%BD%D1%8B%D0%B5_%D0%B0%D0%BB%D0%B3%D0%BE%D1%80%D0%B8%D1%82%D0%BC%D1%8B), но не полиномиальные алгоритмы факторизации. Человечество [умеет](https://en.wikipedia.org/wiki/Integer_factorization_records) факторизовывать числа порядка $2^{200}$. +**Proof.** Each time we walk a new edge, we generate a random number. It has some chance if looping around. +As $p$ is the lesser divisor, $p \leq \sqrt n$. Now we need to plug it into the [Birthday paradox](https://en.wikipedia.org/wiki/Birthday_problem): we need to add $O(\sqrt{p}) = O(\sqrt[4]{n})$ elements to the set to get a collision, which means that the. ---- +Another observation: the length of the "tail" and the cycle is equal in expectation, since when we loop around, we choose any vertex of the path we walked independently. -If you have limited time, you should probably compute as much forward as possible, and then half the time computing the other. +Now, if we find a cycle in this sequence — $i$ and $j$ such that $f^i(x_0) \equiv f^j(x_0) \pmod p$ — we can find some divisor of $n$ using the $\gcd$ trick: $\gcd(|f^i(x_0) - f^j(x_0)|, n)$ would be less than $n$ and divisible by $p$. -How to optimize for the *average* case is unclear. +Floyd's cycle-finding algorithm -99.292641 -25720.164062 almost 15x slower +The algorithm itself just finds a loop in this sequence using the Ford algorithms, also known as the "hare and turtle" technique: we maintain two pointers $i$ and $j$ ($i = 2j$) and check that $f^i(x_0) \equiv f^j(x_0) \pmod p$, which is equivalent to checking $\gcd(|f^i(x_0) - f^j(x_0)|, n) \neq 1$. ```c++ -u64 f(u64 x, u64 a, u64 mod) { - return ((u128) x * x + a) % mod; +u64 f(u64 x, u64 mod) { + return ((u128) x * x + 1) % mod; } u64 diff(u64 a, u64 b) { @@ -315,7 +271,7 @@ u64 diff(u64 a, u64 b) { return a > b ? a - b : b - a; } -u64 rho(u64 n, u64 x0 = 2, u64 a = 1) { +u64 find_factor(u64 n) { u64 x = x0, y = x0, g = 1; while (g == 1) { x = f(x, a, n); @@ -325,16 +281,16 @@ u64 rho(u64 n, u64 x0 = 2, u64 a = 1) { } return g; } - -u64 find_factor(u64 n) { - return rho(n); -} ``` -56.745281 +While it processes 25k 30-bit numbers — almost 15 times slower than the fastest algorithm we have — it drammatically outperforms every $\tilde{O}(\sqrt n)$ algorithm for 60-bit numbers, processing around 90 of them per second. + +### Pollard-Brent Algorithm + +Floyd's cycle-finding algorithm has a problem in that it does more iterator increments than necessary. One way to solve it is to memorize the values that the faster iterator visits and compute the gcd using the difference of $x_i$ and $x_{\lfloor i / 2 \rfloor}$, but it can also be done without extra memory using this trick: ```c++ -u64 rho(u64 n, u64 x0 = 2, u64 a = 1) { +u64 find_factor(u64 n, u64 x0 = 2, u64 a = 1) { u64 x = x0, y = x0; for (int l = 256; l < (1 << 20); l *= 2) { @@ -350,12 +306,14 @@ u64 rho(u64 n, u64 x0 = 2, u64 a = 1) { } ``` -426.389160 +It actually does *not* improve performance and even makes it ~1.5x *slower*, which probably has something to do with the fact that $x$ is stale. It spends most of the time computing the GCD and not advancing the iterator — in fact, the asymptotic of the algorithm is currently $O(\sqrt[4]{n} \log n)$ because of it. + +We can remove the logarithm from the asymptotic using the fact that if one of $a$ and $b$ contains factor $p$, then $a \cdot b \bmod n$ will also contain it, so instead of computing $\gcd(a, n)$ and $\gcd(b, n)$, we can compute $\gcd(a \cdot b \bmod n, n)$. This way, we can group the calculations of GCP in groups of $M = O(\log n)$, we remove $\log n$ out of the asymptotic: ```c++ const int M = 1024; -u64 rho(u64 n, u64 x0 = 2, u64 a = 1) { +u64 find_factor(u64 n, u64 x0 = 2, u64 a = 1) { u64 x = x0, y = x0, p = 1; for (int l = M; l < (1 << 20); l *= 2) { @@ -374,7 +332,13 @@ u64 rho(u64 n, u64 x0 = 2, u64 a = 1) { } ``` -2948.260986 +It now works at 425 factorizations per second, bottlenecked by the speed of modulo. + +### Optimizing Modulo + +The next step is to actually apply [Montgomery Multiplication](/hpc/number-theory/montgomery/). + +This is exactly the type of problem when we need specific knowledge, because we have 64-bit modulo by not-compile-constants, and compiler can't really do much to optimize it. ```c++ struct Montgomery { @@ -403,7 +367,7 @@ u64 f(u64 x, u64 a, Montgomery m) { const int M = 1024; -u64 rho(u64 n, u64 x0 = 2, u64 a = 1) { +u64 find_factor(u64 n, u64 x0 = 2, u64 a = 1) { Montgomery m(n); u64 y = x0; @@ -423,15 +387,38 @@ u64 rho(u64 n, u64 x0 = 2, u64 a = 1) { } ``` +It processes around 3000 per second, which is ~3.8 faster than what [PARI](https://pari.math.u-bordeaux.fr/) library can do (invocated via [sage](https://doc.sagemath.org/html/en/reference/structure/sage/structure/factorization.html)). + +### Further Optimization + +There might be a way to . + +It may be beneficial to start multiplying only after a certain threshold since there is little probability that we enter a cycle in the beginning. + +It may be worth it to run a few versions in parallel and stop whichever finishes first. If we run $p$ runs, it is expected to finish $\sqrt p$ times faster. Either scalar code and taking advantage of there being multiple execution ports for multiplication, or using [SIMD](/hpc/simd) instructions to do 4 or 8 multiplications in parallel. + +Would not be surprised to see another 3x improvement and throughputs of 10k/sec. + +If you have limited time, you should probably compute as much forward as possible, and then half the time computing the other. + +How to optimize for the *average* case is unclear. + +### Reducing Errors + There are slightly more errors because we are a bit loose with modular arithmetic here. The error rate grows higher when we increase and decrease (due to overflows). -788.4861246275735 +Our implementation has less than 0.7% error rate, but it grows higher if the numbers are lower than $10^{18}$. + +Since Pollard's rho algorithm is randomized, you need to account for errors. There may be several sources: + +- Factors not being found (need to perform a primality test and start again if it's negative). +- The `p` variable can get zeroed out (need to either restart or roll back and do it iteration-by-iteration). +- Overflows in Montgomery multiplication (our implementation is pretty loose). ### Larger Numbers "How big are your numbers?" determines the method to use: - - Less than 2^16 or so: Lookup table. - Less than 2^70 or so: Richard Brent's modification of Pollard's rho algorithm. - Less than 10^50: Lenstra elliptic curve factorization diff --git a/content/english/hpc/algorithms/img/rho.jpg b/content/english/hpc/algorithms/img/rho.jpg new file mode 100644 index 0000000000000000000000000000000000000000..d7f01ad81ee48c90ae02e9b248cc880cc3665e9e GIT binary patch literal 14570 zcmc(F1y~$=vTvh?hssp2TOum2oNl|ySqaI!QDN`;1FCwfPvsna2p(gJu&KJw5;G>Z)H={pw-nVFkcakdc=GU|?W?H;*6SVIGhIkPs0; zhzLj^5C|C=2?Y%o9Ss!~jR@xnCN3E<1vwcpDJdl_8v`XZ3k@kLBmYwt4o)5(9twtM zA_82(Y}`Ctzug1|85tQ36^#%bosf%)l#1(L-X1yuY$TW!SR^siZ@660CEFGPkU0mJVJpzM*LqfyCBN7rnd`wFIl#-g8mtRm=R9sT} zxwfvp0o>Ts+|}LF+t)uZI0TuP{5~~3Gds7sw*F&db8CBN_xR-W?EK>L>iXwzdcgp2 ze^Ki{HT!RRVL$2x3l9$m5Bg0n7+BXwfy0JJpyEWtkx&B}**~G?@<+mzjL)h0f=t8x z5{mcMVH^dYmS>gj_&3%5w`RYlSirxf*?%hbfApFI(BNPmjR%Jfhyj5PeIO9_0IXkSW0XdO%$>}gh;dD= z5sf_nxXOe+Z<}n1z2u>n{x3K!6O>pdyFGN~vfcT%NMhsw6Y`>TQm1Uxi+j{@R-gV2 zI_M4A{L+_}mrd5SV#~%U82cNaF=W5pdL4Y)sGiO{W&@XK_Z9{_02;F1*(4v+;wngxNi^DG1< zzUX`l50m3-d_JY3kS2$6tWw7KhDtgVn(OR0G|4nBd=WLP4*-f%?p4gmwRFCn!lm*N z;`8Ka;VyMk=qr;xxMfFaVWS4plLt*dmUUZ%`aG*-_l3R%+{;8ZX% zK~&*S`ar?)qx$Vgl5I*1>8=WW;gWD|L*(nk{kKbs;RnVcToHauGL>>{WQ!Re0~x7d zTaXtfVIXRHh{5G<$vP1BgsDTsjwA@=aIsRdFXx@v(lMS)51s0@fn}Y)|0!-1-1N5Q zV9Ax|i;f8SA2GX-rymdTo<|YEAar@<#JJD*dDx2+*G-ohEwjAj!G>W}gNXRwYTt3DecpcFu z4cLPXQEo}>1sA0ADx$m0-Z$X1wP4FrsQ)p6-tEn?)!gyY}BO8V*QWip+kaLwWM3lzO zN~0NH=DeI21LpIjG2h9Li|#j1&K{RL`u&WR!Jz(zucAoTKzjCF6rX5lv6uod2BM)U z!0UVfx=_omoE5K_dvNBT=WR^za@nvdgHZA_j$<;AA1y)6mts$r9Cemb#~f;1=AtEU zMs*M_yZDVeq>mAVf+JtZS5Y!fJ z&UT`89GW))Tt$HaFt>~i)wQ+r7rJtk^El$d#a(MW-+I=sG#%eK5ee*RFsC2)M^<<} zSq(o=`;_{&oF)Xpx=@=WcdL3O!5d}|;>SL^T7kH`r0W^|L@5>727NMq&@!rulaLEb z1~tC#u)gqXaY0HzX;gwTfVV*P5ZbXOk75uLlfQt0%B6v`?PL5F^gA@%xNPS0ptV-RWF8PmuLZFROTHFy9_EE^KGn@^(ZgU2bvT2&y@KYi4vG-Shs zx%01bdvtai${1fYKYe|WfWJS`{qgaon$B%K;LY${Ju9PW{fxC=-5H z6xx3rt1K(*dI8qBY^c%K86mwmDR$e44rDIbmLQV`6o2!fKS!pWBKQ(yUP}|bGxh_O z@Bq8eswf}knqw2YD{jdKYZcWSoh&}mkMggvYlJvqV%EoHU!{WARl zvgLEoeKllB{wfN6#7Z=&I0(5A%Kkn1?XzmAdcX*EuV@vubeBo5! zBo^d_Fr9l$Ro{)0*pFL??oW>o+n$~tytpt~xmF&o zVRfH)Mc6iMSm}v04YDucU1%D4bMz3KR`(tNR7;1etS$}$&9^@@<3f?0ga!!@7$1QC z_BM|LRK4nB{8-ykiwR`bO}*Hor-$W29KzV*#616FOm({pid2JeW*oF=tNxd_+TSHKlGZgU?Dyebw;UK|X~!I*Aeu zQ!*m=2dA zf8B=-oGW*dMd79jkKYBO%I7ce9at%aIa8b-6Yh6k`R$4cp6N)!J=FjLI3$k5v9t_z zc3rRxwO=I8D+kEsOT`U&5Z4&kzZYB8am!(;?(XvF*%}b*E?TI$$sF&n>HO(v5G#&= ztM+e-hGtl+bdj5(T;j{g0%qcj65^=VN)Kv`eRJTLAgt~PbqB{%=R<+wC>Cq zJyXOL87(i0%3K~+={KNi9IJaCs$|HmJz0S7;pH}ZqzCV z2s#x>-Ja`Fu{f|{Boa}D(00YU?FRsGfChW8un;J?^_f+Hr{=`wgCv2iZ^g30X2_ zCYI{#=_nNz88f6_hQCf+QDSa#O;w|gVx2)6a3 zu%B!d^Y7)=or3Bt4O|?yc~_c1hUMm4625U#bq*efylynxZYmT{-^np@=dh3Dd_`*^nyX=Ly1|fWOjU~Os4I3*Tq*P&u0ghX1if7G=YwieLR72N9{@LF!o5sJI7Pi_ zo+Av4C%dTyU07M}TF zYgzeyJo&qQn!70>zcc!lpC=DMwBYxuP@V%+Z5bM^x(*y|g#K@hJ*CV0cme`jv>LYE z%Q2u{1FA^r?{(W1^i2ccR1lpaOh!4T@XBdcm!D7-#e&6DJAu11x{fzDPSNZ}z)rRW z{pc_e)r13>%HI^mVLs93ylPdbvFHH^3a+TSpRMS3Sv&00m;MnlC6N)ILw6zL+e$!t zh=5h~A;S&5H9=+p4b1`b+15#+3W<^}i#$)9P>y;MM?qZtWyp$*tbqi1_85qrBGR_< zeJii4b)L-lp1e)}*Oocep%9tZU1S?H@?>E5H-4~5#&zcQOiD!*Wps`M^jfOa%n&b;D6&MQK1lnL{$%tSXyFS4WLCPwJ%neh64Zr z0E6@8FNFz^$a+JoA&gaF=Kx!KFMIOFmWxYPON`@A%=xi`G|NOeW6VjdH4>s$)(em6 zc^`sfX>i1TW#xf2JL@{Q$5&)Oc_4Op7qNEz<7HLViq^vz+MCHMub~Z8hVdsz)(}w?4_DNsr9boE~laFG-`#J)KyvbZdtr-xdeT(gJb`vj@GdFT_4kIjNJ&v zY}j9w8f&jt1qN=Fm))ug2PLzTwaS^=D5a9|r%ntQZ6cOa1(g=5!c+E^1l99{o*V&- zq^HaY0N_J&T2_W@MI%>J9DjcbGx#$V@cs?!?K^NQO(Q#9lQepzr;2EGoP=6Z(y1O( zTSZM%J^skaT~RTpM%-I%>XN}B1!&v^A#C9ogH8Z1lV^EKYi=dD50Ox+ryW1qNg`P%F$PlDNo2$OeW9Z$2{tVTDmjW z(PKzrsAH+BkDLh+z@tO|v>pjxJgW57F^2m*8&f8?`c}tg9eP_kYk6Xk*tNSyEfc$W z8q17MlHNom^Zng|Cd#sN$u@W{uJ(I9(U11YLViMw)uXe{*0 z*D)zUYSNB7Yd6RpH8_Mt?&r~cOQd}jHJqEG+5yd#5_hxSL#~w0n|Xol0;w^wsG!)&as9S`(&p65*a~;v^M2{g#|jHkthI}L7jieLr<;D1 zUL6#Kd=m0f9)7({dJ}N|p#F^4nvd)y8JoW3I4^Rdmqs*KxW4G2=(R-WnnrLQtwkAf+qU7h|_tzvUr+rfoZ|$n-D#z9kDVJnhTzp(l0J&^4=W?g!9p!BK)Ym2X#zw_( z?TCSbPl9kHR8d7B7IEGgHF2V!^s%~$2pNH z^0Kvd-7xh1tyIyrOSZri8~fDM;0Gw&DV5QpWTj8zWj2P{Dn)ObtKso$55zWVebjnK z_%4p9s8X%KC|Hh)>Gbu~%hXfMP(u*QEoNy$g<~<9SW>`HOFJz+v(41iD`dhJSxet(R0uW#np` z%Io)QkSY0U!GuydxRY!QtJQ4JG!?Af%GHiK-j)&?O5b5#YF>(~4VUxb+h>rD8XvN4 z4H+@k-rPaz%$((|_xCJT#s4I2c}nyVw-yT*Ex{87SQXY=hQ_%qcQ4_-;B9^3Y%LB4 z5%@;^s$VAVkzA`^(gZb8v^i%c3dWnel#A~3p5?@m{bI?;} z?4Emq2Yk|A$M&^EtYYv!08xrjyx@?1dC_TpidEwKI+#W3QtI3&rYJxNMdzO+gN@7K z9?`uW!F?a!^jHcm>=o(8Av15<9rRPeFF4M%Fs-bBda?-2(*34Hi5pIg2UZqeaP^R` zZ-|4Gk+U&|UrZdFWw1>s*XuW2)Ua^IN~`&66#qSgEx?YQxH=k6Ob`YdHZ6S$u~l}% zP~Ikt^7VB1aY5ZO*K_RR*UkW~%*juhr5)cS65C2IJEieKgDJg7JciGtg;vxrLfmzN z=qjpmk4TR*Dv<{J`?}L9Ru zem-1^3fHhmVkYWfbwa=i)(`)`y6o>9qun`b1WJxo)$3-R^Ti2&6|v*?KlJe74Tdhu zW+_0Oo){G0k5A@%OG1Oca1FV>xr2d&mWqlQ9vtY3Lmi^W4fVV$Z{J{Fn9IGzycw<5 zba|XjQTW}|hk101=8GXtf^SovTSkuqj$QRK+J=;#jJ?*2f;79@Tq14+m#9=4UQ?oCztVUXGMgJ~5zLX_TaCO6_rOy-))p-IgVf~?@*6g$4cv$PXkK-+(V6ljUH4}hA8S^< zb+$A3%gjyH~nDBhwnqg;Btu@1b>aCk(JjA?NTT50!oeu_aI&FSkut6AEeq$C#!Hr1Hjb1vb*)ye8d( zPY@&?I9_UE2yG=QUe*nmHWf}&hodWWtOv-hDerM%qE~taFQbHI*Jhtw8rWW_`@iA- z(dFX;8qXiRWo|7uL^kKVGVAhq7frq}*IBxs2a{FSATpD9HTC@;X4-ZJJfEAP_cfrH zx=*RH4H6g???dlJS-CShU7dDCYDWnxr@DU5zp4)S`OP$SZWTU$por>2RN$*PB86&r z>;O!U)5w=TL~haVX$he*WzTMyFD5DpfAo3N7>`|uWdI)1xXzDJj(M2Zn_nuihd63< z)vC#UTod;=HB_?l0N%g4KpuhS}m?(N>a#Z5oB;ryE8>!v&h9Q-|K z{}FLnYU%MYGq!NSlHjx)zhT=zo+KOY_ADDMi^wfQm_3&yY?WUyxI5C5MEYjp>jKlA zP?pgPsAh;^x5x9W_h|=$^=E~9s!a@{CKZ#I(r|tvZF>Y=I?h|+^)>6!4vRTMxZ~yd zdsZ-eG#C(p+fWMSpMRNl>Yun}&)pupg`8$YMva6$CSYcDWm>)zF8ol;$ErIU~k*p+$Q?a;Jj$u)2!4V z_QEyQx!|WhszB~*m=N(Vr0Hffv?mDO&mM#8>{U^Bax?F>DY)qcfmA>cmK0_VFztUU za>Yy#M$YPR`U*2QG6kM9-_&3OdnklKKy~IIQ%n%}DX+y90-}tH&;dF-cm8o_1W!dB zvR_}pwyrcuseNFBl^%Y1y*m_62S* z4)fYPJBl2J{A10^Q6xFhab@Oei zF4X%&L3u^GV~Jc5xmE1-rdY7&x#)+mi=|YOPD76z9Q+Zv)w&vtJmNgZpq5U8YpvMt!p%bk}@p{V^K^Km2rt^N>qr zRo4UmBSvucjD@OiS)D%>XxhBbm%M(+&F=xQ@lt~xh81yG*o#&<@R`^=0Gddn;tpE( zzmmzS8qDhKYP_6Xn(w!_<~m4tO5jj^MqQNa$6ZAah8~4(rY;T6&GjvOew=4ptd~WJ z?A+!+U<)rT1n16KTG)zwUZ8&f=vVm$GQWygNv1Hyb*)k7Div86#bLf5gWE2*MUBNm z#{-2&r%YZptea)A+xQr?9qXd(_6AhbDISCb>S6v=&yG7U5)(**%*rDpCOq>@v+3T)@crpgL5>@>;eo)d#9B-Jcjxbh zi+-H8W|?Iu6n97`_H!+Qz;o$nJZ#Cw{M_FURKWTxV;6X#>#P=DLMw}g8Z4z#Q{To9 z1rQ}U0*~K<*2M`8;J#Hlxxbv;Qp_yqo)gN)Z2aSuZl0ju+3%Hnu z5k6n%c({jd`rf*wZvXn&Ww}16tJyH*(D~Wf%!bUQ4f@wz2)_Q0#FLo7k~NHfsog1Ksf%YYp8Lq{IE)PK79Q_UnEe0riRP?ce-8Puf zT|_DfQz9OqNkI`bF+(egleII4^>|vmLpj7%JxKH_Q(EwEr)MSYkk8Hva{T}YZfR+I zM}&NIk+0+3IB0bV9O@J=tzw(5GI9s)ta*^wg{9mu&^%0}$GzuX1;H)T`w8i6wMwsB zqjn>_ZExl>7`!hXntf3_$?Q6-Z6LUSX%s){-DvYHw$9%);c#nv{L>f*#6sd%t2RC%X9ph1W?H5 ztykZ1Ua_-84Ke!8Q|51P z#^)jjLxzDwd4eJ9aWWdVOp?`=dYeigZG`Z>w|%)zNr)_>$0QpyM!f&o-h9OQ6=ld2 ze83XpS}-Dxz$8=g8Q@ffdX)jy)tDAmv9EM6GL%m0fdFOG5|&Q%j6gYQZtID0$l;Jl=nOXz+=DBBaqU1b&zJ zUF{lvoE&eohWpwZv5g3dr=WINif;|Aj%5$Pn=2BX$-q)!zSxagr&B^}m0P!ntS%kR z3yL6~XpVK~`h5E)I3nvj`~GPjnA+bNfB09MD2QYaKxCaa*ScE3p<*85S(p$iXJp4$qo7%2>2)0ZSud6@*~iC3z`JO9xn`a2sfZPuBciB*tm z6yx(KL53lIvpLJu*5;6Kv`1Puap*YBe6T3A2e|O%)u|C?YV;U&DyfrU2NI+7vCxns$YQP zw7=7Ffv5tn-%B3IyugxJjgjqHCgu0LW#dJnls{tlAOS0XDTO8f?ob>q`y6k4CzUlR zeltxyBR?5tqp2xjO$JUt|Ad=wJc-KDPdki=L4X3MmBD!J2(+MU@NMb=FddnE730EH zvMUtz{h)Za`7y|MFxmyL>uR4ZZ_N;4+cfL)XG)v1-bvE7VU~5T#~8@CXtuw|BB_C+ zv|M8>pAq*BkPQ<%RQ@M1y#Fo*fY%-UGr6x&>}jCfxZln|l7-mShsw1^6#%~D7!=Jr zPK3CnenZ{ji=&$@F5BR$3Ud7uBGkVc(f;f82-cD(B0?{k+fzQiZ3y0U=m!ws)9=-X zZ&~V+rdRvoPMZ=u8|O|mX-HBvxXezBWP625tnW#8Yji7bB6}4X69pC^&HcBZ?$sw} z=46eE=Cyl(7^EaxxNH=FV#tr>m;CtmeEZM&4yNl0`j~`=dlBR$*)H#foPT!=QFI$V zjd{f8%H2lCO-nK24_<}K^5fx`I3o-Xz!Ngpe;v90R3#Oq0lRkidEYf#;=MP|PU|_} zM$K+mucP)iY}$NW_|D2KS2J!+JgpKq7+m0<;#!My*!yWCTsXn<9H7IkQ-xJ~yuttf J1JJ|l{{rESw}}7% literal 0 HcmV?d00001 From e796f1669013f46eb3e7d31db9f227534b78a2ed Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 25 May 2022 19:50:36 +0300 Subject: [PATCH 476/531] pollard refactor --- .../english/hpc/algorithms/factorization.md | 33 ++++++++++--------- 1 file changed, 18 insertions(+), 15 deletions(-) diff --git a/content/english/hpc/algorithms/factorization.md b/content/english/hpc/algorithms/factorization.md index 90a1bf43..18d46824 100644 --- a/content/english/hpc/algorithms/factorization.md +++ b/content/english/hpc/algorithms/factorization.md @@ -271,12 +271,13 @@ u64 diff(u64 a, u64 b) { return a > b ? a - b : b - a; } +const u64 SEED = 42; + u64 find_factor(u64 n) { - u64 x = x0, y = x0, g = 1; + u64 x = SEED, y = SEED, g = 1; while (g == 1) { - x = f(x, a, n); - y = f(y, a, n); - y = f(y, a, n); + x = f(f(x, n), n); // advance x twice + y = f(y, n); // advance y once g = gcd(diff(x, y)); } return g; @@ -290,13 +291,13 @@ While it processes 25k 30-bit numbers — almost 15 times slower than the fastes Floyd's cycle-finding algorithm has a problem in that it does more iterator increments than necessary. One way to solve it is to memorize the values that the faster iterator visits and compute the gcd using the difference of $x_i$ and $x_{\lfloor i / 2 \rfloor}$, but it can also be done without extra memory using this trick: ```c++ -u64 find_factor(u64 n, u64 x0 = 2, u64 a = 1) { - u64 x = x0, y = x0; +u64 find_factor(u64 n) { + u64 x = SEED; for (int l = 256; l < (1 << 20); l *= 2) { - x = y; + u64 y = x; for (int i = 0; i < l; i++) { - y = f(y, a, n); + x = f(x, n); if (u64 g = gcd(diff(x, y), n); g != 1) return g; } @@ -313,14 +314,14 @@ We can remove the logarithm from the asymptotic using the fact that if one of $a ```c++ const int M = 1024; -u64 find_factor(u64 n, u64 x0 = 2, u64 a = 1) { - u64 x = x0, y = x0, p = 1; +u64 find_factor(u64 n) { + u64 x = SEED; for (int l = M; l < (1 << 20); l *= 2) { - x = y; + u64 y = x, p = 1; for (int i = 0; i < l; i += M) { for (int j = 0; j < M; j++) { - y = f(y, a, n); + y = f(y, n); p = (u128) p * diff(x, y) % n; } if (u64 g = gcd(p, n); g != 1) @@ -340,6 +341,8 @@ The next step is to actually apply [Montgomery Multiplication](/hpc/number-theor This is exactly the type of problem when we need specific knowledge, because we have 64-bit modulo by not-compile-constants, and compiler can't really do much to optimize it. +We do not need to convert numbers out of Montgomery representation before computing the GCD. + ```c++ struct Montgomery { u64 n, nr; @@ -369,13 +372,13 @@ const int M = 1024; u64 find_factor(u64 n, u64 x0 = 2, u64 a = 1) { Montgomery m(n); - u64 y = x0; + u64 x = SEED; for (int l = M; l < (1 << 20); l *= 2) { - u64 x = y, p = 1; + u64 y = x, p = 1; for (int i = 0; i < l; i += M) { for (int j = 0; j < M; j++) { - y = f(y, a, m); + x = f(x, m); p = m.multiply(p, diff(x, y)); } if (u64 g = gcd(p, n); g != 1) From 54fe1ba3afb88fdd5b2ec9a041c74be42469afb5 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 25 May 2022 20:23:30 +0300 Subject: [PATCH 477/531] trial division edits --- .../english/hpc/algorithms/factorization.md | 50 +++++++++++-------- 1 file changed, 30 insertions(+), 20 deletions(-) diff --git a/content/english/hpc/algorithms/factorization.md b/content/english/hpc/algorithms/factorization.md index 18d46824..9e886375 100644 --- a/content/english/hpc/algorithms/factorization.md +++ b/content/english/hpc/algorithms/factorization.md @@ -6,7 +6,7 @@ draft: true The problem of factoring integers into primes is central to computational [number theory](/hpc/number-theory/). It has been [studied](https://www.cs.purdue.edu/homes/ssw/chapter3.pdf) since at least the 3rd century BC, and [many methods](https://en.wikipedia.org/wiki/Category:Integer_factorization_algorithms) have been developed that are efficient for different inputs. -In this case study, we specifically consider the factorization of *word-sized* integers: those on the order of $10^9$ and $10^{18}$. Untypical for this book, in this one, you may actually learn an asymptotically better algorithm: we start with a few basic approaches, and then gradually build up to the $O(\sqrt[4]{n})$-time *Pollard's rho algorithm* and optimize it to the point where it can factorize 60-bit semiprimes in 0.3-0.4ms, which is almost 4x faster than the previous state-of-the-art. +In this case study, we specifically consider the factorization of *word-sized* integers: those on the order of $10^9$ and $10^{18}$. Untypical for this book, in this one, you may actually learn an asymptotically better algorithm: we start with a few basic approaches and gradually build up to the $O(\sqrt[4]{n})$-time *Pollard's rho algorithm* and optimize it to the point where it can factorize 60-bit semiprimes in 0.3-0.4ms and almost 4 times faster than the previous state-of-the-art. -The most basic approach is to try every number less than $n$ as a divosor: +The most basic approach is to try every integer smaller than $n$ as a divisor: ```c++ u64 find_factor(u64 n) { @@ -68,7 +68,7 @@ u64 find_factor(u64 n) { } ``` -One simple optimization is to notice that it is enough to only check divisors that do not exceed $\sqrt n$. This works because if $n$ is divided by $d > \sqrt n$, then it is also divided by $\frac{n}{d} < \sqrt n$, so we can don't have to check it separately. +We can notice that if $n$ is divided by $d < \sqrt n$, then it is also divided by $\frac{n}{d} > \sqrt n$, and there is no need to check for it separately. This lets us stop trial division early and only check for potential divisors that do not exceed $\sqrt n$: ```c++ u64 find_factor(u64 n) { @@ -79,13 +79,13 @@ u64 find_factor(u64 n) { } ``` -In our benchmark, $n$ is a semiprime, and we always find the lesser divisor, so both $O(n)$ and $O(\sqrt n)$ implementations perform the same and are able to factorize ~2k 30-bit numbers per second, while taking whole ~20 seconds to factorize a single 60-bit number. +In our benchmark, $n$ is a semiprime, and we always find the lesser divisor, so both $O(n)$ and $O(\sqrt n)$ implementations perform the same and are able to factorize ~2k 30-bit numbers per second — while taking whole 20 seconds to factorize a single 60-bit number. ### Lookup Table Nowadays, you can type `factor 57` in your Linux terminal or Google search bar to get the factorization of any number. But before computers were invented, it was more practical to use *factorization tables:* special books containing factorizations of the first $N$ numbers. -We can also use this approach to compute these lookup tables [during compile time](/hpc/compilation/precalc/). To save space, it is convenient to only store the smallest divisor of a number, requiring just one byte for a 16-bit integer: +We can also use this approach to compute these lookup tables [during compile time](/hpc/compilation/precalc/). To save space, we can store only the smallest divisor of a number. Since the smallest divisor does not exceed the $\sqrt n$, we need just one byte per a 16-bit integer: ```c++ template @@ -109,13 +109,13 @@ u64 find_factor(u64 n) { } ``` -This approach can process 3M 16-bit integers per second, although it [probably gets slower](../hpc/cpu-cache/bandwidth/) for larger numbers. While it requires just a few milliseconds and 64KB of memory to calculate and store the divisors of the first $2^{16}$ numbers, it does not scale well for larger inputs. +With this approach, we can process 3M 16-bit integers per second, although it would probably [get slower](../hpc/cpu-cache/bandwidth/) for larger numbers. While it requires just a few milliseconds and 64KB of memory to calculate and store the divisors of the first $2^{16}$ numbers, it does not scale well for larger inputs. ### Wheel factorization -To save paper space, pre-computer era factorization tables typically excluded numbers divisible by 2 and 5: in decimal numeral system, you can quickly determine whether a number is divisible by 2 or 5 (by looking at its last digit) and keep dividing the number $n$ by 2 or 5 while it is possible, eventually arriving to some entry in the factorization table. This makes the factorization table just ½ × ⅘ = 0.4 its original size. +To save paper space, pre-computer era factorization tables typically excluded numbers divisible by $2$ and $5$, making the factorization table ½ × ⅘ = 0.4 of its original size. In the decimal numeral system, you can quickly determine whether a number is divisible by $2$ or $5$ (by looking at its last digit) and keep dividing the number $n$ by $2$ or $5$ while it is possible, eventually arriving at some entry in the factorization table. -We can apply a similar trick to trial division, first checking if the number is divisible by $2$, and then only check for odd divisors: +We can apply a similar trick to trial division by first checking if the number is divisible by $2$ and then only considering odd divisors: ```c++ u64 find_factor(u64 n) { @@ -128,9 +128,11 @@ u64 find_factor(u64 n) { } ``` -With 50% fewer divisions to do, this algorithm works twice as fast, but it can be extended. If the number is not divisible by $3$, we can also ignore all multiples of $3$, and the same goes for all other divisors. +With 50% fewer divisions to perform, this algorithm works twice as fast. -The problem is, as we increase the number of primes to exclude, it becomes less straightforward to iterate only over the numbers not divisible by them as they follow an irregular pattern — unless the number of primes is small. For example, if we consider $2$, $3$, and $5$, then, among the first $90$ numbers, we only need to check: +This method can be extended: if the number is not divisible by $3$, we can also ignore all multiples of $3$, and the same goes for all other divisors. The problem is, as we increase the number of primes to exclude, it becomes less straightforward to iterate only over the numbers not divisible by them as they follow an irregular pattern — unless the number of primes is small. + +For example, if we consider $2$, $3$, and $5$, then, among the first $90$ numbers, we only need to check: ```center (1,) 7, 11, 13, 17, 19, 23, 29, @@ -138,7 +140,7 @@ The problem is, as we increase the number of primes to exclude, it becomes less 61, 67, 71, 73, 77, 79, 83, 89… ``` -You can notice a pattern: the sequence repeats itself every $30$ numbers because remainder modulo $2 \times 3 \times 5 = 30$ is all we need to determine whether a number is divisible by $2$, $3$, or $5$. This means that we only need to check $8$ specific numbers in every $30$, proportionally improving the performance: +You can notice a pattern: the sequence repeats itself every $30$ numbers. This is not surprising since the remainder modulo $2 \times 3 \times 5 = 30$ is all we need to determine whether a number is divisible by $2$, $3$, or $5$. This means that we only need to check $8$ numbers with specific remainders out of every $30$, proportionally improving the performance: ```c++ u64 find_factor(u64 n) { @@ -157,11 +159,11 @@ u64 find_factor(u64 n) { } ``` -As expected, it works $\frac{30}{8} = 3.75$ times faster than the naive trial division, processing about 7.6k 30-bit numbers per second. The performance can be improved by considering more primes, but the returns are diminishing: adding a new prime $p$ reduces the number of iterations by $\frac{1}{p}$, but increases the size of the skip-list by a factor of $p$, requiring proportionally more memory. +As expected, it works $\frac{30}{8} = 3.75$ times faster than the naive trial division, processing about 7.6k 30-bit numbers per second. The performance can be improved further by considering more primes, but the returns are diminishing: adding a new prime $p$ reduces the number of iterations by $\frac{1}{p}$ but increases the size of the skip-list by a factor of $p$, requiring proportionally more memory. ### Precomputed Primes -If we keep increasing the number of primes we exclude in wheel factorization, we eventually exclude all composite numbers and only check for prime factors. In this case, we don't need this array of offsets, but we need to precompute primes, which we can do during compile time like this: +If we keep increasing the number of primes in wheel factorization, we eventually exclude all composite numbers and only check for prime factors. In this case, we don't need this array of offsets but just the array of primes: ```c++ const int N = (1 << 16); @@ -193,9 +195,11 @@ u64 find_factor(u64 n) { } ``` -This approach lets us process almost 20k 30-bit integers per second, but it does not work for larger (64-bit) numbers unless they have small ($< 2^{16}$) factors. Note that this is actually an asymptotic optimization: there are $O(\frac{n}{\ln n})$ primes among the first $n$ numbers, so this algorithm performs $O(\frac{\sqrt n}{\ln \sqrt n})$ operations, while wheel factorization only eliminates a large but fixed fraction of divisors. If we extend it to 64-bit numbers and precompute every prime under $2^{32}$ (storing which would require several hundred megabytes of memory), the relative speedup would grow by a factor of $\frac{\ln \sqrt{n^2}}{\ln \sqrt n} = 2 \cdot \frac{1/2}{1/2} \cdot \frac{\ln n}{\ln n} = 2$. +This approach lets us process almost 20k 30-bit integers per second, but it does not work for larger (64-bit) numbers unless they have small ($< 2^{16}$) factors. + +Note that this is actually an asymptotic optimization: there are $O(\frac{n}{\ln n})$ primes among the first $n$ numbers, so this algorithm performs $O(\frac{\sqrt n}{\ln \sqrt n})$ operations, while wheel factorization only eliminates a large but constant fraction of divisors. If we extend it to 64-bit numbers and precompute every prime under $2^{32}$ (storing which would require several hundred megabytes of memory), the relative speedup would grow by a factor of $\frac{\ln \sqrt{n^2}}{\ln \sqrt n} = 2 \cdot \frac{1/2}{1/2} \cdot \frac{\ln n}{\ln n} = 2$. -All variants of trial division, including this one, are bottlenecked by the speed of integer division, which can be [optimized](/hpc/arithmetic/division/) if we know the divisors in advice and allow for some precomputation. In particular, we can use [Lemire division check](/hpc/arithmetic/division/#lemire-reduction): +All variants of trial division, including this one, are bottlenecked by the speed of integer division, which can be [optimized](/hpc/arithmetic/division/) if we know the divisors in advice and allow for some additional precomputation. In our case, it is suitable to use [the Lemire division check](/hpc/arithmetic/division/#lemire-reduction): ```c++ // ...precomputation is the same as before, @@ -212,7 +216,7 @@ u64 find_factor(u64 n) { } ``` -This makes the algorithm ~18x faster: we can now process ~350k 30-bit numbers per second. This is actually the most efficient algorithm we have for this number range. While it can probably be even further optimized by performing these checks in parallel with [SIMD](/hpc/simd), we will stop there and consider a different, asymptotically better approach. +This makes the algorithm ~18x faster: we can now factorize **~350k** 30-bit numbers per second, which is actually the most efficient algorithm we have for this number range. While it can probably be optimized even further by performing these checks in parallel with [SIMD](/hpc/simd), we will stop there and try a different, asymptotically better approach. ### Pollard's Rho Algorithm @@ -235,6 +239,8 @@ By itself, this algorithm is just an esoteric way of computing factorization, bu --> +Pollard's rho algorithm is a randomized $O(\sqrt[4]{n})$ integer factorization algorithm. + To construct this sequence, we need a "seemingly random" function that maps the remainders of $n$. Typical choice is $f(x) = (x + 1)^2 \mod n$. Now, consider a graph where each vertex $x$ has an edge pointing to $f(x)$. Such graphs are called *functional*. The "trajectory" of any element — the path we walk starting from that element and following edges — eventually loop around. This trajectory resembles the greek letter $\rho$ (rho), which is why the algorithm is named so. @@ -427,3 +433,7 @@ Since Pollard's rho algorithm is randomized, you need to account for errors. The - Less than 10^50: Lenstra elliptic curve factorization - Less than 10^100: Quadratic Sieve - More than 10^100: General Number Field Sieve + +Requiring about 100KB of memory. + +6542 * 8 From 002b4aece30f3a63c2dc06a8a1b016afb55c4904 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 25 May 2022 21:46:07 +0300 Subject: [PATCH 478/531] pollard rho description --- .../english/hpc/algorithms/factorization.md | 47 +++++++++++++------ 1 file changed, 32 insertions(+), 15 deletions(-) diff --git a/content/english/hpc/algorithms/factorization.md b/content/english/hpc/algorithms/factorization.md index 9e886375..d44ca6af 100644 --- a/content/english/hpc/algorithms/factorization.md +++ b/content/english/hpc/algorithms/factorization.md @@ -237,35 +237,49 @@ It also searches for a factor, but it does so by repeatedly trying to compute th By itself, this algorithm is just an esoteric way of computing factorization, but can be made useful. If, instead of random numbers, we apply this $\gcd$ trick to a particular number sequence, we get a $O(n^\frac{1}{4})$ approach known as Pollard's rho algorithm. +Apart from this trick, Pollard's rho algorithm relies on a consequence from the Birthday paradox: we need to add $O(\sqrt{n})$ random numbers from $1$ to $n$ to a set until we get a collision. + --> -Pollard's rho algorithm is a randomized $O(\sqrt[4]{n})$ integer factorization algorithm. +Pollard's rho algorithm is a randomized $O(\sqrt[4]{n})$ integer factorization algorithm that makes use of the [birthday paradox](https://en.wikipedia.org/wiki/Birthday_problem): one only needs to draw $\Theta(\sqrt{n})$ random numbers between $1$ and $n$ to get a collision with high probability. -To construct this sequence, we need a "seemingly random" function that maps the remainders of $n$. Typical choice is $f(x) = (x + 1)^2 \mod n$. +Consider some function $f(x)$ that takes a remainder $x \in [0, n)$ and maps it to some other remainder of $n$ in a way that that seems random from the number theory point of view. Specifically, we will use $f(x) = x^2 + 1 \bmod n$, which is random enough for our purposes. -Now, consider a graph where each vertex $x$ has an edge pointing to $f(x)$. Such graphs are called *functional*. The "trajectory" of any element — the path we walk starting from that element and following edges — eventually loop around. This trajectory resembles the greek letter $\rho$ (rho), which is why the algorithm is named so. +Now, consider a graph where each number-vertex $x$ has an edge pointing to $f(x)$. Such graphs are called *functional*. In functional graphs, the "trajectory" of any element — the path we walk if we start from that element and keep following the edges — is a path that eventually loops around (because the set of vertices is limited, and at some point we have to go to a vertex we have already visited). -![](../img/rho.jpg) +![The trajectory of an element resembles the greek letter ρ (rho), which is what the algorithm is named after](../img/rho.jpg) -Apart from this trick, Pollard's rho algorithm relies on a consequence from the Birthday paradox: we need to add $O(\sqrt{n})$ random numbers from $1$ to $n$ to a set until we get a collision. +Consider a trajectory of some particular element $x_0$: -Now, consider a trajectory of some element $x_0$: {$x_0$, $f(x_0)$, $f(f(x_0))$, $\ldots$}. +$$ +x_0, \; f(x_0), \; f(f(x_0)), \; \ldots +$$ -Make another sequence out of it, virtually taking each element modulo $p$, the lesser of prime divisors of $n$. +Now, let's make another sequence out of this one by reducing each element modulo $p$, the smallest prime divisor of $n$. -**Lemma.** The expected length in that sequence is $O(\sqrt[4]{n})$. +**Lemma.** The expected length of that sequence before it turns into a cycle is $O(\sqrt[4]{n})$. -**Proof.** Each time we walk a new edge, we generate a random number. It has some chance if looping around. +**Proof:** Since $p$ is the smallest divisor, $p \leq \sqrt n$. Each time we follow a new edge, we essentially generate a random number between $0$ and $p$ (we treat $f$ as a "deterministically-random" function). The birthday paradox states that we only need to generate $O(\sqrt p) = O(\sqrt[4]{n})$ numers until we get a collision and thus enter a loop. -As $p$ is the lesser divisor, $p \leq \sqrt n$. Now we need to plug it into the [Birthday paradox](https://en.wikipedia.org/wiki/Birthday_problem): we need to add $O(\sqrt{p}) = O(\sqrt[4]{n})$ elements to the set to get a collision, which means that the. +Since we don't know $p$, this mod-$p$ sequence is only imaginary, but if find a cycle in it — that is, $i$ and $j$ such that -Another observation: the length of the "tail" and the cycle is equal in expectation, since when we loop around, we choose any vertex of the path we walked independently. +$$ +f^i(x_0) \equiv f^j(x_0) \pmod p +$$ + +then we can also find $p$ itself as + +$$ +p = \gcd(|f^i(x_0) - f^j(x_0)|, n) +$$ -Now, if we find a cycle in this sequence — $i$ and $j$ such that $f^i(x_0) \equiv f^j(x_0) \pmod p$ — we can find some divisor of $n$ using the $\gcd$ trick: $\gcd(|f^i(x_0) - f^j(x_0)|, n)$ would be less than $n$ and divisible by $p$. +The algorithm itself just finds this cycle and $p$ using this GCD trick and Floyd's "[tortoise and hare](https://en.wikipedia.org/wiki/Cycle_detection#Floyd's_tortoise_and_hare)" algorithm: we maintain two pointers $i$ and $j = 2i$ and check that -Floyd's cycle-finding algorithm +$$ +\gcd(|f^i(x_0) - f^j(x_0)|, n) \neq 1 +$$ -The algorithm itself just finds a loop in this sequence using the Ford algorithms, also known as the "hare and turtle" technique: we maintain two pointers $i$ and $j$ ($i = 2j$) and check that $f^i(x_0) \equiv f^j(x_0) \pmod p$, which is equivalent to checking $\gcd(|f^i(x_0) - f^j(x_0)|, n) \neq 1$. +which is equivalent to comparing $f^i(x_0)$ and $f^j(x_0)$ modulo $p$. Since $j$ (hare) is increasing at twice the rate of $i$ (tortoise), their difference is increasing by $1$ each iteration and eventually will become equal to (or a multiple of) the cycle length, with $i$ and $j$ pointing to the same elements. And as we proved half a page ago, reaching a cycle would only require $O(\sqrt[4]{n})$ iterations: ```c++ u64 f(u64 x, u64 mod) { @@ -290,7 +304,7 @@ u64 find_factor(u64 n) { } ``` -While it processes 25k 30-bit numbers — almost 15 times slower than the fastest algorithm we have — it drammatically outperforms every $\tilde{O}(\sqrt n)$ algorithm for 60-bit numbers, processing around 90 of them per second. +While it processes only ~25k 30-bit integers — almost 15 times slower than the fastest algorithm we have — it drammatically outperforms every $\tilde{O}(\sqrt n)$ algorithm for 60-bit numbers, factorizing around 90 of them per second. ### Pollard-Brent Algorithm @@ -412,6 +426,9 @@ If you have limited time, you should probably compute as much forward as possibl How to optimize for the *average* case is unclear. +Another observation: the length of the "tail" and the cycle is equal in expectation, since when we loop around, we choose any vertex of the path we walked independently. + + ### Reducing Errors There are slightly more errors because we are a bit loose with modular arithmetic here. The error rate grows higher when we increase and decrease (due to overflows). From 428407e09d0461d13b55a5bae555d98bea75d320 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Thu, 26 May 2022 15:32:28 +0300 Subject: [PATCH 479/531] factorization edits --- .../english/hpc/algorithms/factorization.md | 76 ++++++++----------- 1 file changed, 33 insertions(+), 43 deletions(-) diff --git a/content/english/hpc/algorithms/factorization.md b/content/english/hpc/algorithms/factorization.md index d44ca6af..fd61d441 100644 --- a/content/english/hpc/algorithms/factorization.md +++ b/content/english/hpc/algorithms/factorization.md @@ -1,7 +1,6 @@ --- title: Integer Factorization weight: 3 -draft: true --- The problem of factoring integers into primes is central to computational [number theory](/hpc/number-theory/). It has been [studied](https://www.cs.purdue.edu/homes/ssw/chapter3.pdf) since at least the 3rd century BC, and [many methods](https://en.wikipedia.org/wiki/Category:Integer_factorization_algorithms) have been developed that are efficient for different inputs. @@ -241,7 +240,11 @@ Apart from this trick, Pollard's rho algorithm relies on a consequence from the --> -Pollard's rho algorithm is a randomized $O(\sqrt[4]{n})$ integer factorization algorithm that makes use of the [birthday paradox](https://en.wikipedia.org/wiki/Birthday_problem): one only needs to draw $\Theta(\sqrt{n})$ random numbers between $1$ and $n$ to get a collision with high probability. +Pollard's rho is a randomized $O(\sqrt[4]{n})$ integer factorization algorithm that makes use of the [birthday paradox](https://en.wikipedia.org/wiki/Birthday_problem): + +> One only needs to draw $d = \Theta(\sqrt{n})$ random numbers between $1$ and $n$ to get a collision with high probability. + +You can look up formal proof on Wikipedia, but the informal reasoning behind it is that that each of $d$ added numbers has a chance of approximately $\frac{d}{n}$ of colliding with anythin else, meaning that the expected number of collisions is $\frac{d^2}{n}$. If $d$ is asymptotically smaller than $\sqrt n$, then this ratio grows to zero as $n$ rises and to infinity otherwise. Consider some function $f(x)$ that takes a remainder $x \in [0, n)$ and maps it to some other remainder of $n$ in a way that that seems random from the number theory point of view. Specifically, we will use $f(x) = x^2 + 1 \bmod n$, which is random enough for our purposes. @@ -308,7 +311,9 @@ While it processes only ~25k 30-bit integers — almost 15 times slower than the ### Pollard-Brent Algorithm -Floyd's cycle-finding algorithm has a problem in that it does more iterator increments than necessary. One way to solve it is to memorize the values that the faster iterator visits and compute the gcd using the difference of $x_i$ and $x_{\lfloor i / 2 \rfloor}$, but it can also be done without extra memory using this trick: +Floyd's cycle-finding algorithm has a problem in that it moves iterators more than necessary: at least half of the vertices are visited one additional time by the slower iterator. + +One way to solve it is to memorize the values $x_i$ that the faster iterator visits and every two iterations compute the GCD using the difference of $x_i$ and $x_{\lfloor i / 2 \rfloor}$, but it can also be done without extra memory using a different principle: the tortoise doesn't move on every iteration, but it gets reset to the value of the faster iterator when the iteration number becomes a power of two. This lets us save additional iterations while still using the same GCD trick to compare $x_i$ and $x_{2^{\lfloor \log_2 i \rfloor}}$ on each iteration: ```c++ u64 find_factor(u64 n) { @@ -327,9 +332,11 @@ u64 find_factor(u64 n) { } ``` -It actually does *not* improve performance and even makes it ~1.5x *slower*, which probably has something to do with the fact that $x$ is stale. It spends most of the time computing the GCD and not advancing the iterator — in fact, the asymptotic of the algorithm is currently $O(\sqrt[4]{n} \log n)$ because of it. +Note that we also set an upper limit on the number of iterations so that the algorithm finishes in reasonable time and returns `1` if $n$ turns out to be a prime. + +It actually does *not* improve performance and even makes the algorithm ~1.5x *slower*, which probably has something to do with the fact that $x$ is stale. It spends most of the time computing the GCD and not advancing the iterator — in fact, the asymptotic of the algorithm is currently $O(\sqrt[4]{n} \log n)$ because of it. -We can remove the logarithm from the asymptotic using the fact that if one of $a$ and $b$ contains factor $p$, then $a \cdot b \bmod n$ will also contain it, so instead of computing $\gcd(a, n)$ and $\gcd(b, n)$, we can compute $\gcd(a \cdot b \bmod n, n)$. This way, we can group the calculations of GCP in groups of $M = O(\log n)$, we remove $\log n$ out of the asymptotic: +Instead of [optimizing the GCD itself](../gcd), we can optimize the number of its invocations. We can use the fact that if one of $a$ and $b$ contains factor $p$, then $a \cdot b \bmod n$ will also contain it, so instead of computing $\gcd(a, n)$ and $\gcd(b, n)$, we can compute $\gcd(a \cdot b \bmod n, n)$. This way, we can group the calculations of GCP in groups of $M = O(\log n)$, we remove $\log n$ out of the asymptotic: ```c++ const int M = 1024; @@ -357,11 +364,7 @@ It now works at 425 factorizations per second, bottlenecked by the speed of modu ### Optimizing Modulo -The next step is to actually apply [Montgomery Multiplication](/hpc/number-theory/montgomery/). - -This is exactly the type of problem when we need specific knowledge, because we have 64-bit modulo by not-compile-constants, and compiler can't really do much to optimize it. - -We do not need to convert numbers out of Montgomery representation before computing the GCD. +The final step is to apply [Montgomery multiplication](/hpc/number-theory/montgomery/): the modulo is constant, so we can perform all computations — advancing the iterator, multiplication, and even computing the GCD — in the Montgomery space where reduction is cheap. ```c++ struct Montgomery { @@ -410,47 +413,34 @@ u64 find_factor(u64 n, u64 x0 = 2, u64 a = 1) { } ``` -It processes around 3000 per second, which is ~3.8 faster than what [PARI](https://pari.math.u-bordeaux.fr/) library can do (invocated via [sage](https://doc.sagemath.org/html/en/reference/structure/sage/structure/factorization.html)). - -### Further Optimization - -There might be a way to . - -It may be beneficial to start multiplying only after a certain threshold since there is little probability that we enter a cycle in the beginning. - -It may be worth it to run a few versions in parallel and stop whichever finishes first. If we run $p$ runs, it is expected to finish $\sqrt p$ times faster. Either scalar code and taking advantage of there being multiple execution ports for multiplication, or using [SIMD](/hpc/simd) instructions to do 4 or 8 multiplications in parallel. - -Would not be surprised to see another 3x improvement and throughputs of 10k/sec. - -If you have limited time, you should probably compute as much forward as possible, and then half the time computing the other. - -How to optimize for the *average* case is unclear. - -Another observation: the length of the "tail" and the cycle is equal in expectation, since when we loop around, we choose any vertex of the path we walked independently. +This implementation can processes around 3k 60-bit integers per second, which is ~3.8 faster than the [PARI](https://pari.math.u-bordeaux.fr/) library (invoked via [sage](https://doc.sagemath.org/html/en/reference/structure/sage/structure/factorization.html)). +### Further Improvements -### Reducing Errors +I belive there is still a lot of potential for optimization in our implementation of the Pollard's algorithm: -There are slightly more errors because we are a bit loose with modular arithmetic here. The error rate grows higher when we increase and decrease (due to overflows). +- There is probably be a better cycle-finding algorithm that exploits the fact that the graph is random. It is currently bottlenecked by advancing the iterator (the latency of Montgomery multiplication is much higher than its reciprocal throughput), and while we do that, we could calculate more than one multiplication of the values we've seen to detect a loop sooner. On the other hand, there is little chance that we enter the loop in within the first few iterations, so we may just advance the iterator for some time before starting the trials with the GCD trick. +- If we run $p$ independent instances of the algorithm with different seeds in parallel and stop when one of them finds the answer, it would finish $\sqrt p$ times faster (try to prove it). We don't have to use multiple cores for that: there is a lot of untapped [instruction-level parallelism](/hpc/pipelining/), so we could run two or three pairs of operations on the same thread, or use [SIMD](/hpc/simd) instructions to perform 4 or 8 multiplications in parallel. -Our implementation has less than 0.7% error rate, but it grows higher if the numbers are lower than $10^{18}$. +I would not be surprised to see another 3x improvement and a throughput of ~10k/sec. -Since Pollard's rho algorithm is randomized, you need to account for errors. There may be several sources: + -- Factors not being found (need to perform a primality test and start again if it's negative). -- The `p` variable can get zeroed out (need to either restart or roll back and do it iteration-by-iteration). -- Overflows in Montgomery multiplication (our implementation is pretty loose). +Another aspect that we need to handle in a practical implementation is possible errors. Our current implementation has a 0.7% error rate which grows higher if the numbers are lower than $10^{18}$. They come from three main sources: -### Larger Numbers +- Factors simply not being found (the algorithm is inherently randomized, and there is no guarantee that they will be found). In this case, we need to perform a primality test and optionally start again. +- The `p` variable becoming zero (because both $p$ and $q$ can get into the product). It becomes increasingly more likely as we decrease size of the inputs or increase the constant `M`. In this case, we need to either restart the process or (better) roll back the last $M$ iterations and perform the trials one-by-one. +- Overflows in the Montgomery multiplication. Our current implementation is pretty loose with them, and if $n$ is large, we need to add more `x > mod ? x - mod : x` kind of statements to deal with overflows. -"How big are your numbers?" determines the method to use: +These issues become less important if we exclude small numbers and numbers with small prime factors using the algorithms we've implemented before. In general the optimal approach should depend on the size of the numbers: -- Less than 2^16 or so: Lookup table. -- Less than 2^70 or so: Richard Brent's modification of Pollard's rho algorithm. -- Less than 10^50: Lenstra elliptic curve factorization -- Less than 10^100: Quadratic Sieve -- More than 10^100: General Number Field Sieve +- Smaller than $2^{16}$: use a lookup table +- Smaller than $2^{32}$: use a list of precomputed primes with a fast divsibility check +- Smaller than $2^{64}$ or so: use Pollard's rho algorithm with Montgomery multiplication +- Smaller than $10^{50}$: switch to [Lenstra elliptic curve factorization](https://en.wikipedia.org/wiki/Lenstra_elliptic-curve_factorization) +- Smaller than $10^{100}$: switch to [Quadratic Sieve](https://en.wikipedia.org/wiki/Quadratic_sieve) +- Larger than $10^{100}$: switch to [General Number Field Sieve](https://en.wikipedia.org/wiki/General_number_field_sieve) -Requiring about 100KB of memory. + -6542 * 8 +If you [implement](https://github.com/sslotin/amh-code/tree/main/factor) some of these ideas, please [let me know](http://sereja.me/). From 19143a513bdc88a564391fa4b71f5d01e3ef6a0b Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Thu, 26 May 2022 18:44:58 +0300 Subject: [PATCH 480/531] factorization improvements --- .../english/hpc/algorithms/factorization.md | 21 ++++++++++--------- 1 file changed, 11 insertions(+), 10 deletions(-) diff --git a/content/english/hpc/algorithms/factorization.md b/content/english/hpc/algorithms/factorization.md index fd61d441..7fc51f93 100644 --- a/content/english/hpc/algorithms/factorization.md +++ b/content/english/hpc/algorithms/factorization.md @@ -362,7 +362,7 @@ u64 find_factor(u64 n) { It now works at 425 factorizations per second, bottlenecked by the speed of modulo. -### Optimizing Modulo +### Optimizing the Modulo The final step is to apply [Montgomery multiplication](/hpc/number-theory/montgomery/): the modulo is constant, so we can perform all computations — advancing the iterator, multiplication, and even computing the GCD — in the Montgomery space where reduction is cheap. @@ -413,26 +413,27 @@ u64 find_factor(u64 n, u64 x0 = 2, u64 a = 1) { } ``` -This implementation can processes around 3k 60-bit integers per second, which is ~3.8 faster than the [PARI](https://pari.math.u-bordeaux.fr/) library (invoked via [sage](https://doc.sagemath.org/html/en/reference/structure/sage/structure/factorization.html)). +This implementation can processes around 3k 60-bit integers per second, which is ~3.8 faster than what [PARI](https://pari.math.u-bordeaux.fr/) / [SageMath](https://doc.sagemath.org/html/en/reference/structure/sage/structure/factorization.html)'s `factor` function measures. ### Further Improvements -I belive there is still a lot of potential for optimization in our implementation of the Pollard's algorithm: +**Optimizations.** There is still a lot of potential for optimization in our implementation of the Pollard's algorithm: -- There is probably be a better cycle-finding algorithm that exploits the fact that the graph is random. It is currently bottlenecked by advancing the iterator (the latency of Montgomery multiplication is much higher than its reciprocal throughput), and while we do that, we could calculate more than one multiplication of the values we've seen to detect a loop sooner. On the other hand, there is little chance that we enter the loop in within the first few iterations, so we may just advance the iterator for some time before starting the trials with the GCD trick. -- If we run $p$ independent instances of the algorithm with different seeds in parallel and stop when one of them finds the answer, it would finish $\sqrt p$ times faster (try to prove it). We don't have to use multiple cores for that: there is a lot of untapped [instruction-level parallelism](/hpc/pipelining/), so we could run two or three pairs of operations on the same thread, or use [SIMD](/hpc/simd) instructions to perform 4 or 8 multiplications in parallel. +- We could probably use a better cycle-finding algorithm, exploiting the fact that the graph is random. For example, there is little chance that we enter the loop in within the first few iterations (the length of the cycle and the path we walk before entering it should be equal in expectation since before we loop around, we choose the vertex of the path we've walked independently), so we may just advance the iterator for some time before starting the trials with the GCD trick. +- Our current approach is bottlenecked by advancing the iterator (the latency of Montgomery multiplication is much higher than its reciprocal throughput), and while we are waiting for it to complete, we could perform more than just one trial using the previous values. +- If we run $p$ independent instances of the algorithm with different seeds in parallel and stop when one of them finds the answer, it would finish $\sqrt p$ times faster (the reasoning is similar to the Birthday paradox; try to prove it yourself). We don't have to use multiple cores for that: there is a lot of untapped [instruction-level parallelism](/hpc/pipelining/), so we could concurrently run two or three of the same operations on the same thread, or use [SIMD](/hpc/simd) instructions to perform 4 or 8 multiplications in parallel. -I would not be surprised to see another 3x improvement and a throughput of ~10k/sec. +I would not be surprised to see another 3x improvement and a throughput of ~10k/sec. If you [implement](https://github.com/sslotin/amh-code/tree/main/factor) some of these ideas, please [let me know](http://sereja.me/). -Another aspect that we need to handle in a practical implementation is possible errors. Our current implementation has a 0.7% error rate which grows higher if the numbers are lower than $10^{18}$. They come from three main sources: +**Errors.** Another aspect that we need to handle in a practical implementation is possible errors. Our current implementation has a 0.7% error rate for 60-bit integers, and it grows higher if the numbers are lower. These errors come from three main sources: -- Factors simply not being found (the algorithm is inherently randomized, and there is no guarantee that they will be found). In this case, we need to perform a primality test and optionally start again. +- A cycle simply not being found (the algorithm is inherently random, and there is no guarantee that it will be found). In this case, we need to perform a primality test and optionally start again. - The `p` variable becoming zero (because both $p$ and $q$ can get into the product). It becomes increasingly more likely as we decrease size of the inputs or increase the constant `M`. In this case, we need to either restart the process or (better) roll back the last $M$ iterations and perform the trials one-by-one. - Overflows in the Montgomery multiplication. Our current implementation is pretty loose with them, and if $n$ is large, we need to add more `x > mod ? x - mod : x` kind of statements to deal with overflows. -These issues become less important if we exclude small numbers and numbers with small prime factors using the algorithms we've implemented before. In general the optimal approach should depend on the size of the numbers: +**Larger numbers.** These issues become less important if we exclude small numbers and numbers with small prime factors using the algorithms we've implemented before. In general, the optimal approach should depend on the size of the numbers: - Smaller than $2^{16}$: use a lookup table - Smaller than $2^{32}$: use a list of precomputed primes with a fast divsibility check @@ -443,4 +444,4 @@ These issues become less important if we exclude small numbers and numbers with -If you [implement](https://github.com/sslotin/amh-code/tree/main/factor) some of these ideas, please [let me know](http://sereja.me/). +The last three approaches are very different from what we've been doing and require much more advanced number theory, and they deserve an article (or a full-length university course) of their own. From 709340d509d45719c5f9d76432d273a1d84d44c5 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Thu, 26 May 2022 19:22:45 +0300 Subject: [PATCH 481/531] pollard edits --- .../english/hpc/algorithms/factorization.md | 44 +++++++++---------- 1 file changed, 22 insertions(+), 22 deletions(-) diff --git a/content/english/hpc/algorithms/factorization.md b/content/english/hpc/algorithms/factorization.md index 7fc51f93..07bf7408 100644 --- a/content/english/hpc/algorithms/factorization.md +++ b/content/english/hpc/algorithms/factorization.md @@ -244,11 +244,11 @@ Pollard's rho is a randomized $O(\sqrt[4]{n})$ integer factorization algorithm t > One only needs to draw $d = \Theta(\sqrt{n})$ random numbers between $1$ and $n$ to get a collision with high probability. -You can look up formal proof on Wikipedia, but the informal reasoning behind it is that that each of $d$ added numbers has a chance of approximately $\frac{d}{n}$ of colliding with anythin else, meaning that the expected number of collisions is $\frac{d^2}{n}$. If $d$ is asymptotically smaller than $\sqrt n$, then this ratio grows to zero as $n$ rises and to infinity otherwise. +The reasoning behind it is that each of the $d$ added element has a $\frac{d}{n}$ chance of colliding with some other element, implying that the expected number of collisions is $\frac{d^2}{n}$. If $d$ is asymptotically smaller than $\sqrt n$, then this ratio grows to zero as $n \to \infty$, and to infinity otherwise. -Consider some function $f(x)$ that takes a remainder $x \in [0, n)$ and maps it to some other remainder of $n$ in a way that that seems random from the number theory point of view. Specifically, we will use $f(x) = x^2 + 1 \bmod n$, which is random enough for our purposes. +Consider some function $f(x)$ that takes a remainder $x \in [0, n)$ and maps it to some other remainder of $n$ in a way that seems random from the number theory point of view. Specifically, we will use $f(x) = x^2 + 1 \bmod n$, which is random enough for our purposes. -Now, consider a graph where each number-vertex $x$ has an edge pointing to $f(x)$. Such graphs are called *functional*. In functional graphs, the "trajectory" of any element — the path we walk if we start from that element and keep following the edges — is a path that eventually loops around (because the set of vertices is limited, and at some point we have to go to a vertex we have already visited). +Now, consider a graph where each number-vertex $x$ has an edge pointing to $f(x)$. Such graphs are called *functional*. In functional graphs, the "trajectory" of any element — the path we walk if we start from that element and keep following the edges — is a path that eventually loops around (because the set of vertices is limited, and at some point, we have to go to a vertex we have already visited). ![The trajectory of an element resembles the greek letter ρ (rho), which is what the algorithm is named after](../img/rho.jpg) @@ -258,11 +258,11 @@ $$ x_0, \; f(x_0), \; f(f(x_0)), \; \ldots $$ -Now, let's make another sequence out of this one by reducing each element modulo $p$, the smallest prime divisor of $n$. +Let's make another sequence out of this one by reducing each element modulo $p$, the smallest prime divisor of $n$. -**Lemma.** The expected length of that sequence before it turns into a cycle is $O(\sqrt[4]{n})$. +**Lemma.** The expected length of the reduced sequence before it turns into a cycle is $O(\sqrt[4]{n})$. -**Proof:** Since $p$ is the smallest divisor, $p \leq \sqrt n$. Each time we follow a new edge, we essentially generate a random number between $0$ and $p$ (we treat $f$ as a "deterministically-random" function). The birthday paradox states that we only need to generate $O(\sqrt p) = O(\sqrt[4]{n})$ numers until we get a collision and thus enter a loop. +**Proof:** Since $p$ is the smallest divisor, $p \leq \sqrt n$. Each time we follow a new edge, we essentially generate a random number between $0$ and $p$ (we treat $f$ as a "deterministically-random" function). The birthday paradox states that we only need to generate $O(\sqrt p) = O(\sqrt[4]{n})$ numbers until we get a collision and thus enter a loop. Since we don't know $p$, this mod-$p$ sequence is only imaginary, but if find a cycle in it — that is, $i$ and $j$ such that @@ -307,13 +307,13 @@ u64 find_factor(u64 n) { } ``` -While it processes only ~25k 30-bit integers — almost 15 times slower than the fastest algorithm we have — it drammatically outperforms every $\tilde{O}(\sqrt n)$ algorithm for 60-bit numbers, factorizing around 90 of them per second. +While it processes only ~25k 30-bit integers — which is almost 15 times slower than by checking each prime using a fast division trick — it dramatically outperforms every $\tilde{O}(\sqrt n)$ algorithm for 60-bit numbers, factorizing around 90 of them per second. ### Pollard-Brent Algorithm Floyd's cycle-finding algorithm has a problem in that it moves iterators more than necessary: at least half of the vertices are visited one additional time by the slower iterator. -One way to solve it is to memorize the values $x_i$ that the faster iterator visits and every two iterations compute the GCD using the difference of $x_i$ and $x_{\lfloor i / 2 \rfloor}$, but it can also be done without extra memory using a different principle: the tortoise doesn't move on every iteration, but it gets reset to the value of the faster iterator when the iteration number becomes a power of two. This lets us save additional iterations while still using the same GCD trick to compare $x_i$ and $x_{2^{\lfloor \log_2 i \rfloor}}$ on each iteration: +One way to solve it is to memorize the values $x_i$ that the faster iterator visits and, every two iterations, compute the GCD using the difference of $x_i$ and $x_{\lfloor i / 2 \rfloor}$. But it can also be done without extra memory using a different principle: the tortoise doesn't move on every iteration, but it gets reset to the value of the faster iterator when the iteration number becomes a power of two. This lets us save additional iterations while still using the same GCD trick to compare $x_i$ and $x_{2^{\lfloor \log_2 i \rfloor}}$ on each iteration: ```c++ u64 find_factor(u64 n) { @@ -332,11 +332,11 @@ u64 find_factor(u64 n) { } ``` -Note that we also set an upper limit on the number of iterations so that the algorithm finishes in reasonable time and returns `1` if $n$ turns out to be a prime. +Note that we also set an upper limit on the number of iterations so that the algorithm finishes in a reasonable amount of time and returns `1` if $n$ turns out to be a prime. -It actually does *not* improve performance and even makes the algorithm ~1.5x *slower*, which probably has something to do with the fact that $x$ is stale. It spends most of the time computing the GCD and not advancing the iterator — in fact, the asymptotic of the algorithm is currently $O(\sqrt[4]{n} \log n)$ because of it. +It actually does *not* improve performance and even makes the algorithm ~1.5x *slower*, which probably has something to do with the fact that $x$ is stale. It spends most of the time computing the GCD and not advancing the iterator — in fact, the time requirement of this algorithm is currently $O(\sqrt[4]{n} \log n)$ because of it. -Instead of [optimizing the GCD itself](../gcd), we can optimize the number of its invocations. We can use the fact that if one of $a$ and $b$ contains factor $p$, then $a \cdot b \bmod n$ will also contain it, so instead of computing $\gcd(a, n)$ and $\gcd(b, n)$, we can compute $\gcd(a \cdot b \bmod n, n)$. This way, we can group the calculations of GCP in groups of $M = O(\log n)$, we remove $\log n$ out of the asymptotic: +Instead of [optimizing the GCD itself](../gcd), we will optimize the number of its invocations. We can use the fact that if one of $a$ and $b$ contains factor $p$, then $a \cdot b \bmod n$ will also contain it, so instead of computing $\gcd(a, n)$ and $\gcd(b, n)$, we can compute $\gcd(a \cdot b \bmod n, n)$. This way, we can group the calculations of GCP in groups of $M = O(\log n)$ we remove $\log n$ out of the asymptotic: ```c++ const int M = 1024; @@ -360,11 +360,11 @@ u64 find_factor(u64 n) { } ``` -It now works at 425 factorizations per second, bottlenecked by the speed of modulo. +Now it performs 425 factorizations per second, bottlenecked by the speed of modulo. ### Optimizing the Modulo -The final step is to apply [Montgomery multiplication](/hpc/number-theory/montgomery/): the modulo is constant, so we can perform all computations — advancing the iterator, multiplication, and even computing the GCD — in the Montgomery space where reduction is cheap. +The final step is to apply [Montgomery multiplication](/hpc/number-theory/montgomery/). Since the modulo is constant, we can perform all computations — advancing the iterator, multiplication, and even computing the GCD — in the Montgomery space where reduction is cheap: ```c++ struct Montgomery { @@ -413,7 +413,7 @@ u64 find_factor(u64 n, u64 x0 = 2, u64 a = 1) { } ``` -This implementation can processes around 3k 60-bit integers per second, which is ~3.8 faster than what [PARI](https://pari.math.u-bordeaux.fr/) / [SageMath](https://doc.sagemath.org/html/en/reference/structure/sage/structure/factorization.html)'s `factor` function measures. +This implementation can processes around 3k 60-bit integers per second, which is ~3.8 faster than what [PARI](https://pari.math.u-bordeaux.fr/) / [SageMath's `factor`](https://doc.sagemath.org/html/en/reference/structure/sage/structure/factorization.html) function measures. ### Further Improvements @@ -423,24 +423,24 @@ This implementation can processes around 3k 60-bit integers per second, which is - Our current approach is bottlenecked by advancing the iterator (the latency of Montgomery multiplication is much higher than its reciprocal throughput), and while we are waiting for it to complete, we could perform more than just one trial using the previous values. - If we run $p$ independent instances of the algorithm with different seeds in parallel and stop when one of them finds the answer, it would finish $\sqrt p$ times faster (the reasoning is similar to the Birthday paradox; try to prove it yourself). We don't have to use multiple cores for that: there is a lot of untapped [instruction-level parallelism](/hpc/pipelining/), so we could concurrently run two or three of the same operations on the same thread, or use [SIMD](/hpc/simd) instructions to perform 4 or 8 multiplications in parallel. -I would not be surprised to see another 3x improvement and a throughput of ~10k/sec. If you [implement](https://github.com/sslotin/amh-code/tree/main/factor) some of these ideas, please [let me know](http://sereja.me/). +I would not be surprised to see another 3x improvement and throughput of ~10k/sec. If you [implement](https://github.com/sslotin/amh-code/tree/main/factor) some of these ideas, please [let me know](http://sereja.me/). **Errors.** Another aspect that we need to handle in a practical implementation is possible errors. Our current implementation has a 0.7% error rate for 60-bit integers, and it grows higher if the numbers are lower. These errors come from three main sources: - A cycle simply not being found (the algorithm is inherently random, and there is no guarantee that it will be found). In this case, we need to perform a primality test and optionally start again. -- The `p` variable becoming zero (because both $p$ and $q$ can get into the product). It becomes increasingly more likely as we decrease size of the inputs or increase the constant `M`. In this case, we need to either restart the process or (better) roll back the last $M$ iterations and perform the trials one-by-one. +- The `p` variable becoming zero (because both $p$ and $q$ can get into the product). It becomes increasingly more likely as we decrease size of the inputs or increase the constant `M`. In this case, we need to either restart the process or (better) roll back the last $M$ iterations and perform the trials one by one. - Overflows in the Montgomery multiplication. Our current implementation is pretty loose with them, and if $n$ is large, we need to add more `x > mod ? x - mod : x` kind of statements to deal with overflows. **Larger numbers.** These issues become less important if we exclude small numbers and numbers with small prime factors using the algorithms we've implemented before. In general, the optimal approach should depend on the size of the numbers: -- Smaller than $2^{16}$: use a lookup table -- Smaller than $2^{32}$: use a list of precomputed primes with a fast divsibility check -- Smaller than $2^{64}$ or so: use Pollard's rho algorithm with Montgomery multiplication -- Smaller than $10^{50}$: switch to [Lenstra elliptic curve factorization](https://en.wikipedia.org/wiki/Lenstra_elliptic-curve_factorization) -- Smaller than $10^{100}$: switch to [Quadratic Sieve](https://en.wikipedia.org/wiki/Quadratic_sieve) -- Larger than $10^{100}$: switch to [General Number Field Sieve](https://en.wikipedia.org/wiki/General_number_field_sieve) +- Smaller than $2^{16}$: use a lookup table; +- Smaller than $2^{32}$: use a list of precomputed primes with a fast divisibility check; +- Smaller than $2^{64}$ or so: use Pollard's rho algorithm with Montgomery multiplication; +- Smaller than $10^{50}$: switch to [Lenstra elliptic curve factorization](https://en.wikipedia.org/wiki/Lenstra_elliptic-curve_factorization); +- Smaller than $10^{100}$: switch to [Quadratic Sieve](https://en.wikipedia.org/wiki/Quadratic_sieve); +- Larger than $10^{100}$: switch to [General Number Field Sieve](https://en.wikipedia.org/wiki/General_number_field_sieve). From ab5ffcb7135a3848720535b47694d95acb27d504 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Thu, 26 May 2022 19:44:35 +0300 Subject: [PATCH 482/531] elaborate on benchmarking --- content/english/hpc/algorithms/factorization.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/english/hpc/algorithms/factorization.md b/content/english/hpc/algorithms/factorization.md index 07bf7408..acfd0b0c 100644 --- a/content/english/hpc/algorithms/factorization.md +++ b/content/english/hpc/algorithms/factorization.md @@ -5,7 +5,7 @@ weight: 3 The problem of factoring integers into primes is central to computational [number theory](/hpc/number-theory/). It has been [studied](https://www.cs.purdue.edu/homes/ssw/chapter3.pdf) since at least the 3rd century BC, and [many methods](https://en.wikipedia.org/wiki/Category:Integer_factorization_algorithms) have been developed that are efficient for different inputs. -In this case study, we specifically consider the factorization of *word-sized* integers: those on the order of $10^9$ and $10^{18}$. Untypical for this book, in this one, you may actually learn an asymptotically better algorithm: we start with a few basic approaches and gradually build up to the $O(\sqrt[4]{n})$-time *Pollard's rho algorithm* and optimize it to the point where it can factorize 60-bit semiprimes in 0.3-0.4ms and almost 4 times faster than the previous state-of-the-art. +In this case study, we specifically consider the factorization of *word-sized* integers: those on the order of $10^9$ and $10^{18}$. Untypical for this book, in this one, you may actually learn an asymptotically better algorithm: we start with a few basic approaches and gradually build up to the $O(\sqrt[4]{n})$-time *Pollard's rho algorithm* and optimize it to the point where it can factorize 60-bit semiprimes in 0.3-0.4ms and ~3 times faster than the previous state-of-the-art. + *Instrumentation* is an overcomplicated term that means inserting timers and other tracking code into programs. The simplest example is using the `time` utility in Unix-like systems to measure the duration of execution for the whole program. More generally, we want to know *which parts* of the program need optimization. There are tools shipped with compilers and IDEs that can time designated functions automatically, but it is more robust to do it by hand using any methods of interacting with time that the language provides: From 1cd629fa9dde73de0d810890effbc4c7cdac4db8 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Fri, 10 Jun 2022 15:32:41 +0300 Subject: [PATCH 486/531] add anagrams problem --- content/russian/cs/programming/bayans.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/content/russian/cs/programming/bayans.md b/content/russian/cs/programming/bayans.md index d35880cc..9faf6139 100644 --- a/content/russian/cs/programming/bayans.md +++ b/content/russian/cs/programming/bayans.md @@ -307,6 +307,10 @@ def query(y): Даны $3 \cdot 10^5$ точек на плоскости. Выберите среди них любое подмножество из 500 точек и решите для него задачу коммивояжера: найдите минимальный по длине цикл, проходящий через все эти точки. +## Анаграммы + +Найдите в строке $s$ первую подстроку, являющуюся анаграммой (пререстановкой символов) строки $t$ за $O(n)$. + -Due to difficulties in [refraining the compiler from cheating](/hpc/profiling/noise/), the code snippets in this article are slightly simplified for exposition purposes. Check the [code repository](https://github.com/sslotin/amh-code/tree/main/cpu-cache) if you want to reproduce them yourself. +Due to difficulties in [preventing the compiler from optimizing away unused values](/hpc/profiling/noise/), the code snippets in this article are slightly simplified for exposition purposes. Check the [code repository](https://github.com/sslotin/amh-code/tree/main/cpu-cache) if you want to reproduce them yourself. ### Acknowledgements diff --git a/content/english/hpc/data-structures/binary-search.md b/content/english/hpc/data-structures/binary-search.md index 8a4924ea..6e73d32d 100644 --- a/content/english/hpc/data-structures/binary-search.md +++ b/content/english/hpc/data-structures/binary-search.md @@ -9,7 +9,7 @@ Instead, the most fascinating showcases of performance engineering are multifold -In this article, we focus on one such fundamental algorithm — *binary search* — and implement two of its variants that are, depending on the problem size, up to 4x faster than `std::lower_bound`, while being under just 15 lines of code. +In this section, we focus on one such fundamental algorithm — *binary search* — and implement two of its variants that are, depending on the problem size, up to 4x faster than `std::lower_bound`, while being under just 15 lines of code. The first algorithm achieves that by removing [branches](/hpc/pipelining/branching), and the second also optimizes the memory layout to achieve better [cache system](/hpc/cpu-cache) performance. This technically disqualifies it from being a drop-in replacement for `std::lower_bound` as it needs to permute the elements of the array before it can start answering queries — but I can't recall a lot of scenarios where you obtain a sorted array but can't afford to spend linear time on preprocessing. @@ -401,7 +401,7 @@ Also, note that the last few prefetch requests are actually not needed, and in f This prefetching technique allows us to read up to four elements ahead, but it doesn't really come for free — we are effectively trading off excess memory [bandwidth](/hpc/cpu-cache/bandwidth) for reduced [latency](/hpc/cpu-cache/latency). If you run more than one instance at a time on separate hardware threads or just any other memory-intensive computation in the background, it will significantly [affect](/hpc/cpu-cache/sharing) the benchmark performance. -But we can do better. Instead of fetching four cache lines at a time, we could fetch four times *fewer* cache lines. And in the [next article](../s-tree), we will explore the approach. +But we can do better. Instead of fetching four cache lines at a time, we could fetch four times *fewer* cache lines. And in the [next section](../s-tree), we will explore the approach. -When you fetch anything from memory, there is always some non-zero latency before the data arrives. Moreover, the request doesn't go directly to its ultimate storage location, but it first goes through an incredibly complex system of address translation units and caching layers designed to both help in memory management and reduce the latency. +When you fetch anything from memory, there is always some latency before the data arrives. Moreover, the request doesn't go directly to its ultimate storage location, but it first goes through a complex system of address translation units and caching layers designed to both help in memory management and reduce the latency. Therefore, the only correct answer to this question is "it depends" — primarily on where the operands are stored: diff --git a/content/english/hpc/external-memory/hierarchy.md b/content/english/hpc/external-memory/hierarchy.md index f0ca9c65..da1f5bb6 100644 --- a/content/english/hpc/external-memory/hierarchy.md +++ b/content/english/hpc/external-memory/hierarchy.md @@ -58,7 +58,7 @@ There are other caches inside CPUs that are used for something other than data. ### Non-Volatile Memory -While the data cells in CPU caches and the RAM only gently store just a few electrons (that periodically leak and need to be periodically refreshed), the data cells in *non-volatile memory* types store hundreds of them. This lets the data to be persisted for prolonged periods of time without power but comes at the cost of performance and durability — because when you have more electrons, you also have more opportunities for them colliding with silicon atoms. +While the data cells in CPU caches and the RAM only gently store just a few electrons (that periodically leak and need to be periodically refreshed), the data cells in *non-volatile memory* types store hundreds of them. This lets the data to persist for prolonged periods of time without power but comes at the cost of performance and durability — because when you have more electrons, you also have more opportunities for them colliding with silicon atoms. diff --git a/content/english/hpc/pipelining/_index.md b/content/english/hpc/pipelining/_index.md index e18a31cc..aab72d79 100644 --- a/content/english/hpc/pipelining/_index.md +++ b/content/english/hpc/pipelining/_index.md @@ -5,7 +5,7 @@ weight: 3 When programmers hear the word *parallelism*, they mostly think about *multi-core parallelism*, the practice of explicitly splitting a computation into semi-independent *threads* that work together to solve a common problem. -This type of parallelism is mainly about reducing *latency* and achieving *scalability*, but not about improving *efficiency*. You can solve a problem ten times as big with a parallel algorithm, but it would take at least ten times as much computational resources. Although parallel hardware is becoming [ever more abundant](/hpc/complexity/hardware), and parallel algorithm design is becoming an increasingly important area, for now, we will consider the use of more than one CPU core cheating. +This type of parallelism is mainly about reducing *latency* and achieving *scalability*, but not about improving *efficiency*. You can solve a problem ten times as big with a parallel algorithm, but it would take at least ten times as many computational resources. Although parallel hardware is becoming [ever more abundant](/hpc/complexity/hardware) and parallel algorithm design is becoming an increasingly important area, for now, we will limit ourselves to considering only a single CPU core. But there are other types of parallelism, already existing inside a CPU core, that you can use *for free*. diff --git a/content/english/hpc/pipelining/branchless.md b/content/english/hpc/pipelining/branchless.md index 0f87da83..d7416f35 100644 --- a/content/english/hpc/pipelining/branchless.md +++ b/content/english/hpc/pipelining/branchless.md @@ -28,7 +28,7 @@ for (int i = 0; i < N; i++) s += (a[i] < 50) * a[i]; ``` -Suddenly, the loop now takes ~7 cycles per element instead of the original ~14. Also, the performance remains constant if we change `50` to some other threshold, so it doesn't depend on the branch probability. +The loop now takes ~7 cycles per element instead of the original ~14. Also, the performance remains constant if we change `50` to some other threshold, so it doesn't depend on the branch probability. But wait… shouldn't there still be a branch? How does `(a[i] < 50)` map to assembly? @@ -182,7 +182,7 @@ int abs(int a) { **Strings.** Oversimplifying things, an `std::string` is comprised of a pointer to a null-terminated char array (also known as "C-string") allocated somewhere on the heap and one integer containing the string size. -A very common value for strings is the empty string — which is also its default value. You also need to handle them somehow, and the idiomatic thing to do is to assign `nullptr` as the pointer and `0` as the string size, and then check if the pointer is null or if the size is zero at the beginning of every procedure involving strings. +A common value for strings is the empty string — which is also its default value. You also need to handle them somehow, and the idiomatic thing to do is to assign `nullptr` as the pointer and `0` as the string size, and then check if the pointer is null or if the size is zero at the beginning of every procedure involving strings. However, this requires a separate branch, which is costly unless most strings are empty. What we can do to get rid of it is to allocate a "zero C-string," which is just a zero byte allocated somewhere, and then simply point all empty strings there. Now all string operations with empty strings have to read this useless zero byte, but this is still much cheaper than a branch misprediction. @@ -216,7 +216,7 @@ That there are no substantial reasons why compilers can't do this on their own, --> -**Data-parallel programming.** Branchless programming is very important for [SIMD](/hpc/simd) applications, including GPU programming, because they don't have branching in the first place. +**Data-parallel programming.** Branchless programming is very important for [SIMD](/hpc/simd) applications because they don't have branching in the first place. In our array sum example, if you remove the `volatile` type qualifier from the accumulator, the compiler becomes able to [vectorize](/hpc/simd/auto-vectorization) the loop: diff --git a/content/english/hpc/pipelining/tables.md b/content/english/hpc/pipelining/tables.md index 5f69c579..ad90c400 100644 --- a/content/english/hpc/pipelining/tables.md +++ b/content/english/hpc/pipelining/tables.md @@ -33,7 +33,7 @@ Some comments: - Because our minds are so used to the cost model where "more" means "worse," people mostly use *reciprocals* of throughput instead of throughput. - If a certain instruction is especially frequent, its execution unit could be duplicated to increase its throughput — possibly to even more than one, but not higher than the [decode width](/hpc/architecture/layout). - Some instructions have a latency of 0. This means that these instruction are used to control the scheduler and don't reach the execution stage. They still have non-zero reciprocal throughput because the [CPU front-end](/hpc/architecture/layout) still needs to process them. -- Most instructions are pipelined, and if they have the reciprocal throughput of $n$, this usually means that their execution unit can take another instruction after $n$ cycles (and if it is below 1, this means that there are multiple execution units, all capable of taking another instruction on the next cycle). One notable exception is the [integer division](/hpc/arithmetic/division): it is either very poorly pipelined or not pipelined at all. +- Most instructions are pipelined, and if they have the reciprocal throughput of $n$, this usually means that their execution unit can take another instruction after $n$ cycles (and if it is below 1, this means that there are multiple execution units, all capable of taking another instruction on the next cycle). One notable exception is [integer division](/hpc/arithmetic/division): it is either very poorly pipelined or not pipelined at all. - Some instructions have variable latency, depending on not only the size, but also the values of the operands. For memory operations (including fused ones like `add`), the latency is usually specified for the best case (an L1 cache hit). There are many more important little details, but this mental model will suffice for now. From 59ca0451a59c0b3c81e1e542d2f8aff3588207c6 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Mon, 18 Jul 2022 01:17:15 +0300 Subject: [PATCH 490/531] four new theoretical problems --- content/russian/cs/programming/bayans.md | 37 ++++++++++++++++++++++++ 1 file changed, 37 insertions(+) diff --git a/content/russian/cs/programming/bayans.md b/content/russian/cs/programming/bayans.md index 9faf6139..aee5deda 100644 --- a/content/russian/cs/programming/bayans.md +++ b/content/russian/cs/programming/bayans.md @@ -311,6 +311,43 @@ def query(y): Найдите в строке $s$ первую подстроку, являющуюся анаграммой (пререстановкой символов) строки $t$ за $O(n)$. +## Функциональный граф + +Дан ориентированный граф из $n < 10^5$ вершин, в котором из каждой вершины ведет ровно одно ребро. Требуется ответить на $q < 10^5$ запросов «в какую вершину мы попадем, если начнем в вершине $v_i$ и сделаем $k_i < 10^{18}$ переходов» за время $O(q + n)$. + +## Асинхронная шляпа + +Серёжа и его $(n - 1)$ друзей решили поиграть в «шляпу», в которой один игрок должен за ограниченное время объяснить как можно больше слов, чтобы его партнер их отгадал. + +Каждый игрок должен пообщаться с любым другим по разу; обычно игра проводится так: + +- 1-й игрок объясняет в течение минуты слова 2-му, +- 2-й игрок объясняет слова 3-му, +- ..., +- $n$-й игрок объясняет слова 1-му, +- 1-й игрок объясняет слова 3-му, +- 2-й игрок объясняет слова 4-му… + +…и так далее, пока $(n-1)$-й игрок не закончит объяснять слова $(n-2)$-ому. + +Если друзей собралось много, то игра может занять приличное время. Серёжу интересует, какое минимальное время она может длиться, если разрешить парам участников общаться между собой одновременно и в любом порядке. + +Для данного $n \le 500$, найдите минимальное количество времени $k$ и соответствующее ему расписание. + +## Random coffee + +В компании, в которой вы работаете, устроено неизвестное число людей — от одного до бесконечности с равной вероятностью. Для борьбы с одиночеством, каждый сотрудник участвует в «random coffee»: каждую неделю вы встречаетесь со случайным человеком из компании, чтобы попить кофе и обсудить что угодно. + +Вы участвовали в random coffee $n$ раз и пообщались с $k$ разными людьми (с некоторыми — более одного раза). Какое наиболее вероятное число человек работает в компании? + +## Мафия + +В «мафию» играют 13 человек, из которых 10 мирных и 3 мафии. Все роли розданы с помощью стандартной колоды игральных карт: заранее выбрали и перемешали 10 красных и 3 чёрные карты, кто вытянул черную — мафия. Все карты различны и известны всем. Игра начинается с дневного голосования. + +Как мирным гарантированно победить? + + + + + + + + + + + + + + + + + 0 + 7 + + 2 + + + 1 + 3 + + 4 + + 8 + 5 + + 9 + 6 + + + + + 1 + 3 + + + 2 + + 4 + + 8 + 5 + + 9 + 6 + + 0 + 7 + + + + + + + + + + + + From f3fb1ae8eceaaf73d231763b3bcf0fb3f4b964eb Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 19 Jul 2022 01:10:19 +0300 Subject: [PATCH 494/531] typos --- content/english/hpc/algorithms/gcd.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/english/hpc/algorithms/gcd.md b/content/english/hpc/algorithms/gcd.md index 7941edd0..6a4f8ca7 100644 --- a/content/english/hpc/algorithms/gcd.md +++ b/content/english/hpc/algorithms/gcd.md @@ -252,9 +252,9 @@ int gcd(int a, int b) { } ``` -It runs in 91ns — which is good enough to leave it there. +It runs in 91ns, which is good enough to leave it there. -If somebody wants to try to shove off a few more nanoseconds by re-writing assembly by hand or trying a lookup table to save a few last iterations, please [let me know](http://sereja.me/). +If somebody wants to try to shave off a few more nanoseconds by rewriting the assembly by hand or trying a lookup table to save a few last iterations, please [let me know](http://sereja.me/). ### Acknowledgements From 9d626692f78d3e173644d1bbbf8dbbca7d9c2d79 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 19 Jul 2022 01:28:13 +0300 Subject: [PATCH 495/531] improve wording --- content/english/hpc/algorithms/matmul.md | 2 +- content/english/hpc/cpu-cache/alignment.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/content/english/hpc/algorithms/matmul.md b/content/english/hpc/algorithms/matmul.md index 02c68f36..5f2847d2 100644 --- a/content/english/hpc/algorithms/matmul.md +++ b/content/english/hpc/algorithms/matmul.md @@ -438,7 +438,7 @@ There is also an approach that performs asymptotically fewer arithmetic operatio FMA also supports 64-bit floating-point numbers, but it does not support integers: you need to perform addition and multiplication separately, which results in decreased performance. If you can guarantee that all intermediate results can be represented exactly as 32- or 64-bit floating-point numbers (which is [often the case](/hpc/arithmetic/errors/)), it may be faster to just convert them to and from floats. -You can also apply the same trick to other similar computations. One example is the "min-plus matrix multiplication," which is defined as: +This approach can be also applied to some similar-looking computations. One example is the "min-plus matrix multiplication" defined as: $$ (A \circ B)_{ij} = \min_{1 \le k \le n} (A_{ik} + B_{kj}) diff --git a/content/english/hpc/cpu-cache/alignment.md b/content/english/hpc/cpu-cache/alignment.md index 59579467..e9c5f4d3 100644 --- a/content/english/hpc/cpu-cache/alignment.md +++ b/content/english/hpc/cpu-cache/alignment.md @@ -185,4 +185,4 @@ int load(int *p) { } ``` -Compilers usually don't do that because this is not technically always legal: that 4th byte may be on a memory page that you don't own, so the operating system won't let you load it even if you are going to discard it right away. +Compilers usually don't do that because it's technically not legal: that 4th byte may be on a memory page that you don't own, so the operating system won't let you load it even if you are going to discard it right away. From 05f05c5b4eb587ff533769f3fca83486b0307890 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 19 Jul 2022 03:27:19 +0300 Subject: [PATCH 496/531] elaborating on eytzinger layout --- .../hpc/data-structures/binary-search.md | 33 ++++++++++--------- 1 file changed, 18 insertions(+), 15 deletions(-) diff --git a/content/english/hpc/data-structures/binary-search.md b/content/english/hpc/data-structures/binary-search.md index 6e73d32d..d2f237cb 100644 --- a/content/english/hpc/data-structures/binary-search.md +++ b/content/english/hpc/data-structures/binary-search.md @@ -248,7 +248,7 @@ Apart from being compact, it has some nice properties, like that all even-number Here is how this layout looks when applied to binary search: -![](../img/eytzinger.png) +![Note that the tree is slightly imbalanced (because of the last layer is continuous)](../img/eytzinger.png) When searching in this layout, we just need to start from the first element of the array, and then on each iteration jump to either $2 k$ or $(2k + 1)$, depending on how the comparison went: @@ -278,15 +278,15 @@ void eytzinger(int k = 1) { } ``` -This function takes the current node number `k`, recursively writes out all elements to the left of the middle of the search interval, writes out the current element we'd compare against, and then recursively writes out all the elements on the right. It seems a bit complicated, but to convince ourselves that it works, we only need three observations: +This function takes the current node number `k`, recursively writes out all elements to the left of the middle of the search interval, writes out the current element we'd compare against, and then recursively writes out all the elements on the right. It seems a bit complicated, but to convince yourself that it works, you only need three observations: - It writes exactly `n` elements as we enter the body of `if` for each `k` from `1` to `n` just once. - It writes out sequential elements from the original array as it increments the `i` pointer each time. -- By the time we write the element at node `k`, we have already written all the elements to its left (exactly `i`). +- By the time we write the element at node `k`, we will have already written all the elements to its left (exactly `i`). -Despite being recursive, it is actually quite fast as all the memory reads are sequential, and the memory writes are only in $O(\log n)$ different memory blocks at a time. +Despite being recursive, it is actually quite fast as all the memory reads are sequential, and the memory writes are only in $O(\log n)$ different memory blocks at a time. Maintaining the permutation is both logically and computationally harder to maintain though: adding an element to a sorted array only requires shifting a suffix of its elements one position to the right, while Eytzinger array practically needs to be rebuilt from scratch. -Note that this traversal and the resulting permutation are not exactly equivalent to the "tree" of vanilla binary search: for example, the left child subtree may be larger than the right child subtree — and even more than just by one node — but it doesn't matter since both approaches result in the same logarithmic tree depth. +Note that this traversal and the resulting permutation are not exactly equivalent to the "tree" of vanilla binary search: for example, the left child subtree may be larger than the right child subtree — up to twice as large — but it doesn't matter much since both approaches result in the same $\lceil \log_2 n \rceil$ tree depth. Also note that the Eytzinger array is one-indexed — this will be important for performance later. You can put in the zeroth element the value that you want to be returned in the case when the lower bound doesn't exist (similar to `a.end()` for `std::lower_bound`). @@ -300,22 +300,25 @@ while (k <= n) k = 2 * k + (t[k] < x); ``` -The only problem arises when we need to restore the index of the resulting element, as $k$ may end up not pointing to a leaf node. Here is an example of how that can happen: +The only problem arises when we need to restore the index of the resulting element, as $k$ does not directly point to it. Consider this example (its corresponding tree is listed above): ```center - array: 1 2 3 4 5 6 7 8 -eytzinger: 5 3 7 2 4 6 8 1 -1st range: --------------- k := 1 -2nd range: ------- k := 2*k (=2) -3rd range: --- k := 2*k + 1 (=5) -4th range: - k := 2*k (=10) + array: 0 1 2 3 4 5 6 7 8 9 +eytzinger: 6 3 7 1 5 8 9 0 2 4 +1st range: ------------------- k := 1 +2nd range: ------------- k := 2*k = 2 (6 ≥ 3) +3rd range: ------- k := 2*k = 4 (3 ≥ 3) +4th range: --- k := 2*k + 1 = 9 (1 < 3) +5th range: - k := 2*k + 1 = 19 (2 < 3) ``` -Here we query the array of $[1, …, 8]$ for the lower bound of $x=4$. We compare it against $5$, $3$, and $4$, go left-right-left, and end up with $k = 10$, which isn't even a valid array index. + -The trick is to notice that, unless the answer is the last element of the array, we compare $x$ against it at some point, and after we've learned that it is not less than $x$, we start comparing $x$ against elements to the left, and all these comparisons evaluate true (that is, leading to the right). Therefore, to restore the answer, we just need to "cancel" some number of right turns. +Here we query the array of $[0, …, 9]$ for the lower bound of $x=3$. We compare it against $6$, $3$, $1$, and $2$, go left-left-right-right, and end up with $k = 19$, which isn't even a valid array index. -This can be done in an elegant way by observing that the right turns are recorded in the binary representation of $k$ as 1-bits, and so we just need to find the number of trailing 1s in the binary representation and right-shift $k$ by exactly that number of bits. To do this, we can invert the number (`~k`) and call the "find first set" instruction: +The trick is to notice that, unless the answer is the last element of the array, we compare $x$ against it at some point, and after we've learned that it is not less than $x$, we go left exactly once and then keep going right until we reach a leaf (because we will only be comparing $x$ against lesser elements). Therefore, to restore the answer, we just need to "cancel" some number of right turns and then one more. + +This can be done in an elegant way by observing that the right turns are recorded in the binary representation of $k$ as 1-bits, and so we just need to find the number of trailing 1s in the binary representation and right-shift $k$ by exactly that number of bits plus one. To do this, we can invert the number (`~k`) and call the "find first set" instruction: ```c++ int lower_bound(int x) { From c98fcddab8225ab707a71820da7ff45e744da04d Mon Sep 17 00:00:00 2001 From: song-jx <79297685+song-jx@users.noreply.github.com> Date: Tue, 19 Jul 2022 22:05:52 +0800 Subject: [PATCH 497/531] Fixed a problem that could cause out of bounds. Calling add(32, 0) when N = 33 will be out of bounds. --- content/english/hpc/data-structures/segment-trees.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/hpc/data-structures/segment-trees.md b/content/english/hpc/data-structures/segment-trees.md index f4c6fb7f..e98c16cb 100644 --- a/content/english/hpc/data-structures/segment-trees.md +++ b/content/english/hpc/data-structures/segment-trees.md @@ -594,7 +594,7 @@ constexpr int offset(int h) { int s = 0, n = N; while (h--) { s += (n + B - 1) / B * B; - n /= B; + n = (n + B - 1) / B; } return s; } From dad89c8d3155433d45875a057039bcc944ca98f8 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 19 Jul 2022 18:23:56 +0300 Subject: [PATCH 498/531] new branchless binary search --- .../hpc/data-structures/binary-search.md | 61 ++++++++++++++++--- 1 file changed, 51 insertions(+), 10 deletions(-) diff --git a/content/english/hpc/data-structures/binary-search.md b/content/english/hpc/data-structures/binary-search.md index d2f237cb..f2e61ffb 100644 --- a/content/english/hpc/data-structures/binary-search.md +++ b/content/english/hpc/data-structures/binary-search.md @@ -3,6 +3,8 @@ title: Binary Search weight: 1 --- + + While improving the speed of user-facing applications is the end goal of performance engineering, people don't really get excited over 5-10% improvements in some databases. Yes, this is what software engineers are paid for, but these types of optimizations tend to be too intricate and system-specific to be readily generalized to other software. Instead, the most fascinating showcases of performance engineering are multifold optimizations of textbook algorithms: the kinds that everybody knows and deemed so simple that it would never even occur to try to optimize them in the first place. These optimizations are simple and instructive and can very much be adopted elsewhere. And they are surprisingly not as rare as you'd think. @@ -71,7 +73,7 @@ int lower_bound(int x) { Find the middle element of the search range, compare it to `x`, shrink the range in half. Beautiful in its simplicity. -A similar approach is employed by `std::lower_bound`, except that it needs to be more generic to support containers with non-random-access iterators and thus uses the first element and the size of the search interval instead of the two of its ends. Implementations from both [Clang](https://github.com/llvm-mirror/libcxx/blob/78d6a7767ed57b50122a161b91f59f19c9bd0d19/include/algorithm#L4169) and [GCC](https://github.com/gcc-mirror/gcc/blob/d9375e490072d1aae73a93949aa158fcd2a27018/libstdc%2B%2B-v3/include/bits/stl_algobase.h#L1023) use this metaprogramming monstrosity: +A similar approach is employed by `std::lower_bound`, except that it needs to be more generic to support containers with non-random-access iterators and thus uses the first element and the size of the search interval instead of the two of its ends. To this end, implementations from both [Clang](https://github.com/llvm-mirror/libcxx/blob/78d6a7767ed57b50122a161b91f59f19c9bd0d19/include/algorithm#L4169) and [GCC](https://github.com/gcc-mirror/gcc/blob/d9375e490072d1aae73a93949aa158fcd2a27018/libstdc%2B%2B-v3/include/bits/stl_algobase.h#L1023) use this metaprogramming monstrosity: ```c++ template @@ -131,23 +133,60 @@ Now, let's try to get rid of these obstacles one by one. ## Removing Branches -We can replace branching with [predication](/hpc/pipelining/branchless). To do this, we need to adopt the STL approach and rewrite the loop using the first element and the size of the search interval — instead of its first and last element. This way we only need to update the first element of the search interval with a `cmov` instruction and halve its size on each iteration: +We can replace branching with [predication](/hpc/pipelining/branchless). To make the task easier, we can adopt the STL approach and rewrite the loop using the first element and the size of the search interval (instead of its first and last element): ```c++ int lower_bound(int x) { int *base = t, len = n; while (len > 1) { int half = len / 2; - base = (base[half] < x ? &base[half] : base); + if (base[half - 1] < x) { + base += half; + len = len - half; + } else { + len = half; + } + } + return *base; +} +``` + +Note that, on each iteration, `len` is essentially just halved and then either floored or ceiled, depending on how the comparison went. This conditional update seems unnecessary; to avoid it, we can simply say that it's always ceiled: + +```c++ +int lower_bound(int x) { + int *base = t, len = n; + while (len > 1) { + int half = len / 2; + if (base[half - 1] < x) + base += half; + len -= half; // = ceil(len / 2) + } + return *base; +} +``` + +This way, we only need to update the first element of the search interval with a [conditional move](/hpc/pipelining/branchless/) and halve its size on each iteration: + +```c++ +int lower_bound(int x) { + int *base = t, len = n; + while (len > 1) { + int half = len / 2; + base += (base[half - 1] < x) * half; // will be replaced with a "cmov" len -= half; } - return *(base + (*base < x)); + return *base; } ``` -Note that this loop is not always equivalent to the standard binary search — it always rounds *up* the size of the search interval, so it accesses slightly different elements and may perform one comparison more than what is needed. We do this to make the number of iterations constant and remove the need for branching completely, although it does require an awkward `(*base < x)` check at the end. + -As typical for predication, this trick is very fragile to compiler optimizations. It doesn't make a difference on Clang — for some reason, it replaces the ternary operator with a branch anyway — but it works fine on GCC (9.3), yielding a 2.5-3x improvement on small arrays: +Note that this loop is not always equivalent to the standard binary search. Since it always rounds *up* the size of the search interval, it accesses slightly different elements and may perform one comparison more than needed. Apart from simplifying computations on each iteration, it also makes the number of iterations constant if the array size is constant, removing branch mispredictions completely. + +As typical for predication, this trick is very fragile to compiler optimizations — depending on the compiler and how the funciton is invoked, it may still leave a branch or generate suboptimal code. It works fine on Clang 10, yielding a 2.5-3x improvement on small arrays: + + ![](../img/search-branchless.svg) @@ -162,15 +201,17 @@ int lower_bound(int x) { int *base = t, len = n; while (len > 1) { int half = len / 2; - __builtin_prefetch(&base[(len - half) / 2]); - __builtin_prefetch(&base[half + (len - half) / 2]); - base = (base[half] < x ? &base[half] : base); len -= half; + __builtin_prefetch(&base[len / 2 - 1]); + __builtin_prefetch(&base[half + len / 2 - 1]); + base += (base[half - 1] < x) * half; } - return *(base + (*base < x)); + return *base; } ``` + + With prefetching, the performance on large arrays becomes roughly the same: ![](../img/search-branchless-prefetch.svg) From 3b2037f968fec31bf7f0ffef74a80df432338a51 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 20 Jul 2022 05:31:43 +0300 Subject: [PATCH 499/531] simplify code --- content/english/hpc/data-structures/segment-trees.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/hpc/data-structures/segment-trees.md b/content/english/hpc/data-structures/segment-trees.md index e98c16cb..90435a38 100644 --- a/content/english/hpc/data-structures/segment-trees.md +++ b/content/english/hpc/data-structures/segment-trees.md @@ -593,8 +593,8 @@ constexpr int height(int n) { constexpr int offset(int h) { int s = 0, n = N; while (h--) { - s += (n + B - 1) / B * B; n = (n + B - 1) / B; + s += n * B; } return s; } From b8e8ede0ad7a040478df5d985c5bdf417758385b Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 20 Jul 2022 05:37:35 +0300 Subject: [PATCH 500/531] change wording --- content/english/hpc/data-structures/segment-trees.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/content/english/hpc/data-structures/segment-trees.md b/content/english/hpc/data-structures/segment-trees.md index 90435a38..9ad14608 100644 --- a/content/english/hpc/data-structures/segment-trees.md +++ b/content/english/hpc/data-structures/segment-trees.md @@ -603,14 +603,14 @@ constexpr int H = height(N); alignas(64) int t[offset(H)]; // an array for storing nodes ``` -This way we effectively reduce the height of the tree by approximately $\frac{\log_B n}{\log_2 n} = \log_2 B$ times ($\sim4$ times if $B = 16$), but it becomes non-trivial to implement in-node operations efficiently. For our problem, we have two main options: +This way, we effectively reduce the height of the tree by approximately $\frac{\log_B n}{\log_2 n} = \log_2 B$ times ($\sim4$ times if $B = 16$), but it becomes non-trivial to implement in-node operations efficiently. For our problem, we have two main options: 1. We could store $B$ *sums* in each node (for each of its $B$ children). 2. We could store $B$ *prefix sums* in each node (the $i$-th being the sum of the first $(i + 1)$ children). If we go with the first option, the `add` query would be largely the same as in the bottom-up segment tree, but the `sum` query would need to add up to $B$ scalars in each node it visits. And if we go with the second option, the `sum` query would be trivial, but the `add` query would need to add `x` to some suffix on each node it visits. -In either case, one operation will perform $O(\log_B n)$ operations, touching just one scalar in each node, while the other will perform $O(B \cdot \log_B n)$ operations, touching up to $B$ scalars in each node. However, it is 21st century, and we can use [SIMD](/hpc/simd) to accelerate the slower operation. Since there are no fast [horizontal reductions](/hpc/simd/reduction) in SIMD instruction sets, but it is easy to add a vector to a vector, we will choose the second approach and store prefix sums in each node. +In either case, one operation would perform $O(\log_B n)$ operations, touching just one scalar in each node, while the other would perform $O(B \cdot \log_B n)$ operations, touching up to $B$ scalars in each node. We can, however, use [SIMD](/hpc/simd) to accelerate the slower operation, and since there are no fast [horizontal reductions](/hpc/simd/reduction) in SIMD instruction sets, but it is easy to add a vector to a vector, we will choose the second approach and store prefix sums in each node. This makes the `sum` query extremely fast and easy to implement: @@ -623,7 +623,7 @@ int sum(int k) { } ``` -The `add` query is more complicated and slower. We need to add a number to only a suffix of a node, and we can do this by [masking out](/hpc/simd/masking) the positions that need not be modified. +The `add` query is more complicated and slower. We need to add a number only to a suffix of a node, and we can do this by [masking out](/hpc/simd/masking) the positions that should not be modified. We can pre-calculate a $B \times B$ array corresponding to $B$ such masks that tell, for each of $B$ positions within a node, whether a certain prefix sum value needs to be updated or not: From 6a06a065e37eb052a4774a2415cfe9fd356acfce Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 20 Jul 2022 05:48:13 +0300 Subject: [PATCH 501/531] adjust header padding --- themes/algorithmica/assets/style.sass | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/themes/algorithmica/assets/style.sass b/themes/algorithmica/assets/style.sass index eb5e2410..b91f9a5f 100644 --- a/themes/algorithmica/assets/style.sass +++ b/themes/algorithmica/assets/style.sass @@ -187,10 +187,10 @@ menu display: flex font-family: $font-headings - height: 30px + height: 26px background-color: $background justify-content: space-between - padding: 12px + padding: 14px margin: 0 text-align: center From 1c8c455097f458dcb86f67e40c77fd1ca15830c6 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 20 Jul 2022 05:57:43 +0300 Subject: [PATCH 502/531] change simd titles --- content/english/hpc/_index.md | 4 ++-- content/english/hpc/simd/moving.md | 2 +- content/english/hpc/simd/reduction.md | 2 +- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/content/english/hpc/_index.md b/content/english/hpc/_index.md index 942c9f6a..7a0068ff 100644 --- a/content/english/hpc/_index.md +++ b/content/english/hpc/_index.md @@ -163,8 +163,8 @@ Planned table of contents: 9.11. AoS and SoA 10. SIMD Parallelism 10.1. Intrinsics and Vector Types - 10.2. Loading and Writing Data - 10.3. Sums and Other Reductions + 10.2. Moving Data + 10.3. Reductions 10.4. Masking and Blending 10.5. In-Register Shuffles 10.6. Auto-Vectorization diff --git a/content/english/hpc/simd/moving.md b/content/english/hpc/simd/moving.md index 948c31c4..72cbbd33 100644 --- a/content/english/hpc/simd/moving.md +++ b/content/english/hpc/simd/moving.md @@ -1,5 +1,5 @@ --- -title: Loading and Writing Data +title: Moving Data aliases: [/hpc/simd/vectorization] weight: 2 --- diff --git a/content/english/hpc/simd/reduction.md b/content/english/hpc/simd/reduction.md index c67c1942..5a0ace1e 100644 --- a/content/english/hpc/simd/reduction.md +++ b/content/english/hpc/simd/reduction.md @@ -1,5 +1,5 @@ --- -title: Sums and Other Reductions +title: Reductions weight: 3 --- From af2c2b90dedcd2ab701cff977383529244219dbc Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 20 Jul 2022 06:11:31 +0300 Subject: [PATCH 503/531] improving search --- themes/algorithmica/assets/style.sass | 3 +++ themes/algorithmica/layouts/partials/head.html | 1 + 2 files changed, 4 insertions(+) diff --git a/themes/algorithmica/assets/style.sass b/themes/algorithmica/assets/style.sass index b91f9a5f..00a420cf 100644 --- a/themes/algorithmica/assets/style.sass +++ b/themes/algorithmica/assets/style.sass @@ -236,6 +236,9 @@ menu background: $code-background border: $code-border + &:focus + outline: 1px solid $dimmed + #search-count margin-top: 8px color: $dimmed diff --git a/themes/algorithmica/layouts/partials/head.html b/themes/algorithmica/layouts/partials/head.html index 2f4c3c46..c5013dba 100644 --- a/themes/algorithmica/layouts/partials/head.html +++ b/themes/algorithmica/layouts/partials/head.html @@ -45,6 +45,7 @@ if (window.getComputedStyle(searchDiv).display == 'none') { searchDiv.style.display = 'block' window.scrollTo({ top: 0 }); + document.getElementById('search-bar').focus() } else { searchDiv.style.display = 'none' } From 72e00452f0bbd5d30d436ff207fc3f91ee3c678d Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 20 Jul 2022 07:47:49 +0300 Subject: [PATCH 504/531] spmd --- content/english/hpc/_index.md | 2 +- content/english/hpc/simd/_index.md | 2 +- .../english/hpc/simd/auto-vectorization.md | 26 ++++++++++++++----- 3 files changed, 21 insertions(+), 9 deletions(-) diff --git a/content/english/hpc/_index.md b/content/english/hpc/_index.md index 7a0068ff..8d73bcb0 100644 --- a/content/english/hpc/_index.md +++ b/content/english/hpc/_index.md @@ -167,7 +167,7 @@ Planned table of contents: 10.3. Reductions 10.4. Masking and Blending 10.5. In-Register Shuffles - 10.6. Auto-Vectorization + 10.6. Auto-Vectorization and SPMD 11. Algorithm Case Studies 11.1. Binary GCD (11.2. Prime Number Sieves) diff --git a/content/english/hpc/simd/_index.md b/content/english/hpc/simd/_index.md index 5e05da8e..50f6e3ed 100644 --- a/content/english/hpc/simd/_index.md +++ b/content/english/hpc/simd/_index.md @@ -43,6 +43,6 @@ In particular, AVX2 has instructions for working with 256-bit registers, while b ![](img/intel-extensions.webp) -Compilers often do a good job rewriting simple loops with SIMD instructions, like in the case above. This optimization is called [auto-vectorization](auto-vectorization), and it is the preferred way to use SIMD. +Compilers often do a good job rewriting simple loops with SIMD instructions, like in the case above. This optimization is called [auto-vectorization](auto-vectorization), and it is the most popular way of using SIMD. The problem is that it only works with certain types of loops, and even then it often yields suboptimal results. To understand its limitations, we need to get our hands dirty and explore this technology on a lower level, which is what we are going to do in this chapter. diff --git a/content/english/hpc/simd/auto-vectorization.md b/content/english/hpc/simd/auto-vectorization.md index 5fc568c3..b7b8a45f 100644 --- a/content/english/hpc/simd/auto-vectorization.md +++ b/content/english/hpc/simd/auto-vectorization.md @@ -1,15 +1,17 @@ --- -title: Auto-Vectorization +title: Auto-Vectorization and SPMD weight: 10 --- -SIMD-parallelism is most often used for *embarrassingly parallel* computations: the kinds where all you do is apply some elementwise function to all elements of an array and write it back somewhere else. In this setting, you don't even need to know how SIMD works: the compiler is perfectly capable of optimizing such loops by itself — you just need to be aware that such optimization exists and that it usually yields a 5-10x speedup. +SIMD parallelism is most often used for *embarrassingly parallel* computations: the kinds where all you do is apply some elementwise function to all elements of an array and write it back somewhere else. In this setting, you don't even need to know how SIMD works: the compiler is perfectly capable of optimizing such loops by itself — you just need to be aware that such optimization exists and that it usually yields a 5-10x speedup. -Doing nothing and relying on auto-vectorization is actually the preferred way of using SIMD. Whenever you can, you should always stick with the scalar code for its simplicity and maintainability. But often even the loops that seem straightforward to vectorize are not optimized because of some technical nuances. [As in many other cases](/hpc/compilation/contracts), the compiler may need some additional input from the programmer as he may know a bit more about the problem than what can be inferred from static analysis. +Doing nothing and relying on auto-vectorization is actually the most popular way of using SIMD. In fact, in many cases, it even advised to stick with the plain scalar code for its simplicity and maintainability. + +But often even the loops that seem straightforward to vectorize are not optimized because of some technical nuances. [As in many other cases](/hpc/compilation/contracts), the compiler may need some additional input from the programmer as he may know a bit more about the problem than what can be inferred from static analysis. ### Potential Problems -Consider the "a + b" example: +Consider the "a + b" example we [started with](../intrinsics/#simd-intrinsics): ```c++ void sum(int *a, int *b, int *c, int n) { @@ -47,8 +49,18 @@ for (int i = 0; i < n; i++) To help the compiler eliminate this corner case, we can use the `alignas` specifier on static arrays and the `std::assume_aligned` function to mark pointers aligned. -**Checking if vectorization happened.** In either case, it is useful to check if the compiler vectorized the loop the way you intended. You can either [compiling it to assembly](/hpc/compilation/stages) and look for blocks for instructions that start with a "v" or add the `-fopt-info-vec-optimized` compiler flag so that the compiler indicates where auto-vectorization is happening and what SIMD width is being used. If you swap `optimized` for `missed` or `all`, you may also get some reasoning behind why it is not happening in other places. +**Checking if vectorization happened.** In either case, it is useful to check if the compiler vectorized the loop the way you intended. You can either [compiling it to assembly](/hpc/compilation/stages) and look for blocks for instructions that start with a "v" or add the `-fopt-info-vec-optimized` compiler flag so that the compiler indicates where auto-vectorization is happening and what SIMD width is being used. If you swap `optimized` for `missed` or `all`, you may also get some reasoning behind why it is not happening in other places. ---- +There are [many other ways](https://software.intel.com/sites/default/files/m/4/8/8/2/a/31848-CompilerAutovectorizationGuide.pdf) of telling the compiler exactly what we mean, but in especially complex cases — e.g., when there are a lot of branches or function calls inside the loop — it is easier to go one level of abstraction down and vectorize manually. + +### SPMD + +There is a neat compromise between auto-vectorization and the manual use of SIMD intrinsics: "single program, multiple data" (SPMD). This is a model of computation in which the programmer writes what appears to be a regular serial program, but that is actually executed in parallel on the hardware. + +The programming experience is largely the same, and there is still the fundamental limitation in that the computation must be data-parallel, but SPMD ensures that the vectorization will happen regardless of the compiler and the target CPU architecture. It also allows for the computation to be automatically parallelized across multiple cores and, in some cases, even offloaded to other types of parallel hardware. + +There is support for SPMD is some modern languages ([Julia](https://docs.julialang.org/en/v1/base/base/#Base.SimdLoop.@simd)), multiprocessing APIs ([OpenMP](https://www.openmp.org/spec-html/5.0/openmpsu42.html)), and specialized compilers (Intel [ISPC](https://ispc.github.io/)), but it has seen the most success in the context of GPU programming where both problems and hardware are massively parallel. + +We will cover this model of computation in much more depth in Part 2 -There are [many other ways](https://software.intel.com/sites/default/files/m/4/8/8/2/a/31848-CompilerAutovectorizationGuide.pdf) of telling the compiler what we meant exactly, but in especially complex cases — when inside the loop there are a lot of branches or some functions are called — it is easier to go down to the intrinsics level and write it yourself. + From af8d237fcc253fd5a1d32d281e26fa94f2cae948 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 20 Jul 2022 08:39:36 +0300 Subject: [PATCH 505/531] update index --- content/english/hpc/_index.md | 28 ++++++++++++++++------------ 1 file changed, 16 insertions(+), 12 deletions(-) diff --git a/content/english/hpc/_index.md b/content/english/hpc/_index.md index 8d73bcb0..a1ff7f42 100644 --- a/content/english/hpc/_index.md +++ b/content/english/hpc/_index.md @@ -39,11 +39,11 @@ After that, I will mostly be fixing errors and only doing some minor edits refle **Pre-ordering / financially supporting the book.** Due to my unfortunate citizenship and place of birth, you can't — that is, until I find a way that at the same time complies with international sanctions, doesn't sponsor [the war](https://en.wikipedia.org/wiki/2022_Russian_invasion_of_Ukraine), and won't put me in prison for tax evasion. -So, don't bother. If you want to support this book, just share the articles you like on link aggregators and social media and help fix typos — that would be enough. +So, don't bother. If you want to support this book, just share it and help fix typos — that would be enough. **Translations.** The website has a separate functionality for creating and managing translations — and I've already been contacted by some nice people willing to translate the book into Italian and Chinese (and I will personally translate at least some of it into my native Russian). -However, as the book is still evolving, it is probably not the best idea to start translating it at least until Part I is finished. That said, you are very much encouraged to make translations of any articles and publish them in your blogs — just send me the link so that we can merge it back when a centralized translation process starts. +However, as the book is still evolving, it is probably not the best idea to start translating it at least until Part I is finished. That said, you are very much encouraged to make translations of any articles and publish them in your blogs — just send me the link so that we can merge it back when centralized translation starts. **"Translating" the Russian version.** The articles hosted at [ru.algorithmica.org/cs/](https://ru.algorithmica.org/cs/) are not about advanced performance engineering but mostly about classical computer science algorithms — without discussing how to speed them up beyond asymptotic complexity. Most of the information there is not unique and already exists in English on some other places on the internet: for example, the similar-spirited [cp-algorithms.com](https://cp-algorithms.com/). @@ -51,7 +51,7 @@ However, as the book is still evolving, it is probably not the best idea to star There are two highly impactful textbooks on which most computer science courses are built. Both are undoubtedly outstanding, but [one of them](https://en.wikipedia.org/wiki/The_Art_of_Computer_Programming) is 50 years old, and [the other](https://en.wikipedia.org/wiki/Introduction_to_Algorithms) is 30 years old, and [computers have changed a lot](/hpc/complexity/hardware) since then. Asymptotic complexity is not the sole deciding factor anymore. In modern practical algorithm design, you choose the approach that makes better use of different types of parallelism available in the hardware over the one that theoretically does fewer raw operations on galaxy-scale inputs. -And yet, the computer science curricula in most colleges completely ignore this shift. Although there are some great courses that aim to correct that — such as "[Performance Engineering of Software Systems](https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-172-performance-engineering-of-software-systems-fall-2018/)" from MIT, "[Programming Parallel Computers](https://ppc.cs.aalto.fi/)" from Aalto University, and some non-academic ones like Denis Bakhvalov's "[Performance Ninja](https://github.com/dendibakh/perf-ninja)" — most computer science graduates still treat the hardware like something from the 1990s. +And yet, the computer science curricula in most colleges completely ignore this shift. Although there are some great courses that aim to correct that — such as "[Performance Engineering of Software Systems](https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-172-performance-engineering-of-software-systems-fall-2018/)" from MIT, "[Programming Parallel Computers](https://ppc.cs.aalto.fi/)" from Aalto University, and some non-academic ones like Denis Bakhvalov's "[Performance Ninja](https://github.com/dendibakh/perf-ninja)" — most computer science graduates still treat modern hardware like something from the 1990s. What I really want to achieve is that performance engineering becomes taught right after introduction to algorithms. Writing the first comprehensive textbook on the subject is a large part of it, and this is why I rush to finish it by the summer so that the colleges can pick it up in the next academic year. But creating a new course requires more than that: you need a balanced curriculum, course infrastructure, lecture slides, lab assignments… so for some time after finishing the main book, I will be working on course materials and tools for *teaching* performance engineering — and I'm looking forward to collaborating with other people who want to make it a reality as well. @@ -76,7 +76,7 @@ Competitive programming is, in my opinion, misguided. They are doing useless thi The first part covers the basics of computer architecture and optimization of single-threaded algorithms. -It walks through the main CPU optimization topics such as caching, SIMD and pipelining, and provides brief examples in C++, followed by large case studies where we usually achieve a significant speedup over some STL algorithm or data structure. +It walks through the main CPU optimization topics such as caching, SIMD, and pipelining, and provides brief examples in C++, followed by large case studies where we usually achieve a significant speedup over some STL algorithm or data structure. Planned table of contents: @@ -94,7 +94,7 @@ Planned table of contents: 1.4. Functions and Recursion 1.5. Indirect Branching 1.6. Machine Code Layout - 1.7. Interrupts and System Calls + 1.7. System Calls 1.8. Virtualization 3. Instruction-Level Parallelism 3.1. Pipeline Hazards @@ -215,7 +215,7 @@ Among the cool things that we will speed up: - optimal Karatsuba Algorithm - optimal FFT -This work is largely based on blog posts, research papers, conference talks and other work authored by a lot of people: +This work is largely based on blog posts, research papers, conference talks, and other work authored by a lot of people: - [Agner Fog](https://agner.org/optimize/) - [Daniel Lemire](https://lemire.me/en/#publications) @@ -248,29 +248,33 @@ This work is largely based on blog posts, research papers, conference talks and - [Creel](https://www.youtube.com/c/WhatsACreel) Volume: 450-600 pages -Release date: Q2 2022 +Release date: Q3 2022 ### Part II: Parallel Algorithms -Concurrency, models of parallelism, green threads and concurrent runtimes, cache coherence, synchronization primitives, OpenMP, reductions, scans, list ranking and graph algorithms, lock-free data structures, heterogeneous computing, CUDA, kernels, warps, blocks, matrix multiplication and sorting. +Concurrency, models of parallelism, green threads, concurrent runtimes, cache coherence, synchronization primitives, OpenMP, reductions, scans, list ranking, graph algorithms, lock-free data structures, heterogeneous computing, CUDA, kernels, warps, blocks, matrix multiplication, sorting. Volume: 150-200 pages -Release date: 2023? +Release date: 2023-2024? ### Part III: Distributed Computing -Communication-constrained algorithms, message passing, actor model, partitioning, MapReduce, consistency and reliability at scale, storage, compression, scheduling and cloud computing, distributed deep learning. +(I might need some help from here on.) + +Metworking, message passing, actor model, communication-constrained algorithms, distributed primitives, all-reduce, MapReduce, stream processing, query planning, storage, sharding, compression, consistency, reliability, scheduling, cloud computing. Release date: ??? (more likely to be completed than not) ### Part IV: Compilers and Domain-Specific Architectures -LLVM IR, compiler optimizations, JIT-compilation, Cython, JAX, Numba, Julia, OpenCL, DPC++ and oneAPI, XLA, Verilog, FPGAs, ASICs, TPUs and other AI accelerators. +(TODO: come up with a better title — one that emphasizes that this part is mainly about the software-hardware boundary and not PL/IC design.) + +LLVM IR, compiler optimizations & back-end, interpreters, JIT-compilation, Cython, JAX, Numba, Julia, OpenCL, DPC++, oneAPI, XLA, (basic) Verilog, FPGAs, ASICs, TPUs and other AI accelerators. Release date: ??? (less likely to be completed than not) ### Disclaimer: Technology Choices -The examples in this book use C++, GCC, x86-64, CUDA, and Spark, although the underlying principles we aim to convey are not specific to them. +The examples in this book use C++, GCC, x86-64, CUDA, and Spark, although the underlying principles conveyed are not specific to them. To clear my conscience, I'm not happy with any of these choices: these technologies just happen to be the most widespread and stable at the moment and thus more helpful to the reader. I would have respectively picked C / Rust, LLVM, arm, OpenCL, and Dask; maybe there will be a 2nd edition in which some of the tech stack is changed. From 20b8479c5ac2ed627cd86baa12e4e6656074c8ae Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 20 Jul 2022 22:52:44 +0300 Subject: [PATCH 506/531] update hpc index --- content/english/hpc/_index.md | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-) diff --git a/content/english/hpc/_index.md b/content/english/hpc/_index.md index a1ff7f42..8c5e9ef2 100644 --- a/content/english/hpc/_index.md +++ b/content/english/hpc/_index.md @@ -239,6 +239,10 @@ This work is largely based on blog posts, research papers, conference talks, and - [Geoff Langdale](https://branchfree.org/) - [Matt Kulukundis](https://twitter.com/JuvHarlequinKFM) - [Georg Sauthoff](https://gms.tf/) +- [Danila Kutenin](https://danlark.org/author/kutdanila/) +- [Ivica Bogosavljević](https://johnysswlab.com/author/ibogi/) +- [Matt Pharr](https://pharr.org/matt/) +- [Jan Wassenberg](https://research.google/people/JanWassenberg/) - [Marshall Lochbaum](https://mlochbaum.github.io/publications.html) - [Pavel Zemtsov](https://pzemtsov.github.io/) - [Nayuki](https://www.nayuki.io/category/programming) @@ -252,22 +256,22 @@ Release date: Q3 2022 ### Part II: Parallel Algorithms -Concurrency, models of parallelism, green threads, concurrent runtimes, cache coherence, synchronization primitives, OpenMP, reductions, scans, list ranking, graph algorithms, lock-free data structures, heterogeneous computing, CUDA, kernels, warps, blocks, matrix multiplication, sorting. +Concurrency, models of parallelism, context switching, green threads, concurrent runtimes, cache coherence, synchronization primitives, OpenMP, reductions, scans, list ranking, graph algorithms, lock-free data structures, heterogeneous computing, CUDA, kernels, warps, blocks, matrix multiplication, sorting. Volume: 150-200 pages Release date: 2023-2024? ### Part III: Distributed Computing -(I might need some help from here on.) + -Metworking, message passing, actor model, communication-constrained algorithms, distributed primitives, all-reduce, MapReduce, stream processing, query planning, storage, sharding, compression, consistency, reliability, scheduling, cloud computing. +Metworking, message passing, actor model, communication-constrained algorithms, distributed primitives, all-reduce, MapReduce, stream processing, query planning, storage, sharding, compression, distributed databases, consistency, reliability, scheduling, workflow engines, cloud computing. Release date: ??? (more likely to be completed than not) -### Part IV: Compilers and Domain-Specific Architectures +### Part IV: Software & Hardware -(TODO: come up with a better title — one that emphasizes that this part is mainly about the software-hardware boundary and not PL/IC design.) + LLVM IR, compiler optimizations & back-end, interpreters, JIT-compilation, Cython, JAX, Numba, Julia, OpenCL, DPC++, oneAPI, XLA, (basic) Verilog, FPGAs, ASICs, TPUs and other AI accelerators. @@ -277,4 +281,4 @@ Release date: ??? (less likely to be completed than not) The examples in this book use C++, GCC, x86-64, CUDA, and Spark, although the underlying principles conveyed are not specific to them. -To clear my conscience, I'm not happy with any of these choices: these technologies just happen to be the most widespread and stable at the moment and thus more helpful to the reader. I would have respectively picked C / Rust, LLVM, arm, OpenCL, and Dask; maybe there will be a 2nd edition in which some of the tech stack is changed. +To clear my conscience, I'm not happy with any of these choices: these technologies just happen to be the most widespread and stable at the moment and thus more helpful to the reader. I would have respectively picked C / Rust / [Carbon?](https://github.com/carbon-language/carbon-lang), LLVM, arm, OpenCL, and Dask; maybe there will be a 2nd edition in which some of the tech stack is changed. From 6b522385797429bf1a1b5c0295f33ac73350e1a1 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 20 Jul 2022 23:55:19 +0300 Subject: [PATCH 507/531] edit number theory intro --- content/english/hpc/number-theory/_index.md | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/content/english/hpc/number-theory/_index.md b/content/english/hpc/number-theory/_index.md index 6812e14c..d66f85fd 100644 --- a/content/english/hpc/number-theory/_index.md +++ b/content/english/hpc/number-theory/_index.md @@ -3,17 +3,15 @@ title: Number Theory weight: 7 --- -*Disclaimer: this chapter is a very early draft that is probably not worth reading yet.* - In 1940, a British mathematician [G. H. Hardy](https://en.wikipedia.org/wiki/G._H._Hardy) published a famous essay titled "[A Mathematician's Apology](https://en.wikipedia.org/wiki/A_Mathematician%27s_Apology)" discussing the notion that mathematics should be pursued for its own sake rather than for the sake of its applications. -I personally don't agree — and I wrote this book partially to show that there are way too few people working on practical algorithm design instead of theoretical computer science — but I understand where Hardy is coming from. Being 62 years old, he witnessed the devastation caused by the First and the ongoing Second World War that was greatly amplified by the weaponization of science. +Similar to mathematics, the various fields of computer science also form a spectrum, with mathematical logic and computability theory on one end and web programming and application development on the other. I assume that you, the reader, is more on the applied side: this book was written to show that there are way too few people working on practical algorithm design instead of theoretical computer science — and since you got to Chapter 7, you probably also believe in that statement. -As a number theorist, Hardy finds calm working in a "useless" field and not having to face any moral dilemmas, writing: +But, regardless of the personal views on the matter, one can see where Hardy is coming from. Being 62 years old at the moment of writing, he witnessed the devastation caused by the First and the ongoing Second World War — which was greatly amplified by the weaponization of science. As a number theorist, Hardy finds calm working in a "useless" field and not having to face any moral dilemmas, writing: > No one has yet discovered any warlike purpose to be served by the theory of numbers or relativity, and it seems unlikely that anyone will do so for many years. -Ironically, this statement was proved very wrong just 5 years later with the development of the atomic bomb, which would not have been possible without the [understanding](https://en.wikipedia.org/wiki/Einstein%E2%80%93Szil%C3%A1rd_letter) of relativity, and the inception of computer-era cryptography, which extensively builds on number theory. +Ironically, this statement was proved very wrong just 5 years later with the development of the atomic bomb, which would not have been possible without the [understanding](https://en.wikipedia.org/wiki/Einstein%E2%80%93Szil%C3%A1rd_letter) of relativity, and the inception of computer-era cryptography, which extensively builds on number theory — the computational aspect of which is the topic of this chapter. @@ -54,7 +54,7 @@ $$ \bar{x} = x \cdot r \bmod n $$ -Computing this transformation involves a multiplication and a modulo — an expensive operation that we wanted to optimize away in the first place — which is why we don't use this method for general modular multiplication and only long sequences of operations where transforming numbers to and from the Montgomery space is worth it. +Computing this transformation involves a multiplication and a modulo — an expensive operation that we wanted to optimize away in the first place — which is why we only use this method when the overhead of transforming numbers to and from the Montgomery space is worth it and not for general modular multiplication. @@ -287,6 +287,6 @@ int inverse(int _a) { } ``` -While vanilla binary exponentiation with a compiler-generated fast modulo trick requires ~170ns per `inverse` call, this implementation takes ~166ns, going down to ~158s we omit `transform` and `reduce` (a reasonable use case in modular arithmetic is for `inverse` to be used as a subprocedure in a bigger computation). This is a small improvement, but Montgomery multiplication becomes much more advantageous for SIMD applications and larger data types. +While vanilla binary exponentiation with a compiler-generated fast modulo trick requires ~170ns per `inverse` call, this implementation takes ~166ns, going down to ~158s we omit `transform` and `reduce` (a reasonable use case is for `inverse` to be used as a subprocedure in a bigger modular computation). This is a small improvement, but Montgomery multiplication becomes much more advantageous for SIMD applications and larger data types. **Exercise.** Implement efficient *modular* [matix multiplication](/hpc/algorithms/matmul). From a05f571a1762a1f6f8d8b6b329cdbde03b2f56a6 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Thu, 21 Jul 2022 02:49:51 +0300 Subject: [PATCH 510/531] move acknowledgements section --- content/english/hpc/_index.md | 58 +++++++++++++++++++---------------- 1 file changed, 31 insertions(+), 27 deletions(-) diff --git a/content/english/hpc/_index.md b/content/english/hpc/_index.md index 8c5e9ef2..ed71792a 100644 --- a/content/english/hpc/_index.md +++ b/content/english/hpc/_index.md @@ -215,7 +215,35 @@ Among the cool things that we will speed up: - optimal Karatsuba Algorithm - optimal FFT -This work is largely based on blog posts, research papers, conference talks, and other work authored by a lot of people: +Volume: 450-600 pages +Release date: Q3 2022 + +### Part II: Parallel Algorithms + +Concurrency, models of parallelism, context switching, green threads, concurrent runtimes, cache coherence, synchronization primitives, OpenMP, reductions, scans, list ranking, graph algorithms, lock-free data structures, heterogeneous computing, CUDA, kernels, warps, blocks, matrix multiplication, sorting. + +Volume: 150-200 pages +Release date: 2023-2024? + +### Part III: Distributed Computing + + + +Metworking, message passing, actor model, communication-constrained algorithms, distributed primitives, all-reduce, MapReduce, stream processing, query planning, storage, sharding, compression, distributed databases, consistency, reliability, scheduling, workflow engines, cloud computing. + +Release date: ??? (more likely to be completed than not) + +### Part IV: Software & Hardware + + + +LLVM IR, compiler optimizations & back-end, interpreters, JIT-compilation, Cython, JAX, Numba, Julia, OpenCL, DPC++, oneAPI, XLA, (basic) Verilog, FPGAs, ASICs, TPUs and other AI accelerators. + +Release date: ??? (less likely to be completed than not) + +### Acknowledgements + +The book is largely based on blog posts, research papers, conference talks, and other work authored by a lot of people: - [Agner Fog](https://agner.org/optimize/) - [Daniel Lemire](https://lemire.me/en/#publications) @@ -245,38 +273,14 @@ This work is largely based on blog posts, research papers, conference talks, and - [Jan Wassenberg](https://research.google/people/JanWassenberg/) - [Marshall Lochbaum](https://mlochbaum.github.io/publications.html) - [Pavel Zemtsov](https://pzemtsov.github.io/) +- [Gustavo Duarte](https://manybutfinite.com/) +- [Nyaan](https://nyaannyaan.github.io/library/) - [Nayuki](https://www.nayuki.io/category/programming) - [InstLatX64](https://twitter.com/InstLatX64) - [ridiculous_fish](https://ridiculousfish.com/blog/) - [Z boson](https://stackoverflow.com/users/2542702/z-boson) - [Creel](https://www.youtube.com/c/WhatsACreel) -Volume: 450-600 pages -Release date: Q3 2022 - -### Part II: Parallel Algorithms - -Concurrency, models of parallelism, context switching, green threads, concurrent runtimes, cache coherence, synchronization primitives, OpenMP, reductions, scans, list ranking, graph algorithms, lock-free data structures, heterogeneous computing, CUDA, kernels, warps, blocks, matrix multiplication, sorting. - -Volume: 150-200 pages -Release date: 2023-2024? - -### Part III: Distributed Computing - - - -Metworking, message passing, actor model, communication-constrained algorithms, distributed primitives, all-reduce, MapReduce, stream processing, query planning, storage, sharding, compression, distributed databases, consistency, reliability, scheduling, workflow engines, cloud computing. - -Release date: ??? (more likely to be completed than not) - -### Part IV: Software & Hardware - - - -LLVM IR, compiler optimizations & back-end, interpreters, JIT-compilation, Cython, JAX, Numba, Julia, OpenCL, DPC++, oneAPI, XLA, (basic) Verilog, FPGAs, ASICs, TPUs and other AI accelerators. - -Release date: ??? (less likely to be completed than not) - ### Disclaimer: Technology Choices The examples in this book use C++, GCC, x86-64, CUDA, and Spark, although the underlying principles conveyed are not specific to them. From 19bb6305fb564080bc8f0e8995bfeb51038116bd Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Fri, 22 Jul 2022 01:49:24 +0300 Subject: [PATCH 511/531] links to floyd-warshall --- content/english/hpc/algorithms/matmul.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/english/hpc/algorithms/matmul.md b/content/english/hpc/algorithms/matmul.md index 5f2847d2..cf976045 100644 --- a/content/english/hpc/algorithms/matmul.md +++ b/content/english/hpc/algorithms/matmul.md @@ -474,9 +474,9 @@ for (int k = 0; k < n; k++) d[i][j] = min(d[i][j], d[i][k] + d[k][j]); ``` -Interestingly, vectorizing the distance product and executing it $O(\log n)$ times in $O(n^3 \log n)$ total operations is faster than naively executing the Floyd-Warshall algorithm in $O(n^3)$ operations, although not by a lot. +Interestingly, similarly vectorizing the distance product and executing it $O(\log n)$ times ([or possibly fewer](https://arxiv.org/pdf/1904.01210.pdf)) in $O(n^3 \log n)$ total operations is faster than naively executing the Floyd-Warshall algorithm in $O(n^3)$ operations, although not by a lot. -As an exercise, try to speed up this "for-for-for" computation. It is harder to do than in the matrix multiplication case because now there is a logical dependency between the iterations, and you need to perform updates in a particular order, but it is still possible to design a similar kernel and a block iteration order that achieves a 30-50x total speedup. +As an exercise, try to speed up this "for-for-for" computation. It is harder to do than in the matrix multiplication case because now there is a logical dependency between the iterations, and you need to perform updates in a particular order, but it is still possible to design [a similar kernel and a block iteration order](https://github.com/sslotin/amh-code/blob/main/floyd/blocked.cc) that achieves a 30-50x total speedup. ## Acknowledgements From fd9bdbea9477ed7e4e0c749f2967bf5997bb73a8 Mon Sep 17 00:00:00 2001 From: Rinat Valiullov <9755333+RinatValiullov@users.noreply.github.com> Date: Tue, 26 Jul 2022 01:54:42 +0500 Subject: [PATCH 512/531] fix typo (duplicate text) --- content/russian/cs/sorting/bubble.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/content/russian/cs/sorting/bubble.md b/content/russian/cs/sorting/bubble.md index 2d9af9b5..38fa5c8a 100644 --- a/content/russian/cs/sorting/bubble.md +++ b/content/russian/cs/sorting/bubble.md @@ -1,9 +1,10 @@ --- title: Сортировка пузырьком weight: 1 +published: true --- -Наш первый подход будет заключаться в следующем: обозначим за $n$ длину массива и $n$ раз пройдёмся раз пройдемся по нему слева направо, меняя два соседних элемента, если первый больше второго. +Наш первый подход будет заключаться в следующем: обозначим за $n$ длину массива и $n$ раз пройдёмся по нему слева направо, меняя два соседних элемента, если первый больше второго. Каждую итерацию максимальный элемент «всплывает» как пузырек к концу массива — отсюда и название. From 326755608c2464b2fddf960cf972b03d2f8a684f Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Thu, 28 Jul 2022 09:05:21 +0300 Subject: [PATCH 513/531] underline eytzinger search example --- .../english/hpc/data-structures/binary-search.md | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/content/english/hpc/data-structures/binary-search.md b/content/english/hpc/data-structures/binary-search.md index f2e61ffb..7401712e 100644 --- a/content/english/hpc/data-structures/binary-search.md +++ b/content/english/hpc/data-structures/binary-search.md @@ -343,7 +343,7 @@ while (k <= n) The only problem arises when we need to restore the index of the resulting element, as $k$ does not directly point to it. Consider this example (its corresponding tree is listed above): -```center + + +
      +    array:  0 1 2 3 4 5 6 7 8 9                           
      +eytzinger:  6 3 7 1 5 8 9 0 2 4                           
      +1st range:  -------------------  k := 1                    
      +2nd range:  -------------        k := 2*k     = 2   (6 ≥ 3)
      +3rd range:  -------              k := 2*k     = 4   (3 ≥ 3)
      +4th range:      ---              k := 2*k + 1 = 9   (1 < 3)
      +5th range:        -              k := 2*k + 1 = 19  (2 < 3)
      +
      From da216d6c81f59d334f3c9c26cf2ce768871314bb Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Thu, 28 Jul 2022 09:17:31 +0300 Subject: [PATCH 514/531] fix example --- .../english/hpc/data-structures/binary-search.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/content/english/hpc/data-structures/binary-search.md b/content/english/hpc/data-structures/binary-search.md index 7401712e..85f9ef52 100644 --- a/content/english/hpc/data-structures/binary-search.md +++ b/content/english/hpc/data-structures/binary-search.md @@ -354,13 +354,13 @@ eytzinger: 6 3 7 1 5 8 9 0 2 4 -->
      -    array:  0 1 2 3 4 5 6 7 8 9                           
      -eytzinger:  6 3 7 1 5 8 9 0 2 4                           
      -1st range:  -------------------  k := 1                    
      -2nd range:  -------------        k := 2*k     = 2   (6 ≥ 3)
      -3rd range:  -------              k := 2*k     = 4   (3 ≥ 3)
      -4th range:      ---              k := 2*k + 1 = 9   (1 < 3)
      -5th range:        -              k := 2*k + 1 = 19  (2 < 3)
      +    array:  0 1 2 3 4 5 6 7 8 9                            
      +eytzinger:  6 3 7 1 5 8 9 0 2 4                            
      +1st range:  ------------?------  k := 2*k     = 2   (6 ≥ 3)
      +2nd range:  ------?------        k := 2*k     = 4   (3 ≥ 3)
      +3rd range:  --?----              k := 2*k + 1 = 9   (1 < 3)
      +4th range:      ?--              k := 2*k + 1 = 19  (2 < 3)
      +5th range:        !                                        
       
      From 0d811cc49a1784a813f071a2aeb5755e5dfd958a Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Thu, 28 Jul 2022 13:50:38 +0300 Subject: [PATCH 515/531] add s-tree rank example --- content/english/hpc/data-structures/s-tree.md | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-) diff --git a/content/english/hpc/data-structures/s-tree.md b/content/english/hpc/data-structures/s-tree.md index d241aed5..875f72ec 100644 --- a/content/english/hpc/data-structures/s-tree.md +++ b/content/english/hpc/data-structures/s-tree.md @@ -102,7 +102,19 @@ int i = __builtin_ffs(mask) - 1; // now i is the number of the correct child node ``` -Unfortunately, the compilers are not smart enough yet to auto-vectorize this code, so we need to manually vectorize it with intrinsics: +Unfortunately, the compilers are not smart enough to [auto-vectorize](/hpc/simd/auto-vectorization/) this code yet, so we have to optimize it manually. In AVX2, we can load 8 elements, compare them against the search key, producing a [vector mask](/hpc/simd/masking/), and then extract the scalar mask from it with `movemask`. Here is a minimized illustrated example of what we want to do: + +```center + y = 4 17 65 103 + x = 42 42 42 42 + y ≥ x = 00000000 00000000 11111111 11111111 + ├┬┬┬─────┴────────┴────────┘ +movemask = 0011 + ┌─┘ + ffs = 3 +``` + +Since we are limited to processing 8 elements at a time (half our block / cache line size), we have to split the elements into two groups and then combine the two 8-bit masks. To do this, it will be slightly easier to swap the condition for `x > y` and compute the inverted mask instead: ```c++ typedef __m256i reg; @@ -114,7 +126,7 @@ int cmp(reg x_vec, int* y_ptr) { } ``` -This function works for 8-element vectors, which is half our block / cache line size. To process the entire block, we need to call it twice and then combine the masks: +Now, to process the entire block, we need to call it twice and combine the masks: ```c++ int mask = ~( @@ -123,7 +135,7 @@ int mask = ~( ); ``` -Now, to descend down the tree, we use `ffs` on that mask to get the correct child number and just call the `go` function we defined earlier: +To descend down the tree, we use `ffs` on that mask to get the correct child number and just call the `go` function we defined earlier: ```c++ int i = __builtin_ffs(mask) - 1; From f01a7d3df6e6a885fb5b63376df5a0399981bbe2 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Anti=20R=C3=A4is?= Date: Fri, 29 Jul 2022 18:23:51 +0300 Subject: [PATCH 516/531] Improve wording. --- content/english/hpc/profiling/noise.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/content/english/hpc/profiling/noise.md b/content/english/hpc/profiling/noise.md index 74ff0272..243f3600 100644 --- a/content/english/hpc/profiling/noise.md +++ b/content/english/hpc/profiling/noise.md @@ -1,6 +1,7 @@ --- title: Getting Accurate Results weight: 10 +published: true --- It is not an uncommon for there to be two library algorithm implementations, each maintaining its own benchmarking code, and each claiming to be faster than the other. This confuses everyone involved, especially the users, who have to somehow choose between the two. @@ -111,7 +112,7 @@ for (int i = 0; i < N; i++) checksum ^= lower_bound(q[i]); ``` -It is also sometimes convenient to combine the warm-up run with answer validation, it if is more complicated than just computing some sort of checksum. +It is also sometimes convenient to combine the warm-up run with answer validation, if it is more complicated than just computing some sort of checksum. **Over-optimization.** Sometimes the benchmark is outright erroneous because the compiler just optimized the benchmarked code away. To prevent the compiler from cutting corners, you need to add checksums and either print them somewhere or add the `volatile` qualifier, which also prevents any sort of interleaving of loop iterations. From 20d53920f54959981cbab0c17b877a4025763cf4 Mon Sep 17 00:00:00 2001 From: Pasha Date: Sun, 31 Jul 2022 16:33:16 +0300 Subject: [PATCH 517/531] =?UTF-8?q?=D0=BD=D0=B5=D1=81=D0=BE=D0=B3=D0=BB?= =?UTF-8?q?=D0=B0=D1=81=D0=BE=D0=B2=D0=B0=D0=BD=D0=BE=D1=81=D1=82=D1=8C=20?= =?UTF-8?q?=D1=81=D0=BB=D0=BE=D0=B2=20=D0=B2=20=D0=BA=D0=BE=D0=BD=D1=86?= =?UTF-8?q?=D0=B5=20=D0=B0=D0=B1=D0=B7=D0=B0=D1=86=D0=B0?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- content/russian/cs/decomposition/scanline.md | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/content/russian/cs/decomposition/scanline.md b/content/russian/cs/decomposition/scanline.md index 4c9bcdf0..3bc99afd 100644 --- a/content/russian/cs/decomposition/scanline.md +++ b/content/russian/cs/decomposition/scanline.md @@ -1,11 +1,12 @@ --- title: Сканирующая прямая authors: -- Сергей Слотин + - Сергей Слотин prerequisites: -- /cs/range-queries -- /cs/segment-tree + - /cs/range-queries + - /cs/segment-tree weight: 1 +published: true --- Метод сканирующей прямой (англ. *scanline*) заключается в сортировке точек на координатной прямой либо каких-то абстрактных «событий» по какому-то признаку и последующему проходу по ним. @@ -22,7 +23,7 @@ weight: 1 Это решение можно улучшить. Отсортируем интересные точки по возрастанию координаты и пройдем по ним слева направо, поддерживая количество отрезков `cnt`, которые покрывают данную точку. Если в данной точке начинается отрезок, то надо увеличить `cnt` на единицу, а если заканчивается, то уменьшить. После этого пробуем обновить ответ на задачу текущим значением `cnt`. -Как такое писать: нужно представить интересные точки в виде структур с полями «координата» и «тип» (начало / конец) и отсортировать со своим компаратором. Удобно начало отрезка обозначать +1, а конец -1, чтобы просто прибавлять к `cnt` это значение и на разбирать случае. +Как такое писать: нужно представить интересные точки в виде структур с полями «координата» и «тип» (начало / конец) и отсортировать со своим компаратором. Удобно начало отрезка обозначать +1, а конец -1, чтобы просто прибавлять к `cnt` это значение и не разбивать на случаи. Единственный нюанс — если координаты двух точек совпали, чтобы получить правильный ответ, сначала надо рассмотреть все начала отрезков, а только потом концы (чтобы при обновлении ответа в этой координате учлись и правые, и левые граничные отрезки). From 6661563a59217abbe6f69c38d27a6af2cd69aeb4 Mon Sep 17 00:00:00 2001 From: Iago-lito Date: Fri, 5 Aug 2022 16:39:01 +0200 Subject: [PATCH 518/531] Update integer.md --- content/english/hpc/arithmetic/integer.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/english/hpc/arithmetic/integer.md b/content/english/hpc/arithmetic/integer.md index 47f5bd32..686db686 100644 --- a/content/english/hpc/arithmetic/integer.md +++ b/content/english/hpc/arithmetic/integer.md @@ -93,7 +93,7 @@ This seems like an important architecture aspect, but in most cases, it doesn't - Little-endian has the advantage that you can cast a value to a smaller type (e.g., `long long` to `int`) by just loading fewer bytes, which in most cases means doing nothing — thanks to *register aliasing*, `eax` refers to the first 4 bytes of `rax`, so conversion is essentially free. It is also easier to read values in a variety of type sizes — while on big-endian architectures, loading an `int` from a `long long` array would require shifting the pointer by 2 bytes. - Big-endian has the advantage that higher bytes are loaded first, which in theory can make highest-to-lowest routines such as comparisons and printing faster. You can also perform certain checks such as finding out whether a number is negative by only loading its first byte. -Big-endian is also more "natural" — this is how we write binary numbers on paper — but the advantage of having faster type conversions outweigh it. For this reason, little-endian is used by default on most hardware, although some CPUs are "bi-endian" and can be configured to switch modes on demand. +Big-endian is also more "natural" — this is how we write binary numbers on paper — but the advantage of having faster type conversions outweights it. For this reason, little-endian is used by default on most hardware, although some CPUs are "bi-endian" and can be configured to switch modes on demand. ### 128-bit Integers From 387715b6c648a722b2fce506aedf2b79a38d18aa Mon Sep 17 00:00:00 2001 From: psn2706 <69345823+psn2706@users.noreply.github.com> Date: Wed, 10 Aug 2022 23:19:44 +0300 Subject: [PATCH 519/531] Correction of typos --- content/russian/cs/persistent/persistent-array.md | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/content/russian/cs/persistent/persistent-array.md b/content/russian/cs/persistent/persistent-array.md index e476c355..018c287a 100644 --- a/content/russian/cs/persistent/persistent-array.md +++ b/content/russian/cs/persistent/persistent-array.md @@ -2,8 +2,9 @@ title: Структуры с откатами weight: 1 authors: -- Сергей Слотин -date: 2021-09-12 + - Сергей Слотин +date: {} +published: true --- Состояние любой структуры как-то лежит в памяти: в каких-то массивах, или в более общем случае, по каким-то определенным адресам в памяти. Для простоты, пусть у нас есть некоторый массив $a$ размера $n$, и нам нужно обрабатывать запросы присвоения и чтения, а также иногда откатывать изменения обратно. @@ -20,7 +21,7 @@ int a[N]; stack< pair > s; void change(int k, int x) { - l.push({k, a[k]}); + s.push({k, a[k]}); a[k] = x; } @@ -84,7 +85,7 @@ void rollback() { ```cpp int t = 0; -vector versions[N]; +vector< pair > versions[N]; void change(int k, int x) { versions[k].push_back({t++, x}); From 155891c5ed8502decd64047a21736c6285d8edcd Mon Sep 17 00:00:00 2001 From: zh Wang Date: Fri, 12 Aug 2022 05:36:02 +0800 Subject: [PATCH 520/531] Fix typo --- content/english/hpc/profiling/noise.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/english/hpc/profiling/noise.md b/content/english/hpc/profiling/noise.md index 243f3600..b1b186ae 100644 --- a/content/english/hpc/profiling/noise.md +++ b/content/english/hpc/profiling/noise.md @@ -128,10 +128,10 @@ https://github.com/sosy-lab/benchexec The issues we've described produce *bias* in measurements: they consistently give advantage to one algorithm over the other. There are other types of possible problems with benchmarking that result in either unpredictable skews or just completely random noise, thus increasing *variance*. -These type of issues are caused by side effects and some sort of external noise, mostly due to noisy neighbors and CPU frequency scaling: +These types of issues are caused by side effects and some sort of external noise, mostly due to noisy neighbors and CPU frequency scaling: - If you benchmark a compute-bound algorithm, measure its performance in cycles using `perf stat`: this way it will be independent of clock frequency, fluctuations of which is usually the main source of noise. -- Otherwise, set core frequency to the what you expect it to be and make sure nothing interferes with it. On Linux you can do it with `cpupower` (e.g., `sudo cpupower frequency-set -g powersave` to put it to minimum or `sudo cpupower frequency-set -g ondemand` to enable turbo boost). I use a [convenient GNOME shell extension](https://extensions.gnome.org/extension/1082/cpufreq/) that has a separate button to do it. +- Otherwise, set core frequency to what you expect it to be and make sure nothing interferes with it. On Linux you can do it with `cpupower` (e.g., `sudo cpupower frequency-set -g powersave` to put it to minimum or `sudo cpupower frequency-set -g ondemand` to enable turbo boost). I use a [convenient GNOME shell extension](https://extensions.gnome.org/extension/1082/cpufreq/) that has a separate button to do it. - If applicable, turn hyper-threading off and attach jobs to specific cores. Make sure no other jobs are running on the system, turn off networking and try not to fiddle with the mouse. You can't remove noises and biases completely. Even a program's name can affect its speed: the executable's name ends up in an environment variable, environment variables end up on the call stack, and so the length of the name affects stack alignment, which can result in data accesses slowing down due to crossing cache line or memory page boundaries. From adcdf626b1408ad0635ace91dc2a1facdad127d1 Mon Sep 17 00:00:00 2001 From: ar1emicus <87391584+ar1emicus@users.noreply.github.com> Date: Tue, 16 Aug 2022 03:52:35 +0500 Subject: [PATCH 521/531] Update sqrt-structures.md --- content/russian/cs/range-queries/sqrt-structures.md | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/content/russian/cs/range-queries/sqrt-structures.md b/content/russian/cs/range-queries/sqrt-structures.md index bac0da16..8d2cfd6f 100644 --- a/content/russian/cs/range-queries/sqrt-structures.md +++ b/content/russian/cs/range-queries/sqrt-structures.md @@ -1,10 +1,11 @@ --- title: Корневые структуры authors: -- Сергей Слотин -- Иван Сафонов + - Сергей Слотин + - Иван Сафонов weight: 6 -date: 2021-09-13 +date: {} +published: true --- Корневые оптимизации можно использовать много для чего, в частности в контексте структур данных. @@ -68,6 +69,7 @@ void upd(int l, int r, int x) { l += c; } else { + b[l / c] += x; a[l] += x; l++; } @@ -111,8 +113,8 @@ vector< vector > blocks; // возвращает индекс блока и индекс элемента внутри блока pair find_block(int pos) { int idx = 0; - while (blocks[idx].size() >= pos) - pos -= blocks[idx--].size(); + while (blocks[idx].size() <= pos) + pos -= blocks[idx++].size(); return {idx, pos}; } ``` From b80dafe5a8efe0389b1395919dc1770df7408d9f Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Tue, 16 Aug 2022 07:39:08 +0300 Subject: [PATCH 522/531] code style --- content/russian/cs/range-queries/sqrt-structures.md | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-) diff --git a/content/russian/cs/range-queries/sqrt-structures.md b/content/russian/cs/range-queries/sqrt-structures.md index 8d2cfd6f..25fe3b5e 100644 --- a/content/russian/cs/range-queries/sqrt-structures.md +++ b/content/russian/cs/range-queries/sqrt-structures.md @@ -4,8 +4,7 @@ authors: - Сергей Слотин - Иван Сафонов weight: 6 -date: {} -published: true +date: 2022-08-16 --- Корневые оптимизации можно использовать много для чего, в частности в контексте структур данных. @@ -24,16 +23,15 @@ published: true ```c++ // c это и количество блоков, и также их размер; оно должно быть чуть больше корня const int maxn = 1e5, c = 330; -int a[maxn], b[c]; -int add[c]; +int a[maxn], b[c], add[c]; for (int i = 0; i < n; i++) b[i / c] += a[i]; ``` -Заведем также массив `add` размера $\sqrt n$, который будем использовать для отложенной операции прибавления на блоке. Будем считать, что реальное значение $i$-го элемента равно `a[i] + add[i / c]`. +Заведем также массив `add` размера $\sqrt n$, который будем использовать для отложенной операции прибавления на блоке: будем считать, что реальное значение $i$-го элемента равно `a[i] + add[i / c]`. -Теперь мы можем отвечать на запросы первого типа за $O(\sqrt n)$ на запрос: +Теперь мы можем отвечать на запросы первого типа за $O(\sqrt n)$ операций на запрос: 1. Для всех блоков, лежащих целиком внутри запроса, просто возьмём уже посчитанные суммы и сложим. 2. Для блоков, пересекающихся с запросом только частично (их максимум два — правый и левый), проитерируемся по нужным элементам и поштучно прибавим к ответу. @@ -69,7 +67,7 @@ void upd(int l, int r, int x) { l += c; } else { - b[l / c] += x; + b[l / c] += x; a[l] += x; l++; } From a9e98c13f2373883145b951af55cc881671e1804 Mon Sep 17 00:00:00 2001 From: Vladislav Shirshakov Date: Tue, 16 Aug 2022 18:59:47 +0500 Subject: [PATCH 523/531] =?UTF-8?q?=D0=94=D0=BB=D1=8F=20=D0=BF=D0=B5=D1=80?= =?UTF-8?q?=D0=B5=D0=BC=D0=B5=D0=BD=D0=BD=D0=BE=D0=B9=20=D0=BD=D0=B5=20?= =?UTF-8?q?=D1=83=D0=BA=D0=B0=D0=B7=D0=B0=D0=BD=20=D1=82=D0=B8=D0=BF?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- content/russian/cs/sorting/selection.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/content/russian/cs/sorting/selection.md b/content/russian/cs/sorting/selection.md index b47f2320..30854b5f 100644 --- a/content/russian/cs/sorting/selection.md +++ b/content/russian/cs/sorting/selection.md @@ -1,6 +1,7 @@ --- title: Сортировка выбором weight: 2 +published: true --- Похожим методом является **сортировка выбором** (минимума или максимума). @@ -10,7 +11,7 @@ weight: 2 ```cpp void selection_sort(int *a, int n) { for (int k = 0; k < n - 1; k++) - for (j = k + 1; j < n; j++) + for (int j = k + 1; j < n; j++) if (a[k] > a[j]) swap(a[j], a[k]); } From 7fd943e685a0d3ab4c9073cd704bdb25f2455606 Mon Sep 17 00:00:00 2001 From: Sergey Slotin Date: Wed, 17 Aug 2022 09:40:56 +0300 Subject: [PATCH 524/531] improve wording in branchless programming section --- content/english/hpc/pipelining/branchless.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/content/english/hpc/pipelining/branchless.md b/content/english/hpc/pipelining/branchless.md index d7416f35..31bd5a39 100644 --- a/content/english/hpc/pipelining/branchless.md +++ b/content/english/hpc/pipelining/branchless.md @@ -91,7 +91,7 @@ $$ This way you can eliminate branching, but this comes at the cost of evaluating *both* branches and the `cmov` itself. Because evaluating the ">=" branch costs nothing, the performance is exactly equal to [the "always yes" case](../branching/#branch-prediction) in the branchy version. -### When It Is Beneficial +### When Predication Is Beneficial Using predication eliminates [a control hazard](../hazards) but introduces a data hazard. There is still a pipeline stall, but it is a cheaper one: you only need to wait for `cmov` to be resolved and not flush the entire pipeline in case of a mispredict. @@ -180,11 +180,11 @@ int abs(int a) { ### Larger Examples -**Strings.** Oversimplifying things, an `std::string` is comprised of a pointer to a null-terminated char array (also known as "C-string") allocated somewhere on the heap and one integer containing the string size. +**Strings.** Oversimplifying things, an `std::string` is comprised of a pointer to a null-terminated `char` array (also known as a "C-string") allocated somewhere on the heap and one integer containing the string size. -A common value for strings is the empty string — which is also its default value. You also need to handle them somehow, and the idiomatic thing to do is to assign `nullptr` as the pointer and `0` as the string size, and then check if the pointer is null or if the size is zero at the beginning of every procedure involving strings. +A common value for a string is the empty string — which is also its default value. You also need to handle them somehow, and the idiomatic approach is to assign `nullptr` as the pointer and `0` as the string size, and then check if the pointer is null or if the size is zero at the beginning of every procedure involving strings. -However, this requires a separate branch, which is costly unless most strings are empty. What we can do to get rid of it is to allocate a "zero C-string," which is just a zero byte allocated somewhere, and then simply point all empty strings there. Now all string operations with empty strings have to read this useless zero byte, but this is still much cheaper than a branch misprediction. +However, this requires a separate branch, which is costly (unless the majority of strings are either empty or non-empty). To remove the check and thus also the branch, we can allocate a "zero C-string," which is just a zero byte allocated somewhere, and then simply point all empty strings there. Now all string operations with empty strings have to read this useless zero byte, but this is still much cheaper than a branch misprediction. **Binary search.** The standard binary search [can be implemented](/hpc/data-structures/binary-search) without branches, and on small arrays (that fit into cache) it works ~4x faster than the branchy `std::lower_bound`: @@ -193,10 +193,10 @@ int lower_bound(int x) { int *base = t, len = n; while (len > 1) { int half = len / 2; - base = (base[half] < x ? &base[half] : base); + base += (base[half - 1] < x) * half; // will be replaced with a "cmov" len -= half; } - return *(base + (*base < x)); + return *base; } ``` @@ -218,7 +218,7 @@ That there are no substantial reasons why compilers can't do this on their own, **Data-parallel programming.** Branchless programming is very important for [SIMD](/hpc/simd) applications because they don't have branching in the first place. -In our array sum example, if you remove the `volatile` type qualifier from the accumulator, the compiler becomes able to [vectorize](/hpc/simd/auto-vectorization) the loop: +In our array sum example, removing the `volatile` type qualifier from the accumulator allows the compiler to [vectorize](/hpc/simd/auto-vectorization) the loop: ```c++ /* volatile */ int s = 0; @@ -230,7 +230,7 @@ for (int i = 0; i < N; i++) It now works in ~0.3 per element, which is mainly [bottlenecked by the memory](/hpc/cpu-cache/bandwidth). -The compiler is usually able to vectorize any loop that doesn't have branches or dependencies between the iterations — and some specific deviations from that, such as [reductions](/hpc/simd/reduction) or simple loops that contain just one if-without-else. Vectorization of anything more complex is a very nontrivial problem, which may involve various techniques such as [masking](/hpc/simd/masking) and [in-register permutations](/hpc/simd/shuffling). +The compiler is usually able to vectorize any loop that doesn't have branches or dependencies between the iterations — and some specific small deviations from that, such as [reductions](/hpc/simd/reduction) or simple loops that contain just one if-without-else. Vectorization of anything more complex is a very nontrivial problem, which may involve various techniques such as [masking](/hpc/simd/masking) and [in-register permutations](/hpc/simd/shuffling). -When you fetch anything from memory, there is always some latency before the data arrives. Moreover, the request doesn't go directly to its ultimate storage location, but it first goes through a complex system of address translation units and caching layers designed to both help in memory management and reduce the latency. +When you fetch anything from memory, there is always some latency before the data arrives. Moreover, the request doesn't go directly to its ultimate storage location, but it first goes through a complex system of address translation units and caching layers designed to both help in memory management and reduce latency. Therefore, the only correct answer to this question is "it depends" — primarily on where the operands are stored: @@ -27,7 +27,7 @@ Therefore, the only correct answer to this question is "it depends" — primaril - If it was accessed recently, it is probably *cached* and will take less than that to fetch, depending on how long ago it was accessed — it could be ~50 cycles for the slowest layer of cache and around 4-5 cycles for the fastest. - But it could also be stored on some type of *external memory* such as a hard drive, and in this case, it will take around 5ms, or roughly $10^7$ cycles (!) to access it. -Such high variance of memory performance is caused by the fact that memory hardware doesn't follow the same [laws of silicon scaling](/hpc/complexity/hardware) as CPU chips do. Memory is still improving through other means, but if 50 years ago memory timings were roughly on the same scale with the instruction latencies, nowadays they lag far behind. +Such a high variance of memory performance is caused by the fact that memory hardware doesn't follow the same [laws of silicon scaling](/hpc/complexity/hardware) as CPU chips do. Memory is still improving through other means, but if 50 years ago memory timings were roughly on the same scale with the instruction latencies, nowadays they lag far behind. ![](img/memory-vs-compute.png) diff --git a/content/english/hpc/external-memory/hierarchy.md b/content/english/hpc/external-memory/hierarchy.md index da1f5bb6..26dfc144 100644 --- a/content/english/hpc/external-memory/hierarchy.md +++ b/content/english/hpc/external-memory/hierarchy.md @@ -58,7 +58,7 @@ There are other caches inside CPUs that are used for something other than data. ### Non-Volatile Memory -While the data cells in CPU caches and the RAM only gently store just a few electrons (that periodically leak and need to be periodically refreshed), the data cells in *non-volatile memory* types store hundreds of them. This lets the data to persist for prolonged periods of time without power but comes at the cost of performance and durability — because when you have more electrons, you also have more opportunities for them colliding with silicon atoms. +While the data cells in CPU caches and the RAM only gently store just a few electrons (that periodically leak and need to be periodically refreshed), the data cells in *non-volatile memory* types store hundreds of them. This lets the data persist for prolonged periods of time without power but comes at the cost of performance and durability — because when you have more electrons, you also have more opportunities for them to collide with silicon atoms. diff --git a/content/english/hpc/external-memory/model.md b/content/english/hpc/external-memory/model.md index 35cba4ea..9ab86eba 100644 --- a/content/english/hpc/external-memory/model.md +++ b/content/english/hpc/external-memory/model.md @@ -18,7 +18,7 @@ Similar in spirit, in the *external memory model*, we simply ignore every operat In this model, we measure the performance of an algorithm in terms of its high-level *I/O operations*, or *IOPS* — that is, the total number of blocks read or written to external memory during execution. -We will mostly focus on the case where the internal memory is RAM and external memory is SSD or HDD, although the underlying analysis techniques that we will develop are applicable to any layer in the cache hierarchy. Under these settings, reasonable block size $B$ is about 1MB, internal memory size $M$ is usually a few gigabytes, and $N$ is up to a few terabytes. +We will mostly focus on the case where the internal memory is RAM and the external memory is SSD or HDD, although the underlying analysis techniques that we will develop are applicable to any layer in the cache hierarchy. Under these settings, reasonable block size $B$ is about 1MB, internal memory size $M$ is usually a few gigabytes, and $N$ is up to a few terabytes. ### Array Scan diff --git a/content/english/hpc/number-theory/modular.md b/content/english/hpc/number-theory/modular.md index 47310780..3d05e2f9 100644 --- a/content/english/hpc/number-theory/modular.md +++ b/content/english/hpc/number-theory/modular.md @@ -100,7 +100,7 @@ $$ $$ \begin{aligned} a^p &= (\underbrace{1+1+\ldots+1+1}_\text{$a$ times})^p & -\\\ &= \sum_{x_1+x_2+\ldots+x_a = p} P(x_1, x_2, \ldots, x_a) & \text{(by defenition)} +\\\ &= \sum_{x_1+x_2+\ldots+x_a = p} P(x_1, x_2, \ldots, x_a) & \text{(by definition)} \\\ &= \sum_{x_1+x_2+\ldots+x_a = p} \frac{p!}{x_1! x_2! \ldots x_a!} & \text{(which terms will not be divisible by $p$?)} \\\ &\equiv P(p, 0, \ldots, 0) + \ldots + P(0, 0, \ldots, p) & \text{(everything else will be canceled)} \\\ &= a From 88ed77156863353ad37e486195ce0ba3ef682afb Mon Sep 17 00:00:00 2001 From: trasua Date: Mon, 5 Sep 2022 15:59:32 +0700 Subject: [PATCH 529/531] fix typo --- content/english/hpc/data-structures/binary-search.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/content/english/hpc/data-structures/binary-search.md b/content/english/hpc/data-structures/binary-search.md index 85f9ef52..6426ddde 100644 --- a/content/english/hpc/data-structures/binary-search.md +++ b/content/english/hpc/data-structures/binary-search.md @@ -1,6 +1,7 @@ --- title: Binary Search weight: 1 +published: true --- @@ -184,7 +185,7 @@ int lower_bound(int x) { Note that this loop is not always equivalent to the standard binary search. Since it always rounds *up* the size of the search interval, it accesses slightly different elements and may perform one comparison more than needed. Apart from simplifying computations on each iteration, it also makes the number of iterations constant if the array size is constant, removing branch mispredictions completely. -As typical for predication, this trick is very fragile to compiler optimizations — depending on the compiler and how the funciton is invoked, it may still leave a branch or generate suboptimal code. It works fine on Clang 10, yielding a 2.5-3x improvement on small arrays: +As typical for predication, this trick is very fragile to compiler optimizations — depending on the compiler and how the function is invoked, it may still leave a branch or generate suboptimal code. It works fine on Clang 10, yielding a 2.5-3x improvement on small arrays: From 4e00ee7cc5769cd650be6d215e0efacf2f14de51 Mon Sep 17 00:00:00 2001 From: novikov-vladimir <99834014+novikov-vladimir@users.noreply.github.com> Date: Fri, 11 Nov 2022 19:57:50 +0300 Subject: [PATCH 530/531] =?UTF-8?q?=D0=90=D0=B2=D1=82=D0=BE=D0=BC=D0=B0?= =?UTF-8?q?=D1=82=D0=BD=D1=8B=D0=B9=20=D0=BF=D0=B5=D1=80=D0=B5=D1=85=D0=BE?= =?UTF-8?q?=D0=B4=20=D0=B4=D0=BE=D0=BB=D0=B6=D0=B5=D0=BD=20=D0=B2=D0=B5?= =?UTF-8?q?=D1=81=D1=82=D0=B8=20=D0=B2=20=D0=B2=D0=B5=D1=80=D1=88=D0=B8?= =?UTF-8?q?=D0=BD=D1=83,=20=D1=81=D0=BE=D0=BE=D1=82=D0=B2=D0=B5=D1=82?= =?UTF-8?q?=D1=81=D1=82=D0=B2=D1=83=D1=8E=D1=89=D1=83=D1=8E=20=D0=BC=D0=B0?= =?UTF-8?q?=D0=BA=D1=81=D0=B8=D0=BC=D0=B0=D0=BB=D1=8C=D0=BD=D0=BE=D0=BC?= =?UTF-8?q?=D1=83=20=D0=BF=D1=80=D0=B8=D0=BD=D0=B8=D0=BC=D0=B0=D0=B5=D0=BC?= =?UTF-8?q?=D0=BE=D0=BC=D1=83=20=D0=B1=D0=BE=D1=80=D0=BE=D0=BC=20=D1=81?= =?UTF-8?q?=D1=83=D1=84=D1=84=D0=B8=D0=BA=D1=81=D1=83=20(=D0=BD=D0=B5=20?= =?UTF-8?q?=D0=BC=D0=B8=D0=BD=D0=B8=D0=BC=D0=B0=D0=BB=D1=8C=D0=BD=D0=BE?= =?UTF-8?q?=D0=BC=D1=83).?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- content/russian/cs/string-structures/aho-corasick.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/content/russian/cs/string-structures/aho-corasick.md b/content/russian/cs/string-structures/aho-corasick.md index 369f5171..2ca1da65 100644 --- a/content/russian/cs/string-structures/aho-corasick.md +++ b/content/russian/cs/string-structures/aho-corasick.md @@ -1,10 +1,11 @@ --- title: Алгоритм Ахо-Корасик authors: -- Сергей Слотин + - Сергей Слотин weight: 2 prerequisites: -- trie + - trie +published: true --- Представим, что мы работаем журналистами в некотором авторитарном государстве, контролирующем СМИ, и в котором время от времени издаются законы, запрещающие упоминать определенные политические события или использовать определенные слова. Как эффективно реализовать подобную цензуру программно? @@ -36,7 +37,7 @@ prerequisites: **Определение.** *Суффиксная ссылка* $l(v)$ ведёт в вершину $u \neq v$, которая соответствует наидлиннейшему принимаемому бором суффиксу $v$. -**Определение.** *Автоматный переход* $\delta(v, c)$ ведёт в вершину, соответствующую минимальному принимаемому бором суффиксу строки $v + c$. +**Определение.** *Автоматный переход* $\delta(v, c)$ ведёт в вершину, соответствующую максимальному принимаемому бором суффиксу строки $v + c$. **Наблюдение.** Если переход и так существует в боре (будем называть такой переход *прямым*), то автоматный переход будет вести туда же. From 0fa54119101693a9670972a3c27657d2ee1c59d1 Mon Sep 17 00:00:00 2001 From: DavideGianessi <118054693+DavideGianessi@users.noreply.github.com> Date: Sat, 12 Nov 2022 12:49:11 +0100 Subject: [PATCH 531/531] typo --- content/english/hpc/number-theory/montgomery.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/content/english/hpc/number-theory/montgomery.md b/content/english/hpc/number-theory/montgomery.md index 669e39ba..0eeef0b0 100644 --- a/content/english/hpc/number-theory/montgomery.md +++ b/content/english/hpc/number-theory/montgomery.md @@ -1,6 +1,7 @@ --- title: Montgomery Multiplication weight: 4 +published: true --- Unsurprisingly, a large fraction of computation in [modular arithmetic](../modular) is often spent on calculating the modulo operation, which is as slow as [general integer division](/hpc/arithmetic/division/) and typically takes 15-20 cycles, depending on the operand size. @@ -287,6 +288,6 @@ int inverse(int _a) { } ``` -While vanilla binary exponentiation with a compiler-generated fast modulo trick requires ~170ns per `inverse` call, this implementation takes ~166ns, going down to ~158s we omit `transform` and `reduce` (a reasonable use case is for `inverse` to be used as a subprocedure in a bigger modular computation). This is a small improvement, but Montgomery multiplication becomes much more advantageous for SIMD applications and larger data types. +While vanilla binary exponentiation with a compiler-generated fast modulo trick requires ~170ns per `inverse` call, this implementation takes ~166ns, going down to ~158ns we omit `transform` and `reduce` (a reasonable use case is for `inverse` to be used as a subprocedure in a bigger modular computation). This is a small improvement, but Montgomery multiplication becomes much more advantageous for SIMD applications and larger data types. **Exercise.** Implement efficient *modular* [matix multiplication](/hpc/algorithms/matmul).