Benchmarking is hard... (c)
- All test were run on Gentoo Linux with AMD 3700X.
- tiny-skia SSE2 is built with
-Ctarget-cpu=x86-64
- tiny-skia AVX is built with
-Ctarget-cpu=haswell
- Skia v85.
- Skia SSE2 is built using clang with
-march=x86-64
- Skia AVX is built using clang with
-march=haswell
- cairo v1.16.0.
- cairo is built using gcc with
-march=native
, because it's the system one - raqote version can be found at
Cargo.lock
- raqote is built with
-Ctarget-cpu=x86-64
. Testing withhaswell
doesn't change the results much.
All measurement are in nanoseconds. Lower is better.
blend.rs
Filling a shape with a solid color.
Mode/Library | tiny-skia SSE2 | tiny-skia AVX | Skia SSE2 | Skia AVX | cairo | raqote |
---|---|---|---|---|---|---|
clear | 51,020 | 41,785 | 45,898 | 50,203 | 62,331 | 1,043,659 |
source | 51,467 | 41,287 | 46,195 | 50,949 | 62,548 | 1,114,946 |
destination | 63 | 67 | 5,309 | 5,318 | 4 | 980,993 |
source_over | 492,146 | 355,726 | 490,423 | 261,943 | 212,838 | 2,645,207 |
destination_over | 582,450 | 427,992 | 583,660 | 306,542 | 236,970 | 3,446,445 |
source_in | 588,029 | 413,983 | 577,464 | 291,008 | 1,104,817 | 1,163,684 |
destination_in | 518,637 | 379,905 | 570,325 | 293,412 | 1,103,758 | 1,193,288 |
source_out | 600,771 | 420,907 | 576,125 | 297,537 | 1,122,571 | 1,182,724 |
destination_out | 519,200 | 379,237 | 576,521 | 299,481 | 556,088 | 1,193,241 |
source_atop | 577,346 | 426,021 | 603,723 | 307,222 | 663,254 | 1,501,419 |
destination_atop | 552,738 | 406,357 | 595,771 | 306,610 | 1,220,442 | 1,511,405 |
xor | 581,027 | 435,475 | 601,201 | 310,439 | 671,439 | 1,634,236 |
plus | 508,457 | 358,197 | 559,063 | 286,504 | 122,496 | 4,892,311 |
modulate | 522,697 | 379,620 | 570,734 | 290,988 | - | - |
screen | 523,308 | 389,701 | 598,637 | 312,507 | 3,837,010 | 1,459,500 |
overlay | 826,240 | 514,436 | 733,158 | 385,684 | 3,412,192 | 7,016,443 |
darken | 631,307 | 434,548 | 610,493 | 317,215 | 3,579,931 | 5,230,384 |
lighten | 633,910 | 434,123 | 621,340 | 319,867 | 3,584,409 | 5,255,202 |
color_dodge | 1,794,356 | 1,002,832 | 780,556 | 679,996 | 5,151,488 | 9,634,447 |
color_burn | 1,878,874 | 1,055,831 | 861,359 | 708,962 | 5,007,181 | 9,617,102 |
hard_light | 804,040 | 508,038 | 734,878 | 370,307 | 3,442,584 | 7,034,417 |
soft_light | 2,410,827 | 1,412,907 | 1,225,756 | 986,403 | 5,900,415 | 11,941,630 |
difference | 650,915 | 440,554 | 632,797 | 326,285 | 3,936,284 | 5,718,776 |
exclusion | 530,865 | 399,089 | 618,645 | 316,235 | 3,842,131 | 6,082,393 |
multiply | 586,406 | 434,776 | 627,202 | 316,841 | 3,608,817 | 5,986,364 |
hue | 2,675,939 | 2,008,062 | 1,705,745 | 1,413,791 | 7,517,902 | 13,716,827 |
saturation | 2,825,041 | 2,006,886 | 1,676,187 | 1,411,117 | 7,443,261 | 13,752,382 |
color | 2,350,240 | 1,655,401 | 1,431,804 | 1,115,800 | 6,070,058 | 10,537,391 |
luminosity | 2,360,829 | 1,599,196 | 1,351,584 | 1,090,402 | 6,124,294 | 10,488,916 |
Destination is faster in tiny-skia
, because we're exiting immediately,
while Skia uses null blitter, so edges processing is still in place.
fill_aa.rs
Test/Library | tiny-skia SSE2 | tiny-skia AVX | Skia SSE2 | Skia AVX | cairo | raqote |
---|---|---|---|---|---|---|
fill | 663,415 | 542,188 | 495,359 | 348,326 | 538,842 | 1,520,830 |
memset_fill.rs
Shape filling by overwritting original pixels. No blending.
Test/Library | tiny-skia SSE2 | tiny-skia AVX | Skia SSE2 | Skia AVX | cairo | raqote |
---|---|---|---|---|---|---|
opaque | 56,173 | 45,190 | 45,392 | 47,662 | 45,804 | 2,432,875 |
source | 56,542 | 50,158 | 50,433 | 50,663 | 48,839 | 678,795 |
fill_rect.rs
Fills a rectangle with a solid solid color.
Test/Library | tiny-skia SSE2 | tiny-skia AVX | Skia SSE2 | Skia AVX | cairo | raqote |
---|---|---|---|---|---|---|
basic | 699,878 | 499,512 | 711,992 | 348,918 | 203,921 | 3,672,295 |
with AA | 744,711 | 541,858 | 735,547 | 371,368 | 192,178 | 2,087,047 |
with AA and transform | 369,684 | 284,392 | 321,105 | 191,670 | 175,061 | 891,996 |
The last test simply fallbacks to path filling in Skia/tiny-skia.
Strangely, Skia is pretty slow in this task. Not sure why.
fill_all.rs
Filling the whole canvas with a color.
Test/Library | tiny-skia SSE2 | tiny-skia AVX | Skia SSE2 | Skia AVX | cairo | raqote |
---|---|---|---|---|---|---|
fill | 64,213 | 64,916 | 1,112,417 | 554,113 | 285,316 | 46,560 |
spiral.rs
A spiral has a lot of short horizontal strides which are not CPU-friendly when a rendering backend is designed for long horizontal strides.
Test/Library | tiny-skia SSE2 | tiny-skia AVX | Skia SSE2 | Skia AVX | cairo | raqote |
---|---|---|---|---|---|---|
stroke | 1,658,247 | 1,602,847 | 1,205,407 | 1,174,229 | 3,161,763 | 5,707,383 |
hairline.rs
Draws a large spiral using a subpixel stroke width.
Test/Library | tiny-skia SSE2 | tiny-skia AVX | Skia SSE2 | Skia AVX | cairo | raqote |
---|---|---|---|---|---|---|
no AA | 1,391,101 | 1,345,418 | 1,022,641 | 1,027,160 | 2,469,764 | - |
with AA | 3,363,682 | 3,084,936 | 1,846,240 | 1,868,304 | 13,457,819 | - |
raqote
doesn't support hairline stroking.- Not sure why
cairo
is so slow with AA.
gradients.rs
Test/Library | tiny-skia SSE2 | tiny-skia AVX | Skia SSE2 | Skia AVX | cairo | raqote |
---|---|---|---|---|---|---|
linear, two stops, pad | 1,417,980 | 1,223,505 | 993,781 | 571,583 | 2,458,368 | 3,379,723 |
linear, two stops, reflect | 1,724,305 | 1,391,261 | 1,248,226 | 684,310 | 2,449,727 | 3,301,529 |
linear, two stops, repeat | 1,633,253 | 1,294,505 | 1,043,289 | 599,061 | 2,445,711 | 3,115,416 |
linear, three stops, evenly spread | 2,068,385 | 1,940,036 | 1,806,614 | 781,344 | 2,413,454 | 4,021,338 |
linear, three stops, unevenly spread | 2,068,394 | 1,939,826 | 1,805,379 | 687,479 | 2,423,142 | 3,412,176 |
simple radial | 2,293,883 | 2,205,554 | 2,050,437 | 805,376 | 4,704,141 | 5,531,178 |
two point radial | 2,700,221 | 2,516,614 | 1,943,448 | 1,083,230 | 4,709,760 | 13,454,676 |
pattern.rs
Test/Library | tiny-skia SSE2 | tiny-skia AVX | Skia SSE2 | Skia AVX | cairo | raqote |
---|---|---|---|---|---|---|
plain (nearest, no ts) | 2,633,745 | 2,263,839 | 1,315,079 | 1,122,982 | 785,550 | 1,865,327 |
lq (bilinear, with ts) | 8,602,786 | 4,865,226 | 4,484,023 | 2,646,523 | 17,612,685 | 24,906,379 |
hq (bicubic/gauss, with ts) | 27,967,739 | 14,760,398 | 12,386,848 | 9,364,356 | 162,771,632 | - |
Note that raqote
doesn't support high quality filtering.
And yes, cairo is really that slow. Not sure why.
clip.rs
Test/Library | tiny-skia SSE2 | tiny-skia AVX | Skia SSE2 | Skia AVX | cairo | raqote |
---|---|---|---|---|---|---|
clip path | 2,371,153 | 2,066,654 | 579,724 | 299,567 | 336,489 | 4,964,327 |
clip path AA | 2,480,372 | 2,152,685 | 898,489 | 605,572 | 367,626 | 3,396,255 |
tiny-skia
uses just a simple alpha mask for clipping, while Skia has a very complicated,
but way faster algorithm.
png_io.rs
We are comparing raw png
crate performance with tiny-skia
premultiplying/demultiplying code to check how much overhead it adds.
Test/Library | tiny-skia | png |
---|---|---|
Decode RGB | 128,367 | 67,293 |
Decode RGBA | 109,752 | 90,793 |
Encode RGBA | 302,154 | 275,042 |
RGB (without alpha) is slower, since we have to decode an image into a RGB buffer, then transform it into a RGBA buffer. Waiting for image-png/#239.
We support only Linux. The benchmark may work on other OS'es, but it will require a lot of preperation (building Skia and cairo).
You have to install cairo first and built Skia from sources (see below).
Run:
export SKIA_DIR="/path/to/skia"
export SKIA_LIB_DIR="/path/to/skia/out/Shared"
export LD_LIBRARY_PATH="/path/to/skia/out/Shared"
cargo bench
You will need git
, clang
, ninja
and Python 2.
On Windows, use clang-cl
and clang-cl++
for cc
and cxx
instead.
git clone https://skia.googlesource.com/skia.git
cd skia
git fetch --all
git checkout -b m85 origin/chrome/m85
python2 tools/git-sync-deps # this will download about 3 GiB of code
bin/gn gen out/Shared --args='
is_official_build=false
is_component_build=true
is_debug=false
cc="clang"
cxx="clang++"
extra_cflags_cc=["-march=native", "-DSK_FORCE_RASTER_PIPELINE_BLITTER"]
werror=false
paragraph_gms_enabled=false
paragraph_tests_enabled=false
skia_enable_android_utils=false
skia_enable_discrete_gpu=false
skia_enable_gpu=false
skia_enable_nvpr=false
skia_enable_particles=false
skia_enable_pdf=false
skia_enable_skottie=false
skia_enable_skrive=false
skia_enable_skshaper=false
skia_enable_sksl_interpreter=false
skia_enable_skvm_jit=false
skia_enable_tools=false
skia_use_expat=false
skia_use_gl=false
skia_use_harfbuzz=false
skia_use_icu=false
skia_use_libgifcodec=false
skia_use_libheif=false
skia_use_libjpeg_turbo_decode=false
skia_use_libjpeg_turbo_encode=false
skia_use_libwebp_decode=false
skia_use_libwebp_encode=false
skia_use_lua=false
skia_use_piex=false'
ninja -C out/Shared