poniedziałek, 1 lutego 2016

Memcached vs Redis(phpredis vs predis) vs XCache as cache engine for PHP in 2016

If you ever wondered - which engine is best for caching in 2016? Should I switch from one to another? Is Redis -  youngest - better than other competitors? Is XCache still alive since PHP 5.5 has built in opcode cache? Then short answer is:

In terms of raw speed XCache is a winner without any doubts. Redis and Memcached are pretty equal. But differences caused by engine selection in most cases are irrelevant compared to differences caused by how we use them.

First, we should answer ourselves – what is cache? How do we use it? What do we expect from it? 

For me cache is set of independent key – value pairs which can have expiration time. Keys in cache can be deleted at any time (randomly, by LRU mechanism, or flushed completely) and application should just recreate them and keep going without errors or consistency problems. So, for example „cache” is bad place for counters, unless you are perfectly willing to accept that counter can be deleted at any time. Also placing keys with meta data for other keys can lead to consistency problems when one of keys will be deleted, and other don't (Example: implementing some sort of cache keys tags mechanism).

So, all I expect from cache driver is to implement interface similar to that one:

interface Cache {
    public function get($key, $default = null);
    public function set($key, $value, $expiration = 0);
    public function remember($key, $closure, $expiration = 0);
    public function delete($key);
    public function flush();
}
Ok, enough introduction(for now), lets jump to some test results:



Write(short value):
Engine Avg[ms] Stddev[ms] Difference from fastest[%]
xcache 0,003 0 100,00%
redis 0,022 0 733,33%
memcached 0,022 0,002 733,33%
redis 0,023 0,001 766,67%
predis 0,035 0 1166,67%

Desktop env = Debian Jessie(https://www.turnkeylinux.org/nginx-php-fastcgi) on physical machine (Intel Pentium G850, 8GB RAM).

Redis and Memcached are equally fast within error margin, but 7 times slower than XCache. And connecting to redis server using Predis PHP library is almost two times slower than connecting using phpredis extension. But those were just „out of the box configuration” tests, which means „localhost” connections. Both Redis and Memcached are offering „socket” connections.


Write(short value):
Engine Connection Avg[ms] Stddev[ms] Difference from fastest[%]
xcache
0,003 0 100,00%
memcached socket 0,013 0,001 433,33%
redis socket 0,015 0,001 500,00%
redis localhost 0,022 0 733,33%
memcached localhost 0,022 0,002 733,33%
predis localhost 0,035 0 1166,67%
predis socket 0,042 0,004 1400,00%

Switching from tcpip connection to socket cut time almost by half. Well, not for Predis client library, but – lets say, this library didn't get better on other tests…  


Read(short value):
Engine Connection Avg[ms] Stddev[ms] Difference from fastest[%]
xcache
0,002 0 100,00%
memcached socket 0,012 0,001 600,00%
redis socket 0,013 0 650,00%
memcached localhost 0,019 0,001 950,00%
redis localhost 0,021 0 1050,00%
predis localhost 0,033 0 1650,00%
predis socket 0,038 0,005 1900,00%

Read performance is similar to write performance. Xcache wins, Memcached and Redis are almost equal when connecting in similar way, socket faster than localhost. Predis on far end.


That were tests of short values, lets try writing and reading 20kB long strings (nothing too special in real life cache usage).  

Write(20Kb string):
Engine Connection Avg[ms] Stddev[ms] Difference from fastest[%]
xcache
0,009 0 100,00%
redis socket 0,032 0,003 355,56%
redis localhost 0,043 0,004 477,78%
predis socket 0,045 0,002 500,00%
memcached socket 0,121 0,015 1344,44%
memcached localhost 0,125 0,013 1388,89%
predis localhost 39,982 0,013 444244,44%


Read(20Kb string):
Engine Connection Avg[ms] Stddev[ms] Difference from fastest[%]
xcache
0,006 0 100,00%
redis socket 0,025 0,001 416,67%
redis localhost 0,037 0,002 616,67%
predis socket 0,052 0,006 866,67%
memcached socket 0,062 0,01 1033,33%
memcached localhost 0,07 0,007 1166,67%
predis localhost 0,111 0,011 1850,00%

XCache still wins, Redis results are similar to short strings, but what happened to Memcached? Almost 2.5 slower than Redis!
PHP Memcached extension uses compression for values longer than 2000 bytes (in default settings):

So lets try disabling compression in Memcached client:


Write(20Kb string):
Engine Connection Avg[ms] Stddev[ms] Difference from fastest[%]
xcache
0,009 0 100,00%
memcached_nocompression socket 0,027 0,001 300,00%
redis socket 0,032 0,003 355,56%
memcached_nocompression localhost 0,036 0,001 400,00%
redis localhost 0,043 0,004 477,78%
predis socket 0,045 0,002 500,00%
memcached socket 0,121 0,015 1344,44%
memcached localhost 0,125 0,013 1388,89%
predis localhost 39,982 0,013 444244,44%


Read(20Kb string):
Engine Connection Avg[ms] Stddev[ms] Difference from fastest[%]
xcache
0,006 0 100,00%
memcached_nocompression socket 0,021 0,001 350,00%
redis socket 0,025 0,001 416,67%
memcached_nocompression localhost 0,033 0,001 550,00%
redis localhost 0,037 0,002 616,67%
predis socket 0,052 0,006 866,67%
memcached socket 0,062 0,01 1033,33%
memcached localhost 0,07 0,007 1166,67%
predis localhost 0,111 0,011 1850,00%


All those tests were performed within one hardware configuration, I was wondering what impact on results would have different conditions so I've tried more environments:
- desktop – the one already used
- desktop10concurrent – the one already used but with 10 concurrent requests
- desktopVM (the same computer as desktop, but with Windows 10 as host and Debian in Virutalbox as guest)
- laptopVM (other computer with Windows 10 as host and Debian in Virtualbox)
- PHP on laptopVM connecting to Redis/Memcached on desktop over 100 Mb/s ethernet (connection = 'network')


Write(short value):
Env Engine Connection Avg[ms] Stddev[ms] Difference[%]
desktop xcache
0,003 0 100,00%
desktop memcached_nocompression socket 0,013 0 433,33%
desktop memcached socket 0,013 0,001 433,33%
desktop redis socket 0,015 0,001 500,00%
desktop memcached_nocompression localhost 0,021 0,001 700,00%
desktop redis localhost 0,022 0 733,33%
desktop memcached localhost 0,022 0,002 733,33%
desktop predis localhost 0,035 0 1166,67%
desktop predis socket 0,042 0,004 1400,00%
desktop10concurrent xcache
0,005 0,001 166,67%
desktop10concurrent memcached socket 0,014 0,002 466,67%
desktop10concurrent memcached_nocompression socket 0,014 0,002 466,67%
desktop10concurrent redis socket 0,016 0,003 533,33%
desktop10concurrent memcached localhost 0,022 0,004 733,33%
desktop10concurrent memcached_nocompression localhost 0,022 0,004 733,33%
desktop10concurrent redis localhost 0,023 0,003 766,67%
desktopVM xcache
0,004 0 133,33%
desktopVM memcached_nocompression socket 0,016 0,002 533,33%
desktopVM memcached socket 0,017 0,001 566,67%
desktopVM redis socket 0,022 0,002 733,33%
desktopVM memcached localhost 0,026 0,002 866,67%
desktopVM memcached_nocompression localhost 0,026 0,004 866,67%
desktopVM redis localhost 0,031 0,002 1033,33%
desktopVM predis socket 0,044 0,005 1466,67%
desktopVM predis localhost 0,093 0,009 3100,00%
laptopVM xcache
0,004 0 133,33%
laptopVM memcached socket 0,012 0 400,00%
laptopVM memcached_nocompression socket 0,012 0 400,00%
laptopVM redis socket 0,016 0,001 533,33%
laptopVM memcached_nocompression localhost 0,017 0,003 566,67%
laptopVM memcached localhost 0,017 0 566,67%
laptopVM redis localhost 0,02 0,001 666,67%
laptopVM memcached_nocompression network 0,554 0,02 18466,67%
laptopVM redis network 0,557 0,025 18566,67%
laptopVM memcached network 0,562 0,023 18733,33%


Read(short value):
Env Engine Connection Avg[ms] Stddev[ms] Difference[%]
desktop xcache
0,002 0 100,00%
desktop memcached socket 0,012 0,001 600,00%
desktop memcached_nocompression socket 0,012 0 600,00%
desktop redis socket 0,013 0 650,00%
desktop memcached_nocompression localhost 0,019 0,001 950,00%
desktop memcached localhost 0,019 0,001 950,00%
desktop redis localhost 0,021 0 1050,00%
desktop predis localhost 0,033 0 1650,00%
desktop predis socket 0,038 0,005 1900,00%
desktop10concurrent xcache
0,004 0 200,00%
desktop10concurrent memcached_nocompression socket 0,014 0,002 700,00%
desktop10concurrent memcached socket 0,015 0,003 750,00%
desktop10concurrent redis socket 0,015 0,004 750,00%
desktop10concurrent memcached_nocompression localhost 0,019 0,001 950,00%
desktop10concurrent memcached localhost 0,02 0,002 1000,00%
desktop10concurrent redis localhost 0,024 0,005 1200,00%
desktopVM xcache
0,004 0,002 200,00%
desktopVM memcached_nocompression socket 0,015 0,002 750,00%
desktopVM memcached socket 0,015 0,002 750,00%
desktopVM redis socket 0,02 0,003 1000,00%
desktopVM memcached_nocompression localhost 0,023 0,003 1150,00%
desktopVM memcached localhost 0,025 0,004 1250,00%
desktopVM redis localhost 0,03 0,003 1500,00%
desktopVM predis socket 0,038 0,002 1900,00%
desktopVM predis localhost 0,089 0,005 4450,00%
laptopVM xcache
0,003 0 150,00%
laptopVM memcached socket 0,01 0 500,00%
laptopVM memcached_nocompression socket 0,012 0,004 600,00%
laptopVM memcached localhost 0,014 0 700,00%
laptopVM redis socket 0,014 0,001 700,00%
laptopVM memcached_nocompression localhost 0,015 0,003 750,00%
laptopVM redis localhost 0,019 0,001 950,00%
laptopVM memcached_nocompression network 0,543 0,007 27150,00%
laptopVM memcached network 0,545 0,008 27250,00%
laptopVM redis network 0,547 0,014 27350,00%


Write(20Kb string):
Env Engine Connection Avg[ms] Stddev[ms] Difference[%]
desktop xcache
0,009 0 100,00%
desktop memcached_nocompression socket 0,027 0,001 300,00%
desktop redis socket 0,032 0,003 355,56%
desktop memcached_nocompression localhost 0,036 0,001 400,00%
desktop redis localhost 0,043 0,004 477,78%
desktop predis socket 0,045 0,002 500,00%
desktop memcached socket 0,121 0,015 1344,44%
desktop memcached localhost 0,125 0,013 1388,89%
desktop predis localhost 39,982 0,013 444244,44%
desktop10concurrent xcache
0,013 0,001 144,44%
desktop10concurrent memcached_nocompression socket 0,028 0,004 311,11%
desktop10concurrent memcached_nocompression localhost 0,039 0,006 433,33%
desktop10concurrent memcached socket 0,115 0,005 1277,78%
desktop10concurrent memcached localhost 0,125 0,013 1388,89%
desktopVM xcache
0,01 0,002 111,11%
desktopVM memcached_nocompression socket 0,047 0,011 522,22%
desktopVM redis socket 0,052 0,003 577,78%
desktopVM memcached_nocompression localhost 0,061 0,005 677,78%
desktopVM predis socket 0,072 0,002 800,00%
desktopVM redis localhost 0,076 0,006 844,44%
desktopVM memcached socket 0,133 0,005 1477,78%
desktopVM memcached localhost 0,141 0,008 1566,67%
desktopVM predis localhost 40,098 0,233 445533,33%
laptopVM xcache
0,011 0 122,22%
laptopVM memcached_nocompression localhost 0,032 0 355,56%
laptopVM memcached_nocompression socket 0,033 0 366,67%
laptopVM redis socket 0,039 0,001 433,33%
laptopVM redis localhost 0,047 0,002 522,22%
laptopVM memcached localhost 0,102 0,001 1133,33%
laptopVM memcached socket 0,102 0,002 1133,33%
laptopVM memcached network 2,165 0,036 24055,56%
laptopVM redis network 2,96 0,03 32888,89%
laptopVM memcached_nocompression network 3,382 0,036 37577,78%


Read(20Kb string):
Env Engine Connection Avg[ms] Stddev[ms] Difference[%]
desktop xcache
0,006 0 100,00%
desktop memcached_nocompression socket 0,021 0,001 350,00%
desktop redis socket 0,025 0,001 416,67%
desktop memcached_nocompression localhost 0,033 0,001 550,00%
desktop redis localhost 0,037 0,002 616,67%
desktop predis socket 0,052 0,006 866,67%
desktop memcached socket 0,062 0,01 1033,33%
desktop memcached localhost 0,07 0,007 1166,67%
desktop predis localhost 0,111 0,011 1850,00%
desktop10concurrent xcache
0,009 0,002 150,00%
desktop10concurrent memcached_nocompression socket 0,023 0,005 383,33%
desktop10concurrent memcached_nocompression localhost 0,034 0,003 566,67%
desktop10concurrent memcached socket 0,056 0,001 933,33%
desktop10concurrent memcached localhost 0,069 0,004 1150,00%
desktopVM xcache
0,007 0,001 116,67%
desktopVM memcached_nocompression socket 0,03 0,008 500,00%
desktopVM redis socket 0,038 0,003 633,33%
desktopVM memcached_nocompression localhost 0,048 0,002 800,00%
desktopVM redis localhost 0,059 0,004 983,33%
desktopVM memcached socket 0,065 0,005 1083,33%
desktopVM predis socket 0,074 0,006 1233,33%
desktopVM memcached localhost 0,081 0,005 1350,00%
desktopVM predis localhost 0,098 0,005 1633,33%
laptopVM xcache
0,008 0 133,33%
laptopVM memcached_nocompression socket 0,022 0,001 366,67%
laptopVM memcached_nocompression localhost 0,027 0,001 450,00%
laptopVM redis socket 0,027 0 450,00%
laptopVM redis localhost 0,033 0,001 550,00%
laptopVM memcached socket 0,053 0,001 883,33%
laptopVM memcached localhost 0,058 0,001 966,67%
laptopVM memcached network 2,205 0,053 36750,00%
laptopVM redis network 3,154 0,097 52566,67%
laptopVM memcached_nocompression network 3,162 0,041 52700,00%


We can draw first conclusions:


1. Don't use Predis PHP library as Redis client if you have that option. It is generally slower than phpredis, and has some bug when saving values longer than 8kB using tcpip.

2. Prefer socket over localhost connections.

3. Time of writing/reading short value on localhost is counted in mikroseconds (like 0,012ms). Doing the same over network(100Mb/s) requires at minimum 0,540ms! I haven't got possiblity to toy on production grade servers with 1GB/s ethernet with my benchmarking scripts, but generally I didn't see on them respones faster than 0.110ms at best. So best thing you can do is to place your cache on the same machine as PHP.

4. Time of writing/reading long values over network(100Mb/s) is counted in ms. Using compression while over network speeds up process. (from 3.1 ms to 2.2ms in my benchmark).

5. Redis and Memcached configured in similar way are pretty equal. You can choose one or another. While on localhost XCache is fastest, but has own limitations.


In real world scenarios we are mostly storing complex data (arrays, objects), not simple strings. Now our test object will be 50 element array, each element containing 50 elements. Simulation of cached database result (50 items each with 50 columns).  


Write(complex data):
Env Engine Connection Avg[ms] Stddev[ms] Difference[%]
desktop xcache
0,496 0,05 100,00%
desktop memcached_nocompression localhost 0,597 0,048 120,36%
desktop memcached_nocompression socket 0,598 0,051 120,56%
desktop redis localhost 0,612 0,053 123,39%
desktop redis socket 0,614 0,054 123,79%
desktop memcached socket 0,684 0,055 137,90%
desktop memcached localhost 0,699 0,056 140,93%
desktop predis socket 0,705 0,098 142,14%
desktop predis localhost 39,73 0,164 8010,08%


Read(complex data):
Env Engine Connection Avg[ms] Stddev[ms] Difference[%]
desktop xcache
0,607 0,028 100,00%
desktop redis socket 0,689 0,037 113,51%
desktop redis localhost 0,697 0,037 114,83%
desktop memcached_nocompression socket 0,703 0,068 115,82%
desktop memcached_nocompression localhost 0,707 0,038 116,47%
desktop memcached socket 0,74 0,042 121,91%
desktop predis socket 0,758 0,084 124,88%
desktop memcached localhost 0,766 0,048 126,19%
desktop predis localhost 0,931 0,114 153,38%


Wow! Suddenly time raised from 0.006ms to 0.607ms (for XCache). While still on localhost. There is almost no difference between XCache, Redis and Memcached_nocompression. And the explanation is: serialization. Remember „long 20kb string” which we were storing in previous tests? It was this serialized test data, when serialization time was not counted in benchmarks. (and there was no unserialization). 

Memcached and phpredis PHP extensions are using standard PHP serializations by default. Lets try other options:
Write(complex data):
Env Engine Serialization Connection Avg[ms] Stddev[ms] Difference[%]
desktop xcache
msgpack 0,085 0,005 100,00%
desktop xcache
igbinary 0,093 0,001 109,41%
desktop memcached_nocompression socket msgpack 0,115 0,017 135,29%
desktop redis socket msgpack 0,116 0,01 136,47%
desktop memcached_nocompression localhost msgpack 0,12 0,01 141,18%
desktop memcached_nocompression socket igbinary 0,124 0,009 145,88%
desktop redis socket igbinary 0,125 0,01 147,06%
desktop redis localhost msgpack 0,126 0,009 148,24%
desktop memcached_nocompression localhost igbinary 0,134 0,012 157,65%
desktop redis localhost igbinary 0,136 0,008 160,00%
desktop memcached socket msgpack 0,153 0,012 180,00%
desktop predis socket msgpack 0,163 0,023 191,76%
desktop memcached localhost msgpack 0,166 0,02 195,29%
desktop memcached socket igbinary 0,178 0,017 209,41%
desktop predis socket igbinary 0,183 0,026 215,29%
desktop memcached localhost igbinary 0,184 0,021 216,47%
desktop xcache
serialize 0,496 0,05 583,53%
desktop memcached_nocompression localhost default(serialize) 0,597 0,048 702,35%
desktop memcached_nocompression socket default(serialize) 0,598 0,051 703,53%
desktop redis localhost default(serialize) 0,612 0,053 720,00%
desktop redis socket default(serialize) 0,614 0,054 722,35%
desktop redis socket serialize 0,614 0,056 722,35%
desktop redis localhost serialize 0,624 0,04 734,12%
desktop memcached socket default(serialize) 0,684 0,055 804,71%
desktop memcached localhost default(serialize) 0,699 0,056 822,35%
desktop predis socket serialize 0,705 0,098 829,41%
desktop predis localhost igbinary 39,728 0,165 46738,82%
desktop predis localhost msgpack 39,728 0,164 46738,82%
desktop predis localhost serialize 39,73 0,164 46741,18%


Read(complex data):
Env Engine Serialization Connection Avg[ms] Stddev[ms] Difference[%]
desktop xcache
igbinary 0,376 0,008 100,00%
desktop redis socket igbinary 0,403 0,02 107,18%
desktop redis localhost igbinary 0,408 0,022 108,51%
desktop memcached_nocompression localhost igbinary 0,409 0,028 108,78%
desktop memcached_nocompression socket igbinary 0,417 0,025 110,90%
desktop memcached socket igbinary 0,43 0,052 114,36%
desktop memcached localhost igbinary 0,444 0,053 118,09%
desktop xcache
msgpack 0,488 0,039 129,79%
desktop predis socket igbinary 0,512 0,081 136,17%
desktop xcache
serialize 0,607 0,028 161,44%
desktop memcached_nocompression localhost msgpack 0,614 0,031 163,30%
desktop redis socket msgpack 0,615 0,027 163,56%
desktop redis localhost msgpack 0,615 0,021 163,56%
desktop memcached_nocompression socket msgpack 0,626 0,054 166,49%
desktop memcached socket msgpack 0,635 0,037 168,88%
desktop memcached localhost msgpack 0,64 0,05 170,21%
desktop predis localhost igbinary 0,65 0,099 172,87%
desktop redis socket default(serialize) 0,689 0,037 183,24%
desktop redis localhost default(serialize) 0,697 0,037 185,37%
desktop redis localhost serialize 0,702 0,034 186,70%
desktop memcached_nocompression socket default(serialize) 0,703 0,068 186,97%
desktop redis socket serialize 0,704 0,038 187,23%
desktop memcached_nocompression localhost default(serialize) 0,707 0,038 188,03%
desktop memcached socket default(serialize) 0,74 0,042 196,81%
desktop predis socket msgpack 0,741 0,075 197,07%
desktop predis socket serialize 0,758 0,084 201,60%
desktop memcached localhost default(serialize) 0,766 0,048 203,72%
desktop predis localhost msgpack 0,863 0,105 229,52%
desktop predis localhost serialize 0,931 0,114 247,61%

It seems that we are just benchmarking serialization methods, not cache engines. For our test data, writing using igbinary or msgpack serializers is over 5 times faster (~0.1ms instead of ~0.6ms ). For reading (surprisingly, slower than writing), igbinary is winner (0.4ms instead of 0.7ms).

Final conclusions:


1. Don't use standard PHP serialization method. I suggest using igbinary.

2. Don't use predis, use phpredis.

3. Generally try to avoid serialization, especially if you have local cache (in our example we could 
make 40 local requests and still be faster than one with deserialization). Also maybe cache your output HTML instead of data from DB used for generation? Try to avoid unnecessarry complexity of data.


4. Differences between cache engines are counted in single microseconds. Differences caused by serialization methods and network architecture are counted in miliseconds, so you should really focus on them.



Disclaimer:
- remember, its best to make benchmarks on your own setup. For example, you may got completely different results on virtual machines in cloud, or using dockers.