Announcement Announcement Module
Collapse
No announcement yet.
get performance Page Title Module
Move Remove Collapse
X
Conversation Detail Module
Collapse
  • Filter
  • Time
  • Show
Clear All
new posts

  • get performance

    Hi, I am processing large dataset and it looks like I am struggling with redis look up performance. there are around 600M keys. 4 MGET calls take more than 40ms in multithreaded (50 threads) env, and redis connection pool (50 connections) on my mac book pro. I am using Jedis, btw.

    Every MGet is fetching results for 100 keys of the same pattern (as I read somewhere on forum that is recommended number).

    I'd like to know if I can pipeline those 4 MGET calls, so i get some performance gain?
    I couldn't find how to do that using RedisTemplate and Jedis. Any hint?

    Also are there any significant differences in speed among available drivers (jedis, jredis etc). Is there any benchmark available?

    Thanks in advance,
    Milan
    Last edited by clandestino_bgd; Aug 6th, 2013, 07:25 AM.

  • #2
    We just released 1.1 M2, which has some new methods on RedisTemplate for executing commands in a pipeline and retrieving the results. See documentation: http://static.springsource.org/sprin....html#pipeline

    Comment


    • #3
      Originally posted by jencompgeek View Post
      We just released 1.1 M2, which has some new methods on RedisTemplate for executing commands in a pipeline and retrieving the results. See documentation: http://static.springsource.org/sprin....html#pipeline
      Thanks, I've tried pipeline, but it seems even slower than multi get.

      Here is the code


      Code:
                    List<Object> result = redisTemplate.executePipelined(new RedisCallback() {
      			public Object doInRedis(RedisConnection connection) throws DataAccessException {
      				StringRedisConnection stringRedisConn = (StringRedisConnection) connection;
      				List<String> statsResult1 = stringRedisConn.mGet(keys1Array);
      				List<String> statsResult2 = stringRedisConn.mGet(keys2Array);
                                      List<String> statsResult3 = stringRedisConn.mGet(keys3Array);
                                      List<String> statsResult4 = stringRedisConn.mGet(keys4Array);
      				return null;
      			}
      		});
      
                   List<String> stats1= (List<String>)result.get(0);			
      	     List<String> stats2 = (List<String>)result.get(1);
      	     List<String> stats3 = (List<String>)result.get(2);
      	     List<String> stats4= (List<String>)result.get(3);

      comparing to the pure (not pipelined) multiget one:
      Code:
      	 List<String> stats1 = redisTemplate.opsForValue().multiGet(keys1);
      	 List<String> stats2 = redisTemplate.opsForValue().multiGet(keys2);
      	 List<String> stats3 = redisTemplate.opsForValue().multiGet(keys3);
               List<String> stats4 = redisTemplate.opsForValue().multiGet(keys4);
      Both code sections are executed by multiple threads (up to 10 at the time).
      I have Jedis connection pool configured like this:

      Code:
      <bean id="jedisPoolConfig" class="redis.clients.jedis.JedisPoolConfig">
      	    <property name="maxActive" value="10"/>
      	    <property name="maxIdle" value="10"/>
      	    <property name="testOnBorrow" value="false"/>
      		<property name="maxWait" value="15000"></property>
      		<property name="minEvictableIdleTimeMillis" value="300000"></property>
      		<property name="numTestsPerEvictionRun" value="3"></property>
      		<property name="timeBetweenEvictionRunsMillis" value="60000"></property>
      		<property name="whenExhaustedAction" value="1"></property>
      	</bean>
      As you can see, I have 4 multigets per requests, each has 1000 keys (I am using spring batch commit interval of that size).
      My redis instance is local on my mac:

      Code:
      redis 127.0.0.1:6379> DBSIZE
      (integer) 86517797
      Keys and Values are strings.

      Could you please suggest anything that could improve my current performance of 600ms for 4 multigets of 1000 keys.
      Multithreading doesn't help much, as far as I can see, 1, 4 or 8 threads gain very similar results.

      Thanks very much in advance,
      Milan

      Comment


      • #4
        You might try switching from Jedis to Lettuce (https://github.com/wg/lettuce). Jedis pipelining doesn't actually send any commands to Redis until you close the pipeline, while Lettuce uses Netty for asynchronous I/O and will send the command right away (receiving the results asynchronously). I'm not sure how much it will help given that you are only sending 4 commands, but it's definitely worth a try.

        I too haven't seen much or any improvement using multiple threads with some performance testing I've done involving batch rpops from a queue.

        Also, have you tried reducing the batch size? Especially with Lettuce you may find there's a point where it's more performant to make more requests for fewer keys. 1000 seems reasonable to me, but it might also be worth a try.

        Comment


        • #5
          Originally posted by jencompgeek View Post
          You might try switching from Jedis to Lettuce (https://github.com/wg/lettuce). Jedis pipelining doesn't actually send any commands to Redis until you close the pipeline, while Lettuce uses Netty for asynchronous I/O and will send the command right away (receiving the results asynchronously). I'm not sure how much it will help given that you are only sending 4 commands, but it's definitely worth a try.

          I too haven't seen much or any improvement using multiple threads with some performance testing I've done involving batch rpops from a queue.

          Also, have you tried reducing the batch size? Especially with Lettuce you may find there's a point where it's more performant to make more requests for fewer keys. 1000 seems reasonable to me, but it might also be worth a try.
          I've tried Lettuce with default config (no defined connection pool). In sequential mget (4 calls, 1000 keys each) it shows 10% improvement comparing to Jedis, but it is still > 0.5 sec on per batch. Pipelined, it is much worse, 2 times slower + it eventually dies due to:

          Code:
          java.util.concurrent.ExecutionException: org.springframework.data.redis.connection.RedisPipelineException: Pipeline contained one or more invalid commands; nested exception is org.springframework.dao.InvalidDataAccessApiUsageException: Connection closed
          I also tried with smaller batches (4x100 vs 4x1000). It does't help.
          Any other hint how to significantly improve performance?
          Can't really believe that it takes 0.5 -0.6 seconds to get 4x1000 numbers as strings.
          Redis takes ~ 3GB, my batch has -Xmx8g with 4 cores and SSD.

          Thanks in advance,
          Milan

          Comment


          • #6
            How are you measuring the performance exactly? I wrote this test to try switching between different drivers and pipelined vs non-pipelined. It runs batches of 10 concurrent threads doing 4 mgets with 1000 keys against a DB with 8 million keys. Of course, my machine sounds less powerful than yours and I'm not doing any other concurrent activity (are any of your other threads accessing Redis during these mgets?) so I'm sure it's a different env, but my numbers were definitely better than 600 ms.

            One thing I did notice was that the performance improved significantly after the first batch of 10 threads were executed. This was true of every driver with both pipelined and non-pipelined. Wondering if you are just measuring the first results after JVM startup?

            Here's my numbers (in millis) and test (numbers are an informal gathering of data points as they scrolled by my screen). The first range is the numbers on the first batch of threads (before the sleep), the second range is based on numbers from several subsequent batches. You can see that Lettuce pipelined was the best performance for me, after the first batch of threads. I let it run for a while but couldn't repro the connection failure.

            Lettuce:
            209-215, 28-37 non-pipelined
            293-313, 16-33 pipelined

            Jedis (non-pooled):
            200-215, 28-76 non-pipelined
            367-376, 24-51 pipelined

            Jedis (pooled with your config):
            202-219, 26-34 non-pipelined
            370-386, 37-55 pipelined

            SRP:
            210-260, 33-39 non-pipelined
            328-341, 25-52 pipelined

            Code:
            package org.springframework.data.redis.core;
            
            import java.util.ArrayList;
            import java.util.HashMap;
            import java.util.List;
            import java.util.Map;
            
            import org.junit.Before;
            import org.junit.Test;
            import org.junit.runner.RunWith;
            import org.springframework.beans.factory.annotation.Autowired;
            import org.springframework.dao.DataAccessException;
            import org.springframework.data.redis.connection.RedisConnection;
            import org.springframework.data.redis.connection.RedisConnectionFactory;
            import org.springframework.data.redis.connection.StringRedisConnection;
            import org.springframework.test.context.ContextConfiguration;
            import org.springframework.test.context.junit4.SpringJUnit4ClassRunner;
            
            /**
             * To test the different scenarios, switch between multiGet and executePipelined in the TestTask
             * 
             * To test different drivers, switch out the XML file in the ContextConfiguration annotation below
             * 
             * Run this test against a database populated with the populate method.
             * 
             */
            @RunWith(SpringJUnit4ClassRunner.class)
            @ContextConfiguration("classpath:org/springframework/data/redis/connection/srp/SrpConnectionIntegrationTests-context.xml")
            public class ForumTests {
            
            	private RedisTemplate<String, String> redisTemplate;
            
            	@Autowired
            	private RedisConnectionFactory factory;
            
            	private static final int BATCH_SIZE = 1000;
            
            	private static final int BATCH_NUMBER = 4;
            
            	private int counter = 0;
            
            	@Before
            	public void setUp() {
            		this.redisTemplate = new StringRedisTemplate(factory);
            		redisTemplate.afterPropertiesSet();
            		//populate();
            	}
            
            	@Test
            	public void testIt() throws InterruptedException {
            		for (int p = 0; p < 100; p++) {
            			for (int i = 0; i < 10; i++) {
            				Thread th = new Thread(new TestTask(counter));
            				counter += BATCH_SIZE * BATCH_NUMBER;
            				th.start();
            			}
            			// Let all threads die before starting again
            			Thread.sleep(2000);
            		}
            	}
            
            	private class TestTask implements Runnable {
            		private int startIndex;
            
            		public TestTask(int startIndex) {
            			this.startIndex = startIndex;
            		}
            
            		public void run() {
            			multiGet(startIndex);
            			//executePipelined(startIndex);
            		}
            	}
            
            	private void populate() {
            		// dbsize = 8001000
            		for (int t = 0; t < 8000; t++) {
            			Map<String, String> keysAndValues = new HashMap<String, String>();
            			for (int i = 0; i < 1000; i++) {
            				keysAndValues.put("key" + counter++, "100");
            			}
            			redisTemplate.opsForValue().multiSet(keysAndValues);
            		}
            	}
            
            	private List<String> getKeys(int startIndex) {
            		List<String> keys = new ArrayList<String>();
            		for (int i = startIndex; i < (startIndex + BATCH_SIZE); i++) {
            			keys.add("key" + i);
            		}
            		return keys;
            	}
            
            	private void multiGet(int startIndex) {
            		long time = System.currentTimeMillis();
            		for (int i = 0; i < BATCH_NUMBER; i++) {
            			redisTemplate.opsForValue().multiGet(getKeys(startIndex + (i * BATCH_SIZE)));
            		}
            		System.out.println("Get time: " + (System.currentTimeMillis() - time));
            	}
            
            	private void executePipelined(int startIndex) {
            		final List<String[]> keyArrays = new ArrayList<String[]>();
            		for (int i = 0; i < BATCH_NUMBER; i++) {
            			keyArrays.add(getKeys(startIndex + (i * BATCH_SIZE)).toArray(new String[BATCH_SIZE]));
            		}
            		long time = System.currentTimeMillis();
            		redisTemplate.executePipelined(new RedisCallback() {
            			public Object doInRedis(RedisConnection connection) throws DataAccessException {
            				StringRedisConnection stringRedisConn = (StringRedisConnection) connection;
            				for (String[] keys : keyArrays) {
            					stringRedisConn.mGet(keys);
            				}
            				return null;
            			}
            		});
            		System.out.println("Get time: " + (System.currentTimeMillis() - time));
            	}
            }
            If you can create a similar isolated test that performs as poorly for you, I'd be happy to take a look at it. Perhaps I'm not quite emulating your program? Otherwise you might try asking on Stack Overflow or Redis user groups for additional suggestions.

            Comment


            • #7
              Hi, thanks very much for looking into this.
              I am getting the similar results as you when I run your test, and my code looks very much the same.
              However you are testing with smallish db ~ 8M, which is 10x less than what I have (please see dbsize in my earlier post)
              So, when I modified your code and insert 60M, vs 8M (and the key strings are 15 chars + Integer in my case, comparing to your Key+Integer) results are exactly like the ones I've experienced before but with huge extremes:

              average: 689 ms
              max 4340 ms
              min 140 ms

              this is with pooled jedis (my config), no pipeline, your test class.
              I am running it on my macbook pro and looking at my java process in visualvm (all looks good, threads, memory, cpu).
              Could you please try that and advise what could dramatically decrease performance?

              Thanks,
              Milan

              Comment

              Working...
              X