How much rate limiter is effective in Open-Flow enabled switches?

Recently I was doing some experimentation about the effectiveness of rate-limiters, which are available in Open-Flow enabled switches and could be controlled using Open-Flow controllers. Each egress port of switch consist of multiple queues. ( In our case is 8 queues ) Packets that are going to travel out through this port, should first be assigned to one of these queues and then routed out of this port. Using open-flow protocol, you can assign rate limiters to these queues, which are able to limit the bandwidth that are taken by each queue. The implementation of these queues could be completely different in different switches. Some switches may also prioritize all these queues. For example, Queue #0 has higher priority than queue #7. So if queue #0 is going to fully utilize the port, then the other flow would be starved and no packet would be sent out from queue #7, because at all moments we have outstanding packets in the queue #0.

Engineers and scientists may use these rate limiters for QoS purposes. This is also a big challenge mostly in distributed systems, in which multiple users are going to use one cluster of machines.

Until now, one may think that rate-limiters are everything, but looking deeper at the behavior of network workflows, we would see it’s not the end. We have two different kinds of workflows. In one of them we have somehow constant rate, which this rate is not fluctuating so much. In another one, flows are so bursty that at one moment we see no packet coming and going, and at the other moment, a burst of packet are destined to one point.

Now, with all I said, I have seen that rate limiters have shortcomings in handling bursty workloads. One can see that at the beginning, it takes milliseconds for the switch to completely cap the flow. This is completely unpromising for applications where  it completely consists of bursty flows. As a result, our rate limiters have no effect on the differentiation of network flows.

I need to clarify that all my observations are based on a specific switch model( Pronto switch, running Pica8 OS ). It may possible that one don’t see the same results on other open-flow enabled switches, such as CISCO or NEC. This would be awesome, if somebody do some experiments, if he/she has accesses to other models of switches and see what is the outcome.

After two weeks of investigation, finally I’ve got an answer from the engineering team at Pica8 solutions. They also admitted that rate limiters are not capable of handling bursty flows. The solution is using Meters, which has been introduced in a recent version of OpenFlow( v1.3 ). I haven’t tested meters yet, but I’m so eager to see if it really makes any difference or not.