Application availability and Auto-scaling with AVI–Benchmarking

So the last couple of days has been spent on reading up on the latest ADC updates as part of the WhatMatrix.com category for ADC’s which is coming soon! As part of that I’ve looked at alot of the tradisional ADC vendors in the market, which comes from a physical appliance world and has entered the virtual plane. Now most of them have just “converted” their physical appliance to a virtual appliance, with the same feature set, same UI and such and they work on a virtualized platform.

The problem is with most of them is that they are just the same appliances running on a hypervisor and not “virtualization-aware”. Now Avi Networks is one of the vendors in the ADC market space which are virtualization-aware, they do not have any physical legacy based appliance they only operate in the virtual layer.

Now I have written about AVI previously:
http://msandbu.org/a-closer-look-at-avi-networks/
http://msandbu.org/avinetworks-architecture/

So this post is going to focus more on the scaleability issue. Now with most ADC vendors, which are just a physical appliance convert to a virtual appliance they suffer from the same issue.. Scaling. Now image a regular virtual appliance configured with a set of different virtual servers

Most ADC setups can be configured with HA (High-availabilty) features which handles issues in case the appliance goes down. Other solutions also offer Active/Active data path scenarios or clustering using ECMP.

But since they are not virtualization aware they do not have any feature to automatic scale-out if there is for instance alot of traffic going to a particular services and the ADC appliance is running out of resources. In that case we would manually need to setup a new ADC in a clustered enviroment or give the other appliance more resources available. This is also because of their architecture is often a case where an appliance has both the management and the data-layer combined and not disagreegated into seperate appliances.

Now Avi has a different approach, since they have a seperate architecture and can integrate into the virtual enviroment, they have a lot of different capabilities to execute actions into the virtualization layer.

They have an Controller which can integrate with vCenter to allow for automatic deployment of Service Engines (Which handle all packet processing capabilities) into a virtualized enviroment. Now we can scale out service engines manually to add more packet processing resource if we want to. This can be done via the console and on the virtual service level

Here we can scale-out or migrate to another service engine. Scale-out will deploy another service-engine appliance on the VMware enviroment. The default scaling method for AVI is set to manual. Now there is also the option to do automatic scaling.

Source: https://kb.avinetworks.com/wp-content/uploads/2015/12/scaling-decision-tree.png

So in the Servive Engine group we can define how the service engines should operate during congestion and how it is allowed to scale.

Now by default is uses an HA function called Elastic HA (N + M (Buffer) with this mode, all service typically use one service engine to begin with, (Since the minimum value is set to 1 which is the N value here) buffer capacity exists in aggregate to handle one (M=1) Service engine failure.

Now with the limitations of layer 2, the primary service engine will handle all the incoming requests, in most cases like with web traffic, the incoming requests is typically small compared to the outbound traffic with the traffic response.

The primary service engine will handle the requests and begin to forward a small part of the traffic to the other service engine at layer two, the other service engine will then terminate the TCP connection and then estalibish a connection with the backend web server in the server pool.

There is also the option of using BGP along with ECMP which is using Layer 3 which allows for active/active SE packet processing with up to 32 service engines (depending on what the upstream router supports) where we are not limited to the layer two and having one primary service engine.

Now to test the performance of the service engines. I decided I wanted to see how well the performance was of a single service engine instance for a virtual service. Now I know that alot of ADC vendors have SSL transaction limitations in place on their appliance, so it would be interesting too see how well this was going to perform.

I used a simple apache benchmark test with the following attributes

ab -n 100000 -c 200 https://mycustomurl –f TLS1.2 (I specificed TLS1.2 to ensure that it is using the latest secure protocol, and also this service was setup using HTTP back-end because I wanted to use AVInetworks to do SSL offloading for the service)

Not to shabby handling up to 770 transactions per second using a 2048 bit and connecting using TLS 1.2, from the logs I can see that the backend IIS server is getting pretty stressed but not reaching peak level, I also see using perfmon that it is spiking on CPU which is why the application response time is so slow at times. But it is not maxing the performance yet. (Note: however that this was if I was using full logging and analytics within Vantage, my guess is that I could easily get a higher number here if the I turned down logging, or running the apache benchmark on multiple nodes)

While the primary service engine for this virtual service, was still not having that much difficulty processing the requests.

NOTE: A single service engine is default equipped with 2 vCPU and 2048 MB of memory. This is something that can be adjusted in the service engine group settings.Now as mentioned this traffic is processed by the primary service engine, and because of the M + N HA I have another service engine in standby. So I wanted to do another test, I adjusted the service engine settings in the default service engine group to 4 vCPU and 4 GB of memory and did another test. (It should also be noted that it is recommended that the SE CPU’s should be reserved in VMware, this can be done within the Service Group)

So now we are up to 1250 transactions per second, and the backend IIS server is using 98% of its 2vCPU performance. Which Is why we are not getting higher on this single virtual machine instance. So using a single service engine we can get up to 1250 SSL transactions which is alot higher then what the competition does on current licenses.

So this has been part one of Avi and more deep dive on the service engine architecture and traffic flow and show a bit about the performance of a single service engine. Stay tuned for more! I’m going to show ECMP and scale out architecture with active/active Service engines

Share this:

Leave a Reply Cancel reply