KVM Autotest Performance Regression Testing

KVM autotest performance regression Testing (unfinished)

Performance subtests

network

netperf (linux) (windows is requested)
ntttcp (windows)
iozone (linux & windows ) (iozone has its own result analysis module)

block

iometer (windows) (not push upstream)
ffsb (linux)
qemu_io (host): (not push upstream)

Environment setup

Tunning implement in Autotest

Change host services/configure files (eg. grub.conf, ifcfg-eth0, etc), need reboot

Use autotest server side test or change it by host kickstart
Change host services/configure files (eg. grub.conf, ifcfg-eth0, etc), need reboot

Execute change at the beginning of testing, or do this by a single subtest.
Change host services/configure files (eg. grub.conf, ifcfg-eth0, etc), reboot is unnecessary

Add commands or script to "pre_command", recover it by "post_command". Or execute change at the beginning of testing and recover change at the end of testing.
Change host services/configure files (eg. grub.conf, ifcfg-eth0, etc), reboot is unnecessary

Execute change at the beginning of testing.

General Tunning

add "elevator=dealine intel_iommu=off" in gurb configuration file on host.
make sure Hyper-Threading is off on host.
stop unrealted services on both host and guest side.
- [[stop_services.sh | http://amosk.info/pub/html/stop_services.sh ]]
- [[stop_service_off.sh | http://amosk.info/pub/html/stop_service_off.sh]] (need reboot)
bootup guest with runlevel 3 through modify file /etc/inittab.

Network Tunning

Run test on a private LAN.As there are multpile NICS.so set arp_filter=1 through add the following to /etc/sysctl.conf on host.

net.ipv4.conf.default.arp_filter = 1
net.ipv4.conf.all.arp_filter = 1

Disable netfilter on bridges

net.bridge.bridge-nf-call-ip6tables = 0
net.bridge.bridge-nf-call-iptables = 0
net.bridge.bridge-nf-call-arptables = 0

bridge setting. set bridge forward delay to 0:

brctl setfd switch 0

pinning perspective:

Autotest supports to numa pining. Assign "numanode=-1" in tests.cfg, then vcpu threads/vhost_net threads/VM memory will be pined to last numa node. If you want to pin other processes to numa node, you can use numctl and taskset.

memory: numactl -m $n $cmdline
cpu: taskset $node_mask $thread_id

The following content is manual guide.

1.First level pinning would be to use numa pinning when starting the guest.
e.g  numactl -c 1 -m 1 qemu-kvm  -smp 2 -m 4G <> (pinning guest memory and cpus to numa-node 1)

2.For a single instance test, it would suggest trying a one to one mapping of vcpu to pyhsical core.
e.g
get guest vcpu threads id
#taskset -p 40 $vcpus1  (pinning vcpu1 thread to pyshical cpu #6 )
#taskset -p 80 $vcpus2  (pinning vcpu2 thread to physical cpu #7 )

3.To pin vhost on host. get vhost PID and then use taskset to pin it on the same soket.
e.g
taskset -p 20 $vhost (pinning vcpu2 thread to physical cpu #5 )

4.In guest,pin the IRQ to one core and the netperf to another.
1) make sure irqbalance is off - `service irqbalance stop`
2) find the interrupts - `cat /proc/interrupts`
3) find the affinity mask for the interrupt(s) - `cat /proc/irq/<irq#>/smp_affinity`
4) change the value to match the proper core.make sure the vlaue is cpu mask.
e.g pin the IRQ to first core.
   echo 01>/proc/irq/$virti0-input/smp_affinity
   echo 01>/proc/irq/$virti0-output/smp_affinity
5)pin the netserver to another core.
e.g
taskset -p 02 netserver

5.For host to guest scenario. to get maximum performance. make sure to run netperf on different cores on the same numa node as the guest.
e.g
numactl  -m 1 netperf -T 4 (pinning netperf to physical cpu #4)

Block Tunning

run tunning profile on host.

# tuned-adm profile enterprise-storage (for rhel6 project)
# service tuned start  (for rhel5 project)

make sure cache=none in qemu-kvm command

Execute testing

Submit jobs in Autotest server, only execute netperf.guset_exhost for three times.

tests.cfg:

only netperf.guest_exhost
client = 192.168.122.2 (external host ip)
variants:
    - repeat1:
    - repeat2:
    - repeat3:

Result files:

# cd /usr/local/autotest/results/8-debug_user/192.168.122.1/
# find .|grep RHS
kvm.repeat1.r61.virtio_blk.smp2.virtio_net.RHEL.6.1.x86_64.netperf.exhost_guest/results/netperf-result.RHS
kvm.repeat2.r61.virtio_blk.smp2.virtio_net.RHEL.6.1.x86_64.netperf.exhost_guest/results/netperf-result.RHS
kvm.repeat3.r61.virtio_blk.smp2.virtio_net.RHEL.6.1.x86_64.netperf.exhost_guest/results/netperf-result.RHS

Submit same job in another env (different packages) with same configuration

Result files:

# cd /usr/local/autotest/results/9-debug_user/192.168.122.1/
# find .|grep RHS
kvm.repeat1.r61.virtio_blk.smp2.virtio_net.RHEL.6.1.x86_64.netperf.exhost_guest/results/netperf-result.RHS
kvm.repeat2.r61.virtio_blk.smp2.virtio_net.RHEL.6.1.x86_64.netperf.exhost_guest/results/netperf-result.RHS
kvm.repeat3.r61.virtio_blk.smp2.virtio_net.RHEL.6.1.x86_64.netperf.exhost_guest/results/netperf-result.RHS

Analysis result

Config file: perf.conf

[ntttcp]
result_file_pattern = .*.RHS
ignore_col = 1
avg_update =

[netperf]
result_file_pattern = .*.RHS
ignore_col = 2
avg_update = 4,2,3|14,5,12|15,6,13

[iozone]
result_file_pattern =

Execute regression.py to compare two results:

login autotest server
# cd /usr/local/autotest/client/tools
# python regression.py netperf /usr/local/autotest/results/8-debug_user/192.168.122.1/ /usr/local/autotest/results/9-debug_user/192.168.122.1/

T-test:

scipy: http://www.scipy.org/
t-test: http://en.wikipedia.org/wiki/Student's_t-test
Two python modules (scipy and numpy) are needed.

Unpaired T-test is used to compare two samples, user can check p-value to know if regression bug exists. If the difference of two samples is considered to be not statistically significant(p <= 0.05), it will add a '+' or '-' before p-value. ('+': avg_sample1 < avg_sample2, '-': avg_sample1 > avg_sample2)

Regression results:

[[netperf.html | http://amosk.info/pub/html/netperf.html]]
- Every Avg line represents the average value based on *$n* repetitions of the same test,
  and the following SD line represents the Standard Deviation between the *$n* repetitions.
- The Standard deviation is displayed as a percentage of the average.
- The significance of the differences between the two averages is calculated using unpaired T-test that
  takes into account the SD of the averages.
- The paired t-test is computed for the averages of same category.

[[netperf.avg.html | http://amosk.info/pub/html/netperf.avg.html]]
- Raw data that the averages are based on.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly