Skip to content

KVM Autotest Performance Regression Testing

kongove edited this page Mar 29, 2012 · 8 revisions

KVM autotest performance regression Testing

Environment setup

Tunning implement in Autotest

  • Change host services/configure files (eg. grub.conf, ifcfg-eth0, etc), need reboot

    Use autotest server side test or change it by host kickstart

  • Change host services/configure files (eg. grub.conf, ifcfg-eth0, etc), need reboot

    Execute change at the beginning of testing, or do this by a single subtest.

  • Change host services/configure files (eg. grub.conf, ifcfg-eth0, etc), reboot is unnecessary

    Add commands or script to "pre_command", recover it by "post_command". Or execute change at the beginning of testing and recover change at the end of testing.

  • Change host services/configure files (eg. grub.conf, ifcfg-eth0, etc), reboot is unnecessary

    Execute change at the beginning of testing.

General Tunning

add "elevator=dealine intel_iommu=off" in gurb configuration file on host.
make sure Hyper-Threading is off on host.
stop unrealted services on both host and guest side.
- http://amosk.info/pub/html/stop_services.sh
- http://amosk.info/pub/html/stop_service_off.sh (need reboot)
bootup guest with runlevel 3 through modify file /etc/inittab.

Network Tunning

  • Run test on a private LAN.As there are multpile NICS.so set arp_filter=1 through add the following to /etc/sysctl.conf on host.
net.ipv4.conf.default.arp_filter = 1
net.ipv4.conf.all.arp_filter = 1
  • Disable netfilter on bridges
net.bridge.bridge-nf-call-ip6tables = 0
net.bridge.bridge-nf-call-iptables = 0
net.bridge.bridge-nf-call-arptables = 0
  • bridge setting. set bridge forward delay to 0:
brctl setfd switch 0
  • pinning perspective:
1.First level pinning would be to use numa pinning when starting the guest.
e.g  numactl -c 1 -m 1 qemu-kvm  -smp 2 -m 4G <> (pinning guest memory and cpus to numa-node 1)

2.For a single instance test, it would suggest trying a one to one mapping of vcpu to pyhsical core.
e.g
get guest vcpu threads id
#taskset -p 40 $vcpus1  (pinning vcpu1 thread to pyshical cpu #6 )
#taskset -p 80 $vcpus2  (pinning vcpu2 thread to physical cpu #7 )

3.To pin vhost on host. get vhost PID and then use taskset to pin it on the same soket.
e.g
taskset -p 20 $vhost (pinning vcpu2 thread to physical cpu #5 )

4.In guest,pin the IRQ to one core and the netperf to another.
1) make sure irqbalance is off - `service irqbalance stop`
2) find the interrupts - `cat /proc/interrupts`
3) find the affinity mask for the interrupt(s) - `cat /proc/irq/<irq#>/smp_affinity`
4) change the value to match the proper core.make sure the vlaue is cpu mask.
e.g pin the IRQ to first core.
   echo 01>/proc/irq/$virti0-input/smp_affinity
   echo 01>/proc/irq/$virti0-output/smp_affinity
5)pin the netserver to another core.
e.g
taskset -p 02 netserver

5.For host to guest scenario. to get maximum performance. make sure to run netperf on different cores on the same numa node as the guest.
e.g
numactl  -m 1 netperf -T 4 (pinning netperf to physical cpu #4)

Block Tunning

  • run tunning profile on host.
# tuned-adm profile enterprise-storage (for rhel6 project)
# service tuned start  (for rhel5 project)
  • make sure cache=none in qemu-kvm command

Execute testing

  • Submit jobs in Autotest server, only execute netperf.guset_exhost for three times.

tests.cfg:

only netperf.guest_exhost
client = 192.168.122.2 (external host ip)
variants:
    - repeat1:
    - repeat2:
    - repeat3:

Result files:

# cd /usr/local/autotest/results/8-debug_user/192.168.122.1/
# find .|grep RHS
kvm.repeat1.r61.virtio_blk.smp2.virtio_net.RHEL.6.1.x86_64.netperf.exhost_guest/results/netperf-result.RHS
kvm.repeat2.r61.virtio_blk.smp2.virtio_net.RHEL.6.1.x86_64.netperf.exhost_guest/results/netperf-result.RHS
kvm.repeat3.r61.virtio_blk.smp2.virtio_net.RHEL.6.1.x86_64.netperf.exhost_guest/results/netperf-result.RHS
  • Submit same job in another env (different packages) with same configuration

Result files:

# cd /usr/local/autotest/results/9-debug_user/192.168.122.1/
# find .|grep RHS
kvm.repeat1.r61.virtio_blk.smp2.virtio_net.RHEL.6.1.x86_64.netperf.exhost_guest/results/netperf-result.RHS
kvm.repeat2.r61.virtio_blk.smp2.virtio_net.RHEL.6.1.x86_64.netperf.exhost_guest/results/netperf-result.RHS
kvm.repeat3.r61.virtio_blk.smp2.virtio_net.RHEL.6.1.x86_64.netperf.exhost_guest/results/netperf-result.RHS

Analysis result

  • Config file: perf.conf
[ntttcp]
result_file_pattern = .*.RHS
ignore_col = 1
avg_update =

[netperf]
result_file_pattern = .*.RHS
ignore_col = 2
avg_update = 4,2,3|14,5,12|15,6,13

[iozone]
result_file_pattern =
  • Execute regression.py to compare two results:
login autotest server
# cd /usr/local/autotest/client/tools
# python regression.py netperf /usr/local/autotest/results/8-debug_user/192.168.122.1/ /usr/local/autotest/results/9-debug_user/192.168.122.1/
  • T-test:
scipy: http://www.scipy.org/
t-test: http://en.wikipedia.org/wiki/Student's_t-test
Two python modules (scipy and numpy) are needed.

Unpaired T-test is used to compare two samples, user can check p-value to know if regression bug exists. If the difference of two samples is considered to be not statistically significant(p <= 0.05), it will add a '+' or '-' before p-value. ('+': avg_sample1 < avg_sample2, '-': avg_sample1 > avg_sample2)

Regression results:

== Avg1 SD Augment Rate =================================================================================================================================================
    TCP_STREAM
    size|sessions|throughput|   cpu|normalize|  #tx-pkts|  #rx-pkts|    #tx-byts|    #rx-byts|#re-trans|#tx-intr|#rx-intr|  #io_exit|  #irq_inj|#tpkt/#exit|#rpkt/#irq
    ....
    2048|       2|  14699.17| 31.73|   463.19|   1291715|   3373041|    85296481|110477745596|        0|      20| 1763433|   1074349|   1851644|       1.20|      1.82
%SD     |     0.0|       0.6|   0.0|      0.8|       0.3|       4.7|         0.4|         0.6|      0.0|     0.0|     1.5|       1.7|       1.3|        0.0|       5.5
    2048|       4|  15935.68| 34.30|   464.66|   1171103|   2888728|    77388754|119722177477|        0|      18| 1637434|    727207|   1734915|       1.61|      1.67
%SD     |     0.0|       0.3|   1.7|      1.5|       2.8|       6.4|         2.7|         0.3|      0.0|     0.0|     1.5|       3.0|       1.8|        0.0|       6.0

== Avg2 SD Augment Rate =================================================================================================================================================
    ....

== AvgS Augment Rate ====================================================================================================================================================
    TCP_STREAM
    size|sessions|throughput|   cpu|normalize|  #tx-pkts|  #rx-pkts|    #tx-byts|    #rx-byts|#re-trans|#tx-intr|#rx-intr|  #io_exit|  #irq_inj|#tpkt/#exit|#rpkt/#irq
      64|       1|    865.39| 28.13|    30.76|   2774791|   5390191|   183136811|  6846238102|        0|      42| 3733152|   2325894|   3778832|       1.19|      1.43
      64|       1|    883.44| 28.34|    31.18|   2790826|   5436099|   184195088|  6984651542|        0|      42| 3845410|   2372988|   3892147|       1.17|      1.40
%       |    +0.0|      +2.1|  +0.7|     +1.4|      +0.6|      +0.9|        +0.6|        +2.0|     +0.0|    +0.0|    +3.0|      +2.0|      +3.0|       -1.7|      -2.1
    ...
    2048|       2|  14699.17| 31.73|   463.19|   1291715|   3373041|    85296481|110477745596|        0|      20| 1763433|   1074349|   1851644|       1.20|      1.82
    2048|       2|  14346.11| 31.30|   458.22|   1312073|   3819360|    86626802|107855006456|        0|      19| 1901794|   1060233|   1995777|       1.24|      1.90
%       |    +0.0|      -2.4|  -1.4|     -1.1|      +1.6|     +13.2|        +1.6|        -2.4|     +0.0|    -5.0|    +7.8|      -1.3|      +7.8|       +3.3|      +4.4
    2048|       4|  15935.68| 34.30|   464.66|   1171103|   2888728|    77388754|119722177477|        0|      18| 1637434|    727207|   1734915|       1.61|      1.67
    2048|       4|  22884.08| 33.65|   686.88|   1867010|   4378656|   123294638|171931411098|        0|      28| 2520051|   1456781|   2643759|       1.31|      1.62
%       |    +0.0|     +43.6|  -1.9|    +47.8|     +59.4|     +51.6|       +59.3|       +43.6|     +0.0|   +55.6|   +53.9|    +100.3|     +52.4|      -18.6|      -3.0

    ....
== T-test Pvalue ========================================================================================================================================================
     TCP_STREAM
     size|sessions|throughput|   cpu|normalize|  #tx-pkts|  #rx-pkts|    #tx-byts|    #rx-byts|#re-trans|#tx-intr|#rx-intr|  #io_exit|  #irq_inj|#tpkt/#exit|#rpkt/#irq
     2048|       2|  13953.87| 30.77|   454.35|   3232541|  10341241|   213369507|105343585903|        0|      49| 3770714|   2111756|   3863355|       1.34|      2.15
     2048|       2|  14357.55| 31.23|   559.89|   1226498|   2752682|    80993060|107872176441|        0|      19| 1584401|   1072178|   1661943|       1.14|      1.66
%PV      |   0.343|     0.815| 0.117|   +0.422|     0.153|     0.135|      -0.043|       0.843|    0.343|   0.147|   0.116|     0.110|     0.116|      0.261|     0.076
     2048|       4|  15600.51| 32.31|   483.23|   1649537|   5302229|   108985553|117370893553|        0|      26| 2419493|   1275629|   2503394|       1.28|      1.75
     2048|       4|  15667.72| 32.80|   479.08|   1569538|   4784146|   103697876|117843215754|        0|      25| 2243108|   1177907|   2328012|       1.34|      1.72
%PV      |   0.343|     0.937| 0.513|    0.897|     0.912|     0.893|       0.912|       0.944|    0.343|   0.888|   0.882|     0.859|     0.885|      0.089|     0.950
Clone this wiki locally