Skip to content

orcas-elite/resilience-simulator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MiSim Microservice Resilience Simulator

This simulator was created as part of the Fachstudie Simulation-based Resilience Prediction of Microservice Architectures at the Reliable Software Systems Research Group of the Institute of Software Technology at the University of Stuttgart.

It allows the simulation of microservice architectures in regard to resilience and is based on the DesmoJ framework for discrete event modelling and simulation.

Table of contents:

Installation

In order to run the simulator you have to download the DesmoJ binary from sourceforge and then include it into the project.

Execution

The simulation works only when the relative path ./Report exists in execution directory. With the following file structure ...

project/
|--- Examples/
    |--- architecture_model.json
    |--- experiment_model.json
    |--- ...
|--- Report/
    |--- css/
    |--- js/
    |--- ...
|--- MiSim.jar
|--- ...

... use the following command to run a simulation:

java -jar MiSim.jar -a ./Examples/architecture_model.json -e ./Examples/experiment_model.json -p

Architectural Model

The architectural model is required as input for the simulator. It is saved in a JSON file. The following is a simple example for the architectural model:

{
  "microservices": [
    {
      "name": "B",
      "instances": 2,
      "patterns": [],
      "capacity": 1000,
      "operations": [
        {
          "name": "b1",
          "demand": 160,
          "circuitBreaker": {
            "rollingWindow": 10,
            "requestVolumeThreshold": 4,
            "errorThresholdPercentage": 0.5,
            "sleepWindow": 5,
            "timeout": 1
          },
          "dependencies": [
            {
              "service": "D",
              "operation": "d1",
              "probability": 1.0
            },
            {
              "service": "E",
              "operation": "e1",
              "probability": 1.0
            }
          ]
        }
      ]
    },
    {
      "name": "A",
      "instances": 2,
      "patterns": [],
      "capacity": 1000,
      "operations": [
        {
          "name": "a2",
          "demand": 120,
          "circuitBreaker": null,
          "dependencies": [
            {
              "service": "C",
              "operation": "c2",
              "probability": 1.0
            }
          ]
        },
        {
          "name": "a1",
          "demand": 140,
          "circuitBreaker": null,
          "dependencies": [
            {
              "service": "B",
              "operation": "b1",
              "probability": 1.0
            },
            {
              "service": "C",
              "operation": "c1",
              "probability": 1.0
            }
          ]
        }
      ]
    },
    {
      "name": "D",
      "instances": 1,
      "patterns": [],
      "capacity": 1000,
      "operations": [
        {
          "name": "d1",
          "demand": 170,
	  "circuitBreaker": null,
          "dependencies": []
        }
      ]
    },
    {
      "name": "E",
      "instances": 2,
      "patterns": [],
      "capacity": 1000,
      "operations": [
        {
          "name": "e2",
          "demand": 180,
	  "circuitBreaker": null,
          "dependencies": []
        },
        {
          "name": "e1",
          "demand": 430,
	  "circuitBreaker": null,
          "dependencies": []
        }
      ]
    },
    {
      "name": "C",
      "instances": 1,
      "patterns": [],
      "capacity": 1000,
      "operations": [
        {
          "name": "c1",
          "demand": 192,
          "circuitBreaker": {
            "rollingWindow": 10,
            "requestVolumeThreshold": 4,
            "errorThresholdPercentage": 0.5,
            "sleepWindow": 5,
            "timeout": 1
          },
          "dependencies": [
            {
              "service": "E",
              "operation": "e2",
              "probability": 1.0
            }
          ]
        },
        {
          "name": "c2",
          "demand": 90,
	  "circuitBreaker": null,
          "dependencies": []
        }
      ]
    }
  ]
}

Description

The model contains architectural information about the microservices of the system, their operations and dependencies and the resilience patterns they implement.

  • name: Name of the microservice
  • instances: Number of instances of this microservice
  • capacity: CPU capacity of each instance in Mhz
  • patterns: Array of resilience patterns that are implemented in this microservice. The array contains objects which hold information about the respective resilience pattern
    • name: The name of the pattern. As of now the only supported pattern is Resource Limiter
    • arguments: An array which contains parameters about the pattern
  • operations: Array which holds objects which contain information about the different operations that this microservice can perform
    • name: Name of the operation
    • demand: CPU demand of this operation in Mhz
    • circuitBreaker: Contains the following parameters that configure the implemented circuit breaker: rollingWindow, requestVolumeThreshold, errorThresholdPercentage, sleepWindow and timeout. If the operations doesn't implement a circuit breaker the value is null
    • dependencies: Arry of objects which hold information about a dependency that this operation has with another operation.
      • service: Name of the microservice to which the other operation belongs
      • operation: Name of the other operation from which this operation depends
      • probability: The probability that this operation will call the other operation (decimal number in between 0 and 1)

Experiment Model

The experiment model contains meta information for the simulation and information about the experiment.

This is an example for the experiment model:

{
  "simulation_meta_data": {
    "experiment_name": "ABCDE Experiment",
    "model_name": "Schema ABCDE",
    "duration": 50,
    "report": "",
    "datapoints": 50,
    "seed": 979
  },
  "request_generators": [
    {
      "microservice": "A",
      "operation": "a1",
      "interval": 0.25
    },
    {
      "microservice": "A",
      "operation": "a2",
      "interval": 0.25
    }
  ],
  "chaosmonkeys": [
    {
      "microservice": "E",
      "instances": 1,
      "time": 10
    }
  ]
}

Simulation Meta Data

The simulation-meta-data object holds meta data that's needed for the simulation.

  • experiment_name: Name of the experiment
  • model_name: Name of the used model
  • duration: The duration of the experiment in seconds, must be an integer
  • report: The simulator creates a report at the end of the simulation. Leave this field empty if you want a detailed report, set the value to "minimalistic" if you want a minimalistic version of the report or set it to "none" if you don't want a report
  • datapoints: The number of datapoints you want for the charts in your report. The simulator records statistics at every datapoint. If you set the value to "0" no charts will be created. If you set it to "-1" the simulator will record a datapoint at every simulated second
  • seed: A seed for the randomly generated events in the simulator. Leave this field empty if you want random experiments or set the value to an integer to use a seed

Request Generators

The request_generators array holds objects which contain information about the generation of inital requests to different microservices of the system to start the simulation.

  • service: Name of the microservice to which the request should be send
  • operation: Name of the operation which should be performed
  • interval: Time interval in seconds in which these requests will be created

Chaosmonkeys

The chaosmonkeys array holds objects which contain information about chaos monkeys which shut down instances of specified microservices during the simulation.

  • service: Name of the service of which you want to shut down a number of instances during the simulation
  • instances: Number of instances you want to shut down during the simulation
  • time: Time point (in seconds) at which you want to shut down the instances of the specified microservie