Simulating HASH's Cloud Infrastructure
This is a simulation which demonstrates how HASH's cloud infrastruture responds to user requests to run simulations on hCloud. hCloud executes each simulation in a dedicated Kubernetes pod with a certain amount of CPU and memory resources.
Requests are generated according to real-world data. The request distribution was uploaded as a dataset — distribution.csv
— and represents the proportion of daily requests received in each hour of the day. Simulations are executed as soon as received, unless there are not currently enough compute resources available in the cluster. In this case, the request is queued until adequate resources become available. Compute resources become available when another simulation completes, or when a new compute node is added to the cluster by the autoscaler.
The parameters controlling this simulation may be tuned through the globals.json
file. These parameters are:
requests_per_day
: the number of user requests received per day.experiment_time_seconds
: a triangular distribution which specifies the duration of each simulation run. This distribution is parameterised by amin
,mode
andmax
, in seconds.node_specs
: the number ofcpu
cores and gibabytes ofmemory
in each compute node in the cluster.pod_specs
: the number ofcpu
cores and gigabytes ofmemory
allocated to the pod for each simulation run.autoscale_bounds
: the autoscaler automatically adds and removes nodes in an effort to maintain cluster utilisation between the given bounds. If cluster utilisation exceedsmax_util
, then one node is added to the cluster. Similarly, if utilisation drops belowmin_util
, then one node is removed from the cluster. At each time step, at most one node will be removed or added.autoscale_delay_minutes
: the time delay before which a request to add a node to the cluster is fulfilled. There is no delay to remove a node.initial_num_nodes
: the starting number of nodes in the cluster.min_num_nodes
: the minimum number of nodes required to be in the cluster at all times.
Each step in the simulation represents a passing of one minute.
The Analysis tab contains several plots illustrating how the cluster evolves as experiment requests are received and executed.
- Experiment Pods shows the number of actively running, and queued simulation experiments at each time step.
- Cluster CPU Usage shows the total number of CPU cores in the cluster and the current utilisation at each time step.
- Cluster Memory Usage shows the total amount of memory in the cluster and the current utilisation at each time step.
- Number of Nodes shows the number of compute nodes in the cluster at each time step.