This runs workloads that are labeled as "integration-test". The apiserver and
scheduler are only started once per unique configuration, followed by each
workload using that configuration. This makes execution faster. In contrast to
benchmarking, we care less about starting with a clean slate for each test.