Today we released version 0.7 of the Hazelcast Simulator tool. It is our production simulator used to test Hazelcast and Hazelcast based applications in clustered environments.
Simulator Communication Protocol
The biggest change in this release is the new Simulator Communication Protocol. It makes each Simulator component (Coordinator, Agents, Workers) individually addressable and the communication between them language agnostic. This is a big milestone in the support for our Hazelcast clients in C#, C++, Python etc. We also created a Simulator Worker Implementation Guide which will be published soon.
SimulatorAddress for each component is also a requirement to implement a script based resilience testing. With this upcoming task we will be able to induct disturbances into an ongoing Simulator run to trigger migrations and split-brain handling.
An additional benefit is the bi-directional communication between all Simulator components. So we could switch most internal systems to directly push their data to the Coordinator, instead of polling. This reduced the latency of the failure detection and improved the continuous performance monitoring.
Simulator Test Framework
We’ve cleaned up the Simulator Test Framework, which defines the API for a Simulator Test.
There are new annotations to inject common objects into your test class:
@InjectProbe is now mandatory to get a
Probe injected. The old behavior was to inject a probe into every field of the type
Probe. The annotation also provides a field to override the
Probe name (if not defined the field name will be used).
We merged all
Probes into a single implementation, based on
HdrHistogram. This saves the whole configuration part in the TestSuite file and different interfaces. You can define if a
Probe should be used for the throughput calculation of the test via the
@InjectProbe annotation. Per default it will just be used to record latency values.
We also created more abstract
IWorker classes for the
@RunWithWorker annotation (which is the recommended way to write a Simulator Test).
|Class name||Abstract methods||Description|
This is the simplest implementation which supports just a single operation.
Has a single, built-in
Useful for simple Simulator Tests with a single operation and a fast
Supports a single operation like the
Has a single, built-in
Useful if you have a single operation and need more control over the measured code block, e.g. if you do expensive operations in your
This is the basic implementation for multiple operations. You have to pass an
Useful for most Simulator Tests with multiple operations and a fast
Supports multiple operations like the
Has a single, built-in
Useful if you have multiple operations and need more control over the measured code block, e.g. if you do expensive operations in your
If you defined an operation
Useful if you have multiple operations and need a separate
Implements the Hazelcast interface
Useful if you have multiple operations for asynchronous map operations.
Useful if you decide dynamically what your Simulator Worker should do at runtime and you need Workers which do nothing.
- Added a new command line tool
simulator-wizardto ease the installation and creation of working directories.
- Added new property
CLOUD_PROVIDER=localto run Simulator with a minimum setup on the local machine.
- Added new property
AGENT_PORTto configure the Agent port.
- Added new properties
HAZELCAST_PORT_RANGEto configure the Hazelcast cluster ports.
- Added command line parameters
--targetCountto control the load generation.
- Added command line parameter
--licenseKeyto set an Enterprise License key for a Simulator run.
- Added configuration option to create Workers with different Hazelcast versions and configuration (via
KeyLocality.SHAREDwhich generates random keys from the same range on all Workers.
- Resolved a lot of SonarQube issues in test classes.
- Adapted test classes to changes in the Simulator Test Framework.
- Optimized some tests to scale better with high number of Workers.
ExtractorMapTestto test queries with attribute extractors.
MapPutAllOnTheFlyTestto test the performance of
NetworkTestto test the performance of the Hazelcast IO system.
PartitionServiceMBeanTestto test method calling via
LongTestPhasesTestto test timeout detection of
MapTimeToLiveTestwhich instantiated a wrong class if used with local
- Created common base class for
DomainObjectclasses to reduce code duplication.
- Removed HTTP tests from the default test module, because they needed a lot of dependencies and didn’t work out of the box.
- Reduced the technical debt of the project.
- Simulator is now compliant with the company code coverage requirements.
- Simulator CheckStyle configuration is aligned with Hazelcast main project.
- Added logging which worker is the “first worker”, which is used for the global test phases.
- Cleanup of the
user-libdirectory to avoid clashes with previous runs in static setups.
- Moved upload of Hazelcast JARs from Provisioner to Coordinator to support different Hazelcast versions on each Worker.
- Symlinks are now resolved in file upload methods.
- Allowed the
timeStep()methods of the abstract
IWorkerimplementations to throw checked exceptions.
- Simplified the
Probesto a single implementation based on
- Added interval latency snapshots.
- Moved the throughput calculation to the Workers to eliminate network latency to be part of the calculation.
- Added latency values to continuous performance monitoring.
- Switched continuous performance logging to milliseconds for latency values over one second. Removed the (performance not available) logging, since it makes the output of mixed TestSuites harder to read.
- Agents are started and stopped by the Coordinator, so there are no more failures if Agents got killed or are still running from another Simulator version.
- Pulled out
harakiri-monitoras standalone command line tool which is used by Provisioner and Coordinator.
- Made use of
ThreadSpawnerto reduce internal code duplication and to parallelize some code paths.
CloudProviderUtilsto have a single location for
- Implemented and improved the fail-fast behavior of
TestCaseRunner, so a TestSuite is aborted properly on a critical failure.
- Added fetching of tags to
GitSupportso new versions can automatically be built.
- Replaced SLF4J and Logback with lo4gj bridge, to unify log configuration (and enable easy logging of Netty on Agent and Worker).
- Removed property
TestContainer(we have the built-in probes to monitor progress).
- Removed the
--listcommand from Provisioner, which does the same as a simple
-XX:+UnlockDiagnosticVMOptions -XX:+DebugNonSafepointsto the JVM arguments for JFR profiler.
- Fixed multiple issues with the live performance monitoring.
- Fixed retrieval of Enterprise JARs when
--enterpriseEnabledis used with Maven version specification.
- Fixed mounting of ephemeral devices on EC2 instances with default AMIs.
- Fixed GCE setup related files and settings.
- Fixed an incompatibility with Hazelcast Simulator and Hazelcast version 3.2.
- Removed the need to parse the
hazelcast.xmlfile on the Coordinator to retrieve the Hazelcast port, since this failed if the file wasn’t compatible with the out-of-the-box version of Hazelcast in Simulator.
- Removed the need for
HostAddressPickerwhich failed in complex network setups.
We increased code coverage by 37.7% to 94.5% and added 681 new tests. We resolved all 340 SonarQube issues, reduced the technical debt by 38 days and raised the code duplication by 0.2%. We aligned some CheckStyle rules between project XML and SonarQube.
The figures in the screenshots are a bit lower, since the 0.6 tag was pushed some days after the release when the code quality was already increased again. You can compare the exact numbers by having a look at the Simulator 0.6 Release. The drops in code coverage in the end were caused by a test configuration error.
Try Simulator for yourself, get started today!