Take Control of Your Hazelcast C++ Objects

Starting with the Hazelcast C++ client 3.6.2, the new API concept of “Raw Pointer” adaptors is introduced. You can now obtain memory ownership of returned object provided by the C++ API and manage object’s memory in your own library. The new API also adds the capability of late deserialization for multiple-value returning APIs such as IMap::values, IMap::keySet, etc.

For each container there are new adapter classes, following the naming schema to start with RawPointer, and provide access to the raw pointers of the created objects. These adapter classes can be found in the hazelcast::client::adaptor namespace and are listed below:

  • RawPointerList
  • RawPointerQueue
  • RawPointerTransactionalMultiMap
  • RawPointerMap
  • RawPointerSet
  • RawPointerTransactionalQueue
  • RawPointerMultiMap
  • RawPointerTransactionalMap

It is important to note, that these adapter classes do not create new structures in the cluster. To construct any of those adaptors, simply provide the Hazelcast container (e.g., the IMap) that you got in the cluster as a construction parameter.

Why Do You Need to Use the Raw Pointer Adaptors?

The Hazelcast C++ API (e.g., the IMap) traditionally returns the objects as boost::shared_ptr objects. If you use this API and you want to keep this returned object in your library, how do you do it?

boost::shared_ptr does not allow the memory of the object to be released manually. You can get a pointer, but the memory is managed by the shared_ptr. Thus, once your shared_ptr is destructed and there are no references, then your object is destroyed. However, what if you need to later use this object?

One solution to this problem is keeping a boost::shared_ptr in a list (strongly referenced) to prevent object destruction and to provide the pointer of the object to your library. However, you may have a legacy library where you want to manage the lifecycle of the object (the library deletes the object when no longer in use). This would be a problem, since the boost::shared_ptr will try to delete the object as well.

boost::shared_ptr<V> get(const K &key)
….

{
  boost::shared_ptr<V> temp = map.get(key);
  passThePointerToTheLibrary(temp.get());
} // object temp is destroyed

Line 5: map.get constructs a value object temp. Line 6: The pointer of the object is passed to your library. Line 7: When the scope for the temp object is finished, the object is destroyed. When the legacy library tries to use the object at some future point, the program will crash since the memory for the object was released already. Therefore, you would need to construct a copy of the temp object and pass it to the library at line 6 to prevent this problem. This would cause an extra copy of the object.

Raw pointer adaptors overcome this limitation by eliminating the need to keep the boost::shared_ptr. Let’s see an example:

IMap<int, Employee> originalMap = client.getMap<int, Employee>(“MyMap”);
client::adaptor::RawPointerMap<int, Employee> employees(originalMap);

{
    std::auto_ptr<Employee> employee = employees.get(3);
    passThePointerToTheLibrary(employee.release()); // passes the ownership of the object to the library, the auto_ptr pointer becomes NULL.
} // The auto_ptr is destroyed but the object is NOT destroyed.

Line 1: Gets the map that we are interested in. Line 2: This line adopts our map to the raw pointer map interface. Both the originalMap and employees use the same map “MyMap” in the cluster. Line 5: We get the employee with id 3 as a pointer. We can reuse this object the same way as we use any std::auto_ptr. Line 6: employee.release() releases the memory ownership of the object. Therefore, the std::auto_ptr no longer manages the memory of the object. The internal pointer in the employee variable becomes NULL after this call. The library gets the ownership of the object. It is now the responsibility of the library to release the allocated memory when it it no longer needs it. Line 7: The employee auto_ptr is destroyed, but since its internal pointer is NULL at this point, this has no effect on the constructed object. The library can still continue to use the object without any problems.

Late Deserialization

Deserialization is the transformation into objects from the binary data representation received from the cluster. Deserialization is a costly process (in terms of delay and CPU usage), and it is desirable to minimize the number of deserializations.

Let’s have a look at the multiple object returning APIs.:

std::vector<V> IMap::values();

Let’s assume that this API returns 100 objects. At that point, you would have to wait for 100 objects to be deserialized before the user can continue. If each deserialization takes X milliseconds, the call to values() will not return before 100 * X milliseconds. (Some parallelization is possible here, but let’s assume that deserialization is performed sequentially.) Let’s assume that you will actually use only five out of the 100 objects returned. Then, the deserialization of 95 objects would be useless. Furthermore, since this API returns objects in a vector of objects, you need to make a copy if you want to keep the object in your library.

Using the raw pointer-based API solves this situation. The cost of deserialization of an item is delayed until the item is actually accessed. The raw pointer API uses the DataArray and EntryArray interfaces that allow late deserialization of objects. The entry in the returned array is deserialized only when it is accessed. Please see the example code below:

// No deserialization here
std::auto_ptr<hazelcast::client::DataArray<V> > vals = map.values();

// deserializes the item at index 0 assuming that there are at least 1 items in the array
const V *value = vals->get(0);

// no deserialization here since it was already de-serialized
value = vals->get(0);

// no deserialization here since it was already de-serialized
value = (*vals)[0];

// releases the value so that you can keep this object pointer in your application at some other place
std::auto_ptr<std::string> releasedValue = vals->release(0);

// deserialization occurs again since the value was released already
value = vals->get(0);

Line 2: Gets all the values in the map. This may be a long list of values. None of them are deserialized at this point; only the binary data is obtained. Line 5: Accesses the first item in the list at index 0. During this access, the binary data for the first item is deserialized. The memory ownership of this new object belongs to the DataArray (vals). Line 8: Accessing the first entry again does not cause any deserialization. Line 11: This line also accesses the first item in the array in a different way using the [] operator. No deserialization here since the object was already deserialized. Line 14: This line releases the memory ownership of the first object in the array. Therefore, it is your responsibility to manage the released object from this point on. Line 17: Trying to access an already released object causes the binary data to be deserialized again.

In this blog, we introduced the new raw pointer adaptor classes for better memory usage of the returned objects in Hazelcast C++ API. You can see more examples at these links: