Adaptive Cloud Publish-Subscribe Services for Latency-Constrained Applications
With the advent of very high-speed connections, modern web applications, smartphones and mobile applications, large-scale internet services have become ubiquitous and are part of our daily lives. Nowadays, in many of these services, such as social media, not only do users consume the contents, but they also contribute to the production of the contents. In addition, users want to be dynamically informed of changes to the contents in which they are interested in, notably by means of push notifications.
The publish/subscribe model is an efficient paradigm that can be leveraged in these contexts, as it provides a nice abstraction that allows for logically and efficiently decoupling content producers (publishers) from content consumers (subscribers). Publish/subscribe is typically provided as a service, in which subscribers register interest in (subscribe to) contents that they want to receive. Then, as publishers generate and submit contents in the form of publications to the service, the latter determines to which subscribers each publication should be sent to, and forwards each publication accordingly to the relevant subscribers. While multiple variants of the publish/subscribe paradigm have been described in the literature, this thesis is centered around topic-based publish/subscribe, which enjoys widespread usage in large-scale commercial systems.
Supporting large-scale topic-based publish/subscribe applications brings interesting research challenges, notably regarding the scalability and load balancing aspects, as some applications built on these systems can generate high message volumes. In addition, some specific applications impose additional constraints, such as multiplayer online games (MOG), in which publication delivery latencies must be kept below a given threshold, which can be particularly challenging when clients are distributed around the world. The cloud can be leveraged in these contexts, as a publish/subscribe service deployed in the cloud can benefit from the large pool of resources that the cloud can provide in several geographical regions.
This thesis proposes a set of contributions in the general area of scaling cloud-based topic-based publish/subscribe systems. Our first contribution, Dynamoth, provides a scalable topic-based publish/subscribe service that is tailored for latency-constrained applications. It provides a hierarchical scalability and load balancing model that exploits the intrinsic characteristics of the topic-based publish/subscribe paradigm. In addition, Dynamoth also provides availability and fault tolerance in the event of server failures and provides several levels of reliability and ordering guarantees. Our second contribution, MultiPub, provides a global-scale topic-based pub/sub service tailored for the needs of applications with many clients around the world, and having strict latency constraints. As such, it allows one to impose latency constraints. MultiPub then continuously makes sure that these constraints are satisfied (if possible), by generating optimal configurations of cloud deployments spanning across several of the available regions. As cloud usage incurs bandwidth-related costs, and that different cloud regions exhibit different costs, MultiPub also attempts to reduce such costs by selecting the most cost-efficient configuration that respects latency constraints. On the other end, our third contribution, DynFilter, proposes a game-oriented topic-based publish/subscribe service that aims at limiting bandwidth usage in multiplayer and massively multiplayer online games. As DynFilter is game-specific, it exploits the conceptual spatial model of such games in order to inhibit the dissemination of publications that are of a lesser importance in a game setting, in a dynamic way, in order to achieve target bandwidth savings.
All of our experiments are run in the context of multiplayer online games, as the topic-based publish/subscribe paradigm fits well into the architectural model of such games. In addition, they are a good example of highly distributed, latency-constrained systems. As running experiments in the cloud is a challenging task, this thesis provides, as an additional contribution, a set of tools that were developed to assist in running large-scale, highly-distributed cloud-based experiments. Among these contributions is a full, reusable implementation of our Dynamoth platform, built according to software engineering principles.
In summary, we believe that this thesis provides new and innovative contributions in the cloud-based scalability of topic-based publish/subscribe.