By Stephen Munnings | Dec 22, 2017
*In this article we are using "publish-subscribe" in the context of software architecture.
What is Publish-Subscribe?
Publish-subscribe is a messaging facility. It describes a particular form of communication between software modules or components. The name is chosen to reflect the most significant characteristics of this communication paradigm.
In simple communications, software modules communicate directly with each other using structures and media that are understood by all parties. As the communications needs have become more complex or demanding, other communications architectures have evolved. Publish-subscribe is one of these and only one of many.
The Central Ideas of Publish-Subscribe
- Software components do not necessarily know who they are communicating with.
- Producers of data publish that data to the system as a whole.
- Consumers of data subscribe to and receive data from the system as a whole.
- Information is labelled so that software modules can identify the available information. This label is often referred to as the topic.
Other Common Publish-Subscribe Ideas
Note: Not all publish-subscribe systems make use of these concepts.
A central software module takes care of managing and matching all the data, the publishing, and the subscribing. This is commonly termed the “broker”. Brokers may sometimes be a network of cooperating software modules, and the software modules that use the services of the broker are referred to as clients.
Clients that publish and subscribe often “register” with the broker in order to manage communication paths, authenticate clients, and to facilitate other housekeeping tasks.
Message delivery to subscribers filtered by content as opposed to topic. This can be used instead of or in conjunction with the topic. This is only implemented by a few publish-subscribe systems.
Data can be “persistent”, meaning that the last published data for a particular topic will be available to subscribers that register on the system after the publisher last published the data.
Example: Home Heating System
Here is a small example of a system that could be built using publish-subscribe. It is a home heating system. It has a temperature sensor, a heating unit, and a central control unit.
The data (topics) are:
- Temperature in the house. Published by the temperature sensor and subscribed to by the central control unit. Has the value of the current in-house temperature.
- Heating Request. Published by the control unit and subscribed to by the heating unit. Has the value of either “On” or “Off”.
Advantages of Publish-Subscribe
- Loose Coupling: Each client does not have to know about the other clients and can focus entirely on the data in question.
- Dynamic Network Topology: Individual clients can join and leave the system and their location in the network is not normally relevant.
- Scalability: The media used for this networking is usually TCP/IP and thus can be local to a single machine or distributed across the entire cloud. Data volume issues are not magically solved by this system, but experience has shown that considerable data volume can be managed quite well. Any data volume improvements tend to be centralized in the broker modules.
- Existing Solutions: While not an advantage of the publish-subscribe system itself, there are a number of existing solutions, some of which are open-source, so developers do not need to rewrite the whole system from scratch. They can concentrate on the functions of the client and simply make use of the communications framework.
Disadvantages of Publish-Subscribe
The largest disadvantages are a side effect of the main advantage, loose coupling, and can generally be identified as message delivery issues.
- Client Presence: Since neither the publishers nor the subscribers know of the presence or absence of cooperating clients, it is possible for data to go missing unless mechanisms are implemented to handle this.
- Decreased Data Flexibility: Modifying the publisher and the structure of the published data is more difficult to coordinate.
- Data Volume Instabilities: When large amounts of data are handled it can result in networking instabilities and issues.
What You Need to Use Publish-Subscribe
- An underlying network that can be used for the communications.
- Access to a broker module or system – this can be local or external.
- Code that allows clients to communicate with the broker. Usually this would be libraries that clients invoke according to the API for them.
- All the other obvious things, such as the equipment to run the client software on, the required software development tools, etc.
Popular Publish-Subscribe Systems
- MQTT: This is an open-source, public, protocol standard. A large selection of brokers exists, including open-source brokers (such as Eclipse Mosquitto) and licensed brokers (such as IBM WebSphere MQ). Some brokers are currently running in the cloud as public-hosted test servers.
- DDS: Another open-source, public, protocol standard. However, this protocol is “peer-to-peer”, and does not rely on broker services. A “brokered” implementation is possible using DDS but is not necessary.
- Amazon SNS: A paid commercial implementation managed by Amazon as part of the AWS ecosystem.
- XEP-0060: This is an extension to the XMPP protocol for generic publish-subscribe functionality. XMPP is a free open protocol standard with clients, servers and libraries available in the open-source domain.
Table 1: Comparison of publish-subscribe systems
Table 1 Notes
1) There are both free and commercial implementations.
2) While brokers can be created, the DDS implementation is for peer-to-peer communications with discovery mechanisms.
3) In the specification, all examples of using the protocol are shown as xml snippets.
Creating Your Own System
If you don’t want to use an existing system, you can roll your own. The advantage is that you can do it exactly the way you want to do it. The disadvantage is that it will probably take you longer (and with more pain) to implement it.
Common Options in Many Publish-Subscribe Systems
Almost all brokers use a client registration mechanism. This can be very simple or more complex. This facility is used for security (clients must pass authentication), for keeping track of subscriptions and published data (including purging of obsolete information), and other bookkeeping tasks.
When a publishing client loses connection with the broker it can be accidental or deliberate. In some cases, subscribers still want to know the last published information on a particular topic. If this option is in effect, the broker will be able to forward the last published data on a particular topic to subscribers, even when the publisher is no longer connected to the system.
When clients lose connection with the broker it can be a short-term communication problem. In some cases, there are mechanisms that allow a client to re-connect with the broker and re-establish the same session that was already in progress.
When you have loosely coupled systems depending on networked communications, there is the possibility of data being lost in transit. An often overlooked, but possibly critical, adjunct is some form of delivery guarantee. Some publish-subscribe systems have configurable delivery guarantee levels – also commonly referred to as “Quality of Service”. Depending on the system, some data loss can be acceptable, but in other systems it can be fatal.
Publish-subscribe is not a magic solution for every system, however, when it matches the needs of the system in question it can be a very attractive tool in your arsenal. We hope that this article has helped you get a good high-level understanding of what publish-subscribe is and how it can meet your needs.