Designing Semantic Publish/Subscribe Networks using Super-Peers
Consider a P2P network which manages metadata about publications, and a user of this network, Bob, who is interested in the new publications of some specific authors, e.g., Koubarakis and Nejdl. With conventional P2P file sharing networks like Gnutella or Kazaa, this is really difficult, because sending out queries which either include "Koubarakis" or "Nejdl" in the search string will return all publications from these authors, and Bob has to filter out the new publications each time. With an RDF-based P2P network like Edutella, this is a bit easier, because Bob can formulate a query, which includes a disjunction for the attribute dc:creator (i.e., dc:creator includes "Nejdl" or dc:creator includes "Koubarakis"), as well as a constraint on the date attribute (i.e., dc:date > 2003), which includes all necessary constraints in one query and will only return answers containing publications from 2004 on. Still, this is not quite what Bob wants, because whenever he uses this query, he will get all 2004 publications including the ones he has already seen.
What Bob really needs from his P2P file sharing network are publish/subscribe capabilities:
- Advertising: Peers send information about the content they will publish, for example a Hannover peer announces that it will make available all L3S publications, including publications from Nejdl, a Crete peer announces that it would do the same for Koubarakis' group.
- Subscribing: Peers send subscriptions to the network, defining the kind of documents they want to retrieve. Bob's profile would then express his subscription for Nejdl and Koubarakis papers. The network might store these subscriptions near the peers which will provide these resources, in our case near the Hannover and the Crete peer.
- Notifying: Peers notify the network whenever new resources become available. These resources should be forwarded to all peers whose subscription profiles match them, so Bob should regularily receive all new publications from Nejdl and Koubarakis.
We assume two types of nodes: super-peers and peers. A peer is a typical network node that wants to advertise and publish its data and/or subscribe to data owned by others. A super-peer is a node with more capabilities than a peer (e.g., more cpu power and bandwidth). Staying on-line for long periods of time is another desirable property for super-peers. In our architecture, super-peers are organized in a separate network which we call the super-peer backbone and are responsible for processing notifications, advertisements and subscriptions. Peers connect to super-peers in a star-like fashion, providing content and content metadata. Each peer is connected to a single super-peer which is its access point to the rest of the network and its services. Once connected, a peer can disconnect, reconnect or even migrate to a different super-peer. Our super-peers are arranged in the HyperCuP topology. This is the solution adopted in the Edutella infrastructure because of its special characteristics regarding broadcasts and network partitioning.
Who becomes a super-peer? As an example, super-peers can be centrally managed by a company that owns and runs the overlay to offer a service (e.g., a content provider). A more challenging design is that super-peers are normal peers that either volunteer to play the role of a super-peer for a time window (i.e., because they will get a number of privileges as a return) or the system forces all peers to become super-peers periodically in order to be able to use the services of the overlay. This is an area where some interesting research has been carried out recently. It was introduce the concept of altruistic peers, namely peers with the following characteristics, (a) they stay on line for long periods and (b) they are willing to offer a significant portion of their resources to speedup the performance of the network. Although it does not uses the term super-peer directly, the concepts of super-peers and altruistic peers are related: one can view altruistic peers as one kind of super-peers in a P2P network.
