On Decentralized and Distributed Edges
This is the first part of a tech blog post series i’d like to start on the topic of decentralized and distributed IoT computing. At ThingForward we spend a lot of time on coding protocol and communication software and firmware for small sensor devices, and the IoT model brings up an architecture model which in part differs from the one we find in web app architectures. We already had a webinar on this topic last month and we’d like to investigate this a bit further.
So what’s the goal here? We’d like to build an IoT edge architecture pattern in which edge components (such as gateways) and cloud backend components are connected - on a basis of a decentralized networking architecture, but forming a distributed application, a Dapp. But first we need some clarification, what „decentralized“ and „distributed“ means in terms of an IoT architecture model.
+---+ +---+ +----+ | | | | | | +---+ ++--+ | +------| | +----+ || | +---+ || | +-------+ | +----+ +vv--------v-v+ +---+ | +-------> | +----+ | Centralized | |-------^+ Application <--------+ +----+ | | +--+ | | +--^---------^+ | | +----+ | | +--+ | | | | +-----++ ++--+ | | | | +------+ +---+
In a centralized application, all clients of an application connect to a central node. Not to be confused with a single server, but with a single endpoint such a DNS record that points to one of more IP addresses, where usually load balancers terminate and distribute traffic to upstream host serving the client requests.
Within the application’s architecture there are typically many servers and services involved, so most apps are highly structured (from an architecture perspective). But from a 10.000ft internet view, structure is hidden behind an API, and all clients use this API. Most transactional web applications are structured this way. They are typically deployed across several availability zones to have some sort of failover functionality in case of an outage, because in this centralized model if there is an outage on the API level, all clients are affected.
Centralized applications are easy to build, fast to deploy and good to operate, because everything is located nearby. But some applications are so large, they need to be spread across different locations.
+---+ +-----+ | | | | +-+-+ +-------+ +--+--+ | | | | | +----+--+ | +----v-----+ | | | +--------------------+ | +-------> Decentr | | | | | +-v------v-+ | | | | +------+ +--^-----+-+ | Decentra | +-----+ | +------+ | | <----+ | +------+ | +------------> | +-----+ | | +-------^--+ | | | | | | | | | +v-------+--+ +--+--+ +----+ | | | | | +---------> Decentra | +-----+ +----+ | <--+ | | | +^----------+ | | | | | | | +------++ +----+-+ | | | | | | +------+ +-------+
In a decentralized applications, the clients do not depend on the services of a single node or endpoint. Control can be spread accros a number of nodes and clients can connect to any of these nodes (or are redirected in between sessions). One example of this is content distribution in a content network. Clients connect to a node which is located near (in terms of network hops and speed), so that content is transferred faster, compared to a single, centralized master.
Another example is the current state of IoT deployments. Fog devices such as small sensor and actor nodes today are in most cases not directly connected to the internet. Being constrained devices, connected to constrained networks, they often do not bring a TCP/IP stack. They connect by various methods (i.e. radio) to gateway nodes, which in turn do have TCP/IP and speak web protocols to backend services. So it’s a number of small devices talking to a gateway, and a number of gateways talking to a single cloud service - layered, decentralized.
One could say that all gateway nodes of a vendor’s product form „the edge“ of this product. But these individual gateway nodes do not talk to each other, despite having the capabilities and capacity to do other sorts of computations or services. This would lead to some sort of:
+-----+ +-----+ +--------------------+ | | | | | +----++ +--v---+ | | +---------------+ | +--v--+ | | | | | | | | +---^--+ | | | | | +---+-+ | | | | | | | | +v--v-+ | +--------> +--------------+ | | | +-----------+ | | | +---^-+ | +v--+---+ | | | | | | | | | | | | | +-^-----+ | | | | | | | | +v--------+ | | | | +---+ +--v---++ | +--------------> | +---------+ | | +-------+
Distributed applications are on the rise since a basic computational challenge has been solved, that is how a number of nodes can elect one of themselves as a leader node, and reach a consensus about a shared state. State can be anything between a simple online/offline info and a complete dataset.
A popular example of this are blockchain-based technologies such as Bitcoin and Ethereum. Bitcoin is not a single application in the way that one could ask e.g. where it is deployed, in what data center it is running etc. Bitcoin is a network, where ALL nodes are part of the computing model. Distributed applications are harder to maintain in terms of application changes and compatibility, but offer many advantages in terms of e.g. fault tolerance (impact of outages can be absorbed by the network. Additionally, it becomes virtually impossible to shut the whole thing down, or take it over.
Blockchain-based crypto currencies are a world of its own, and we’re not going to look deeper into them. But the advantages and possibilities of truly distributed applications are promising. So what do we mean when speaking about the decentralized, distributed edge?
IoT Edge Computing
Regarding IoT architectures, we see roughly three types of components as players:
- Sensor Devices are constrained and barely bring the capacity to fulfill their immediate purpose, that is detecting or triggering something, with the least amount of battery power as possible. They’re not (yet) capable of taking computational or storage assignments in larger communication structures. Although they will become more powerful over time.
- Gateways are typically mains-powered devices with capabilities comparable to servers. Most of them smaller, some larger. They are connected to the internet (typically behind a NAT, but anyway..) and can take part in larger architectures. They form the „edge“, where functionality can be located (i.e. analytics, filtering, but also metering and probably billing/transactions)
- Backend or Cloud services. They are scalable and powerful, at least as of now. Who knows how all of this will look like with billions of devices offloading their sensor data to the cloud? So moving functionality between edge gateways and cloud backends could be a key factor in upcoming IoT architecture scenarios.
We’d like to enable this and interconnect Gateways and Backends to form a private, distributed IoT application space, built upon the decentralized IoT architecture of IPv4 and IPv6 networks that are in place now. What components should it contain?
- First of all we’d like to have private networking. Nodes should be able to connect on a peer-to-peer basis and thus form the basis for any distributed applications.
- Distributed Asynchronous Messaging should allow applications on these nodes to publish data to others as well as subscribe to events coming from other nodes.
- File Exchange would be nice, using some form of shared file system.
- Using a distributed database (SQL or NoSQL), applications can manage tructured data
- When it comes to transactional data, a Distributed Ledger (a.k.a. Blockchain) would be great. This does not need to be a suitable for crypto currency or mining etc. but we believe that cryptographically secured ledgers are going to be an inherent part of future IoT architectures.
- When it comes to applications, we’d like to have a distributed compute model, i.e. delivered by Containers. Ideally combined with membership and failure detection.
How do we start? In the upcoming blog posts we’re going to build this solutions, hopefully with all of the above parts. Expect much code and command line fu :) Looking forward to it!