The world and its dog has gone crazy for blockchain (and related technologies, which is a whole other post). The past two years have been a hype roller coaster for the technology with stories of equally insane valuations, technical proposals, media exposure, regulatory nightmares, frauds, and unrealized dreams.
But what is the technology, and why could it be of interest to developers like yourselves?
Before I get started, speaking about "blockchain" is like speaking about "programming" in that there are multiple differences in paradigms, approaches, and patterns, most of which their communities are still defining. In some ways, this is why I find the technology so compelling, but it makes it hard to define the technology completely.
For simplicity, I use Ethereum as the reference technology in this post, because it is more interesting to developers, sparked the broader use of blockchain beyond Bitcoin, and other approaches use it, replicate it, or compete with it.
Blockchain's Basic Definition
My absolute basic, abstract definition of blockchain is:
A cryptographically secure distributed ledger of transactions that is generally immutable and keeps all instances in the cluster informed of changes.
I know people will want to dispute this, change, and add to it, but I'd like to stick with broad strokes right now.
Let's take a quick tangent as to what "cryptographically secure" means before continuing. All blockchains use public-key cryptography to sign transactions, associate them with an origin, and achieve consensus on the state of the network.
Blockchains also often use a chain of hashes to bundle transactions for extra security and some efficiency improvements. The hash chains are then written to blocks that contain a reference to previous blocks to maintain a reliable sequence of events.
The exact implementation of this process varies wildly from protocol to protocol. For Ethereum, you can read the yellow paper for more details.
Comparing Blockchain to Distributed Systems
For those of you with experience in distributed systems and especially distributed databases, this definition might sound somewhat familiar. I am often surprised how little the blockchain community is aware of fundamental distributed system concepts, thinking that everything they are doing is something new and original, but hey. that's technologists for you!
Where things start to differ is in the definition of cluster. In "traditional" distributed systems, the time-honored formula of
2n+1 applies in defining how many nodes to add to a cluster, and getting the right number takes time. Too little and the cluster may not cope with demand; too many and replicating data across the cluster can be too slow.
With the Ethereum blockchain, there is only one cluster (well, actually four, but the others are test networks). When you write and read transactions to it, they are replicated across the entire network, typically around 20,000 nodes at the time of writing.
Herein lies most of the point of blockchain technologies, by having all nodes and data public, you create a decentralized network, with a mutual trust of that data, and a consensus algorithm (that varies from protocol to protocol) that determines if a transaction submitted is valid or not, and subsequently written.
This also leads to the biggest problem with many blockchain protocols (but not all), in that large-scale replication makes transactions slow. There are numerous supplements to existing blockchain protocols and entirely new protocols that look to solve this, but it's not entirely solved and production-tested yet.
While there are dozens of super interesting projects built on blockchain (often Ethereum, or something like it) that look to challenge the entire contemporary computing stack (I wrote another post on this subject), until the community solves this speed and scalability issue, they are mostly experiments and proofs of concept.
Public Versus Private Blockchains
There is another option to consider, and that's private blockchains, with Hyperledger being one of the most mature and established, but there are others to consider. These protocols offer you many of the features of a public blockchain (decentralized consensus, for example), but control who runs instances, where they run, and the number of instances. Private blockchains are somewhat controversial in the community, as in many opinions they go against the "point," while others believe that it's too much like a conventional distributed system to have any point.
Still, the decentralized consensus nature versus the conventional account and access setup is potentially compelling.
With all this aside, what does the Ethereum blockchain consist of? Quite a lot actually, which makes it more complex/interesting, depending on your perspective.
A simple illustration is hard to come by, but I recommend you take a look (and be overwhelmed by) this comprehensive image created by LeeJThomas. The fundamental components to gain an overview, in my mind, are:
Solidity smart contracts: The 'magic sauce' that Ethereum introduced to the world. These allow you to run simple applications attached to the blockchain. Other blockchain protocols have their own smart contract languages; this repository has a good summary.
The Ethereum Virtual Machine (EVM): The contracts are compiled into bytecode, which the EVM reads and executes. The EVM is sandboxed and isolated from the host machine.
Swarm and Whisper: Working with the EVM, Whisper provides communication channels between applications running on the network (called DApps), and Swarm provides storage for the application code and any data written by an application.
Integration tools: There are a handful of widely adopted tools for handling migration and deployment of smart contracts. Due to the complete immutability of transactions and contracts, you can't update a contract once you have deployed it. There are a variety of techniques to work around this limitation, but an interesting aspect of blockchain programming is that there's much more impetus to get it right the first time around.
Another interesting aspect of Ethereum and other blockchains is how you pay for access. Every smart contract deployed to the main Ethereum network consumes "gas" when it runs, again showing that efficiency and "getting it right are paramount.
You can think of this cost as somewhat equivalent to paying for services on your cloud host of choice. It's hard to make a comparable cost comparison, but while cloud providers don't always encourage efficient usage, the blockchain does; your code directly relates to cost.
To gain gas, you need ether, the token of the Ethereum network, which you can buy or trade from others, or you provide computing resources (mining) to the network and gain ether for doing so.
What Does Blockchain Mean to Developers?
This post was a brief and broad introduction to the blockchain, and I promised to say why developers should care about it.
Some fans and critics say that the blockchain space right now is like the nascent internet days in the late '90s. Everyone is competing for a half-baked, over-hyped idea that few understand, think is needed, or think is possible. Anyone who is old enough to remember the dot-com bubble burst of the late '90s may remember the number of promising and dumb ideas that failed, leaving in their wake a mixture of much better ideas, or ideas that managed to survive and define the modern internet.
Time will tell if the same happens with blockchain, and thankfully we are slowly filtering the noise of the ICO madness of late 2017 in time to readdress some of the original ideals of blockchain. These ideals were that too much power and influence lay in the hands of too few, and by decentralizing as much as possible, we create the internet that everyone envisioned in the first place.
Again, some critics say that some blockchain protocols and networks are way more centralized than they allude to and that it's already too late.
But hey, if some of the ideas in this article encouraged you to try to work on something different, then find a project that interests you and get involve. These are early days, and everyone can still play a part.
Thanks to Júlio Santos for reviewing this post for me.
Stay up to date
We'll never share your email address and you can opt out at any time, we promise.