What is a Blockchain Node Provider? Why Do I Need One?

Deric Cheng
October 12, 2021
Technical

If you're new to blockchain development, there's plenty of unfamiliar concepts around nodes and how they play a role in your blockchain stack. What is a blockchain node? Why is running your own Ethereum node difficult? What is a node provider and why do I need one? What are the differences between providers such as Infura, Alchemy, and Quicknode?

Believe me, we've been there. It's super confusing. Here's a quick rundown of what you need to know.

What is a node in blockchain?

Let's start from the basics! A node is essentially a program running on a single computer that allows you to connect with the rest of the blockchain network. It connects with other nodes to send information back and forth, checks that transactions sent between people are valid, and stores important information about the state of the blockchain.

In particular, one of the peculiarities of a blockchain network is that the network is essentially composed of only nodes: that is, the physical hardware running a blockchain such as Ethereum or Bitcoin is just the collection of all the nodes around the world being run by individual people. There's no master server or single source of truth - that's why it's decentralized!

Finally, it's important to note that there's no way to access the information on a blockchain without using a node. Can't be done. Think of it like the browser for the blockchain.

The “blockchain” is just a collection of computers (nodes) run by individuals, collectively taking part in verifying the state of this blockchain following a certain set of rules.

You interact with a node by sending requests to and receiving responses from it via an API. For instance, you can send a request like the following, assuming you're running a node on your computer on port 8545:

-- CODE language-js line-numbers -- curl localhost:8545 \-X POST \-H "Content-Type: application/json" -d '{"jsonrpc":"2.0","method":"eth_blockNumber","params":[],"id":0}'

Try it out online using the Alchemy Composer! This request will ask your node to return the latest block number, or the most recently produced block on the network, by calling the blockNumber method. Here's an example response:

-- CODE language-js line-numbers -- { "jsonrpc": "2.0", "id": 0, "result": "0xa1c054"}

As you can see, the most recent block in this case is 0xa1c054, which translates to block 10600532 in decimal form.

Why is running a node difficult?

There's a few things that make developing on your own node connected to the network particularly annoying. Let's cover some reasons:

Nodes take a long time to set up - up to weeks!

The bane of any developer is spending a lot of time setting up for a tool that doesn't directly contribute to what they're trying to build, and nodes are among the worst offenders.

There's typically two main categories of nodes - light nodes and full nodes. Light nodes sync just the block headers and request from full nodes for many queries, while full nodes keep the entire state of a blockchain - every transaction that's ever been created. Most queries work with light nodes, but full nodes are the backbone of the blockchain - they’re necessary to serve most information.

Light nodes have gotten relatively more simple in the past, but still require installing the node program, setting configuration variables, downloading block headers, and checking ports and health to ensure they're running properly. If you can get your first light node running in under 30 minutes, ping us on Discord and we'll get you a custom badge for your efforts 👏👏.

Full nodes are even worse: the biggest issue is that full nodes need to download every block from 0 to latest from scratch, and manually replay every block and transaction ever submitted by anybody ever. For Ethereum mainnet, that's over 10 million blocks and on the order of billions of transactions. That can literally take weeks of syncing.

Note: In Ethereum, there’s one more type of node called archive nodes that are useful for historical lookups. We won’t cover them in depth here.

Nodes have to be managed - by you!

Get ready for the dev-ops project from hell. Just a quick overview:

  • Nodes regularly need to be upgraded every few weeks, and occasionally re-built from scratch in the case of hard forks and node client upgrades.
  • Because most nodes weren't designed with reliability in mind, certain queries (such as eth_getLogs) can involve running through millions of blocks and transactions, and often time out or crash a node - we call them "queries of death". So you'll have to keep a close eye on the health of your node and wake up to debug them at 3 AM quite often.
  • Individual nodes can fall behind the network for various reasons - peering and connection issues, getting stranded on outdated branches, issues with internal state. If they’re behind, your users will get served stale data but not realize it - which can be a dangerously bad experience.

Scaling past a single node is tricky!

A single node is fine and dandy when you're building a personal project (even if it does keep on crashing on you intermittently). But what happens when you can't make your node server large enough to keep up with the requests you're sending it?

"I'll just run two nodes - and set up a load balancer between them!", you might suggest. That's what we thought too! Unfortunately, this setup is actually super difficult to keep consistent, because different nodes "see" the latest state of the blockchain in slightly different ways, leading to inconsistent data and other user-facing problems.

Picture this: we have two nodes syncing separately behind a load balancer. Node A thinks that the latest block is block 5, and Node B thinks it's block 4. This is a totally normal situation - because the most recent information propagates through the network slowly, some nodes will always be ahead of others.

You: Hey Mr. Load Balancer, what's the latest block you see?

Mr. LB: (Sends request to Node A and returns the response): The latest block on the network is block 5.

You: Thanks! Can you share with me the information in block 5?

Mr. LB:  (Sends request to Node B this time): Sorry, I don't know about block 5. Please try again.

In a real world situation, imagine a user buying an NFT on your app. They might purchase an NFT while sending requests to Node A - but when their queries start going to Node B, it might appear as if their transactions never happened! Consistency issues like this are pervasive and very tricky to solve - especially when you scale to dozens of nodes. 

What is a node provider?

Node providers are essentially teams (like us!) that offer a way to access the information on a blockchain without having to run your own node! In essence, instead of sending your requests to a local node, you can send them over the internet to a provider offering an identical API that is running fully-synced, up-to-date nodes available 24/7.

If you remember the blockNumber request above, this is what a node request looks like when sent to a provider:

-- CODE language-js line-numbers -- curl https://eth-mainnet.alchemyapi.io/v2/your-api-key \ -X POST \ -H "Content-Type: application/json" \ -d '{"jsonrpc":"2.0","method":"eth_blockNumber","params":[],"id":0}'

We just swapped the endpoint! No other changes are necessary.

A solid node provider will offer, at the very minimum:

  • Access to light and full nodes with regularly-updated nodes and alerts, so you don't have to worry about forks or network changes.
  • Access to archive nodes for historical transaction data (only Alchemy does this for free!)
  • Scalability and reliability: nodes should be available whenever you want, as much as you want them.
  • Consistency: providers should handle tricky edge-cases similar to the latest-block problem above. This is oftentimes an issue when using Infura or other providers.

I'm running my dApp locally and it works great! Why do I need a node provider?

Until you're ready to send traffic to a public testnet or mainnet, you don't need a node provider! A local version of the blockchain for testing (as provided by Hardhat or Truffle / Ganache) is all you need to build and test your project.

Once you want to deploy your application to a live chain, node providers become a critical piece of your development workflow.

First, you'll need a way to deploy your smart contract via a transaction to the blockchain - you can only do that via a node on the blockchain. That means running your own node, or sending your transaction to a provider.

Second, your application will likely need to continue pulling information about the blockchain to update its internal state. That information also goes through a node or node provider - and you'll want that channel to be reliable and synced properly so that you're not serving stale or broken data to users.

What is Alchemy and how does it differ from other node providers?

If you've gotten to this point, you've picked up the gist of why we exist! We're essentially a blockchain node provider with extremely high reliability, fantastic customer support, and lots of developer tools, three features that are critical to our enterprise customers and the reason we have 70% of the top apps in blockchain sending their traffic through us.

If you're interested to learn how Alchemy's Supernode infrastructure solves consistency issues like the ones described above, you can learn more here.

There's several additional things that set us apart from other node providers in the space:

  • Developer Tools: How do you know what requests your users are sending? If their requests fail, how can you view / debug them? What's going on with transactions you've sent that are waiting to be mined? We provide a number of tools in our Alchemy Dashboard that allow you to analyze traffic on your dApp that otherwise would involve trawling pages of logs.
  • Push Notifications: What do you do if you want to be alerted whenever an Ethereum user you're following (for example, Vitalik Buterin) makes a transaction? You could write a script to read every block and search for a specific address and run it 24/7 - or you could use a tool like Alchemy Notify, a proprietary tool that sends push notifications (webhooks) for events on the blockchain.
  • Enhanced APIs: What if you want to search for all transactions made by a single ETH user? Though this might be simple in a SQL database, it's prohibitively difficult in blockchain - you basically need to scan every transaction in the blockchain (again, in the billions on ETH mainnet!) to see if it includes one address. We've built several Enhanced APIs that allow you to make this query and other similar queries instantaneously.

How do I get started?

Using Alchemy as a node provider is insanely simple - in fact, it should only be a single line of code to get started! If you've been using web3.js or ethers.js, it's as simple as creating a free Alchemy account, generating an API key, and replacing the instantiation with something like this:

-- CODE language-js line-numbers -- const web3 = createAlchemyWeb3("https://eth-mainnet.alchemyapi.io/");

If you'd like a full tutorial, check out our Getting Started With Alchemy documentation here!

And finally, we're always available to help 24/7 on our Alchemy Discord. Stop by and say hi - we'd love to help you on your journey to becoming a full-fledged blockchain developer!



More articles