1. What we actually do
Think of blockchain as a massive distributed database where anyone can write data, yet everyone needs to read it quickly and consistently. Our team ensures these enormous datasets (processing millions of transactions across terabytes of blockchain data) are efficiently accessible to users. We strive to deliver this data as promptly and consistently as possible while continuously optimizing for the highest performance standards.
- In short: We build and scale the infrastructure that powers blockchain development for thousands of engineers and millions of end-users.
- Our team ensures that blockchain data—both historical and real-time—is processed, enhanced, and made instantly accessible through our specialised data pipeline.
- Think of us as creating the reliable foundation that developers need to build stable applications in the often unpredictable world of blockchain.
- Our systems handle real-time, high-throughput demands and petabytes of data, operating with multi-region availability and millisecond latency goals.
- Concrete examples:
- We recently overhauled our data processing pipeline, which resulted in us roughly halving the time it takes for fresh blockchain data (i.e. the most recently mined blocks) to become visible and available for our customers.
- In order to store and serve blockchain data in the most efficient way, we rely on our proprietary multi-layer storage system. One of our latest efforts was to migrate a significant portion of this blockchain data into a new storage layer leading to significant savings in our cloud costs, whilst also improving performance for serving data.
2. What problems we solve and why they’re hard (and fun)
- Consistency, scalability, and low latency-balancing all three at once
- Handling failovers automatically in systems that cannot go down
- Building distributed systems that meet banking-level data integrity standards, but with the flexibility of Web3
- Optimizing performance for a wide variety of workloads and patterns of reading blockchain data
- Tail latency is our sworn enemy, and we battle it daily
Example challenges:
- Our data processing pipeline enhances blockchain data for customers, but here's the challenge: when one component slows down, everything downstream feels it. This means customers see delays in their blockchain data - not acceptable.
Even though blockchain data is inherently sequential, we built parallel synchronization mechanisms that kick in when needed. This lets us process new blocks simultaneously when our customer-facing layers need to catch up, keeping data flowing smoothly even during processing hiccups.
- We recently faced an interesting scaling problem when one customer's legitimate use case - a spike in requests for blockchain logs - started impacting others. These resource-intensive operations were increasing latency across different workloads.
After a quick temporary fix, we took a step back and reimagined our storage architecture. By migrating this specific data type to its own dedicated storage instance, we effectively solved the "noisy neighbour" problem. This gave us both the performance isolation we needed and maintained responsiveness for all customers - a win-win solution that required some creative thinking.
3. How we make decisions in the Node team
- We value open debates and data-driven decisions