Skip to main content
Around IT in 256 seconds

Around IT in 256 seconds

By Tomasz Nurkiewicz
Podcast for developers, testers, SREs... and their managers. I explain complex and convoluted technologies in a clear way, avoiding buzzwords and hype. Never longer than 4 minutes and 16 seconds. Because software development does not require hours of lectures, dev advocates' slide decks and hand waving. For those of you, who want to combat FOMO, while brushing your teeth. 256 seconds is plenty of time. If I can't explain something within this time frame, it's either too complex, or I don't understand it myself.

By Tomasz Nurkiewicz. Java Champion, CTO, trainer, O'Reilly author, blogger
Where to listen
Apple Podcasts Logo

Apple Podcasts

Google Podcasts Logo

Google Podcasts

Overcast Logo


RadioPublic Logo


Spotify Logo


Currently playing episode

#11: MapReduce

Around IT in 256 seconds

#88: SLI, SLO and SLA: a number, a threshold and a legal document respectively
Many people, when asked about SLA, simply shout 99%. The correct answer to that question is probably a long, boring PDF, written by lawyers. Yes, SLA is a legal obligation. Not a metric or a number. You probably meant SLI or SLO. Read more: Get the new episode straight to your mailbox:
October 03, 2022
#87: Artificial neural networks: imitating human brain to solve problems like humans
An artificial neural network is a computer algorithm somewhat inspired by our brains. Superficially, our brain is a network of neurons connected with each other and communicating via electrical impulses. Artificial intelligence experts implemented a similar concept purely in software. An artificial neuron is basically a function that takes a set of inputs and has an output. Just like the biological one. By connecting hundreds of such neurons in a network, we can observe quite intelligent behaviours. For example, artificial neural networks can recognize what’s in the image. Or quite the opposite - generate images from text. Read more: Get the new episode straight to your mailbox:
September 27, 2022
#86: Proof of stake: how to cut global energy usage by 0.2%
p>A few weeks ago Ethereum blockchain moved from proof-of-work to a proof-of-stake algorithm. This step alone reduced global energy consumption by 0.2%. It’s as much as an energy usage of Austria. At this point, Ethereum, the second largest blockchain after Bitcoin, is using barely as much electricity as a few hundred households. How is that possible? How does the proof-of-stake algorithm work, avoiding catastrophic energy waste? Read more: Get the new episode straight to your mailbox:
September 19, 2022
#85: Genetic algorithm: natural selection helps to solve coding problems
A genetic algorithm is a heuristic approach to solving complex computational problems. This includes various optimizations, especially around scheduling and design. For example, NASA designed a radio antenna for their spacecraft using a genetic algorithm. Its shape is quite complicated, like nothing that could be designed by hand. So how do genetic algorithms work their way to the solution? Well, they are inspired by the natural selection process in living creatures (!) Read more: Get the new episode straight to your mailbox:
September 13, 2022
#84: Non-fungible token (NFT): digital, decentralized art market
Non-fungible tokens, NFTs for short, are financial instruments implemented on top of the blockchain. They can be bought and sold, just like cryptocurrencies. However, unlike bitcoins, each NFT is unique and traded individually. Whereas Bitcoins or ether are interchangeable, just like hundred-dollar bills. So what makes each NFT unique? Why would you purchase this particular NFT rather than the other one? Well, an NFT has an associated piece of data. That data is typically a hyperlink to a digital piece of art. What you actually purchase is… well… that link? Read more: Get the new episode straight to your mailbox:
August 29, 2022
#83: Real-time bidding: how online tracking helps serving ads
We all know this feeling. You search for a hotel in Paris and you keep getting ads for hotels and flights for weeks to come. Or something even scarier. You visit a blog post highlighting the first symptoms of a pregnancy. An hour later every single website on the planet advertises diapers and baby formulas. How is that possible? How do they know? And how did we get into this dystopia? All of this became possible with real-time bidding. The billion-dollar industry that tracks our every movement. Read more: Get the new episode straight to your mailbox:
August 23, 2022
#82: MongoDB: the most popular NoSQL database
MongoDB is a NoSQL database. Precisey speaking, it’s a document-oriented database. It stores arbitrarily complex key-value objects. For example, in a single Car object you can store as much information as you want. Not only license plate or manufacturing year. But also information about each individual part, history of repairs, insurance and all owners. No matter how much information you want to keep, you just put that in a single, easily accessible document. Contrast that to relational databases, where each relationship has to be modelled as a separate table. So the same Car would have been spread across tens of tables. Imagine all these SQL JOINs! No wonder why MongoDB is one of the most popular databases. Read more: Get the new episode straight to your mailbox:
August 16, 2022
#81: Quarkus: supersonic, subatomic Java (guest: Holly Cummins)
Quarkus is supersonic, subatomic Java. What does that mean? It means it’s Java, but really, really small. And really, really fast. Quarkus is a runtime framework which gives you access to programming models you’re probably familiar with. Like Microprofile, JAX-RS, CDI dependency injection. And also access you’re probably less familiar with, like reactive programming. Author: Holly Cummins Read more: Get the new episode straight to your mailbox:
August 05, 2022
#80: Ethereum: a distributed virtual machine for exchanging money and bored apes
Ethereum is a network of computers with no central trusted authority. They achieve consensus by running computation-intensive algorithm, known as proof-of-work. The agreed state is added to an append only ledger, known as blockchain. Yes, Ethereum is yet another blockchain. And it’s used to track transactions in a cryptocurrency, known as Ether. However, unlike Bitcoin, it’s much more than a simple log. Bitcoin accounts simply hold currency. Ethereum accounts can run programs as well. Ethereum network is actually one, huge computer! Read more: Get the new episode straight to your mailbox:
July 04, 2022
#79: QUIC: what makes HTTP/3 faster
QUIC can be thought of as the third fundamental protocol of the Internet. Next to UDP and TCP/IP. Let’s talk a little bit about these two. They both build upon IP, Internet Protocol. IP supports exchanging packets of data between two machines having… IP addresses. UDP adds ports. Port is a logical concept. It’s simply a number within one machine that identifies a certain process. Thanks to ports, many different processes on the same machine can exchange data. Read more: Get the new episode straight to your mailbox:
June 30, 2022
#78: Stuxnet: computer virus that you can admire
Stuxnet was probably one of the most sophisticated pieces of software ever built. I can easily imagine a Hollywood movie about it. A computer program that could change the course of history. Ironically, Stuxnet was a computer virus. A virus that infected 200 thousand machines. But activated and damaged only on a fraction of that. Read more: Get the new episode straight to your mailbox:
June 20, 2022
#77: DDoS: take down a server, one request at a time
Denial-of-service attack tries to take down a server by sending specially-crafted requests. The simplest form of this attack is just sending a lot of requests in a short period of time. But more sophisticated methods are possible. For example, sending a single unusual request that overwhelms the server. One such example is a ZIP bomb, that I’ll explain later. But the most widespread technique requires a large number of attacking servers. Also known as distributed denial-of-service. DDoS for short. Read more: Get the new episode straight to your mailbox:
June 13, 2022
#76: 12th Factor App: portable and resilient services start here. Part 8-12/12
In part 2 of the Twelve-Factor App, we’ll explore the second half of the principles. Be sure to listen to the previous episode as well. We still have only four minutes, so let’s go! Read more: Get the new episode straight to your mailbox:
June 06, 2022
#75: 12th Factor App: portable and resilient services start here. Part 1-7/12
Twelve-Factor App is a set of design guidelines defined by Heroku. These guidelines are best suited for cloud-native, portable and resilient services. In this episode, I’ll explain the first seven principles. I have four minutes left, so let’s go! Read more: Get the new episode straight to your mailbox:
May 31, 2022
#74: SOAP: (not really) Simple Object Access Protocol
SOAP, formerly known as Simple Object Access Protocol, is a messaging standard. SOAP is very broad and general. Technically, it can support request-response, as well as fire-and-forget communication. The underlying protocol is typically HTTP, but there’s nothing against using message brokers. Or even good old SMTP. You know, the one for exchanging e-mails. The communication happens through XML messages. These messages are well-defined and structured. XML schema is agreed upon before any communication. Read more: Get the new episode straight to your mailbox:
May 16, 2022
#73: Neo4j: all your data as a graph?
Neo4j is a NoSQL database engine. What makes it different is the unusual data model. In Neo4j everything is modelled as a graph. A graph is a collection of nodes connected with edges. A typical example is a graph of friends on a social media website. Or a network of movies and actors. But it turns out many problems can be efficiently modelled as graphs. Like a customer having orders, each order has items. Or insurance, connected to a certain car and an accident. So what makes Neo4j special? Read more: Get the new episode straight to your mailbox:
May 10, 2022
#72: React.js: library that won frontends?
React.js is a JavaScript library for building dynamic user interfaces. React applications are built on top of reusable components. Components encapsulate look and feel, logic and state. Also, React has quite an advanced state propagation mechanism. In simple words, it means that the user interface is very responsive and consistent. To improve developer experience, React typically uses JSX. An extension to JavaScript language. Let’s dive deeper into why React.js became the most popular web framework. Or library. Or both. Depends who you ask. Read more: Get the new episode straight to your mailbox:
May 06, 2022
#71: Erlang: let it crash!
Erlang is a programming language designed for highly scalable, fault-tolerant systems. Its primary use case used to be telecommunication. But these days it powers some of the biggest distributed systems. For example, half-billion WhatsApp users. The unique features of Erlang allow it to achieve amazing availability. A typical enterprise system may be unavailable for, let's say, a few hours per year. This means 99.9% availability. Systems written in Erlang may even reach so-called nine nines. Or 99.9999999%. It means the system is unavailable for less than 31 milliseconds. Per year. How is that possible? Read more: Get the new episode straight to your mailbox:
April 26, 2022
#70: CRDT: Conflict-free Replicated Data Type (guest: Martin Kleppmann)
Hello everyone! My name is Martin Kleppmann. I’m a researcher at the University of Cambridge. And I would like to tell you briefly about the technology called CRDTs. So, CRDT stands for Conflict-free Replicated Data Type. It’s a type of data structure that you can use to build collaboration software. So think software like Google Docs for example. Or Figma. Or Trello. Or a TODO list that syncs between your computer and your phone. You can build this type of software using CRDTs. Read more: Get the new episode straight to your mailbox:
April 12, 2022
#69: DevOps: not a job position, but culture and mindset
DevOps is a movement to bridge the gap between developers and operations teams. Traditionally, these two groups were separate and rarely interacted with each other. Developers didn’t quite understand how software is deployed and managed. Operation teams, on the other hand, treated software as a black box. DevOps encourages synergy between these two roles. Developers should take responsibility for their software. Including how it runs and behaves on production. Ops should understand the software they run. But more importantly, they should adopt well-established software engineering principles. For example, automation, auditing, testing, and fast feedback. Ideally, devs and ops should work together in a single team, toward a common goal. Read more: Get the new episode straight to your mailbox:
February 14, 2022
#68: ACID transactions: don't corrupt your data
Transactions in SQL databases are rock-solid. By reading and modifying data within a transaction we limit the risk of data corruption. Actually, there’s an acronym describing transactions: ACID. Which stands for: atomicity, consistency, isolation and durability. A good database engine follows these properties religiously. NoSQL engines, on the other hand, trade ACID properties for availability or speed. Of course, this is a gross simplification. Anyways, NoSQL crowd coined another acronym: BASE. Which stands for: basically available, soft state and eventually consistent. We’ll leave BASE for another episode. Read more: Get the new episode straight to your mailbox:
February 01, 2022
#67: Version control systems: auditing source code, tracking bugs and experimenting
Version control systems, like git, serve two purposes. First of all, they allow collaborating on the same code by multiple developers. Collaboration is needed for any non-trivial project. Secondly, they keep the history of changes. Modification history allows tracking bug fixes and regressions. That, and many other applications of version control, will become obvious in a second. Read more: Get the new episode straight to your mailbox:
January 25, 2022
#66: Aspect-oriented programming: another level of code modularization
DRY, or don’t repeat yourself is a common principle in pSpring AOP riddlerogramming. That’s why we invented functions and objects. But some sources of duplication are really hard to get rid of. Well, sometimes it’s even hard to realize there’s duplication in the first place! Common examples are logging, validation, checking security, starting a transaction. Often, these are one-liners that are too simple to extract. Too mundane too bother. And too ubiquitous to forget. Read more: Get the new episode straight to your mailbox:
January 18, 2022
#65: Zero Downtime deployment: If it hurts, do it more often
Remember the days when deploying a new version of your application required downtime? If your application is particularly important, you might have had to schedule a maintenance window. Or perform the deployment in the middle of the night to avoid disruption. Today’s tools and DevOps practices allow deploying tens or even hundreds of times per day. With no downtime, and no noticeable disruption. Sometimes every commit is deployed automatically to production within minutes. How’s all this possible? Read more: Get the new episode straight to your mailbox:
January 10, 2022
#64: TypeScript: will it entirely replace JavaScript?
TypeScript is a programming language, a superset of JavaScript. This means any valid JavaScript program is also valid TypeScript. But not vice-versa! TypeScript adds a ton of features, addressing the shortcomings of JavaScript. The most important one is optional static typing, including null-safety. The fact that you can take any JavaScript code and turn it into TypeScript by simply changing a file extension is crucial. It means you can gradually start using TypeScript’s features without rewriting your whole application. Read more: Get the new episode straight to your mailbox:
January 03, 2022
#63: Logging libraries: auditing and troubleshooting your application
You can’t look at your application all the time. Instead, it should leave some sort of persistent trace. Such an audit log can be examined later on. However, it’s the responsibility of the application itself to log appropriately. But more importantly, the data it logs for later must be well-structured. Simply printing random words to a console is no longer sufficient. Read more: Get the new episode straight to your mailbox:
December 27, 2021
#62: Object-relational mapping: hiding vs. introducing complexity
Object-relational mapping, ORM for short, simplifies access to relational databases. Such frameworks help with developing applications without writing SQL. SQL was supposed to be easy to use for non-programmers. That’s part of the reason why SQL is so verbose. However, writing complex joins by hand is hard. Also, typically, once you fetch data from your database, you immediately translate it to objects. So why not build a universal framework for such mapping? Like, object-relational mapping? Read more: Get the new episode straight to your mailbox:
December 20, 2021
#61: Spring framework: 2 decades of building Java applications
Spring framework is probably the most popular and most successful application framework for Java. Writing a server or a web application before Spring was cumbersome. And it required an insane amount of boilerplate. Even in already bloated Java language. This framework was created sort of as a by-product for a book by Rod Johnson, back in 2003. He wanted to build an alternative to heavyweight Enterprise Java Beans standard. What was just an idea sparked to be one of the largest ecosystems for Java. Read more: Get the new episode straight to your mailbox:
December 15, 2021
#60: Haskell: purely functional and statically typed programming language
Haskell is a purely-functional programming language. It is also statically and strongly typed. Haskell takes these characteristics to the extreme. For example, doing any input/output is considered impure from a functional programming point of view. So in some books, a simple “Hello, world” example appears as late as in chapter… 9. Read more: Get the new episode straight to your mailbox:
December 07, 2021
#59: How compilers work: from source to execution
A compiler is an application that turns text into an executable program. It’s quite extraordinary how much work these complex pieces of software are doing. Pretty much every compiler works by executing several phases. Each phase takes the input of the previous ones to finally produce the runnable code. Let’s take a journey through the compiler internals. Read more: Get the new episode straight to your mailbox:
November 29, 2021
#58: Consumer-driven Contracts: TDD between services
Consumer-driven Contracts is an approach to testing integration between services. In a distributed system, many components talk to each other. Typically via request/response protocols or message queues. The client must know and understand the API provided by the server. What kind of endpoints are available, what formats, request/response schema. Without consumer-driven contracts (CDC for short), we are often reckless when it comes to testing. Maybe we have a bunch of smoke tests against a mocked server. Maybe we copy-paste typical responses from the server’s documentation. But both client and server can evolve, breaking the integration in unexpected ways. CDC attempts to codify the API without explicit schema and coordination. Read more: Get the new episode straight to your mailbox:
November 22, 2021
#57: Kotlin: Much more than 'better Java'
Kotlin is a programming language that runs mainly on Java Virtual Machine. This means it’s fully interoperable with Java and even other JVM languages. Developers can gradually rewrite their applications from Java to Kotlin. Or use Java libraries and frameworks inside Kotlin. But why bother with a new language? Kotlin has plenty of improvements over good old Java. Sometimes it’s placed between Java and Scala in terms of capabilities. It seems more modern, agile, and powerful. Read more: Get the new episode straight to your mailbox:
November 16, 2021
#56: Test-driven development: It's not about testing
Test-driven development (TDD for short) means developing software by writing tests first. I hope you all write unit and integration tests. But do you write them before the actual production code? This approach to software development is just that. You must write a failing test first. And you are not allowed to write even a single line of production code without a failing test. Think about it. If all your tests are green, it is forbidden to write production code. Everything must start from a red test. Read more: Get the new episode straight to your mailbox:
November 02, 2021
#55: Percentages, percentage points and basis points: understand your metrics
You might find this topic weird, but understanding percentages is crucial not only in banking. What does it mean when disk space decreased by 10 percent? How to scientificly measure relative system load? And how to sound smart when applying for mortgage? You’ll learn all that in the next four minutes. Read more: Get the new episode straight to your mailbox:
October 25, 2021
#54: Immutability: from data structures to data centers
Immutability means that when something was once created, it can’t be changed. This concept is tremendously important across our whole industry. Probably you’ve heard about immutable data structures. Let’s take an immutable list as an example. If you create such a list with a few items, you can’t add more items to that list. It’s written in stone. Any action attempting to modify that list returns a modified copy. The original instance is left intact. Modifying a single item, adding or removing, sorting - each of these operations return a copy. Read more: Get the new episode straight to your mailbox:
October 19, 2021
#53: CDN: Content Delivery Network: global scale caching
CDN is a set of geographically distributed servers for fast content delivery. Without CDN all requests are routed to your own server, located somewhere in the world. For example, in San Francisco. If your visitor lives in Australia, the experience is rather poor. But now imagine the traffic to your website is proxied through a global caching layer. Your visitor in Australia downloads data from an edge server nearby. A different visitor in Cape Town, Africa, will be routed to a completely different CDN server. The routing is done by the CDN itself, typically via DNS. It’s transparent to your visitors. Of course, all CDN servers contain the same data. Moreover, pretty much no-one contacts your own server in San Francisco. Only the CDN network itself. Technically, visitors don’t even know the address of your origin server! They use domain name like and DNS routes to appropriate cache server. Read more: Get the new episode straight to your mailbox:
October 11, 2021
#52: How computers work: from electrons to Electron
Today I’d like to explain how computers work. From the ground up, grossly simplifying. It all starts with an electric field. It’s a place where charged particles, like electrons, are attracted or repelled. The electricity flows through a piece of wire because of the difference in electric field potential on wire’s ends. This difference is known as voltage. Read more: Get the new episode straight to your mailbox:
October 04, 2021
#51: Cloud computing: more than renting servers per minute
Cloud computing is a broad term. In general, it refers to using hardware and software managed by someone else. Typically with very flexible pricing: we only pay for what we use and for the time we use it. We don’t build data centers ourselves. We don’t buy large servers and provision them. We simply rent a server on a per-minute basis. The cloud provider has a pool of servers and they are reused and recycled. Once we are done, we no longer pay and some other customer can use that same server. Just like we don’t own a taxi. We barely pay for kilometers and minutes. When the server breaks for some reason, the provider takes care of repairs and replacements. We simply, almost transparently, get a new machine. Read more: Get the new episode straight to your mailbox:
September 27, 2021
#50: Property-based testing: find bugs automatically by generating thousands of test cases
Property-based testing is an approach to automatically test software against well-defined rules. We don’t specify desired output for a few inputs. Instead, we barely define properties that should always hold. It’s best explained with an example. How do you make sure that your compression algorithm works? Ordinary unit tests verify a handful of inputs that you came up with. If you are experienced developer, you will include edge cases. I mean, the weirdest inputs, like an empty string or a long sequence of the same character. And what are the properties of a good compression algorithm? Its output should takes less space, obviously. But even more importantly, lossless algorithm should be capable of decompression. What if I told you, there is software that can check these properties automatically? With thousands of randomized tests? Read more: Get the new episode straight to your mailbox:
September 21, 2021
#49: Functional programming: academic research or new hope for the industry?
Functional programming means programming using functions. See, I need much less than 256 seconds for that! Unfortunately, this definition is as useful as saying that object-oriented programming means programming with objects. So let’s dive deeper. First of all, I mean pure functions as defined by mathematicians. In math, a function always returns the same output for a given input. A length of a string is a function. A form validator is typically a function as well. For the same form inputs it always returns the same result: valid or invalid. On the other hand, returning the current date for a given location is not a function. Or reading a file. Read more: Get the new episode straight to your mailbox:
September 13, 2021
#48: Distributed tracing: find bottlenecks in complex systems
Life used to be simple. In a traditional monolithic application, when a failure occurred, you could easily find the problem. When an exception bubbles up, it appears throughout all stack frames. You can easily examine which methods or functions were invoked from each other. You can see application layers involved. Moreover, it\u2019s fairly easy to profile performance bottlenecks. Answering these questions becomes much harder when there are multiple systems involved. Read more: Get the new episode straight to your mailbox:
September 07, 2021
#47: Terraform: managing infrastructure as code
Terraform is fairly low-level software for managing your infrastructure. For instance, it’s used to create and provision cloud instances, networks and software. Unlike traditional tools in this area, Terraform is declarative. It means you don’t define step-by-step, imperative guides. Essentially, scripting your infrastructure with Bash or Python. Instead, you define desired, final infrastructure state. For example, how many hosts, how they should be connected, what kind of software and packages they should contain. Once you apply this configuration, Terraform takes all the necessary steps to fulfill your needs. Here’s how it works in more detail. Read more: Get the new episode straight to your mailbox:
July 05, 2021
#46: Kubernetes: Orchestrating large-scale deployments
Kubernetes is a platform for managing various workloads inside containers. Before I jump into a definition, let’s describe the problems it tries to solve. Imagine your application consists of several components. It can be microservices, multi-layer application, etc. Each type of component needs to be deployed on multiple servers. First of all, to support fault tolerance, but also to achieve horizontal scaling. Doing this by hand is quite problematic. Manually tracking which servers should host which components is tedious and error-prone. You need to take into account: * CPU and memory requirements of each component * discoverability (where each component is located) * provisioning (different components need different libraries and packages) * scaling out and migrating from broken servers * and so on, and so forth Read more: Get the new episode straight to your mailbox:
June 29, 2021
#45: Node.js: running JavaScript on the server (!)
Node.js: running JavaScript on the server (!)", "episode_description": "JavaScript language is primarily used inside your web browser. Your computer downloads a JavaScript file and executes it on your machine. But if you want to build a dynamic website, you need a server-side language. Like PHP, Java, Python, etc. Programs written in these languages handle incoming requests and produce dynamic HTML. HTML that varies, depending on the request, who is asking and what data is available in the underlying database. But for more than a decade we can also use JavaScript on the server. The same language can be used for a very different purpose. Namely, listening and handling web requests. But also implementing command-line utilities and one-off scripts. This became possible after extracting the JavaScript engine from Chrome browser. Read more: Get the new episode straight to your mailbox:
June 21, 2021
#44: RESTful APIs: much more than JSON over HTTP
REST is an architectural style of communication, based on HTTP. It was proposed in the year 2000 by Roy Fielding. In his dissertation he describes the way systems should communicate, embracing fundamental features of HTTP. He puts emphasis on: statelessness, support for caching, uniform representation and self-discoverability. APIs that adhere to these priniciples are called RESTful. This academic paper is quite abstract so I’ll focus on what it means in the enterprise. Also, it’s much easier to understand what RESTful API is when contrasted to SOAP. And GraphQL released recently. Read more: Get the new episode straight to your mailbox:
June 15, 2021
#43: Public-key cryptography: math invention that revolutionized the Internet
Disclaimer: this podcast is not about cryptocurrencies. I despise them. Instead, we’ll talk about asymmetric encryption. One of the most wonderful math discoveries of the 20th century. Before 1970s all cryptographic algorithms were symmetric. This means that the same key must be used to encrypt and decrypt data. That sounds rather obvious. If you encrypt a file with a password, you must use the same password to decrypt it. But there’s one problem. Imagine Bob wants to e-mail an encrypted file to Alice. Sadly, Eve can read all communication between Alice and Bob. File was encrypted, so no worries? Well, it’s not only encrypted, but also worthless. Alice doesn’t have a password. And how is Bob suppose to provide that password if Eve can spy all communication channels? Read more: Get the new episode straight to your mailbox:
June 07, 2021
#42: Flow control and backpressure: slowing down to remain stable
Imagine two independent systems communicating with each other. One producing data and the other consuming it. There must be some place where data is buffered. Just in case the producer generated some data but the consumer is currently busy. For example, incoming requests, messages, packets - must wait. Sooner or later, this buffer overflows and either starts dropping data or crashes altogether. Moreover, large buffers imply growing latency between production and consumption. The consumer is perceived less responsive because data waited for a long time in queue. Especially when nothing is prioritized, so first come, first served. Also known as FIFO, first in, first out. Read more: Get the new episode straight to your mailbox:
May 31, 2021
#41: Unicode: can you see these: Æ, 爱 and 🚀?
Computers speak bits and bytes. Numbers in general. They don’t understand images, poems and JSON. When we say “hello”, it needs to be encoded to numbers. Conveniently, each character becomes one number. A number can then be stored, transfered and rendered on another computer. Therefore, everyone needs to agree which numbers represent which characters. The first commonly used attempt was called ASCII. American Standard Code for Information Interchange. In short, it’s a table of 127 symbols and their respective numbers. For example, lower-case h is 104, whereas exclamation mark is 33. There’s one problem here. 127 symbols. 7 bits. Of course, it’s an American Standard. So it ingores the existence of any other country and alphabet. Read more: Get the new episode straight to your mailbox:
May 24, 2021
#40: Docker: more than a process, less than a VM
When two processes run on the same machine, they are somewhat isolated. For example, they cannot read each other’s memory. However, they still share the same file system, libraries, network ports. And hardware: CPU and memory. Docker allows running processes with greater isolation on a Linux machine. Processes like: web servers, databases or web applications. Traditionally, to achieve better isolation, virtual machines were used. Virtual machine is essentially an operating system started inside of another operating system. For example, Windows running inside Linux. Typically you run a few VMs on a single host. Unfortunately, a virtual machine has an overhead. It takes several seconds to start and uses a significant amount of memory. Docker is somewhere in between. Better isolation than plain processes, but it’s not quite yet a VM. Read more: Get the new episode straight to your mailbox:
May 18, 2021
#39: DNS: one of the fundamental protocols of the Internet
Domain name system (DNS for short) is one of the fundamental protocols of the Internet. In the Internet all communication is routed through IP addresses. Traditionally, these addresses consist of four numbers, like Each and every server, as well as your computer, is identified using such an address. But we no longer remember phone numbers, let alone IP addresses! Remembering that the aforementioned IP belongs to is tedious. DNS is often compared to a global phone book. A phone book that maps easy to remember domains like or to IP addresses - usable by machines. Without DNS the Internet could technically work. Just like you could use your phone without contacts, memorizing all phone numbers. DNS servers not only free us from remembering IP addresses. They know all of them. Read more: Get the new episode straight to your mailbox:
May 11, 2021
#38: HTTP cookies: from saving shopping cart to online tracking
Before we fully appreciate how important HTTP cookies are, let’s imagine the web without them. HTTP is inherently stateless. This means that the HTTP server is not allowed and not capable of storing any context between requests. It has no memory of prior questions from the same client. Contrary to stateful protocols like FTP or SSH. They have a concept of long-running session. If you change the working directory during a session, subsequent commands take that into account. This is not the case for HTTP. For example, imagine you just logged in to GMail to see the list of unread e-mails. Now you click the most important one, from the Nigerian prince. Sadly, the server has no idea you are the person who just logged in. You must log in again. And again. This is where cookies help tremendously. Read more: Get the new episode straight to your mailbox:
March 30, 2021
#37: Fallacies of distributed computing
Fallacies of distributed computing are a set of myths we believe, when designing complex systems. And what is a distributed system? Well, if your application is split into hundreds of microservices, it’s distributed. Or if you have a single application, scaled horizontally to hundreds of instances. Or… If you have a monolith connecting to a database on the other node. This is a distributed system as well! OK, we have 200 seconds left and 8 fallacies to cover. Let’s go! Read more: Get the new episode straight to your mailbox:
March 22, 2021
#36: Microservices architecture: principles and how to break them
Microservices are contrasted to a monolith. Single, large application that implement the whole system. Typically hard to understand, develop, test and deploy. Monoliths tend to become a big ball of mud with each component referencing every other. The idea behind microservices is to split your complex system into multiple independent applications. Small and agile. They communicate with each other via APIs but are otherwise highly decoupled. The independence and decoupling has many aspects: deployment, languages and frameworks, storage, organization. Most importantly, each microservice should be self-sufficient to a reasonable degree. Let’s discuss what it means and how often these aspects are violated. Read more: Get the new episode straight to your mailbox:
March 16, 2021
#35: Reactive programming: from spreadsheets to modern web frameworks
To understand what reactive programming is, let’s contrast it to imperative programming. Imperative programs can be read top-to-bottom, with occasional jumps. Jumps are if statements, loops and procedure calls. Program is executed line by line. If you see x = y + z, the expression on the right is evaluated once. Then the symbol on the left is modified. If you change the value of y or z in the next line, obviously, it won’t affect x. Compare it to a spreadsheet. Yes, an Excel file. It’s obvious that changing any cell immediately propagates to all cells that depend on it, right? The process continues until all affected cells are updated. Essentially, every spreadsheet is internally represented by a dependency graph. We declare which pieces of data depend on which. The rest happens automatically. This approach to developing software is called… reactive programming. Read more: Get the new episode straight to your mailbox:
March 02, 2021
#34: SQL joins
In relational databases, data is kept in relations, commonly known as tables. Simplifying, when data is normalized, it’s not duplicated. For example, when storing books and authors, you don’t keep an author’s name next to a book record. Instead, you use a so-called foreign key that references the author in another table. Thanks to this level of indirection, books by the same author do not store repeated information. This has many benefits and one, huge drawback. In order to look up a book together with a corresponding author you must somehow correlate these two tables. This is called joining. Read more: Get the new episode straight to your mailbox:
February 22, 2021
#33: OAuth 2.0
OAuth 2.0 is a standardized authorization protocol. In this episode, I’ll explain just one use case of it: the authorization code flow. It allows server-side application to act on behalf of a user of another service. For example, a 3rd party application can post on Twitter on your account. Historically, to do this, this application must have had your Twitter credentials stored. Not only you had to reveal your Twitter password, but also that application must store it in plain text. Such an approach has multiple flaws. First of all, if the application is not entirely honest, it can now do anything on your behalf. Including changing your password and stealing your online account. But even if you trust the 3rd party application, it can still be hacked. Your password, together with thousands or millions of others, is compromised. Read more: Get the new episode straight to your mailbox:
February 16, 2021
#32: (Cryptographic) hash function
Sometimes you need to split arbitrary objects into a fixed number of groups. For example, storing a record into one out of many database nodes. Or saving a cookie in a hash table. Or distributing jobs among multiple workers. In all of these cases you later want to know, bucket or worker was chosen. Also, data should be split evenly. You don’t want one node or worker to be overloaded. The above properties are implemented by a so-called hash function. It’s an algorithm that takes arbitrary input and produces fixed-length output. A number. For the same input, often called a message, it always produces the same output, known as a hash. Ideally, different messages should produce a different hash. Even better, two slightly different messages should produce wildly different hash. In practice, hash collisions must happen. After all, we are mapping arbitrarily large messages into a fixed-length hash. Often 32- or 64-bit. Read more: Get the new episode straight to your mailbox:
February 08, 2021
#31: Redis
Redis is quite a versatile NoSQL, key-value database. Or in-memory cache. Or pub/sub broker. With transactions, stored procedures and fast replication. It’s quite universal. Anyway, the main use-case for Redis is caching. Internally the whole dataset must fit in memory. Redis can optionally persist data on disk, but all online operations happen entirely in memory. This makes Redis extremely fast. It’s often used as an alternative to the widespread Memcached server. Read more: Get the new episode straight to your mailbox:
February 01, 2021
#30: Linear Regression
Linear regression is one of the simplest machine learning algorithms. But also quite useful. It takes a bunch of existing, known observations and tries to predict how new observations will look like. Think about forecasting or finding trends. It says “linear” because the algorithm essentially finds a straight line that most closely follows the observations. OK, let’s take a concrete example. Imagine you are selling your apartment. What is the right price for it? Well, you compare it to similar apartments in your neighborhood. If someone sells the exact same flat across the street, your price should be very similar. If another flat is sold, but 10% larger, expect its price to be 10% higher as well. Yet another flat is half the size of yours. So expect its price to be just 50% of your estimated asking price. Sounds reasonable? Read more: Get the new episode straight to your mailbox:
January 18, 2021
#29: Time synchronization
Clocks are important to computers. Computers need to order events in a way understandable to humans. Every computer has a bunch of internal counters, like CPU ticks. But they only work within one machine. We need a way to have a reliable, global clock, that is synchronized between many computers. Why, exactly? Well, imagine you are selling tickets to The Rolling Stones concert. They sometimes sell within a few seconds. First come, first served. But who was first, if selling happens asynchronously in multiple data centers? Fans shouldn’t be penalized for being routed to a server with higher latency. So, instead, we use timestamps. Late messages may still be treated as earlier ones if a transaction timestamp says so. Obviously we can’t rely on the client’s clock. It’s too easy to change your laptop’s time and see Mick Jagger from the front row. But how do we make sure servers aren’t lying the same way? Even unintentionally? This is where NTP, network time protocol, comes into play. Read more: Get the new episode straight to your mailbox:
January 12, 2021
#28: Event sourcing
Event sourcing is an alternative technique of storing business data. Rather than updating a single database record, every change is captured in an immutable, append-only log. We never overwrite existing data. Instead, we create and store an event that represents what exactly has changed. From the business perspective. In order to recreate the current state of an entity we must go through all the events and reconstruct it from history. Event sourcing brings better auditing and debugging. Also, storing changes can be faster because it requires inserting a new record rather than updating an existing one. Read more: Get the new episode straight to your mailbox:
January 05, 2021
#27: Proof-of-work algorithm in blockchain
Let’s try to cheat the blockchain. If my wallet has exactly one bitcoin, I can’t spend it twice. Once it’s written into an immutable blockchain, everyone knows my wallet is empty. However, what if I purposefully create and announce two blocks at the same time. With the same parent block. For example, in one of the blocks there’s a 1 bitcoin from my wallet spent on drugs. In the other block I spent that same bitcoin on unlicensed firearms. You know, things you do with cryptocurrencies. There are two competing blocks, each having a different version of the history. There is no central database so the blockchain network has no way of figuring out which block is valid and which is not. Well, they both technically are. Read more: Get the new episode straight to your mailbox:
December 29, 2020
#26: Blockchain
Blockchain is a technology used for storing data without a central database. Data is organized in an ever-growing list of blocks with each block referencing the previous one. Like a linked list. Once a block is added to this list, it can’t be modified. The integrity is guaranteed by including a cryptographic hash of the previous block. If the previous block changes, all subsequent blocks need to change as well. You can’t simply modify history. This is similar to the operations on your bank account. However, the idea behind the blockchain is to maintain integrity without a central authority, like a bank. Data is distributed among peers. No node is distinguished and some number of nodes can even be hostile. Blockchain tolerates up to 50% of nodes purposefully trying to cheat the system. Everything happens using peer-to-peer network with no central backbone whatsoever. At what cost all of this can be achieved? Read more: Get the new episode straight to your mailbox:
December 22, 2020
#25: High-frequency trading
According to some estimates, even half of the trading volume in the American stock exchange is generated by computers. Specifically, computer programs that make trading decisions in a split of a second. They may buy stock to sell it a few milliseconds later. With very minimal profit, this process repeated thousands of times per day can make a solid return. How do such systems work? There are multiple strategies, but most of them require extremely fast algorithms running close to the physical stock exchange. The speed is crucial and that’s what makes HFT so interesting. A trading bot can easily read social media and within microseconds decide whether particular news is good or bad. That can lead to a stock going up or down. For example, a president tweets about a new special tax relief for the pharmaceutical industry. A computer program almost instantaneously buys some stocks from the pharma companies and sells them seconds later. Before other computers do the same. Human traders stand no chance. Read more: Get the new episode straight to your mailbox:
December 14, 2020
#24: Service discovery
In the old days an application consisted of a monolithic backend and a database. Once they were deployed their location never changed. So the only piece of configuration was the address of the database almost hardcoded into the monolith. These days an application is split into hundreds of microservices talking to each other. Probably too many services, probably talking too much. But that’s a different story. Anyway, the environment became much more dynamic. Services come and go, orchestration frameworks are deploying them on different machines all the time. TCP/IP ports are random, instances are scaled up and down frequently. Sometimes automatically. New hosts are provisioned, old ones are shut down. Whole data centers are added. Under such circumstances we can no longer hard-code anything. When one service wants to talk to the other, it must somehow figure out where that service currently lives. It needs a mechanism to dynamically discover that service in an ever-changing environment. Read more: Get the new episode straight to your mailbox:
December 08, 2020
#23: Garbage collection
Creating new objects, arrays or strings is so straightforward that we often forget what happens underneath. And I don’t mean trying to figure out what this refers to in JavaScript objects. I mean: memory management. On each request we create a ton of objects. A server can easily allocate hundreds of megabytes of memory. Per second. Memory is cheap and there’s a lot of it. But it’s not infinite. How come we can simply call new Object() over and over again, taking more and more memory from our computer? Many objects are no longer needed a few milliseconds after they’re created. What happens to the memory they occupy? We take for granted what was thought to be almost impossible: automatic memory management. Read more: Get the new episode straight to your mailbox:
November 30, 2020
#22: Moore's Law
It's a common misconception that Moore's law is dead. That's because many believe it's about the speed of a CPU. But in reality Gordon Moore meant the number of transistors, not the clock frequency. And also, it's now even a law. Just an observation that holds true after half a century. OK, so what does this "law" state? Gordon Moore, before co-founding Intel, noticed that the number of transistors in a CPU doubles every two years. This means exponential growth. Which is a lot. So why are these transistors important? Read more: Get the new episode straight to your mailbox:
November 23, 2020
#21: SSE and WebSockets
HTTP is historically request-response-driven. This means a server is idle as long as no-one asks it to do something. Typically fetching data or accepting some form. In reality, we’d often like to receive data from the server without any request. Typically to subscribe for some server-side updates. For example, displaying a current price on the stock exchange that changes many times per second. Or when waiting for some asynchronous process to complete. Traditionally this could be achieved with a few hacks. The most obvious and the worst one is busy-waiting. You simply keep asking the server over and over again periodically. More frequent requests result in a lot of excessive network traffic. Less frequent requests increase latency, so it’s no longer real-time communication. A slightly smarter approach is long-polling. In this implementation, you periodically ask the server whether there is some new data. To avoid excessive round-trips, the server doesn’t respond until some update is available. Or, after a timeout, it sends back an empty response and the loop continues. Read more: Get the new episode straight to your mailbox:
November 03, 2020
#20: Chaos engineering
We tend to focus on testing happy paths and expected edge cases. But how do you make sure that your system can survive minor infrastructure and network failures, as well as application bugs? Especially in microservice or serverless environment, where there are tons of moving parts. I've seen too many times systems that fail miserably because some minor dependency was malfunctioning. For example, you have a tiny service that displays a small social widget on your website. When that service is down, the rest of the website should work. But without proper care and testing, you may end up with global HTTP 503 failure. Code reviews and unit tests are fine, but the ultimate test is... turning off that service on production. And making sure the rest actually works. This is called chaos engineering. Believe it or not, many organizations do practice deliberately injecting faults into production. Now, turning off a service's instance on production is probably the easiest test you can conduct. The client must catch an exception and handle the failure gracefully. Sometimes by retrying, hoping to reach another healthy instance. Sometimes by returning a fallback value that's less relevant or up-to-date. Ideally, the end-user should not realize one of the services is down. Of course, that would mean that a failed service is not needed at all and can be shut down forever. So in practice, we expect visible, but insignificant degrade in service quality. Read more: Get the new episode straight to your mailbox:
October 26, 2020
#19: GraalVM
GraalVM is a set of tools that aim to improve the performance and interoperability of Java Virtual Machine. Taking advantage of GraalVM not only makes your apps run faster. It also allows running different languages like JavaScript or Python with superb speed. GraalVM consists of quite a few projects, so let's dive in. The most groundbreaking technology is the JIT compiler. To recap, JIT is responsible for translating abstract bytecode into low-level machine code. JIT is the reason why Java is actually quite fast these days. Your code is compiled behind the scenes into heavily optimized CPU instructions. Unfortunately, this wonderful piece of software was buried deeply in Java VM.  The JIT codebase in C++ turned out to be too complex to maintain anymore. So someone thought: what if we rewrite JIT compiler in Java? Sounds crazy. But as a matter of fact, JIT is essentially a pure function that takes bytecode as input and returning machine code as output. Byte array in, byte array out. That's how GraalVM was born. Now you can plug-in a JIT compiler written in Java to a JVM. Suddenly the codebase became much more maintainable and developer friendly. GraalVM's JIT compiler quickly outperformed legacy JIT compiler. Essentially it is now much easier to write optimized machine code generation. But it turned out this was just the beginning. Read more: Get the new episode straight to your mailbox:
October 19, 2020
#18: JIT - Just-in-time compilation
Source code can then be executed in two ways. Language implementations in general either interpret or compile it. In order to run an interpreted program, you need one extra binary: an interpreter. Interpretation is simple: you read source code line by line and execute it. The compilation is much harder. A special program called a compiler reads your source code ahead of time (AOT) and translates it into machine code. After this translation your program is standalone. You don't need a compiler to run it. Only you and your CPU. Turns out this distinction is not that clear at all these days. Almost every language implementation performs compilation behind the scenes. And many languages that have a compiler produce code that needs an interpreter anyway. What? Read more: Get the new episode straight to your mailbox:
October 12, 2020
#17: BPM: Business process modeling
[...] All of this complexity is somewhat hidden with BPM framework. First of all the process is first drawn. Using a special notation known as BPMN. This is actually quite natural. You use arrows to show how insurance claim changes state over time - and why. Then the diagram is translated into fairly standard XML. Now those pesky developers need to fill in the gaps. I mean, writing code that does some logic. For example sending an SMS when a claim enters a certain state. Or transferring money when a transition happens from one state to another. [...] Read more: Get the new episode straight to your mailbox:
October 06, 2020
#16: Akka
Akka is a toolkit for building highly scalable, concurrent applications. It's written in Scala and based on the ideas from Erlang. Its approach to achieve concurrency is quite radical. Rather than mutexes, semaphores and shared memory, Akka uses so-called _actor model_. Actor is a small, stateful object that doesn't expose traditional methods. Instead, actors send and receive asynchronous messages with each other. There is no other way to interact with an actor. If you want an actor to do something or give you some information, message passing is the only way. Send a message, actor will receive it at some point in time, consume it and optionally send a response back.  Read more: Get the new episode straight to your mailbox:
September 22, 2020
#15: Mutation testing
Imagine I wrote a script that takes your codebase and removes a random line. Fairly simple. Or maybe some more subtle change, like replacing plus with minus operator? Or switching `x` and `y` parameters with each other? OK, so now my script builds your project. Most of the time it will fail the compilation or test phase. But what if the build succeeds? Well, apparently your test suite is not covering some lines? OK, but what if my script only removes or alters lines covered by tests? How is it possible that the build still succeeds? Turns out your tests aren't as good as you think. And I just described mutation testing that discovers that. Read more: Get the new episode straight to your mailbox:
September 14, 2020
#14: Static, Dynamic, Strong and Weak Type Systems
When choosing or learning a new programming language, type system should be your first question. How strict is that language when types don't really match? Will there be a conservative, slow and annoying compiler? Or maybe a fast feedback loop, often resulting in crashes at runtime? And also, is the language runtime trusting you know what you are doing, even if you don't? Or maybe it's babysitting you, making it hard to write fast, low-level code? Believe it or not, I just described static, dynamic, weak and strong typing. Read more: Get the new episode straight to your mailbox:
August 31, 2020
#13: Cassandra
Cassandra is an open-source NoSQL database. It's heavily optimized for writes, but also has intriguing read capabilities. Cassandra has near-linear scalability. In terms of CAP theorem it favours availability over consistency. Interestingly, despite NoSQL label, Cassandra tables have strict schema. Also, Cassandra Query Language is similar to SQL. Read more: Get the new episode straight to your mailbox:
August 18, 2020
#12: Continuous integration, delivery and deployment
Typically, more than one developer is working on the same codebase. How do they share their work? The simplest approach is a common Dropbox folder. This has several drawbacks, mainly we risk breaking other's work with our half-done features. So we come up with version control systems. Read more: Get new episode straight to your mailbox:
August 11, 2020
#11: MapReduce
MapReduce is a programming model for processing large amounts of data. It works best when you have a relatively simple program, but data is spread across thousands of servers. MapReduce was invented and popularized by Google. I'll talk about MapReduce in general and Hadoop in particular. Read more: Get new episode straight to your mailbox:
August 04, 2020
#10: HTTP protocol
HTTP protocol is fundamental to the Internet. It's a simple request-response protocol where the request is initiated by the client, typically a web browser. Read more: Get new episode straight to your mailbox:
July 27, 2020
#9: Retrying failures
I find it quite fascinating how many failures in complex systems could be avoided if we simply... tried again. So how so you retry effectively, so that your systems are much more fault-tolerant and less brittle? Read more: Get new episode straight to your mailbox:
July 21, 2020
#8: Kafka's design
Kafka is not a message broker. However, it can be used as such very effectively. Instead, I'd like to think about as a very peculiar database. A database where inserts are insanely fast and sequential reads are preferred and very fast as well. In this episode I am focusing on the architecture and internals of Kafka. The best way to understand Kafka is by examining how it works. Read more: Get new episode straight to your mailbox:
July 14, 2020
#7: Speed of light
Speed of light is not as abstract to us, software engineers, as you might think. If you are deploying to the cloud or if you want to squeeze every bit of performance in your app, speed of light holds you back. Light travels at an unbelievable speed of three hundred thousand kilometers per second. That's more than 7 times around the globe in one second. Is this relevant in our industry?  Read more: Get new episode straight to your mailbox:
July 06, 2020
#6: Little's law
Little's law is an astounding equation that's dead simple, yet it can bring an amazing insight into what your distributed system is capable of. Read more: Newsletter: More resources: * Little's law: * John Little: * Node.js and CPU intensive requests: * My talk where I mention Little's law (from 23:03:
June 30, 2020
#5: asm.js and WebAssembly
Read more: Newsletter: More resources: * asm.js: * WebAssembly: * Compiling C/C++/Rust/... to asm.js via LLVM backend: * Quake in the browser (asm.js): * Unity Engine in the browser (WebAssembly):
June 16, 2020
#4: Serverless
Read more: Newsletter: 4th edition of the newsletter, apart from transcript, contains GraphQL scalability tricks, enjoy! More resources: * AWS Lambda: * Google Cloud Functions: * Azure Functions: * Spring Cloud Function:
June 08, 2020
#3: GraphQL
Read more: Newsletter: More resources: Official GraphQL website: Curated collection of resources: GitHub's API explorer using GraphQL: Facebook's API explorer using GraphQL: Visual GraphQL explorer for any API: A series of my blog posts about GraphQL in Java:,,
June 02, 2020
#2: Service Mesh
Notable implementations of service mesh: * * More details: * What's a service mesh? And why do I need one? ( * What's a service mesh? ( * InfoQ ( * Service Mesh Landscape ( * Service Mesh Comparison ( Read more at: Most episodes are originally much longer. If you wish to hear full, director's cut version, check out my mailing list: I will also notify you about new episodes and add some extra content like transcripts. Suggest your topics:
May 26, 2020
#1: Circuit Breaker
Circuit breaker is a design pattern that prevents cascading failures in distributed systems. More details: and Circuit breaker implementations: * (Java) * (.NET) * (Go) * (Scala/Akka) * (JavaScript) This episode was originally twice as long. If you wish to hear full, director's cut version, check out my mailing list: I will also notify you about new episodes and add some extra content. Suggest your topics:
May 12, 2020
#0: Meta
I explain software development in no more than 4 minutes and 16 seconds. Notifications of new episodes: User voice: suggest topics: Which programming languages count from 1: Background music from
April 27, 2020