
The GeekNarrator
By Kaivalya Apte
The GeekNarrator podcast is a show hosted by Kaivalya Apte who is a Software Engineer and loves to talk about Technology, Technical Interviews, Self Improvement, Best Practices and Hustle.
Connect with Kaivalya Apte www.linkedin.com/in/kaivalya-apte-2217221a
Tech blogs: kaivalya-apte.medium.com/
Wanna talk? Book a slot here: calendly.com/speakwithkv/hey
Enjoy the show and please follow to get more updates. Also please don’t forget to rate and review the show.
Cheers
Connect with Kaivalya Apte www.linkedin.com/in/kaivalya-apte-2217221a
Tech blogs: kaivalya-apte.medium.com/
Wanna talk? Book a slot here: calendly.com/speakwithkv/hey
Enjoy the show and please follow to get more updates. Also please don’t forget to rate and review the show.
Cheers

The GeekNarratorMar 27, 2024
00:00
01:01:37

How do vector (search) databases work? ft: turbopuffer
For memberships: join this channel as a member here:https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/joinSummary:In this conversation, Kaivalya Apte and Simon Eskildsen talk about vector databases, particularly focusing on TurboPuffer. They discuss the importance of vector search, embeddings, and the challenges associated with building efficient search engines. The conversation covers various aspects such as cost considerations, chunking strategies, multi-tenancy, and performance optimization. Simon shares insights on the future of vector search and the significance of observability and metrics in database performance. The discussion emphasizes the need for practical application and experimentation in understanding these technologies.Chapters:00:00 Introduction to Vector Databases10:34 Understanding Vectors and Embeddings15:03 Example: Designing a Search Engine for Podcasts27:53 Scaling Challenges in Vector Search36:46 Indexing and Querying in TurboPuffer38:12 Understanding Indexing and Query Planning45:45 Exploring Index Types and Their Performance50:27 Data Ingestion and Embedding Retrieval54:19 Use Cases and Challenges in Vector Search01:01:22 Metrics and Observability in Vector Databases01:03:52 Future Trends in Vector Search and DatabasesReferences:How do build a database on Object Storage? https://youtu.be/RFmajOeUKnETurbopuffer https://turbopuffer.com/Continous Recall measurement: https://turbopuffer.com/blog/continuous-recallTurbopuffer architecture: https://turbopuffer.com/architecture
Apr 07, 202501:08:59

Are your Data Pipelines Complex?
The GeekNarrator memberships can be joined here: https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/joinMembership will get you access to member only videos, exclusive notes and monthly 1:1 with me. Here you can see all the member only videos: https://www.youtube.com/playlist?list=UUMO_mGuY4g0mggeUGM6V1osdA------------------------------------------------------------------------------------------------------------------------------------------------------------------About this episode: ------------------------------------------------------------------------------------------------------------------------------------------------------------------In this conversation, Jacopo and Ciro discuss their journey in building Bauplan, a platform designed to simplify data management and enhance developer experience. They explore the challenges faced in data bottlenecks, the integration of development and production environments, and the unique approach of Bauplan using serverless functions and Git-like versioning for data. The discussion also touches on scalability, handling large data workloads, and the critical aspects of reproducibility and compliance in data management. Chapters:00:00 Introduction03:00 The Data Bottleneck: Challenges in Data Management06:14 Bridging Development and Production: The Need for Integration09:06 Serverless Functions and Git for Data17:03 Developer Experience: Reducing Complexity in Data Management19:45 The Role of Functions in Data Pipelines: A New Paradigm23:40 Building Robust Data Solutions: Versioning and Parameters30:13 Optimizing Data Processing: Bauplan Runtime46:46 Understanding Control Planes and Data Management48:51 Ensuring Robustness in Data Pipelines52:38 Data Quality and Testing Mechanisms54:43 Branching and Collaboration in Data Development57:09 Scalability and Resource Management in Data Functions01:01:13 Handling Large Data Workloads and Use Cases01:09:05 Reproducibility and Compliance in Data Management01:16:46 Future Directions in Data Engineering and Use CasesLinks and References:Bauplan website:https://www.bauplanlabs.com
Apr 07, 202501:23:28

Can Math simplify incremental compute?
In this episode of The Geek Narrator podcast, Lalit Suresh, CEO of Feldera, joins us to share insights on incremental view maintenance and its significance in modern data processing.We have discussed the challenges posed by distributed systems, the mathematical foundation of DBSP, and how Feldera's architecture addresses these challenges. Performance optimization, handling late events, and the future of stream processing, the importance of SQL in creating efficient data workflows - its all in here.Chapters00:00 Introduction to Incremental View Maintenance06:30 Challenges in Distributed Systems11:46 Batch Processing vs Stream Processing16:27 Understanding DBSP: The Mathematical Foundation27:46 Architecture of Feldera and Data Flow39:23 Partitioning and Storage Layer in Feldera42:51 Understanding Co-Design Storage Layers45:52 Foreground and Background Workers in DBSP49:16 Tuning Background Workers for Performance49:41 Synchronous Compute Model and View Propagation51:35 Zsets and Batch Processing in Stream Workloads54:00 Data Model Optimization in Feldera57:22 Handling Late Events and Lateness in Feldera01:01:18 Watermarks and Lateness Annotations01:04:20 Error Handling and Idempotency in Feldera01:11:05 Feldera's Differentiators and Future Roadmap
Apr 06, 202501:17:14

Redpanda - High Performance Streaming Platform for Data Intensive Applications
The GeekNarrator memberships can be joined here: https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/joinMembership will get you access to member only videos, exclusive notes and monthly 1:1 with me. Here you can see all the member only videos: https://www.youtube.com/playlist?list=UUMO_mGuY4g0mggeUGM6V1osdA------------------------------------------------------------------------------------------------------------------------------------------------------------------About this episode: ------------------------------------------------------------------------------------------------------------------------------------------------------------------In this conversation, Alex from Red Panda discusses his engineering background, the challenges faced in reliability engineering, and the journey of building a better streaming system. He emphasizes the importance of understanding latency and performance in engineering systems, the market position of Red Panda in relation to Kafka, and the complexities involved in optimizing codebases for better performance. In this conversation, Alex discusses Red Panda's architecture, focusing on its thread architecture, memory allocation mechanics, and the importance of protocol correctness. He highlights how Red Panda stands out in the data systems landscape by eliminating unnecessary complexities and optimizing performance across various latency spectrums. The discussion also touches on the future of data processing, emphasizing the shift towards agentic workloads and the integration of analytical and operational layers.Chapters00:00 Introduction11:07 Building a Better Streaming System19:10 Market Position and Competition25:06 Optimizing Latency and Performance32:38 Understanding Complexity in Codebases33:36 Thread Architecture and Concurrency Models39:39 Memory Allocation Mechanics47:31 Protocol Correctness and Optimization Strategies56:27 Red Panda's Unique Position in Data Systems01:02:05 The Future of Data Processing and Agentic WorkloadsBlogs:TPC buffers: https://www.redpanda.com/blog/tpc-buffershttps://www.redpanda.com/blog/always-on-production-memory-profiling-seastarhttps://www.redpanda.com/blog/end-to-end-data-pipelines-types-benefits-and-process------------------------------------------------------------------------------------------------------------------------------------------------------------------Like building real stuff?------------------------------------------------------------------------------------------------------------------------------------------------------------------Try out CodeCrafters and build amazing real world systems like Redis, Kafka, Sqlite. Use the link below to signup and get 40% off on paid subscription.https://app.codecrafters.io/join?via=geeknarrator------------------------------------------------------------------------------------------------------------------------------------------------------------------Link to other playlists. LIKE, SHARE and SUBSCRIBE------------------------------------------------------------------------------------------------------------------------------------------------------------------If you like this episode, please hit the like button and share it with your network. Also please subscribe if you haven't yet.Database internals series: https://youtu.be/yV_Zp0Mi3xsPopular playlists:Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_dModern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsNStay Curios! Keep Learning!#streaming #kafka #redpanda #c++ #databasesystems #SQL #distributedsystems #memoryallocation #garbagecollection
Mar 14, 202501:05:29

Hosted PostgreSQL on bare metal and uni kernel
The GeekNarrator memberships can be joined here: https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/joinMembership will get you access to member only videos, exclusive notes and monthly 1:1 with me. Here you can see all the member only videos: https://www.youtube.com/playlist?list=UUMO_mGuY4g0mggeUGM6V1osdA------------------------------------------------------------------------------------------------------------------------------------------------------------------About this episode: ------------------------------------------------------------------------------------------------------------------------------------------------------------------In this episode, we talk to Søren Schmidt, Co-Founder and CEO of Prisma, discussing the evolution of Prisma from a backend as a service to a popular ORM and now to Prisma Postgres. He shares insights into the challenges faced during this journey, the importance of user feedback, and the innovative architecture of Prisma Postgres, which leverages micro VMs for performance optimization. The conversation also touches on the complexities of managing data centers and the strategies employed to ensure a seamless user experience. In this conversation, Søren Schmidt discusses the details about Postgres snapshots, their impact on performance, and the mechanisms for fault tolerance. He explains how Pulse change data capture works and how Prisma Postgres simplifies database management for users. Chapters00:00 Introduction to Prisma and Its Evolution03:00 The Journey from ORM to Prisma Postgres06:00 Simplifying Database Management09:01 Understanding Prisma Postgres Architecture12:12 The Role of Accelerate in Query Routing14:51 Optimizing Query Processing with Micro VMs18:12 Maintaining Postgres Integrity in a Micro VM Environment21:07 User Experience and Community Feedback23:57 Challenges of Data Center Management27:09 Cold Starts and Performance Optimization34:30 Understanding Snapshots in Postgres38:55 Snapshot Mechanisms and Fault Tolerance44:09 Change Data Capture with Pulse55:07 Transitioning to Prisma Postgres58:45 Community and Getting Started with Prisma PostgresSome blogs worth checking out:https://www.prisma.io/blog/prisma-postgres-the-future-of-serverless-databaseshttps://www.prisma.io/blog/cloudflare-unikernels-and-bare-metal-life-of-a-prisma-postgres-queryhttps://www.prisma.io/blog/announcing-prisma-postgres-early-accessPrisma Postgres relies heavily on the Unikraft project. There is a good introductory talk here: https://www.youtube.com/watch?v=n4wOyAuNhl0And some very technical papers here: https://unikraft.org/community/papersThe best way to get started with Prisma Postgres is to go straight to https://www.prisma.io/ ------------------------------------------------------------------------------------------------------------------------------------------------------------------Like building real stuff?------------------------------------------------------------------------------------------------------------------------------------------------------------------Try out CodeCrafters and build amazing real world systems like Redis, Kafka, Sqlite. Use the link below to signup and get 40% off on paid subscription.https://app.codecrafters.io/join?via=geeknarrator------------Database internals series: https://youtu.be/yV_Zp0Mi3xsPopular playlists:Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_dModern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
Mar 14, 202501:00:04

eBPF and continuous profiling with Frederic
The GeekNarrator memberships can be joined here: https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/joinMembership will get you access to member only videos, exclusive notes and monthly 1:1 with me. Here you can see all the member only videos: https://www.youtube.com/playlist?list=UUMO_mGuY4g0mggeUGM6V1osdA------------------------------------------------------------------------------------------------------------------------------------------------------------------About this episode: ------------------------------------------------------------------------------------------------------------------------------------------------------------------In this episode, Kaivalya Apte and Frederic Branczyk talk about observability, focusing on continuous profiling and the role of eBPF. They discuss the evolution of profiling techniques, the importance of systematic data collection, and the challenges faced in maintaining low overhead while gathering detailed performance metrics.Frederic shares insights from his extensive experience with Prometheus and Kubernetes, emphasizing the transformative impact of continuous profiling on software performance optimization. This conversation delves into the intricacies of eBPF (Extended Berkeley Packet Filter) and its applications in profiling and performance analysis. The discussion covers the capabilities of eBPF in extending the kernel safely, the mechanisms of user space profiling, and the handling of process terminations. It also explores memory and network profiling techniques, the challenges of profiling in different programming environments, and the limitations of eBPF in certain use cases. The conversation concludes with valuable resources for those interested in learning more about eBPF and profiling techniques.Chapters:00:00 Introduction to Observability and Profiling01:17 Frederic's Background and Expertise02:11 The Importance of Continuous Profiling06:46 The Value of Continuous Profiling11:20 Understanding Profiling Data19:09 Data Structures and Performance in Profiling32:35 The Role of eBPF in Profiling42:48 Introduction to eBPF and Its Capabilities48:32 User Space Profiling and Memory Management51:39 Handling Process Termination and Agent Recovery55:27 Memory and Network Profiling Techniques01:01:33 Profiling in Different Programming Environments01:11:47 Use Cases and Limitations of eBPF in Profiling01:13:54 Resources for Learning eBPF and Profiling Techniques------------------------------------------------------------------------------------------------------------------------------------------------------------------Like building real stuff?------------------------------------------------------------------------------------------------------------------------------------------------------------------Try out CodeCrafters and build amazing real world systems like Redis, Kafka, Sqlite. Use the link below to signup and get 40% off on paid subscription.https://app.codecrafters.io/join?via=geeknarrator------------------------------------------------------------------------------------------------------------------------------------------------------------------Link to other playlists. LIKE, SHARE and SUBSCRIBE------------------------------------------------------------------------------------------------------------------------------------------------------------------Database internals series: https://youtu.be/yV_Zp0Mi3xsPopular playlists:Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_dModern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsNStay Curios! Keep Learning!
Mar 14, 202501:17:46

Patterns of Distributed Systems with Unmesh Joshi
The GeekNarrator memberships can be joined here: https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/joinMembership will get you access to member only videos, exclusive notes and monthly 1:1 with me. Here you can see all the member only videos: https://www.youtube.com/playlist?list=UUMO_mGuY4g0mggeUGM6V1osdA------------------------------------------------------------------------------------------------------------------------------------------------------------------About this episode: ------------------------------------------------------------------------------------------------------------------------------------------------------------------In this conversation, Unmesh Joshi discusses the patterns of distributed systems. He emphasizes the importance of understanding the context in which patterns are applied, the need to read code to grasp their implementation, and the common pitfalls that developers face when applying patterns without a clear understanding of the underlying problems. Chapters00:00 Introduction to Distributed Systems and Patterns05:39 Understanding Patterns in Distributed Systems19:23 Bridging Theory and Practice in Distributed Systems28:56 The Role of Developers in Understanding Patterns31:58 Understanding Patterns in Software Development40:58 The Human Aspect of Software Design44:37 Iterative Development and Real-World Applications49:03 The Future of Patterns in Cloud-Native Systems55:07 Common Misunderstandings of Distributed PatternsInteresting quotes:"Patterns capture wisdom of generations.""Reading code is the best way to understand.""Patterns help you see beyond abstractions.""Understanding patterns helps bridge the gap.""Expert generalists can operate across verticals.""There are no simple systems in the cloud era.""Patterns can add complexity if misunderstood.""Patterns are always useful within a context.""Design and development are human activities.""The deconstruction of databases is happening.""Paxos is the most misunderstood pattern."Unmesh Joshi :https://in.linkedin.com/in/unmesh-joshi-9487635Catalog of Patterns: https://martinfowler.com/articles/patterns-of-distributed-systems/I hope you liked the episode, if you did please like, share and subscribe. ------------------------------------------------------------------------------------------------------------------------------------------------------------------Like building real stuff?------------------------------------------------------------------------------------------------------------------------------------------------------------------Try out CodeCrafters and build amazing real world systems like Redis, Kafka, Sqlite. Use the link below to signup and get 40% off on paid subscription.https://app.codecrafters.io/join?via=geeknarrator------------------------------------------------------------------------------------------------------------------------------------------------------------------Link to other playlists. LIKE, SHARE and SUBSCRIBE------------------------------------------------------------------------------------------------------------------------------------------------------------------If you like this episode, please hit the like button and share it with your network. Also please subscribe if you haven't yet.Database internals series: https://youtu.be/yV_Zp0Mi3xsPopular playlists:Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_dModern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsNStay Curios! Keep Learning!#distributedsystems #patterns #softwarearchitecture #consensus #algorithms #coding #patterns #softwaredevelopment #ThoughtWorks #softwareengineering #cloud #computing #software
Feb 12, 202558:14

AWS Aurora Distributed SQL internals with Marc Brooker
The GeekNarrator memberships can be joined here: https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/join
Membership will get you access to member only videos, exclusive notes and monthly 1:1 with me.
Here you can see all the member only videos: https://www.youtube.com/playlist?list=UUMO_mGuY4g0mggeUGM6V1osdA
------------------------------------------------------------------------------------------------------------------------------------------------------------------
About this episode:
------------------------------------------------------------------------------------------------------------------------------------------------------------------
In this episode of the Geek Narrator podcast, host Kaivalya Apte interviews Marc Brooker, a distinguished engineer at AWS, about Aurora D-SQL. They discuss Marc's journey at AWS, the evolution of Aurora D-SQL, and the customer-centric approach that led to its development.
Marc explains the choice of PostgreSQL as the foundation for DSQL, the architecture of the database, and the importance of snapshot isolation and concurrency control. The conversation goes into the technical aspects of DSQL, including the write process and how atomicity is maintained, providing listeners with a comprehensive understanding of this innovative database solution. This conversation also goes deep into the intricacies of database design, focusing on fault tolerance, replication strategies, and the role of Firecracker VMs in enhancing scalability. Marc Brooker discusses the architecture of Aurora D-SQL, emphasizing the importance of transaction management, the challenges of active-active deployments, and the trade-offs involved in database design. The discussion also highlights various use cases for Aurora DSQL, including its suitability for micro-services and serverless architectures, while addressing scenarios where it may not be the best fit.
Chapters
00:00 Introduction to Aurora DSQL and Marc Brooker's Journey
03:38 The Evolution of Aurora DSQL at AWS
09:24 Customer-Centric Development and Technological Enablers
12:50 Why PostgreSQL? The Choice Behind DSQL
16:39 High-Level Architecture of DSQL
22:07 Understanding Snapshot Isolation and Concurrency Control
28:45 The Write Process and Atomicity in DSQL
38:50 Designing Fault Tolerance in Databases
47:38 Replication and Transaction Commit Strategies
54:35 Active-Active Deployment and Fault Tolerance
01:00:14 Role of Firecracker VM in Scalability
01:09:27 Use Cases and Trade-offs of Aurora D-SQL
Marc's Blog: https://brooker.co.za/blog/
Marc on Aurora DSQL : https://brooker.co.za/blog/2024/12/03/aurora-dsql.html
AWS's documentation on Aurora DSQL : https://aws.amazon.com/rds/aurora/dsql/features/
------------------------------------------------------------------------------------------------------------------------------------------------------------------
Like building real stuff?
------------------------------------------------------------------------------------------------------------------------------------------------------------------
Try out CodeCrafters and build amazing real world systems like Redis, Kafka, Sqlite. Use the link below to signup and get 40% off on paid subscription.
https://app.codecrafters.io/join?via=geeknarrator
------------------------------------------------------------------------------------------------------------------------------------------------------------------
Link to other playlists. LIKE, SHARE and SUBSCRIBE
------------------------------------------------------------------------------------------------------------------------------------------------------------------
If you like this episode, please hit the like button and share it with your network.
Also please subscribe if you haven't yet.
Database internals series: https://youtu.be/yV_Zp0Mi3xs
Popular playlists:
Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-
Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17
Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d
Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
Stay Curios! Keep Learning!
#sql #postgres #databasesystems #aws #awsdevelopers #spanner #google #cockroachdb #yugabytedb #cap #scalability #WAL #DistributedSystems #Cloud #aurora
Jan 24, 202501:14:56

Power of #Duckdb with Postgres: pg_duckdb
The GeekNarrator memberships can be joined here: https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/join
Membership will get you access to member only videos, exclusive notes and monthly 1:1 with me.
Here you can see all the member only videos: https://www.youtube.com/playlist?list=UUMO_mGuY4g0mggeUGM6V1osdA
------------------------------------------------------------------------------------------------------------------------------------------------------------------
About this episode:
------------------------------------------------------------------------------------------------------------------------------------------------------------------
Hey folks - In this episode we have Jelte with us, who is the main contributor to the pg_duckdb project, which is a postgres extension to add the #duckdb power to our beloved #postgresql.
We will try to understand how it works? Why is it needed and what's the future of pg_duckdb?
If you love #Postgres or #Duckdb or just understanding #database internals then this episode will give you pretty solid insights into Postgres query processing, Duckdb analytics, Postgres extension ecosystem and so on.
Basics:
pg_duckdb is a Postgres extension that embeds DuckDB's columnar-vectorized analytics engine and features into Postgres. We recommend using pg_duckdb to build high performance analytics and data-intensive applications.
Chapters:
00:00 Introduction to PG-DuckDB
03:40 Understanding the Integration of DuckDB with Postgres
06:23 Architecture of PG-DuckDB: Query Processing Explained
10:02 Configuring DuckDB for Analytics Queries
15:37 Managing Workloads: Transactional vs. Analytical
21:02 Observability and Debugging in DuckDB
25:58 Data Deletion and GDPR Compliance
30:46 Schema Management and Migration Challenges
33:14 Managing Schema Changes in Databases
35:21 Upgrading Database Extensions
36:33 Enhancing Data Reading Methods
38:33 Future Features and Improvements
45:54 Use Cases for PGDuckDB
50:03 Challenges in Building the Extension
55:25 Getting Involved with PGDuckDB
Important links:
The duckdb discord server, which has a pg_duckdb channel inside it: https://discord.duckdb.org/
repo: https://github.com/duckdb/pg_duckdb
good-first-issue issues: https://github.com/duckdb/pg_duckdb/issues?q=sort%3Aupdated-desc+is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22
------------------------------------------------------------------------------------------------------------------------------------------------------------------
Like building real stuff?
------------------------------------------------------------------------------------------------------------------------------------------------------------------
Try out CodeCrafters and build amazing real world systems like Redis, Kafka, Sqlite. Use the link below to signup and get 40% off on paid subscription.
https://app.codecrafters.io/join?via=geeknarrator
------------------------------------------------------------------------------------------------------------------------------------------------------------------
Link to other playlists. LIKE, SHARE and SUBSCRIBE
------------------------------------------------------------------------------------------------------------------------------------------------------------------
If you like this episode, please hit the like button and share it with your network.
Also please subscribe if you haven't yet.
Database internals series: https://youtu.be/yV_Zp0Mi3xs
Popular playlists:
Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-
Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17
Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d
Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
Stay Curios! Keep Learning!
#sql #postgres #databasesystems
Jan 22, 202501:00:19

DBOS internals - Build reliable backends 10x faster
The GeekNarrator memberships can be joined here: https://www.youtube.com/channel/UC_mGuY4g0mggeUGM6V1osdA/join
Membership will get you access to member only videos, exclusive notes and monthly 1:1 with me.
Here you can see all the member only videos: https://www.youtube.com/playlist?list=UUMO_mGuY4g0mggeUGM6V1osdA
------------------------------------------------------------------------------------------------------------------------------------------------------------------
About this episode:
------------------------------------------------------------------------------------------------------------------------------------------------------------------
In this episode we are talking to Peter and Qian, co-founders of DBOS. The conversation covers the challenges of creating fault-tolerant applications, the architecture of DBOS, and how it addresses reliability at multiple layers.
Chapters:
00:00 Introduction to the Geeknerder Podcast
00:29 Meet the Co-Founders of DBOSS
01:25 The Core Problem: Building Reliable Systems
02:05 How DBOSS Solves Reliability Issues
04:29 Understanding DBOSS Architecture
06:09 Deep Dive into DBOSS Library
08:36 Postgres and State Management
18:31 Handling Parallel Steps and Performance Concerns
26:00 Observability and Version Control
30:18 Running Multiple Code Versions
30:58 Managing Workflow Versions
32:03 Surgery on Workflow States
33:15 Library Annotations and Durable Execution
34:24 Migrating to the Cloud Version
37:23 Handling Email Workflows
42:41 Transactional Guarantees with Postgres
48:44 Technical Challenges and Multi-Tenancy
54:12 Real-World Use Cases and Benefits
59:45 Conclusion and Final Thoughts
Some important links:
- Main website: https://www.dbos.dev/
- DBOS docs: https://docs.dbos.dev/
- Open-source DBOS Transact libraries:
- Python: https://github.com/dbos-inc/dbos-transact-py
- TypeScript: https://github.com/dbos-inc/dbos-transact-ts
------------------------------------------------------------------------------------------------------------------------------------------------------------------
Like building real stuff?
------------------------------------------------------------------------------------------------------------------------------------------------------------------
Try out CodeCrafters and build amazing real world systems like Redis, Kafka, Sqlite. Use the link below to signup and get 40% off on paid subscription.
https://app.codecrafters.io/join?via=geeknarrator
------------------------------------------------------------------------------------------------------------------------------------------------------------------
Link to other playlists. LIKE, SHARE and SUBSCRIBE
------------------------------------------------------------------------------------------------------------------------------------------------------------------
If you like this episode, please hit the like button and share it with your network.
Also please subscribe if you haven't yet.
Database internals series: https://youtu.be/yV_Zp0Mi3xs
Popular playlists:
Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-
Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17
Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d
Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
Stay Curios! Keep Learning!
Jan 04, 202501:01:58

Database Trends and More with Peter Zaitsev
Deep Dive into Databases with Peter Zaitsev | The GeekNarrator Podcast
Join host Kaivalya Apte and special guest Peter Zaitsev from Percona on this episode of the Geeknerder Podcast. They discuss Peter's fascinating journey into the world of databases, founding Percona, and the evolution of open source database solutions. Topics include the rise of PostgreSQL, the comparison between MySQL and PostgreSQL, database observability, the impact of cloud and Kubernetes on database management, licensing changes in popular databases like Redis, and career advice for database administrators and developers. Stay tuned for insights on the future of databases, observability strategies, and the role of AI in database management.
00:00 Introduction and Guest Welcome
00:14 Peter's Journey into Databases
04:15 The Rise of PostgreSQL vs MySQL
18:17 Challenges in Managing Database Clusters
24:36 Common Developer Mistakes with Databases
30:59 MongoDB's Success and Future
34:53 Redis and Licensing Changes
37:07 Elastic's License Change and Its Impact
38:25 Redis Fork and Industry Collaboration
40:27 Kubernetes and Cloud-Native Databases
47:47 Challenges in Database Upgrades and Migrations
54:58 Load Testing and Observability
01:09:02 Future of Database Administration and Development
01:15:13 Conclusion and Final Thoughts
Become a member of The GeekNarrator to get access to member only videos, notes and monthly 1:1 with me.
Like building stuff? Try out CodeCrafters and build amazing real world systems like Redis, Kafka, Sqlite. Use the link below to signup and get 40% off on paid subscription.
https://app.codecrafters.io/join?via=geeknarrator
If you like this episode, please hit the like button and share it with your network.
Also please subscribe if you haven't yet.
Database internals series: https://youtu.be/yV_Zp0Mi3xs
Popular playlists:
Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-
Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17
Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d
Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
Stay Curios! Keep Learning!
Jan 04, 202501:04:14

How would you design a database on Object Storage?
Join Kaivalya Apte and Simon Hørup Eskildsen from Turbopuffer as they talk about the complexities of building a database on top of object storage. Discover the key challenges, the nuances of various storage formats, and the critical trade-offs involved.
Learn from Simon's rich experience, from his time at Shopify to creating Turbopuffer. This episode covers everything—from approaches to write-ahead logs to multi-tenancy and object storage advancements. Perfect for database enthusiasts and those keen on first-principles thinking!
00:00 Introduction
00:17 Simon's Background and Journey to TurboBuffer
02:42 Challenges in Database Scalability
04:21 Experimenting with Vector Databases
05:02 Cost Implications of Vector Databases
05:52 Architectural Considerations for Search Workloads
07:39 Building a Database on Object Storage
16:14 Designing a Simple Database on Object Storage
26:01 Handling Multiple Writers and Consistency
31:26 Trade-offs in Write Operations
32:36 Optimizing MySQL Write Performance
34:03 Batching Writes in Object Storage
35:08 Time-Based vs Size-Based Batching
36:32 Understanding Amplification in Databases
42:26 Challenges with Cold Queries
44:02 Building and Persisting B-Trees
50:53 Separating Workloads in Databases
56:07 Multi-Tenancy Challenges
01:00:39 Choosing Storage Formats
01:06:10 Key Innovations in Object Storage Databases
Important links:
- https://github.com/sirupsen/napkin-math (numbers)
- https://turbopuffer.com/
- https://turbopuffer.com/architecture
- https://sirupsen.com/napkin/problem-10-mysql-transactions-per-second
- https://sirupsen.com (my blog, napkin math)
- https://sirupsen.com/subscribe (napkin math newsletter)
- https://github.com/rkyv/rkyv rkyv rust
Become a member of The GeekNarrator to get access to member only videos, notes and monthly 1:1 with me.
Like building stuff? Try out CodeCrafters and build amazing real world systems like Redis, Kafka, Sqlite. Use the link below to signup and get 40% off on paid subscription.
https://app.codecrafters.io/join?via=geeknarrator
If you like this episode, please hit the like button and share it with your network.
Also please subscribe if you haven't yet.
Database internals series: https://youtu.be/yV_Zp0Mi3xs
Popular playlists:
Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-
Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17
Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d
Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
Stay Curios! Keep Learning!
Dec 02, 202401:08:26

Practical Systems Learning & Verification with Jack Vanlightly
Welcome to The GeekNarrator podcast! In this episode, host Kaivalya Apte goes deeper into the practical applications of formal methods with Jack Vanlightly, a principal technologist at Confluent. With years of experience in distributed systems, Jack discusses his journey and how formal methods have been instrumental in system design verification and bug detection. The conversation covers Jack's background, his process of using formal methods, the significance of modelling, verification, documentation, and systems learning, as well as the future evolution of tooling and its applications. Tune in to understand the intricacies of how formal methods can transform your approach to distributed systems!
Chapters:
00:00 Introduction to the episode
00:37 Meet Jack VanLightly: Principal Technologist at Confluent
02:17 Jack's Journey into Distributed Systems
04:29 Discovering the Power of Formal Methods
08:11 Modeling and Simulation in Formal Methods
13:43 Verification and Safety Properties
19:02 Documentation and Communication Challenges
20:43 Formal Methods as a Systems Learning Tool
24:26 Practical Applications and Case Studies
56:38 Future of Formal Verification and Closing Thoughts
Jack's Blog: https://jack-vanlightly.com/
Become a member of The GeekNarrator to get access to member only videos, notes and monthly 1:1 with me.
Like building stuff? Try out CodeCrafters and build amazing real world systems like Redis, Kafka, Sqlite. Use the link below to signup and get 40% off on paid subscription.
https://app.codecrafters.io/join?via=geeknarrator
If you like this episode, please hit the like button and share it with your network.
Also please subscribe if you haven't yet.
Database internals series: https://youtu.be/yV_Zp0Mi3xs
Popular playlists:
Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-
Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17
Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d
Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
Stay Curios! Keep Learning!
Dec 02, 202401:01:08

Database Internals - NileDB Postgres re-engineered for multitenant apps
Database Internals - NileDB: Postgres Re-engineered for Multitenant Apps with Gwen Shapira
Join us in this episode as we dive deep into the intricacies of NileDB, a groundbreaking database re-engineered for multi-tenant applications. Our special guest, Gwen Shapira, co-founder of NileDB and a notable figure in the database field, shares her insights and technical know-how on solving the common challenges faced by multitenant SaaS applications. From the benefits of using Postgres as the underlying database to the unique tenant isolation features of NileDB, we cover it all. Don't miss out on learning about AI native capabilities, handling schema migrations, and ensuring zero downtime data migrations.
Chapters:
00:00 Introduction
07:19 Challenges in Multi-Tenant Databases
11:09 Tenant Isolation and NILDB's Approach
34:16 Necessary Modifications for Tenant Data
37:47 Zero Downtime Data Migrations
44:32 Handling Schema Migrations
59:11 AI Use Cases and Vector Embedding Storage
59:51 Technical and Non-Technical Learnings from Building Nile
01:05:03 Future Plans and Upcoming Features
NileDB: https://www.thenile.dev/
Blog: https://www.thenile.dev/blog
Gwen's Linkedin: https://www.linkedin.com/in/gwenshapira
Gwen's Twitter: https://twitter.com/gwenshap
#postgres #sql #ai
Become a member of The GeekNarrator to get access to member only videos, notes and monthly 1:1 with me.
Like building stuff? Try out CodeCrafters and build amazing real world systems like Redis, Kafka, Sqlite. Use the link below to signup and get 40% off on paid subscription.
https://app.codecrafters.io/join?via=geeknarrator
If you like this episode, please hit the like button and share it with your network.
Also please subscribe if you haven't yet.
Database internals series: https://youtu.be/yV_Zp0Mi3xs
Popular playlists:
Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-
Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17
Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d
Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
Stay Curios! Keep Learning!
Nov 07, 202401:06:45

Building a continuous profiler with Frederic from Polar Signals
Building a Continuous Profiler with Frederic from Polar Signals | Geek Narrator Podcast
In this episode we chat with Frederic from Polar Signals. We dive deep into the intricacies of building a continuous profiler, the challenges faced, and the unique solutions developed by Polar Signals. Frederic shares insights from his background in observability and discusses the innovations in FrostDB, a custom columnar database designed for high-performance query and storage of profiling data.
Chapters:
00:00 Introduction
00:29 Frederic's Background
03:40 What is Continuous Profiling?
06:56 Challenges in Data Collection
18:22 Profiling Data Ingestion and Storage Architecture
27:23 Querying Data
28:52 High Cardinality Data and Cost Optimization
23:39 Tenant Isolation and Load Management
41:24 Performance Optimizations
46:02 Testing & Deterministic Simulation
50:33 Technical and Organizational Learnings
54:32 Future of Polar Signals
56:21 Conclusion
You can check more about Polar Signals here: https://www.polarsignals.com/
Become a member of The GeekNarrator to get access to member only videos, notes and monthly 1:1 with me.
Like building stuff? Try out CodeCrafters and build amazing real world systems like Redis, Kafka, Sqlite. Use the link below to signup and get 40% off on paid subscription.
https://app.codecrafters.io/join?via=geeknarrator
If you like this episode, please hit the like button and share it with your network.
Also please subscribe if you haven't yet.
Database internals series: https://youtu.be/yV_Zp0Mi3xs
Popular playlists:
Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-
Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17
Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d
Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
Stay Curios! Keep Learning!
#distributedsystems #systemdesign
Oct 19, 202457:09

Database Internals - SlateDB with Chris Riccomini
Welcome back to another episode! Today, I have a special guest, Chris Riccomini, joining me to delve into the exciting world of databases. In this episode, we focus on SlateDB, a new and innovative database that's making waves in the tech community. We'll cover a wide range of topics, including the architecture of SlateDB, its internals, design decisions, and some fascinating use cases. Chris, a seasoned software engineer with a background at LinkedIn and WePay, shares his journey and the motivations behind creating SlateDB. 🎙️
Chatpers:
00:00 Introduction to the Topic and Guest
01:58 Chris Riccomini's Background and Experience
04:19 The Genesis of SlateDB
04:54 Understanding SlateDB's Architecture
10:22 The Rise of Object Storage in Databases
13:43 Exploring SlateDB's Features and Trade-offs
32:54 Understanding Latency Trade-offs
34:12 Exploring Storage Formats and Manifest Files
37:25 Caching Strategies and Optimizations in SlateDB
50:21 Consistency Guarantees and Transactionality
52:36 Integration and Resource Management in SlateDB
56:04 Future Prospects and Use Cases for SlateDB
SlateDB: https://slatedb.io/
More about Chris: https://cnr.sh/
Like building stuff? Try out CodeCrafters and build amazing real world systems like Redis, Kafka, Sqlite. Use the link below to signup and get 40% off on paid subscription.
https://app.codecrafters.io/join?via=geeknarrator
If you like this episode, please hit the like button and share it with your network.
Also please subscribe if you haven't yet.
Database internals series: https://youtu.be/yV_Zp0Mi3xs
Popular playlists:
Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-
Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17
Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d
Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
Stay Curios! Keep Learning!
#distributedsystems #systemdesign #formalmethods
Oct 11, 202401:01:45

System Design the formal way with FizzBee
In this video I talk to Jayaprabhakar Kadarkarai aka JP who is the founder of FizzBee. FizzBee is a design specification language and model checker to help developers verify their design before writing even a single line of implementation code.
We have discussed where it is applicable, what are the benefits, how does it work and many other interesting challenges with examples.
Chapters:
00:00 Introduction
01:13 Challenges in Designing Distributed Systems
03:13 Understanding Design Specification Languages
04:00 The Value of Structured Design Documents
09:00 When to Use Design Specification Languages
21:27 Modeling a Travel Booking System
22:51 Ensuring Atomicity in Distributed Systems
26:09 Handling Failures and Consistency
34:45 Refinement in System Design
35:38 Balancing Abstraction and Implementation
37:53 Common Pitfalls in Modeling and Implementation
40:02 Challenges in System Design and Implementation
40:12 Two-Way Feedback in System Design
41:01 Performance Considerations in Implementation
41:36 Importance of Solid Design Blueprints
41:56 Model-Based Testing and Continuous Integration
43:27 Updating Design Documentation
44:38 Simulation Testing vs. Model Checking
45:32 Design Issues and Formal Verification
49:51 Applying Formal Verification to Existing Systems
55:35 Common Design Problems and Solutions
01:07:57 Future Enhancements in Design Specification Tools
01:12:50 Getting Started with FizzBee
FizzBee : https://fizzbee.io/
Get in touch with JP: https://www.linkedin.com/in/jayaprabhakar
Like building stuff? Try out CodeCrafters and build amazing real world systems like Redis, Kafka, Sqlite. Use the link below to signup and get 40% off on paid subscription.
https://app.codecrafters.io/join?via=geeknarrator
If you like this episode, please hit the like button and share it with your network.
Also please subscribe if you haven't yet.
Database internals series: https://youtu.be/yV_Zp0Mi3xs
Popular playlists:
Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-
Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17
Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d
Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
Stay Curios! Keep Learning!
#distributedsystems #systemdesign #formalmethods
Sep 22, 202401:16:23

Learnings from building Open Source Distributed Systems with Kishore Gopalakrishna
In this episode of The Geek Narrator podcast, hosted by Kaivalya Apte, we welcome a special guest, Kishore Gopalakrishna from StarTree, co-author of Apache Pinot and other notable projects. Kishore shares his extensive experience in building real-time analytics and streaming systems, including Apache Pino, Espresso, Apache Helix, and Third Eye. The episode delves into the motivations and challenges behind creating these systems, the innovations they brought to distributed systems, and the impact of community on open-source projects. Kishore also discusses the evolution of testing methodologies, cost optimizations in transactional and analytical systems, and key considerations for companies evaluating real-time analytics solutions.
Don't miss this in-depth conversation packed with valuable insights for both seasoned developers and tech enthusiasts!
Chapters:
00:00 Introduction
03:13 Building Distributed Systems at LinkedIn
08:57 Testing and Challenges in Distributed Systems
30:50 Advantages of Columnar Storage
33:04 The Importance of Upserts
34:24 Building a Strong Open Source Community
41:10 Challenges and Lessons in System Design
51:35 Real-Time Analytics: Do You Need It?
StarTree: https://startree.ai/
Apache Pinot: https://pinot.apache.org/
If you like this episode, please hit the like button and share it with your network.
Also please subscribe if you haven't yet.
Database internals series: https://youtu.be/yV_Zp0Mi3xs
Popular playlists:
Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-
Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17
Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d
Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
Stay Curios! Keep Learning!
#distributedsystems #kafka #s3 #streaming #realtimeanalytics #database #pinot #startree
Aug 27, 202401:00:24

WarpStream: A drop-in replacement for Kafka
In this episode of The GeekNarrator podcast, host Kaivalya Apte interviews Ryan and Richie, the founders of WarpStream. They discuss the architecture, benefits, and core functionalities of WarpStream, a drop-in replacement for Apache Kafka. The conversation covers their experience with Kafka, the design decisions behind WarpStream, and the operational challenges it addresses. They also delve into the seamless migration process, the scalability, and cost benefits, the integration with the Kafka ecosystem, and potential future features. This episode is a must-watch for developers and tech enthusiasts interested in modern, distributed data streaming solutions.
Chapters:
00:00 Introduction
02:27 Introducing Warpstream: A Kafka Replacement
11:07 Deep Dive into Warpstream's Architecture
35:42 Exploring Kafka's Ordering Guarantees
36:52 Handling Buffering and Compaction
38:44 Efficient Data Reading and File Caching
44:06 WarpStream's Flexibility and Cost Efficiency
01:06:59 Future Features
Links:
WarpStream : https://www.warpstream.com/
Blog: https://www.warpstream.com/blog
X:
Ryan: https://x.com/ryanworl
Richard Artoul: https://x.com/richardartoul
Kaivalya Apte: https://x.com/thegeeknarrator
If you like this episode, please hit the like button and share it with your network.
Also please subscribe if you haven't yet.
Database internals series: https://youtu.be/yV_Zp0Mi3xs
Popular playlists:
Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-
Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17
Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d
Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
Stay Curios! Keep Learning!
#distributedsystems #kafka #s3 #streaming
Jul 19, 202401:12:14

XTDB - An Immutable SQL Database
Exploring XTDB with Jeremy Taylor & Malcolm Sparks: An In-Depth Dive into Immutability and Database Internals
In this episode of the Geek Narrator Podcast, host Kaivalya is joined by Jeremy Taylor and Malcolm Sparks from Juxt to explore XTDB, an immutable database designed to handle complex historical and financial data with precision. They delve into the architecture, internal mechanics, and use cases while discussing the importance of immutability.
This episode covers everything you need to know about XTDB and its capabilities. Whether you're a developer interested in databases or someone curious about data management and history tracking, this discussion offers invaluable insights.
Chapters:
00:00 Introduction
02:51 Challenges with General Purpose Databases
11:50 XTDB: A New Approach to Databases
31:56 Understanding Kafka and XTDB Integration
36:06 Querying and Indexing in XTDB
40:31 Temporal Data Management and Use Cases
54:52 Deployment and User Experience
XTDB: https://xtdb.com/
XTDB Github: https://github.com/xtdb/xtdb
Juxt: https://www.juxt.pro/
Juxt Github: https://github.com/juxt
If you like this episode, please hit the like button and share it with your network.
Also please subscribe if you haven't yet.
Database internals series: https://youtu.be/yV_Zp0Mi3xs
Popular playlists:
Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-
Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17
Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d
Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
Stay Curios! Keep Learning!
#sql #kafka #datastorage #immutable
Jul 19, 202401:06:55

Testing Distributed Systems the right way ft. Will Wilson
In this episode of The GeekNarrator podcast, host Kaivalya Apte dives into the complexities of testing distributed systems with Will Wilson from Antithesis. If you’re grappling with the challenges of testing databases, micro-services, and distributed systems, this episode is a must-watch. Will Wilson demystifies the concept of deterministic simulation testing, shares insights about its advantages over conventional testing methods, and explains how Antithesis helps developers ensure software reliability. Learn about the various strategies and techniques used to identify and resolve bugs, and explore how deterministic simulation can transform your software testing approach. Perfect for developers, engineers, and tech enthusiasts who are keen on improving their testing methodologies for complex systems.
Chapters:
00:00 Introduction
03:04 Limitations of Conventional Testing Methods
04:09 Understanding Deterministic Simulation Testing
08:07 Implementing Deterministic Simulation Testing
14:30 Real-World Example: Chat Application
19:56 Antithesis Hypervisor and Determinism
27:06 Defining Properties and Assertions
38:34 Optimizing Snapshot Efficiency
40:44 Understanding Isolation in CI/CD Pipelines
43:39 Strategies for Effective Bug Detection
47:59 Exploring Program State Trees
51:17 Heuristics and Fuzzing Techniques
01:01:56 Mocking Third-Party APIs
01:05:54 Handling Long-Running Tests
01:09:06 Classifying and Prioritizing Bugs
01:15:35 Future Plans and Closing Remarks
References:
Hypervisor: https://antithesis.com/blog/deterministic_hypervisor/
AFL : https://github.com/google/AFL
Antithesis website: https://antithesis.com/
Follow me on Linkedin and Twitter: https://www.linkedin.com/in/kaivalyaapte/ and https://twitter.com/thegeeknarrator
If you like this episode, please hit the like button and share it with your network.
Also please subscribe if you haven't yet.
Database internals series: https://youtu.be/yV_Zp0Mi3xs
Popular playlists:
Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-
Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17
Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d
Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
Stay Curios! Keep Learning!
#distributedsystems #databases #microservices #antithesis #fuzzer #testing
Jul 19, 202401:17:31

Turso - SQLite for production
Exploring Turso with Glauber Costa: Insights on SQLite for Production
In this episode of The GeekNarrator podcast, host Kaivalya Apte interviews Glauber Costa, founder and CEO of TursoDB. They discuss the inception of TursoDB, Glauber's background in Linux kernel development, and the journey from unikernel projects to founding a database company. Glauber explains TursoDB's enhancements to SQLite for production use, including native replication, schema management, and vector search capabilities. The conversation dives deep into use cases, architecture, and the benefits of a multi-tenant database design. Learn about TursoDB’s future plans and essential insights for developers.
Chapters:
00:00 Introduction
05:05 The Birth of Turso
08:02 Challenges and Pivot to libSQL
17:12 SQLite for Production: Enhancements and Features
22:02 Replication and Backup Solutions
23:38 Enterprise-Level Features and Multi-Tenancy
25:55 User Experience and Simplicity of TursoDB
33:14 Handling Network Failures and Monitoring
36:35 Native Replication in SQLite
37:52 Virtualizing the Write-Ahead Log
39:20 Replication Mechanisms
41:31 Primary and Replica Dynamics
46:51 Multi-Tenancy and Scalability
53:33 Schema Changes and Migrations
58:51 Vector Search Capabilities
01:02:13 Future Roadmap and Features
If you like this episode, please hit the like button and share it with your network.
Also please subscribe if you haven't yet.
Database internals series: https://youtu.be/yV_Zp0Mi3xs
Popular playlists:
Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-
Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17
Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d
Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
Stay Curios! Keep Learning!
#distributedsystems #databases #sqlite #sql
Jul 19, 202401:04:54

Taking Postgres to the next level with Neon
Deep Dive into Serverless Databases with Neon: Featuring Heikki Linnakangas
In this episode of the Geek Narrator podcast, host Kaivalya Apte is joined by Heikki Linnakangas, co-founder of Neon, to explore the innovative world of serverless databases. They discuss Neon's unique approach to separating compute and storage, the benefits of serverless architecture for modern applications, and dive into various compelling use cases. They also cover Neon's architectural features like branching, auto-scaling, and auto-suspend, making it a powerful tool for both developers and enterprises. Whether you're curious about multi-tenancy, fault tolerance, or developer productivity, this episode offers insightful knowledge about leveraging Neon's capabilities for your next project.
00:00 Introduction
00:53 The Birth of Neon: Why It Was Created
02:16 Understanding Serverless Databases
07:06 Neon's Architecture: Separation of Compute and Storage
09:59 Exploring Branching in Neon
18:21 Auto Scaling and Handling Spikes in Traffic
20:17 The Challenge of Multiple Writers in Distributed Systems
22:51 Auto Suspend: Cost-Effective Database Management
26:02 Optimizing Cold Start Times
27:14 Balancing Cost and Performance
28:52 Replication and Durability
30:32 Understanding the Storage Layer
34:02 Custom LSM Tree Implementation
36:21 Fault Tolerance and Failover
07:00 Developer Productivity and Use Cases
42:56 Migration and Tooling
48:35 Future Roadmap and User Experience
50:28 Conclusion and Final Thoughts
Neon website: https://neon.tech/
Follow me on Linkedin and Twitter: https://www.linkedin.com/in/kaivalyaapte/ and https://twitter.com/thegeeknarrator
If you like this episode, please hit the like button and share it with your network.
Also please subscribe if you haven't yet.
Database internals series: https://youtu.be/yV_Zp0Mi3xs
Popular playlists:
Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-
Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17
Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d
Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
Stay Curios! Keep Learning!
#PostgreSQL #SQL #RDBMS #NEON
Jul 19, 202450:47

Scaling Derived Data for Planet-Scale Applications at Linkedin
In this video I speak with Felix GV, who is a Principal Staff Engineer at Linkedin, and has done major contributions to the data infrastructure and Linkedin, including VeniceDB.
This episode will give you a good understanding of why we need a new database for storing "Derived Data" in a low latency, high performance manner, which is very important for Machine Learning workloads.
Chapters:
00:00 Introduction
01:42 The Evolution of LinkedIn's Databases
03:15 Challenges with Voldemort and the Birth of VeniceDB
08:42 Understanding Derived Data
13:33 Planet-Scale Applications and Multi-Region Support
17:40 Writing Data into VeniceDB
22:53 Merging Data in VeniceDB
40:31 Understanding the Architecture
40:47 Components of the Write Path
41:56 Leader and Follower Architecture
43:58 Partitioning and DaVinci Client
47:57 Read Patterns and Client Options
54:25 Fault Tolerance and Recommender Systems
01:01:19 Kafka Integration and Deployment
01:06:56 Roadmap and Future Improvements
Important links:
VeniceDB blog: https://www.linkedin.com/blog/engineering/open-source/open-sourcing-venice-linkedin-s-derived-data-platform
VeniceDB docs: https://venicedb.org/
Qcon: https://youtu.be/pJeg4V3JgYo?si=vblGUxp5fNdKPHoC
Follow me on Linkedin and Twitter: https://www.linkedin.com/in/kaivalyaapte/ and https://twitter.com/thegeeknarrator
If you like this episode, please hit the like button and share it with your network.
Also please subscribe if you haven't yet.
Database internals series: https://youtu.be/yV_Zp0Mi3xs
Popular playlists:
Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-
Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17
Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d
Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
Stay Curios! Keep Learning!
#kafka #linkedin #venicedb #Rocksdb
Jun 05, 202401:12:42

SuperCharging PostgreSQL for Search and Analytics - ParadeDB (Philippe Noël)
In this video I speak with Philippe Noël, about ParadeDB, which is an Elasticsearch alternative built on Postgres, modernizing the features of Elasticsearch's product suite, starting with real-time search and analytics.
I hope you will enjoy and learn about the product.
Chapters:
00:00 Introduction
01:12 Challenges with Elasticsearch and the Need for ParadeDB
02:29 Why Postgres?
06:30 Technical Details of ParadeDB's Search Functionality
18:25 Analytics Capabilities of ParadeDB
24:00 Understanding ParadeDB Queries and Transactions
24:22 Application Logic and Data Workflows
25:14 Using PG Cron for Data Migration
30:05 Scaling Reads and Writes in Postgres
31:53 High Availability and Distributed Systems
34:31 Isolation of Workloads
39:38 Database Upgrades and Migrations
41:21 Using ParadeDB Extensions and Distributions
43:02 Observability and Monitoring
44:42 Upcoming Features and Roadmap
46:34 Final Thoughts
Important links:
Links:
GitHub: https://github.com/paradedb/paradedb
Website: https://paradedb.com
Docs: https://docs.paradedb.com/
Blog: https://blog.paradedb.com
Follow me on Linkedin and Twitter: https://www.linkedin.com/in/kaivalyaapte/ and https://twitter.com/thegeeknarrator
If you like this episode, please hit the like button and share it with your network.
Also please subscribe if you haven't yet.
Database internals series: https://youtu.be/yV_Zp0Mi3xs
Popular playlists:
Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-
Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17
Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d
Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
Stay Curios! Keep Learning!
#postgresql #datafusion #parquet #sql #OLAP #apachearrow #database #systemdesign #elasticsearch
Jun 05, 202446:59

Modern OLAP Database System Design with FDAP (Andrew Lamb)
In this video I speak with Andrew Lamb, Staff Software Engineer @Influxdb. We discuss FDAP (Flight, DataFusion, Arrow, Parquet) stack for modern OLAP database system design. Andrew shared some insights into why the FDAP stack is so powerful in designing and implementing a modern OLAP database.
Chapters:
00:00 Introduction
01:48 Understanding Analytics: Transactional vs Analytical Databases
04:41 The Genesis and Goals of the FDAP Stack
09:31 Decoding FDAP: Flight, Data Fusion, Arrow, and Parquet
12:40 Apache Parquet: Revolutionizing Columnar Storage
17:18 Apache Arrow: The In-Memory Game Changer
23:51 Interoperability and Migration with Apache Arrow
27:10 Comparing Apache Parquet and Arrow
28:26 Exploring Data Mutability in Analytic Systems
29:19 Handling Data Updates and Deletions
29:24 The Role of Immutable Storage in Analytics
30:42 Optimizing Data Storage and Mutation Strategies
34:20 Introducing Flight: Simplifying Data Transfer
35:02 Deep Dive into Flight's Benefits and SQL Support
39:20 Unpacking Data Fusion's SQL Support and Extensibility
46:12 The Interplay of FDAP Components in Analytics
51:49 Future Directions and Innovations in Data Analytics
56:04 Concluding Thoughts on FDAP and Its Impact
FDAP Stack: https://www.influxdata.com/glossary/fdap-stack/
FDAP Blog: https://www.influxdata.com/blog/flight-datafusion-arrow-parquet-fdap-architecture-influxdb/
InfluxDB: https://www.influxdata.com/
Follow me on Linkedin and Twitter: https://www.linkedin.com/in/kaivalyaapte/ and https://twitter.com/thegeeknarrator
If you like this episode, please hit the like button and share it with your network.
Also please subscribe if you haven't yet.
Database internals series: https://youtu.be/yV_Zp0Mi3xs
Popular playlists:
Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-
Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17
Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d
Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
Stay Curios! Keep Learning!
#datafusion #parquet #sql #OLAP #apachearrow #database #systemdesign
Jun 05, 202456:49

The ultimate multi-model Database, SurrealDB with Pratim Bhosale
In this video I and Pratim Bhosale, Developer Advocate at SurrealDB, talk about SurrealDB, a multi-model database which aims to make Developer’s life easier by letting them focus mainly on the business logic and not on the Database choice. Following chapters will help you understand what is a multi-model database and how SurrealDB shines.
Chapters:
00:00 Introduction
01:48 The Genesis of SurrealDB
03:59 SurrealDB's Mission and Use Cases
07:34 Understanding Multi-Model Databases
10:30 Deep Dive into SurrealDB's Architecture
33:09 Deployment and Getting Started with SurrealDB
34:31 Future Developments and Use Case Considerations
43:51 Final Thoughts and How to Get Started
Important links:
Install SurrealDB
https://sdb.li/4bqwn38
SurrealDB Docs:
https://sdb.li/3wxjoxx
SurrealDB Website:
https://sdb.li/3JMK7JI
Surrealist:
https://sdb.li/4b7wcdh
SurrealDB GitHub:
https://sdb.li/3JRPNlE
Follow me on Linkedin and Twitter: https://www.linkedin.com/in/kaivalyaapte/ and https://twitter.com/thegeeknarrator
If you like this episode, please hit the like button and share it with your network.
Also please subscribe if you haven't yet.
Database internals series: https://youtu.be/yV_Zp0Mi3xs
Popular playlists:
Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-
Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17
Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d
Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
Stay Curios! Keep Learning!
#surrealdb #elasticsearch #search #vectorsearch #acid #databases #sql #joins #indexes #graphdatabase
Jun 05, 202446:09

Demystifying Real-time Analytics, Search and Hybrid Search with Dhruba, CTO @Rockset
In this video, I talk to Dhruba, CTO @Rockset about search and realtime analytics. We discussed deep internals of Rockset, its architecture and why is it a great fit for search and realtime analytics use cases.
Chapters:
00:00 Introduction
02:45 The Evolution of Data Systems: From Hadoop to Rockset
07:30 Understanding Rockset: Real-Time Analytics and Search Defined
12:01 The Technical Edge: Rockset vs. Elasticsearch
18:16 Deep Dive into Rockset's Architecture and Internals
28:21 Partitioning, Hashing, and Data Distribution in Rockset
36:56 Exploring Hot Storage and Cache Layers
37:40 Why Hot Storage is Essential for Low Latency
39:05 Optimizing Data Storage with Compression and Delta Encoding
39:49 Balancing Cost and Performance in Data Storage
41:50 The Power of Converged Indexing in Rockset
45:50 Efficient Query Execution and Index Management
54:51 Leveraging Mutability for Real-Time Analytics
59:24 Deep Dive into Query Processing and Optimization
01:04:21 Understanding Joins and Reporting Queries in Rockset
01:12:23 Future Directions and Vector Search Innovations
Index Conference: https://rockset.com/index-conf/
Follow me on Linkedin and Twitter: https://www.linkedin.com/in/kaivalyaapte/ and https://twitter.com/thegeeknarrator
If you like this episode, please hit the like button and share it with your network.
Also please subscribe if you haven't yet.
Database internals series: https://youtu.be/yV_Zp0Mi3xs
Popular playlists:
Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-
Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17
Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d
Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
Stay Curios! Keep Learning!
#rockset #elasticsearch #search #vectorsearch #realtime #databases #sql #joins #indexes
May 17, 202401:14:53

Rapidly Simulate Production Traffic ft. Michael Drogalis
In this episode we explore how to Rapidly Simulate Production Traffic with Michael Drogalis, using his creation ShadowTraffic. I am sure you will be able to relate to all the different problems mentioned in this episode and like how ShadowTraffic aims to solve those problems.
I hope you like this conversation.
Chapters:
00:00 Welcome to The Geek Narrator Podcast: Exploring Deep Tech
00:18 The Challenge of Simulating Production Traffic
00:59 Introducing Shadow Traffic: A Solution to Data Simulation
02:34 Understanding the Problem Space of Data Simulation
06:03 How Shadow Traffic Works: A Deep Dive
08:17 The Power of Declarative Data Generation with Shadow Traffic
10:40 Shadow Traffic's Architecture and Deployment
13:02 Configuring Load Testing and Throttling with Shadow Traffic
15:47 Testing and Validation in Shadow Traffic
20:42 Mimicking Production Data Distribution with Shadow Traffic
26:48 Innovative Features for Stream Processing Testing
28:47 Shadow Traffic: Adding Faults to Data for Robust Testing
29:04 Antithesis and Shadow Traffic: A Synergistic Approach
32:46 The Challenge of Generating Realistic Test Data
40:04 Enhancing Observability in Data Generation
41:50 Customer-Driven Roadmap and Future Vision
45:27 Closing Thoughts
ShadowTraffic: https://shadowtraffic.io/
Contact Michael: https://shadowtraffic.io/contact.html
Follow me on Linkedin and Twitter: https://www.linkedin.com/in/kaivalyaapte/ and https://twitter.com/thegeeknarrator
If you like this episode, please hit the like button and share it with your network.
Also please subscribe if you haven't yet.
Database internals series: https://youtu.be/yV_Zp0Mi3xs
Popular playlists:
Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-
Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17
Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d
Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
Stay Curios! Keep Learning!
#kafka #s3 #postgres #testing #streamprocessing #loadtesting #chaostesting #demo
May 17, 202447:02

High Performance with GraalVM - Alina Yurenko
If you're involved in the Java space, chances are you've come across #GraalVM. And for those active in the tech community, you might have heard about the recent 1BRC challenge initiated by Gunnar Morling.
GraalVM truly showcased its capabilities in this challenge, sparking my curiosity. That's why I reached out to Alina to delve deeper into GraalVM, exploring its features and uncovering how it excels in such endeavors. And here we are talking about GraalVM
Chapters:
00:00 Introduction
01:47 GraalVM's Impact on the 1BRC Challenge and Its Features
04:34 Exploring GraalVM's Core Features and Benefits
08:34 Real-World Success Stories: GraalVM in Action
16:18 Understanding Native Image Compilation with GraalVM
20:34 Framework Compatibility and GraalVM Integration
25:04 Testing and Integration with GraalVM
25:26 Exploring Testing and Development with GraalVM
25:58 Best Practices for Developing with GraalVM
28:11 Migrating to GraalVM: Strategies and Considerations
31:25 Performance Optimization in GraalVM
35:15 Building and Resource Considerations for GraalVM
38:45 Expanding Horizons: Polyglot Programming with GraalVM
43:15 Future Directions and Limitations of GraalVM
47:40 Engaging the Java Community: GraalVM's Impact
50:21 Getting Started with GraalVM: Resources and Recommendations
References and Links:
- The GraalVM website with docs, downloads, guides: https://www.graalvm.org/
- Nicolai Parlog's "Modern Java in Action" demo: https://github.com/nipafx/modern-java-demo
- My native version of Nicolai's demo: https://github.com/alina-yur/native-modern-java-demo
- For news, follow GraalVM: https://twitter.com/graalvm
Follow me on Linkedin and Twitter: https://www.linkedin.com/in/kaivalyaapte/ and https://twitter.com/thegeeknarrator
If you like this episode, please hit the like button and share it with your network.
Also please subscribe if you haven't yet.
Database internals series: https://youtu.be/yV_Zp0Mi3xs
Popular playlists:
Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-
Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17
Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d
Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
Stay Curios! Keep Learning!
#Java #jvm #graalvm #highperformance #JITcompiler #AOT #nativeimage #security #rust #c++
May 17, 202452:18

Taming TimeSeries Data with QuestDB - Javier Ramirez
In this episode I am talking to Javier Ramirez from QuestDB, about everything QuestDB. This episode is a great resource to understand how QuestDB works, its architecture, what is it optimised for and whats upcoming as per the roadmap.
If you have timeseries data and need a simple yet highly scalable solution, #QuestDB is a great option.
Chapters:
00:00 Introduction
03:04 Understanding QuestDB: Origins and Use Cases
09:21 Deep Dive into QuestDB's Architecture and Data Ingestion
19:07 Optimizing Data Reads and Writes in QuestDB
28:40 Exploring Data Granularity and Partitioning in QuestDB
29:29 Optimizing Query Performance with Partition Strategies
30:26 Handling Data Ingestion and Query Efficiency
32:58 In-depth Look at Data Duplication and Ingestion Performance
34:55 Understanding Compression and Its Impact on Performance
38:51 Replication and Data Distribution Strategies
47:10 Observability and Metrics in QuestDB
50:57 Future Developments and Enhancements in QuestDB
58:45 Closing Remarks
Links:
QuestDB: https://questdb.io/
Github: https://github.com/questdb/questdb
===============================================================================
For discount on the below courses:
Appsync: https://appsyncmasterclass.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Testing serverless: https://testserverlessapps.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Production-Ready Serverless: https://productionreadyserverless.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Use the button, Add Discount and enter "geeknarrator" discount code to get 20% discount.
===============================================================================
Follow me on Linkedin and Twitter: https://www.linkedin.com/in/kaivalyaapte/ and https://twitter.com/thegeeknarrator
If you like this episode, please hit the like button and share it with your network.
Also please subscribe if you haven't yet.
Database internals series: https://youtu.be/yV_Zp0Mi3xs
Popular playlists:
Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-
Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17
Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d
Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
Stay Curios! Keep Learning!
#questdb #sql #timeseries #timeseriesanalysis #databases #highscale #scaleup #performance #parquet #S3 #replication #writeaheadlog #wal #durability #columnstore
Apr 18, 202459:06

Beat the CAP Theorem : Make Distributed consistency simple
In this episode I talk to Andras Gerlits, who founded omniledger.io. Andras has a very interesting view on how Distributed Consistency should work that can get rid of several bottlenecks when it comes to maintaining Distributed consistency.
He argues how getting rid of a global wall clock and using causality to approach Distributed consistency helps you build resilient, simple and performant systems. We have gone deeper into how that can be achieved and how the product works.
Chapters:
00:00 Introduction
00:52 Andras's Journey into Distributed Consistency
03:04 The Evolution of Data Consistency in Banking and Beyond
08:04 Introducing Client-Centric Consistency
10:36 Exploring the Standard Model of Distributed Consistency
16:01 Redefining Strong Consistency with a Relativistic Approach
34:25 Practical Implications of Client-Centric Consistency in Banking
36:20 Mitigating Latencies and Partitions in Distributed Systems
41:08 Exploring System Reliability and Availability
41:52 Tuning System Properties for Specific Use Cases
43:07 Comparing Standard and New Models for Data Management
45:08 Understanding Local Progress and Mutex-Free Updates
47:23 Deep Dive into Token-Based Ordering and Global Calibration
58:30 Introducing OmniLedger: A New Approach to Distributed Consistency
01:02:41 Performance Optimizations and Tunable Consistency
01:08:20 Ideal Use Cases and Potential Limitations of OmniLedger
01:14:30 Future Directions and Closing Thoughts
Links:
Our website:
https://omniledger.io
A long-form essay on the thinking behind our model:
https://medium.com/p/5e397cb12e63
A demo of transactionality
https://www.youtube.com/watch?v=XJSSjY4szZE
I think my blog in general might be interesting to some
https://medium.com/@andrasgerlits
The science-paper with all its mathematical rigour:
https://www.researchgate.net/publication/359578461_Continuous_Integration_of_Data_Histories_into_Consistent_Namespaces
===============================================================================
For discount on the below courses:
Appsync: https://appsyncmasterclass.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Testing serverless: https://testserverlessapps.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Production-Ready Serverless: https://productionreadyserverless.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Use the button, Add Discount and enter "geeknarrator" discount code to get 20% discount.
===============================================================================
Follow me on Linkedin and Twitter: https://www.linkedin.com/in/kaivalyaapte/ and https://twitter.com/thegeeknarrator
If you like this episode, please hit the like button and share it with your network.
Also please subscribe if you haven't yet.
Database internals series: https://youtu.be/yV_Zp0Mi3xs
Popular playlists:
Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-
Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17
Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d
Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
Stay Curios! Keep Learning!
#databases #sql #consistency #distributedsystems
Apr 09, 202401:16:52

A Graph Database That You Can Embed - KuzuDB
In this video I talk to Semih Salihoglu about KuzuDB : A highly scalable, extremely fast, easy to use embeddable Graph Database.
Chapters:
00:00 Introduction
00:40 The Genesis of KuzuDB: From Academic Research to Startup
06:40 Graph Databases 101: Understanding the Basics and Beyond
10:24 When to Opt for a Graph Database: Use Cases and Advantages
19:16 KuzuDB vs. Traditional Databases: A Comparative Analysis
24:39 Inside KuzuDB: Optimizations and Data Ingestion Explained
31:08 Exploring Query Optimizations in Graph Databases
31:34 The Relational Nature of Graph Databases
33:33 Factorization: A Key Optimization Technique
38:50 Integrating New Data Sources and Handling Joins
43:39 Optimizing Write Operations and Index Management
50:23 Comparing Kuzu with Other Graph Databases
58:50 Future Developments and Vision for Kuzu
Important links:
- History of DBMSs and the IDS, which is the first database in history, which had a graph-based model: https://dl.acm.org/doi/abs/10.1145/1147376.1147382 is a good paper by CS historian on this history and a must read for everyone interested in the birth of databases as a field.
- https://blog.kuzudb.com/post/what-every-gdbms-should-do-and-vision/ blog on the what every GDBMS should do and vision of Kùzu.
- The user survey paper that got Semih into GDBMSs. https://arxiv.org/pdf/1709.03188.pdf
- Blog on factorization https://blog.kuzudb.com/post/factorization/
- Kùzu's RDFGraphs feature https://docs.kuzudb.com/rdf-graphs/
===============================================================================
For discount on the below courses:
Appsync: https://appsyncmasterclass.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Testing serverless: https://testserverlessapps.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Production-Ready Serverless: https://productionreadyserverless.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Use the button, Add Discount and enter "geeknarrator" discount code to get 20% discount.
===============================================================================
Follow me on Linkedin and Twitter: https://www.linkedin.com/in/kaivalyaapte/ and https://twitter.com/thegeeknarrator
If you like this episode, please hit the like button and share it with your network.
Also please subscribe if you haven't yet.
Database internals series: https://youtu.be/yV_Zp0Mi3xs
Popular playlists:
Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-
Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17
Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d
Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
Stay Curios! Keep Learning!
Mar 27, 202401:01:37

Restate - making distributed systems simple with Stephan Ewen
In this video, I talk to Stephan Ewen from Restate, who is popularly known from the world of Apache Flink. We have talked about the problems in the world of Distributed systems and the complex solutions developers have to deal with. This complexity makes the architecture so complex that it eventually creates reliability, Observability and delivery velocity problems. Restate aims to solve it by making resilience and durability for your services, functions and RPC a lot simpler.
Chapters:
00:00 Introduction
00:45 Introducing Restate: A Solution for Distributed System Challenges
01:22 Deep Dive into Restate with Stefan: From Apache Flink to Building Resilient Systems
06:04 The Complexities of Distributed Systems and How Restate Addresses Them
15:49 The Vision of Restate: Simplifying Developer Experience in Distributed Systems
24:42 Integrating Restate into Your Architecture: A User's Perspective
33:16 Exploring Restate: The Durable Service Mesh
33:32 The Power of Restate in Handling Transactions
34:26 Restate's Role in Service Communication and Durability
35:40 Deep Dive into Restate's Mechanisms and Benefits
38:04 Practical Example: Email Pipeline with Restate
39:40 Understanding Restate's Log and Event Handling
58:43 Restate's Unique Features and Programming Model
01:04:22 Final Thoughts on Restate's Impact and Deployment
Restate: https://restate.dev/
===============================================================================
For discount on the below courses:
Appsync: https://appsyncmasterclass.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Testing serverless: https://testserverlessapps.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Production-Ready Serverless: https://productionreadyserverless.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Use the button, Add Discount and enter "geeknarrator" discount code to get 20% discount.
===============================================================================
Follow me on Linkedin and Twitter: https://www.linkedin.com/in/kaivalyaapte/ and https://twitter.com/thegeeknarrator
If you like this episode, please hit the like button and share it with your network.
Also please subscribe if you haven't yet.
Database internals series: https://youtu.be/yV_Zp0Mi3xs
Popular playlists:
Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-
Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17
Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d
Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
Stay Curios! Keep Learning!
#distributedsystems #faulttolerance #reliability #resilience
Mar 22, 202401:05:34

Volt Active Data: Low Latency Stream processing
In this episode of The GeekNarrator podcast, our host Kaivalya talks to Seeta Somagani from Volt Active Data, a low latency stream processing platform. They discuss fascinating topics about what low latency stream processing means, the different guarantees that Volt Active Data provides, and the various problems it can solve. They delve into the evolution of VoltDB to Volt Active Data, real-time data processing use cases, the high-level architecture, and how the platform effectively addresses high-concurrency challenges. This is a must-listen for anyone interested in understanding latency critical applications, data processing, and high performance computing.
Chapters:
00:00 Welcome to The GeekNarrator Podcast with Special Guest from Volt Active Data
00:41 Introduction
01:45 The Evolution of VoltDB to Volt Active Data
06:13 Exploring Real-Time Data Processing and Use Cases
08:25 Addressing High-Concurrency Challenges in Various Industries
12:57 High-Level Architecture of Volt Active Data
19:26 Understanding Stored Procedures and Data Processing in Volt
22:48 Practical Application: Tracking Data Usage with Volt Active Data
25:16 Diving into Replicated and Partitioned Tables
25:44 Exploring Event Processing and Exporting
26:57 Understanding Stored Procedures and Performance
29:03 Partitioning Strategies and Recommendations
31:39 Ensuring Determinism in Stored Procedures
35:02 Handling Complex Requirements with Compound Procedures
37:25 Fault Tolerance and Data Replication Strategies
40:44 Exploring Use Cases for VoltActiveData
43:30 The Future of Streaming and VoltActiveData's Role
47:05 Closing Remarks and How to Learn More
Volt Active Data: https://www.voltactivedata.com/use-cases/activesd-streaming-data/
===============================================================================
For discount on the below courses:
Appsync: https://appsyncmasterclass.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Testing serverless: https://testserverlessapps.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Production-Ready Serverless: https://productionreadyserverless.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Use the button, Add Discount and enter "geeknarrator" discount code to get 20% discount.
===============================================================================
Follow me on Linkedin and Twitter: https://www.linkedin.com/in/kaivalyaapte/ and https://twitter.com/thegeeknarrator
If you like this episode, please hit the like button and share it with your network.
Also please subscribe if you haven't yet.
Database internals series: https://youtu.be/yV_Zp0Mi3xs
Popular playlists:
Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-
Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17
Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d
Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
Stay Curios! Keep Learning!
#sql #streamprocessing #java #acid
Mar 08, 202448:09

TigerBeetle: World’s Fastest Financial Transactions Database
In an enlightening episode of the GeekNarrator Podcast, host Kaivalya Apte and TigerBeetle's CEO, Joran, delve deep into the world of online transaction processing (OLTP). They discuss the origin, unique architecture, and innovative methodologies behind TigerBeetle, a database tailored to efficiently handle high-volume transaction systems. The podcast explores the system's key features such as efficient scalability, performance-oriented design, and optimized memory usage, demonstrating its robustness in handling business transactions and accounting. It also elucidates TigerBeetle’s adaptability to various domains beyond finance, like energy management and gaming, while highlighting the rigorous testing it undergoes for impeccable quality assurance.
Chapters:
00:00 Introduction
01:19 Joran's Journey into Databases
03:59 Understanding Financial Transaction Databases
07:41 The Evolution of OLTP and OLAP
16:13 The Need for a New Database: TigerBeetle
16:53 Performance and Safety Features of TigerBeetle
28:49 The Importance of Safety in Financial Transactions
36:49 Changing Developer Experience with TigerBeetle
41:43 Understanding the CPU and Memory Bandwidth
42:12 The Importance of Data Format Language
43:27 The Concept of Serialization and its Impact
46:23 The Architecture of TigerBeetle
46:29 The Role of Replicated State Machine
48:18 The Importance of Consensus in Replication
50:20 The Structure of TigerBeetle
50:37 The Importance of Log in Systems
50:51 Understanding the State in Replicated State Machine
52:55 The Role of LSM in TigerBeetle
53:55 The Impact of Compaction Process on Performance
57:06 The Importance of Predictability in Software
01:06:15 The Read and Write Path in TigerBeetle
01:14:46 Potential Use Cases for TigerBeetle
01:17:09 Understanding the Limitations of TigerBeetle
===============================================================================
For discount on the below courses:
Appsync: https://appsyncmasterclass.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Testing serverless: https://testserverlessapps.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Production-Ready Serverless: https://productionreadyserverless.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Use the button, Add Discount and enter "geeknarrator" discount code to get 20% discount.
===============================================================================
Follow me on Linkedin and Twitter: https://www.linkedin.com/in/kaivalyaapte/ and https://twitter.com/thegeeknarrator
If you like this episode, please hit the like button and share it with your network.
Also please subscribe if you haven't yet.
Database internals series: https://youtu.be/yV_Zp0Mi3xs
Popular playlists:
Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-
Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17
Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d
Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
Stay Curios! Keep Learning!
#tigerbeetledb #databases #acid #olap #oltp #postgres #mysql
Feb 23, 202401:19:39

Clean Code Adventures with Uncle Bob
In this episode, we dive deep into the world of clean coding with none other than the master and pioneer of the field, Uncle Bob. We explore the nuances and the art behind writing effective and efficient scripts. This conversation covers the nitty-gritty of writing and editing scripts, from understanding how to break down large functions, to discussing principles like 'Single Responsibility Principle', 'Dependency Inversion Principle' and how to balance the 'DRY' (Don't Repeat Yourself) principles. Uncle Bob also shares valuable insights on testing, handling errors, naming conventions and how to work with different types of duplication in coding. He shares recommended resources and books that every coder should read.
Chapters:
00:00 Introduction and Welcome
00:06 The Importance of Code Quality
00:29 Introducing Robert Martin (Uncle Bob)
01:39 Uncle Bob's Journey in Programming
02:34 Discussion on Functional Design and New Book
03:52 The Evolution of Software Development
04:28 Revisiting the Clean Code Book
04:49 The Impact of Hardware Changes on Software
06:13 The Evolution of Programming Languages
07:33 The Importance of Code Structure and Organization
09:07 The Impact of Microservices and Open Source
11:14 The Role of Modular Programming
22:07 The Importance of Naming in Code
26:31 The Role of Functions in Code
34:12 The Role of Switch Statements in Code
42:36 The Importance of Immutability
51:00 Dealing with Complex Steps in Programming
51:21 Implementing State Machines in Programming
51:46 The Pragmatic Approach to Programming
53:01 Understanding Error Handling in Programming
54:08 The Challenge of Exception Handling
57:27 The Importance of Log Messages in Debugging
01:03:05 The Dilemma of Code Duplication
01:05:51 The Intricacies of Error Handling
01:07:40 The Role of Abstraction in Programming
01:13:55 The Importance of Testing in Programming
01:19:43 The Challenges of Mocking in Testing
01:25:11 The Essence of Programming: Discipline, Ethics, and Standards
Book Recommendations:
Tidy First: https://www.oreilly.com/library/view/tidy-first/9781098151232/
Design Patterns: https://www.amazon.de/-/en/Erich-Gamma/dp/0201633612
Analysis Pattern: https://martinfowler.com/books/ap.html
Structured Analysis and System Specification: https://www.amazon.de/-/en/Tom-Demarco/dp/0138543801
Fundamental Algorithms: https://www.amazon.com/Art-Computer-Programming-Vol-Fundamental/dp/0201896834
Sorting and Searching: https://www.amazon.de/-/en/Donald-Knuth/dp/0201896850
Structure and Interpretation of Computer Programs: https://web.mit.edu/6.001/6.037/sicp.pdf
===============================================================================
For discount on the below courses:
Appsync: https://appsyncmasterclass.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Testing serverless: https://testserverlessapps.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Production-Ready Serverless: https://productionreadyserverless.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Use the button, Add Discount and enter "geeknarrator" discount code to get 20% discount.
===============================================================================
Follow me on Linkedin and Twitter: https://www.linkedin.com/in/kaivalyaapte/ and https://twitter.com/thegeeknarrator
If you like this episode, please hit the like button and share it with your network.
Also please subscribe if you haven't yet.
Database internals series: https://youtu.be/yV_Zp0Mi3xs
Popular playlists:
Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-
Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17
Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d
Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
Stay Curios! Keep Learning!
Feb 17, 202401:34:01

Durable async/await with Dominik Tornow
In this episode of the Geek Narrator Podcast, Kaivalya Apte engages in an invigorating discussion with Dominik Tornow, the founder and CEO of Resonate. They explore Durable Async-Await, an interesting concept in distributed systems world, along with other nuances of distributed programming. Dominik also talks about the development and execution of Resonate to simplify distributed systems with a focus on observability, usability, and it's future direction. The conversation concludes with a discussion of different concurrency models and the future of distributed systems
Chapters:
00:00 Introduction and Guest Background
02:44 Understanding Async Await
10:25 Challenges with Current Async Await Model
12:53 Introducing Resonate: A Solution for Distributed Async Await
13:34 Practical Application: E-commerce Example
24:57 Understanding the Role of the Platform in Distributed Systems
30:12 Dealing with Partial Failures in Distributed Systems
39:44 Getting Started with Resonate
40:40 Introduction to Resonate and its Simplicity
41:09 Getting Started with Resonate: Installation and Setup
42:22 Understanding the Durability Aspect of Resonate
42:49 Exploring the Resonate Durable Promise Server
44:10 Scaling Up: Introducing Workers into the System
48:35 The Importance of Open Standards in Resonate
50:17 Exploring the Integration Capabilities of Durable Promises
01:04:31 Understanding the Role of Timeouts in Durable Promises
01:07:29 The Future of Resonate: Challenges and Upcoming Features
01:13:04 Understanding the Limitations of Durable Promises
01:14:51 Wrapping Up: Final Thoughts on Resonate and Durable Promises
References:
A note on Distributed Systems: https://scholar.harvard.edu/files/waldo/files/waldo-94.pdf
Thinking in Distributed Systems: https://dtornow.gumroad.com/l/distributed-systems
McCarthy's paper: https://www-formal.stanford.edu/jmc/recursive/recursive.html
===============================================================================
For discount on the below courses:
Appsync: https://appsyncmasterclass.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Testing serverless: https://testserverlessapps.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Production-Ready Serverless: https://productionreadyserverless.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Use the button, Add Discount and enter "geeknarrator" discount code to get 20% discount.
===============================================================================
Follow me on Linkedin and Twitter: https://www.linkedin.com/in/kaivalyaapte/ and https://twitter.com/thegeeknarrator
If you like this episode, please hit the like button and share it with your network.
Also please subscribe if you haven't yet.
Database internals series: https://youtu.be/yV_Zp0Mi3xs
Popular playlists:
Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-
Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17
Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d
Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
Stay Curios! Keep Learning!
Feb 11, 202401:15:48

Observability Engineering with Liz Fong-Jones
Join host Kaivalya Apte in this episode of The Geek Narrator Podcast as he discusses observability engineering with field CTO at Honeycomb, Liz Fong-Jones. They delve into the importance of observability for software engineers, the role of Honeycomb in popularizing this concept, and how observability has evolved over the years. Liz shares her experiences transitioning from being an SRE at Google to advocating for observability at Honeycomb and walking the journey from developer advocate to Field CTO. They discuss the definitions and misconceptions surrounding observability and elucidate on Service-Level Objectives (SLOs) & indicators (SLIs) and challenges they solve. Tune in for an informative and in-depth conversation on observability engineering.
Chapters:
00:00 Introduction
00:08 Understanding Observability Engineering
00:37 Guest Introduction: Liz Fong Jones
00:53 Liz's Journey to Field CTO at Honeycomb
27:38 Understanding Site Reliability Workbook Materials
27:57 Identifying Critical User Journeys
29:49 Different Types of Services and Their SLOs
33:05 Setting Up SLOs: Granularity and Number
42:42 Understanding Service Level Indicators (SLIs)
50:26 Common Mistakes in Setting Up SLOs
52:09 Cultivating an Observability-Driven Development Culture
References:
Observability Engineering: https://www.oreilly.com/library/view/observability-engineering/9781492076438/
@Google SRE book: https://sre.google/books/
===============================================================================
For discount on the below courses:
Appsync: https://appsyncmasterclass.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Testing serverless: https://testserverlessapps.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Production-Ready Serverless: https://productionreadyserverless.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Use the button, Add Discount and enter "geeknarrator" discount code to get 20% discount.
===============================================================================
Follow me on Linkedin and Twitter: https://www.linkedin.com/in/kaivalyaapte/ and https://twitter.com/thegeeknarrator
If you like this episode, please hit the like button and share it with your network.
Also please subscribe if you haven't yet.
Database internals series: https://youtu.be/yV_Zp0Mi3xs
Popular playlists:
Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-
Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17
Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d
Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
Stay Curios! Keep Learning!
Feb 03, 202454:38

Messaging and Streaming with Apache Pulsar - with Matteo Merli
In this video I talk about Apache Pulsar with Matteo Merli, CTO at StreamNative. This episode will provide you good insight about how Apache Pulsar works and more importantly differs with the most popular Pub/Sub and streaming platform Apache Kafka. Things like, what enables possibility of 1 million topics? Why is rebalancing not required? How does decoupled storage and compute architecture works? How it uses the concept of Subscriptions to avoid retaining data unnecessarily?
And much more...
Chapters:
00:00 Introduction and Guest Introduction
00:08 Understanding Apache Pulsar and its Origin
01:22 The Problem Apache Pulsar was Designed to Solve
02:35 The Evolution of Apache Pulsar
05:15 Understanding Basic Concepts of Apache Pulsar
09:27 Deep Dive into Apache Pulsar's Architecture
21:16 Understanding the Flow of Data in Apache Pulsar
28:54 Understanding Subscriptions in Apache Pulsar
31:57 Understanding End-to-End Latency and Subscription Creation
32:32 Broker's Role and Handling Metadata
33:05 Memory Management and Consumer Handling
34:07 Message Processing and Flow Control
34:32 Message Storage and Retrieval
36:00 Comparing Pulsar with Kafka
43:52 Understanding Multi-Tenancy in Pulsar
49:17 Exploring Tiered Storage and Future Developments
Important links:
StreamNative: https://streamnative.io/
Apache Pulsar: https://pulsar.apache.org/
Matteo Merli: https://twitter.com/merlimat
===============================================================================
For discount on the below courses:
Appsync: https://appsyncmasterclass.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Testing serverless: https://testserverlessapps.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Production-Ready Serverless: https://productionreadyserverless.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Use the button, Add Discount and enter "geeknarrator" discount code to get 20% discount.
===============================================================================
Follow me on Linkedin and Twitter: https://www.linkedin.com/in/kaivalyaapte/ and https://twitter.com/thegeeknarrator
If you like this episode, please hit the like button and share it with your network.
Also please subscribe if you haven't yet.
Database internals series: https://youtu.be/yV_Zp0Mi3xs
Popular playlists:
Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-
Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17
Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d
Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
Stay Curios! Keep Learning!
Jan 27, 202401:03:47

VictoriaMetrics internals - Making monitoring simple and reliable at massive scale
Deep Dive into Victoria Metrics with Alex and Roman
Join the insightful discussion with Vitoriametrics creators, Alex and Roman, in the Geekneritor podcast hosted by Kaivalya Apte. This episode explores the internals of Victoria Metrics - a highly scalable monitoring solution and time series database. Discover the origins of Victoria Metrics, understand how it evolved, and learn about its unique architecture and functionality. From the concept of time series, the usage of consistent hashing in data distribution to real-world applications, it's all packed into this engaging conversation.
00:00 Introduction
01:52 The Genesis of VictoriaMetrics
02:18 The Journey from Postgres to Clickhouse
03:19 The Transition from Prometheus to Victoria Metrics
05:08 The Birth and Evolution of Victoria Metrics
13:01 The Architecture of Victoria Metrics
20:10 Data Ingestion and Integration in Victoria Metrics
29:15 Understanding the Vector Metric Architecture
30:30 Comparing Shared Storage and Object Store
31:00 Designing the VictoriaMetrics Architecture
32:01 The Role of Object Storage
36:15 The Importance of Indexing
43:19 Understanding the Ingestion Process
45:46 Exploring the Select Process
55:55 Future Plans for Victoria Metrics
Important Links:
1. Architecture Overview: https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#architecture-overview
2. How ClickHouse Inspired Us to Build a High Performance Time Series Database
https://altinity.com/wp-content/uploads/2021/11/How-ClickHouse-Inspired-Us-to-Build-a-High-Performance-Time-Series-Database.pdf
3. Frequently asked questions.
https://docs.victoriametrics.com/FAQ.html
===============================================================================
For discount on the below courses:
Appsync: https://appsyncmasterclass.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Testing serverless: https://testserverlessapps.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Production-Ready Serverless: https://productionreadyserverless.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Use the button, Add Discount and enter "geeknarrator" discount code to get 20% discount.
===============================================================================
Follow me on Linkedin and Twitter: https://www.linkedin.com/in/kaivalyaapte/ and https://twitter.com/thegeeknarrator
If you like this episode, please hit the like button and share it with your network.
Also please subscribe if you haven't yet.
Database internals series: https://youtu.be/yV_Zp0Mi3xs
Popular playlists:
Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-
Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17
Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d
Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
Stay Curios! Keep Learning!
Jan 20, 202401:03:00

TiDB Internals with Li Shen
Join us on a deep dive into the intricacies of TiDB with Li Shen from PingCap. In this episode, Li Shen provides a comprehensive exploration of TiDB, its unique features, and how it tackles scalability and reliability issues commonly associated with MySQL.
If you're dealing with struggles in your MySQL cluster and seeking a more dependable and scalable system, TiDB might be the solution for you. This conversation touches on various aspects of this cutting-edge database, its operational mechanism, use case scenarios, and how it's optimized for different workloads.
Key topics include: the architecture of TiDB, the journey of data from API to storage node, embracing analytical use cases, the importance of database reliability, and the process of migrating to TiDB. Dive in now!
00:00 Introduction and Welcome
02:47 Defining TIDB: A Disputed SQL Database
04:55 The Role of MySQL Compatibility in TIDB
05:54 Primary Use Cases for TIDB
09:38 Understanding the Data Ingestion Process in TIDB
16:52 Understanding Indexing in TIDB
23:01 Pushing Down Table Scans and Partial Aggregation
24:39 Introduction to Columnary Extension: Flash
24:54 Understanding Data Replication and Learner Nodes
26:23 Ensuring Strong Consistency in Data
27:12 Balancing Transactional and Analytical Use Cases
27:57 Understanding Data Replication and Consistency Model
28:42 Exploring Ty Flash Storage Layer
28:54 Understanding High Concurrency Insert and Update
32:09 Exploring the Read Path and Caching Mechanism
37:50 Understanding the Importance of High Reliability
43:01 Exploring Migration from Other Databases
48:01 Comparing TiDB with Other Distributed SQL Databases
52:21 Identifying Use Cases Where TiDB Might Not Be the Best Choice
Stay Curios! Keep Learning!
Jan 20, 202454:40

AI Powered Database optimisation with Andy Pavlo, Ottertune
In this video I discuss Database tuning and Optimisation with Andy Pavlo, OtterTune.
Andy is an Associate Professor with Indefinite Tenure of Databaseology in the Computer Science Department at Carnegie Mellon University. My research interest is in database management systems, specifically main memory systems, self-driving / autonomous architectures, transaction processing systems, and large-scale data analytics.
00:00 Introduction and Welcome
01:31 Understanding Database Optimization
05:48 Understanding When Database Tuning is Needed
08:45 Understanding Database Optimization Difficulties
16:16 Understanding Default Settings in Databases
22:35 Role of Machine Learning in Database Tuning
22:38 Introduction to Ottertune
28:36 Data Collection for Machine Learning Model
35:25 Deployment and Data Collection Process
38:03 Admitting the Limitations of Current Model
38:53 Challenges in Predicting Performance Improvements
39:28 The Importance of Data Collection Over Time
39:52 Avoiding Weekend and Holiday Tuning
40:05 Introducing New Features for Database Comparison
42:09 Provisioning Recommendations and Performance Predictions
43:03 The Importance of Telemetry in Understanding Database Performance
44:01 Handling Dramatic Changes in Database Workloads
44:48 Preparing for Predictable Traffic Spikes
48:13 The Importance of Testing in Database Optimization
53:33 The Future of Database Optimization
55:50 Common Mistakes in Database Management
01:09:15 The Future of Holistic Database Tuning
Links:
Ottertune: https://ottertune.com/
Andy Pavlo: https://www.cs.cmu.edu/~pavlo/
CMU youtube: https://www.youtube.com/@UCHnBsf2rH-K7pn09rb3qvkA
Resources:
CMU: https://15799.courses.cs.cmu.edu/spring2022/schedule.html
Ottertune blog: https://ottertune.com/blog
===============================================================================
For discount on the below courses:
Appsync: https://appsyncmasterclass.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Testing serverless: https://testserverlessapps.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Production-Ready Serverless: https://productionreadyserverless.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Use the button, Add Discount and enter "geeknarrator" discount code to get 20% discount.
===============================================================================
Follow me on Linkedin and Twitter: https://www.linkedin.com/in/kaivalyaapte/ and https://twitter.com/thegeeknarrator
If you like this episode, please hit the like button and share it with your network.
Also please subscribe if you haven't yet.
Stay Curios! Keep Learning!
Jan 14, 202401:14:56

Duckdb Internals with Mark Raasveldt
Deep Dive into DuckDB with CTO Mark Raasveldt
Decode the insights of databases with Geek Narrator podcast. In this episode, host Kaivalya Apte converses with Mark Raasveldt, the CTO of DuckDB labs, discussing his journey from being a database enthusiast to creating DuckDB. They delve into how DuckDB, an analytical database, differs from other databases, the design decisions, its internal mechanisms, and much more. The episode also highlights the advantages of DuckDB in analytics, the motivation behind its ACID compliance, and how DuckDB handles ingestion, transaction isolation, mutations, and queries. Join in to learn how your data workloads can benefit from DuckDB.
00:00 Introduction and Guest Introduction
00:44 Guest's Journey into Databases
03:40 The Birth of DuckDB
04:30 Challenges with Existing Databases
05:15 Technical Difficulties
05:16 Why Existing Databases Fall Short for Data Scientists
09:16 The Role of SQLite and Its Limitations
13:59 Defining DuckDB
16:48 Comparing DuckDB with Other Analytical Databases
19:50 Deployment Models for DuckDB
22:47 Data ingestion into DuckDB
22:51 Data Ingestion in DuckDB
30:24 How DuckDB Handles Updates and Mutations
35:35 Understanding Column Granularity and Rewrites
35:58 Implications of Compression on Data Updates
36:38 Trade-offs in Row Group Size
37:32 Benefits of Column Storage Model
38:15 Row Groups and Parallelism
39:02 Choosing Row Group Size: An Experimental Approach
40:00 Handling Data Type Changes in Columns
41:00 Internal Data Structures in DuckDB
42:21 Reading Data: Point Lookups, Aggregations, and Joins
47:22 Optimization for Full Table Scans
53:49 Understanding ACID Compliance in DuckDB
55:49 Multi-Version Concurrency Control (MVCC) in DuckDB
59:50 Use Cases and Applications of DuckDB
01:01:42 The Story Behind DuckDB's Name
01:02:34 Future Vision for DuckDB
References:
DuckDB: https://duckdb.org/
Mark's blog: https://mytherin.github.io/
===============================================================================
For discount on the below courses:
Appsync: https://appsyncmasterclass.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Testing serverless: https://testserverlessapps.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Production-Ready Serverless: https://productionreadyserverless.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Use the button, Add Discount and enter "geeknarrator" discount code to get 20% discount.
===============================================================================
Follow me on Linkedin and Twitter: https://www.linkedin.com/in/kaivalyaapte/ and https://twitter.com/thegeeknarrator
If you like this episode, please hit the like button and share it with your network.
Also please subscribe if you haven't yet.
Database internals series: https://youtu.be/yV_Zp0Mi3xs
Popular playlists:
Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-
Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17
Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d
Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
Stay Curios! Keep Learning!
Cheers,
The GeekNarrator
Dec 06, 202301:04:08

ScyllaDB internals with Felipe Mendes
In this episode we talk about ScyllaDB internals with Felipe Mendes.
Chapters:
0:00 ScyllaDB internals with Felipe Mendes
07:51 Write Path - API to Storage
11:40 What makes it faster than Cassandra?
13:39 Optimisations: Sea Star, shard per core architecture
15:49 Optimisations: No Garbage collection and Custom Cache Implementation
18:15 Optimisations: Scheduling groups and IO priority classes
20:07 Optimisations: IO scheduler
22:55 Benefits of shard per core architecture
30:16 Write path - Hows is a coordinator chosen?
38:20 Read path
39:27 Read path optimisations - Index Caching
41:48 Shard vs Partition
43:10 Shard per core architecture tradeoff
44:03 Observability of Database
References:
ScyllaDB architecture: https://opensource.docs.scylladb.com/stable/architecture/
Sea star: https://seastar.io/
ScyllaDB Caching: https://www.scylladb.com/2018/07/26/how-scylla-data-cache-works/
Shard per core architecture: https://www.scylladb.com/product/technology/shard-per-core-architecture/
Database performance at Scale: https://www.scylladb.com/2023/10/02/introducing-database-performance-at-scale-a-free-open-source-book/
===============================================================================
For discount on the below courses:
Appsync: https://appsyncmasterclass.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Testing serverless: https://testserverlessapps.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Production-Ready Serverless: https://productionreadyserverless.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Use the button, Add Discount and enter "geeknarrator" discount code to get 20% discount.
===============================================================================
Follow me on Linkedin and Twitter: https://www.linkedin.com/in/kaivalyaapte/ and https://twitter.com/thegeeknarrator
If you like this episode, please hit the like button and share it with your network.
Also please subscribe if you haven't yet.
Database internals series: https://youtu.be/yV_Zp0Mi3xs
Popular playlists:
Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-
Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17
Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d
Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
Stay Curios! Keep Learning!
Cheers,
The GeekNarrator
Nov 25, 202354:40

Graph Database Internals: @neo4j with Michael Hunger
In this episode I talk to Michael Hunger from Neo4j about Graph Database Internals (Neo4J)
Chapters:
0:00 Introduction and historical context
20:51 Data Modelling
25:16 Problem with SQL for Graph Model
26:21 Cypher - Query Language
28:23 Write Path
31:36 Neo4J Storage Layer
33:51 Graph API on top of Relational Model vs Native Graph Databases
37:05 Create Node Relationships
40:42 What makes Graph Database's performance better?
46:00 Partitioning Strategy
53:20 Read path
59:27 Schema Migration
01:04:41 Graph database use cases
===============================================================================
For discount on the below courses:
Appsync: https://appsyncmasterclass.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Testing serverless: https://testserverlessapps.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Production-Ready Serverless: https://productionreadyserverless.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Use the button, Add Discount and enter "geeknarrator" discount code to get 20% discount.
===============================================================================
Follow me on Linkedin and Twitter: https://www.linkedin.com/in/kaivalyaapte/ and https://twitter.com/thegeeknarrator
If you like this episode, please hit the like button and share it with your network.
Also please subscribe if you haven't yet.
Database internals series: https://youtu.be/yV_Zp0Mi3xs
Popular playlists:
Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-
Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17
Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d
Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
Stay Curios! Keep Learning!
Cheers,
The GeekNarrator
Nov 09, 202301:09:21

RUST vs C++, Java, Go with Micah Wylde
In this episode I talk to Micah Wylde about why #Rust could be the best choice for writing distributed systems and how does it compare to #C++, #Java and #Go.
Chapters:
00:00 Introduction
03:48 History of Systems Programming
09:42 Is C++ coming back?
13:31 Problems with C++
16:24 Problems with Java
25:18 Problems with Go
31:21 Why did you choose Rust?
35:19 What makes Rust better?
41:49 Rust cannot save you from logical bugs
44:02 Problems in the context of Stream Processing
48:10 Challenges with Rust
51:28 Learning Rust
54:10 Future of Rust
56:41 A Summary
Blog mentioned in the discussion: https://www.arroyo.dev/blog/rust-for-data-infra
For the courses mentioned use the following links:
Coupon code: "geeknarrator"
Appsync: https://appsyncmasterclass.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Testing serverless: https://testserverlessapps.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Production-Ready Serverless: https://productionreadyserverless.com/?affiliateId=41c07a65-24c8-4499-af3c-b853a3495003
Use the button, Add Discount and enter "geeknarrator" discount code to get 20% discount.
Follow me on Linkedin and Twitter: https://www.linkedin.com/in/kaivalyaapte/ and https://twitter.com/thegeeknarrator
If you like this episode, please hit the like button and share it with your network.
Also please subscribe if you haven't yet.
Database internals series: https://youtu.be/yV_Zp0Mi3xs
Popular playlists:
Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-
Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17
Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d
Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
Stay Curios! Keep Learning!
Cheers,
The GeekNarrator
Oct 21, 202358:36

Becoming a better engineer - John Crickett
Hello Everyone,
In this podcast I have invited John Crickett, who has been a Software Engineer since 27 years, having vast experience in variety of tech stacks. He is known for his newsletter "Coding Challenges" that helps developers build real world applications and becomming a better engineer.
00:00 Introduction
01:17 What made you start Coding Challenges?
03:21 What made you start learning Rust?
04:08 How should Software Engineers Prioritise learning? What should they learn? How would they know?
12:20 How to become a better engineer?
14:05 Knowing your passion? but how?
17:43 Should LeetCode be part of interviews? When does (and not) it make sense ?
25:39 System Design interviews
29:38 Building as a community.
More about Coding Challenges : https://codingchallenges.fyi
Join the discord server: https://discord.com/invite/zv4RKDcEKV
Connect with John : https://www.linkedin.com/in/johncrickett/
Follow me on Linkedin and Twitter: https://www.linkedin.com/in/kaivalyaapte/ and https://twitter.com/thegeeknarrator
If you like this episode, please hit the like button and share it with your network.
Cheers,
The GeekNarrator
Oct 15, 202333:43

YugaByteDB Internals with Franck Pachot
Hey Everyone,
In this video I talk to Franck Pachot about internals of YugabyteDB. Franck has joined the show previously to talk about general database internals and its again a pleasure to host him and talk about DistributedSQL, YugabyteDB, ACID properties, PostgreSQL compatibility etc.
Chapters:
00:00 Introduction
01:26 What does Cloud Native means?
02:57 What is Distributed SQL?
03:47 Is DistributedSQL also based on Sharding?
05:44 What problem does DistributedSQL solves?
07:32 Writes - Behind the scenes.
10:59 Reads: Behind the scenes.
17:01 BTrees vs LSM: How is the data written do disc?
25:02 Why RocksDB?
29:52 How is data stored? Key Value?
33:56 Transactions: Complexity, SQL vs NoSQL
42:51 MVCC in YugabyteDB: How does it work?
45:08 Default Transaction Isolation level in YugabyteDB
51:57 Fault Tolerance & High Availability in Yugabyte
56:48 Thoughts on Postgres Compatibility and Future of Distributed SQL
01:03:53 Usecases not suitable for YugabyteDB
Previous videos:
Database Internals:
Part1: https://youtu.be/DiLA0Ri6RfY?si=ToGv9NwjdyDE4LHO
Part2: https://youtu.be/IW4cpnpVg7E?si=ep2Yb-j_eaWxvRwc
Geo Distributed Applications: https://youtu.be/JQfnMp0OeTA?si=Rf2Y36-gnpQl18yj
Postgres Compatibility: https://youtu.be/2dtu_Ki9TQY?si=rcUk4tiBmlsFPYzY
I hope you liked this episode, please hit the like button and subscribe to the channel for more.
Popular playlists:
Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-
Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17
Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d
Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
Franck's Twitter and Linkedin: https://twitter.com/FranckPachot and https://www.linkedin.com/in/franckpachot/
Connect and follow here: https://twitter.com/thegeeknarrator and https://www.linkedin.com/in/kaivalyaapte/
Keep learning and growing.
Cheers,
The GeekNarrator
Oct 05, 202301:08:10

Accelerating Postgres Queries with Epsio - GIlad Kleinman
Hey Everyone, In this video I talk to Gilad Kleinmann, CEO and Co-Founder of epsio.io, about Epsio and how it helps companies to run queries faster and cheaper.
Chapters:
00:00 Introduction
02:09 Defining the problem statement
07:17 What is Epsio ?
09:58 How does Epsio change my architecture?
12:59 Use of CDC
14:05 Where is the query result stored ? (Foreign data wrappers)
15:40 What permissions does Epsio needs?
16:43 How does Epsio parses a query and creates a virtual table?
24:15 Consistency model of Epsio
27:48 How do I know if Epsio is suitable for me?
31:41 How does it compare with Caching?
35:59 What metrics are available with Epsio?
38:32 What other databases does Epsio support? (will support)
40:47 How to know more about Epsio?
41:37 Pricing model of Epsio
Read more about epsio: https://www.epsio.io/
Docs: https://docs.epsio.io/
Foreign data wrappers: https://wiki.postgresql.org/wiki/Foreign_data_wrappers
Other playlists:
Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA-
Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17
Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d
Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN
I hope you like this episode, please hit the like button if you did and subscribe to the channel if you haven't.
Cheers,
The GeekNarrator
Aug 23, 202344:14