Monday Morning Data Chat
By Ternary Data
Monday Morning Data ChatOct 10, 2022
#173 - Bart Vandekerckhove - Data Security Deep Dive
Bart Vandekerckhove (Raito) joins us to chat about data security, and the challenges data teams face when using traditional IAM technology and workflows for data access/security management.
#172 - Yali Sassoon - Using LLMs to Support the Analytics Workflow
It seems like LLMs are taking the analytics world by storm. But how do you use them to support the analytics workflow? Yali Sassoon (CTO, Co-founder of Snowplow) joins us to chat about this and much more.
We'll also likely dive into behavioral analytics and more.
Snowplow: https://snowplow.io/
#171 - David Yaffe & John Kutay - The State of Streaming and Change Data Capture
David Yaffe (Estuary) and John Kutay (Striim) join the show to chat about the state of streaming and change data capture (CDC) in 2024 and beyond. There is a lot to cover and learn in this show.
Estuary: https://estuary.dev/
Striim: https://www.striim.com/
#170 - Solomon Kahn - Customer-Facing Data Products, Why A/B Testing is a Waste of Time, and More
Data products are all the rage. But what are they? And what the heck does "customer-facing" mean? Will your old-school BI tool handle customer-facing needs? Solomon Kahn joins us to chat about customer-facing data products and much more. He's one of the people we consider to be at the bleeding edge of modern analytics and data products, so definitely check this out. Delivery Layer: https://www.deliverylayer.com/ LinkedIn: https://www.linkedin.com/in/solomonkahn/
#169 - Katharine Jarmul - Are We Solving the "Right" Problems with AI?
Katharine Jarmul is a AI/ML privacy and security expert, and the author of Practical Data Privacy. She joins us to chat about whether we are solving the "right" problems with AI/ML/data science, exploring what "safe", "responsible", and "ethical" AI means, and much more.
#168 -Cedric Chin & Sam Taylor - Communicating Sophisticated Stuff to Stakeholders
Cedric Chin (CommonCog) and Sam Taylor join us to chat about communicating sophisticated stuff to stakeholders, statistical process control, XmR charts, and a ton more. CommonCog: https://commoncog.com/
XMRIT: https://xmrit.com/
#167 - Martin Musiol - Generative AI: Navigating the Course to the AGI Future
Martin Musiol (Founder and CEO of generativeAI.net and GenAI Lead for Europe at Infosys) joins us to chat about all things generative AI, and his new book, "Generative AI: Navigating the Course to the Artificial General Intelligence Future."
#166 - Tony Baer - The Outlook for Generative AI in 2024 (and Beyond)
Veteran data industry analyst Tony Baer joins us to chat about his outlook for generative AI in 2024 and beyond.
Tony goes way deeper than most people in his analysis, and if you're interested where things are going with generative AI, you better tune into this.
#165 - Ethan Aaron - Is Data a Job or a Skill?
Ethan Aaron (CEO of Portable) joins us to chat about whether data is a job or a skill, what's stopping companies from running their analytics out of a Google Sheet, and much more.
#164 - Jean-Georges Perrin - Data Mesh, Data Contracts, Modern Data Engineering Standards, Bitol, and More
Jean-Georges Perrin joins the show to chat about data mesh, data contracts, modern data engineering standards, Bitol (is open standard project with the Linux Foundation), data architecture, and much more. This is a wide-reaching discussion. Enjoy!
#163 - Joe Reis and Matt Housley - The Demise of the Modern Data Stack & Listener Q&A
Joe Reis and Matt Housley are back for another listener Q&A. They chat about the demise of the Modern Data Stack, architecture, data modeling, AI, and much more.
#162 - Scott Taylor - Explaining "Value" to the Business
Our favorite Valentine's Week guest and all around love doctor, Scot Taylor, joins the show to chat about explaining "value" to the business, puppets, and much more.
#161 - Michel Tricot - AI's Impact on Traditional Data Practices and More!
Michel Tricot (CEO of Airbyte) joins the show to chat about the impact of AI on traditional data practices (e.g. ETL/ELT), building a company, and much more.
#160 - Benn Stancil - 2024 Predictions, GenAI and Product Development, etc
Benn Stancil joins the show to chat about 2024 predictions, how GenAI impacts product development, writing, and more.
Please note - there were some Internet problems with Benn's audio once in a while in the talk.
#159 - Dave Langer - Python in Excel, Data Science without "Data Scientists"
Dave Langer returns to the show to chat about Python in Excel, data science in SMBs without "data scientists", and much more
#158 - Joe Reis & Matt Housley - Ask Us Anything
It's Joe and Matt today, taking listener questions and ranting about whatever's on their minds.
#157 - Alex Gallego - The Streaming Data Renaissance, Open Formats, More
Alex Gallego (founder and CEO of Redpanda) joins the show to chat about the streaming data renaissance, why open formats for tiered storage is the future of data, and much more.
#156 - Mike Ferguson - Top Key Trends in Data Management and Analytics
Mike Ferguson (Managing Director, Intelligent Business Strategies, Chairman of Big Data London) joins the show to chat about the top key trends he's seeing in in data management and analytics - GenAI, architecture modernization, FinOps, and much more.
#155 - Tristan Handy - Data Ecosystems, Moats, Semantic Layers, and More
Tristan Handy (CEO of dbt Labs) joins the show to chat about the data tooling landscape, business moats, semantic layers, the data engineering ecosystem, and much more. We covered a ton of ground in an hour and probably could've kept going for another hour. Enjoy!
#154 - Sol Rashidi - Getting Business Value From Data, the CXO playbook, AWS ReInvent, and more
Sol Rashidi is a heavy hitter in the enterprise data space, having been CXO at Estee Lauder, Sony, Merck, and more. She joins us to chat about getting business value from data, the CXO playbook, AWS ReInvent, and more.
#153 - Sarah Nagy - Automating Analytics with Generative AI
Sarah Nagy (CEO, Seek.ai) joins us to chat about automating analytics with generative AI, the generative AI space in general, and much more.
#152 - Dave McComb - Knowledge Graphs, Semantics, and More
Dave McComb (Semantic Arts) is a pioneer in the use of knowledge graphs and semantics in data management. He joins to chat about these topics, and much more.
#151 - Kai Zenner - The EU AI Act
Kai Zenner joins us to chat about all things EU AI Act. If you've wanted to learn about this upcoming piece of critical regulation, tune in.
#150 - Nadine Farah - Apache Hudi Deep Dive
Nadine Farah joins the show to chat about Apache Hudi's core primitives: indexing, CDC, table services, faster UPSERTs, incremental processing framework, and more.
#149 - Why is Data Security so Hard? w/ Yoav Cohen
Yoav Cohen (co-founder & CTO at Satori) joins the show to chat about why data security is hard, strategies companies use to deal with analytics over sensitive data, security and compliance requirements that data teams need to meet, and much more.
#148 - Data Conference Recap (Coalesce, Gitex Dubai, DEWCon) w/ Kevin Hu
Our favorite nerd sniper, Kevin Hu (CEO of Metaplane), joins the show to help us recap some major conferences we attended last week. Lots of data news, gossip, anecdotes, and more.
#147 - Data Warehouses and Semantics Deep Dive, SDF, and more w/ Lukas Schulte (SDF)
Why are semantics important for a data warehouse? Lukas Schulte joins us to chat about why semantics are important, the heterogeneity of data systems, how semantics relate to SQL compilers, his project SDF, and much more.
Please be aware that this discussion will get into the nitty-gritty and technical weeds of all things data.
#146 - Improving Your Health and Wellness - Techie Edition w/ Colleen Fotsch
This is a bit of a different episode, but it's a topic that is long overdue for discussion. Between long hours sitting in front of a monitor, "hustle culture", and prevalent alcohol and drug use, our profession is literally killing us. The negative effects on health and wellness among techies are insane. We've seen our friends go to the ER from stress, diet, and lifestyle-related emergencies. We've lost other friends along the way. Colleen Fotsch is uniquely qualified to discuss this issue. She is used to operating at the highest levels of sports, being an NCAA D1-champion swimmer, multiple-time CrossFit Games athlete and coach, and former US Bobsled team member. She also works as a data analyst and part-time coach for Opex, a leader in fitness education coaching (she's Joe's coach). It's time we wake up and look at how we can improve our health and wellness, and bring our best selves to our work and life. Colleen's IG: https://www.instagram.com/colleenfotsch/?hl=en
#145 - Data Engineering AMA w/ Matt Housley and Joe Reis
Matt Housley and Joe Reis chat about where data engineering is going, and take audience questions.
#144 - Data Career Advice w/ Matt Housley, Chris Tabb, and Joe Reis
The data job market is certainly evolving. Matt, Chris, and Joe have a candid chat and AMA about career advice going into 2024.
#143 - The Future of Generative AI in Data Analytics w/ Amit Prakash
Amit Prakash (CTO/Co-Founder of Thoughtspot) joins the show to chat about the future of generative AI in data analytics. Thoughtspot has been a leader in searchable analytics, and it will be interesting to get Amit's take on where the field of analytics is heading next.
#142 - Incentivizing Devs to Pursue Open-Source Projects w/ Max Howell
Max Howell created Homebrew, one of the most popular open-source software (OSS) packages on the planet. He's also the founder of tea.xyz, which is helping incentivize developers to pursue their OSS projects.
In this episode, we chat about the realities and future of OSS, how developers can be remunerated for their OSS projects, and much more.
Tea: tea.xyz
#141 - Data Vendors and Grifters w/ Aaron Hunsaker
Aaron Hunsaker joins Matthew Housley and I to chat about data grifters, dealing with vendors, how data people should converse with "the business", and much more. Aaron doesn't hold back.
Enjoy this very candid in-person chat.
#140 - The Power of 3 (Math Nerds, Professors, and Authors) w/ Hala Nelson
Hala Nelson joins the show to chat writing books, teaching math, and much more. It's not often we get three math nerds, professors, and authors in the same conversation, and this is a lot of fun. Enjoy!
#139 - Streaming Data Processing Deep Dive w/ David Yaffe and Johnny Graettinger (Estuary)
David Yaffe and Johnny Graettinger (both from Estuary) join the show to do a deep dive into streaming data processing. We also cover how to scale change data capture (CDC) and where transformations belong in data pipelines.
Estuary: https://estuary.dev/
Gazette: https://gazette.readthedocs.io/en/latest/
Note - Joe's audio was having issues for this episode. Apologies.
#138 - The Rise and Importance of Business Language w/ John O'Gorman
Multiple products, versions, platforms, targets technologies, formats, and locales? How do you make sense of the "multiple of multiples" challenge from a technical perspective? The "language of the business" and data in all its structured, semi-structured, and 'unstructured' forms helps drive this home.
John O'Gorman has world-class expertise in language, semantics, and tying this together for the business. We hope you learn something new from this episode.
#137 - Why Apache Iceberg Won the Table Format War, Data Mesh, and More w/ Brian Olsen
Brian Olsen joins us again to chat about why Apache Iceberg won the table format war. We also finish our chat from last time about Data Mesh. #dataengineering #datalake #datamesh
#136 - Programming Languages for Data Science, and Why Your BI Team is Your Best Bet for Data Science w/ Dave Langer
David Langer joins the show to chat about how programming languages for data science, BI teams have a unique advantage in helping introduce data science into their organizations.
#135 - Dataframe Deep Dive w/ Devin Petersohn
Devin Petersohn (Modin, Ponder) knows a thing or two about dataframes, having done his PhD thesis on them, among other related achievements. We'll talk about all things dataframes, both high level and in the weeds. If you've ever wanted to learn about dataframes, this is the discussion for you.
#134 - Should Your Business Chase Generative AI? w/ Andreas Welsch
Andreas Welsch (Chief AI Strategist, Host of the Intelligence Briefing) joins the show to discuss the change management required to succeed with Generative AI in today's business world, prompt engineering, and more.
#133 - Intro to Data Contracts w/ Andrew Jones
Data contracts are all the rage right now. Andrew Jones originated the term "data contracts", so who better to chat with? We discuss data contracts - how they originated, how they were implemented, what they are, and why you should care. #data #datacontracts #dataengineering,"
#132 - Data Collaboration From the Outside-In w/ Andrew Padilla
Data collaboration is hard. Andrew Padilla chats about how to effectively address data collaboration at a common level across your organization, then apply the relevant parts to your internal orgs. #data
#131 - The Importance of Actionable Data to Inform Decision-Making w/ Joe Perez
Joe Perez ("Dr. Joe") joins the show to chat about the importance of actionable data to inform decision-making. We also discuss the various ways disparate data are brought together into a cohesive data warehouse, and how there should be a finite, measurable, deployable strategy
#130 - Data Modeling in 2023 w/ Colin Zima
Colin Zima joins the show to chat about what data modeling should look like in 2023 and beyond. We'll chat about real ETL, semantic layers, the troubles with BI, and much more. #datamodeling #data #dataengineering #analytics
#129 - Putting Data Products at the Center of Data Management w/ Saket Saurabh
Saket Saurabh joins the show to chat about taking a data product-centric view to reinventing data management. #data #dataproducts #datamanagement #dataengineering
#128 - Big & Small Data in 2023 w/ Joe Reis & Matt Housley
There's a lot of debate on big and small data. For systems and compute, some say "Big Data is Dead", while others challenge this notion. In AI and ML, big tech companies can pour tons of money and data into building massive LLMs, while open source provides compelling "small data" alternatives to the LLM walled gardens.
So which is it? Will Big Data reign supreme or will small data become more popular? Matt and I riff on these topics and more.
#data #dataengineering #chatgpt #ai #bigdata
#127 - Product Management as a Data Scientist w/ Santona Tuli
Santona Tuli discusses the product management aspects as a data scientist - product and stakeholder management, due diligence of requirements gathering, and developing a strategy before implementing DS/ML pipelines. You know, the fun stuff ;)
#126 - Data Virtualization Hot Takes w/ Brian Olsen
Brian Olsen joins the show to chat about data virtualization. Does it put the heat on vendors that want to lock you in with proprietary storage? Will virtualization and federation enable data mesh, allowing you to move your data at will? Tune in and learn more! #datamesh #dataengineering #data
#125 - The Art of Developer Relations w/ Tim Berglund
#124 - The Rise of the Semantic Layer in the Modern Data Stack w/ Dave Mariani
Dave Mariani (Founder/CTO at AtScale) joins the show to chat about the rise of the semantic layer in the modern data stack. Dave is a Silicon Valley veteran with a ton of experience building companies, amazing tech stacks, and everything in between. #semanticlayer #data #dataengineering #atscale