Skip to main content
DataTalks.Club

DataTalks.Club

By DataTalks.Club

DataTalks.Club - the place to talk about data!
Available on
Apple Podcasts Logo
Google Podcasts Logo
Pocket Casts Logo
RadioPublic Logo
Spotify Logo
Currently playing episode

Making Sense of Data Engineering Acronyms and Buzzwords - Natalie Kwong

DataTalks.ClubSep 11, 2021

00:00
01:00:21
Building Machine Learning Products - Reem Mahmoud

Building Machine Learning Products - Reem Mahmoud

Links:

  • LinkedIn: https://www.linkedin.com/in/reemmahmoud/recent-activity/all/
  • Website: https://topmate.io/reem_mahmoud


Free Data Engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

Mar 16, 202456:48
Make an Impact Through Volunteering Open Source Work - Sara EL-ATEIF

Make an Impact Through Volunteering Open Source Work - Sara EL-ATEIF

We talked about:

  • Sara’s background
  • On being a Google PhD fellow
  • Sara’s volunteer work
  • Finding AI volunteer work
  • Sara’s Fruit Punch challenge
  • How to take part in AI challenges
  • AI Wonder Girls
  • Hackathons
  • Things people often miss in AI projects and hackathons
  • Getting creative
  • Fostering your social media
  • Tips on applying for volunteer projects
  • Why it’s worth doing volunteer projects
  • Opportunities for data engineers and students
  • Sara’s newsletter suggestions


Links:

  • Dev and AI hackathons: https://devpost.com/
  • Healthcare-focused challenges: https://grand-challenge.org/challenges/
  • Volunteering in projects (AI4Good): https://www.fruitpunch.ai/
  • Volunteering in projects (AI4Good) 2: https://www.omdena.com/
  • Twitter: https://twitter.com/el_ateifSara
  • Instagram: https://www.instagram.com/saraelateif/
  • LinkedIn: https://www.linkedin.com/in/sara-el-ateif/
  • Youtube: www.youtube.com/@elateifsara


Free Data Engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

Feb 23, 202455:56
Accelerating The Job Hunt for The Perfect Job in Tech - Sarah Mestiri

Accelerating The Job Hunt for The Perfect Job in Tech - Sarah Mestiri

We talked about:

  • Sarah’s background
  • How Sarah became a coach and found her niche
  • Sarah’s clients
  • How Sarah helps her clients find the perfect job
  • Finding a specialization
  • Informational interviews
  • Building a connection for mutual benefit
  • The networking strategy
  • Listing your projects in the CV
  • The importance of doing research yourself and establishing your interests
  • How to land a part-time job when the company wants full-time
  • Age is not a factor
  • Applying for jobs after finishing a course and the importance of sharing your learnings
  • Sarah resource recommendations


Links:

  • LinkedIn: https://www.linkedin.com/in/sarahmestiri/
  • Website: https://thrivingcareermoms.com/
  • Personal Website: https://www.sarahmestiri.com/
  • Youtube channel: https://www.youtube.com/@thrivingcareermoms444

Free Data Engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

Feb 02, 202453:05
Machine Learning Engineering in Finance - Nemanja Radojkovic

Machine Learning Engineering in Finance - Nemanja Radojkovic

We talked about:

  • Nemanja’s background
  • When Nemanja first work as a data person
  • Typical problems that ML Ops folks solve in the financial sector
  • What Nemanja currently does as an ML Engineer
  • The obstacle of implementing new things in financial sector companies
  • Going through the hurdles of DevOps
  • Working with an on-premises cluster
  • “ML Ops on a Shoestring” (You don’t need fancy stuff to start w/ ML Ops)
  • Tactical solutions
  • Platform work and code work
  • Programming and soft skills needed to be an ML Engineer
  • The challenges of transitioning from and electrical engineering and sales to ML Ops
  • The ML Ops tech stack for beginners
  • Working on projects to determine which skills you need


Links:

  • LinkedIn: https://www.linkedin.com/in/radojkovic/

Free Data Engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

Jan 31, 202453:11
Stock Market Analysis with Python and Machine Learning - Ivan Brigida

Stock Market Analysis with Python and Machine Learning - Ivan Brigida

We talked about:

  • Ivan’s background
  • How Ivan became interested in investing
  • Getting financial data to run simulations
  • Open, High, Low, Close, Volume
  • Risk management strategy
  • Testing your trading strategies
  • Sticking to your strategy
  • Important metrics and remembering about trading fees
  • Important features
  • Deployment
  • How DataTalks.Club courses helped Ivan
  • Ivan’s site and course sign-up


Links:

  • Exploring Finance APIs: https://pythoninvest.com/long-read/exploring-finance-apis
  • Python Invest Blog Articles: https://pythoninvest.com/blog


Free ML Engineering course: http://mlzoomcamp.com Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

Jan 24, 202455:31
Bayesian Modeling and Probabilistic Programming - Rob Zinkov

Bayesian Modeling and Probabilistic Programming - Rob Zinkov

We talked about:

  • Rob’s background
  • Going from software engineering to Bayesian modeling
  • Frequentist vs Bayesian modeling approach
  • About integrals
  • Probabilistic programming and samplers
  • MCMC and Hakaru
  • Language vs library
  • Encoding dependencies and relationships into a model
  • Stan, HMC (Hamiltonian Monte Carlo) , and NUTS
  • Sources for learning about Bayesian modeling
  • Reaching out to Rob


Links:

  • Book 1: https://bayesiancomputationbook.com/welcome.html
  • Book/Course: https://xcelab.net/rm/statistical-rethinking/

Free ML Engineering course: http://mlzoomcamp.com Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

Jan 22, 202454:16
Navigating Challenges and Innovations in Search Technologies - Atita Arora

Navigating Challenges and Innovations in Search Technologies - Atita Arora

We talked about:


  • Atita’s background
  • How NLP relates to search
  • Atita’s experience with Lucidworks and OpenSource Connections
  • Atita’s experience with Qdrant and vector databases
  • Utilizing vector search
  • Major changes to search Atita has noticed throughout her career
  • RAG (Retrieval-Augmented Generation)
  • Building a chatbot out of transcripts with LLMs
  • Ingesting the data and evaluating the results
  • Keeping humans in the loop
  • Application of vector databases for machine learning
  • Collaborative filtering
  • Atita’s resource recommendations


Links:

  • LinkedIn: https://www.linkedin.com/in/atitaarora/
  • Twitter: https://x.com/atitaarora
  • Github: https://github.com/atarora
  • Human-in-the-Loop Machine Learning: https://www.manning.com/books/human-in-the-loop-machine-learning
  • Relevant Search: https://www.manning.com/books/relevant-search
  • Let's learn about Vectors: https://hub.superlinked.com/ Langchain: https://python.langchain.com/docs/get_started/introduction
  • Qdrant blog: https://blog.qdrant.tech/
  • OpenSource Connections Blog: https://opensourceconnections.com/blog/

Free ML Engineering course: http://mlzoomcamp.com Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

Dec 27, 202357:00
The Entrepreneurship Journey: From Freelancing to Starting a Company - Adrian Brudaru

The Entrepreneurship Journey: From Freelancing to Starting a Company - Adrian Brudaru

We talked about:

  • Adrian’s background
  • The benefits of freelancing
  • Having an agency vs freelancing
  • What let Adrian switch over from freelancing
  • The conception of DLT (Growth Full Stack)
  • The investment required to start a company
  • Growth through the provision of services
  • Growth through teaching (product-market fit)
  • Moving on to creating docs
  • Adrian’s current role
  • Strategic partnerships and community growth through DocDB
  • Plans for the future of DLT
  • DLT vs Airbyte vs Fivetran
  • Adrian’s resource recommendations


Links:

  • Adrian's LinkedIn: https://www.linkedin.com/in/data-team/
  • Twitter: https://twitter.com/dlt_library
  • Github: https://github.com/dlt-hub/dlt
  • Website: https://dlthub.com/docs/intro


Free ML Engineering course: http://mlzoomcamp.com Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

Dec 19, 202356:22
Become a Data Freelancer - Dimitri Visnadi

Become a Data Freelancer - Dimitri Visnadi

We talked about:

  • Dimitri’s background
  • The first steps of transitioning into freelance
  • Working with recruiters (contracting)
  • Deciding on what to charge for your services
  • Establishing your network
  • Self-marketing
  • Contracting vs freelancing
  • Which channel is better for those starting out?
  • Cutting out the middleman
  • Where to look for clients and how to vet them
  • The different way of getting into freelancing
  • Going back to a full-time job after freelancing
  • Common mistakes freelancers make
  • Dimitri’s resource suggestions
  • Reaching out to Dimitri


Links:

  • LinkedIn profile: http://www.linkedin.com/in/visnadi
  • The DataFreelancer website: https://thedatafreelancer.com/


Free ML Engineering course: http://mlzoomcamp.com Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

Dec 17, 202355:13
AI for Digital Health - Maria Bruckert

AI for Digital Health - Maria Bruckert

We talked about:


  • Maria’s background
  • Deciding to go into telecare (healthcare)
  • Current difficulties in healthcare
  • Getting into the healthcare industry as a lifestyle brand
  • The importance of a plan B and being flexible
  • What is SQIN and the importance of communication
  • Going from lipstick to skin health analysis
  • The importance of community and broadening your audience
  • The importance of feedback and communicating benefits
  • The current state and growth of SQIN
  • Convincing investors and the importance of proving profitability
  • Maria’s role at SQIN
  • Balancing a newborn child and a new company


Links:

  • Free ML Engineering course: http://mlzoomcamp.com
  • Join DataTalks.Club: https://datatalks.club/slack.html
  • Our events: https://datatalks.club/events.html
Dec 04, 202350:25
Cracking the Code: Machine Learning Made Understandable - Christoph Molnar

Cracking the Code: Machine Learning Made Understandable - Christoph Molnar

We talked about:

  • Christoph’s background
  • Kaggle and other competitions
  • How Christoph became interested in interpretable machine learning
  • Interpretability vs Accuracy
  • Christoph’s current competition engagement
  • How Christoph chooses topics for books
  • Why Christoph started the writing journey with a book
  • Self-publishing vs via a publisher
  • Christoph’s other books
  • What is conformal prediction?
  • Christoph’s book on SHAP
  • Explainable AI vs Interpretable AI
  • Working alone vs with other people
  • Christoph’s other engagements and how to stay hands-on
  • Keeping a logbook
  • Does one have to be an expert on the topic to write a book about it?
  • Writing in the open and other feedback gathering methods
  • Advice for those who want to be technical writers
  • Self-publishing tools
  • Finding Christoph online


Links:

  • LinkedIn: https://www.linkedin.com/in/christoph-molnar/
  • Website: https://christophmolnar.com/


Free ML Engineering course: http://mlzoomcamp.com Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

Nov 26, 202351:59
The Unwritten Rules for Success in Machine Learning - Jack Blandin

The Unwritten Rules for Success in Machine Learning - Jack Blandin

We talked about:

  • Jack’s background
  • Transitioning from IC to management
  • Lesson not taught in traditional school
  • The importance of people’s perception, trust, and respect
  • How soft skills are relevant to machine learning
  • How to put on a salesman hat in machine learning management
  • The importance of visuals and building a POC as fast as possible
  • 1st Rule of Machine Learning – don’t be afraid to start without machine learning
  • The importance of understanding the reality that data represents
  • The importance of putting yourself in the shoes of customers
  • The importance of software engineering skills in machine learning
  • Where to find Jack’s content
  • Jack’s next venture

Links:


  • Jack's LinkedIn profile: https://www.linkedin.com/in/jackblandin/

Free ML Engineering course: http://mlzoomcamp.com Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

Nov 20, 202350:26
From a Research Scientist at Amazon to a Machine learning/AI Consultant - Verena Webber

From a Research Scientist at Amazon to a Machine learning/AI Consultant - Verena Webber

Links:

  • Mini sound bath: https://www.youtube.com/watch?v=g-lDrcSqcrQ


Free ML Engineering course: http://mlzoomcamp.com Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

Nov 10, 202354:55
From Marketing to Product Owner in Search - Lera Kaimashnіkova

From Marketing to Product Owner in Search - Lera Kaimashnіkova

We talked about:

  • Lera’s background
  • Lera’s move from Ukraine to Germany
  • The transition from Marketing to Product Ownership
  • The importance of communication and one-on-ones
  • The role of Product Owner
  • Utilizing Scrum as a Product Owner
  • Building teams and cross-functionality
  • Lera’s experience learning about search
  • The importance of having both technical knowledge and business context
  • Open developer positions at AUTODOC
  • What experience Lera came to AUTODOC with
  • How marketing skills helped Lera in her current role
  • Lera’s resource recommendations
  • Everything is possible



Links:

  • Post: https://www.linkedin.com/posts/leracaiman_elasticsearch-ecommerce-activity-7106615081588674560-5WQO


Free ML Engineering course: http://mlzoomcamp.com Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

Nov 05, 202355:14
Collaborative Data Science in Business - Ioannis Mesionis

Collaborative Data Science in Business - Ioannis Mesionis

Links:

  • LinkedIn: https://www.linkedin.com/in/ioannis-mesionis/
  • Github: https://github.com/ioannismesionis
  • Website: https://ioannismesionis.github.io/



Free ML Engineering course: http://mlzoomcamp.com Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

Oct 27, 202355:50
Bridging Data Science and Healthcare - Eleni Stamatelou

Bridging Data Science and Healthcare - Eleni Stamatelou

Free ML Engineering course: http://mlzoomcamp.com Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

Oct 20, 202354:02
DataTalks.Club Anniversary Interview - Alexey Grigorev, Johanna Bayer

DataTalks.Club Anniversary Interview - Alexey Grigorev, Johanna Bayer

Free ML Engineering course: http://mlzoomcamp.com Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

Oct 12, 202357:45
Data Engineering for Fraud Prevention - Angela Ramirez

Data Engineering for Fraud Prevention - Angela Ramirez

We talked about:

  • Angela's background
  • Angela's role at Sam's Club
  • The usefulness of knowing ML as a data engineer
  • Angela's career path
  • Transitioning from data analyst to data engineer/system designer
  • Best practices for system design and data engineering
  • Working with document databases
  • Working with network-based databases
  • Detecting fraud with a network-based database
  • Selecting the database type to work with
  • Neo4j vs Postgres
  • The importance of having software engineering knowledge in data engineering
  • Data quality check tooling
  • The greatest challenges in data engineering
  • Debugging and finding the root cause of a failed job
  • What kinds of tools Angela uses on a daily basis
  • Working with external data sources
  • Angela's resource recommendations


Links:

  • LinkedIn: https://www.linkedin.com/in/aramirez1305/
  • Twitter: https://twitter.com/angelamaria__r
  • Github: https://github.com/aramir62
  • Previous podcast talk: https://twitter.com/i/spaces/1OwGWwZAZDnGQ?s=20


Free ML Engineering course: http://mlzoomcamp.com

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Oct 06, 202354:14
From Data Manager to Data Architect - Loïc Magnien

From Data Manager to Data Architect - Loïc Magnien

We talked about:

  • Loïc's background
  • Data management
  • Loïc's transition to data engineer
  • Challenges in the transition to data engineering
  • What is a data architect?
  • The output of a data architect's work
  • Establishing metrics and dimensions
  • The importance of communication
  • Setting up best practices for the team
  • Staying relevant and tech-watching
  • Setting up specifications for a pipeline
  • Be agile, create a POC, iterate ASAP, and build reusable templates
  • Reaching out to Loïc for questions


Links:

  • Loiic LinkedIn: https://www.linkedin.com/in/loicmagnien/


Free ML Engineering course: http://mlzoomcamp.com

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Sep 29, 202356:42
Pragmatic and Standardized MLOps - Maria Vechtomova

Pragmatic and Standardized MLOps - Maria Vechtomova

We talked about:

  • Maria's background
  • Marvelous MLOps
  • Maria's definition of MLOps
  • Alternate team setups without a central MLOps team
  • Pragmatic vs non-pragmatic MLOps
  • Must-have ML tools (categories)
  • Maturity assessment
  • What to start with in MLOps
  • Standardized MLOps
  • Convincing DevOps to implement
  • Understanding what the tools are used for instead of knowing all the tools
  • Maria's next project plans
  • Is LLM Ops a thing?
  • What Ahold Delhaize does
  • Resource recommendations to learn more about MLOps
  • The importance of data engineering knowledge for ML engineers

Links:

  • LinkedIn: https://www.linkedin.com/company/marvelous-mlops/
  • Website: https://marvelousmlops.substack.com/

Free MLOps course: https://github.com/DataTalksClub/mlops-zoomcamp Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

Sep 08, 202353:43
Democratizing Causality - Aleksander Molak

Democratizing Causality - Aleksander Molak

We talked about:

  • Aleksander's background
  • Aleksander as a Causal Ambassador
  • Using causality to make decisions
  • Counterfactuals and and Judea Pearl
  • Meta-learners vs classical ML models
  • Average treatment effect
  • Reducing causal bias, the super efficient estimator, and model uplifting
  • Metrics for evaluating a causal model vs a traditional ML model
  • Is the added complexity of a causal model worth implementing?
  • Utilizing LLMs in causal models (text as outcome)
  • Text as treatment and style extraction
  • The viability of A/B tests in causal models
  • Graphical structures and nonparametric identification
  • Aleksander's resource recommendations

Links:


  • The Book of Why: https://amzn.to/3OZpvBk
  • Causal Inference and Discovery in Python: https://amzn.to/46Pperr
  • Book's GitHub repo: https://github.com/PacktPublishing/Causal-Inference-and-Discovery-in-Python
  • The Battle of Giants: Causality vs NLP (PyData Berlin 2023): https://www.youtube.com/watch?v=Bd1XtGZhnmw
  • New Frontiers in Causal NLP (papers repo): https://bit.ly/3N0TFTL


Free MLOps course: https://github.com/DataTalksClub/mlops-zoomcamp Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

Aug 25, 202356:00
Mastering Data Engineering as a Remote Worker - José María Sánchez Salas

Mastering Data Engineering as a Remote Worker - José María Sánchez Salas

We talked about:

  • José's background
  • How José relocated to Norway and his schedule
  • Tech companies in Norway and José role
  • Challenges of working as a remote data engineer
  • José's newsletter on how to make use of data
  • The process of making data useful
  • Where José gets inspiration for his newsletter
  • Dealing with burnout
  • When in Norway, do as the Norwegians do
  • The legalities of working remotely in Norway
  • The benefits of working remotely


Links:

  • LinkedIn: https://www.linkedin.com/in/jmssalas
  • Github: https://github.com/jmssalas
  • Website & Newsletter: https://jmssalas.com


Free MLOps course: https://github.com/DataTalksClub/mlops-zoomcamp Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

Aug 18, 202346:31
The Good, the Bad and the Ugly of GPT - Sandra Kublik

The Good, the Bad and the Ugly of GPT - Sandra Kublik

We talked about:

  • Sandra's background
  • Making a YouTube channel to break into the LLM space
  • The business cases for LLMs
  • LLMs as amplifiers
  • The befits of keeping a human in the loop when using LLMs (AI limitations)
  • Using LLMs as assistants
  • Building an app that uses an LLM
  • Prompt whisperers and how to improve your prompts
  • Sandra's 7-day LLM experiment
  • Sandra's LLM content recommendations
  • Finding Sandra online


Links:

  • LinkedIn: https://www.linkedin.com/in/sandrakublik/
  • Twitter: https://twitter.com/sandra_kublik
  • Youtube: https://www.youtube.com/@sandra_kublik


Free MLOps course: https://github.com/DataTalksClub/mlops-zoomcamp Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

Aug 04, 202350:53
LLMs for Everyone - Meryem Arik

LLMs for Everyone - Meryem Arik

We talked about:


  • Meryam's background
  • The constant evolution of startups
  • How Meryam became interested in LLMs
  • What is an LLM (generative vs non-generative models)?
  • Why LLMs are important
  • Open source models vs API models
  • What TitanML does
  • How fine-tuning a model helps in LLM use cases
  • Fine-tuning generative models
  • How generative models change the landscape of human work
  • How to adjust models over time
  • Vector databases and LLMs
  • How to choose an open source LLM or an API
  • Measuring input data quality
  • Meryam's resource recommendations


Links:

  • Website: https://www.titanml.co/
  • Beta docs: https://titanml.gitbook.io/iris-documentation/overview/guide-to-titanml...
  • Using llama2.0 in TitanML Blog: https://medium.com/@TitanML/the-easiest-way-to-fine-tune-and-inference-llama-2-0-8d8900a57d57
  • Discord: https://discord.gg/83RmHTjZgf
  • Meryem LinkedIn: https://www.linkedin.com/in/meryemarik/


Free MLOps course: https://github.com/DataTalksClub/mlops-zoomcamp Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

Jul 28, 202355:29
Investing in Open-Source Data Tools - Bela Wiertz

Investing in Open-Source Data Tools - Bela Wiertz

We talked about:

  • Bela's background
  • Why startups even need investors
  • Why open source is a viable go-to-market strategy
  • Building a bottom-up community
  • The investment thesis for the TKM Family Office and the blurriness of the funding round naming convention
  • Angel investors vs VC Funds vs family offices
  • Bela's investment criteria and GitHub stars as a metric
  • Inbound sourcing, outbound sourcing, and investor networking
  • Making a good impression on an investor
  • Balancing open and closed source parts of a product
  • The future of open source
  • Recent successes of open source companies
  • Bela's resource recommendations


Links:


  • Understand who is engaging with your open source project article: https://www.crowd.dev/
  • Top 6 Books on Developer Community Building: https://www.crowd.dev/post/top-6-books-on-developer-community-building
  • Which open source software metrics matter: https://www.bvp.com/atlas/measuring-the-engagement-of-an-open-source-software-community#Which-open-source-software-metrics-matter


Free MLOps course: https://github.com/DataTalksClub/mlops-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Jul 21, 202354:58
Why Machine Learning Design is Broken - Valerii Babushkin

Why Machine Learning Design is Broken - Valerii Babushkin

Links:


  • Book: https://www.manning.com/books/machine-learning-system-design?utm_source=AGMLBookcamp&utm_medium=affiliate&utm_campaign=book_babushkin_machine_4_25_23&utm_content=twitter
  • Discount: poddatatalks21 (35% off)
  • Evidently: https://www.evidentlyai.com/
  • Article: https://medium.com/people-ai-engineering/design-documents-for-ml-models-bbcd30402ff7


Free MLOps course: https://github.com/DataTalksClub/mlops-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Jul 14, 202351:20
Interpretable AI and ML - Polina Mosolova

Interpretable AI and ML - Polina Mosolova

We talked about:

  • Polina's background
  • How common it is for PhD students to build ML pipelines end-to-end
  • Simultaneous PhD and industry experience
  • Support from both the academic and industry sides
  • How common the industrial PhD setup is and how to get into one
  • Organizational trust theory
  • How price relates to trust
  • How trust relates to explainability
  • The importance of actionability
  • Explainability vs interpretability vs actionability
  • Complex glass box models
  • Does the explainability of a model follow explainability?
  • What explainable AI bring to customers and end users
  • Can all trust be turned into KPI?

Links:


  • LinkedIn: https://www.linkedin.com/in/polina-mosolova/
  • Neural Additive Models paper: https://proceedings.neurips.cc/paper/2021/file/251bd0442dfcc53b5a761e050f8022b8-Paper.pdf
  • Neural Basis Model paper: https://arxiv.org/pdf/2205.14120.pdf
  • Interpretable Feature Spaces paper: https://kdd.org/exploration_files/vol24issue1_1._Interpretable_Feature_Spaces_revised.pdf
Jul 07, 202352:48
From Scratch to Success: Building an MLOps Team and ML Platform - Simon Stiebellehner

From Scratch to Success: Building an MLOps Team and ML Platform - Simon Stiebellehner

We talked about:

  • Simon's background
  • What MLOps is and what it isn't
  • Skills needed to build an ML platform that serves 100s of models
  • Ranking the importance of skills
  • The point where you should think about building an ML platform
  • The importance of processes in ML platforms
  • Weighing your options with SaaS platforms
  • The exploratory setup, experiment tracking, and model registry
  • What comes after deployment?
  • Stitching tools together to create an ML platform
  • Keeping data governance in mind when building a platform
  • What comes first – the model or the platform?
  • Do MLOps engineers need to have deep knowledge of how models work?
  • Is API design important for MLOps?
  • Simon's recommendations for furthering MLOps knowledge


Links:

  • LinkedIn: https://www.linkedin.com/in/simonstiebellehner/
  • Github: https://github.com/stiebels
  • Medium: https://medium.com/@sistel

Free MLOps course: https://github.com/DataTalksClub/mlops-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Jun 30, 202353:33
From MLOps to DataOps - Santona Tuli

From MLOps to DataOps - Santona Tuli

We talked about:

  • Santona's background
  • Focusing on data workflows
  • Upsolver vs DBT
  • ML pipelines vs Data pipelines
  • MLOps vs DataOps
  • Tools used for data pipelines and ML pipelines
  • The “modern data stack” and today's data ecosystem
  • Staging the data and the concept of a “lakehouse”
  • Transforming the data after staging
  • What happens after the modeling phase
  • Human-centric vs Machine-centric pipeline
  • Applying skills learned in academia to ML engineering
  • Crafting user personas based on real stories
  • A framework of curiosity
  • Santona's book and resource recommendations


Links:

  • LinkedIn: https://www.linkedin.com/in/santona-tuli/
  • Upsolver website: upsolver.com
  • Why we built a SQL-based solution to unify batch and stream workflows: https://www.upsolver.com/blog/why-we-built-a-sql-based-solution-to-unify-batch-and-stream-workflows


Free MLOps course: https://github.com/DataTalksClub/mlops-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Jun 23, 202353:05
Data Developer Relations - Hugo Bowne-Anderson

Data Developer Relations - Hugo Bowne-Anderson

We talked about:

  • Hugo's background
  • Why do tools and the companies that run them have wildly different names
  • Hugo's other projects beside Metaflow
  • Transitioning from educator to DevRel
  • What is DevRel?
  • DevRel vs Marketing
  • How DevRel coordinates with developers
  • How DevRel coordinates with marketers
  • What skills a DevRel needs
  • The challenges that come with being an educator
  • Becoming a good writer: nature vs nurture
  • Hugo's approach to writing and suggestions
  • Establishing a goal for your content
  • Choosing a form of media for your content
  • Is DevRel intercompany or intracompany?
  • The Vanishing Gradients podcast
  • Finding Hugo online


Links:

  • Hugo Browne's github: http://hugobowne.github.io/
  • Vanishing Gradients: https://vanishinggradients.fireside.fm/
  • MLOps and DevOps: Why Data Makes It Differenthttps://www.oreilly.com/radar/mlops-and-devops-why-data-makes-it-different/
  • Evaluate Metaflow for free, right from your Browser: https://outerbounds.com/sandbox/


Free MLOps course: https://github.com/DataTalksClub/mlops-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html


Jun 16, 202350:51
Lessons Learned from Freelancing and Working in a Start-up - Antonis Stellas

Lessons Learned from Freelancing and Working in a Start-up - Antonis Stellas

We talked about;

  • Antonis' background
  • The pros and cons of working for a startup
  • Useful skills for working at a startup and the Lean way to work
  • How Antonis joined the DataTalks.Club community
  • Suggestions for students joining the MLOps course
  • Antonis contributing to Evidently AI
  • How Antonis started freelancing
  • Getting your first clients on Upwork
  • Pricing your work as a freelancer
  • The process after getting approved by a client
  • Wearing many hats as a freelancer and while working at a startup
  • Other suggestions for getting clients as a freelancer
  • Antonis' thoughts on the Data Engineering course
  • Antonis' resource recommendations

Links:

  • Lean Startup by Eric Ries: https://theleanstartup.com/
  • Lean Analytics: https://leananalyticsbook.com/
  • Designing Machine Learning Systems by Chip Huyen: https://www.oreilly.com/library/view/designing-machine-learning/9781098107956/
  • Kafka Streaming with python by Khris Jenkins tutorial video: https://youtu.be/jItIQ-UvFI4


Free MLOps course: https://github.com/DataTalksClub/mlops-zoomcamp Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

Jun 09, 202350:31
Data Access Management - Bart Vandekerckhove

Data Access Management - Bart Vandekerckhove

We talked about:

  • Bart's background
  • What is data governance?
  • Data dictionaries and data lineage
  • Data access management
  • How to learn about data governance
  • What skills are needed to do data governance effectively
  • When an organization needs to start thinking about data governance
  • Good data access management processes
  • Data masking and the importance of automating data access
  • DPO and CISO roles
  • How data access management works with a data mesh approach
  • Avoiding the role explosion problem
  • The importance of data governance integration in DataOps
  • Terraform as a stepping stone to data governance
  • How Raito can help an organization with data governance
  • Open-source data governance tools

Links:

  • LinkedIn: https://www.linkedin.com/in/bartvandekerckhove/
  • Twitter: https://twitter.com/Bart_H_VDK
  • Github: https://github.com/raito-io
  • Website: https://www.raito.io/
  • Data Mesh Learning Slack: https://data-mesh-learning.slack.com/join/shared_invite/zt-1qs976pm9-ci7lU8CTmc4QD5y4uKYtAA#/shared-invite/email
  • DataQG Website: https://dataqg.com/
  • DataQG Slack: https://dataqgcommunitygroup.slack.com/join/shared_invite/zt-12n0333gg-iTZAjbOBeUyAwWr8I~2qfg#/shared-invite/email
  • DMBOK (Data Management Book of Knowledge): https://www.dama.org/cpages/body-of-knowledge
  • DMBOK Wheel describing the data governance activities: https://www.dama.org/cpages/dmbok-2-wheel-images


Free MLOps course: https://github.com/DataTalksClub/mlops-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Jun 02, 202350:29
Data Strategy: Key Principles and Best Practices - Boyan Angelov

Data Strategy: Key Principles and Best Practices - Boyan Angelov

We talked about:


  • Boyan's background
  • What is data strategy?
  • Due diligence and establishing a common goal
  • Designing a data strategy
  • Impact assessment, portfolio management, and DataOps
  • Data products
  • DataOps, Lean, and Agile
  • Data Strategist vs Data Science Strategist
  • The skills one needs to be a data strategist
  • How does one become a data strategist?
  • Data strategist as a translator
  • Transitioning from a Data Strategist role to a CTO
  • Using ChatGPT as a writing co-pilot
  • Using ChatGPT as a starting point
  • How ChatGPT can help in data strategy
  • Pitching a data strategy to a stakeholder
  • Setting baselines in a data strategy
  • Boyan's book recommendations

Links:


  • LinkedIn: https://www.linkedin.com/in/angelovboyan/
  • Twitter: https://twitter.com/thinking_code
  • Github: https://github.com/boyanangelov
  • Website: https://boyanangelov.com/


Free MLOps course: https://github.com/DataTalksClub/mlops-zoomcamp Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

May 26, 202355:49
Practical Data Privacy - Katharine Jarmul

Practical Data Privacy - Katharine Jarmul

We talked about:

  • Katharine's background
  • Katharine's ML privacy startup
  • GDPR, CCPA, and the “opt-in as the default” approach
  • What is data privacy?
  • Finding Katharine's book – Practical Data Privacy
  • The various definitions of data privacy and “user profiles”
  • Privacy engineering and privacy-enhancing technologies
  • Why data privacy is important
  • What is differential privacy?
  • The importance of keeping privacy in mind when designing systems
  • Data privacy on the example of ChatGPT
  • Katharine's resource suggestions for learning about data privacy


Links:

  • LinkedIn: https://www.linkedin.com/in/katharinejarmul/
  • Twitter: https://twitter.com/kjam

Free data engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

May 19, 202357:44
Building Scalable and Reliable Machine Learning Systems - Arseny Kravchenko

Building Scalable and Reliable Machine Learning Systems - Arseny Kravchenko

We talked about:

  • Arseny's background
  • Working on machine learning in startups
  • What is Machine Learning System Design?
  • Constraints and requirements
  • Known unknowns vs unknown unknowns (Design stage)
  • Writing a design document
  • Technical problems vs product-oriented problems
  • The solution part of the Design Document
  • What motivated Arseny to write a book on ML System Design
  • Examples of a Design Document in the book
  • The types of readers for ML System Design
  • Working with the co-author
  • Reacting to constraints and feedback when writing a book
  • Arseny's favorite chapter of the book
  • Other resources where you can learn about ML System Design
  • Twitter Giveaway


Links:

  • Book: https://www.manning.com/books/machine-learning-system-design?utm_source=AGMLBookcamp&utm_medium=affiliate&utm_campaign=book_babushkin_machine_4_25_23&utm_content=twitter
  • Discount: poddatatalks21 (35% off)


Free data engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

May 12, 202350:59
Building an Open-Source NLP Tool - Johannes Hötter

Building an Open-Source NLP Tool - Johannes Hötter

We talked about:

  • Johannes’s background
  • Johannes’s Open Source Spotlight demos – Refinery and Bricks
  • The difficulties of working with natural language processing (NLP)
  • Incorporating ChatGPT into a process as a heuristic
  • What is Bricks?
  • The process of starting a startup – Kern
  • Making the decision to go with open source
  • Pros and cons of launching as open source
  • Kern’s business model
  • Working with enterprises
  • Johannes as a salesperson
  • The team at Kern
  • Johannes’s role at Kern
  • How Johannes and Henrik separate responsibilities at Kern
  • Working with very niche use cases
  • The short story of how Kern got its funding
  • Johannes’s resource recommendation


Links:

  • Refinery's GitHub repo: https://github.com/code-kern-ai/refinery
  • Bricks' Github repo: https://github.com/code-kern-ai/bricks
  • Bricks Open Source Spotlight demo: https://www.youtube.com/watch?v=r3rXzoLQy2U
  • Refinery Open Source Spotlight demo: https://www.youtube.com/watch?v=LlMhN2f7YDg
  • Discord: https://discord.com/invite/qf4rGCEphW
  • Ker's Website: https://www.kern.ai


Free data engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Apr 21, 202356:27
Navigating Industrial Data Challenges - Rosona Eldred

Navigating Industrial Data Challenges - Rosona Eldred

We talked about:

  • Rosona’s background
  • How mathematics knowledge helps in industry
  • What is industrial data?
  • Setting up an industrial process using blue paint
  • Internet companies’ data vs industrial data
  • Explaining industrial processes using packing peanuts
  • Why productive industry needs data
  • Measuring product qualities
  • How data specialists use industrial data
  • Defining and measuring sustainability
  • Using data in reactionary measures to changing regulations
  • Types of industrial data
  • Solving problems and optimizing with industrial data
  • Industrial solvers
  • Tiny data vs Big data in productive industry
  • The advantages of coming from academia into productive industry
  • Materials and resources for industrial data
  • Women in industry
  • Why Rosona decided to shift to industrial data


Links:

  • Kaggle dataset: https://www.kaggle.com/datasets/paresh2047/uci-semcom






Apr 14, 202353:22
Mastering Self-Learning in Machine Learning - Aaisha Muhammad

Mastering Self-Learning in Machine Learning - Aaisha Muhammad

We talked about:

  • Aaisha’s background
  • How homeschooling affects self-study
  • Deciding on what to learn about
  • Establishing whether a resource is good
  • How Aaisha focuses on learning
  • Deciding on what kind of project to build
  • Find research materials
  • Aaisha’s experience with the Data Talks Club ML Zoomcamp
  • ML Zoomcamp projects
  • Aaisha’s interest in bioinformatics
  • Keeping motivated with deadlines
  • Notes and time-tracking tools
  • Drawbacks to self-studying
  • Aaisha’s interest in machine learning
  • Aaisha’s least favorable part of ML Zoomcamp
  • Helping people as a way to learn
  • Using ChatGPT as a “study group”
  • Is it possible to use self-studying to learn high-level topics
  • Switching topics to avoid burnout
  • Aaisha’s resource recommendations


Links:

  • LinkedIn: https://www.linkedin.com/in/aaisha-muhammad/
  • Twitter: https://twitter.com/ZealousMushroom
  • Github: https://github.com/AaishaMuhammad
  • Website: http://www.aaishamuhammad.co.za/

Free data engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Apr 07, 202351:02
The Secret Sauce of Data Science Management - Shir Meir Lador

The Secret Sauce of Data Science Management - Shir Meir Lador

We talked about:

  • Shir’s background
  • Debrief culture
  • The responsibilities of a group manager
  • Defining the success of a DS manager
  • The three pillars of data science management
  • Managing up
  • Managing down
  • Managing across
  • Managing data science teams vs business teams
  • Scrum teams, brainstorming, and sprints
  • The most important skills and strategies for DS and ML managers
  • Making sure proof of concepts get into production


Links:

  • The secret sauce of data science management: https://www.youtube.com/watch?v=tbBfVHIh-38
  • Lessons learned leading AI teams: https://blogs.intuit.com/2020/06/23/lessons-learned-leading-ai-teams/
  • How to avoid conflicts and delays in the AI development process (Part I): https://blogs.intuit.com/2020/12/08/how-to-avoid-conflicts-and-delays-in-the-ai-development-process-part-i/
  • How to avoid conflicts and delays in the AI development process (Part II): https://blogs.intuit.com/2021/01/06/how-to-avoid-conflicts-and-delays-in-the-ai-development-process-part-ii/
  • Leading AI teams deck: https://drive.google.com/drive/folders/1_CnqjugtsEbkIyOUKFHe48BeRttX0uJG
  • Leading AI teams video: https://www.youtube.com/watch?app=desktop&v=tbBfVHIh-38


Free data engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Mar 31, 202348:43
SE4ML - Software Engineering for Machine Learning - Nadia Nahar

SE4ML - Software Engineering for Machine Learning - Nadia Nahar

We talked about:

  • Nadia’s background
  • Academic research in software engineering
  • Design patterns
  • Software engineering for ML systems
  • Problems that people in industry have with software engineering and ML
  • Communication issues and setting requirements
  • Artifact research in open source products
  • Product vs model
  • Nadia’s open source product dataset
  • Failure points in machine learning projects
  • Finding solutions to issues using Nadia’s dataset and experience
  • The problem of siloing data scientists and other structure issues
  • The importance of documentation and checklists
  • Responsible AI
  • How data scientists and software engineers can work in an Agile way


Links:

  • Model Card: https://arxiv.org/abs/1810.03993
  • Datasheets: https://arxiv.org/abs/1803.09010
  • Factsheets: https://arxiv.org/abs/1808.07261
  • Research Paper: https://www.cs.cmu.edu/~ckaestne/pdf/icse22_seai.pdf
  • Arxiv version: https://arxiv.org/pdf/2110.


Free data engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Mar 24, 202353:40
Starting a Consultancy in the Data Space - Aleksander Kruszelnicki

Starting a Consultancy in the Data Space - Aleksander Kruszelnicki

We talked about:

  • Aleksander’s background
  • The difficulty of selling data stack as a service
  • How Aleksander got into consulting
  • The Mom Test – extracting feedback from people
  • User interviews
  • Why Aleksander’s data stack as a service startup was not viable
  • How Aleksander decided to switch to consulting
  • Finding clients to consult
  • Figuring out how to position your services
  • Geographical limitations
  • Figuring out your target audience
  • The importance of networking and marketing
  • Pricing your services
  • The pitfalls of daily and hourly pricing and how to balance incentives
  • Is Germany a good place to found a company?
  • Aleksander’s book recommendations


Links:

  • LinkedIn: https://www.linkedin.com/in/alkrusz/
  • Twitter: https://twitter.com/alkrusz
  • Website: www.leukos.io


Free data engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Mar 17, 202352:28
Biohacking for Data Scientists and ML Engineers - Ruslan Shchuchkin

Biohacking for Data Scientists and ML Engineers - Ruslan Shchuchkin

We talked about:

  • Ruslan’s background
  • Fighting procrastination and perfectionism
  • What is biohacking?
  • The role of dopamine and other hormones in daily life
  • How meditation can help
  • The influence light has on our bodies
  • Behavioral biohacking
  • Daylight lamps and using light to wake up
  • Sleep cycles
  • How nutrition affects productivity
  • Measuring productivity
  • Examples of unsuccessful biohacking attempts
  • Stoicism, voluntary discomfort, and self-challenges
  • Biohacking risks and ways to prevent them
  • Coffee and tea biohacking
  • Using self-reflection and tracking to measure results
  • Mindset shifting
  • Stoicism book recommendation
  • Work/life balance
  • Ruslan’s biohacking resource recommendation


Links:

  • LinkedIn: https://www.linkedin.com/in/ruslanshchuchkin/


ree data engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html


Mar 10, 202352:58
 Analytics for a Better World - Parvathy Krishnan

Analytics for a Better World - Parvathy Krishnan

We talked about:

  • Parvathy’s background
  • Brainstorming sessions with nonprofits to establish data maturity
  • Example of an Analytics for a Better World project
  • The overall data maturity situation of nonprofits vs private sector
  • Solving the skill gap
  • Publicly available content
  • The Analytics for a Better World Academy
  • The Academy’s target audience
  • How researchers can work with Analytics for a Better World
  • Improving data maturity in nonprofit organizations
  • People, processes, and technology
  • Typical tools that Analytics for a Better World recommends to nonprofits
  • Profiles in nonprofits
  • Does Analytics for a Better World has a need for data engineers?
  • The Analytics for a Better World team
  • Factors that help organizations become more data-driven
  • Parvathy’s resource recommendations


Links:

  • LinkedIn: https://www.linkedin.com/in/parvathykrishnank/
  • Twitter:  https://twitter.com/ABWInstitute
  • Github: https://github.com/Analytics-for-a-Better-World
  • Website:  https://analyticsbetterworld.org/


Free data engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html


Mar 03, 202354:35
Accelerating the Adoption of AI through Diversity - Dânia Meira

Accelerating the Adoption of AI through Diversity - Dânia Meira

We talked about: 

  • Dania’s background
  • Founding the AI Guild
  • Datalift Summit
  • Coming up with meetup topics
  • Diversity in Berlin
  • Other types of diversity besides gender
  • The pitfalls of lacking diversity
  • Creating an environment where people can safely share their experiences
  • How the AI Guild helps organizations become more diverse
  • How the AI guild finds women in the fields of AI and data science
  • Advice for people in underrepresented groups
  • Organizing a welcoming environment and creating a code of conduct
  • AI Guild’s consulting work and community
  • AI Guild team
  • Dania’s resource recommendations
  • Upcoming Datalift Summit


Links:

  • Call for Speakers for the #datalift summit (Berlin, 14 to 16 June 2023): https://eu1.hubs.ly/H02RXvX0
  • Coded Bias documentary on Netflix: https://www.netflix.com/de/title/81328723#:~:text=This%20documentary%20investigates%20the%20bias,flaws%20in%20facial%20recognition%20technology.
  • Book Weapons of Math Destruction by Cathy O'Neil: https://en.wikipedia.org/wiki/Weapons_of_Math_Destruction
  • Book Lean In by Sheryl Sandberg: https://en.wikipedia.org/wiki/Lean_In


Free data engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Feb 24, 202357:01
Staff AI Engineer - Tatiana Gabruseva

Staff AI Engineer - Tatiana Gabruseva

We talked about:

  • Tatiana’s background
  • Going from academia to healthcare to the tech industry
  • What staff engineers do
  • Transferring skills from academia to industry and learning new ones
  • The importance of having mentors
  • Skipping junior and mid-level straight into the staff role
  • Convincing employers that you can take on a lead role
  • Seeing failure as a learning opportunity
  • Preparing for coding interviews
  • Preparing for behavioral and system design interviews
  • The importance of having a network and doing mock interviews
  • How much do staff engineers work with building pipelines, data science, ETC, MPOps, etc.?
  • Context switching
  • Advice for those going from academia to industry
  • The most exciting thing about working as an AI staff engineer
  • Tatiana’s book recommendations


Links:

  • LinkedIn: https://www.linkedin.com/in/tatigabru/ 
  • Twitter:  https://twitter.com/tatigabru
  • Github: https://github.com/tatigabru
  • Website:  http://tatigabru.com/


Free data engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Feb 17, 202355:24
The Journey of a Data Generalist: From Bioinformatics to Freelancing - Jekaterina Kokatjuhha

The Journey of a Data Generalist: From Bioinformatics to Freelancing - Jekaterina Kokatjuhha

We talked about:

  • Jekaterina’s background
  • How Jekaterina started freelancing
  • Jekaterina’s initial ways of getting freelancing clients
  • How being a generalist helped Jekaterina’s career
  • Connecting business and data
  • How Jekaterina’s LinkedIn posts helped her get clients
  • Jekaterina’s work in fundraising
  • Cohorts and KPIs
  • Improving communication between the data and business teams
  • Motivating every link in the company’s chain
  • The cons of freelancing
  • Balancing projects and networking
  • The importance of enjoying what you do
  • Growing the client base
  • In the office work vs working remotely
  • Jekaterina’s advice who people who feel stuck
  • Jekaterina’s resource recommendations

Links:

  • Jekaterina's LinkedIn: https://www.linkedin.com/in/jekaterina-kokatjuhha/

Join DataTalks.Club: https://datatalks.club/slack.html

Feb 11, 202352:18
Navigating Career Changes in Machine Learning - Chris Szafranek

Navigating Career Changes in Machine Learning - Chris Szafranek

We talked about

  • Chris’s background
  • Switching careers multiple times
  • Freedom at companies
  • Chris’s role as an internal consultant
  • Chris’s sabbatical
  • ChatGPT
  • How being a generalist helped Chris in his career
  • The cons of being a generalist and the importance of T-shaped expertise
  • The importance of learning things you’re interested in
  • Tips to enjoy learning new things
  • Recruiting generalists
  • The job market for generalists vs for specialists
  • Narrowing down your interests
  • Chris’s book recommendations


Links:

  • Lex Fridman: science, philosophy, media, AI (especially earlier episodes): https://www.youtube.com/lexfridman
  • Andrej Karpathy, former Senior Director of AI at Tesla, who's now focused on teaching and sharing his knowledge: https://www.youtube.com/@AndrejKarpathy
  • Beautifully done videos on engineering of things in the real world: https://www.youtube.com/@RealEngineering
  • Chris' website: https://szafranek.net/
  • Zalando Tech Radar: https://opensource.zalando.com/tech-radar/
  • Modal Labs, new way of deploying code to the cloud, also useful for testing ML code on GPUs: https://modal.com
  • Excellent Twitter account to follow to learn more about prompt engineering for ChatGPT: https://twitter.com/goodside
  • Image prompts for Midjourney: https://twitter.com/GuyP
  • Machine Learning Workflows in Production - Krzysztof Szafanek: https://www.youtube.com/watch?v=CO4Gqd95j6k
  • From Data Science to DataOps: https://datatalks.club/podcast/s11e03-from-data-science-to-dataops.html


Free data engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html


Feb 03, 202355:36
Preparing for a Data Science Interview - Luke Whipps

Preparing for a Data Science Interview - Luke Whipps

We talked about:

  • Luke’s background
  • Luke’s podcast - AI Game Changers
  • How Luke helps people get jobs
  • What’s changed in the recruitment market over the last 6 months
  • Getting ready for the interview process
  • Stage “zero” – the filter between the candidate and the company
  • Preparing for the introduction stage – research and communication
  • Reviewing the fundamentals during preparation
  • Preparing for the technical part of the interview
  • Establishing the hiring company’s expectations
  • Depth vs breadth
  • Overly theoretical and mathematical questions in interviews
  • Bombing (failing) in the middle of an interview
  • Applying to different roles within the same company
  • Luke’s resource recommendations


Links:

  • Luke's LinkedIn: https://www.linkedin.com/in/lukewhipps/


Free data engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html


Jan 27, 202354:17
Indie Hacking - Pauline Clavelloux

Indie Hacking - Pauline Clavelloux

We talked about:

  • Pauline’s background
  • Pauline’s work as a manager at IBM
  • What is indie hacking?
  • Pauline initial indie hacking projects
  • Getting ready for launch
  • Responsibilities and challenges in indie hacking
  • Pauline’s latest indie hacking project
  • Going live and marketing
  • Challenges with Unreal Me
  • Staying motivated with indie hacking projects
  • Skills Pauline picked up while doing indie hacking projects
  • Balancing a day job and indie hacking
  • Micro SaaS and AboutStartup.io
  • How Pauline comes up with ideas for projects
  • Going from an idea on paper to building a project
  • Pauline’s Twitter success
  • Connecting with Pauline online
  • Pauline’s indie hacking inspiration
  • Pauline’s resource recommendation


Links:

  • Website: https://wintopy.io/
  • Pauline's Twitter: https://twitter.com/Pauline_Cx
  • Pauline's LinkedIn: https://www.linkedin.com/in/paulineclavelloux/ 
  • Blog about Indiehacking: https://aboutstartup.io


Free data engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Jan 20, 202351:03
Doing Software Engineering in Academia - Johanna Bayer

Doing Software Engineering in Academia - Johanna Bayer

We talked about:

  • Johanna’s background
  • Open science course and reproducible papers
  • Research software engineering
  • Convincing a professor to work on software instead of papers
  • The importance of reproducible analysis
  • Why academia is behind on software engineering
  • The problems with open science publishing in academia
  • The importance of standard coding practices
  • How Johanna got into research software engineering
  • Effective ways of learning software engineering skills
  • Providing data and analysis for your project
  • Johanna’s initial experience with software engineering in a project
  • Working with sensitive data and the nuances of publishing it
  • How often Johanna does hackathons, open source, and freelancing
  • Social media as a source of repos and Johanna’s favorite communities
  • Contributing to Git repos
  • Publishing in the open in academia vs industry
  • Johanna’s book and resource recommendations
  • Conclusion


Links:

  • The Society of Research Software Engineering,  plus regional chapters: https://society-rse.org/
  • The RSE Association of Australia and New Zealand: https://rse-aunz.github.io/
  • Research Software Engineers (RSEs) The people behind research software: https://de-rse.org/en/index.html
  • The software sustainability institute: https://www.software.ac.uk/
  • The Carpentries (beginner git and programming courses): https://carpentries.org/
  • The Turing Way Book of  Reproducible Research: https://the-turing-way.netlify.app/welcome


Free data engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Jan 13, 202349:49