Roger Peng: Sustaining data science ā in classrooms, code, and conversations
Michael, Hadley, and Wes welcome Roger Peng, professor of statistics and data science at UT Austin and co-host of Not So Standard Deviations. Together they trace Rogerās journey from early R adopter to pioneering online educator and prolific podcaster. The conversation ranges from the accidental rise of ādata scienceā as a field, to the tension between research papers and software maintenance, to what makes for meaningful, lasting creative work.Whatās Inside:Rogerās first analysis project and what it taught him about authorship and dataRogerās advice for students testing the waters in data scienceWhy software has become the unifying language of modern statisticsThe origins of ādata scienceā as a field and a labelReflections on Coursera, MOOCs, and opening education to the worldWhat keeps a podcast (and a career) going strong after a decade-plus
-------- Ā
45:05
--------
45:05
Mine Ćetinkaya-Rundel: Teaching in the AI era ā and keeping students engaged
In this conversation, Mine Ćetinkaya-Rundel, data science educator at Duke University and Posit, joins Michael, Hadley, and Wes to talk about teaching data science in a time when AI can write the code for you. Mine shares her journey from actuarial science to academia, the teaching philosophy behind the āwhole gameā approach, and her experiments using LLMs for instant student feedback. Along the way, the group dives into the joys and risks of coding by hand, the role of open source in the classroom, and what itās like to work across both the R and Python communities.Whatās Inside:How a career in actuarial science led Mine to the world of data science and teachingThe āwhole gameā approach to learning and how it helps students stay motivatedBuilding an LLM-powered feedback tool for low-stakes assignmentsBalancing AI assistance with the need for hands-on coding experienceThe shared DNA of R and Python scientific computing communitiesThe hidden value of live coding, pair programming, and seeing the process ā not just the output
-------- Ā
54:47
--------
54:47
Wes McKinney: Part 2 ā The open source hustle and an insider view of Positron
In part two of our conversation with Wes McKinney, we dig into the challenges and realities of sustaining open source development. Wes shares how funding actually works (or doesnāt), why corporate buy-in is essential, and what itās like building tools across languages, communities, and IDEs. We also talk about the Apache Software Foundationās role in open governance and the origin of the Positron IDE.Whatās Inside:Why passion isnāt enough for open source to scaleApache Arrowās origin story and how it was pitchedHow open governance enables trust between competitorsThe thinking behind Positron, Positās next-gen IDEPolyglot programming ā Designing tools that bridge the R/Python divideLLMs and data UX: Why modern IDEs need to serve both humans and modelsDay-to-day coding, advising, investing, and context-switchingMetalheads unite
-------- Ā
26:33
--------
26:33
Wes McKinney: Part 1 ā Building Pandas, Arrow, and a speedrunning legacy
Wes McKinneyās fingerprints are all over the modern data stack ā from inventing Pandas to co-creating Arrow. But before all that, Wes was organizing speedrun communities and hacking together better ways to wrangle datasets in finance. In this conversation, he shares his origin story and what makes good tools good. Stay tuned for part 2, coming soon.Whatās Inside:How frustration with data work led Wes to build pandas (and leave a PhD)A nostalgic dive into the GoldenEye speedrunning sceneWhy read_csv performance is a deeply personal crusadeLessons from convincing friends to quit finance and go open sourceFounding startups, launching Arrow, and the Ibis origin storyThe beauty of letting contributors take the reinsShout-out to Philip Cloud, pandasā resident pun masterWhy open communities win ā and what it takes to build them
-------- Ā
23:22
--------
23:22
Spreadsheets, bikes, and the accidental empire of R packages ā with Hadley Wickham
Before Hadley Wickham became a pillar of modern data science, he was a spreadsheet-loving teenager making databases for his dadās job. In this episode, he reflects on the early days of his involvement with R, the birth of tidyverse, and how real-world unpredictability ā like a bear in a field ā shapes data science.Whatās Inside:Hadleyās first brush with R code ⦠inside a Word docConsulting as a grad student ā and learning what people really want from statsHow messy Excel sheets inspired the tidy data revolutionWriting R packages as a form of self-defense (and productivity)The secret sauce of building the tidyverse teamOn focus, burnout, and saying ānoā to GitHub pull requestsCurrent obsession: using LLMsĀ to make data science faster, easier, and more funHow writing books is a form of tidying ideas, and how a Shiny textbook led to a custom bike
A Posit podcast for data science junkies, anomaly hunters, and those who play outside the confidence interval. Hosted by Michael Chow, with co-hosts Wes McKinney & Hadley Wickham.