Python Polars: The Definitive Guide

Introduction

Welcome to the official website of the book Python Polars: The Definitive Guide by Jeroen Janssens and Thijs Nieuwdorp. The book is scheduled to be published by O’Reilly in February 2025.

In the meantime, you can:

The code is idiomatic, formatted using black, thoroughly tested, and sprinkled with helpful callouts.

The code is idiomatic, formatted using black, thoroughly tested, and sprinkled with helpful callouts.

Most visualizations are created with the plotnine package.

Most visualizations are created with the plotnine package.

We cover many related topics such as encodings and floating point representations.

We cover many related topics such as encodings and floating point representations.

We dive deep into the concept of expressions, the building blocks of every query.

We dive deep into the concept of expressions, the building blocks of every query.

The book contains many tips, tricks, and warnings based on our own real-world experience with Polars.

The book contains many tips, tricks, and warnings based on our own real-world experience with Polars.

Each chapter ends with useful takeaways.

Each chapter ends with useful takeaways.

Even the Great Tables package makes an appearance.

Even the Great Tables package makes an appearance.

We collaborated with NVIDIA and Dell Technologies to benchmark Polars on the GPU.

We collaborated with NVIDIA and Dell Technologies to benchmark Polars on the GPU.

Get Free Sample Chapter

To get a good idea of what the book is all about, you can read the first chapter for free. This chapter discusses what Polars is, explains why you should use it, and demonstrates its capabilities through a showcase. Enter your name and email address below to receive an email with a link to the PDF (4,7 MB).

Feel free to unsubscribe from the newsletter once you’ve got the PDF. Stay subscribed if you’d like to receive future updates about our book and other resources related to Polars.

Book Description

Unlock the power of Polars, a Python package for transforming, analyzing, and visualizing data. In this hands-on guide, Jeroen Janssens and Thijs Nieuwdorp walk you through every feature of Polars, showing you how to use it for real-world tasks like data wrangling, exploratory data analysis, building pipelines, and more.

Whether you’re a seasoned data professional or new to data science, you’ll quickly master Polars’ expressive API and its underlying concepts. You don’t need to have experience with pandas, but if you do, this book will help you make a seamless transition. The many practical examples and real-world datasets are available on GitHub, so you can easily follow along.

  • Process data from CSV, Parquet, spreadsheets, databases, and the cloud
  • Get a solid understanding of Expressions, the building blocks of every query
  • Handle complex data types, including text, time, and nested structures
  • Use both eager and lazy APIs, and know when to use each
  • Visualize your data with Altair, hvPlot, plotnine, and Great Tables
  • Extend Polars with your own Python functions and Rust plugins
  • Leverage GPU acceleration to boost performance even further

Praise

Jeroen and Thijs have done an excellent job–not only teaching you the ins and outs of Polars but also helping you unlearn habits from other tools like pandas. They really bring out the power of expressions, which are key to using Polars effectively, guiding you toward a more declarative, functional approach to data processing. As you work through this book, I’m sure you’ll gain a deep understanding of Polars and discover fresh ways to approach data processing.

Ritchie Vink, Creator of Polars (excerpt from the Foreword)

Polars has become a rising star in the Python data ecosystem, showing what’s possible in a next-generation data frame library. Jeroen and Thijs have written a timely and essential resource to help you take advantage of everything Polars has to offer.

Wes McKinney, Creator of pandas, Principal Architect, Posit PBC

Polars has brought a ton of much-needed innovation to the data frame world with its much more streamlined API and efficient implementation. As a result, the capabilities of data analysis in Python are pushed to new heights. We also greatly enjoy Ritchie and team as a part of the Amsterdam data ecosystem.

I greatly respect Jeroen’s commitment to teaching data science in an accessible way, whether it be on the command line or elsewhere. His and Thijs’ book is a testament to this commitment and I recommend it to the data science community.

Hannes Mühleisen, Co-Creator of DuckDB

As a client working closely with Thijs and Jeroen on migrating a data pipeline to Polars, we were initially skeptical, but we soon experienced the speed and intuitiveness of Polars and its API. While Jeroen and Thijs worked late hours to make progress with their book, we directly benefited from the improvements in our pipeline. We hope this book helps you along the way and that you find all the little gems Polars has to offer—while being lazy, of course!

Marnix van Lieshout and Bram Timmers, Data Scientists at Alliander

This book will change how you think about data analysis. Jeroen and Thijs have done a phenomenal job including all kinds of comparisons, diagrams, and examples. Polars has an incredible amount of functionality, and it’s clear they’ve put great care into organizing and breaking all the pieces down. I appreciate their focus on data visualization throughout the book, and the inclusion of table styling!

Michael Chow, Principal Software Engineer at Posit PBC, Co-maintainer of Great Tables

This book cleverly demystifies Polars’ powerful ecosystem. Thijs and Jeroen seamlessly guide you through the theoretical foundations and hands-on examples, making complex concepts accessible without sacrificing depth. Whether you’re migrating from pandas or starting fresh with Polars, this guide provides the roadmap you need to realise your data workflows could be running at a far more bear-able speed.

Hella Haanstra, Machine Learning Engineer at Xomnia

When I first interacted with Polars it was so early days that I made the PR for the .pipe() method on DataFrames. I was pleased with the speedups that Polars gave me, but I was massively impressed by the API. It just felt like such good taste and a great direction for the future. Fast forward a few years, and today Polars has become an established tool with so many features that the ecosystem was in dire need of a guide. This book gives us just that. It is a guide, but also a reference!

Vincent D. Warmerdam, Data person, Co-founder of calmcode

Python Polars: The Definitive Guide manages to offer a comprehensive overview of everything Polars has to offer, while also providing a great learning experience in the form of excellent code examples. Truly a great resource!

Stijn de Gooijer, Core contributor to Polars

Polars is emerging as one of the leading data frameworks in Python, especially for time series analysis and forecasting. It is now fully integrated with libraries like Nixtla’s MLForecast and StatsForecast, allowing for the creation of forecasts at scale with high performance.

Jeroen and Thijs have done an excellent job of establishing a solid foundation for both new practitioners who wish to learn how to process data with Python and experienced users looking to transition from pandas to Polars.

Rami Krispin, Senior Manager Data Science and Engineering at Apple

The depth that Jeroen and Thijs went into in order to produce this phenomenally good book is impressive. There’s some pretty good Polars books out there, but this is the best one. They don’t just repeat what’s in the user guide, they go above and beyond: in-depth explanations of expressions, (friendly) comparisons with other tools, an example of how to go beyond what Polars offers out-of-the-box with a geocoding plugin! Whether you’re new to Polars or want to improve your understanding of it, I wholeheartedly recommend this book.

Marco Gorelli, Senior Software Engineer at Quansight, Core contributor to Polars and pandas, Creator of Narwhals

Frequently Asked Questions

Q: Why is there no polar bear on the cover?

A: The polar bear is already featured on another O’Reilly book. But don’t worry, Jeroen and Thijs are actually quite proud of their Iberian Lynx.

About the Authors

Jeroen Janssens is a senior developer relations engineer at Posit, PBC. His expertise lies in visualizing data, implementing machine learning models, and building solutions using Python, R, JavaScript, and Bash. He’s passionate about open source and sharing knowledge. Previously, Jeroen was at Xomnia, where he first learned about Polars. He is the author of Data Science at the Command Line (O’Reilly, 2021). Jeroen holds a PhD in machine learning from Tilburg University and an MSc in artificial intelligence from Maastricht University. He lives with his wife and two kids in Rotterdam, the Netherlands. Learn more on his website.

Thijs Nieuwdorp is the lead data scientist at Xomnia in Amsterdam. His interest in the interaction between human and computer led him to an education in artificial intelligence at the Radboud University, after which he dove straight into the field of data science. At Xomnia he witnessed the birth of Polars as Ritchie Vink started working on it during his employment there and has been using it in his projects ever since. He enjoys figuring out complex data problems, optimizing existing solutions, and putting them to good use by implementing them into business processes. Outside work, Thijs enjoys exploring our world through hiking and traveling and exploring other worlds through books, games, and movies. He lives in Amsterdam with his partner, Paula. Learn more on his website.