“Duckplyr: Seamlessly integrating duckdb with R and the tidyverse – posit::conf(2023)”

Duck plyr is the new high-performance database that integrates with arrow, and it’s going to be much faster than the traditional way of doing things. With talented specialists, decades of experience, and a focus on test-first development, Duck plyr aims to make all your R code work faster. Plus, it’s designed to be a drop-in replacement for dplyr, so you can achieve better results in less time. Exciting times ahead! πŸ”₯πŸ¦† #DBS #positconf2023

Key Takeaways πŸš€

Key PointDescription
New High Performance DatabaseDBS has introduced a new high performance database that integrates with various programming languages such as Python, Java, JavaScript, rust, Swift, Scala, and more.
Deep Integration with ArrowThe database deeply integrates with Arrow, allowing for efficient computation and processing of large datasets.
Duckplyr SpeedDuckplyr offers significantly faster processing times compared to traditional methods, making it a valuable tool for data analysis.
Full Compatibility with dplyrDuckplyr aims to be a full drop-in replacement for dplyr, supporting all data types and R functions.

Introduction πŸ“Š

My name is Kill and today I present you the first results from my collaboration with DBS and posit. I wouldn’t be here today if not for Hanis, Tom, Davis, Hedley, Andrew, and Kevin. Thank you for joining me in this presentation. DBS has introduced a new high performance database that integrates seamlessly with various programming languages, including Python, Java, JavaScript, rust, Swift, Scala, and more. Additionally, it offers deep integration with Arrow, enabling efficient computation and processing of large datasets.

Why Choose Duckplyr? πŸ“ˆ

Ease of Reading and Writing Code πŸ“‹

I get a lot of joy from writing and reading duckplyr code. Even if you’re not familiar with the syntax, you can read from top to bottom and understand each individual step. The SQL version of similar queries can be complicated and the order of operations may be mixed up.

Feature Matrix Comparison πŸ“Š

Comparing various features such as deep, dctb, and duckplyr, we find that duckplyr offers a nice balance between functionality and speed.

The Power of Duckplyr πŸ¦†

FeatureDescription
Full Compatibility with deerDuckplyr is designed to be a full drop-in replacement for deer, supporting all data types and R functions.
Test-Driven DevelopmentDuckplyr follows a test-driven development approach, utilizing existing test cases from the deer package.
Performance BenefitsDuckplyr outperforms traditional methods, offering significantly faster processing times.

As opposed to other systems such as data.table and DT, duckplyr ensures that all your deer code works seamlessly.

Challenges and Future Roadmap πŸ› οΈ

While duckplyr offers extensive compatibility with deer, there are still some areas that need improvement. For example, custom R functions and certain data types are not yet fully supported. The team at DBS is continuously working on enhancing the compatibility and performance of duckplyr. Additionally, they welcome feedback and input from users to further improve the system.

Conclusion 🎯

In conclusion, duckplyr presents a promising solution for data analysis, offering speed, efficiency, and seamless integration with existing R workflows. The team at DBS is committed to enhancing the system and ensuring full compatibility with deer, making it a go-to choice for handling large datasets.


If you have any questions or feedback, feel free to reach out to the team at the lobby. Thank you for your attention and support!

About the Author

About the Channel:

Share the Post:
en_GBEN_GB