A Python library to run analytics workloads with the performance of Rust, the flexibility of Python and O(1) cost in moving data between the two. Uses Apache Arrow in-memory format and respective query engine DataFusion.
Like pyspark, it allows you to build a plan through SQL or a DataFrame API against in-memory data, parquet or CSV files, run it in a multi-threaded environment, and obtain the result back in Python.
…