Skip to content

Add support for count_distinct aggregate function #69

@rohitrastogi

Description

@rohitrastogi

Add support for count_distinct aggregate function

Fenic currently supports the count() aggregate function and the drop_duplicates() DataFrame method, but does not yet support the count_distinct() aggregate function.

This issue is about adding support for count_distinct() to Fenic. Fenic represents expressions in a logical expression tree, which it then transpiles into Polars expressions for execution.


🛠️ What needs to be done

  1. Add a new expression class

    • Create a CountDistinctExpr class that extends AggregateExpr (which in turn extends LogicalExpr).
    • Reference other aggregate expression definitions here:
      aggregate.py
  2. Transpile the expression to Polars

  3. Expose a user-facing API

    • Add a new count_distinct function to Fenic’s public API with a clean docstring and appropriate typing.
    • Follow the style and conventions in this file:
      builtin.py
  4. Write unit tests


Feel free to ask questions or open a draft PR early if you're unsure about anything. We're happy to help!

Metadata

Metadata

Assignees

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions