Skip to content

Conversation

@mary-cleaton
Copy link
Contributor

@mary-cleaton mary-cleaton commented Nov 17, 2025

Description

Add type hints to all functions. In particular, please can you check:

  • Type hints are correct (e.g. Spark session is pyspark.sql.SparkSession not spark.sql.SparkSession).
  • Type hints are consistent (e.g. cutpoints is always Dict[str, Union[List[float], None]]), input_variables is always Dict[str, Any]).
  • Type hinting is removed from docstrings in both Args and Returns sections.

I have also standardised the docstrings, where I noticed that they were non-standard, and updated the docstring styling (mostly by reducing the indent to two spaces from four, to reduce file length). Standardising the docstrings included standardising subheading order, which as per our chosen style should be:

  • Args
  • Dependencies
  • Provisos
  • Returns
  • Attribution

Type of change

  • Bug fix - non-breaking change
  • New feature - non-breaking change
  • Breaking change - backwards incompatible change, changes expected behaviour
  • Non-user facing change, structural change, dev functionality, docs ...

Checklist:

  • I have performed a self-review of my own code.
  • I have commented my code appropriately, focusing on explaining my design decisions (explain why, not how).
  • I have made corresponding changes to the documentation (comments, docstring, etc.. )
  • I have added tests that prove my fix is effective or that my feature works.
  • New and existing unit tests pass locally with my changes.
  • I have updated the change log.

Peer review

Any new code includes all the following:

  • Documentation: docstrings, comments have been added/ updated.
  • Style guidelines: New code conforms to the project's contribution guidelines.
  • Functionality: The code works as expected, handles expected edge cases and exceptions are handled appropriately.
  • Complexity: The code is not overly complex, logic has been split into appropriately sized functions, etc..
  • Test coverage: Unit tests cover essential functions for a reasonable range of inputs and conditions. Added and existing tests pass on my machine.

Review comments

Suggestions should be tailored to the code that you are reviewing. Provide context.
Be critical and clear, but not mean. Ask questions and set actions.

These might include:
  • bugs that need fixing (does it work as expected? and does it work with other code
    that it is likely to interact with?)
  • alternative methods (could it be written more efficiently or with more clarity?)
  • documentation improvements (does the documentation reflect how the code actually works?)
  • additional tests that should be implemented
    • Do the tests effectively assure that it
      works correctly? Are there additional edge cases/ negative tests to be considered?
  • code style improvements (could the code be written more clearly?)

Further reading: code review best practices

@mary-cleaton mary-cleaton requested a review from a team November 17, 2025 10:48
@mary-cleaton mary-cleaton self-assigned this Nov 17, 2025
@mary-cleaton mary-cleaton added documentation Improvements or additions to documentation enhancement New feature or request python Pull requests that update python code pyspark Pull requests that update pyspark code labels Nov 17, 2025
@mary-cleaton mary-cleaton linked an issue Nov 17, 2025 that may be closed by this pull request
Had coded this using the union object |. However, this was added in Python 3.10 and this repo currently still uses Python 3.8 and 3.9. Hence, revert to using typing.Union. Once this repo no longer supports these versions of Python, this can be changed back to using the union object (which is less verbose and easier to read).
@mary-cleaton mary-cleaton marked this pull request as ready for review November 24, 2025 10:37
@mary-cleaton mary-cleaton requested a review from a team as a code owner November 24, 2025 10:37
@mary-cleaton mary-cleaton marked this pull request as draft December 4, 2025 15:26
@mary-cleaton
Copy link
Contributor Author

Wait til #63 merged, then rebase and add type hints to run script. Then, this can be published and ready for review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation enhancement New feature or request pyspark Pull requests that update pyspark code python Pull requests that update python code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add type hints

2 participants