-
Notifications
You must be signed in to change notification settings - Fork 1
Feat/test calc njklm values #68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
Tests added
Test to be run again as it fails initially
Test for calculate_njklm_values has been added and passed.
| expected_output = spark.createDataFrame( | ||
| [ | ||
| (1.0, 5.0, 4.0, 0.0), | ||
| ], | ||
| ["col1", "col2", "col3", "col4"], | ||
| ) | ||
|
|
||
| expected_output = expected_output.toPandas() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test will be much more efficient if this was defined in Pandas rather than defined in PySpark and converted to Pandas.
To hard-code a dataset in Pandas, do this:
expected_output = pd.DataFrame(
[
list_of_row_1_values,
list_of_row_2_values,
etc.
],
list_of_column_names
]There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw in the docstring/comments that the function is taking a Spark DataFrame. Then processed it and then return Pandas df?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but you don't have to make the test inputs and outputs the way that the function makes them. :)
| def test_get_scaled_labelled_x_star(): | ||
| """ | ||
| Tests that makes get_scaled_labelled_x_star returns the correct output | ||
| with appropriate inputs | ||
| """ | ||
|
|
||
| # Arrange | ||
|
|
||
| # Act | ||
|
|
||
| # Assert | ||
|
|
||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this here? It shouldn't be in this branch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New branch created for this test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you removed this test entirely when tidying this up. It should remain as a test shell in this branch.
| Tests that label_x_star() gives the correct output when provided with | ||
| appropriate inputs. | ||
| """ | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has been altered but shouldn't have been.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yes. I need to create another branch for the Tests that label_x_star
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please make sure that on this branch this test is just a test shell.
| test_input = spark.createDataFrame( | ||
| [ | ||
| (1.0, 2.0, 1.0, 0.0), | ||
| (0.0, 1.0, 1.0, 0.0), | ||
| (0.0, 1.0, 1.0, 0.0), | ||
| (0.0, 1.0, 1.0, 0.0), | ||
| ], | ||
| ["col1", "col2", "col3", "col4"], | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any test input should reflect a real (though small and artificial) test case. This doesn't really do this for two reasons:
- The column names are uninformative and don't reflect what real column names would look like in the input.
- This function takes a table of delta comparisons as it's input. It's impossible for a delta comparison table to contain 2.0 as it's an indicator matrix (of an indicator matrix). So, the
test_inputhere is actually impossible.
Please take a look at the output of compare_deltas function in .../src/indicator_matrix/indicator_matrix.py to see what the input of this function should look like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is much better!
If this is exactly the same dataset as that used as the test output for test_compare_deltas, it could be defined once as a fixture in conftest.py and re-used.
mary-cleaton
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see code-specific comments and address these before resubmitting for review. Thanks!
Description
A test has been added for calculate_njklm_values in test_matrix_a_star.py.
This is a feature change and does not break any existing functionality
Type of change
You can delete options that are not relevant.
Checklist:
Peer review
Any new code includes all the following:
Review comments
Suggestions should be tailored to the code that you are reviewing. Provide context.
Be critical and clear, but not mean. Ask questions and set actions.
These might include:
that it is likely to interact with?)
works correctly? Are there additional edge cases/ negative tests to be considered?
Further reading: code review best practices