🌐 AI搜索 & 代理 主页
Skip to content

Conversation

@DeaMariaLeon
Copy link
Member

@DeaMariaLeon DeaMariaLeon commented Aug 13, 2025

Reference Issues/PRs

Towards #26595

What does this implement/fix? Explain your changes.

This is part of "Display the (public) fitted attributes".

Any other comments?

Example
Screenshot 2025-08-20 at 18 14 43

@github-actions
Copy link

github-actions bot commented Aug 13, 2025

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 2005a4e. Link to the linter CI: here

@DeaMariaLeon DeaMariaLeon changed the title WIP: Display the shape of outgoing data structures WIP: Display the number of outgoing data structures Aug 18, 2025
@DeaMariaLeon DeaMariaLeon changed the title WIP: Display the number of outgoing data structures WIP: Display the number of output features Aug 18, 2025
@DeaMariaLeon DeaMariaLeon marked this pull request as ready for review August 21, 2025 09:03
@DeaMariaLeon
Copy link
Member Author

I wonder if I can have feedback before I add/fix more tests.
@glemaitre

@DeaMariaLeon DeaMariaLeon changed the title WIP: Display the number of output features ENH: Display the number of output features Aug 21, 2025
@glemaitre glemaitre self-requested a review August 22, 2025 09:03
@glemaitre
Copy link
Member

I see that we have to handle one specific case:

image

We have an internal PassThrough transformer that forward the input feature as-is and this it means that we should mention that the output feature are the same as n_features_in_.

@jeremiedbb
Copy link
Member

I find the block a bit big, it takes as much space as the estimator itself. I was also thinking that having the input features would be nice but then it really starts to take a lot of space around the estimator. So I wondered if the features could be intermediate blocks in the diagram, representing both the output features from the previous estimator and the input features for the next estimator. Something like this
output_features

This way the diagram alternates estimator blocks and data blocks
Then the text would be different obviously, like "16 features", or even the full shape ?

In addition, in this PR or in a following one, the data block could be unfold to show the feature names if available.

@glemaitre
Copy link
Member

One feedback of @ogrisel IRL is to directly show the feature names using the same pattern than "Parameters".

I personally agree with @jeremiedbb feedback: I would like something smaller. Also write now, we have to mention "output features" instead of simply "features" because of the ambiguity input/output when attached to the estimator. So the proposal to make the "feature" being blocks leaving on their own is nice I think because there is not ambiguity anymore.

@DeaMariaLeon
Copy link
Member Author

I'll work on this, thanks for the feedback. Just:

One feedback of @ogrisel IRL is to directly show the feature names using the same pattern than "Parameters".

Should I add the feature names on this PR? I remember @glemaitre saying that they should be added on a separate PR.

@glemaitre
Copy link
Member

Should I add the feature names on this PR?

I want to dissociate it at first but since we are going to create a new block, it might be better to have directly the feature names as well.

Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It really looks good. Now I just think that we need to work more in depth for the tests but the code is clean.

max-height: 10em;
overflow: auto;
scrollbar-width: thin;
padding: .25em .5em;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be inline with other tables. FYI, we don't need to take care about the vertical scrolling bar. The text will not overlap with it.

Suggested change
padding: .25em .5em;
padding: .25em 0.1rem;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@glemaitre this change makes on a jupyter notebook to have a higher space for the other tables. It would look like this:
Screenshot 2025-11-28 at 11 50 50

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, I included the change anyway.


def test_features_number_not_included():
est = LogisticRegression(C=10.0, fit_intercept=False)
assert "Output features" not in estimator_html_repr(est)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's create a test_features.py file to add those new test. I think that you need to update those tests. I think that we can test:

  • Make sure that we don't show anything when the estimator is not fitted
  • Make sure that we have have the boxes when the estimator is fitted
  • Make sure that we have the right feature names in the table
  • Make sure that it have the expected number of "X features" when having a column transformer in a pipeline
  • Make sure that thing works even with Transformer that do not provide the full scikit-learn API. In short, we should add a sklearn.utils._testing.MinimalTransformer in a Pipeline and check that it does not crash.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done - but I'm using estimator_html_repr to test. I wonder what think about that.

#
print(
"accuracy of the best model from randomized search: "
"accuracy of the best model from randomized search: "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should remember to revert this change for the final merge ;)


estimator_html_repr(model)


Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot wrote the tests that follow.
WDYT?

@glemaitre
Copy link
Member

It will be worth adding a changelog for this improvement as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants