Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 38 additions & 14 deletions beets/autotag/match.py
Original file line number Diff line number Diff line change
Expand Up @@ -236,11 +236,30 @@ def _add_candidate(
)


def _parse_search_terms_with_fallbacks(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seeing that you still want to use this helper method after reading my comments, I'm now concerned about my ability to understand the code.

I honestly have an issue trying to understand what this function does. I failed to do this by reading the code, so I had to manually run search in order to see what it does.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've got a simple mp3 track:

$ tag track.mp3
IDv2 tag info for track.mp3
TALB=album
TIT2=title

Previously

import artist search album/title search resulting search terms
album None None - album
album "Artist" "" - album
album "" "Album" - album
album "Artist" "Album" Artist - Album
singleton None None - title
singleton "Artist" "" Artist - title
singleton "" "Title" - Title
singleton "Artist" "Title" Artist - Title

I'm actually surprised to see that album search requires both terms. Apparently, I misunderstood even the simple if not (search_artist and search_album) condition 😁.

Now

import artist search album/title search resulting search terms
album None None - album
album "Artist" "" Artist -
album "" "Album" - Album
album "Artist" "Album" Artist - Album
singleton None None - title
singleton "Artist" "" Artist -
singleton "" "Title" - Title
singleton "Artist" "Title" Artist - Title

I take back what I said regarding 'we're keeping the previous functionality'. What you've got here aligns with my expectation: we only need one of the search terms to be present for the search to override item data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I also had issue with understanding the logic especially since it was different in singletons and albums. Have you seen the test I added?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did, see the rest of the comments

*pairs: tuple[str | None, str],
) -> tuple[str, ...]:
"""Given pairs of (search term, fallback), return a tuple of
search terms. If **all** search terms are empty, returns the fallback. Otherwise,
return the search terms even if some are empty.

Examples:
(("", "F1"), ("B", "F2")) -> ("", "B")
(("", "F1"), ("", "F2")) -> ("F1", "F2")
(("A", "F1"), (None, "F2")) -> ("A", "")
((None, "F1"), (None, "F2")) -> ("F1", "F2")
"""
if any(term for term, _ in pairs):
return tuple(term or "" for term, _ in pairs)
else:
return tuple(fallback for _, fallback in pairs)


def tag_album(
items,
search_artist: str | None = None,
search_album: str | None = None,
Comment on lines 260 to 261
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest we handle only strings here:

    search_artist: str,
    search_album: str,

And update the tag_item and tag_album calls in beets.importer.tasks to explicitly provide empty strings.

search_ids: list[str] = [],
search_ids: list[str] | None = None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a nice use case for a Sequence with a default of empty tuple:

Suggested change
search_ids: list[str] | None = None,
search_ids: Sequence[str] = (),

This way, you can simplify the following:

-    if search_ids:
-        for search_id in search_ids:
+    for search_id in search_ids:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would than also allow for non list entries. I remember we opted to not use Sequence[str] in some other places because of this reason.

# both valid:
search_ids: Sequence[str] = "fooo"
search_ids: Sequence[str] = ["fooo", "baar"]

) -> tuple[str, str, Proposal]:
"""Return a tuple of the current artist name, the current album
name, and a `Proposal` containing `AlbumMatch` candidates.
Expand Down Expand Up @@ -294,23 +313,24 @@ def tag_album(
Proposal(list(candidates.values()), rec),
)

# Search terms.
if not (search_artist and search_album):
# No explicit search terms -- use current metadata.
search_artist, search_album = cur_artist, cur_album
log.debug("Search terms: {} - {}", search_artist, search_album)
# Manually provided search terms or fallbacks.
_search_artist, _search_album = _parse_search_terms_with_fallbacks(
(search_artist, cur_artist),
(search_album, cur_album),
)
Comment on lines +317 to +320
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on my comment above, we don't need to handle Nones any more, so this can be simplified to

if not search_artist and not search_album:
    search_artist, search_album = cur_artist, cur_album

log.debug("Search terms: {} - {}", _search_artist, _search_album)

# Is this album likely to be a "various artist" release?
va_likely = (
(not consensus["artist"])
or (search_artist.lower() in VA_ARTISTS)
or (_search_artist.lower() in VA_ARTISTS)
or any(item.comp for item in items)
)
log.debug("Album might be VA: {}", va_likely)

# Get the results from the data sources.
for matched_candidate in metadata_plugins.candidates(
items, search_artist, search_album, va_likely
items, _search_artist, _search_album, va_likely
):
_add_candidate(items, candidates, matched_candidate)

Expand All @@ -322,7 +342,7 @@ def tag_album(


def tag_item(
item,
item: Item,
search_artist: str | None = None,
search_title: str | None = None,
search_ids: list[str] | None = None,
Expand Down Expand Up @@ -364,14 +384,18 @@ def tag_item(
else:
return Proposal([], Recommendation.none)

# Search terms.
search_artist = search_artist or item.artist
search_title = search_title or item.title
log.debug("Item search terms: {} - {}", search_artist, search_title)
# Manually provided search terms or fallbacks.
_search_artist, _search_title = _parse_search_terms_with_fallbacks(
(search_artist, item.artist),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, item.artist and item.title types are not present. Let's add them:

class LibModel(dbcore.Model["Library"]):
    ...
    path: bytes
    artist: str

...

class Item(LibModel):
    title: str

(search_title, item.title),
)
log.debug("Item search terms: {} - {}", _search_artist, _search_title)

# Get and evaluate candidate metadata.
for track_info in metadata_plugins.item_candidates(
item, search_artist, search_title
item,
_search_artist,
_search_title,
):
dist = track_distance(item, track_info, incl_artist=True)
candidates[track_info.track_id] = hooks.TrackMatch(dist, track_info)
Expand Down
27 changes: 19 additions & 8 deletions beets/metadata_plugins.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,17 +35,21 @@ def find_metadata_source_plugins() -> list[MetadataSourcePlugin]:


@notify_info_yielded("albuminfo_received")
def candidates(*args, **kwargs) -> Iterable[AlbumInfo]:
def candidates(
items: Sequence[Item], artist: str, album: str, va_likely: bool
) -> Iterable[AlbumInfo]:
"""Return matching album candidates from all metadata source plugins."""
for plugin in find_metadata_source_plugins():
yield from plugin.candidates(*args, **kwargs)
yield from plugin.candidates(
items=items, artist=artist, album=album, va_likely=va_likely
)


@notify_info_yielded("trackinfo_received")
def item_candidates(*args, **kwargs) -> Iterable[TrackInfo]:
"""Return matching track candidates fromm all metadata source plugins."""
def item_candidates(item: Item, artist: str, title: str) -> Iterable[TrackInfo]:
"""Return matching track candidates from all metadata source plugins."""
for plugin in find_metadata_source_plugins():
yield from plugin.item_candidates(*args, **kwargs)
yield from plugin.item_candidates(item=item, artist=artist, title=title)


def album_for_id(_id: str) -> AlbumInfo | None:
Expand Down Expand Up @@ -157,15 +161,22 @@ def candidates(

@abc.abstractmethod
def item_candidates(
self, item: Item, artist: str, title: str
self,
item: Item,
artist: str,
title: str,
) -> Iterable[TrackInfo]:
"""Return :py:class:`TrackInfo` candidates that match the given track.

Used in the autotag functionality to search for tracks.

:param item: Track item
:param artist: Track artist
:param title: Track title
:param artist: Track artist, either a search manually provided or
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's undo this - the plugin is not supposed to know where this data comes from. This is going to get out of sync given how plugins may extend it (and it probably is given the fromfilename plugin).

Copy link
Contributor Author

@semohr semohr Oct 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue I see here, is that there are two places to get this information from. E.g. either a plugin gets it from item.artist or from artist. For me, until looking into the source, it was not clear what are the differences. Giving some more information here makes this a bit more clear in my opinion. We can change the wording but we should definitely mention this somehow.

A bit in the line of how are artist and title more special in this case as every other field defined in item?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main question is: do these two fields always come from the item?

Track artist, either a search manually provided or preprocessed from the item

preprocessed from the item. If no metadata is available an empty string
is passed.
:param title: Track title, either a search manually provided or
preprocessed from the item. If no metadata is available an empty string
is passed.
"""
raise NotImplementedError

Expand Down
5 changes: 5 additions & 0 deletions docs/changelog.rst
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,11 @@ Other changes:
- The documentation chapter :doc:`dev/paths` has been moved to the "For
Developers" section and revised to reflect current best practices (pathlib
usage).
- Standardized ``search_*`` parameter handling in autotag matchers. Manual album
and singleton searches now behave consistently: when a user does not specify a
search query in the prompt, the system defaults to using the corresponding
value from the metadata. This was already the case for albums but not for
singletons.

2.5.1 (October 14, 2025)
------------------------
Expand Down
40 changes: 40 additions & 0 deletions test/autotag/test_match.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
import pytest

from beets.autotag.match import _parse_search_terms_with_fallbacks


class TestSearchTermHandling:
@pytest.mark.parametrize(
"input_pairs, expected",
[
(
(("A", "F1"), ("B", "F2")),
("A", "B"),
),
(
(("", "F1"), ("B", "F2")),
("", "B"),
),
(
(("", "F1"), ("", "F2")),
("F1", "F2"),
),
(
(("A", "F1"), (None, "F2")),
("A", ""),
),
(
((None, "F1"), (None, "F2")),
("F1", "F2"),
),
],
)
def test_search_parsing(self, input_pairs, expected):
result = _parse_search_terms_with_fallbacks(*input_pairs)
assert result == expected

# Should also apply for the reversed order of inputs
reversed_pairs = tuple(reversed(input_pairs))
reversed_expected = tuple(reversed(expected))
reversed_result = _parse_search_terms_with_fallbacks(*reversed_pairs)
assert reversed_result == reversed_expected