Skip to content

Conversation

@snejus
Copy link
Member

@snejus snejus commented Nov 15, 2025

Refactor item tagging and fix several underlying issues.

Fixes

  • Synchronise all artist list fields. Notably, artist_sort / artists_sort and artist_credit / artists_credit fields have not been synchronised.
  • Fix overwrite_null configuration which was previously ignored for fields defined in autotag/__init__.py::SPECIAL_FIELDS.

Updates

  • Move metadata application logic into Match objects: add Match.apply_metadata, AlbumMatch.apply_metadata, AlbumMatch.apply_album_metadata, and TrackMatch.apply_metadata; callers now use those methods instead of legacy free functions.
  • Remove legacy functions from beets.autotag.__init__ (apply_item_metadata, apply_album_metadata, apply_metadata) and related globals (SPECIAL_FIELDS, log), and export only core types (AlbumInfo, AlbumMatch, TrackInfo, TrackMatch, Proposal, Recommendation, tag_album, tag_item).
  • Add structured metadata facilities to Info and subclasses:
    • Info.type class property and nullable_fields for per-type 'overwrite_null' config.
    • Info.raw_data and Info.item_data computed properties to apply artist_credit rules, filter nulls, and map media-specific field names.
    • AlbumInfo and TrackInfo extend raw_data/item_data behavior to handle album/track specifics (date zeroing, tracktotal, mb_releasetrackid, per-disc numbering).
  • Introduce TrackInfo.merge_with_album to merge track-level data with album-level fallback for a final item payload.
  • Move correct_list_fields to hooks.py and update it to keep unmapped / non-media single/list fields in sync (artist <-> artists, albumtype <-> albumtypes, etc.).
  • Wire changes through the codebase:
    • Pass Item objects into TrackMatch in match.tag_item to enable item-level metadata application.
    • Replace calls to removed autotag apply functions with Match.apply_metadata invocations in beets/importer/tasks.py, beetsplug/bpsync.py, and beetsplug/mbsync.py.
    • Update importer logic to set album artist fallbacks for albumartists / mb_albumartistids when missing.
  • Add and update tests:
    • New test/autotag/test_hooks.py and test/autotag/test_match.py to validate new data mapping, list field synchronization, overwrite behavior, and assignment logic.

@snejus snejus requested a review from a team as a code owner November 15, 2025 16:05
@snejus snejus requested review from henry-oberholtzer and semohr and removed request for a team November 15, 2025 16:05
@github-actions
Copy link

Thank you for the PR! The changelog has not been updated, so here is a friendly reminder to check if you need to add an entry.

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes and they look great!

Prompt for AI Agents
Please address the comments from this code review:

## Individual Comments

### Comment 1
<location> `beets/autotag/hooks.py:48-57` </location>
<code_context>
+def correct_list_fields(data: JSONDict) -> JSONDict:
</code_context>

<issue_to_address>
**issue (bug_risk):** The correct_list_fields function mutates its input dictionary in place.

This dual behavior may cause unexpected side effects for callers. Recommend copying the input or clearly documenting the mutation.
</issue_to_address>

### Comment 2
<location> `beets/autotag/hooks.py:88-89` </location>
<code_context>
 class AttrDict(dict[str, V]):
     """Mapping enabling attribute-style access to stored metadata values."""

     def copy(self) -> Self:
-        return deepcopy(self)
+        return self.__class__(**deepcopy(self))

     def __getattr__(self, attr: str) -> V:
</code_context>

<issue_to_address>
**suggestion:** AttrDict.copy uses **deepcopy(self), which may not preserve all dict semantics.

deepcopy(self) may fail for keys that are not valid identifiers or contain special characters. Using self.__class__(deepcopy(dict(self))) ensures all keys are handled correctly.

```suggestion
    def copy(self) -> Self:
        return self.__class__(deepcopy(dict(self)))
```
</issue_to_address>

### Comment 3
<location> `beets/autotag/hooks.py:120-129` </location>
<code_context>
+    def nullable_fields(cls) -> set[str]:
+        return set(config["overwrite_null"][cls.type.lower()].as_str_seq())
+
     @cached_property
     def name(self) -> str:
         raise NotImplementedError

+    @cached_property
+    def raw_data(self) -> JSONDict:
+        """Provide metadata with artist credits applied when configured."""
+        data = self.copy()
+        if config["artist_credit"]:
+            data.update(
+                artist=self.artist_credit or self.artist,
+                artists=self.artists_credit or self.artists,
+            )
+        return correct_list_fields(data)
+
+    @cached_property
+    def item_data(self) -> JSONDict:
+        """Metadata for items with field mappings and exclusions applied.
+
</code_context>

<issue_to_address>
**issue (bug_risk):** item_data uses dict union with data | {v: data.pop(k) ...}, which mutates data.

Using data.pop(k) within the dict comprehension alters the original data dictionary, which could cause issues if data is accessed elsewhere. To avoid side effects, create a new dictionary without mutating data.
</issue_to_address>

### Comment 4
<location> `test/autotag/test_hooks.py:186-194` </location>
<code_context>
+        assert self.items[0].month == 0
+        assert self.items[0].day == 0
+
+    def test_missing_date_applies_nothing(self):
+        self.items = [Item(year=1, month=2, day=3)]
+        self.info.update(year=None, month=None, day=None)
+
+        self._apply()
+
+        assert self.items[0].year == 1
+        assert self.items[0].month == 2
+        assert self.items[0].day == 3
</code_context>

<issue_to_address>
**suggestion (testing):** Consider adding a test for partial date updates (e.g., only year or only month present).

Adding tests for cases where only some date fields are updated will help verify correct handling of partial updates and prevent unintended data changes.

```suggestion
    def test_missing_date_applies_nothing(self):
        self.items = [Item(year=1, month=2, day=3)]
        self.info.update(year=None, month=None, day=None)

        self._apply()

        assert self.items[0].year == 1
        assert self.items[0].month == 2
        assert self.items[0].day == 3

    def test_partial_date_update_year_only(self):
        self.items = [Item(year=1, month=2, day=3)]
        self.info.update(year=2020, month=None, day=None)

        self._apply()

        assert self.items[0].year == 2020
        assert self.items[0].month == 0
        assert self.items[0].day == 0

    def test_partial_date_update_month_only(self):
        self.items = [Item(year=1, month=2, day=3)]
        self.info.update(year=None, month=5, day=None)

        self._apply()

        assert self.items[0].year == 1
        assert self.items[0].month == 5
        assert self.items[0].day == 0

    def test_partial_date_update_day_only(self):
        self.items = [Item(year=1, month=2, day=3)]
        self.info.update(year=None, month=None, day=15)

        self._apply()

        assert self.items[0].year == 1
        assert self.items[0].month == 2
        assert self.items[0].day == 15
```
</issue_to_address>

### Comment 5
<location> `test/autotag/test_hooks.py:162-167` </location>
<code_context>
+            },
+        ]
+
+    def test_autotag_items(self):
+        self._apply()
+
+        keys = self.expected_tracks[0].keys()
+        get_values = operator.itemgetter(*keys)
+
+        applied_data = [
+            dict(zip(keys, get_values(dict(i)))) for i in self.items
+        ]
+
+        assert applied_data == self.expected_tracks
+
+    def test_artist_credit_prefers_artist_over_albumartist_credit(self):
</code_context>

<issue_to_address>
**suggestion (testing):** Consider adding a test for 'from_scratch' import configuration.

Please add a test that enables 'from_scratch' and checks that items are cleared before metadata is applied.

```suggestion
    def test_from_scratch_clears_items_before_applying_metadata(self):
        # Enable 'from_scratch' configuration
        self.config['from_scratch'] = True

        # Simulate items with pre-existing metadata
        for item in self.items:
            item.artist = "PreExistingArtist"
            item.title = "PreExistingTitle"

        # Assert items have pre-existing metadata
        for item in self.items:
            assert item.artist == "PreExistingArtist"
            assert item.title == "PreExistingTitle"

        # Apply autotag with 'from_scratch' enabled
        self._apply()

        # Assert items are cleared before metadata is applied
        for item in self.items:
            assert item.artist != "PreExistingArtist"
            assert item.title != "PreExistingTitle"

        # Assert metadata is correctly applied
        keys = self.expected_tracks[0].keys()
        get_values = operator.itemgetter(*keys)
        applied_data = [
            dict(zip(keys, get_values(dict(i)))) for i in self.items
        ]
        assert applied_data == self.expected_tracks

    def test_artist_credit_prefers_artist_over_albumartist_credit(self):
        self.info.tracks[0].update(artist="oldArtist", artist_credit=None)

        self._apply(artist_credit=True)

        assert self.items[0].artist == "oldArtist"
```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@codecov
Copy link

codecov bot commented Nov 15, 2025

Codecov Report

❌ Patch coverage is 94.78261% with 6 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (introduce-info-name-property@0b570a9). Learn more about missing BASE report.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
beets/autotag/hooks.py 96.73% 1 Missing and 2 partials ⚠️
beetsplug/bpsync.py 50.00% 2 Missing ⚠️
beets/importer/tasks.py 92.30% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@                       Coverage Diff                       @@
##             introduce-info-name-property    #6165   +/-   ##
===============================================================
  Coverage                                ?   67.48%           
===============================================================
  Files                                   ?      136           
  Lines                                   ?    18524           
  Branches                                ?     3118           
===============================================================
  Hits                                    ?    12500           
  Misses                                  ?     5365           
  Partials                                ?      659           
Files with missing lines Coverage Δ
beets/autotag/__init__.py 72.72% <ø> (ø)
beets/autotag/match.py 77.07% <100.00%> (ø)
beetsplug/mbsync.py 82.05% <100.00%> (ø)
beets/importer/tasks.py 90.64% <92.30%> (ø)
beetsplug/bpsync.py 19.04% <50.00%> (ø)
beets/autotag/hooks.py 98.50% <96.73%> (ø)
🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@snejus snejus requested a review from a team November 15, 2025 16:07
@snejus snejus force-pushed the introduce-info-name-property branch from 8b417f1 to 5f8ca35 Compare November 15, 2025 16:14
@snejus snejus force-pushed the simplify-item-tagging branch from 1cca6b5 to 1ef5f76 Compare November 15, 2025 16:14
@snejus snejus force-pushed the introduce-info-name-property branch from 5f8ca35 to 9514f7d Compare November 15, 2025 17:25
@snejus snejus force-pushed the simplify-item-tagging branch 3 times, most recently from c766ec8 to 56126ad Compare November 16, 2025 16:31
Consolidate multiple granular test methods in ApplyTest into a single
comprehensive test that validates all applied metadata at once. This
improves test maintainability and clarity by:

- Replacing ~20 individual test methods with one data-driven test
- Using expected data dictionaries to validate all fields together
- Removing ApplyCompilationTest class (covered by va=True in main test)
- Keeping focused tests for edge cases (artist_credit, date handling)
- Switching from BeetsTestCase to standard TestCase for speed
- Adding operator import for efficient data extraction

The new approach makes it easier to validate all applied metadata at once.
@snejus snejus force-pushed the introduce-info-name-property branch from 9514f7d to 0b570a9 Compare November 17, 2025 10:21
@snejus snejus force-pushed the simplify-item-tagging branch from 56126ad to a32c45e Compare November 17, 2025 10:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants