-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Centralise common autotagger search functionality #5982
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Reviewer's GuideAdds a shared SearchApiMetadataSourcePlugin abstraction to centralize query building, normalization, limit injection, error handling, and logging, then refactors all existing metadata source plugins and related tests to adopt the new hooks. Sequence diagram for unified search flow in metadata source pluginssequenceDiagram
participant Plugin as SearchApiMetadataSourcePlugin
participant UserCode as Importer/Autotagger
participant API as External API (e.g. Spotify, Deezer)
UserCode->>Plugin: candidates(items, artist, album, va_likely)
Plugin->>Plugin: get_search_filters(query_type, items, artist, name, va_likely)
Plugin->>Plugin: _search_api(query_type, query, filters)
Plugin->>Plugin: get_search_response(SearchParams)
Plugin->>API: Perform API request
API-->>Plugin: Return results
Plugin-->>UserCode: Return AlbumInfo/TrackInfo objects
Class diagram for the new SearchApiMetadataSourcePlugin abstraction and plugin refactoringclassDiagram
class SearchApiMetadataSourcePlugin {
+config
+_log
+get_search_filters(query_type, items, artist, name, va_likely)
+get_search_response(params)
+_search_api(query_type, query, filters)
+_get_candidates(query_type, ...)
+candidates(items, artist, album, va_likely)
+item_candidates(item, artist, title)
}
class SearchParams {
+query_type: QueryType
+query: str
+filters: dict[str, str]
}
class IDResponse {
+id: str
}
SearchApiMetadataSourcePlugin <|-- SpotifyPlugin
SearchApiMetadataSourcePlugin <|-- DeezerPlugin
SearchApiMetadataSourcePlugin <|-- MusicBrainzPlugin
SearchApiMetadataSourcePlugin <|-- DiscogsPlugin
SearchParams <.. SearchApiMetadataSourcePlugin
IDResponse <.. SearchApiMetadataSourcePlugin
class SpotifyPlugin {
+get_search_response(params)
}
class DeezerPlugin {
+get_search_response(params)
}
class MusicBrainzPlugin {
+get_search_filters(...)
+get_search_response(params)
}
class DiscogsPlugin {
+get_search_filters(...)
+get_search_response(params)
}
Class diagram for SearchParams and its usageclassDiagram
class SearchParams {
+query_type: QueryType
+query: str
+filters: dict[str, str]
}
SearchParams <.. SearchApiMetadataSourcePlugin: uses
File-Level Changes
Possibly linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
|
@sourcery-ai review |
And centralise common search functionality inside the parent class
7234d74 to
1a6fb85
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there - I've reviewed your changes - here's some feedback:
- Consider marking get_search_filters as abstract so that each plugin must explicitly implement its own query-building logic instead of relying on the current no-op default.
- Align error handling and retry logic by moving the Spotify 401 re-authentication and other exception management into the base _search_api (or a shared hook) to ensure consistent behavior across all plugins.
- The QueryType alias is currently only defined under TYPE_CHECKING and may not be available at runtime—move its definition out of the TYPE_CHECKING guard to avoid missing references in annotations.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Consider marking get_search_filters as abstract so that each plugin must explicitly implement its own query-building logic instead of relying on the current no-op default.
- Align error handling and retry logic by moving the Spotify 401 re-authentication and other exception management into the base _search_api (or a shared hook) to ensure consistent behavior across all plugins.
- The QueryType alias is currently only defined under TYPE_CHECKING and may not be available at runtime—move its definition out of the TYPE_CHECKING guard to avoid missing references in annotations.
## Individual Comments
### Comment 1
<location> `beetsplug/discogs.py:246` </location>
<code_context>
- return filter(None, map(self.get_album_info, releases))
+ return f"{artist} - {name}", {"type": "release"}
+
+ def get_search_response(self, params: SearchParams) -> Sequence[IDResponse]:
+ """Returns a list of AlbumInfo objects for a discogs search query."""
+ limit = params.filters.pop("limit")
+ results = self.discogs_client.search(params.query, **params.filters)
+ results.per_page = limit
+ return [r.data for r in results.page(1)]
@cache
</code_context>
<issue_to_address>
Potential issue with 'limit' key removal from filters.
Using pop without a default will raise KeyError if 'limit' is missing. Use params.filters.pop('limit', None) to prevent exceptions.
</issue_to_address>
### Comment 2
<location> `beetsplug/discogs.py:241` </location>
<code_context>
+ name: str,
+ va_likely: bool,
+ ) -> tuple[str, dict[str, str]]:
+ if va_likely:
+ artist = items[0].artist
- try:
</code_context>
<issue_to_address>
Possible unintended override of artist for VA releases.
Using items[0].artist for VA releases could be inaccurate if items include multiple artists. Please verify that this logic aligns with the expected handling of VA releases.
</issue_to_address>
### Comment 3
<location> `beetsplug/spotify.py:437` </location>
<code_context>
- )
- return response_data
+ response.raise_for_status()
+ except requests.exceptions.HTTPError:
+ if response.status_code == 401:
+ self._authenticate()
+ return self.get_search_response(params)
+
+ raise
+
+ return response.json().get(f"{params.query_type}s", {}).get("items", [])
</code_context>
<issue_to_address>
Recursive call to get_search_response on authentication failure could lead to infinite recursion.
Add a retry limit or alternative handling to prevent stack overflow from repeated authentication failures.
</issue_to_address>
### Comment 4
<location> `test/plugins/test_musicbrainz.py:1043` </location>
<code_context>
+ lambda *_, **__: {"recording": self.RECORDING},
+ )
candidates = list(mb.item_candidates(Item(), "hello", "there"))
</code_context>
<issue_to_address>
Test only covers successful candidate retrieval; missing error/edge case coverage.
Please add tests for scenarios like empty results, exceptions, or malformed data from the search API to verify error handling and logging.
Suggested implementation:
```python
mbid = "d2a6f856-b553-40a0-ac54-a321e8e2da99"
RECORDING = {"title": "foo", "id": mbid, "length": 42}
@pytest.fixture
def plugin_config(self):
"musicbrainzngs.search_recordings",
lambda *_, **__: {"recording-list": [self.RECORDING]},
)
monkeypatch.setattr(
def test_item_candidates_empty_results(monkeypatch):
# Simulate empty results from the search API
monkeypatch.setattr(
"musicbrainzngs.search_recordings",
lambda *_, **__: {"recording-list": []},
)
mb = MyMusicBrainzPlugin()
candidates = list(mb.item_candidates(Item(), "hello", "there"))
assert candidates == [], "Expected no candidates for empty results"
def test_item_candidates_exception(monkeypatch, caplog):
# Simulate an exception from the search API
def raise_exception(*_, **__):
raise RuntimeError("API failure")
monkeypatch.setattr(
"musicbrainzngs.search_recordings",
raise_exception,
)
mb = MyMusicBrainzPlugin()
with caplog.at_level("ERROR"):
candidates = list(mb.item_candidates(Item(), "hello", "there"))
assert candidates == [], "Expected no candidates when exception occurs"
assert any("API failure" in m for m in caplog.text.splitlines()), "Expected error to be logged"
def test_item_candidates_malformed_data(monkeypatch, caplog):
# Simulate malformed data from the search API
monkeypatch.setattr(
"musicbrainzngs.search_recordings",
lambda *_, **__: {"unexpected-key": "unexpected-value"},
)
mb = MyMusicBrainzPlugin()
with caplog.at_level("ERROR"):
candidates = list(mb.item_candidates(Item(), "hello", "there"))
assert candidates == [], "Expected no candidates for malformed data"
assert any("malformed" in m.lower() or "unexpected" in m.lower() for m in caplog.text.splitlines()), "Expected malformed data error to be logged"
```
- You may need to adjust the error handling and logging in your plugin code (`item_candidates`) to ensure that exceptions and malformed data are properly caught and logged as errors.
- The test assumes the existence of a `MyMusicBrainzPlugin` class and an `Item` class, and that logging is performed using Python's standard logging module.
- If your plugin uses a different logging mechanism, update the tests accordingly.
</issue_to_address>
### Comment 5
<location> `test/plugins/test_musicbrainz.py:995` </location>
<code_context>
- RECORDING = {"title": "foo", "id": "bar", "length": 42}
+ RECORDING = {"title": "foo", "id": mbid, "length": 42}
@pytest.fixture
def plugin_config(self):
</code_context>
<issue_to_address>
No tests for album candidate search with new abstraction.
Please add or update tests for album candidate search with the new abstraction, covering both VA and non-VA albums.
Suggested implementation:
```python
mbid = "d2a6f856-b553-40a0-ac54-a321e8e2da99"
RECORDING = {"title": "foo", "id": mbid, "length": 42}
ALBUM_MBID = "1a2b3c4d-5e6f-7a8b-9c0d-1e2f3a4b5c6d"
VA_ALBUM_MBID = "va1b2c3d-4e5f-6a7b-8c9d-0e1f2a3b4c5d"
ALBUM = {
"title": "Test Album",
"id": ALBUM_MBID,
"artist-credit": [{"artist": {"name": "Test Artist"}}],
"release-group": {"type": "Album"},
"release-list": [],
}
VA_ALBUM = {
"title": "VA Compilation",
"id": VA_ALBUM_MBID,
"artist-credit": [{"artist": {"name": "Various Artists"}}],
"release-group": {"type": "Album"},
"release-list": [],
}
```
```python
@pytest.fixture
def plugin_config(self):
# Existing setup for recordings
monkeypatch.setattr(
"musicbrainzngs.search_recordings",
lambda *_, **__: {"recording-list": [self.RECORDING]},
)
monkeypatch.setattr(
"musicbrainzngs.get_recording_by_id",
lambda *_, **__: {"recording": self.RECORDING},
)
# Setup for album candidate search
monkeypatch.setattr(
"musicbrainzngs.search_releases",
lambda *_, **__: {"release-list": [self.ALBUM, self.VA_ALBUM]},
)
monkeypatch.setattr(
"musicbrainzngs.get_release_by_id",
lambda id, *_, **__: {"release": self.ALBUM if id == self.ALBUM_MBID else self.VA_ALBUM},
)
candidates = list(mb.item_candidates(Item(), "hello", "there"))
def test_album_candidate_search_non_va(self, plugin_config):
# Test non-VA album candidate search
candidates = list(mb.album_candidates("Test Album", "Test Artist"))
assert any(c["id"] == self.ALBUM_MBID for c in candidates)
assert all("Various Artists" not in c["artist-credit"][0]["artist"]["name"] for c in candidates)
def test_album_candidate_search_va(self, plugin_config):
# Test VA album candidate search
candidates = list(mb.album_candidates("VA Compilation", "Various Artists"))
assert any(c["id"] == self.VA_ALBUM_MBID for c in candidates)
assert any("Various Artists" in c["artist-credit"][0]["artist"]["name"] for c in candidates)
```
- Ensure that the `mb.album_candidates` abstraction exists and is implemented in your codebase. If not, you will need to implement it to match the new abstraction for album candidate search.
- If your test class uses `self`, make sure the test methods are part of a class and that the setup assigns the album/VA album attributes to `self`.
- Adjust the monkeypatching and test logic to fit your actual plugin and test structure.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| def get_search_response(self, params: SearchParams) -> Sequence[IDResponse]: | ||
| """Returns a list of AlbumInfo objects for a discogs search query.""" | ||
| limit = params.filters.pop("limit") | ||
| results = self.discogs_client.search(params.query, **params.filters) | ||
| results.per_page = limit | ||
| return [r.data for r in results.page(1)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (bug_risk): Potential issue with 'limit' key removal from filters.
Using pop without a default will raise KeyError if 'limit' is missing. Use params.filters.pop('limit', None) to prevent exceptions.
| if va_likely: | ||
| artist = items[0].artist |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question (bug_risk): Possible unintended override of artist for VA releases.
Using items[0].artist for VA releases could be inaccurate if items include multiple artists. Please verify that this logic aligns with the expected handling of VA releases.
| except requests.exceptions.HTTPError: | ||
| if response.status_code == 401: | ||
| self._authenticate() | ||
| return self.get_search_response(params) | ||
|
|
||
| raise |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue (bug_risk): Recursive call to get_search_response on authentication failure could lead to infinite recursion.
Add a retry limit or alternative handling to prevent stack overflow from repeated authentication failures.
| lambda *_, **__: {"recording": self.RECORDING}, | ||
| ) | ||
|
|
||
| candidates = list(mb.item_candidates(Item(), "hello", "there")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (testing): Test only covers successful candidate retrieval; missing error/edge case coverage.
Please add tests for scenarios like empty results, exceptions, or malformed data from the search API to verify error handling and logging.
Suggested implementation:
mbid = "d2a6f856-b553-40a0-ac54-a321e8e2da99"
RECORDING = {"title": "foo", "id": mbid, "length": 42}
@pytest.fixture
def plugin_config(self):
"musicbrainzngs.search_recordings",
lambda *_, **__: {"recording-list": [self.RECORDING]},
)
monkeypatch.setattr(
def test_item_candidates_empty_results(monkeypatch):
# Simulate empty results from the search API
monkeypatch.setattr(
"musicbrainzngs.search_recordings",
lambda *_, **__: {"recording-list": []},
)
mb = MyMusicBrainzPlugin()
candidates = list(mb.item_candidates(Item(), "hello", "there"))
assert candidates == [], "Expected no candidates for empty results"
def test_item_candidates_exception(monkeypatch, caplog):
# Simulate an exception from the search API
def raise_exception(*_, **__):
raise RuntimeError("API failure")
monkeypatch.setattr(
"musicbrainzngs.search_recordings",
raise_exception,
)
mb = MyMusicBrainzPlugin()
with caplog.at_level("ERROR"):
candidates = list(mb.item_candidates(Item(), "hello", "there"))
assert candidates == [], "Expected no candidates when exception occurs"
assert any("API failure" in m for m in caplog.text.splitlines()), "Expected error to be logged"
def test_item_candidates_malformed_data(monkeypatch, caplog):
# Simulate malformed data from the search API
monkeypatch.setattr(
"musicbrainzngs.search_recordings",
lambda *_, **__: {"unexpected-key": "unexpected-value"},
)
mb = MyMusicBrainzPlugin()
with caplog.at_level("ERROR"):
candidates = list(mb.item_candidates(Item(), "hello", "there"))
assert candidates == [], "Expected no candidates for malformed data"
assert any("malformed" in m.lower() or "unexpected" in m.lower() for m in caplog.text.splitlines()), "Expected malformed data error to be logged"- You may need to adjust the error handling and logging in your plugin code (
item_candidates) to ensure that exceptions and malformed data are properly caught and logged as errors. - The test assumes the existence of a
MyMusicBrainzPluginclass and anItemclass, and that logging is performed using Python's standard logging module. - If your plugin uses a different logging mechanism, update the tests accordingly.
| RECORDING = {"title": "foo", "id": "bar", "length": 42} | ||
| RECORDING = {"title": "foo", "id": mbid, "length": 42} | ||
|
|
||
| @pytest.fixture |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (testing): No tests for album candidate search with new abstraction.
Please add or update tests for album candidate search with the new abstraction, covering both VA and non-VA albums.
Suggested implementation:
mbid = "d2a6f856-b553-40a0-ac54-a321e8e2da99"
RECORDING = {"title": "foo", "id": mbid, "length": 42}
ALBUM_MBID = "1a2b3c4d-5e6f-7a8b-9c0d-1e2f3a4b5c6d"
VA_ALBUM_MBID = "va1b2c3d-4e5f-6a7b-8c9d-0e1f2a3b4c5d"
ALBUM = {
"title": "Test Album",
"id": ALBUM_MBID,
"artist-credit": [{"artist": {"name": "Test Artist"}}],
"release-group": {"type": "Album"},
"release-list": [],
}
VA_ALBUM = {
"title": "VA Compilation",
"id": VA_ALBUM_MBID,
"artist-credit": [{"artist": {"name": "Various Artists"}}],
"release-group": {"type": "Album"},
"release-list": [],
} @pytest.fixture
def plugin_config(self):
# Existing setup for recordings
monkeypatch.setattr(
"musicbrainzngs.search_recordings",
lambda *_, **__: {"recording-list": [self.RECORDING]},
)
monkeypatch.setattr(
"musicbrainzngs.get_recording_by_id",
lambda *_, **__: {"recording": self.RECORDING},
)
# Setup for album candidate search
monkeypatch.setattr(
"musicbrainzngs.search_releases",
lambda *_, **__: {"release-list": [self.ALBUM, self.VA_ALBUM]},
)
monkeypatch.setattr(
"musicbrainzngs.get_release_by_id",
lambda id, *_, **__: {"release": self.ALBUM if id == self.ALBUM_MBID else self.VA_ALBUM},
)
candidates = list(mb.item_candidates(Item(), "hello", "there"))
def test_album_candidate_search_non_va(self, plugin_config):
# Test non-VA album candidate search
candidates = list(mb.album_candidates("Test Album", "Test Artist"))
assert any(c["id"] == self.ALBUM_MBID for c in candidates)
assert all("Various Artists" not in c["artist-credit"][0]["artist"]["name"] for c in candidates)
def test_album_candidate_search_va(self, plugin_config):
# Test VA album candidate search
candidates = list(mb.album_candidates("VA Compilation", "Various Artists"))
assert any(c["id"] == self.VA_ALBUM_MBID for c in candidates)
assert any("Various Artists" in c["artist-credit"][0]["artist"]["name"] for c in candidates)- Ensure that the
mb.album_candidatesabstraction exists and is implemented in your codebase. If not, you will need to implement it to match the new abstraction for album candidate search. - If your test class uses
self, make sure the test methods are part of a class and that the setup assigns the album/VA album attributes toself. - Adjust the monkeypatching and test logic to fit your actual plugin and test structure.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #5982 +/- ##
==========================================
+ Coverage 66.49% 66.59% +0.10%
==========================================
Files 117 117
Lines 18119 18102 -17
Branches 3070 3071 +1
==========================================
+ Hits 12048 12055 +7
+ Misses 5415 5390 -25
- Partials 656 657 +1
🚀 New features to boost your workflow:
|
Add a unified search abstraction across metadata source plugins.
Summary:
SearchApiMetadataSourcePluginwithSearchParams,get_search_filters, andget_search_responsehooks to standardize album/track searches._search_apiand query construction logic in Deezer, Spotify, MusicBrainz, and Discogs plugins with the new shared implementation.search_limitvia a common path (added automatically as a filter) and applies optional ASCII normalization once.get_search_filters.Summary by Sourcery
Centralize and standardize autotagger search functionality by introducing a shared SearchApiMetadataSourcePlugin and refactoring existing metadata source plugins to use it
New Features:
Enhancements:
Tests: