Skip to content

Conversation

@amogus07
Copy link
Contributor

Description

closes #6123

To Do

  • Documentation
  • Changelog
  • Tests

@amogus07 amogus07 requested a review from a team as a code owner October 24, 2025 12:46
@github-actions
Copy link

Thank you for the PR! The changelog has not been updated, so here is a friendly reminder to check if you need to add an entry.

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes - here's some feedback:

Blocking issues:

  • Detected subprocess function 'Popen' without a static string. If this data can be controlled by a malicious actor, it may be an instance of command injection. Audit the use of this call to ensure it is not controllable by an external resource. You may consider using 'shlex.escape()'. (link)

General comments:

  • util/io.py has grown very large—consider decomposing it into smaller, concern-focused modules (e.g. separate path utilities from file operations) to improve readability and reduce merge conflict surface.
  • Review the all declarations in beets/util/init.py (and other moved modules) to ensure they export only the intended public API after the refactor.
  • Add module-level docstrings to the new util/exceptions.py and util/misc.py to clarify each file's responsibility and aid future maintainers.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- util/io.py has grown very large—consider decomposing it into smaller, concern-focused modules (e.g. separate path utilities from file operations) to improve readability and reduce merge conflict surface.
- Review the __all__ declarations in beets/util/__init__.py (and other moved modules) to ensure they export only the intended public API after the refactor.
- Add module-level docstrings to the new util/exceptions.py and util/misc.py to clarify each file's responsibility and aid future maintainers.

## Individual Comments

### Comment 1
<location> `beets/util/__init__.py:32` </location>
<code_context>
-
-import beets
-from beets.util import hidden
+from beets.util.exceptions import HumanReadabaleErrorArgs, HumanReadableError
+from beets.util.io import (
+    CHAR_REPLACE,
</code_context>

<issue_to_address>
**issue (typo):** Typo in 'HumanReadabaleErrorArgs' import name.

Please update the import to 'HumanReadableErrorArgs' for consistency.

```suggestion
from beets.util.exceptions import HumanReadableErrorArgs, HumanReadableError
```
</issue_to_address>

### Comment 2
<location> `beets/util/exceptions.py:11-14` </location>
<code_context>
+    from logging import Logger
+
+
+class HumanReadabaleErrorArgs(TypedDict):
+    reason: BaseException | bytes | str
+    verb: str
</code_context>

<issue_to_address>
**issue (typo):** Typo in 'HumanReadabaleErrorArgs' class name.

Please rename the class to 'HumanReadableErrorArgs'.

```suggestion
class HumanReadableErrorArgs(TypedDict):
    reason: BaseException | bytes | str
    verb: str
    tb: NotRequired[str | None]
```
</issue_to_address>

### Comment 3
<location> `beets/util/io.py:835-842` </location>
<code_context>
    proc: subprocess.Popen[bytes] = subprocess.Popen(
        cmd,
        stdout=subprocess.PIPE,
        stderr=subprocess.PIPE,
        stdin=devnull,
        close_fds=platform.system() != "Windows",
        shell=shell,
    )
</code_context>

<issue_to_address>
**security (python.lang.security.audit.dangerous-subprocess-use-audit):** Detected subprocess function 'Popen' without a static string. If this data can be controlled by a malicious actor, it may be an instance of command injection. Audit the use of this call to ensure it is not controllable by an external resource. You may consider using 'shlex.escape()'.

*Source: opengrep*
</issue_to_address>

### Comment 4
<location> `beets/util/io.py:226` </location>
<code_context>
def mkdirall(path: bytes) -> None:
    """Make all the enclosing directories of path (like mkdir -p on the
    parent).
    """
    ancestor: bytes
    for ancestor in ancestry(path):
        if not os.path.isdir(syspath(ancestor)):
            try:
                os.mkdir(syspath(ancestor))
            except OSError as exc:
                raise FilesystemError(
                    reason=exc,
                    verb="create",
                    paths=(ancestor,),
                    tb=traceback.format_exc(),
                )

</code_context>

<issue_to_address>
**suggestion (code-quality):** Explicitly raise from a previous error ([`raise-from-previous-error`](https://docs.sourcery.ai/Reference/Default-Rules/suggestions/raise-from-previous-error/))

```suggestion
                ) from exc
```
</issue_to_address>

### Comment 5
<location> `beets/util/io.py:298` </location>
<code_context>
def components(path: AnyStr) -> list[AnyStr]:
    """Return a list of the path components in path. For instance:

       >>> components(b'/a/b/c')
       ['a', 'b', 'c']

    The argument should *not* be the result of a call to `syspath`.
    """
    comps: list[AnyStr] = []
    ances: list[AnyStr] = ancestry(path)
    for anc in ances:
        comp: AnyStr = os.path.basename(anc)
        if comp:
            comps.append(comp)
        else:  # root
            comps.append(anc)

    last: AnyStr = os.path.basename(path)
    if last:
        comps.append(last)

    return comps

</code_context>

<issue_to_address>
**issue (code-quality):** Use named expression to simplify assignment and conditional [×2] ([`use-named-expression`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/use-named-expression/))
</issue_to_address>

### Comment 6
<location> `beets/util/io.py:412` </location>
<code_context>
def remove(path: PathLike, soft: bool = True) -> None:
    """Remove the file. If `soft`, then no error will be raised if the
    file does not exist.
    """
    str_path: str = syspath(path)
    if not str_path or (soft and not os.path.exists(str_path)):
        return
    try:
        os.remove(str_path)
    except OSError as exc:
        raise FilesystemError(
            reason=exc,
            verb="delete",
            paths=(str_path,),
            tb=traceback.format_exc(),
        )

</code_context>

<issue_to_address>
**suggestion (code-quality):** Explicitly raise from a previous error ([`raise-from-previous-error`](https://docs.sourcery.ai/Reference/Default-Rules/suggestions/raise-from-previous-error/))

```suggestion
        ) from exc
```
</issue_to_address>

### Comment 7
<location> `beets/util/io.py:437` </location>
<code_context>
def copy(path: bytes, dest: bytes, replace: bool = False) -> None:
    """Copy a plain file. Permissions are not copied. If `dest` already
    exists, raises a FilesystemError unless `replace` is True. Has no
    effect if `path` is the same as `dest`. Paths are translated to
    system paths before the syscall.
    """
    if samefile(path, dest):
        return
    str_path: str = syspath(path)
    str_dest: str = syspath(dest)
    if not replace and os.path.exists(str_dest):
        raise FilesystemError(
            reason="file exists", verb="copy", paths=(str_path, str_dest)
        )
    try:
        _ = shutil.copyfile(str_path, str_dest)
    except OSError as exc:
        raise FilesystemError(
            reason=exc,
            verb="copy",
            paths=(str_path, str_dest),
            tb=traceback.format_exc(),
        )

</code_context>

<issue_to_address>
**suggestion (code-quality):** Explicitly raise from a previous error ([`raise-from-previous-error`](https://docs.sourcery.ai/Reference/Default-Rules/suggestions/raise-from-previous-error/))

```suggestion
        ) from exc
```
</issue_to_address>

### Comment 8
<location> `beets/util/io.py:465` </location>
<code_context>
def move(path: bytes, dest: bytes, replace: bool = False) -> None:
    """Rename a file. `dest` may not be a directory. If `dest` already
    exists, raises an OSError unless `replace` is True. Has no effect if
    `path` is the same as `dest`. Paths are translated to system paths.
    """
    if os.path.isdir(syspath(path)):
        raise FilesystemError(
            reason="source is directory", verb="move", paths=(path, dest)
        )
    if os.path.isdir(syspath(dest)):
        raise FilesystemError(
            reason="destination is directory", verb="move", paths=(path, dest)
        )
    if samefile(path, dest):
        return
    if os.path.exists(syspath(dest)) and not replace:
        raise FilesystemError(
            reason="file exists", verb="rename", paths=(path, dest)
        )

    # First, try renaming the file.
    try:
        os.replace(syspath(path), syspath(dest))
    except OSError:
        # Copy the file to a temporary destination.
        basename: bytes = os.path.basename(bytestring_path(dest))
        dirname: bytes = os.path.dirname(bytestring_path(dest))
        tmp: IO[bytes] = tempfile.NamedTemporaryFile(
            suffix=".beets",
            prefix=f".{os.fsdecode(basename)}.",
            dir=syspath(dirname),
            delete=False,
        )
        try:
            with open(syspath(path), "rb") as f:
                # mypy bug:
                # - https://github.com/python/mypy/issues/15031
                # - https://github.com/python/mypy/issues/14943
                # Fix not yet released:
                # - https://github.com/python/mypy/pull/14975
                shutil.copyfileobj(f, tmp)  # type: ignore[misc]
        finally:
            tmp.close()

        try:
            # Copy file metadata
            shutil.copystat(syspath(path), tmp.name)
        except OSError:
            # Ignore errors because it doesn't matter too much.  We may be on a
            # filesystem that doesn't support this.
            pass

        # Move the copied file into place.
        tmp_filename: str = tmp.name
        try:
            os.replace(tmp_filename, syspath(dest))
            tmp_filename = ""
            os.remove(syspath(path))
        except OSError as exc:
            raise FilesystemError(
                reason=exc,
                verb="move",
                paths=(path, dest),
                tb=traceback.format_exc(),
            )
        finally:
            if tmp_filename:
                os.remove(tmp_filename)

</code_context>

<issue_to_address>
**issue (code-quality):** We've found these issues:

- Extract code out into function ([`extract-method`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/extract-method/))
- Use `contextlib`'s `suppress` method to silence an error ([`use-contextlib-suppress`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/use-contextlib-suppress/))
- Explicitly raise from a previous error ([`raise-from-previous-error`](https://docs.sourcery.ai/Reference/Default-Rules/suggestions/raise-from-previous-error/))
</issue_to_address>

### Comment 9
<location> `beets/util/io.py:522` </location>
<code_context>
def link(path: bytes, dest: bytes, replace: bool = False) -> None:
    """Create a symbolic link from path to `dest`. Raises an OSError if
    `dest` already exists, unless `replace` is True. Does nothing if
    `path` == `dest`.
    """
    if samefile(path, dest):
        return

    if os.path.exists(syspath(dest)) and not replace:
        raise FilesystemError(
            reason="file exists", verb="rename", paths=(path, dest)
        )
    try:
        os.symlink(syspath(path), syspath(dest))
    except NotImplementedError:
        # raised on python >= 3.2 and Windows versions before Vista
        raise FilesystemError(
            reason="OS does not support symbolic links.link",
            verb="link",
            paths=(path, dest),
            tb=traceback.format_exc(),
        )
    except OSError as exc:
        raise FilesystemError(
            reason=exc,
            verb="link",
            paths=(path, dest),
            tb=traceback.format_exc(),
        )

</code_context>

<issue_to_address>
**issue (code-quality):** Explicitly raise from a previous error [×2] ([`raise-from-previous-error`](https://docs.sourcery.ai/Reference/Default-Rules/suggestions/raise-from-previous-error/))
</issue_to_address>

### Comment 10
<location> `beets/util/io.py:553` </location>
<code_context>
def hardlink(path: bytes, dest: bytes, replace: bool = False) -> None:
    """Create a hard link from path to `dest`. Raises an OSError if
    `dest` already exists, unless `replace` is True. Does nothing if
    `path` == `dest`.
    """
    if samefile(path, dest):
        return

    if os.path.exists(syspath(dest)) and not replace:
        raise FilesystemError(
            reason="file exists", verb="rename", paths=(path, dest)
        )
    try:
        os.link(syspath(path), syspath(dest))
    except NotImplementedError:
        raise FilesystemError(
            reason="OS does not support hard links.link",
            verb="hardlink",
            paths=(path, dest),
            tb=traceback.format_exc(),
        )
    except OSError as exc:
        if exc.errno == errno.EXDEV:
            raise FilesystemError(
                reason="Cannot hard link across devices.link",
                verb="hardlink",
                paths=(path, dest),
                tb=traceback.format_exc(),
            )
        else:
            raise FilesystemError(
                reason=exc,
                verb="link",
                paths=(path, dest),
                tb=traceback.format_exc(),
            )

</code_context>

<issue_to_address>
**issue (code-quality):** Explicitly raise from a previous error [×2] ([`raise-from-previous-error`](https://docs.sourcery.ai/Reference/Default-Rules/suggestions/raise-from-previous-error/))
</issue_to_address>

### Comment 11
<location> `beets/util/io.py:779` </location>
<code_context>
def str2bool(value: str) -> bool:
    """Returns a boolean reflecting a human-entered string."""
    return value.lower() in ("yes", "1", "true", "t", "y")

</code_context>

<issue_to_address>
**suggestion (code-quality):** Use set when checking membership of a collection of literals ([`collection-into-set`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/collection-into-set/))

```suggestion
    return value.lower() in {"yes", "1", "true", "t", "y"}
```
</issue_to_address>

### Comment 12
<location> `beets/util/io.py:801-803` </location>
<code_context>
def plurality(objs: Iterable[T]) -> tuple[T, int]:
    """Given a sequence of hashble objects, returns the object that
    is most common in the set and the its number of appearance. The
    sequence must contain at least one object.
    """
    c: Counter[T] = Counter(objs)
    if not c:
        raise ValueError("sequence must be non-empty")
    return c.most_common(1)[0]

</code_context>

<issue_to_address>
**issue (code-quality):** We've found these issues:

- Use named expression to simplify assignment and conditional ([`use-named-expression`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/use-named-expression/))
- Lift code into else after jump in control flow ([`reintroduce-else`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/reintroduce-else/))
- Swap if/else branches ([`swap-if-else-branches`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/swap-if-else-branches/))
</issue_to_address>

### Comment 13
<location> `beets/util/io.py:868-875` </location>
<code_context>
@cache
def get_max_filename_length() -> int:
    """Attempt to determine the maximum filename length for the
    filesystem containing `path`. If the value is greater than `limit`,
    then `limit` is used instead (to prevent errors when a filesystem
    misreports its capacity). If it cannot be determined (e.g., on
    Windows), return `limit`.
    """
    length: int
    if length := config["max_filename_length"].get(int):
        return length

    limit: int = MAX_FILENAME_LENGTH
    if hasattr(os, "statvfs"):
        try:
            res: os.statvfs_result = os.statvfs(config["directory"].as_str())
        except OSError:
            return limit
        return min(res[9], limit)
    else:
        return limit

</code_context>

<issue_to_address>
**issue (code-quality):** We've found these issues:

- Swap if/else branches ([`swap-if-else-branches`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/swap-if-else-branches/))
- Remove unnecessary else after guard condition ([`remove-unnecessary-else`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/remove-unnecessary-else/))
</issue_to_address>

### Comment 14
<location> `beets/util/io.py:884-889` </location>
<code_context>
def open_anything() -> str:
    """Return the system command that dispatches execution to the correct
    program.
    """
    sys_name: str = platform.system()
    base_cmd: str
    if sys_name == "Darwin":
        base_cmd = "open"
    elif sys_name == "Windows":
        base_cmd = "start"
    else:  # Assume Unix
        base_cmd = "xdg-open"
    return base_cmd

</code_context>

<issue_to_address>
**issue (code-quality):** We've found these issues:

- Simplify conditional into switch-like form ([`switch`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/switch/))
- Lift return into if ([`lift-return-into-if`](https://docs.sourcery.ai/Reference/Default-Rules/refactorings/lift-return-into-if/))
</issue_to_address>

### Comment 15
<location> `beetsplug/musicbrainz.py:827` </location>
<code_context>
    def _search_api(
        self,
        query_type: Literal["recording", "release"],
        filters: dict[str, str],
    ) -> list[JSONDict]:
        """Perform MusicBrainz API search and return results.

        Execute a search against the MusicBrainz API for recordings or releases
        using the provided criteria. Handles API errors by converting them into
        MusicBrainzAPIError exceptions with contextual information.
        """
        filters = {
            k: _v for k, v in filters.items() if (_v := v.lower().strip())
        }
        self._log.debug(
            "Searching for MusicBrainz {}s with: {!r}", query_type, filters
        )
        try:
            method = getattr(musicbrainzngs, f"search_{query_type}s")
            res = method(limit=self.config["search_limit"].get(), **filters)
        except musicbrainzngs.MusicBrainzError as exc:
            raise MusicBrainzAPIError(
                reason=exc,
                verb=f"{query_type} search",
                query=filters,
                tb=traceback.format_exc(),
            )
        return res[f"{query_type}-list"]

</code_context>

<issue_to_address>
**suggestion (code-quality):** Explicitly raise from a previous error ([`raise-from-previous-error`](https://docs.sourcery.ai/Reference/Default-Rules/suggestions/raise-from-previous-error/))

```suggestion
            ) from exc
```
</issue_to_address>

### Comment 16
<location> `beetsplug/musicbrainz.py:881` </location>
<code_context>
    def album_for_id(
        self, album_id: str
    ) -> beets.autotag.hooks.AlbumInfo | None:
        """Fetches an album by its MusicBrainz ID and returns an AlbumInfo
        object or None if the album is not found. May raise a
        MusicBrainzAPIError.
        """
        self._log.debug("Requesting MusicBrainz release {}", album_id)
        if not (albumid := self._extract_id(album_id)):
            self._log.debug("Invalid MBID ({}).", album_id)
            return None

        try:
            res = musicbrainzngs.get_release_by_id(albumid, RELEASE_INCLUDES)

            # resolve linked release relations
            actual_res = None

            if res["release"].get("status") == "Pseudo-Release":
                actual_res = _find_actual_release_from_pseudo_release(res)

        except musicbrainzngs.ResponseError:
            self._log.debug("Album ID match failed.")
            return None
        except musicbrainzngs.MusicBrainzError as exc:
            raise MusicBrainzAPIError(
                reason=exc,
                verb="get release by ID",
                query=albumid,
                tb=traceback.format_exc(),
            )

        # release is potentially a pseudo release
        release = self.album_info(res["release"])

        # should be None unless we're dealing with a pseudo release
        if actual_res is not None:
            actual_release = self.album_info(actual_res["release"])
            return _merge_pseudo_and_actual_album(release, actual_release)
        else:
            return release

</code_context>

<issue_to_address>
**suggestion (code-quality):** Explicitly raise from a previous error ([`raise-from-previous-error`](https://docs.sourcery.ai/Reference/Default-Rules/suggestions/raise-from-previous-error/))

```suggestion
            ) from exc
```
</issue_to_address>

### Comment 17
<location> `beetsplug/musicbrainz.py:914` </location>
<code_context>
    def track_for_id(
        self, track_id: str
    ) -> beets.autotag.hooks.TrackInfo | None:
        """Fetches a track by its MusicBrainz ID. Returns a TrackInfo object
        or None if no track is found. May raise a MusicBrainzAPIError.
        """
        if not (trackid := self._extract_id(track_id)):
            self._log.debug("Invalid MBID ({}).", track_id)
            return None

        try:
            res = musicbrainzngs.get_recording_by_id(trackid, TRACK_INCLUDES)
        except musicbrainzngs.ResponseError:
            self._log.debug("Track ID match failed.")
            return None
        except musicbrainzngs.MusicBrainzError as exc:
            raise MusicBrainzAPIError(
                reason=exc,
                verb="get recording by ID",
                query=trackid,
                tb=traceback.format_exc(),
            )
        return self.track_info(res["recording"])

</code_context>

<issue_to_address>
**suggestion (code-quality):** Explicitly raise from a previous error ([`raise-from-previous-error`](https://docs.sourcery.ai/Reference/Default-Rules/suggestions/raise-from-previous-error/))

```suggestion
            ) from exc
```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +835 to +842
proc: subprocess.Popen[bytes] = subprocess.Popen(
cmd,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
stdin=devnull,
close_fds=platform.system() != "Windows",
shell=shell,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security (python.lang.security.audit.dangerous-subprocess-use-audit): Detected subprocess function 'Popen' without a static string. If this data can be controlled by a malicious actor, it may be an instance of command injection. Audit the use of this call to ensure it is not controllable by an external resource. You may consider using 'shlex.escape()'.

Source: opengrep


def str2bool(value: str) -> bool:
"""Returns a boolean reflecting a human-entered string."""
return value.lower() in ("yes", "1", "true", "t", "y")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (code-quality): Use set when checking membership of a collection of literals (collection-into-set)

Suggested change
return value.lower() in ("yes", "1", "true", "t", "y")
return value.lower() in {"yes", "1", "true", "t", "y"}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant