Skip to content

Conversation

@Myranae
Copy link
Contributor

@Myranae Myranae commented Oct 30, 2025

Summary:

Overview
Implements a catalog hash system to track changes in pnpm catalog dependencies (Wonder Blocks, React, etc.) across Perseus packages. This ensures packages are correctly rebuilt when their catalog dependencies are updated, even if the package code itself hasn't changed.

Key Features

  1. Automatic Hash Updates
  • pnpm update-catalog-hashes - CLI command to update all catalog hashes
  • pnpm update-catalog-hashes --dry-run - Preview changes without writing
  • pnpm update-catalog-hashes --verbose - Show detailed dependency information
  1. Integrated into Existing Workflows
  • Sync Script: utils/sync-dependencies.ts now automatically updates catalog hashes after syncing
  • Pre-publish Check: utils/pre-publish-check-ci.ts verifies all hashes are current before publishing
  1. Smart Filtering
  • Only includes prodDeps and peerDeps catalogs (excludes devDeps)
  • Skips unpublished packages (root package, packages in changeset ignore list)
  • Only processes packages in the packages/ directory

Non-Test Files Added

  • utils/get-catalog-deps-hash.ts - Hash calculation logic
  • utils/maybe-update-catalog-hash.ts - Single package update logic
  • utils/update-catalog-hashes.ts - Main orchestration function
  • utils/update-catalog-hashes-cli.ts - CLI entry point
  • utils/verify-catalog-hashes.ts - Pre-publish verification

Files Modified

  • package.json - Added string-hash dependency and update-catalog-hashes script
  • README.md - Added documentation for catalog hashes
  • utils/sync-dependencies.ts - Auto-update hashes after syncing
  • utils/pre-publish-check-ci.ts - Verify hashes before publishing
  • All 12 packages in packages/ - Added initial catalogHash values

Dependencies

Issue: LEMS-3328

Test plan:

  • Confirm all checks pass
  • Manually run sync-deps after pulling Frontend and see if hashes are updated (only works if there are new depdencies)

@Myranae Myranae self-assigned this Oct 30, 2025
@Myranae Myranae changed the title Add Catalog Hash System for Dependency Tracking [wip] Add Catalog Hash System for Dependency Tracking Oct 30, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Oct 30, 2025

🗄️ Schema Change: No Changes ✅

@github-actions
Copy link
Contributor

github-actions bot commented Oct 30, 2025

Size Change: 0 B

Total Size: 497 kB

ℹ️ View Unchanged
Filename Size
packages/kas/dist/es/index.js 20.8 kB
packages/keypad-context/dist/es/index.js 1 kB
packages/kmath/dist/es/index.js 5.98 kB
packages/math-input/dist/es/index.js 99.2 kB
packages/math-input/dist/es/strings.js 1.61 kB
packages/perseus-core/dist/es/index.item-splitting.js 13.1 kB
packages/perseus-core/dist/es/index.js 22.4 kB
packages/perseus-editor/dist/es/index.js 96.6 kB
packages/perseus-linter/dist/es/index.js 7.21 kB
packages/perseus-score/dist/es/index.js 9.2 kB
packages/perseus-utils/dist/es/index.js 403 B
packages/perseus/dist/es/index.js 204 kB
packages/perseus/dist/es/strings.js 7.73 kB
packages/pure-markdown/dist/es/index.js 1.39 kB
packages/simple-markdown/dist/es/index.js 6.71 kB

compressed-size-action

@github-actions
Copy link
Contributor

github-actions bot commented Oct 30, 2025

🛠️ Item Splitting: No Changes ✅

@github-actions
Copy link
Contributor

github-actions bot commented Oct 30, 2025

npm Snapshot: Published

Good news!! We've packaged up the latest commit from this PR (60d98d2) and published it to npm. You
can install it using the tag PR3006.

Example:

pnpm add @khanacademy/perseus@PR3006

If you are working in Khan Academy's frontend, you can run the below command.

./dev/tools/bump_perseus_version.ts -t PR3006

If you are working in Khan Academy's webapp, you can run the below command.

./dev/tools/bump_perseus_version.js -t PR3006

Copy link
Collaborator

@jeremywiebe jeremywiebe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this Tamara! I have a few thoughts before you land this:

  • we duplicate types in several places, could we centralize those so that they can't drift out of sync?
  • the utils/ folder has an internal/ subfolder. Would some of these files belong in there? I think this utils/ folder should only contain executable files and any supporting files could go into internal/.
  • The PR description is a bit verbose. I don't think there's value in the description outlining changed files as that is data that git will give us. It would also be nice if you could state the problem this PR solves at the outset. I think it's solving the problem of packages not being published when we bump catalog dependencies, right? Could you make that clearer in the description?

import {updateCatalogHashes} from "./update-catalog-hashes";

function printHelp() {
console.log("Usage: update-catalog-hashes [options]");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
console.log("Usage: update-catalog-hashes [options]");
console.log("Usage: update-catalog-hashes-cli.ts [options]");

Comment on lines +28 to +31
console.log("Examples:");
console.log(" update-catalog-hashes");
console.log(" update-catalog-hashes --dry-run");
console.log(" update-catalog-hashes --verbose");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These aren't useful unless they show more complex usages of flags. As it stands, there's only two flags and they're pretty common, so examples here seems like just extra noise.

*/
export function updateCatalogHashes(isDryRun: boolean, verbose = false): void {
// Get the set of packages that we don't publish, per the changeset config.
const {ignore = []} = loadChangesetConfig();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer we look at each package.json's private flag and/or the publishConfig. These are npm-ecosystem flags that document if the package should be published. Although the ignore config option is supported, they document that it is meant to be a temporary solution during dev.

THIS FEATURE IS DESIGNED FOR TEMPORARY USE TO ALLOW CHANGES TO BE MERGED WITHOUT PUBLISHING THEM - If you want to stop a package from being published at all, set private: true in its package.json.

Comment on lines +42 to +45
const packageJsonPaths = fastGlob.sync("**/package.json", {
absolute: true,
ignore: ["**/node_modules/**", "**/dist/**"],
});
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works fine, but one option that would be easier to maintain (and not require maintaining an ignore list is to use git to figure out which files to work with. Only package.json files that are checked into git should be considered... so git ls-files "package.json" "**/package.json" would find all package.json files that are tracked... meaning we'd ignore everything in any node_module/ or dist/ folder automatically.

const unpublishedPackages = new Set<string>(ignore);
// The root package is always considered unpublished,
// but is not included in the changeset config ignore list.
unpublishedPackages.add(ROOT_PACKAGE_NAME);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We won't need this if we change to obey the package's private: true flag as our root package.json has this (correctly) set.

Comment on lines +37 to +40
console.log("\n🔍 Verifying catalog hashes...");
const catalogHashResult = verifyCatalogHashes();
if (!catalogHashResult.success) {
console.error("\n❌ Catalog hash verification failed:\n");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we prefix these messages with an extra newline (\n)?


const ROOT_PACKAGE_NAME = "perseus";

type PackageJson = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we duplicate this type in at least one other file. Maybe we should have a types.ts file in this utils/ folder that hosts these types?

Comment on lines +69 to +70
Catalog hashes ensure packages are rebuilt when their catalog dependencies (Wonder Blocks, React, etc.) are updated. These hashes are automatically updated when running `utils/sync-dependencies.ts`. If you manually add catalog dependencies to a package.json, run `pnpm update-catalog-hashes` to update the hashes. The pre-publish check will verify all hashes are current before releasing.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, this is a mostly clear explanation, thanks!

Just to check my understanding, are these hashes used to trigger a publish of affected packages or just rebuilding (I'm not trying to be pedantic, but I'm unclear what we mean when we say "rebuilt")

"sloc": "^0.2.1",
"storybook": "^9.0.4",
"storybook-multilevel-sort": "^2.0.1",
"string-hash": "^1.1.3",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, we could use globalThis.crypto.subtle to do the hashing (via digest) as that's built-in to Node.

Comment on lines +308 to +309
// Assert - Empty string hashes to "5381" with string-hash
expect(result).toBe("5381");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thought: We document this "magic" value in several of these tests. I wonder if it'd be clearer if we had a known string value that we use if the input string is empty. ie. what if this assertion were: expect(result).toBe("--empty--").

What are your thoughts? :)

Copy link
Collaborator

@jeremywiebe jeremywiebe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this Tamara! I have a few thoughts before you land this:

  • we duplicate types in several places, could we centralize those so that they can't drift out of sync?
  • the utils/ folder has an internal/ subfolder. Would some of these files belong in there? I think this utils/ folder should only contain executable files and any supporting files could go into internal/.
  • The PR description is a bit verbose. I don't think there's value in the description outlining changed files as that is data that git will give us. It would also be nice if you could state the problem this PR solves at the outset. I think it's solving the problem of packages not being published when we bump catalog dependencies, right? Could you make that clearer in the description?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants