Skip to content

Yarn2 PackageManager tries to fetch Metadata for Workspace packages #10915

@flash-me

Description

@flash-me

Describe the bug

TL;DR:
ort tries to fetch metadata with yarn npm info for packages maintained in the same yarn workspace and fails, because they are not published

Rough idea of Our Setup: Yarn2+ (right now its 4.9.1) Monorepo with bunch of workspace-only packages.

-- mainapp
── plugins
   ├── aut
   ├── jum
   ├── meg
   ├── per
   ├── pro
   ├── REA
   └── sca

Where every subfolder below plugins represents packages used by the mainapp.

These packages are not publicly available. Some of them are not published at all.
The packagemanagers plugin detects correctly packages being part of the workspace and skips.

Sample output (original paths redacted)

19:02:45.848 [main] INFO  org.ossreviewtoolkit.plugins.packagemanagers.node.NodePackageManagerDetection - Skipping '/home/runner/work/plugins/..../package.json' as it is part of a workspace implicitly handled by [YARN2].
19:02:45.849 [main] INFO  org.ossreviewtoolkit.plugins.packagemanagers.node.NodePackageManagerDetection - Skipping '/home/runner/work/plugins/..../package.json' as it is part of a workspace implicitly handled by [YARN2].
19:02:45.849 [main] INFO  org.ossreviewtoolkit.plugins.packagemanagers.node.NodePackageManagerDetection - Skipping '/home/runner/work/plugins/..../package.json' as it is part of a workspace implicitly handled by [YARN2].
19:02:45.849 [main] INFO  org.ossreviewtoolkit.plugins.packagemanagers.node.NodePackageManagerDetection - Skipping '/home/runner/work/plugins/..../package.json' as it is part of a workspace implicitly handled by [YARN2].
19:02:45.849 [main] INFO  org.ossreviewtoolkit.plugins.packagemanagers.node.NodePackageManagerDetection - Skipping '/home/runner/work/plugins/..../package.json' as it is part of a workspace implicitly handled by [YARN2].
19:02:45.849 [main] INFO  org.ossreviewtoolkit.plugins.packagemanagers.node.NodePackageManagerDetection - Skipping '/home/runner/work/plugins/..../package.json' as it is part of a workspace implicitly handled by [YARN2].
19:02:45.849 [main] INFO  org.ossreviewtoolkit.plugins.packagemanagers.node.NodePackageManagerDetection - Skipping '/home/runner/work/plugins/..../package.json' as it is part of a workspace implicitly handled by [YARN2].

Where it starts to fail

Right after, the Analyzer starts doing its job, in particular this log snippet:

19:02:45.911 [main] INFO  org.ossreviewtoolkit.analyzer.Analyzer - Calling before resolution hooks for 1 manager(s).
19:02:45.939 [main] INFO  org.ossreviewtoolkit.utils.common.ProcessCapture - Running 'yarn --version' in '/home/runner/work/foo/bar'...
19:02:46.340 [main] INFO  org.ossreviewtoolkit.analyzer.PackageManagerRunner - Starting Yarn 2+ analysis.
19:02:46.353 [DefaultDispatcher-worker-1] INFO  org.ossreviewtoolkit.analyzer.PackageManager - Using Yarn 2+ to resolve dependencies for path 'package.json'...
19:02:46.358 [DefaultDispatcher-worker-1] INFO  org.ossreviewtoolkit.utils.common.ProcessCapture - Running 'yarn install' in '/home/runner/work/foo/bar'...
19:04:14.084 [DefaultDispatcher-worker-1] INFO  org.ossreviewtoolkit.utils.common.ProcessCapture - Running 'yarn workspaces list --json' in '/home/runner/work/foo/bar'...
19:04:14.740 [DefaultDispatcher-worker-1] INFO  org.ossreviewtoolkit.utils.common.ProcessCapture - Running 'yarn info --all --recursive --manifest --virtuals --json' in '/home/runner/work/foo/bar'...
19:05:23.008 [DefaultDispatcher-worker-3] INFO  org.ossreviewtoolkit.utils.common.ProcessCapture - Running 'yarn npm info --json @types/request@2.48.13 @types/resolve@1.20.2 @types/retry@0.12.2 @types/semver@7.5.8 @types/send@0.17.5 @types/serve-index@1.9.4 @types/serve-static@1.15.8 @types/set-cookie-parser@2.4.10 @types/sockjs@0.3.36 @types/ssh2-streams@0.1.12 @types/ssh2@0.5.52 @types/ssh2@1.15.5 @types/stack-utils@2.0.3 @types/styled-jsx@2.2.9 

Especially the last two lines are my pain points. First, details about the workspace are listed with

yarn info --all --recursive --manifest --virtuals --json

⚠️⚠️⚠️⚠️ followed by the crashing step that fetches the Metadata of all packages. ⚠️⚠️⚠️⚠️

For this, every entry in the yarn.lock is processed.

Even the workspace-only packages (mentioned above) are passed to yarn npm info --json

which obviously crashes with

{"type":"error","name":35,"displayName":"YN0035","indent":"","data":"  \u001b[38;5;111mResponse Code\u001b[39m: \u001b[38;5;220m404\u001b[39m (Not Found)"}
{"type":"error","name":35,"displayName":"YN0035","indent":"","data":"  \u001b[38;5;111mRequest Method\u001b[39m: GET"}
{"type":"error","name":35,"displayName":"YN0035","indent":"","data":"  \u001b[38;5;111mRequest URL\u001b[39m: \u001b[38;5;170m[https://registry.yarnpkg.com/@foo%2fbar\u001b[39m](https://registry.yarnpkg.com/@foo%2fbar/u001b[39m)"}

To Reproduce

Steps to reproduce the behavior:

Try to scan any up2date yarn repo? like backstage

Expected behavior

I would clearly expect that packages from the same yarn workspace are also skipped by the analyzer, as they are in the previous step.

Environment

Output of the ort requirements command (ensure to remove any sensitive information manually):

  • ORT version: 69.0.0
  • Java version: 21
  • OS: Linux (GitHub Runner ubuntu-24.04)

I know since two days what ORT even is, I can't tell which information might be relevant, but here an excerpt of the config file

ort:
  deniedProcessEnvironmentVariablesSubstrings:
    - 'key'
    - 'pass'
    - 'pwd'
  enableRepositoryPackageConfigurations: true
  enableRepositoryPackageCurations: true
  forceOverwrite: false
  packageConfigurationProviders:
    - type: 'DefaultDir'
      id: 'DefaultDir'
      enabled: true
      options: {}
  packageCurationProviders:
    - type: 'DefaultDir'
      id: 'DefaultDir'
      enabled: true
      options: {}
    - type: 'DefaultFile'
      id: 'DefaultFile'
      enabled: true
      options: {}
  severeIssueThreshold: 'ERROR'
  severeRuleViolationThreshold: 'ERROR'

  scanner:
    skip_excluded: true
    scanners:
      SCANOSS:
        options:
          apiUrl: '<REDACTED>'
          writeToStorage: false
          enablePathObfuscation: false
        secrets:
          apiKey: '<REDACTED>'

analyzer:
  skip_excluded: true
  allow_dynamic_versions: false

advisor:
  skip_excluded: true

downloader:
  allowMovingRevisions: false
  includedLicenseCategories: []
  skipExcluded: false
  sourceCodeOrigins:
    - 'VCS'
    - 'ARTIFACT'

Additional context

I also looked into Yarn2.kt code.
It really seems like that the workspace packages are not excluded from the list that is passed to yarn npm info....

Metadata

Metadata

Assignees

No one assigned

    Labels

    analyzerAbout the analyzer tool

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions