Skip to content

Conversation

@MangoIV
Copy link
Collaborator

@MangoIV MangoIV commented Oct 1, 2025

Previously the executable name of the external command was passed to external commands as the first argument.

This behaviour was adapated from cargo which does this because of reasons that are internal to rust that do not affect GHC Haskell, and are even orthogonal to patterns that see common use in Haskell.

Additionally, it complicates the 'simple' case which is what we should optimize for when building such a feature.

The previous use case (one executable that serves multiple external subcommands) is still possible by the following means:

  • using a wrapper around the executable
  • using a symlink and check argv[0] in the executable

Include the following checklist in your PR:

@MangoIV MangoIV force-pushed the mangoiv/change-subcommands branch 2 times, most recently from 47ea2c6 to 50e1bfa Compare October 1, 2025 16:32
Copy link
Collaborator

@geekosaur geekosaur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an issue here which hasn't been brought up yet, which is one of the reasons the program name is passed as the first parameter.

See the caveat on getProgName. On Windows in particular, there is no way to retrieve argv[0], so anything relying on it will have issues.

This also plays into a potential future change where invocation via cabal would be marked. (Likely by something like how login on Unix marks login shells: leading '-'.) The reason we would want this is the potential for future versions of the external command mechanism to receive bits of project or cabal state via the environment. (We already pass $CABAL / %CABAL%, but tooling should already use that to ensure they are running the right cabal.) External commands would need some "key" indicating that the additional information is available. We don't do this yet because we're waiting for feedback from users of the external command mechanism about what information they would need from cabal.

All of which means I'm not sure we can get away with not passing the name. At minimum, we might need to replace it with a --run-from-cabal parameter or other mechanism to indicate that additional information is available, and potentially parameterized by the actual run name to work around getProgName issues on non-POSIX (which would put us right back where we are now).

@MangoIV
Copy link
Collaborator Author

MangoIV commented Oct 1, 2025

There are two possible workarounds listed, the other one (just using multiple executables) even aligns very nicely with how Haskell projects are usually structured and doesn't incur a lot of overhead. Does that one not work on windows either?

On the topic of getting more information: if you're setting an environment variable anyway, then why is it not possible to just check for the presence of that variable?

@geekosaur
Copy link
Collaborator

$CABAL itself isn't enough, as other tools already use it for this purpose. A separate envar might be.

If two tools share most of their implementation, multiple driver programs are possible but may lead to complex library entry points. This is why Unix likes to use (sym)links in that case, but that doesn't fly on Windows and probably other OSes that aren't Unix-like (there's ongoing work to port ghc and build tools to Haiku, for example).

@geekosaur
Copy link
Collaborator

geekosaur commented Oct 1, 2025

You haven't updated the external command tests (PackageTests/ExternalCommandExitCode/cabal.test.hs, PackageTests/ExternalCommand/cabal.test.hs, PackageTests/ExternalCommandHelp/cabal.test.hs).

@MangoIV
Copy link
Collaborator Author

MangoIV commented Oct 1, 2025

You haven't updated the external command test

I didn't see it failing locally, must've slipped through the one million lines of output

If two tools share most of their implementation, multiple driver programs are possible but may lead to complex library entry points.

That's why I said in Haskell, it is super common to make the executable just a shim that invokes the library. In my opinion we don't have too look too closely what other people are doing there because they have other backgrounds. The default case is that it's very simple to add new executables to Haskell projects and even if that's not a case, for that hypothetical project (I know of none that has this problem) it's not a hurdle that big that should force other users to take the hurdles that they have to take now. (Or making anything impossible)

$CABAL itself isn't enough, as other tools already use it for this purpose. A separate envar might be.

Yes, I propose not setting that variable and instead use a longer, more rarely used one like $CABAL_EXTERNAL_COMMAND_CONTEXT which at the same time would be good to avoid executables that currently long for $CABAL to break.

@geekosaur
Copy link
Collaborator

Yes, I propose not setting that variable and instead use a longer, more rarely used one like $CABAL_EXTERNAL_COMMAND_CONTEXT which at the same time would be good to avoid executables that currently long for $CABAL to break.

We still need to set it, because if a user runs a different cabal than the default on on $PATH (I do this somewhat regularly, for example) we need to inform the external command which one to use to e.g. query project information. $CABAL is the standard for this.

@Bodigrim
Copy link
Collaborator

Bodigrim commented Oct 1, 2025

We still need to set it, because if a user runs a different cabal than the default on on $PATH (I do this somewhat regularly, for example) we need to inform the external command which one to use to e.g. query project information. $CABAL is the standard for this.

I doubt it's ever wise to use $CABAL to query project information or similar. Why would anyone do it? Cabal CLI interface isn't stable, and certainly running a new process, checking its version and parsing the output is too much work, when you can simply use cabal-install as a library instead.

@geekosaur
Copy link
Collaborator

That may tie your program to a specific version. In the opposite direction, this is why I think we'll be exporting more information in the future. Also, future versions of cabal will have more stable interfaces for this, whose design will come from this and other users (e.g. HLS); we already have a ticket and some early design work for a proposed cabal status, and some work on extending cabal path.

@MangoIV
Copy link
Collaborator Author

MangoIV commented Oct 1, 2025

Do I understand this incorrectly though: I think this is orthogonal to the issue at hand - the additional argument wouldn't carry any information about the status of cabal, it would be the environment variable.

@geekosaur
Copy link
Collaborator

If the additional parameter were to stay, it would be the same as the argv[0] which isn't always accessible on non-Unix. It might instead be shifted into the environment.

@geekosaur
Copy link
Collaborator

Some care is needed with the environment, though; it's already pretty sizable on some systems, and there is a limit on argv+envp on anything other than FreeBSD.

@Bodigrim
Copy link
Collaborator

Bodigrim commented Oct 1, 2025

That may tie your program to a specific version.

Which is something under mine control as a developer, so it's good. While with $CABAL I have no idea what I'm dealing with: what is the version? was it patched somehow? can I execute it as a child process without hitting a lock of some sort? As a third-party developer, I'm very certain about my preference here. But we digress.

@mniip
Copy link

mniip commented Oct 1, 2025

There's an issue here which hasn't been brought up yet, which is one of the reasons the program name is passed as the first parameter.

See the caveat on getProgName. On Windows in particular, there is no way to retrieve argv[0], so anything relying on it will have issues.

I feel like you're severily misunderstanding the docs here. getProgName works on windows like a charm, in the sense that it outputs the filename of the program as it appeared in the first word in the CreateProcess invocation (i.e. pre symlink resolution).

Here's what it looks like:

PS C:\Users\mniip\Desktop\prog> cat .\GetProgName.hs
import System.Environment

main :: IO ()
main = getProgName >>= putStrLn
PS C:\Users\mniip\Desktop\prog> ghc .\GetProgName.hs -o cabal-getprogname.exe
[2 of 2] Linking cabal-getprogname.exe
PS C:\Users\mniip\Desktop\prog> New-Item -Path cabal-getprogname-symlink.exe -ItemType SymbolicLink -Value cabal-getprogname.exe


    Directory: C:\Users\mniip\Desktop\prog


Mode                 LastWriteTime         Length Name
----                 -------------         ------ ----
-a---l         10/1/2025  11:05 PM              0 cabal-getprogname-symlink.exe


PS C:\Users\mniip\Desktop\prog> fsutil hardlink create cabal-getprogname-hardlink.exe cabal-getprogname.exe
Hard link created for C:\Users\mniip\Desktop\prog\cabal-getprogname-hardlink.exe <<===>> C:\Users\mniip\Desktop\prog\cabal-getprogname.exe
PS C:\Users\mniip\Desktop\prog> cabal getprogname
cabal-getprogname.exe
PS C:\Users\mniip\Desktop\prog> cabal getprogname-symlink
cabal-getprogname-symlink.exe
PS C:\Users\mniip\Desktop\prog> cabal getprogname-hardlink
cabal-getprogname-hardlink.exe

@MangoIV MangoIV force-pushed the mangoiv/change-subcommands branch from 50e1bfa to 2fbabfe Compare October 1, 2025 21:19
@geekosaur
Copy link
Collaborator

I feel like you're severily misunderstanding the docs here. getProgName works on windows like a charm, in the sense that it outputs the filename of the program as it appeared in the first word in the CreateProcess invocation (i.e. pre symlink resolution).

It still leaves out cabal-specific changes to it to indicate that it has been run from cabal and extra information may be available (#11232 (review)); it only provides the executable name that was run.

@mniip
Copy link

mniip commented Oct 1, 2025

It still leaves out cabal-specific changes to it to indicate that it has been run from cabal and extra information may be available (#11232 (review)); it only provides the executable name that was run.

Yes. This is a completely orthogonal issue though to whether cabal copies the command name in argv[1] or not. This PR isn't about making the perfect cabal external commands system, it's about changing that one specific aspect.

@MangoIV MangoIV force-pushed the mangoiv/change-subcommands branch from 2fbabfe to cdd2f7a Compare October 1, 2025 21:40
@ulysses4ever
Copy link
Collaborator

Thank you for submitting the patch @MangoIV! I agree with your arguments and I wasn't able to discern serious counter-arguments from the discussion above. So I'm in support of the change.

Why is it a draft?

@MangoIV
Copy link
Collaborator Author

MangoIV commented Oct 1, 2025

Why is it a draft?

There's two missing checkboxes that I haven't gone through yet (I want to adjust missing docs)

@geekosaur
Copy link
Collaborator

You still also have a typo in your changelog entry and two tests that still need to be fixed (I saw only one fixed in the last push).

@MangoIV
Copy link
Collaborator Author

MangoIV commented Oct 1, 2025

yeah that's also why it's still draft, sorry. I will of course go through this. :)

@ulysses4ever
Copy link
Collaborator

No worries, I was just making sure it's not an oversight.

@MangoIV
Copy link
Collaborator Author

MangoIV commented Oct 2, 2025

As per request of @mpickering I have announced this change on discourse

https://discourse.haskell.org/t/cabal-install-external-command-feature-without-passing-the-subcommands-name-as-argv-1/13062

in order for users to be able to chime in if they expect breakage.

@MangoIV MangoIV force-pushed the mangoiv/change-subcommands branch from cdd2f7a to 9280ae0 Compare October 22, 2025 09:36
@MangoIV
Copy link
Collaborator Author

MangoIV commented Oct 23, 2025

I would like to note that validate.sh does not reproduce the error, which is really weird.

@MangoIV MangoIV force-pushed the mangoiv/change-subcommands branch 3 times, most recently from 041ad3a to 017a12e Compare October 23, 2025 10:47
@MangoIV MangoIV force-pushed the mangoiv/change-subcommands branch from 017a12e to 18f1a27 Compare October 23, 2025 10:59
@MangoIV MangoIV marked this pull request as ready for review October 23, 2025 11:02
@MangoIV MangoIV requested a review from geekosaur October 23, 2025 11:02
@MangoIV MangoIV force-pushed the mangoiv/change-subcommands branch 2 times, most recently from fe1818f to fcbd59c Compare October 23, 2025 11:40
cabal_exe <- getExecutablePath
let new_env = ("CABAL", cabal_exe) : cur_env
result <- try $ createProcess ((proc exec (name : cmdArgs)){env = Just new_env})
let new_env = ("CABAL_EXTERNAL_CABAL_PATH", cabal_exe) : cur_env
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if this needs to duplicate the "CABAL" part. Perhaps CABAL_EXTERNAL_PATH would be better.

Also, I think this probably needs a deprecation cycle since there are already external commands which may use CABAL.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think would be a sensible way of deprecating it? Not changing anything and emitting a warning? Or doing something while keeping the name? I don't think either of these will be useful.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think reporting anything is worthwhile, as there's no way for us to know what actually uses it. Release notes, possibly a note in the manual, and keep it until (or possibly through, given we're having to release on ghc's rapid release cycle) 3.18.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, both release notes and manual should announce up front when the backward compatibility use will be removed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as I said, there's no way of making it "backward compatible" because there's no way to pass and not pass it as the first argument at the same time.

I did already change it in the manual and made it part of the changelog, so I thought the latter would be used to make it appear in the release notes?

Where else would I add a note on this topic?

I have also announced the change on discourse and I am aware that at least two (major) users of the feature are appreciative of the change rather than afraid of being broken.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What the "two major users"? The discourse post doesn't have any meaningful feedback, sadly --- only pat-on-the-back from Matt... https://discourse.haskell.org/t/cabal-install-external-command-feature-without-passing-the-subcommands-name-as-argv-1/13062

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No „meaningful feedback“ in this case means nobody is complaining, right, which would be really good!

Previously the executable name of the external command was passed to external commands as the
first argument.

This behaviour was adapated from cargo which does this because of reasons that are internal
to rust that do not affect GHC Haskell, and are even orthogonal to patterns that see common use in
Haskell.

Additionally, it complicates the 'simple' case which is what we should optimize for when building
such a feature.

The previous use case (one executable that serves multiple external subcommands) is still possible
by the following means:
- using a wrapper around the executable
- using a symlink and check argv[0] in the executable

Additionally, the variable `$CABAL` that was set by `cabal-install` was renamed to `CABAL_EXTERNAL_CABAL_PATH`. This has two reasons:
1. it makes migration easier for users of the external command feature that were previously expecting the name of the executable
   to appear in `argv[1]`
2. it does not unnecessarily pollute the environment variable namespace as it turns out some other tools have been and are already
   using this name, historically

Resolves haskell#10275
@MangoIV MangoIV force-pushed the mangoiv/change-subcommands branch from fcbd59c to c002960 Compare October 24, 2025 14:00
@MangoIV
Copy link
Collaborator Author

MangoIV commented Oct 27, 2025

Small survey.

Tools that do not work with the current behavior but would with the new behaviour:

  • cabal-fmt
  • cabal-audit
  • cabal-gild

Tools that do work with the current behaviour but whose authors would prefer the new bheaviour and who would not be broken by the new behaviour:

  • cabal-matrix
  • cabal-add

Tools that do work with the current behaviour who would not be broken by the new behaviour (but whose authors I did not ask about this change):

  • doctest (cabal doctest)

I am currently not aware of any tools that rely on the old behaviour to keep working

@mpickering
Copy link
Collaborator

mpickering commented Oct 27, 2025

One important one is doctest - https://github.com/sol/doctest?tab=readme-ov-file#cabal-integration

Could you add that to your survey?

@MangoIV
Copy link
Collaborator Author

MangoIV commented Oct 27, 2025

added @mpickering :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants