Skip to content

Conversation

@mazunki
Copy link
Contributor

@mazunki mazunki commented Oct 23, 2025

It works! All tests are passing!

@mazunki
Copy link
Contributor Author

mazunki commented Oct 23, 2025

Regarding std::launder, see https://eel.is/c++draft/ptr.launder. We used to rely on undefined behaviour here.

Thanks compiler:)

@mazunki mazunki mentioned this pull request Oct 23, 2025
@alfreb
Copy link
Contributor

alfreb commented Oct 26, 2025

Love it!

@alfreb-scalemem
Copy link

Ok, looks like the std::launder is the only thing I'm tripping over here. I think it's worth having a quick discussion on that here, to make sure at least two of us understands this new thing before we start using it. Besides that I think it all looks good!

@mazunki
Copy link
Contributor Author

mazunki commented Oct 27, 2025

C++ distinguishes addresses from objects at a conceptual (compile-time) level. Objects have lifetimes, which just /happen/ to exist at addresses. This matters when it comes to optimizations. std::launder helps us by giving us the object existing at an address.

Suppose you do something like

int *a1 = malloc(sizeof(int));
int *a2 = a1;
*a1 = 42;
printf("%d\n", *a2);  // 42

a1 = realloc(a1, 16*sizeof(int));  // assuming we get the same address back
*a1 = 1337;

printf("%d\n", *a2);  // undefined behaviour

The first print statement is guaranteed to be 42, but a2 in the second one the value of a2 may never have been updated to match because we never claimed it refers to the same object. The compiler would have been free to internally convert it to a value-copy instead of a pointer which is dereferenced later.

Instead, if we did

a1 = realloc(a1, 16*sizeof(int));  // assuming we get the same address back
*a1 = 1337;

a2 = std::launder(a2);
printf("%d\n", *a2);  // this is fine now :)

we are no longer relying on undefined behaviour. With a2 = std::launder(a2) we're basically saying "please give me the object that resides at that address now". The compiler is no longer allowed to optimize the address away, guaranteeing a print of 1337.

Of course, realloc() can actually move the pointer: this is just an example where we pretend it remains the same with a larger allocation at the end (which for small size differences is probably happening anyway, but that's beside the point). In the code, we are using placement-new which puts us in control of the address of the allocation.

@alfreb-scalemem
Copy link

C++ distinguishes addresses from objects at a conceptual (compile-time) level. Objects have lifetimes, which just /happen/ to exist at addresses. This matters when it comes to optimizations. std::launder helps us by giving us the object existing at an address.

Suppose you do something like

int *a1 = malloc(sizeof(int));
int *a2 = a1;
*a1 = 42;
printf("%d\n", *a2);  // 42

a1 = realloc(a1, 16*sizeof(int));  // assuming we get the same address back
*a1 = 1337;

printf("%d\n", *a2);  // undefined behaviour

The first print statement is guaranteed to be 42, but a2 in the second one the value of a2 may never have been updated to match because we never claimed it refers to the same object. The compiler would have been free to internally convert it to a value-copy instead of a pointer which is dereferenced later.

Instead, if we did

a1 = realloc(a1, 16*sizeof(int));  // assuming we get the same address back
*a1 = 1337;

a2 = std::launder(a2);
printf("%d\n", *a2);  // this is fine now :)

we are no longer relying on undefined behaviour. With a2 = std::launder(a2) we're basically saying "please give me the object that resides at that address now". The compiler is no longer allowed to optimize the address away, guaranteeing a print of 1337.

Of course, realloc() can actually move the pointer: this is just an example where we pretend it remains the same with a larger allocation at the end (which for small size differences is probably happening anyway, but that's beside the point). In the code, we are using placement-new which puts us in control of the address of the allocation.

Wait - if you are getting the same pointer back here, which your explanation depends on - I don't see why this is UB.

A) You get the same pointer back. Now a1 == a2. They are both int*
B) You get a different pointer back - launder doesn't help with that at all.

#include <cstdlib>
#include <cstdint>
#include <cstdio>
#include <new>

int main()
{
  int *a1 = (int *)malloc(sizeof(int));
  int *a2 = a1;
  *a1 = 42;
  printf("%d\n", *a2); // 42

  a1 = (int *)realloc(a1, 16 * sizeof(int));  // We have to cast to int*, otherwise the assignment won't work.
  *a1 = 1337;

  printf("First addr: 0x%p Second addr: 0x%p \n", a1, a2);

  a2 = std::launder(a2); // No effect?
  printf("%d\n", *a2); // undefined behaviour if the pointer changed - the old pointer is freed.
}

Compiling this locally with clang++ -Wall -Wextra -std=c++23 -o test_cpp23 test.cpp gives no warnings and the output is basically garbage because the pointer changes after realloc:

❯ ./test_cpp23
42
First addr: 0x0x60000133c180 Second addr: 0x0x600000438020 
-1995145184

I think you need to find a different example here. And the original warning I assume you got, which prompted you to introduce launder in the first place, would be very helpful. I think placement new might be more relevant, but not sure.

@mazunki
Copy link
Contributor Author

mazunki commented Oct 28, 2025

Wait - if you are getting the same pointer back here, which your explanation depends on - I don't see why this is UB.

Because the compiler has no guarantee that the same address is referring to the same object.

We know it is, this is what we intended; but realloc() here is really no different to free() + malloc(), in which case the semantics become a bit more apparent:

int *addr = malloc(sizeof(int));
int *bananas = addr;

*bananas = 42;
printf("%d\n", *addr); // 42


int *apples = addr;  // this being up here causes UB
addr = (int*) realloc(addr, sizeof(int));  // even assuming the same address, it might not refer to the same object

// int *apples = addr;  // if it was down here it'd be fine
// addr = std::launder(addr);  // or use launder instead

*apples = 1337;
printf("%d\n", *addr);

a2 = std::launder(a2); // No effect?

You're right in that a2 = std::launder(a2) has no effect value-wise or pointer-wise, but it refreshes the object that the compiler is looking at for a2. We can read it as a hint to the compiler more so than a CPU instruction.

printf("%d\n", *a2); // undefined behaviour if the pointer changed - the old pointer is freed.

We are using placement-new, this is not a concern. We are in control of the address returned.

If it's not clear, I can draw up a placement-new example, but the syntax is a bit more awkward than simply using malloc/realloc. The key insight here is that addresses and objects are conceptually different things, even if both the value at the symbol and its address is identical.

@mazunki
Copy link
Contributor Author

mazunki commented Oct 28, 2025

#include <new>
#include <cstdio>
#include <print>

struct Count { int n; };

int main() {
    const Count* counter = new const Count{3};
    int apples = counter->n;

    new (const_cast<Count*>(counter)) const Count{5};

    int bananas = counter->n; // UB
    std::println("apples={}", apples);
    std::println("bananas={}", bananas);
}
$ clang++ -Wall -Wextra -Wpedantic -O3 -std=c++23 -fsanitize=address,undefined main.cpp
$ ./a.out                                                                              
apples=3
bananas=5

=================================================================
==7668==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 4 byte(s) in 1 object(s) allocated from:
    #0 0x562bad2c87c1 in operator new(unsigned long) (/home/maz/nyaa/launder/a.out+0x15e7c1)
    #1 0x562bad2c9c6d in main (/home/maz/nyaa/launder/a.out+0x15fc6d)

SUMMARY: AddressSanitizer: 4 byte(s) leaked in 1 allocation(s).

@mazunki
Copy link
Contributor Author

mazunki commented Oct 28, 2025

And the original warning I assume you got, which prompted you to introduce launder in the first place, would be very helpful

The warning which made me look into this is aligned_storage_t being deprecated. I'm trying to get a warning of the UB itself, but can't find a way to have the compiler tell me that other than through runtime sanitation.

Related: https://www.think-cell.com/assets/en/career/talks/pdf/think-cell_talk_lifetime.pdf

@mazunki
Copy link
Contributor Author

mazunki commented Oct 28, 2025

This example is only leaking sometimes for me, lol

#include <new>
#include <cstdio>
#include <print>

int main() {
    const int* ptr = new const int(11); // provenance A
    std::destroy_at(ptr);

    void* raw = const_cast<void*>(static_cast<const void*>(ptr));

    int* new_ptr = ::new (raw) int(42); // provenance B

    std::print("*new_ptr = {}\n", *new_ptr);       // okay
    std::print("*ptr = {}\n", *ptr);               // UB
    std::print("*ptr = {}\n", *std::launder(ptr)); // okay
}

Interesting bit: if I use println instead of print, it always fails. Gotta love some UB.

clang++ -Wall -Wextra -Wpedantic -O3 -std=c++23 -fsanitize=address,undefined -ggdb3 main.cpp

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants