How not to use shared_ptr to this

A customer has code like this:

struct B;
struct A {
    std::shared_ptr<B> b;
};

struct B {
   B(A& a);
};

B::B(A& a) {
    a.b = std::shared_ptr<B>(this);
}

int main() {
    A a;
    auto b = std::shared_ptr<B>(new B(a));

    return 0;
}

The application (most of which is not shown here) runs as expected, but when main exits, we get a double free crash. The reason is that the instance of B that this program creates is managed by two unrelated shared_ptrs. The first shared_ptr is created in main, and the second is created in the B constructor (and then moved into A::b).

In a situation where you have more than one shared_ptr managing the same object, use-after-free and double free bugs occur whenever any code tries to use one of the shared_ptrs after another shared_ptr has already freed the managed object. That is what is happening here, though it’s not obvious because the pointer usages that lead to this crash are invisible. That is, none of the statements shown here actually crash; it’s rather the compiler-generated destructor invocations that set the failure in motion. It is up to C++ users to be aware of how shared_ptr works and to follow best practices to avoid its limitations.

But let’s say that for whatever reasons, we don’t immediately spot the problems with the above code. Perhaps we know all about shared_ptr pitfalls, but rather than standing in plain sight like in this blog post, the problems are lurking deep in a large codebase. Then how might debugging proceed? Running under a debugger and checking the stack trace at the point of the abort implicates the relevant classes.

#0  0x00007ffff7aa8664 in __pthread_kill_implementation () from /lib64/libc.so.6
#1  0x00007ffff7a4fc4e in raise () from /lib64/libc.so.6
#2  0x00007ffff7a37902 in abort () from /lib64/libc.so.6
#3  0x00007ffff7a38767 in __libc_message_impl.cold () from /lib64/libc.so.6
#4  0x00007ffff7ab27e5 in malloc_printerr () from /lib64/libc.so.6
#5  0x00007ffff7ab4def in _int_free () from /lib64/libc.so.6
#6  0x00007ffff7ab754e in free () from /lib64/libc.so.6
#7  0x00000000004019ab in std::_Sp_counted_ptr<B*, (__gnu_cxx::_Lock_policy)2>::_M_dispose (this=0x4172d0) at /usr/include/c++/14/bits/shared_ptr_base.h:428
#8  0x000000000040136b in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x4172d0) at /usr/include/c++/14/bits/shared_ptr_base.h:346
#9  0x00000000004014c9 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0x7fffffffdcf8, __in_chrg=<optimized out>) at /usr/include/c++/14/bits/shared_ptr_base.h:1069
#10 0x0000000000401428 in std::__shared_ptr<B, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x7fffffffdcf0, __in_chrg=<optimized out>) at /usr/include/c++/14/bits/shared_ptr_base.h:1525
#11 0x0000000000401444 in std::shared_ptr<B>::~shared_ptr (this=0x7fffffffdcf0, __in_chrg=<optimized out>) at /usr/include/c++/14/bits/shared_ptr.h:175
#12 0x00000000004011cd in A::~A (this=0x7fffffffdcf0, __in_chrg=<optimized out>) at main.cpp:13
#13 0x000000000040129d in main () at main.cpp:33

The stack trace shows we abort when freeing a shared_ptr<B> inside ~A. If we add some logging to the destructors involved:

A::~A() {
    std::cout << "Inside A destructor.\n";
}
B::~B() {
    std::cout << "Inside B destructor.\n";
}

Then the application outputs:

Inside B destructor.
Inside A destructor.
Inside B destructor.
free(): double free detected in tcache 2
Aborted (core dumped)

This reveals that B’s destructor is in fact running twice! We can learn more by using AddressSanitizer, or preferably a debugger. GDB produces the following stack trace for the first invocation of B’s destructor.

#0  B::~B (this=0x4172b0, __in_chrg=<optimized out>) at main.cpp:25
#1  0x000000000040199e in std::_Sp_counted_ptr<B*, (__gnu_cxx::_Lock_policy)2>::_M_dispose (this=0x4172f0) at /usr/include/c++/14/bits/shared_ptr_base.h:428
#2  0x000000000040136b in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x4172f0) at /usr/include/c++/14/bits/shared_ptr_base.h:346
#3  0x00000000004014c9 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0x7fffffffdce8, __in_chrg=<optimized out>) at /usr/include/c++/14/bits/shared_ptr_base.h:1069
#4  0x0000000000401428 in std::__shared_ptr<B, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x7fffffffdce0, __in_chrg=<optimized out>) at /usr/include/c++/14/bits/shared_ptr_base.h:1525
#5  0x0000000000401444 in std::shared_ptr<B>::~shared_ptr (this=0x7fffffffdce0, __in_chrg=<optimized out>) at /usr/include/c++/14/bits/shared_ptr.h:175
#6  0x0000000000401291 in main () at main.cpp:33

We can tell from frames 5 and 6 that this is where local variable b is being destroyed in main. This finishes successfully. Next comes the destruction of local variable a in main, with the following stack trace:

#0  A::~A (this=0x7fffffffdcf0, __in_chrg=<optimized out>) at main.cpp:12
#1  0x000000000040129d in main () at main.cpp:33

As part of A’s destruction, A’s shared_ptr<B> field gets destroyed. This leads to the double free of the B instance. Note that GDB allows us to see that the this pointer is the same this time as before (this=0x4172b0):

#0  B::~B (this=0x4172b0, __in_chrg=<optimized out>) at main.cpp:25
#1  0x000000000040199e in std::_Sp_counted_ptr<B*, (__gnu_cxx::_Lock_policy)2>::_M_dispose (this=0x4172d0) at /usr/include/c++/14/bits/shared_ptr_base.h:428
#2  0x000000000040136b in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x4172d0) at /usr/include/c++/14/bits/shared_ptr_base.h:346
#3  0x00000000004014c9 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0x7fffffffdcf8, __in_chrg=<optimized out>) at /usr/include/c++/14/bits/shared_ptr_base.h:1069
#4  0x0000000000401428 in std::__shared_ptr<B, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr (this=0x7fffffffdcf0, __in_chrg=<optimized out>) at /usr/include/c++/14/bits/shared_ptr_base.h:1525
#5  0x0000000000401444 in std::shared_ptr<B>::~shared_ptr (this=0x7fffffffdcf0, __in_chrg=<optimized out>) at /usr/include/c++/14/bits/shared_ptr.h:175
#6  0x00000000004011cd in A::~A (this=0x7fffffffdcf0, __in_chrg=<optimized out>) at main.cpp:13
#7  0x000000000040129d in main () at main.cpp:33

To be precise, the above stack trace showed the second invocation of B’s destructor, not the free, which happens after the destructor returns. Of course, much of these stack traces is outside the scope of the language standard anyway, so they look different on different platforms, but I show some here anyway for reference.

What to do instead

The main takeaway is “Never pass this as the argument to the shared_ptr constructor.” But what can you do instead when you want to get a shared pointer to this?

std::enable_shared_from_this

One option is to inherit std::enable_shared_from_this<T> where T is your class. This base class gives you a function shared_from_this which returns a shared_ptr to this. A crucial limitation of this approach is that this function must not be called inside your constructors.

Refactor

Another good albeit open-ended approach is to just give up on the idea. Considering the above example, if our application requires that an instance of A have a reference to an instance of B, we can satisfy that requirement some other way. In particular, we could accomplish the pointer assignment inside the caller of B’s constructor, instead of inside B’s constructor itself.

Possible refactor:

struct B;
struct A {
    std::shared_ptr<B> b;
    ~A();
};

struct B {
    B();
    ~B();
};

B::B() {
    // Removed bad statement.
    // We could still use the A for other purposes here if we wanted to,
    // but there's no need in this example, so I removed the parameter.
}

int main() {
    A a;
    auto b = std::shared_ptr<B>(new B());
    a.b = b;

    return 0;
}

This version of the code achieves in spirit the same thing but without ever needing a shared_ptr to this. Arguably, it has the added benefit of removing the bad practice of leaking this from a constructor.

Addendum

I feel I should mention that I typically prefer make_shared over the shared_ptr constructor, but I used the latter in this post for consistency.


Posted

in

,

by

Tags: