Python Debug Resources

In the following examples I refer to one of my code snippets named atomic.py.

I also use my preconfigured environment, with all the apps and commands used already installed.

Do not hesistate to checkout my boths repos to test the following commands.

Using GDB with python

Using GDB can avoid to edit your code to append debug points.

In the following command, I first ran my script, and then I directly attach gdb to the pid corresponding to my running python script:

python python/atomic.py & gdb python $!

Retrieve the Current Exception in PDB

pdb stores the exception type and value in __exception__.

You can simply retrieve this exception by using:

(pdb) __exception__

You can print the exception part of a traceback in pdb with:

(Pdb) import traceback; print "".join(traceback.format_exception_only(*__exception__))

This trick is useful when you deal with code like the following snippet:

try:
    do_something()
except:
    do_something_else()

Compile Python Code

Sometime it could be useful to compile python. To do that you can simply use the pyinstaller library.

Compiling the code of atomic.py:

$ pyinstaller atomic.py

The previous command will generate a dist folder containing something like the following (truncated) output:

dist
└── atomic
    ├── _internal
    │   ├── base_library.zip
    │   ├── lib-dynload
    │   │   ├── _bisect.cpython-312-x86_64-linux-gnu.so
    │   │   ├── _blake2.cpython-312-x86_64-linux-gnu.so
    │   │   ├── _bz2.cpython-312-x86_64-linux-gnu.so
    │   │   ├── _codecs_cn.cpython-312-x86_64-linux-gnu.so
...
    │   │   ├── _datetime.cpython-312-x86_64-linux-gnu.so
    │   │   ├── _decimal.cpython-312-x86_64-linux-gnu.so
    │   │   ├── _hashlib.cpython-312-x86_64-linux-gnu.so
    │   │   ├── _heapq.cpython-312-x86_64-linux-gnu.so
    │   │   ├── _lzma.cpython-312-x86_64-linux-gnu.so
    │   │   ├── _md5.cpython-312-x86_64-linux-gnu.so
    │   │   ├── _multibytecodec.cpython-312-x86_64-linux-gnu.so
...
    │   │   ├── _random.cpython-312-x86_64-linux-gnu.so
    │   │   └── zlib.cpython-312-x86_64-linux-gnu.so
    │   ├── libbz2.so.1.0
    │   ├── libcrypto.so.1.1
    │   ├── liblzma.so.5
    │   ├── libpython3.12.so.1.0
    │   └── libz.so.1
    └── atomic

3 directories, 44 files

Disassemble Compiled Python Code

Compiling and disassembling python code can allow you to implement some sort of linters, like one that could be used to identify file handeling made at the C level with fopen.

Image you want to migrate existing blocking code into async code, maybe by using asyncio, and you want to identify first which part of your code could be source of blocking IO, compiling and then decompiling Python code can help you identifying where are located the code you need to reformat.

Your code may use underlying libraries where are hosted your blocking IOs, by compiling and decompiling you transform your code into low level code with a direct access to your hidden blocking IOs.

Based on the previously generated binary (see the previous section), the following command will dissasemble this binary:

$ objdump -d dist/atomic/atomic

dist/atomic/atomic:     file format elf64-x86-64


Disassembly of section .init:

0000000000402000 <.init>:
  402000:	48 83 ec 08          	sub    $0x8,%rsp
  402004:	48 8b 05 ed bf 00 00 	mov    0xbfed(%rip),%rax        # 40dff8 <dlerror@plt+0xbad8>
  40200b:	48 85 c0             	test   %rax,%rax
  40200e:	74 05                	je     402015 <__strcat_chk@plt-0x1b>
  402010:	e8 ab 02 00 00       	callq  4022c0 <__gmon_start__@plt>
  402015:	48 83 c4 08          	add    $0x8,%rsp
  402019:	c3                   	retq   

Disassembly of section .plt:

0000000000402020 <__strcat_chk@plt-0x10>:
  402020:	ff 35 e2 bf 00 00    	pushq  0xbfe2(%rip)        # 40e008 <dlerror@plt+0xbae8>
  402026:	ff 25 e4 bf 00 00    	jmpq   *0xbfe4(%rip)        # 40e010 <dlerror@plt+0xbaf0>
  40202c:	0f 1f 40 00          	nopl   0x0(%rax)

0000000000402030 <__strcat_chk@plt>:
  402030:	ff 25 e2 bf 00 00    	jmpq   *0xbfe2(%rip)        # 40e018 <dlerror@plt+0xbaf8>
  402036:	68 00 00 00 00       	pushq  $0x0
  40203b:	e9 e0 ff ff ff       	jmpq   402020 <__strcat_chk@plt-0x10>

0000000000402040 <getenv@plt>:
  402040:	ff 25 da bf 00 00    	jmpq   *0xbfda(%rip)        # 40e020 <dlerror@plt+0xbb00>
  402046:	68 01 00 00 00       	pushq  $0x1
  40204b:	e9 d0 ff ff ff       	jmpq   402020 <__strcat_chk@plt-0x10>
...

0000000000402460 <fopen@plt>:
  402460:	ff 25 ca bd 00 00    	jmpq   *0xbdca(%rip)        # 40e230 <dlerror@plt+0xbd10>
  402466:	68 43 00 00 00       	pushq  $0x43
  40246b:	e9 b0 fb ff ff       	jmpq   402020 <__strcat_chk@plt-0x10>

...
  407e14:	e8 47 a6 ff ff       	callq  402460 <fopen@plt>
  407e19:	48 81 c4 08 10 00 00 	add    $0x1008,%rsp
  407e20:	5d                   	pop    %rbp
  407e21:	41 5c                	pop    %r12
  407e23:	c3                   	retq   
  407e24:	0f 1f 40 00          	nopl   0x0(%rax)
  407e28:	8b 05 6e 64 00 00    	mov    0x646e(%rip),%eax        # 40e29c <dlerror@plt+0xbd7c>
  407e2e:	83 f8 ff             	cmp    $0xffffffff,%eax
  407e31:	74 3d                	je     407e70 <dlerror@plt+0x5950>
  407e33:	85 c0                	test   %eax,%eax
  407e35:	0f 85 84 00 00 00    	jne    407ebf <dlerror@plt+0x599f>
  4086ee:	e8 0d 99 ff ff       	callq  402000 <__strcat_chk@plt-0x30>
  4086f3:	48 85 ed             	test   %rbp,%rbp
  4086f6:	74 1e                	je     408716 <dlerror@plt+0x61f6>
  4086f8:	0f 1f 84 00 00 00 00 	nopl   0x0(%rax,%rax,1)
  4086ff:	00 
  408700:	4c 89 ea             	mov    %r13,%rdx
  408703:	4c 89 f6             	mov    %r14,%rsi
  408706:	44 89 ff             	mov    %r15d,%edi
  408709:	41 ff 14 dc          	callq  *(%r12,%rbx,8)
  40870d:	48 83 c3 01          	add    $0x1,%rbx
  408711:	48 39 eb             	cmp    %rbp,%rbx
  408714:	75 ea                	jne    408700 <dlerror@plt+0x61e0>
  408716:	48 83 c4 08          	add    $0x8,%rsp
  40871a:	5b                   	pop    %rbx
  40871b:	5d                   	pop    %rbp
  40871c:	41 5c                	pop    %r12
  40871e:	41 5d                	pop    %r13
  408720:	41 5e                	pop    %r14
  408722:	41 5f                	pop    %r15
  408724:	c3                   	retq   
  408725:	90                   	nop
  408726:	66 2e 0f 1f 84 00 00 	nopw   %cs:0x0(%rax,%rax,1)
  40872d:	00 00 00 
  408730:	f3 c3                	repz retq 

Disassembly of section .fini:

0000000000408734 <.fini>:
  408734:	48 83 ec 08          	sub    $0x8,%rsp
  408738:	48 83 c4 08          	add    $0x8,%rsp
  40873c:	c3                   	retq   

USDT (User Statically-Defined Tracing (USDT) probes) & Python

Imagine you ran a python server, and you want to monitor threads for this running python process.

First, lets start this server:

$ python3 -m http.server
Serving HTTP on 0.0.0.0 port 8000 (http://0.0.0.0:8000/) ...

You can use bpftrace to list all USDT available for this running python server:

$ sudo bpftrace -lp <pid> "usdt:*" 
...
usdt:/proc/920960/root/usr/lib64/libc.so.6:libc:mutex_timedlock_acquired
usdt:/proc/920960/root/usr/lib64/libc.so.6:libc:mutex_timedlock_entry
usdt:/proc/920960/root/usr/lib64/libc.so.6:libc:pthread_create
usdt:/proc/920960/root/usr/lib64/libc.so.6:libc:pthread_join
usdt:/proc/920960/root/usr/lib64/libc.so.6:libc:pthread_join_ret
usdt:/proc/920960/root/usr/lib64/libc.so.6:libc:pthread_start
usdt:/proc/920960/root/usr/lib64/libc.so.6:libc:rdlock_acquire_read
usdt:/proc/920960/root/usr/lib64/libc.so.6:libc:rdlock_entry
usdt:/proc/920960/root/usr/lib64/libc.so.6:libc:rwlock_destroy
...

Replace <pid> the pid of the running python server that you want to inspect, you can retrieve this pid by using ps ax | grep python.

The previous command list all the USDT you can attach.

We want to observe threads, so lets focus on them:

$ sudo bpftrace -p <pid> -e \
    'usdt:/proc/<pid>/root/usr/lib64/libc.so.6:libc:pthread* 
    { printf("%s %u [%u] %u %s\n", comm, pid, cpu, elapsed, probe); }'
Attaching 4 probes...
python3 <pid> [3] 1546728119 usdt:/proc/921304/root/usr/lib64/libc.so.6:libc:pthread_create
python3 <pid> [1] 1546799291 usdt:/proc/921304/root/usr/lib64/libc.so.6:libc:pthread_start

In the example above we can observe that a thread is created and started when I submit a new request to this running server (wget http://0.0.0.0:8000/).

Lot of USDT endpoints can be monitored, that can be useful to debug your apps.

Using BCC to trace a pid launched in a Python virtual context

Sometime we run python things inside a virtualenv, in this case shared object and listing available probes can differ from a raw launching.

The snippet below is an example of this problem. The deadlock BCC tools is not able to properly find the pthread_mutex_unlock symbol, because of the virtualenv usage:

$ tox -e venv -- python reproducer.py
$ ps ax | grep reproducer
1535989 pts/2    Sl+    0:27 /usr/bin/python3 -P /usr/bin/tox -e venv -- python reproducer.py
1536055 pts/2    Sl+    0:00 /home/hberaud/dev/oslo.log/.tox/venv/bin/python reproducer.py
1536892 pts/3    S+     0:00 grep --color=auto --exclude-dir=.bzr --exclude-dir=CVS --exclude-dir=.git --exclude-dir=.hg --exclude-dir=.svn --exclude-dir=.idea --exclude-dir=.tox reprod
$ sudo /usr/share/bcc/tools/deadlock 1536055
In file included from /virtual/main.c:11:
In file included from include/linux/sched.h:12:
In file included from arch/x86/include/asm/current.h:10:
In file included from include/linux/cache.h:6:
In file included from arch/x86/include/asm/cache.h:5:
In file included from include/linux/linkage.h:8:
In file included from arch/x86/include/asm/linkage.h:6:
arch/x86/include/asm/ibt.h:77:8: warning: 'nocf_check' attribute ignored; use -fcf-protection to enable the attribute [-Wignored-attributes]
extern __noendbr u64 ibt_save(bool disable);
       ^
arch/x86/include/asm/ibt.h:32:34: note: expanded from macro '__noendbr'
#define __noendbr       __attribute__((nocf_check))
                                       ^
arch/x86/include/asm/ibt.h:78:8: warning: 'nocf_check' attribute ignored; use -fcf-protection to enable the attribute [-Wignored-attributes]
extern __noendbr void ibt_restore(u64 save);
       ^
arch/x86/include/asm/ibt.h:32:34: note: expanded from macro '__noendbr'
#define __noendbr       __attribute__((nocf_check))
                                       ^
2 warnings generated.
could not determine address of symbol pthread_mutex_unlock in /usr/bin/python3.12. Failed to attach to symbol: pthread_mutex_unlock
Is --binary argument missing?

Starting from there, we can try to pass more arguments to this BCC tools to help it to find the thing it needs. Here is an example on how to retrieve such details in an virtual env context:

$ ldd /home/hberaud/dev/oslo.log/.tox/venv/bin/python
        linux-vdso.so.1 (0x00007ffdc33ab000)
        libpython3.12.so.1.0 => /lib64/libpython3.12.so.1.0 (0x00007fbdd7200000)
        libc.so.6 => /lib64/libc.so.6 (0x00007fbdd701e000)
        libm.so.6 => /lib64/libm.so.6 (0x00007fbdd6f3d000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fbdd7905000)
$ nm /lib64/ld-linux-x86-64.so.2 | grep mutex
0000000000012da0 t rtld_mutex_dummy
000000000001a610 t __rtld_mutex_init
0000000000033a50 d ___rtld_mutex_lock
0000000000033a48 d ___rtld_mutex_unlock
nm /lib64/libc.so.6 | grep mutex | grep unlock
00000000000937b0 t __GI___pthread_mutex_unlock
0000000000093690 t __GI___pthread_mutex_unlock_usercnt
00000000000937b0 t ___pthread_mutex_unlock
0000000000093230 t __pthread_mutex_unlock_full
00000000000937b0 T __pthread_mutex_unlock@GLIBC_2.2.5
00000000000937b0 T pthread_mutex_unlock@@GLIBC_2.2.5
0000000000093690 t __pthread_mutex_unlock_usercnt

First we inspect the shared objects used by the binary used by our virtualenv. Then, we search for such mutex unlock features. The /lib64/libc.so.6 seems to own something that is very close to the symbol that the deadlock tools try to trace.

Lets see if we can use it.

$ sudo /usr/share/bcc/tools/deadlock 1536055 --binary /lib64/libc.so.6
In file included from /virtual/main.c:11:
In file included from include/linux/sched.h:12:
In file included from arch/x86/include/asm/current.h:10:
In file included from include/linux/cache.h:6:
In file included from arch/x86/include/asm/cache.h:5:
In file included from include/linux/linkage.h:8:
In file included from arch/x86/include/asm/linkage.h:6:
arch/x86/include/asm/ibt.h:77:8: warning: 'nocf_check' attribute ignored; use -fcf-protection to enable the attribute [-Wignored-attributes]
extern __noendbr u64 ibt_save(bool disable);
       ^
arch/x86/include/asm/ibt.h:32:34: note: expanded from macro '__noendbr'
#define __noendbr       __attribute__((nocf_check))
                                       ^
arch/x86/include/asm/ibt.h:78:8: warning: 'nocf_check' attribute ignored; use -fcf-protection to enable the attribute [-Wignored-attributes]
extern __noendbr void ibt_restore(u64 save);
       ^
arch/x86/include/asm/ibt.h:32:34: note: expanded from macro '__noendbr'
#define __noendbr       __attribute__((nocf_check))
                                       ^
2 warnings generated.
Tracing... Hit Ctrl-C to end.

Bingo! By passing the binary path to the deadlock BCC tool, this tracer now works now as expected. We are now able to detect potential deadlocks in our Python process.

Using perf to profile python process

If the PYTHONPERFSUPPORT variable is set to a nonzero value, it enables support for the Linux perf profiler so Python calls can be detected by it.

$ PYTHONPERFSUPPORT=1 python sample.py

Then we are now able to record the profile of our running python process by using the perf record command:

$ perf record -F 9999 -g -o perf.data -p $(ps ax | grep python | grep sample | awk '{print $1}')

Once our recording is finished we are now able to generate a report for our recorded profile:

$ perf report -g -i perf.data

Or, create a flamegraph from your app:

$ perf script flamegraph -a -F 99 sleep 60

More interesting details can be found there:

https://www.brendangregg.com/perf.html
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/monitoring_and_managing_system_status_and_performance/getting-started-with-flamegraphs_monitoring-and-managing-system-status-and-performance
https://docs.python.org/3/howto/perf_profiling.html