Discussion:
Bug#844494: reopen: Should close stderr when becoming multiplex master
Julien Palard
2016-11-16 09:45:47 UTC
Permalink
Package: openssh-client
Version: 1:7.3p1-3+b1
Severity: normal

Dear Maintainer,

I think that the bug
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=714526 is still
live, after upgrading to 1:7.3p1-3+b1 (should have been fixed still
1:7.2p2-6).

I straced SSH with and without `ControlMaster` and extracted revelant
bits of information, redacting it, and removing useless lines. First,
without `ControlMaster`:

```
# Here, Python 3 Popen prepares pipes for stdin, stdout, and stderr:
7880 1479228055.059186 pipe2([8, 9], O_CLOEXEC) = 0
7880 1479228055.059226 pipe2([10, 11], O_CLOEXEC) = 0
7880 1479228055.059258 pipe2([12, 13], O_CLOEXEC) = 0
# Then ssh is cloned:
7880 1479228055.060158 clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f7bfd7fa9d0) = 7949
7949 1479228055.121259 <... clone resumed> child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fd3f2a63650) = 7954
7954 1479228055.121135 execve("/bin/sh", ["/bin/sh", "-c", "exec ssh ***@gate.redacted -W dev2.redacted:22"], [/* 16 vars */] <unfinished ...>
# Normal exit
7949 1479228055.592488 +++ exited with 0 +++
7871 1479228055.592519 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=7949, si_uid=1003, si_status=0, si_utime=0, si_stime=2} ---
7954 1479228055.592786 exit_group(129) = ?
# Python receive EOF on stdout AND stderr
7880 1479228055.694455 read(12, <unfinished ...>
7880 1479228055.694480 <... read resumed> "", 32768) = 0
7880 1479228055.738380 read(10, <unfinished ...>
7880 1479228055.738410 <... read resumed> "", 32768) = 0
# So, as the process is exited, AND stdout/stderr had their EOF, Python 3 subprocess.Popen.communicate allows itself to wait for it:
7880 1479228055.924898 wait4(7949, <unfinished ...>
7880 1479228055.925334 <... wait4 resumed> [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 7949
```

Now, with `ControlMaster`:

```
# Same pipes for stdin, stdout, and stderr from Python Popen:
6749 1479227922.771783 pipe2([8, 9], O_CLOEXEC) = 0
6749 1479227922.771831 pipe2([10, 11], O_CLOEXEC) = 0
6749 1479227922.771865 pipe2([12, 13], O_CLOEXEC) = 0
# A bunch of clones, SSH starting its master process:
6749 1479227922.774839 <... clone resumed> child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fa1957fa9d0) = 6824
6824 1479227922.818219 <... clone resumed> child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7ff4c6f11650) = 6827
6827 1479227922.820940 execve("/bin/sh", ["/bin/sh", "-c", "exec ssh ***@redacted -W redacted:22"], [/* 16 vars */] <unfinished ...>
6824 1479227923.204918 <... clone resumed> child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7ff4c6f11650) = 6856
6856 1479227923.205441 <... clone resumed> child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7ff4c6f11650) = 6857
6857 1479227923.206004 <... clone resumed> child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7ff4c6f11650) = 6858
# Normal exit after work is done:
6824 1479227923.335999 +++ exited with 0 +++
6749 1479227923.336042 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=6824, si_uid=1003, si_status=0, si_utime=0, si_stime=0} ---
# Python is getting EOF on STDOUT
6749 1479227923.336086 read(10, "", 32768) = 0
# But here, process is a zombie (dead but not awaited), after like a minute,
# I send a SIGINT and a SIGQUIT to the Python process, so everyone dies and finally:
6749 1479227986.358448 read(12, "", 32768) = 0
6827 1479227986.328027 exit_group(130) = ?
6858 1479227986.329254 exit_group(255) = ?
6749 1479227986.578489 wait4(6824, <unfinished ...>
6749 1479227986.578546 <... wait4 resumed> [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 6824
```

So here it looks like the bug is exactly the same as before: stdout is
closed but stderr is kept open. As Python's
``subprocess.Popen.communicate`` want to wait for both to be closed before
waiting for the process, it yields to stuck Python process and zombies.

-- System Information:
Debian Release: 8.6
APT prefers stable
APT policy: (900, 'stable'), (200, 'testing')
Architecture: amd64 (x86_64)

Kernel: Linux 3.16.0-4-amd64 (SMP w/4 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) (ignored: LC_ALL set to en_US.utf8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)

Versions of packages openssh-client depends on:
ii adduser 3.113+nmu3
ii dpkg 1.17.27
ii libc6 2.24-5
ii libedit2 3.1-20140620-2
ii libgssapi-krb5-2 1.14.3+dfsg-2
ii libselinux1 2.3-2
ii libssl1.0.2 1.0.2j-1
ii passwd 1:4.2-3+deb8u1
ii zlib1g 1:1.2.8.dfsg-2+b1

Versions of packages openssh-client recommends:
ii xauth 1:1.0.9-1

Versions of packages openssh-client suggests:
pn keychain <none>
pn libpam-ssh <none>
pn monkeysphere <none>
pn ssh-askpass <none>

-- no debconf information
Debian Bug Tracking System
2018-10-21 10:09:12 UTC
Permalink
Your message dated Sun, 21 Oct 2018 10:04:53 +0000
with message-id <E1gEAbN-0007is-***@fasolo.debian.org>
and subject line Bug#844494: fixed in openssh 1:7.9p1-1
has caused the Debian Bug report #844494,
regarding reopen: Should close stderr when becoming multiplex master
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact ***@bugs.debian.org
immediately.)
--
844494: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=844494
Debian Bug Tracking System
Contact ***@bugs.debian.org with problems
Loading...