guix substitute: backtrace when the network is disabled during substitution

  • Open
  • quality assurance status badge
Details
2 participants
  • Ludovic Courtès
  • Maxime Devos
Owner
unassigned
Submitted by
Maxime Devos
Severity
important
M
M
Maxime Devos wrote on 20 Jun 2021 04:07
(address . bug-guix@gnu.org)
1fa57b9edcd62f9bea334688da9c4f009507b268.camel@telenet.be
X-Debbugs-CC: ludo@gnu.org 48903@debbugs.gnu.org maxim.cournoyer@gmail.com
tags: + substituter

Ludovic Courtès schreef op vr 11-06-2021 om 17:09 [+0200]:
Toggle quote (21 lines)
> Hi Maxim{,e}!
>
> Maxime Devos <maximedevos@telenet.be> skribis:
>
> > > 122.8 MB will be downloaded
> > > libreoffice-6.4.7.2 117.1MiB 344KiB/s 03:04 [######### ] 52.7%guix substitute: error: TLS error in procedure 'read_from_session_record_port': Error decoding the received TLS packet.
> > > substitution of /gnu/store/44h13hn5zssfppz67vydxcf95qsc8qfw-libreoffice-6.4.7.2 failed
> > > guix package: error: some substitutes for the outputs of derivation `/gnu/store/9f8sffldy39mprihx6xgrs7hys9j75jm-libreoffice-6.4.7.2.drv' failed (usually happens due to networking issues); try `--fallback' to build derivation from source
> > > --8<---------------cut here---------------end--------------->8---
> > >
> >
> > I often have the same problem when I do "guix package -u".
> > (Same error message, same package libreoffice, same derivation)
> > (Usually libreoffice, sometimes with other packages as well.)
>
> As a first step, can you reproduce the bug like this:
>
> while echo substitute /gnu/store/44h13hn5zssfppz67vydxcf95qsc8qfw-libreoffice-6.4.7.2 /tmp/t | guix substitute --substitute ; do chmod -R +w /tmp/t && rm -rf /tmp/t; done
>
> ?

I cannot reproduce either, with

$ guix describe
Generatie 97 23:40:15 19 jun 2021 (huidig)
guix 3aabe51
tak: master
commit: 3aabe51e8c09b9a2a87c03c40e3cc0f90d531bfd
nonguix d81564f
tak: master
commit: d81564f21e7d8800e6f6187fe2e1f6476e06bc30

so I wondered whether it is a networking issue,
so I disabled & re-enabled wireless networking during the substitution
and encountered a (to my knowledge) previously-unknown backtrace:

$ while (echo substitute /gnu/store/44h13hn5zssfppz67vydxcf95qsc8qfw-libreoffice-6.4.7.2 /tmp/t && echo substitute /gnu/store/j0j7z6ckarjs9yi77sncszbmdgy38s70-guix-1.3.0-4.4985a42 /tmp/u) | guix
substitute --substitute ; do chmod -R +w /tmp/t && rm -rf /tmp/t && chmod -R +w /tmp/u && rm -rf /tmp/u ; done

libreoffice-6.4.7.2 117.1MiB 1.7MiB/s 01:08 [##################] 100.0%

guix-1.3.0-4.4985a42 36.1MiB 223KiB/s 01:53 [############ ] 67.9%Backtrace:
In guix/serialization.scm:
468:33 19 (read "/tmp/u/share/guile/site/3.0" _)
468:33 18 (read "/tmp/u/share/guile/site/3.0/gnu" _)
468:33 17 (read "/tmp/u/share/guile/site/3.0/gnu/packages" _)
442:24 16 (read "/tmp/u/share/guile/site/3.0/gnu/packages/crates…" …)
525:24 15 (_ "/tmp/u/share/guile/site/3.0/gnu/packages/crates-io…" …)
In ice-9/ports.scm:
433:17 14 (call-with-output-file _ _ #:binary _ #:encoding _)
In guix/serialization.scm:
247:20 13 (dump #<input: string 7f0d0b7d3d90> #<output: /tmp/u/s…> …)
In unknown file:
12 (get-bytevector-n! #<input: string 7f0d0b7d3d90> # 0 #)
In gcrypt/hash.scm:
223:13 11 (read! #vu8(32 32 40 97 114 103 117 109 101 110 116 # …) …)
In unknown file:
10 (get-bytevector-n! #<input: string 7f0d0b7d3e00> # 0 #)
In lzlib.scm:
501:4 9 (lzread! #<lz-decoder 7f0d0b7d5ef0> #<input: string 7f…> …)
In unknown file:
8 (get-bytevector-n #<input: string 7f0d0b7d3e70> 65537)
In guix/progress.scm:
368:31 7 (read! _ _ _)
In unknown file:
6 (get-bytevector-n! #<input: string 7f0d0a15e4d0> # 0 #)
In web/response.scm:
95:2 5 (read! _ _ _)
In ice-9/boot-9.scm:
1685:16 4 (raise-exception _ #:continuable? _)
1685:16 3 (raise-exception _ #:continuable? _)
1780:13 2 (_ #<&compound-exception components: (#<&error> #<&irri…>)
1685:16 1 (raise-exception _ #:continuable? _)
1685:16 0 (raise-exception _ #:continuable? _)

ice-9/boot-9.scm:1685:16: In procedure raise-exception:
Throw to key `bad-response' with args `("EOF while reading response body: ~a bytes of ~a" (26017808 37842870))'.

This seems like a separate issue from #48903 though,
so I opened a new bug report.

Greetings,
Maxime.
-----BEGIN PGP SIGNATURE-----

iI0EABYKADUWIQTB8z7iDFKP233XAR9J4+4iGRcl7gUCYM8heBccbWF4aW1lZGV2
b3NAdGVsZW5ldC5iZQAKCRBJ4+4iGRcl7qrFAP9FxwTW6dNL8SA2IbZIiCmPbvCb
NocaTz3OnZMQiWWh+QEAoLRwuN7YppeSdBTBXw9Srco+cLG/RLYO7OFufqno0go=
=qRxg
-----END PGP SIGNATURE-----


L
L
Ludovic Courtès wrote on 22 Jun 2021 07:06
(name . Maxime Devos)(address . maximedevos@telenet.be)(address . 49132@debbugs.gnu.org)
87pmwe2dv9.fsf@gnu.org
Hi,

Maxime Devos <maximedevos@telenet.be> skribis:

Toggle quote (6 lines)
> so I disabled & re-enabled wireless networking during the substitution
> and encountered a (to my knowledge) previously-unknown backtrace:
>
> $ while (echo substitute /gnu/store/44h13hn5zssfppz67vydxcf95qsc8qfw-libreoffice-6.4.7.2 /tmp/t && echo substitute /gnu/store/j0j7z6ckarjs9yi77sncszbmdgy38s70-guix-1.3.0-4.4985a42 /tmp/u) | guix
> substitute --substitute ; do chmod -R +w /tmp/t && rm -rf /tmp/t && chmod -R +w /tmp/u && rm -rf /tmp/u ; done

[...]

Toggle quote (20 lines)
> In lzlib.scm:
> 501:4 9 (lzread! #<lz-decoder 7f0d0b7d5ef0> #<input: string 7f…> …)
> In unknown file:
> 8 (get-bytevector-n #<input: string 7f0d0b7d3e70> 65537)
> In guix/progress.scm:
> 368:31 7 (read! _ _ _)
> In unknown file:
> 6 (get-bytevector-n! #<input: string 7f0d0a15e4d0> # 0 #)
> In web/response.scm:
> 95:2 5 (read! _ _ _)
> In ice-9/boot-9.scm:
> 1685:16 4 (raise-exception _ #:continuable? _)
> 1685:16 3 (raise-exception _ #:continuable? _)
> 1780:13 2 (_ #<&compound-exception components: (#<&error> #<&irri…>)
> 1685:16 1 (raise-exception _ #:continuable? _)
> 1685:16 0 (raise-exception _ #:continuable? _)
>
> ice-9/boot-9.scm:1685:16: In procedure raise-exception:
> Throw to key `bad-response' with args `("EOF while reading response body: ~a bytes of ~a" (26017808 37842870))'.

Ah indeed, this is poorly handled.

I’m not really sure how to address it. I/O ports are a nice abstraction
as it allows you to transparently read “streams” from any medium, but as
always, that also comes with opacity where the call site is not supposed
to know what kind of exceptions might be thrown deep down.

Thoughts?

Ludo’.
L
L
Ludovic Courtès wrote on 28 Jun 2021 14:15
control message for bug #49132
(address . control@debbugs.gnu.org)
875yxxwv1b.fsf@gnu.org
severity 49132 important
quit
M
M
Maxime Devos wrote on 8 Jun 2023 14:17
Re: guix substitute: backtrace when the network is disabled during substitution
423696ef-97b8-f67a-58e5-795a0df5a343@telenet.be
Toggle quote (8 lines)
> Ah indeed, this is poorly handled.
>
> I’m not really sure how to address it. I/O ports are a nice abstraction
> as it allows you to transparently read “streams” from any medium, but as
> always, that also comes with opacity where the call site is not supposed
> to know what kind of exceptions might be thrown deep down.
>
> Thoughts?
About 'as always, [...]’: [citation needed]. AFAICT, nowhere does Guile
documentation state they they aren't supposed to know.
Also, that seems a bad supposition to me if it prevents fixing the bug.
I would just ignore that 'not supposed to’.
I think this supposition needs some adjustment in order to be practical:
guix/scripts/substitute.scm has opened the network connection (via
http-client), so guix/scripts/substitute.scm is responsible for the
connection (unless it delegates of course), so
guix/scripts/substitute.scm is supposed to know what to do about
exceptions involving that connection (unless it delegates).
That there are some intermediate modules before things are actually read
from the port is irrelevant -- substitute.scm opened the port, is
responsible for the port and knows best how to handle exceptional
situations involving that port.
Nothing lower-level has the right context/information to make a good
decision on how to handle the exception, so no delegation possible. As
such, guix/scripts/substitute.scm should do it.
It's not 100% clear from the backtrace, but it appears that the
exception (from guix/scripts/substitute.scm perspective) happens at:
;; Procedure: process-substitution
;; Unpack the Nar at INPUT into DESTINATION.
(define cpu-usage
(with-cpu-usage-monitoring
(restore-file hashed destination
#:dump-file (if (and destination-in-store?
deduplicate?)
dump-file/deduplicate*
dump-file))))
This should then be wrapped in an error handler catching 'bad-response',
maybe reporting it, and switching over the next URL.
Greetings,
Maxime.
Attachment: OpenPGP_signature
?