* read.c does not close gzipped files; leaks memory
@ 2018-02-23 19:28 Wolfgang Müller
2018-02-23 22:13 ` Ingo Schwarze
0 siblings, 1 reply; 2+ messages in thread
From: Wolfgang Müller @ 2018-02-23 19:28 UTC (permalink / raw)
To: tech
[-- Attachment #1.1: Type: text/plain, Size: 361 bytes --]
Hi,
I'm using mandoc on a system with a significant quantity of gzipped
manpages, and noticed its memory usage being unusually high. After
investigating, I found that in read_whole_file, mandoc opens the gzipped
file, but then never closes it before returning from the function.
Find attached a patch which solves this issue.
--
Wolfgang Müller
[-- Attachment #1.2: mandoc-gzleak.patch --]
[-- Type: text/x-diff, Size: 777 bytes --]
--- read.c.orig 2018-02-18 07:27:07.124135724 +0000
+++ read.c 2018-02-18 07:34:02.832965838 +0000
@@ -556,6 +556,7 @@
gzFile gz;
size_t off;
ssize_t ssz;
+ int end;
if (fstat(fd, &st) == -1) {
mandoc_vmsg(MANDOCERR_FILE, curp, 0, 0,
@@ -598,6 +599,7 @@
*with_mmap = 0;
off = 0;
+ end = 0;
fb->sz = 0;
fb->buf = NULL;
for (;;) {
@@ -614,7 +616,8 @@
read(fd, fb->buf + (int)off, fb->sz - off);
if (ssz == 0) {
fb->sz = off;
- return 1;
+ end = 1;
+ break;
}
if (ssz == -1) {
mandoc_vmsg(MANDOCERR_FILE, curp, 0, 0,
@@ -624,6 +627,12 @@
off += (size_t)ssz;
}
+ if (curp->gzip)
+ gzclose(gz);
+
+ if (end)
+ return 1;
+
free(fb->buf);
fb->buf = NULL;
return 0;
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: read.c does not close gzipped files; leaks memory
2018-02-23 19:28 read.c does not close gzipped files; leaks memory Wolfgang Müller
@ 2018-02-23 22:13 ` Ingo Schwarze
0 siblings, 0 replies; 2+ messages in thread
From: Ingo Schwarze @ 2018-02-23 22:13 UTC (permalink / raw)
To: Wolfgang Mueller; +Cc: tech
Hello Wolfgang,
Wolfgang Mueller wrote on Fri, Feb 23, 2018 at 07:28:10PM +0000:
> I'm using mandoc on a system with a significant quantity of gzipped
> manpages, and noticed its memory usage being unusually high. After
> investigating, I found that in read_whole_file, mandoc opens the gzipped
> file, but then never closes it before returning from the function.
>
> Find attached a patch which solves this issue.
Thanks a lot for noticing the bug, for tracking down the root cause,
for providing a concise explanation, and for even providing a patch.
That is very helpful!
I committed your patch with a number of tweaks, see below.
The main problem with your version was that it also closed the
original file descriptor, which this function is not supposed to do.
Besides, error reporting was incomplete.
Yours,
Ingo
P.S.
I'm surprised to see that the zlib API is such a horrible mess -
and that the manual page is even worse:
- Documentation of gzread(3) doesn't mention how errors are reported.
Finding out required experimentation (or reading the code).
- The mixture of reporting some errors via gzerror(3) and others
via errno does not look like reasonable design. It results
in complicated and ugly application code.
- The mixture of some functions returning gzerror values, some
passing them out via pointer arguments, and the competing
concept of retrieving them after the fact via gzerror(3) feels
like redundant and badly inconsistent API design and causes
application code to look inconsistent, too.
- What gzerror(3) reports is not particularly helpful. Invalidly
changed bytes in the middle of a file merely result in "data error" -
arguably accurate, but mildly helpful at best, whereas an empty
file results in "buffer error", which is totally misleading.
- The manual page explicitly recommends gzerror(3) after gzclose(3)
failure, but that actually crashes (with segfault, bus error or
similar) because gzclose(3) trashes the gzFile object even on
failure.
- The function you actually need after gzclose(3) failure,
zError(3), is weirdly named and almost undocumented.
- There doesn't appear to be a way to disable the behaviour of
opening the file uncompressed when it is not in zlib format.
But what can we do, we have to use the API as it is. Probably it was
a mistake to rely on zlib, given how ill-designed it is, the resulting
ugliness taints the core mandoc code. But now it seems too late to
throw out zlib support again, i guess too many people already got
used to having support for gzipped manual pages...
Log Message:
-----------
After opening a file with gzdopen(3), we have to call gzclose(3) or
we leak memory internally used by zlib to keep compression state.
Bug reported by Wolfgang Mueller <vehk at vehk dot de> who also
provided an incomplete patch, part of which i'm using in this commit.
Modified Files:
--------------
mandoc:
read.c
Revision Data
-------------
Index: read.c
===================================================================
RCS file: /home/cvs/mandoc/mandoc/read.c,v
retrieving revision 1.193
retrieving revision 1.194
diff -Lread.c -Lread.c -u -p -r1.193 -r1.194
--- read.c
+++ read.c
@@ -556,6 +556,7 @@ read_whole_file(struct mparse *curp, con
gzFile gz;
size_t off;
ssize_t ssz;
+ int gzerrnum, retval;
if (fstat(fd, &st) == -1) {
mandoc_vmsg(MANDOCERR_FILE, curp, 0, 0,
@@ -583,9 +584,22 @@ read_whole_file(struct mparse *curp, con
}
if (curp->gzip) {
+ /*
+ * Duplicating the file descriptor is required
+ * because we will have to call gzclose(3)
+ * to free memory used internally by zlib,
+ * but that will also close the file descriptor,
+ * which this function must not do.
+ */
+ if ((fd = dup(fd)) == -1) {
+ mandoc_vmsg(MANDOCERR_FILE, curp, 0, 0,
+ "dup: %s", strerror(errno));
+ return 0;
+ }
if ((gz = gzdopen(fd, "rb")) == NULL) {
mandoc_vmsg(MANDOCERR_FILE, curp, 0, 0,
"gzdopen: %s", strerror(errno));
+ close(fd);
return 0;
}
} else
@@ -598,6 +612,7 @@ read_whole_file(struct mparse *curp, con
*with_mmap = 0;
off = 0;
+ retval = 0;
fb->sz = 0;
fb->buf = NULL;
for (;;) {
@@ -614,19 +629,29 @@ read_whole_file(struct mparse *curp, con
read(fd, fb->buf + (int)off, fb->sz - off);
if (ssz == 0) {
fb->sz = off;
- return 1;
+ retval = 1;
+ break;
}
if (ssz == -1) {
- mandoc_vmsg(MANDOCERR_FILE, curp, 0, 0,
- "read: %s", strerror(errno));
+ if (curp->gzip)
+ (void)gzerror(gz, &gzerrnum);
+ mandoc_vmsg(MANDOCERR_FILE, curp, 0, 0, "read: %s",
+ curp->gzip && gzerrnum != Z_ERRNO ?
+ zError(gzerrnum) : strerror(errno));
break;
}
off += (size_t)ssz;
}
- free(fb->buf);
- fb->buf = NULL;
- return 0;
+ if (curp->gzip && (gzerrnum = gzclose(gz)) != Z_OK)
+ mandoc_vmsg(MANDOCERR_FILE, curp, 0, 0, "gzclose: %s",
+ gzerrnum == Z_ERRNO ? strerror(errno) :
+ zError(gzerrnum));
+ if (retval == 0) {
+ free(fb->buf);
+ fb->buf = NULL;
+ }
+ return retval;
}
static void
--
To unsubscribe send an email to tech+unsubscribe@mandoc.bsd.lv
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2018-02-23 22:13 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-23 19:28 read.c does not close gzipped files; leaks memory Wolfgang Müller
2018-02-23 22:13 ` Ingo Schwarze
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).