Discussion:
[landley/toybox] xargs: exec ... Argument list too long (#40)
(too old to reply)
Rob Landley
2016-08-10 18:28:58 UTC
Permalink
Raw Message
.... stat ....|, yields the error |Argument list too long|.
Some googlfu shows that this is likely related to the argument exceeding
the kernels ARG_MAX value.
Androids ARG_MAX according to this 131072
<https://github.com/android/platform_bionic/blob/master/libc/kernel/uapi/linux/limits.h>.
The Linux kernel removed that limit in 2007
(https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=b6a2fea39318
went into the 2.6.22 release).

That said, let's cd to a directory with a lot of files in it:

$ find . | wc
68971 69024 3159459
$ find . | toybox xargs echo
xargs: exec echo: Argument list too long

And I'm seeing a limit here too. Hmmm...
Manually limiting the xargs argument length using |-s bytes| fixes this
for me.
The kernel should not be enforcing this? Maybe something else is? There
isn't an obvious ulimit for this... Ah, in include/uapi/linux/limits.h:

#define ARG_MAX 131072 /* # bytes of args + environ for exec() */

And that's used in fs/exec.c:

/*
* We've historically supported up to 32 pages (ARG_MAX)
* of argument strings even with small stacks
*/
if (size <= ARG_MAX)
return page;

/*
* Limit to 1/4-th the stack size for the argv+env strings.
* This ensures that:
* - the remaining binfmt code will not run out of stack space,
* - the program will have a reasonable amount of stack left
* to work from.
*/
rlim = current->signal->rlim;
if (size > ACCESS_ONCE(rlim[RLIMIT_STACK].rlim_cur) / 4) {
put_page(page);
return NULL;
}

Ok, so that's still there as a MINIMUM, and the maximum is 1/4 the
current stack size ulimit. According to ulimit -s that defaults to 8192
kilobytes (at least on xubuntu 14.04), so 1/4 of 8 megs is 2 megs. And...

$ find . | head -n 40000 | wc
40000 40002 1853620
$ find . | head -n 40000 | toybox xargs echo | wc
1 40002 1853620

Yup, the _actual_ default limit is way more than 131072, and that
default is arbitrarily adjustable even for normal users:

$ ulimit -Hs
unlimited
$ ulimit -s 9999999
$ find . | toybox xargs echo | wc
1 69023 3159436

(Huh, what happens if I set the stack size limit to _less_ than 131072
and then try to exec stuff? Not trying it on my work netbook just now,
thanks...)
I've looked at the code for xarg in toybox but could not figure out yet
how the max argument length is determined.
It didn't enforce a default maximum length limit because as far as I
knew the kernel hadn't required one for 9 years. I worked out the math
for enforcing such a limit back when I first did xargs:

http://landley.net/notes-2011.html#17-12-2011

So it's not hard to add a default, but what the default should be isn't
obvious. Back when I noticed the kernel had changed its logic I
_checked_ that I could feed it more than 131072 bytes, and I could, so I
removed the limit. (Not just from xargs, but from find in commit
aa784b09a9fb and possibly elsewhere.)
Busybox doesn't seem to run into this issue using the same command on
the same device. Busybox seems to have some sanity checks
<https://git.busybox.net/busybox/tree/findutils/xargs.c#n529>, adjusting
the argument length if necessary.
The sanity checks are there (-s is implemented), toybox just doesn't
have a default value if you don't specify one.
Could there be an issue with how toybox's |xarg| determines the default
argument length on Android?
See above.

There's a sysconf(_SC_ARG_MAX), which does appear to be calculating this
properly (at least built against glibc, musl is giving me 131072 and I
still haven't got a host bionic toolchain because building bionic
outside of AOSP was nontrivial when I looked at it).

Oh well, use the sysconf and then file a bug against musl. (Elliott will
probably notice this, and if not he'll notice the commit.)

Thanks for the heads up,

Rob
enh
2016-08-10 20:08:04 UTC
Permalink
Raw Message
Post by Rob Landley
.... stat ....|, yields the error |Argument list too long|.
Some googlfu shows that this is likely related to the argument exceeding
the kernels ARG_MAX value.
Androids ARG_MAX according to this 131072
<https://github.com/android/platform_bionic/blob/master/libc/kernel/uapi/linux/limits.h>.
The Linux kernel removed that limit in 2007
(https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=b6a2fea39318
went into the 2.6.22 release).
$ find . | wc
68971 69024 3159459
$ find . | toybox xargs echo
xargs: exec echo: Argument list too long
And I'm seeing a limit here too. Hmmm...
Manually limiting the xargs argument length using |-s bytes| fixes this
for me.
The kernel should not be enforcing this? Maybe something else is? There
#define ARG_MAX 131072 /* # bytes of args + environ for exec() */
/*
* We've historically supported up to 32 pages (ARG_MAX)
* of argument strings even with small stacks
*/
if (size <= ARG_MAX)
return page;
/*
* Limit to 1/4-th the stack size for the argv+env strings.
* - the remaining binfmt code will not run out of stack space,
* - the program will have a reasonable amount of stack left
* to work from.
*/
rlim = current->signal->rlim;
if (size > ACCESS_ONCE(rlim[RLIMIT_STACK].rlim_cur) / 4) {
put_page(page);
return NULL;
}
Ok, so that's still there as a MINIMUM, and the maximum is 1/4 the
current stack size ulimit. According to ulimit -s that defaults to 8192
kilobytes (at least on xubuntu 14.04), so 1/4 of 8 megs is 2 megs. And...
$ find . | head -n 40000 | wc
40000 40002 1853620
$ find . | head -n 40000 | toybox xargs echo | wc
1 40002 1853620
Yup, the _actual_ default limit is way more than 131072, and that
$ ulimit -Hs
unlimited
$ ulimit -s 9999999
$ find . | toybox xargs echo | wc
1 69023 3159436
(Huh, what happens if I set the stack size limit to _less_ than 131072
and then try to exec stuff? Not trying it on my work netbook just now,
thanks...)
I've looked at the code for xarg in toybox but could not figure out yet
how the max argument length is determined.
It didn't enforce a default maximum length limit because as far as I
knew the kernel hadn't required one for 9 years. I worked out the math
http://landley.net/notes-2011.html#17-12-2011
So it's not hard to add a default, but what the default should be isn't
obvious. Back when I noticed the kernel had changed its logic I
_checked_ that I could feed it more than 131072 bytes, and I could, so I
removed the limit. (Not just from xargs, but from find in commit
aa784b09a9fb and possibly elsewhere.)
Busybox doesn't seem to run into this issue using the same command on
the same device. Busybox seems to have some sanity checks
<https://git.busybox.net/busybox/tree/findutils/xargs.c#n529>, adjusting
the argument length if necessary.
The sanity checks are there (-s is implemented), toybox just doesn't
have a default value if you don't specify one.
Could there be an issue with how toybox's |xarg| determines the default
argument length on Android?
See above.
There's a sysconf(_SC_ARG_MAX), which does appear to be calculating this
properly (at least built against glibc, musl is giving me 131072 and I
still haven't got a host bionic toolchain because building bionic
outside of AOSP was nontrivial when I looked at it).
yeah, bionic currently just returns ARG_MAX. looks lik glibc divides
the stack by 4 as you describe above.

running

#include <unistd.h>
int main() {
return sysconf(_SC_ARG_MAX)/1024;
}

under strace gets me

getrlimit(RLIMIT_STACK, {rlim_cur=8192*1024, rlim_max=RLIM64_INFINITY}) = 0
exit_group(2048) = ?
Post by Rob Landley
Oh well, use the sysconf and then file a bug against musl. (Elliott will
probably notice this, and if not he'll notice the commit.)
yeah, this is as good a bionic bug reporting forum as any :-)
Post by Rob Landley
Thanks for the heads up,
Rob
_______________________________________________
Toybox mailing list
http://lists.landley.net/listinfo.cgi/toybox-landley.net
--
Elliott Hughes - http://who/enh - http://jessies.org/~enh/
Android native code/tools questions? Mail me/drop by/add me as a reviewer.
enh
2016-08-13 21:36:36 UTC
Permalink
Raw Message
Post by enh
Post by Rob Landley
.... stat ....|, yields the error |Argument list too long|.
Some googlfu shows that this is likely related to the argument exceeding
the kernels ARG_MAX value.
Androids ARG_MAX according to this 131072
<https://github.com/android/platform_bionic/blob/master/libc/kernel/uapi/linux/limits.h>.
The Linux kernel removed that limit in 2007
(https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=b6a2fea39318
went into the 2.6.22 release).
$ find . | wc
68971 69024 3159459
$ find . | toybox xargs echo
xargs: exec echo: Argument list too long
And I'm seeing a limit here too. Hmmm...
Manually limiting the xargs argument length using |-s bytes| fixes this
for me.
The kernel should not be enforcing this? Maybe something else is? There
#define ARG_MAX 131072 /* # bytes of args + environ for exec() */
/*
* We've historically supported up to 32 pages (ARG_MAX)
* of argument strings even with small stacks
*/
if (size <= ARG_MAX)
return page;
/*
* Limit to 1/4-th the stack size for the argv+env strings.
* - the remaining binfmt code will not run out of stack space,
* - the program will have a reasonable amount of stack left
* to work from.
*/
rlim = current->signal->rlim;
if (size > ACCESS_ONCE(rlim[RLIMIT_STACK].rlim_cur) / 4) {
put_page(page);
return NULL;
}
Ok, so that's still there as a MINIMUM, and the maximum is 1/4 the
current stack size ulimit. According to ulimit -s that defaults to 8192
kilobytes (at least on xubuntu 14.04), so 1/4 of 8 megs is 2 megs. And...
$ find . | head -n 40000 | wc
40000 40002 1853620
$ find . | head -n 40000 | toybox xargs echo | wc
1 40002 1853620
Yup, the _actual_ default limit is way more than 131072, and that
$ ulimit -Hs
unlimited
$ ulimit -s 9999999
$ find . | toybox xargs echo | wc
1 69023 3159436
(Huh, what happens if I set the stack size limit to _less_ than 131072
and then try to exec stuff? Not trying it on my work netbook just now,
thanks...)
I've looked at the code for xarg in toybox but could not figure out yet
how the max argument length is determined.
It didn't enforce a default maximum length limit because as far as I
knew the kernel hadn't required one for 9 years. I worked out the math
http://landley.net/notes-2011.html#17-12-2011
So it's not hard to add a default, but what the default should be isn't
obvious. Back when I noticed the kernel had changed its logic I
_checked_ that I could feed it more than 131072 bytes, and I could, so I
removed the limit. (Not just from xargs, but from find in commit
aa784b09a9fb and possibly elsewhere.)
Busybox doesn't seem to run into this issue using the same command on
the same device. Busybox seems to have some sanity checks
<https://git.busybox.net/busybox/tree/findutils/xargs.c#n529>, adjusting
the argument length if necessary.
The sanity checks are there (-s is implemented), toybox just doesn't
have a default value if you don't specify one.
Could there be an issue with how toybox's |xarg| determines the default
argument length on Android?
See above.
There's a sysconf(_SC_ARG_MAX), which does appear to be calculating this
properly (at least built against glibc, musl is giving me 131072 and I
still haven't got a host bionic toolchain because building bionic
outside of AOSP was nontrivial when I looked at it).
yeah, bionic currently just returns ARG_MAX. looks lik glibc divides
the stack by 4 as you describe above.
running
#include <unistd.h>
int main() {
return sysconf(_SC_ARG_MAX)/1024;
}
under strace gets me
getrlimit(RLIMIT_STACK, {rlim_cur=8192*1024, rlim_max=RLIM64_INFINITY}) = 0
exit_group(2048) = ?
Post by Rob Landley
Oh well, use the sysconf and then file a bug against musl. (Elliott will
probably notice this, and if not he'll notice the commit.)
yeah, this is as good a bionic bug reporting forum as any :-)
fixed in AOSP. (took a while to kill code in the internal tree making
bad assumptions about ARG_MAX...)
Post by enh
Post by Rob Landley
Thanks for the heads up,
Rob
_______________________________________________
Toybox mailing list
http://lists.landley.net/listinfo.cgi/toybox-landley.net
--
Elliott Hughes - http://who/enh - http://jessies.org/~enh/
Android native code/tools questions? Mail me/drop by/add me as a reviewer.
--
Elliott Hughes - http://who/enh - http://jessies.org/~enh/
Android native code/tools questions? Mail me/drop by/add me as a reviewer.
Rob Landley
2016-08-15 07:06:12 UTC
Permalink
Raw Message
Post by enh
Post by enh
Post by Rob Landley
Oh well, use the sysconf and then file a bug against musl. (Elliott will
probably notice this, and if not he'll notice the commit.)
yeah, this is as good a bionic bug reporting forum as any :-)
fixed in AOSP. (took a while to kill code in the internal tree making
bad assumptions about ARG_MAX...)
I've gotta add the measuring code back to find.

There's also this fun little corner case from susv4's xargs page
Post by enh
The xargs utility shall limit the command line length such that when
the command line is invoked, the combined argument and environment
lists (see the exec family of functions in the System Interfaces
volume of POSIX.1-2008) shall not exceed {ARG_MAX}-2048 bytes.
Within this constraint, if neither the -n nor the -s option is
specified, the default command line length shall be at least
{LINE_MAX}.
So they want a gratuitous -2048 constant in there. I don't know if this
is because they're _not_ measuring the space used by the actual argv[]
and envp[] pointer arrays, or because they're leaving space for the new
command to add extra environment data to itself, or...?

I'm leaning towards doing another "nuts to your white mice" at posix's
magic constant here and just actually measuring the real-world things
they never mention the existence of. Or I could just hardwire a random
default like they suggest and let people -s specify it if they know what
they're doing. (Alas, I've run out of weekend...)

Rob
enh
2017-09-20 20:59:05 UTC
Permalink
Raw Message
this bug was reported again internally this week (within a day of a
year and a month). patch for xargs attached.



Subject: [PATCH] Fix xargs to obey POSIX's ARG_MAX restrictions.

This avoids "xargs: exec echo: Argument list too long" errors in practice.

find(1) needs to be fixed too, but that's a bit more complicated and a working
xargs provides a workaround.

Bug: http://b/65818597
Test: find /proc | strace -f -e execve ./toybox xargs echo > /dev/null
---
lib/lib.c | 13 +++++++++++++
lib/lib.h | 1 +
toys/posix/xargs.c | 14 +++++++++++++-
3 files changed, 27 insertions(+), 1 deletion(-)
Post by Rob Landley
Post by enh
Post by enh
Post by Rob Landley
Oh well, use the sysconf and then file a bug against musl. (Elliott will
probably notice this, and if not he'll notice the commit.)
yeah, this is as good a bionic bug reporting forum as any :-)
fixed in AOSP. (took a while to kill code in the internal tree making
bad assumptions about ARG_MAX...)
I've gotta add the measuring code back to find.
There's also this fun little corner case from susv4's xargs page
Post by enh
The xargs utility shall limit the command line length such that when
the command line is invoked, the combined argument and environment
lists (see the exec family of functions in the System Interfaces
volume of POSIX.1-2008) shall not exceed {ARG_MAX}-2048 bytes.
Within this constraint, if neither the -n nor the -s option is
specified, the default command line length shall be at least
{LINE_MAX}.
So they want a gratuitous -2048 constant in there. I don't know if this
is because they're _not_ measuring the space used by the actual argv[]
and envp[] pointer arrays, or because they're leaving space for the new
command to add extra environment data to itself, or...?
I'm leaning towards doing another "nuts to your white mice" at posix's
magic constant here and just actually measuring the real-world things
they never mention the existence of. Or I could just hardwire a random
default like they suggest and let people -s specify it if they know what
they're doing. (Alas, I've run out of weekend...)
Rob
--
Elliott Hughes - http://who/enh - http://jessies.org/~enh/
Android native code/tools questions? Mail me/drop by/add me as a reviewer.
enh
2017-09-21 15:40:43 UTC
Permalink
Raw Message
i should probably have also said (since it was contentious before)
that i went with the -2048 not because i have a use for it, but
because it's POSIX, it's what all existing implementations do, it's
trivial, and it's at least a year (or "never") until i can update the
toybox on a device if we do ever find someone relying on this safety
margin.

(the guy who owns the NDK and has to regularly test on ancient devices
may thank me for this some day.)
Post by enh
this bug was reported again internally this week (within a day of a
year and a month). patch for xargs attached.
Subject: [PATCH] Fix xargs to obey POSIX's ARG_MAX restrictions.
This avoids "xargs: exec echo: Argument list too long" errors in practice.
find(1) needs to be fixed too, but that's a bit more complicated and a working
xargs provides a workaround.
Bug: http://b/65818597
Test: find /proc | strace -f -e execve ./toybox xargs echo > /dev/null
---
lib/lib.c | 13 +++++++++++++
lib/lib.h | 1 +
toys/posix/xargs.c | 14 +++++++++++++-
3 files changed, 27 insertions(+), 1 deletion(-)
Post by Rob Landley
Post by enh
Post by enh
Post by Rob Landley
Oh well, use the sysconf and then file a bug against musl. (Elliott will
probably notice this, and if not he'll notice the commit.)
yeah, this is as good a bionic bug reporting forum as any :-)
fixed in AOSP. (took a while to kill code in the internal tree making
bad assumptions about ARG_MAX...)
I've gotta add the measuring code back to find.
There's also this fun little corner case from susv4's xargs page
Post by enh
The xargs utility shall limit the command line length such that when
the command line is invoked, the combined argument and environment
lists (see the exec family of functions in the System Interfaces
volume of POSIX.1-2008) shall not exceed {ARG_MAX}-2048 bytes.
Within this constraint, if neither the -n nor the -s option is
specified, the default command line length shall be at least
{LINE_MAX}.
So they want a gratuitous -2048 constant in there. I don't know if this
is because they're _not_ measuring the space used by the actual argv[]
and envp[] pointer arrays, or because they're leaving space for the new
command to add extra environment data to itself, or...?
I'm leaning towards doing another "nuts to your white mice" at posix's
magic constant here and just actually measuring the real-world things
they never mention the existence of. Or I could just hardwire a random
default like they suggest and let people -s specify it if they know what
they're doing. (Alas, I've run out of weekend...)
Rob
--
Elliott Hughes - http://who/enh - http://jessies.org/~enh/
Android native code/tools questions? Mail me/drop by/add me as a reviewer.
--
Elliott Hughes - http://who/enh - http://jessies.org/~enh/
Android native code/tools questions? Mail me/drop by/add me as a reviewer.
Rob Landley
2017-09-23 13:14:11 UTC
Permalink
Raw Message
Post by enh
i should probably have also said (since it was contentious before)
that i went with the -2048 not because i have a use for it, but
because it's POSIX, it's what all existing implementations do, it's
trivial, and it's at least a year (or "never") until i can update the
toybox on a device if we do ever find someone relying on this safety
margin.
(the guy who owns the NDK and has to regularly test on ancient devices
may thank me for this some day.)
Sorry, my turn to be sick, but I should be able to catch up this weekend...

Rob
enh
2017-09-25 17:09:32 UTC
Permalink
Raw Message
i did find some time (no pun intended) to look at the equivalent bug
in find on friday, but spent it all on a couple of ratholes i
discovered trying to set up a test :-(
Post by Rob Landley
Post by enh
i should probably have also said (since it was contentious before)
that i went with the -2048 not because i have a use for it, but
because it's POSIX, it's what all existing implementations do, it's
trivial, and it's at least a year (or "never") until i can update the
toybox on a device if we do ever find someone relying on this safety
margin.
(the guy who owns the NDK and has to regularly test on ancient devices
may thank me for this some day.)
Sorry, my turn to be sick, but I should be able to catch up this weekend...
Rob
--
Elliott Hughes - http://who/enh - http://jessies.org/~enh/
Android native code/tools questions? Mail me/drop by/add me as a reviewer.
Rob Landley
2017-09-29 23:53:07 UTC
Permalink
Raw Message
And after deleting the mailing list archive again, dreamhost restored
from a 3 day old backup that doesn't contain this entire thread.

Wheee...

Rob
Did you look at my xargs patch in this thread? It handles all that. I
think the find patch is pretty similar, I just haven't had time to set
up a working test case. (Or rather I fell into ratholes while doing so.)
Sorry, yes I did get time to look at this, I just haven't _fixed_ it
yet.
The problem is that environment variables and arguments both fit in the
same space, which is dynamic at 1/4 the stack size limit as mentioned at
http://lists.landley.net/pipermail/toybox-landley.net/2016-August/008592.html
<http://lists.landley.net/pipermail/toybox-landley.net/2016-August/008592.html>
last time around.
Which means that _any_ static size we enforce isn't guaranteed to work,
if you have a lot of environment variables defined (or just one really
big one). Environment varaibles can even consume the space to the point
you can't launch a _single_ argument. (If it's less than PATH_MAX, we
have a problem.)
So I think it has to probe, and be dynamic, and at least not _loop_ when
find produces a single long name that's more than the remaining
environment space to launch a child. (I.E. test the corner case that the
maximum number of args we can fit is zero.)
Space consumption should be exactly sizeof(char *)*(argc+1) +
sizeof(char *)+(envc+1) + the strlen+1 of all the argv[] + the strlen+1
of all the envp[]. I need to test that's still the case (it used to be
when the limit was hardwired 32*4096).
Sorry for the delay, still juggling way more things than I have energy
for. :(
Rob
Rob Landley
2017-10-01 22:31:59 UTC
Permalink
Raw Message
Did you look at my xargs patch in this thread?
Not closely enough. (Description implied it was the same as last time.)
It handles all that. I think the find patch is pretty similar, I just
haven't had time to set up a working test case. (Or rather I fell into
ratholes while doing so.)
I think this year has been one big rathole for me. :)

Ok, first I should have invoked my "I took too long, apply the submitted
patch as-is, clean it up later" rule a week ago. (It conflicted with
stuff in lib.h and lib.c I was doing, but I've reverted that and can
re-apply after.)

Done and pushed, BUT: I think it's still the wrong fix. I think my
fundamental objection is:

$ seq 1 200000 | tr '\n' x | xargs
xargs: argument line too long
$ echo $(seq 1 200000 | tr '\n' x) | wc
1 1 1288896

Why does the first _not_ work when the second can? (I note that's the
Ubuntu xargs failing; I think toybox should do better than that.)

You're calling _SC_ARG_MAX which at least on musl returns hardwired
131072, when it should return 1/4 of the stack size limit. (cc'd Rich to
see what he thinks. See the URLs at the end for earlier discussion of
digging into the kernel, the real limit the kernel enforces these days
is 1/4 the stack size ulimit, which defaults to 8388608 on ubuntu 14.04,
1/4 of which is 2 megabytes. I dunno if that's because this netbook has
8 gigs though? It might scale, haven't checked...) Fixing libc to probe
the real size is an acceptable fix, otherwise toybox should do it.

The guaranteed 2k space left over makes sense, although the "can a
single file still advance" issue should be able to cut into that as
necessary. Which would argue for it being PATH_MAX (I.E. the historical
4k) space reserved. But reality can still blow through that, filenames
go arbitrarily deep (or non-filename arguments, as in the above
example). So 2k's no sillier than anything else. :)

And tests/xargs.test needs size tests making sure we calculate the
corner cases exactly. "printf '%0*d' $LEN 0" is probably useful here.
And yes, it works in toybox printf. :)

Rob

P.S. One more FUN corner case is that ulimit -s is a soft limit, the
hard limit is unlimited. So a normal user can ulimit -s 999999999 and
it's accepted. So xargs _could_ finagle this itself if it wanted, at
least for the "one big argument that doesn't fit by itself" case,
although here there probably be dragons. xargs -f anyone? :)
Sorry, yes I did get time to look at this, I just haven't _fixed_ it
yet.
The problem is that environment variables and arguments both fit in the
same space, which is dynamic at 1/4 the stack size limit as mentioned at
http://lists.landley.net/pipermail/toybox-landley.net/2016-August/008592.html
<http://lists.landley.net/pipermail/toybox-landley.net/2016-August/008592.html>
last time around.
Which means that _any_ static size we enforce isn't guaranteed to work,
if you have a lot of environment variables defined (or just one really
big one). Environment varaibles can even consume the space to the point
you can't launch a _single_ argument. (If it's less than PATH_MAX, we
have a problem.)
So I think it has to probe, and be dynamic, and at least not _loop_ when
find produces a single long name that's more than the remaining
environment space to launch a child. (I.E. test the corner case that the
maximum number of args we can fit is zero.)
Space consumption should be exactly sizeof(char *)*(argc+1) +
sizeof(char *)+(envc+1) + the strlen+1 of all the argv[] + the strlen+1
of all the envp[]. I need to test that's still the case (it used to be
when the limit was hardwired 32*4096).
Sorry for the delay, still juggling way more things than I have energy
for. :(
Rob
enh
2017-10-02 15:20:39 UTC
Permalink
Raw Message
Post by Rob Landley
Did you look at my xargs patch in this thread?
Not closely enough. (Description implied it was the same as last time.)
It handles all that. I think the find patch is pretty similar, I just
haven't had time to set up a working test case. (Or rather I fell into
ratholes while doing so.)
I think this year has been one big rathole for me. :)
Ok, first I should have invoked my "I took too long, apply the submitted
patch as-is, clean it up later" rule a week ago. (It conflicted with
stuff in lib.h and lib.c I was doing, but I've reverted that and can
re-apply after.)
Done and pushed, BUT: I think it's still the wrong fix. I think my
$ seq 1 200000 | tr '\n' x | xargs
xargs: argument line too long
$ echo $(seq 1 200000 | tr '\n' x) | wc
1 1 1288896
Why does the first _not_ work when the second can? (I note that's the
Ubuntu xargs failing; I think toybox should do better than that.)
You're calling _SC_ARG_MAX which at least on musl returns hardwired
131072, when it should return 1/4 of the stack size limit. (cc'd Rich to
see what he thinks. See the URLs at the end for earlier discussion of
digging into the kernel, the real limit the kernel enforces these days
is 1/4 the stack size ulimit, which defaults to 8388608 on ubuntu 14.04,
1/4 of which is 2 megabytes. I dunno if that's because this netbook has
8 gigs though? It might scale, haven't checked...) Fixing libc to probe
the real size is an acceptable fix, otherwise toybox should do it.
this seems to be a clear case of a musl bug, though, not something
toybox should be trying to reimplement itself.
Post by Rob Landley
The guaranteed 2k space left over makes sense, although the "can a
single file still advance" issue should be able to cut into that as
necessary. Which would argue for it being PATH_MAX (I.E. the historical
4k) space reserved. But reality can still blow through that, filenames
go arbitrarily deep (or non-filename arguments, as in the above
example). So 2k's no sillier than anything else. :)
And tests/xargs.test needs size tests making sure we calculate the
corner cases exactly. "printf '%0*d' $LEN 0" is probably useful here.
And yes, it works in toybox printf. :)
Rob
P.S. One more FUN corner case is that ulimit -s is a soft limit, the
hard limit is unlimited. So a normal user can ulimit -s 999999999 and
it's accepted. So xargs _could_ finagle this itself if it wanted, at
least for the "one big argument that doesn't fit by itself" case,
although here there probably be dragons. xargs -f anyone? :)
Sorry, yes I did get time to look at this, I just haven't _fixed_ it
yet.
The problem is that environment variables and arguments both fit in the
same space, which is dynamic at 1/4 the stack size limit as mentioned at
http://lists.landley.net/pipermail/toybox-landley.net/2016-August/008592.html
<http://lists.landley.net/pipermail/toybox-landley.net/2016-August/008592.html>
last time around.
Which means that _any_ static size we enforce isn't guaranteed to work,
if you have a lot of environment variables defined (or just one really
big one). Environment varaibles can even consume the space to the point
you can't launch a _single_ argument. (If it's less than PATH_MAX, we
have a problem.)
So I think it has to probe, and be dynamic, and at least not _loop_ when
find produces a single long name that's more than the remaining
environment space to launch a child. (I.E. test the corner case that the
maximum number of args we can fit is zero.)
Space consumption should be exactly sizeof(char *)*(argc+1) +
sizeof(char *)+(envc+1) + the strlen+1 of all the argv[] + the strlen+1
of all the envp[]. I need to test that's still the case (it used to be
when the limit was hardwired 32*4096).
Sorry for the delay, still juggling way more things than I have energy
for. :(
Rob
--
Elliott Hughes - http://who/enh - http://jessies.org/~enh/
Android native code/tools questions? Mail me/drop by/add me as a reviewer.
Loading...