Discussion:
New tests for dirname and wc
(too old to reply)
Rob Landley
2012-11-26 04:39:41 UTC
Permalink
The discussion keeps wandering off the list. :)
Understood. Setting the environment variable in the test script
makes
sense, I'll add a comment.
? The environment is only read in order to check for the locale.
It
might
better to use locale (1) though.
I've gotten distracted a bit: I got a "which" contribution that
needed
a lot of cleanup (I'm about 2/3 of the way done), and then I need to
clean up "cut", and I'm properly fixing "cp", which needed the
equivalent of readlink -f in order to do duplicate detection (I can
catch "cp * ." by checking stat info, but infinite recursion ala
"cp -R
.. ." requires absolute path comparisons with symlinks resolved),
and
when I poked at readlink -f it was unfinished, and then the test
suite
failed on the rewrite, and the test suite actually needs to be
extended
for the new -m and -e options...
The usual. :)
:)
I note I just checked in the which cleanup, and blogged a chunk of the
rationale at http://landley.net/notes-2012.html#23-11-2012 but when I
really get the bit in my teeth it's hard to write up my rationale for
doing stuff _or_ to do the test suite. Both of which are kinda good
things to do.

I had to redo which (from where I left off cleaning it up) because:

A) getdate() didn't do what I thought it did.

It hasn't got any built in date-guessing formats, you have to point it
at a file of them. There doesn't seem to be a system standard file or
anything built into libc, so it was kinda useless. The man page for the
gnu/dammit version of date says -d can guess an arbitrary date code,
but no idea how.

So I've got to go to the two specific date formats posix lists for -d
and -t, but I don't have to open code it like the initial contribution
did, I can use strptime() out of libc.

B) posix says touch should handle multiple files on the command line,
and this only handled one argument.

I wanted to use loopfiles(), but that means opening the file to get a
filehandle, and touch should be able to change the date of chmod 000
files, meaning it can't use futimes but has to use the non-filehandle
version.

Posix also says [.frac] I.E. fractions of a second support. (Computers
are fast now.)
The man page has utime() and utimes() on it and utimes() supports
milisecond resolution. Modern linux supports nanoseconds but
miliseconds is probably fine for now. (Eh, I could switch to
utimensat(AT_CWD, ...) at some point.)

C) It only supported about half the posix command line arguments, it
needs -a, -c, and -d.

Note that supporting -a and -m means it needs to stat the existing file
to get the current date, modify it, and write it back out.

I also wince when I see repeated code, and try to sit back and find a
way to just have _one_ call to utime() or similar. I don't always
manage it but I do stop and try.

I had to dig a bit to find out that the T "time designator" is a
literal T.

I've still got a couple todo items: allow space as a time designator,
allow , as a fraction indicator. (*shrug* Posix.)
Conditional on the config symbol enabling UTF-8 support. (That
introduces the possibility of internationalization causing
regressions
in other commands, but it does mean they asked for it.)
The unanswerable question is whether UTF-8 support should be
enabled by
default. (The bigger the test suite is the more my answer leans
towards
yes.)
I think that I would also prefer a default of enabled i18n. Is
there a
way to check in test scripts for enabled configuration variables?
Yes, the "optional" shell function. See sort.test or tail.test for
examples, if the config symbol isn't set (in the current build
directory) calls to "testing" after that are NOPed out until the
next
call to optional.
(Probably what I should have it do instead is return true/false and
use
shell if statements. Right now you can't test for more than one
symbol
at a time...)
Thanks. So I made a patch for implementing the option. I was not sure
where
to exactly put the setlocale call. Please feel free to move it, where
it
belongs. Attached is a hg exported patch.
I have applied it. I've got some pending cleanups, but it's revealed
yet more structural limitations in my stuff.

I don't like #ifdefs in C code, I prefer if (CFG_BLAH) {stuff} with
CFG_BLAH becoming a constant 0 and dead code elimination dropping out
{stuff} when configured out. That way it's always syntax checked.

(Note that the CFG_ macros will always be #defined, they're just 1 or
0. So an #ifdef on them will always be true. You can either #if
CFG_BLAH or #ifdef CONFIG_BLAH to actually chop stuff out with the
preprocessor.)

In addition to the CFG_BLAH macros I also made USE_BLAH() macros which
resolve to the contents of the parentheses when BLAH is enabled, and
nothing when BLAH is disabled.

The downside of letting dead code elimination do the work is that if
the headers are conditionally #included, the function calls have no
prototype. I'm not sure if having the headers unconditionally included
is better, or moving them to lib/portability.h and having an #else case
in the ifdef that copies the relevant prototypes from posix. Probably
unconditional inclusion until somebody complains, then fix it up in
portability.h.

Another problem is that if I stick a USE_BLAH("x") around the config
options my header generation infrastructure is only producing headers
for stuff that's enabled. I.E.

USE_WC(NEWTOY(wc, USE_TOYBOX_I18N("m")"cwl", TOYFLAG_USR|TOYFLAG_BIN))

When I switch off TOYBOX_I18N I don't get FLAG_m defined.

(What I _should_ do is #define FLAG_M to 0 when it's disabled. But
teaching the script to do that, with sed, is complicated. :P )

So yeah: working on it. :)
I mean that I am painfully aware of my lack of expertise here,
but
am
trying to close tabs before opening new ones. :)
For example, I dunno if daemontools or runit are the same thing.
(And I
try to keep track of where each one came from. susv init was
from
system v, there was a bsd one that kinda fell out of use,
upstart
was
from ubuntu, systemd was from <strike>Pointy Hair
Linux</strike> Red
Hat, android has some home-grown thing, and last I heard gentoo
was
threatening to write its own.
Daemontools and Runit are different (but have the same ideas).
Daemontools
are by djb. Runit is by Gerrit Pape. Daemontools are not really a
init
but a tool to manage (supervise) services.
I wrote that down. At some point I need to dig into all this and
figure
out what to do, but that research project can wait until I've
emptied a
couple of the open cans of worms.
(Who puts worms in cans? I'm guessing this is a fishing metaphor...)
I have yet to open this can of worms. Possibly I can just
ignore it
and
do sysv init.
Oh, and let's not forget the careful failure to actually say
anything
http://refspecs.linuxfoundation.org/LSB_4.1.0/LSB-Core-generic/LSB-Core-generic/tocsysinit.html
A standard that doesn't specify. Sigh...
I mean, is there any problem for someone who does not like
sysv
init
to
use their own init instead of toybox's?
Not at all. but at the same time. I'd like to provide something
people
find useful. "I don't actually expect people to use this tool"
means I
probably shouldn't include that tool. (I included oneit because
_I_
use
it. And because it solves the problem of booting a
single-purpose
system that exists to run one process, such as a single command
shell.
Yes you can do init=/myshellscript.sh which I've done a lot
(modulo
the
bug that the period seems to confuse the kernel's command line
argument
parsing), but init has to reap zombies for orphan processes and
that
won't, so zombie processes pile up in the resulting system.
Plus you
can't ctrl-c out of it.)
Yeah, it's really fun to ping stuff without job control...
ping -c 3
(And as soon as you forget the -c once, it's reboot time. Honestly,
-c
10 or similar should be the _default_, with an option to enable
endless
pinging.)
I also discovered "ping -c" some time ago. On Windows ping seems to
be better
behaved in that sense...
You're still a typo away from a bricked box. Been there, done that.
(With runit you'd pack your script into /etc/runit/1, have a
zombie
killer
in /etc/runit/3 and one agetty as a service started by something
like
"exec agetty 38400 tty1 linux". Ok, that's kind of more work than
using
oneit.)
I like oneit, it's a good tool for what it does. However, it's
intentionally very limited. Going to a full system management
daemon...
there are a number of possible approaches. Figuring out what to do
is a
major research project.
I dislike systemd if for no other reason than Red Hat's attempts to
shove an ever-growing katamari down people's throats, over the
objections of Ubuntu and Gentoo and so on. (Plenty of other
reasons, of
course. PID 1 should not have dependencies. Sucking udev into it is
not
an improvement, and Linus Torvalds himself has flamed Kay Sievers
recently http://article.gmane.org/gmane.linux.kernel/1369384 .)
Anyway,
"not that".
I should take a look at upstart, which came first. It would be
something to be compatable with. But from the quick glances I've
done,
it also seems kinda overcomplicated.
Given that Android already has more deployments than Linux and is
still
in the exponential growth part of its S-curve, I _need_ to be
familiar
with whatever it is they're doing. Toybox should either provide a
good
clean replacement for it, or easily work with it (interact with the
unmodified android init).
I'm reluctant to pick favorites among daemontools and runit and
gentoo's thing and so on because I'd like to make a good, simple,
standard general-purpose tool. Back in 2005 when I rewrote the init
in
busybox, this was the System V init. (This rewrite never made it
upstream because Erik was switching CVS over to SVN and nobody could
check anything in for over a month, during which time I switched
laptops an the code got buried in backup directories. But I got
pretty
deeply familiar with how init actually works during the rewrite,
which
led me to do oneit.)
As long as I don't find systemd or upstart in toybox I'm happy. I
think
that I feel similarly about systemd as you. dbus doesn't seem to be
the
right IPC mechanism for an init to me...
People were trying to move dbus in-kernel at one point:

http://www.phoronix.com/scan.php?page=news_item&px=ODYwNA

It struck me as a defense mechanism more than a design...
It may solve some problems,
but seems to add a great deal of complexity. I kind of feel similarly
about UEFI and BIOS.
The musl maintainer was pondering doing a kernel as his next big
project, and I tried to point him at doing a bootloader. You can scale
up something like uboot with a scheduler and posix system call API, but
right now there's no way to chop Linux _down_ to just what you need.

I ranted about that years ago:

http://www.mentby.com/rob-landley/what-happened-to-linux-tiny.html
I have no idea about the init system of Android.
Gentoo's openrc builds on top of sysvinit and I think it would not
make
much sense to add it to toybox. As mentioned previously daemontools
are
actually not an init. And I kind of wanted to demonstrate that runit
is
still very simple. I didn't want to suggest that you should add
functionality related to it.
I can do a sysvinit. It would be really nice if there was SOME
standard, somewhere...

Mostly I want to get a clear idea of the problem people are trying to
solve, and whether or not there's any installed base of expertise out
there I can leverage. (What are the expectations here? What do
sysadmins sit down and mangle today without having to learn new stuff?
I dunno, I need to talk to some sysadmins...)
Since then, people have started demanding more out of their init.
Mostly parallelism to speed up the boot process, but also paranoid
developers are TERRIFIED that their daemons will exit and need to be
relaunched, even though they trust PID 1 to stay up and if PID 1
ever
does go down the kernel will panic and how is this different from
the
other daemon? Really, if random processes are exiting your system
has a
_problem_ and needs to be _fixed_, creating infrastructure to
silently
recover from things that shouldn't happen is hiding the problem...
Most daemons should write something to the logs when an error occurs
or
they being restarted.
Expecting good daemons is like expecting good apps. You cope with what
you have.
But as you said not restarting them automatically
would give greater incentives to fix them. And of course I agree that
the PID 1 should be as simple as possible.
(Sorry for the remark: runit's process superviser restarts processes
when they exit. This is for example used to implement service
dependencies.)
Eh, the OOM killer can kill stuff. Bad things happen, there should be a
recovery mechanism. (It might be detect it and "shutdown -r now".)

Again, I'm not comfortable with my understanding of the problem space,
and I dunno what "being agnostic" means in this circumstance. I need to
provide the infrastructure to support unknown use cases, but in the
absence of a standard _or_ consensus on what that infrastructure is,
the lack of knowledge of the use cases is disturbing.
Felix
This problem requires further study, and I need to clear space
first.
Rob
Rob
David Seikel
2012-11-26 05:39:43 UTC
Permalink
/me picks two things out of this long email to comment on.
Post by Rob Landley
A) getdate() didn't do what I thought it did.
It hasn't got any built in date-guessing formats, you have to point
it at a file of them. There doesn't seem to be a system standard file
or anything built into libc, so it was kinda useless. The man page
for the gnu/dammit version of date says -d can guess an arbitrary
date code, but no idea how.
So I've got to go to the two specific date formats posix lists for
-d and -t, but I don't have to open code it like the initial
contribution did, I can use strptime() out of libc.
B) posix says touch should handle multiple files on the command
line, and this only handled one argument.
I wanted to use loopfiles(), but that means opening the file to get
a filehandle, and touch should be able to change the date of chmod
000 files, meaning it can't use futimes but has to use the
non-filehandle version.
Posix also says [.frac] I.E. fractions of a second support.
(Computers are fast now.)
The man page has utime() and utimes() on it and utimes() supports
milisecond resolution. Modern linux supports nanoseconds but
miliseconds is probably fine for now. (Eh, I could switch to
utimensat(AT_CWD, ...) at some point.)
C) It only supported about half the posix command line arguments, it
needs -a, -c, and -d.
Note that supporting -a and -m means it needs to stat the existing
file to get the current date, modify it, and write it back out.
I also wince when I see repeated code, and try to sit back and find
a way to just have _one_ call to utime() or similar. I don't always
manage it but I do stop and try.
I had to dig a bit to find out that the T "time designator" is a
literal T.
I've still got a couple todo items: allow space as a time
designator, allow , as a fraction indicator. (*shrug* Posix.)
I would guess for those human languages that swap the usage of
commas and periods as fraction / thousands separators.

<useless anecdote>

In the SecondLife (SL) based virtual world work I do, the major clone of
the server side is called OpenSim. OpenSim has the option of creating
your virtual world from a disparate collection of sim servers run by a
motley collection of random people. A sim is the basic unit of
geography in SL, 256x256 meters, starting at 0 meters, and extending
all the way up to entirely useless heights. So for instance, one sim
server could be in USA, with the server itself set to the USA variation
of English as it's locale, while another sim server could be in
Denmark, with it's locale set to Danish.

English and Danish are two languages where the use of commas and
periods for fraction / thousands separator is swapped. OpenSim
developers like to do crazy things, like use human readable wire
protocols, coz they drink the "humans being able to read wire protocols
that 99.999999% of the time only computers need to read is a great idea
for the 0.000001% of the time that a human needs to read it to debug it"
kool-aid. Some of the things OpenSim sends as human readable wire
protocol get filtered through the servers locale. Did I mention the
OpenSim developers are crazy?

The end result is that if you teleport from this Danish sim to that USA
sim, the Danish sim says "send this avatar to a height of 20,123
meters" (a little over 20 meters, a typical ground level), and the
USA sim server thinks "Hmmm, 20 thousand meters high, OK" and tosses the
poor avatar waaaay up in the air. There is never anything up that
high, and the avatar has a loooong fall to the ground where they
thought they where going.

Getting commas and periods right is important for humans, computers
should not care so much on the damn wire. Getting it wrong can be
painful and messy.

</useless anecdote>
Post by Rob Landley
I can do a sysvinit. It would be really nice if there was SOME
standard, somewhere...
Mostly I want to get a clear idea of the problem people are trying
to solve, and whether or not there's any installed base of expertise
out there I can leverage. (What are the expectations here? What do
sysadmins sit down and mangle today without having to learn new
stuff? I dunno, I need to talk to some sysadmins...)
Long ago, for busybox, I did a Sys V init that tried to stick to what
LSB said should be done. I can't recall why it was not accepted inte
busybox. Possibly someone had done their own favourite style of init
and Sys V init was not desired. shrugs

It was described as "Clean room implementation of runlevel/init.d/SYS V
init applets for busybox, aiming for LSB compliance."

/me digs through his ancient projects ...

http://urunlevel.sourceforge.net/ Oh ick that looks awful on a wide
screen.

If nothing else, it might provide one persons take on the LSB
"standard", even if it was from seven years ago. One feature I added
was the ability to write init "scripts" in C as well as the usual shell
scripts. Toybox probably does not want that.
--
A big old stinking pile of genius that no one wants
coz there are too many silver coated monkeys in the world.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <http://lists.landley.net/pipermail/toybox-landley.net/attachments/20121126/c72a3a7d/attachment.pgp>
Rob Landley
2012-11-26 17:28:14 UTC
Permalink
Post by David Seikel
/me picks two things out of this long email to comment on.
I copied the whole thing because who knows how long ago the
conversation wandered off the list? It's easier for me to check the
list archive than to dig through the compost heap of my inbox, so I
like having toybox stuff copied here.
Post by David Seikel
Post by Rob Landley
I've still got a couple todo items: allow space as a time
designator, allow , as a fraction indicator. (*shrug* Posix.)
I would guess for those human languages that swap the usage of
commas and periods as fraction / thousands separators.
I expect so.

Mostly I'm not bothering with that because it belongs at the GUI level:
we're not translating the names "sort", "touch", "kill", or "nice" to
other languages, why translate their arguments? It wasn't there for the
first 30 years of these tools, adds a large amount of complexity for a
relatively small amount of gain, changing the alphabetization of "sort"
broken more than one build when that suddenly happened, I'm not
competent to _test_ this so if I did implement it I'd have no way of
knowing if I'd done it right...

But if SUSv4 is going to require a synonym and it's easy to do, sure,
why not?
Post by David Seikel
<useless anecdote>
In the SecondLife (SL) based virtual world work I do, the major clone
of
the server side is called OpenSim. OpenSim has the option of creating
your virtual world from a disparate collection of sim servers run by a
motley collection of random people. A sim is the basic unit of
geography in SL, 256x256 meters, starting at 0 meters, and extending
all the way up to entirely useless heights. So for instance, one sim
server could be in USA, with the server itself set to the USA
variation
of English as it's locale, while another sim server could be in
Denmark, with it's locale set to Danish.
English and Danish are two languages where the use of commas and
periods for fraction / thousands separator is swapped. OpenSim
developers like to do crazy things, like use human readable wire
protocols, coz they drink the "humans being able to read wire
protocols
that 99.999999% of the time only computers need to read is a great
idea
for the 0.000001% of the time that a human needs to read it to debug
it"
kool-aid.
I side with them on this: word size, endianness, alignment, padding,
how the compiler's decided to represent C bitfields this week, anything
to do with floating point representation...

That said, translating a wire protocol is nuts. The comma/period is
part of the protocol. You can say decimal numbers are represented by
"123!456" and if that's what you're doing just be _consistent_.
Post by David Seikel
Some of the things OpenSim sends as human readable wire
protocol get filtered through the servers locale. Did I mention the
OpenSim developers are crazy?
That is crazy, yes.
Post by David Seikel
The end result is that if you teleport from this Danish sim to that
USA
sim, the Danish sim says "send this avatar to a height of 20,123
meters" (a little over 20 meters, a typical ground level), and the
USA sim server thinks "Hmmm, 20 thousand meters high, OK" and tosses
the
poor avatar waaaay up in the air. There is never anything up that
high, and the avatar has a loooong fall to the ground where they
thought they where going.
Getting commas and periods right is important for humans, computers
should not care so much on the damn wire. Getting it wrong can be
painful and messy.
The moral of the story is "pick one and be consistent". And I have.

I admit to a certain amount of discomfort going "the langauge I already
speak is coincidentally the standard for posix programming". This is a
decision in need of more defending than average.

That said, Linus Torvalds was born speaking Swedish, and learned
English to compute with. Ingo Molnar is hungarian. Pavel Machek's home
page even says http://atrey.karlin.mff.cuni.cz/~pavel/ "English is my
second language." The OpenVZ guys are in Moscow. Years back I handed
off BusyBox to Denys Vlasenko who is from the Ukraine, and uClibc is
currently maintained by Bernhard who was in Austria last I checked.
Mark Shuttleworth's from South Africa (although one of their 11
official languages is English so I dunno what he grew up speaking).
Marcelo Tosatti is from Brazil, Thomas Gleixner is German, VFAT
filesystem maintainer Ogawa Hirofumi is from Japan...

Outside of Linux: Guido Van Rossum is Dutch but Python development
occurs in english. Here's an interview with Ruby inventor Yukihiro
Matsumoto... in English:
PHP inventor Rasmus Ledorf
was born in Greenland but tweets in English https://twitter.com/rasmus

There's a theme here. All these people picked a language to collaborate
in. Leveraging that theme to simplify the toybox code is kinda obvious.

But there's "simplifying" and then there's "not serving the users".
Still not sure where to draw the line, mostly waiting for people to
complain and thus volunteer to test stuff. :)
Post by David Seikel
Post by Rob Landley
I can do a sysvinit. It would be really nice if there was SOME
standard, somewhere...
Mostly I want to get a clear idea of the problem people are trying
to solve, and whether or not there's any installed base of expertise
out there I can leverage. (What are the expectations here? What do
sysadmins sit down and mangle today without having to learn new
stuff? I dunno, I need to talk to some sysadmins...)
Long ago, for busybox, I did a Sys V init that tried to stick to what
LSB said should be done. I can't recall why it was not accepted inte
busybox. Possibly someone had done their own favourite style of init
and Sys V init was not desired. shrugs
It was described as "Clean room implementation of runlevel/init.d/SYS
V
init applets for busybox, aiming for LSB compliance."
/me digs through his ancient projects ...
http://urunlevel.sourceforge.net/ Oh ick that looks awful on a wide
screen.
If nothing else, it might provide one persons take on the LSB
"standard", even if it was from seven years ago. One feature I added
was the ability to write init "scripts" in C as well as the usual
shell
scripts. Toybox probably does not want that.
Not so much, no.

If sysv init is the way to go, writing one is easy enough. Working out
the correct design is generally the hard part, implementation is
comparatively trivial. (Of course you learn stuff during the
implementation that can affect the design. Something about plans not
surviving contact with the enemy. :)

Rob
Rob Landley
2012-11-26 04:39:41 UTC
Permalink
The discussion keeps wandering off the list. :)
Understood. Setting the environment variable in the test script
makes
sense, I'll add a comment.
? The environment is only read in order to check for the locale.
It
might
better to use locale (1) though.
I've gotten distracted a bit: I got a "which" contribution that
needed
a lot of cleanup (I'm about 2/3 of the way done), and then I need to
clean up "cut", and I'm properly fixing "cp", which needed the
equivalent of readlink -f in order to do duplicate detection (I can
catch "cp * ." by checking stat info, but infinite recursion ala
"cp -R
.. ." requires absolute path comparisons with symlinks resolved),
and
when I poked at readlink -f it was unfinished, and then the test
suite
failed on the rewrite, and the test suite actually needs to be
extended
for the new -m and -e options...
The usual. :)
:)
I note I just checked in the which cleanup, and blogged a chunk of the
rationale at http://landley.net/notes-2012.html#23-11-2012 but when I
really get the bit in my teeth it's hard to write up my rationale for
doing stuff _or_ to do the test suite. Both of which are kinda good
things to do.

I had to redo which (from where I left off cleaning it up) because:

A) getdate() didn't do what I thought it did.

It hasn't got any built in date-guessing formats, you have to point it
at a file of them. There doesn't seem to be a system standard file or
anything built into libc, so it was kinda useless. The man page for the
gnu/dammit version of date says -d can guess an arbitrary date code,
but no idea how.

So I've got to go to the two specific date formats posix lists for -d
and -t, but I don't have to open code it like the initial contribution
did, I can use strptime() out of libc.

B) posix says touch should handle multiple files on the command line,
and this only handled one argument.

I wanted to use loopfiles(), but that means opening the file to get a
filehandle, and touch should be able to change the date of chmod 000
files, meaning it can't use futimes but has to use the non-filehandle
version.

Posix also says [.frac] I.E. fractions of a second support. (Computers
are fast now.)
The man page has utime() and utimes() on it and utimes() supports
milisecond resolution. Modern linux supports nanoseconds but
miliseconds is probably fine for now. (Eh, I could switch to
utimensat(AT_CWD, ...) at some point.)

C) It only supported about half the posix command line arguments, it
needs -a, -c, and -d.

Note that supporting -a and -m means it needs to stat the existing file
to get the current date, modify it, and write it back out.

I also wince when I see repeated code, and try to sit back and find a
way to just have _one_ call to utime() or similar. I don't always
manage it but I do stop and try.

I had to dig a bit to find out that the T "time designator" is a
literal T.

I've still got a couple todo items: allow space as a time designator,
allow , as a fraction indicator. (*shrug* Posix.)
Conditional on the config symbol enabling UTF-8 support. (That
introduces the possibility of internationalization causing
regressions
in other commands, but it does mean they asked for it.)
The unanswerable question is whether UTF-8 support should be
enabled by
default. (The bigger the test suite is the more my answer leans
towards
yes.)
I think that I would also prefer a default of enabled i18n. Is
there a
way to check in test scripts for enabled configuration variables?
Yes, the "optional" shell function. See sort.test or tail.test for
examples, if the config symbol isn't set (in the current build
directory) calls to "testing" after that are NOPed out until the
next
call to optional.
(Probably what I should have it do instead is return true/false and
use
shell if statements. Right now you can't test for more than one
symbol
at a time...)
Thanks. So I made a patch for implementing the option. I was not sure
where
to exactly put the setlocale call. Please feel free to move it, where
it
belongs. Attached is a hg exported patch.
I have applied it. I've got some pending cleanups, but it's revealed
yet more structural limitations in my stuff.

I don't like #ifdefs in C code, I prefer if (CFG_BLAH) {stuff} with
CFG_BLAH becoming a constant 0 and dead code elimination dropping out
{stuff} when configured out. That way it's always syntax checked.

(Note that the CFG_ macros will always be #defined, they're just 1 or
0. So an #ifdef on them will always be true. You can either #if
CFG_BLAH or #ifdef CONFIG_BLAH to actually chop stuff out with the
preprocessor.)

In addition to the CFG_BLAH macros I also made USE_BLAH() macros which
resolve to the contents of the parentheses when BLAH is enabled, and
nothing when BLAH is disabled.

The downside of letting dead code elimination do the work is that if
the headers are conditionally #included, the function calls have no
prototype. I'm not sure if having the headers unconditionally included
is better, or moving them to lib/portability.h and having an #else case
in the ifdef that copies the relevant prototypes from posix. Probably
unconditional inclusion until somebody complains, then fix it up in
portability.h.

Another problem is that if I stick a USE_BLAH("x") around the config
options my header generation infrastructure is only producing headers
for stuff that's enabled. I.E.

USE_WC(NEWTOY(wc, USE_TOYBOX_I18N("m")"cwl", TOYFLAG_USR|TOYFLAG_BIN))

When I switch off TOYBOX_I18N I don't get FLAG_m defined.

(What I _should_ do is #define FLAG_M to 0 when it's disabled. But
teaching the script to do that, with sed, is complicated. :P )

So yeah: working on it. :)
I mean that I am painfully aware of my lack of expertise here,
but
am
trying to close tabs before opening new ones. :)
For example, I dunno if daemontools or runit are the same thing.
(And I
try to keep track of where each one came from. susv init was
from
system v, there was a bsd one that kinda fell out of use,
upstart
was
from ubuntu, systemd was from <strike>Pointy Hair
Linux</strike> Red
Hat, android has some home-grown thing, and last I heard gentoo
was
threatening to write its own.
Daemontools and Runit are different (but have the same ideas).
Daemontools
are by djb. Runit is by Gerrit Pape. Daemontools are not really a
init
but a tool to manage (supervise) services.
I wrote that down. At some point I need to dig into all this and
figure
out what to do, but that research project can wait until I've
emptied a
couple of the open cans of worms.
(Who puts worms in cans? I'm guessing this is a fishing metaphor...)
Post by Rob Landley
I have yet to open this can of worms. Possibly I can just
ignore it
Post by Rob Landley
and
do sysv init.
Oh, and let's not forget the careful failure to actually say
anything
http://refspecs.linuxfoundation.org/LSB_4.1.0/LSB-Core-generic/LSB-Core-generic/tocsysinit.html
Post by Rob Landley
A standard that doesn't specify. Sigh...
I mean, is there any problem for someone who does not like
sysv
init
Post by Rob Landley
to
use their own init instead of toybox's?
Not at all. but at the same time. I'd like to provide something
people
find useful. "I don't actually expect people to use this tool"
means I
probably shouldn't include that tool. (I included oneit because
_I_
use
it. And because it solves the problem of booting a
single-purpose
system that exists to run one process, such as a single command
shell.
Yes you can do init=/myshellscript.sh which I've done a lot
(modulo
the
bug that the period seems to confuse the kernel's command line
argument
parsing), but init has to reap zombies for orphan processes and
that
won't, so zombie processes pile up in the resulting system.
Plus you
can't ctrl-c out of it.)
Yeah, it's really fun to ping stuff without job control...
ping -c 3
(And as soon as you forget the -c once, it's reboot time. Honestly,
-c
10 or similar should be the _default_, with an option to enable
endless
pinging.)
I also discovered "ping -c" some time ago. On Windows ping seems to
be better
behaved in that sense...
You're still a typo away from a bricked box. Been there, done that.
(With runit you'd pack your script into /etc/runit/1, have a
zombie
killer
in /etc/runit/3 and one agetty as a service started by something
like
"exec agetty 38400 tty1 linux". Ok, that's kind of more work than
using
oneit.)
I like oneit, it's a good tool for what it does. However, it's
intentionally very limited. Going to a full system management
daemon...
there are a number of possible approaches. Figuring out what to do
is a
major research project.
I dislike systemd if for no other reason than Red Hat's attempts to
shove an ever-growing katamari down people's throats, over the
objections of Ubuntu and Gentoo and so on. (Plenty of other
reasons, of
course. PID 1 should not have dependencies. Sucking udev into it is
not
an improvement, and Linus Torvalds himself has flamed Kay Sievers
recently http://article.gmane.org/gmane.linux.kernel/1369384 .)
Anyway,
"not that".
I should take a look at upstart, which came first. It would be
something to be compatable with. But from the quick glances I've
done,
it also seems kinda overcomplicated.
Given that Android already has more deployments than Linux and is
still
in the exponential growth part of its S-curve, I _need_ to be
familiar
with whatever it is they're doing. Toybox should either provide a
good
clean replacement for it, or easily work with it (interact with the
unmodified android init).
I'm reluctant to pick favorites among daemontools and runit and
gentoo's thing and so on because I'd like to make a good, simple,
standard general-purpose tool. Back in 2005 when I rewrote the init
in
busybox, this was the System V init. (This rewrite never made it
upstream because Erik was switching CVS over to SVN and nobody could
check anything in for over a month, during which time I switched
laptops an the code got buried in backup directories. But I got
pretty
deeply familiar with how init actually works during the rewrite,
which
led me to do oneit.)
As long as I don't find systemd or upstart in toybox I'm happy. I
think
that I feel similarly about systemd as you. dbus doesn't seem to be
the
right IPC mechanism for an init to me...
People were trying to move dbus in-kernel at one point:

http://www.phoronix.com/scan.php?page=news_item&px=ODYwNA

It struck me as a defense mechanism more than a design...
It may solve some problems,
but seems to add a great deal of complexity. I kind of feel similarly
about UEFI and BIOS.
The musl maintainer was pondering doing a kernel as his next big
project, and I tried to point him at doing a bootloader. You can scale
up something like uboot with a scheduler and posix system call API, but
right now there's no way to chop Linux _down_ to just what you need.

I ranted about that years ago:

http://www.mentby.com/rob-landley/what-happened-to-linux-tiny.html
I have no idea about the init system of Android.
Gentoo's openrc builds on top of sysvinit and I think it would not
make
much sense to add it to toybox. As mentioned previously daemontools
are
actually not an init. And I kind of wanted to demonstrate that runit
is
still very simple. I didn't want to suggest that you should add
functionality related to it.
I can do a sysvinit. It would be really nice if there was SOME
standard, somewhere...

Mostly I want to get a clear idea of the problem people are trying to
solve, and whether or not there's any installed base of expertise out
there I can leverage. (What are the expectations here? What do
sysadmins sit down and mangle today without having to learn new stuff?
I dunno, I need to talk to some sysadmins...)
Since then, people have started demanding more out of their init.
Mostly parallelism to speed up the boot process, but also paranoid
developers are TERRIFIED that their daemons will exit and need to be
relaunched, even though they trust PID 1 to stay up and if PID 1
ever
does go down the kernel will panic and how is this different from
the
other daemon? Really, if random processes are exiting your system
has a
_problem_ and needs to be _fixed_, creating infrastructure to
silently
recover from things that shouldn't happen is hiding the problem...
Most daemons should write something to the logs when an error occurs
or
they being restarted.
Expecting good daemons is like expecting good apps. You cope with what
you have.
But as you said not restarting them automatically
would give greater incentives to fix them. And of course I agree that
the PID 1 should be as simple as possible.
(Sorry for the remark: runit's process superviser restarts processes
when they exit. This is for example used to implement service
dependencies.)
Eh, the OOM killer can kill stuff. Bad things happen, there should be a
recovery mechanism. (It might be detect it and "shutdown -r now".)

Again, I'm not comfortable with my understanding of the problem space,
and I dunno what "being agnostic" means in this circumstance. I need to
provide the infrastructure to support unknown use cases, but in the
absence of a standard _or_ consensus on what that infrastructure is,
the lack of knowledge of the use cases is disturbing.
Felix
This problem requires further study, and I need to clear space
first.
Rob
Rob
David Seikel
2012-11-26 05:39:43 UTC
Permalink
/me picks two things out of this long email to comment on.
Post by Rob Landley
A) getdate() didn't do what I thought it did.
It hasn't got any built in date-guessing formats, you have to point
it at a file of them. There doesn't seem to be a system standard file
or anything built into libc, so it was kinda useless. The man page
for the gnu/dammit version of date says -d can guess an arbitrary
date code, but no idea how.
So I've got to go to the two specific date formats posix lists for
-d and -t, but I don't have to open code it like the initial
contribution did, I can use strptime() out of libc.
B) posix says touch should handle multiple files on the command
line, and this only handled one argument.
I wanted to use loopfiles(), but that means opening the file to get
a filehandle, and touch should be able to change the date of chmod
000 files, meaning it can't use futimes but has to use the
non-filehandle version.
Posix also says [.frac] I.E. fractions of a second support.
(Computers are fast now.)
The man page has utime() and utimes() on it and utimes() supports
milisecond resolution. Modern linux supports nanoseconds but
miliseconds is probably fine for now. (Eh, I could switch to
utimensat(AT_CWD, ...) at some point.)
C) It only supported about half the posix command line arguments, it
needs -a, -c, and -d.
Note that supporting -a and -m means it needs to stat the existing
file to get the current date, modify it, and write it back out.
I also wince when I see repeated code, and try to sit back and find
a way to just have _one_ call to utime() or similar. I don't always
manage it but I do stop and try.
I had to dig a bit to find out that the T "time designator" is a
literal T.
I've still got a couple todo items: allow space as a time
designator, allow , as a fraction indicator. (*shrug* Posix.)
I would guess for those human languages that swap the usage of
commas and periods as fraction / thousands separators.

<useless anecdote>

In the SecondLife (SL) based virtual world work I do, the major clone of
the server side is called OpenSim. OpenSim has the option of creating
your virtual world from a disparate collection of sim servers run by a
motley collection of random people. A sim is the basic unit of
geography in SL, 256x256 meters, starting at 0 meters, and extending
all the way up to entirely useless heights. So for instance, one sim
server could be in USA, with the server itself set to the USA variation
of English as it's locale, while another sim server could be in
Denmark, with it's locale set to Danish.

English and Danish are two languages where the use of commas and
periods for fraction / thousands separator is swapped. OpenSim
developers like to do crazy things, like use human readable wire
protocols, coz they drink the "humans being able to read wire protocols
that 99.999999% of the time only computers need to read is a great idea
for the 0.000001% of the time that a human needs to read it to debug it"
kool-aid. Some of the things OpenSim sends as human readable wire
protocol get filtered through the servers locale. Did I mention the
OpenSim developers are crazy?

The end result is that if you teleport from this Danish sim to that USA
sim, the Danish sim says "send this avatar to a height of 20,123
meters" (a little over 20 meters, a typical ground level), and the
USA sim server thinks "Hmmm, 20 thousand meters high, OK" and tosses the
poor avatar waaaay up in the air. There is never anything up that
high, and the avatar has a loooong fall to the ground where they
thought they where going.

Getting commas and periods right is important for humans, computers
should not care so much on the damn wire. Getting it wrong can be
painful and messy.

</useless anecdote>
Post by Rob Landley
I can do a sysvinit. It would be really nice if there was SOME
standard, somewhere...
Mostly I want to get a clear idea of the problem people are trying
to solve, and whether or not there's any installed base of expertise
out there I can leverage. (What are the expectations here? What do
sysadmins sit down and mangle today without having to learn new
stuff? I dunno, I need to talk to some sysadmins...)
Long ago, for busybox, I did a Sys V init that tried to stick to what
LSB said should be done. I can't recall why it was not accepted inte
busybox. Possibly someone had done their own favourite style of init
and Sys V init was not desired. shrugs

It was described as "Clean room implementation of runlevel/init.d/SYS V
init applets for busybox, aiming for LSB compliance."

/me digs through his ancient projects ...

http://urunlevel.sourceforge.net/ Oh ick that looks awful on a wide
screen.

If nothing else, it might provide one persons take on the LSB
"standard", even if it was from seven years ago. One feature I added
was the ability to write init "scripts" in C as well as the usual shell
scripts. Toybox probably does not want that.
--
A big old stinking pile of genius that no one wants
coz there are too many silver coated monkeys in the world.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <http://lists.landley.net/pipermail/toybox-landley.net/attachments/20121126/c72a3a7d/attachment-0002.pgp>
Rob Landley
2012-11-26 17:28:14 UTC
Permalink
Post by David Seikel
/me picks two things out of this long email to comment on.
I copied the whole thing because who knows how long ago the
conversation wandered off the list? It's easier for me to check the
list archive than to dig through the compost heap of my inbox, so I
like having toybox stuff copied here.
Post by David Seikel
Post by Rob Landley
I've still got a couple todo items: allow space as a time
designator, allow , as a fraction indicator. (*shrug* Posix.)
I would guess for those human languages that swap the usage of
commas and periods as fraction / thousands separators.
I expect so.

Mostly I'm not bothering with that because it belongs at the GUI level:
we're not translating the names "sort", "touch", "kill", or "nice" to
other languages, why translate their arguments? It wasn't there for the
first 30 years of these tools, adds a large amount of complexity for a
relatively small amount of gain, changing the alphabetization of "sort"
broken more than one build when that suddenly happened, I'm not
competent to _test_ this so if I did implement it I'd have no way of
knowing if I'd done it right...

But if SUSv4 is going to require a synonym and it's easy to do, sure,
why not?
Post by David Seikel
<useless anecdote>
In the SecondLife (SL) based virtual world work I do, the major clone
of
the server side is called OpenSim. OpenSim has the option of creating
your virtual world from a disparate collection of sim servers run by a
motley collection of random people. A sim is the basic unit of
geography in SL, 256x256 meters, starting at 0 meters, and extending
all the way up to entirely useless heights. So for instance, one sim
server could be in USA, with the server itself set to the USA
variation
of English as it's locale, while another sim server could be in
Denmark, with it's locale set to Danish.
English and Danish are two languages where the use of commas and
periods for fraction / thousands separator is swapped. OpenSim
developers like to do crazy things, like use human readable wire
protocols, coz they drink the "humans being able to read wire
protocols
that 99.999999% of the time only computers need to read is a great
idea
for the 0.000001% of the time that a human needs to read it to debug
it"
kool-aid.
I side with them on this: word size, endianness, alignment, padding,
how the compiler's decided to represent C bitfields this week, anything
to do with floating point representation...

That said, translating a wire protocol is nuts. The comma/period is
part of the protocol. You can say decimal numbers are represented by
"123!456" and if that's what you're doing just be _consistent_.
Post by David Seikel
Some of the things OpenSim sends as human readable wire
protocol get filtered through the servers locale. Did I mention the
OpenSim developers are crazy?
That is crazy, yes.
Post by David Seikel
The end result is that if you teleport from this Danish sim to that
USA
sim, the Danish sim says "send this avatar to a height of 20,123
meters" (a little over 20 meters, a typical ground level), and the
USA sim server thinks "Hmmm, 20 thousand meters high, OK" and tosses
the
poor avatar waaaay up in the air. There is never anything up that
high, and the avatar has a loooong fall to the ground where they
thought they where going.
Getting commas and periods right is important for humans, computers
should not care so much on the damn wire. Getting it wrong can be
painful and messy.
The moral of the story is "pick one and be consistent". And I have.

I admit to a certain amount of discomfort going "the langauge I already
speak is coincidentally the standard for posix programming". This is a
decision in need of more defending than average.

That said, Linus Torvalds was born speaking Swedish, and learned
English to compute with. Ingo Molnar is hungarian. Pavel Machek's home
page even says http://atrey.karlin.mff.cuni.cz/~pavel/ "English is my
second language." The OpenVZ guys are in Moscow. Years back I handed
off BusyBox to Denys Vlasenko who is from the Ukraine, and uClibc is
currently maintained by Bernhard who was in Austria last I checked.
Mark Shuttleworth's from South Africa (although one of their 11
official languages is English so I dunno what he grew up speaking).
Marcelo Tosatti is from Brazil, Thomas Gleixner is German, VFAT
filesystem maintainer Ogawa Hirofumi is from Japan...

Outside of Linux: Guido Van Rossum is Dutch but Python development
occurs in english. Here's an interview with Ruby inventor Yukihiro
Matsumoto... in English:
http://youtu.be/IGG4l_QEzog PHP inventor Rasmus Ledorf
was born in Greenland but tweets in English https://twitter.com/rasmus

There's a theme here. All these people picked a language to collaborate
in. Leveraging that theme to simplify the toybox code is kinda obvious.

But there's "simplifying" and then there's "not serving the users".
Still not sure where to draw the line, mostly waiting for people to
complain and thus volunteer to test stuff. :)
Post by David Seikel
Post by Rob Landley
I can do a sysvinit. It would be really nice if there was SOME
standard, somewhere...
Mostly I want to get a clear idea of the problem people are trying
to solve, and whether or not there's any installed base of expertise
out there I can leverage. (What are the expectations here? What do
sysadmins sit down and mangle today without having to learn new
stuff? I dunno, I need to talk to some sysadmins...)
Long ago, for busybox, I did a Sys V init that tried to stick to what
LSB said should be done. I can't recall why it was not accepted inte
busybox. Possibly someone had done their own favourite style of init
and Sys V init was not desired. shrugs
It was described as "Clean room implementation of runlevel/init.d/SYS
V
init applets for busybox, aiming for LSB compliance."
/me digs through his ancient projects ...
http://urunlevel.sourceforge.net/ Oh ick that looks awful on a wide
screen.
If nothing else, it might provide one persons take on the LSB
"standard", even if it was from seven years ago. One feature I added
was the ability to write init "scripts" in C as well as the usual
shell
scripts. Toybox probably does not want that.
Not so much, no.

If sysv init is the way to go, writing one is easy enough. Working out
the correct design is generally the hard part, implementation is
comparatively trivial. (Of course you learn stuff during the
implementation that can affect the design. Something about plans not
surviving contact with the enemy. :)

Rob

Continue reading on narkive:
Loading...