Discussion:
[PATCH] Support the %N coreutils/busybox extension to date.
(too old to reply)
enh
2017-04-15 18:17:04 UTC
Permalink
Raw Message
Used by the toybox touch test suite (and thus necessary to pass the toybox
tests on a toybox-only system).
---
tests/date.test | 10 ++++++++++
toys/posix/date.c | 50 +++++++++++++++++++++++++++++++++++++++-----------
2 files changed, 49 insertions(+), 11 deletions(-)
Rob Landley
2017-04-16 20:28:28 UTC
Permalink
Raw Message
Post by enh
Used by the toybox touch test suite (and thus necessary to pass the toybox
tests on a toybox-only system).
Did I ever actually post my rant on this topic? Or just compose it and
save as draft.

This is really really libc's job, it's a pity the gnu guys didn't
coordinate with the glibc developers. (On the other hand, I suppose
that's why glibc wasn't worse.)

Yes "struct tm" doesn't have nanoseconds, and they don't want to make it
bigger because binary compatibility. But lp64 says int is 32 bits, so
they could stick the nanoseconds in the middle 30 bits of tm_isdst (old
0 and 1 inputs only flip the bottom bit, -1 sets the top bit, middle 30
bits are basically unused and can store 0-999999999; old code that sets
the field to -1 means nanoseconds are over 999999999 and you can treat
that as 0). Then strptime and strftime could both have access to
nanoseconds and could both implement %N. It's a little awkward but more
or less backwards compatible.

I already have date.c accepting nanoseconds input in its
@unixtime.fraction format. I pondered doing it for the
MMDDhhmm[[CC]YY][.ss] one but it doesn't OUTPUT that format which is
just sad. (Be symmetrical, people!) I've pondered _adding_ a flag to
make date output that, but "toybox tests require toybox commands" is a
bridge I haven't crossed yet.

Speaking of which, this patch adds %N to output but not to input, so
date can still only _set_ nanoseconds using the @123456789.123456789
syntax. And you're not checking that "%%N is a literal %N". (Sigh. If
we're going there, may I draw your attention to next_printf() in
lib/lib.c? Example usage in seq and stat.)

It's kind of sad "ls" can't show nanoseconds. Or "stat". I understand
why (nanoseconds were added when makefile dependencies started breaking
as computers got faster and multiple build things happened in the same
second; nothing else actually _needed_ them in the filesystem) but it's
still half-assed.

Grumble grumble...

Rob
Rob Landley
2017-04-16 20:51:23 UTC
Permalink
Raw Message
Post by Rob Landley
It's kind of sad "ls" can't show nanoseconds.
It's ls --full-time (which implies -l)

Added to the todo heap...

Rob
enh
2017-04-16 20:51:07 UTC
Permalink
Raw Message
Post by enh
Post by enh
Used by the toybox touch test suite (and thus necessary to pass the
toybox
Post by enh
tests on a toybox-only system).
Did I ever actually post my rant on this topic? Or just compose it and
save as draft.
This is really really libc's job, it's a pity the gnu guys didn't
coordinate with the glibc developers. (On the other hand, I suppose
that's why glibc wasn't worse.)
yeah, definitely a mess.

note that i only did this because (a) existing tests were using it and (b)
busybox has it too. i've been assuming (perhaps incorrectly) that busybox
also tried to avoid needlessly implementing GNU extensions unless they were
actually useful to enough people.
Post by enh
Yes "struct tm" doesn't have nanoseconds, and they don't want to make it
bigger because binary compatibility. But lp64 says int is 32 bits, so
they could stick the nanoseconds in the middle 30 bits of tm_isdst (old
0 and 1 inputs only flip the bottom bit, -1 sets the top bit, middle 30
bits are basically unused and can store 0-999999999; old code that sets
the field to -1 means nanoseconds are over 999999999 and you can treat
that as 0). Then strptime and strftime could both have access to
nanoseconds and could both implement %N. It's a little awkward but more
or less backwards compatible.
I already have date.c accepting nanoseconds input in its
@unixtime.fraction format. I pondered doing it for the
MMDDhhmm[[CC]YY][.ss] one but it doesn't OUTPUT that format which is
just sad. (Be symmetrical, people!) I've pondered _adding_ a flag to
make date output that, but "toybox tests require toybox commands" is a
bridge I haven't crossed yet.
Speaking of which, this patch adds %N to output but not to input, so
syntax.
isn't setting via arbitrary format a toybox extension?
Post by enh
And you're not checking that "%%N is a literal %N". (Sigh. If
we're going there, may I draw your attention to next_printf() in
lib/lib.c? Example usage in seq and stat.)
okay, i'll switch it over (assuming we actually want %N).
Post by enh
It's kind of sad "ls" can't show nanoseconds. Or "stat". I understand
why (nanoseconds were added when makefile dependencies started breaking
as computers got faster and multiple build things happened in the same
second; nothing else actually _needed_ them in the filesystem) but it's
still half-assed.
~$ ls -ld --time-style=full-iso /
drwxr-xr-x 25 root root 4096 2017-04-10 08:08:16.298419414 -0700 /
~$ stat /
File: ‘/’
Size: 4096 Blocks: 8 IO Block: 4096 directory
Device: fc01h/64513d Inode: 2 Links: 25
Access: (0755/drwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2017-04-16 13:35:50.436148625 -0700
Modify: 2017-04-10 08:08:16.298419414 -0700
Change: 2017-04-10 08:08:16.298419414 -0700
Birth: -
~$

i'm happy to rewrite the tests instead... i've never used %N personally,
and would just naturally use stat for this kind of thing.

as you say, the asymmetry is ugly, and the divergence from libc is ugly, so
if busybox added this "just because" maybe we shouldn't have it at all?
Post by enh
Grumble grumble...
Rob
--
Elliott Hughes - http://who/enh - http://jessies.org/~enh/
Android native code/tools questions? Mail me/drop by/add me as a reviewer.
Rob Landley
2017-04-16 23:22:19 UTC
Permalink
Raw Message
Post by enh
yeah, definitely a mess.
note that i only did this because (a) existing tests were using it and
(b) busybox has it too. i've been assuming (perhaps incorrectly) that
busybox also tried to avoid needlessly implementing GNU extensions
unless they were actually useful to enough people.
As far as I can tell the way people do this is indeed date %N. I've
tried to do this before, but adding extra escapes to both input and
output via an overlay mechanism turns out to be a hard problem to do right.

(Really I want a generic mechanism to wrap arbitrary printf/scanf style
things, but the problem space isn't necessarily set up that way. Can I
chop strptime parsing into multiple strptime calls that _only_ modify
certain fields each time while leaving the others alone? I was reading
the posix page and man page stuff about that at one point, something
said it only modified specific fields other than the ones you
specified... but alas, I got distracted halfway through...)

Checking my notes, I was attempting to add %s support to the input path.
Except that %s pretty much _replaces_ strptime() because none of the
other fields you set make sense in combination with that. (But %s could
be in the middle of a context string describing the rest of the input;
that I didn't make up, it's in the ubuntu one.) So %s is probably
fiddlier than %N. Hmmm...

In general I wince at merging half a solution to a larger problem,
because it increases the surface area of the problem. (People start
using the partial thing, then I break them when I try to fix the rest of
it...) That said, if I'm not doing the full fix now and the partial fix
is there and we need it... Sigh.
Post by enh
Speaking of which, this patch adds %N to output but not to input, so
syntax.
isn't setting via arbitrary format a toybox extension?
Walter Harms added it to busybox in 2006 (back when I maintained it):

http://lists.busybox.net/pipermail/busybox/2006-February/018203.html
https://git.busybox.net/busybox/commit/?id=c5789a6e234c

I genericized his original idea a bit because "I want to convert data
from one format to another" is awkward to solve otherwise. "This email's
datestamp would be what in unix time"... I used to write little C
programs to do that.

I used to have a button that said "Only XT users know that January 1,
1980 was a tuesday". Toybox says:

$ ./date -D "%m %d %Y" -d "1 1 1980" +%A
Tuesday

If you can say %N in + but not in -D I'm doing it wrong. It's a tiny
thing, but it bothers me.
Post by enh
And you're not checking that "%%N is a literal %N". (Sigh. If
we're going there, may I draw your attention to next_printf() in
lib/lib.c? Example usage in seq and stat.)
okay, i'll switch it over (assuming we actually want %N).
Eh, we probably want it, I just winced at how expensive it was to
implement and it's the wrong place to do it (belongs in libc) and if
we're going there I want to do the full fix (get and set, and handling
%% and %4N) so it isn't inconsistent...

And it needs one of them Focused Review things that have been in short
supply recently. (The easter weekend's helped me to dig out a bit. If I
got a three day weekend every week, keeping up would be so much easier. :)
Post by enh
It's kind of sad "ls" can't show nanoseconds. Or "stat". I understand
why (nanoseconds were added when makefile dependencies started breaking
as computers got faster and multiple build things happened in the same
second; nothing else actually _needed_ them in the filesystem) but it's
still half-assed.
~$ ls -ld --time-style=full-iso /
--time-style is a can of worms (I made a design decision _not_ to
support more than one output format, now there's a straightforward
extension of the existing output format we want to add, but if I use the
generic named selection mechanism for it the pressure to support all the
rest of 'em will never stop. I lose the ability to draw a clear boundary
line.)

But --full-time is a synonym, which I can add _and_ make -ll do it too.
(Repeat -l and it shows nanoseconds. I like having short options for
things, but the letters are all taken here. Still, logically it's a
"long long"... :)

My concern here is maximizing the surface area of "I can write a script
that runs under toybox and under ubuntu as well." Such as the current
test suite stuff we're trying to do. I can write one using --full-time
it works in both contexts.

The problem with using ls for the touch test code is "ls -l" shows
ownership information also, varying between systems. You can't output
_just_ the info you want without that. We can do the awk/cut trick to
filter it and only look at specific fields (todo: can we have spaces or
tabs in usernames? You can in the realname field...).

tl;dr there's still a call for "date %N".
Post by enh
i'm happy to rewrite the tests instead... i've never used %N personally,
and would just naturally use stat for this kind of thing.
How do you get nanoseconds out of stat? (I'm happy to add a toybox
extension for it, but relying on that breaks TEST_HOST or "make
test_sed" on a non-toybox host.)
Post by enh
as you say, the asymmetry is ugly, and the divergence from libc is ugly,
so if busybox added this "just because" maybe we shouldn't have it at all?
It was useful, and still is. (Especially the ability to specify
different input and output formats so you can convert between them.) And
it's already implemented and working. :)

Rob
enh
2017-04-17 16:42:38 UTC
Permalink
Raw Message
take 2 (switches to next_printf, fixes the %%N case):

tests/date.test | 14 ++++++++++++++
toys/posix/date.c | 50 +++++++++++++++++++++++++++++++++++++++-----------
2 files changed, 53 insertions(+), 11 deletions(-)
Post by Rob Landley
Post by enh
yeah, definitely a mess.
note that i only did this because (a) existing tests were using it and
(b) busybox has it too. i've been assuming (perhaps incorrectly) that
busybox also tried to avoid needlessly implementing GNU extensions
unless they were actually useful to enough people.
As far as I can tell the way people do this is indeed date %N. I've
tried to do this before, but adding extra escapes to both input and
output via an overlay mechanism turns out to be a hard problem to do right.
(Really I want a generic mechanism to wrap arbitrary printf/scanf style
things, but the problem space isn't necessarily set up that way. Can I
chop strptime parsing into multiple strptime calls that _only_ modify
certain fields each time while leaving the others alone? I was reading
the posix page and man page stuff about that at one point, something
said it only modified specific fields other than the ones you
specified... but alas, I got distracted halfway through...)
one of the things i liked in Plan 9:
http://man.cat-v.org/plan_9/2/fmtinstall
Post by Rob Landley
Checking my notes, I was attempting to add %s support to the input path.
Except that %s pretty much _replaces_ strptime() because none of the
other fields you set make sense in combination with that. (But %s could
be in the middle of a context string describing the rest of the input;
that I didn't make up, it's in the ubuntu one.) So %s is probably
fiddlier than %N. Hmmm...
In general I wince at merging half a solution to a larger problem,
because it increases the surface area of the problem. (People start
using the partial thing, then I break them when I try to fix the rest of
it...) That said, if I'm not doing the full fix now and the partial fix
is there and we need it... Sigh.
plus no-one else supports round trips as far as i can tell, so demand
hasn't been that great.
Post by Rob Landley
Post by enh
Speaking of which, this patch adds %N to output but not to input, so
syntax.
isn't setting via arbitrary format a toybox extension?
http://lists.busybox.net/pipermail/busybox/2006-February/018203.html
https://git.busybox.net/busybox/commit/?id=c5789a6e234c
I genericized his original idea a bit because "I want to convert data
from one format to another" is awkward to solve otherwise. "This email's
datestamp would be what in unix time"... I used to write little C
programs to do that.
I used to have a button that said "Only XT users know that January 1,
$ ./date -D "%m %d %Y" -d "1 1 1980" +%A
Tuesday
(given that this question comes up in real life more than "show me the
whole calendar for 1856" or "show me the calendar for October 1856", i
never understood why cal will show you a whole year or a whole month but
not a single day.)
Post by Rob Landley
If you can say %N in + but not in -D I'm doing it wrong. It's a tiny
thing, but it bothers me.
Post by enh
And you're not checking that "%%N is a literal %N". (Sigh. If
we're going there, may I draw your attention to next_printf() in
lib/lib.c? Example usage in seq and stat.)
okay, i'll switch it over (assuming we actually want %N).
Eh, we probably want it, I just winced at how expensive it was to
implement and it's the wrong place to do it (belongs in libc) and if
we're going there I want to do the full fix (get and set, and handling
%% and %4N) so it isn't inconsistent...
And it needs one of them Focused Review things that have been in short
supply recently. (The easter weekend's helped me to dig out a bit. If I
got a three day weekend every week, keeping up would be so much easier. :)
no extra day off here :-( at the other end of the spectrum, the London
office have a four-day weekend...
Post by Rob Landley
Post by enh
It's kind of sad "ls" can't show nanoseconds. Or "stat". I understand
why (nanoseconds were added when makefile dependencies started
breaking
Post by enh
as computers got faster and multiple build things happened in the
same
Post by enh
second; nothing else actually _needed_ them in the filesystem) but
it's
Post by enh
still half-assed.
~$ ls -ld --time-style=full-iso /
--time-style is a can of worms (I made a design decision _not_ to
support more than one output format, now there's a straightforward
extension of the existing output format we want to add, but if I use the
generic named selection mechanism for it the pressure to support all the
rest of 'em will never stop. I lose the ability to draw a clear boundary
line.)
But --full-time is a synonym, which I can add _and_ make -ll do it too.
(Repeat -l and it shows nanoseconds. I like having short options for
things, but the letters are all taken here. Still, logically it's a
"long long"... :)
My concern here is maximizing the surface area of "I can write a script
that runs under toybox and under ubuntu as well." Such as the current
test suite stuff we're trying to do. I can write one using --full-time
it works in both contexts.
The problem with using ls for the touch test code is "ls -l" shows
ownership information also, varying between systems. You can't output
_just_ the info you want without that. We can do the awk/cut trick to
filter it and only look at specific fields (todo: can we have spaces or
tabs in usernames? You can in the realname field...).
tl;dr there's still a call for "date %N".
Post by enh
i'm happy to rewrite the tests instead... i've never used %N personally,
and would just naturally use stat for this kind of thing.
How do you get nanoseconds out of stat? (I'm happy to add a toybox
extension for it, but relying on that breaks TEST_HOST or "make
test_sed" on a non-toybox host.)
well, the _text_ form includes nanoseconds. but i guess you meant the
numeric-only form, in which case, no, i'm not aware of a way to do that.
Post by Rob Landley
Post by enh
as you say, the asymmetry is ugly, and the divergence from libc is ugly,
so if busybox added this "just because" maybe we shouldn't have it at
all?
It was useful, and still is. (Especially the ability to specify
different input and output formats so you can convert between them.) And
it's already implemented and working. :)
Rob
--
Elliott Hughes - http://who/enh - http://jessies.org/~enh/
Android native code/tools questions? Mail me/drop by/add me as a reviewer.
enh
2017-05-09 15:59:25 UTC
Permalink
Raw Message
what's the plan? "half an eye is better than no eye" (especially when
no-one else has this particular missing eye), or "rewrite the test to
use --full-time"?

since i personally have never needed %N except for getting this test
to run, i'm happy to do either.
Post by enh
tests/date.test | 14 ++++++++++++++
toys/posix/date.c | 50 +++++++++++++++++++++++++++++++++++++++-----------
2 files changed, 53 insertions(+), 11 deletions(-)
Post by Rob Landley
Post by enh
yeah, definitely a mess.
note that i only did this because (a) existing tests were using it and
(b) busybox has it too. i've been assuming (perhaps incorrectly) that
busybox also tried to avoid needlessly implementing GNU extensions
unless they were actually useful to enough people.
As far as I can tell the way people do this is indeed date %N. I've
tried to do this before, but adding extra escapes to both input and
output via an overlay mechanism turns out to be a hard problem to do right.
(Really I want a generic mechanism to wrap arbitrary printf/scanf style
things, but the problem space isn't necessarily set up that way. Can I
chop strptime parsing into multiple strptime calls that _only_ modify
certain fields each time while leaving the others alone? I was reading
the posix page and man page stuff about that at one point, something
said it only modified specific fields other than the ones you
specified... but alas, I got distracted halfway through...)
http://man.cat-v.org/plan_9/2/fmtinstall
Post by Rob Landley
Checking my notes, I was attempting to add %s support to the input path.
Except that %s pretty much _replaces_ strptime() because none of the
other fields you set make sense in combination with that. (But %s could
be in the middle of a context string describing the rest of the input;
that I didn't make up, it's in the ubuntu one.) So %s is probably
fiddlier than %N. Hmmm...
In general I wince at merging half a solution to a larger problem,
because it increases the surface area of the problem. (People start
using the partial thing, then I break them when I try to fix the rest of
it...) That said, if I'm not doing the full fix now and the partial fix
is there and we need it... Sigh.
plus no-one else supports round trips as far as i can tell, so demand hasn't
been that great.
Post by Rob Landley
Post by enh
Speaking of which, this patch adds %N to output but not to input, so
syntax.
isn't setting via arbitrary format a toybox extension?
http://lists.busybox.net/pipermail/busybox/2006-February/018203.html
https://git.busybox.net/busybox/commit/?id=c5789a6e234c
I genericized his original idea a bit because "I want to convert data
from one format to another" is awkward to solve otherwise. "This email's
datestamp would be what in unix time"... I used to write little C
programs to do that.
I used to have a button that said "Only XT users know that January 1,
$ ./date -D "%m %d %Y" -d "1 1 1980" +%A
Tuesday
(given that this question comes up in real life more than "show me the whole
calendar for 1856" or "show me the calendar for October 1856", i never
understood why cal will show you a whole year or a whole month but not a
single day.)
Post by Rob Landley
If you can say %N in + but not in -D I'm doing it wrong. It's a tiny
thing, but it bothers me.
Post by enh
And you're not checking that "%%N is a literal %N". (Sigh. If
we're going there, may I draw your attention to next_printf() in
lib/lib.c? Example usage in seq and stat.)
okay, i'll switch it over (assuming we actually want %N).
Eh, we probably want it, I just winced at how expensive it was to
implement and it's the wrong place to do it (belongs in libc) and if
we're going there I want to do the full fix (get and set, and handling
%% and %4N) so it isn't inconsistent...
And it needs one of them Focused Review things that have been in short
supply recently. (The easter weekend's helped me to dig out a bit. If I
got a three day weekend every week, keeping up would be so much easier. :)
no extra day off here :-( at the other end of the spectrum, the London
office have a four-day weekend...
Post by Rob Landley
Post by enh
It's kind of sad "ls" can't show nanoseconds. Or "stat". I understand
why (nanoseconds were added when makefile dependencies started breaking
as computers got faster and multiple build things happened in the same
second; nothing else actually _needed_ them in the filesystem) but it's
still half-assed.
~$ ls -ld --time-style=full-iso /
--time-style is a can of worms (I made a design decision _not_ to
support more than one output format, now there's a straightforward
extension of the existing output format we want to add, but if I use the
generic named selection mechanism for it the pressure to support all the
rest of 'em will never stop. I lose the ability to draw a clear boundary
line.)
But --full-time is a synonym, which I can add _and_ make -ll do it too.
(Repeat -l and it shows nanoseconds. I like having short options for
things, but the letters are all taken here. Still, logically it's a
"long long"... :)
My concern here is maximizing the surface area of "I can write a script
that runs under toybox and under ubuntu as well." Such as the current
test suite stuff we're trying to do. I can write one using --full-time
it works in both contexts.
The problem with using ls for the touch test code is "ls -l" shows
ownership information also, varying between systems. You can't output
_just_ the info you want without that. We can do the awk/cut trick to
filter it and only look at specific fields (todo: can we have spaces or
tabs in usernames? You can in the realname field...).
tl;dr there's still a call for "date %N".
Post by enh
i'm happy to rewrite the tests instead... i've never used %N personally,
and would just naturally use stat for this kind of thing.
How do you get nanoseconds out of stat? (I'm happy to add a toybox
extension for it, but relying on that breaks TEST_HOST or "make
test_sed" on a non-toybox host.)
well, the _text_ form includes nanoseconds. but i guess you meant the
numeric-only form, in which case, no, i'm not aware of a way to do that.
Post by Rob Landley
Post by enh
as you say, the asymmetry is ugly, and the divergence from libc is ugly,
so if busybox added this "just because" maybe we shouldn't have it at all?
It was useful, and still is. (Especially the ability to specify
different input and output formats so you can convert between them.) And
it's already implemented and working. :)
Rob
--
Elliott Hughes - http://who/enh - http://jessies.org/~enh/
Android native code/tools questions? Mail me/drop by/add me as a reviewer.
--
Elliott Hughes - http://who/enh - http://jessies.org/~enh/
Android native code/tools questions? Mail me/drop by/add me as a reviewer.
Rob Landley
2017-05-11 15:05:36 UTC
Permalink
Raw Message
Post by enh
what's the plan? "half an eye is better than no eye" (especially when
no-one else has this particular missing eye), or "rewrite the test to
use --full-time"?
since i personally have never needed %N except for getting this test
to run, i'm happy to do either.
Sigh. %N is a good feature to have but 30 lines of code to add it makes
me wince, and it's one of those "eh, both sides have a strong argument"...

Lemme see what I can clean up here...

(You're right that --full-time is already in, and tests rewritten to use
that would be merged as-is. But that's not a reason _not_ to have %N,
nanoseconds are a thing these days and "date" should support them. Only
supporting them in one direction makes me sad, hwclock should support
them too, which usually involves spinning and re-reading to find the
edge if I recall... Did I already do that? I have a note but which todo
list it would be in... "ls todo*.* */todo*.* */*/todo*.* | wc -l" finds
50 files. "Collate todo lists" is a perennial todo item...)

Rob
Rob Landley
2017-05-19 17:25:51 UTC
Permalink
Raw Message
Post by enh
what's the plan? "half an eye is better than no eye" (especially when
no-one else has this particular missing eye), or "rewrite the test to
use --full-time"?
since i personally have never needed %N except for getting this test
to run, i'm happy to do either.
I've had the window open for a while, unhappy with the amount of code
added for a seemingly small feature, so I tried to rewrite your commit
to pass through the %xxxN stuff as %xxxld (because that's what
next_printf() does), and worked out that %*N uses completely bespoke
logic to do something nothing else does. (The FSF!)

Which means that using next_printf() is silly, and your first version
would be right... except that didn't handle %%N. So I wrote a simple
version that looped through, found the %N, did the strftime() into
toybuf with length sizeof(toybuf)-10, appended sprintf(blah, "%09u",
TT.nano), and then set blah[width] = 0 to trim it before printing.

And you have tests for %10N and %20N. Inserting zeroes. Which means that
no amount of space I reserve at the end of toybuf is enough.

So, I went to a two step approach where it prints out the strftime data
and then does a printf("%u", TT.nano) except I can't come _up_ with a
%*u syntax that will truncate an integer and give me just the first few
digits. It'll only do that with _strings_, not with numbers.

I really, really hate the way the FSF implemented this feature. It is
bespoke crap that does not act like existing printf() acts, requiring
special case code to implement it for NO OBVIOUS REASON.

(Haven't forgotten, just every time I sit down to deal with this I
tackle the code again rather than the email...)

Rob

Loading...