Arithmetic evaluation of negative numbers with base prefix

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

Arithmetic evaluation of negative numbers with base prefix

Jeremy Townshend
Configuration Information [Automatically generated, do not change]:
Machine: x86_64
OS: linux-gnu
Compiler: gcc
Compilation CFLAGS:  -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64' -DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-pc-linux-gnu' -DCONF_VENDOR='pc' -DLOCALEDIR='/usr/share/locale' -DPACKAGE='bash' -DSHELL -DHAVE_CONFIG_H   -I.  -I../. -I.././include -I.././lib  -Wdate-time -D_FORTIFY_SOURCE=2 -g -O2 -fdebug-prefix-map=/build/bash-vEMnMR/bash-4.4.18=. -fstack-protector-strong -Wformat -Werror=format-security -Wall -Wno-parentheses -Wno-format-security
uname output: Linux tower 4.15.0-48-generic #51-Ubuntu SMP Wed Apr 3 08:28:49 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Machine Type: x86_64-pc-linux-gnu

Bash Version: 4.4
Patch Level: 19
Release Status: release

Description:
        Unexpected and undocumented behaviour for arithmetic evaluation of negative numbers when prefixed with the optional "base#" (e.g. 10#${i}). The base prefix may be needed if the variable has a decimal integer value but might be zero-padded, otherwise it is at risk of being misinterpreted as an octal.  Where the variable holds a negative value, results are as you would expect (e.g. i=-1; echo $((10#${i})), returns -1) until you subtract (or unary minus) the variable.  This unexpected behaviour occurs even when numbers are used directly (as in the first part of the Repeat-By section to simplify) but in real world examples the number would be hidden in a variable requiring the optional "base#" prefix to ensure correct interpretation of its value.

Repeat-By:
        echo $((10#-1))   # -1 as expected
        echo $((0-10#1))  # -1 as expected
        echo $((0+10#-1)) # -1 as expected
        echo $((0-10#-1)) # -1 UNEXPECTED. Would expect 1.
        echo $((0--1))    # 1 as expected

        # Real world example:
        i=001
        echo $((3-10#${i})) # 2 as expected
        i=$((10#${i}-2)) # i's value decremented by 2 to -1
        echo $((3-10#${i})) # 2 UNEXPECTED. Would expect 4.
        echo $((3+10#${i})) # 2 as expected
        # Certainly wouldn't expect the last two expressions to have the same
        # result.


Reply | Threaded
Open this post in threaded view
|

Re: Arithmetic evaluation of negative numbers with base prefix

Chet Ramey
On 6/14/19 10:19 AM, Jeremy Townshend wrote:

> Bash Version: 4.4
> Patch Level: 19
> Release Status: release
>
> Description:
> Unexpected and undocumented behaviour for arithmetic evaluation of negative numbers when prefixed with the optional "base#" (e.g. 10#${i}). The base prefix may be needed if the variable has a decimal integer value but might be zero-padded, otherwise it is at risk of being misinterpreted as an octal.  Where the variable holds a negative value, results are as you would expect (e.g. i=-1; echo $((10#${i})), returns -1) until you subtract (or unary minus) the variable.  This unexpected behaviour occurs even when numbers are used directly (as in the first part of the Repeat-By section to simplify) but in real world examples the number would be hidden in a variable requiring the optional "base#" prefix to ensure correct interpretation of its value.

I think the issue is that unary minus is an operator, not part of an
integer constant. That affects how expressions are parsed.

Couple that with the shell's arithmetic evaluator being helpful and
treating 10# as identical to 10#0, you can see how the results you
mark as `unexpected' are straightforward results of operator parsing.


--
``The lyf so short, the craft so long to lerne.'' - Chaucer
                 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU    [hidden email]    http://tiswww.cwru.edu/~chet/

Reply | Threaded
Open this post in threaded view
|

Re: Arithmetic evaluation of negative numbers with base prefix

Ilkka Virta
In reply to this post by Jeremy Townshend
On 14.6. 17:19, Jeremy Townshend wrote:
> echo $((10#-1))   # -1 as expected

Earlier discussion about the same on bug-bash:
https://lists.gnu.org/archive/html/bug-bash/2018-07/msg00015.html

Bash doesn't support the minus (or plus) sign following the 10#.
I think the expression above seems to work in this case because 10# is
treated as a constant number by itself (with a value of 0), and then the
1 is subtracted.

try also e.g.:

   $ echo $((10#))
   0

> echo $((0-10#-1)) # -1 UNEXPECTED. Would expect 1.

So this is 0-0-1 = -1

--
Ilkka Virta / [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Arithmetic evaluation of negative numbers with base prefix

Jeremy Townshend
In reply to this post by Chet Ramey
Dear Chet

Many thanks for your impressively swift response.  It is enlightening to see
how these expressions are parsed.

For the record, whilst I can now see how they are parsed, it feels
particularly unsatisfactory that the following two expressions yield the same
result when the variable i happens to have unwittingly been decremented below
zero (by bash arithmetic evaluation by the way - not by the output of some
external command):

  echo $((3-10#${i}))
  echo $((3+10#${i}))

As you indicate, this is caused by 10# being parsed as zero.  That silent
assumption of zero effectively then also silently nullifies/swallows the
preceding operator.

Ilkka Virta's email helpfully pointed me to a somewhat related debate that
occurred about 11 months ago.  I agree with your comment in this debate:

  "There would be a good case for rejecting the '10#' because it's missing
  the value."

It is this silently proceeding with a plausible (but undesirable) output in
such cases which is especially concerning.

In the meantime it would seem cautionary to advise against the pitfall of
using base# prefixed to variables (contrary to
mywiki.wooledge.org/ArithmeticExpression) unless you can be confident that
they will never be decremented below zero.

At the very least it would be helpful if the manual reflected that 10#
followed by anything other than a digit ([0-9a-zA-Z@_]) is parsed as zero, and
rlarified more completely the constraints of "number" for "n" in the "base#"
paragraph.  

I cannot find anywhere else in the manual where the word "number", "numeric
value" or "integer" excludes values less than zero without explicitly stating
so.  On the other hand phrases like "[if] ...  number/numeric values less than
zero", "if ...  [not] a number greater than [or equal to] zero" are used
repeatedly.  In those cases "number" clearly doesn't exclude those less than
zero.


Jeremy Townshend


Reply | Threaded
Open this post in threaded view
|

Re: Arithmetic evaluation of negative numbers with base prefix

Greg Wooledge
On Mon, Jun 17, 2019 at 02:30:27PM +0100, Jeremy Townshend wrote:
> In the meantime it would seem cautionary to advise against the pitfall of
> using base# prefixed to variables (contrary to
> mywiki.wooledge.org/ArithmeticExpression) unless you can be confident that
> they will never be decremented below zero.

Fair point.  I've updated <https://mywiki.wooledge.org/ArithmeticExpression>
and <https://mywiki.wooledge.org/BashPitfalls>.

Reply | Threaded
Open this post in threaded view
|

Re: Arithmetic evaluation of negative numbers with base prefix

Ilkka Virta
On 17.6. 18:47, Greg Wooledge wrote:
> On Mon, Jun 17, 2019 at 02:30:27PM +0100, Jeremy Townshend wrote:
>> In the meantime it would seem cautionary to advise against the pitfall of
>> using base# prefixed to variables (contrary to
>> mywiki.wooledge.org/ArithmeticExpression) unless you can be confident that
>> they will never be decremented below zero.
>
> Fair point.  I've updated <https://mywiki.wooledge.org/ArithmeticExpression>
> and <https://mywiki.wooledge.org/BashPitfalls>.

Good!

I still wish this could be fixed to do the useful thing without any
workarounds, given it's what ksh and zsh do, and since this is the
second time it comes up on the list, it appears to be surprising to
users, too.

The <base># prefix is already an extension of the C numeric constant
syntax, so extending it further to include an optional sign wouldn't
seem in inappropriate.


I took a look last night and made some sort of a patch. It seems to
work, though I'm not sure if I've missed any corner cases. Apart from
the digitless '10#', the behaviour matches ksh and zsh, I made it an
error, they apparently allow it.

   $ cat test.sh
   echo $(( 10 * 10#-123 ))  # -1230
   echo $(( 10 * 10#-008 ))  #   -80
   echo $(( 10 * 10#1+23 ))  #    10*1 + 23 = 33
   echo $(( 10# ))           #  error

   $ ./bash test.sh
   -1230
   -80
   33
   test.sh: line 5: 10#: no digits in number (error token is "10#")

   $ ksh test.sh
   -1230
   -80
   33
   0


--
Ilkka Virta / [hidden email]

expr-allow-sign.c (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Arithmetic evaluation of negative numbers with base prefix

Chet Ramey
On 6/18/19 1:52 AM, Ilkka Virta wrote:

> I still wish this could be fixed to do the useful thing without any
> workarounds, given it's what ksh and zsh do

I'm surprised people keep saying this.

$ ksh93 -c 'echo ${.sh.version}'
Version ABIJM 93v- 2014-09-29
$ ksh93 -c 'echo $(( 10# ))'
ksh93:  10# : arithmetic syntax error
$ ksh93 -c 'echo $(( 10#-4 ))'
ksh93:  10#-4 : arithmetic syntax error


--
``The lyf so short, the craft so long to lerne.'' - Chaucer
                 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU    [hidden email]    http://tiswww.cwru.edu/~chet/

Reply | Threaded
Open this post in threaded view
|

Re: Arithmetic evaluation of negative numbers with base prefix

Greg Wooledge
On Tue, Jun 18, 2019 at 10:27:48AM -0400, Chet Ramey wrote:
> $ ksh93 -c 'echo ${.sh.version}'
> Version ABIJM 93v- 2014-09-29
> $ ksh93 -c 'echo $(( 10# ))'
> ksh93:  10# : arithmetic syntax error

I guess most Linux distributions are not shipping the 2014 version of
ksh93 yet...?

wooledg:~$ ksh -c 'echo $(( 10# ))'
0
wooledg:~$ dpkg -l ksh | tail -1
ii  ksh            93u+20120801-3.4 amd64        Real, AT&T version of the Korn shell
wooledg:~$ ksh -c 'echo ${.sh.version}'
Version AJM 93u+ 2012-08-01

Seems kinda weird to continue calling it "ksh93" if it's being changed,
but I don't make the decisions.

Reply | Threaded
Open this post in threaded view
|

Re: Arithmetic evaluation of negative numbers with base prefix

Ilkka Virta
On 18.6. 18:20, Greg Wooledge wrote:
> On Tue, Jun 18, 2019 at 10:27:48AM -0400, Chet Ramey wrote:
>> $ ksh93 -c 'echo ${.sh.version}'
>> Version ABIJM 93v- 2014-09-29
>> $ ksh93 -c 'echo $(( 10# ))'
>> ksh93:  10# : arithmetic syntax error
>
> I guess most Linux distributions are not shipping the 2014 version of
> ksh93 yet...?

Yeah, I had the one from Debian. I'm not even sure what the current
version of ksh is.

At least the newer versions throw an error instead of silently doing the
unexpected.

> wooledg:~$ ksh -c 'echo $(( 10# ))'
> 0
> wooledg:~$ dpkg -l ksh | tail -1
> ii  ksh            93u+20120801-3.4 amd64        Real, AT&T version of the Korn shell
> wooledg:~$ ksh -c 'echo ${.sh.version}'
> Version AJM 93u+ 2012-08-01
>
> Seems kinda weird to continue calling it "ksh93" if it's being changed,
> but I don't make the decisions.
>


--
Ilkka Virta / [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Arithmetic evaluation of negative numbers with base prefix

Chet Ramey
In reply to this post by Jeremy Townshend
On 6/17/19 9:30 AM, Jeremy Townshend wrote:

> Ilkka Virta's email helpfully pointed me to a somewhat related debate that
> occurred about 11 months ago.  I agree with your comment in this debate:
>
>   "There would be a good case for rejecting the '10#' because it's missing
>   the value."

I'll probably do that for bash-5.1. The code is in there and tagged for
later.

>
> It is this silently proceeding with a plausible (but undesirable) output in
> such cases which is especially concerning.

Trying to be helpful rarely works out in every case.

> I cannot find anywhere else in the manual where the word "number", "numeric
> value" or "integer" excludes values less than zero without explicitly stating
> so.  On the other hand phrases like "[if] ...  number/numeric values less than
> zero", "if ...  [not] a number greater than [or equal to] zero" are used
> repeatedly.  In those cases "number" clearly doesn't exclude those less than
> zero.

I'm not sure how relevant that language is to integer constants in
expressions. I could also note that the language describing the base#n
syntax only talks about digits, letters, `@', and `_'.

The bash definition of arithmetic evaluation is taken from C. That includes
integer constants, and, while the base#value syntax clearly extends the C
definition of a constant,  the `-' (and `+', FWIW) is still an operator as
defined by C.

Chet
--
``The lyf so short, the craft so long to lerne.'' - Chaucer
                 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU    [hidden email]    http://tiswww.cwru.edu/~chet/

Reply | Threaded
Open this post in threaded view
|

Re: Arithmetic evaluation of negative numbers with base prefix

Chet Ramey
In reply to this post by Greg Wooledge
On 6/18/19 11:20 AM, Greg Wooledge wrote:

> Seems kinda weird to continue calling it "ksh93" if it's being changed,
> but I don't make the decisions.

Korn once explained it as the "ksh93 language definition." So there are
multiple implementations of that language.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
                 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU    [hidden email]    http://tiswww.cwru.edu/~chet/

Reply | Threaded
Open this post in threaded view
|

Re: Arithmetic evaluation of negative numbers with base prefix

Chet Ramey
In reply to this post by Ilkka Virta
On 6/18/19 12:13 PM, Ilkka Virta wrote:

> On 18.6. 18:20, Greg Wooledge wrote:
>> On Tue, Jun 18, 2019 at 10:27:48AM -0400, Chet Ramey wrote:
>>> $ ksh93 -c 'echo ${.sh.version}'
>>> Version ABIJM 93v- 2014-09-29
>>> $ ksh93 -c 'echo $(( 10# ))'
>>> ksh93:  10# : arithmetic syntax error
>>
>> I guess most Linux distributions are not shipping the 2014 version of
>> ksh93 yet...?
>
> Yeah, I had the one from Debian. I'm not even sure what the current version
> of ksh is.

It's unclear. When Korn (and Fowler) left at&t, and at&t closed down the
ast software repository and it moved to github, the development status
got murky. There are people committing to that repository, but I don't
believe Korn is involved.

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
                 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU    [hidden email]    http://tiswww.cwru.edu/~chet/