RFE: Fix the name collision issues and typing issues with namerefs, improve various issues for function libraries

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

RFE: Fix the name collision issues and typing issues with namerefs, improve various issues for function libraries

tetsujin

This is relevant to my interests!

So first off, the circular reference problem with "declare -n"
apparently doesn't exist in Korn Shell: If you "typeset -n x=$1", and
$1 is "x", it's not a circular reference. The nameref seems to take
the origin of the name into account when creating the nameref: If $1
is used to provide the name for a nameref within a function, then the
nameref refers to a variable in the scope of the function's caller.
But if the name for the nameref is taken from another parameter, or a
string literal, then the variable is drawn from the scope of the
function itself. It seems a little "black magic" in a way but it has
the benefit of neatly matching some important common use-cases. I
think Bash should either adopt something similar, or provide another
mechanism to exclude function-local variables from nameref
construction.

...And I could be wrong here (as I haven't used namerefs much in bash)
but it seems you can't use a nameref to change the type of a
variable...  For instance if "x" is local to f1, and f1 calls "f2 x",
and f2 declares a nameref that resolves to its argument "x", f2 cannot
then change "x" to an associative array, for instance. The caller has
to declare the variable with the proper type before passing it into
the function as a nameref.

Regarding return values, there's a couple ways of looking at this:

First, a common way of dealing with the situation is to have functions
emit their results to stdout, and then capture the data with command
substitution. This is a bit problematic because command substitution
occurs in a subshell environment. (This is actually mandated in Posix,
it seems - which is unfortunate, because there's actually no need for
it to be a subshell environment! In Korn Shell, it's not...  Command
substitution is evaluated in the main shell environment - which struck
me as odd at first, but it actually makes sense for these kinds of
scenarios, where you want to run something that has a side-effect, and
capture its output.)  This means a function call or built-in can't
create side-effects in the shell environment (setting environment
variables or open/close files) AND emit a result on stdout for
capture. It's strictly one or the other.

Second, with namerefs in the picture, one could use them to tell a
function where to store its return value. It doesn't look "functional"
but it gets the job done, right? But again, there's the various
problems with namerefs in Bash presently. Name collisions can lead to
circular references, and the type of the var can't be changed using
the nameref.

Finally, of course...  Functions could simply "return" values by
populating a variable in the caller's scope. I don't think it's a
great solution

And finally, with respect to creating Bash "libraries" of functions:
The problem of namespace pollution could be largely solved by
supporting some form of namespaces to contain the declarations that
are needed only locally. (Current versions of ksh have a whole OO
system, meaning a library's footprint in global namespace can be very
small!)

----- Original Message -----
From: "Greg Wooledge" <[hidden email]>
To:"bug-bash" <[hidden email]>
Cc:
Sent:Wed, 14 Jun 2017 08:43:14 -0400
Subject:Re: RFE: Please allow unicode ID chars in identifiers

 On Tue, Jun 13, 2017 at 04:58:59PM -0700, L A Walsh wrote:
 > Forgive me if I'm misremembering, but hasn't Greg argued against
 > the ability to supply "libraries" of re-usable scripts due to
 > the ease with which names could conflict with each other and cause
 > script incompatibilities?

 Namespace collisions are certainly an issue, yes. But my primary
 argument against trying to write "libraries" in bash has always been
 the limitations of bash's functions.

 1) You can't return values from them (making them more like
"commands"
 than actual functions).

 2) You can't pass variables by reference. Therefore:

 3) You can't pass the name of a variable in which you'd like to
receive
 a value from the function, to work around point 1.

 4) You can't pass an array by name. You have to pass every single
array
 element separately, losing the indices in the process.

 "declare -n" looks like it should address point 2, but it doesn't
 do so safely. It only works if you magically choose a name (within
 the function) that the caller does NOT choose (outside the function).
 In fact, the caller's variable name must not match ANY local variable
 of the function, or it breaks.

 See examples at
<http://mywiki.wooledge.org/BashProgramming#Functions>.

 If your function is recursive (is its own caller), then you're simply
 doomed -- you can't avoid having the same name used in the caller and
 callee, because they're both the same piece of code. You might be
able to
 hack up a global indexed array of return values, and then each
instance
 of the recursive function can use its recursion depth as an index
into
 this array to retrieve its return value. That's the best I can think
of.


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RFE: Fix the name collision issues and typing issues with namerefs, improve various issues for function libraries

Greg Wooledge
On Wed, Jun 14, 2017 at 12:04:35PM -0400, [hidden email] wrote:
> So first off, the circular reference problem with "declare -n"
> apparently doesn't exist in Korn Shell: If you "typeset -n x=$1", and
> $1 is "x", it's not a circular reference.

Korn shell actually has two different *kinds* of functions.

If you declare using:  foo() { ...; }

Then you get bash-like behavior.

If you declare using:  function foo { ...; }

Then it works as you described.

$ ksh
$ foo() { typeset -n x="$1"; x=y; }
$ x=global
$ foo x
ksh: typeset: x: invalid self reference
$ function bar { typeset -n x="$1"; x=y; }
$ bar x
$ echo "$x"
y

As far as I know, bash has no equivalent of ksh's "function foo"
declaration.  Bash permits that *syntax* to be used as a function
declaration, but it has exactly the same semantics as "foo()".

> Regarding return values, there's a couple ways of looking at this:
> [...]

You're retracing a lot of the ground I've already covered in the last
few years.  Which is cool, and I'll be happy if you find something
that I missed.

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RFE: Fix the name collision issues and typing issues with namerefs, improve various issues for function libraries

tetsujin


Yeah I didn't really consider ksh's "backward compatible" function
syntax in my post because using that form while looking for better
support for function libraries would be a needless handicap. You don't
even get local variables with the backward-compatible syntax, so
there's no hope of getting around issues like nameref collision.

Indeed, much of what I had to say about returning values from
functions was on your site. I included the information mainly to
provide some context to the feature request.

----- Original Message -----
From: "Greg Wooledge" <[hidden email]>
To:"bug-bash" <[hidden email]>
Cc:
Sent:Wed, 14 Jun 2017 12:18:15 -0400
Subject:Re: RFE: Fix the name collision issues and typing issues with
namerefs, improve various issues for function libraries

 On Wed, Jun 14, 2017 at 12:04:35PM -0400, [hidden email]
wrote:
 > So first off, the circular reference problem with "declare -n"
 > apparently doesn't exist in Korn Shell: If you "typeset -n x=$1",
and
 > $1 is "x", it's not a circular reference.

 Korn shell actually has two different *kinds* of functions.

...
 If you declare using: function foo { ...; }

 Then it works as you described.
...

 > Regarding return values, there's a couple ways of looking at this:
 > [...]

 You're retracing a lot of the ground I've already covered in the last
 few years. Which is cool, and I'll be happy if you find something
 that I missed.


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RFE: Fix the name collision issues and typing issues with namerefs, improve various issues for function libraries

tetsujin
In reply to this post by tetsujin

I should add the related problem:

$ f() { declare -n x=$1; echo "x=$x"; declare y="local Y"; echo
"x=$x"; }
$ y="global Y"
$ f y
x=global Y
x=local Y

In other words, even if you do get a nameref to point to something in
the caller's scope, as soon as you shadow that variable with another
local variable, the nameref points to that one instead.
(As Greg pointed out, the name ref basically fails if the referenced
name is shared with any local variable in the function.)
This is arguably in line with the name of the feature ("nameref", i.e.
referencing the variable by name rather than by specific identity) but
I feel it's much more useful for a variable reference, once
established, to be stable.

To sum up a bit:

- There should be a way to declare a nameref locally in a function
which reliably refers to a variable in the caller's scope - IMO this
is the whole point of having a reference type, to get around the fact
that the language is otherwise pass-by-value.

- A nameref, once established, should refer to a particular name _in a
particular scope_. Shadowing the referenced variable with a new local
should not cause the nameref to switch to the new definition.

- Because Bash can create global namerefs from within a function,
creating a global nameref to a function-local variable should not be
allowed (because the global nameref would outlive the declaration it
references)

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RFE: Fix the name collision issues and typing issues with namerefs, improve various issues for function libraries

Chet Ramey
In reply to this post by tetsujin
On 6/14/17 12:04 PM, [hidden email] wrote:

>
> This is relevant to my interests!
>
> So first off, the circular reference problem with "declare -n"
> apparently doesn't exist in Korn Shell: If you "typeset -n x=$1", and
> $1 is "x", it's not a circular reference. The nameref seems to take
> the origin of the name into account when creating the nameref: If $1
> is used to provide the name for a nameref within a function, then the
> nameref refers to a variable in the scope of the function's caller.
> But if the name for the nameref is taken from another parameter, or a
> string literal, then the variable is drawn from the scope of the
> function itself. It seems a little "black magic" in a way but it has
> the benefit of neatly matching some important common use-cases. I
> think Bash should either adopt something similar, or provide another
> mechanism to exclude function-local variables from nameref
> construction.

I've gotten a suggestion that function-scope circular references should
always be resolved beginning at the previous scope, but I haven't done
anything to implement that yet.

There are a couple of problems that make it less clean to implement the
Korn shell mechanism. First, declare is a shell builtin, which means that
its arguments are expanded before it sees them. x=$1 and x=x both look
the same to declare when it sees them. The second is dynamic scoping,
which makes resolution tricker. The korn shell uses static scoping, so it
only has to look at the current scope and the global scope (which makes
the x=$1 case even more irregular).

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
                 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU    [hidden email]    http://cnswww.cns.cwru.edu/~chet/

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RFE: Fix the name collision issues and typing issues with namerefs, improve various issues for function libraries

tetsujin
On Sat, 2017-06-17 at 20:23 -0400, Chet Ramey wrote:

> On 6/14/17 12:04 PM, [hidden email] wrote:
> >
> >
> > This is relevant to my interests!
> >
> > So first off, the circular reference problem with "declare -n"
> > apparently doesn't exist in Korn Shell: If you "typeset -n x=$1", and
> > $1 is "x", it's not a circular reference. The nameref seems to take
> > the origin of the name into account when creating the nameref: If $1
> > is used to provide the name for a nameref within a function, then the
> > nameref refers to a variable in the scope of the function's caller.
> > But if the name for the nameref is taken from another parameter, or a
> > string literal, then the variable is drawn from the scope of the
> > function itself. It seems a little "black magic" in a way but it has
> > the benefit of neatly matching some important common use-cases. I
> > think Bash should either adopt something similar, or provide another
> > mechanism to exclude function-local variables from nameref
> > construction.
> I've gotten a suggestion that function-scope circular references should
> always be resolved beginning at the previous scope, but I haven't done
> anything to implement that yet.
>
> There are a couple of problems that make it less clean to implement the
> Korn shell mechanism. First, declare is a shell builtin, which means that
> its arguments are expanded before it sees them. x=$1 and x=x both look
> the same to declare when it sees them. The second is dynamic scoping,
> which makes resolution tricker. The korn shell uses static scoping, so it
> only has to look at the current scope and the global scope (which makes
> the x=$1 case even more irregular).
>
The functionality is the important thing here, not the specific semantics. Personally I wouldn't want to emulate the semantics ksh uses to trigger
this behavior anyway. It's the sort of thing that kind of makes sense in some ways (i.e. the variable name comes from the caller...  So the variable
ultimately identified in the nameref also comes from the caller) but the inconsistency there, of "typeset -n" behaving differently depending on where
the name comes from, is very unappealing to me. I'd much rather see this behavior activated by an explicit argument to declare, like this:
declare --caller-scope-nameref result_var=$whatever
Or this:
declare -n --scope=caller result_var=$whatever
I think switching the behavior in a clear-cut way like that is much preferable anyway. The fact that it's also easier to implement than the "black
magic" is like a bonus.
This also shouldn't be triggered by circular references: I think it's just more "black magic" and it's still an incomplete solution to the problem:
- Caller has locally-declared $x and $y, calls "f x" to modify its $x
- Function "f" internally creates a nameref "declare -n caller_var=$1" but it also has $x as a local variable, so $caller_var refers to that one.
(It's not a circular reference, but it's not the right variable either.)
My understanding of Bash's variable scoping implementation is a bit limited at the moment, I think there are still some gaps in my understanding of
it, but I'm pretty sure the approach I described would still work.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RFE: Fix the name collision issues and typing issues with namerefs, improve various issues for function libraries

Greg Wooledge
On Mon, Jun 19, 2017 at 12:07:31AM -0400, George wrote:
> ultimately identified in the nameref also comes from the caller) but the inconsistency there, of "typeset -n" behaving differently depending on where
> the name comes from, is very unappealing to me. I'd much rather see this behavior activated by an explicit argument to declare, like this:
> declare --caller-scope-nameref result_var=$whatever
> Or this:
> declare -n --scope=caller result_var=$whatever

Or this:
upvar "$1" myvar

~50% joking...

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RFE: Fix the name collision issues and typing issues with namerefs, improve various issues for function libraries

tetsujin


As a bonus you'd be able to lure the uninitiated into asking questions
like "What's upvar?"

----- Original Message -----
From: "Greg Wooledge" <[hidden email]>
To:"bug-bash" <[hidden email]>
Cc:
Sent:Mon, 19 Jun 2017 08:18:14 -0400
Subject:Re: RFE: Fix the name collision issues and typing issues with
namerefs, improve various issues for function libraries

 On Mon, Jun 19, 2017 at 12:07:31AM -0400, George wrote:
 > ultimately identified in the nameref also comes from the caller)
but the inconsistency there, of "typeset -n" behaving differently
depending on where
 > the name comes from, is very unappealing to me. I'd much rather see
this behavior activated by an explicit argument to declare, like this:
 > declare --caller-scope-nameref result_var=$whatever
 > Or this:
 > declare -n --scope=caller result_var=$whatever

 Or this:
 upvar "$1" myvar

 ~50% joking...


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: RFE: Fix the name collision issues and typing issues with namerefs, improve various issues for function libraries

Chet Ramey
On 6/19/17 9:12 AM, [hidden email] wrote:
>
>
> As a bonus you'd be able to lure the uninitiated into asking questions
> like "What's upvar?"

The same old, what's up with you?

--
``The lyf so short, the craft so long to lerne.'' - Chaucer
                 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU    [hidden email]    http://cnswww.cns.cwru.edu/~chet/

Loading...