'When does assignment by reference occur in MATLAB?

This is a question about the MATLAB language. I am going through the MathWorks "Onramp" tutorial, and I have noticed a strange "assignment by reference" behavior (for lack of a better term) that is counter to my expectations.

v1 = [4 6 1 3 4 9 5];

I think that in the below, it first evaluates the parenthetical expression, which generates a logical array [1 0 1 1 1 0 0], which then indexes v1 to get the result. So far so good.

>> v1(v1 < 5)

ans =

     4     1     3     4

The below is what surprises me. If you run it, you will see that ans (the default result variable, sort of an anonymous variable) acquires the value [4 1 3 4] which is the value of the left-hand side of the statement. I would expect the assignment to only write to ans, but instead it passes through by reference and writes to the referent array v1.

>> v1(v1 < 5) = 1

v1 =

     1     6     1     1     1     9     5

Of course, this is similar to other languages. In print a[3] the syntax means we get the value of a[3], but in a[3] = 1 the syntax means we assign a new value to a[3]. In that sense, the only "new" part is that MATLAB allows more advanced indexing expressions than most languages.

What's confusing here is that MATLAB clearly evaluates the expression both ways. It gets the indexed values and stores them in ans, but then it ignores that and puts the righthand values into the locations referred to by the index.

I don't see how it could do this without evaluating the expression twice, or doing other magic behind the scenes. I don't feel like I have a grasp on the order/rules of evaluation.



Solution 1:[1]

One way to think about your example

v1(v1 < 5) = 1;

is to consider the effective functional form that MATLAB executes. In this case, this expression is converted by MATLAB into a function call like this:

v1 = subsasgn(v1, substruct('()', {v1 < 5}), 1);

In other words, when MATLAB sees an indexed assignment form (i.e. where a variable appears on the left-hand side of an =), internally this gets translated into a function call that both takes in the original value, and overwrites it. (MATLAB's "in-place optimisation" means that this is generally efficient, and doesn't duplicate memory). The substruct encapsulates all the details of the form of indexing. This can get quite complicated if you're assigning to a part of a field of a struct or something like that.

Solution 2:[2]

I believes this assertion is false, and it is what causes the confusion:

It gets the indexed values and stores them in ans

What makes you think that the specific memory needs to be read before it is written to? These are independent operations.

However, what you're saying could make sense, when we consider scopes. Let's assume that the interpreter, when encountering parentheses (much like a function), creates its own scope with its own ans variable, which is then discarded after it has done its job. Then, we do not see these the effects in the enclosing scope.


Next, I'd like to elaborate on what Troy and Edric had said.

To my best knowledge, the difference here is how the code is being interpreted into built-in functions prior to execution.

  • Your first example, v1(v1 < 5), is interpreted as subsref followed by implicit assign to ans. It is equivalent to writing ans = v1(v1 < 5); explicitly. Here ans is the result of the indexed expression.
  • Your second example, v1(v1 < 5) = 1, is interpreted as subsasgn. Here the output will just be the input after the assignment operation.

Regarding operator precedence, see this documentation page.

Solution 3:[3]

When you write

v1(v1 < 5)

the result of the expression gets assigned to ans. So the above is really a shortcut for

ans = v1(v1 < 5)

And now we can see the difference between this statement and

v1(v1 < 5) = 1

In the first statement, the indexing happens on the right side of the assignment operator. In the second statement, the indexing happens on the left side. The assignment operator is what makes the indexing different. On the left side of the assignment operator you specify where you write something. It is not evaluated. On the right side of the assignment operator you specify what gets assigned. It's an expression that is evaluated, resulting in a value (an array of values) that gets assigned to whatever is on the left side.


That the statements above get resolved in the same way as subsasgn or subsref function calls is irrelevant to understanding the basic distinction between the left and right sides of the assignment operator. These two functions are useful primarily if you write your own class, and you need to overload indexing operations. They are definitely not relevant to a novice that might be confused by basic MATLAB syntax.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Edric
Solution 2 Community
Solution 3 Cris Luengo