Skip to content

Conversation

@0xe
Copy link
Contributor

@0xe 0xe commented Oct 28, 2025

Fixes for #1641

@rbri
Copy link
Collaborator

rbri commented Oct 28, 2025

@0xe wow, this looks really promising - > 300 more passing tests

@0xe 0xe force-pushed the scratch/satish/dstr-pending branch 2 times, most recently from fae67c7 to 88a0b70 Compare October 29, 2025 11:19
@0xe 0xe marked this pull request as ready for review October 29, 2025 11:20
@aardvark179
Copy link
Contributor

That is awesome @0xe, thanks!

Copy link
Collaborator

@gbrail gbrail left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very good feedback -- thanks!

Is there a way that DestructuringIterator can be a ScriptableObject (using lambdas if necessary) instead of an IdScriptableObject? We're trying to phase that thing out.

@0xe
Copy link
Contributor Author

0xe commented Nov 3, 2025

This is very good feedback -- thanks!

Is there a way that DestructuringIterator can be a ScriptableObject (using lambdas if necessary) instead of an IdScriptableObject? We're trying to phase that thing out.

Sure. AFK for a week or so, will take a look once I get back.

@rbri
Copy link
Collaborator

rbri commented Nov 8, 2025

@0xe have done a first smoke test by merging this into core-js and run the whole HtmlUnit test suit. Nothing broken - looks really good!

@0xe 0xe force-pushed the scratch/satish/dstr-pending branch from 88a0b70 to 67d1f75 Compare November 10, 2025 04:43
@0xe 0xe requested a review from gbrail November 10, 2025 05:21
@0xe
Copy link
Contributor Author

0xe commented Nov 10, 2025

This is very good feedback -- thanks!

Is there a way that DestructuringIterator can be a ScriptableObject (using lambdas if necessary) instead of an IdScriptableObject? We're trying to phase that thing out.

Thanks, I've made it just a ScriptableObject.

Copy link
Contributor

@aardvark179 aardvark179 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I like the overall effect of this PR, but I'm not sure we should be introducing a new byte code just for this.

Let me try and unwind this from the implementation side:

  1. In the interpreter we want to:
    1. Get the iterator for the array
    2. Get the next method from that
    3. Iterate using the iterator the required number of times to get values, assigning if needed..
  2. DestructuringIterator seems to be a special implementation of an Iterator Record, which would likely be a useful thing to have for implementing all the iterator prototype methods, but I'm not sure is needed here because in the interpreter we can simply hold the method and the iterator object on the stack or in variables.

I was going to say that it looks like we're diving to implementation details too early by doing that in the parser… but tracing through things I see that's all called from the IR factory, and just lives in the parser to confuse simple bears like me. 😀

@0xe
Copy link
Contributor Author

0xe commented Nov 16, 2025

Okay, I like the overall effect of this PR, but I'm not sure we should be introducing a new byte code just for this.

Let me try and unwind this from the implementation side:

  1. In the interpreter we want to:

    1. Get the iterator for the array
    2. Get the next method from that
    3. Iterate using the iterator the required number of times to get values, assigning if needed..
  2. DestructuringIterator seems to be a special implementation of an Iterator Record, which would likely be a useful thing to have for implementing all the iterator prototype methods, but I'm not sure is needed here because in the interpreter we can simply hold the method and the iterator object on the stack or in variables.

I was going to say that it looks like we're diving to implementation details too early by doing that in the parser… but tracing through things I see that's all called from the IR factory, and just lives in the parser to confuse simple bears like me. 😀

I’m working on addressing @aardvark179’s suggestions.

@0xe 0xe force-pushed the scratch/satish/dstr-pending branch from ed5388e to 3c523f1 Compare November 19, 2025 05:42
@0xe 0xe requested a review from aardvark179 November 19, 2025 05:43
@0xe
Copy link
Contributor Author

0xe commented Nov 19, 2025

Okay, I like the overall effect of this PR, but I'm not sure we should be introducing a new byte code just for this.

Let me try and unwind this from the implementation side:

1. In the interpreter we want to:
   
   1. Get the iterator for the array
   2. Get the `next` method from that
   3. Iterate using the iterator the required number of times to get values, assigning if needed..

2. `DestructuringIterator` seems to be a special implementation of an [Iterator Record](https://tc39.es/ecma262/#sec-iterator-records), which would likely be a useful thing to have for implementing all the iterator prototype methods, but I'm not sure is needed here because in the interpreter we can simply hold the method and the iterator object on the stack or in variables.

I was going to say that it looks like we're diving to implementation details too early by doing that in the parser… but tracing through things I see that's all called from the IR factory, and just lives in the parser to confuse simple bears like me. 😀

I've addressed your suggestions by removing the new bytecodes by doing the transformations in the IR. You were right, we didn't really need the new bytecodes or the new wrapper class. The changes which are really only in Parser.java now are simpler. While they are in the Parser, these functions are really called during IR transforms (as you noted above).

Here's how the transforms look:

function f([a, b]) {
    return a + b;
}
FUNCTION f [scope #1: $0 $2 $3 $4 a $5 b $6]
    BLOCK
        EXPR_VOID
            COMMA
                COMMA                      // usual destructuring COMMA node
                    SETVAR $1 = GETVAR $0
                    COMMA [destructuring_names: [a, b]]
                        SETVAR $2 = CALL // call iterator
                            GETELEM
                                GETVAR $1
                                GETPROP
                                    NAME Symbol
                                    STRING "iterator"
                        SETVAR $3 = CALL // next() on the iterator
                            GETPROP
                                GETVAR $2
                                STRING "next"
                        SETVAR $4 = GETPROP // read value ...
                            GETVAR $3
                            STRING "value"
                        SETVAR a = GETVAR $4
                        SETVAR $3 = CALL
                            GETPROP
                                GETVAR $2
                                STRING "next"
                        SETVAR $5 = GETPROP 
                            GETVAR $3
                            STRING "value"
                        SETVAR b = GETVAR $5 
                        HOOK                // IteratorClose: https://tc39.es/ecma262/#sec-iteratorclose 
                            NOT
                                GETPROP
                                    GETVAR $3
                                    STRING "done"  // no need to close if iterator is exhausted
                            HOOK
                                NE
                                    SETVAR $6 = GETPROP
                                        GETVAR $2
                                        STRING "return"
                                    UNDEFINED
                                CALL
                                    GETPROP
                                        GETVAR $6
                                        STRING "call"
                                    GETVAR $2
                                UNDEFINED
                            UNDEFINED
                        GETVAR $1
        RETURN
            ADD
                GETVAR a
                GETVAR b

@0xe
Copy link
Contributor Author

0xe commented Nov 19, 2025

Looking at why java 21 multi-threaded test is failing in CI. It runs successfully for me locally with openjdk 21.0.9: ./gradlew check -Drhino.useThreadSafeObjectsByDefault=true.

@0xe
Copy link
Contributor Author

0xe commented Nov 19, 2025

This task has failed in the multithreaded test:

> Task :rhino-kotlin:decycleMain FAILED
> Execution failed for task ':rhino-kotlin:decycleMain'.
   > A failure occurred while executing de.obqo.decycle.gradle.DecycleWorker
      > Stream closed

I think this is unrelated.

@rbri
Copy link
Collaborator

rbri commented Nov 19, 2025

I think this is unrelated.

Restart the test solves the problem.... :-D

@rbri
Copy link
Collaborator

rbri commented Nov 19, 2025

Smoke test done again with the latest sources - still no smoke!

@aardvark179
Copy link
Contributor

I’ve seen the decycle problem locally at least once. I haven’t debugged it properly to know if it’s a simple race between a few parts of the build or something else.

@andreabergia
Copy link
Contributor

Okay, I like the overall effect of this PR, but I'm not sure we should be introducing a new byte code just for this.

I've addressed your suggestions by removing the new bytecodes by doing the transformations in the IR. You were right, we didn't really need the new bytecodes or the new wrapper class. The changes which are really only in Parser.java now are simpler. While they are in the Parser, these functions are really called during IR transforms (as you noted above).

Here's how the transforms look:

function f([a, b]) {
    return a + b;
}
FUNCTION f [scope #1: $0 $2 $3 $4 a $5 b $6]
    BLOCK
        EXPR_VOID
            COMMA
                COMMA                      // usual destructuring COMMA node
                    SETVAR $1 = GETVAR $0
                    COMMA [destructuring_names: [a, b]]
                        SETVAR $2 = CALL // call iterator
                            GETELEM
                                GETVAR $1
                                GETPROP
                                    NAME Symbol
                                    STRING "iterator"
                        SETVAR $3 = CALL // next() on the iterator
                            GETPROP
                                GETVAR $2
                                STRING "next"
                        SETVAR $4 = GETPROP // read value ...
                            GETVAR $3
                            STRING "value"
                        SETVAR a = GETVAR $4
                        SETVAR $3 = CALL
                            GETPROP
                                GETVAR $2
                                STRING "next"
                        SETVAR $5 = GETPROP 
                            GETVAR $3
                            STRING "value"
                        SETVAR b = GETVAR $5 
                        HOOK                // IteratorClose: https://tc39.es/ecma262/#sec-iteratorclose 
                            NOT
                                GETPROP
                                    GETVAR $3
                                    STRING "done"  // no need to close if iterator is exhausted
                            HOOK
                                NE
                                    SETVAR $6 = GETPROP
                                        GETVAR $2
                                        STRING "return"
                                    UNDEFINED
                                CALL
                                    GETPROP
                                        GETVAR $6
                                        STRING "call"
                                    GETVAR $2
                                UNDEFINED
                            UNDEFINED
                        GETVAR $1
        RETURN
            ADD
                GETVAR a
                GETVAR b

To me, this looks very complex frankly. A call to a function in ScriptRuntime with a new opcode seems preferable, IMHO.

Me and @0xe noticed that V8 has a generic "invokeruntime" opcode; I think that could be an idea to investigate, if we want to keep the number of opcodes in check.

@aardvark179
Copy link
Contributor

aardvark179 commented Nov 20, 2025

So I think there are a few different things to consider here:

  1. Our parse tree is consumed (for better or worse) by other things (e.g. rhino drops a character on single line comments #2151 ) so the output of the parser should expose implementation details that might change.
  2. Byte codes are a relatively scarce resource, so we should think carefully about when they really need to be introduced.
  3. We should be careful not to block off future optimisations too early.

So introducing byte codes for array destructuring seems like a bad move because it is a little user feature and it makes better optimisation harder.

A more general runtime call byte code could carry its weight. As it could be reused elsewhere.

Thinking about all the other iterator methods in the spec might also help in designing this. Standard array iterators are likely a special case because they very fixed behaviour and the close operation can be trivial.

oh, and the parse tree should express the concept of destructuring assignment, and whatever our implementation details are should only be introduce either in the IR factory or by the compilers’ transform routines.

@0xe
Copy link
Contributor Author

0xe commented Nov 21, 2025

So I think there are a few different things to consider here:

  1. Our parse tree is consumed (for better or worse) by other things (e.g. rhino drops a character on single line comments #2151 ) so the output of the parser should expose implementation details that might change.

...

oh, and the parse tree should express the concept of destructuring assignment, and whatever our implementation details are should only be introduce either in the IR factory or by the compilers’ transform routines.

The parse tree itself is reasonably clean. The implementation logic gets added only during IR transforms. For the example above, this is the AST:

0	SCRIPT 0 62
0	  FUNCTION 0 40 functionType=1
9	    NAME 9 1 f
11	    ARRAYLIT 11 6  // signifies destructuring assignment
12	      NAME 1 1 a
15	      NAME 4 1 b
19	    BLOCK 19 21
25	      RETURN 6 13
32	        ADD 7 5
32	          NAME 0 1 a
36	          NAME 4 1 b
  1. Byte codes are a relatively scarce resource, so we should think carefully about when they really need to be introduced.
  2. We should be careful not to block off future optimisations too early.

So introducing byte codes for array destructuring seems like a bad move because it is a little user feature and it makes better optimisation harder.

A more general runtime call byte code could carry its weight. As it could be reused elsewhere.

Looking at the Interpreter among the instructions that call into ScriptRuntime about 5 are just direct calls which can benefit from something like V8's CallRuntime bytecode. A few more manipulate the stack before/after calling but I think they can still be refactored into using such a bytecode.

For example, these are only direct calls to ScriptRuntime:

DoLeaveWith - ScriptRuntime.leaveWith(scope)
DoTypeOfName - ScriptRuntime.typeofName(scope, name)
DoBindName - ScriptRuntime.bind(cx, scope, name)
DoRegExp - ScriptRuntime.wrapRegExp(cx, scope, re)
DoDelPropSuper - ScriptRuntime.throwDeleteOnSuperPropertyNotAllowed()

A few more are single calls but after a ScriptRuntime.wrapNumber. Maybe those can be CallRuntimeAfterUnwrap any DOUBLE_MARK arguments.

However, I think that's starting to sound like a separate PR. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants