Skip to content

Infinite loop/memory exhaustion parsing a rexexp named capture with a malformed unicode escape #3729

@stevenjohnstone

Description

@stevenjohnstone

Example input foo.txt:

/(?<\u{21!3}>foo)/ =~ "foo"

causes

./bin/parse foo.txt

to get stuck in pm_named_capture_escape_unicode using CPU and allocating memory on each loop.

Similar results for entering the code into irb etc.

I've patched locally with

diff --git a/src/prism.c b/src/prism.c
--- a/src/prism.c
+++ b/src/prism.c
@@ -21163,6 +21163,9 @@ pm_named_capture_escape_unicode(pm_parser_t *parser, pm_buffer_t *unescaped, con
         }

         size_t length = pm_strspn_hexadecimal_digit(cursor, end - cursor);
+        if (length == 0) {
+            break;
+        }
         uint32_t value = escape_unicode(parser, cursor, length);

and the immediate problem is fixed. Stopped short of a pull request because there may be a diagnostic error that's required in this case?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions