← Back to writeups

TJCTF 2024 — golf-hard

regex below par? note that this challenge has five levels.

nc tjc.tf 31627

We're given a Regex "quiz" with 5 levels. After passing all 5, we get the flag.

Code (py):

1#!/usr/local/bin/python3.11
2
3import regex  # le duh
4import os
5import tomllib
6from tabulate import tabulate
7from importlib import import_module
8from itertools import zip_longest as zp
9
10
11def check_regex(pattern, matches, nonmatches):
12    try:
13        re = regex.compile(pattern, flags=regex.V1)
14    except:
15        print("nope")
16        return False
17
18    for text in matches:
19        if not re.search(text):
20            print(f"whoops, didn't match on {text}")
21            return False
22
23    for text in nonmatches:
24        if re.search(text):
25            print(f"whoops, matched on {text}")
26            return False
27
28    return True
29
30
31def main():
32    for dir in sorted(os.listdir("challenges")):
33        tomlpath = os.sep.join(["challenges", dir, "info.toml"])
34        with open(tomlpath, "rb") as f:
35            info = tomllib.load(f)
36
37        matches = info["matches"]
38        nonmatches = info["nonmatches"]
39        length = info["length"]
40
41        print(info["description"])
42        print(
43            tabulate(
44                zp(matches, nonmatches),
45                [
46                    "Match on all of these:",
47                    "But none of these:    ",
48                ],
49                disable_numparse=True,
50                tablefmt="simple",
51            )
52        )
53        print(f"\nMaximum allowable length is {length}\n")
54
55        # include some test cases that may be inconvenient to display
56        # only some challenges have extra tests
57        # fear not, the intended pattern will remain the same
58        ext = import_module(f"challenges.{dir}.extensions")
59        matches.extend(ext.more_matches())
60        nonmatches.extend(ext.more_nonmatches())
61
62        pattern = input("> ")
63        if len(pattern) > length:
64            print(f"whoops, too long")
65            return
66
67        if not check_regex(pattern, matches, nonmatches):
68            return
69
70    print(open("flag.txt").read())
71
72
73if __name__ == "__main__":
74    main()

Since the server compiles regex using regex instead of re, we can use fancier regular expression extensions like recursion.

We just want to match all words starting with a. This is pretty simple; just use ^ to force the match to start at the start of the string, then look for the first a:

Code:

1^a

Code:

11. "Warmup"
2    This one's pretty straightforward.
3
4Match on all of these:    But none of these:
5------------------------  ------------------------
6ampasimenite              jasmone
7anchorable                decisivenesses
8aconic                    backoff
9antistrophic              whitebark
10abrade                    physogastrism
11arroya                    shavee
12apoplex                   hanoverian
13ayahs                     weatherstripped
14arock                     naturalize
15anconoid                  lophotrichous
16anglophile                nonprecedential
17acalephan                 tjctf
18amaze                     shepherdia
19adelphe                   waymark
20anno                      picnicker
21amiable                   bottleholder
22aliquid                   porule
23achromatin                diagraming
24aircheck                  solifuge
25antigenically             phrenetically
26
27Maximum allowable length is 2
28
29> ^a

The first regex relying on PCRE-style group-insertion shenanigans. We can take some inspiration from this StackOverflow: essentially, we can split the first string of x's into two captured groups, with the second string of xs being the first group and the resulting sum being the second.

Code:

1^(.*)(.*)-\1=\2$

Code:

12. "Subtraction"
2    Think addition, but not.
3
4Match on all of these:                    But none of these:
5----------------------------------------  -------------------------------------------
6xxxxxxxxxxxxx-xxx=xxxxxxxxxx              xxx-xxxx=xxxxxxxxxx
7xxxxxxx-x=xxxxxx                          xxxxxxxxxxxx-xxxxxxxxx=xx
8xxxxxxxxxxxxxxxx-xxxxxxxxxxxxx=xxx        xxxx-xxxxxxxxxxxxx=
9x-x=                                      xxxxxxxxxxxxxxx-xxxxxxx=xxxxxxx
10xxxx-xxxx=                                xxxxxxxxxxx-xxxxxxx=xxxxxxxxxxxxxx
11xx-=xx                                    xxxxxxxxxxxxxx-xxx=xxxxxxxxxxxx
12xxxxxxxxxxx-xxxxx=xxxxxx                  xxxx-xxxx=xxxxxxxxxxxxxxxxxx
13xxxxxxxxxx-xxxxx=xxxxx                    xxxxxxxxxxxxxxx-xxx=xxxxxxxxxxx
14-=                                        xxxxxxxxxxxxx-xxxxxxx=
15xxxxxxxxxxxxxxxxx-xxxxxxxxxxx=xxxxxx      xxxxxxx-xxxxxxxx=xxxxxxxxxxx
16xxxxxxxxxxxxx-xxxxxxxxxxxxx=              x-xxxxxxxxxxxxxxxxxxxx=
17xxxxxxxxxxxxxxxxxx-xxxxxxxxxxxxxxx=xxx    xxxxxxxxxxxxxxx-xxxxxxxxxxxxxxx=xxxxxxxxxx
18xxxxxxxxxxx-xxxxxxxxxxx=                  xxxxx-xxxxxxxxxx=xx
19xxxxxxxxxxxxxxxxx-xxxxx=xxxxxxxxxxxx      xxxxxxxxxxx-xxxxxxxxxxxxxx=xxxxxx
20xxxxxxxxx-xx=xxxxxxx                      xxxxxxxxxxxxx-=xxxxxxxxxxxxxxx
21xxxxxxxxxxxxxxxxxx-xxxxxxxxxxxx=xxxxxx    xxxxxxxx-xxxxxxxxxx=xxxxxxxxxxxxxxxxx
22x-=x                                      xxxxxxx-xxxxxxx=xxxxxxxxxxxxxx
23xxxxxxxx-xxxxx=xxx                        -xxxxxxxxxxxxxxxxxxxx=
24xxxxxxxxxx-xxxxxx=xxxx                    xxxxxxxxxxxxxxxxxxxx-xxxxxxxxxxxxxxxx=x
25xxxxxxxxxxxxxx-xxxxxx=xxxxxxxx            xxxxxxxx-xxxxxxxx=xxxxxxxxxxxxxx
26xxxxxxxxxxxxxxxxxx-xxxxxxxxxxxxx=xxxxx    xxxxxx-xxxxxxxxxxxxxxxxxxx=xxxxxxxx
27xxxxxxxxxxxxxxx-xxxx=xxxxxxxxxxx          xx-xxxxxxxxxxxxxxx=xx
28xxxxxxxxxxxxxxxx-xxxxxxx=xxxxxxxxx        xxxxxxxx-xxxxxxxxxxxxxxxxxxxx=xxxxxxxxxxxxx
29xxxxxxxxxxxxxxxxxxx-x=xxxxxxxxxxxxxxxxxx  xxxxxxxxx-xx=xxxxxxxxxxxxxxxxx
30
31Maximum allowable length is 16
32
33> ^(.*)(.*)-\1=\2$

The first regex relying on PCRE recursion shenanigans. We can again take inspiration from a StackOverflow post for this: ignoring the .NET solution using balancing groups, we can edit the recursive solution to use angled brackets and ignore text between brackets to trim down its length.

Code:

1^(<(?1)*>)+$

Code:

13. "Parity"
2    This one's supposed to be impossible in the general case.
3    The tests will enforce a solution for the general case.
4
5Match on all of these:                            But none of these:
6------------------------------------------------  ------------------------------------------------
7<>                                                <
8<<>>                                              >
9<><>                                              <>>
10<<><>>                                            >><>>
11<<<<>>>>                                          <><><<
12<<<>>><>                                          <<>>><>
13<<><><<>>>                                        <<>>>>><
14<><<<><>>>                                        >><>>>><<
15<<<><><>>>                                        <>><<>><><<<<>>><<
16<<<<<>>><>>>                                      <<>>>>><>><<>><<><<
17<<<<<<>>>>>>                                      >>>>><><><><<<>>><>>
18<<<<<<<>>>>>>>                                    ><><<<><><>>>>>><<><<
19<<<>><><<<>>>>                                    <<<<<><<<><>><>>><>>>><>>>
20<<<<<<<<<<>>>>>>>>>>                              >>>><>>><><><<<<<>>><>><<><>>><><<>
21<<<<<<<<<<>>>>>>>>>><>                            <>><><>><>><<><>><>>><><>>><><>><><>><
22<><<<>><<>>><<<<>><>>>                            <<<><>>>>><<><<><><<<<>>><<<>>>>>>><>>>
23<<><><><<<<<><<>>>>>><>>                          ><<>><<<<><><<<><>><<>><<><>>><>>>>>><<>
24<<<<><>>><><><>><><><><>                          >>><<<<<<<<<<>>>><<<><<><>><<<><><><<<<>>
25<<>><<>><<>><<<<<<<>>>>><>>>                      <<><>><<>><>>>>><>><><<><>>><<<<<><>><<<<><>
26<<<<<<>>>>>><<<<<<<>>><>>>>>                      <>><<<><<><<<<<>>><><<<<<<<><><<>><><><<>>>>
27<<<<>><><>>><<><<>><<<<>>><>>>                    ><><<<>>><<><<><>>><<<<>><>><><<>>><><<><<<<
28<<<><<>><<<>><<<>>><>>>><<<>><>>                  ><><<<<><<<<<>><<><<><>>>><><<><><><>><<<><><
29<><<><><><<<>>><<<>><>>><<<<><>>>>                >>><<>>><><<<<><><><<<<>>>><<<>>><>><<<<<<<<<>
30<<<<<<>>><>><><><>><>><<<>><><<<>>>>              <>><><>>><<>>><>>><>>><<<<>>>><>>><<<<<>><>><>><
31<<<>><><>><><<><><<<>>>><<<>>><<<><<>>>><><<<>>>  <>>>><<<><>><<><<<><<<><>>>>>><>>><>>>>><<<><<<>
32
33Maximum allowable length is 12
34
35> ^(<(?1)*>)+$

Another recursive regex, this time finding palindromes. Because this is a well known / solved problem, we can just use the solution from this StackOverflow answer to pass.

Code:

1^((.)(?1)\2|.?)$

Code:

14. "Topsy Turvy"
2    This one's another famously impossible regex challenge.
3    Have fun solving it!
4
5Match on all of these:     But none of these:
6-------------------------  ------------------------
7i                          cashew
8dad                        hamleteer
9abba                       electromagnetic
10kayak                      zimbabwe
11refer                      xerox
12civic                      epiphanizing
13pullup                     detected
14racecar                    tahinis
15rotator                    foolproof
16detartrated                sailboat
17wassamassaw                racecars
18tattarrattat               redeemer
19satireveritas              woodrow
20dogeeseseegod              bathtub
21neveroddoreven             butterfly
22madaminedenimadam          scarabs
23anutforajaroftuna          alabama
24wasitacaroracatisaw        yamaha
25amanaplanacanalpanama      engage
26satorarepotenetoperarotas  nonattributiveness
27
28Maximum allowable length is 18
29
30> ^((.)(?1)\2|.?)$

Since this isn't exactly a "well known regex problem", we'll have to cook up a solution ourselves. We can leverage PCRE capture groups and recursion to capture the first operand, then "paste" it on the right of the equals for each character in the second. After a bit of fanagling, we can come up with

Code:

1^(x*)\*(x(?2)\1|=)$

and get the flag:

Code:

15. "Multiplication"
2    Seems self-explanatory.
3
4Match on all of these:                       But none of these:
5-------------------------------------------  ------------------------------------------
6xxx*xxxxx=xxxxxxxxxxxxxxx                    x*x=xx
7xxxxx*xxxx=xxxxxxxxxxxxxxxxxxxx              xxxxxxxx*xxxx=xxxxxxxx
8xxxxx*xxx=xxxxxxxxxxxxxxx                    x*xxxxxxxx=xxxxxxx
9xx*xx=xxxx                                   xxxxxxx*xxxxx=xxxxxxxxx
10xx*xxxx=xxxxxxxx                             xxxxxxxx*xxx=xxxxxxxxxxxxxxxxxxxxxxxxxxxxx
11xxx*xxxxxx=xxxxxxxxxxxxxxxxxx                xxxxx*xxxxx=xxxxxxxxxxxxxx
12xxx*xxxxx=xxxxxxxxxxxxxxx                    xxx*xxxxxx=xxxx
13xx*xxxxx=xxxxxxxxxx                          xx*xxxx=xxxxxxxxxx
14xx*xxx=xxxxxx                                xxxx*xxxxx=xx
15xxxxx*xxxxxx=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx  xxx*xxx=xxx
16x*xx=xx                                      xxxx*xxxxxxxx=xxxxxxxxxxxxxxxxxxxx
17xxxxxx*xxxx=xxxxxxxxxxxxxxxxxxxxxxxx         *x=x
18x*x=x                                        xxxxxx*xxxxxxx=xxxxxxxxxxxxxxxxxxxxxxxxx
19xxx*xxx=xxxxxxxxx                            xx*xxxxxx=xxxxxxxxx
20xxxx*xxxxxx=xxxxxxxxxxxxxxxxxxxxxxxx         xxxxxxxx*xxxxxx=xxxxxxxxxxxxxxxxxxxxxx
21xxxx*xx=xxxxxxxx                             xxxx*xxxxxxx=xxxxxxxxxxxxxxxxxxxxxxxxxx
22*x=                                          xxxxxxxx*xx=xxxxxxxxxxxxxxx
23xxxxx*xxxxxx=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx  x*xxxxxxx=x
24xxx*xx=xxxxxx                                xx*x=x
25xxxx*xxxxxx=xxxxxxxxxxxxxxxxxxxxxxxx         xx*xxxxx=xxxxxx
26
27Maximum allowable length is 20
28
29> ^(x*)\*(x(?2)\1|=)$
30
31
32From the moment I understood the weakness of my code, it disgusted me. I craved the strength and certainty of ascii. I aspired to the purity of the Blessed Regex.
33
34Your kind cling to your code, as if it will not have bugs and fail you. One day the crude boilerplate you call a script will wither, and you will beg my kind to save you. But I am already saved, for the Regex is immortal...
35
36tjctf{even_in_death_I_serve_the_PCRE_Standard_3ceb7afc}