sid_com February 2016

Unicode property "Space" in Perl 5 and Perl 6

Is the unicode-property \p{Space} a Perl5 extension?

In Perl5 Space matches all white-spaces

my $s = "one\ttwo\nthree";
$s =~ s/\p{Space}/*/g;
say $s;

# one*two*three

while in Per6 it maybe matches only a simple space

my $s = "one\ttwo\nthree";
$s.=subst( /<:Space>/, '*', :g );
say $s;

# one     two
# three


Calle Dybedahl February 2016

It's not really an extension, but it's a shorthand name for another Unicode property, \p{White_Space}. This is documented in detail in the manpage perluniprops.

I have no idea what the Perl6 people are doing here.

Christoph February 2016

Tabulators are of category Control, not Space. The property you're interested in is actually called White_Space, and that's what you need to use in Perl 6:

say so "\t" ~~ /<:White_Space>/

Several alternative spellings appear to be available as well, including WhiteSpace, WSpace and its lower-case variants, but not WS.

There is also a built-in rule <ws>, which matches zero or more whitespace characters instead of a single one, and of course \s, which already uses Unicode semantics.

