This package provides a human readable interface to regular expressions. I've left out a couple of regular expression features that I don't understand (recursion, conditionals, call-backs). If you understand these features I'd love some help.
Tutorial
I'll follow the example from rex
for url validation.
julia> using RegularExpressions
julia> invalids = raw.((".", "/", " ", "-"));
julia> url_pattern = pattern(
CONSTANTS.start,
capture(
or(
kind(:group, "http", of(:maybe, "s")),
"ftp"
),
name = "protocol"
),
raw("://"),
of(:maybe,
capture(
of(:some, one_of(not, short(:space))),
name = "username"
),
of(:maybe,
raw(":"),
capture(
of(:none_or_some, one_of(not, short(:space))),
name = "password"
)
),
raw("@")
),
capture(
of(:none_or_some,
of(:some, one_of(not, invalids...)),
of(:none_or_some, raw("-"))
),
of(:some, one_of(not, invalids...)),
name = "host"
),
capture(
of(:none_or_some,
raw("."),
of(:none_or_some,
of(:some, one_of(not, invalids...)),
of(:none_or_some, raw("-"))
),
of(:some, one_of(not, invalids...))
),
name = "domain"
),
raw("."), capture(
between(2, Inf, one_of(not, invalids...)),
name = "TLD"
),
of(:maybe, raw(":"), capture(
between(2, 5, short(:digit)),
name = "port"
)),
of(:maybe, raw("/"), capture(
of(:none_or_some, one_of(not, short(:space))),
name = "resource"
)),
CONSTANTS.stop
);
julia> goods = (
"http://foo.com/blah_blah",
"http://foo.com/blah_blah/",
"http://foo.com/blah_blah_(wikipedia)",
"http://foo.com/blah_blah_(wikipedia)_(again)",
"http://www.example.com/wpstyle/?p=364",
"https://www.example.com/foo/?bar=baz&inga=42&quux",
"http://✪df.ws/123",
"http://userid:password@example.com:8080",
"http://userid:password@example.com:8080/",
"http://userid@example.com",
"http://userid@example.com/",
"http://userid@example.com:8080",
"http://userid@example.com:8080/",
"http://userid:password@example.com",
"http://userid:password@example.com/",
"http://➡.ws/䨹",
"http://⌘.ws",
"http://⌘.ws/",
"http://foo.com/blah_(wikipedia)#cite-1",
"http://foo.com/blah_(wikipedia)_blah#cite-1",
"http://foo.com/unicode_(✪)_in_parens",
"http://foo.com/(something)?after=parens",
"http://☺.damowmow.com/",
"http://code.google.com/events/#&product=browser",
"http://j.mp",
"ftp://foo.bar/baz",
"http://foo.bar/?q=Test%20URL-encoded%20stuff",
"http://مثال.إختبار",
"http://例子.测试",
"http://-.~_!&'()*+,;=:%40:80%2f::::::@example.com",
"http://1337.net",
"http://a.b-c.de",
"http://223.255.255.254"
);
julia> bads = (
"http://",
"http://.",
"http://..",
"http://../",
"http://?",
"http://??",
"http://??/",
"http://#",
"http://##",
"http://##/",
"http://foo.bar?q=Spaces should be encoded",
"//",
"//a",
"///a",
"///",
"http:///a",
"foo.com",
"rdar://1234",
"h://test",
"http:// shouldfail.com",
":// should fail",
"http://foo.bar/foo(bar)baz quux",
"ftps://foo.bar/",
"http://-error-.invalid/",
"http://-a.b.co",
"http://a.b-.co",
"http://0.0.0.0",
"http://3628126748",
"http://.www.foo.bar/",
"http://www.foo.bar./",
"http://.www.foo.bar./"
);
julia> all(occursin.(url_pattern, goods))
true
julia> any(occursin.(url_pattern, bads))
false
Interface
General
RegularExpressions.pattern
— Function.pattern(them..., options...)
Splat of Regex
. Options can be in OPTIONS
julia> using RegularExpressions
julia> p = pattern("a", "b")
r"ab"
julia> occursin(p, "ab")
true
julia> p = pattern("A", caseless = true)
r"A"i
julia> occursin(p, "a")
true
RegularExpressions.template
— Function.template(them...)
Splat of SubstitutionString
. See examples in captured
.
RegularExpressions.raw
— Function.raw(it)
Escape punctuation.
julia> using RegularExpressions
julia> p = pattern(raw("1.0"))
r"1\.0"
julia> occursin(p, "v1.0")
true
RegularExpressions.not
— Constant.RegularExpressions.CONSTANTS
— Constant.CONSTANTS
Plain commands.
julia> using RegularExpressions
julia> p = pattern(CONSTANTS.any)
r"."
julia> occursin(p, "a")
true
RegularExpressions.or
— Function.or(them...)
At least one of them
.
julia> using RegularExpressions
julia> p = pattern(or("a", "b"))
r"a|b"
julia> occursin(p, "b")
true
RegularExpressions.kind
— Function.kind(a_kind, them...)
A variety of syntaxes: a_kind
of them
. Access KINDS
.
julia> using RegularExpressions
julia> p = pattern(kind(:before, "a"), "b")
r"(?<=a)b"
julia> occursin(p, "ab")
true
RegularExpressions.KINDS
— Constant.KINDS
Access via kind
.
Shortcuts
RegularExpressions.short
— Function.short(it)
short(::Not, it)
A short command. Access SHORTS
.
julia> using RegularExpressions
julia> p = pattern(short(:space))
r"\s"
julia> occursin(p, " ")
true
julia> p = pattern(short(not, :space))
r"\S"
julia> occursin(p, "a")
true
RegularExpressions.SHORTS
— Constant.SHORTS
Access with short
.
RegularExpressions.property
— Function.property([::Not], general, [specific])
A character property. Access PROPERTIES
.
julia> using RegularExpressions
julia> p = pattern(property(:seperator))
r"\p{Z}"
julia> occursin(p, " ")
true
julia> p = pattern(property(not, :seperator))
r"\P{Z}"
julia> occursin(p, "a")
true
julia> p = pattern(property(:seperator, :space))
r"\p{Zs}"
julia> occursin(p, " ")
true
julia> p = pattern(property(not, :seperator, :space))
r"\P{Zs}"
julia> occursin(p, "a")
true
RegularExpressions.PROPERTIES
— Constant.PROPERTIES
Access with property
.
RegularExpressions.script
— Function.script([::Not], it
A character from a script.
julia> using RegularExpressions
julia> p = pattern(script(:Han))
r"\p{Han}"
julia> occursin(p, "中")
true
julia> p = pattern(script(not, :Han))
r"\P{Han}"
julia> occursin(p, "a")
true
RegularExpressions.option
— Function.option([::Not]; options...)
option
. Access OPTIONS
.
julia> using RegularExpressions
julia> p = pattern(option(caseless = true, ignore_space = true), "a ")
r"(?ix)a "
julia> occursin(p, "A")
true
julia> p = pattern(option(caseless = true), option(not, caseless = true), "a")
r"(?i)(?-i)a"
julia> occursin(p, "A")
false
RegularExpressions.OPTIONS
— Constant.OPTIONS
Access with option
RegularExpressions.extra
— Function.extra(it)
extra(it, value::Number)
extra
command. Access EXTRAS
.
julia> using RegularExpressions
julia> p = pattern(extra(:standard_newline), "a", CONSTANTS.stop)
r"(*ANYCRLF)a$"
julia> occursin(p, "a
")
true
julia> extra(:limit_match, 1)
"(*LIMIT_MATCH=1)"
RegularExpressions.EXTRAS
— Constant.EXTRAS
Access with extra
.
Classes
RegularExpressions.one_of
— Function.one_of([::Not], them...)
Create a character class.
julia> using RegularExpressions
julia> p = pattern(one_of('a', 'b'))
r"[ab]"
julia> occursin(p, "b")
true
julia> p = pattern(one_of(not, 'a', 'b'))
r"[^ab]"
julia> occursin(p, "c")
true
RegularExpressions.class
— Function.class([::Not], it)
Character classes. Access CLASSES
.
julia> using RegularExpressions
julia> p = pattern(one_of(class(:space)))
r"[[:space:]]"
julia> occursin(p, " ")
true
julia> p = pattern(one_of(class(not, :space)))
r"[[:^space:]]"
julia> occursin(p, "a")
true
RegularExpressions.CLASSES
— Constant." CLASSES
Access with class
.
RegularExpressions.through
— Function.through(start, stop)
A range of characters
julia> using RegularExpressions
julia> p = pattern(one_of(through('a', 'c')))
r"[a-c]"
julia> occursin(p, "b")
true
Quantifiers
RegularExpressions.GREEDS
— Constant.RegularExpressions.of
— Function.of(quantity::Symbol, them...; greed = :greedy)
of(quanitty::Number, them...)
A quantity
of
it
with a certain greed
. Acccess QUANTITIES
and GREEDS
.
julia> using RegularExpressions
julia> p = pattern(of(:some, "a"))
r"a+"
julia> occursin(p, "aa")
true
julia> p = pattern(of(2, "a"))
r"a{2}"
julia> occursin(p, "aa")
true
RegularExpressions.QUANTITIES
— Constant.QUANTITIES
Access with of
.
RegularExpressions.between
— Function.between(low, high, them...; greed = :greedy)
Between low
and high
of it
with a certain greed
. Access GREEDS
.
julia> using RegularExpressions
julia> p = pattern(between(1, 3, "a"))
r"a{1,3}"
julia> occursin(p, "aa")
true
julia> p = pattern(between(2, Inf, "a"))
r"a{2,}"
julia> occursin(p, "aaa")
true
Captures
RegularExpressions.capture
— Function.capture(them...; name = nothing)
Capture them
with optional name
. See examples in captured
.
RegularExpressions.captured
— Function.captured(it::AbstractString)
captured(it::Number; relative = false)
Refer to a capture
[@ref]d group.
julia> using RegularExpressions
julia> p = pattern(capture("a"), capture("b", name = "second"))
r"(a)(?<second>b)"
julia> t = template(captured("second"), captured(1))
s"\\g<second>\\g<1>"
julia> replace("ab", p => t)
"ba"
julia> p = pattern(captured(1, relative = true), capture("a"))
r"\g<+1>(a)"
julia> occursin(p, "aa")
true