Pattern matching

Pattern matching #

Pattern matching is a traditional feature of functional languages. It pairs well with sum types (which we will see later).

Pattern matching is no longer exclusive to functional languages: several imperative languages now support it to some extent (natively or via libraries). This includes:

  • C# (since 2020),
  • Java (since 2021),
  • PHP (since 2021),
  • Python (since 2021)
  • Rust (since its first release in 2015).

in Haskell #

A pattern is a syntactic expression that can match certain values.

Examples. Here are some basic Haskell patterns:

  • the pattern 2 matches the number 2,
  • the pattern "Unibz" matches the string "Unibz",
  • the pattern True matches the Boolean value True,
  • a variable name (like x, xs or maxValue) matches anything.

Patterns let us define a function on a case-by-case basis.

Example. The function below maps 0 to 0, and every other integer to its successor.

fancySucc :: Integer -> Integer
-- if the argument is 0, then map it to 0
fancySucc 0 = 0
-- otherwise map the argument x to x + 1
fancySucc x = x + 1

Terminology. The function above uses:

  • the pattern 0 on the left-hand side of the first equation fancySucc 0 = 0, and
  • the pattern x on the left-hand side of the second equation fancySucc x = x + 1.

Warning. The first pattern that matches a value takes precedence. So the order of equations matters.

What does the following function do? Can it be simplified?

dummySucc :: Integer -> Integer
dummySucc x = x + 1
dummySucc 0 = 0
dummySucc :: Integer -> Integer
dummySucc x = x + 1
dummySucc 0 = 0     -- this equation cannot be reached

This function maps an integer to its successor.

The second equation (dummySucc 0 = 0) cannot be reached, because the input 0 is already handled by the first pattern.

So the function is equivalent to:

succ :: Integer -> Integer
succ x = x + 1

Patterns can be composed into tuples.

Example. The following function maps a pairs of Boolean values to their conjunction.

and :: (Bool, Bool) -> Bool
-- if both arguments are `True`
and (True,True) = True
-- otherwise
and (x,y) = False

Reminder. Function that map pairs of values (like the one above) are unconventional in Haskell. Their curried form is usually preferred.

Write the function above in curried form.

Curried form:

and :: Bool -> Bool -> Bool
and True True = True
and x y = False

Or equivalently (with infix notation):

and :: Bool -> Bool -> Bool
True `and` True = True
x `and` y = False

What does the following function do?

op :: Bool -> Bool -> Bool
False `op` x = False
True `op` x = x

It is equivalent to the function and above.

Syntax. A same variable name cannot appear twice on the left-hand side of an equation.

Example. The following is not valid syntactically in Haskell

invalidXNor :: Bool -> Bool -> Bool
x `invalidXNor` x = True    -- invalid equation!!!
x `invalidXNor` y = False

The null pattern #

The null pattern _ matches any value (like a variable name), but the value that it matches cannot be reused. It can be convenient to indicate that the value is irrelevant.

Example. Let us reproduce the function op introduced above:

op :: Bool -> Bool -> Bool
False `op` x = False
True `op` x = x

This function can equivalently be written:

op :: Bool -> Bool -> Bool
-- the value of the "second argument" is irrelevant
False `op` _ = False
-- if this equation is reached, then the value of the "first argument" must be `True`
_ `op` x = x

Contrary to a variable, the null pattern can be used several times on the left-hand side of an equation. In such an equation, the values matched by _ can be different.

Example. Let us reproduce the function and introduced above:

and :: Bool -> Bool -> Bool
True `and` True = True
x `and` y = False

This function can equivalently be written:

and :: Bool -> Bool -> Bool
True `and` True = True
-- if this equation is reached, then the values of the arguments is irrelevant
_ `and` _ = False

Case expression #

Syntax. A case expression has the form

case <expression> of
  <pattern 1> -> <expression 1>
  ...
  <pattern n> -> <expression n>

It can be helpful to pattern match the output of a function call.

Example.

-- Maps an address to the corresponding city
extractCity :: String -> String
extractCity address = <some expression here>

-- Maps an address to the corresponding province
getProvince :: String -> String
-- call the function `extractCity` with the input address
getProvince address = case extractCity address of
  -- pattern match the output
  "Bolzano" -> "BZ"
  "Merano" -> "BZ"
  "Trento" -> "TN"
  _ -> "Unknown"

Alias definition #

Syntax. A pattern can also be used in an alias definition.

This also lets us pattern match the output of a function call.

Example.

-- This auxiliary function returns a pair
auxFunction :: Char -> (String, Int)
auxFunction c = <some expression here>

-- This function uses a `where` alias to pattern match the output of the auxiliary function
myFunctionW :: Char -> Bool
myFunctionW c = length s == x
      where (s, x) = auxFunction c

-- Same function with a `let` alias
myFunctionL :: Char -> Bool
myFunctionL c =
         let (s, x) = auxFunction c
         in length s == x

Lambda expression #

Syntax. A pattern can also be used on the left-hand side of a lambda expression.

Example. The two following functions are equivalent:

aFunction :: (Int,Int) -> Int
aFunction (x,y) =  x + y

theSameFunction :: (Int,Int) -> Int
theSameFunction = \(x,y) -> x + y

Reminder. The function above is unconventional in Haskell. Its curried form is usually preferred:

aFunction :: Int -> Int -> Int
aFunction x y = x + y

Complex patterns #

We will see later on that pattern matching extends to lists and composite types.