Semigroups: Our First Algebraic Structure
In the previous chapter, we discussed associativity, a key property of some binary operations. Now, let’s explore our first formal algebraic structure built around such operations: the Semigroup.
Understanding Semigroups Through Types and Operations
Section titled “Understanding Semigroups Through Types and Operations”A semigroup is an algebraic structure (A, •)
where:
A
is a non-empty set (in programming terms, a Type).•
(the “dot” operator, representing our binary operation) is a function that takes two elements from setA
and returns an element also inA
. In programming terms, this is a function with a generic type signature like'a -> 'a -> 'a
.- Recalling from Unit 1, this type signature (due to right-associativity, often related to currying) means the function takes a first argument of type
'a'
, and returns a new function of type'a -> 'a'
. This new function then takes the second argument of type'a'
and produces the final result, also of type'a'
. - Crucially, the operation is closed over the set/type: combining two
'a'
s always yields an'a'
.
- Recalling from Unit 1, this type signature (due to right-associativity, often related to currying) means the function takes a first argument of type
- The operation
•
must satisfy the associative law: for allx, y, z
inA
,(x • y) • z = x • (y • z)
.
Let’s see this in familiar examples:
Numbers with Addition and Multiplication
Section titled “Numbers with Addition and Multiplication”For the Type int
(the set of integers):
-
Binary Operation: Addition, represented by the function
(+)
.- Type:
int -> int -> int
. (Here, our generic'a
is specialized toint
). - This function takes two
int
s and returns anint
.
- Type:
-
Associativity:
(1 + 2) + 3 = 1 + (2 + 3)
holds true. -
Therefore, (
int
,(+)
) forms a semigroup. -
Binary Operation: Multiplication, represented by the function
(*)
.- Type:
int -> int -> int
. - This function takes two
int
s and returns anint
.
- Type:
-
Associativity:
(1 * 2) * 3 = 1 * (2 * 3)
holds true. -
Therefore, (
int
,(*)
) forms a semigroup.
(Note: Integer subtraction (-)
and division (/)
are binary operations on integers, but they are not associative, so (int
, (-)
) and (int
, (/)
) do not form semigroups.)
Strings with Concatenation
Section titled “Strings with Concatenation”For the Type string
(the set of all possible strings):
- Binary Operation: String concatenation, represented by the function
(+)
(in F# for strings).- Type:
string -> string -> string
. - This function takes two
string
s and returns astring
.
- Type:
- Associativity:
("He" + "l") + "lo" = "He" + ("l" + "lo")
holds true. - Therefore, (
string
,(+)
for concatenation) forms a semigroup.
Physical Examples: The Same Pattern
Section titled “Physical Examples: The Same Pattern”LEGO Blocks
Section titled “LEGO Blocks”- Type (Set): The set of all physical LEGO blocks.
- Binary Operation (Function): The act of physically connecting two LEGO blocks. This operation takes two LEGO blocks and results in a new, combined LEGO block. Crucially, the result is still a “LEGO block” – the operation is closed within the type.
- Associativity: As discussed,
(A joined with B) joined with C
is the same asA joined with (B joined with C)
. - Therefore, (LEGO blocks, join operation) forms a semigroup.
USB Devices
Section titled “USB Devices”- Type (Set): The set of all USB devices.
- Binary Operation (Function): The act of connecting two USB devices via a USB hub. This operation takes two USB devices and results in a new, combined USB device configuration. The result is still a “USB device (set),” meaning the operation is closed within the type.
- Associativity: Nesting hubs in different orders yields the same final USB device configuration.
- Therefore, (USB devices, hub connection operation) forms a semigroup.
Why Semigroups Matter
Section titled “Why Semigroups Matter”The real power of recognizing that a certain Type and a binary operation (a function of type 'a -> 'a -> 'a
) form a semigroup is that we gain a guarantee:
- We have a well-defined set of values (our Type).
- We have a binary operation (our Function) that is:
- Closed: It always takes two values of our type and returns a value of the same type.
- Associative: It combines values in a predictable way, meaning the order of grouping for multiple operations doesn’t change the final outcome.
This allows us to reliably chain these operations together in our data transformation pipelines, much like we can confidently connect LEGO blocks in any grouping.
The semigroup structure ensures that for a sequence of operations a • b • c • d ...
, we can compute it as ((a • b) • c) • d
or a • (b • (c • d))
or any other valid grouping, and the result will be identical. This predictability is invaluable for building complex systems, optimizing computations (e.g., parallel processing), and reasoning about our code.