In Swift, structs are value types. What does that mean in practice? Let's consider the following:
struct Recipe {
var name: String
var instructions: String
var image: UIImage
}
let a = Recipe(
name: "Omelette",
instructions: "Break some eggs",
image: UIImage(named: "omelette")!)
var b = a // Copy!
When initializing b
from a
, copying a
, the struct is copied in its entirety, resulting in a copy of each of its stored properties. A full, recursive, copy for any properties that are structs (name
and instructions
), and a quick copy of the reference for reference-type properties (image
, which is an instance of a UIImage
, a class).
Now, a
and b
have their own name
/instruction
s, but share the same underlying UIImage
instance for image
.
b.name = "A different omelette"
print(a.name) // "Omelette"
print(b.name) // "A different omelette"
print(a.image === b.image) // "true"
What about collections? Array
is a struct, and the idea of copying it in its entirety frequently doesn't sound particularly efficient.
let values1 = Array(repeating: 0, count: 1000000)
var values2 = values1 // Is this copying _everything_ ?? 😱😱
Thankfully, that's not the case. Whilst they still follow the rules above, standard library collections also use an optimization pattern called copy on write.
Internally, Array
is backed by a reference-type buffer, so the initialization of values2
just copies this reference. A deep copy of this buffer is only performed when the Array
is actually mutated (and only if the buffer is actually shared).
var values1 = Array(repeating: 0, count: 1000000)
values1.append(1) // No copy here, the buffer isn't shared
var values2 = values1 // values1, values2 share the same buffer
values2.append(2) // Entire buffer copied before appending!
print(values1) // "[0,0,...,1]"
print(values2) // "[0,0,...,2]"
A similar optimisation is implemented for Set
and Dictionary
.
It's important to note that copy on write does not come for free. There's no magic in Swift that delays copying the contents of your struct until a mutation is performed. On a struct copy, all stored properties are copied. But for stored properties that are reference types, copying the reference is very quick).
So how can we implement this ourselves? Let's imagine that we're implementing our own struct to store a collection of Int
s. In practice you'd use an Array<Int>
for this (and you definitely wouldn't just wrap an NSMutableArray
) but bear with me.
struct IntCollection {
/// Private backing store for our values.
private let values = NSMutableArray()
/// Appends a value to this `IntCollection`.
func append(_ value: Int) {
values.add(value)
}
/// Subscript access, similarly to `Array`.
subscript(index: Int) -> Int {
get { return values[index] as! Int }
set { values[index] = newValue }
}
}
Right now, our IntCollection
doesn't even properly implement value semantics. Performing a copy just copies the values
reference, so any copies always share the same underlying NSMutableArray
. This isn't the expected behaviour for a struct; each copy should give us a semantically unique instance to work with.
var collection = IntCollection()
collection.append(1)
print(collection[0]) // "1"
var copy = collection
copy[0] = 999
print(copy[0]) // "999"
print(collection[0]) // Also "999"!
One way to fix this would be to copy our values into a fresh NSMutableArray
on every mutation, however this results in a load of wasteful copy operations. For our copy on write implementation, we want to copy all of our values into a fresh NSMutableArray
, but only if that NSMutableArray
is currently being shared by multiple IntCollection
s. How do we keep track of that sharing?
Thankfully, the standard library gives us isKnownUniquelyReferenced(_:)
, which serves this exact purpose. It'll return true
if it's argument only has a single strong reference. 🧙♂️
Let's modify IntCollection
to take advantage:
struct IntCollection {
/// Private backing store for our values.
private var values = NSMutableArray()
/// Appends a value to our collection.
/// Must now be mutating, as it may modify values.
mutating func append(_ value: Int) {
copyValuesIfNecessary()
values.add(value)
}
/// Subscript semantics, similar to Array.
subscript(index: Int) -> Int {
get { return values[index] as! Int }
set {
copyValuesIfNecessary()
values[index] = newValue
}
}
/// Copies values if it's shared. Must be done before mutation
private mutating func copyValuesIfNecessary() {
if !isKnownUniquelyReferenced(&values) {
values = values.mutableCopy() as! NSMutableArray
}
}
}
Now, we get the best of both worlds. Efficient copies (since only our reference to values
is copied routinely), and the ability to mutate our copies separately (since a full copy of values
will be performed at this point).
var collection = IntCollection()
collection.append(1)
print(collection[0]) // "1"
var copy = collection // A quick operation.
copy[0] = 999 // Full copy performed here.
print(collection[0]) // Still "1".
inout
ParametersThere's one interesting point when passing copy on write types as inout
parameters. If you check out the Swift docs, inout
parameter passing is described as follows:
- When the function is called, the value of the argument is
copied.
- In the body of the function, the copy is modified.
- When the function returns, the copy’s value is assigned to
the original argument.
This would seem to imply that the following performs a full copy.
func mutate(_ inoutCollection: inout IntCollection) {
inoutCollection.append(5) // Does this perform a full copy?
}
var collection = IntCollection()
collection.append(1)
mutate(&collection)
Inside mutate(_:)
, the copy is mutated, and at this point the underlying NSMutableArray
is referenced twice. Once by collection
, and once by inoutCollection
(the copy). Wouldn't copy on write kick in here?
Turns out that the model described in the Apple docs is more of a mental model, rather than a description of the actual implementation. Further docs state:
As an optimization, when the argument is a value stored at a
physical address in memory, the same memory location is used
both inside and outside the function body.
Therefore, the initial copy is skipped in certain cases. No copy, no multiple references, and no full copy of the underlying NSMutableArray
. Neat!
If you test this scenario, printing out when a full copy takes place, you'll find that the call to mutate
will not copy the underlying NSMutableArray
. It's as if our IntCollection
were truly passed by reference.