Happy 2024! In this series of posts I'm going to be running through the implementation of a lightweight SQLite wrapper, from scratch, in Swift.

In the previous post we managed to open a SQLite connection, but what use is a connection to a database if you can't insert values or later retrieve them? Now, we'll focus on values.

Values

So which data types do we want to support? SQLite defines a number of storage classes, which we can map directly to a Swift enum. These are the types that we'll support storing and retrieving directly to / from the database.

public enum Value: Sendable, Equatable {
  case null
  case integer(Int64)
  case real(Double)
  case text(String)
  case blob(Data)
}

...and for a given Value type, we want to support both binding a Value into a statement (effectively interpolating it into the statement), and extracting it from a statement result (for when we've queried some data).

Let's check out binding first. Say we have a query with a placeholder, such as SELECT * from user WHERE id = ? (note that the ? is SQLite syntax - see the documentation for the various placeholders that can be used). We'd like to be able to bind a Value representing the id we're searching for into this statement (which has already been prepared and is represented by a StatementHandle). This mostly involves using the correct sqlite3_ API for the Values type.

public enum Value {

  ...

  /// Binds this `Value` into `statementHandle` at the specified `index`.
  func bind(
    connectionHandle: ConnectionHandle,
    statementHandle: StatementHandle,
    index: Int,
    query: String
  ) throws {
    let bindResult = switch self {
    case .null:
      sqlite3_bind_null(statementHandle, Int32(index))
    case .integer(let int):
      sqlite3_bind_int64(statementHandle, Int32(index), int)
    case .real(let double):
      sqlite3_bind_double(statementHandle, Int32(index), double)
    case .text(let string):
      sqlite3_bind_text(
        statementHandle,
        Int32(index),
        string,
        -1,
        Self.transientDestructorType
      )
    case .blob(let data):
      data.withUnsafeBytes { bytes in
        sqlite3_bind_blob(
          statementHandle,
          Int32(index),
          bytes.baseAddress,
          Int32(bytes.count),
          Self.transientDestructorType
        )
      }
    }
    guard bindResult == SQLITE_OK else {
      throw Error.argumentBind(
        query: query,
        argumentIndex: index,
        value: self,
        description: Error.errorDescription(connectionHandle: connectionHandle)
      )
    }
  }

  private static let transientDestructorType = unsafeBitCast(
    -1,
    to: sqlite3_destructor_type.self
  )
}

I've defined a new Error case to handle errors during binding.

One bit of obscurity here is transientDestructorType, an ugly unsafeBitCast to sqlite3_destructor_type. This is simply a parameter that instructs SQLite how to manage the lifetime of the bound values, and the chosen value (-1) will instruct SQLite to copy the values at the point of binding. More info can be found in the documentation.

Next up is initializing a Value following the result of a statement. Ths is effectively the inverse of the above, and involves asking SQLite for the type of the column at the requested index, and mapping a Value from the enclosed value. Pulling out Datas requires a bit of juggling, but it's mostly mechanical.

public enum Value {

  ...

  /// Initializes a `Value` from the result of an executed `statement`, extracting the value
  /// for `columnIndex`.
  init(
    connectionHandle: ConnectionHandle,
    statementHandle: StatementHandle,
    query: String,
    columnIndex: Int,
    columnName: String
  ) throws {
    switch sqlite3_column_type(statementHandle, Int32(columnIndex)) {
    case SQLITE_NULL:
      self = .null
    case SQLITE_INTEGER:
      self = .integer(sqlite3_column_int64(statementHandle, Int32(columnIndex)))
    case SQLITE_FLOAT:
      self = .real(sqlite3_column_double(statementHandle, Int32(columnIndex)))
    case SQLITE_TEXT:
      guard let textPointer = sqlite3_column_text(statementHandle, Int32(columnIndex)) else {
        throw Error.resultValue(query: query, column: columnName)
      }
      self = .text(String(cString: textPointer))
    case SQLITE_BLOB:
      let byteLength = sqlite3_column_bytes(statementHandle, Int32(columnIndex))
      if byteLength > 0 {
        guard let bytes = sqlite3_column_blob(statementHandle, Int32(columnIndex)) else {
          throw Error.resultValue(query: query, column: columnName)
        }
        self = .blob(Data(bytes: bytes, count: Int(byteLength)))
      } else {
        self = .blob(Data())
      }
    default:
      throw Error.execute(
        query: query,
        description: Error.errorDescription(connectionHandle: connectionHandle)
      )
    }
  }
}

A new Error.resultValue case is used to report errors when extracting Values from results.

Execution

So we now have Values, which we can both bind in to statements, and pull out of results. Let's update Connection.execute to do both.

When fetching values, a statement may return a number of "rows" (e.g. column names paired with values), so let's define a quick typealias for a single result row.

public typealias Row = [String: Value]

Now let's tackle binding arguments into a query. We can update our top-level Connection.execute function to also take a number of Values, and use our APIs added above to bind them into our statement once it's been prepared. Note also that this function now returns a number of Rows - we'll be updating our other execute func in a sec to actually build and return the Rows.

public actor Connection {

  ...

  @discardableResult
  public func execute(_ query: String, _ arguments: Value...) throws -> [Row] {
    // Prepare a statement for `query`, retrieving a `StatementHandle`.
    var statementHandle: StatementHandle?
    let prepareResult = sqlite3_prepare_v3(
      connectionHandle,
      query,
      -1,
      0,
      &statementHandle,
      nil
    )
    guard prepareResult == SQLITE_OK, let statementHandle else {
      throw Error.execute(
        query: query,
        description: Error.errorDescription(connectionHandle: connectionHandle)
      )
    }

    // Ensure the statement is finalized following execution (even if execution or binding 
    // fails).
    defer {
      sqlite3_finalize(statementHandle)
    }

    // Bind all arguments into the statement. Note that the index starts at 1.
    var index = 1
    for argumentValue in arguments {
      try argumentValue.bind(
        connectionHandle: connectionHandle,
        statementHandle: statementHandle,
        index: index,
        query: query
      )
      index += 1
    }

    // Execute the statement.
    return try execute(query: query, statementHandle: statementHandle)
  }
}

And finally, let's update our other execute func (called just above) which actually does the work of executing the statement and pulling out the Rows. This looks much more complex than before, but it's mostly straightforward; the comments outline the flow of logic required to build and return the Rows.

public actor Connection {

  ...

  private func execute(query: String, statementHandle: StatementHandle) throws -> [Row] {
    var rows: [Row] = []
    var cachedColumnNames: [String]?

    // Continuously call `sqlite3_step` until execution is complete, or there's an error.
    while true {
      let stepResult = sqlite3_step(statementHandle)

      // Check for errors.
      guard stepResult == SQLITE_ROW || stepResult == SQLITE_DONE else {
        throw Error.execute(
          query: query,
          description: Error.errorDescription(connectionHandle: connectionHandle)
        )
      }

      // Build a `Row` if one is available.
      if stepResult == SQLITE_ROW {
        // Get the column names for the `Row`s, using cached names if available.
        let columnNames = try {
          if let cachedColumnNames {
            return cachedColumnNames
          } else {
            // Fetch the column names from the DB, but cache them in `cachedColumnNames`.
            let columnCount = Int(sqlite3_column_count(statementHandle))
            let columnNames = try (0 ..< columnCount).map { index in
              guard let columnNamePointer = sqlite3_column_name(statementHandle, Int32(index))
              else {
                throw Error.execute(
                  query: query,
                  description: Error.errorDescription(connectionHandle: connectionHandle)
                )
              }
              return String(cString: columnNamePointer)
            }
            cachedColumnNames = columnNames
            return columnNames
          }
        }()

        // Construct the `Row`, using the initializer on `Value` we added previously.
        rows.append(try (0 ..< columnNames.count).reduce(into: Row()) { row, columnIndex in
          row[columnNames[columnIndex]] = try Value(
            connectionHandle: connectionHandle,
            statementHandle: statementHandle,
            query: query,
            columnIndex: columnIndex,
            columnName: columnNames[columnIndex]
          )
        })
      }

      switch stepResult {
      case SQLITE_DONE:
        // All done, can return the `Row`s we've constructed.
        return rows
      case SQLITE_ROW:
        // More rows to fetch, continue stepping.
        continue
      default:
        throw Error.execute(
          query: query,
          description: Error.errorDescription(connectionHandle: connectionHandle)
        )
      }
    }
  }
}

Since we're matching up values to columns, we need to fetch the column names from the query result. This can be done using both sqlite3_column_count and sqlite3_column_name, and we cache these in case we're fetching a large number of Rows for a single query.

Ergonomic Improvements

We can now write some tests to insert and retrieve values from our DB! Let's add a simple test to ConnectionTests.

final class ConnectionTests: XCTestCase {
  ...

  /// Verifies that values can be bound into queries, before being fetched.
  func testFetchValues() async throws {
    // Given:
    let connection = try await Connection(url: temporaryDatabaseURL())
    try await connection
      .execute("CREATE TABLE test (id INTEGER NOT NULL, info TEXT, num REAL)")

    // When:
    try await connection.execute("INSERT INTO test VALUES (1, ?, ?)", .text("i1"), .null)
    try await connection.execute("INSERT INTO test VALUES (2, ?, ?)", .null, .real(1.2))
    try await connection.execute("INSERT INTO test VALUES (3, ?, ?)", .text("i3"), .real(3.4))
    let rows = try await connection.execute("SELECT * FROM test ORDER BY id ASC")

    // Then:
    // Verify `rows` directly.
    XCTAssertEqual(rows[0], ["id": .integer(1), "info": .text("i1"), "num": .null])
    XCTAssertEqual(rows[1], ["id": .integer(2), "info": .null, "num": .real(1.2)])
    XCTAssertEqual(rows[2], ["id": .integer(3), "info": .text("i3"), "num": .real(3.4)])
  }
}

This passes, however having to use the Value cases directly (.text(...), .real(...), etc) feels a little ugly. Can we do better?

Rather than working with Values directly, that's define a new ValueConvertible protocol for types that can be converted to / from Values.

public protocol ValueConvertible: Sendable {
  var value: Value { get }
  init(value: Value) throws
}

We can then add conformances for some typical types. I've just included Double and String below, however you can imagine extending this to other types (including Int, Data, etc.). I've also added a new Error case for when an unexpected conversion is attempted.

extension Double: ValueConvertible {
  public var value: Value {
    .real(self)
  }

  public init(value: Value) throws {
    switch value {
    case .real(let real):
      self = real
    default:
      throw Error.unexpectedValueType(value: value, expectedTargetType: "Double")
    }
  }
}

extension String: ValueConvertible {
  public var value: Value {
    .text(self)
  }

  public init(value: Value) throws {
    switch value {
    case .text(let text):
      self = text
    default:
      throw Error.unexpectedValueType(value: value, expectedTargetType: "String")
    }
  }
}

Finally, let's add a few helper conformances. Value itself should conform to ValueConvertible, and we can also add support for Optionals that wrap a ValueConvertible type themselves. Additionally, we want an easy way to extract a certain target ValueConvertible from a Value - a new get method is used to support this.

extension Value {
  /// Attempts to convert this `Value` into a `T`.
  func get<T: ValueConvertible>(_: T.Type = T.self) throws -> T {
    try T(value: self)
  }
}


extension Value: ValueConvertible {
  public var value: Value { self }
  public init(value: Value) { self = value }
}

extension Optional: ValueConvertible where Wrapped: ValueConvertible {
  public var value: Value {
    if let self {
      self.value
    } else {
      .null
    }
  }

  public init(value: Value) throws {
    switch value {
    case .null:
      self = nil
    default:
      self = try Wrapped(value: value)
    }
  }
}

A small change to execute is required in order to take ValueConvertibles rather than Values:

public actor Connection {

  ...

  // We now take `ValueConvertible`s rather than `Value`s...
  @discardableResult
  public func execute(_ query: String, _ arguments: any ValueConvertible...) throws -> [Row] {

    ...

    // ...and map each argument to a `Value` when iterating over.
    for argumentValue in arguments.map(\.value) {

        ...
  }
}

Phew! Now we can use those base ValueConvertible types in our tests, allowing us to pass Strings and Doubles directly to execute rather than having to wrap them in Values first.

   /// Verifies that values can be bound into queries, before being fetched.
  func testFetchValues() async throws {
    // Given:
    let connection = try await Connection(url: temporaryDatabaseURL())
    try await connection
      .execute("CREATE TABLE test (id INTEGER NOT NULL, info TEXT, num REAL)")

    // When:
    try await connection.execute("INSERT INTO test VALUES (1, ?, ?)", "i1", Value.null)
    try await connection.execute("INSERT INTO test VALUES (2, ?, ?)", Value.null, 1.2)
    try await connection.execute("INSERT INTO test VALUES (3, ?, ?)", "i3", 3.4)
    let rows = try await connection.execute("SELECT * FROM test ORDER BY id ASC")

    // Then:
    XCTAssertEqual(try rows[0]["id"]?.get(), 1)
    XCTAssertEqual(try rows[1]["id"]?.get(), 2)
    XCTAssertEqual(try rows[2]["id"]?.get(), 3)
    XCTAssertEqual(try rows[0]["info"]?.get(), "i1")
    XCTAssertEqual(try rows[1]["info"]?.get(), String?.none)
    XCTAssertEqual(try rows[2]["info"]?.get(), "i3")
    XCTAssertEqual(try rows[0]["num"]?.get(), Double?.none)
    XCTAssertEqual(try rows[1]["num"]?.get(), 1.2)
    XCTAssertEqual(try rows[2]["num"]?.get(), 3.4)
  }

One thing you may have noticed is that we're still dealing with stringly-typed column names for our returned Rows. We won't try and improve this right now, however one option could be to make execute generic over a container type for the specific row that we're fetching. This container would need some way of allowing string access to its properties (macros would likely help here!), however you could then centralize all of the row building and conversion logic into execute, and know that if it succeeds, you have a nicely typed row instance back that you can use.

Before we finish up, an exercise for the reader. What other types could you add ValueConvertible conformance to? Maybe UUIDs, storing them as data blobs? You're ultimately not limited to SQLite's storage types, and as long as you consider that your way of storing a certain type may not be compatible with someone else's (if you're sharing the DB with another application), you can be creative.

Conclusion

That's probably enough code for today, so let's wrap up. We've done a lot so far, and in theory you could stop here: you have a way to spin up a SQLite connection, execute queries against it, and a way to extract values. This is probably the most lightweight of lightweight wrappers you could think of. The code up to this point is available on Github.

But there's a few nice improvements still to make. Next time we'll be looking at moving from a single Connection to a pool of Connections, unlocking parallel reads (with a write). We'll also be adding support for transactions, allowing for multiple statements to be executed such that you can be confident that they either all succeed, or they get rolled back. See you then!