Making Retcon fast: A cache for every need
Retcon 1.0 was decently fast in modestly-sized repositories, but appallingly slow for anything big. You’d have to watch the app freeze for an agonizing 8.82 seconds to merely rename a commit in the Swift repo. In Retcon 1.4, that same rename now completes in 448 forgettable milliseconds; a 1868% improvement. What’s the secret there?
Well, there’s no single thing. Months of work resulted in numerous, surprisingly diverse optimizations; some expected, some eyebrow-raising. Despite that variety, however, one kind of optimization ended up the most common by far: caches.
Many values, many caches
Retcon now has numerous in-memory caches. Anything that’s slightly expensive to compute is stored in memory, to be reused the next time it’s needed. To make sure these cached values don’t go out of date, they’re either discarded whenever the input data changes, or are stored alongside a fingerprint of the input data, to allow checking validity on retrieval.
One of Retcon’s most important classes is VirtualizedRepository
. VR objects mediate access to Git repositories, while transparently layering Retcon’s own data on top. They’re core to allowing Retcon to display and manipulate states that Git itself can’t represent, such as a commit histories with a conflict midway through.
Because of this, a VR exposes many different properties, most derived from on-disk data, that get accessed quite heavily. And so these properties are cached: the head commit list is cached as an array, allowing for random access, instead of requiring walking the history one commit at a time through parent lookups. The repository’s tags are cached in a purpose-made map, cachedTagsBySHA1
, that makes it very fast to look up a given commit’s tags, instead of needing to filter through the repository’s complete tag list every time. There’s the precisely-named cachedPhysicalSecondaryHeadAncestorsByMergeCommitLineage
—a map of the commits that were merged into the current branch, keyed by the identity1 of their merge commit. Purpose-made, once again. And there’s even a cache of the behind/ahead counts, that are eventually displayed on the pull/push buttons—if your branch has significantly diverged from its upstream, finding the closest shared parent can take a while.
Many of the VR caches are regenerated from scratch whenever any repository data changes, so that they’re always up to date. But some are handled with more granularity, for performance.
The head commit list cache is updated, not discarded. When a repo changes, the vast majority of the history usually stays the same; so Retcon holds onto the current cache’s tail of unchanged commits, and prepends new and modified commits2.
The merged commit map is never proactively built, but instead gets populated as data is requested. This is because in Retcon, merge commits are initially displayed collapsed: their child list is only needed once the user toggles them open.
Rigorously not measuring text
Caches are useful for more than abstract, internal state. Computing the layout of UI elements can also be expensive, once again inviting caches; Retcon’s best example is the measuring of diff hunk heights.

Three hunks, from two different files.
Now, while generating diff hunks is fast, computing their on-screen height is not. Retcon doesn’t let the text scroll horizontally, instead wrapping it at the window edge; this means that to know the total pixel height of a hunk, the layout system must scan through the entirety of its text, determining line wrapping points, and therefore the number of lines the hunk actually spans3. This is a very, very slow process for large amounts of text.
Which brings us back to caches. It’s a no-brainer to cache the output of this expensive measuring process, as a given hunk will often stay the exact same when the diff view is refreshed. Here’s the tricky part, though: what does it mean to be the exact same? If the hunk’s text remains unchanged, but the window becomes wider, then we should redo the measurement—more horizontal space means less wrapping. If one of the hunk’s neighbors changes in content, we ostensibly don’t need to reflow the unchanged hunk itself—unless the neighbor’s line numbers are now different, since these dictate the width of the line number gutter, which is shared by all of a file’s hunks. So many variables to take into account, coming from such distinct sources!
So we need to invalidate our cache based on these diverse factors, and do so with precision: while the height must never go out of date, we also want to avoid costly superfluous invalidations. To achieve this granularity, Retcon makes use of a supporting method, that fetches all the relevant inputs for a given hunk, and mixes them into a single integer:
/// A hash of all the properties that have an impact on this hunk's view height.
func heightDependenciesHash(
forHunkAtIndex index: Int,
withParentWidth parentWidth: CGFloat
) -> Int? {
guard let viewHunk = viewHunks[safeIndex: index]
else { assertionFailure(); return nil }
var hasher = Hasher()
hasher.combine(parentWidth)
hasher.combine(contextIsDough)
hasher.combine(effectiveForceDisplayingHunks)
viewHunk.hunk.combineText(into: &hasher)
hasher.combine(hunkGutterWidth)
return hasher.finalize()
}
The method essentially captures the complete context in which a hunk is measured, and represents it as a hash. This allows us to store the hash alongside the cached height, and use it as the cache key: whenever the cache is read back, we first check if the stored hash matches the current output of the method. If the two differ, it means the cached height is stale, and we need to take a fresh measurement.
The nice thing about this approach is that it neatly encapsulates all knowledge about hunk height calculation. While this listing of factors is itself duplication (after all, the window width will affect hunk height regardless of what we hash in the method), it does prevent duplicating and scattering that information any further. It’s the one, canonical way of representing hunk cache freshness.
Unrigorously measuring text?
In a slightly different world, we could actually skip measuring most hunks. If a hunk isn’t in view, all we need is an approximation of its height, so that we can properly size the scroll bar’s thumb.
Retcon has a way of producing such an estimate: it maintains estimatedRowHeightToLineCountRatio
, a cache of the average ratio between the pixel height of a hunk, and the number of line break characters it contains, based on hunks measured so far. We always know how many line breaks a hunk has, so it’s extremely cheap to get an idea of its height using this value.
However, this technique turns out to be unusable. AppKit’s NSTableView
class, that Retcon uses to display the list of hunks, requires us to provide exact heights, not estimates, when enumerating our hunks4. If the values we provide are off by even a few pixels, the view will constantly stutter and jump when scrolled, making for a terrible experience. Oh well.
The long tail
There’s many more caches in Retcon. They all bring welcome performance boosts, but most aren’t especially interesting. Here’s one last story for the road.
The welcome window, with its list of recently-opened repositories, could be slow to appear, sometimes stalling for more than a second. The view code would read the system-provided recentDocumentURLs
list many times per refresh, with the assumption that the access would be instant; it was not. One cachedRecentDocumentURLs
later, and the window now shows up without delay. A welcome improvement!
There’s a lot more to Retcon 1.4 than caches, though; we’ll look at the many other optimizations in future blog posts. Follow along on Mastodon, Bluesky or Atom, and get Retcon now to see firsthand how fast it’s become.
-
In regular Git, nothing links a rewritten commit to its source commit: they’re entirely unrelated. A commit’s hash completely changes any time it’s even slightly modified. Retcon works around this by assigning each commit a unique identifier called a lineage ID, which it keeps stable across rewrites. This is primarily to allow animating large history changes, such as rebases, to make it much clearer what’s happening. ↩︎
-
Would you know it: updating the head commit cache makes use of another cache,
cachedPhysicalHeadCommitIndicesBySHA1
, that’s maintained solely for this purpose.To determine where the unchanged tail of the new history starts, Retcon needs to walk commits one by one, starting with the head, checking whether each is already contained in the cache. Such a lookup would be very slow if done naively (“for each new commit, iterate over the current cache to see if it’s in there”), so instead we’re using this map, which makes the lookup vastly faster. ↩︎
-
The text layout process, handled by Apple’s TextKit 2, also accounts for varying typographic bounds. While lines of only ASCII characters always have the same height, adding a Chinese character or an emoji will use a different font, often making the line taller. So even if we disabled line wrapping, calculating a hunk’s total height would be more complicated than just
hunkLinebreakCount
×lineHeight
. ↩︎ -
Retcon’s diff view implements the table view delegate method
tableView(_:heightOfRow:)
, which expects an exact height.
In UIKit,tableView(_:estimatedHeightForRowAt:)
exists precisely for communicating estimates like ours, but AppKit has no equivalent. ↩︎