Complex web applications have complex state management solutions. And managing those solutions - and especially their interactions with each other - is hard! Here at Causal, we were running into this problem more and more as our frontend grew and sources of truth proliferated. So we decided to build a solution from the ground up that allows reading from and subscribing to multiple stores with one simple interface. In this blog post, we’ll explain the problem we set out to solve, describe our solution, and look under the hood at how we’ve optimized it to be better than useSelector.

There are countless possible sources of truth for a web application: the URL, local storage, a Redux store, Apollo Client, Zustand, Recoil, and many many more. Different types of data make sense for different stores. For example, the ID of the document a user is viewing probably belongs in the URL, but the user’s auth token does not. A list of the user’s folders fetched with GraphQL probably belongs in the Apollo Client, but (for the most part) ephemeral local state does not ¹.

At Causal, we have five(!) different stores holding pieces of our application’s state. Of course, we didn’t set out to use as many stores as possible! Rather, we introduced each store when none of the existing stores could do what we wanted. While most apps probably do not need five stores, every web app will need at least two stores ²:

The URL
Something to hold complex values (anything that doesn’t make sense in the URL). Redux is the most popular and generic option, but there are many others.

Having an application’s state spread across multiple different sources of truth makes life harder when trying to use that data. One possible solution is syncing multiple stores. For example, copying changes from the URL into the Redux store. This approach will always be bug prone, because there is no longer a single source of truth. In this blog post, we’ll introduce a better solution to this problem, which we call Causal Selectors.

Causal Selectors allow easily accessing and deriving state from multiple different stores. They can be used in React or outside of React, but when used in a React component they do not cause any unnecessary rerenders. And they are simple to write and compose.

This blog post will explain how Causal Selectors work, and how we optimized their performance. But first, we’ll lay the conceptual groundwork for how they work and why they are designed that way.

Selectors

The first key concept is a selector: a function that takes the entire state, and returns a value derived from that state. Selectors were popularized by Redux, but the concept is not specific to Redux. In TypeScript, a selector can be represented as:

type Selector<TState, TResult> = (state: TState) => TResult;

Selectors are useful because they expose a canonical way of fetching a single piece of state. This abstracts the actual state shape, allowing for more structured and maintainable data access. For example, if we have a selector like this:

const selectUser = (state: AppState): User => state.user;

Now any callers can get the user with selectUser(appState). And if we change the AppState shape later, we only need to update the selector - everywhere that uses it will still work.

Selectors are also useful because they are easily composable. For Redux selectors, this is typically done with reselect, but again the concept is not specific to Redux.

Here we’ll use an example relevant to Causal. In Causal, users create models to work with their data, forecasts, and visualizations. When a user logs in, we load the list of models they have access to. Some may be their own models, and some may be shared with them. So let’s imagine we want to fetch the models a user actually created. We could do it as follows:

import { createSelector } from "reselect";

const selectAllModels = (state: AppState): Model[] = state.models;

const selectUserCreatedModels = createSelector(
  selectUser,
  selectAllModels,
  (user, models) => models.filter(model => model.creator_id === user.id)
);

See how we’ve used the selectUser and selectAllModels selectors as inputs to the selectUserCreatedModels - this is the beauty of composition! Each selector has one responsibility and it can focus on doing that well, using other selectors as necessary to get their outputs instead of reinventing the wheel each time.

A selector library like reselect also supports memoization of the result, such that if each input to the selector is strict-equals to the previous input, the function to combine the inputs is skipped and the previous output is returned. Not only is this faster, but returning the same memory references works well with React memoization semantics. This is taken even further with re-reselect which allows caching of an arbitrary number of computed values, with a cache key derived from the inputs.

Finally, selectors tend to be defined globally. This means that this memoization can be shared across multiple consumers (e.g. multiple React component instances).

What is a Causal Selector?

Causal Selectors are selectors that allow pulling data from any store, combining and manipulating that data into an output, and subscribing to changes in that output as the data from the underlying stores changes.

There are two types of Causal Selectors:

Leaf selectors: for each type of store, there is a leaf selector which can pull data from that store
Combining selectors: much like createSelector in the example above, a combining selector takes one or more other Causal Selectors (of any type) as its inputs, and combines those inputs into its own output.

The combining selectors implement the same-inputs-same-output memoization described above.

How are Causal Selectors created?

The best way to see how Causal Selectors come together is to see an example. This is a real example from our source code!

This is an instructive example because it shows how we need to read from multiple different stores just to derive one commonly used value. Here, we’re trying to fetch a single model. But to do that, we need:

The underlying model, as it exists on the server (selectModelWithoutDeltas)
The changes the user has made to the model locally, which have not yet been saved (selectModelDeltas)
The “view” the user is looking at - basically a saved filter (selectView)

To get each of those three values, we may need even more details. For example, selectView needs to know which view the user is looking at, which is stored as a view ID in the URL and read by selectViewId.

By composing these selectors, each one can focus on its responsibility only. For example, selectModel does not care that views are stored in Apollo Client and the view ID is stored in the URL - it just uses the value selectView gives it.

Here’s what it looks like in code:

const selectModel = createCausalSelector(
  selectModelsWithoutDeltas,
  selectModelDeltas,
  selectView,
  (models, deltas, views) => { ... }
);

How do Causal Selectors integrate with React?

So far we’ve only been looking at one piece of the puzzle: computing a value from state across multiple different stores. But for a Causal Selector to be useful in a React app, it also needs to be able to rerender a component when that value changes. Each of the stores we’ve mentioned so far has a hook to do this. For example:

// Redux

import { useSelector } from "react-redux";

const user = useSelector(selectUser);

// Apollo

import { useQuery } from "@apollo/client";

const views = useQuery(ViewsDocument);

// URL (in Next.js)

import { useRouter } from "next/router";

const viewId = useRouter().query.view;

We created a similar hook:

const model = useCausalSelector(selectModel)

To do this, we needed a low-level React API: useSyncExternalStore. Its (simplified) type signature is:

function useSyncExternalStore<Snapshot>(
    subscribe: (onStoreChange: () => void) => () => void,
    getSnapshot: () => Snapshot,
): Snapshot;

To explain each argument a little further:

subscribe expects a function which takes a callback to be called when the store changes, and returns a callback to be called when React wants to unsubscribe from the store’s updates
getSnapshot expects a function which returns the current value from the external store

We won’t dive into the gory details of how useCausalSelector is implemented here, but the key is that it uses useSyncExternalStore.

As for how we implement subscribe: because Causal Selectors are composed from other Causal Selectors, we can subscribe to a single Causal Selector’s changes simply by subscribing to all its children. As for leaf selectors, they use the mechanism specific to the store they’re hooked up to. For example, here is the leaf Redux selector (somewhat simplified to avoid unnecessary details):

function createCausalReduxSelector<T>(
  selector: (state: CausalReduxState) => T
): CausalSelector<T> {
  return ({ redux }) => ({
    getSnapshot: () => selector(redux.getState()),
    subscribeCallbacks: () => [onStoreChanged => redux.subscribe(onStoreChanged)],
  });
}

Initial results

While this change made our React code much easier to reason about and easier to write it caused our Interaction to Next Paint (INP) metric to drop significantly. Instead of 50% ³ of sessions being in the good bucket, only 30% of sessions were in the good bucket after rolling out the new framework.

Performance optimisations

In order to understand the performance optimisations we made, first we need to explain why the naive approach was slow.

One important detail is on how React’s useSyncExternalStore() works. Very briefly, the logic is:

The onStoreChange argument to the subscribe() is called by a store, notifying the useSyncExternalStore() implementation that the external state may have changed.
useSyncExternalStore() calls getSnapshot() and compares that against the previous output using strict equality
If the new output has changed, the component/hook that called useSyncExternalStore() is queued for a rerender.

It is also worth understanding how Redux store notifications work. Redux was the biggest culprit in our performance issues because a) we keep most of our state in Redux, b) it changes more often, and c) because it has a fairly blunt approach to notifications: every Redux action triggers a notification on every subscriber.

Finally, recall that our Causal Selector subscription mechanism involves composing subscription callbacks from the leaf selectors (which interface directly with their corresponding store), and merging them up the selector tree.

OK, with that background, we’re ready to step through an example with the naive implementation! Say we have a simple selector composed like so:

Say we have wired up this selector into our component:

function CausalApp() {
  const importantValue = useCausalSelector(rootSelector);
}

Now imagine that a Redux action is emitted, and that the new Redux state will result in different selector outputs for reduxSelector2 and reduxSelector4. Let’s step through the resulting flow:

Redux starts iterating through its subscribed listeners and calling onStoreChange() on each.
reduxSelector1 bubbles up its notification to the useSyncExternalStore() instance corresponding to the <CausalApp /> component
useSyncExternalStore() calls getSnapshot() on the selector. This involves a full traversal of the selector tree. (Note that the same-input-same-output memoization technique doesn’t prevent this traversal - we still need to traverse all the way down to the leaf nodes to extract the inputs from the store, and then traverse back up to the root as we combine the selector outputs at each level. However, it may mean that we skip some computation on our way back up the tree, returning a memoized value rather than calculating a new one).
reduxSelector1 didn’t change, so useSyncExternalStore() does nothing
Redux continues iterating through its subscribed listeners
Repeat steps 2-5 for reduxSelector2. Because this selector has changed output, the overall output from rootSelector has also changed, so useSyncExternalStore() queues a rerender.
Repeat for reduxSelector3 and reduxSelector4. Since the latter has also changed outputs, useSyncExternalStore() queues another rerender.
React is smart enough to dedupe queued renders for the same component, so <CausalApp /> rerenders once.

Phew! That was a lot of work, and most of it was unnecessary. The lowlights:

there is a full tree traversal for each leaf node. For a tree of size n the number of leaves is O(n), so the whole operation is O(n^2) (and selector trees can get much larger than the 8 nodes shown in this example).
everything after the reduxSelector2 update was unnecessary, because a rerender was already scheduled for <CausalApp />
for larger selector trees, the number of rerenders queued by React can get so large that it triggers the batch to be processed, and another batch started. This new batch can then take new queued updates of the same component, meaning that React`s deduping no longer applies, and we end up rerendering the same component multiple times for the same state update.

For the record, we had made things worse than just using reselect selectors and react-redux's useSelector(), even for a Causal selector with only redux leaf nodes. This is because useSelector() registers the root selector directly with the redux store, while our approach meant that every leaf selector was registered with the store, and the notifications coming from each leaf selector were hard to deduplicate for a given root selector. So while useSelector() would only get notified once per redux update, our selectors get notified once per leaf selector. (Note that useSelector() still needs to traverse the entire selector tree to build up the result, even if it hasn’t changed).

One approach to fixing this would have been to emulate useSelector() and have the root selector manage the subscriptions entirely. Something like this:

enum StoreType = { Redux, Router, Apollo, ... }

export function createCausalSelector<A, B, T>(
  selectorA: CausalSelector<A>,
  selectorB: CausalSelector<B>,
  combiner: (a: A, b: B) => T,
): CausalSelector<T> {
  // Some function that recursively extracts the store types of the
  // leaf selectors used by this selector.
  // Note that this is only called once, at selector definition time
  // i.e. when the page first loads.
	const storeTypes: Set<StoreType> = extractStoreTypes(selectorA, selectorB);
  return stores => ({
    getSnapshot: ...
    subscribeCallbacks: () => [...storeTypes].map(storeType =>
      subscribeToStore(storeType, stores),
    ),
  });
}

// Returns a subscription callback for the given `storeType`. The callback
// can be called with an update notification function by a subscriber.
// That function will be called when the store upates.
function subscribeToStore(storeType: StoreType, stores: Stores): (onStoreChanged: () => void) => void {
  if (storeType === StoreType.Redux) {
    return onStoreChanged => redux.subscribe(onStoreChanged)
  }
  // handle other store types...
}

Something like this would have worked, with performance equivalent to useSelect(). But as we thought about it, we decided we could do better!

The first step was a simple optimisation: only propagate a subscription update notification if the selector output has actually changed. Essentially this

function createCausalReduxSelector<T>(
  selector: (state: CausalReduxState) => T
): CausalSelector<T> {
  return ({ redux }) => ({
    getSnapshot: () => selector(redux.getState()),
    subscribeCallbacks:
      () => [onStoreChanged => redux.subscribe(onStoreChanged)],
  });
}

becomes

function createCausalReduxSelector<T>(
  selector: (state: CausalReduxState) => T,
): CausalSelector<T> {
  return ({ redux }) => {
    let previousOutput: T | undefined;
    return {
      getSnapshot: () => selector(redux.getState()),
      subscribeCallbacks: () => [
        onStoreChanged =>
          redux.subscribe(() => {
            const nextOutput = selector(redux.getState());
            if (nextOutput === previousOutput) return; // short-circuit
            previousOutput = nextOutput;
            onStoreChanged();
          }),
      ],
    };
  };
}

Note that this works best if the Redux selectors are very simple state accessor functions; expensive logic should be avoided in these selectors.

This immediately halves the amount of work done in our example above - the notifications for reduxSelector1 and reduxSelector3 don’t propagate, because the outputs of those selectors don’t change. But we can do better!

The next step was to apply this optimisation to all selector types, not just the leaf selectors. It is possible for a selector’s output to remain the same even when its inputs change (a common case is when one of the inputs is a map and the other is a key; if the map has updated because some unrelated value has changed then then output from the selector will be the same).

Now that we have a cached previousOutput on every selector type, we can use it when we call getSnapshot(), avoiding the additional tree traversal! Continuing the example with the Redux selector, we now have:

function createCausalReduxSelector<T>(
  selector: (state: CausalReduxState) => T,
): CausalSelector<T> {
  return ({ redux }) => {
    let previousOutput: T | undefined;
    return {
      getSnapshot: () => previousOutput ?? selector(redux.getState()),
      subscribeCallbacks: () => [
        onStoreChanged =>
          redux.subscribe(() => {
            const nextOutput = selector(redux.getState());
            if (nextOutput === previousOutput) return; // short-circuit
            previousOutput = nextOutput;
            onStoreChanged();
          }),
      ],
    };
  };
}

Note that the real code has some extra complexity around properly handling undefined outputs (which are valid), and ensuring that the previousOutput is only returned from getSnapshot() when the selector actually has subscribers (because otherwise we can’t be sure that the cached previousOutput is up-to-date). But this is the core idea.

This is much more efficient, but we need to be careful. Going back to our <CausalApp /> example, let’s say that another Redux action is fired that updates reduxSelector2 and reduxSelector4 again. In this example the component has already gone through several state update cycles, so the previousOutput cache at each level of the selector tree is populated.

Redux starts iterating through its subscribed listeners and calling onStoreChange() on each.
reduxSelector1 short-circuits because its output hasn’t changed (yay!)
reduxSelector2 does not short circuit, and propagates the notification up the tree
internalSelector1 checks if its snapshot has changed, sees that it has, and propagates
rootSelector does the same. In doing this it takes the previousOutput from internalSelector1, which was just updated and is up-to-date, and the previousOutput from internalSelector2, which hasn’t yet updated and is stale. This is because reduxSelector4 hasn’t been notified of the change yet, so hasn’t had a chance to update its cached value
rootSelector incorrectly uses this stale cached value and calculates an output based on inconsistent state, a mix of old and new state that never actually existed.
React queues a rerender using this inconsistent result. This is bad!

To fix this, we need to ensure that the selectors are processed in the correct order when determining whether to propagate an update, from leaves to root. An example valid ordering that would avoid the above bug would be [reduxSelector1, reduxSelector2, reduxSelector3, reduxSelector4, internalSelector1, internalSelector2, rootSelector]. This is of course a topological sort, which has the property that by the time we calculate a given selector’s snapshot, we know that all of its inputs are already up-to-date.

Unfortunately this isn’t trivial to do with Redux, because we don’t control the notification order - Redux is simply iterating over the listeners in whatever order they subscribed to the store. So the next step was to take control of the subscriptions to Redux. Rather than having every Redux selector subscribe to the Redux store individually, we created our own subscription manager that subscribed to Redux, and the selectors subscribe to the manager. So we have at most one subscription to Redux coming from these selectors. Now when there is a Redux update, Redux just notifies the manager, and the manager can perform a topological sort and process selectors in the correct order.

The full benefits of this approach can be seen when we use a more realistic example: a given React component may be subscribed to multiple selectors; a given selector may be subscribed to by multiple React components; and a given selector may appear many times in a given subscribers dependencies (we have talked about “selector trees” so far, but more accurately they form a DAG):

When processing a Redux update, we actually process this full DAG consisting of all the selectors and the React components subscribed to them, applying the topological sort to the whole graph. This has several nice results:

each selector in the DAG is called at most once, no matter how many of its inputs change
each React component is queued for rerender at most once, no matter how many selectors it is subscribed to
we maintain the earlier optimisation that by the time we know we need to use a selector result (e.g. for rendering), the value is already cached and available in constant time
we also maintain the pruning of DAG branches that have no updates

The main caveat worth mentioning here is that these subscription managers, and the DAGs they build and update, exist per store. So far we have been discussing Redux, but we have applied the same optimizations to the other store types. But update notifications for different stores are emitted separately (i.e. there’s no way a notification that the URL has changed could also include a notification that the Redux state has changed). In practise this hasn’t mattered too much - the point of supporting these different sources-of-truth in a first class way means that the stores tend to change independently of each other, rather than all at once.

And that’s it! One benefit of the Causal Selector framework: none of these optimizations required changing any existing selectors. The API stayed unchanged, even as we cut the amount of work done under the hood by tenfold in many cases.

After all these optimizations, our INP numbers are back to their original levels - but our state management is now much easier to use!

Next steps

We still have some components that use the old way of accessing data (useQuery, …). It’ll be interesting to see what the overall performance looks like once the migration is complete.

We’re also working on caching the topological sort which should speed things up as well.

Our investigations here highlighted that Redux, despite it’s widespread adoption, doesn’t scale well for large and complicated apps. The more state you have in Redux, the more likely you are to emit an action to update that state, and the more components you have subscribed to that state. The fact that every subscriber gets notified for every action means that there is a quadratic-ish relationship here, fundamentally limiting the scalability. We will be investigating moving state out of Redux into stores that handle this better (in particular, we intend to move state that comes from backend queries to graphql and use the Apollo client to manage that state; Apollo handles subscriptions to subsections of state much better than Redux).

We’re always on the lookout for other optimisations and spend a fair amount of time in the Chrome/React profiler. Performance is extremely important for a productivity tool like Causal given that some of our users are active for multiple hours every day.

We also have various performance challenges to solve on the backend where we’ve built a calculation engine that can handle 100s of millions of cells. If you're interested in working on challenges like this with us, you should consider joining the Causal team — check out our careers page or email lukas@causal.app!

‍

Thanks to Andrew Churchill and Tom McIntyre who did most of the engineering work and wrote most of the post.

^{1: Yes it’s possible, no it’s not fun.
2: There are}^ways^{to get URL state into Redux but it doesn’t work with Next.js Router.
3: We’re not proud of this number, if you’re interested in frontend performance, we’re hiring! Causal is a data-heavy application so improving INP further is a difficult task.}

re-re-reselect — Simplifying React state management

Nov 6, 2023

Causal

Table of Contents

Heading 2

Heading 3

The URL
Something to hold complex values (anything that doesn’t make sense in the URL). Redux is the most popular and generic option, but there are many others.