State Management in Python and Jupyter

Implementing signal based state management in Jupyter by mistake.
Apr 10, 2020

Intro

I work at a cool company named Imubit. This company creates AI solutions for refineries.

At the beginning of the year, we decided to create a team that will create internal data science and data investigation tools. After a lot of discussion, we landed on building those tools in JupyterLab.

My task was (and still is) to create data visualization tools that let the user query big amount of data, modify it and visualize it in charts and heat maps.

The problem

The applications became bigger and bigger, and had components that were used in multiple different applications. This created a situation where I needed to find a way to manage and share data between multiple components, across multiple applications.

The light bulb

As a frontend developer I knew what I needed right away: State management.

The idea was simple: I need a single source of truth to all components, that will hold the current user selections and cache the current queries results. This source should be able to notify a component on a data change.

Now, if I was using React for example, I would probably go for MobX, but sadly, it does not exist for Python.

Solution

My solution was a Proxy inspired idea: Create a class that inherit dict, and add the option to notify and trigger listeners.

Because we are dealing with Python, I did not want to use an Event based mechanism, and preferred to keep is as synchronous is I can.

This will be easier to debug and use in Python.

We start with a “proxy” like solution:

1class StateManager(dict):
2 def __init__(self initial_state: dict):
3 self[‘state’] = initial_state
4
5 # Use getattr and setattr to block direct manipulations
6

From there we need to add the state management. I chose to use

1set_state(*path, value)
and
1get_state(*path)

set_state(*path, value) - will iterate the path and add the value to that path in self[’state’], will create that path if not exists

get_state(*path) - will iterate the path and return the value if exists.

Now, that we have a state caching, we can add the components update mechanism;

1class StateManager(dict):
2 def __init__(self, initial_state: dict):
3 self[‘state’] = initial_state
4 self.registered_callbacks: dict = {}
5
6# ….
7
1def register(self, func: Callable):
2 # gets the function name, and subscribe to a top level keyword with that name.
3 func_name = func.__name__
4 if func_name in self.registered_callbacks:
5 self.registered_callbacks[func_name].append(func)
6 else:
7 self.registered_callbacks[func_name] = [func]
8
1def subscribe(self, keyword: str, func: Callable):
2 # subscribe to a top level keyword with same name.
3 if keyword in self.registered_callbacks:
4 self.registered_callbacks[keyword].append(func)
5 else:
6 self.registered_callbacks[keyword] = [func]
7
1def notify(self, keyword: str, event_type: str, new_value: Any, old_value: Any):
2 """
3 Notify listeners of a keyword, called from set_state
4 Callback will get the following arguments:
5 - event_type: represents the type to change (server response, or user selection)
6 - new value
7 - old value
8 """
9 if keyword in self.registered_callbacks:
10 for fn in self.registered_callbacks[keyword]:
11 fn(event_type, new_val, old_val)
12

Next step: convert it from a dictionary to a class members and take advantage of the dot-notation.