Efficient Spatial Relationships with Z-Order Curves

Dan Miller
Staff Software Engineer

Solving Hard Problems

At Outer Labs, we’re interested in using technology to scale built environments. This is a statement that’s going to invent hard and time-consuming problems for software engineers like myself, who would otherwise just be peacefully going about their lives.

When I think about some of the more intricate challenges we face in this work, I realize they come in two forms:

  1. The typical practice of software engineering: there’s a computer somewhere that doesn’t know what to do, and I do. Normally, this problem is solved by hours sitting at the keyboard, communicating with the computer in a shared language. I build a model of the situation that the computer can understand, then the computer knows what to do and we’re done.
  2. However, for a lot of what we work on, there’s a further complication: I don’t know what to do either. In this state I find myself staring at the computer, the computer staring back at me, and we just quietly feel bad together.

With this in mind, how do you solve hard problems? As far as I can tell, you don’t. You turn hard problems into easy problems and solve those. For example, to break down a hard problem such as understanding a large space with a lot of things in it, you’ve got to turn it into an easier problem like understanding the relationship between smaller spaces.

Big problems are often just smaller problems in disguise.

Searching for the Things

The physical world is filled with things, but you don’t have to take my word for it. If you open up the window and look outside you’ll see some of them. It’s perfectly reasonable then to imagine that at some point someone will have to answer questions like, “Where’s a specific thing?” or, “How close are two things?” These are questions about the spatial relationships of objects and there are a lot of interesting algorithms designed around answering them.

The world we live in has three spatial dimensions, so you can just encode an object’s distance from some reference point in each dimension with a number and that gives you a precise location for a given point in space. This is the Cartesian coordinate system and it turns out to be a pretty good way of describing where something is. You can describe a position as a three dimensional vector and you’re done.

Now, let’s imagine some fun situations. Say you’ve lost something valuable and have no idea where it is. How do you search for it? Or, let’s say you’ve set out to enforce social distancing and you need to tell everyone who’s closer than six feet apart to distance themselves. How do you efficiently do that given that there are a lot of people out there?

You need to break down the problem. If every time I lost a favorite pair of socks I caught a flight to a random country to start my search of the whole globe, my feet would be cold by the time I found them. Or, in the social distancing case, if I had to break out the ruler to measure the distance between every pair of people on the planet, even if it took me a millisecond to take each measurement, it would still take me 1.8 billion years to finish, and there’s no way people will be willing to stand still that long.

The reason why these problems are too large to practically solve is because they treat all candidate solutions as equally likely. Instead of doing that, I could view the space around me as structured by distance, adjacency, or whatever makes sense for the problem I’m trying to solve. And by structuring the world this way, it gives me a natural way to search. I can organize all of space by starting with the planet earth, dividing it into continents, dividing that into nations, and so on until we’re dividing my apartment into rooms. I can start searching in the room I’m in, and if I don’t find it, I can backtrack up to the apartment level and search other rooms. So rather than search the world for my socks, I can build a map of likely places the socks might be and then search that, which is much more achievable in a lazy afternoon.

Structuring the way in which a problem is expressed helps you solve it, which is nice.

Binary Trees, Quadtrees, and Octrees

A binary tree is a tree data-structure where each node has, at most, two child nodes. Making a series of binary choices (e.g., do I go left down this street or right, do I cook dinner tonight or eat out) is equivalent to traversing a binary tree.

Oftentimes a binary search is used to locate data within an ordered list. If you knew nothing about the contents of the list, then every location in the list is a viable candidate location for where your data is located, and so you’d need to visit each one to confirm or deny that the element you’re looking for is there. However, if you know that the list is ordered, by visiting a single location, you can know whether the item you’re looking for is before your current location (where smaller elements than the current are located) or after (where greater elements are located). This allows you to remove many candidates from consideration at once and becomes a more efficient way to search.

One thing worth noting in this search is that the problem is recursive. In this case, there is a binary choice: “Do I search the elements less than where I am currently looking, or greater?” And after making that choice, you have the same problem you started with (searching a list for an element) just smaller.

Each decision makes the problem easier and easier, until there’s only one candidate left and the problem is as simple as, “Is this the element I’m looking for?” At a higher level, because the list is ordered, there’s a definition to the concept of “closeness.” Each decision not only eliminates a candidate from consideration, but it gives you information about where to look next. In this way, a binary tree is a one-dimensional spatial index.

This would model a lot of the spatial problems we have at Outer Labs — if our world was a rope. Unfortunately our real problems are two or three dimensional, which is why we use data structures called quadtrees and octrees more regularly.

A quadtree is a tree where each node has up to four children, while an octree is the same, but with eight. Where a binary tree takes a list and recursively breaks it up into lists that are half as long, a quadtree takes a square and recursively breaks it up into squares that are a quarter of the size of the original, and then a quarter of the size of that, and on. An octree takes a cube and recursively breaks it up into eight child cubes. These trees are used, much in the same way as a binary tree, to build spatial indices which allow us to quickly and easily associate data with space in two or three dimensions.

Representation of These Data Trees

The typical way of representing a binary tree has the nodes as something like this:

struct BinaryTreeNode {
    value T
    left *BinaryTreeNode
    right *BinaryTreeNode
}

We can represent quadtrees and octrees in much the same way, by storing pointers to four or eight children (either as individual struct members, or embedding an array). If you’re concerned about memory layout — and who wouldn’t be — you might notice that as the number of children increases, each node needs to store a pointer per child, so the size of each node is increasing. Also, a lot of pointer indirection is bad for the CPU cache and traversal of your data structure will slow. For that reason, you may want to allocate the children of each node in contiguous blocks. Then each node only needs to store a pointer to the block of children rather than a pointer to each child individually. You also might need to be able to traverse the other way, from child to parent, and if that’s the case each node will have to store a pointer to its parent as well.

And, depending on the specifics of your problem, keeping these pointers around might be very inefficient and you might start to wonder why you need them at all. Pointers are addresses in memory. The reason they have to be used at all is because we can’t predict where in memory something will be allocated beforehand: We ask the system to allocate a bit of memory for us and then it tells us the address where that memory was allocated and we have to record that address or lose it.

So, if there was a way to construct some sort of address that uniquely identified a node, we wouldn’t need to store tree relationships in the nodes themselves. We could just construct relationships when needed.

Z-Order Curve

Let’s break the following square down using a quadtree. We can break the first set of four children down into squares and then number the squares from zero to three. If you convert the numbers we’ve given each node in this quadtree into binary you may notice:

  • One bit (the leftmost) is set to 1 depending on the child’s position in the Y dimension.
  • The other bit is set to 1 depending on the child’s position in the X dimension.

Dividing and encoding a 2D surface into its children.

This is what you would expect given the decisions that need to be encoded: There are two dimensions and two choices in each dimension. In this way, a quadtree can be seen as two binary trees which have been interwoven together.

As we recurse down into one of the children squares and repeat the process, we can concatenate the bits of the choices we make together, as a record of each choice that we’ve made. This assigns a unique number to every location in a given space. A 64-bit unsigned integer can store 32 levels of quadtree, which is enough to find my socks on the surface of the earth. Two 64-bit unsigned integers can index individual atoms on the surface of many earths, so probably good enough for our spatial indexing problems.

This ordering of spaces is called the z-order curve, and you can see why because if you trace out each children’s location in the order of their location code it makes a lot of fractal-y “Z” shapes. There are a number of benefits for encoding space this way. One of the smallest benefits is that there is a natural traversal order if you want to do a breadth-first search of your whole tree, which is just counting up from 0, and that’s kind of fun.

Breaking a 2D surface down into a series of quadtrees, arriving at an encoded path of “100111".

If you need to manifest your tree in memory and associate data with each node, you can trade indirection and pointer bookkeeping costs for hashing costs by representing your octree as a hash table with each node’s location codes as the keys. This pays off especially well if your hashing algorithm can take advantage of the small, definitely sized numeric keys, or if the tree is sparse.

Ease of working with the keys is also valuable:

  • A child can be found from a parent’s location code by left shifting by the number of dimensions of your tree and or-ing on the child’s index (assuming z-order).
  • Finding a parent is as easy as right shifting.
  • You can weave the bits of a vector together to form a location code, or unweave a location code’s bits into a cartesian coordinate. This can be done efficiently with some bit fiddling.

If you store these keys in a hash set, or the keys of a hash table, a lot of spatial operations and queries can become constant or linear time complexity, which is pretty good.

The Hard Problems

We had a problem at Outer Labs where we needed to be able to do set operations (union, intersection, etc.) on the results of spatial searches. So we had a metric which could assign a value to a region in space, and we would recursively search that space for features, building an octree of high-value regions. If we had a second metric, we might need to find the intersection with the results of a similar search, which would give us the regions that satisfy both metrics. This is the intersection of two octrees. We might also want the combined region which satisfies either metric, which would be the union. If we want one and not the other, that would be the difference.

Breaking down the problem into these location codes, typically called morton codes, allows us to do this in linear time. Consider a union operation. If you keep the codes in each octree in sorted lists, codes which represent regions higher in the octree necessarily come before codes which encode nested regions. So if we encounter a particular code, any other code that begins with the same bits as a prefix can simply be removed. The remaining codes represent a union of the two octrees. Intersection and difference are more complicated than union, but still easy to implement in this form. Which is nice. Who doesn’t like when things are easy?

Now if anyone ever asks you how to find socks, you can tell them to use a z-order curve to build a spatial index of their apartment.

Content originally published by outer labs

November 2, 2020

Don't miss out on future articles from the Outer Labs team. Follow us on Twitter or LinkedIn to stay up to speed on our latest thoughts on how to scale your real estate design, construction, and operations.