A Tabular Hash for Fast Owen Scrambling

Summary

The Sobol sequence is effective for Monte Carlo integrations that commonly occur in computer graphics. Its utility is improved when it is coupled with Owen's nested uniform scrambling, commonly called Owen Scrambling. We survey existing implementations of Owen Scrambling and present an original method using tabular hashing. Our implementation uses a fixed-size static lookup table of 2KiB, suitable for scrambling any number of dimensions. It offers a good compromise amongst speed, quality and flexibility.

Introduction

The Sobol sequence is commonly used for Monte Carlo integrations in computer graphics. However, in multiple dimensions, the sequence can display structured artifacts. Joe and Kuo created direction vectors to maximize coverage across dimensions. Faure and Tezuka randomized the rank order of values in the sequence. We call this "shuffling."

Owen introduced a randomization which eliminates structured artifacts and actually improves performance when sampling sufficiently smooth signals. It also provides some guaranteed worst-case bounds on convergence. This method is commonly called Owen Scrambling. In brief, it works by probabilistically swapping elementary intervals of samples in each dimension, preserving sample coverage of those intervals. It has been studied extensively.

Implementing Owen Scrambling naively can be expensive because it requires a tree of random bits with a depth equal to the size of the bit-word to be scrambled. Alternative methods to reduce or eliminate the costs of producing scramble trees have been presented. Laine and Karras developed a hash function for shuffling. Burley noted that this can also be used for scrambling. Vegdhal improved upon the hashing function at the core of Burley's implementation. Matoušek proposed random digit scrambling and random linear scrambling as alternatives to full Owen Scrambling. In base-2, a static random word is applied as the scramble word in each dimension. Tan and Boyle suggested iterating repeatedly through a fixed-size scramble tree. Friedel and Keller proposed using lazy permutation trees.

Christensen et al and Helmer et al generated samples in bulk, merging randomization with calculating the sequence itself.

Our Method

Here, we propose an implementation of Owen Scrambling that uses tabular hashing as opposed to the hashing methods of Laine and Karras, Burley and Vegdhal. With tabular hashing, the bulk of the entropy is generated during coding and it is stored in static tables. At run-time, tabular data are accessed and potentially aggregated in computationally simple ways to produce good-quality hash results. Overall, tabular hashing can be very fast without sacrificing the quality of the randomization.

In our case, we first we generate a set of fixed-depth scramble trees using existing methodology. In our testing, 16 trees of depth 8 were sufficient. These are created during code implementation, so the tables are static and can be stored in a lookup table with a total size of 2KiB.

When calculating a binary word for scrambling, a user-supplied seed value is used to choose the first random tree. When we reach the leaf of that tree, we use the last few random bits from the traversal to select a new tree. In this way, our pre-generated trees can be chained together to produce binary scramble words of arbitrary length. To create a greater diversity of randomization, we use a simple integer multiplication against the user-provided seed to further randomize the final scrambling word.

We present pseudo-code for our scrambling calculation in Algorithm 1. Complete source code for calculating the Owen Scrambled Sobol sequence with state and without is available here.

Algorithm 1: Our Method for Owen Scrambling in Base-2
Require: static 𝑠𝑐𝑟𝑎𝑚𝑏𝑙𝑒𝑡𝑎𝑏𝑙𝑒[16][128], table of 8-bit words
Require: input 𝑠𝑒𝑒𝑑, a user-supplied 32-bit random seed
Require: input 𝑤𝑜𝑟𝑑, 32-bit binary value to scramble
1: 𝑥 ← 𝑠𝑒𝑒𝑑 × 0𝑥6𝐴935𝐶𝐴5
2: 𝑛 ← 𝑠𝑒𝑒𝑑
3: 𝑤 ← 𝑤𝑜𝑟𝑑
4: 𝑏𝑖𝑡𝑠 ← 32
5: while 𝑏𝑖𝑡𝑠 do
6: 𝑏𝑖𝑡𝑠 ← 𝑏𝑖𝑡𝑠 − 8
7: 𝑛 ← 𝑠𝑐𝑟𝑎𝑚𝑏𝑙𝑒𝑡𝑎𝑏𝑙𝑒[𝑛 & 0𝑥𝐹][𝑤 >> 25]
8: 𝑤 ← 𝑤 << 8
9: 𝑥 ← 𝑥 ⊕ (𝑛 << 𝑏𝑖𝑡𝑠)
10: end while
11: return 𝑤𝑜𝑟𝑑 ⊕ 𝑥

Scrambling in different dimensions is achieved by incorporating the dimension into the input seed for the algorithm. In its simplest form, the dimension can be the seed value. This is in contrast to the methods of Burley and Vegdhal which requires additional randomization of the input seed value for the best result.

Interestingly, Tan and Boyle’s method can be achieved by changing line 1 to initialize 𝑥 to zero, and using 𝑠𝑒𝑒𝑑 on line 7 instead of 𝑛 to choose the tree. Speed is improved at the expense of quality. We will use this implementation of the Tan and Boyle method later in our testing.

As an optimization, since practical applications involve words of fixed size, the while loop can be unwrapped to avoid jump instructions that could impede pipelined execution on the CPU.

Our reference implementation of the algorithm is available online.

Testing

In this section, we evaluate different methods of randomizing the Sobol sequence in several different contexts to demonstrate the worth of our method.

First, we show the power spectral density of several different sample generators as a demonstration of their quality. We start with the Stochastic Sobol generator of Helmer et. al., which shows high quality.

Next, we show the power spectral density for the Sobol sequence scrambled with the Burley/Vegdahl implementation of Owen Scrambling. The image on the left shows the result without a strong hash on the input seed value; the image on the right does have a strong hash applied to the seed.

Last, we show the power spectral density for the Sobol sequence scrambled using our tabular hash implementation of Owen Scrambling. The image on the left shows the results of the Tan & Boyle algorithm; the image on the right demonstrates the full implementation of Algorithm 1.

While the power spectral density for our tabular hash method is not as ideal as the results produced by the Stochastic Sobol generator, its quality is respectable. We have confidence that the sequence is not subject to exotic aliasing artifacts.

When testing for quality, we expected the various implementations of Owen Scrambling to produce results of similar quality. This is because these implementations are all expressions of the same mathematical operation. We also expected the Stochastic Sobol generator to perform similarly to Sobol sequences with Owen Scrambling applied by similar logic.

We conducted three render tests, an ambient occlusion test, an area light sampling test, and a full path tracing test. The images below depict the scenes that we tested.

During the ambient occlusion rendering test, our expectations of equivalent quality were largely met. The various generators differed by only a few percent in terms of the RMS errors that they generated. A Sobol sequence using our implementation of Owen Scrambling outperformed the Stochastic Sobol sequence for 51% of the sample counts tested. It outperformed the Burley/Vegdahl method with strong seed hashing for 48% of the tests and for 79% of the tests without the strong seed hash. It outperformed the Tan & Boyle method 50% of the time.

However, the rendering test involving area light sampling resulted in pronounced differences. Our method outperformed the Stochastic Sobol sequence for 91% of the sample counts. It outperformed the Burley/Vegdahl method with strong seed hashing 62% of the time and for 91% of the time without the strong seed hash. It outperformed the Tan & Boyle method 66% of the time. The fact that this rendering test showed severe variances is unexpected and requires more study.

The full path tracing test also met expectations. Our method was better than the Stochastic Sobol sequence 52% of the time, the Burley/Vegdahl method with strong seed hashing 50% of the time, without strong seed hashing 55% of the time and it bested the Tan & Boyle method 55% of the time.

Finally, we tested the generators for their speed. There are several caveats for these results. First, we used an implementation of Stochastic Sobol generation based on Listing 2 of the algorithm from the paper. It's possible that real-world implementations have optimizations that were not present in that algorithm listing. Also, the Burley/Vegdhal implementation can be accelerated in a number of ways. For example, by reversing the direction vectors, one of the two calls that reverse bit-words for each calculation can be eliminated. On the other hand, the tabular hash could be also accelerated by using larger lookup tables, where longer bit-words are returned from the table and therefore fewer iterations are required to calculate the scrambling word. Implementations of these algorithms gives the engineer a lot of opportunities for optimization based on the needs of the application, so these timings should be only used as an approximate reference.

Nonetheless, on a dated Intel Xeon W3670 CPU with a clock rate of 3.20GHz, we observed the following performance:

Stateful Sobol sequence without Owen Scrambling: 152.4 million samples per second
Stochastic Sobol sequence: 90.90 M/sec
Stateful Sobol sequence with Tan & Boyle Owen Scrambling: 67.54 M/sec
Stateful Sobol sequence with our implementation of Owen Scrambling: 51.28 M/sec
Stateful Sobol sequence with Burley/Vegdhal Owen Scrambling: 34.34 M/sec

While the Stochastic Sobol sequence provides both, excellent randomization and performance, its samples must be calculated in bulk. As a result, it is not necessarily the best option when samples must be calculated entirely on demand. The other rapid methods generate samples of lower quality. With that in mind, we believe our method offers a good compromise amongst speed, quality and flexibility.

Results & Conclusions

We've presented an alternative method for calculating Owen Scrambling based on tabular hashing. It is capable of randomizing the Sobol sequence in any dimension using a 2KiB lookup table. We've shown that its quality is comparable to existing methods and that its speed is good. While there are many options available for high quality sample generation, we hope that the computer graphics community finds our new method useful.

Musings of a Curious Engineer

Search This Blog