• Splay trees and optimality

    The splay tree is probably my favourite data structure. Is it useful in practice? Probably not, but its remarkable optimality properties coupled with its bare simplicity are so tantalizing that I’ve fallen in love with splaying. In the rest of this post, I’ll describe the splay tree structure, and present some of my favourite splay tree properties. You will also find an instructive D3 visualization of a splay tree in motion.

    Read more →

  • A uniform sum threshold problem

    Let \(U_1, U_2, \dots\) be an infinite sequence of independent \(\mathrm{Uniform}[0, 1]\) random variables. Let \(N\) be the minimum index for which \[ U_1 + U_2 + \dots + U_N > 1 . \] What is the expected value \(\mathbf{E}\{N\}\)?

    Read more →

  • Generating spherical points without complex operations

    These days, most of everyone’s favourite languages and libraries for scientific computing come ready-equipped with random number generators for most common univariate distributions: the uniform, binomial, normal, geometric, exponential, beta, etc. In my experience, multivariate generation is comparatively hit-or-miss. But in any case, since documentation usually doesn’t specify implementation methods or running time, you usually can’t be sure of the efficiency of one of these functions without personally examining some source code, or being lucky and finding that someone else on StackExchange already did. Thankfully, when in doubt, one can always refer to the excellent and totally free book Non-Uniform Random Variate Generation by my old PhD supervisor Luc Devroye. In fact, it seems this book is even more than free, as stated in this plea posted by the author on the book’s webpage.

    I give anyone the permission, even without asking me, to take these PDF files to a printer, print as many copies as you like, and sell them for profit. If you would like me to advertise the sales points of the hard copies, please let me know. To the libraries: Please do not charge patrons for copying this book. I grant everyone the right to copy at will, for free.

    Read more →

  • Discrete minimax estimation with trees

    This morning, I submitted the final version of my paper Discrete minimax estimation with trees (Devroye and Reddad, 2019), which is to appear in the Electronic Journal of Statistics. I think this paper is conceptually quite interesting, and I’m very happy with the final result, so in this post I’ll describe some of the main ideas present in the work.

    Read more →

  • Some conditioned Galton-Watson trees never grow

    When programmers hear the phrase “random tree,” they most likely think of a random binary search tree, i.e., a binary search tree built from the insertion of a uniformly random permutation of \(n\) keys—denote such a tree by \(\mathrm{BST}_n\). A mathematician might instead think that a ``random tree’’ is more likely to be a uniformly random tree taken from some class, like the set of all ordered binary trees with \(n\) nodes—denote such a tree by \(\mathrm{UNIF}_n\). Of course, neither would be wrong. It should be clear, though, that these two distributions on the space of binary trees are quite different. In particular, most undergraduate students of computer science learn, through the study of comparison-based sorting algorithms, that \[ \mathbf{E}\{\mathrm{height}(\mathrm{BST}_n)\} = \varTheta(\log n) , \] and some will learn that \[ \mathbf{E}\{\mathrm{height}(\mathrm{UNIF}_n)\} = \varTheta(\sqrt{n}) . \] Though random binary search trees might seem more immediately relevant to programmers, uniformly random binary trees are part of a bigger picture which is comparatively more versatile in the probabilistic analysis of algorithms. To this end, we introduce the concept of Galton-Watson trees.

    Read more →