Nedtries

A portable in-place bitwise binary Fredkin trie algorithm which allows for near constant time insertions, deletions, finds, closest fit finds and iteration. Is approx. 50-100% faster than red-black trees and up to 20% faster than O(1) hash tables.

Generate Convert Improve

Install / Use

/learn @ned14/Nedtries

About this skill

Quality Score

0/100

README

<p><b>This algorithm has been ported to modern C++ and can be found at https://github.com/ned14/quickcpplib/blob/master/include/quickcpplib/algorithm/bitwise_trie.hpp. This project has been <b>ARCHIVED</b> and will no longer be maintained. Thanks for all the user support throughout the years.</b></p> <hr /> <div style="text-align: center"> <h1 style="text-decoration: underline">nedtries v1.03 trunk (?)</h1> <h2 style="text-decoration: none;">by Niall Douglas</h2> <p>Web site: <a href="http://www.nedprod.com/programs/portable/nedtries/">http://www.nedprod.com/programs/portable/nedtries/</a></p> <p>API Reference: <a href="https://ned14.github.io/nedtries/nedtrie_8h.html">https://ned14.github.io/nedtries/nedtrie_8h.html</a></p> <hr /></div> <p>master branch CI status: Linux: <a href="https://travis-ci.org/ned14/nedtries"><img src="https://travis-ci.org/ned14/nedtries.svg?branch=master"/></a> Windows: <a href="https://ci.appveyor.com/project/ned14/nedtries"><img src="https://ci.appveyor.com/api/projects/status/wcvgw1lx8in5c36p?svg=true"/></a></p> <p>Enclosed is nedtries, an in-place bitwise binary Fredkin trie algorithm which allows for near constant time insertions, deletions, finds, <span style="text-decoration: underline"> <strong>closest fit finds</strong></span> and iteration. On modern hardware it is approximately 50-100% faster than red-black binary trees, it handily beats even the venerable O(1) hash table for less than 3000 objects and it is barely slower than the hash table for 10000 objects. Past 10000 objects you probably ought to use a hash table though, and if you need <a href="#nfindversuscfind">nearest fit rather than close fit</a> then red-black trees are still optimal.</p> <p>It is licensed under the <a href="http://www.boost.org/LICENSE_1_0.txt" target="_blank">Boost Software License</a> which basically means you can do anything you like with it. Commercial support is available from <a href="http://www.nedproductions.biz/" target="_blank">ned Productions Limited</a>.</p> <p>Its advantages over other algorithms are sizeable:</p> <ol> <li>It has all the advantages of red-black trees such as close-fit finds (i.e. find an item which is similar but not exact to the search term) without anything like the impact on memory bandwidth as red-black trees because it doesn't have to rebalance itself when adding new items (i.e. it scales far better with memory pressure than red-black trees).</li> <li>It doesn't require dynamic memory like hash tables, so it can be used in a bounded environment such as a bootstrapper or a tiny embedded systems kernel. It is also a lot faster than hash tables for less than a few thousand items.</li> <li>Unlike either red-black trees or most hash tables, nedtries can store as many items with identical keys as you like.</li> <li>Its performance is nearly perfectly stable over time and number of contents N with a worst case complexity of O(M) following a mostly linear degradation with increasing M average where 1 <= M <= 8*sizeof(void *). M is a measure of the entropy between differing keys, so where keys are very similar at the bit level M is higher than where keys are very dissimilar. This leads to an unusual complexity where it can be like O(log N) for some distributions of key, and like O(1) for other distributions of key. The scaling graphs below are for completely random keys.</li> <li>Its complexities for find and insert are identical, whereas for deletion it is slightly more constant. Unlike almost any other algorithm, bitwise binary tries have nearly identical real world speeds for ALL its operations rather than being fast at one thing but slow at the others. In other words, if your code equally inserts, deletes and finds with no preference for which then <span style="text-decoration: underline">this algorithm will typically beat all others in the general purpose situation</span>.</li> </ol> <p>Its two primary disadvantages are that (i) it can only key upon a size_t (i.e. the size of a void *), so it cannot make use of an arbitrarily large key like a hash table can (though of course one could hash the large key into a size_t sized key) and (ii) it is lousy at guaranteed nearest fit finds. It also runs fastest when the key is as unique from other keys as possible, so if you wish to replace a red-black tree which has a complex left-right comparison function which cannot be converted into a stable size_t value then you will need to stick with red-black trees. In other words, it is ideal when you are keying on pointer sized keys where each item has a definitive non-changing key.</p> <p>Have a look at the scaling graphs below to decide if your software could benefit. Note for non-random key distributions you may get significantly better or worse performance than shown. If you're interested, read on for how to add them to your software.</p> <div><center> <img alt="Bitwise Trees Scaling" height="25%" src="images/BitwiseTreesScaling.png" width="25%" /><img alt="Red Black Trees Scaling" height="25%" src="images/RedBlackTreesScaling.png" width="25%" /><img alt="Hash Table Scaling" height="25%" src="images/HashTableScaling.png" width="25%" /></center></div> <div><center><img alt="Bitwise Trees Scaling" height="25%" src="images/LogLogBitwiseTreesScaling.png" width="25%" /><img alt="Red Black Trees Scaling" height="25%" src="images/LogLogRedBlackTreesScaling.png" width="25%" /><img alt="Hash Table Scaling" height="25%" src="images/LogLogHashTableScaling.png" width="25%" /></center></div> <h2><a name="breakingchanges">A. Breaking changes since previous versions:</a></h2> <p>In v1.02 a breaking change in what NEDTRIE_NFIND means and does was introduced, and the version was bumped to v1.02 to warn users of the change. The problem in v1.01 and earlier was that I had incorrectly documented what NEDTRIE_NFIND does: I said that it found the nearest item to the search term when this was patently untrue (thanks to smilingthax for reporting this). In fact, NEDTRIE_NFIND used to return a matching item, if there was one, but if there was no matching item, it returned <em>any</em> larger item rather than <em>the</em> next largest item. I do apologise to the users of nedtries for this documentation error, and for the lost productivity it surely must have caused some of you.</p> <p>The good news is that NEDTRIE_NFIND now guarantees to return the next largest item, and it therefore now matches BSD's red-black Nfind. The bad news is that bitwise tries are really not ideal for guaranteed nearest matching, and performance is terrible as you can see via the purple line in the graphs above. If you can put up with non-guaranteed nearest matching, NEDTRIE_CFIND offers much better performance. NEDTRIE_CFIND takes a <em>rounds</em> parameter which indicates how hard the routine should try to return a close item: rounds=0 means to return the first item encountered which is equal or larger to search key, rounds=1 means try one level down, rounds=2 means try two levels down and so on. rounds=INT_MAX means try hardest, and guarantees that any item with a matching key will be found and that if not matching, any item returned will have a very close key (those not necessarily the closest).</p> <p>For a summary of the differences between Nfind and Cfind, <a href="#nfindversuscfind">see this useful table</a>. Note that if you just want any item with a key larger or equal to the search key, NEDTRIE_CFIND(rounds=0|1|2) is extremely swift and has O(1) complexity as shown in the graphs above.</p> <h2><a name="implementation">B. Implementation:</a></h2> <p>The source makes use of C macros on C and C++ templates on C++ - therefore, unlike typical C-macro-based algorithms it is easy to debug and in fact, the improved metadata specified by the templates lets a modern C++ compiler produce 5-15% faster code through PGO guided selective inlining. The code is 100% standard C and C++, so it should run on any platform or architecture though you <em>may</em> need to implement your own nedtriebitscanr() function if you're not using GCC nor MSVC and want to keep performance high. If you are building debug, NEDTRIEDEBUG is by default turned on: this causes a complete state validation check to be performed after each and every change to the trie which tends to be very good at catching bugs early, but can make debug builds a little slow.</p> <p>So what is "an in-place bitwise binary Fredkin trie algorithm" then? Well you ought to start by reading and fully digesting <a href="http://en.wikipedia.org/wiki/Trie" target="_blank">the Wikipedia page on Fredkin tries</a> as what comes won't make much sense otherwise. The Wikipedia page describes a non-inplace trie which uses dynamic memory to store each consecutive non-differing section of a string, and indeed this is how tries are normally described in algorithm theory and classes. nedtries obviously enough selects on individual bits rather than substrings, and it uses an inplace instead of dynamically allocated implementation.</p> <p>Here is how nedtries performs an indexation: firstly, the most significant set bit X is found using nedtriebitscanr() which is no more than one to three CPU cycles on modern processors. This is used to index an array of bins. Each bin X contains a binary tree of items whose keys are (1<<X) <= key < (1<<(X+1)), so what one does is to follow the tree downwards selecting left or right based on whether the next bit downwards is 0 or 1. If an item has children, its key is only guaranteed to be constrained to that of its bin, whereas if an item does not have children then its key is guaranteed to match as closely as possible its position in the tree.</p> <p>If you insert an item, nedtries indexes as far as it can down the existing tree where the new item ought to be and inserts it there. If you remove

Related Skills

node-connect

337.4k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

83.2k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

337.4k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

commit-push-pr

83.2k

Commit, push, and open a PR