`injest`: `+>` `+>>` `x>>` `=>>`

Clojure's threading macros (the -> and ->> thrushes) are great for navigating into data and transforming sequences. injest's path thread macros +> and +>> are just like -> and ->> but with expanded path navigating abilities similar to get-in.

Transducers are great for performing sequence transformations efficiently. x>> combines the efficiency of transducers with the better ergonomics of +>>. Thread performance can be further extended by automatically parallelizing work with =>>.

injest macros achieve this by scanning forms for transducers and comping them together into a function that either sequences or parallel folds the values flowing in the thread through the transducers.

Getting Started

deps.edn

Place the following in the :deps map of your deps.edn file:

  ...
  net.clojars.john/injest {:mvn/version "0.1.0-beta.9"}
  ...

clj-kondo

Make clj-kondo/Clojure-LSP aware of injest by adding "net.clojars.john/injest" to the :config-paths vector of your .clj-kondo/config.edn file:

{:config-paths ["net.clojars.john/injest"]}

This will automatically import injest's lint definitions in Calva and other IDE's that leverage clj-kondo and/or Clojure-LSP.

Quickstart

To try it in a repl right now with criterium and net.cgrand.xforms, drop this in your shell:

clj -Sdeps \
    '{:deps 
      {net.clojars.john/injest {:mvn/version "0.1.0-beta.9"}
       criterium/criterium {:mvn/version "0.4.6"}
       net.cgrand/xforms {:mvn/version "0.19.2"}}}'

Requiring

Then require the injest macros in your project:

(ns ...
  (:require [injest.path :refer [+> +>> x>> =>>]]
   ...

To just use x>> or =>> with the classical thread behavior, without the additional path thread semantics, you can require in the injest.classical namespace instead of the injest.path namespace:

(ns ...
  (:require [injest.classical :refer [x>> =>>]]
   ...

Having these two :require options allows individuals and organizations to adopt a la carte these orthogonal value propositions of improved performance and improved navigation.

Path Threads

injest.path allows for more intuitive path navigation, like you're used to with the (-> m :a :b :c) idiom. We refer to these as path threads.

Ergonomically, path threads provide a semantic superset of the behaviors found in -> and ->>. In other words, there is generally nothing you can do with -> that you can't do with +>. All the thread macros in injest.path have these path thread semantics.

As a replacement for `get-in`, `get` and `nth`

In path threads, naked integers and strings become lookups on the value being passed in, making those tokens useful again in threads. You can index into sequences with integers, like you would with nth, and replace get/get-in for most cases involving access in heterogeneous map nestings:

(let [m {1 (rest ['ignore0 0 1 {"b" [0 1 {:c :res}]}])}]
  (+> m 1 2 "b" 2 :c name)) ;=> "res"

Here, we're looking up 1 in the map, then getting the third element of the sequence returned, then looking up "b" in the returned map, then getting the third element of the returned vector, then looking up :c in the returned map, and then finally calling name on the returned keyword value.

In the above form, you could replace +> with either +>>, x>> or =>>, and you will still get the same result. +>> is simply the thread-last version of +> and x>> and =>> are transducing and parallel versions of +>>.

Lambda wrapping

Path threads allow you to thread values through anonymous functions, like #(- 10 % 1) or (fn [x] (- 10 x 1)), without having to wrap them in an extra enclosing set of parenthesis:

(x> 10 range rest 2 #(- 10 % 1)) ;=> 6

Or, extending our prior example:

(let [m {1 (rest ['ignore0 0 1 {"b" [0 1 {:c :bob}]}])}]
  (x>> m 1 2 "b" 2 :c name #(println "hi " % "!"))) ;=> "hi bob!"

This has the added benefit of conveying to the reader that the author intends for the anonymous function to only take one parameter. In the classical thread syntax, the reader would have to scan all the way to the end of (#(... in order to know if an extra parameter is being passed in. This also prevents people from creating unmaintainable abstractions involving the threading of values into a literal lambda definition - a common source of errors.

Backwards compatability

+> and +>> have the same laziness semantics as -> and ->>. So, if you find yourself wanting to migrate a path thread away from a transducer/parallel context, back to the more lazy semantics, but you want to keep the path navigation semantics, you can simply replace the x>> or =>> macros with the +>> macro we required in above. Path navigating will continue to work:

(let [m {1 (rest ['ignore0 0 1 {"b" [0 1 {:c :bob}]}])}]
  (+>> m 1 2 "b" 2 :c name #(println "hi " % "!"))) ;=> "hi bob!"

You can also just use +> and +>> on their own, without the transducifying macros, if you only want the more convenient ergonomics.

As stated above, you can also require x>> and =>> in from injest.classical and, in the event you want to revert back to ->>, you will be able to do that knowing that no one has added any path thread semantics to the thread that would also need to be converted to the classical syntax.

`x>>` Auto Transducification

Why? Well, for one, speed. Observe:

(->> (range 10000000)
     (map inc)
     (filter odd?)
     (mapcat #(do [% (dec %)]))
     (partition-by #(= 0 (mod % 5)))
     (map (partial apply +))
    ;;  (mapv dec)
     (map (partial + 10))
     (map #(do {:temp-value %}))
     (map :temp-value)
     (filter even?)
     (apply +)
     time)

Returns:

"Elapsed time: 8275.319295 msecs"
5000054999994

Whereas:

(x>> (range 10000000)
     (map inc)
     (filter odd?)
     (mapcat #(do [% (dec %)]))
     (partition-by #(= 0 (mod % 5)))
     (map (partial apply +))
    ;;  (mapv dec)
     (map (partial + 10))
     (map #(do {:temp-value %}))
     (map :temp-value)
     (filter even?)
     (apply +)
     time)

Returns:

"Elapsed time: 2913.851103 msecs"
5000054999994

Two to three times the speed with basically the same code. The more transducers you can get lined up contiguously, the less boxing you’ll have in your thread.

Note: These times reflect the execution environment provided by Github's browser-based vscode runtime. My local box performs much better and yours likely will too.

Let’s uncomment the (mapv dec) that is currently commented out in both the threads above. Because mapv is not a transducer, items get boxed halfway through our thread. As a result our performance degrades slightly for x>>.

First, let's see it with ->>:

(->> (range 10000000)
     (map inc)
     (filter odd?)
     (mapcat #(do [% (dec %)]))
     (partition-by #(= 0 (mod % 5)))
     (map (partial apply +))
     (mapv dec)
     (map (partial + 10))
     (map #(do {:temp-value %}))
     (map :temp-value)
     (filter even?)
     (apply +)
     time)
"Elapsed time: 6947.00928 msecs"
44999977000016

Hmm, ->> actually goes faster now, perhaps due to mapv removing some laziness. The more lazy semantics are less predictable in that way.

But now, for x>>:

(x>> (range 10000000)
     (map inc)
     (filter odd?)
     (mapcat #(do [% (dec %)]))
     (partition-by #(= 0 (mod % 5)))
     (map (partial apply +))
     (mapv dec)
     (map (partial + 10))
     (map #(do {:temp-value %}))
     (map :temp-value)
     (filter even?)
     (apply +)
     time)
"Elapsed time: 3706.701192 msecs"
44999977000016

So we lost some speed due to the boxing, but we’re still doing a worthy bit better than the default thread macro. So keep in mind, if you want to maximize performance, try to align your transducers contiguously.

Note: In addition to improved speed, transducers also provide improved memory efficiency over finite sequences. So x>> may lower your memory usage as well.

Available Transducers

| These are the core functions that are available to use as transducers in a x>> thread-last: | | --- | | take-nth, disj!, dissoc!, distinct, keep-indexed, random-sample, map-indexed, map, replace, drop, remove, cat, partition-all, interpose, mapcat, dedupe, drop-while, partition-by, take-while, take, keep, filter, halt-when |

`=>>` Auto Parallelization

injest provides a parallel version of x>> as well. =>> leverages Clojure's parallel fold reducer in order to execute stateless transducers over a Fork/Join pool. Remaining stateful transducers are comped and threaded just like x>>.

It d

Injest

Install / Use

README

`injest`: `+>` `+>>` `x>>` `=>>`

Getting Started

deps.edn

clj-kondo

Quickstart

Requiring

Path Threads

As a replacement for `get-in`, `get` and `nth`

Lambda wrapping

Backwards compatability

`x>>` Auto Transducification

Available Transducers

`=>>` Auto Parallelization

Injest

Install / Use

README

injest: +> +>> x>> =>>

Getting Started

deps.edn

clj-kondo

Quickstart

Requiring

Path Threads

As a replacement for get-in, get and nth

Lambda wrapping

Backwards compatability

x>> Auto Transducification

Available Transducers

=>> Auto Parallelization

`injest`: `+>` `+>>` `x>>` `=>>`

As a replacement for `get-in`, `get` and `nth`

`x>>` Auto Transducification

`=>>` Auto Parallelization