Conduit
A streaming data library
Install / Use
/learn @snoyberg/ConduitREADME
Conduit is a framework for dealing with streaming data, such as reading raw bytes from a file, parsing a CSV response body from an HTTP request, or performing an action on all files in a directory tree. It standardizes various interfaces for streams of data, and allows a consistent interface for transforming, manipulating, and consuming that data.
Some of the reasons you'd like to use conduit are:
- Constant memory usage over large data
- Deterministic resource usage (e.g., promptly close file handles)
- Easily combine different data sources (HTTP, files) with data consumers (XML/CSV processors)
Want more motivation on why to use conduit? Check out
this presentation on conduit.
Feel free to ignore the yesod section.
NOTE As of March 2018, this document has been updated to be compatible with version 1.3 of conduit. This is available in Long Term Support (LTS) Haskell version 11 and up. For more information on changes between versions 1.2 and 1.3, see the changelog.
Table of Contents
- Synopsis
- Libraries
- Conduit as a bad list
- Interleaved effects
- Terminology and concepts
- Folds
- Transformations
- Monadic composition
- Primitives
- Evaluation strategy
- Resource allocation
- Chunked data
- ZipSink
- ZipSource
- ZipConduit
- Forced consumption
- FAQs
- More exercises
- Legacy syntax
- Further reading
Synopsis
Basic examples of conduit usage, much more to follow!
#!/usr/bin/env stack
-- stack script --resolver lts-12.21
import Conduit
main = do
-- Pure operations: summing numbers.
print $ runConduitPure $ yieldMany [1..10] .| sumC
-- Exception safe file access: copy a file.
writeFile "input.txt" "This is a test." -- create the source file
runConduitRes $ sourceFileBS "input.txt" .| sinkFile "output.txt" -- actual copying
readFile "output.txt" >>= putStrLn -- prove that it worked
-- Perform transformations.
print $ runConduitPure $ yieldMany [1..10] .| mapC (+ 1) .| sinkList
Libraries
There are a large number of packages relevant to conduit, just search for conduit on the LTS Haskell package list page. In this tutorial, we're going to rely mainly on the conduit library itself, which provides a large number of common functions built-in. There is also the conduit-extra library, which adds in some common extra support, like GZIP (de)compression.
You can run the examples in this tutorial as Stack scripts.
Conduit as a bad list
Let's start off by comparing conduit to normal lists. We'll be able to compare and contrast with functions you're already used to working with.
#!/usr/bin/env stack
-- stack script --resolver lts-12.21
{-# LANGUAGE ExtendedDefaultRules #-}
import Conduit
take10List :: IO ()
take10List = print
$ take 10 [1..]
take10Conduit :: IO ()
take10Conduit = print $ runConduitPure
$ yieldMany [1..] .| takeC 10 .| sinkList
main :: IO ()
main = do
putStrLn "List version:"
take10List
putStrLn ""
putStrLn "Conduit version:"
take10Conduit
Our list function is pretty straightforward: create an infinite list from 1 and ascending, take the first 10 elements, and then print the list. The conduit version does the exact same thing, but:
- In order to convert the
[1..]list into a conduit, we use theyieldManyfunction. (And note that, like lists, conduit has no problem dealing with infinite streams.) - We're not just doing function composition, and therefore we need to
use the
.|composition operator. This combines multiple components of a conduit pipeline together. - Instead of
take, we usetakeC. TheConduitmodule provides many functions matching common list functions, but appends aCto disambiguate the names. (If you'd prefer to use a qualified import, check out Data.Conduit.Combinators). - To consume all of our results back into a list, we use
sinkList - We need to explicitly run our conduit pipeline to get a result from
it. Since we're running a pure pipeline (no monadic effects), we can
use
runConduitPure. - And finally, the data flows from left to right in the conduit
composition, as opposed to right to left in normal function
composition. There's nothing deep to this; it's just intended to
make conduit feel more like common streaming abstraction from other
places. For example, notice how similar the code above looks to
piping in a Unix shell:
ps | grep ghc | wc -l.
Alright, so what we've established is that we can use conduit as a bad, inconvenient version of lists. Don't worry, we'll soon start to see cases where conduit far outshines lists, but we're not quite there yet. Let's build up a slightly more complex pipeline:
#!/usr/bin/env stack
-- stack script --resolver lts-12.21
{-# LANGUAGE ExtendedDefaultRules #-}
import Conduit
complicatedList :: IO ()
complicatedList = print
$ takeWhile (< 18) $ map (* 2) $ take 10 [1..]
complicatedConduit :: IO ()
complicatedConduit = print $ runConduitPure
$ yieldMany [1..]
.| takeC 10
.| mapC (* 2)
.| takeWhileC (< 18)
.| sinkList
main :: IO ()
main = do
putStrLn "List version:"
complicatedList
putStrLn ""
putStrLn "Conduit version:"
complicatedConduit
Nothing more magical going on, we're just looking at more
functions. For our last bad-list example, let's move over from a pure
pipeline to one which performs some side effects. Instead of
printing the whole result list, let's use mapM_C to print each
value individually.
#!/usr/bin/env stack
-- stack script --resolver lts-12.21
{-# LANGUAGE ExtendedDefaultRules #-}
import Conduit
complicatedList :: IO ()
complicatedList = mapM_ print
$ takeWhile (< 18) $ map (* 2) $ take 10 [1..]
complicatedConduit :: IO ()
complicatedConduit = runConduit
$ yieldMany [1..]
.| takeC 10
.| mapC (* 2)
.| takeWhileC (< 18)
.| mapM_C print
main :: IO ()
main = do
putStrLn "List version:"
complicatedList
putStrLn ""
putStrLn "Conduit version:"
complicatedConduit
For the list version, all we've done is added mapM_ at the
beginning. In the conduit version, we replace print $ runConduitPure
with runConduit (since we're no longer generating a result to print,
and our pipeline now has effects), and replaced sinkList with
mapM_C print. We're no longer reconstructing a list at the end,
instead just streaming the values one at a time into the print
function.
Interleaved effects
Let's make things a bit more difficult for lists. We've played to
their strengths until now, having a pure series of functions composed,
and then only performing effects at the end (either print or mapM_ print). Suppose we have some new function:
magic :: Int -> IO Int
magic x = do
putStrLn $ "I'm doing magic with " ++ show x
return $ x * 2
And we want to use this in place of the map (* 2) that we were doing
before. Let's see how the list and conduit versions adapt:
#!/usr/bin/env stack
-- stack script --resolver lts-12.21
{-# LANGUAGE ExtendedDefaultRules #-}
import Conduit
magic :: Int -> IO Int
magic x = do
putStrLn $ "I'm doing magic with " ++ show x
return $ x * 2
magicalList :: IO ()
magicalList =
mapM magic (take 10 [1..]) >>= mapM_ print . takeWhile (< 18)
magicalConduit :: IO ()
magicalConduit = runConduit
$ yieldMany [1..]
.| takeC 10
.| mapMC magic
.| takeWhileC (< 18)
.| mapM_C print
main :: IO ()
main = do
putStrLn "List version:"
magicalList
putStrLn ""
putStrLn "Conduit version:"
magicalConduit
Notice how different the list version looks: we needed to break out
>>= to allow us to have two different side-effecting actions (mapM magic and mapM_ print). Meanwhile, in conduit, all we did was
replace mapC (* 2) with mapMC magic. This is where we begin to see
the strength of conduit: it allows us to build up large pipelines of
components, and each of those components can be side-effecting!
However, we're not done with the difference yet. Try to guess what the output will be, and then ideally run it on your machine and see if you're correct. For those who won't be running it, here's the output:
List version:
I'm doing magic with 1
I'm doing magic with 2
I'm doing magic with 3
I'm doing magic with 4
I'm doing magic with 5
I'm doing magic with 6
I'm doing magic with 7
I'm doing magic with 8
I'm doing magic with 9
I'm doing magic with 10
2
4
6
8
10
12
14
16
Conduit version:
I'm doing magic with 1
2
I'm doing magic with 2
4
I'm doing magic with 3
6
I'm doing magic with 4
8
I'm doing magic with 5
10
I'm doing magic with 6
12
I'm doing magic with 7
14
I'm doing magic with 8
16
I'm doing magic with 9
In the list version, we apply the magic function to all 10 elements
in the initial list, printing all the output at once and generating a
new list. We then use takeWhile on this new list and exclude the
values 18 and 20. Finally, we print out each element in our new
8-value list. This has a number of downsides:
- We had to force all 10 items of the list into memory at once. For 10 items, not a big deal. But if we were dealing with massive amou
Related Skills
node-connect
340.5kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
84.2kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
340.5kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
commit-push-pr
84.2kCommit, push, and open a PR
