Conduit is a framework for dealing with streaming data, such as reading raw bytes from a file, parsing a CSV response body from an HTTP request, or performing an action on all files in a directory tree. It standardizes various interfaces for streams of data, and allows a consistent interface for transforming, manipulating, and consuming that data.

Some of the reasons you'd like to use conduit are:

Constant memory usage over large data
Deterministic resource usage (e.g., promptly close file handles)
Easily combine different data sources (HTTP, files) with data consumers (XML/CSV processors)

Want more motivation on why to use conduit? Check out this presentation on conduit. Feel free to ignore the yesod section.

NOTE As of March 2018, this document has been updated to be compatible with version 1.3 of conduit. This is available in Long Term Support (LTS) Haskell version 11 and up. For more information on changes between versions 1.2 and 1.3, see the changelog.

Synopsis
Libraries
Conduit as a bad list
Interleaved effects
Terminology and concepts
Folds
Transformations
Monadic composition
Primitives
Evaluation strategy
Resource allocation
Chunked data
ZipSink
ZipSource
ZipConduit
Forced consumption
FAQs
More exercises
Legacy syntax
Further reading

Synopsis

Basic examples of conduit usage, much more to follow!

#!/usr/bin/env stack
-- stack script --resolver lts-12.21
import Conduit

main = do
    -- Pure operations: summing numbers.
    print $ runConduitPure $ yieldMany [1..10] .| sumC

    -- Exception safe file access: copy a file.
    writeFile "input.txt" "This is a test." -- create the source file
    runConduitRes $ sourceFileBS "input.txt" .| sinkFile "output.txt" -- actual copying
    readFile "output.txt" >>= putStrLn -- prove that it worked

    -- Perform transformations.
    print $ runConduitPure $ yieldMany [1..10] .| mapC (+ 1) .| sinkList

Libraries

There are a large number of packages relevant to conduit, just search for conduit on the LTS Haskell package list page. In this tutorial, we're going to rely mainly on the conduit library itself, which provides a large number of common functions built-in. There is also the conduit-extra library, which adds in some common extra support, like GZIP (de)compression.

You can run the examples in this tutorial as Stack scripts.

Conduit as a bad list

Let's start off by comparing conduit to normal lists. We'll be able to compare and contrast with functions you're already used to working with.

#!/usr/bin/env stack
-- stack script --resolver lts-12.21
{-# LANGUAGE ExtendedDefaultRules #-}
import Conduit

take10List :: IO ()
take10List = print
    $ take 10 [1..]

take10Conduit :: IO ()
take10Conduit = print $ runConduitPure
    $ yieldMany [1..] .| takeC 10 .| sinkList

main :: IO ()
main = do
    putStrLn "List version:"
    take10List
    putStrLn ""
    putStrLn "Conduit version:"
    take10Conduit

Our list function is pretty straightforward: create an infinite list from 1 and ascending, take the first 10 elements, and then print the list. The conduit version does the exact same thing, but:

In order to convert the [1..] list into a conduit, we use the yieldMany function. (And note that, like lists, conduit has no problem dealing with infinite streams.)
We're not just doing function composition, and therefore we need to use the .| composition operator. This combines multiple components of a conduit pipeline together.
Instead of take, we use takeC. The Conduit module provides many functions matching common list functions, but appends a C to disambiguate the names. (If you'd prefer to use a qualified import, check out Data.Conduit.Combinators).
To consume all of our results back into a list, we use sinkList
We need to explicitly run our conduit pipeline to get a result from it. Since we're running a pure pipeline (no monadic effects), we can use runConduitPure.
And finally, the data flows from left to right in the conduit composition, as opposed to right to left in normal function composition. There's nothing deep to this; it's just intended to make conduit feel more like common streaming abstraction from other places. For example, notice how similar the code above looks to piping in a Unix shell: ps | grep ghc | wc -l.

Alright, so what we've established is that we can use conduit as a bad, inconvenient version of lists. Don't worry, we'll soon start to see cases where conduit far outshines lists, but we're not quite there yet. Let's build up a slightly more complex pipeline:

#!/usr/bin/env stack
-- stack script --resolver lts-12.21
{-# LANGUAGE ExtendedDefaultRules #-}
import Conduit

complicatedList :: IO ()
complicatedList = print
    $ takeWhile (< 18) $ map (* 2) $ take 10 [1..]

complicatedConduit :: IO ()
complicatedConduit = print $ runConduitPure
     $ yieldMany [1..]
    .| takeC 10
    .| mapC (* 2)
    .| takeWhileC (< 18)
    .| sinkList

main :: IO ()
main = do
    putStrLn "List version:"
    complicatedList
    putStrLn ""
    putStrLn "Conduit version:"
    complicatedConduit

Nothing more magical going on, we're just looking at more functions. For our last bad-list example, let's move over from a pure pipeline to one which performs some side effects. Instead of printing the whole result list, let's use mapM_C to print each value individually.

#!/usr/bin/env stack
-- stack script --resolver lts-12.21
{-# LANGUAGE ExtendedDefaultRules #-}
import Conduit

complicatedList :: IO ()
complicatedList = mapM_ print
    $ takeWhile (< 18) $ map (* 2) $ take 10 [1..]

complicatedConduit :: IO ()
complicatedConduit = runConduit
     $ yieldMany [1..]
    .| takeC 10
    .| mapC (* 2)
    .| takeWhileC (< 18)
    .| mapM_C print

main :: IO ()
main = do
    putStrLn "List version:"
    complicatedList
    putStrLn ""
    putStrLn "Conduit version:"
    complicatedConduit

For the list version, all we've done is added mapM_ at the beginning. In the conduit version, we replace print $ runConduitPure with runConduit (since we're no longer generating a result to print, and our pipeline now has effects), and replaced sinkList with mapM_C print. We're no longer reconstructing a list at the end, instead just streaming the values one at a time into the print function.

Interleaved effects

Let's make things a bit more difficult for lists. We've played to their strengths until now, having a pure series of functions composed, and then only performing effects at the end (either print or mapM_ print). Suppose we have some new function:

magic :: Int -> IO Int
magic x = do
    putStrLn $ "I'm doing magic with " ++ show x
    return $ x * 2

And we want to use this in place of the map (* 2) that we were doing before. Let's see how the list and conduit versions adapt:

#!/usr/bin/env stack
-- stack script --resolver lts-12.21
{-# LANGUAGE ExtendedDefaultRules #-}
import Conduit

magic :: Int -> IO Int
magic x = do
    putStrLn $ "I'm doing magic with " ++ show x
    return $ x * 2

magicalList :: IO ()
magicalList =
    mapM magic (take 10 [1..]) >>= mapM_ print . takeWhile (< 18)

magicalConduit :: IO ()
magicalConduit = runConduit
     $ yieldMany [1..]
    .| takeC 10
    .| mapMC magic
    .| takeWhileC (< 18)
    .| mapM_C print

main :: IO ()
main = do
    putStrLn "List version:"
    magicalList
    putStrLn ""
    putStrLn "Conduit version:"
    magicalConduit

Notice how different the list version looks: we needed to break out >>= to allow us to have two different side-effecting actions (mapM magic and mapM_ print). Meanwhile, in conduit, all we did was replace mapC (* 2) with mapMC magic. This is where we begin to see the strength of conduit: it allows us to build up large pipelines of components, and each of those components can be side-effecting!

However, we're not done with the difference yet. Try to guess what the output will be, and then ideally run it on your machine and see if you're correct. For those who won't be running it, here's the output:

List version:
I'm doing magic with 1
I'm doing magic with 2
I'm doing magic with 3
I'm doing magic with 4
I'm doing magic with 5
I'm doing magic with 6
I'm doing magic with 7
I'm doing magic with 8
I'm doing magic with 9
I'm doing magic with 10
2
4
6
8
10
12
14
16

Conduit version:
I'm doing magic with 1
2
I'm doing magic with 2
4
I'm doing magic with 3
6
I'm doing magic with 4
8
I'm doing magic with 5
10
I'm doing magic with 6
12
I'm doing magic with 7
14
I'm doing magic with 8
16
I'm doing magic with 9

In the list version, we apply the magic function to all 10 elements in the initial list, printing all the output at once and generating a new list. We then use takeWhile on this new list and exclude the values 18 and 20. Finally, we print out each element in our new 8-value list. This has a number of downsides:

We had to force all 10 items of the list into memory at once. For 10 items, not a big deal. But if we were dealing with massive amou

Conduit

Install / Use

README

Table of Contents

Synopsis

Libraries

Conduit as a bad list

Interleaved effects

Related Skills