Puma: A Ruby Web Server Built For Parallelism

Puma is a simple, fast, multi-threaded, and highly parallel HTTP 1.1 server for Ruby/Rack applications.

Built For Speed & Parallelism

Puma is a server for Rack-powered HTTP applications written in Ruby. It is:

Multi-threaded. Each request is served in a separate thread. This helps you serve more requests per second with less memory use.
Multi-process. "Pre-forks" in cluster mode, using less memory per-process thanks to copy-on-write memory.
Standalone. With SSL support, zero-downtime rolling restarts and a built-in request bufferer, you can deploy Puma without any reverse proxy.
Battle-tested. Our HTTP parser is inherited from Mongrel and has over 15 years of production use. Puma is currently the most popular Ruby webserver, and is the default server for Ruby on Rails.

Originally designed as a server for Rubinius, Puma also works well with Ruby (MRI) and JRuby.

On MRI, there is a Global VM Lock (GVL) that ensures only one thread can run Ruby code at a time. But if you're doing a lot of blocking IO (such as HTTP calls to external APIs like Twitter), Puma still improves MRI's throughput by allowing IO waiting to be done in parallel. Truly parallel Ruby implementations (TruffleRuby, JRuby) don't have this limitation.

Quick Start

$ gem install puma
$ puma

Without arguments, puma will look for a rackup (.ru) file in working directory called config.ru.

SSL Connection Support

Puma will install/compile with support for ssl sockets, assuming OpenSSL development files are installed on the system.

If the system does not have OpenSSL development files installed, Puma will install/compile, but it will not allow ssl connections.

Frameworks

Rails

Puma is the default server for Rails, included in the generated Gemfile.

Start your server with the rails command:

$ rails server

Many configuration options and Puma features are not available when using rails server. It is recommended that you use Puma's executable instead:

$ bundle exec puma

Sinatra

You can run your Sinatra application with Puma from the command line like this:

$ ruby app.rb -s Puma

In order to actually configure Puma using a config file, like puma.rb, however, you need to use the puma executable. To do this, you must add a rackup file to your Sinatra app:

# config.ru
require './app'
run Sinatra::Application

You can then start your application using:

$ bundle exec puma

Configuration

Puma provides numerous options. Consult puma -h (or puma --help) for a full list of CLI options, or see Puma::DSL or dsl.rb.

You can also find several configuration examples as part of the test suite.

For debugging purposes, you can set the environment variable PUMA_LOG_CONFIG with a value and the loaded configuration will be printed as part of the boot process.

Thread Pool

Puma uses a thread pool. You can set the minimum and maximum number of threads that are available in the pool with the -t (or --threads) flag:

$ puma -t 8:32

Puma will automatically scale the number of threads, from the minimum until it caps out at the maximum, based on how much traffic is present. The current default is 0:16 and on MRI is 0:5. Feel free to experiment, but be careful not to set the number of maximum threads to a large number, as you may exhaust resources on the system (or cause contention for the Global VM Lock, when using MRI).

Be aware that additionally Puma creates threads on its own for internal purposes (e.g. handling slow clients). So, even if you specify -t 1:1, expect around 7 threads created in your application.

Cluster mode

Puma also offers "cluster mode". Cluster mode forks workers from a master process. Each child process still has its own thread pool. You can tune the number of workers with the -w (or --workers) flag:

$ puma -t 8:32 -w 3

Or with the WEB_CONCURRENCY environment variable:

$ WEB_CONCURRENCY=3 puma -t 8:32

When using a config file, most applications can simply set workers :auto (requires the concurrent-ruby gem) to match the number of worker processes to the available processors:

# config/puma.rb
workers :auto

See workers :auto gotchas.

Note that threads are still used in cluster mode, and the -t thread flag setting is per worker, so -w 2 -t 16:16 will spawn 32 threads in total, with 16 in each worker process.

If workers is set to :auto, or the WEB_CONCURRENCY environment variable is set to "auto", and the concurrent-ruby gem is available in your application, Puma will set the worker process count to the result of available processors.

For an in-depth discussion of the tradeoffs of thread and process count settings, see our docs.

In cluster mode, Puma can "preload" your application. This loads all the application code prior to forking. Preloading reduces total memory usage of your application via an operating system feature called copy-on-write.

If the number of workers is greater than 1 (and --prune-bundler has not been specified), preloading will be enabled by default. Otherwise, you can use the --preload flag from the command line:

$ puma -w 3 --preload

Or, if you're using a configuration file, you can use the preload_app! method:

# config/puma.rb
workers 3
preload_app!

Preloading can’t be used with phased restart, since phased restart kills and restarts workers one-by-one, and preloading copies the code of master into the workers.

Cluster mode hooks

When using clustered mode, Puma's configuration DSL provides before_fork, before_worker_boot, and after_worker_shutdown hooks to run code when the master process forks, the child workers are booted, and after each child worker exits respectively.

It is recommended to use these hooks with preload_app!, otherwise constants loaded by your application (such as Rails) will not be available inside the hooks.

# config/puma.rb
before_fork do
  # Add code to run inside the Puma master process before it forks a worker child.
end

before_worker_boot do
  # Add code to run inside the Puma worker process after forking.
end

after_worker_shutdown do |worker_handle|
  # Add code to run inside the Puma master process after a worker exits. `worker.process_status` can be used to get the
  # `Process::Status` of the exited worker.
end

In addition, there is an before_refork and after_refork hooks which are used only in fork_worker mode, when the worker 0 child process forks a grandchild worker:

before_refork do
  # Used only when fork_worker mode is enabled. Add code to run inside the Puma worker 0
  # child process before it forks a grandchild worker.
end

after_refork do
  # Used only when fork_worker mode is enabled. Add code to run inside the Puma worker 0
  # child process after it forks a grandchild worker.
end

Importantly, note the following considerations when Ruby forks a child process:

File descriptors such as network sockets are copied from the parent to the forked child process. Dual-use of the same sockets by parent and child will result in I/O conflicts such as SocketError, Errno::EPIPE, and EOFError.
Background Ruby threads, including threads used by various third-party gems for connection monitoring, etc., are not copied to the child process. Often this does not cause immediate problems until a third-party connection goes down, at which point there will be no supervisor to reconnect it.

Therefore, we recommend the following:

If possible, do not establish any socket connections (HTTP, database connections, etc.) inside Puma's master process when booting.
If (1) is not possible, use before_fork and before_refork to disconnect the parent's socket connections when forking, so that they are not accidentally copied to the child process.
Use before_worker_boot to restart any background threads on the forked child.
Use after_refork to restart any background threads on the parent.

Master process lifecycle hooks

Puma's configuration DSL provides master process lifecycle hooks after_booted, before_restart, and after_stopped which may be used to specify code blocks to run on each event:

# config/puma.rb
after_booted do
  # Add code to run in the Puma master process after it boots,
  # and also after a phased restart completes.
end

before_restart do
  # Add code to run in the Puma master process when it receives
  # a restart command but before it restarts.
end

after_stopped do
  # Add code to run in the Puma master process when it receives
  # a stop command but before it shuts down.
end

Error handling

If Puma encounters an error outside of the context of your application, it will respond with a 400/500 and a simple textual error message (see Puma::Server#lowlevel_error or server.rb). You can specify custom behavior for this scenario. For example, you can report the error to your third-party error-tracking service (in this

Puma

Install / Use

README