Papa
A Python library for creating sockets and launching processes from a stable parent process
Install / Use
/learn @scottkmaxwell/PapaREADME
Summary
papa is a process kernel. It contains both a client library and a server component for creating sockets and launching processes from a stable parent process.
Dependencies
Papa has no external dependencies, and it never will! It has been tested under the following Python versions:
- 2.6
- 2.7
- 3.2
- 3.3
- 3.4
Installation
$> pip install papa
Purpose
Sometimes you want to be able to start a process and have it survive on its own, but you still want to be able to capture the output. You could daemonize it and pipe the output to files, but that is a pain and lacks flexibility when it comes to handling the output.
Process managers such as circus and supervisor are very good for starting and stopping processes, and for ensuring that they are automatically restarted when they die. However, if you need to restart the process manager, all of their managed processes must be brought down as well. In this day of zero downtime, that is no longer okay.
Papa is a process kernel. It has extremely limited functionality and it has zero external dependencies. If I've done my job right, you should never need to upgrade the papa package. There will probably be a few bug fixes before it is really "done", but the design goal was to create something that did NOT do everything, but only did the bare minimum required. The big process managers can add the remaining features.
Papa has 3 types of things it manages:
- Sockets
- Values
- Processes
Here is what papa does:
- Create sockets and close sockets
- Set, get and clear named values
- Start processes and capture their stdout/stderr
- Allow you to retrieve the stdout/stderr of the processes started by papa
- Pass socket file descriptors and port numbers to processes as they start
Here is what it does NOT do:
- Stop processes
- Send signals to processes
- Restart processes
- Communicate with processes in any way other than to capture their output
Sockets
By managing sockets, papa can manage interprocess communication. Just create a socket in papa and then pass the file descriptor to your process to use it. See the Circus docs for a very good description of why this is so useful.
Papa can create Unix, INET and INET6 sockets. By default it will create an INET TCP socket on an OS-assigned port.
You can pass either the file descriptor (fileno) or the port of a socket to a process by including a pattern like this in the process arguments:
$(socket.my_awesome_socket_name.fileno)$(socket.my_awesome_socket_name.port)
Values
Papa has a very simple name/value pair storage. This works much like environment variables. The values must be text, so if you want to store a complex structure, you will need to encode and decode with something like the JSON module.
The primary purpose of this facility is to store state information for your process that will survive between restarts. For instance, a process manager can store the current state that all of its managed processes are supposed to be in. Then if the process manager is restarted, it can restore its internal state, then go about checking to see if anything on the machine has changed. Are all processes that should be running actually running?
Processes
Processes can be started with or without output management. You can specify a maximum size for output to be cached. Each started process has a management thread in the Papa kernel watching its state and capturing output if necessary.
A Note on Naming (Namespacing)
Sockets, values and processes all have unique names. A name can only represent one item per class. So you could have an "aack" socket, an "aack" value and an "aack" process, but you cannot have two "aack" processes.
All of the monitoring commands support a final asterix as a wildcard. So you can get a list of sockets whose names match "uwsgi*" and you would get any socket that starts with "uwsgi".
One good naming scheme is to prefix all names with the name of your own application. So, for instance, the Circus process manager can prefix all names with "circus." and the Supervisor process manager can prefix all names with "supervisor.". If you write your own simple process manager, just prefix it with "tweeter." or "facebooklet." or whatever your project is called.
If you need to have multiple copies of something, put a number after a dot
for each of those as well. For instance, if you are starting 3 waitress
instances in circus, call them circus.waitress.1, circus.waitress.2, and
circus.waitress.3. That way you can query for all processes named circus.*
to see all processes managed by circus, or query for circus.waitress.* to
see all waitress processes managed by circus.
Starting the kernel
There are two ways to start the kernel. You can run it as a process, or you can just try to access it from the client library and allow it to autostart. The client library uses a lock to ensure that multiple threads do not start the server at the same time but there is currently no protection against multiple processes doing so.
By default, the papa kernel process will communicate over port 20202. You can change this by specifying a different port number or a path. By specifying a path, a Unix socket will be used instead.
If you are going to be creating papa client instances in many places in your
code, you may want to just call papa.set_default_port or papa.set_default_path
once when your application is starting and then just instantiate the Papa object
with no parameters.
Telnet interface
Papa has been designed so that you can communicate with the process kernel entirely without code. Just start the Papa server, then do this:
telnet localhost 20202
You should get a welcome message and a prompt. Type "help" to get help. Type "help process" to get help on the process command.
The most useful commands from a monitoring standpoint are:
- list sockets
- list processes
- list values
All of these can by used with no arguments, or can be followed by a list of names, including wildcards. For instance, to see all of the values in the circus and supervisor namespaces, do this:
list values circus.* supervisor.*
You can abbreviate every command as short as you like. So "l p" means "list processes" and "h l p" means "help list processes"
Creating a Connection
You can create either long-lived or short-lived connections to the Papa kernel. If you want to have a long-lived connection, just create a Papa object to connect and close it when done, like this:
class MyObject(object):
def __init__(self):
self.papa = Papa()
def start_stuff(self):
self.papa.make_socket('uwsgi')
self.papa.make_process('uwsgi', 'env/bin/uwsgi', args=('--ini', 'uwsgi.ini', '--socket', 'fd://$(socket.uwsgi.fileno)'), working_dir='/Users/aackbar/awesome', env=os.environ)
self.papa.make_process('http_receiver', sys.executable, args=('http.py', '$(socket.uwsgi.port)'), working_dir='/Users/aackbar/awesome', env=os.environ)
def close(self):
self.papa.close()
If you want to just fire off a few commands and leave, it is better to use the
with mechanism like this:
from papa import Papa
with Papa() as p:
print(p.list_sockets())
print(p.make_socket('uwsgi', port=8080))
print(p.list_sockets())
print(p.make_process('uwsgi', 'env/bin/uwsgi', args=('--ini', 'uwsgi.ini', '--socket', 'fd://$(socket.uwsgi.fileno)'), working_dir='/Users/aackbar/awesome', env=os.environ))
print(p.make_process('http_receiver', sys.executable, args=('http.py', '$(socket.uwsgi.port)'), working_dir='/Users/aackbar/awesome', env=os.environ))
print(p.list_processes())
This will make a new connection, do a bunch of work, then close the connection.
Socket Commands
There are 3 socket commands.
p.list_sockets(*args)
The sockets command takes a list of socket names to get info about. All of
these are valid:
p.list_sockets()p.list_sockets('circus.*')p.list_sockets('circus.uwsgi', 'circus.nginx.*', 'circus.logger')
A dict is returned with socket names as keys and socket details as values.
p.make_socket(name, host=None, port=None, family=None, socket_type=None, backlog=None, path=None, umask=None, interface=None, reuseport=None)
All parameters are optional except for the name. To create a standard TCP socket on port 8080, you can do this:
p.make_socket('circus.uwsgi', port=8080)
To make a Unix socket, do this:
p.make_socket('circus.uwsgi', path='/tmp/uwsgi.sock')
A path for a Unix socket must be an absolute path or make_socket will raise a
papa.Error exception.
You can also leave out the path and port to create a standard TCP socket with an OS-assigned port. This is really handy when you do not care what port is used.
If you call make_socket with the name of a socket that already exists, papa
will return the original socket if all parameters match, or raise a papa.Error
exception if some parameters differ.
See the make_sockets method of the Papa object for other parameters.
p.remove_sockets(*args)
The remove_sockets command also takes a list of socket names. All of these are
valid:
p.remove_sockets('circus.*')p.remove_sockets('circus.uwsgi', 'circus.nginx.*', 'circus.logger')
Removing a socket will prevent any future processes from using it, but any processes that were already started using the file descriptor of the socket will continue to use the copy they inherit
