Security Notice

This project has not been updated since 2019. There are multiple security vulnerabilities in dependency packages.

Without an investment in upgrading these dependencies, this project should be considered for archival use only.

Overview

PcapDB is a distributed, search-optimized open source packet capture system. It was designed to replace expensive, commercial appliances with off-the-shelf hardware and a free, easy to manage software system. Captured packets are reorganized during capture by flow (an indefinite length sequence of packets with the same src/dst ips/ports and transport proto), indexed by flow, and searched (again) by flow. The indexes for the captured packets are relatively tiny (typically less than 1% the size of the captured data).

For hardware requirements, see HARDWARE.md.

Also see our fact sheet

DESTDIR

Many things in this file refer to DESTDIR as a pathname prefix. The default, and that used by future pcapdb packages, is /var/pcapdb.

Architectural Overview

A PcapDB installation consists of a Search Head and one or more Capture Nodes. The Search Head can also be a Capture Node, or it can be a VM somewhere else. Wherever it is, the Search Head must be accessible by the Capture Nodes, but there's no need for the Capture Nodes to be visible to the Search Head.

1. Requirements

PcapDB is designed to work on Linux servers only. It was developed on both Redhat Enterprise and Debian systems, but its primary testbed has so far been Redhat based. While it has been verified to work (with packages from non-default repositories) on RHEL 6, a more bleeding edge system (like RHEL/Centos 7, or the latest Debian/Ubuntu LTS) will greatly simplify the process of gathering dependencies.

sys_requirements.md contains a list of the packages required to run and build pcapdb. They are easiest to install on modern Debian based machines.

requirements.txt contains python/pip requirements. They will automatically be installed via 'make install'.

2. Installing

To build and install everything in /var/pcapdb/, run one of:

make install-search-head
make install-capture-node
make install-monolithic

Like with most Makefiles, you can set the DESTDIR environment variable to specify where to install the system. make install-search-head DESTDIR=/var/mypcaplocation
This includes installing in place: make install-capture-node DESTDIR=$(pwd). In this case, PcapDB won't install system scripts for various needed components. You will have to run it manually, see below.
If you're behind a proxy, you'll need to specify a proxy connection string using PROXY=host:port as part of the make command.
There's a bug in some 1.10.* versions of virtualenv that cause the install to fail. Specify the python3 virtualenv executable with using VIRTUALENV=<virtualenv path>

To make your life easier, however, you should work make sure the indexing code builds cleanly by running 'make' in the 'indexer/' directory.

Postgresql may install in a strange location, as noted in the 'indexer/README'. This can cause build failures in certain pip installed packages. Add PATH=$PATH:<pgsql_bin_path> to the end of your 'make install' command to fix this. For me, it is: make install PATH=$PATH:/usr/pgsql-9.4/bin.

3. Setup

After running 'make install', there are a few more steps to perform.

3-0: Setup hugepages (optional)

Pcapdb uses 2MB hugepages to manage memory more efficiently. If a capture node has hugepages available, they will be automatically consumed by the capture process.

First, determine how much memory you want to devote to capture. I'd recommend about 70% of available system memory, which should be a minimum of 16G (64G or more are recommended). Then simply divide that amount by 2M to get the number of hugepages.

To enable hugepages, add 'hugepages=<number of pages> hugepagesz=2M' to your /boot/grub/grub.conf and reboot. You may also want to add it to /etc/default/grub in the GRUB_CMDLINE_LINUX variable, depending on your OS flavor (debian).

3-1: Post-Install script

The core/bin/post-install.sh script will handle the vast majority of the system setup for you.

Only run once. It can cause issues if you run it multiple times. ~~It does so idempotently, so it can be run multiple times without breaking anything.~~ If you run it again, you may have to re-install completely. Moving the old PcapDB install files to another directory can cause problems as well.
Run without arguments to get the usage information.
Basically, you want to give it arguments based on whether you're setting up a search head (-s), a capture node (-c), or monolithic install (-c -s).
You'll also have to give it the search head's IP.

/var/pcapdb/core/bin/post-install.sh [-c] [-s] <search_head_ip>

This will set up the databases and rabbitmq.

3-2 DESTDIR/etc/pcapdb.cfg

This is the main Pcapdb config file. You must set certain values before PcapDB will run at all. There are a few things you need to set in here manually:

(On capture nodes) The search head db password
(On capture nodes) The rabbitmq password
- Both of the above should be in the search head's pcapdb.cfg file.
(On search head) The local mailserver.
- If you don't have one, I'd start with installing Postfix. It even has selectable install settings that will configure it as a local mailserver for you.

3-3 Add an admin user (Search Head Only)

You'll need to create an admin user.

sudo su - capture
./bin/python core/manage.py add_user <username> <first_name> <last_name> <email>

Usernames must be at least four characters long.
This will email you a link to use to set that user's password.
(This is why email had to be set up).
root@localhost is a reasonable email address, if you need it.
Note that manage.py also has a createsuperuser command, which shouldn't be used.

3-4 Add the capture nodes to the postgres pg_hba.conf on the search head.

This is needed if running a separate search head. See below.

3-5 Set up a site.

You should be able to login with your admin account.
Click 'Admin', 'Capture Sites'.
Add a new capture site. The group name can be the same as the site name; the admin group should be different.
This adds a grouping of capture nodes to work with.

3-6 Set up a capture node.

Click 'Admin', 'Capture Nodes'.
Select your site, and add a new capture by hostname.
The you have to do this even in monolithic mode.
If this fails, check logs/celery.log on the capture node.
The capture node must already be able to connect to rabbitmq/celery on the search head for this to work.

3-7 Add capture node permissions

To be able to configure the capture node, you must set permissions for it.
Click 'Admin', 'Users'.
Select your user, and 'Add to Group'. Select the admin group for the site you added.
You can now edit the disks and configure capture.

3-8 Configure disks.

Go into the Disks view - Click 'Admin', 'Capture Nodes', and then the 'Disks' button on the node you want to configure.

Index disks

You'll need one or two equally sized disks dedicated to indexing. Select the disks from the available 'Devices' table, and click 'Create Index RAID'. It will take a few minutes.
If you choose two disks, it will create a RAID 1 of the two disks.

Capture disks

You'll need to set up some groups of disks for capture.

Select some number of equally sized disks from the 'Devices' list, and click 'Create RAID5'. This will create a new md device from those disks. (If your disk is an external RAID, you can skip this step.)
Select your RAID, and click 'Initialize as Capture Disk'. This will format the RAID and add it to the database. It should appear in the 'Capture Disks' table at the top of the page.
Repeat these steps to include as many capture disks as you like. PcapDB balances across them according to their size, so they don't all have to end up as the same size.
You can also add dynamically re-assigned spares that will be used by any of your RAID's as needed, by clicking a disk and selecting 'Add Spare'.
For each of the Capture Disks you added, select and enable them in the 'Capture Disks' table.

Debugging note: Errors from this all go into the logs/celery.log on the capture node.

3-9 Configure Capture

Go into the Capture view - Click 'Admin', 'Capture Nodes', and then the 'Capture' button on the node you want to configure.

Select the interface you'd like to enable for capture, and click the red circle. Capture will be enabled on this device on the next capture restart.
On capture settings, you can enable PFRing ZC mode, if you have a license and a compatible network interface. All used interfaces must be compatible.
The multi-queue mode has been tested, but not well. The queue slider is only used when in PFring-ZC mode.
There's a bug in the capture settings; you must put a number (including zero) in the local memory box.

When you're ready, click 'Start'.

Debugging note: Capture runner errors go into logs/django.log on the capture node. For some reason you may get error messages about the capture runner not trying to do anything. As root, on the capture node, run supervisorctl restart capture_runner. Logs for the actual capture process are supposed to be in /var/pcapdb/log/capture.log, but sometimes show up in /var/log/messages if syslog didn't get configured correctly.

Things that can, and have, gone wrong

If your host doesn't have a host name in DNS, you can set an IP in the 'search_head_host' variable in the pcapdb.cfg file.

3-5 pfring-zc drivers

One more thing. You should install the

Pcapdb

Install / Use

README