SkillAgentSearch skills...

Laundry

data sanitation services

Install / Use

/learn @solita/Laundry
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

laundry

laundry converts user-supplied possibly dangerous files to more static and safer versions. Use it to reduce the risks of malware spreading via files supplied by external users or systems. The conversions are done with an up-to-date toolchain in a hardened stateless sandbox.

Antivirus products can mitigate the risks of malware, but they are imperfect. They mostly work against mass malware and have their own attack surfaces. laundry provides optional antivirus scans with ClamAV open-source antivirus engine for additional level of security.

Features

laundry provides an HTTP API for the conversions below.

| Input | Output | Uses | Purpose | |--------|--------|---------------------------------------------|---------| | doc(x) | pdf | LibreOffice | Removes any embedded macros etc and turns .doc(x) to portable PDF which can be e.g. embedded in HTML. | | jpeg | jpeg | ImageMagick | Strip away all metadata and extraneous bytes, keep only pixel-by-pixel color data. Conversion performed with intermediate PPM format. | | pdf | pdf/a | Ghostscript | Clean up a PDF with conversion to PDF/A for archival purposes. Beware the potentially large file sizes. | | pdf | jpeg | Ghostscript | Converts the first page to jpeg for thumbnails or previews. | | pdf | text | Ghostscript | Extract plain text from a PDF. Does not perform OCR. | | png | png | ImageMagick | Strip away all metadata and extraneous bytes, keep only pixel-by-pixel color data. Conversion performed with intermediate PPM format. | | xls(x) | pdf | LibreOffice | Removes any embedded macros etc and turns .xls(x) to portable PDF which can be e.g. embedded in HTML. |

The laundry HTTP server provides an REST API and online tool to try out the conversions and antivirus scans directly from the browser. Optional API-key-based authorization is available.

Conversions are performed in single-use disposable Docker containers. The containers are secured, and their runtime is gVisor runsc. It provides an additional layer of isolation for the containers.

Antivirus scan is exposed as an HTTP API. It takes in one file and the response tells whether there were any viruses in the file. The scans are performed with ClamAV clamdscan from their official Docker image. This container is not a single-use; instead it is kept alive for extended periods in order to keep the anti-virus signature database up-to-date.

HTTP API documentation

The examples here use service address http://192.168.123.123:8080 of local development environment. See CONTRIBUTING.md for instructions how to set it up.

Use the HTTP API in asynchronous manner; The provided endpoints can be slow. Processing a large file might take tens of seconds.

Each operation requires potentially hundreds of mebibytes of memory. Limit the amount of concurrent requests according to your server constraints.

GET /alive

Endpoint for healthchecks. Invoke it to check whether the service is up and running.

Authorization: No authorization required.

Example request:

curl http://192.168.123.123:8080/alive

Responses:

  • HTTP status 200 with response body yes.

GET /auth-test

Endpoint for testing your API KEY authorization without any actual operation.

Authorization: Optional HTTP Basic authentication with user name laundry-api and your api-key as password. Authorization is required when the server is launched with -k or --api-key-file option.

Example request:

curl -u "laundry-api:abcd1234" http://192.168.123.123:8080/auth-test

Responses:

  • HTTP status 200 when authorization is successful or when the server is running without authorization.
  • HTTP status 401 for failed authorization with response body access denied.

POST /antivirus/scan

Scans the attached file with ClamAV and indicates whether there were any viruses detected. The request must be multipart/form-data and the file in a part named file.

Authorization: Optional HTTP Basic authentication as documented in GET /auth-test.

Example request:

curl -F file=@input.xxx http://192.168.123.123:8080/antivirus/scan

Responses:

  • HTTP status 200 when the file was clean and no viruses were detected.
  • HTTP status 400 when viruses were detected! See response body for detailed response from clamdscan. It includes the virus name.
  • HTTP status 401 for failed authorization. See GET /auth-test for details.
  • HTTP status 500 when the scan can not be performed. See response body for detailed error message.

Example response when virus detected:

HTTP/1.1 400 Bad Request
Content-Type: text/plain;charset=utf-8

Viruses found! stream: Win.Test.EICAR_HDB-1 FOUND

----------- SCAN SUMMARY -----------
Infected files: 1
Time: 0.006 sec (0 m 0 s)
Start Date: 2022:10:19 07:22:16
End Date:   2022:10:19 07:22:16

POST /docx/docx2pdf

Converts the provided .doc or .docx to a PDF. The request must be multipart/form-data and the file in a part named file.

Authorization: Optional HTTP Basic authentication as documented in GET /auth-test.

Example request:

curl -F file=@input.docx --output result.pdf http://192.168.123.123:8080/docx/docx2pdf

Responses:

  • HTTP status 200 when the conversion succeeded. The content-type is application/pdf and the PDF is transferred in response body.
  • HTTP status 401 for failed authorization. See GET /auth-test for details.
  • HTTP status 500 when conversion failed. See server logs for details.

POST /xlsx/xlsx2pdf

Converts the provided .xls or .xlsx to a PDF. The request must be multipart/form-data and the file in a part named file.

Authorization: Optional HTTP Basic authentication as documented in GET /auth-test.

Example request:

curl -F file=@input.xlsx --output result.pdf http://192.168.123.123:8080/xlsx/xlsx2pdf

Responses:

  • HTTP status 200 when the conversion succeeded. The content-type is application/pdf and the PDF is transferred in response body.
  • HTTP status 401 for failed authorization. See GET /auth-test for details.
  • HTTP status 500 when conversion failed. See server logs for details.

POST /image/png2png

Cleans up the provided .png keeping only pixel-by-pixel color data. The request must be multipart/form-data and the file in a part named file.

Authorization: Optional HTTP Basic authentication as documented in GET /auth-test.

Example request:

curl -F file=@input.png --output result.png http://192.168.123.123:8080/image/png2png

Responses:

  • HTTP status 200 when the conversion succeeded. The content-type is image/png and the image is transferred in response body.
  • HTTP status 401 for failed authorization. See GET /auth-test for details.
  • HTTP status 500 when cleanup failed. See server logs for details.

POST /image/jpeg2jpeg

Cleans up the provided .jpg or .jpeg keeping only pixel-by-pixel color data. The request must be multipart/form-data and the file in a part named file.

Authorization: Optional HTTP Basic authentication as documented in GET /auth-test.

Example request:

curl -F file=@input.jpeg --output result.jpeg http://192.168.123.123:8080/image/jpeg2jpeg

Responses:

  • HTTP status 200 when the conversion succeeded. The content-type is image/jpeg and the image is transferred in response body.
  • HTTP status 401 for failed authorization. See GET /auth-test for details.
  • HTTP status 500 when cleanup failed. See server logs for details.

POST /pdf/pdf-preview

Converts the first page of the PDF to jpeg. The request must be multipart/form-data and the file in a part named file.

Authorization: Optional HTTP Basic authentication as documented in GET /auth-test.

Example request:

curl -F file=@input.pdf --output result.jpeg http://192.168.123.123:8080/pdf/pdf-preview

Responses:

  • HTTP status 200 when the conversion succeeded. The content-type is image/jpeg and the image is transferred in response body.
  • HTTP status 401 for failed authorization. See GET /auth-test for details.
  • HTTP status 500 when conversion failed. See server logs or response body for details.

POST /pdf/pdf2txt

Extracts the contents of PDF to plain text. The request must be multipart/form-data and the file in a part named file.

Authorization: Optional HTTP Basic authentication as documented in GET /auth-test.

Example request:

curl -F file=@input.pdf --output result.txt http://192.168.123.123:8080/pdf/pdf2txt

Responses:

  • HTTP status 200 when the extraction succeeded. The content-type is text/plain and the text is transferred in response body.
  • HTTP status 401 for failed authorization. See GET /auth-test for details.
  • HTTP status 500 when extraction failed. See server logs or response body for details.

POST /pdf/pdf2pdfa

Converts the PDF to safer PDF/A, which is often used for archival purposes. This removes embedded scripts etc, but might also convert custom fonts to images. Thus the result might contain text as images, have large file sizes and be slow to open. The reques

Related Skills

View on GitHub
GitHub Stars12
CategoryDevelopment
Updated1y ago
Forks5

Languages

Clojure

Security Score

75/100

Audited on Mar 17, 2025

No findings