GitHub - ErrorTzy/unpaper-gpu: A fork of unpaper that uses opencv cuda module to accerate processing. ~70x faster than origional unpaper cpu backend

Originally written by Jens Gulden — see AUTHORS for more information. The entire unpaper project is licensed under GNU GPL v2. Some of the individual files are licensed under the MIT or Apache 2.0 licenses. Each file contains an SPDX license header specifying its license. The text of all three licenses is available under LICENSES.

Overview

unpaper is a post-processing tool for scanned sheets of paper, especially for book pages that have been scanned from previously created photocopies. The main purpose is to make scanned book pages better readable on screen after conversion to PDF. Additionally, unpaper might be useful to enhance the quality of scanned pages before performing optical character recognition (OCR).

unpaper tries to clean scanned images by removing dark edges that appeared through scanning or copying on areas outside the actual page content (e.g. dark areas between the left-hand-side and the right-hand-side of a double- sided book-page scan).

The program also tries to detect misaligned centering and rotation of pages and will automatically straighten each page by rotating it to the correct angle. This process is called "deskewing".

Key Features

CPU and CUDA backends with auto-selection (--device=cpu|cuda)
Batch processing for large file sequences (--batch, --jobs, --progress)
PDF input/output pipeline (MuPDF) with image extraction + render fallback
GPU-accelerated JPEG/JP2 decode/encode via nvImageCodec (CUDA builds)
Broad image input support via FFmpeg (PNG/JPEG/TIFF/etc., subject to pixel formats)

Note that the automatic processing will sometimes fail. It is always a good idea to manually control the results of unpaper and adjust the parameter settings according to the requirements of the input. Each processing step can also be disabled individually for each sheet.

See further documentation for the supported file formats notes.

Dependencies

Base build requirements:

FFmpeg libraries: libavformat, libavcodec, libavutil, libswscale
POSIX threads and the math library

Optional features (auto-detected unless explicitly enabled):

PDF support (-Dpdf=enabled): MuPDF
JBIG2 decode for PDF B&W images (-Djbig2=enabled): jbig2dec
CUDA backend (-Dcuda=enabled): NVIDIA CUDA Toolkit + OpenCV 4.x CUDA

Meson feature options default to auto. Use -Dcuda=disabled, -Dpdf=disabled, or -Djbig2=disabled to force features off.

CUDA Backend Dependencies

For GPU-accelerated processing (auto-detected or --device=cuda), the following are required:

CUDA Toolkit: Tested with CUDA 12.x and 13.x. The nvcc compiler, CUDA runtime (cudart), and NPP must be available.
OpenCV 4.x with CUDA support: Required for CUDA builds. OpenCV must be built with CUDA support enabled, including cudaarithm, cudaimgproc, and cudawarping. OpenCV provides GPU-accelerated operations including connected-component labeling for the noisefilter.
nvImageCodec (nvimgcodec): Required for GPU JPEG/JP2 decode/encode. JPEG2000 support depends on the nvImageCodec build and available plugins.

Building instructions

unpaper uses the Meson Build system, which can be installed using Python's package manage (pip3 or pip):

unpaper$ pip3 install --user 'meson >= 0.57' 'sphinx >= 3.4'
unpaper$ CFLAGS="-march=native" meson setup --buildtype=debugoptimized builddir
unpaper$ meson compile -C builddir

You can pass required optimization flags when creating the meson build directory in the CFLAGS environment variable. Usage of Link-Time Optimizations (Meson option -Db_lto=true) is recommended if available.

Further optimizations such as -ftracer and -ftree-vectorize are thought to work, but their effect has not been evaluated so your mileage may vary.

Tests depend on pytest and pillow, which will be auto-detected by Meson.

Building with PDF Support

To enable PDF input/output, configure with -Dpdf=enabled (MuPDF required):

unpaper$ meson setup builddir-pdf -Dpdf=enabled --buildtype=debugoptimized
unpaper$ meson compile -C builddir-pdf

Optional JBIG2 decode for B&W PDF images:

unpaper$ meson setup builddir-pdf -Dpdf=enabled -Djbig2=enabled \
    --buildtype=debugoptimized

PDF processing is activated when both input and output files are PDFs:

unpaper$ unpaper input.pdf output.pdf
unpaper$ unpaper --pdf-quality=high input.pdf output.pdf
unpaper$ unpaper --pdf-dpi=400 input.pdf output.pdf

Building with CUDA Support

To enable GPU-accelerated processing, configure with -Dcuda=enabled:

unpaper$ meson setup builddir-cuda -Dcuda=enabled --buildtype=debugoptimized
unpaper$ meson compile -C builddir-cuda

The CUDA backend requires:

NVIDIA CUDA Toolkit (nvcc compiler, cudart, NPP)
OpenCV 4.x with CUDA support (cudaarithm, cudaimgproc, cudawarping)
nvImageCodec (nvimgcodec) for GPU JPEG/JP2 decode/encode

By default, unpaper will use CUDA when it is available. Use --device=cpu to force CPU processing, or --device=cuda to force GPU processing.

To check which backends are active at runtime, use --perf:

unpaper$ ./builddir-cuda/unpaper --perf --device=cuda input.pgm output.pgm
# Output includes: perf backends: device=cuda opencv=yes ccl=yes

Recommended Pixel Formats for CUDA

For best CUDA performance, use these pixel formats:

Image Type	Recommended Format	Notes
Grayscale	GRAY8 (8-bit grayscale)	Full OpenCV CUDA acceleration
Color	RGB24 (24-bit RGB)	Full OpenCV CUDA acceleration

These formats benefit from optimized OpenCV CUDA primitives including cv::cuda::transpose, cv::cuda::flip, and cv::cuda::warpAffine.

Other formats like Y400A (grayscale with alpha) and 1-bit mono (MONOWHITE, MONOBLACK) are supported but use custom CUDA kernels since OpenCV lacks native support for 2-channel and bit-packed images.

Output Formats

For image inputs, the default output remains PNM (PBM/PGM/PPM). In CUDA batch mode with nvImageCodec available, you can write JPEG or JPEG2000 by using .jpg/.jpeg or .jp2 extensions in the output filenames. For PDF output, use PDF input/output files and --pdf-quality (and optionally --pdf-dpi for render fallback). Use --jpeg-quality to control JPEG encoding quality.

Batch Processing

Batch mode pre-enumerates jobs and processes them in parallel:

unpaper$ unpaper --batch --jobs=4 input%04d.png output%04d.pbm

When using CUDA with JPEG output files, batch mode may be auto-enabled to activate the GPU JPEG pipeline.

FFmpeg automatically selects pixel format based on input. To ensure optimal format, you can pre-convert images:

# Convert to GRAY8 for grayscale scans
ffmpeg -i input.tiff -pix_fmt gray output.pgm

# Convert to RGB24 for color scans
ffmpeg -i input.tiff -pix_fmt rgb24 output.ppm

Development Hints

The project includes configuration for pre-commit which is integrated with GitHub Actions CI. If you're using git for devleopment, you can install it with pip install pre-commit && pre-commit --install.

Using Sapling with this repository is possible and diffs can be reviewed as a stack.

Further Information

You can find more information on the basic concepts and the image processing in the available documentation.