Email or username:

Password:

Forgot your password?
Alex Schroeder

I have a bunch of directories full of PDF files and want to compress them all to save some space, so I gzip them all. Is there a way to set up Apache to serve as PDF files? Something like a rewrite from .pdf$ to .pdf.gz but ... transparently? Right now I edit the wikip page and change all the links ending in .pdf to .pdf.gz; when I follow the resulting link, the browser downloads the file and calls whatever is used to handle archives and therefore the PDF is no longer shown by the browser. Something like assigning a magic application/pdf+gzip to the .pdf.gz extension maybe?

5 comments
Alex Schroeder

OK, I think I found a website with the answer I was looking for. The key phrase is in the title of this post: "Serving pre-compressed files using Apache". Sometimes searching for stuff is hard just because you don't know what it's called. 😅

https://feeding.cloud.geek.nz/posts/serving-pre-compressed-files-using/
AddEncoding gzip gz
Options +Multiviews
SetEnv force-no-vary
Header set Cache-Control "private"
<FilesMatch "\.pdf\.gz$">
ForceType application/pdf
</FilesMatch>
Alex Schroeder

And now that I've gone through the directories, it turns out that those PDFs are already pretty compressed! Most of their content is probably compressed image files. I guess I should have known.😅
https://alexschroeder.ch/view/2024-11-01-gzip

Alex Schroeder

I guess I could use Ghostscript to reduce PDF file size! Sure, it wouldn’t be a bit-exact copy but perhaps archival copies don’t need to be.

https://alexschroeder.ch/view/2017-09-06_PDF_File_Size

Alex Schroeder

First test with Ghostscript to reduce filesize is disappointing. It also turned text in the original to a badly pixelated image. I wonder why. I had hoped it would only affect images. Now I'm reading https://ghostscript.com/blog/optimizing-pdfs.html and my head is smoking.

Alex Schroeder

OK, based on that blog post, I have the following invocation. This doesn't turn the fonts into images but it does downsample all images to 150dpi if they are at 165dpi or larger, and it forces JPEG if there images are uncompressed or PNG (I think).

Let me know if there's anything else you can think of.

function pdf-shrink --description 'Turn images in PDF files to 150dpi JPG'
# https://ghostscript.com/blog/optimizing-pdfs.html
gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dNOPAUSE -dQUIET -dBATCH \
   -dConvertCMYKImagesToRGB \
   -dGrayImageFilter=/DCTEncode -dColorImageFilter=/DCTEncode \
   -dAutoFilterGrayImages=false -dAutoFilterColorImages=false \
   -dDownsampleColorImages=true -dDownsampleGrayImages=true -dDownsampleMonoImages=true \
   -dColorImageResolution=150 -dGrayImageResolution=150 -dMonoImageResolution=150 \
   -dColorImageDownsampleThreshold=1.1 -dGrayImageDownsampleThreshold=1.1 -dMonoImageDownsampleThreshold=1.1 \
   -dPreserveHalftoneInfo=false -dPreserveOverprintSettings=false \
   -dTransferFunctionInfo=/Apply -dUCRandBGInfo=/Remove \
   -sOutputFile=$argv[2] $argv[1]
end

OK, based on that blog post, I have the following invocation. This doesn't turn the fonts into images but it does downsample all images to 150dpi if they are at 165dpi or larger, and it forces JPEG if there images are uncompressed or PNG (I think).

Let me know if there's anything else you can think of.

Go Up