To process a large scanned PDF file, the first step is to chop it down to small files that can be processed. This can be done using the following command (or something similar):
Let’s go over the arguments. -r 72 forces a resolution of 75DPI. You may need to experiment with this to find the best setting. Another popular default resolution is 72DPI. The smaller you make this number, the lower the pixel resolution of the produced files.
Note that this number may have nothing to do with the actual scan resolution. A scanner tool may simply store all the raster data in the PDF file without storing the actual scan resolution. The only way you know whether the option is specified right is to check the produced files. Use a tool (ImageMagick identify or the Gimp) to check the actual dimension of the raster images. You can check the pixel dimension of the scanned PDF document using pdfinfo.
-gray specifies the output be in gray scale. If this is appropriate for your file, it makes the output 1/3 of the color version!
-png specifies you want PNG files as the output instead of the default (PPM). This makes a huge difference in size. For example, a 500MB PPM file compresses to 7MB.