« Of Quarries & Panel Gauges | Main | Copyright Explained »

February 06, 2008

Scanning Basics

There have been some questions about scanning ephemera and books. I'll be placing a permanent feature at The Toolemera Press sometime in the near future, but for now, here is a quick and dirty review of what works for me. Keep in mind that I am not an imaging expert, not a digital artist and in fact am somewhat color-blind and have lousy eye-sight.

  1. Scanner. Not as important as what scan software you use. That said, I'm a fan of Epson scanners, particularly the Perfection line. Not too fast, not too slow and the output is clear, sharp and color accurate. Plus they don't die as fast as the Canon scanners seem to. I do have a tendency to burn out a scanner each year. That doesn't bother me as it presents me with the opportunity to upgrade and let me tell you, the advances in consumer-level scanners have brought the machines into the realm of what I used to see in pro-level scanners just a few years ago. I also like the Opticbook line from Plustek. A bit expensive and Windows only (plus you have to have an Intel processor to use the software) but the only affordable scanner on the market that will scan to within 6 mm of the inside edge of a book. The design of this scanner allows you to scan without having to flatten the book, a nasty thing to do to a spine. As a last thought, for very high-quality scans, there are the Microtek scanners. Expensive as a rule and slow, but amongst the best in output at the price.
  2. Software. The stuff that ships with scanners, as a rule, is bupkis (Yiddish for junk). It's fine for very average work, but falls flat when trying to scan old documents, photographs and books. There are two after-market products that I swear by: Lasersoft Imaging's Silverfast and Hammrich Software's Vuescan. Silverfast is by far the better of the two and the one most professionals will turn to. Vuescan is feature rich but sometimes a little confusing in it's interface, however it's a fraction of the price of Silverfast. As a rule I use Silverfast. The downside is that you have to buy a copy specific to each scanner. Vuescan is a one time purchase that will work with most scanners on the market.
  3. DPI versus Image Dimensions. First and foremost, I scan in either 175 dpi TIFF or 300 dpi TIFF. 175 for simply black & white and 300 dpi for color and grayscale. (More on that later under Post-Scan Processing.) This one is always confusing as the two are not the same. A 300 dpi scan of a 4"x5" object will produce a digital image that is many times the measured dimensions of the original. A scan set to 4"x5", without specifying dpi, will turn out a 4"x5" scan with a lower dpi. In the first instance, the dpi setting will simply produce a scan with more dots per inch, resulting in a large on-screen scan. And that's the trick... we are talking about a digital representation of a physical object. When viewed on a monitor, the higher the dpi of an image, the larger it will appear. When printed, the higher the dpi, the larger in physical dimensions the print can be (unless you set your printer to print to a given size). More information IN yields more information OUT. In the second instance, scanning to a physical size without setting dpi, the scan software will choose the resolution setting that will produce the desired image dimensions. Resolution is just another term for how much detail can be found in an image. The higher the dpi, the higher the resolution. Basically. There are lots of exceptions to this rule, but who cares? What we care about is getting the scan to look  the way we want it to. Leave the real technical stuff to the artists, photographers and designers. At least that is what I do.
  4. Color, Grayscale, Black & White. Color is easy. If you want your image to be in color, choose color. If you want it to be Black & White, choose that setting. If you want a Grayscale output, get confused. What is Grayscale? Originally intended as a means to digitize halftone prints (think magazine images made up of all those little dots), Grayscale is the Swiss Army Knife of scanning. If you must scan without color, but you have images, or you have an original with varying backgrounds (old stained paper), selecting Grayscale will produce a scan that is accurate to the original... but of course without color. Scanning such an image in Black & White will work but you may pick up lots of junk in the scan from the uneven textures of the background or you may end up with funny wiggles in the graphics (Moire effects from scanning engravings or halftone prints). TRICK OF THE DAY: Set your Grayscale scan to 400 dpi. This setting, for various arcane reasons known only to Harry Potter, will often reduce or eliminate the moire pattern and/or smooth the noisy background of an old piece of paper. You can reduce the image dpi or dimensions afterwards with image editing sofware.
  5. BITS. Don't worry too much about Bits. 48 bit, 24 bit and so on. These numbers refer to the amount of data included in the scan. That is a very simplistic way of explaining it, but it really doesn't matter that much unless you are going to get into fine-tuning something like Silverfast. The higher the bit count, the larger in kb's or mb's the end result. The truth is that it's best to experiment with each setting to see what scan looks the best to your eyes and for your purpose. I typically scan at the highest Bit setting and then play around with the image in Photoshop Elements.
  6. Optical Character Recognition (OCR). In the realm of scanning ephemera and books, the only reason for OCR is if you want an end result that is searchable. I rarely, if ever, run OCR software on individual pieces of ephemera. I will run OCR on books if the scan is fairly clean and if I feel there is a need for search within the PDF document. OCR software can be time-consuming to run and it usually increases the size of the PDF considerably. In brief, OCR creates a second, hidden image of the document. This hidden image contains text that is searchable but the PDF reading software IF and only IF your PDF software supports text search. Adobe Acrobat Reader does. Some off-brand PDF readers don't. Don't ever bother to try to OCR an 18th Century book as the font will never OCR well, resulting in tons of peculiar search hits.
  7. Scanning Fingers and Wrists. What do you do about all those fingers and wrists that show up in your scans? Sometime you can't force the scan software to scan precisely to the object dimensions, or you may be scanning a book. If so, forget about the scanner cover. Remove it or raise it out of the way. Get yourself a piece of dense black cloth (no velvet please, leave that to Elvis). The finer the texture the better. Drape the cloth over the item to be scanned and scan away.
  8. Post-Scan Processing. This is where the magic happens. Old documents, photographs and books are never an easy target. You don't have a nice, clean white background. I scan to produce the image that looks closest to the original, at 175 or 300 dpi, with the intent of working on the image in Photoshop Elements. In PE, I can adjust contrast and brightness (always do contrast first, then brightness), run a sharp filter (unsharp mask for photographs, sharpen for black & white images) and reduce the scan to the format and size that I want.
  9. TIFF v JPEG v GIF v PNG. Gesundheit. In reverse order, PNG is a fairly new format that I just can't cozy up to. It sounds too much like a National Political party. GIF is fine for simple online grahics such as a banner or logo. GIF is lousy when it comes to complex colors or grayscale. My favorites are TIFF and JPEG. TIFF retains all the original image quality no matter how many times you save the file, while JPEG will lose image quality with each save. But... JPEG files are smaller and more web friendly. So I scan in TIFF and run the image through Photoshop Elements to convert it into a web friendly JPEG. I try to do any image manipulation in TIFF and as little messing around in JPEG.

And that is all that I have to say for now on the Artes & Mysteries of Scanning.

Till next
Gary

PS: I'll do something on PDF creation another time. I tend to work in color, so my work flows reflect that process, but I'll add something on background removal and all that B&W stuff.

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00e54f1398f3883400e5502df5b38834

Listed below are links to weblogs that reference Scanning Basics:

Comments

Twitter Updates

    follow me on Twitter

    eStore Stuff

    • EAIA Chronicle DVD
      Select Member Status

    PR

    Related Posts Widget for Blogs by LinkWithin