By Tony Bandy
From official documents and family snapshots to handwritten letters and more – incorporating scanned images and documents into your genealogy research can be rewarding on many levels. However, the process of digitizing them can often be difficult.
You may find yourself wondering: what format(s) should I use/not use to store old family records? What’s the best choice out there for long-term use and storage?
While opinion does vary quite a bit on this topic, I’d like to share with you four common image and file formats that you might come across in the course of your research: GIF, JPEG, TIFF and PDF.
Let’s take a quick look at these four and examine the pros/cons for each type as it relates to family research and image preservation:
GIF, otherwise known as the Graphics Interchange Format, was one of the early image formats used on the web and is still widely used today. Images displayed with this format are very small in file size and the resolution/color display is quite limited. This makes them poor choices for anything other than simple drawings, sketches, etc.
If you are doing a quick scan or altering an image format for an icon or a quick share, then this may be a reasonable choice – but for long-term storage or detailed use, this format should not be considered. For more details and history on the GIF, look here.
JPEG or JPG
One of the most common image formats found online (as well as in many digital archives), the Joint Photographic Experts Group (JPEG) format is one of the most widely used image containers in genealogy research today. File sizes are reasonable and most computer software can interact with this format quite easily.
For most general purposes, this format is perfectly fine for your family research. However, for archival, long-term image storage or high resolution photos, the JPEG format can be problematic. Here’s why: the JPEG standard compresses data in the scanned image, removing some image data bits to reduce file size. So, if you are looking to store archival quality or high-resolution photos, read on for a better choice. Check here for more details on how the JPEG works.
While it’s not as well known as some other image formats, the TIFF, or Tagged Image File Format, is by far the best choice for long-term, archival quality storage of your scanned genealogy images.
Unlike the other formats mentioned above, scanned images and documents in this format are not compressed and the full-range of data from the original document is available. Most computer systems can work with the TIFF, but a drawback is that file sizes can be quite large. The Library of Congress has some great additional information and background on this file type.
While not strictly an image format, PDF, or Portable Document Format, seems to be the current unofficial standard for much of the online document archives for genealogy. You will find that file sizes and quality can vary widely here depending upon the original software that was used to create the PDF.
PDF files can be both image-based as well as contain a layer of OCR (Optical Character Recognition) in the file structure, which is especially handy when using for scanned genealogy files such as court documents, wills, etc as it allows you to search these records. For a complete background on the PDF, read more here.
So, which one should you choose?
The answer here really depends both on your technical skills/resources as well as what you ultimately want out of your research. Look at it this way: if you are part of an online digital project or archive and are looking to maintain both capability and true fidelity of the images/documents that you have been entrusted with, then TIFF (or even a RAW file format, which is data captured straight from the physical device used to take the image and can be accomplished from many DSLRs and high resolution scanners at home) would be the top choice. For long-term archival storage, no doubt this is the best option out there.
Of course, the same level of care could also be applied to your own personal genealogy research. So, if you have the storage capability, skills and software, then TIFF is an easy choice to make. However, if you are simply looking to quickly share on social media or with family and friends, then JPEG is just fine for images and PDF will work great for documents.
Check out these additional resources for even more information on using these, and other, file formats in your family research and beyond:
- Virginia Tech University Libraries File Format Guide
- Center for Digital Archaeology Image Usage Information
- Library of Congress Recommended Formats
- Archaeology Data Service/Digital Antiquity Guide to Good Practice
You might also like: Sort, Scan, Share: How to NOT Drown in Family Memorabilia
Freelance writer, family researcher, and librarian/historian, Tony Bandy can be found at Adventures in History.