The Hans Rausing Endangered Languages ProjectThe Hans Rausing Endangered Languages Project   The Hans Rausing Endangered Languages Project

Deposit formats - images

Advice for preparing and depositing images

  • Be selective. You should archive photos and images:
    • that have enduring value
    • that are relevant to the linguistic, cultural or community context, and
    • for which you have appropriate metadata.
  • Images should be technically competent, for example, the important/relevant parts must be in focus.
  • Image files should be provided in standard formats such as BMP, JPG (JPEG), or TIFF. If photos were originally shot as JPG (the typical camera setting), then send the original JPG, without further compression or conversion. If files were originally uncompressed, supply them in that form. If you have processed or converted the files, document the changes and provide them as metadata.
  • When shooting, set the camera to shoot in its highest quality - typically this will result in JPG files between 1 and 2 MB in size. Use only optical, not digital, zoom
  • If scanning photographs or documents, scan using at least 200dpi and use high colour settings, even for "black and white" originals. If possible, use a non-compressed format (e.g. BMP). Note that JPG compression can be particularly destructive to text
  • If photographing documents, set the camera at its highest quality, and use a tripod (or at least some prop, rather than hand holding the camera). Use a shutter delay to further prevent camera movement. Ensure plenty of light falls evenly on the page. Flash can be effective, but make sure that it causes no bright reflections by not shooting completely square-on. Again, use colour settings even if the original appears to be "black and white"
  • You can retain the original filenames assigned by cameras, such as 102_0243.JPG or DSCN2032.JPG. If you create filenames yourself, keep them simple. Two methods are popular:
    • assign simple filenames sequentially, such as 1.jpg, 2.jpg etc, or t01.jpg, t02.jpg ... t10.jpg etc
    • assign filenames that make image identification and browsing easier, such as smith.jpg, jones.jpg. However, keep them short and simple
  • Guidelines for filenames: avoid spaces, and preferably use only characters "a"-"z", "-", "_", and one only "." to separate the name root from the extension
  • Do not make filenames too long or a substitute for metadata or other contextual/caption information - such information must be supplied separately. Think of image filenames as keys or references rather than information-bearers
  • Provide metadata (which can also include comments, captions or other information) for each image. The metadata fields you use will depend on your project's aims and content. However, clearly document the fields (e.g. don't merely have rows or columns headed "name" or "lg" - explain these somewhere) and use them thoroughly consistently
  • Forms of metadata:
    • you might provide a separate metadata file for each image, linked by their filename roots, for example, metadata for 32.jpg is found in 32.txt
    • alternatively, provide one file containing metadata for a set of images. In this case, use a structured format such as tables, spreadsheet, or XML marked-up data, to rigorously maintain the relationship between each image reference and its metadata. Use the image filename as the reference or key (hence the importance of keeping filenames as short and simple as possible)
    • note that some image formats, such as JPG and TIFF also have embedded metadata, typically the "EXIF" data about the camera and shooting characteristics
  • Make sure that you have the right to deposit the images. Collect any "protocol" information about sensitivities, restrictions etc. and supply this as part of the metadata

Please contact us at archive@hrelp.org if you unsure whether a particular format is suitable for submission to the archive, or if you are having problems converting your data to a portable format.