Deposit formats
We recommend that all data is as 'portable' as possible:
- Objects (disks etc.) should be labelled
- File names should be platform-independent and have correct extensions
- Files should be in a platform-independent format
For more information about data portability, see Seven Dimensions of Portability for
Language Documentation and Description (Bird/Simons) or 'Language documentation and archiving, or how to build a better corpus' (Heidi Johnson 2004 in P. Austin (ed) Language Description and Documentation 2).
If some of your data is not portable (or not yet digital), you will usually be
able to convert it to a portable digital format. For case
studies in converting materials to portable formats, see:
Preferred formats
We can accept a range of formats, with a preference for the
following:
- sound - WAV
- image - BMP, TIFF, JPEG. See full advice about images
- video - MPEG2
- text - plain text, with or without markup
- documents - plain text, PDF or postscript
- structured text - XML, other markup (with description of markup
system)
- structured data in commonly available Office formats - ELAR will
convert them to archive-suitable formats
- character encoding :
- preferred encoding is ASCII or Unicode
- clearly document any other encodings used, e.g. ISO 8859-5
- discuss with us if you use font substitution to handle non-Roman characters
Please contact us at archive@hrelp.org if you unsure whether a particular format is suitable for submission to the archive, or if you are having problems converting your data to
a portable format.
This page is not yet complete.
|