RAW image formats

From OSP Wiki

Jump to: navigation, search

Contents

RAW files contain the most complete information from the camera's sensor and are essential when we want to take advantage of more advanced post-processing techniques.

General information

What's this about RAW files?

Digital cameras capture the image using a light sensor. The data captured by this sensor is then processed and stored in a format that can be read outside the camera. Typically, this format is JPEG. Due to the limitation of JPEG's 8 bit color depth, not all data that was captured by the sensor represented in the final image and actually a lot of information is lost.

Though almost all digital cameras store the captured images using JPEG format, more advanced cameras like DSLRs and some high end point-and-shoot have the option of storing images in RAW format. The word raw is used to express the idea that the data captured by the sensor wasn't modified by the camera processor, although this isn't quite true as most cameras process the sensor data to some extent before writing the RAW file, some less some more. Even so, the data on a RAW file is as raw as it gets, and a lot more complete than it's JPEG counterpart.

In the bibliography section you can find detailed articles about RAW files and light capture by digital cameras[1][2].

So why not just use RAW?

There are several reasons why RAW files aren't practical for everyday use. First of all, the human eye sees more detail in bright areas of an image that in dark ones, which makes linear RAW images look flat and dark. So it's convenient and necessary to process the RAW data into a more practical format like JPEG or TIFF. If you're confident you wont need to post-process your picture, you can skip one step and just use JPEG.

Also, the size of a RAW file is a lot larger than it's JPEG equivalent, usually more than twice the size. This leads to storing less photos on a single memory card and also more time to transfer the image from internal memory, which in turn impacts on the speed of continuous shooting. To reduce this problem, some manufactures have implemented lossless compression on their formats formats.

And that leads us to the major problem with RAW image files, one that's particularly sensitive to Open Source advocates: each manufacturer uses a proprietary format to write RAW data and they're not disclosing the file specifications. This means that it's difficult to support new formats and versions, with several being released each year. And it's even harder when some manufacturers encrypt part of the RAW file data[3] It also means that, when using RAW files as the equivalent to film negative, if an “old” format stops being supported by the manufacturer, in a few years, when old programs are no longer running on future operating systems, you'll have no way to read them.

In order to solve this, a working group called OpenRAW tries to persuade camera manufactures to disclose the information on their RAW formats and to promote the development and adoption of an open documented raw format. Adobe Systems(home) has developed the Digital Negative format and is promoting it as an open format, although there are some problems to overcome[4].

Technical information

Generally speaking, RAW file formats are proprietary and not documented (with the exception of DNG Adobe, which is intentionally public), so one could expect that each is different from others. Indeed, most of them are (more or less strictly) based on the TIFF format, so we can divide them in two families:

  • TIFF-based formats: CR2, DCR, DNG, MRW, NEF, ORF, PEF, SRF
  • other formats: CRW

With "based on the TIFF" format we mean that they share the overall structure of IFDs and tags - but they are not necessarily "compliant" or "compatible" with TIFF.

Basic TIFF features

In this paragraph we are going to describe the main parts of TIFF that are useful for understanding TIFF-based formats (if you want to know more about this, see the TIFF page) - please note that the information below is precise for the purpose of decoding RAW file formats, but not necessarily in the TIFF perspective).

Header 
The header is a short sequence of bytes that can be found at the very beginning of the file. It contains a "magic cookie", that is a handful of bytes with a special value that "tags" the file as being a TIFF. Furthermore, the header also contains a flag value to specify the byte order convention (hi-lo or lo-hi), the format version and a pointer to the next data chunk.
Tag 
A tag is an elementary piece of information identified by an unique number (the code of the tag) which defined the meaning of the associated information. For instance, the code 256 means that the associated information is the width of the image. The information can be stored in many formats: integer from 8 to 32 bits, floating point, rational (an integer in form of n / d).
Directory (IFD) 
A directory is a set of tags somewhat related among them. Each file contains more than a single IFD: one describes the main picture, others could describe various thumbnails, EXIF metadata, other kinds of metadata. Among these formats, the "makernote" is a special, proprietary block present in virtually all RAW file formats.
Raster data 
Raster data is the set of values that define the color of each pixel. There are many different layouts: the most commons are by strip (row by row) and by tile (adjacent square blocks). Often raster data is compressed (and the compression schema is almost always proprietary) and mangled (that is, it's arranged in a more complex way than necessary, probably to make the reverse engineer's life harder).

As we said before, the header points to the first IFD. Most IFDs contain a further pointer to other IFDs at the same level, or subIFDs. This makes a simple tree made of IFDs.

Existing RAW formats

The following table should list all existing RAW image file formats. If you know of a format that isn't listed, please add it to the table. Also if you notice some error, or know some information that's not present, please feel free to correct and add.

Ext. Maker and models Notes
ARW Sony (a100)
CR2 Canon EOS (1D, 1D MkII, 1DS, 1DS MkII, 20D, 30D, 40D, 5D, 350D/XT, 400D/XTi) Camera Raw 2 (modified TIFF); Lossless compression; format description[5]
CRW Canon EOS (10D, 300D, D30, D60), Canon PowerShot (600, A5, A5 Zoom, G1, G2, G3, G5, G6, S45, S50, S70) Camera RAW (CIFF); Specifications available[6][7]
DCR Kodak (DSC Pro SLR, DSC Pro 14N, DSC PRO 14nx) Digital Camera RAW
DNG Adobe, Hasselblad (503CW), Leica (Digilux-3, R8, R9, M8), Pentax (K10D), Ricoh (Digital-GR), Samsung (GX10, Pro815) Digital Negative; Open documentation[8]
K25 Kodak (DC25)
KDC Kodak (DC40, DC50, DC120, P850)
MRW Minolta (DiMAGE 5, DiMAGE 7, DiMAGE A1, DiMAGE A200, Dynax, Maxxum 7D) Minolta RAW Format
NEF Nikon (D1, D1H, D1X, D100, D2H, D2Hs, D2X, D200, D50, Nikon D70, D70s, E5000, E5700, E8800) Nikon Electronic Format; format varies from model to model; some versions use metadata encryption[3]
ORF Olympus (C5050Z, E-1, E-10, E-300, C70Z, C7070Z, SP350) Olympus RAW
PEF Pentax (istD, istDS) Pentax Electronic Format; Each model has different format
PTX Pentax
SR2 Sony
SRF Sony (DSC-F828, DSC-R1) Sony RAW File
X3F Sigma (SD9, SD10) Foveon sensor, Specification is available[9]

Reading RAW files

We have a page with a Comparison table of RAW software for Linux.

Dave Coffin's dcraw

Digital photography owes much to David Coffin, who wrote and maintains dcraw under a free and open source license. With dcraw we can decode virtually any RAW file produced from a digital camera. And not only open source programs benefit from Dave Coffin's work, even proprietary software like Adobe Photoshop and others use his source code.

However, dcraw on itself can only be used on a command line, so if we want to fine tune the result we need a graphical user interface.

UFRaw

See the resources page for more information on UFRaw

UFRaw or Unidentified Flying Raw presents a GUI to process all RAW formats supported by dcraw. It can be used stand-alone to output 8bit JPEG and TIFF or 16bit TIFF files. It can also be used as a plug-in to GIMP, limited to the 8bit color workspace.

Rawstudio

See the resources page for more information on Rawstudio

RawStudio is another graphical interface for processing raw files. It is still young project compared with UFRaw but is moving quickly. It has a major advantage in that it allows to process batches from within the GUI (UFRaw has ufraw-batch on the command line) so it can speed up workflow quite a bit.

digiKam plugins

digiKam plugins implement a graphical user interface over dcraw and are used by digiKam itself and also to other KDE image aplications like KPhotoAlbum.

libopenraw

libopenraw is an ongoing project to provide a free software implementation for camera RAW files decoding. One of the main reason is that dcraw is not suited for easy integration into applications, and there is a need for an easy to use API to build free software digital image processing application.

It also has the goal to address missing feature from dcraw like meta-data decoding and easy thumbnail extraction.

dcraw-assist

dcraw assist is a Kommander scripted GUI for KDE that executes both dcraw and ImageMagick's convert utility to generate both a full sized JPEG and a web-sized JPEG.

It's chief purpose is to batch process a directory full of correctly exposed (or incorrectly exposed, but uniformly so) images. It also implements GREYCstoration noise reduction algorithms.

jrawio

jrawio is a Service Provider Implementation for the Java™ ImageIO API. It provides the ability to read images coded in a digital "camera raw" format (such as NEF for Nikon or CRW/CR2 for Canon). It is to be pointed out that jrawio is implemented in 100% pure Java™.

jrawio allows you to read the raster data, the thumbnails and to extract all the known metadata contained in the image. The current version (1.0) by default provides you the RAW data, that is the unprocessed data captured by the camera sensor. You will need to process them for achieving a displayable image.

Bibliography and notes

  1. Understanding RAW files, The Luminous Landscape.
  2. Norman Koren, Tonal quality and dynamic range in digital cameras.
  3. 3.0 3.1 RAW storm in a teacup?, Digital Photography Review, 2005.
  4. The RAW Problem, OpenRAW
  5. Understanding What is stored in a Canon RAW .CR2 file, How and Why, Laurent Clevy, 2009
  6. CIFS Specification on Image Data File, Canon, 1997.
  7. Phil Harvey, The Canon RAW (CRW) File Format, 2006
  8. Digital Negative (DNG) Specification, Adobe Systems, 2005.
  9. X3F File Format External Specification, www.x3f.info
Personal tools