Extracting PNG Chunks with Go
2018-02-26 08:27:49
Author: parsiya.net(查看原文)
阅读量:53
收藏
Yesterday I had to extract some data from hidden chunks in PNG files. I realized the PNG file format is blissfully simple.
I wrote some quick code that parses a PNG file, extracts some information, identifies chunks and finally extracts chunk data. The code has minimal error handling (if chunks are not formatted properly). We also do not care about parsing PLTE and tRNS chunks although we will extract them.
uint32 length in big-endian. This is the length of the data field.
Four-byte chunk type. Chunk type can be anything 1.
Chunk data is a bunch of bytes with a fixed length read before.
Four-byte CRC-32 of Chunk 2nd and 3rd field (chunk type and chunk data).
Chunk struct
1
2
3
4
5
6
7
8
// Each chunk starts with a uint32 length (big endian), then 4 byte name,
// then data and finally the CRC32 of the chunk data.
type Chunk struct {
Length int// chunk data length
CType string// chunk type
Data []byte// chunk data
Crc32 []byte// CRC32 of chunk data
}
First chunk or IHDR looks like this:
IHDR chunk
Converting big-endian uint32s to int is straightforward:
Note (05-Apr-2020):int is dangerous. On 32-bit systems it's int32 and on 64-bit systems it's int64. So on my machine I am converting int64 to uint32 because I am running a 64-bit OS. On a 32-bit machine (e.g., Go playground) int is int32. In retrospect, I should have probably used int32 in the struct or come to think of it uint32 could have been a better choice. For more information please see int vs. int.
Trick #1: When reading chunks, I did something I had not done before. I passed in an io.Reader. This let me pass anything that implements that interface to the method. As each chunk is populated, reader pointer moves forward and gets to the start of next chunk. Note this assumes chunks are formatted correctly and does not check the CRC32 hash.
type PNG struct {
Width int
Height int
BitDepth int
ColorType int
CompressionMethod int
FilterMethod int
InterlaceMethod int
chunks []*Chunk // Not exported == won't appear in JSON string.
NumberOfChunks int
}
Trick #2: chunks does not start with a capital letter. It's not exported, so it is not parsed when we convert the struct to JSON.
IDAT chunks contain the image data. They are compressed using deflate. If you look at the first chunk, you will see the zlib magic header. This stackoverflow answer lists them:
Note that each chunk is not compressed individually. All IDAT chunks need to be extracted, concatenated and decompressed together.
In our case, IDAT chunk has the 78 5E header:
Everything else is straightforward after this.
Operation is pretty simple. PNG is passed by -file. Tool will display the PNG info like height and width. -c flag will display the chunks and their first 20 bytes. Chunks can be saved to file individually. Modifying the program to collect, decompress and store the IDAT chunks is also simple.