Before we would sink into details of the DICOM standard file format, let us create a simplified version to entertain ourselves.
In the previous sections we have learned about the binary representation of data. We realized that if we are given a long byte stream then we could not tell the meaning without learning about the data types used. If one studied the problems given above then the one could list the following coding problems:
It is best if the beginner think over all the issues listed above and tries to build his or her own toy-DICOM file format alone. To make the task simpler, let me define all the types that will have to be used in this introductory section:
We have now four different data types: the word, the image, the patient name, and the file identifier. Before introducing the symbols for the different data types let us not forget about the interpretation of the words describing the numeric values belonging to the different grey levels. It seems to be practical if we start our private toy-DICOM files with the signal telling which way of the numeric interpretation will be used. Let the byte 00 signal that the space values go from the right to the left, and let the byte FF signal the opposite direction. We may introduce a symbolic way to tell the same. Let 00 and FF denoted by () and < , respectively for human consumption. Without introducing further rules let us agree that we will use the Latin-2 character set only. So far we realized that a DICOM file may have a so called header containing information about the whole file. Yes, the header is very important part of the real DICOM files.
Having agreed in all that. Let us denote the start of the patient name by 01, and let the symbol of this sign – for human consumption: PN. Let the introductory byte for the file identifier 02, and for the image, let it be 03; the corresponding „edible” symbols are UID and *. There is one more thing to agree in. Let the introductory symbols (signaling the beginning of a sequence of special data) be followed by one byte telling the length of the following data in the number of units we agreed. (units: bytes in the case of the PN, and the UID, and words for the pixels.)
Before giving a sample file to decode let us create a table to summarize what we have already said about our private file format.
The byte | Symbolic sign For the human interpreter | Example for one such unit in hexa form | Explanation | |
00 | LittleEndian | The words in the file will have to be read from the right to the left: The name: “LittleEndian” refers to that order. | 10 2a | 16*256+42=4096+42= 4138 |
FF | BigEndian | The words in the file will have to be read from the left to the right | 10 2a | 16+256*42=10752 |
01 | PN | The beginning of the patient name. Note that the first byte following this sign will tell the number of characters in the name, that is, it will not be the byte giving the first letter of the name. | 010B 41656565656565 6F6F6F6F0300010001 | 01 tells that the patient name comes, 0B tells that the number of characters in the patient name will be 0B=11. The further 11 bytes will have to be interpreted (as we agreed) according to the Latin-2 character table. Since in Latin-2 41=A, 65=e, 6F=o, the patient name of 11 characters is „Aeeeeeeoooo” We do not worry about the next byte: 03 because we know the patient name contains only 10 characters. |
02 | UID | The beginning of the file identifier. Note that the first byte following this sign will tell the number of characters in the identifier, that is, it will not be the byte giving the first character of the UID. | 02020303 | 02: the UID comes, 02: it will have only two bytes: 0303. That is the file identifier is 33. |
03 | Image | The beginning of the file part of the file that describes an image. Note that the first byte following this sign be A9=25 because our images are always of 5x5 pixels | 03A9 0000 0000 0000 0000 0000 0001 0001 0010 0010 0010 000A 000A 000A 000A 000A 000B 000B 000B 000B 000B | 03 tells that the image comes. A9 means 25, that is the image will contain 25 pixels whose pixel values are given in words. We do not know how to interpret the words yet, since we did not give the first byte of the file |
Let us choose the usual method, that is, the method that we use when we interpret decimal numbers: let the place values increase from the right to the left. As a result the header of our file will be: 00, or LittleEndian:
Now, here is our DICOM byte level dump
00 01 0B 41 65 65 65 65 65 65 6F 6F 6F 6F 02 02
02 03 03 A9 00 00 00 00 00 00 00 00 00 00 00 01
00 01 00 01 00 01 00 01 00 0A 00 0A 00 0A 00 0A
00 0A 00 0B 00 0B 00 0B 00 0B 00 0B 00 00 00 00
00 00
First let us translate this dump into a readable format. Note that the first part of the file gives information about the file as a whole this is way this part will be called MetaHeader. All the other information but the image related part will be called Header:
Meta Header:
LittleEndian
Meta Header:
LittleEndian, (lower place values to the right.)
Header:
PN (Patient name 11 characters): Aeeeeeeoooo
UID (File identifier 2 characters): 33
Image: (Pixel values 25 words, 5x4 pixels):
0, 0, 0, 0, 0
1, 1, 1, 1, 1
10, 10, 10, 10, 10
11, 11, 11, 11, 11
0, 0, 0, 0, 0
End of the toy DICOM file.
Now, if we have the following grey scale:
Ekkor a 33-as azonosítójú DICOM fájlunk, mely Aeeeeeeoooo nevű betegünkhöz tartozik a következő képet tárolja:
Then our toy DICOM file of identifier 33, belonging to patient Aeeeeeeoooo has the following image in it.
The original document is available at http://549552.cz968.group/tiki-index.php?page=Introduction+to+the+a+simplified+toy-DICOM+file+format+III