Why can't we understand the content of a binary file after compiled?-Collection of common programming errors
So, if I have understood everything correctly
Not quite.
It is a binary file and its data is incomprehensible for us humans
Typically a binary file is incomprehensible to human and machine, especially when the purpose of the file is unknown. Note that not all binary files are executable files. A lot of binary files are data files that do not contain any machine instructions. That is why file extensions are used when naming files (in some OSes). The .com extension was used by CP/M to denote an executable file. The .exe extension was added by MS-DOS to denote another executable file format. *nixes use the execute attribute to denote which files can be executed, although it could be script as well as code.
As already mentioned by others, binary files, which contain numbers, should be viewed by a hex dump program or hex editor and not by a text viewer.
there is a example of the content of the ping.exe program
That file is actually a relocatable program, and not all of the data in that file represents machine code. There is information about the program such as which dynamic libraries it needs, which routines have to be linked, requirements for stack and program & data memory, and the program’s entry point. Address operands in the file could be relative values that need to be calculated to absolute values, or references that need to be resolved.
The “program file” that you’re probably thinking of is called a binary image file or a dump of program memory. Such a file would contain only machine code and data, with all address references properly set for execution.
even if they know Assembly code(the lowest level of machine language.)
Assembly language is not the same as machine language. The typical (as to exclude high-level language computers) CPU accepts machine code as input, one instruction at a time. The operands are either registers or numeric memory addresses. Assembly language is a higher-level language that can use symbolic labels for instruction locations and variables, as well as replacing numeric op-codes with mnemonics. An assembly language program has to converted to machine language/code before it can actually be executed (typically by utilities called assembler, linker and loader).
The reverse operation, disassemby, can be performed on program files with some success and loss of symbolic information. Disassembly of a memory dump or program image file is more trial & error, as code and data locations need to be identified manually.
BTW there are persons that can read and code the (numeric) machine code. Of course this is a lot easier on an 8-bit CPU or microcontroller than a 32-bit CISC processor with a dozen memory address modes.
Originally posted 2013-11-10 00:10:52.