- cross-posted to:
- programmerhumor@lemmy.ml
- cross-posted to:
- programmerhumor@lemmy.ml
Also, do y’all call main() in the if block or do you just put the code you want to run in the if block?
Also, do y’all call main() in the if block or do you just put the code you want to run in the if block?
Can someone explain to me how to compile a C library with “main” and a program with main? How does executing a program actually work? It has an executable flag, but what actually happens in the OS when it encounters a file with an executable file? How does it know to execute “main”? Is it possible to have a library that can be called and also executed like a program?
Anti Commercial-AI license
Way too long an answer for a lemmy post
Depends on OS. Linux will look at the first bytes of the file, either see (ASCII)
#!
(called a shebang) or ELF magic, then call the appropriate interpreter with the executable as an argument. When executing e.g. python, it’s going to call/usr/bin/env
with parameterspython
and the file name because the shebang was#!/usr/bin/env python
.Compiled C programs are ELF so it will go through the ELF header, figure out which
ld.so
to use, then start that so that it will find all the libraries, resolve all dynamic symbols, then do some bookkeeping, and jump to_start
. That is, it doesn’t:main
is a C thing.Absolutely.
ld.so
is an example of that.. Actually, wait, I’m not so sure any more, I’m getting things mixed up withlibdl.so
. In any caseld.so
is an executable with a file extension that makes it look like a library.EDIT: It does work. My (GNU) libc spits out version info when executed as an executable.
If you want to start looking at the innards like that I would suggest starting here: Hello world in assembly. Note the absence of a
main
function, the symbol the kernel actually invokes is_start
, the setup necessary to call a Cmain
is done bylibc.so
. Don’t try to understand GNU’s libc it’s full of hystarical raisins I would suggest musl.How does that work? There must be something above
ld.so
, maybe the OS? Because looking at the ELF header,ld.so
is a shared library “Type: DYN (Shared object file)”$ readelf -hl ld.so ELF Header: Magic: 7f 45 4c 46 02 01 01 03 00 00 00 00 00 00 00 00 Class: ELF64 Data: 2's complement, little endian Version: 1 (current) OS/ABI: UNIX - GNU ABI Version: 0 Type: DYN (Shared object file) Machine: Advanced Micro Devices X86-64 Version: 0x1 Entry point address: 0x1d780 Start of program headers: 64 (bytes into file) Start of section headers: 256264 (bytes into file) Flags: 0x0 Size of this header: 64 (bytes) Size of program headers: 56 (bytes) Number of program headers: 11 Size of section headers: 64 (bytes) Number of section headers: 23 Section header string table index: 22 Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align LOAD 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000db8 0x0000000000000db8 R 0x1000 LOAD 0x0000000000001000 0x0000000000001000 0x0000000000001000 0x0000000000029435 0x0000000000029435 R E 0x1000 LOAD 0x000000000002b000 0x000000000002b000 0x000000000002b000 0x000000000000a8c0 0x000000000000a8c0 R 0x1000 LOAD 0x00000000000362e0 0x00000000000362e0 0x00000000000362e0 0x0000000000002e24 0x0000000000003000 RW 0x1000 DYNAMIC 0x0000000000037e80 0x0000000000037e80 0x0000000000037e80 0x0000000000000180 0x0000000000000180 RW 0x8 NOTE 0x00000000000002a8 0x00000000000002a8 0x00000000000002a8 0x0000000000000040 0x0000000000000040 R 0x8 NOTE 0x00000000000002e8 0x00000000000002e8 0x00000000000002e8 0x0000000000000024 0x0000000000000024 R 0x4 GNU_PROPERTY 0x00000000000002a8 0x00000000000002a8 0x00000000000002a8 0x0000000000000040 0x0000000000000040 R 0x8 GNU_EH_FRAME 0x0000000000031718 0x0000000000031718 0x0000000000031718 0x00000000000009b4 0x00000000000009b4 R 0x4 GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000 RW 0x10 GNU_RELRO 0x00000000000362e0 0x00000000000362e0 0x00000000000362e0 0x0000000000001d20 0x0000000000001d20 R 0x1
The program headers don’t have interpreter information either. Compare that to
ls
“Type: EXEC (Executable file)”.$ readelf -hl ls ELF Header: Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 Class: ELF64 Data: 2's complement, little endian Version: 1 (current) OS/ABI: UNIX - System V ABI Version: 0 Type: EXEC (Executable file) Machine: Advanced Micro Devices X86-64 Version: 0x1 Entry point address: 0x40b6e0 Start of program headers: 64 (bytes into file) Start of section headers: 1473672 (bytes into file) Flags: 0x0 Size of this header: 64 (bytes) Size of program headers: 56 (bytes) Number of program headers: 14 Size of section headers: 64 (bytes) Number of section headers: 32 Section header string table index: 31 Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align PHDR 0x0000000000000040 0x0000000000400040 0x0000000000400040 0x0000000000000310 0x0000000000000310 R 0x8 INTERP 0x00000000000003b4 0x00000000004003b4 0x00000000004003b4 0x0000000000000053 0x0000000000000053 R 0x1 LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000 0x0000000000007570 0x0000000000007570 R 0x1000 LOAD 0x0000000000008000 0x0000000000408000 0x0000000000408000 0x00000000000decb1 0x00000000000decb1 R E 0x1000 LOAD 0x00000000000e7000 0x00000000004e7000 0x00000000004e7000 0x00000000000553a0 0x00000000000553a0 R 0x1000 LOAD 0x000000000013c9c8 0x000000000053d9c8 0x000000000053d9c8 0x000000000000d01c 0x0000000000024748 RW 0x1000 DYNAMIC 0x0000000000148080 0x0000000000549080 0x0000000000549080 0x0000000000000250 0x0000000000000250 RW 0x8 NOTE 0x0000000000000350 0x0000000000400350 0x0000000000400350 0x0000000000000040 0x0000000000000040 R 0x8 NOTE 0x0000000000000390 0x0000000000400390 0x0000000000400390 0x0000000000000024 0x0000000000000024 R 0x4 NOTE 0x000000000013c380 0x000000000053c380 0x000000000053c380 0x0000000000000020 0x0000000000000020 R 0x4 GNU_PROPERTY 0x0000000000000350 0x0000000000400350 0x0000000000400350 0x0000000000000040 0x0000000000000040 R 0x8 GNU_EH_FRAME 0x0000000000126318 0x0000000000526318 0x0000000000526318 0x0000000000002eb4 0x0000000000002eb4 R 0x4 GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000 RW 0x10 GNU_RELRO 0x000000000013c9c8 0x000000000053d9c8 0x000000000053d9c8 0x000000000000c638 0x000000000000c638 R 0x1
It feels like somewhere in the flow there is the same thing that’s happening in python just more hidden. Python seems to expose it because a file can be a library and an executable at the same time.
Anti Commercial-AI license
Your ld.so contains:EDIT: …with which I meant, modulo brainfart: My
libc.so.6
contains a proper entry address, while other libraries are pointing at0x0
and coredump when executed.libc.so
is a linker script, presumably because GNU compulsively overcomplicates everything.…I guess that’s enough for the kernel. It might be a linux-only thing, maybe even unintended and well linux doesn’t break userspace.
Speaking of, I was playing it a bit fast and loose:
_start
is merely the default symbol name for the entry label, I’m sure nasm and/or ld have ways to set it to something different.Btw,
ld.so
is a symlink told-linux-x86-64.so.2
at least on my system. It is an statically linked executable. Theld.so
is, in simpler words, an interpreter for the ELF format and you can run it:ld.so --help
Which seems to be contained in the only executable
sectionsegment ofld.so
LOAD 0x0000000000001000 0x0000000000001000 0x0000000000001000 0x0000000000028bb5 0x0000000000028bb5 R E 0x1000
Edit: My understanding of this quite shallow; the above is a segment that in this case contains the entirety of the
.text
section.You don’t. In C everything gets referenced by a symbol during the link stage of compilation. Libraries ultimately get treated like your source code during compilation and all items land in a symbol table. Two items with the same name result in a link failure and compilation aborts. So a library and a program with main is no bueno.
When Linux loads an executable they basically look at the program’s symbol table and search for “main” then start executing at that point
Windows behaves mostly the same way, as does MacOS. Most RTOS’s have their own special way of doing things, bare metal you’re at the mercy of your CPU vendor. The C standard specifies that “main” is the special symbol we all just happen to use
There are a lot of other helpful replies in this thread, so I won’t add much, but I did find this reference, which you could read if you have a lot of free time. But I particularly liked reading this summary:
If you want to have a library that can also be a standalone executable, just put the main function in an extra file and don’t compile that file when using the library as a library.
You could also use the preprocessor to do it similar to python but please don’t.
Just use any build tool, and have two targets, one library and one executable:
LIB_SOURCES = tools.c, stuff.c, more.c EXE_SOURCES = main.c, $LIB_SOURCES
Edit: added example
I haven’t done much low level stuff, but I think the ‘main’ function is something the compiler uses to establish an entry point for the compiled binary. The name ‘main’ would not exist in the compiled binary at all, but the function itself would still exist. Executable formats aren’t all the same, so they’ll have different ways of determining where this entry point function is expected to be. You can ‘run’ a binary library file by invoking a function contained therein, which is how DLL files work.