Fixing ptrace(pt_deny_attach, ...) on Mac OS X 10.5 Leopard

22 Jan 2008, 03:58 PST

Introduction

PT_DENY_ATTACH is a non-standard ptrace() request type that prevents a debugger from attaching to the calling process. Adam Leventhal recently discovered that Leopard extends PT_DENY_ATTACH to prevent introspection into processes using dtrace. I hope Adam will forgive me for quoting him here, but he put it best:

This is antithetical to the notion of systemic tracing, antithetical to the goals of DTrace, and antithetical to the spirit of open source. I'm sure this was inserted under pressure from ISVs, but that makes the pill no easier to swallow.

This article will cover disabling PT_DENY_ATTACH for all processes on Mac OS X 10.5. Over the previous few years, I've provided similar hacks for both Mac OS X 10.4, and 10.3.

To be clear: this work-around is a hack, and I hold that the correct fix is the removal of PT_DENY_ATTACH from Mac OS X.

How it Works

In xnu the sysent array includes function pointers to all system calls. By saving the old function pointer and inserting my own, it's relatively straight-forward to insert code in the ptrace(2) path.

However, with Mac OS X 10.4, Apple introduced official KEXT Programming Interfaces, with the intention of providing kernel binary compatibility between major operating system releases. As a part of this effort, the sysent array's symbol can not be directly resolved from a kernel extension, thus removing the ability to easily override system call. In 10.4, I was able to work-around this with the amusing temp_patch_ptrace() API. This API has disappeared in 10.5.

For Leopard, I decided to find a public symbol that is placed in the data segment, nearby the sysent array. In the kernel's data segment, nsysent is placed (almost) directly before the sysent array. By examining mach_kernel I can determine the offset to the actual sysent array, and then use this in my kext to patch the actual function. To keep things safe, I added sanity checks to verify that I'd found the real sysent array.

Each sysent structure has the following fields:

struct sysent {
	int16_t		sy_narg;		/* number of arguments */
	int8_t		reserved;		/* unused value */
	int8_t		sy_flags;		/* call flags */
	sy_call_t	*sy_call;		/* implementing function */
	sy_munge_t	*sy_arg_munge32;	/* munge system call arguments for 32-bit processes */
	sy_munge_t	*sy_arg_munge64;	/* munge system call arguments for 64-bit processes */
	int32_t		sy_return_type; /* return type */
	uint16_t	sy_arg_bytes;	/* The size of all arguments for 32-bit system calls, in bytes */
};

The "sy_call" field contains a function pointer to the actual implementing function for a given syscall. If we look at the actual sysent table, we'll see that the first entry is "SYS_nosys":

__private_extern__ struct sysent sysent[] = {
    {0, 0, 0, (sy_call_t *)nosys, NULL, NULL, _SYSCALL_RET_INT_T, 0},

To narrow down the haystack, we'll find the address of the nsysent variable, and then search for the nosys function pointer -- as shown above, nosys should be the first entry in the sysent array.

nm /mach_kernel| grep _nsysent
00502780 D _nsysent
nm /mach_kernel| grep T\ _nosys
00388604 T _nosys

Here is a dump of the mach_kernel, starting at 0x502780. You can see the value is 0x01AB, or 427 -- by looking at the kernel headers, we can determine that this is the correct number of syscall entries. 33 bytes after nsysent, we see 0x388604 (in little-endian byte order) -- this is our nosys function pointer. After counting the size of the sysent structure fields, we can determine that the the sysent array is located 32 bytes after the nsysent variable address. (On PPC, it's directly after).

otool -d /mach_kernel
00502780        ab 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00502790        00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
005027a0        00 00 00 00 04 86 38 00 00 00 00 00 00 00 00 00

Once we have the address of the array, we can find the SYS_ptrace entry and substitute our own ptrace wrapper:

static int our_ptrace (struct proc *p, struct ptrace_args *uap, int *retval)
{
	if (uap->req == PT_DENY_ATTACH) {
		printf("[ptrace] Blocking PT_DENY_ATTACH for pid %d.\n", uap->pid);
		return (0);
	} else {
		return real_ptrace(p, uap, retval);
	}
}
kern_return_t pt_deny_attach_start (kmod_info_t *ki, void *d) {
	...
	real_ptrace = (ptrace_func_t *) _sysent[SYS_ptrace].sy_call;
	_sysent[SYS_ptrace].sy_call = (sy_call_t *) our_ptrace;
	...
}

Download

You can download the kext source here (sig).

Buyer beware: This code has only seen limited testing, and your mileage may vary. If something goes wrong, sanity checks should prevent a panic, and the module will fail to load.

If the module loads correctly, you should see the following in your dmesg output:

[ptrace] Found nsysent at 0x502780 (count 427), calculated sysent location 0x5027a0.
[ptrace] Sanity check 0 1 0 3 4 4: sysent sanity check succeeded.
[ptrace] Patching ptrace(PT_DENY_ATTACH, ...).
[ptrace] Blocking PT_DENY_ATTACH for pid 82248.

Note: To access the nsysent symbol, the kext is required to declare a dependency on a specific version of Mac OS X. When updating to a new minor release, it should be sufficient to change the 'com.apple.kernel' version in the kext's Info.plist. I've uploaded a new version of the kext with this change, but I won't provide future updates unless a code change is required.

<key>OSBundleLibraries</key>
<dict>
    <key>com.apple.kernel</key>
    <string>9.2.0</string>
</dict>

Much thanks to Ryan Chapman for noting this issue, and testing the kext with 10.5.2.

SoyLatte: Release 1.0.1

06 Jan 2008, 19:26 PST

Minor Bugfix

This release fixes a name resolution bug reported by Leif Nelson of LLNL.

I tracked this down to this copy/paste bug in resolver code:

error = getaddrinfo(hostname, "domain", &hints, &res);

The service argument should have been NULL.

Download

Binaries, source, build, and contribution instructions are all available from SoyLatte Project Page

Implementing a Better DNS Dead Drop

06 Jan 2008, 19:15 PST

dead drop (n): A dead drop or dead letter box, is a location used to secretly pass items between two people, without requiring them to meet.

The Original DNS Dead-Drop

Two years ago, I implemented a DNS-based dead-drop, based on an idea presented by Dan Kaminisky in Attacking Distributed Systems: The DNS Case Study.

Using a recursive, caching name server, coupled with a wildcard zone, it's possible to implement double-blind data transfer. In each DNS query, 7 bits are reserved for a number of flags, one of which is the Recursion Desired (RD) flag. If set to 0, the queried DNS server will not attempt to recurse -- it will only provide answers from its cache.

Combine this with a wildcard zone and it's possible to signal bits (RD on), and read them (RD off). To set a bit to 1 the sender issues a query with the RD bit on. The wildcard zone resolves all requests, including this query. The receiver then issues a query for the same hostname, with the RD bit off. If the bit is 1, the query will return a valid record. If the bit is 0, no record will be returned.

So, it's easy to signal a single bit, but what if you want to share more than 1 bit of data? This requires both sides to compute a list of records -- one record for every bit of data we wish to send. In my implementation, I chose to do this with a pre-shared word list and initialization vector (IV). Given the same word list and IV, both sender and receiver can independently compute an identical mapping of words to bit positions. The sender can then signal the '1' bits, and the receiver can query all bits.

Hiding the Trail: Using TTL to Signal Bits

To avoid suspicion, a good dead-drop mechanism should not appear unusual to an outside observer. The RD flag is unusual, and a signature to detect its use can easily be added to intrusion detection systems. It would be considerably more sneaky to use a signaling mechanism that relied on more normal-appearing DNS queries.

This is where the time-to-live (TTL) value can be used. When returning query results, many recursive DNS servers include a TTL -- the number of seconds before the recursive name server will purge the record from its cache. The TTL begins decrementing as soon as a record is cached. Therefor, newer lookups with have a higher TTL than older lookups. Using this property, it is possible to determine if a record was previously cached, and thus signal bits without relying on the RD flag.

To communicate, the sender and receiver need to pre-share a word list, an initialization vector (IV), the IP of a recursive nameserver, a wildcard domain, and a communications window (time of day). Here's how the protocol works:

Sender:

Receiver:

Download

You can download a copy of NSDK here. It's written in Python, and depends on the dnspython library.

The implementation is a proof-of-concept -- the TTL heuristic is very simple, and you'll certainly see bit errors in longer messages. I enjoyed the "Bourne Identity" books way too much, and this is all meant in fun.

Usage

nsdk.py [dns IP] [wildcard domain] [word list] [iv] <message>

Each bit of your message requires at least one DNS query. I strongly suggest testing this implementation against name servers and zones that you control.

To send a message:

./nsdk.py 10.0.0.1 wildcard.example.com /usr/share/dict/words 42 "The crow flies at midnight"
Message sent successfully!

To receive the message:

./nsdk.py 10.0.0.1 wildcard.example.com /usr/share/dict/words 42
Read 208 bits from DNS server
The Secret Message: The crow flies at midnight