Fix iOS Frameworks (or GTFO?)

13 Jan 2014, 07:48 PST

In response to my post on iOS Static Libraries, a number of people have asked for a radar they can easily submit to follow up with Apple. In the spirit of Fix Radar or GTFO, here's a template you can use to submit a bug asking Apple to fix the broken state of library distribution on iOS.

Summary:
 
The lack of support for frameworks/dylibs on iOS has become the status quo, and has been and continues to be enormously limiting and costly to the iOS development ecosystem, as described here: http://landonf.bikemonkey.org/code/ios/Radar_15800975_iOS_Frameworks.20140112.html and in rdar://15800975
 
Nearly 7 years after the introduction of iOS, it well past time for Apple to prioritize closing the feature gap between iOS and Mac toolchains. A real framework solution plays a central role in how we as third-party developers can share and deliver common code.
 
Steps to Reproduce:
 
Ship or consume 3rd party libraries on iOS.
 
Expected Results:
 
We can leverage the long-standing functionality of dylibs and frameworks as exists on Mac OS X.
 
Actual Results:
  
- Anyone distributing libraries has had to adopt hackish workarounds to facilitate their use by other developers
- Anyone shipping resources to be bundled with their library have had to adopt similar work-arounds.
- Reproducibility and debugging information is lost, and common debug info can not be shared or managed by the library provider.
- The limitations of Xcode and the need for multi-platform building for both iOS+Simulator (and often Mac OS X) forces developers to deploy technically incorrect complex solutions, such as lipo'ing together device and simulator binaries.
- Standard static libraries do not support dylib linker features that are hugely useful when shipping and consuming libraries, such as two-level namespacing, LC_LOAD_DYLIB, etc.
 
Version:
Xcode 5.0.3 5A3005
 
Notes:

iOS Static Libraries Are, Like, Really Bad, And Stuff (Radar 15800975)

12 Jan 2014, 12:33 PST

Introduction

When I first documented static frameworks as a partial workaround for the lack of shared libraries and frameworks on iOS, it was 2008.

Nearly six years later, we still don't have a solution on-par with Mac OS X frameworks for distributing libraries, and in my experience, this has introduced unnecessary cost and complexity across the entire ecosystem of iOS development. I decided to sit down and write my concerns as a bug report (rdar://15800975), and realized that I'd nearly written an essay (and that I was overflowing Apple's 3000-character limitations on their broken Radar web UI), and that I may as well actually turn it into an actual blog post.

The lack of a clean library distribution format has had a significant, but not always obvious, affect on the iOS development community and norms. I can't help but wonder whether the responsible parties at Apple -- where internal developers aren't subject to the majority of constraints we are -- realize just how much the lack of a clean library distribution mechanism has impacted not just how we share libraries with each other, but also how we write them.

It's been nearly 7 years since the introduction of iPhoneOS. iOS needs real frameworks, and moreover, iOS needs multiple-platform frameworks, with support for bundling Simulator, Device, and Mac binaries -- along with their resources, headers, and related content -- into a single atomic distribution bundle that applications developers can drag and drop into their projects.

The Problems with Static Libraries

From the perspective of someone that has spent nearly 13 years on Mac OS X and iOS (and various UNIXs and Mac OS before that), there are a litany of obvious costs and inefficiencies caused entirely by the lack of support for proper library distribution on iOS.

The limitations stand out in stark relief when compared to Mac OS X's existing framework support, or even other language environments such as Ruby, Python, Java, or Haskell, all of which when compared to the iOS toolchain, provide more consistent, comprehensive mechanisms for building, distributing, and declaring dependencies on common libraries.

Targeting Simulator and Device

When targeting iOS, anyone distributing binary static libraries has had to adopt complicated workarounds to facilitate both adoption and usage by developers. If you look at all the common static libraries for iOS -- PLCrashReporter included -- they've been manually lipo'd from iOS/ARM and Simulator/x86 binaries to create a single static library to simplify linking. Xcode doesn't support this, requiring complex use of multiple targets (often duplicated for each platform), custom build scripts, and more complex development processes that increase the cognitive load for any other developers that might want to build the project.

On top of this, such binaries are technically invalid; Mach-O Universal binaries only encode architecture, not platform, and were there ever to be an ARM-based Mac, or an x86-based iOS device, these libraries would fail to link, as they conflate architecture (arm/x86) with platform (ios/simulator). Despite all that, we hack up our projects and ship these lipo'd binaries anyway, as the alternative is increasing the integration complexity for every single user of our library.

To make this work, library authors have employed a variety of complex work-arounds, such as using duplicated targets for both iOS and Simulator libraries to allow a single Xcode build to produce a lipo'd binary for both targets, driving xcodebuild via external shell scripts and stitching together the results, and employing a variety of 3rd-party "static framework" target templates that attempt to perform the above.

By comparison, Apple has the benefit of both being able to ship independent SDKs for each platform, and having support for finding and automatically using the appropriate SDK built into Xcode. As such, they're free to ship multiple binaries for each supported platform, and any user can simply pass a linker flag, or add the Apple-supplied libraries or frameworks to the appropriate build phase, and expect them to work.

Library Resources

One of the significant features of frameworks on Mac OS X is the ability to bundle resources. This doesn't just include images, nibs, and other visual data, but also bundled helper tools, XPCServices[1], and additional libraries/frameworks that their framework itself depends on.

On iOS, we can rule out XPC services and helper tools; we're not allowed to spawn subprocesses or bundled XPC services, which while arguably made more sense in the era of the 128MB iPhone 1 than it does now, is a subject for another blog post.

However, that leaves the other resource types -- compiled xibs, images, audio files, textures, etc -- the distribution and use of which winds up being far more difficult than it needs to be. On Mac OS X, we can use great APIs like +[NSBundle bundleForClass:] to automatically find the bundle in which our framework class is defined, and use that bundle to load associated resources. Mac OS X users of our frameworks only have to drop our framework into their project, and all the resources will be available and in an easily found location.

On iOS, however, anyone shipping resources to be bundled with their library has had to adopt work-arounds. External resources are often provided as another, independent bundle that must be placed in their application bundle by end-users, increasing the number of steps required to integrate a simple library. Everyone has to write their own resource location code -- it's just a few lines of code to replace the functionality of +[NSBundle bundleForClass:] as an NSBundle category, but they're a few lines of code that shouldn't need to be written, and certainly not by every author of a library containing resources.

This increase in effort -- both on your users, and on you as an author, leads to questioning whether you really need to ship a resource with your library, even if it would be the best technical solution. It changes the value equation, and as such, library authors deem the effort too complex for a simple use case, and instead do more work to either avoid including the resource, programmatically generate it, or in extreme cases, figure out how to bundle the resource into the static library itself, eg, by inclusion as a byte array in a source file.

Meanwhile, when targeting Mac OS X, we've long-since added the resource to the framework bundle and moved on.

Debugging Info and Consistency

One of the great features of dynamic libraries is consistency. Even if two different applications both ship a duplicate copy of the the same version of a shared library, the actual library itself will be the same.

This brings with it a number of advantages when it comes to supporting a library in the wild -- we can trust that, should an issue arise in the library, we as library authors know exactly what tools were used to build it, we have the original debugging symbols available (and ideally, we supplied them with the binary). This gives us a level of assurance that allows us to provide better support with less effort when and if things go wrong.

However, when using static libraries, that level of consistency and information sharing is lost. In the years past, for example, I saw issues related to a specific linker bug that resulted in improper relocation of Mach-O symbols during final linking of the executable, and crashes that thus only occurred in the specific user's application, and could only be reproduced with a specific set of linker input.

If they attempted to reproduce the linker bug with an isolated test case, it would disappear, as the bug itself was dependent on the original linker input. The only way that I could provide assistance was if they sent me their project, including source code, so that I could debug the linker interactions locally. For obvious reasons, most developers can not send their company's source code to an external developer, and the issue generally disappeared forever if they changed the linker input -- eg, by adding or removing a file.

I was never able to get a reproduction case, and I was never able to reproduce the issue locally. For a few years, I'd receive sporadic bug reports about the linker issue appearing, and then disappearing, until finally some update to the Xcode toolchain seemed to have solved the issue, and -- through no change that I made -- the issue disappeared.

Consistency facilitates reliability.

However, there are advantages beyond reliability to having consistent versions of your framework used across all applications. One of the other advantages is transparency, and specifically, transparency when investigating failure.

When a static library is linked into a binary, all the symbols are relocated, linked, and new debugging information is generated -- assuming debugging information was available in the first place: the default 'release' target for static libraries strips out the DWARF data, and if you're shipping a commercial library, you may not want to expose the deep innards of your software by providing every user of your library with a full set of DWARF debugging info.

Given that, even if multiple applications use the same exact version of a library, each and every application build generates build-specific debug information, and in modern toolchains (eg, via LTO), may in fact generate code that constructively differs from the library as shipped. As a library author, you are entirely reliant on whatever debug information was preserved by the user, and in performing post-facto analysis of a crash, you cannot perform deep analysis of your library's machine code without also having access to the user's own application binary, along with the DWARF debugging information that contains not only the debug info for your library, but also that of the end-user's application.

That all assumes that you, as a library author, ship debugging symbols. If you're providing a commercial library for which debugging info must not be provided, then there is no reasonable way to perform post-facto debugging of your library after it has been statically linked into the final application.

By comparison, dynamic libraries and frameworks maintain consistency -- any DWARF dSYM that is preserved by the library author will apply equally to any integration of that version of the library. Commercial library vendors can provide debugging information as necessary post-crash, and as opposed to the symbol stripping that occurs as a link-time optimization when using static libraries, the public symbols of the dynamic library will be visible to the application developer, allowing them introspection into failures even in the case where no debugging info is supplied.

Missing Shared Library Features

Over the course of the decades after which shared libraries were deployed, a variety of features were introduced that solved very real problems related to hiding implementation details, versioning, and otherwise presenting a clean interface to external consumers of the library.

Dependent Libraries

The most obvious example is linking of dependent libraries. When you add a framework to your project, that framework already has encoded the libraries it depends on; simply drop the framework in, and no further changes are required to the list of libraries your project itself links to.

With static libraries, however, it's the application developer's responsibility to add all required libraries in addition to the one they actually want to use. Some companies have gone so far as producing custom integration applications that walk users through the process of configuring their Xcode project just to provide that same level of ease-of-use that you'd get for free from frameworks. Other library implementors have switched to dlopen()'ing their dependencies just to avoid having to deal with user support around configuring linked libraries -- given a linker error showing only the undefined symbols, it's rarely obvious to an unfamiliar application developer what library they should link to fix it.

Even if users should know how to do this successfully, it remains an totally unnecessary burden to place on every single consumer of a library, and forces library authors to reconsider adding new system framework dependencies to their project -- even if it would be the best technical choice -- as it will break the build of every project that upgrades to a new version with that dependency, requiring additional configuration on behalf of the application author, and additional support from the library developer.

Two-level Namespacing

However useful automatic dependent library linking may be, there are much more significant (and much less easily worked-around) features not provided by static libraries -- such as two-level namespacing.

Two-level namespacing is a somewhat unique Apple platform feature -- instead of the linker recording just an external reference to a symbol name, it instead records both the library from which the symbol was referenced (first level of the namespace) AND the symbol name (second level of the namespace).

This is a huge win for compatibility and avoiding exposing internal implementation details that may break the application or other libraries. For example, if my framework internally depends on a linked copy of sqlite3 that includes custom build-time features (as PLDatabase actually does), and your application links against /usr/lib/libsqlite.dylib, there is no symbol conflict. Your application will resolve SQLite symbols in /usr/lib/libsqlite.dylib, and the library will resolve them in its local copy of libsqlite3.

If you're using static libraries, however, two-level namespacing can't work. Since the static library is included in the main executable, both the static library and the main executable both share the same references to the same external symbols.

Without two-level namespacing, internal library dependencies -- such as libc++/libstdc++ -- are exposed to the enclosing application, causing build breakage, incompatibility between multiple static libraries, incompatibilities with enclosing applications, and depending on the library in question, the introduction of subtle failures or bugs. This requires work-arounds on behalf of the library author -- in libraries such as PLCrashReporter, where a minimal amount of C++ is used internally, this has resulted in our careful avoidance of any features that would require linking the C++ stdlib. This is not an approach that would work for a project making use of more substantial C++ features, and the result is that library authors must either provide two versions of their library, one using libc++, one using libstdc++, or all clients of that library must switch C++ stdlibs in lockstep - even if they neither expose nor rely on C++ APIs externally.

Symbol Visibility

One of the features that is possible to achieve with static libraries is the management of symbol visibility. For example, PLCrashReporter ships with an embedded copy of the protobuf-c library. To avoid conflict with applications and libraries that also use protobuf-c, we rely on symbol visibility to hide the symbols entirely (though, if we had two-level namespaces, we could have avoided the problem in the first place).

To export only a limited set of symbols, we can use linker support for generating MH_OBJECT files from static libraries. This is called "single object pre-link" in Xcode, and uses ld(1)'s '-r' flag. Unfortunately, MH_OBJECT is not used by Xcode's static library templates by default, is seemingly rarely used inside of Apple, and has exhibited a wide variety of bugs. For example, a recent update to the Xcode toolchain introduced a bug in MH_OBJECT linking; when used in combination with -exported_symbols_list, the linker generates an invalid __LINKEDIT vmsize in the resulting Mach-O file (rdar://15042905 -- reported in September 2013).

This highlights a major issue with static libraries: Apple doesn't use them the way we do. Things that aren't used largely aren't tested, with a high tendency towards bitrot and regression; unusual linker flags are no exception.

Conclusion

Above, I've listed a litany of issues that I've seen with actually producing and maintaining static libraries for iOS, and their deficiencies compared to a solution that NeXT was shipping a form of nearly 30 years ago. My list of issues is hardly exhaustive -- I didn't even mention ld's stripping of Objective-C categories, -all_load, and the work-arounds people have employed.

However, all technical issues aside, what's worse are the effects that these technical constraints have had on the culture of code sharing on the platform. The headaches involved in shipping binary libraries has contributed to most people not trying. Instead, we've adopted hacks and workarounds that create both technical debt and greatly constrain the power of tools we can bring to bear on problems. Since static libraries are so painful, instead, people gravitate towards solutions that, while technically sub-optimal, are imminently pragmatic:

These workarounds have introduced a number of problems for the development community; the majority of my support requests have nothing to do with actually *using* my libraries. Rather, users get stuck trying to integrate them at all -- as a subproject, or trying to embed the source files, or all of the above.

Developers are regularly frustrated by projects that can't easily be integrated via source code drops, or have complicated sets of targets that are required when attempting to embed a subproject -- but if if it were not for the limitations of iOS library distribution, the internals of our library build systems need not be exposed at all to developers.

For library authors, all of these integration hacks -- embedding source code, reproducing project configuration, hacking projects into use -- result in builds of the library being unique -- not corresponding to a specific known release, they make support and maintenance all the more difficult.

In response to these issues, a variety of cultural shifts have occurred. Tools like CocoaPods automate the extraction of source code from a library project, generate custom Xcode workspaces and project files, and reproduce the build configuration from the library project in the resulting workspace. Through this complex and often fragile mechanism, users can integrate library code in a way that begins to approach the simple atomic integration of a framework, but at the cost of ideally unnecessary user-facing additions to their project's build environment, significant fragility around the process, and not insignificant overhead for library authors themselves.

Outside of CocoaPods, the unnecessary complexity of distribution and packaging libraries for iOS has in no small part resulted in a decrease in the availability of easily integrated and well-designed frameworks. This is no surprise, as a significant, discouraging amount of effort is required to produce something on par with what was easily done with frameworks, and in doing so, one often has to introduce duplicated targets, complex scripts, and other work-arounds that are both time consuming to implement and maintain, and make the project inhospitable to potential contributors.

Apple’s lack of priority on solving the problem of 3rd-party iOS library distribution has taken a real toll on the community’s ability to produce documented, easily integrated, consistently versioned libraries with reasonable effort; most people seemingly don't even try outside of CocoaPods, whereas this was the norm on Mac OS X.

For those of us outside of Apple, limited to only what is permitted by Apple for 3rd-party developers, I believe that this stance has been damaging to the cultural of software craftsmanship by introducing significant discouraging costs and inefficiencies when trying to produce consistent, transparent, easily integrated high-quality libraries.

Nearly 7 years after the introduction of iOS, it well past time for Apple to prioritize closing the gap between iOS and Mac toolchains. A real framework solution is not the only improvement we need to see to the development environment, and certainly not the most important one, but it plays a central role in how we as third-party developers can share and deliver common code.

[1] Technically, only system frameworks can currently embed XPCServices on Mac OS X. Mentioned in rdar://15800873.

Update: On request (and in the spirit of Fix Radar or GTFO), I've posted a template you can use to submit a Radar to Apple about this issue. Please consider gritting your teeth and using Radar Web to submit a bug.

iOS Function Patching

20 Jan 2013, 08:03 PST

On Mac OS X, mach_override is used to implement runtime patching of functions. It essentially works by marking the executable page as writable, and actually inserting a new function prologue into the target function.

On iOS, pages are W^X: a page can be writable, or executable, but it's never allowed to be both. This has required finding inventive solutions, such as trampoline pools to support things such as imp_implementationWithBlock() and libffi's closures.

However, the trampoline approach will not work for patching arbitrary OS code; you somehow need to be able to modify the code in place, but there's no way to actually write to a code page.

Last night, Mike Ash and I were jokingly discussing how we could implement this (badly) using memory protections and signal handlers.

Enter libevil:

void (*orig_NSLog)(NSString *fmt, ...) = NULL;
 
void my_NSLog (NSString *fmt, ...) {
    orig_NSLog(@"I'm in your computers, patching your strings ...");
 
    NSString *newFmt = [NSString stringWithFormat: @"[PATCHED]: %@", fmt];
    
    va_list ap;
    va_start(ap, fmt);
    NSLogv(newFmt, ap);
    va_end(ap);
}
 
evil_init();
evil_override_ptr(NSLog, my_NSLog, (void **) &orig_NSLog);
NSLog(@"Print a string");

You should not use this code. Seriously.

How it Works

libevil uses VM memory protection and remapping tricks to allow for patching arbitrary functions on iOS/ARM. This is similar in function to mach_override, except that libevil has to work without the ability to write to executable pages.

This is achieved as follows:

The entire binary is remapped as to 'correctly' handle PC-relative addressing that would otherwise fail. There are still innumerable ways that this code can explode in your face. Remember how I said not to use it?

A fancier implementation would involve performing instruction interpretation from the crashed page, rather than letting the CPU execute from remapped pages. This would involve actually implementing an ARM emulator, which seems like drastic overkill for a massive hack.

The Code

The implementation only supports ARM, so you can only test it out on your iOS device. I've posted the code and a sample application on github.