TDT09 Topic 6


Software composition - the evil of libraries

Libraries are essential for building modern software - no programmer would want to create all functionality of an application from scratch. The earliest libraries were simply collections of object files that were linked to each application as required. However, this led to the same code (such as the standard C library libc) being linked to every binary, which resulted in a waste of disk space and main memory.

Shared Libraries were first introduced in SunOS [1] to reduce this overhead. Most current systems have adopted this technology, e.g. in the form of dynamically loaded libraries (DLLs) in Windows. A copy of a shared library is only required once on disk and in memory; the shared library was only added to a process address space when the program was executed (instead of after the compilation phase as with static libraries). Since shared libraries and application binaries were now decoupled, this also enabled easier upgrades of libraries (e.g. in case of security problems), which previously would have required to re-link every binary on the system using the specific library to be updated.

However, shared libraries come with their own set of problems. Shared libraries commonly have access to the whole memory space of an application. A malicious library can, in turn, secretly send sensitive data of an application to an attacker on the internet. This has already been exploited, e.g. on Android [4]. Accordingly, methods to isolate shared libraries from the application and from other shared libraries have been invented [2,3], but are not yet widely implemented in common operating systems. Some systems, such as Plan9, have eliminated shared libraries again [5] to avoid problems like the "DLL hell" [6], since some of the original restrictions in disk and main memory space are no longer relevant in modern systems - today, the data handled by applications is typically orders of magnitudes larger than the program code. There are also research prototypes for hybrid approaches [7,8].

Shared libraries were created due to the requirement to save resources. The evolution of memory and storage capacities have made some of these requirements irrelevant, yet other advantages, such as easy upgradability, remain. However, good management approaches and isolation is required to overcome the compatibility and security problems of shared libraries. A general overview of the linking and loading process can be found in [9].

References

  1. R. Gingell at al. Shared Libraries in SunOS. USENIX Conference Proceedings, pp. 131-147, 1987 pdf
  2. W. Qiang, Y. Cao, W. Dai, D. Zou, H. Jin and B. Liu, "Libsec: A Hardware Virtualization-Based Isolation for Shared Library," 2017 IEEE 19th International Conference on High Performance Computing and CommunicationsA link
  3. Nuwan Goonasekera, William Caelli, and Colin Fidge. LibVM: an architecture for shared library sandboxing. Software Practice and Experience Volume 45, Issue 12 link
  4. TrendMicro Inc. Analyzing Xavier: An Information-Stealing Ad Library on Android pdf
  5. 9pio. Why Static? link
  6. DLL Hell. Wikipedia link
  7. Collberg, Christian S., et al. "SLINKY: Static Linking Reloaded." USENIX Annual Technical Conference, General Track. 2005 pdf
  8. Will Dietz and Vikram Adve. 2018. Software multiplexing: share your libraries and statically link them too. Proc. ACM Program. Lang. 2, OOPSLA, Article 154 2018 link
  9. John Levine. Linkers and Loaders. Morgan Kaufmann 1999 link