Skip to main content

Creating a Multi-Call Linux Binary

Web Doc

Note: This is publication is now archived. For reference only.

thumbnail 

Published on 09 December 2002

  1. View in HTML

Share this page:   

IBM Form #: TIPS0092


Authors: Gregory Geiselhart

    menu icon

    Abstract

    A multi-call binary is an executable, written in C, that performs the action of more than one utility. A prime example of a multi-call binary is the BusyBox package. BusyBox implements a large number of standard Linux utilities (such as the ls and ln commands) in a single executable. This enables specialized Linux distributions to have a reduced size. This tip describes how multi-call binaries are written.

    Contents

    The BusyBox package is one of the best examples of a multi-call binary. This concept allows a single executable file to perform the function of dozens of different utilities that are usually packaged as separate files. Multi-call binaries exploit a number of operating system features that make it possible for a user of a system to not even know that the programs they are running are all, in fact, the same file.

    There are two ways to invoke BusyBox functions:


    • In the first method, you issue the command busybox followed by the name of the function you want to issue. For example, the command busybox ls would perform the directory list function (equivalent to the usual ls command). This method requires no administration, but users of the program would have to remember that they could not simply perform a function by issuing the name of a command.

    • The second method is to create a set of symbolic links to the BusyBox executable, each with the name of a function implemented by BusyBox. When BusyBox is run, it checks the name by which it was invoked, and uses that name as the function to be executed. This method does require some administration, as the symbolic links must be maintained, but system users can follow the normal practice of performing a function by issuing the name of the command.


    To illustrate, the following sequence shows the content of a BusyBox /bin directory and the effect of issuing the ls command.

    # pwd


    /bin
    # ls -l l*
    lrwxrwxrwx 1 root root 12 Oct 2 00:11 ln -> /bin/busybox
    lrwxrwxrwx 1 root root 14 Oct 2 00:11 login -> /bin/tinylogin
    lrwxrwxrwx 1 root root 12 Oct 2 00:11 ls -> /bin/busybox

    # ls -lG
    ls: invalid option -- G
    BusyBox v0.60.3 (2002.09.26-00:58+0000) multi-call binary

    Usage: ls [-1AacCdeFilnpLRrSsTtuvwxXhk] [filenames...]

    List directory contents

    In the first output you can also see login, which is a symbolic link to /bin/tinylogin. TinyLogin is a partner program to BusyBox, and performs the functions of programs like login and sulogin. These functions could have been implemented in BusyBox, but for security reasons it is preferred to have a separate executable for login processing.


    This example also shows us another feature of the BusyBox utility. In the full GNU implementation of ls, the -G option is valid (it suppresses the display of the group name from the directory list). In the interests of saving space, however, not all of the function of the various utilities is provided. This is quite appropriate for BusyBox, however, since the idea is to eliminate unused (or little used) functions in the interests of reducing the executable size.


    So, how does a multi-call binary like BusyBox, when invoked using a symbolic link, know what function to perform? The answer is that the way a multi-call binary program is written differs from a normal program.

    The C language is used for most systems programming on UNIX/POSIX systems. Programs written in C always have a main() function, which is the first part of the program to be executed. The main function is written in a particular way, to allow the operating system to pass parameters to it. A typical main() function declaration appears here:

    int main(int argc, *char argv[])

    The parameters passed to the main() function are argc, an integer containing the number of parameters passed by the system to the program, and argv, the list of the parameters passed. By convention (on UNIX/POSIX systems, at least), there will always be at least one parameter passed to the program: the name used to invoke the program. This is usually the command typed by the user at the shell prompt to invoke the command, and will just about always be the name of the file that contains the program. In C notation, this value (the first item in the array called argv) is argv[0].


    Most single call binaries ignore the contents of argv[0], as the program is designed to perform a single task and it is irrelevant what name the system used to invoke the program. Some programs, for security reasons, do make sure that the command issued is correct. This can prevent a malicious user from executing a program they should not have access to.


    A multi-call binary pays attention to this parameter, however, and uses it to determine which function to execute. In the case of BusyBox, if argv[0] is the same as the executable file name, it will use the second item in the parameter list (argv[1]) as the name of the function to be executed. If argv[0] is not the same as the name of the BusyBox executable file, it will attempt to use the contents of argv[0] as the name of the requested function.

     

    Special Notices

    The material included in this document is in DRAFT form and is provided 'as is' without warranty of any kind. IBM is not responsible for the accuracy or completeness of the material, and may update the document at any time. The final, published document may not include any, or all, of the material included herein. Client assumes all risks associated with Client's use of this document.