Advanced Networking Concepts Applied Using Linux on System z: Overview of Virtualization and Networking

Published 27 February 2013

More options

Rate and comment

Authors: Lydia Parziale

Abstract

This IBM® Redbooks® publication provides an overview of IBM System z® virtualization and networking when using Linux on System z.

Contents

For a complex environment, server consolidation helps reduce power consumption and cooling needs and reduces data center rack space requirements and server costs. It helps data centers better manage resources and resiliency. In addition, IBM z/VM® has a powerful mechanism to clone servers that allows existing servers to be cloned in a few minutes. The process leads to increased administration, system controls, and network complexity for your environment. It is important to ensure that you have an optimal network configuration. z/VM uses virtualization so administrators can manage resources on the IBM System z® platform.

Developed with hypervisor technology, z/VM provides flexibility, availability, and security capabilities for Linux instances, while it creates an isolated and protected environment for critical applications. The virtual network that is provided by z/VM for the Linux guests communication offers high throughputs and better reliability (failure tolerance).

Typically, z/VM provides three networking options:

  • IBM® HiperSockets™
  • Guest LANs
  • Virtual switches

These three options give Linux on System z guests the ability to communicate over the network. These Linux guests use virtual devices as their own physical network adapters. For complex environments that require outside LAN communication, one of the best choices is virtual switches. A virtual switch allows grouping of several OSA-Express devices to create one logical link for providing fault-tolerance and high-speed connections between the physical OSA devices and the Linux guests.

In general, decisions regarding the best methods for networking are based on reliability, performance, and availability. In this technical document, we cover the preferred method, that is, virtual switches.


Guest LANs and HiperSockets

Guest LANs are virtual networks that are used to connect Linux guests in the same z/VM LPAR. They facilitate the communication between these guests without any additional hardware. Two types of guest LANs are available:
  • QDIO
  • HiperSockets

Although the Guest LAN method is still available and used in some scenarios, do not use it for complex environments because this technology requires a z/VM service machine (TCP/IP), or a Linux guest that is acting as a router, to forward packages to the outside network. For performance purposes and complex networks, a separate hardware device (such as a Cisco or Juniper product) should act as the router and provide such communication.

For complex network environments where there is intense network traffic activity and external connectivity is required, virtual switches are the best choice.


Virtual switches

The virtual switch (VSWITCH) method allows Linux on System z guests to connect over the network. This method is both efficient and secure. It eliminates the need to have a z/VM service machine or a Linux guest act as a router, reducing the impact on z/VM to perform this role. In addition, the virtual switches support VLANs (IEEE 802.1Q).

The VSWITCH method requires an OSA-Express card on the System z platform to function. Limitations on either connectivity or data throughput are related to the OSA-Express cards. For more details about the OSA-Express devices, see OSA-Express Implementation Guide, SG24-5948.

Important: An OSA-Express card is a LAN adapter.

The available OSA devices can be verified by using the z/VM operating system command-line interface (CLI). Use the command syntax shown in the following example:

Ready;
q osa
OSA 3080 ATTACHED TO DTCVSW2 3080 DEVTYPE OSA CHPID 1C OSD
OSA 3081 ATTACHED TO DTCVSW2 3081 DEVTYPE OSA CHPID 1C OSD
OSA 3082 ATTACHED TO DTCVSW2 3082 DEVTYPE OSA CHPID 1C OSD
OSA 30A0 ATTACHED TO DTCVSW1 30A0 DEVTYPE OSA CHPID 1E OSD
OSA 30A1 ATTACHED TO DTCVSW1 30A1 DEVTYPE OSA CHPID 1E OSD
OSA 30A2 ATTACHED TO DTCVSW1 30A2 DEVTYPE OSA CHPID 1E OSD

Important: Ensure that you have privilege class B to run the QUERY OSA command.

A sample of a virtual switch is shown in the Figure 1.

z/VM virtual switch network
Figure 1, z/VM virtual switch network

Transport modes

OSA-Express devices allow communication with IP (Layer 3) and non-IP (Layer 2) transport modes. Carefully select the protocol to use if the OSA-Express device is being shared with other LPARs.

Consider the following information when you make a selection:
  • A Layer 2 VSWITCH cannot be shared with both an IBM z/OS® and z/VM TCP/IP stack, but can be shared with another Layer 2 VSWITCH.
  • A Layer 3 VSWITCH can be shared with a z/VM and z/OS TCP/IP stack.
  • OSA ports per z/OS and z/VM TCP/IP stack can be shared.
  • VSWITCH LACP OSA ports must be dedicated.

An equally important point is that Layer 2 and Layer 3 transport modes have differences:
  • IP (Layer 2) mode
    – Is optional for VSWITCH / LAN.
    – Forwards both IP and non-IP protocols (such as IPX, NetBIOS, or SNA).
    – Supports VLAN.
    – Each host has or is assigned a unique MAC address.
  • IP (Layer3) mode
    – Is the default for VSWITCH / LAN.
    – Forwards only the IP protocol.
    – Supports VLAN.
    – A unique MAC address is assigned by OSA device, which means that all guests share a MAC address with the adapter.
More information: For specific information about how to define the virtual switches and NICs, see z/VM Connectivity, SC24-6174.


VLAN aware and unaware

Using a VSWITCH is important during network communications when running Linux guests under z/VM. Two operation options are available for virtual switches: aware and unaware. Carefully select your option because it changes the behavior of how virtual switches handle and process packages and frames. Depending on what option you select, the virtual switch ignores or processes the VLAN tags.
  • VLAN aware. In this mode, the virtual switch reads and handles VLAN tags. The switch port that is connected to the OSA-Express port must be configured as a trunk port (check with your network administrator). The trunk port carries traffic from all VLANs.
  • VLAN unaware. In this mode, the virtual switch ignores VLAN tags. The switch port that is connected to the OSA-Express port must be configured as an access port (check with your network administrator). The access port carries traffic for a single VLAN.

To create a Layer 3 virtual switch (VLAN unaware is the default), run the following command:

DEFINE VSWITCH VSWITCH1 RDEV 3080 30A0

To create a Layer 2 virtual switch by using VLAN aware, run the following command:

DEFINE VSWITCH VSWITCH1 RDEV 3080 30A0 ETH VLAN 1

VSWITCH configuration: Although virtual switches can be defined dynamically, you must add the VSWITCH definition in to the SYSTEM CONFIG file (this directory is the best place) to make it persistent and avoid problems during initial program load (IPL).

After running the command to create a new virtual switch, you can query it by running the following command:

QUERY VSWITCH VSWITCH1

The output from this command is shown in the following example:

QUERY VSWITCH VSWITCH1
VSWITCH SYSTEM VSWITCH1 Type: VSWITCH Connected: 3 Maxconn: INFINITE
PERSISTENT RESTRICTED NONROUTER Accounting: OFF
VLAN Unaware
MAC address: 02-00-00-00-00-01
State: Ready
IPTimeout: 5 QueueStorage: 8
Isolation Status: OFF
RDEV: 3080.P00 VDEV: 3080 Controller: DTCVSW2
RDEV: 30A0.P00 VDEV: 30A0 Controller: DTCVSW1 BACKUP

In this example, VSWITCH1 is using NONROUTER (Layer 3) and is VLAN unaware.

The output of this command when using VLAN aware is shown in the following example:

QUERY VSWITCH VSWITCH1
VSWITCH SYSTEM VSWITCH1 Type: VSWITCH Connected: 0 Maxconn: INFINITE
PERSISTENT RESTRICTED ETHERNET Accounting: OFF
VLAN Aware Default VLAN: 0001 Default Porttype: Access GVRP: Disabled
Native VLAN: 0001 VLAN Counters: OFF
MAC address: 02-00-03-00-00-08
State: Ready
IPTimeout: 5 QueueStorage: 8
Isolation Status: OFF
RDEV: 30A0.P003080.P00 VDEV: 3080 Controller: DTCVSW2

VSWITCH1 in this example is using ETHERNET (Layer 2) and is VLAN aware. After the VSWITCH1 is created, add all necessary privileges to allow the Linux guest to couple to the new virtual switch by running the following command:

SET VSWITCH VSWITCH1 GRANT LNXSU11

In this example, to give LNXSU11 access to the VSWITCH1 at z/VM IPL time, the SET VSWITCH1 command is added to the AUTOLOG1 user ID or SYSTEM CONFIG file. After running this command, LNXSU11 has the authority to couple to VSWITCH1. The following command lists all the user IDs that have enough privileges to couple to VSWITCH1.

The following command lists all the user IDs that have enough privileges to couple to VSWITCH1.

q vswitch vswitch1 accesslist

This command produces the following output. You can see that LNXSU11 is authorized and can couple to VSWITCH1.

VSWITCH SYSTEM VSWITCH1 Type: VSWITCH Connected: 3 Maxconn: INFINITE
PERSISTENT RESTRICTED NONROUTER Accounting: OFF
VLAN Unaware
MAC address: 02-00-00-00-00-01
State: Ready
IPTimeout: 5 QueueStorage: 8
Isolation Status: OFF
Authorized userids:
LNXRH56 LNXSU11 SYSTEM TCPIP
RDEV: 3080.P00 VDEV: 3080 Controller: DTCVSW2
RDEV: 30A0.P00 VDEV: 30A0 Controller: DTCVSW1 BACKUP

To create the virtual NIC device, use the following command:

define nic c200 type qdio dev 3

After the network hardware (the NIC) is created, you can couple it to the VSWITCH. While connected to the LNXSU11 server console, run the following command to dynamically couple a network interface, previously defined to VSWITCH1.

couple c200 system VSWITCH1

The following output is from the couple command:

NIC C200 is connected to VSWITCH SYSTEM VSWITCH1


Setting the vmcp module to be loaded during boot time

With the s390-tools package that is installed on Linux on System z, you can issue CP commands from a Linux guest to z/VM by using the vmcp module. By default, this module is not loaded at boot time. To avoid loading this module every time, enable the vmcp module to load at boot time as follows:
  • On SUSE Linux

    Add the following command to the /etc/sysconfig/kernel and run the SuSEconfig command after:

    MODULES_LOADED_ON_BOOT="vmcp"

  • On Red Hat Enterprise Linux

    Add modprobe vmcp to the /etc/rc.d/rc.local file.

    Now the system will load the vmcp module during boot time.


The module can be loaded for the current session by running the following command:

modprobe vmcp

To see if the vmcp module is loaded on either SUSE or Red Hat, run the following command:

vmcp q userid


Modifying VSWITCH from Layer 3 to Layer 2

In some circumstances, you might need to modify your VSWITCH to accommodate a specific network configuration. To change the VSWITCH from Layer 3 to Layer 2 for VSWITCH1, complete the following steps:
  1. Connect to the Linux guest:

    a. Take down the interface by running the following command:

    /sbin/ifdown eth0

    b. Detach the NIC from the Linux guest by running the following command:

    /sbin/vmcp det nic 1e00

    c. Uncouple the virtual NIC (in our example, c200) from the Layer 3 VSWITCH by running the following command:

    /sbin/vmcp uncouple c200 system VSWITCH1
  2. Connect to z/VM:

    a. Redefine VSWITCH1 to change the transport mode from Layer 3 to Layer 2 by running the following commands:

    DETACH VSWITCH VSWITCH1 DEFINE VSWITCH VSWITCH1 RDEV 3080 30A0 ETHERNET CONTROLLER *

    Options: 3080 and 30A0 are the device OSA card numbers.

    b. Grant the guest authorization to connect to the Layer 2 VSWITCH by running the following command:

    SET VSWITCH VSWITCH1 GRANT LNXSU11

    c. Update the SYSTEM CONFIG file to reflect the new configuration.
  3. Connect again to the Linux guest:

    a. Couple the virtual NIC (c200) to the Layer 2 VSWITCH by running the following command:

    /sbin/vmcp couple c200 system VSWITCH1

    b. Start the interface by running the following command:

    /sbin/ifup eth0


The qeth driver

For z/VM on IBM System z10® and later hardware, communications between OSA-Express devices and the qeth device driver are available by using the queued direct I/O (QDIO) protocol. In addition, all devices are represented by a folder under the /sys file system when the qeth module is loaded.

The qeth file: The file /sys/bus/ccwgroup/drivers/qeth is created when the qeth module loads.

The qeth device driver requires three I/O subchannels (channel command word (CCW) devices for read, write, and data) for each OSA-Express CHPID predefined in your input/output control data set (IOCDS).

To define a qeth group device, issue the following command:

echo read_device_id,write_device_id,data_device_id >
/sys/bus/ccwgroup/drivers/qeth/group

Some helpful information about qeth definitions are inTable 1.

Table 1, qeth definitions
Name
Description
Example
read_device_ioMust be evenc200
write_device_ioMust be the device bus-ID of the read subchannel plus 1c201
data_device_ioMight be the device bus-ID of the write subchannel plus 1c202

Network activation is done by echoing 0 or 1 to the /sys/bus/ccwgroup/drivers/qeth/online file, as shown in the following two examples:

echo 1 > /sys/bus/ccwgroup/drivers/qeth/0.0.c200/online
echo 0 > /sys/bus/ccwgroup/drivers/qeth/0.0.c200/online

When qeth is loaded, you can determine what the interface name assigned to the device card is by running the following command:

cat /sys/bus/ccwgroup/drivers/qeth/0.0.c200/if_name eth0

Optionally, you can set additional parameters and features to attend a specific need. For example, portno and Layer 2 can be modified, but these modifications depend on the way your network is set up. If you need additional information, see Linux on System z - Device Drivers, Features, and Commands, SC33-8289.


Special Notices

This material has not been submitted to any formal IBM test and is published AS IS. It has not been the subject of rigorous review. IBM assumes no responsibility for its accuracy or completeness. The use of this information or the implementation of any of these techniques is a client responsibility and depends upon the client's ability to evaluate and integrate them into the client's operational environment.

Follow IBM Redbooks

Follow IBM Redbooks