Yanyg - SAN Software Engineer

LINUX Udev机制介绍及简单调试

目录

某客户现场发生云平台和存储兼容性问题。分析发现是SPC协议兼容性导致的问题:

spc5r19 P288: In response to an INQUIRY command received by an incorrect logical unit, the SCSI target device shall return the INQUIRY data with the peripheral qualifier set to the value defined in 6.7.2. The device server or task router (see SAM-5) shall terminate an INQUIRY command with CHECK CONDITION status only if the device server or task router is unable to return the requested INQUIRY data.

表1  spc5r19 P291
Qualifier Description
000b An addressed logical unit having the indicated peripheral
  device type is:
  a) accessible to the task router (see SAM-5) contained in the
  SCSI target port that received this INQUIRY command; or
  b) the task router is unable to determine whether or not the
  addressed logical unit is accessible from this SCSI target.
  port This peripheral qualifier value does not indicate that a
  logical unit accessible by this task router is ready for
  access.
001b An addressed logical unit having the indicated device type is
  not accessible, at this time, to the task router (see SAM-5)
  contained in the SCSI target port that received this INQUIRY
  command. However, the task router is capable of accessing the
  addressed logical unit from this SCSI target port.
010b Reserved.
011b An addressed logical unit is not accessible to the task
  router (see SAM-5) contained in the SCSI target port that
  received this INQUIRY command. If the task router sets the
  PERIPHERAL QUALIFIER field to 011b, the task router shall set
  the PERIPHERAL DEVICE TYPE field to 1Fh.
100b to 111b Vendor specific

应该返回PQualifier=011B,DeviceType=1Fh;实际返回PQualifier=001B,DeviceType=00h。

Linux内核特化处理了PQualifier=011B,按照异常流程删除了scsi设备: linux/drivers/scsi/scsi_scan.c

static int scsi_probe_and_add_lun(struct scsi_target *starget,
                                  u64 lun, int *bflagsp,
                                  struct scsi_device **sdevp,
                                  enum scsi_scan_mode rescan,
                                  void *hostdata)
{
        ...
        /*
         * result contains valid SCSI INQUIRY data.
          */
        if ((result[0] >> 5) == 3) {
        /*
         * For a Peripheral qualifier 3 (011b), the SCSI
         * spec says: The device server is not capable of
         * supporting a physical device on this logical
         * unit.
         *
         * For disks, this implies that there is no
         * logical disk configured at sdev->lun, but there
         * is a target id responding.
         */
        SCSI_LOG_SCAN_BUS(2, sdev_printk(KERN_INFO, sdev, "scsi scan:"
                           " peripheral qualifier of 3, device not"
                           " added\n"))
        if (lun == 0) {
                SCSI_LOG_SCAN_BUS(1, {
                        unsigned char vend[9];
                        unsigned char mod[17];

                        sdev_printk(KERN_INFO, sdev,
                                "scsi scan: consider passing scsi_mod."
                                "dev_flags=%s:%s:0x240 or 0x1000240\n",
                                scsi_inq_str(vend, result, 8, 16),
                                scsi_inq_str(mod, result, 16, 32));
                });

        }

        res = SCSI_SCAN_TARGET_PRESENT;
        goto out_free_result;
        }

        ...
}

新发布版本周期长,影响范围大,采用udev方案进行了规避:

  1. 监控SUBSYSTEM==scsi_generic, KERNEL=="sg[0-9]*" event,触发脚本调用;
  2. 脚本检测ACTION环境变量为ADD/CHANGE,删除来自特定存储且PQualifier=001B的设备。

1 udev调测过程

  • 使用info查看设备属性
[email protected]:~$ udevadm info -a -x /dev/sg0

Udevadm info starts with the device specified by the devpath and then
walks up the chain of parent devices. It prints for every device
found, all possible attributes in the udev rules key format.
A rule to match, can be composed by the attributes of the device
and the attributes from one single parent device.

  looking at device '/devices/pci0000:00/0000:00:1f.2/ata1/host0/target0:0:0/0:0:0:0/scsi_generic/sg0':
    KERNEL=="sg0"
    SUBSYSTEM=="scsi_generic"
    DRIVER==""

  looking at parent device '/devices/pci0000:00/0000:00:1f.2/ata1/host0/target0:0:0/0:0:0:0':
    KERNELS=="0:0:0:0"
    SUBSYSTEMS=="scsi"
    DRIVERS=="sd"
  • 在/etc/udev/rules.d/下放置特化rules文件(文件扩展名为rules)。例如:
[email protected]:~$ cat /etc/udev/rules.d/50-test-sg.rules
SUBSYSTEM=="scsi_generic", KERNEL=="sg[0-9]*", RUN+="/usr/sbin/sg-test.sh $tempnode $kernel"
  • 在脚本中处理特定事件:
[email protected]:~/org$ cat /usr/sbin/sg-test.sh
#! /bin/bash

export PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

[ "$ACTION" = "remove" ] && exit 0

LOGGER="logger -t udev[$(cat /proc/uptime | cut -d' ' -f1)] --"

$LOGGER "Hello debug udev, process here"
  • 不重启加载新的rules
[email protected]:~$ sudo udevadm control --reload
  • 测试是否生效
[email protected]:~$ sudo udevadm test /sys/class/scsi_generic/sg0
...
28133 strings (230213 bytes), 24413 de-duplicated (197990 bytes), 3721 trie nodes used
RUN '/usr/sbin/sg-test.sh $tempnode $kernel' /etc/udev/rules.d/50-test-sg.rules:1
GROUP 6 /lib/udev/rules.d/50-udev-default.rules:63
...
  • 触发特定事件
[email protected]:~$ sudo udevadm trigger --subsystem-match=scsi_generic --verbose
/sys/devices/pci0000:00/0000:00:1f.2/ata1/host0/target0:0:0/0:0:0:0/scsi_generic/sg0
  • 触发时的日志输出:
Apr 20 19:06:32 pc udev[729227.64]: Hello debug udev, process here

2 udev实现

LINUX通过设备文件的主次设备号与内核建立映射关系。设备文件的维护经历了静态维护、 devfs内核态维护、sysfs+hotplug用户态维护、udev几个阶段。相对比之前,udev使用更少空间,提供更高的弹性、更便捷的设备文件维护。参考udev talk获取更多细节。

优势总结:用户态、动态分配、更小、更便捷。LWN 65197有更多细节描述:

The Problems:

  1. A static /dev is unwieldy and big. It would be nice to only show the /dev entries for the devices we actually have running in the system.
  2. We are (well, were) running out of major and minor numbers for devices.
  3. Users want a way to name devices in a persistent fashion (i.e. "This disk here, must always be called "boot_disk" no matter where in the scsi tree I put it", or "This USB camera must always be called "camera" no matter if I have other USB scsi devices plugged in or not.")
  4. Userspace programs want to know when devices are created or removed, and what /dev entry is associated with them.

The constraints:

  1. No policy in the kernel!
  2. Follow standards (like the LSB)
  3. must be small so embedded devices will use it.

So, how does devfs stack up to the above problems and constraints: Problems:

  1. devfs only shows the dev entries for the devices in the system.
  2. devfs does not handle the need for dynamic major/minor numbers
  3. devfs does not provide a way to name devices in a persistent fashion.
  4. devfs does provide a deamon that userspace programs can hook into to listen to see what devices are being created or removed.

Constraints:

  1. devfs forces the devfs naming policy into the kernel. If you don't like this naming scheme, tough.
  2. devfs does not follow the LSB device naming standard.
  3. devfs is small, and embedded devices use it. However it is implemented in non-pagable memory.

2.1 代码实现

代码位于Kernel Utils Hotplug,代码最新更新是2012/4,长久未更新,已经十分稳定了。 Git仓库如下:

git://git.kernel.org/pub/scm/linux/hotplug/udev.git
https://git.kernel.org/pub/scm/linux/hotplug/udev.git
https://kernel.googlesource.com/pub/scm/linux/hotplug/udev.git

3 References