This article shows an example of ISCSI device plan and installation, host side configuration of both ISCSI and multipath, followed by tunning and debug.

In this article, the iSCSI target is a hardware iSCSI device, ds3524, I have another article described how to install a software iSCSI target on Linux, also there is one for iSCSI client installation.

 

Hardware

In this case, ISCSI target is IBM DS3524, 24 SAS drives, 10k rpm 600GB/each. 8 1GB host interfaces. 4 hosts will use ISCSI storage, vm1, vm2, adm and backup.

Storage setup

On iscsi DS3524 target side, there are two arrays created


Array backup, 18 disk drives, raid6, total 8.9TB,

two luns, both segment size is 32KB

one is lun imirror 300GB, designed to be used by adm node.

the other is lun iraid, 8.6TB, to be used by backup

Array vms, 6 disk drives, raid10, total 1.67TB.

25 luns, segment size 128KB
7*100GB
14*20GB
4*40GB.

vm1-vm10 for vm1/2 to use.

Two host groups setup.

All host type  set to LNXALUA

one is backupgroup, host backup and adm, the other is vmgroup, host vm1 and vm2

In one host group, each host can see others' luns, but one lun can only be mounted on one host at a time.

Name Thin Status Capacity Accessible by Source
backup No Optimal 8,634.585 GB Host Group backupgroup Array backup
mirror No Optimal 300.000 GB Host Group backupgroup Array backup
vm1 No Optimal 100.000 GB Host Group vmgroup Array vms
vm2 No Optimal 100.000 GB Host Group vmgroup Array vms
vm3 No Optimal 100.000 GB Host Group vmgroup Array vms
vm4 No Optimal 100.000 GB Host Group vmgroup Array vms
vm5 No Optimal 100.000 GB Host Group vmgroup Array vms
vm6 No Optimal 100.000 GB Host Group vmgroup Array vms
vm7 No Optimal 100.000 GB Host Group vmgroup Array vms
vm8 No Optimal 20.000 GB Host Group vmgroup Array vms
...
vm21 No Optimal 20.000 GB Host Group vmgroup Array vms
vm22 No Optimal 40.000 GB Host Group vmgroup Array vms
vm23 No Optimal 40.000 GB Host Group vmgroup Array vms
vm24 No Optimal 40.000 GB Host Group vmgroup Array vms
vm25 No Optimal 40.000 GB Host Group vmgroup Array vms

Network

In this case, ISCSI is pretty isolated from current network infrastracture, so we don't use CHAP authentication, no iSNS server for discovery either
Each host as two NICs(adm's second NIC needs more work), accordingly, there are four vlans created.
All ports are using ipv4, MTU 9000, flow control enabled.

Here is the detail, 4 vlans created

vlan 390
         vm1 eth1     192.168.130.200
         vm2 eth1     192.168.130.201
         iscsiA port3 192.168.130.1
         iscsiB port4 192.168.130.2
vlan 391
         vm1 eth3     192.168.131.200
         vm2 eth3     192.168.131.201
         iscsiA port4 192.168.131.1
         iscsiB port3 192.168.131.2
vlan 392
         backup eth0  192.168.132.200
         adm    eth3  192.168.132.201
         iscsiA port5 192.168.132.1
         iscsiB port6 192.168.132.2
vlan 393
         backup eth1  192.168.133.200
         adm    eth4  192.168.133.201
         iscsiA port6 192.168.133.1
         iscsiB port5 192.168.133.2

Host side installation:

Kernel info

#uname -a Linux 2.6.32-279.14.1.el6.x86_64 #1 SMP Tue Nov 6 11:21:14 \
CST 2012 x86_64 x86_64 x86_64 GNU/Linux

Installed multipath packages

# rpm -qa | grep mapper
device-mapper-event-1.02.74-10.el6.x86_64
device-mapper-event-libs-1.02.74-10.el6.x86_64
device-mapper-multipath-0.4.9-56.el6.x86_64
device-mapper-multipath-libs-0.4.9-56.el6.x86_64
device-mapper-1.02.74-10.el6.x86_64
device-mapper-libs-1.02.74-10.el6.x86_64

Install iscsi packages:

# yum install iscsi-initiator-utils 

On host backup node, /etc/multipath.conf

multipath configured like the following
devices {
device {
     vendor "IBM" product "1746"
     getuid_callout "/lib/udev/scsi_id --page=0x83 --whitelisted --device=/dev/%n" 
     features "2 pg_init_retries 5"
     hardware_handler "1 rdac"
    path_selector "round-robin 0"
     path_grouping_policy
     group_by_prio failback immediate
    rr_weight priorities
     no_path_retry fail
     rr_min_io 1000
     path_checker rdac
     prio rdac
   }
}
blacklist {
   device {
    vendor "Kingston" product "DT*"
   }
   device {
    vendor "ServeRA"
    product "*" }
    devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
    devnode "^hd[a-z]"
}
multipaths {
  multipath {
    wwid 360080e50002dfdb60000699750addb4f
    alias imirror
   }
   multipath {
     wwid 360080e50002dfdb60000699a50addb7c
     alias iraid
   }
...
}

Configuring ISCSI iqn

The configuration file is/etc/iscsi/initiatorname.iscsi, this is the configuration file which stores Initiator.

Both iSCSI target and initiator use same IQN format, initiatorname default is

InitiatorName=iqn.1994-05.com.redhat:vm1

Note: in the same iSCSI environment, never set two initiatorname to same.

Configuring Open-iSCSI initiator utilities

iSCSI initiator configuation file is /etc/iscsi/iscid.conf, you can use it as it is, also you can tune it according your setup environment, I'll mention tunning part in later session. Here is the origional of the file configuration.

iscsid.startup = /etc/rc.d/init.d/iscsid force-start
node.startup = automatic
node.leading_login = No
node.session.timeo.replacement_timeout = 120
node.conn[0].timeo.login_timeout = 15
node.conn[0].timeo.logout_timeout = 15
node.conn[0].timeo.noop_out_interval = 5
node.conn[0].timeo.noop_out_timeout = 5
node.session.err_timeo.abort_timeout = 15
node.session.err_timeo.lu_reset_timeout = 30
node.session.err_timeo.tgt_reset_timeout = 30
node.session.initial_login_retry_max = 8
node.session.cmds_max = 128
node.session.queue_depth = 32
node.session.xmit_thread_priority = -20
node.session.iscsi.InitialR2T = No
node.session.iscsi.ImmediateData = Yes
node.session.iscsi.FirstBurstLength = 262144
node.session.iscsi.MaxBurstLength = 16776192
node.conn[0].iscsi.MaxRecvDataSegmentLength = 262144
node.conn[0].iscsi.MaxXmitDataSegmentLength = 0
discovery.sendtargets.iscsi.MaxRecvDataSegmentLength = 32768
node.conn[0].iscsi.HeaderDigest = None
node.session.nr_sessions = 1
node.session.iscsi.FastAbort = Yes

Tunning:

As for ISCSI host side configuration, some parameters have been changed, in /etc/iscsi/iscsid.conf

Origional

node.session.timeo.replacement_timeout = 120
node.session.cmds_max = 128
node.session.queue_depth = 32

My tuned settings

node.session.timeo.replacement_timeout = 15
node.session.cmds_max = 1024
node.session.queue_depth = 128

Other tunings

On top of multipath device, block readahead is set to 16584(the best).

#blockdev --setra /dev/<device>
, or set it in udev script

On iscsi target part, cache size set to 32KB. path fail alert set to 60 minutes.

Connecting to the iSCSI array

The file /etc/iscsi/initiatorname.iscsi should contain an initiator name for your iSCSI client host. You need to include this initiator name on your iSCSI array's configuration for this specific iSCSI client host.

Started the iSCSI daemon

run the following command before we commence with discovering targets.

#service iscsid start

Discover available targets

Once the iscsid service is running and the client's initiator name is configured on the iSCSI array, then you may proceed with the following command to discover available targets.

iscsiadm -m discovery -t sendtargets -p 192.168.132.1 
iscsiadm -m discovery -t sendtargets -p 192.168.132.2
iscsiadm -m discovery -t sendtargets -p 192.168.133.1
iscsiadm -m discovery -t sendtargets -p 192.168.133.2

Remove unaccessble targets

Then remove nodes discovered but not in the same vlan(this is the part I don't like iSCSI utility, seems bug to me), you can find other ways to get rid of them, but I found this is good and reliable way to do so.

cd /var/lib/iscsi/nodes/iqn.1992-01.com.lsi:2365.60080e50002e01d0000000004fab11c9
rm -rf 192.168.130.1
rm -rf 192.168.130.2,3260,2
rm -rf 192.168.131.1,3260,1
rm -rf 192.168.131.2,3260,2

Same way, on the nodes on other vlan, do the similar thing to discover iSCSI target.

Start iscsi

Then, start iscsi and check if iSCSI target devices showing up.

# /etc/init.d/iscsi start
# iscsiadm -m node
192.168.130.2:3260,2 iqn.1992-01.com.lsi:2365.60080e50002e01d0000000004fab11c9
192.168.131.1:3260,1 iqn.1992-01.com.lsi:2365.60080e50002e01d0000000004fab11c9
192.168.130.1:3260,1 iqn.1992-01.com.lsi:2365.60080e50002e01d0000000004fab11c9
192.168.131.2:3260,2 iqn.1992-01.com.lsi:2365.60080e50002e01d0000000004fab11c9

Check session status

You can also check session statu

# iscsiadm -m session
tcp: [1] 192.168.130.2:3260,2 iqn.1992-01.com.lsi:2365.60080e50002e01d0000000004fab11c9
tcp: [2] 192.168.131.1:3260,1 iqn.1992-01.com.lsi:2365.60080e50002e01d0000000004fab11c9
tcp: [3] 192.168.130.1:3260,1 iqn.1992-01.com.lsi:2365.60080e50002e01d0000000004fab11c9
tcp: [4] 192.168.131.2:3260,2 iqn.1992-01.com.lsi:2365.60080e50002e01d0000000004fab11c9

Troubleshooting

iscsiadm with -P (print) flag and verbosity level 0-3:

iscsiadm -m session -P3
iSCSI Transport Class version 2.0-870
version 6.2.0-873.2.el6
Target: iqn.1992-01.com.lsi:2365.60080e50002e01d0000000004fab11c9
Current Portal: 192.168.130.2:3260,2
Persistent Portal: 192.168.130.2:3260,2
**********
Interface:
**********
Iface Name: default
Iface Transport: tcp
Iface Initiatorname: iqn.1994-05.com.redhat:7ee624947bae
Iface IPaddress: 192.168.130.201
Iface HWaddress: <empty>
Iface Netdev: <empty>
SID: 1
iSCSI Connection State: LOGGED IN
iSCSI Session State: LOGGED_IN
Internal iscsid Session State: NO CHANGE
*********
Timeouts:
*********
Recovery Timeout: 15
Target Reset Timeout: 30
...

Check multipath devices.

run multipath to check devices and path availability

# multipath -ll

Make file system

mkfs.xfs -f -L iraid -d su=64k,sw=16 -b size=4096 -s size=4096 /dev/mapper/iraid

Use netdev option for iscsi devices in fstab

/dev/mapper/iraid   /iraid   xfs _netdev         0 0
/dev/mapper/imirror /imirror xfs _netdev,noauto  0 0

Iozone test

result on vm2 for iraid

  File size set to 201326592 KB
  Record Size 32 KB
  Command line used: iozone -s192g -i 0 -i 1 -r32 -j 16 -t 1
  Output is in Kbytes/sec
  Time Resolution = 0.000001 seconds.
  Processor cache size set to 1024 Kbytes.
  Processor cache line size set to 32 bytes.
  File stride size set to 16 * record size.
  Throughput test with 1 process
  Each process writes a 201326592 Kbyte file in 32 Kbyte records
  Children see throughput for  1 initial writers  =  250440.28 KB/sec
  Parent sees throughput for  1 initial writers   =  238661.17 KB/sec
  Min throughput per process                      =  250440.28 KB/sec
  Max throughput per process                      =  250440.28 KB/sec
  Avg throughput per process                      =  250440.28 KB/sec
  Min xfer                                        = 201326592.00 KB
  Children see throughput for  1 rewriters        =  250842.19 KB/sec
  Parent sees throughput for  1 rewriters         =  239207.47 KB/sec
  Min throughput per process                      =  250842.19 KB/sec
  Max throughput per process                      =  250842.19 KB/sec
  Avg throughput per process                      =  250842.19 KB/sec
  Min xfer                                        = 201326592.00 KB
  Children see throughput for  1 readers          =  253003.98 KB/sec
  Parent sees throughput for  1 readers           =  253002.54 KB/sec
  Min throughput per process                      =  253003.98 KB/sec
  Max throughput per process                      =  253003.98 KB/sec
  Avg throughput per process                      =  253003.98 KB/sec
  Min xfer                                        = 201326592.00 KB
  Children see throughput for 1 re-readers        =  250448.03 KB/sec
  Parent sees throughput for 1 re-readers         =  250445.81 KB/sec
  Min throughput per process                      =  250448.03 KB/sec
  Max throughput per process                      =  250448.03 KB/sec
  Avg throughput per process                      =  250448.03 KB/sec
  Min xfer                                        = 201326592.00 KB

 

According to the results above, NIC channels are pretty much saturated. For smaller file test, reading could reach to 450MB/sec, benefit from cached in memory. This could be useful for virtual machines.