Symmetrix Configuration
Considerations
- Configure enough resources for your workload
- Use resources evenly for best overall performance
- Spread across all available components
- Includes FE, BE and disks
- Path management can help FE
- FAST/Optimizer can help BE
- What size system do I need?
- Each resource has a limit of I/Os per second and MBs per second
- Disks
- Back-end controllers (DAs)
- Front-end controllers (Fibre, FICON, GigE)
- SRDF controllers
- Slices (CPU complexes)
- Configure enough components to support workload peaks
- Use those resources as uniformly as possible
- CPU utilization
- As a rule of thumb, a limit of no more than 50-70% utilization is good if response time is critical
- A higher utilization can be tolerated if only IOPS or total throughput matters
- Memory considerations
- Ideal to have same size memory boards and same memory between engines
- Imbalance will make little or no difference with OLTP type workloads
- Imbalance will create more accesses to boards or engines with large amount of memory, creating a skewed distribution over the hardware resources
- Front-end connections
- Go wide before you go deep
- Use all 0 ports on director first and then the 1 ports
- Spread across directors first, then on same director
- Two active ports on one FA slice do not generally do more I/Os
- Ratios (random read hit normalized at 1)
- Random read hit 1
- Random read miss 1/2
- Random Overwrite I/O�s 1/2
- Random new write 1/4
- Worst connection for a host with 8 connections
- All on one director
- Instead do one connection per director
- Go wide before you go deep
- Disks
- Performance will scale linearly as you add drives
- You can see up to 510 IOPS per drive when benchmarking at 8KB, but 150 IOPS is a reasonable design number for real world situations
- Note that with higher IOPS comes higher response times as well as queues will grow
- Until some back-end director limit is reaches
- With smaller I/O sizes (<32KB), the limit reaches is the CPU limit
- With largest I/O sizes (>32KB), we can reach a throughput limit in the plumbing instead
- Performance will scale linearly as you add drives
- Engine Scaling
- Scales nearly linear, though not quite.
- From 1 to 8 engines, it�s 6.8 to 7.8x WRT to IOPS (8KB I/O)
- From 1 to 8 engines, it�s 4.2 to 7.1x WRT to bandwidth (64KB I/O)
- Scaling from 1 to 8 shows worst numbers. 4 to 8 showed better numbers.
- Each resource has a limit of I/Os per second and MBs per second
- What�s the optimum size of a hyper or number per disk?
- General rule of thumb, fewer larger hypers will give better overall system performance.
- There is a system overhead to manage a logical volume so it makes sense that more logical volumes could lead to more overhead.
- Frequently legacy hyper size is carried forward because of migration
- Virtual Provisioning will make the size of the hyper on the physical disk
- You can create very large hypers for the TDATs and still present small LUNs to the host
- There can be a case of having too few hypers per drive
- Because it could limit concurrency
- Set a minimum of 4 to 8 hypers
- Not an issue with large drives or protections other than R1
- General rule of thumb, fewer larger hypers will give better overall system performance.
- What is the optimum queue depth?
- Single threaded (or 1 I/O at a time), the I/O rate is simply the inverse of the service time.
- For a 5.8ms service time your maximum IOPS is 172.
- Same drive with 128 I/Os queued can get nearly 500 IOPS
- We need 1-4 I/Os queued to the disk to achieve the maximum throughput with reasonable latencies
- Lower queue lengths if response time is CRITICAL
- Higher if total IOPS is more important than response time
- With VP, the LUN could be spread over 1000s of drives
- Queue depth of 32 per VP LUN is probably a reasonable start
- As IOPS go up, response time will exponentially get worse
- Single threaded (or 1 I/O at a time), the I/O rate is simply the inverse of the service time.
- What is the optimum number of members in a meta volume?
- 255 maximum supported
- Reasonable sizes for meta member counts are something like 4, 8, 16, 32
- Even numbers are preferred
- Powers of 2 fit nicely into back-end configurations
- Powers of 2 not important for VP thin metas
- Getting enough I/O into a very large meta can be a problem
- 32-way R5 7+1 meta volume would need at least 256 I/Os queued to have 1 I/O per physical disk
- Should I use meta volumes or host-based striping? Or both?
- Avoid too many levels of striping (plaid)
- One large meta volume may outperform serveral smaller meta volumes that are grouped in a host stripe
- In many cases, host-based striping is preferred over meta volumes
- One reason is because there will be more host-based queues for concurrency that the host can manage before even getting to the array.
- However, meta volumes can reduce complexity at the host level
- So it all depends
- 24-way meta versus 6 host x 4-way meta � average read response time was better with host-based stripe
- Striped or Concatenated Metas?
- In most cases, striped meta volumes will give you better performance than concatenated
- Because they reside on more spindles
- Some exceptions exist where concatenated may be better
- If you don�t have enough drives for all the meta members to be on separate drives (wrapping)
- If you plan to re-stripe many meta volumes again at the host-level
- If you are making a very large R5/R6 meta and your workload is largely sequential
- Concatenated meta volumes can be placed on the same RAID group
- Don�t place striped meta volumes on the same RAID group (wrapped)
- Virtual Provisioning
- Back-end is already striped over the virtual provisioning pool so why re-stripe the thin volume (TDEV)
- May be performance reasons to have a striped meta on VP
- Device WP �disconnect� between front-end and backend
- 5874 Q210SR, 5773 future SR fixes this
- Number of random read requests we can send to a single device
- Single device can have 8 outstanding reads per slice per device (TDEV on FA slice)
- Number of outstanding SRDF/S writes per device
- Single device can have 1 outstanding write per path per device
- If it is important to be able to expand a meta, choose concatenated
- In most cases, striped meta volumes will give you better performance than concatenated
- What stripe and I/O size should I choose?
- For most host-based striping, 128KB or 256KB is good
- May want to consider a smaller stripe size for database logs, 64KB or smaller may be advised by a Symmetrix performance guru
- I/O sizes about 64KB or 128KB show little to no performance boost (flattens out). 256KB may actually decrease throughput. This is because everything is managed internally at 64KB chunks.
- Segregation
- For the most optimal system performance, you should not segregate applications/BCVs/Clones onto separate physical disks/DAs or engines
- For the most predictable system performance, you should segregate
- Tiers should share DA resources so that one tier will not consume resources for another tier
- What disk drive class should I choose?
- EFD provide the best response time and maximum IOPS of all drives
- 15k provide 30% faster performance than 10k (random read miss)
- 15k provide 56% faster than SATA, 10k provide 39% faster than SATA (random read miss)
- SATA still does well in sequential read (with single threaded and larger block sizes) (basically good in single stream, bad with multi-thread and therefore disk seeks)
- What RAID protection should I choose?
- Performance of reads similar across all protection types (number of drives is what matters)
- Major difference with random write performance
- Mirrored: 1 host write = 2 writes
- R5: 1 host write = 2 reads + 2 writes
- R6: 1 host write = 3 reads + 3 writes
- Cost is also a factor
- R5/R6 are best at 12.5% and 25% protection overhead
- R1 has 50% protection overhead
- How much cache do I need?
- Easiest method is to utilize the Dynamic Cache Partition White If (DCPwi) tool
- Put like devices together in cache partitions
- Start analysis mode and collect DCP stats
- How do I know when I�m getting close to limits?
- Watch for growth trends in your workload with SPA
- Look out for increasing response time (host-based tools like iostat, sar, RMF)
- Monitor utilization metrics in WLA/STP
- Better to be pro-active than waiting to hit th ewall
- Any utilizations well over 50% should be considered a possible source of future issues with growth
download file now