This discussion is now locked
Posted by Mr. Phil Sevetson on 30 April 2012 at 8:56
On P. 9, immediately below Fig.5 in the draft, language is found:
"In a DS8800, two thirds of the ranks are configured with a spare and the other two thirds do not have a spare."
This should probably be fixed so that there are 1.000 disks total, not 2/3+2/3=1.333 disks in the proportion.
Posted by Hari Shanmugadhasan on 31 December 2012 at 9:12
The performance comparisons are especially interesting for scenarios without the current HW combination.
Page 8, "So, the combination of zHPF and SSD raises the throughput from 2.7 MBps all the way to 90 MBps. That is a profoundly big difference."
Compared with the much higher MB/sec with sequential and dynamic prefetch it causes one to be suspicious of V9 access paths using list prefetch that fall into the slow scenarios. Suppressing list prefetch might slash some elapsed times.
Posted by Hari Shanmugadhasan on 31 December 2012 at 9:40
Page 11 makes clever use of DB2 Runstats sampling to get surprising results with "old" HW:
" . . .but even with sampling in the range of 5 - 20%,
the elapsed time remained higher than it was for 99% sampling."
It would appear that previously with List Prefetch:
Less is More . . . elapsed time! ;-)
- with apologies to Mies van der Rohe.
The potential CPU benefits of sampling are not discussed.
Posted by Hari Shanmugadhasan on 3 January 2013 at 11:02
It is fortunate that this Redbpaper came out in July 2012 so that this particular bug in DB2 10 RUNSTATS TABLESAMPLE SYSTEM had already been closed in August 2011:
PM42924: UNNEEDED SEQUENTIAL PREFETCH REQUESTS DURING RUNSTATS TABLESAMPLE
Posted by Hari Shanmugadhasan on 4 January 2013 at 8:36
The page 11, figure 8, RUNSTATS TABLESAMPLE 5 data point for 5% sampling with FICON 10K disk looks very close to the V10 Performance Topics page 278, figure 9-11, 5% page sampling bar, at just under 350 elapsed secs. But it is on a 30GB vs. 32 GB table here.
It's slightly better than the No Sample case, unlike the 99% sample case in this Redpaper. This may be because No Sample case looks very close to CPU bound.
Posted by Hari Shanmugadhasan on 4 January 2013 at 8:55
Assuming the page 11 TABLESAMPLE 5 test case is 350 elapsed secs, it implies that 5% of the 8 million pages, or 400,000 pages, were read via list prefetch in 350 seconds.
This implies a throughput of roughly 4.4 MB/sec with an average page skip of 19 4K pages, since roughly 1 in 20 pages are read.
This is not as good as the page 8 test with a different workload with a 17 page skip between pages list prefetched. But the random 5% page sampling of page 11 is probably more realistic.
Posted by Hari Shanmugadhasan on 4 January 2013 at 9:01
Consider that the RUNSTATS processing of page 11 is probably more CPU bound than some other workloads, so one would expect results to vary from workload to workload.
The V10 Performance Topics page 278, figure 9-11 shows CPU benefits with page sampling even against SAMPLE 5 which is doing 5% of the rows vs. 5% of pages. This suggests that sampling algorithm is having a CPU impact.
Posted by Hari Shanmugadhasan on 4 January 2013 at 9:12
Both sets of RUNSTATS TABLESAMPLE 5 comparisons with other RUNSTATS options suggest that one might consider avoiding an NPI and list prefetch for V9 queries running on "old" DASD, when only 5% of pages are of interest.
A tablespace scan could provide better elapsed time performance though would cost more CPU. No doubt results are highly dependent on a variety of factors, but it is worth considering and maybe testing if sufficient elapsed time benefit is available to justify the cost.