3.sort? This is just an application of Equation 1 to Similar comments apply to other sequential opera- compute the break-even reference interval. Use the tions (group by, rollup, cube, hash join, index build, DEC TPC-C prices [2] and components in the previ- etc…). In general, sequential operations should ous section. If sort uses to 64KB transfers then there use high-bandwidth disk transfers and they are 16 pages/MB and it gets 80 accesses per second should cache data that they will revisit the data (about 5 MB/s). within a minute. PagesPerMBofRAM = 16 pages/MB AccessesPerSecondPerDisk = 80 access/sec/disk In the limit, for large transfers, sequential access cost degenerates to the cost of the bandwidth. The tech- Using these parameters, Equation 1 yields a break- nology ratio of equation 1 becomes the reciprocal of even reference interval of 26 seconds (= (16/80) x the bandwidth (in megabytes): (2,000/15)). Actually, sort would have to write and TechnologyRatio then read the pages, so that doubles the IO cost and = (PagesPerMB)/(AccessesPerSecond) moves the balance point to 52 seconds. Anticipating = (1E6/TransferSize)/ higher bandwidths and less expensive RAM, we pre- ( DiskBandwidth/TransferSize) dict that this value will slowly grow over time. for purely sequential access = 1E6/DiskBandwidth. (3) Consequently, we recommend the one-minute- This is an interesting result. It gives rise to the as- sequential rule: hash joins, sorts, cubes, and other ymptote in Figure 3 that shows the reference interval sequential operations should use main memory to vs. page size. With current disk technology, the ref- cache data if the algorithm will revisit the data within erence interval asymptotically approaches 40 seconds a minute. as the page size grows. For example, a one-pass sort is known to run at about 5 GB/minute [4]. Such sorts use many disks and lots Reference Inverval vs Page Size of RAM but they use only half the IO bandwidth of a (DRAM vs disk or tape storage) 100,000 two-pass sort (they pass over the data only once). Applying the one-minute-sequential rule, below 5 GB 10,000 a one-pass sort is warranted. Beyond that size, a Tape two-pass sort is warranted. With 5GB of RAM a two- 1,000 robot Reference Interval pass sort can sort 100 terabytes. This covers ALL 100 disk current sorting needs. (minutes) 5 min 10 rule 1 DRAM Needed for a Two-pass Sort 1 min asymtpote Gigabytes Can Sort PetaBytes 0 1.E+12 1E+0 1E+3 1E+6 1E+9 1E+12 Page Size (bytes) DRAM size needed Figure 3: The break-even reference interval for disk 1.E+09 vs. DRAM asymptotically approaches something like one minute for current technology. The asymptote is the product of the technology ratio (which becomes 1e6/bandwidth) and the economic ratio. A later sec- tion discuses the disk-tape tradeoff. Fundamentally, 1.E+06 1.E+06 1.E+09 1.E+12 1.E+15 1.E+18 tape technology is VERY expensive to access. This Size of input file (bytes) encourages very large tape page sizes and very cold Figure 2: A two-pass sort can process 100 terabyte data on tape. The tape asymptote is approached at 10 files with a 5 GB DRAM buffer. The two pass sort GB (tape hardware is described in Table 4). balances the run length against the number of runs to merge in the second pass. If it generates a thou- 1.4. RAID and Tape sand runs of 100 MB each, it can merge them using RAID 0 (striping) spreads IO among disks and so 100 MB of merge buffers in phase 2. This is a 100 makes the transfer size smaller. Otherwise, RAID 0 GB sort. With current technology, use a 1-pass does not perturb this analysis. RAID 1 (mirroring) sort up to 5GB files. For larger files, do a 2-pass slightly decreases the cost of reads and nearly dou-