Need better file servers

I don’t think file servers can ever be fast enough. Witness this onslaught from our cluster, pummeling our FAS960:

 CPU   Total    Net kB/s    Disk kB/s    Tape kB/s Cache Cache  CP  CP Disk
       ops/s    in   out   read  write  read write   age   hit time ty util
100%   11677  2260 341256     26     90     0     0    42  100%   1%  :   1%
100%   11570  2627 338862   1330   1788     0     0    42  100%  13%  T   4%
100%   11520  2446 339475   1551   2470     0     0    42  100%  19%  T   5%
100%   11667  2436 339196     93      0     0     0    42  100%   0%  -   1%
100%   11595  2484 338587   1412   2020     0     0    42  100%  13%  T   4%
100%   11550  2439 340170     19      0     0     0    42  100%   0%  -   0%
100%   11513  2435 338183   1407   2047     0     0    43  100%  13%  T   3%

At least it’s being served out of the cache. There’s actually a host outside the cluster attempting to write a big file in to a different volume at the same time, and it’s only managing to move about 2MB/sec..

You can see that once the flood from the cluster eases off, the writes pick right up:

 CPU   Total    Net kB/s    Disk kB/s    Tape kB/s Cache Cache  CP  CP Disk
       ops/s    in   out   read  write  read write   age   hit time ty util
100%   12019  2163 338672    952   1002     0     0    47  100%   8%  T   2%
100%   11813  2403 337734      5      0     0     0    47  100%   0%  -   0%
100%   11982  2119 337358   1148   1438     0     0    47  100%  12%  T   3%
 28%   10951  2071  3491     95      0     0     0    48   94%   0%  -   2%
 40%   10998 17515  2141   1342   3554     0     0    48   97%  19%  T   3%
 72%    9325 42424  2535  11450  37717     0     0    48   99%  85%  2  78%
 78%    9439 53211  3361  12980  50676     0     0    48   99% 100%  B  98%

So we need two things to happen in the storage world. One, we need scalable storage grids. Second, we need QoS. As long as disks are going to be slow rotational media, they’re never going to keep up to the aggregate interconnect speeds we have at our disposal. So we’re always going to be able to generate more I/O than our storage servers (SAN/NAS/whatever) can muster. Distributed grid-like approaches might very well finally give us some more juice to keep up with the I/O demand, but just like in the networking world there will always be cases when QoS techniques like prioritization would be well suited. It seems to me that the QoS approach would be easier and cheaper to implement and take advantage of than storage grids, but what do I know…

Well, I guess I know one thing. I need better storage technology.


About this entry