[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: zfs update
Scott,
Capping ARC fixed most of the problem. Steve found the fix at http://wiki.gentoo.org/wiki/ZFS#Adjusting_ARC_memory_usage
What it doesn't really explain is how to compute the value to use. For the pod with 8 GB of RAM I'm using 53,6870,912, which the web site says corresponds to 512 MB. It seems to be effective because twelve hours into a "linkdups -r -v" run, top is reporting a load factor of 1.00.
True optimizing of ZFS performance will almost certainly require Red Hat to formally support it. To do that they'll have to reconcile the very different memory management schemes used by Solaris and Linux. In the mean time, we'll have to use bandaids like ARC caps.
--Doc
-----Original Message-----
From: Scott Duensing <scott@jaegertech.com>
Reply-to: silug-discuss@silug.org
To: silug-discuss <silug-discuss@silug.org>
Subject: Re: zfs update
Date: Wed, 7 May 2014 00:57:56 -0500
I had issues until I put a cap on the ARC. So far, so good. I use the same system for a VM host and a ZFS file server. It's a little loaded at the moment (maintenance time). Top is reporting a load of 3.91. I'm usually a lot lower.
On Wed, May 7, 2014 at 12:40 AM, Robert G. (Doc) Savage <dsavage@peaknet.net> wrote:
-----Original Message-----
From: Scott Duensing <scott@jaegertech.com>
Reply-to: silug-discuss@silug.org
To: silug-discuss <silug-discuss@silug.org>
Subject: Re: zfs update
Date: Sat, 19 Apr 2014 22:22:08 -0500
It took you a week to move 2.3T of data? That sounds very wrong. I have a similar setup here (10 x 4TB in RAIDZ2) and migrated my 9TB+ of data in just a couple days across GigE.
Scott,
I attribute at least part of the slowness of the first pass to the 4096:512 emulation which the ashift=12 parameter should have fixed on the return pass. There's something -- possibly inadequate caching memory space -- that ran up the load factor on the return pass to over 8 at one point. The pod has only 8GB and the smallest/slowest AMD 6000 series CPU. Another possibility is that the 1:5 SATA port expanders are inefficient. The main server with a 2TB ZFS array has a proper server motherboard with dual quad-core 2000 series AMD CPUs and 32GB of ECC registered RAM.
--Doc
It's going to take hours/days to do anything when top says your load factor is 35+. top is also reporting a spl_kmem_cache process consuming 99.,5% of CPU cycles. I'm still digging into this, but a Google search is telling me there's a fundamental conflict between the ways Solaris and Linux manage memory for ZFS filesystems. From what I've read so far, the spl_kmem_cache process consumes extremely large volumes of memory and that winds up up thrashing to/from swap. I'm reading about strange new creatures called ARC and SPL. They seem to say (a) terabyte storage requires relatively large amounts of system RAM even when not doing deduplication, and (b) until fundamental differences between Solaris and Linux memory management are resolved, I will have to throttle the size of SPL caching to values that don't result in swap thrashing. Stay tuned.
--Doc