Bugzilla – Bug 11978
OnAccess scanning unusable due to bugs/limitations
Last modified: 2017-12-15 12:32:32 EST
We are attempting to configure clamd on Red Hat Enterprise Linux 7 to perform on-access scanning of user-writable filesystems, combined with deeper periodic scans of all local filesystems. This is the /etc/clamd.d/onaccess.conf file we are using: LocalSocket /run/clamd-onaccess/clamd.sock TemporaryDirectory /run/clamd-onaccess OnAccessExcludePath /run/clamd-onaccess CrossFilesystems false LogFacility LOG_DAEMON LogSyslog yes ScanOnAccess yes OnAccessExcludeUID 0 OnAccessExcludeUID 985 OnAccessExtraScanning yes ExtendedDetectionInfo yes OnAccessMountPath /home OnAccessMountPath /tmp OnAccessMountPath /var/tmp To our dismay, we have discovered that there isn't any practical way to do this without clamav essentially DoSing the system. First, OnAccessExcludeUID doesn't appear to do what we thought it did: ignore accesses on files when the uid of the processing accessing the file matches the uid. Rather, OnAccessExcludeUID appears to be an exemption from "OnAccessPrevention true". This means that if we kick off a periodic scan with clamscan, and clamscan descends into one of the OnAccessMountPath filesystems that clamd is watching, every single file that clamscan accesses to scan ALSO triggers clamd to scan the file. And even more hilariously, if clamscan creates temporary files (e.g., to unpack an archive), it defaults to doing so in /tmp, which (in our configuration) triggers clamd on-access scans on all of the temporary files that clamscan creates. Whee! We can avoid the triple-scan on /tmp by telling clamscan to use /run/clamd-onaccess for temporary files. But we cannot avoid the double-scan on any on-access monitored filesystems, because there is no way to tell clamd to refrain from triggering an on-access scan if the uid of the accessing process is 0. This is a showstopper for using clamscan on any system where clamd is performing on-access scanning. So, if the problem is that there is no way to make clamd play nicely with clamscan, perhaps we can use clamdscan instead of clamscan to perform periodic deep filesystem scanning. Except, oops, we can't, because bug 11462: clamd blissfully ignores the CrossFilesystems when scanning via clamdscan. So if we want clamdscan to scan the / (root) filesystem, on any system that has (e.g.) remotely mounted filesystems, it's game over; that system is now unusable. (Not to mention the billions of SELinux avc denials as clamd merrily tries to go stomping into /run, /proc, /sys, et. al. This is really sad. It's clear a lot of work went into the fanotify(7) on-access scanning capabilities on Linux. But due to two nagging little defects in clamd (OnAccessExcludeUID doesn't do what is expected, and clamd ignores the CrossFilesystems setting), there is no way to prevent the on-access scanning capabilities of clamd from DoSing a system that enables them. Taking either of these actions would make on-access scanning on Linux safe to use: 1. Make OnAccessExcludeUID actually exclude on-access scanning from uids. (Create another option, say OnAccessPreventionExcludeUID, to exclude arbitrary uids from being blocked via OnAccessPrevention. 2. Finally fix bug 11462, so it is safe to use clamd to perform all scanning on a system, both on-access scanning and periodic whole-filesystem scanning. Is there any chance of any of these happening any time soon? Also, if we were to provide a patch for #1, is there any chance is would be accepted, and in a reasonable timeframe? (The submitter of bug 11462 provided a patch, but the patch and the bug have been blissfully ignored for over a year, so that doesn't exactly encourage third parties from spending their own time to help on clamav's development.)
Could someone please uncheck the "Security group" restriction on this bug, please? This isn't a security report, and there's nothing about this bug that shouldn't be public. Thanks!
Thanks for your detailed report here. I think this is a duplicate of bug 11963. You can confirm by running clamconf to see if OnAccessExcludeUID comes up as "disabled". I also checked the code to verify, and it seems like onas_fan_checkowner is run on all paths, regardless of whether prevention is on or not, but if you're seeing something different that needs to be addressed. Additionally, I did some work on this recently that I'm hoping might fix up your problems: https://github.com/vrtadmin/clamav-devel/commit/77112702d8a29819f64e9016626c3e6fd58ae679 and https://github.com/vrtadmin/clamav-devel/commit/a20128bb21d7c371d94aa79aede81c2baee6d186 If these don't fix your issues, let me know.
(In reply to Mickey Sola from comment #2) > I think this is a duplicate of bug 11963. You can confirm by running > clamconf to see if OnAccessExcludeUID comes up as "disabled". Bingo: $ ln -s onaccess.conf /etc/clamd.d/clamd.conf $ clamconf -c /etc/clamd.d | grep OnAccessExcludeUID OnAccessExcludeUID disabled > I also checked the code to verify, and it seems like onas_fan_checkowner is > run on all paths, regardless of whether prevention is on or not, but if > you're seeing something different that needs to be addressed. I think the only issue is that we were unwittingly disabling the OnAccessExcludeUID option without realizing it, in our attempt to exclude root from on-access scanning. > Additionally, I did some work on this recently that I'm hoping might fix up > your problems: > > https://github.com/vrtadmin/clamav-devel/commit/ > 77112702d8a29819f64e9016626c3e6fd58ae679 > > and > > https://github.com/vrtadmin/clamav-devel/commit/ > a20128bb21d7c371d94aa79aede81c2baee6d186 > > If these don't fix your issues, let me know. Thanks; I'll rebuild locally with those commits and see if that resolves the issue.
(In reply to James Ralston from comment #3) > Thanks; I'll rebuild locally with those commits and see if that resolves the > issue. Alas, the patches don't quite seem to work: if I exclude -1, the option is still disabled: $ grep OnAccessExcludeUID /etc/clamd.d/clamd.conf OnAccessExcludeUID -1 OnAccessExcludeUID 985 $ clamconf -c /etc/clamd.d | grep OnAccessExcludeUID OnAccessExcludeUID disabled There is clearly code somewhere that is either special-casing or reacting to the value of -1, because other negative values don't show the same behavior: $ grep OnAccessExcludeUID /etc/clamd.d/clamd.conf OnAccessExcludeUID -2 OnAccessExcludeUID 985 $ clamconf -c /etc/clamd.d | grep OnAccessExcludeUID OnAccessExcludeUID = "-2", "985" Any tips on where to look?
That's really strange. At this point I think I just need to add an ExcludeRootUID option to get around clam's opt-parsing shenanigans. Looking through it, it seems there's no real way to have a numbered list of options that contains 0 *or* -1. I hate adding new options, but it's the cleanest solution at this point.
(In reply to Mickey Sola from comment #5) > That's really strange. > > At this point I think I just need to add an ExcludeRootUID option to get > around clam's opt-parsing shenanigans. Looking through it, it seems there's > no real way to have a numbered list of options that contains 0 *or* -1. That aligns with what I'm seeing. Just out of curiosity, I changed the code to treat -2 as excluding uid 0, instead of -1, and that works: $ grep OnAccessExcludeUID /etc/clamd.d/clamd.conf OnAccessExcludeUID -2 OnAccessExcludeUID 985 $ clamconf -c /etc/clamd.d | grep OnAccessExcludeUID OnAccessExcludeUID = "-2", "985" Testing a clamscan run as root on a filesystem that clamd is watching via OnAccessMountPath does the right thing: clamd ignores the access when it determines that the owner of /proc/PID is in the OnAccessExcludeUID (treating -2 as 0). > I hate adding new options, but it's the cleanest solution at this point. Yeah… saying "exclude uid -2 if you want to exclude uid 0" feels really klugy. If the options parser can't easily be made to just accept a value 0 to match uid 0 (instead of disabling the option entirely), a separate OnAccessExcludeRootUID option is probably cleaner. While I lack deep familiarity with the ClamAV code, implementing OnAccessExcludeRootUID looks fairly straightforward. If you think it'll be a while until you can get to this, I can take a crack at it and throw you a patch / merge request, because we really want to get this working in our environment.
Also, just FYI: I realized that on Red Hat Enterprise Linux 7, the targeted SELinux policy silently (as in, dontaudit) blocks clamd's ability to stat /proc/PID, which completely breaks the OnAccessExcludeUID feature. I'll file a Red Hat support request to address that issue, but if you get reports about OnAccessExcludeUID not working on Red Hat (and derivatives thereof), and they're not running into the (uid == 0 || uid == -1) issue, that's likely the culprit.
It's pretty straightforward, working on it now and hoping to have it finished and tested within an hour or so. Might finish a bit sooner or later than that--really depends on where else my attention gets split. And that's really good to know regarding RHEL 7's targeted policy. Do you know offhand if there are similar restrictions in the reference policy?
Commit tested and applied. Also added a bit of logging if the user can't stat /proc based on your suggestion. Please check if this patch fixes things up for you. And thanks again for all your feedback and help here. https://github.com/vrtadmin/clamav-devel/commit/ef48b6af143fe32cd5850f76cc91d8d171fd2591
No problem; we appreciate your sparing the cycles for this issue. I will kick the tires on ef48b6af and report back. The SELinux reference policy: https://github.com/TresysTechnology/refpolicy/wiki …has no mention of antivirus_t anywhere in it, so that appears to be an addition specific to Red Hat.
Hmmm: onas_fan_checkowner() seems to be skipping the stat() on /proc/PID entirely now, even when both OnAccessExcludeUID and OnAccessExcludeRootUID are set. I wonder if there is a subtle precedence flaw in this: if (!(opt = optget (opts, "OnAccessExcludeUID"))->enabled && !(opt_root = optget (opts, "OnAccessExcludeRootUID"))->enabled) …but if there is, I'm not seeing it. I'll throw some debugging statements in there and see if I can figure out what's going on. In any case, this might be easier to follow: opt = optget (opts, "OnAccessExcludeUID"); opt_root = optget (opts, "OnAccessExcludeRootUID"); if (!(opt->enabled || opt_root->enabled)) return 0; Also, "int num_arg" is now an unused variable in onas_fan_checkowner(), no?
Yep, good feedback. Will make those changes. Textbook example of why you don't test and push code 2 minutes before a meeting. Could it be that you aren't seeing the proc because the UID matches that of the clamd user whose pid is now automatically ignored without needing to stat?
(In reply to Mickey Sola from comment #12) > Could it be that you aren't seeing the proc because the UID matches that of > the clamd user whose pid is now automatically ignored without needing to > stat? The problem is that if OnAccessExcludeUID is set, the (!a && !b) test short-circuits, and opt_root is never assigned. I'll have a suggested patch for you in a few minutes (including cleanups to documentation).
Ah duh, that makes complete sense. Great catch. Your suggested changes have already been tested and pushed. https://github.com/vrtadmin/clamav-devel/commit/51be9906ac281b3d17c278a10104a5c17b38ea66
Looks good. Here are the rest of the suggestions: https://github.com/qralston/clamav-devel/commit/f97115343309411ac79f1a67de58ca10159a6409 I didn't realize until I threw some debugging statements in there just how often the process that triggered the access check exits before clamd can stat /proc/<PID>. Fundamentally, this is a limitation of the fanotify(7) API: it really should pass the pid *and* uid of the process that triggered the event, instead of just the pid. But there's no way to work around that, except to enable OnAccessPrevention (which will block the accessing process until clamd can make a decision). Anyway, when the process has already exited before clamd can stat /proc/<PID>, clamd currently performs a scan anyway, even if the user should have been excluded. I think that's probably the most reasonable course of action. But should clamd's behavior here be configurable? I.e., should there be a OnAccessExcludeUIDSkipUnknown option, to control whether clamd skips or scans files where the process that caused the event already exited and can't be tested against OnAccessExcludeUID and OnAccessExcludeRootUID? If the answer is "let's not make it configurable", should the documentation note that neither OnAccessExcludeUID nor OnAccessExcludeRootUID is *guaranteed* to exclude, due to the limitations of the fanotify(7) API?
Hmm, additional configuration is definitely an option at this point. However, we try to be careful about adding additional options...unless we can point to a specific use case/need. I want to hear if these changes have made an impact on your environment at all, and whether you think blocking additional scans will give marked improvement on system performance. If that ends up being the case, then we can definitely add that in (alongside the polling features). I will have to push it to 0.99.4 unless you can provide a patch in the next month or so.
From my testing, when OnAccessExcludeRootUID is in effect, a clamscan process can walk into a filesystem that clamd is watching, and clamd hardly reacts at all. (You have to strace it to see that it's calling stat() on a bunch of /proc/<PID> files; it doesn't peg CPU/memory usage at all.) So from our perspective, OnAccessExcludeRootUID works, and works well. We don't care if OnAccessExcludeUID and OnAccessExcludeRootUID aren't 100% guaranteed to skip scans, because that won't affect our primary use case: kicking off "clamscan --recursive" scans from cron as root. Since in that case there is a single clamscan process that persists for the life of the scan, we know that OnAccessExcludeRootUID will be effective. So, my inclination is to leave clamd's behavior the way it is, and update the documentation to note that OnAccessExcludeUID and OnAccessExcludeRootUID aren't 100% effective due to limitations of Linux. Suggested changes here: https://github.com/qralston/clamav-devel/commit/559dd8946eae785f045f4159c64d3d7098932054 If someone later comes along and makes a case for an OnAccessExcludeUIDSkipUnknown option, then they can make the case for it at that time. But for now, there's little sense in implementing a feature that no one is asking for. Making clamdscan respect CrossFilesystems (bug 11462) would be nice, but the reality is that with OnAccessExcludeRootUID, we can use clamscan to avoid bug 11462. IMHO, a polling feature to detect removable media mounts and mark them with FAN_MARK_MOUNT (bug 11983) would be a better place to spend time than fixing CrossFilesystems. But at any rate, from our perspective, the addition of OnAccessExcludeRootUID, combined with the cleanups above, resolve this bug.
After adding your documentation and logging updates, and given the above, I'm marking this as resolved. Thanks again for all your help/feedback here.