Bugzilla – Bug 12673
Commit e01ba94e36 breaks scanning files larger than 4GiB - 4kiB
Last modified: 2021-06-10 09:15:28 EDT
In ClamaAV-0.103.0 (and .1) files up to a size of 4294963200 (4G-4k) are scanned fine, but everything larger triggers the following error: --- snip --- $ dd if=/dev/zero of=testfile bs=64M count=64 $ ./clamscan/clamscan testfile LibClamAV Error: cli_calloc(): Attempt to allocate 0 bytes. Please report to https://bugzilla.clamav.net LibClamAV Warning: fmap: map header allocation failed LibClamAV Error: CRITICAL: fmap() failed /tmp/testfile: Can't allocate memory ERROR ----------- SCAN SUMMARY ----------- Known viruses: 8665590 Engine version: 0.103.0 Scanned directories: 0 Scanned files: 0 Infected files: 0 Total errors: 1 Data scanned: 0.00 MB Data read: 4096.00 MB (ratio 0.00:1) Time: 12.170 sec (0 m 12 s) Start Date: 2021:02:11 19:29:07 End Date: 2021:02:11 19:29:19 --- snap --- We bisected this down to this commit: https://github.com/Cisco-Talos/clamav-devel/commit/e01ba94e36 We reproduced this on different versions of openSUSE and SUSE Linux Enterprise Server with different versions of gcc and glibc and also verified it with "clean" builds (no patches, only default configure options and CFLAGS, etc.), so we think it is a generic problem.
We are seeing the same exact error with EPEL packages. Nice work with the bisect! This bug report in Red Hat's Bugzilla might be related: https://bugzilla.redhat.com/show_bug.cgi?id=1897909
Created attachment 7741 [details] large file scan patch I made a patch to resolve this issue. Changes in 0.103 to order of operations for creating fmaps and performaing hashes of fmaps resulted errors when scanning files that are 4096M and a different (but related) error when scanning files > 4096M. This is despite the fact that scanning is supposed to be limited to --max-scansize (MaxScanSize) and was also apparently limited to INT_MAX - 2 (aka ~1.999999G) back in 2014 to alleviate reported crashes for a few large file formats. (see https://bugzilla.clamav.net/show_bug.cgi?id=10960) This last limitation was not documented, so I added it to the sample clamd.conf. Anyways, the main issue is that the fmap module was using "unsigned int" and was then enforcing a limitation (verbose error messages) when that a map length exceeded the capapacity of an unsigned int. This commit switches the associated variables over to uint64_t, and while fmaps are still limited to size_t in other places, the fmap module will at least work with files > 4G on 64bit systems. In testing this, I found that the time to hash a file, particularly when hashing a file on an NTFS partition from Linux was really slow because we were hashing in FILEBUFF chunks (about 8K) at a time. Increasing this to 10MB chunks speeds up scanning of large files. Finally, now that hashing is performed immediately when an fmap is created for a file, hashing of files larger than max-scansize was occuring. This commit adds checks to bail out early if the file size exceeds the maximum before creating an fmap. It will alert with the Heuristics.Limits.Exceeded name if the heuristic is enabled. I also fixed CheckFmapFeatures.cmake module for the CMake build that detects if sysconf(_SC_PAGESIZE) is available.
Maybe also worth noting: I am interested in changing clamav's behavior so that incomplete or skipped scans say something other than "OK" in the future, but since I'm planning to include this patch in 0.103.2 next week -- it makes no sense to make that change right now. I really don't know how many systems will break if we started emitting something like "SKIPPED" instead of "OK" for files that are too large to scan. I suspect it will cause problems for a lot of people. I may write to the users list to try to get some user opinions about the "SKIPPED (<reason>)" or maybe "INCOMPLETE (<reason>)" or something scan result for a future version.
Created attachment 7742 [details] large file scan patch 2 Replacement patch, I missed a couple of things.
Merged this fix for inclusion in 0.104: https://github.com/Cisco-Talos/clamav-devel/commit/861153a656bcb5266952630f0a2aaed228883404 We'll also merge this into a branch for inclusion in the upcoming 0.103.2 patch release.
*** Bug 12374 has been marked as a duplicate of this bug. ***
*** Bug 12649 has been marked as a duplicate of this bug. ***
(In reply to Micah Snyder from comment #3) > Maybe also worth noting: I am interested in changing clamav's behavior so > that incomplete or skipped scans say something other than "OK" in the > future, but [...] I suspect it will cause problems for a lot of people. > > I may write to the users list to try to get some user opinions about the > "SKIPPED (<reason>)" or maybe "INCOMPLETE (<reason>)" or something scan > result for a future version. Any news on these plans?