Bug 12673 - Commit e01ba94e36 breaks scanning files larger than 4GiB - 4kiB
Commit e01ba94e36 breaks scanning files larger than 4GiB - 4kiB
Status: RESOLVED FIXED
Product: ClamAV
Classification: ClamAV
Component: libclamav
stable
x86_64 GNU/Linux
: P3 major
: 0.101.0
Assigned To: Micah Snyder
:
: 12374 12649 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2021-02-11 14:42 EST by Reinhard Max
Modified: 2021-06-10 09:15 EDT (History)
6 users (show)

See Also:
QA Contact:


Attachments
large file scan patch (14.97 KB, patch)
2021-03-30 18:56 EDT, Micah Snyder
no flags Details | Diff
large file scan patch 2 (15.77 KB, patch)
2021-03-30 22:06 EDT, Micah Snyder
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Reinhard Max 2021-02-11 14:42:03 EST
In ClamaAV-0.103.0 (and .1) files up to a size of 4294963200 (4G-4k) are scanned fine, but everything larger triggers the following error:

--- snip ---
$ dd if=/dev/zero of=testfile bs=64M count=64
$ ./clamscan/clamscan testfile 
LibClamAV Error: cli_calloc(): Attempt to allocate 0 bytes. Please report to https://bugzilla.clamav.net
LibClamAV Warning: fmap: map header allocation failed
LibClamAV Error: CRITICAL: fmap() failed
/tmp/testfile: Can't allocate memory ERROR

----------- SCAN SUMMARY -----------
Known viruses: 8665590
Engine version: 0.103.0
Scanned directories: 0
Scanned files: 0
Infected files: 0
Total errors: 1
Data scanned: 0.00 MB
Data read: 4096.00 MB (ratio 0.00:1)
Time: 12.170 sec (0 m 12 s)
Start Date: 2021:02:11 19:29:07
End Date:   2021:02:11 19:29:19
--- snap ---

We bisected this down to this commit:

https://github.com/Cisco-Talos/clamav-devel/commit/e01ba94e36

We reproduced this on different versions of openSUSE and SUSE Linux Enterprise Server with different versions of gcc and glibc and also verified it with "clean" builds (no patches, only default configure options and CFLAGS, etc.), so we think it is a generic problem.
Comment 1 Ville-Pekka Vainio 2021-02-17 08:56:29 EST
We are seeing the same exact error with EPEL packages. Nice work with the bisect! This bug report in Red Hat's Bugzilla might be related: https://bugzilla.redhat.com/show_bug.cgi?id=1897909
Comment 2 Micah Snyder 2021-03-30 18:56:52 EDT
Created attachment 7741 [details]
large file scan patch

I made a patch to resolve this issue.

Changes in 0.103 to order of operations for creating fmaps and performaing hashes of fmaps resulted errors when scanning files that are 4096M and a different (but related) error when scanning files > 4096M. This is despite the fact that scanning is supposed to be limited to --max-scansize (MaxScanSize) and was also apparently limited to INT_MAX - 2 (aka ~1.999999G) back in 2014 to alleviate reported crashes for a few large file formats. (see https://bugzilla.clamav.net/show_bug.cgi?id=10960) This last limitation was not documented, so I added it to the sample clamd.conf.  

Anyways, the main issue is that the fmap module was using "unsigned int" and was then enforcing a limitation (verbose error messages) when that a map length exceeded the capapacity of an unsigned int. This commit switches the associated variables over to uint64_t, and while fmaps are still limited to size_t in other places, the fmap module will at least work with files > 4G on 64bit systems.  

In testing this, I found that the time to hash a file, particularly when hashing a file on an NTFS partition from Linux was really slow because we were hashing in FILEBUFF chunks (about 8K) at a time.  Increasing this to 10MB chunks speeds up scanning of large files.  

Finally, now that hashing is performed immediately when an fmap is created for a file, hashing of files larger than max-scansize was occuring. This commit adds checks to bail out early if the file size exceeds the maximum before creating an fmap. It will alert with the Heuristics.Limits.Exceeded name if the heuristic is enabled.  

I also fixed CheckFmapFeatures.cmake module for the CMake build that detects if sysconf(_SC_PAGESIZE) is available.
Comment 3 Micah Snyder 2021-03-30 19:02:36 EDT
Maybe also worth noting: I am interested in changing clamav's behavior so that incomplete or skipped scans say something other than "OK" in the future, but since I'm planning to include this patch in 0.103.2 next week -- it makes no sense to make that change right now.  I really don't know how many systems will break if we started emitting something like "SKIPPED" instead of "OK" for files that are too large to scan.  I suspect it will cause problems for a lot of people.  

I may write to the users list to try to get some user opinions about the "SKIPPED (<reason>)" or maybe "INCOMPLETE (<reason>)" or something scan result for a future version.
Comment 4 Micah Snyder 2021-03-30 22:06:26 EDT
Created attachment 7742 [details]
large file scan patch 2

Replacement patch, I missed a couple of things.
Comment 5 Micah Snyder 2021-03-31 20:58:56 EDT
Merged this fix for inclusion in 0.104: https://github.com/Cisco-Talos/clamav-devel/commit/861153a656bcb5266952630f0a2aaed228883404

We'll also merge this into a branch for inclusion in the upcoming 0.103.2 patch release.
Comment 6 Micah Snyder 2021-04-03 13:25:21 EDT
*** Bug 12374 has been marked as a duplicate of this bug. ***
Comment 7 Micah Snyder 2021-04-03 13:27:04 EDT
*** Bug 12649 has been marked as a duplicate of this bug. ***
Comment 8 Reinhard Max 2021-06-10 09:15:28 EDT
(In reply to Micah Snyder from comment #3)
> Maybe also worth noting: I am interested in changing clamav's behavior so
> that incomplete or skipped scans say something other than "OK" in the
> future, but [...] I suspect it will cause problems for a lot of people.  
> 
> I may write to the users list to try to get some user opinions about the
> "SKIPPED (<reason>)" or maybe "INCOMPLETE (<reason>)" or something scan
> result for a future version.

Any news on these plans?