Bug 12667 - Unreliable Eicar-Test
Unreliable Eicar-Test
Status: RESOLVED FIXED
Product: ClamAV
Classification: ClamAV
Component: clamd
stable
x86_64 GNU/Linux
: P3 major
: 0.101.0
Assigned To: ClamAV team
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2021-01-29 17:01 EST by Hanspeter Gosteli
Modified: 2021-03-20 00:25 EDT (History)
5 users (show)

See Also:
QA Contact:


Attachments
replacement file (28.21 KB, application/octet-stream)
2021-02-05 17:20 EST, aragusa
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Hanspeter Gosteli 2021-01-29 17:01:08 EST
Clamd is not reliably detecting eicar file creation when TCP streaming is enabled under RHEL/Centos7,Centos8,Fedora33. Clamd will not not always log "FOUND" but randomly concludes "OK". Please note for RHEL7 we only have this impacted TCP option available. This Bug apparently degrades On-Access Scanning for all Centos7 / RHEL7.  

Frequency of appearance:
In corporate RHEL7 environment (hardened, security tools) we catch the eicar rarely (~1/10 FOUND). On performance homeoffice stack this happens rarely (<1/1000 OK), almost never when system is idle. This bug may be some race condition and it's difficult to detect as sometimes 1000s of eicar-test are successful and then suddenly many will fail.

Affected Versions:
rel/0.103.0
dev/0.103.1
dev/0.104

[vagrant@centos7 test]$ sudo cat /etc/clamd.d/scan.conf 
LogClean yes
LogSyslog yes
LogVerbose yes
LogFile /var/log/clamav.log
TCPSocket 3310
TCPAddr 127.0.0.1
OnAccessExtraScanning yes
OnAccessIncludePath /home/vagrant/test

[vagrant@centos7 test]$ cat drop_eicar.sh 
#!/bin/bash
sudo sh -c ":> /var/log/clamav.log"
rm eicar.com.*
for i in {1..10}
do
  echo 'K5B!C%@NC[4\CMK54(C^)7PP)7}$RVPNE-FGNAQNEQ-NAGVIVEHF-GRFG-SVYR!$U+U*' | tr '[A-Za-z]' '[N-ZA-Mn-za-m]' > eicar.com.$i
done
sudo cat /var/log/clamav.log | grep FOUND | wc -l
sudo cat /var/log/clamav.log | grep OK | wc -l

[vagrant@centos7 test]$ ./drop_eicar.sh 
11
2
[vagrant@centos7 test]$ ./drop_eicar.sh 
7
2
[vagrant@centos7 test]$ ./drop_eicar.sh 
9
1
[vagrant@centos7 test]$ ./drop_eicar.sh 
8
3
[vagrant@centos7 test]$ ./drop_eicar.sh 
8
1
[vagrant@centos7 test]$ ./drop_eicar.sh 
12
4
Comment 1 Hanspeter Gosteli 2021-01-30 18:15:36 EST
apologies, the reproduction script (drop_eicar.sh) is not working. it would need a sleep between creating the files and checking the log. the printed counts are incomplete, as the scans are still pending at the time.

so far only watching the log for suspicious OKs and creating the file as follows is reproducing the issue at a rate of about 1/1000 on a libvirt/centos7.

echo 'K5B!C%@NC[4\CMK54(C^)7PP)7}$RVPNE-FGNAQNEQ-NAGVIVEHF-GRFG-SVYR!$U+U*' | tr '[A-Za-z]' '[N-ZA-Mn-za-m]' > eicar.com

on the corporate system (rhel7) the eicar files are missed on 3 out of 4 occations.

the log observation is further verified by starting clamonacc with --move= and it does not quarantine the missed eicar-files.

hopefully we can come up with a script that reliably provokes this behaviour. can you please set this bug to public?
Comment 2 Hanspeter Gosteli 2021-01-31 11:35:33 EST
by stressing out all cpu cores we can reliably reproduce failure of the scan, by running the following two commands:

$ stress --cpu <double your cores>

$ echo 'K5B!C%@NC[4\CMK54(C^)7PP)7}$RVPNE-FGNAQNEQ-NAGVIVEHF-GRFG-SVYR!$U+U*' | tr '[A-Za-z]' '[N-ZA-Mn-za-m]' > eicar.com

-> the scan on eicar.com will produce an "OK" and the file will not be quarantined. after stopping stress, normal operation will resume and the successively created eicar files fill be quarantined as designed.
Comment 3 Hanspeter Gosteli 2021-01-31 15:53:16 EST
the issue happens also using LocalSocket and is not limited to TCPSocket. tcpdump reveals, clamonacc is not passing any file payload eicar string to clamd when cpu is stressed.
Comment 4 Micah Snyder 2021-01-31 16:33:59 EST
Hi Hanspeter,

I see that you're clearing the clamd log before you remove the old EICAR files, so perhaps there's a race between clamd scanning the old eicar files and the files being deleted.

If you either A) clear the log after the delete, or B) enable OnAccessPrevention that should eliminate the delete-race condition.  I haven't tested it myself. Please try these out and let me know how it goes.

It's also possible there's a race between clamd scanning the new file when it's first created and your script writing the EICAR contents to the file, but I'm guessing it's the delete-race.

-Micah
Comment 5 Hanspeter Gosteli 2021-01-31 16:54:33 EST
Hi Micah,
thanks for the reply. The initial test was flaky and has been refined down to writing a single eicar-file when the cpu's are maxed out. It seems to reliably reproduce the issue in all my environments.

Please refer to my latest setup and log output:


# OK - Clamonacc (No stress running)

ClamFanotif: attempting to feed consumer queue
ClamInotif: attempting to feed consumer queue
ClamWorker: performing scanning on file '/home/hp/test/eicar.txt.ok'
ClamWorker: handling inotify event ...
ClamWorker: performing (extra) scanning on file '/home/hp/test/eicar.txt.ok'
/home/hp/test/eicar.txt.ok: Eicar-Signature FOUND
traverse_to: Handle opened for 'home' directory.
traverse_to: Handle opened for 'hp' directory.
traverse_to: Handle opened for 'test' directory.
traverse_rename: Failed to rename: /home/hp/test/eicar.txt.ok
	to: /tmp/quarantine/eicar.txt.ok
Error:Invalid cross-device link
traverse_to: Handle opened for 'home' directory.
traverse_to: Handle opened for 'hp' directory.
traverse_to: Handle opened for 'test' directory.
/home/hp/test/eicar.txt.ok: moved to '/tmp/quarantine/eicar.txt.ok'


# NOK - Clamonacc (stress running)

ClamInotif: attempting to feed consumer queue
ClamFanotif: attempting to feed consumer queue
ClamWorker: handling inotify event ...
ClamWorker: performing (extra) scanning on file '/home/hp/test/eicar.txt.nok'
ClamWorker: performing scanning on file '/home/hp/test/eicar.txt.nok'


# Commands used on F33/0.103.1
$ ./clamd -c /etc/clamd.d/scan.conf -F --debug

$ ./clamonacc -c /etc/clamd.d/scan.conf -F --verbose --move=/tmp/quarantine --
stream

$ echo 'K5B!C%@NC[4\CMK54(C^)7PP)7}$RVPNE-FGNAQNEQ-NAGVIVEHF-GRFG-SVYR!$U+U*' | tr '[A-Za-z]' '[N-ZA-Mn-za-m]' > eicar.txt.ok

$ stress --cpu 24 # in parallel to following eicar drop:

$ echo 'K5B!C%@NC[4\CMK54(C^)7PP)7}$RVPNE-FGNAQNEQ-NAGVIVEHF-GRFG-SVYR!$U+U*' | tr '[A-Za-z]' '[N-ZA-Mn-za-m]' > eicar.txt.nok


# /etc/clamd/scan.conf
LogClean yes
LogSyslog yes
LogVerbose yes
TCPSocket 3310
TCPAddr 127.0.0.1
OnAccessExtraScanning yes
OnAccessIncludePath /home/hp/test
OnAccessExcludeRootUID yes

Thanks alot for the support!
Comment 6 aragusa 2021-02-03 17:30:10 EST
Hi Hanspeter,

I am attempting to reproduce this now, and will let you know when I do.

Thanks,

Andy
Comment 7 aragusa 2021-02-03 19:22:24 EST
I am able to reproduce this issue.  I will let you know when I have a patch for you to evaluate.

Thank you for all of the details in this ticket!

Andy
Comment 8 aragusa 2021-02-05 17:20:37 EST
Created attachment 7731 [details]
replacement file

Hi Hanspeter,

When you get a chance, could you try replacing clamonacc/inotif/inotif.c with this one and let me know if this fixes your issue?

Thanks,
Andy
Comment 9 Hanspeter Gosteli 2021-02-08 03:54:42 EST
Goodmorning Andy
I have successfully tested your patch - we're catching 100k/100k file creates on both F33 and RHEL7 (Shared Infra, many security tools stacked on) plus heavy stress.

I find 100011 files in quarantine. The 11 have .001 file ending - but to us this glitch is acceptable and may be handled in a new bugzilla.

Thank you Andy for this outstanding work. I am looking forward to see this fix merged into dev/104.

Have a great day!
Comment 10 Hanspeter Gosteli 2021-03-15 07:52:40 EDT
I just noticed the batch not breaking build-windows when i merged from upstream https://github.com/goshansp/clamav-devel/actions/runs/653880324 ... do we have an ETA when patch merges into dev/0.104 ? Thanks!
Comment 11 Micah Snyder 2021-03-20 00:25:34 EDT
Hi Hanspeter,

We had some trouble with the [internal] test system the last couple days, but it's working o.k. now.  Andy's work finished review a few days ago and just completed a run through the test pipeline.  I just merged it.  

Commit is here in dev/0.104 for the upcoming 0.104 feature release: https://github.com/Cisco-Talos/clamav-devel/commit/3fdf035911544a971a771cdd7f2116fd07c43a7e