Bugzilla – Bug 12292
clamd terminates without any obvious reason during freshclam
Last modified: 2020-02-18 17:15:36 EST
I'm observing on two servers that clamd occasionally stops without any apparent reason. I increased the debugging level for the logs, but nothing interesting shows up. What seems to happen as far as I can tell from the logs: * clamd stops working soon after freshclam * I have a monitoring (via monit) that will notice clamd is not responding any more and will restart. This happens irregularly, but roughly every 2-3 days. What confuses me: I have several other servers with pretty much exactly the same setup, but it only happens on those two. Any pointers on how to better debug this are welcome. Here's some excerpt from the logs, but it doesn't look very helpful: Mar 9 11:28:19 [host] freshclam[22973]: Received signal: wake up Mar 9 11:28:19 [host] freshclam[22973]: ClamAV update process started at Sat Mar 9 11:28:19 2019 Mar 9 11:28:19 [host] freshclam[22973]: Using IPv6 aware code Mar 9 11:28:19 [host] freshclam[22973]: Max retries == 3 Mar 9 11:28:19 [host] freshclam[22973]: Querying current.cvd.clamav.net Mar 9 11:28:20 [host] freshclam[22973]: TTL: 1800 Mar 9 11:28:20 [host] freshclam[22973]: Software version from DNS: 0.101.1 Mar 9 11:28:20 [host] freshclam[22973]: main.cvd version from DNS: 58 Mar 9 11:28:20 [host] freshclam[22973]: main.cld is up to date (version: 58, sigs: 4566249, f-level: 60, builder: sigmgr) Mar 9 11:28:20 [host] freshclam[22973]: daily.cvd version from DNS: 25383 Mar 9 11:28:25 [host] freshclam[22973]: Retrieving http://db.de.clamav.net/daily-25383.cdiff Mar 9 11:28:25 [host] freshclam[22973]: Trying to download http://db.de.clamav.net/daily-25383.cdiff (IP: 2606:4700::6810:db 54) Mar 9 11:28:25 [host] freshclam[22973]: Downloading daily-25383.cdiff [100%] Mar 9 11:28:26 [host] freshclam[22973]: cdiff_apply: Parsed 1658 lines and executed 1658 commands Mar 9 11:28:26 [host] freshclam[684]: Loading signatures from daily.cld Mar 9 11:28:32 [host] freshclam[684]: Properly loaded 1507935 signatures from new daily.cld Mar 9 11:28:32 [host] freshclam[22973]: daily.cld updated (version: 25383, sigs: 1507935, f-level: 63, builder: raynman) Mar 9 11:28:32 [host] freshclam[22973]: Querying daily.25383.102.1.0.2606470000000000000000006810db54.ping.clamav.net Mar 9 11:28:33 [host] freshclam[22973]: bytecode.cvd version from DNS: 328 Mar 9 11:28:33 [host] freshclam[22973]: bytecode.cld is up to date (version: 328, sigs: 94, f-level: 63, builder: neo) Mar 9 11:28:34 [host] freshclam[22973]: Database updated (6074278 signatures) from db.de.clamav.net (IP: 2606:4700::6810:db5 4) Mar 9 11:28:34 [host] freshclam[22973]: Clamd successfully notified about the update. Mar 9 11:28:34 [host] freshclam[22973]: -------------------------------------- Mar 9 11:32:52 [host] /etc/init.d/clamd[1194]: start-stop-daemon: 1 process refused to stop Mar 9 11:32:52 [host] freshclam[22973]: Update process terminated Mar 9 11:34:35 [host] clamd[22965]: Reading databases from /var/lib/clamav
Hanno, Nice to hear from you again. Sorry to hear that you're having occasional clamd crashes. I wonder if there is some sort a race condition in the reload_db() logic after freshclam notifies clamd about an update. If the log terminates just after: https://github.com/Cisco-Talos/clamav-devel/blob/dev/0.102/clamd/server-th.c#L199 ...then the crash must be happening sometime before: https://github.com/Cisco-Talos/clamav-devel/blob/dev/0.102/clamd/server-th.c#L243 Can you tell me what your clamd.conf configuration looks like? Do you have additional features (Such as on-access features) enabled? If it is a threading issue with the database reload this information may help quite a bit. Regards, Micah
clamd.conf: LogTime yes LogSyslog yes LogVerbose yes PidFile /var/run/clamav/clamd.pid DatabaseDirectory /var/lib/clamav LocalSocket /var/run/clamav/clamd.sock User clamav PhishingScanURLs no MaxScanSize 10M MaxFileSize 10M MaxRecursion 2 MaxFiles 100
Thanks Hanno. Nothing particularly interesting in the config then. Very barebones. I suppose it will take a deeper code review, or some luck in testing. What sort of systems are these servers? The output of `clamconf` may help. Please let me know if you learn anything else about the issue. In the meantime, I'll try to find time to examine the reload logic more closely.
More or less standard Gentoo system, but we always install the latest version of clamav (i.e. the system is "stable Gentoo", but we pick "testing clamav package").
Is this still happening? Are you sure there's enough memory in the systems? Have you tried compiling 0.101.4 from the original source?
Yes, it's still happening on an irregular basis, I can confirm at least one incidence after the update to 0.101.4. I have no indication there's a memory problem, no OOM issues reported. This is a Gentoo system, so it's compiled from the original source already.
We have someone else who reported something similar, except on Windows. He observed this issue when running clamd.exe a service and would see it become unresponsive after only 12-ish hours. In addition he is also seeing it free the signature database but continue running (and sometimes scanning) with almost no memory footprint. I wasn't able to reproduce his issue either, and don't know of anyone else observing this type of issue. I'm honestly a little bit stumped right now on this. Please do update the ticket again if you come up with anything else that might indicate ware the problem lies.
Hanno and I are doing a little debugging off-piste. Seems it's happening less often now but still happening. TBC
Do you have any more information about this crash or how to reproduce it?
Will close with "needinfo" since we could never reproduce the issue here, and haven't heard back in some time. Feel free to re-open the ticket if you find a new lead on this issue.