Bug 12292 - clamd terminates without any obvious reason during freshclam
clamd terminates without any obvious reason during freshclam
Status: CLOSED NEEDINFO
Product: ClamAV
Classification: ClamAV
Component: clamd
stable
x86_64 GNU/Linux
: P3 normal
: 0.101.0
Assigned To: ClamAV team
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2019-03-11 05:47 EDT by Hanno Böck
Modified: 2020-02-18 17:15 EST (History)
2 users (show)

See Also:
QA Contact:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Hanno Böck 2019-03-11 05:47:59 EDT
I'm observing on two servers that clamd occasionally stops without any apparent reason. I increased the debugging level for the logs, but nothing interesting shows up.

What seems to happen as far as I can tell from the logs:
* clamd stops working soon after freshclam
* I have a monitoring (via monit) that will notice clamd is not responding any more and will restart.

This happens irregularly, but roughly every 2-3 days.

What confuses me: I have several other servers with pretty much exactly the same setup, but it only happens on those two. Any pointers on how to better debug this are welcome.

Here's some excerpt from the logs, but it doesn't look very helpful:

Mar  9 11:28:19 [host] freshclam[22973]: Received signal: wake up
Mar  9 11:28:19 [host] freshclam[22973]: ClamAV update process started at Sat Mar  9 11:28:19 2019
Mar  9 11:28:19 [host] freshclam[22973]: Using IPv6 aware code
Mar  9 11:28:19 [host] freshclam[22973]: Max retries == 3
Mar  9 11:28:19 [host] freshclam[22973]: Querying current.cvd.clamav.net
Mar  9 11:28:20 [host] freshclam[22973]: TTL: 1800
Mar  9 11:28:20 [host] freshclam[22973]: Software version from DNS: 0.101.1
Mar  9 11:28:20 [host] freshclam[22973]: main.cvd version from DNS: 58
Mar  9 11:28:20 [host] freshclam[22973]: main.cld is up to date (version: 58, sigs: 4566249, f-level: 60, builder: sigmgr)
Mar  9 11:28:20 [host] freshclam[22973]: daily.cvd version from DNS: 25383
Mar  9 11:28:25 [host] freshclam[22973]: Retrieving http://db.de.clamav.net/daily-25383.cdiff
Mar  9 11:28:25 [host] freshclam[22973]: Trying to download http://db.de.clamav.net/daily-25383.cdiff (IP: 2606:4700::6810:db
54)
Mar  9 11:28:25 [host] freshclam[22973]: Downloading daily-25383.cdiff [100%]
Mar  9 11:28:26 [host] freshclam[22973]: cdiff_apply: Parsed 1658 lines and executed 1658 commands
Mar  9 11:28:26 [host] freshclam[684]: Loading signatures from daily.cld
Mar  9 11:28:32 [host] freshclam[684]: Properly loaded 1507935 signatures from new daily.cld
Mar  9 11:28:32 [host] freshclam[22973]: daily.cld updated (version: 25383, sigs: 1507935, f-level: 63, builder: raynman)
Mar  9 11:28:32 [host] freshclam[22973]: Querying daily.25383.102.1.0.2606470000000000000000006810db54.ping.clamav.net
Mar  9 11:28:33 [host] freshclam[22973]: bytecode.cvd version from DNS: 328
Mar  9 11:28:33 [host] freshclam[22973]: bytecode.cld is up to date (version: 328, sigs: 94, f-level: 63, builder: neo)
Mar  9 11:28:34 [host] freshclam[22973]: Database updated (6074278 signatures) from db.de.clamav.net (IP: 2606:4700::6810:db5
4)
Mar  9 11:28:34 [host] freshclam[22973]: Clamd successfully notified about the update.
Mar  9 11:28:34 [host] freshclam[22973]: --------------------------------------
Mar  9 11:32:52 [host] /etc/init.d/clamd[1194]: start-stop-daemon: 1 process refused to stop
Mar  9 11:32:52 [host] freshclam[22973]: Update process terminated
Mar  9 11:34:35 [host] clamd[22965]: Reading databases from /var/lib/clamav
Comment 1 Micah Snyder 2019-03-13 09:55:26 EDT
Hanno,

Nice to hear from you again.  Sorry to hear that you're having occasional clamd crashes. 

I wonder if there is some sort a race condition in the reload_db() logic after freshclam notifies clamd about an update. 

If the log terminates just after:
https://github.com/Cisco-Talos/clamav-devel/blob/dev/0.102/clamd/server-th.c#L199 

...then the crash must be happening sometime before:
https://github.com/Cisco-Talos/clamav-devel/blob/dev/0.102/clamd/server-th.c#L243

Can you tell me what your clamd.conf configuration looks like?  Do you have additional features (Such as on-access features) enabled?  If it is a threading issue with the database reload this information may help quite a bit. 

Regards,
Micah
Comment 2 Hanno Böck 2019-03-13 09:59:05 EDT
clamd.conf:

LogTime yes
LogSyslog yes
LogVerbose yes
PidFile /var/run/clamav/clamd.pid
DatabaseDirectory /var/lib/clamav
LocalSocket /var/run/clamav/clamd.sock
User clamav
PhishingScanURLs no
MaxScanSize 10M
MaxFileSize 10M
MaxRecursion 2
MaxFiles 100
Comment 3 Micah Snyder 2019-03-13 12:20:48 EDT
Thanks Hanno.  Nothing particularly interesting in the config then.  Very barebones.  I suppose it will take a deeper code review, or some luck in testing.  

What sort of systems are these servers?  The output of `clamconf` may help. 

Please let me know if you learn anything else about the issue.  In the meantime, I'll try to find time to examine the reload logic more closely.
Comment 4 Hanno Böck 2019-03-13 12:37:04 EDT
More or less standard Gentoo system, but we always install the latest version of clamav (i.e. the system is "stable Gentoo", but we pick "testing clamav package").
Comment 5 Ged 2019-09-11 05:26:30 EDT
Is this still happening?

Are you sure there's enough memory in the systems?

Have you tried compiling 0.101.4 from the original source?
Comment 6 Hanno Böck 2019-09-11 07:08:10 EDT
Yes, it's still happening on an irregular basis, I can confirm at least one incidence after the update to 0.101.4. I have no indication there's a memory problem, no OOM issues reported.

This is a Gentoo system, so it's compiled from the original source already.
Comment 7 Micah Snyder 2019-09-13 16:47:43 EDT
We have someone else who reported something similar, except on Windows.  He observed this issue when running clamd.exe a service and would see it become unresponsive after only 12-ish hours.  

In addition he is also seeing it free the signature database but continue running (and sometimes scanning) with almost no memory footprint. 

I wasn't able to reproduce his issue either, and don't know of anyone else observing this type of issue.  

I'm honestly a little bit stumped right now on this.  Please do update the ticket again if you come up with anything else that might indicate ware the problem lies.
Comment 8 Ged 2019-09-14 10:38:30 EDT
Hanno and I are doing a little debugging off-piste.

Seems it's happening less often now but still happening.

TBC
Comment 9 Micah Snyder 2019-12-05 15:51:42 EST
Do you have any more information about this crash or how to reproduce it?
Comment 10 Micah Snyder 2020-02-18 17:15:36 EST
Will close with "needinfo" since we could never reproduce the issue here, and haven't heard back in some time. Feel free to re-open the ticket if you find a new lead on this issue.