Bugzilla – Bug 11544
[clamav-users] Curious clamd behavior- .pdb sig recognized w/clamdscan but not via MTA
Last modified: 2020-09-30 13:54:01 EDT
[clamav-users] list posting from Dave McMurtrie(part I): I created a local pdb database so I can catch phishing attempts when URLs in an email display our domain name but actually link to a malicious URL. In testing, I found something that I don't understand. When I run clamdscan on a test message it correctly detects a spoofed domain in the message. When my MTA connects to the clamd socket and asks it to scan the same exact message, it does not detect it. I ran into a very similar problem before with a gdb database and never did figure it out. The big difference that I notice in looking at libclamav debug output is that when I ran clamdscan it detects it to be an email message and it calls cli_scanmail(): LibClamAV debug: in cli_magic_scandesc (reclevel: 0/16) LibClamAV debug: Recognized ASCII text LibClamAV debug: Matched signature for file type Mail file LibClamAV debug: cache_check: 2abdd56b32d91583175dfd071e7019d1 is negative LibClamAV debug: Starting cli_scanmail(), recursion = 1 However, when my MTA connects to clamd it does not: LibClamAV debug: in cli_magic_scandesc (reclevel: 0/16) LibClamAV debug: Recognized ASCII text LibClamAV debug: cache_check: 94e3a1ba1c23e73cb98e9a8e8a801479 is positive LibClamAV debug: cli_magic_scandesc: returning 0 at line 2791 (no post, no cache) LibClamAV debug: in cli_magic_scandesc (reclevel: 0/16) LibClamAV debug: Recognized ASCII text LibClamAV debug: Matched signature for file type HTML data.UNOFFICIAL LibClamAV debug: cache_check: f82c03beb094dd4a77cd3074ce327601 is positive Oh, this is version: ClamAV 0.99.1/21471/Wed Mar 23 19:48:37 2016 ======================= part II: Replying to myself here and hoping one of the Clam developers can clue me in. I started to look at the code to figure out why it's not identifying this as type Mail when my MTA asks clamd to scan it, but it does when I manually run clamdscan. After decoding all the "Mail" types from filetypes_int.h, it appears as though the following matches should identify something as "Mail": >From Date: Delivered-To: Delivery-date: Envelope-to: Message-ID: Message-Id: Subject: To: X-Apparently-To: X-Envelope-From: X-Original-To: X-Real-To: X-Sieve: X-UIDL: My sample message has several of those headers, but none match when my MTA invokes clamd. Oddly, through dumb luck testing with telnet connecting to my MTA I seem to have figured out what's going on. clamd appears to only match any of these if there's a blank line as the first line of data I send. Meaning, if I do this it won't be identified as Mail: mail from:dave64@andrew.cmu.edu 250 2.1.0 dave64@andrew.cmu.edu... Sender ok rcpt to:dave64@andrew.cmu.edu 250 2.1.5 dave64@andrew.cmu.edu... Recipient ok data 354 Enter mail, end with "." on a line by itself Date: Thu, 24 Mar 2016 06:41:42 -0400 ...snipped for brevity... However, if I do this it will be identified as Mail and my pdb signature works correctly: mail from:dave64@andrew.cmu.edu 250 2.1.0 dave64@andrew.cmu.edu... Sender ok rcpt to:dave64@andrew.cmu.edu 250 2.1.5 dave64@andrew.cmu.edu... Recipient ok data 354 Enter mail, end with "." on a line by itself Date: Thu, 24 Mar 2016 06:41:42 -0400 ...snipped for brevity... Given that smtp protocol does not require (or even mention) that the first line of the DATA phase will be a crlf, I'm not sure how ClamAV would ever identify anything as type Mail. Am I doing something wrong here? I assume I must be, because I can't be the only person attempting to use a pdb database to do this.
Created attachment 7105 [details] Raw email message file to test pdb signature
Created attachment 7106 [details] pdb signature file
Environment is: sendmail 8.15.2 mimedefang 2.78 clamav 0.99.1 clamd is configured to listed on a local socket. Custom mimedefang milter code uses clamd protocol via the message_contains_virus_clamd() function.
Example of pdb signature working (note the blank line as the first part of the DATA phase of the protocol): [root@andrew-mx-t01 ~]# telnet localhost 25 Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. 220-andrew-mx-t01.andrew.cmu.edu ESMTP Sendmail 8.15.2/8.15.1 220-Mis-identifing the sender of mail is an abuse of computing facilities. 220 ESMTP spoken here helo cmu.edu 250 andrew-mx-t01.andrew.cmu.edu Hello localhost [127.0.0.1], pleased to meet you mail from:dave64@andrew.cmu.edu 250 2.1.0 dave64@andrew.cmu.edu... Sender ok rcpt to:dave64@andrew.cmu.edu 250 2.1.5 dave64@andrew.cmu.edu... Recipient ok data 354 Enter mail, end with "." on a line by itself MIME-Version: 1.0 Date: Thu, 24 Mar 2016 06:41:42 -0400 Message-ID: <CACEq7yO_=SqcCS_RDQCumXD=ETX1odCdpMuj3=Z23cdBqVynFA@mail.gmail.com> Subject: test From: Dave McMurtrie <dave64@andrew.cmu.edu> To: Dave McMurtrie <dave64@andrew.cmu.edu> Content-Type: multipart/alternative; boundary=001a113316228f68fb052ec9171e --001a113316228f68fb052ec9171e Content-Type: text/plain; charset=UTF-8 go here: www.cmu.edu <http://www.google.com> --001a113316228f68fb052ec9171e Content-Type: text/html; charset=UTF-8 <div dir="ltr">go here:<br><br><a href="http://www.google.com">www.cmu.edu</a><br></div> --001a113316228f68fb052ec9171e-- . 554 5.7.1 Message failed virus scan (Heuristics.Phishing.Email.SpoofedDomain)
Example of pdb signature not working. Note there is no blank line as the first part of the DATA phase (before the headers are sent): [root@andrew-mx-t01 ~]# telnet localhost 25 Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. 220-andrew-mx-t01.andrew.cmu.edu ESMTP Sendmail 8.15.2/8.15.1 220-Mis-identifing the sender of mail is an abuse of computing facilities. 220 ESMTP spoken here helo cmu.edu 250 andrew-mx-t01.andrew.cmu.edu Hello localhost [127.0.0.1], pleased to meet you mail from:dave64@andrew.cmu.edu 250 2.1.0 dave64@andrew.cmu.edu... Sender ok rcpt to:dave64@andrew.cmu.edu 250 2.1.5 dave64@andrew.cmu.edu... Recipient ok data 354 Enter mail, end with "." on a line by itself MIME-Version: 1.0 Date: Thu, 24 Mar 2016 06:41:42 -0400 Message-ID: <CACEq7yO_=SqcCS_RDQCumXD=ETX1odCdpMuj3=Z23cdBqVynFA@mail.gmail.com> Subject: test From: Dave McMurtrie <dave64@andrew.cmu.edu> To: Dave McMurtrie <dave64@andrew.cmu.edu> Content-Type: multipart/alternative; boundary=001a113316228f68fb052ec9171e --001a113316228f68fb052ec9171e Content-Type: text/plain; charset=UTF-8 go here: www.cmu.edu <http://www.google.com> --001a113316228f68fb052ec9171e Content-Type: text/html; charset=UTF-8 <div dir="ltr">go here:<br><br><a href="http://www.google.com">www.cmu.edu</a><br></div> --001a113316228f68fb052ec9171e-- . 250 2.0.0 u2OKLiRu008482 Message accepted for delivery
clamdscan correctly recognizing the spoofed domain: [root@andrew-mx-t01 ~]# clamdscan /var/tmp/phish_test.txt /var/tmp/phish_test.txt: Heuristics.Phishing.Email.SpoofedDomain FOUND ----------- SCAN SUMMARY ----------- Infected files: 1 Time: 0.026 sec (0 m 0 s) The clamd version: [root@andrew-mx-t01 ~]# clamd --version ClamAV 0.99.1/21472/Thu Mar 24 10:24:50 2016
(In reply to Dave McMurtrie from comment #5) > Example of pdb signature not working. Note there is no blank line as the > first part of the DATA phase (before the headers are sent): > > [root@andrew-mx-t01 ~]# telnet localhost 25 > Trying 127.0.0.1... > Connected to localhost. > Escape character is '^]'. > 220-andrew-mx-t01.andrew.cmu.edu ESMTP Sendmail 8.15.2/8.15.1 > 220-Mis-identifing the sender of mail is an abuse of computing facilities. > 220 ESMTP spoken here > helo cmu.edu > 250 andrew-mx-t01.andrew.cmu.edu Hello localhost [127.0.0.1], pleased to > meet you > mail from:dave64@andrew.cmu.edu > 250 2.1.0 dave64@andrew.cmu.edu... Sender ok > rcpt to:dave64@andrew.cmu.edu > 250 2.1.5 dave64@andrew.cmu.edu... Recipient ok > data > 354 Enter mail, end with "." on a line by itself > MIME-Version: 1.0 > Date: Thu, 24 Mar 2016 06:41:42 -0400 These file typing signatures require the LF, including mime-version: "1:0,1024:0a(46|66)726f6d3a20{-1024}0a(4d|6d)(49|69)(4d|6d)(45|65)2d(56|76)657273696f6e3a20:Mail file:CL_TYPE_ANY:CL_TYPE_MAIL" "1:0,1024:0a(46|66)726f6d3a20{-2048}0a(43|63)6f6e74656e742d(54|74)7970653a20:Mail file:CL_TYPE_ANY:CL_TYPE_MAIL" "1:0,1024:0a(4d|6d)(49|69)(4d|6d)(45|65)2d(56|76)657273696f6e3a20{-2048}0a(43|63)6f6e74656e742d(54|74)7970653a20:Mail file:CL_TYPE_ANY:CL_TYPE_MAIL" "1:0,1024:0a(4d|6d)6573736167652d(49|69)643a20{-1024}0a(43|63)6f6e74656e742d(54|74)7970653a20:Mail file:CL_TYPE_ANY:CL_TYPE_MAIL" Mime version sig should probably change...
To be done in conjunction with other email/MIME/magics issues for 0.99.3.
This is not a bug in ClamAV. The problem lies within mimedefang. mimedefang connects to the clamd socket after the message has already been broken apart into pieces and it sends a SCAN command to clamd telling it to scan the mimedefang "./Work" directory, which contains all of the unpacked MIME parts as files. clamd never sees the raw mail message or the headers that would identify it as a mail message. The single linefeed prior to sending data was a red herring. That just caused mimedefang to fail to unpack the message, leaving it as a raw message inside the Work directory which would allow clamd to properly detect that it was a mail message. For clamd to ever properly classify something as a mail message, mimedefang would have to have it scan the ./INPUTMSG file that it creates, which is the raw email message. Unfortunately, looking at the mimedefang source, it is hard-coded to always scan the ./Work directory. When I modify mimedefang to look at both the Work directory and the INPUTMSG file, clamd is working exactly as expected.
(In reply to Steven Morgan from comment #7) > (In reply to Dave McMurtrie from comment #5) > > Example of pdb signature not working. Note there is no blank line as the > > first part of the DATA phase (before the headers are sent): > > > > [root@andrew-mx-t01 ~]# telnet localhost 25 > > Trying 127.0.0.1... > > Connected to localhost. > > Escape character is '^]'. > > 220-andrew-mx-t01.andrew.cmu.edu ESMTP Sendmail 8.15.2/8.15.1 > > 220-Mis-identifing the sender of mail is an abuse of computing facilities. > > 220 ESMTP spoken here > > helo cmu.edu > > 250 andrew-mx-t01.andrew.cmu.edu Hello localhost [127.0.0.1], pleased to > > meet you > > mail from:dave64@andrew.cmu.edu > > 250 2.1.0 dave64@andrew.cmu.edu... Sender ok > > rcpt to:dave64@andrew.cmu.edu > > 250 2.1.5 dave64@andrew.cmu.edu... Recipient ok > > data > > 354 Enter mail, end with "." on a line by itself > > MIME-Version: 1.0 > > Date: Thu, 24 Mar 2016 06:41:42 -0400 > > These file typing signatures require the LF, including mime-version: > > "1:0,1024:0a(46|66)726f6d3a20{- > 1024}0a(4d|6d)(49|69)(4d|6d)(45|65)2d(56|76)657273696f6e3a20:Mail > file:CL_TYPE_ANY:CL_TYPE_MAIL" > > "1:0,1024:0a(46|66)726f6d3a20{-2048}0a(43|63)6f6e74656e742d(54|74)7970653a20: > Mail file:CL_TYPE_ANY:CL_TYPE_MAIL" > > "1:0,1024:0a(4d|6d)(49|69)(4d|6d)(45|65)2d(56|76)657273696f6e3a20{- > 2048}0a(43|63)6f6e74656e742d(54|74)7970653a20:Mail > file:CL_TYPE_ANY:CL_TYPE_MAIL" > > "1:0,1024:0a(4d|6d)6573736167652d(49|69)643a20{- > 1024}0a(43|63)6f6e74656e742d(54|74)7970653a20:Mail > file:CL_TYPE_ANY:CL_TYPE_MAIL" > > Mime version sig should probably change... I think these are all fine, since every line in an smtp protocol transaction ends with a LF.
Reassigning.
Move to 0.99.4 for batch mail fixes.
Dave, I just stumbled across this. I'm having trouble understanding why this ticket is still open. It looks like you identified that mimedefang was incorrectly sending partial emails to clamd to be scanned rather than whole/raw eml files. May I close this ticket? Micah
Hi Micah, Yes, I'm sorry this is still on your plate. As I explained, this is *not* a bug in ClamAV and should be closed. Thanks! Dave
(In reply to Dave McMurtrie from comment #14) > Hi Micah, > > Yes, I'm sorry this is still on your plate. As I explained, this is *not* a > bug in ClamAV and should be closed. > > Thanks! > > Dave Hi Dave, Thanks! Will close it. :D -Micah