Bug 11544 - [clamav-users] Curious clamd behavior- .pdb sig recognized w/clamdscan but not via MTA
[clamav-users] Curious clamd behavior- .pdb sig recognized w/clamdscan but no...
Status: CLOSED WONTFIX
Product: ClamAV
Classification: ClamAV
Component: libclamav
stable
x86_64 GNU/Linux
: P3 normal
: 0.99.4
Assigned To: Micah Snyder
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2016-03-24 15:12 EDT by Steven Morgan
Modified: 2020-09-30 13:54 EDT (History)
1 user (show)

See Also:
QA Contact:


Attachments
Raw email message file to test pdb signature (628 bytes, text/plain)
2016-03-24 16:13 EDT, Dave McMurtrie
no flags Details
pdb signature file (14 bytes, application/x-aportisdoc)
2016-03-24 16:14 EDT, Dave McMurtrie
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Steven Morgan 2016-03-24 15:12:53 EDT
[clamav-users] list posting from Dave McMurtrie(part I):

I created a local pdb database so I can catch phishing attempts when
URLs in an email display our domain name but actually link to a
malicious URL.  In testing, I found something that I don't understand.

When I run clamdscan on a test message it correctly detects a spoofed
domain in the message.  When my MTA connects to the clamd socket and
asks it to scan the same exact message, it does not detect it.

I ran into a very similar problem before with a gdb database and never
did figure it out.  The big difference that I notice in looking at
libclamav debug output is that when I ran clamdscan it detects it to be
an email message and it calls cli_scanmail():

LibClamAV debug: in cli_magic_scandesc (reclevel: 0/16)
LibClamAV debug: Recognized ASCII text
LibClamAV debug: Matched signature for file type Mail file
LibClamAV debug: cache_check: 2abdd56b32d91583175dfd071e7019d1 is
negative
LibClamAV debug: Starting cli_scanmail(), recursion = 1


However, when my MTA connects to clamd it does not:

LibClamAV debug: in cli_magic_scandesc (reclevel: 0/16)
LibClamAV debug: Recognized ASCII text
LibClamAV debug: cache_check: 94e3a1ba1c23e73cb98e9a8e8a801479 is
positive
LibClamAV debug: cli_magic_scandesc: returning 0  at line 2791 (no post,
no cache)
LibClamAV debug: in cli_magic_scandesc (reclevel: 0/16)
LibClamAV debug: Recognized ASCII text
LibClamAV debug: Matched signature for file type HTML data.UNOFFICIAL
LibClamAV debug: cache_check: f82c03beb094dd4a77cd3074ce327601 is
positive

Oh, this is version: ClamAV 0.99.1/21471/Wed Mar 23 19:48:37 2016

=======================

part II:

Replying to myself here and hoping one of the Clam developers can clue
me in.

I started to look at the code to figure out why it's not identifying
this as type Mail when my MTA asks clamd to scan it, but it does when I
manually run clamdscan.  After decoding all the "Mail" types from
filetypes_int.h, it appears as though the following matches should
identify something as "Mail":

>From
Date:
Delivered-To:
Delivery-date:
Envelope-to:
Message-ID:
Message-Id:
Subject:
To:
X-Apparently-To:
X-Envelope-From:
X-Original-To:
X-Real-To:
X-Sieve:
X-UIDL:

My sample message has several of those headers, but none match when my
MTA invokes clamd.  Oddly, through dumb luck testing with telnet
connecting to my MTA I seem to have figured out what's going on.

clamd appears to only match any of these if there's a blank line as the
first line of data I send.

Meaning, if I do this it won't be identified as Mail:

mail from:dave64@andrew.cmu.edu
250 2.1.0 dave64@andrew.cmu.edu... Sender ok
rcpt to:dave64@andrew.cmu.edu
250 2.1.5 dave64@andrew.cmu.edu... Recipient ok
data
354 Enter mail, end with "." on a line by itself
Date: Thu, 24 Mar 2016 06:41:42 -0400
...snipped for brevity...

However, if I do this it will be identified as Mail and my pdb signature
works correctly:

mail from:dave64@andrew.cmu.edu
250 2.1.0 dave64@andrew.cmu.edu... Sender ok
rcpt to:dave64@andrew.cmu.edu
250 2.1.5 dave64@andrew.cmu.edu... Recipient ok
data
354 Enter mail, end with "." on a line by itself

Date: Thu, 24 Mar 2016 06:41:42 -0400
...snipped for brevity...

Given that smtp protocol does not require (or even mention) that the
first line of the DATA phase will be a crlf, I'm not sure how ClamAV
would ever identify anything as type Mail.

Am I doing something wrong here?  I assume I must be, because I can't be
the only person attempting to use a pdb database to do this.
Comment 1 Dave McMurtrie 2016-03-24 16:13:36 EDT
Created attachment 7105 [details]
Raw email message file to test pdb signature
Comment 2 Dave McMurtrie 2016-03-24 16:14:20 EDT
Created attachment 7106 [details]
pdb signature file
Comment 3 Dave McMurtrie 2016-03-24 16:17:14 EDT
Environment is:

sendmail 8.15.2
mimedefang 2.78
clamav 0.99.1

clamd is configured to listed on a local socket.  Custom mimedefang milter code uses clamd protocol via the message_contains_virus_clamd() function.
Comment 4 Dave McMurtrie 2016-03-24 16:21:22 EDT
Example of pdb signature working (note the blank line as the first part of the DATA phase of the protocol):

[root@andrew-mx-t01 ~]# telnet localhost 25
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
220-andrew-mx-t01.andrew.cmu.edu ESMTP Sendmail 8.15.2/8.15.1
220-Mis-identifing the sender of mail is an abuse of computing facilities.
220 ESMTP spoken here
helo cmu.edu
250 andrew-mx-t01.andrew.cmu.edu Hello localhost [127.0.0.1], pleased to meet you
mail from:dave64@andrew.cmu.edu
250 2.1.0 dave64@andrew.cmu.edu... Sender ok
rcpt to:dave64@andrew.cmu.edu
250 2.1.5 dave64@andrew.cmu.edu... Recipient ok
data
354 Enter mail, end with "." on a line by itself

MIME-Version: 1.0
Date: Thu, 24 Mar 2016 06:41:42 -0400
Message-ID: <CACEq7yO_=SqcCS_RDQCumXD=ETX1odCdpMuj3=Z23cdBqVynFA@mail.gmail.com>
Subject: test
From: Dave McMurtrie <dave64@andrew.cmu.edu>
To: Dave McMurtrie <dave64@andrew.cmu.edu>
Content-Type: multipart/alternative; boundary=001a113316228f68fb052ec9171e

--001a113316228f68fb052ec9171e
Content-Type: text/plain; charset=UTF-8

go here:

www.cmu.edu <http://www.google.com>

--001a113316228f68fb052ec9171e
Content-Type: text/html; charset=UTF-8

<div dir="ltr">go here:<br><br><a href="http://www.google.com">www.cmu.edu</a><br></div>

--001a113316228f68fb052ec9171e--
.
554 5.7.1 Message failed virus scan (Heuristics.Phishing.Email.SpoofedDomain)
Comment 5 Dave McMurtrie 2016-03-24 16:23:48 EDT
Example of pdb signature not working.  Note there is no blank line as the first part of the DATA phase (before the headers are sent):

[root@andrew-mx-t01 ~]# telnet localhost 25
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
220-andrew-mx-t01.andrew.cmu.edu ESMTP Sendmail 8.15.2/8.15.1
220-Mis-identifing the sender of mail is an abuse of computing facilities.
220 ESMTP spoken here
helo cmu.edu
250 andrew-mx-t01.andrew.cmu.edu Hello localhost [127.0.0.1], pleased to meet you
mail from:dave64@andrew.cmu.edu
250 2.1.0 dave64@andrew.cmu.edu... Sender ok
rcpt to:dave64@andrew.cmu.edu
250 2.1.5 dave64@andrew.cmu.edu... Recipient ok
data
354 Enter mail, end with "." on a line by itself
MIME-Version: 1.0
Date: Thu, 24 Mar 2016 06:41:42 -0400
Message-ID: <CACEq7yO_=SqcCS_RDQCumXD=ETX1odCdpMuj3=Z23cdBqVynFA@mail.gmail.com>
Subject: test
From: Dave McMurtrie <dave64@andrew.cmu.edu>
To: Dave McMurtrie <dave64@andrew.cmu.edu>
Content-Type: multipart/alternative; boundary=001a113316228f68fb052ec9171e

--001a113316228f68fb052ec9171e
Content-Type: text/plain; charset=UTF-8

go here:

www.cmu.edu <http://www.google.com>

--001a113316228f68fb052ec9171e
Content-Type: text/html; charset=UTF-8

<div dir="ltr">go here:<br><br><a href="http://www.google.com">www.cmu.edu</a><br></div>

--001a113316228f68fb052ec9171e--
.
250 2.0.0 u2OKLiRu008482 Message accepted for delivery
Comment 6 Dave McMurtrie 2016-03-24 16:25:39 EDT
clamdscan correctly recognizing the spoofed domain:

[root@andrew-mx-t01 ~]# clamdscan /var/tmp/phish_test.txt 
/var/tmp/phish_test.txt: Heuristics.Phishing.Email.SpoofedDomain FOUND

----------- SCAN SUMMARY -----------
Infected files: 1
Time: 0.026 sec (0 m 0 s)


The clamd version:

[root@andrew-mx-t01 ~]# clamd --version
ClamAV 0.99.1/21472/Thu Mar 24 10:24:50 2016
Comment 7 Steven Morgan 2016-03-25 11:58:31 EDT
(In reply to Dave McMurtrie from comment #5)
> Example of pdb signature not working.  Note there is no blank line as the
> first part of the DATA phase (before the headers are sent):
> 
> [root@andrew-mx-t01 ~]# telnet localhost 25
> Trying 127.0.0.1...
> Connected to localhost.
> Escape character is '^]'.
> 220-andrew-mx-t01.andrew.cmu.edu ESMTP Sendmail 8.15.2/8.15.1
> 220-Mis-identifing the sender of mail is an abuse of computing facilities.
> 220 ESMTP spoken here
> helo cmu.edu
> 250 andrew-mx-t01.andrew.cmu.edu Hello localhost [127.0.0.1], pleased to
> meet you
> mail from:dave64@andrew.cmu.edu
> 250 2.1.0 dave64@andrew.cmu.edu... Sender ok
> rcpt to:dave64@andrew.cmu.edu
> 250 2.1.5 dave64@andrew.cmu.edu... Recipient ok
> data
> 354 Enter mail, end with "." on a line by itself
> MIME-Version: 1.0
> Date: Thu, 24 Mar 2016 06:41:42 -0400

These file typing signatures require the LF, including mime-version:

"1:0,1024:0a(46|66)726f6d3a20{-1024}0a(4d|6d)(49|69)(4d|6d)(45|65)2d(56|76)657273696f6e3a20:Mail file:CL_TYPE_ANY:CL_TYPE_MAIL"

"1:0,1024:0a(46|66)726f6d3a20{-2048}0a(43|63)6f6e74656e742d(54|74)7970653a20:Mail file:CL_TYPE_ANY:CL_TYPE_MAIL"

"1:0,1024:0a(4d|6d)(49|69)(4d|6d)(45|65)2d(56|76)657273696f6e3a20{-2048}0a(43|63)6f6e74656e742d(54|74)7970653a20:Mail file:CL_TYPE_ANY:CL_TYPE_MAIL"

"1:0,1024:0a(4d|6d)6573736167652d(49|69)643a20{-1024}0a(43|63)6f6e74656e742d(54|74)7970653a20:Mail file:CL_TYPE_ANY:CL_TYPE_MAIL"

Mime version sig should probably change...
Comment 8 Steven Morgan 2016-05-27 12:02:04 EDT
To be done in conjunction with other email/MIME/magics issues for 0.99.3.
Comment 9 Dave McMurtrie 2016-07-20 11:55:52 EDT
This is not a bug in ClamAV.  The problem lies within mimedefang.  mimedefang connects to the clamd socket after the message has already been broken apart into pieces and it sends a SCAN command to clamd telling it to scan the mimedefang "./Work" directory, which contains all of the unpacked MIME parts as files.  clamd never sees the raw mail message or the headers that would identify it as a mail message.

The single linefeed prior to sending data was a red herring.  That just caused mimedefang to fail to unpack the message, leaving it as a raw message inside the Work directory which would allow clamd to properly detect that it was a mail message.

For clamd to ever properly classify something as a mail message, mimedefang would have to have it scan the ./INPUTMSG file that it creates, which is the raw email message.  Unfortunately, looking at the mimedefang source, it is hard-coded to always scan the ./Work directory.

When I modify mimedefang to look at both the Work directory and the INPUTMSG file, clamd is working exactly as expected.
Comment 10 Dave McMurtrie 2016-07-20 11:57:43 EDT
(In reply to Steven Morgan from comment #7)
> (In reply to Dave McMurtrie from comment #5)
> > Example of pdb signature not working.  Note there is no blank line as the
> > first part of the DATA phase (before the headers are sent):
> > 
> > [root@andrew-mx-t01 ~]# telnet localhost 25
> > Trying 127.0.0.1...
> > Connected to localhost.
> > Escape character is '^]'.
> > 220-andrew-mx-t01.andrew.cmu.edu ESMTP Sendmail 8.15.2/8.15.1
> > 220-Mis-identifing the sender of mail is an abuse of computing facilities.
> > 220 ESMTP spoken here
> > helo cmu.edu
> > 250 andrew-mx-t01.andrew.cmu.edu Hello localhost [127.0.0.1], pleased to
> > meet you
> > mail from:dave64@andrew.cmu.edu
> > 250 2.1.0 dave64@andrew.cmu.edu... Sender ok
> > rcpt to:dave64@andrew.cmu.edu
> > 250 2.1.5 dave64@andrew.cmu.edu... Recipient ok
> > data
> > 354 Enter mail, end with "." on a line by itself
> > MIME-Version: 1.0
> > Date: Thu, 24 Mar 2016 06:41:42 -0400
> 
> These file typing signatures require the LF, including mime-version:
> 
> "1:0,1024:0a(46|66)726f6d3a20{-
> 1024}0a(4d|6d)(49|69)(4d|6d)(45|65)2d(56|76)657273696f6e3a20:Mail
> file:CL_TYPE_ANY:CL_TYPE_MAIL"
> 
> "1:0,1024:0a(46|66)726f6d3a20{-2048}0a(43|63)6f6e74656e742d(54|74)7970653a20:
> Mail file:CL_TYPE_ANY:CL_TYPE_MAIL"
> 
> "1:0,1024:0a(4d|6d)(49|69)(4d|6d)(45|65)2d(56|76)657273696f6e3a20{-
> 2048}0a(43|63)6f6e74656e742d(54|74)7970653a20:Mail
> file:CL_TYPE_ANY:CL_TYPE_MAIL"
> 
> "1:0,1024:0a(4d|6d)6573736167652d(49|69)643a20{-
> 1024}0a(43|63)6f6e74656e742d(54|74)7970653a20:Mail
> file:CL_TYPE_ANY:CL_TYPE_MAIL"
> 
> Mime version sig should probably change...

I think these are all fine, since every line in an smtp protocol transaction ends with a LF.
Comment 11 Steven Morgan 2017-12-15 11:31:11 EST
Reassigning.
Comment 12 Steven Morgan 2018-01-03 15:13:36 EST
Move to 0.99.4 for batch mail fixes.
Comment 13 Micah Snyder 2019-04-19 09:25:45 EDT
Dave,

I just stumbled across this.  I'm having trouble understanding why this ticket is still open.  It looks like you identified that mimedefang was incorrectly sending partial emails to clamd to be scanned rather than whole/raw eml files.  May I close this ticket?

Micah
Comment 14 Dave McMurtrie 2020-09-29 09:05:27 EDT
Hi Micah,

Yes, I'm sorry this is still on your plate.  As I explained, this is *not* a bug in ClamAV and should be closed.

Thanks!

Dave
Comment 15 Micah Snyder 2020-09-30 13:54:01 EDT
(In reply to Dave McMurtrie from comment #14)
> Hi Micah,
> 
> Yes, I'm sorry this is still on your plate.  As I explained, this is *not* a
> bug in ClamAV and should be closed.
> 
> Thanks!
> 
> Dave

Hi Dave,

Thanks!  Will close it. :D

-Micah