Bug 12279 - RAR support status
RAR support status
Status: NEW
Product: ClamAV
Classification: ClamAV
Component: libclamav
stable
x86_64 GNU/Linux
: P3 normal
: 0.101.0
Assigned To: ClamAV team
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2019-02-21 18:13 EST by Sebastian A. Siewior
Modified: 2021-08-29 11:33 EDT (History)
6 users (show)

See Also:
QA Contact:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Sebastian A. Siewior 2019-02-21 18:13:52 EST
At the beginning the unrar bits were inside libclamav. Then the Debian maintainer suggested to split out bits into a folder, make a library and add an interface known as libclamunrar_iface so that the code could be loaded at runtime rather than link directly into libclamav.
There reason for that is the license of the unrar code which forbids us (Debian) to put the software into the main section. With that compromise we are allowed to put the clamav into the main section (repack the source archive and remove the unrar bits from there) and add just the unrar library into the non-free section of the archive.

That was a history lesson :)

I'm bringing this up because almost each one of the recent releases broke that split one way or another. In 0.101.x the Makefile disappeared because it was integrated into libclamav's.

The unrar code has been updated so it looks like you want/need "latest" support for it. Would it work for you to use libarchive
  https://www.libarchive.org/
  https://github.com/libarchive/libarchive
  https://github.com/libarchive/libarchive/wiki/LibarchiveFormats

instead?
I'm asking because this would help us (Debian) in a way that would no longer require to split the unrar pieces out of the archive and maintain the additional unrar package. My understanding is that libarchive supports the rar format including the v3 format so there should be no loss in functionality. 
Also, it supports other formats besides rar so you wouldn't need to talk libbz2 for instance but could let librarchive handle that.
What do you think?

Sebastian
Comment 1 Micah Snyder 2019-02-22 08:30:53 EST
Hi Sebastian, Scott,

My apologies regarding the Makefile. I wasn't aware that a separate Makefile for libclamunrar_iface was a part of the process for you.  I assumed that you would simply drop the libclamunrar_iface.so from your distribution.  If needed, we can separate it again.

You aren't the first ones to have asked us about switching to libarchive the past few months.

It appears that libarchive has an API to open an archive using a file descriptor. That alone might make it worth switching to, as the current UnRAR library requires a filepath and thus we either have to have read access to the file (not true in some sandboxed conditions) or we have to dump the fd to a new tempfile and open that.

    int
    archive_read_open_fd(struct archive *a, int fd, size_t block_size)

Now that said, I haven't tested libarchive, I can't tell you how stable their RAR support is.  I do know that RAR v5 support isn't in their latest release.  RAR v5 support was just merged in last October (https://github.com/libarchive/libarchive/pull/1061) and their latest release was v3.3.3 on Sept 3rd.

We had a stream of requests throughout the last year asking for RAR v5 support by both the community and internal Cisco users.

Hopefully libarchive gets a release soon with the v5 support, and I'll cross my fingers real hard that v6 doesn't come out any time soon.

It appears that the v5 author, Grzegorz Antoniak (antekone), has been putting a lot of support effort into the v5 code since its integration. That's a really good sign:
https://github.com/libarchive/libarchive/pull/1079
https://github.com/libarchive/libarchive/pull/1084
https://github.com/libarchive/libarchive/pull/1102
https://github.com/libarchive/libarchive/pull/1107
https://github.com/libarchive/libarchive/pull/1125

Anyhow, I'm not opposed to the idea of switching, exactly, but we could not do it until a release with the v5 included is published. It is also quite time consuming to swap libraries so it's very difficult for me to prioritize it in order to accomodate Debian's definition of "free".
Comment 2 sergey 2019-10-07 10:33:32 EDT
I seem some problems with libclamunrar dealing with numbered archives, when it doesn't detect a virus in a filename.001 RAR-archive.
I guess it would be solved by regular updates of libarchive.
Therefore I "vote" for this issue.
Comment 3 Sebastian A. Siewior 2019-10-07 14:12:01 EDT
Now that this poped up in my inbox, I see that libarchive v3.4.0 was released and one of the features is "Read support for RAR 5.0 archives". From reading the initial reply here, the only blocker here is resolved, right?
Comment 4 Micah Snyder 2019-10-09 22:44:33 EDT
(In reply to Sebastian A. Siewior from comment #3)
> Now that this poped up in my inbox, I see that libarchive v3.4.0 was
> released and one of the features is "Read support for RAR 5.0 archives".
> From reading the initial reply here, the only blocker here is resolved,
> right?

Ah that's good to see.  Probably?  We probably couldn't replace it wholesale unless we included libarchive source with ClamAV, or else people might lose RAR 5 support if libarchive v3.4 is not available on their system.  Maybe it could be an optional replacement at first?  Something to think about.
Comment 5 Sebastian A. Siewior 2019-10-10 17:11:25 EDT
(In reply to Micah Snyder from comment #4)
> Ah that's good to see.  Probably?  We probably couldn't replace it wholesale
> unless we included libarchive source with ClamAV, or else people might lose
> RAR 5 support if libarchive v3.4 is not available on their system.  Maybe it
> could be an optional replacement at first?  Something to think about.

Would it work to have libarchive support in place and fallback to the old unrar code if it fails? We could fade it out a few years later. I don' think the security team will be very pleased to see the library embedded.
Comment 6 Scott Kitterman 2019-10-10 19:17:34 EDT
libarchive v3.4 is new enough that supporting it will definitely be an issue for more than just Debian.  Archive processors are security sensitive, so I don't think embedded copies are a great idea either.

I like the idea of leaving the current libclamunrar as it is to support systems without libarchive v3.4 and then using libarchive v3.4 when it is available.  That would also give distros that are less averse to code copies than Debian the option to use a statically linked copy as part of their packaging (as recently discussed for libcurl for 0.102).

It could be structured so that the Makefile would try to find libarchive in sufficient version and if found, use the new way and if not, use the existing code.

Scott K
Comment 7 sergey 2019-10-10 19:47:57 EDT
> It could be structured so that the Makefile would try to find libarchive in sufficient version and if found, use the new way and if not, use the existing code.

that would be awesome!

also that would probably solve a bug with libclamunrar which sometimes skips signature on a "filename.001" attachment in emails
Comment 8 Orion Poplawski 2019-10-24 21:54:24 EDT
FWIW - Fedora currently strips the RAR support out of clamav as well due to legal issues, so supporting RAR via libarchive would be a big help there as well.
Comment 9 Sebastian A. Siewior 2020-05-24 11:51:54 EDT
Just noticed that the internal unrar code has been update to the latest version.
Is there any progress on the libarchive side? If not, should I work on it?

Sebastian
Comment 10 Micah Snyder 2020-05-24 20:15:30 EDT
(In reply to Sebastian A. Siewior from comment #9)
> Just noticed that the internal unrar code has been update to the latest
> version.
> Is there any progress on the libarchive side? If not, should I work on it?
> 
> Sebastian

Hi Sebastian,

No progress on libarchive support. It's not on our task list at present. If you have the free time to work on it, a PR on github would be best. I really hate optional dependencies, and options for competing dependencies. I would prefer to keep things simple. However, I have no clue how well the libarchive RAR support matches UnRAR. If you get something working, it would be very interesting to test. It won't make much sense to drop UnRAR if it reduces efficacy. Being able to use either in that case would be better, as much as I hate the added complexity.

I have build systems on my mind -- so I looked around a bit just now to see if there's an m4 file we could lift for libarchive detection in autotools but came up empty handed. Find_package support is provided for CMake though.

Side note, I've been focusing on my CMake support branch this weekend. I've gone on a bit of a tangent with this work doing autotools changes to to match CMake (and to get rid of autoconf warnings). Specifically, I'm building the "shared" app code as a static library now rather than compiling it in separately to each app. Anyways, I digress. Revisiting my question above -- would it be helpful to Debian if I move the Makefile stuff for libclamunrar out of libclamav/Makefile.am into libclamunrar_iface/Makefile.am while I'm fussing around with autotools? 

-Micah
Comment 11 Sebastian A. Siewior 2020-05-25 13:24:25 EDT
(In reply to Micah Snyder from comment #10)
> Hi Sebastian,

Hi Micah,

> No progress on libarchive support. It's not on our task list at present. If
> you have the free time to work on it, a PR on github would be best. I really
> hate optional dependencies, and options for competing dependencies. I would
> prefer to keep things simple. However, I have no clue how well the
> libarchive RAR support matches UnRAR.

Based on the history here v3.4+ should be enough.

> If you get something working, it would
> be very interesting to test. It won't make much sense to drop UnRAR if it
> reduces efficacy. Being able to use either in that case would be better, as
> much as I hate the added complexity.

We could keep UnRAR if a good enough libarchive isn't around for instance. My motivation is that I want to stop to provide libclamunrar in Debian because it costs additional cycles each time.

> I have build systems on my mind -- so I looked around a bit just now to see
> if there's an m4 file we could lift for libarchive detection in autotools
> but came up empty handed. Find_package support is provided for CMake though.

The package provides libarchive.pc so it should be something along the lines:

+AC_ARG_WITH([libarchive], AS_HELP_STRING([--with-libarchive], [Build with libarchive.]))
+AS_IF([test "x$with_libarchive" = "xyes"],
+       [
+               PKG_CHECK_MODULES([LIBARCHIVE], [libarchive >= 3.4], [AC_DEFINE([HAVE_LIBARCHIVE], [1], [Use LIBARCHIVE])])
+       ])
+AC_SUBST([LIBARCHIVE_LIBS])

untested.

> Side note, I've been focusing on my CMake support branch this weekend. I've
> gone on a bit of a tangent with this work doing autotools changes to to
> match CMake (and to get rid of autoconf warnings). Specifically, I'm
> building the "shared" app code as a static library now rather than compiling
> it in separately to each app. Anyways, I digress. Revisiting my question
> above -- would it be helpful to Debian if I move the Makefile stuff for
> libclamunrar out of libclamav/Makefile.am into
> libclamunrar_iface/Makefile.am while I'm fussing around with autotools? 

Currently I copy Makefile.am the file from there to libclamunrar_iface to get it working. So if you move things I would have to update the split script :) But I think I manage to do it and it also makes me want to stare at the libarchive support a little more :) Now that I know that you would accept a PR…

> -Micah

Sebastian
Comment 12 Sebastian A. Siewior 2020-05-26 15:38:25 EDT
(In reply to Sebastian A. Siewior from comment #11)
> > No progress on libarchive support. It's not on our task list at present. If
> > you have the free time to work on it, a PR on github would be best. I really
> > hate optional dependencies, and options for competing dependencies. I would
> > prefer to keep things simple. However, I have no clue how well the
> > libarchive RAR support matches UnRAR.
> 
> Based on the history here v3.4+ should be enough.

I made dis
  https://github.com/Cisco-Talos/clamav-devel/pull/120

so it was very quickly written. I dind't properly include it into the build-system, etc. Could do it…
I would be interested in testing an what is missing. As stated in the commit, the 
v2 can not be unpacked, v3 works. I'm going to open a bug against libarchive and hope that they will address it. It would be interesting if you have a cabinet with more files which either pass or fail.
I didn't find "compressed-size", crc and compressed method of the file. I saw 
that some parts of it are fed into cli_matchmeta(). Not sure how important this information is.

Sebastian