decodemail (GNU Mailutils Manual)

3.9 `decodemail` – Decode multipart messages

The decodemail utility is a filter program that reads messages from the input mailbox, decodes “textual” parts of each multipart message from a base64- or quoted-printable encoding to an 8-bit or 7-bit transfer encoding, and stores the processed messages in the output mailbox. All messages from the input mailbox are stored in the output, regardless of whether a change was made.

The message parts deemed to be textual are those whose ‘Content-Type’ header matches a predefined, or user-defined, mime type pattern. In addition, encoded pieces of the ‘From:’, ‘To:’, ‘Subject:’, etc., headers are decoded.

For example, decodemail makes this transformation:

Subject: =?utf-8?Q?The=20Baroque=20Enquirer=20|=20July=202020?=
⇒ Subject: The Baroque Enquirer | July 2020

The built-in list of textual content type patterns is:

text/*
application/*shell
application/shellscript
*/x-csrc
*/x-csource
*/x-diff
*/x-patch
*/x-perl
*/x-php
*/x-python
*/x-sh

These strings are matched as shell globbing patterns (see glob in glob(7) manual page).

More patterns can be added to this list using the mime.text-type configuration statement. See The mime Statement, for a detailed discussion, and the configuration section below for a simple example.

When processing old mesages you may encounter ‘Content-Type’ headers whose value contains only type, but no subtype. To match such headers, use the pattern without ‘/whatever’ part. E.g. ‘text/*’ matches ‘text/plain’ and ‘text/html’, but does not match ‘text’. On the other hand, ‘t*xt’ does not match ‘text/plain’, but does match ‘text’.

Optionally, the decoded parts can be converted to another character set. By default, the character set is not changed.

Invocation of decodemail.
Configuration of decodemail.
Purpose and caveats of decodemail.

3.9.1 Invocation of `decodemail`.

Usually, the utility is invoked as:

decodemail inbox outbox

where inbox and outbox are file names or URLs of the input and output mailboxes, correspondingly. The input mailbox is opened read-only and will not be modified in any way. In particular, the status of the processed messages will not change. If the output mailbox does not exist, it will be created. If it exists, the messages will be appended to it, preserving any original messages that are already in it. This behavior can be changed using the -t (--truncate) option, described below.

The two mailboxes can be of different types. For example you can read input from an imap server and store it in local ‘maildir’ box using the following command:

decodemail imap://user@example.com maildir:///var/mail/user

Both arguments can be omitted. If outbox is not supplied, the resulting mailbox will be printed on the standard output in Unix ‘mbox’ format. If inbox is not supplied, the utility will open the system inbox for the current user and use it for input.

A consequence of these rules is that there is no simple way to read the input mailbox from standard input (the input must be seekable). If you need to do this, the normal procedure would be to save what would be standard input in a temporary file and then give that file as decodemail’s input.

The following command line options modify the decodemail behavior:

-c, --charset=charset: Convert all textual parts from their original character set to the specified charset.
-R, --recode: Convert all textual parts from their original character set to the current character set, as specified by the LC_ALL or LANG environment variable.
--no-recode: Do not convert character sets. This is the default.
-t, --truncate: If the output mailbox exists, truncate it before appending new messages.
--no-truncate: Keep the existing messages in the output mailbox intact. This is the default.

Additionally, the Options That are Common for All Utilities. are also understood.

3.9.2 Configuration of `decodemail`.

The following common configuration statements affect the behavior of decodemail:

Statement	Reference
mime	See The `mime` Statement.
debug	See Debug Statement.
mailbox	See Mailbox Statement.
locking	See Locking Statement.

Notably, the mime statement can be used to extend the list of types which are decoded. For example, in the file ~/.decodemail (other locations are possible, see Mailutils Configuration File), you could have:

# base64/qp decode these mime types also:
mime {
  text-type "application/x-bibtex";
  text-type "application/x-tex";
}

Since the list of textual mime types is open-ended, with new types being used at any time, we do not attempt to make the built-in list comprehensive.

3.9.3 Purpose and caveats of `decodemail`.

The principal use envisioned for this program is to decode messages in batch, after they are received.

Unfortunately, some mailers prefer to encode messages in their entirety in base64 (or quoted-printable), even when the content is entirely human-readable text. This makes straightforward use of grep or other standard commands impossible. The idea is for decodemail to rectify that, by making the message text readable again.

Besides personal mail, mailing list archives are another place where such decoding can be useful, as they are often searched with standard tools.

It is generally not recommended to run decodemail within a mail reader (which should be able to do the decoding itself), or directly in a terminal (since quite possibly there will be 8-bit output not in the current character set).

Although the output message from decodemail should be entirely equivalent to the input message, apart from the decoding, it is generally not identical. Because decodemail parses the input message and reconstructs it for output, there are usually small differences:

In the envelope ‘From ’ line, multiple spaces are collapsed to one.
A ‘Content-Transfer-Encoding:’ header may be added where not previously present, or its value changed from ‘8bit’ to ‘7bit’, or vice versa. This may happen both for the message as a whole, and for a given mime part. decodemail looks at the actual content of the text and outputs ‘Content-Transfer-Encoding:’ accordingly.
A trailing space is inserted when a long header line is broken to occupy several lines (header wrapping).
```
SomeHeader: 
  someextremelylongvaluethatcannotbebroken
```
The non-tracing headers may be reordered, notably those that are mime-related.
Any material before the first mime part of a mime multipart message is lost. By the standards, nothing should appear there. Typically if it does appear, it is a string such as ‘This is a multi-part message in MIME format.’.
In mime parts, the charset specifications may no longer be quoted (if quoting is not necessary). For example, ‘charset="utf-8"’ becomes ‘charset=utf-8’.
The mime boundary strings will be changed.

If a discrepancy is created which actually affects message parsing or reading, that’s most likely a bug, and please report it. Naturally, please send an exact input message to reproduce the problem.

GNU Mailutils

General-Purpose Mail Package

3.9 decodemail – Decode multipart messages

3.9.1 Invocation of decodemail.

3.9.2 Configuration of decodemail.

3.9.3 Purpose and caveats of decodemail.

3.9 `decodemail` – Decode multipart messages

3.9.1 Invocation of `decodemail`.

3.9.2 Configuration of `decodemail`.

3.9.3 Purpose and caveats of `decodemail`.