Skip to content

protocols.imap_object¤

protocols.imap_object ¤

Attributes¤

protocols.imap_object.regex_starter module-attribute ¤

regex_starter = '(?<=^|\\s|\\[|\\(|\\{|\\<|\\\'|\\"|`|;|\\>)'

Start of line, or start of document, or start of markup

protocols.imap_object.regex_stopper module-attribute ¤

regex_stopper = '(?=$|\\s|\\]|\\)|\\}|\\>|\\\'|\\"|`|;|\\<)'

End of line, or end of document, or end of markup

protocols.imap_object.end_of_word module-attribute ¤

end_of_word = '(?=$|\\s|\\]|\\)|\\}|\\>|\\\'|\\"|`|;|:|,|\\?|\\!|\\.|\\<)'

End of word, or end of line, or end of document, or end of markup

protocols.imap_object.regex_algebra module-attribute ¤

regex_algebra = '[\\+\\-\\=\\\\±]'

Algebraic signs

protocols.imap_object.IP_PATTERN module-attribute ¤

IP_PATTERN = re.compile(
    "%s%s%s" % (regex_starter, regex_ip, regex_stopper), re.IGNORECASE
)

IPv4 and IPv6 patterns where the whole IP is captured in the first group.

protocols.imap_object.EMAIL_PATTERN module-attribute ¤

EMAIL_PATTERN = re.compile(
    "<?([0-9a-z\\-\\_\\+\\.]+?@[0-9a-z\\-\\_\\+]+(\\.[0-9a-z\\_\\-]{2,})+)>?",
    re.IGNORECASE,
)

Emails patterns like <me@mail.com> or me@mail.com where the whole address is captured in the first group.

protocols.imap_object.URL_PATTERN module-attribute ¤

URL_PATTERN = re.compile(
    "%s%s%s" % (regex_starter, regex_url, end_of_word), re.IGNORECASE
)

URL patterns like http(s)://domain.ext/page/subpage?q=x&r=0:1#anchor or //domain.ext/page. URL must follow RFC3986, meaning query parameters should be before anchors, if any. Relying on this assumption allows a faster regex parsing.

  • the protocol (ftp, ftps, http, https) is captured as the first group,
  • domain.ext is captured as the second group,
  • /page/etc is the third group, including leading and trailing /,
  • page query parameters ?s=x&r=0, including ?, is the fourth group if the URL declares ...?params#anchor,
  • anchor #anchor is the fifth group, including #, if the URL declares ...?params#anchor.

URLs are captured if they are:

  • alone on their own line,
  • enclosed in {}, [], ()
  • enclosed in whitespaces.

Warning: URLs enclosed in (), [] and {} may retain the closing sign as part of the page name since () and [] are valid in URL pathes and parameters. This pattern will work on plain text only: Markdown, XML, HTML and JSON will need to be parsed ahead.

protocols.imap_object.MEMBERS_PATTERN module-attribute ¤

MEMBERS_PATTERN = re.compile('(?<=[a-z])(\\.)(?=[a-z])', re.IGNORECASE)

Domain patterns without leading protocol like cdn.company.com or class members in object-oriented programming languages like params.cookies.client.

protocols.imap_object.DATE_PATTERN module-attribute ¤

DATE_PATTERN = re.compile(date_regex, re.IGNORECASE)

Dates like 2022-12-01, 01-12-2022, 01-12-22, 01/12/2022, 01/12/22 where the whole date is captured in the first group, then each group of digits is captured in the order of appearance, in the next 3 groups

protocols.imap_object.TIME_PATTERN module-attribute ¤

TIME_PATTERN = re.compile(time_regex, re.IGNORECASE)

Identify more or less standard time patterns, like :

  • 12h15
  • 12:15
  • 12:15:00
  • 12am
  • 12 am
  • 12 h
  • 12:15:00Z
  • 12:15:00+01
  • 12:15:00 UTC+1
  • 11:27:45+0000
RETURNS DESCRIPTION
0

1- or 2-digits hour,

TYPE: str

1

hour/minutes separator or half-day marker among ["h", ":", "am", "pm"] (case-insensitive)

TYPE: str

2

2-digits minutes, if any, or None

TYPE: str

3

2-digits seconds, if any.

TYPE: str

4

hour marker (h or H), half-day marker (case-insensitive ["am", "pm"]), or time zone marker (case-sensitive ["Z", "UTC"])

TYPE: str

5

1-or 2-digits signed integer timezone shift (referred to UTC).

TYPE: str

Examples:

see https://regex101.com/r/QNtZAK/2

see src/tests/test-patterns.py

protocols.imap_object.DOMAIN_PATTERN module-attribute ¤

DOMAIN_PATTERN = re.compile(
    "from ((?:[a-z0-9\\-_]{0,61}\\.)+[a-z]{2,})", re.IGNORECASE
)

Matches patterns like from (domain.ext) from RFC-822 Received header in emails.

protocols.imap_object.UID_PATTERN module-attribute ¤

UID_PATTERN = re.compile('UID ([0-9]+)')

Matches email integer UID from IMAP headers.

protocols.imap_object.FLAGS_PATTERN module-attribute ¤

FLAGS_PATTERN = re.compile('FLAGS \\((.*?)\\)')

Matches email flags from IMAP headers.

protocols.imap_object.PATH_PATTERN module-attribute ¤

PATH_PATTERN = re.compile('%s%s%s' % (regex_starter, path_regex, end_of_word))

File path pattern like ~/file, /home/file, ./file or C:\windows

protocols.imap_object.PARTIAL_PATH_REGEX module-attribute ¤

PARTIAL_PATH_REGEX = re.compile(
    "%s%s%s" % (regex_starter, partial_path_regex, end_of_word)
)

Partial, invalid path patterns missing the leading root, like home/user/stuff. We start capturing after at least two folder separators (slash or backslash).

Warning

this will collide with date detection, so run it after in the pipeline.

protocols.imap_object.RESOLUTION_PATTERN module-attribute ¤

RESOLUTION_PATTERN = re.compile('\\d+(?:×|x|X)\\d+')

Pixel resolution like 10x20 or 10×20. Units are discarded.

protocols.imap_object.NUMBER_PATTERN module-attribute ¤

NUMBER_PATTERN = re.compile(
    "%s%s%s" % (regex_starter, regex_number, regex_stopper)
)

Signed integers and decimals, fractions and numeric IDs with interal dashes and underscores. Numbers with starting or trailing units are not considered. Lazy decimals (.1 and 1.) are considered.

protocols.imap_object.HASH_PATTERN module-attribute ¤

HASH_PATTERN = re.compile(
    "%s%s%s" % (regex_starter, regex_hash, end_of_word), re.IGNORECASE
)

Cryptographic hexadecimal hashes and fingerprints, of a min length of 8 characters.

protocols.imap_object.MULTIPLE_LINES module-attribute ¤

MULTIPLE_LINES = re.compile('(?: ?[\\t\\r\\n]{2,} ?)+')

Detect more than 2 newlines and tab, possibly mixed with spaces

protocols.imap_object.MULTIPLE_NEWLINES module-attribute ¤

MULTIPLE_NEWLINES = re.compile('(?: ?[\\t\\r\\n]+ ?){2,}')

Detect broken sequences of newlines and spaces.

protocols.imap_object.INTERNAL_NEWLINE module-attribute ¤

INTERNAL_NEWLINE = re.compile('(?<=\\w)[\\n\\t\\r]{1}(?=\\w)')

Detect single newline characters nested inside text. Mostly useful for parsed PDF where line wrapping is quite literal ( used instead of space).

protocols.imap_object.EXPOSURE module-attribute ¤

EXPOSURE = re.compile(
    "%s%s%s" % (regex_starter, exposure_regex, end_of_word), flags=re.IGNORECASE
)

Exposure values in EV or IL

protocols.imap_object.PHOTOSPEED module-attribute ¤

PHOTOSPEED = re.compile(
    "%s%s%s" % (regex_starter, photospeed_regex, end_of_word),
    flags=re.IGNORECASE,
)

Exposure values in EV or IL

protocols.imap_object.SENSIBILITY module-attribute ¤

SENSIBILITY = re.compile(
    "%s%s%s" % (regex_starter, sensibility_regex, end_of_word),
    flags=re.IGNORECASE,
)

Photographic sensibility in ISO or ASA

protocols.imap_object.LUMINANCE module-attribute ¤

LUMINANCE = re.compile(
    "%s%s%s" % (regex_starter, luminance_regex, end_of_word),
    flags=re.IGNORECASE,
)

Luminance/radiance in nits or Cd/m²

protocols.imap_object.DIAPHRAGM module-attribute ¤

DIAPHRAGM = re.compile(
    "%s%s" % (regex_starter, diaphragm_regex), flags=re.IGNORECASE
)

Photographic diaph aperture values like f/2.8 or f/11

protocols.imap_object.GAIN module-attribute ¤

GAIN = re.compile(
    "%s%s%s" % (regex_starter, gain_regex, end_of_word), flags=re.IGNORECASE
)

Gain, attenuation and PSNR in dB

protocols.imap_object.FILE_SIZE module-attribute ¤

FILE_SIZE = re.compile(
    "%s%s%s" % (regex_starter, filesize_regex, end_of_word), flags=re.IGNORECASE
)

File and memory size in bit, byte, or octet and their multiples

protocols.imap_object.DISTANCE module-attribute ¤

DISTANCE = re.compile(
    "%s%s%s" % (regex_starter, distance_regex, end_of_word), flags=re.IGNORECASE
)

Distance in meter, inch, foot and their multiples

protocols.imap_object.PERCENT module-attribute ¤

PERCENT = re.compile('%s%s%s' % (regex_starter, percent_regex, end_of_word))

Number followed by %

protocols.imap_object.WEIGHT module-attribute ¤

WEIGHT = re.compile(
    "%s%s%s" % (regex_starter, weight_regex, end_of_word), flags=re.IGNORECASE
)

Weight (mass) in British and SI units and their multiples

protocols.imap_object.ANGLE module-attribute ¤

ANGLE = re.compile(
    "%s%s%s" % (regex_starter, angle_regex, end_of_word), flags=re.IGNORECASE
)

Angles in radians, degrees and steradians

protocols.imap_object.TEMPERATURE module-attribute ¤

TEMPERATURE = re.compile(
    "%s%s%s" % (regex_starter, temperature_regex, end_of_word),
    flags=re.IGNORECASE,
)

Temperatures in °C, °F and K

protocols.imap_object.FREQUENCY module-attribute ¤

FREQUENCY = re.compile(
    "%s%s%s" % (regex_starter, frequency_regex, end_of_word),
    flags=re.IGNORECASE,
)

Frequencies in hertz and multiples

protocols.imap_object.TEXT_DATES module-attribute ¤

TEXT_DATES = re.compile(
    "([0-9]{1,2})? (jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec|jan|fév|mar|avr|mai|jui|jui|aou|sep|oct|nov|déc|janvier|février|mars|avril|mai|juin|juillet|août|septembre|octobre|novembre|décembre|january|february|march|april|may|june|july|august|september|october|november|december)\\.?( [0-9]{1,2})?( [0-9]{2,4})(?!\\:)",
    flags=re.IGNORECASE | re.MULTILINE,
)

Find textual dates formats:

  • English dates like 01 Jan 20 or 01 Jan. 2020 but avoid capturing adjacent time like 12:08.
  • French dates like 01 Jan 20 or 01 Jan. 2020 but avoid capturing adjacent time like 12:08.
RETURNS DESCRIPTION
0

2 digits (day number or year number, depending on language)

TYPE: str

1

month (full-form or abbreviated)

TYPE: str

2

2 digits (day number or year number, depending on language)

TYPE: str

3

4 digits (full year)

TYPE: str

protocols.imap_object.BASE_64 module-attribute ¤

BASE_64 = re.compile(
    "((?:[A-Za-z0-9+\\/]{4}){64,}(?:[A-Za-z0-9+\\/]{2}==|[A-Za-z0-9+\\/]{3}=)?)"
)

Identifies base64 encoding

protocols.imap_object.BB_CODE module-attribute ¤

BB_CODE = re.compile('\\[(img|quote)[a-zA-Z0-9 =\\"]*?\\].*?\\[\\/\\1\\]')

Identifies left-over BB code markup [img] and [quote]

protocols.imap_object.MARKUP module-attribute ¤

MARKUP = re.compile('(?:\\[|\\{|\\<)([^\\n\\r]+?)(?:\\]|\\}|\\>)')

Identifies left-over HTML and Markdown markup, like <...>, {...}, [...]

protocols.imap_object.USER module-attribute ¤

USER = re.compile('([\\w\\-\\+\\.]+)?@([\\w\\-\\+\\.]+)|(user\\-?\\d+)')

Identifies user handles or emails

protocols.imap_object.REPEATED_CHARACTERS module-attribute ¤

REPEATED_CHARACTERS = re.compile('(.)\\1{9,}')

Identifies any character repeated more than 9 times

protocols.imap_object.UNFINISHED_SENTENCES module-attribute ¤

UNFINISHED_SENTENCES = re.compile('(?<![?!.;:])\\n\\n|\\r\\n')

Identifies sentences finishing with 2 newlines characters without having ending punctuations

protocols.imap_object.MULTIPLE_DOTS module-attribute ¤

MULTIPLE_DOTS = re.compile('\\.{2,}')

Identifies dots repeated more than twice

protocols.imap_object.MULTIPLE_DASHES module-attribute ¤

MULTIPLE_DASHES = re.compile('[-~]{1,}')

Identifies dashes repeated more than once

protocols.imap_object.MULTIPLE_QUESTIONS module-attribute ¤

MULTIPLE_QUESTIONS = re.compile('\\?{1,}')

Identifies question marks repeated more than once

protocols.imap_object.ORDINAL_FR module-attribute ¤

ORDINAL_FR = re.compile('n° ?([0-9]+)')

French ordinal numbers (numéros n°)

protocols.imap_object.FRANCAIS module-attribute ¤

FRANCAIS = re.compile(
    "%s(j|t|s|d|qu|lorsqu|quelqu|jusqu|m|c|n)\\'(?=[aeiouyéèàêâîôûïüäëöh][\\w\\s])"
    % regex_starter,
    flags=re.IGNORECASE,
)

French contractions of pronouns and determinants

protocols.imap_object.DASHES module-attribute ¤

DASHES = re.compile('(?<=\\w)(-|_|=)+(?=\\w)', re.IGNORECASE)

Dashes in the middle of ASCII/Latin compounded words. Will not work if accented or Unicode characters are immediately surrounding the dash.

protocols.imap_object.ALTERNATIVES module-attribute ¤

ALTERNATIVES = re.compile('(?<=[a-z])(\\/)(?=[a-z])', re.IGNORECASE)

Slash-separated word alternatives like and/or mr/mrs

protocols.imap_object.PLURAL_S module-attribute ¤

PLURAL_S = re.compile('(?<=[a-zA-Z]{4,})s?e{0,2}s%s' % end_of_word)

Identify plural form of nouns (French and English), adjectives (French) and third-person present verbs (English) and second-person verbs (French) in -s.

protocols.imap_object.FEMININE_E module-attribute ¤

FEMININE_E = re.compile('(?<=\\w{4,})e{1,2}%s' % end_of_word)

Identify feminine form of adjectives (French) in -e.

protocols.imap_object.DOUBLE_CONSONANTS module-attribute ¤

DOUBLE_CONSONANTS = re.compile(
    "(?<=\\w{2,})([bcfghjklmnpqrstvwxz])\\1", re.IGNORECASE
)

Identify double consonants in the middle of words.

protocols.imap_object.FEMININE_TRICE module-attribute ¤

FEMININE_TRICE = re.compile('(?<=\\w{4,})t(rice|eur|or)%s' % end_of_word)

Identify French feminine nouns in -trice.

protocols.imap_object.ADVERB_MENT module-attribute ¤

ADVERB_MENT = re.compile('(?<=\\w{4,})e?ment%s' % end_of_word)

Identify French adverbs and English nouns ending en -ment

protocols.imap_object.SUBSTANTIVE_TION module-attribute ¤

SUBSTANTIVE_TION = re.compile('(?<=\\w{4,})(t|s)ion%s' % end_of_word)

Identify French and English substantives formed from verbs by adding -tion and -sion

protocols.imap_object.SUBSTANTIVE_AT module-attribute ¤

SUBSTANTIVE_AT = re.compile('(?<=\\w{4,})at%s' % end_of_word)

Identify French and English substantives formed from other nouns by adding -at

protocols.imap_object.PARTICIPLE_ING module-attribute ¤

PARTICIPLE_ING = re.compile('(?<=\\w{4,})ing%s' % end_of_word)

Identify English substantives and present participles formed from verbs by adding -ing

protocols.imap_object.ADJECTIVE_ED module-attribute ¤

ADJECTIVE_ED = re.compile('(?<=\\w{4,})ed%s' % end_of_word)

Identify English adjectives formed from verbs by adding -ed

protocols.imap_object.ADJECTIVE_TIF module-attribute ¤

ADJECTIVE_TIF = re.compile('(?<=\\w{2,})ti(f|v)%s' % end_of_word)

Identify English and French adjectives formed from verbs by adding -tif or -tive

protocols.imap_object.SUBSTANTIVE_Y module-attribute ¤

SUBSTANTIVE_Y = re.compile('(?<=\\w{3,})y%s' % end_of_word)

Identify English substantives ending in -y

protocols.imap_object.VERB_IZ module-attribute ¤

VERB_IZ = re.compile('(?<=\\w{4,})(i|y)z%s' % end_of_word)

Identify American verbs ending in -iz that French and Brits write in -is

protocols.imap_object.STUFF_ER module-attribute ¤

STUFF_ER = re.compile('(?<=\\w{5,})er%s' % end_of_word)

Identify French 1st group verb (infinitive) and English substantives ending in -er

protocols.imap_object.BRITISH_OUR module-attribute ¤

BRITISH_OUR = re.compile('(?<=\\w{3,})our%s' % end_of_word)

Identify British spelling ending in -our (colour, behaviour).

protocols.imap_object.SUBSTANTIVE_ITY module-attribute ¤

SUBSTANTIVE_ITY = re.compile('(?<=\\w{4,})it(y|e)%s' % end_of_word)

Identify substantives in -ity (English) and -ite (French).

protocols.imap_object.SUBSTANTIVE_IST module-attribute ¤

SUBSTANTIVE_IST = re.compile('(?<=\\w{3,})is(t|m)%s' % end_of_word)

Identify substantives in -ist and -ism.

protocols.imap_object.SUBSTANTIVE_IQU module-attribute ¤

SUBSTANTIVE_IQU = re.compile('(?<=\\w{3,})i(qu|c)%s' % end_of_word)

Identify French substantives in -iqu

protocols.imap_object.SUBSTANTIVE_EUR module-attribute ¤

SUBSTANTIVE_EUR = re.compile('(?<=\\w{3,})eur%s' % end_of_word)

Identify French substantives -eur

protocols.imap_object.HYPHENIZED module-attribute ¤

HYPHENIZED = re.compile('(?<=\\w{3,})[-–—]+ *[\\n\\r]{1,2}(?=\\w)')

Detect hyphenized words at the end of a PDF text line.

protocols.imap_object.WAYBACK_RE module-attribute ¤

WAYBACK_RE = re.compile('https?://web\\.archive\\.org/web/[^/]+/(https?://.+)')

Find the canonical URL from web.archive.org (Wayback Machine) URLs

Classes¤

protocols.imap_object.EMail ¤

EMail(raw_message: list, server)

Bases: connectors.Content

Attributes¤
protocols.imap_object.EMail.urls instance-attribute ¤
urls = []

list[tuple[str]]: List of URLs found in email body.

protocols.imap_object.EMail.ips instance-attribute ¤
ips = []

list[str]: List of IPs found in the server delivery route (in Received headers)

protocols.imap_object.EMail.domains instance-attribute ¤
domains = []

list[str]: List of domains found in the server delivery route (in Received headers)

protocols.imap_object.EMail.server instance-attribute ¤
server: 'Server'

(Server): back-reference to the Server instance from which the current email is extracted.

protocols.imap_object.EMail.msg instance-attribute ¤

Standard Python email object

Functions¤
protocols.imap_object.EMail.has_header ¤
has_header(header: str) -> bool

Check if the case-insensitive header exists in the email headers.

PARAMETER DESCRIPTION
header

the RFC 822 email header.

TYPE: str

RETURNS DESCRIPTION
bool

presence of the header

protocols.imap_object.EMail.get_sender ¤
get_sender() -> list[list, list]

Get the full list of senders of the email, using the From header, splitting their name (if any) apart from their address.

RETURNS DESCRIPTION
list[list, list]

list[0] contains the list of names, rarely used, list[1] is the list of email addresses.

protocols.imap_object.EMail.parse_urls ¤
parse_urls(input: str) -> list[tuple]

Update self.urls with a list of all URLs found in input, split as (domain, page) tuples.

Examples:

Each result in the list is a tuple (domain, page), for example :

  • google.com/index.php is broken into ('google.com', '/index.php')
  • google.com/ is broken into ('google.com', '/')
  • google.com/login.php?id=xxx is broken into ('google.com', '/login.php')
protocols.imap_object.EMail.get_body ¤
get_body(preferencelist=('related', 'html', 'plain')) -> str

Get the body of the email.

PARAMETER DESCRIPTION
preferencelist

sequence of candidate properties in which to pick the email body, by order of priority. If set to "plain", return either the plain-text variant of the email if any, or build one by removing (x)HTML markup from the HTML variant if no plain-text variant is available.

TYPE: tuple | str DEFAULT: ('related', 'html', 'plain')

Note

Emails using quoted-printable transfer encoding but not UTF-8 charset are not handled. This weird combination has been met only in spam messages written in Russian, so far, and should not affect legit emails.

protocols.imap_object.EMail.is_in ¤
is_in(
    query_list: list[str] | str,
    field: str,
    case_sensitive: bool = False,
    mode: str = "any",
) -> bool

Check if any or all of the elements in the query_list is in the email field.

PARAMETER DESCRIPTION
query_list

list of keywords or unique keyword to find in field.

TYPE: list[str] | str

field

any RFC 822 header or "body".

TYPE: str

case_sensitive

True if the search should be case-sensitive. This has no effect if field is a RFC 822 header, it only applies to the email body.

TYPE: str DEFAULT: False

mode

"any" if any element in query_list should be found in field to return True. "all" if all elements in query_list should be found in field to return True.

TYPE: str DEFAULT: 'any'

RETURNS DESCRIPTION
bool

True if any or all elements (depending on mode) of query_list have been found in field.

protocols.imap_object.EMail.tag ¤
tag(keyword: str)

Add any arbitrary IMAP tag (aka label), standard or not, to the current email.

Warning

In Mozilla Thunderbird, labels/tags need to be configured first in the preferences (by mapping the label string to a color) to properly appear in the GUI. Otherwise, any undefined tag will be identified as “Important” (associated with red), no matter its actual string.

Horde, Roundcube and Nextcloud mail (based on Horde) treat those properly.

protocols.imap_object.EMail.untag ¤
untag(keyword: str)

Remove any arbitrary IMAP tag (aka label), standard or not, to the current email.

protocols.imap_object.EMail.delete ¤
delete()

Delete the current email directly without using the trash bin. It will not be recoverable.

Use EMail.move to move the email to the trash folder to get a last chance at reviewing what will be deleted.

Note

As per IMAP standard, this only add the \Deleted flag to the current email. Emails will be actually deleted when the expunge server command is launched, which is done automatically at the end of Server.run_filters.

protocols.imap_object.EMail.spam ¤
spam(spam_folder='INBOX.spam')

Mark the current email as spam, adding Mozilla Thunderbird Junk flag, and move it to the spam/junk folder.

protocols.imap_object.EMail.move ¤
move(folder: str)

Move the current email to the target folder, that will be created recursively if it does not exist. folder will be internally encoded to IMAP-custom UTF-7 with Server.encode_imap_folder.

protocols.imap_object.EMail.mark_as_important ¤
mark_as_important(mode: str)

Flag or unflag an email as important

PARAMETER DESCRIPTION
mode

add to add the \Flagged IMAP tag to the current email, remove to remove it.

TYPE: str

protocols.imap_object.EMail.mark_as_read ¤
mark_as_read(mode: str)

Flag or unflag an email as read (seen).

PARAMETER DESCRIPTION
mode

add to add the \Seen IMAP tag to the current email, remove to remove it.

TYPE: str

protocols.imap_object.EMail.mark_as_answered ¤
mark_as_answered(mode: str)

Flag or unflag an email as answered.

PARAMETER DESCRIPTION
mode

add to add the \Answered IMAP tag to the current email, remove to remove it.

TYPE: str

Note

if you answer programmatically, you need to manually pass the Message-ID of the original email to the In-Reply-To and Referencess of the answer to get threaded messages. In-Reply-To gets only the immediate previous email, References get the whole thread.

protocols.imap_object.EMail.is_read ¤
is_read() -> bool

Check if this email has been opened and read.

protocols.imap_object.EMail.is_unread ¤
is_unread() -> bool

Check if this email has not been yet opened and read.

protocols.imap_object.EMail.is_recent ¤
is_recent() -> bool

Check if this session is the first one to get this email. It doesn’t mean user read it.

Note

this flag cannot be set by client, only by server. It’s read-only app-wise.

protocols.imap_object.EMail.is_draft ¤
is_draft() -> bool

Check if this email is maked as draft.

protocols.imap_object.EMail.is_answered ¤
is_answered() -> bool

Check if this email has been answered.

protocols.imap_object.EMail.is_important ¤
is_important() -> bool

Check if this email has been flagged as important.

protocols.imap_object.EMail.is_mailing_list ¤
is_mailing_list() -> bool

Check if this email has the typical mailing-list headers.

Warning

The headers checked for hints here are not standard and not systematically used.

protocols.imap_object.EMail.is_newsletter ¤
is_newsletter() -> bool

Check if this email has the typical newsletter headers.

Warning

The headers checked for hints here are not standard and not systematically used.

protocols.imap_object.EMail.spf_pass ¤
spf_pass() -> int

Check if any of the servers listed in the Received email headers is authorized by the DNS SPF rules to send emails on behalf of the email address set in Return-Path.

RETURNS DESCRIPTION
score
  • = 0: neutral result, no explicit success or fail, or server configuration could not be retrieved/interpreted.
  • > 0: success, server is explicitly authorized or SPF rules are deliberately permissive.
  • < 0: fail, server is unauthorized.
  • = 2: explicit success, server is authorized.
  • = -2: explicit fail, server is forbidden, the email is a deliberate spoofing attempt.

TYPE: int

Note

The Return-Path header is set by any proper mail client to the mailbox collecting bounces (notice of undelivered emails), and, while it is optional, the RFC 4408 states that it is the one from which the SPF domain will be inferred. In practice, it is missing only in certain spam messages, so its absence is treated as an explicit fail.

Warning

Emails older than 6 months will at least get a score of 0 and will therefore never fail the SPF check. This is because DNS configuration may have changed since the email was sent, and it could have been valid at the time of sending.

protocols.imap_object.EMail.dkim_pass ¤
dkim_pass() -> int

Check the authenticity of the DKIM signature.

Note

The DKIM signature uses an asymetric key scheme, where the private key is set on the SMTP server and the public key is set in DNS records of the mailserver. The signature is a cryptographic hash of the email headers (not their content). A valid signature means the private key used to hash headers matches the public key in the DNS records AND the headers have not been tampered with since sending.

RETURNS DESCRIPTION
score
  • = 0: there is no DKIM signature.
  • = 1: the DKIM signature is valid but outdated. This means the public key in DNS records has been updated since they email was sent.
  • = 2: the DKIM signature is valid and up-to-date.
  • = -2: the DKIM signature is invalid. Either the headers have been tampered or the DKIM signature is entirely forged (happens a lot in spam emails).

TYPE: int

Warning

Emails older than 6 months will at least get a score of 0 and will therefore never fail the DKIM check. This is because DNS configuration (public key) may have changed since the email was sent, and it could have been valid at the time of sending.

protocols.imap_object.EMail.arc_pass ¤
arc_pass() -> int

Check the authenticity of the ARC signature.

Note

The ARC signature is still experimental and not widely used. When an email is forwarded, by an user or through a mailing list, its DKIM signature will be invalidated and the email will appear forged/tampered. ARC authentifies the intermediate servers and aims at solving this issue.

RETURNS DESCRIPTION
score
  • = 0: there is no ARC signature,
  • = 2: the ARC signature is valid
  • =-2: the ARC signature is invalid. Typically, it means the signature has been forged.

TYPE: int

protocols.imap_object.EMail.authenticity_score ¤
authenticity_score() -> int

Compute the score of authenticity of the email, summing the results of EMail.spf_pass, EMail.dkim_pass and EMail.arc_pass. The weighting is designed such that one valid check compensates one fail.

RETURNS DESCRIPTION
score
  • == 0: neutral, no explicit authentification is defined on DNS or no rule could be found
  • > 0: explicitly authenticated by at least one method,
  • == 6: maximal authenticity (valid SPF, DKIM and ARC)
  • < 0: spoofed, at least one of SPF or DKIM or ARC failed and

TYPE: int

protocols.imap_object.EMail.is_authentic ¤
is_authentic() -> bool

Helper function for EMail.authenticity_score, checking if at least one authentication method succeeded.

RETURNS DESCRIPTION
bool

True if EMail.authenticity_score returns a score greater or equal to zero.

protocols.imap_object.EMail.age ¤
age() -> timedelta

Compute the age of an email at the time of evaluation

RETURNS DESCRIPTION
timedelta

time difference between current time and sending time of the email

protocols.imap_object.EMail.now ¤
now() -> str

Helper to get access to date/time from within the email object when writing filters

protocols.imap_object.EMail.create_hash ¤
create_hash()

Create a stable hash for the email from persistent headers and metadata.

protocols.imap_object.EMail.query_referenced_emails ¤
query_referenced_emails() -> list[EMail]

Fetch the list of all emails referenced in the present message, aka the whole email thread in wich the current email belongs.

The list is sorted from newest to oldest. Queries emails having a Message-ID header matching the ones contained in the References header of the current email.

RETURNS DESCRIPTION
list[EMail]

All emails referenced.

protocols.imap_object.EMail.query_replied_email ¤
query_replied_email() -> EMail

Fetch the email being replied to by the current email.

RETURNS DESCRIPTION
EMail

The email being replied to.

Functions¤

protocols.imap_object.split_url ¤

split_url(url: str) -> tuple[str, str, str, str, str] | None

Split a well-formed URL following RFC3986 into base elements.

RETURNS DESCRIPTION
tuple[str, str, str, str, str] | None

a tuple of (protocol, domain, page, parameters, anchor).

tuple[str, str, str, str, str] | None

Empty/missing fields are inited with empty strings so there is no need for individual None checks.

tuple[str, str, str, str, str] | None

If the url input doesn’t match an URL format, return None.