# SARE HTML Ruleset for SpamAssassin - ruleset 0 # Version: 01.03.07 # Created: 2004-03-31 # Modified: 2005-07-02 # Usage instructions, documentation, and change history in 70_sare_html0.cf #@@# Revision History: Full Revision History stored in 70_sare_html.log #@@# 01.03.07: July 2 2005 #@@# Minor score tweaks based on recent mass-checks #@@# Added file 0: SARE_HTML_EHTML_OBFU #@@# Moved file 0 to 1: SARE_HTML_INV_TAGA # License: Artistic - see http://www.rulesemporium.com/license.txt # Current Maintainer: Bob Menschel - RMSA@Menschel.net # Current Home: http://www.rulesemporium.com/rules/70_sare_html0.cf # # Usage: This family of files, 70_sare_html*.cf, contain rules that test HTML strings within emails # (except URIs, which are handled in the 70_sare_uri*.cf family of files). # # File 0: 70_sare_html0.cf -- These are html rules that hit at least 10 spam and no ham. # While SARE cannot guarantee they never will hit ham, they have not hit ham in any SARE mass-check, against tens of thousands of ham. # This is a rules file we expect any/all email systems using SpamAssassin to benefit from. # # File 1: 70_sare_html1.cf -- These are html rules that meet one of the follow criteria: # a) Rules that do, or in the past have hit ham during SARE mass-check tests # b) Rules that hit no ham and currently do not hit more than 10 spam in any single mass-check run. # If the rules hit ham, they hit at last 10 spam to each 1 ham. # If the rules hit ham, they hit fewer than 100 ham # With few exceptions these rules score significantly less than the rules in file 0. # Systems which are very sensitive to false positives and/or need to be very careful about resource use may want to exclude this ruleset, # pick and choose among its rules, or lower their scores. # Systems that use this file 1 should ALSO use file 0. # # File 2: 70_sare_html2.cf -- These html rules hit no spam at this time, but they are considered "safe" rules that should never hit ham. # These are primarily rules that test for specific html seen only in spam, or similar types of "pretty darn sure" rules. # Systems which are very sensitive to SpamAssassin overhead may want to exclude this ruleset file to avoid its overhead, # but systems with plenty of resources that want to be aggressive against spam may benefit from this ruleset file. # # File 3: 70_sare_html3.cf -- These are html rules that hit a significant amount of ham during SARE mass-check tests. # Systems which are very sensitive to false positives or to SA resource usage should NOT install this ruleset. # # File 4: 70_sare_html4.cf -- These are html rules that meet one of the following criteria: # a) They hit over 100 ham during SARE mass-check tests, but still hit enough spam to be worth while to aggressively anti-spam systems. # b) They hit no emails at this time, but have been recommended by anti-spam sources. # Again, systems which are very sensitive to false positives or to SA resource usage should NOT install this ruleset. # # eng: 70_sare_html_eng.cf -- These are html rules which work well within the English language, but are liable to cause false # positives in other languages. They include rules which test for letter combinations. Systems that # receive ham in languages other than English should NOT use this file. # # x30: 70_sare_html_x30.cf -- These are html rules which have been incorporated into SpamAssassin 3.0.x, # or which duplicate or greatly overlap 3.0.x rules. # Systems which have installed SpamAssassin 3.0.x should therefore NOT use this file. # # arc: 70_sare_html_arc.cf -- These are html rules that once were published in other files, but which have since lost all value. # They either hit too much ham (without hitting enough spam to make it worth while), or they don't hit any spam. # SARE regularly runs mass-checks on these rules to see if any of them are worth reviving, but # we expect that nobody will be running these rules in any production system. # ######## ###################### ################################################## ######## ###################### ################################################## # Rules renamed or moved ######## ###################### ################################################## meta SARE_HTML_ALT_WAIT2 0 # archived Feb 26 2005, 01.03.04 meta SARE_HTML_BADOPEN 0 # archived Feb 26 2005, 01.03.04 meta SARE_HTML_BAD_FG_CLR 0 # Moved 3 to 4, Feb 26 2005, 01.03.04 meta SARE_HTML_COLOR_B 0 # Moved 1 to 3, Feb 26 2005, 01.03.04 meta SARE_HTML_COLOR_NWHT3 0 # Moved 3 to 4, Feb 26 2005, 01.03.04 meta SARE_HTML_FONT_INVIS2 0 # Moved 3 to 4, Feb 26 2005, 01.03.04 meta SARE_HTML_FSIZE_1ALL 0 # Moved 1 to 4, Feb 26 2005, 01.03.04 meta SARE_HTML_GIF_DIM 0 # Moved 3 to 4, Feb 26 2005, 01.03.04 meta SARE_HTML_HTML_AFTER 0 # Moved 3 to 4, Feb 26 2005, 01.03.04 meta SARE_HTML_HTML_DBL 0 # Moved 0 to 1, Feb 26 2005, 01.03.04 meta SARE_HTML_HTML_TBL 0 # Moved 0 to 1, Feb 26 2005, 01.03.04 meta SARE_HTML_IMG_ONLY 0 # Moved 0 to 1, Feb 26 2005, 01.03.04 meta SARE_HTML_JVS_HREF 0 # Unarchived to 3, Feb 26 2005, 01.03.04 meta SARE_HTML_MANY_BR10 0 # Moved 1 to 3, Feb 26 2005, 01.03.04 meta SARE_HTML_MANY_BR10 0 # Moved 3 to 4, Feb 26 2005, 01.03.04 meta SARE_HTML_NO_BODY 0 # Moved 1 to 3, Feb 26 2005, 01.03.04 meta SARE_HTML_NO_HTML1 0 # Moved 1 to 3, Feb 26 2005, 01.03.04 meta SARE_HTML_P_JUSTIFY 0 # Moved 3 to 4, Feb 26 2005, 01.03.04 meta SARE_HTML_TITLE_SEX 0 # Moved 0 to 1, Feb 26 2005, 01.03.04 meta SARE_HTML_URI_2SLASH 0 # Moved 3 to 4, Feb 26 2005, 01.03.04 meta SARE_HTML_URI_AXEL 0 # archived Feb 26 2005, 01.03.04 meta SARE_HTML_URI_BADQRY 0 # Moved 0 to 3, Feb 26 2005, 01.03.04 meta SARE_HTML_URI_FORMPHP 0 # Unarchived to 3, Feb 26 2005, 01.03.04 meta SARE_HTML_URI_HREF 0 # Moved 0 to 3, Feb 26 2005, 01.03.04 meta SARE_HTML_URI_MANYP2 0 # archived Feb 26 2005, 01.03.04 meta SARE_HTML_URI_MANYP3 0 # Moved 3 to 4, Feb 26 2005, 01.03.04 meta SARE_HTML_URI_NUMPHP3 0 # Moved 3 to 4, Feb 26 2005, 01.03.04 meta SARE_HTML_URI_OBFU4 0 # archived Feb 26 2005, 01.03.04 meta SARE_HTML_URI_OBFU4a 0 # archived Feb 26 2005, 01.03.04 meta SARE_HTML_URI_PARTID 0 # Moved 0 to 1, Feb 26 2005, 01.03.04 meta SARE_HTML_URI_RID 0 # archived Feb 26 2005, 01.03.04 meta SARE_HTML_USL_MULT 0 # archived Feb 26 2005, 01.03.04 meta SARE_HTML_FONT_EBEF 0 # Moved 0 to 1, Mar 12 2005, 01.03.05 meta SARE_HTML_URI_DEFASP 0 # Moved 0 to 1, Mar 12 2005, 01.03.05 meta SARE_HTML_INV_TAGA 0 ######## ###################### ################################################## rawbody __SARE_HTML_HAS_A eval:html_tag_exists('a') rawbody __SARE_HTML_HAS_BR eval:html_tag_exists('br') rawbody __SARE_HTML_HAS_DIV eval:html_tag_exists('div') rawbody __SARE_HTML_HAS_FONT eval:html_tag_exists('font') rawbody __SARE_HTML_HAS_IMG eval:html_tag_exists('img') rawbody __SARE_HTML_HAS_P eval:html_tag_exists('p') rawbody __SARE_HTML_HAS_PRE eval:html_tag_exists('pre') rawbody __SARE_HTML_HAS_TITLE eval:html_tag_exists('title') header __SARE_HTML_HAS_TO exists:To body __SARE_HTML_HAS_MSG /./ rawbody __SARE_HTML_HBODY m''i rawbody __SARE_HTML_BEHTML m''i rawbody __SARE_HTML_BEHTML2 m'^'i rawbody __SARE_HTML_EFONT m'^'i rawbody __SARE_HTML_EHEB m'^'i rawbody __SARE_HTML_CMT_CNTR /
/i describe SARE_HTML_CMT_MONEY HTML Comment seems to mention money score SARE_HTML_CMT_MONEY 0.100 #counts SARE_HTML_CMT_MONEY 0s/0h of 98542 corpus (76935s/21607h RM) 05/12/04 #counts SARE_HTML_CMT_MONEY 0s/0h of 29365 corpus (5882s/23483h JH) 08/14/04 TM2 SA3.0-pre2 ######## ###################### ################################################## # Image tag tests ######## ###################### ################################################## rawbody SARE_HTML_GIF_NUM /\.gif\d{2,}/i describe SARE_HTML_GIF_NUM HTML contains tracking numbers after .gif score SARE_HTML_GIF_NUM 0.100 #counts SARE_HTML_GIF_NUM 0s/0h of 98542 corpus (76935s/21607h RM) 05/12/04 #counts SARE_HTML_GIF_NUM 0s/0h of 29365 corpus (5882s/23483h JH) 08/14/04 TM2 SA3.0-pre2 ######## ###################### ################################################## # Paragraphs, breaks, and spacings ######## ###################### ################################################## ######## ###################### ################################################## # Javascript and object tests ######## ###################### ################################################## rawbody SARE_HTML_JVS_POPUP /<\/b>.{1,5}){7,8}/i describe SARE_HTML_USL_B7 Multiple (7-8) score SARE_HTML_USL_B7 0.100 #counts SARE_HTML_USL_B7 0s/0h of 98542 corpus (76935s/21607h RM) 05/12/04 #counts SARE_HTML_USL_B7 0s/0h of 29365 corpus (5882s/23483h JH) 08/14/04 TM2 SA3.0-pre2 rawbody SARE_HTML_USL_B9 /(<\/b>.{1,5}){9,10}/i describe SARE_HTML_USL_B9 Multiple (9-10) score SARE_HTML_USL_B9 0.100 #counts SARE_HTML_USL_B9 0s/0h of 98542 corpus (76935s/21607h RM) 05/12/04 #counts SARE_HTML_USL_B9 0s/0h of 29365 corpus (5882s/23483h JH) 08/14/04 TM2 SA3.0-pre2 # EOF # SARE HTML Ruleset for SpamAssassin - ruleset 3 # Version: 01.03.07 # Created: 2004-03-31 # Modified: 2005-07-02 # Usage instructions, documentation, and change history in 70_sare_html0.cf #@@# Revision History: Full Revision History stored in 70_sare_html.log #@@# 01.03.07: July 2 2005 #@@# Minor score tweaks based on recent mass-checks #@@# Archived from file 3: SARE_HTML_URI_ENC_AT #@@# Moved file 3 to 1: SARE_HTML_A_BODY # License: Artistic - see http://www.rulesemporium.com/license.txt # Current Maintainer: Bob Menschel - RMSA@Menschel.net # Current Home: http://www.rulesemporium.com/rules/70_sare_html3.cf # ######## ###################### ################################################## rawbody __SARE_HTML_HAS_A eval:html_tag_exists('a') rawbody __SARE_HTML_HAS_BR eval:html_tag_exists('br') rawbody __SARE_HTML_HAS_DIV eval:html_tag_exists('div') rawbody __SARE_HTML_HAS_FONT eval:html_tag_exists('font') rawbody __SARE_HTML_HAS_IMG eval:html_tag_exists('img') rawbody __SARE_HTML_HAS_P eval:html_tag_exists('p') rawbody __SARE_HTML_HAS_PRE eval:html_tag_exists('pre') rawbody __SARE_HTML_HAS_TITLE eval:html_tag_exists('title') header __SARE_HTML_HAS_TO To =~ /./ body __SARE_HTML_HAS_MSG /./ rawbody __SARE_HTML_HBODY m''i rawbody __SARE_HTML_BEHTML m''i rawbody __SARE_HTML_BEHTML2 m'^'i rawbody __SARE_HTML_EFONT m'^'i rawbody __SARE_HTML_EHEB m'^'i rawbody __SARE_HTML_CMT_CNTR /