2187 lines
		
	
	
		
			42 KiB
		
	
	
	
		
			HTML
		
	
	
	
			
		
		
	
	
			2187 lines
		
	
	
		
			42 KiB
		
	
	
	
		
			HTML
		
	
	
	
| <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN""http://www.w3.org/TR/html4/loose.dtd">
 | |
| <HTML
 | |
| ><HEAD
 | |
| ><TITLE
 | |
| >Appendix</TITLE
 | |
| ><META
 | |
| NAME="GENERATOR"
 | |
| CONTENT="Modular DocBook HTML Stylesheet Version 1.79"><LINK
 | |
| REL="HOME"
 | |
| TITLE="Privoxy 3.0.12 User Manual"
 | |
| HREF="index.html"><LINK
 | |
| REL="PREVIOUS"
 | |
| TITLE="See Also"
 | |
| HREF="seealso.html"><LINK
 | |
| REL="STYLESHEET"
 | |
| TYPE="text/css"
 | |
| HREF="../p_doc.css"><META
 | |
| HTTP-EQUIV="Content-Type"
 | |
| CONTENT="text/html;
 | |
| charset=ISO-8859-1">
 | |
| <LINK REL="STYLESHEET" TYPE="text/css" HREF="p_doc.css">
 | |
| </head
 | |
| ><BODY
 | |
| CLASS="SECT1"
 | |
| BGCOLOR="#EEEEEE"
 | |
| TEXT="#000000"
 | |
| LINK="#0000FF"
 | |
| VLINK="#840084"
 | |
| ALINK="#0000FF"
 | |
| ><DIV
 | |
| CLASS="NAVHEADER"
 | |
| ><TABLE
 | |
| SUMMARY="Header navigation table"
 | |
| WIDTH="100%"
 | |
| BORDER="0"
 | |
| CELLPADDING="0"
 | |
| CELLSPACING="0"
 | |
| ><TR
 | |
| ><TH
 | |
| COLSPAN="3"
 | |
| ALIGN="center"
 | |
| >Privoxy 3.0.12 User Manual</TH
 | |
| ></TR
 | |
| ><TR
 | |
| ><TD
 | |
| WIDTH="10%"
 | |
| ALIGN="left"
 | |
| VALIGN="bottom"
 | |
| ><A
 | |
| HREF="seealso.html"
 | |
| ACCESSKEY="P"
 | |
| >Prev</A
 | |
| ></TD
 | |
| ><TD
 | |
| WIDTH="80%"
 | |
| ALIGN="center"
 | |
| VALIGN="bottom"
 | |
| ></TD
 | |
| ><TD
 | |
| WIDTH="10%"
 | |
| ALIGN="right"
 | |
| VALIGN="bottom"
 | |
| > </TD
 | |
| ></TR
 | |
| ></TABLE
 | |
| ><HR
 | |
| ALIGN="LEFT"
 | |
| WIDTH="100%"></DIV
 | |
| ><DIV
 | |
| CLASS="SECT1"
 | |
| ><H1
 | |
| CLASS="SECT1"
 | |
| ><A
 | |
| NAME="APPENDIX"
 | |
| >14. Appendix</A
 | |
| ></H1
 | |
| ><DIV
 | |
| CLASS="SECT2"
 | |
| ><H2
 | |
| CLASS="SECT2"
 | |
| ><A
 | |
| NAME="REGEX"
 | |
| >14.1. Regular Expressions</A
 | |
| ></H2
 | |
| ><P
 | |
| > <SPAN
 | |
| CLASS="APPLICATION"
 | |
| >Privoxy</SPAN
 | |
| > uses Perl-style <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"regular
 | |
|  expressions"</SPAN
 | |
| > in its <A
 | |
| HREF="actions-file.html"
 | |
| >actions
 | |
|  files</A
 | |
| > and <A
 | |
| HREF="filter-file.html"
 | |
| >filter file</A
 | |
| >,
 | |
|  through the <A
 | |
| HREF="http://www.pcre.org/"
 | |
| TARGET="_top"
 | |
| >PCRE</A
 | |
| > and
 | |
|  <SPAN
 | |
| CLASS="APPLICATION"
 | |
| >PCRS</SPAN
 | |
| > libraries.</P
 | |
| ><P
 | |
| > If you are reading this, you probably don't understand what <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"regular
 | |
|  expressions"</SPAN
 | |
| > are, or what they can do. So this will be a very brief
 | |
|  introduction only. A full explanation would require a <A
 | |
| HREF="http://www.oreilly.com/catalog/regex/"
 | |
| TARGET="_top"
 | |
| >book</A
 | |
| > ;-)</P
 | |
| ><P
 | |
| > Regular expressions provide a language to describe patterns that can be
 | |
|  run against strings of characters (letter, numbers, etc), to see if they
 | |
|  match the string or not. The  patterns are themselves (sometimes complex)
 | |
|  strings of literal characters, combined with  wild-cards, and other special
 | |
|  characters, called meta-characters. The <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"meta-characters"</SPAN
 | |
| > have
 | |
|  special meanings and are used to build complex patterns to be matched against.
 | |
|  Perl Compatible Regular Expressions are an especially convenient
 | |
|  <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"dialect"</SPAN
 | |
| > of the regular expression language.</P
 | |
| ><P
 | |
| > To make a simple analogy, we do something similar when we use wild-card
 | |
|  characters when listing files with the <B
 | |
| CLASS="COMMAND"
 | |
| >dir</B
 | |
| > command in DOS. 
 | |
|  <TT
 | |
| CLASS="LITERAL"
 | |
| >*.*</TT
 | |
| > matches all filenames. The <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"special"</SPAN
 | |
| >
 | |
|  character here is the asterisk which matches any and all characters. We can be
 | |
|  more specific and use <TT
 | |
| CLASS="LITERAL"
 | |
| >?</TT
 | |
| > to match just individual
 | |
|  characters. So <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"dir file?.text"</SPAN
 | |
| > would match
 | |
|  <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"file1.txt"</SPAN
 | |
| >, <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"file2.txt"</SPAN
 | |
| >, etc. We are pattern
 | |
|  matching, using a similar technique to <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"regular expressions"</SPAN
 | |
| >!</P
 | |
| ><P
 | |
| > Regular expressions do essentially the same thing, but are much, much more
 | |
|  powerful. There are many more <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"special characters"</SPAN
 | |
| > and ways of 
 | |
|  building complex patterns however. Let's look at a few of the common ones,
 | |
|  and then some examples:</P
 | |
| ><P
 | |
| ><P
 | |
| ></P
 | |
| ><TABLE
 | |
| BORDER="0"
 | |
| ><TBODY
 | |
| ><TR
 | |
| ><TD
 | |
| >  <SPAN
 | |
| CLASS="emphasis"
 | |
| ><I
 | |
| CLASS="EMPHASIS"
 | |
| >.</I
 | |
| ></SPAN
 | |
| > - Matches any single character, e.g. <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"a"</SPAN
 | |
| >,
 | |
|   <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"A"</SPAN
 | |
| >, <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"4"</SPAN
 | |
| >, <SPAN
 | |
| CLASS="QUOTE"
 | |
| >":"</SPAN
 | |
| >, or <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"@"</SPAN
 | |
| >.
 | |
|  </TD
 | |
| ></TR
 | |
| ></TBODY
 | |
| ></TABLE
 | |
| ><P
 | |
| ></P
 | |
| ></P
 | |
| ><P
 | |
| ><P
 | |
| ></P
 | |
| ><TABLE
 | |
| BORDER="0"
 | |
| ><TBODY
 | |
| ><TR
 | |
| ><TD
 | |
| >  <SPAN
 | |
| CLASS="emphasis"
 | |
| ><I
 | |
| CLASS="EMPHASIS"
 | |
| >?</I
 | |
| ></SPAN
 | |
| > - The preceding character or expression is matched ZERO or ONE
 | |
|   times. Either/or.
 | |
|  </TD
 | |
| ></TR
 | |
| ></TBODY
 | |
| ></TABLE
 | |
| ><P
 | |
| ></P
 | |
| ></P
 | |
| ><P
 | |
| ><P
 | |
| ></P
 | |
| ><TABLE
 | |
| BORDER="0"
 | |
| ><TBODY
 | |
| ><TR
 | |
| ><TD
 | |
| >  <SPAN
 | |
| CLASS="emphasis"
 | |
| ><I
 | |
| CLASS="EMPHASIS"
 | |
| >+</I
 | |
| ></SPAN
 | |
| > - The preceding character or expression is matched ONE or MORE
 | |
|   times.
 | |
|  </TD
 | |
| ></TR
 | |
| ></TBODY
 | |
| ></TABLE
 | |
| ><P
 | |
| ></P
 | |
| ></P
 | |
| ><P
 | |
| ><P
 | |
| ></P
 | |
| ><TABLE
 | |
| BORDER="0"
 | |
| ><TBODY
 | |
| ><TR
 | |
| ><TD
 | |
| >  <SPAN
 | |
| CLASS="emphasis"
 | |
| ><I
 | |
| CLASS="EMPHASIS"
 | |
| >*</I
 | |
| ></SPAN
 | |
| > - The preceding character or expression is matched ZERO or MORE
 | |
|   times.
 | |
|  </TD
 | |
| ></TR
 | |
| ></TBODY
 | |
| ></TABLE
 | |
| ><P
 | |
| ></P
 | |
| ></P
 | |
| ><P
 | |
| ><P
 | |
| ></P
 | |
| ><TABLE
 | |
| BORDER="0"
 | |
| ><TBODY
 | |
| ><TR
 | |
| ><TD
 | |
| >  <SPAN
 | |
| CLASS="emphasis"
 | |
| ><I
 | |
| CLASS="EMPHASIS"
 | |
| >\</I
 | |
| ></SPAN
 | |
| > - The <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"escape"</SPAN
 | |
| > character denotes that
 | |
|   the following character should be taken literally. This is used where one of the 
 | |
|   special characters (e.g. <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"."</SPAN
 | |
| >) needs to be taken literally and
 | |
|   not as a special meta-character. Example: <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"example\.com"</SPAN
 | |
| >, makes 
 | |
|   sure the period is recognized only as a period (and not expanded to its 
 | |
|   meta-character meaning of any single character).
 | |
|  </TD
 | |
| ></TR
 | |
| ></TBODY
 | |
| ></TABLE
 | |
| ><P
 | |
| ></P
 | |
| ></P
 | |
| ><P
 | |
| ><P
 | |
| ></P
 | |
| ><TABLE
 | |
| BORDER="0"
 | |
| ><TBODY
 | |
| ><TR
 | |
| ><TD
 | |
| >  <SPAN
 | |
| CLASS="emphasis"
 | |
| ><I
 | |
| CLASS="EMPHASIS"
 | |
| >[ ]</I
 | |
| ></SPAN
 | |
| > - Characters enclosed in brackets will be matched if
 | |
|   any of the enclosed characters are encountered. For instance, <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"[0-9]"</SPAN
 | |
| >
 | |
|   matches any numeric digit (zero through nine). As an example, we can combine 
 | |
|   this with <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"+"</SPAN
 | |
| > to match any digit one of more times: <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"[0-9]+"</SPAN
 | |
| >.
 | |
|  </TD
 | |
| ></TR
 | |
| ></TBODY
 | |
| ></TABLE
 | |
| ><P
 | |
| ></P
 | |
| ></P
 | |
| ><P
 | |
| ><P
 | |
| ></P
 | |
| ><TABLE
 | |
| BORDER="0"
 | |
| ><TBODY
 | |
| ><TR
 | |
| ><TD
 | |
| >  <SPAN
 | |
| CLASS="emphasis"
 | |
| ><I
 | |
| CLASS="EMPHASIS"
 | |
| >( )</I
 | |
| ></SPAN
 | |
| > - parentheses are used to group a sub-expression,
 | |
|   or multiple sub-expressions.
 | |
|  </TD
 | |
| ></TR
 | |
| ></TBODY
 | |
| ></TABLE
 | |
| ><P
 | |
| ></P
 | |
| ></P
 | |
| ><P
 | |
| ><P
 | |
| ></P
 | |
| ><TABLE
 | |
| BORDER="0"
 | |
| ><TBODY
 | |
| ><TR
 | |
| ><TD
 | |
| >  <SPAN
 | |
| CLASS="emphasis"
 | |
| ><I
 | |
| CLASS="EMPHASIS"
 | |
| >|</I
 | |
| ></SPAN
 | |
| > - The <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"bar"</SPAN
 | |
| > character works like an
 | |
|   <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"or"</SPAN
 | |
| > conditional statement. A match is successful if the
 | |
|   sub-expression on either side of <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"|"</SPAN
 | |
| > matches. As an example:
 | |
|   <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"/(this|that) example/"</SPAN
 | |
| > uses grouping and the bar character 
 | |
|   and would match either <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"this example"</SPAN
 | |
| > or <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"that
 | |
|   example"</SPAN
 | |
| >, and nothing else.
 | |
|  </TD
 | |
| ></TR
 | |
| ></TBODY
 | |
| ></TABLE
 | |
| ><P
 | |
| ></P
 | |
| ></P
 | |
| ><P
 | |
| > These are just some of the ones you are likely to use when matching URLs with 
 | |
|  <SPAN
 | |
| CLASS="APPLICATION"
 | |
| >Privoxy</SPAN
 | |
| >, and is a long way from a definitive
 | |
|  list. This is enough to get us started with a few simple examples which may
 | |
|  be more illuminating:</P
 | |
| ><P
 | |
| > <SPAN
 | |
| CLASS="emphasis"
 | |
| ><I
 | |
| CLASS="EMPHASIS"
 | |
| ><TT
 | |
| CLASS="LITERAL"
 | |
| >/.*/banners/.*</TT
 | |
| ></I
 | |
| ></SPAN
 | |
| > - A  simple example
 | |
|  that uses the common combination of <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"."</SPAN
 | |
| > and <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"*"</SPAN
 | |
| > to 
 | |
|  denote any character, zero or more times. In other words, any string at all.
 | |
|  So we start with a literal forward slash, then our regular expression pattern 
 | |
|  (<SPAN
 | |
| CLASS="QUOTE"
 | |
| >".*"</SPAN
 | |
| >) another literal forward slash, the string
 | |
|  <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"banners"</SPAN
 | |
| >, another forward slash, and lastly another
 | |
|  <SPAN
 | |
| CLASS="QUOTE"
 | |
| >".*"</SPAN
 | |
| >. We are building 
 | |
|  a directory path here. This will match any file with the path that has a
 | |
|  directory named <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"banners"</SPAN
 | |
| > in it. The <SPAN
 | |
| CLASS="QUOTE"
 | |
| >".*"</SPAN
 | |
| > matches
 | |
|  any characters, and this could conceivably be more forward slashes, so it
 | |
|  might expand into a much longer looking path. For example, this could match:
 | |
|  <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"/eye/hate/spammers/banners/annoy_me_please.gif"</SPAN
 | |
| >, or just
 | |
|  <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"/banners/annoying.html"</SPAN
 | |
| >, or almost an infinite number of other
 | |
|  possible combinations, just so it has <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"banners"</SPAN
 | |
| > in the path
 | |
|  somewhere.</P
 | |
| ><P
 | |
| > And now something a little more complex:</P
 | |
| ><P
 | |
| > <SPAN
 | |
| CLASS="emphasis"
 | |
| ><I
 | |
| CLASS="EMPHASIS"
 | |
| ><TT
 | |
| CLASS="LITERAL"
 | |
| >/.*/adv((er)?ts?|ertis(ing|ements?))?/</TT
 | |
| ></I
 | |
| ></SPAN
 | |
| > - 
 | |
|  We have several literal forward slashes again (<SPAN
 | |
| CLASS="QUOTE"
 | |
| >"/"</SPAN
 | |
| >), so we are
 | |
|  building another expression that is a file path statement. We have another 
 | |
|  <SPAN
 | |
| CLASS="QUOTE"
 | |
| >".*"</SPAN
 | |
| >, so we are matching against any conceivable sub-path, just so
 | |
|  it matches our expression. The only true literal that <SPAN
 | |
| CLASS="emphasis"
 | |
| ><I
 | |
| CLASS="EMPHASIS"
 | |
| >must
 | |
|  match</I
 | |
| ></SPAN
 | |
| > our pattern is <SPAN
 | |
| CLASS="APPLICATION"
 | |
| >adv</SPAN
 | |
| >, together with
 | |
|  the forward slashes. What comes after the <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"adv"</SPAN
 | |
| > string is the
 | |
|  interesting part. </P
 | |
| ><P
 | |
| > Remember the <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"?"</SPAN
 | |
| > means the preceding expression (either a
 | |
|  literal character or anything grouped with <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"(...)"</SPAN
 | |
| > in this case)
 | |
|  can exist or not, since this means either zero or one match. So
 | |
|  <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"((er)?ts?|ertis(ing|ements?))"</SPAN
 | |
| > is optional, as are the
 | |
|  individual sub-expressions: <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"(er)"</SPAN
 | |
| >,
 | |
|  <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"(ing|ements?)"</SPAN
 | |
| >, and the <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"s"</SPAN
 | |
| >. The <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"|"</SPAN
 | |
| >
 | |
|  means <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"or"</SPAN
 | |
| >. We have two of those. For instance, 
 | |
|  <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"(ing|ements?)"</SPAN
 | |
| >, can expand to match either <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"ing"</SPAN
 | |
| > 
 | |
|  <SPAN
 | |
| CLASS="emphasis"
 | |
| ><I
 | |
| CLASS="EMPHASIS"
 | |
| >OR</I
 | |
| ></SPAN
 | |
| > <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"ements?"</SPAN
 | |
| >. What is being done here, is an
 | |
|  attempt at matching as many variations of <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"advertisement"</SPAN
 | |
| >, and 
 | |
|  similar, as possible. So this would expand to match just <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"adv"</SPAN
 | |
| >,
 | |
|  or <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"advert"</SPAN
 | |
| >, or <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"adverts"</SPAN
 | |
| >, or
 | |
|  <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"advertising"</SPAN
 | |
| >, or <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"advertisement"</SPAN
 | |
| >, or
 | |
|  <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"advertisements"</SPAN
 | |
| >. You get the idea. But it would not match 
 | |
|  <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"advertizements"</SPAN
 | |
| > (with a <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"z"</SPAN
 | |
| >). We could fix that by
 | |
|  changing our regular expression to: 
 | |
|  <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"/.*/adv((er)?ts?|erti(s|z)(ing|ements?))?/"</SPAN
 | |
| >, which would then match
 | |
|  either spelling.</P
 | |
| ><P
 | |
| > <SPAN
 | |
| CLASS="emphasis"
 | |
| ><I
 | |
| CLASS="EMPHASIS"
 | |
| ><TT
 | |
| CLASS="LITERAL"
 | |
| >/.*/advert[0-9]+\.(gif|jpe?g)</TT
 | |
| ></I
 | |
| ></SPAN
 | |
| > - Again 
 | |
|  another path statement with forward slashes. Anything in the square brackets 
 | |
|  <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"[ ]"</SPAN
 | |
| > can be matched. This is using <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"0-9"</SPAN
 | |
| > as a
 | |
|  shorthand expression to mean any digit one through nine. It is the same as
 | |
|  saying <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"0123456789"</SPAN
 | |
| >. So any digit matches. The <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"+"</SPAN
 | |
| >
 | |
|  means one or more of the preceding expression must be included. The preceding 
 | |
|  expression here is what is in the square brackets -- in this case, any digit 
 | |
|  one through nine. Then, at the end, we have a grouping: <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"(gif|jpe?g)"</SPAN
 | |
| >. 
 | |
|  This includes a <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"|"</SPAN
 | |
| >, so this needs to match the expression on
 | |
|  either side of that bar character also. A simple <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"gif"</SPAN
 | |
| > on one side, and the other
 | |
|  side will in turn match either <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"jpeg"</SPAN
 | |
| > or <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"jpg"</SPAN
 | |
| >,
 | |
|  since the <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"?"</SPAN
 | |
| > means the letter <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"e"</SPAN
 | |
| > is optional and
 | |
|  can be matched once or not at all. So we are building an expression here to
 | |
|  match image GIF or JPEG type image file. It must include the literal
 | |
|  string <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"advert"</SPAN
 | |
| >, then one or more digits, and a <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"."</SPAN
 | |
| >
 | |
|  (which is now a literal, and not a special character, since it is escaped
 | |
|  with <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"\"</SPAN
 | |
| >), and lastly either <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"gif"</SPAN
 | |
| >, or
 | |
|  <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"jpeg"</SPAN
 | |
| >, or <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"jpg"</SPAN
 | |
| >. Some possible matches would
 | |
|  include: <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"//advert1.jpg"</SPAN
 | |
| >,
 | |
|  <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"/nasty/ads/advert1234.gif"</SPAN
 | |
| >,
 | |
|  <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"/banners/from/hell/advert99.jpg"</SPAN
 | |
| >. It would not match
 | |
|  <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"advert1.gif"</SPAN
 | |
| > (no leading slash), or
 | |
|  <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"/adverts232.jpg"</SPAN
 | |
| > (the expression does not include an
 | |
|  <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"s"</SPAN
 | |
| >), or <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"/advert1.jsp"</SPAN
 | |
| > (<SPAN
 | |
| CLASS="QUOTE"
 | |
| >"jsp"</SPAN
 | |
| > is not
 | |
|  in the expression anywhere).</P
 | |
| ><P
 | |
| > We are barely scratching the surface of regular expressions here so that you
 | |
|  can understand the default <SPAN
 | |
| CLASS="APPLICATION"
 | |
| >Privoxy</SPAN
 | |
| >
 | |
|  configuration files, and maybe use this knowledge to customize your own
 | |
|  installation. There is much, much more that can be done with regular
 | |
|  expressions. Now that you know enough to get started, you can learn more on
 | |
|  your own :/</P
 | |
| ><P
 | |
| > More reading on Perl Compatible Regular expressions: 
 | |
|  <A
 | |
| HREF="http://perldoc.perl.org/perlre.html"
 | |
| TARGET="_top"
 | |
| >http://perldoc.perl.org/perlre.html</A
 | |
| ></P
 | |
| ><P
 | |
| > For information on regular expression based substitutions and their applications
 | |
|  in filters, please see the <A
 | |
| HREF="filter-file.html"
 | |
| >filter file tutorial</A
 | |
| >
 | |
|  in this manual.</P
 | |
| ></DIV
 | |
| ><DIV
 | |
| CLASS="SECT2"
 | |
| ><H2
 | |
| CLASS="SECT2"
 | |
| ><A
 | |
| NAME="AEN5174"
 | |
| >14.2. Privoxy's Internal Pages</A
 | |
| ></H2
 | |
| ><P
 | |
| > Since <SPAN
 | |
| CLASS="APPLICATION"
 | |
| >Privoxy</SPAN
 | |
| > proxies each requested 
 | |
|  web page, it is easy for <SPAN
 | |
| CLASS="APPLICATION"
 | |
| >Privoxy</SPAN
 | |
| > to 
 | |
|  trap certain special URLs. In this way, we can talk directly to
 | |
|  <SPAN
 | |
| CLASS="APPLICATION"
 | |
| >Privoxy</SPAN
 | |
| >, and see how it is 
 | |
|  configured, see how our rules are being applied, change these 
 | |
|  rules and other configuration options, and even turn
 | |
|  <SPAN
 | |
| CLASS="APPLICATION"
 | |
| >Privoxy's</SPAN
 | |
| > filtering off, all with 
 | |
|  a web browser.
</P
 | |
| ><P
 | |
| > The URLs listed below are the special ones that allow direct access 
 | |
|  to <SPAN
 | |
| CLASS="APPLICATION"
 | |
| >Privoxy</SPAN
 | |
| >. Of course,
 | |
|  <SPAN
 | |
| CLASS="APPLICATION"
 | |
| >Privoxy</SPAN
 | |
| > must be running to access these. If 
 | |
|  not, you will get a friendly error message. Internet access is not 
 | |
|  necessary either.</P
 | |
| ><P
 | |
| > <P
 | |
| ></P
 | |
| ><UL
 | |
| ><LI
 | |
| ><P
 | |
| >  
 | |
|    Privoxy main page: 
 | |
|   </P
 | |
| ><A
 | |
| NAME="AEN5188"
 | |
| ></A
 | |
| ><BLOCKQUOTE
 | |
| CLASS="BLOCKQUOTE"
 | |
| ><P
 | |
| > 
 | |
|      <A
 | |
| HREF="http://config.privoxy.org/"
 | |
| TARGET="_top"
 | |
| >http://config.privoxy.org/</A
 | |
| >
 | |
|    </P
 | |
| ></BLOCKQUOTE
 | |
| ><P
 | |
| >   There is a shortcut: <A
 | |
| HREF="http://p.p/"
 | |
| TARGET="_top"
 | |
| >http://p.p/</A
 | |
| > (But it
 | |
|    doesn't provide a fall-back to a real page, in case the request is not
 | |
|    sent through <SPAN
 | |
| CLASS="APPLICATION"
 | |
| >Privoxy</SPAN
 | |
| >)
 | |
|   </P
 | |
| ></LI
 | |
| ><LI
 | |
| ><P
 | |
| >  
 | |
|     Show information about the current configuration, including viewing and 
 | |
|     editing of actions files:
 | |
|   </P
 | |
| ><A
 | |
| NAME="AEN5196"
 | |
| ></A
 | |
| ><BLOCKQUOTE
 | |
| CLASS="BLOCKQUOTE"
 | |
| ><P
 | |
| > 
 | |
|     <A
 | |
| HREF="http://config.privoxy.org/show-status"
 | |
| TARGET="_top"
 | |
| >http://config.privoxy.org/show-status</A
 | |
| >
 | |
|    </P
 | |
| ></BLOCKQUOTE
 | |
| ></LI
 | |
| ><LI
 | |
| ><P
 | |
| >  
 | |
|     Show the source code version numbers:
 | |
|   </P
 | |
| ><A
 | |
| NAME="AEN5201"
 | |
| ></A
 | |
| ><BLOCKQUOTE
 | |
| CLASS="BLOCKQUOTE"
 | |
| ><P
 | |
| > 
 | |
|     <A
 | |
| HREF="http://config.privoxy.org/show-version"
 | |
| TARGET="_top"
 | |
| >http://config.privoxy.org/show-version</A
 | |
| >
 | |
|    </P
 | |
| ></BLOCKQUOTE
 | |
| ></LI
 | |
| ><LI
 | |
| ><P
 | |
| >  
 | |
|    Show the browser's request headers:
 | |
|   </P
 | |
| ><A
 | |
| NAME="AEN5206"
 | |
| ></A
 | |
| ><BLOCKQUOTE
 | |
| CLASS="BLOCKQUOTE"
 | |
| ><P
 | |
| > 
 | |
|     <A
 | |
| HREF="http://config.privoxy.org/show-request"
 | |
| TARGET="_top"
 | |
| >http://config.privoxy.org/show-request</A
 | |
| >
 | |
|    </P
 | |
| ></BLOCKQUOTE
 | |
| ></LI
 | |
| ><LI
 | |
| ><P
 | |
| >  
 | |
|    Show which actions apply to a URL and why:
 | |
|   </P
 | |
| ><A
 | |
| NAME="AEN5211"
 | |
| ></A
 | |
| ><BLOCKQUOTE
 | |
| CLASS="BLOCKQUOTE"
 | |
| ><P
 | |
| > 
 | |
|     <A
 | |
| HREF="http://config.privoxy.org/show-url-info"
 | |
| TARGET="_top"
 | |
| >http://config.privoxy.org/show-url-info</A
 | |
| >
 | |
|    </P
 | |
| ></BLOCKQUOTE
 | |
| ></LI
 | |
| ><LI
 | |
| ><P
 | |
| >  
 | |
|    Toggle Privoxy on or off. This feature can be turned off/on in the main 
 | |
|    <TT
 | |
| CLASS="FILENAME"
 | |
| >config</TT
 | |
| > file. When toggled <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"off"</SPAN
 | |
| >, <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"Privoxy"</SPAN
 | |
| >
 | |
|    continues to run, but only as a pass-through proxy, with no actions taking
 | |
|    place:
 | |
|   </P
 | |
| ><A
 | |
| NAME="AEN5219"
 | |
| ></A
 | |
| ><BLOCKQUOTE
 | |
| CLASS="BLOCKQUOTE"
 | |
| ><P
 | |
| > 
 | |
|     <A
 | |
| HREF="http://config.privoxy.org/toggle"
 | |
| TARGET="_top"
 | |
| >http://config.privoxy.org/toggle</A
 | |
| >
 | |
|    </P
 | |
| ></BLOCKQUOTE
 | |
| ><P
 | |
| >   Short cuts. Turn off, then on: 
 | |
|   </P
 | |
| ><A
 | |
| NAME="AEN5223"
 | |
| ></A
 | |
| ><BLOCKQUOTE
 | |
| CLASS="BLOCKQUOTE"
 | |
| ><P
 | |
| > 
 | |
|      <A
 | |
| HREF="http://config.privoxy.org/toggle?set=disable"
 | |
| TARGET="_top"
 | |
| >http://config.privoxy.org/toggle?set=disable</A
 | |
| >
 | |
|    </P
 | |
| ></BLOCKQUOTE
 | |
| ><A
 | |
| NAME="AEN5226"
 | |
| ></A
 | |
| ><BLOCKQUOTE
 | |
| CLASS="BLOCKQUOTE"
 | |
| ><P
 | |
| > 
 | |
|      <A
 | |
| HREF="http://config.privoxy.org/toggle?set=enable"
 | |
| TARGET="_top"
 | |
| >http://config.privoxy.org/toggle?set=enable</A
 | |
| >
 | |
|    </P
 | |
| ></BLOCKQUOTE
 | |
| ></LI
 | |
| ></UL
 | |
| ></P
 | |
| ><P
 | |
| > These may be bookmarked for quick reference. See next.
</P
 | |
| ><DIV
 | |
| CLASS="SECT3"
 | |
| ><H3
 | |
| CLASS="SECT3"
 | |
| ><A
 | |
| NAME="BOOKMARKLETS"
 | |
| >14.2.1. Bookmarklets</A
 | |
| ></H3
 | |
| ><P
 | |
| > Below are some <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"bookmarklets"</SPAN
 | |
| > to allow you to easily access a
 | |
|  <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"mini"</SPAN
 | |
| > version of some of <SPAN
 | |
| CLASS="APPLICATION"
 | |
| >Privoxy's</SPAN
 | |
| >
 | |
|  special pages. They are designed for MS Internet Explorer, but should work
 | |
|  equally well in Netscape, Mozilla, and other browsers which support
 | |
|  JavaScript. They are designed to run directly from your bookmarks - not by
 | |
|  clicking the links below (although that should work for testing).</P
 | |
| ><P
 | |
| > To save them, right-click the link and choose <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"Add to Favorites"</SPAN
 | |
| >
 | |
|  (IE) or <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"Add Bookmark"</SPAN
 | |
| > (Netscape). You will get a warning that
 | |
|  the bookmark <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"may not be safe"</SPAN
 | |
| > - just click OK. Then you can run the
 | |
|  Bookmarklet directly from your favorites/bookmarks. For even faster access,
 | |
|  you can put them on the <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"Links"</SPAN
 | |
| > bar (IE) or the <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"Personal
 | |
|  Toolbar"</SPAN
 | |
| > (Netscape), and run them with a single click. </P
 | |
| ><P
 | |
| > <P
 | |
| ></P
 | |
| ><UL
 | |
| ><LI
 | |
| ><P
 | |
| >    <A
 | |
| HREF="javascript:void(window.open('http://config.privoxy.org/toggle?mini=y&set=enabled','ijbstatus','width=250,height=100,resizable=yes,scrollbars=no,toolbar=no,location=no,directories=no,status=no,menubar=no,copyhistory=no').focus());"
 | |
| TARGET="_top"
 | |
| >Privoxy - Enable</A
 | |
| >
 | |
|    </P
 | |
| ></LI
 | |
| ><LI
 | |
| ><P
 | |
| >    <A
 | |
| HREF="javascript:void(window.open('http://config.privoxy.org/toggle?mini=y&set=disabled','ijbstatus','width=250,height=100,resizable=yes,scrollbars=no,toolbar=no,location=no,directories=no,status=no,menubar=no,copyhistory=no').focus());"
 | |
| TARGET="_top"
 | |
| >Privoxy - Disable</A
 | |
| >
 | |
|    </P
 | |
| ></LI
 | |
| ><LI
 | |
| ><P
 | |
| >    <A
 | |
| HREF="javascript:void(window.open('http://config.privoxy.org/toggle?mini=y&set=toggle','ijbstatus','width=250,height=100,resizable=yes,scrollbars=no,toolbar=no,location=no,directories=no,status=no,menubar=no,copyhistory=no').focus());"
 | |
| TARGET="_top"
 | |
| >Privoxy - Toggle Privoxy</A
 | |
| > (Toggles between enabled and disabled)
 | |
|    </P
 | |
| ></LI
 | |
| ><LI
 | |
| ><P
 | |
| >    <A
 | |
| HREF="javascript:void(window.open('http://config.privoxy.org/toggle?mini=y','ijbstatus','width=250,height=2,resizable=yes,scrollbars=no,toolbar=no,location=no,directories=no,status=no,menubar=no,copyhistory=no').focus());"
 | |
| TARGET="_top"
 | |
| >Privoxy- View Status</A
 | |
| >
 | |
|    </P
 | |
| ></LI
 | |
| ><LI
 | |
| ><P
 | |
| >    <A
 | |
| HREF="javascript:void(window.open('http://config.privoxy.org/show-url-info?url='+escape(location.href),'Why').focus());"
 | |
| TARGET="_top"
 | |
| >Privoxy - Why?</A
 | |
| >
 | |
|    </P
 | |
| ></LI
 | |
| ></UL
 | |
| ></P
 | |
| ><P
 | |
| > Credit: The site which gave us the general idea for these bookmarklets is
 | |
|  <A
 | |
| HREF="http://www.bookmarklets.com/"
 | |
| TARGET="_top"
 | |
| >www.bookmarklets.com</A
 | |
| >. They
 | |
|  have more information about bookmarklets. </P
 | |
| ></DIV
 | |
| ></DIV
 | |
| ><DIV
 | |
| CLASS="SECT2"
 | |
| ><H2
 | |
| CLASS="SECT2"
 | |
| ><A
 | |
| NAME="CHAIN"
 | |
| >14.3. Chain of Events</A
 | |
| ></H2
 | |
| ><P
 | |
| > Let's take a quick look at how some of <SPAN
 | |
| CLASS="APPLICATION"
 | |
| >Privoxy's</SPAN
 | |
| > 
 | |
|  core features are triggered, and the ensuing sequence of events when a web
 | |
|  page is requested by your browser:</P
 | |
| ><P
 | |
| > <P
 | |
| ></P
 | |
| ><UL
 | |
| ><LI
 | |
| ><P
 | |
| >   First, your web browser requests a web page. The browser knows to send 
 | |
|    the request to <SPAN
 | |
| CLASS="APPLICATION"
 | |
| >Privoxy</SPAN
 | |
| >, which will in turn, 
 | |
|    relay the request to the remote web server after passing the following 
 | |
|    tests: 
 | |
|   </P
 | |
| ></LI
 | |
| ><LI
 | |
| ><P
 | |
| >   <SPAN
 | |
| CLASS="APPLICATION"
 | |
| >Privoxy</SPAN
 | |
| > traps any request for its own internal CGI 
 | |
|    pages (e.g <A
 | |
| HREF="http://p.p/"
 | |
| TARGET="_top"
 | |
| >http://p.p/</A
 | |
| >) and sends the CGI page back to the browser.
 | |
|   </P
 | |
| ></LI
 | |
| ><LI
 | |
| ><P
 | |
| >   Next, <SPAN
 | |
| CLASS="APPLICATION"
 | |
| >Privoxy</SPAN
 | |
| > checks to see if the URL 
 | |
|    matches any <A
 | |
| HREF="actions-file.html#BLOCK"
 | |
| ><SPAN
 | |
| CLASS="QUOTE"
 | |
| >"+block"</SPAN
 | |
| ></A
 | |
| > patterns. If
 | |
|    so, the URL is then blocked, and the remote web server will not be contacted.
 | |
|    <A
 | |
| HREF="actions-file.html#HANDLE-AS-IMAGE"
 | |
| ><SPAN
 | |
| CLASS="QUOTE"
 | |
| >"+handle-as-image"</SPAN
 | |
| ></A
 | |
| > 
 | |
|    and 
 | |
|    <A
 | |
| HREF="actions-file.html#HANDLE-AS-EMPTY-DOCUMENT"
 | |
| ><SPAN
 | |
| CLASS="QUOTE"
 | |
| >"+handle-as-empty-document"</SPAN
 | |
| ></A
 | |
| >
 | |
|    are then checked, and if there is no match, an 
 | |
|    HTML <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"BLOCKED"</SPAN
 | |
| > page is sent back to the browser. Otherwise, if
 | |
|    it does match, an image is returned for the former, and an empty text
 | |
|    document for the latter. The type of image would depend on the setting of
 | |
|    <A
 | |
| HREF="actions-file.html#SET-IMAGE-BLOCKER"
 | |
| ><SPAN
 | |
| CLASS="QUOTE"
 | |
| >"+set-image-blocker"</SPAN
 | |
| ></A
 | |
| >
 | |
|    (blank, checkerboard pattern, or an HTTP redirect to an image elsewhere).
 | |
|   </P
 | |
| ></LI
 | |
| ><LI
 | |
| ><P
 | |
| >   Untrusted URLs are blocked. If URLs are being added to the
 | |
|    <TT
 | |
| CLASS="FILENAME"
 | |
| >trust</TT
 | |
| > file, then that is done.
 | |
|   </P
 | |
| ></LI
 | |
| ><LI
 | |
| ><P
 | |
| >   If the URL pattern matches the <A
 | |
| HREF="actions-file.html#FAST-REDIRECTS"
 | |
| ><SPAN
 | |
| CLASS="QUOTE"
 | |
| >"+fast-redirects"</SPAN
 | |
| ></A
 | |
| > action,
 | |
|    it is then processed. Unwanted parts of the requested URL are stripped.
 | |
|   </P
 | |
| ></LI
 | |
| ><LI
 | |
| ><P
 | |
| >   Now the rest of the client browser's request headers are processed. If any
 | |
|    of these match any of the relevant actions (e.g. <A
 | |
| HREF="actions-file.html#HIDE-USER-AGENT"
 | |
| ><SPAN
 | |
| CLASS="QUOTE"
 | |
| >"+hide-user-agent"</SPAN
 | |
| ></A
 | |
| >,
 | |
|    etc.), headers are suppressed or forged as determined by these actions and
 | |
|    their parameters.
 | |
|   </P
 | |
| ></LI
 | |
| ><LI
 | |
| ><P
 | |
| >   Now the web server starts sending its response back (i.e. typically a web
 | |
|    page).
 | |
|   </P
 | |
| ></LI
 | |
| ><LI
 | |
| ><P
 | |
| >   First, the server headers are read and processed to determine, among other
 | |
|    things, the MIME type (document type) and encoding. The headers are then
 | |
|    filtered as determined by the 
 | |
|    <A
 | |
| HREF="actions-file.html#CRUNCH-INCOMING-COOKIES"
 | |
| ><SPAN
 | |
| CLASS="QUOTE"
 | |
| >"+crunch-incoming-cookies"</SPAN
 | |
| ></A
 | |
| >,
 | |
|    <A
 | |
| HREF="actions-file.html#SESSION-COOKIES-ONLY"
 | |
| ><SPAN
 | |
| CLASS="QUOTE"
 | |
| >"+session-cookies-only"</SPAN
 | |
| ></A
 | |
| >,
 | |
|    and <A
 | |
| HREF="actions-file.html#DOWNGRADE-HTTP-VERSION"
 | |
| ><SPAN
 | |
| CLASS="QUOTE"
 | |
| >"+downgrade-http-version"</SPAN
 | |
| ></A
 | |
| >
 | |
|    actions.
 | |
|   </P
 | |
| ></LI
 | |
| ><LI
 | |
| ><P
 | |
| >   If any <A
 | |
| HREF="actions-file.html#FILTER"
 | |
| ><SPAN
 | |
| CLASS="QUOTE"
 | |
| >"+filter"</SPAN
 | |
| ></A
 | |
| > action
 | |
|    or <A
 | |
| HREF="actions-file.html#DEANIMATE-GIFS"
 | |
| ><SPAN
 | |
| CLASS="QUOTE"
 | |
| >"+deanimate-gifs"</SPAN
 | |
| ></A
 | |
| >
 | |
|    action applies (and the document type fits the action), the rest of the page is
 | |
|    read into memory (up to a configurable limit). Then the filter rules (from
 | |
|    <TT
 | |
| CLASS="FILENAME"
 | |
| >default.filter</TT
 | |
| > and any other filter files) are
 | |
|    processed against the buffered content. Filters are applied in the order
 | |
|    they are specified in one of the filter files. Animated GIFs, if present,
 | |
|    are reduced to either the first or last frame, depending on the action
 | |
|    setting.The entire page, which is now filtered, is then sent by
 | |
|    <SPAN
 | |
| CLASS="APPLICATION"
 | |
| >Privoxy</SPAN
 | |
| > back to your browser. 
 | |
|   </P
 | |
| ><P
 | |
| >   If neither a <A
 | |
| HREF="actions-file.html#FILTER"
 | |
| ><SPAN
 | |
| CLASS="QUOTE"
 | |
| >"+filter"</SPAN
 | |
| ></A
 | |
| > action
 | |
|    or <A
 | |
| HREF="actions-file.html#DEANIMATE-GIFS"
 | |
| ><SPAN
 | |
| CLASS="QUOTE"
 | |
| >"+deanimate-gifs"</SPAN
 | |
| ></A
 | |
| >
 | |
|    matches, then <SPAN
 | |
| CLASS="APPLICATION"
 | |
| >Privoxy</SPAN
 | |
| > passes the raw data through 
 | |
|    to the client browser as it becomes available.
 | |
|   </P
 | |
| ></LI
 | |
| ><LI
 | |
| ><P
 | |
| >   As the browser receives the now (possibly filtered) page content, it 
 | |
|    reads and then requests any URLs that may be embedded within the page
 | |
|    source, e.g. ad images, stylesheets, JavaScript, other HTML documents (e.g.
 | |
|    frames), sounds, etc. For each of these objects, the browser issues a
 | |
|    separate request (this is easily viewable in <SPAN
 | |
| CLASS="APPLICATION"
 | |
| >Privoxy's</SPAN
 | |
| >
 | |
|    logs). And each such request is in turn processed just as above. Note that a
 | |
|    complex web page will have many, many such embedded URLs. If these 
 | |
|    secondary requests are to a different server, then quite possibly a very 
 | |
|    differing set of actions is triggered.
 | |
|   </P
 | |
| ></LI
 | |
| ></UL
 | |
| ></P
 | |
| ><P
 | |
| > NOTE: This is somewhat of a simplistic overview of what happens with each URL
 | |
|  request. For the sake of brevity and simplicity, we have focused on 
 | |
|  <SPAN
 | |
| CLASS="APPLICATION"
 | |
| >Privoxy's</SPAN
 | |
| > core features only.</P
 | |
| ></DIV
 | |
| ><DIV
 | |
| CLASS="SECT2"
 | |
| ><H2
 | |
| CLASS="SECT2"
 | |
| ><A
 | |
| NAME="ACTIONSANAT"
 | |
| >14.4. Troubleshooting: Anatomy of an Action</A
 | |
| ></H2
 | |
| ><P
 | |
| > The way <SPAN
 | |
| CLASS="APPLICATION"
 | |
| >Privoxy</SPAN
 | |
| > applies 
 | |
|  <A
 | |
| HREF="actions-file.html#ACTIONS"
 | |
| >actions</A
 | |
| > and <A
 | |
| HREF="actions-file.html#FILTER"
 | |
| >filters</A
 | |
| >
 | |
|  to any given URL can be complex, and not always so
 | |
|  easy to understand what is happening. And sometimes we need to be able to
 | |
|  <SPAN
 | |
| CLASS="emphasis"
 | |
| ><I
 | |
| CLASS="EMPHASIS"
 | |
| >see</I
 | |
| ></SPAN
 | |
| > just what <SPAN
 | |
| CLASS="APPLICATION"
 | |
| >Privoxy</SPAN
 | |
| > is
 | |
|  doing. Especially, if something <SPAN
 | |
| CLASS="APPLICATION"
 | |
| >Privoxy</SPAN
 | |
| > is doing
 | |
|  is causing us a problem inadvertently. It can be a little daunting to look at
 | |
|  the actions and filters files themselves, since they tend to be filled with
 | |
|  <A
 | |
| HREF="appendix.html#REGEX"
 | |
| >regular expressions</A
 | |
| > whose consequences are not
 | |
|  always so obvious. </P
 | |
| ><P
 | |
| > One quick test to see if <SPAN
 | |
| CLASS="APPLICATION"
 | |
| >Privoxy</SPAN
 | |
| > is causing a problem 
 | |
|  or not, is to disable it temporarily. This should be the first troubleshooting 
 | |
|  step. See <A
 | |
| HREF="appendix.html#BOOKMARKLETS"
 | |
| >the Bookmarklets</A
 | |
| > section on a quick 
 | |
|  and easy way to do this (be sure to flush caches afterward!). Looking at the 
 | |
|  logs is a good idea too. (Note that both the toggle feature and logging are 
 | |
|  enabled via <TT
 | |
| CLASS="FILENAME"
 | |
| >config</TT
 | |
| > file settings, and may need to be 
 | |
|  turned <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"on"</SPAN
 | |
| >.)</P
 | |
| ><P
 | |
| > Another easy troubleshooting step to try is if you have done any
 | |
|  customization of your installation, revert back to the installed
 | |
|  defaults and see if that helps. There are times the developers get complaints
 | |
|  about one thing or another, and the problem is more related to a customized
 | |
|  configuration issue.</P
 | |
| ><P
 | |
| > <SPAN
 | |
| CLASS="APPLICATION"
 | |
| >Privoxy</SPAN
 | |
| > also provides the 
 | |
|  <A
 | |
| HREF="http://config.privoxy.org/show-url-info"
 | |
| TARGET="_top"
 | |
| >http://config.privoxy.org/show-url-info</A
 | |
| >
 | |
|  page that can show us very specifically how <SPAN
 | |
| CLASS="APPLICATION"
 | |
| >actions</SPAN
 | |
| >
 | |
|  are being applied to any given URL. This is a big help for troubleshooting.</P
 | |
| ><P
 | |
| > First, enter one URL (or partial URL) at the prompt, and then
 | |
|  <SPAN
 | |
| CLASS="APPLICATION"
 | |
| >Privoxy</SPAN
 | |
| > will tell us 
 | |
|  how the current configuration will handle it. This will not
 | |
|  help with filtering effects (i.e. the <A
 | |
| HREF="actions-file.html#FILTER"
 | |
| ><SPAN
 | |
| CLASS="QUOTE"
 | |
| >"+filter"</SPAN
 | |
| ></A
 | |
| > action) from
 | |
|  one of the filter files since this is handled very
 | |
|  differently and not so easy to trap! It also will not tell you about any other
 | |
|  URLs that may be embedded within the URL you are testing. For instance, images
 | |
|  such as ads are expressed as URLs within the raw page source of HTML pages. So
 | |
|  you will only get info for the actual URL that is pasted into the prompt area
 | |
|  -- not any sub-URLs. If you want to know about embedded URLs like ads, you
 | |
|  will have to dig those out of the HTML source. Use your browser's <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"View
 | |
|  Page Source"</SPAN
 | |
| > option for this. Or right click on the ad, and grab the
 | |
|  URL.</P
 | |
| ><P
 | |
| > Let's try an example, <A
 | |
| HREF="http://google.com"
 | |
| TARGET="_top"
 | |
| >google.com</A
 | |
| >, 
 | |
|  and look at it one section at a time in a sample configuration (your real 
 | |
|  configuration may vary):</P
 | |
| ><P
 | |
| > <TABLE
 | |
| BORDER="0"
 | |
| BGCOLOR="#E0E0E0"
 | |
| WIDTH="100%"
 | |
| ><TR
 | |
| ><TD
 | |
| ><PRE
 | |
| CLASS="SCREEN"
 | |
| > Matches for http://www.google.com:
 | |
| 
 | |
|  In file: default.action <SPAN
 | |
| CLASS="GUIBUTTON"
 | |
| >[ View ]</SPAN
 | |
| > <SPAN
 | |
| CLASS="GUIBUTTON"
 | |
| >[ Edit ]</SPAN
 | |
| >
 | |
| 
 | |
|  {+change-x-forwarded-for{block}
 | |
|  +deanimate-gifs {last}
 | |
|  +fast-redirects {check-decoded-url}
 | |
|  +filter {refresh-tags}
 | |
|  +filter {img-reorder}
 | |
|  +filter {banners-by-size}
 | |
|  +filter {webbugs}
 | |
|  +filter {jumping-windows}
 | |
|  +filter {ie-exploits}
 | |
|  +hide-from-header {block}
 | |
|  +hide-referrer {forge}
 | |
|  +session-cookies-only
 | |
|  +set-image-blocker {pattern}
 | |
| /
 | |
|  
 | |
|  { -session-cookies-only }
 | |
|  .google.com
 | |
| 
 | |
|  { -fast-redirects }
 | |
|  .google.com
 | |
| 
 | |
| In file: user.action <SPAN
 | |
| CLASS="GUIBUTTON"
 | |
| >[ View ]</SPAN
 | |
| > <SPAN
 | |
| CLASS="GUIBUTTON"
 | |
| >[ Edit ]</SPAN
 | |
| >
 | |
| (no matches in this file)  </PRE
 | |
| ></TD
 | |
| ></TR
 | |
| ></TABLE
 | |
| ></P
 | |
| ><P
 | |
| > This is telling us how we have defined our 
 | |
|  <A
 | |
| HREF="actions-file.html#ACTIONS"
 | |
| ><SPAN
 | |
| CLASS="QUOTE"
 | |
| >"actions"</SPAN
 | |
| ></A
 | |
| >, and
 | |
|  which ones match for our test case, <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"google.com"</SPAN
 | |
| >. 
 | |
|  Displayed is all the actions that are available to us. Remember,
 | |
|  the <TT
 | |
| CLASS="LITERAL"
 | |
| >+</TT
 | |
| > sign denotes <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"on"</SPAN
 | |
| >. <TT
 | |
| CLASS="LITERAL"
 | |
| >-</TT
 | |
| >
 | |
|  denotes <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"off"</SPAN
 | |
| >. So some are <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"on"</SPAN
 | |
| > here, but many 
 | |
|  are <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"off"</SPAN
 | |
| >. Each example we try may provide a slightly different
 | |
|  end result, depending on our configuration directives.</P
 | |
| ><P
 | |
| > The first listing
 | |
|   is for our <TT
 | |
| CLASS="FILENAME"
 | |
| >default.action</TT
 | |
| > file. The large, multi-line
 | |
|   listing, is how the actions are set to match for all URLs, i.e. our default
 | |
|   settings. If you look at your <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"actions"</SPAN
 | |
| > file, this would be the
 | |
|   section just below the <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"aliases"</SPAN
 | |
| > section near the top. This
 | |
|   will apply to all URLs as signified by the single forward slash at the end
 | |
|   of the listing -- <SPAN
 | |
| CLASS="QUOTE"
 | |
| >" / "</SPAN
 | |
| >.</P
 | |
| ><P
 | |
| > But we have defined additional actions that would be exceptions to these general
 | |
|  rules, and then we list specific URLs (or patterns) that these exceptions
 | |
|  would apply to. Last match wins. Just below this then are two explicit
 | |
|  matches for <SPAN
 | |
| CLASS="QUOTE"
 | |
| >".google.com"</SPAN
 | |
| >. The first is negating our previous
 | |
|  cookie setting, which was for <A
 | |
| HREF="actions-file.html#SESSION-COOKIES-ONLY"
 | |
| ><SPAN
 | |
| CLASS="QUOTE"
 | |
| >"+session-cookies-only"</SPAN
 | |
| ></A
 | |
| >
 | |
|  (i.e. not persistent). So we will allow persistent cookies for google, at
 | |
|  least that is how it is in this example. The second turns
 | |
|  <SPAN
 | |
| CLASS="emphasis"
 | |
| ><I
 | |
| CLASS="EMPHASIS"
 | |
| >off</I
 | |
| ></SPAN
 | |
| > any <A
 | |
| HREF="actions-file.html#FAST-REDIRECTS"
 | |
| ><SPAN
 | |
| CLASS="QUOTE"
 | |
| >"+fast-redirects"</SPAN
 | |
| ></A
 | |
| >
 | |
|  action, allowing this to take place unmolested. Note that there is a leading
 | |
|  dot here -- <SPAN
 | |
| CLASS="QUOTE"
 | |
| >".google.com"</SPAN
 | |
| >. This will match any hosts and
 | |
|  sub-domains, in the google.com domain also, such as
 | |
|  <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"www.google.com"</SPAN
 | |
| > or <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"mail.google.com"</SPAN
 | |
| >. But it would not 
 | |
|  match <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"www.google.de"</SPAN
 | |
| >! So, apparently, we have these two actions
 | |
|  defined as exceptions to the general rules at the top somewhere in the lower
 | |
|  part of our <TT
 | |
| CLASS="FILENAME"
 | |
| >default.action</TT
 | |
| > file, and
 | |
|  <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"google.com"</SPAN
 | |
| > is referenced somewhere in these latter sections.</P
 | |
| ><P
 | |
| > Then, for our <TT
 | |
| CLASS="FILENAME"
 | |
| >user.action</TT
 | |
| > file, we again have no hits.
 | |
|  So there is nothing google-specific that we might have added to our own, local
 | |
|  configuration. If there was, those actions would over-rule any actions from 
 | |
|  previously processed files, such as <TT
 | |
| CLASS="FILENAME"
 | |
| >default.action</TT
 | |
| >.
 | |
|  <TT
 | |
| CLASS="FILENAME"
 | |
| >user.action</TT
 | |
| > typically has the last word. This is the
 | |
|  best place to put hard and fast exceptions,</P
 | |
| ><P
 | |
| > And finally we pull it all together in the bottom section and summarize how
 | |
|  <SPAN
 | |
| CLASS="APPLICATION"
 | |
| >Privoxy</SPAN
 | |
| > is applying all its <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"actions"</SPAN
 | |
| > 
 | |
|  to <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"google.com"</SPAN
 | |
| >:
</P
 | |
| ><P
 | |
| > <TABLE
 | |
| BORDER="0"
 | |
| BGCOLOR="#E0E0E0"
 | |
| WIDTH="100%"
 | |
| ><TR
 | |
| ><TD
 | |
| ><PRE
 | |
| CLASS="SCREEN"
 | |
| >
 Final results:
 | |
|  
 | |
|  -add-header
 | |
|  -block
 | |
|  +change-x-forwarded-for{block} 
 | |
|  -client-header-filter{hide-tor-exit-notation}
 | |
|  -content-type-overwrite
 | |
|  -crunch-client-header
 | |
|  -crunch-if-none-match
 | |
|  -crunch-incoming-cookies
 | |
|  -crunch-outgoing-cookies
 | |
|  -crunch-server-header
 | |
|  +deanimate-gifs {last}
 | |
|  -downgrade-http-version
 | |
|  -fast-redirects
 | |
|  -filter {js-events}
 | |
|  -filter {content-cookies}
 | |
|  -filter {all-popups}
 | |
|  -filter {banners-by-link}
 | |
|  -filter {tiny-textforms}
 | |
|  -filter {frameset-borders}
 | |
|  -filter {demoronizer}
 | |
|  -filter {shockwave-flash}
 | |
|  -filter {quicktime-kioskmode}
 | |
|  -filter {fun}
 | |
|  -filter {crude-parental}
 | |
|  -filter {site-specifics}
 | |
|  -filter {js-annoyances}
 | |
|  -filter {html-annoyances}
 | |
|  +filter {refresh-tags}
 | |
|  -filter {unsolicited-popups}
 | |
|  +filter {img-reorder}
 | |
|  +filter {banners-by-size}
 | |
|  +filter {webbugs}
 | |
|  +filter {jumping-windows}
 | |
|  +filter {ie-exploits}
 | |
|  -filter {google}
 | |
|  -filter {yahoo}
 | |
|  -filter {msn}
 | |
|  -filter {blogspot}
 | |
|  -filter {no-ping}
 | |
|  -force-text-mode
 | |
|  -handle-as-empty-document
 | |
|  -handle-as-image
 | |
|  -hide-accept-language
 | |
|  -hide-content-disposition
 | |
|  +hide-from-header {block}
 | |
|  -hide-if-modified-since
 | |
|  +hide-referrer {forge}
 | |
|  -hide-user-agent
 | |
|  -limit-connect
 | |
|  -overwrite-last-modified
 | |
|  -prevent-compression
 | |
|  -redirect
 | |
|  -server-header-filter{xml-to-html}
 | |
|  -server-header-filter{html-to-xml} 
 | |
|  -session-cookies-only
 | |
|  +set-image-blocker {pattern} </PRE
 | |
| ></TD
 | |
| ></TR
 | |
| ></TABLE
 | |
| ></P
 | |
| ><P
 | |
| > Notice the only difference here to the previous listing, is to 
 | |
|  <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"fast-redirects"</SPAN
 | |
| > and <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"session-cookies-only"</SPAN
 | |
| >,
 | |
|  which are activated specifically for this site in our configuration, 
 | |
|  and thus show in the <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"Final Results"</SPAN
 | |
| >.</P
 | |
| ><P
 | |
| > Now another example, <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"ad.doubleclick.net"</SPAN
 | |
| >:</P
 | |
| ><P
 | |
| > <TABLE
 | |
| BORDER="0"
 | |
| BGCOLOR="#E0E0E0"
 | |
| WIDTH="100%"
 | |
| ><TR
 | |
| ><TD
 | |
| ><PRE
 | |
| CLASS="SCREEN"
 | |
| >
 { +block{Domains starts with "ad"} }
 | |
|   ad*.
 | |
| 
 | |
|  { +block{Domain contains "ad"} }
 | |
|   .ad.
 | |
| 
 | |
|  { +block{Doubleclick banner server} +handle-as-image }
 | |
|   .[a-vx-z]*.doubleclick.net</PRE
 | |
| ></TD
 | |
| ></TR
 | |
| ></TABLE
 | |
| ></P
 | |
| ><P
 | |
| > We'll just show the interesting part here - the explicit matches. It is 
 | |
|  matched three different times. Two <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"+block{}"</SPAN
 | |
| > sections, 
 | |
|  and a <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"+block{} +handle-as-image"</SPAN
 | |
| >,
 | |
|  which is the expanded form of one of our aliases that had been defined as: 
 | |
|  <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"+block-as-image"</SPAN
 | |
| >. (<A
 | |
| HREF="actions-file.html#ALIASES"
 | |
| ><SPAN
 | |
| CLASS="QUOTE"
 | |
| >"Aliases"</SPAN
 | |
| ></A
 | |
| > are defined in
 | |
|  the first section of the actions file and typically used to combine more 
 | |
|  than one action.)</P
 | |
| ><P
 | |
| > Any one of these would have done the trick and blocked this as an unwanted 
 | |
|  image. This is unnecessarily redundant since the last case effectively 
 | |
|  would also cover the first. No point in taking chances with these guys 
 | |
|  though ;-) Note that if you want an ad or obnoxious 
 | |
|  URL to be invisible, it should be defined as <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"ad.doubleclick.net"</SPAN
 | |
| >
 | |
|  is done here -- as both a <A
 | |
| HREF="actions-file.html#BLOCK"
 | |
| ><SPAN
 | |
| CLASS="QUOTE"
 | |
| >"+block{}"</SPAN
 | |
| ></A
 | |
| >
 | |
|  <SPAN
 | |
| CLASS="emphasis"
 | |
| ><I
 | |
| CLASS="EMPHASIS"
 | |
| >and</I
 | |
| ></SPAN
 | |
| > an 
 | |
|  <A
 | |
| HREF="actions-file.html#HANDLE-AS-IMAGE"
 | |
| ><SPAN
 | |
| CLASS="QUOTE"
 | |
| >"+handle-as-image"</SPAN
 | |
| ></A
 | |
| >.
 | |
|  The custom alias <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"<TT
 | |
| CLASS="LITERAL"
 | |
| >+block-as-image</TT
 | |
| >"</SPAN
 | |
| > just
 | |
|  simplifies the process and make it more readable.</P
 | |
| ><P
 | |
| > One last example. Let's try <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"http://www.example.net/adsl/HOWTO/"</SPAN
 | |
| >.
 | |
|  This one is giving us problems. We are getting a blank page. Hmmm ...</P
 | |
| ><P
 | |
| > <TABLE
 | |
| BORDER="0"
 | |
| BGCOLOR="#E0E0E0"
 | |
| WIDTH="100%"
 | |
| ><TR
 | |
| ><TD
 | |
| ><PRE
 | |
| CLASS="SCREEN"
 | |
| >
 Matches for http://www.example.net/adsl/HOWTO/:
 | |
| 
 | |
|  In file: default.action <SPAN
 | |
| CLASS="GUIBUTTON"
 | |
| >[ View ]</SPAN
 | |
| > <SPAN
 | |
| CLASS="GUIBUTTON"
 | |
| >[ Edit ]</SPAN
 | |
| >
 | |
| 
 | |
|  {-add-header 
 | |
|   -block
 | |
|   +change-x-forwarded-for{block} 
 | |
|   -client-header-filter{hide-tor-exit-notation}
 | |
|   -content-type-overwrite
 | |
|   -crunch-client-header
 | |
|   -crunch-if-none-match
 | |
|   -crunch-incoming-cookies
 | |
|   -crunch-outgoing-cookies
 | |
|   -crunch-server-header
 | |
|   +deanimate-gifs 
 | |
|   -downgrade-http-version 
 | |
|   +fast-redirects {check-decoded-url}
 | |
|   -filter {js-events}
 | |
|   -filter {content-cookies}
 | |
|   -filter {all-popups}
 | |
|   -filter {banners-by-link}
 | |
|   -filter {tiny-textforms}
 | |
|   -filter {frameset-borders}
 | |
|   -filter {demoronizer}
 | |
|   -filter {shockwave-flash}
 | |
|   -filter {quicktime-kioskmode}
 | |
|   -filter {fun}
 | |
|   -filter {crude-parental}
 | |
|   -filter {site-specifics}
 | |
|   -filter {js-annoyances}
 | |
|   -filter {html-annoyances}
 | |
|   +filter {refresh-tags}
 | |
|   -filter {unsolicited-popups}
 | |
|   +filter {img-reorder}
 | |
|   +filter {banners-by-size}
 | |
|   +filter {webbugs}
 | |
|   +filter {jumping-windows}
 | |
|   +filter {ie-exploits}
 | |
|   -filter {google}
 | |
|   -filter {yahoo}
 | |
|   -filter {msn}
 | |
|   -filter {blogspot}
 | |
|   -filter {no-ping}
 | |
|   -force-text-mode
 | |
|   -handle-as-empty-document
 | |
|   -handle-as-image 
 | |
|   -hide-accept-language
 | |
|   -hide-content-disposition  
 | |
|   +hide-from-header{block} 
 | |
|   +hide-referer{forge} 
 | |
|   -hide-user-agent 
 | |
|   -overwrite-last-modified
 | |
|   +prevent-compression 
 | |
|   -redirect
 | |
|   -server-header-filter{xml-to-html}
 | |
|   -server-header-filter{html-to-xml} 
 | |
|   +session-cookies-only 
 | |
|   +set-image-blocker{blank} }
 | |
|    /
 | |
| 
 | |
|  { +block{Path contains "ads".} +handle-as-image }
 | |
|   /ads</PRE
 | |
| ></TD
 | |
| ></TR
 | |
| ></TABLE
 | |
| ></P
 | |
| ><P
 | |
| > Ooops, the <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"/adsl/"</SPAN
 | |
| > is matching <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"/ads"</SPAN
 | |
| > in our 
 | |
|  configuration! But we did not want this at all! Now we see why we get the
 | |
|  blank page. It is actually triggering two different actions here, and 
 | |
|  the effects are aggregated so that the URL is blocked, and <SPAN
 | |
| CLASS="APPLICATION"
 | |
| >Privoxy</SPAN
 | |
| > is told 
 | |
|  to treat the block as if it were an image. But this is, of course, all wrong.
 | |
|   We could now add a new action below this (or better in our own
 | |
|   <TT
 | |
| CLASS="FILENAME"
 | |
| >user.action</TT
 | |
| > file) that explicitly
 | |
|   <SPAN
 | |
| CLASS="emphasis"
 | |
| ><I
 | |
| CLASS="EMPHASIS"
 | |
| >un</I
 | |
| ></SPAN
 | |
| > blocks (
 | |
|   <A
 | |
| HREF="actions-file.html#BLOCK"
 | |
| ><SPAN
 | |
| CLASS="QUOTE"
 | |
| >"{-block}"</SPAN
 | |
| ></A
 | |
| >) paths with
 | |
|   <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"adsl"</SPAN
 | |
| > in them (remember, last match in the configuration
 | |
|   wins). There are various ways to handle such exceptions. Example:</P
 | |
| ><P
 | |
| > <TABLE
 | |
| BORDER="0"
 | |
| BGCOLOR="#E0E0E0"
 | |
| WIDTH="100%"
 | |
| ><TR
 | |
| ><TD
 | |
| ><PRE
 | |
| CLASS="SCREEN"
 | |
| >
 { -block }
 | |
|   /adsl</PRE
 | |
| ></TD
 | |
| ></TR
 | |
| ></TABLE
 | |
| ></P
 | |
| ><P
 | |
| > Now the page displays ;-) 
 | |
|  Remember to flush your browser's caches when making these kinds of changes to
 | |
|  your configuration to insure that you get a freshly delivered page! Or, try
 | |
|  using <TT
 | |
| CLASS="LITERAL"
 | |
| >Shift+Reload</TT
 | |
| >.</P
 | |
| ><P
 | |
| > But now what about a situation where we get no explicit matches like 
 | |
|  we did with:</P
 | |
| ><P
 | |
| > <TABLE
 | |
| BORDER="0"
 | |
| BGCOLOR="#E0E0E0"
 | |
| WIDTH="100%"
 | |
| ><TR
 | |
| ><TD
 | |
| ><PRE
 | |
| CLASS="SCREEN"
 | |
| >
 { +block{Path starts with "ads".} +handle-as-image }
 | |
|  /ads</PRE
 | |
| ></TD
 | |
| ></TR
 | |
| ></TABLE
 | |
| ></P
 | |
| ><P
 | |
| > That actually was very helpful and pointed us quickly to where the problem
 | |
|  was. If you don't get this kind of match, then it means one of the default 
 | |
|  rules in the first section of <TT
 | |
| CLASS="FILENAME"
 | |
| >default.action</TT
 | |
| > is causing
 | |
|  the problem. This would require some guesswork, and maybe a little trial and
 | |
|  error to isolate the offending rule. One likely cause would be one of the
 | |
|  <A
 | |
| HREF="actions-file.html#FILTER"
 | |
| ><SPAN
 | |
| CLASS="QUOTE"
 | |
| >"+filter"</SPAN
 | |
| ></A
 | |
| > actions.
 | |
|  These tend to be harder to troubleshoot.
 | |
|  Try adding the URL for the site to one of aliases that turn off
 | |
|  <A
 | |
| HREF="actions-file.html#FILTER"
 | |
| ><SPAN
 | |
| CLASS="QUOTE"
 | |
| >"+filter"</SPAN
 | |
| ></A
 | |
| >:</P
 | |
| ><P
 | |
| > <TABLE
 | |
| BORDER="0"
 | |
| BGCOLOR="#E0E0E0"
 | |
| WIDTH="100%"
 | |
| ><TR
 | |
| ><TD
 | |
| ><PRE
 | |
| CLASS="SCREEN"
 | |
| >
 { shop }
 | |
|  .quietpc.com
 | |
|  .worldpay.com   # for quietpc.com
 | |
|  .jungle.com
 | |
|  .scan.co.uk
 | |
|  .forbes.com</PRE
 | |
| ></TD
 | |
| ></TR
 | |
| ></TABLE
 | |
| ></P
 | |
| ><P
 | |
| > <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"<TT
 | |
| CLASS="LITERAL"
 | |
| >{ shop }</TT
 | |
| >"</SPAN
 | |
| > is an <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"alias"</SPAN
 | |
| > that expands to 
 | |
|  <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"<TT
 | |
| CLASS="LITERAL"
 | |
| >{ -filter -session-cookies-only }</TT
 | |
| >"</SPAN
 | |
| >.
 | |
|  Or you could do your own exception to negate filtering:
</P
 | |
| ><P
 | |
| > <TABLE
 | |
| BORDER="0"
 | |
| BGCOLOR="#E0E0E0"
 | |
| WIDTH="100%"
 | |
| ><TR
 | |
| ><TD
 | |
| ><PRE
 | |
| CLASS="SCREEN"
 | |
| >
 { -filter }
 | |
|  # Disable ALL filter actions for sites in this section
 | |
|  .forbes.com
 | |
|  developer.ibm.com
 | |
|  localhost</PRE
 | |
| ></TD
 | |
| ></TR
 | |
| ></TABLE
 | |
| ></P
 | |
| ><P
 | |
| > This would turn off all filtering for these sites. This is best
 | |
|  put in <TT
 | |
| CLASS="FILENAME"
 | |
| >user.action</TT
 | |
| >, for local site
 | |
|  exceptions. Note that when a simple domain pattern is used by itself (without
 | |
|  the subsequent path portion), all sub-pages within that domain are included 
 | |
|  automatically in the scope of the action.</P
 | |
| ><P
 | |
| > Images that are inexplicably being blocked, may well be hitting the 
 | |
| <A
 | |
| HREF="actions-file.html#FILTER-BANNERS-BY-SIZE"
 | |
| ><SPAN
 | |
| CLASS="QUOTE"
 | |
| >"+filter{banners-by-size}"</SPAN
 | |
| ></A
 | |
| >
 | |
|  rule, which assumes 
 | |
|  that images of certain sizes are ad banners (works well 
 | |
|  <SPAN
 | |
| CLASS="emphasis"
 | |
| ><I
 | |
| CLASS="EMPHASIS"
 | |
| >most of the time</I
 | |
| ></SPAN
 | |
| >  since these tend to be standardized).</P
 | |
| ><P
 | |
| > <SPAN
 | |
| CLASS="QUOTE"
 | |
| >"<TT
 | |
| CLASS="LITERAL"
 | |
| >{ fragile }</TT
 | |
| >"</SPAN
 | |
| > is an alias that disables most
 | |
|  actions that are the most likely to cause trouble. This can be used as a
 | |
|  last resort for problem sites. </P
 | |
| ><P
 | |
| > <TABLE
 | |
| BORDER="0"
 | |
| BGCOLOR="#E0E0E0"
 | |
| WIDTH="100%"
 | |
| ><TR
 | |
| ><TD
 | |
| ><PRE
 | |
| CLASS="SCREEN"
 | |
| >
 { fragile }
 | |
|  # Handle with care: easy to break
 | |
|  mail.google.
 | |
|  mybank.example.com</PRE
 | |
| ></TD
 | |
| ></TR
 | |
| ></TABLE
 | |
| ></P
 | |
| ><P
 | |
| > <SPAN
 | |
| CLASS="emphasis"
 | |
| ><I
 | |
| CLASS="EMPHASIS"
 | |
| >Remember to flush caches!</I
 | |
| ></SPAN
 | |
| > Note that the 
 | |
|  <TT
 | |
| CLASS="LITERAL"
 | |
| >mail.google</TT
 | |
| > reference lacks the TLD portion (e.g. 
 | |
|  <SPAN
 | |
| CLASS="QUOTE"
 | |
| >".com"</SPAN
 | |
| >). This will effectively match any TLD with 
 | |
|  <TT
 | |
| CLASS="LITERAL"
 | |
| >google</TT
 | |
| > in it, such as <TT
 | |
| CLASS="LITERAL"
 | |
| >mail.google.de.</TT
 | |
| >, 
 | |
|  just as an example.</P
 | |
| ><P
 | |
| > 
 | |
|  If this still does not work, you will have to go through the remaining
 | |
|  actions one by one to find which one(s) is causing the problem.</P
 | |
| ></DIV
 | |
| ></DIV
 | |
| ><DIV
 | |
| CLASS="NAVFOOTER"
 | |
| ><HR
 | |
| ALIGN="LEFT"
 | |
| WIDTH="100%"><TABLE
 | |
| SUMMARY="Footer navigation table"
 | |
| WIDTH="100%"
 | |
| BORDER="0"
 | |
| CELLPADDING="0"
 | |
| CELLSPACING="0"
 | |
| ><TR
 | |
| ><TD
 | |
| WIDTH="33%"
 | |
| ALIGN="left"
 | |
| VALIGN="top"
 | |
| ><A
 | |
| HREF="seealso.html"
 | |
| ACCESSKEY="P"
 | |
| >Prev</A
 | |
| ></TD
 | |
| ><TD
 | |
| WIDTH="34%"
 | |
| ALIGN="center"
 | |
| VALIGN="top"
 | |
| ><A
 | |
| HREF="index.html"
 | |
| ACCESSKEY="H"
 | |
| >Home</A
 | |
| ></TD
 | |
| ><TD
 | |
| WIDTH="33%"
 | |
| ALIGN="right"
 | |
| VALIGN="top"
 | |
| > </TD
 | |
| ></TR
 | |
| ><TR
 | |
| ><TD
 | |
| WIDTH="33%"
 | |
| ALIGN="left"
 | |
| VALIGN="top"
 | |
| >See Also</TD
 | |
| ><TD
 | |
| WIDTH="34%"
 | |
| ALIGN="center"
 | |
| VALIGN="top"
 | |
| > </TD
 | |
| ><TD
 | |
| WIDTH="33%"
 | |
| ALIGN="right"
 | |
| VALIGN="top"
 | |
| > </TD
 | |
| ></TR
 | |
| ></TABLE
 | |
| ></DIV
 | |
| ></BODY
 | |
| ></HTML
 | |
| > |