Showing posts with label P3P. Show all posts
Showing posts with label P3P. Show all posts

Thursday, June 5, 2008

Configuring Tomcat to send P3P Compact Policy in HTTP response header

pencil icon, that"s clickable to start editing the post

Some time ago i did some analysis of P3P Compact Headers and the interplay it has with IE6+ to enable thirdparty cookies. The other day I assisted someone in implementing it in Tomcat. It was rather easy since the Servlet specfication has the filter feature that fits this like a glove.

Here's an example code for such a filter called StaticResponseHeaderFilter, that has been generalized to be able of appending any static HTTP header to the response. An important point is that I expect one policy to cover the hole site. I'm not sure if anyone differentiates the policy for different pages, though I guess it could in theory make sense, it could very easily confuse the users.

    1 package org.sweetxml.webutil;
    2 
    3 import java.io.IOException;
    4 
    5 import javax.servlet.Filter;
    6 import javax.servlet.FilterChain;
    7 import javax.servlet.FilterConfig;
    8 import javax.servlet.ServletException;
    9 import javax.servlet.ServletRequest;
   10 import javax.servlet.ServletResponse;
   11 import javax.servlet.http.HttpServletResponse;
   12 
   13 public class StaticResponseHeaderFilter implements Filter {
   14 
   15   private final static String HEADERNAME_INIT_PARAM = "headername";
   16   private final static String HEADERVALUE_INIT_PARAM = "headervalue";
   17 
   18 
   19   private String headername = null;
   20   private String headervalue = null;
   21 
   22 
   23   @Override
   24   public void init(FilterConfig filterConfig) throws ServletException {
   25     headername = filterConfig.getInitParameter(HEADERNAME_INIT_PARAM);
   26     headervalue = filterConfig.getInitParameter(HEADERVALUE_INIT_PARAM);
   27   }
   28 
   29   @Override
   30   public void doFilter(ServletRequest request, ServletResponse response, FilterChain filterChain) throws IOException,
   31       ServletException {
   32 
   33     HttpServletResponse httpResp = (HttpServletResponse) response;
   34     // pass on to other filters or the resource
   35     filterChain.doFilter(request, response);
   36     // set HTTP header on response
   37     httpResp.setHeader(headername, headervalue);
   38   }
   39 
   40   @Override
   41   public void destroy() {
   42   }
   43 }

At first I had implemented it with ResponseWrapper, but as I thought of it I came to believe that it wasn't necessary (my experience with implementing filters are limited and when I'll need to use the wrapper classes).

It's configured like this in web.xml:

   39 
   40 
   41   <filter>
   42     <filter-name>StaticResponseHeaderFilter</filter-name>
   43     <filter-class>org.sweetxml.webutil.StaticResponseHeaderFilter</filter-class>
   44     <init-param>
   45       <param-name>headername</param-name>
   46       <param-value>P3P</param-value>
   47     </init-param>
   48     <init-param>
   49       <param-name>headervalue</param-name>
   50       <param-value>CP="CAO PSA OUR"</param-value>
   51     </init-param>
   52   </filter>
   53   <filter-mapping>
   54     <filter-name>StaticResponseHeaderFilter</filter-name>
   55     <url-pattern>/*</url-pattern>
   56   </filter-mapping>
   57 
   58 

And a quick check with wget gives:

wget -S http://localhost:8080/p3pdemo/
--23:20:08--  http://localhost:8080/p3pdemo/
Resolving localhost... ::1
Connecting to localhost|::1|:8080... connected.
HTTP request sent, awaiting response...
  HTTP/1.1 200 OK
  Server: Apache-Coyote/1.1
  Set-Cookie: JSESSIONID=8BFBF636E28A43E0F292345EAD7F6CB7; Path=/p3pdemo
  P3P: CP="CAO PSA OUR"
  Content-Type: text/html;charset=UTF-8
  Content-Length: 1609
  Date: Thu, 05 Jun 2008 21:20:08 GMT
  Connection: keep-alive
Length: 1609 (1.6K) [text/html]

Very easy indeed and the filter feature was just perfect.

As for implementing it in Apache HTTPD see the Appendix in The Platform for Privacy Preferences 1.0 Deployment Guide (W3C Note 11 February 2002) and for IIS

I can only conclude that the hardest part of implementing P3P is getting the policy right and being representative, it's not implementing it on servers.

Read more

Sunday, February 24, 2008

W3C draft on a 'policy' to allow webapps to selectively loosen the same-origin restriction

pencil icon, that"s clickable to start editing the post

The news feed from w3c has been thick since new year and i still haven't had time to catch up with it all. On of news items from a fourth night ago is on Access Control for Cross-site Requests. It's from the Web Application Formats Working Group (first time for me) that has published a W3C Working Draft on Access Control for Cross-site Requests (14 February 2008). In the introduction it's revealed what it's all about:

Web application technologies commonly apply same-origin restrictions to network requests. These restrictions prevent a Web application running from one origin from obtaining data retrieved from another origin, and also limit the amount of unsafe HTTP requests that can be automatically launched toward destinations that differ from the running application's origin.

It's sounds truly great if this can be brought up to speed with current use and needs, though I'll admit I'm a bit skeptic on advances on these kinds of specifications and not the least the following implementation. Later it says:

This specification is a building block for other specifications, so-called hosting specifications, which will define the precise model by which this specification is used. Among others, such specifications are likely to include XMLHttpRequest Level 2, XBL 2.0, and HTML 5 (for its server-sent events feature).

Sooooo, this could take some time to get in common use as P3P, XHTML 5 (how about just XHTML 1.0 browser support). Some things just take time and the requirements seem solid enough.

Read more

Friday, October 19, 2007

A minimal P3P Compact policy - a suggestion by Microsoft

pencil icon, that"s clickable to start editing the post

Back in July I had a look at P3P compact policy, and in the post An analysis of a P3P compact policy example I resolved the example NOI DSP COR NID PSAo OUR IND. The other day I by coincidence came by an entry in the Microsoft Knowledge base Session variables are lost if you use FRAMESET in Internet Explorer 6 [HTML] (In danish: Sessionsvariabler går tabt ved brug af FRAMESET i Internet Explorer 6). This article suggest the minimal policy CAO PSA OUR to declare that no malicious actions are performed with the data of the user..

Doing a further search I found quite many articles and postings with plenty of positive comments that left me with the feeling that this is used as a technical fix to overcome problems with ex. using third-party content in Iframes. It's not that I think IE6+ is done right and FF 2.0 is done wrong, but that using the P3P compact policy without understanding the semantics, for sure wasn't what Microsoft had in mind when they choose to implement IE 6+ to demand P3P Compact policies (for third parties) and for certain not what the P3P working group had in mind!.

To turn around the problem I'll make my guess at what the policy CAO PSA OUR means and thereby give myself and other web administrators the possibility to check whether that fits our websites.

CAO (<contact-and-other/>)
Identified Contact Information and Other Identified Data: access is given to identified online and physical contact information as well as to certain other identified data.
PSA (<pseudo-analysis/>)
Pseudonymous Analysis: Information may be used to create or build a record of a particular individual or computer that is tied to a pseudonymous identifier, without tying identified data (such as name, address, phone number, or email address) to the record. This profile will be used to determine the habits, interests, or other characteristics of individuals for purpose of research, analysis and reporting, but it will not be used to attempt to identify specific individuals. For example, a marketer may wish to understand the interests of visitors to different portions of a Web site.
OUR (<ours/>)
Ourselves and/or entities acting as our agents or entities for whom we are acting as an agent: An agent in this instance is defined as a third party that processes data only on behalf of the service provider for the completion of the stated purposes. (e.g., the service provider and its printing bureau which prints address labels and does nothing further with the information.)

My best shot at the meaning of this is:

My website do collect some PII for my own use, your can probably check for yourself online what's stored about you. My website may use profiling mapped by pseudonymous identifier.

An 2002 posting on Oreillynet P3P in IE6 : Frustrating Failure has a comment with an even more minimal policy and also a comment from one of the authors of the P3P specification.

Read more

Monday, August 13, 2007

Privacy Impact Assessment (PIA) - Aliens have a right for P3P as well

pencil icon, that"s clickable to start editing the post

The OMB has written Guidance for Implementing the Privacy Provisions of the E-Government Act of 2002 (M-03-22). It's a rather long and terse document, that even scares junior spec readers like myself. I ran into into it when searching for privacy initiatives. In Attachment A - E-Government Act Section 208 Implementation Guidance section II. Privacy Impact Assessment subsection A. Definitions. there's a glossary.

Individual - means a citizen of the United States or an alien lawfully admitted for permanent residence. (Agencies may, consistent with individual practice, choose to extend the protections of the Privacy Act and E-Government Act to businesses, sole proprietors, aliens, etc.)

As english is a non-native language to me, this is a great laugh :-). Though the meaning is less spectacular as can be seen in wikipedias entry on the term Alien in U.S. law:

In U.S. law, an alien is a person who owes political allegiance to another country or government and not a native or naturalized citizen of the land where they are found.

The sixth term defines the PIA (my formatting)

Privacy Impact Assessment (PIA) - is an analysis of how information is handled:

  1. to ensure handling conforms to applicable legal, regulatory, and policy requirements regarding privacy,
  2. to determine the risks and effects of collecting, maintaining and disseminating information in identifiable form in an electronic information system, and
  3. to examine and evaluate protections and alternative processes for handling information to mitigate potential privacy risks.

This sounds all fair, though it's a lot of hard work creating and keeping up to date. The seventh and last term:

Privacy policy in standardized machine-readable format - means a statement about site privacy practices written in a standard computer language (not English text) that can be read automatically by a web browser.

Here the obvious candidate is P3P. In section IV. Privacy Policies in Machine-Readable Formats the requirements are:

  1. Actions.
    1. Agencies must adopt machine readable technology that alerts users automatically about whether site privacy practices match their personal privacy preferences. Such technology enables users to make an informed choice about whether to conduct business with that site.
    2. OMB encourages agencies to adopt other privacy protective tools that become available as the technology advances.
  2. Reporting Requirement. Agencies must develop a timetable for translating their privacy policies into a standardized machine-readable format. The timetable must include achievable milestones that show the agency’s progress toward implementation over the next year. Agencies must include this timetable in their reports to OMB (see Section VII).

The last mentioned report is not eazy-piecy. In section VII. Reporting Requirements it goes:

Agencies are required to submit an annual report on compliance with this guidance to OMB as part of their annual E-Government Act status report. The first reports are due to OMB by December 15, 2003. All agencies that use information technology systems and conduct electronic information collection activities must complete a report on compliance with this guidance, whether or not they submit budgets to OMB.

Reports must address the following four elements:

  1. Information technology systems or information collections for which PIAs were conducted. Include the mechanism by which the PIA was made publicly available (website, Federal Register, other), whether the PIA was made publicly available in full, summary form or not at all (if in summary form or not at all, explain), and, if made available in conjunction with an ICR or SOR, the publication date.
  2. Persistent tracking technology uses. If persistent tracking technology is authorized, include the need that compels use of the technology, the safeguards instituted to protect the information collected, the agency official approving use of the tracking technology, and the actual privacy policy notification of such use.
  3. Agency achievement of goals for machine readability: Include goals for and progress toward achieving compatibility of privacy policies with machine-readable privacy protection technology.
  4. Contact information. Include the individual(s) (name and title) appointed by the head of the Executive Department or agency to serve as the agency’s principal contact(s) for information technology/web matters and the individual (name and title) primarily responsible for privacy policies.

Whouzz, this is done right and with required documentation. When I get the energy and courage I'll go look for a such a policy on a U.S. Government website.

Read more

Tuesday, July 17, 2007

An analysis of a P3P compact policy example

pencil icon, that"s clickable to start editing the post

Since 6.0 IE has been aware of P3P policies in deciding whether to accept third party cookies or not. This is a quick look at P3P, based on an example.

The Platform for Privacy Preferences (P3P) Project homepage gives the purpose as:

The Platform for Privacy Preferences Project (P3P) enables Websites to express their privacy practices in a standard format that can be retrieved automatically and interpreted easily by user agents. P3P user agents will allow users to be informed of site practices (in both machine- and human-readable formats) and to automate decision-making based on these practices when appropriate. Thus users need not read the privacy policies at every site they visit.

What strikes as a bad smell is that the P3P Work has been suspended, in the process of creating version 1.1. I think there could be three reasons for that:

  • The work is done, it hit spot on and conquered the world
  • The former version
  • It's a dead end and it's not attracting any attention
It's not clear to me exactly what's the case here. It states The P3P Specification Working Group took this step as there was insufficient support from current Browser implementers for the implementation of P3P 1.1.. This is a fair argument but still does uease me slightly since privacy is on the rise.

I'll go for the The Platform for Privacy Preferences 1.0 (P3P1.0) Specification, since the group note is not for me until I realize that everybody is following it. Since I'm not implementing P3P (yet) this seems like the one for me, though a clear primer probably would have been best. I usually don't read a spec from end to end, but instead take some examples and follow those into the spec, sort of attacking it from the side based on a search for specific information.

The example

In FDIM and third party cookies I discovered the geminus host ndk.hit.gemius.pl has a compact P3P policy that contains P3P: CP="NOI DSP COR NID PSAo OUR IND", so what does that code mean?

This way of showing the policy is defined in 2.2.2 HTTP Headers. This is a case of compact-policy-field defined in 4. Compact Policies. This leaves the real meat to be NOI DSP COR NID PSAo OUR IND.


NOI is described in 4.2.1 Compact ACCESS and stands for <nonident/>, but that's just a reference to the full policy for 3.2.5 The ACCESS element, and it means Web site does not collect identified data.


DSP means there are some DISPUTES (4.2.2 Compact DISPUTES), which is really defined in 3.2.6 The DISPUTES element, described by:

A policy SHOULD contain a DISPUTES-GROUP element, which contains one or more DISPUTES elements. These elements describe dispute resolution procedures that may be followed for disputes about a services' privacy practices.

The index page for the host just delivers an empty HTML structure, so here's no reference to the full policy, but following the well know location /w3c/p3p.xml that contains:

<?xml version="1.0" encoding="UTF-8" ?>
<META>
   <POLICY-REFERENCES>
      <POLICY-REF about="/w3c/policy.xml">
         <INCLUDE>/*</INCLUDE>
         <COOKIE-INCLUDE>* * * </COOKIE-INCLUDE>
      </POLICY-REF>
   </POLICY-REFERENCES>
</META>

That's a policy reference to the real privacy policy. The DISPUTES links to the service, the company website. I'm not sure what's best practice here, but I did expect something more specific though this usage is quite abstract to me, and I guess no agents will read it any way.


COR stands for <correct> in 4.2.3 Compact REMEDIES. The real definition is in 3.2.7 The REMEDIES element and has the promised description Errors or wrongful actions arising in connection with the privacy policy will be remedied by the service. There's also the possiblity to use the <money/> as a common use case will involve an economic limit.


NID stands for <NON-IDENTIFIABLE> in 4.2.4 Compact NON-IDENTIFIABLE.In 3.3.3 The NON-IDENTIFIABLE element it's described as:

This element signifies that either no data is collected (including Web logs), or that the organization collecting the data will anonymize the data referenced in the enclosing STATEMENT. In order to consider the data "anonymized", there must be no reasonable way for the entity or a third party to attach the collected data to the identity of a natural person. Some types of data are inherently anonymous, such as randomly-generated session IDs. Data which might identify natural people in some circumstances, such as IP addresses, names, or addresses, must have a non-reversible transformation applied in order be considered "anonymized".


PSAo is described in 4.2.5 Compact PURPOSE where the PSA stands for <pseudo-analysis/> and o stands for opt-out. The full description is in 3.3.4 The PURPOSE element:

Pseudonymous Analysis: Information may be used to create or build a record of a particular individual or computer that is tied to a pseudonymous identifier, without tying identified data (such as name, address, phone number, or email address) to the record. This profile will be used to determine the habits, interests, or other characteristics of individuals for purpose of research, analysis and reporting, but it will not be used to attempt to identify specific individuals. For example, a marketer may wish to understand the interests of visitors to different portions of a Web site.

and opt-out means:

Data may be used for this purpose unless the user requests that it not be used in this way. When this value is selected, the service MUST provide clear instructions to users on how to opt-out of this purpose at the opturi. Services SHOULD also provide these instructions or a pointer to these instructions at the point of data collection.


OUR stands for <ours/> 4.2.6 Compact RECIPIENT. 3.3.5 The RECIPIENT element where it's described:

Ourselves and/or entities acting as our agents or entities for whom we are acting as an agent: An agent in this instance is defined as a third party that processes data only on behalf of the service provider for the completion of the stated purposes. (e.g., the service provider and its printing bureau which prints address labels and does nothing further with the information.)


The last one is IND stands for <indefinitely/> in 4.2.7 Compact RETENTION. That's described in 3.3.6 The RETENTION element:

Information is retained for an indeterminate period of time. The absence of a retention policy would be reflected under this option. Where the recipient is a public fora, this is the appropriate retention policy.

An easy overview for compact policies can be found on the p3pwriter website.

Read more

Wednesday, July 11, 2007

Party cookies - first or third-context

pencil icon, that"s clickable to start editing the post

Internet Explorer 6+ (IE) has default privacy policy to block third party cookies in ex. iframe's. How exactly is first and third party defined and thereby how can this obstacle be overcome?

In the Microsoft Windows XP article Understanding cookies the description is:

A first-party cookie either originates on or is sent to the Web site you are currently viewing. These cookies are commonly used to store information, such as your preferences when visiting that site.

A third-party cookie either originates on or is sent to a Web site different from the one you are currently viewing. Third-party Web sites usually provide some content on the Web site you are viewing. For example, many sites use advertising from third-party Web sites and those third-party Web sites may use cookies. A common use for this type of cookie is to track your Web page use for advertising or other marketing purposes. Third-party cookies can either be persistent or temporary.

The definition for first-party is the Web site you are currently viewing, and reading it like this it seems that there is just one first-party and the rest i third-party. But that's not true, well that depends on how you interpret it, because the most precise definition that I've found is in Privacy in Internet Explorer 6 in the section First and Third-Party Context:

Internet Explorer 6 defines first-party content as content associated with the host domain. Third-party content originates from any other domain. For example, suppose a user visits www.wideworldimporters.com by typing this URL in the address bar, and www.wingtiptoys.com has a banner ad on this page. If these two sites set cookies, the cookies from www.wideworldimporters.com are in a first-party context while the cookies from www.wingtiptoys.com are in a third-party context.

Often commercial Web pages are an amalgamation of first- and third-party content. The Internet Explorer 6 privacy features distinguish between first- and third-party content. The underlying assumption is that users have a different relationship with first parties than with third parties. In fact, users might not be aware of the third party or be given a choice of whether to have a relationship with it. For this reason, default privacy settings for third parties are more stringent than for first parties.

but it's in the associated note it's written crystal clear:

The URLs www.wideworldimporters.com and toys.wideworldimporters.com both contain the same minimal domain, wideworldimporters.com. Content that shares the same minimal domain as the host domain is considered first-party content. Likewise, cookies set from these domains are considered first-party cookies. Minimal domains must have the same top-level domain (TLD). Some common examples of TLDs are .com, .net, and .org.

I haven't heard the term minimal domain before but it's ituitive to understand. It matches (sort of) the definition of a domain cookie as defined in RFC 2965 - HTTP State Management Mechanism:

Host names can be specified either as an IP address or a HDN string. Sometimes we compare one host name with another. (Such comparisons SHALL be case-insensitive.) Host A's name domain-matches host B's if

  • their host name strings string-compare equal; or
  • A is a HDN string and has the form NB, where N is a non-empty name string, B has the form .B', and B' is a HDN string. (So, x.y.com domain-matches .Y.com but not Y.com.)

It's fair enough to expect the owner of a domain to take responsibility of all webservers/hosts on that domain, making it a domain of trust. Though be aware that the P3P privacy policy is per host (and/or subdomain) and not per minimal domain as illustrated in spywarewarrior's Internet Privacy w/ IE6 & P3P: A Summary of Findings the section First-Party vs. Third-Party.

Next I'll have a quick look at P3P.

Read more