Tuesday, July 17, 2007

An analysis of a P3P compact policy example

pencil icon, that"s clickable to start editing the post

Since 6.0 IE has been aware of P3P policies in deciding whether to accept third party cookies or not. This is a quick look at P3P, based on an example.

The Platform for Privacy Preferences (P3P) Project homepage gives the purpose as:

The Platform for Privacy Preferences Project (P3P) enables Websites to express their privacy practices in a standard format that can be retrieved automatically and interpreted easily by user agents. P3P user agents will allow users to be informed of site practices (in both machine- and human-readable formats) and to automate decision-making based on these practices when appropriate. Thus users need not read the privacy policies at every site they visit.

What strikes as a bad smell is that the P3P Work has been suspended, in the process of creating version 1.1. I think there could be three reasons for that:

  • The work is done, it hit spot on and conquered the world
  • The former version
  • It's a dead end and it's not attracting any attention
It's not clear to me exactly what's the case here. It states The P3P Specification Working Group took this step as there was insufficient support from current Browser implementers for the implementation of P3P 1.1.. This is a fair argument but still does uease me slightly since privacy is on the rise.

I'll go for the The Platform for Privacy Preferences 1.0 (P3P1.0) Specification, since the group note is not for me until I realize that everybody is following it. Since I'm not implementing P3P (yet) this seems like the one for me, though a clear primer probably would have been best. I usually don't read a spec from end to end, but instead take some examples and follow those into the spec, sort of attacking it from the side based on a search for specific information.

The example

In FDIM and third party cookies I discovered the geminus host ndk.hit.gemius.pl has a compact P3P policy that contains P3P: CP="NOI DSP COR NID PSAo OUR IND", so what does that code mean?

This way of showing the policy is defined in 2.2.2 HTTP Headers. This is a case of compact-policy-field defined in 4. Compact Policies. This leaves the real meat to be NOI DSP COR NID PSAo OUR IND.


NOI is described in 4.2.1 Compact ACCESS and stands for <nonident/>, but that's just a reference to the full policy for 3.2.5 The ACCESS element, and it means Web site does not collect identified data.


DSP means there are some DISPUTES (4.2.2 Compact DISPUTES), which is really defined in 3.2.6 The DISPUTES element, described by:

A policy SHOULD contain a DISPUTES-GROUP element, which contains one or more DISPUTES elements. These elements describe dispute resolution procedures that may be followed for disputes about a services' privacy practices.

The index page for the host just delivers an empty HTML structure, so here's no reference to the full policy, but following the well know location /w3c/p3p.xml that contains:

<?xml version="1.0" encoding="UTF-8" ?>
<META>
   <POLICY-REFERENCES>
      <POLICY-REF about="/w3c/policy.xml">
         <INCLUDE>/*</INCLUDE>
         <COOKIE-INCLUDE>* * * </COOKIE-INCLUDE>
      </POLICY-REF>
   </POLICY-REFERENCES>
</META>

That's a policy reference to the real privacy policy. The DISPUTES links to the service, the company website. I'm not sure what's best practice here, but I did expect something more specific though this usage is quite abstract to me, and I guess no agents will read it any way.


COR stands for <correct> in 4.2.3 Compact REMEDIES. The real definition is in 3.2.7 The REMEDIES element and has the promised description Errors or wrongful actions arising in connection with the privacy policy will be remedied by the service. There's also the possiblity to use the <money/> as a common use case will involve an economic limit.


NID stands for <NON-IDENTIFIABLE> in 4.2.4 Compact NON-IDENTIFIABLE.In 3.3.3 The NON-IDENTIFIABLE element it's described as:

This element signifies that either no data is collected (including Web logs), or that the organization collecting the data will anonymize the data referenced in the enclosing STATEMENT. In order to consider the data "anonymized", there must be no reasonable way for the entity or a third party to attach the collected data to the identity of a natural person. Some types of data are inherently anonymous, such as randomly-generated session IDs. Data which might identify natural people in some circumstances, such as IP addresses, names, or addresses, must have a non-reversible transformation applied in order be considered "anonymized".


PSAo is described in 4.2.5 Compact PURPOSE where the PSA stands for <pseudo-analysis/> and o stands for opt-out. The full description is in 3.3.4 The PURPOSE element:

Pseudonymous Analysis: Information may be used to create or build a record of a particular individual or computer that is tied to a pseudonymous identifier, without tying identified data (such as name, address, phone number, or email address) to the record. This profile will be used to determine the habits, interests, or other characteristics of individuals for purpose of research, analysis and reporting, but it will not be used to attempt to identify specific individuals. For example, a marketer may wish to understand the interests of visitors to different portions of a Web site.

and opt-out means:

Data may be used for this purpose unless the user requests that it not be used in this way. When this value is selected, the service MUST provide clear instructions to users on how to opt-out of this purpose at the opturi. Services SHOULD also provide these instructions or a pointer to these instructions at the point of data collection.


OUR stands for <ours/> 4.2.6 Compact RECIPIENT. 3.3.5 The RECIPIENT element where it's described:

Ourselves and/or entities acting as our agents or entities for whom we are acting as an agent: An agent in this instance is defined as a third party that processes data only on behalf of the service provider for the completion of the stated purposes. (e.g., the service provider and its printing bureau which prints address labels and does nothing further with the information.)


The last one is IND stands for <indefinitely/> in 4.2.7 Compact RETENTION. That's described in 3.3.6 The RETENTION element:

Information is retained for an indeterminate period of time. The absence of a retention policy would be reflected under this option. Where the recipient is a public fora, this is the appropriate retention policy.

An easy overview for compact policies can be found on the p3pwriter website.

0 comments :