What stops Target telling you're pregnant?

Question: What stops Target from telling that you’re pregnant?
Answer: In many parts the world, the law!

The recent New York Times feature How Companies Learn Your Secrets caused a stir. Investigative reporter Charles Duhigg details conversations he had with data analysts and statisticians about what marketing gold they can divine from shoppers’ buying habits, and how one department store then seemed to shut down the dialogue.

The case in point was the ability to statistically predict pregnancy. Duhigg and looked into the enormous business potential for retailers if they could work out — from shifts in buying patterns — that individual customers were in the early stages of pregnancy. One retail analyst said “We knew that if we could identify them in their second trimester, there’s a good chance we could capture them for years”. Insiders admitted to developing and testing a “pregnancy prediction” score but it remains unclear to what extent such tools are used in practice with real data.

This is pretty heady stuff, on the leading edge of Big Data analytics. Many seem paralysed wondering about what to do about it.

What kind of problem is this?

Charles Duhigg’s NYT feature ends on a note of resignation, and I get the impression from scanning blog posts on this matter that many people — especially in the largely unregulated United States — are feeling powerless to do anything about this. Yet they should take heart from existing privacy law, at least in places like Australia with OECD-based data protection legislation, it’s pretty clear for anyone who actually reads the rules, that for a department store to work out and record that someone is pregnant is likely be unlawful.

TL; DR: Don’t give up on privacy!

On my reading of Australian data privacy and health privacy laws (which are similar to typical GDPR-like laws; see my detailed analysis beneath), we can be sure of the following:

If a department store mines its data on shopping habits, determines that a named woman is likely to be pregnant, and records that prediction in a database, then the store will have collected health information about her and is subject to health privacy legislation in several states (as well as the Sensitive Personal Information clauses of Australia’s federal privacy law).
If the department store has not obtained the customer’s consent to having the state of her pregnancy being determined, then the store will have breached Australian law (specifically HPP 1.1).
If the store uses information originally collected from customers to monitor their shopping habits to generate new information predicting their pregnancies, then it will have breached HPP 2.2.
If the store has not informed the woman that they have predicted she is pregnant, then it will have breached HPP 1.5.

Many commentators fear that the march of technology outpaces the law, but I for one am more optimistic. For the most part, it seems our current information privacy law actually copes well with the sorts of business activities we find intuitively problematic. I am not a lawyer but it looks clearly unlawful to me if a department store in Australia were to purposefully work out that its customers are pregnant. Technically, just recording that prediction, even without acting upon it, probably counts as a Collection of health information and as such it needs the consent of the customer (the OAIC calls this use of analytics to synthesis personal information “Collection by Creation”).

The same legal principles apply — with even more force — in Europe. It remains to be seen whether information privacy can be better regulated in the US.

Analysis: A detailed look at how Australia regulates privacy

At state and federal level, Australia has several privacy acts and health records acts. For our purposes here, they’re all much the same, derived from long standing OECD privacy principles around collection limitation, use & disclosure limitation and transparency, so the following analysis is likely to have parallels in many other countries. I will use the Victorian Health Records Act 2001 (the “Act”) as a model; underlining in the quoted passages is added by me for emphasis.

Personal Information is defined in the Act as:

information or an opinion (including information or an opinion
forming part of a database), whether true or not, and whether
recorded in a material form or not, about an individual whose
identity is apparent, or can reasonably be ascertained
from the information or opinion

At this point, note that the definition is broad and unqualified by such abstract matters as “data ownership”. In the Australian legal system, privacy rights attach to any information whatsoever pertaining to an identifiable individual, whether that information is explicitly collected from the person, or generated automatically by Big Data processing.

Health Information is defined as, amongst other things:

(i) the physical, mental or psychological health
(at any time) of an individual; or
(ii) a disability (at any time) of an individual; or
(iii) an individual’s expressed wishes about the
future provision of health services to him or her

The cornerstones of privacy in OECD-style data protection systems are Collection Limitation and Use Limitation. Here are the opening clauses of Victoria’s Health Privacy Principle HPP 1 – Collection:

1.1 When health information may be collected
An organisation must not collect health information about an
individual unless the information is necessary for one or more
of its functions or activities and at least one of the following
applies –
(a) the individual has consented;
(b) the collection is required, authorised or permitted,
whether expressly or impliedly, by or under law;
(c) the information is necessary to provide a health service …

Note that consent is required in advance of collecting health information, whereas in the case of regular Personal Information, organisations have more latitude to give notice of collection reasonably after the fact.

And here are the opening clauses of Health Privacy Principle HPP 2 – Use & Disclosure:

2.1 An organisation may use or disclose health information about
an individual for the primary purpose for which the information was
collected in accordance with HPP 1.1.

2.2 An organisation must not use or disclose health information about
an individual for a purpose (the secondary purpose) other than the
primary purpose for which the information was collected unless
at least one of the following paragraphs applies –
(a) both of the following apply –
(i) the secondary purpose is directly related to the primary purpose; and
(ii) the individual would reasonably expect the organisation to use or
disclose the information for the secondary purpose; or
(b) the individual has consented to the use or disclosure …

HPP 1 goes on to sanction how individuals should be kept informed about the collection of health information about them:

How health information is to be collected
1.4 At or before the time (or, if that is not practicable,
as soon as practicable thereafter) an organisation collects
health information about an individual from the individual,
the organisation must take steps that are reasonable in the
circumstances to ensure that the individual is generally aware of –
(a) the identity of the organisation and how to contact it; and
(b) the fact that he or she is able to gain access to the
information; and
(c) the purposes for which the information is collected; and
(d) to whom (or the types of individuals or organisations to which)
the organisation usually discloses information of that kind; and
(e) any law that requires the particular information to be
collected; and
(f) the main consequences (if any) for the individual if all or
part of the information is not provided.

1.5 If an organisation collects health information about an
individual from someone else, it must take any steps that are
reasonable in the circumstances to ensure that the individual
is or has been made aware of the matters listed in HPP 1.4 except
to the extent that making the individual aware of the matters
would pose a serious threat to the life or health of any
individual or would involve the disclosure of information
given in confidence.

Big Data, Privacy

What stops Target telling you’re pregnant?