Sunday, 16 February 2014
Grand Ballroom A (Hyatt Regency Chicago)
Sharing sequencing datasets is essential for human genetics. I will present common techniques that were thought to de-identify these datasets, such as removing explicit identifiers, pooling, and data masking. Then, I will survey a range of techniques to bypass these methods and re-identify the “anonymous” samples. Specifically, I will show how publicly-available Internet resources can identify participants of genetic studies in some scenarios.