Predicting Social Security Numbers from Public Data

Posted on July 16, 2009 

Predicting Social Security Numbers from Public Data

Social Security numbers were created under the Social Security Act of 1935 as identifiers for accounts tracking individual earnings. However, over time, they started being used as sensitive authentication devices, becoming one of the pieces of information most often sought by identity thieves: knowledge of a person’s name, SSN, and data of birth, is often a sufficient condition to impersonate that individual and obtain access to a variety of services, leading to so-called identity theft.

The current public policy in the area of identity theft suggests that SSNs should be kept confidential: consumers are urged to protect their SSNs. However, we show that it is possible to predict individual SSNs simply from publicly available data.

Based on observation of issuance patterns in the “Death Master File” (a public database that contains SSNs of people who have died), we were able to use information about an individual’s date and state of birth to predict narrow ranges of values likely to contain that individual’s SSN. The predictions are particularly accurate for the SSNs of people who were born after 1988 (when the SSA initiated the Enumeration at Birth program, through which babies receive SSNs soon after birth) and in states with lower population. Since SSNs are predictable from public data, identity theft could occur even without events such as data breaches.

Comments

Leave a Reply