Linking Data to Identities: Deterministic and Probabilistic Explained
Cross-device tracking is the latest Holy Grail of digital advertising. At stake is the marketer’s ability to recognize the same consumer when he or she interacts with a brand, whether that interaction happens on a laptop, tablet, smartphone or connected TV.
Until recently, marketers had no choice but to view these engagements in silos — essentially treating one consumer as three or four distinct people, one for each device they used. What’s the impact of that? Plenty. From the brand’s perspective, basic tasks like managing frequency, attributing and allocating conversion credit and sequencing messages all suffer.
But new and more accurate tools and techniques that link our various digital aliases have come to market recently. And those methodologies for linking these digital identities fall into two buckets: deterministic and probabilistic. Right now, the industry is debating whether probabilistic or deterministic is the best approach to cross-device measurement and targeting of digital advertising.
Those in the deterministic camp believe that definitive proof of a consumer’s identity is the only way to go. Such solutions rely on known facts about people, typically revealed when they log in to sites such as Facebook or Twitter. Once a consumer logs into the same site from her desktop, mobile phone and tablet, it’s a fairly straightforward process to link all of those devices together to a particular alias.
So why isn’t the whole world embracing deterministic matching? While highly accurate, it struggles to scale. No one — not even behemoths like Google and Facebook — can build deterministic device graphs for every consumer. And that means marketers can’t truly achieve their campaign goals using just deterministic data.
Probabilistic device-linking approaches use data analysis to associate multiple devices to a specific consumer or household. Let’s say a marketer serves an ad to a desktop on a certain WiFi residential address. Later, the marketer sees a mobile device using that same Wi-Fi connection. It’s probable — but not certain — that the device is part of that household. As you can see, this approach delivers more scale, but with less assurance that the linkages are accurate.
So which approach is better, deterministic or probabilistic? The smarter way to answer this is to ask yourself: What do you want to accomplish with your campaign, and which data actually gives you the positive marketing returns you seek?
For instance, if you’re selling family vacations or big-ticket items the whole family weighs in on, reaching a household probably makes sense for that type of campaign. But that won’t be the case for all sectors and campaigns. It all goes back to what you want to achieve with your campaign.
And of course, the other important consideration in cross-device targeting is consumer privacy. There is a wide range of risk for each major form of deterministic and probabilistic methodology.
Some identifiers, such as cookies, typically have set expiration dates and consumers have enough familiarity with them that they can control them if they so choose. Device IDs, which are used to reach users in mobile apps, don’t have set expiration dates, but consumers can easily opt out of tracking. Both are common and known to consumers, which means device tracking based on cookies and device IDs won’t feel as invasive to them.
Other identifiers — such as household IDs and carrier IDs — are still emerging, and consumers have low visibility into their use, or how to opt out. Consequently, household and carrier IDs pose more risk.
There are other identifiers used by consumer-tracking vendors that marketers should seriously give pause to before using, such as fingerprinting and zombie cookies. Fingerprinting uses persistent identifiers and gathers new data via methods that are both invisible to the consumer and collected without consent. Zombie cookies are expired cookies that have been brought back to life and are often in direct violation of a consumer’s choice to opt out. Both data types involve a considerable amount of risk of angering consumers.
How do you avoid risk by ensuring you don’t use these two identifiers in your campaigns? You need to ask your vendors explicitly how they compile their consumer data sets — and get it in writing.
Most brand marketers aren’t interested in getting into the weeds of cross-device tracking, and leave it to their vendors to recommend an approach. If we lived in a world where we had the perfect ID that might make sense, but we don’t. Between the embarrassing headlines about wireless companies and zombie cookies, increasing consumer awareness of privacy issues, and even the possibility of government regulation in the form of a Consumer Privacy Bill of Rights, every marketer needs a basic understanding of data, its uses and misuses, as well as consumer attitudes and rights. Ignorance can land you on the front page in a bad way, and go a long way in alienating the customers you worked so hard to win.
By investing some time to expand your understanding by asking specific, tough questions of your vendors, you’ll more likely attain the transparency and flexibility you need to use the right data at the right time for the right campaign.
This article was originally published on MediaPost.com