Taking Larger Strides Toward Data Transparency – A Call To Action

There has long been a desire for transparency and specific guidelines around data within the advertising industry. Recently the Advertising Research Foundation partnered with the Coalition for Innovative Media Measurement to propose a data labeling initiative. Still in the early stages of development, the proposal is surrounded by a plethora of open questions about how an initiative such as this can come to fruition.

While we fully support the idea of data labeling and affirm that this is a move in the right direction, a need still exists to take this initiative further. Sure, slap a label on something, but how do we know what that label means or represents? What are actual steps that should be taken to rid advertising of skepticism and uncertainty? As a group, we need to move toward confidence and knowledge that data is being sourced and analyzed correctly. Without these specific instructions and action items around a standard operating procedure when it comes to data transparency, we’re still staring at a problem that will only continue to grow with the current data revolution.

Why we need more help

No one wants to be labeled a ‘bad actor’. Most data providers are quick to tout the accuracy of their data and create noise around why their product is better than the next. Maybe the data is accurate, but there is still a black box around how that data is collected, built and validated. As a result, brands and marketers don’t truly trust the data they receive. Although they continue to use that data, doubt still abounds. Advertiser Perceptions even recently found that 80% of advertisers use audience insights, but that only 33% say they ‘completely trust them’.

In nearly every industry outside of advertising, there is data regulation. There is clear delineation on how data is sourced, kept and secured. Take HIPAA legislations, for example, which put highly strict regulations on what type of medical data can be shared and where personally identifiable information should be withheld. These type of regulations don’t exist for the mass amounts of general, publicly available data that exist. There is no official standard. And this is not acceptable. On top of data labeling, this is the standard that needs to be set and how we all get there together.

  1. Providing clear and accurate reports – Inaccurate reporting often stems from identifying sample size and consumers at the individual level versus the household level. In these cases, the baseline of the data collection is left unknown as the data suppliers create false match rates simply using an address and last name, thus making sweeping assumptions about a household as a whole. Everyone in that household is essentially ‘matched’, but without transparency on the report, simple qualifiers at an individual level — age, ethnicity, occupation, gender — can be extremely wrong and suddenly all the information is invalid for the consumer of that information. To avoid this, data providers should be required to disclose the level at which their data is being represented and whether or not that sample is consistent with other information in the database or identity graph.
  2. Describing how the data is built – Data can be collected in any number of manners, so as data providers, we should be sharing the methods we use with clients. This boils down to whether or not data is collected in an appropriate manner, that it is recent and updated regularly, what the point of collection is (online surveys, transactions, web scraping, via phone, with consumer notice that we are collecting data). Again, was this data collected at the household (inferred) or individual level?
  3. Clarifying proprietary vs. white labeled data – Let’s look at scenario. Company A may want to license data from company X and Y, but company A doesn’t realize that company X is white labeling from company Y and that the data is the same. In this scenario, without full transparency, company A is potentially left buying the same data twice and wasting time and money. If you’re buying data, you’d like to know that you are dealing with the originator of the data or at the least, where that data’s origins are. Companies can sell data, but we often see white labeled data presented as proprietary, presenting a substantial need for more transparency around where data is truly coming from.
  4. Creating a standard sample set — When data providers supply samples they often try to put their best foot forward. It’s easy to mask what they are doing poorly if they are allowed to drive what that sample is and only supply the best data upfront. For this reason, there should be a standard sample set that people can adhere to. If there were standards around what the samples should be, there would be no way to hide this and everyone would get a clear, accurate sample of data to test before committing fully.
  5. Developing data accuracy compliance — It’s difficult to imagine a world where companies fully disclose the names of their sources or are truly 100% transparent. Yet if every other industry has data security and privacy standards, why shouldn’t the advertising industry? In addition to labeling data, we need to develop a compliance checklist that ranks companies in terms of their level of transparency. This would give brands and advertisers the option to be selective about the firms they choose to work with based on whatever standards they deem fit.

This may require giving up a few sources or opening up previously closed doors, but if the whole industry was held accountable to this standard, it would be common practice. If you feel strongly about your data you will be willing to follow these standards and stand together as we take more steps toward full transparency.