Category: Data

What Insurers Can Learn From Tech Giants


The competitive advantage of the world’s leading tech companies resides in the way they use data. Whether you’re searching the web for an answer to an obscure question, or you’re seeing ads for the exact product you’ve been thinking of buying, technology’s ability to anticipate our needs is becoming astoundingly accurate.

Imagine if insurance companies could get this good at knowing who potential customers are, even at predicting a customer’s behavior and lifetime value. If insurers could apply big data and artificial intelligence as the leading tech companies, they could know right away which individuals are in their target customer segment, well before the bureaucracy of filed ratings and underwriting guidelines.

In the insurance industry, successes are built on the ability to forecast risk. And, now the industry has the opportunity to apply advanced data science to create the new frontier of risk forecasting, customer experience design and loss predictions. In fact, it is an imperative. So what is holding them back?

Insurance companies have a long history of struggling to predict what are known as unpriceable risks. This small percentage of every insurance company’s book of business ends up having a large impact on the central metric of their profitability–the loss ratio. Whether these unpriceable risks are fraudsters, litigious insureds, or people who generate high dollar claims, these are the risks that historically have been both the most difficult to predict. Since most insurance companies are not leveraging the same data and artificial intelligence capabilities to identify these risk profiles before the individual becomes a customer, they must take reactionary measures, where possible, such as non-renewal once it’s too late– the loss has occurred.

This is not a new problem, and the industry has tried to solve it in a variety of ways. In the 1990s, insurance companies began to introduce credit-based insurance scores into rating plans. To date, there had been no single factor that existed that was as highly correlated to predicting loss frequency and severity. Considering an applicant’s credit history was, at the time, the best available piece of information that could help the insurer to understand an individual’s behaviors and level of conscientiousness. The hypothesis, which they were able to prove in actuarial filings, was that the higher a person’s credit score, the less likely they would be to file a claim.

However, credit-based insurance scoring may not be here to stay. According to the Insurance Information Institute, some have suggested that the use of insurance scores might unfairly discriminate against certain demographic or economically disadvantaged groups. States such as California, Massachusetts, Hawaii and Maryland place restrictions on the use of credit, and more states are considering moving in a similar direction.

In his book The Four: The Hidden DNA of Amazon, Apple, Facebook, and Google, New York University professor and tech thought leader Scott Galloway cited that between 2010 and 2015, there were 13 companies that managed to outperform the S&P 500 each year. And when we take a look at this elite group, we find that the majority of these businesses are algorithmically driven. These companies take in data constantly and use this data in real time to update the user-experience. A research report from Accenture found that artificial intelligence has the potential to increase corporate profitability in 16 industries by an average of 38% by the year 2035. Kate Smaje, a senior partner at McKinsey and global co-leader of McKinsey Digital said, “Data is providing the fuel to power better and faster decisions. High-performing organizations are three times more likely than others to say their data and analytics initiatives have contributed at least 20 percent to EBIT.” It is hard to deny that success in our respective businesses is a function of how well we make use of the data available to us.

Now that we are entering a new era of advanced predictive capabilities, the future can be predicted earlier and with incredible accuracy. So back to the original question: What is holding insurers back from leveraging these advancements?

The first thing holding insurers back is history and culture. The insurance industry has been slow adopters, but fast followers. Although some insurers are already using data to better understand and segment risk profiles of an individual — wide adoption and significant impact remains elusive. According to McKinsey, for Property and Casualty insurers who are leveraging big data and AI “leading insurers can see loss ratios improve three to five points, new business premiums increase 10 to 15 percent, and retention in profitable segments jump 5 to 10 percent.” And, the message from every one of the big 4 consulting firms on the biggest opportunities in the insurance industry today is that the new normal for insurance companies is to leverage sophisticated big data, and to use advanced AI to harness the predictive power of this data.

However, leveraging the power of advanced AI and sophisticated data sets is difficult to do. The guidance and trends are clear here as well, engaging with partners that can deliver immediate value is a way to deliver bottom-line results and escape “pilot traps”. This is where InsurTechs and insurance partners like Pinpoint Predictive excel.

Behavioral economics pertains to Pinpoint’s proprietary, first-party data, which correlates with insurance outcomes and also reveals the explainable features underlying risk-related behaviors. Pinpoint’s platform leverages behavioral economics and deep learning, providing leading insurers with superior risk-selection scores at the beginning of the customer journey, as well as a deeper understanding of individual-level risks. Whereas traditional risk segmentation puts people into blanket categories that can penalize people for their credit, location, or age, a more individualized approach removes potentially discriminating categories that may penalize people for their financial status or the neighborhood they live in. Because of the vastness of big data and behavioral profiles, behavioral-economic profiles remain relatively stable over time. 

It’s the data equivalent of truly knowing a person, understanding their personality, and quantifying their propensity for key risk behaviors.

Armed with this knowledge, insurance companies can get smarter about how to find and keep the right customers. A better understanding of risk informs better decisions in shaping their book of business. They can precisely prioritize prospects that think like their best customers, a vast improvement over targeting prospects with look-alike modeling.

If insurance companies get this good at knowing who potential customers are, they can identify unpriceable risks, such as fraud, litigation, and high cost claims at the beginning of the customer journey and use this information to augment the performance of their risk models. They then can tailor a customer’s journey, creating relevant experiences based on more accurate risk predictions and lifetime value.

Over the next decade, most insurance companies will abandon traditional, rudimentary approaches to risk segmentation based on sparse data points. Instead, they will be leveraging deep learning before underwriting a risk to make accurate risk propensity predictions at the beginning of the application process. Pinpoint is at the forefront of driving this transformation. The data modeling predictions that make the most successful tech companies so good at making assessments of who a person is, what they like, and how they buy will be the norm for insurers, not the exception. In this new competitive environment, insurers will be able to directly tie their data strategy to their loss ratios as they get better and better at targeting customers, segmenting risk, and tailoring their customer experience at an individualized level.

Pinpoint is already driving the insurance industry towards this future, and putting the predictive power of big tech in the hands of insurers to help them win. Ready to learn more? Reach out to me at

This piece was originally posted via Pinpoint Predictive.

About Pinpoint Predictive

Pinpoint Predictive brings individualized intelligence to insurers by applying deep learning and behavioral economics to accurately predict the risk propensity of an individual insured without using financial-based credit scores.

Their AI-powered platform has revealed $100s of millions in bottom-line value to Top 10 insurers by quantifying key points of risk in areas including likeliness to file a claim, cancellation, likelihood to commit fraud, and litigiousness.

By utilizing their proprietary Thinkalike Scores in pre-eligibility and pre-renewal decisions, carriers are able to leverage the predictive power of deep learning and Pinpoint’s custom database representing thousands of sophisticated, privacy safe data points on approximately 260 million US adults in real time, resulting in significant improvement in their ability to target their most profitable insureds and improvement in their loss ratio.

Discover the powerful results of using individualized intelligence to boost profitability at

Why The World’s Leading Data Experts Warn Covid-19 Data is Wrong


And how to make better decisions from the data you see

As states are slowly starting to ease lock down restrictions and phasing reopening businesses, data is playing a big role in helping policy makers and leaders to both form and execute these decisions. But there is one question that must be asked in this process–is the data used to make these decisions correct?

In short, the answer is not always.

Enter Exhibit A: one of the leading models in the early phases of the pandemic in the U.S. went from rising star, consulted daily by White House officials, only to be put in a corner with a dunce cap after concerning inaccuracies were brought to light. The University of Washington’s Institute for Health Metrics and Evaluation, or IHME, was being used by White House officials, the Centers for Disease Control (CDC) and state officials around the country. This model formed the basis of NPR’s popular state-by-state peak predictions and was used by many other credible news agencies.

Ruth Etzioni, biostatistician at the Fred Hutch Cancer Center said the IHME model makes her cringe. In a STAT article she stated, “That it is being used for policy decisions and its results interpreted wrongly is a travesty unfolding before our eyes.” Epidemiologist Marc Lipsitch of the Harvard T.H. Chan School of Public Health said of the IHME model, “It’s not a model that most of us in the infectious disease epidemiology field think is well suited” for projecting Covid-19 deaths.

The root of concern from data experts was a glaring issue.

The IHME model had predicted that Covid-19 deaths would reach 60,000 by the end of August. This was problematic because deaths in the US had already reached 68,000 by the beginning of May. On May 4th the IHME called a press conference to release their model update with a new prediction of 134,000 deaths by the end of August, more than double the previous estimates.

Yann LeCun, Facebook’s Chief AI Scientist described IHME’s model in a tweet on May 18 as “pretty much the worst.” 

Youyang Gu is a Data Scientist (MIT ’15) and creator of the model. This model is now one of 17 Covid-19 data models linked on the CDC’s site. Early in the pandemic he repeatedly expressed concerns over the IHME model. 

Due to the mounting concerns over its inaccuracies, on May 1 the CDC quietly removed the IHME model from their website. And just like that, one of the leading data sources used by Americans was put on the shelf. The takeaway for Americans—just because we see data does not mean that it’s correct. Especially in the middle of a pandemic where all we have to go off of is a relatively small amount of very new data.

Harvard Professor of Statistics Xiao-Li Meng warned of the consequences of the poor quality of Covid-19 data that is currently available. He argues in his May 14th publication for the Harvard Data Science Review that academic studies on Covid-19, while conducted thoughtfully, are “dangerous” when researchers do not take into account the low quality of most of the Covid-19 that is available today. According to him, data quality is of utmost importance:

Building elaborated epidemiological or mathematical models without taking into account the weaknesses of the data generating mechanism is still statistically unprincipled, because data quality fundamentally trumps everything else.

Data is like Transformers — there’s more than meets the eye. We need to understand the “more.”

Sadly, this is not the only data fail since the Covid-19 pandemic arrived in the U.S. In their May 21st article “How Could the CDC Make That Mistake?” The Atlantic reported that the CDC and several states including Pennsylvania, Georgia and Texas were mixing viral test data with antibody test data, damaging the public’s ability to understand what is happening in any one state. Harvard Professor of Global Health and director of the Harvard Global Health Institute K. T. Li said that blending viral and antibody tests “will drive down your positive rate in a very dramatic way.” As a consequence of this error, some of the metrics that decision makers have depended on for state reopening plans have been wrong, and we do not actually know how our ability to test people who are sick with Covid-19 has improved. The conflating of viral and antibody tests is a clean cut example of how easy it is to dramatically skew data.

Over the past 3 months, we have all been consuming data daily in an effort to track this pandemic. So to uncover the inaccuracy of key data we have relied on is nothing short of frustrating. But there is a lesson in all this madness: no data is perfect.

Data quality fundamentally trumps everything else

In my 2017 talk for TEDxProvidence I amplified the limitations of data. Having loads of data and data scientists does not guarantee our ability to make accurate predictions. Botched predictions for both the 2016 U.S. Presidential Election and the Brexit decision are sobering examples of this. It’s happened again with Covid-19 projections, and we’ll keep seeing the same pattern repeat in the future. This will continue because the innate nature of data is imperfect.

So what do we do with all of this?

The takeaway here is that every person should know that data is always flawed. Whether you’re a CEO or just someone who is trying to make sense of what’s going on, we need to understand a few basic principles when looking at data. Cassie Kozyrkov, Head of Decision Intelligence at Google put together a very succinct and helpful list of “dos” and “don’ts” for interpreting Covid-19 data.

A few takeaways to keep in mind as it relates to pandemic data:

There are many different ways to measure what appears to be the same thing. The fact that some states have been lumping viral and antibody tests together and others have not is a problem. Mistakes like this happen when we don’t question how data is being measured.

Never blindly trust data or a model. While no one model is perfect in its ability to predict the future, we use models as a tool to assist with health care and resource planning. In the case of the IHME model, its inaccuracies were concerning enough to discontinue using it for policy decisions. Just like the imperfect data used to make them, data models are imperfect, too.

A better understanding of the subject matter leads to better understanding of the data. It’s a dangerous trap to fall into when we don’t have a deep knowledge of the type of data we’re looking at. Data is more accurately interpreted by those with a deep understanding of the data sources, clinical measures, and the spread of infectious diseases. There are certain areas where we do need to trust experts.

Finally, when it comes to matters of using data to make personal decisions in a pandemic, safety is the most important thing. No amount of data will make you discover that frequent hand washing, social distancing and wearing a mask are the wrong choice. As the author of The Black Swan Nassim Taleb stated, “It’s a situation where you can’t afford to be wrong even once.”

* * *

Shannon Shallcross is Co-Founder and CEO of BetaXAnalytics

Why Budgeting For Health Care Is Near Impossible


The average millennial will spend between 1/2 and 2/3 of their lifetime earnings on healthcare. This jaw-dropping estimate, outlined in David Goldhill’s book Catastrophic Care: Why Everything We Think We Know about Health Care Is Wrong, is the perfect picture of how, for Americans, the new normal involves personally budgeting for healthcare expenses. Unfortunately it’s not an easy task to break healthcare costs down to what comes out of our personal pockets.

Divided equally among each person in the U.S., healthcare’s overall price tag averages out to over $10,000 per person each year—a whopping 18% of U.S. GDP. Since employers provide 48% of the healthcare coverage in the U.S. this burden has fallen heavily on their shoulders, and, as a consequence, they have shared this cost burden with employees. The growing popularity of high deductible health plans and copays means employees are sharing a larger portion of these healthcare costs, and as such, the average person needs to budget for these costs in their financial planning.

Healthcare Costs: What Employers Pay, What Employees Pay

The Milliman Medical Index estimates medical costs each year as they relate to employer and employee contributions. Based on data from 2018, healthcare for a family of 4 in the United States costs $28,166. Of this total cost, $15,788 comes from the employer, while the employee contributes on average $7,674, with an additional $4,704 paid by the employee for out of pocket for deductibles and copays.

Here’s a snapshot of the cost breakdown for employer-sponsored health insurance:

The 2018 Milliman Medical Index estimates the total cost of healthcare for a family of 4 to be $28,166; $15,788 of this comes from the employer, $7,674 comes from the employee, and an additional $4,704 is paid by the employee in the form of deductibles and copays.

*2019 ACA Out of Pocket Maximums are $7,900/individual and $15,800/family.

These estimates are sound breakdowns based on large amounts of employer-sponsored plan data from Milliman. But do they truly inform how an individual can budget for their own healthcare expenses? Unfortunately the answer to this question is not so easy.

What Factors Influence Individual Health Spending

To understand how to budget for individual health expenses, we need to look at the levers that influence healthcare costs. And there are several factors which could cause individual health costs to largely vary. These factors are:

1.      Age and Gender. Not surprisingly, actual health costs can vary greatly based on an individual’s age and gender. The figure below from the Peterson Kaiser Health System Tracker breaks down the American population by age, and then demonstrates each age group’s share of overall health spending.

The Peterson Kaiser Health System Tracker breaks down the American population by age, and then demonstrates each age group's share of overall health spending.

2.      Individual health status. Chronic illnesses such as diabetes and cancer have a marked impact on someone’s personal healthcare costs. The Centers for Medicare and Medicaid Services (CMS) report that 90% of the nation’s $3.3 trillion dollars in healthcare spending is for people with chronic and mental health conditions.

3.      Geographic area. Differences in the costs of labor, rents and taxes in different geographic regions affect healthcare costs. Furthermore, areas of the country with more technological advances will have higher utilization rates of healthcare, further contributing to cost differences.

4.      Provider variation. A frequently criticized hallmark of the healthcare industry is that provider costs can vary widely depending on where an individual goes to seek treatment. Furthermore, different payment methodologies, pre-negotiated payment rates and capitated rates can affect healthcare costs.

5.      Insurance coverage. Richer health insurance plans tend to have higher utilization rates than budget options with less coverage. In addition, who is paying for the procedure can affect the ultimate cost. For example, what a provider is paid from Medicare (which, as demonstrated in the figure below, provides 14% of all coverage in the U.S.) and what they are paid under an employer-sponsored plan for the same exact procedure could be two different costs.

CMS breaks down the source of health cost coverage in the United States by coverage provided by employers, Medicare, Medicaid, the individual market, and other forms of coverage, while also factoring in the number of uninsured in the U.S.

What Healthcare is Costing and What is Coming Out of Our Personal Pockets Are Two Completely Different Things

In short, the reason why budgeting for individual health costs is so challenging is because our system of how we pay for healthcare masks the true cost of healthcare. The subsidization in the health insurance market muddies the waters for anyone trying to budget for their own personal healthcare costs. And just in case this wasn’t confusing enough, the rules that govern this system which determine things like out of pocket maximums, in addition to insurance rates, change every year.

An individual trying to budget for their own expenses can use a best-guess of looking at their annual share of healthcare premiums and their average out of pocket costs each year. This assumes that their own past health expenses are the best way to predict future expenses. But even this approach is not perfect. Understanding how the $170/person cost of healthcare in 1960 made up only 5% of US GDP, compared to healthcare’s current share at 18% of GDP…the past might not always be the best predictor of the future where healthcare is concerned.

* * *

About BetaXAnalytics:

We combine data science with clinical, pharmacy and wellness expertise to guide employers and providers into a data deep-dive that is more comprehensive than any data platform on the market today. BetaXAnalytics uses the power of their health data “for good” to improve the cost and quality of health care. For more insights on using data to drive healthcare, pharmacy and wellbeing decisions, follow BetaXAnalytics on Twitter @betaxanalytics, Facebook @bxanalytics and LinkedIn at BetaXAnalytics.

Analytics For Employers: A Tutorial (Part 1)


It’s been almost 3 years since we started BetaXAnalytics with the goal of using data science to offer strategic guidance to employers and providers on healthcare spending and services. Since opening our doors, we’ve spent a lot of time talking with companies who pay for healthcare for their employees, as well as the brokers and consultants who help to guide these decisions. At the same time, we’ve spent time taking a look at many of the analytic tools that are on the market right now—these are the technology platforms that provide spending transparency to employers and their brokers.

From day 1 when we started these customer interviews, one resounding theme was apparent. The biggest question we heard from employers and their brokers is simply this: Having data is good…but what do you do with it?

3 years later, this is still the most common question we hear. We see this recurring question from employers and their brokers as a symptom of the early-stage maturity of the employer health analytics market. In short, over the past decade as more self-insured employers use health data to help to manage their spending, we haven’t moved too far from the starting line.

Anyone who is familiar with the general progression of analytics will recognize the analytic maturity model below.

At its most basic level, health care analytics is often pigeonholed into “counting things.” Counting dollars, counting medications, counting members…and watching these numbers go up and down. So every time we get the question, “What do you do with the data?” this just reaffirms that most employers are still in the dark with respect to using data to drive their benefits strategy. This type of “analytics” examines only the past and gives very little insight into the 4 critical areas of focus as a healthcare purchaser (which we explain in Part 2 of this post).

After seeing many of the analytic tools on the market today, we can confidently assert that that the market for using health analytics to control employer healthcare spending is here:

Having data is good…but what do you do with it? This recurring question is a symptom of the low analytic maturity of the current state of employer health analytics—that is to say, we pay for access to data, but it’s rarely actionable. “Analytics” in this stage is synonymous with “counting” and data is hindsight-focused on reporting what has already happened.

The current state of employer analytics is a good start, but it barely scratches the surface of the strategic potential of analytics. Tracking spending is important, but true analytics go far beyond just spending to understand insights into your population health, designing and tracking programs to target conditions and support mental health, and even to provide insight into how well benefits are being communicated to employees. When we move past hindsight analytics to incorporate insight and foresight, we move past counting things, and to the realm of strategic benefit planning. This means developing a deeper understanding of who associates are, the benefits that will attract the best talent, and identifying the optimal strategy for funding these next-generation benefits packages. As with so many initiatives that fall under the Human Resources / Human Capital umbrella—including talent acquisition and retention, compensation, healthcare, engagement, benefits, wellbeing—the most strategic analytics should consider all of these areas. This is the future of analytics—and the future is here.

About BetaXAnalytics:

We combine data science with clinical, pharmacy and wellness expertise to guide employers and providers into a data deep-dive that is more comprehensive than any data platform on the market today. BetaXAnalytics uses the power of their health data “for good” to improve the cost and quality of health care. For more insights on using data to drive healthcare, pharmacy and wellbeing decisions, follow BetaXAnalytics on Twitter @betaxanalytics, Facebook @bxanalytics and LinkedIn at BetaXAnalytics.

Image credit: iStockPhoto

Forget Flashy Technology: Here Are 3 Data and Analytics Best Practices Any Company Can Use Right Now


The ability to take data—to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it—that’s going to be a hugely important skill in the next decades.

~Dr. Hal Varian, Chief Economist at Google

Practically everyone is talking about using data and analytics to succeed today in business, but surprisingly companies are only deriving a fraction of the value that’s available to them in their data when they’re making decisions. The reasons for this vary across organizations, but often times it comes down to budget constraints, talent constraints, or lack of recognition from leadership that analytics will help their business to run better. During an interview in 2009, Google’s Chief Economist Dr. Hal R.Varian predicted, “The ability to take data—to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it—that’s going to be a hugely important skill in the next decades.” 

Let’s take a look at some of the highest-performing companies out there today. Over the past 5 years, there have been 13 companies that have managed to outperform the S&P 500 each year. And when you take a look at this elite group—which includes companies such as Facebook, Amazon, and Google—you find that the majority of these businesses are algorithmically-driven. These companies take in data constantly, and use this data in real time to update the user-experience. In their 2012 feature on big data, Andrew McAfee and Erik Brynjolfsson shared findings from their research that “companies in the top third of their industry in the use of data-driven decision making were, on average, 5% more productive and 6% more profitable than their competitors.” It is hard to deny that success in our respective businesses is not a function of how well we make use of the data available to us. 

So how does Human Resources (HR) fit in to this picture? HR may not be the first group that you think of when considering who should have a strategy around using data. However, HR has the weighty responsibility of managing the top expenses of a company—salaries, healthcare, and benefits. The 2018 Milliman Medical Index estimates that the cost of healthcare for a family of 4 this year will be upwards of $28,166. Yet approximately 20% of employer-sponsored health care spending is wasted each year due to unnecessary or preventable costs across the continuum of care. The rise of high deductible health plans mean that decisions made within HR on health plans and benefits are decisions that weigh heavily on their employees pocketbooks as well.  When we look at HR through the expense-management lens, we see that HR carries the company’s fiduciary responsibility to manage these expenses not just for the bottom line of the employer, but also for the sake of their employees’ wallets.

We often see companies who make the decision to start using data and analytics immediately start shopping for a tool to make use of their data. While this step may be right for some companies, there are a few foundational analytics best-practices that we recommend companies have in place before making any analytic technology investments.

1.      Understand the quality of your data. One of the biggest mistakes we see companies make is that they assume that just because a report comes from I.T. or from a vendor, that the data is correct. However, very rarely is the data captured by a company in “ready-to-use” form. IBM estimates that poor data quality cost American companies $3.1 trillion in 2016 alone. A recent study of 75 executives who assessed their own organizations data quality found that only 3% of their companies’ data met basic quality standards. Furthermore, understanding data quality is a fundamental issue within organizations, executives are more informed to understand how data quality affects their vendor partners as well. Every bit of data that we review is a piece of a much larger picture, and understanding the limitations of the quality of your company’s data helps to make a more accurate assessment of its insights.

2.      Develop your data strategy. Take a step back from day to day operations to decide how to data can help to inform your decisions. This affects what metrics you’re looking at, and how often you’re receiving it. Many companies are surprised to find that the process of developing a data strategy often means reducing the amount of reports people are looking at. A common assumption is that the more data we’re looking at, the better off we are. In reality, when decision-makers are inundated with extraneous reports, they may miss valuable messages that they need to see. What goals is your division working towards? Which pieces of data most closely track progress to these goals? The best way to guide a strategic process for looking at data aligns your business goals with a limited number of key metrics to indicate when changes are needed to reset course. 

3.      Identify a data “expert” on your team. Given the issues that exist in every organization with data quality, it is valuable to identify someone who is intimately aware of the source and limitations of the data your company assesses. This person can answer questions on why particular data might be wrong, if duplicate records are skewing the data, or how outliers are affecting results. Your data expert can help to tell the story of your organization’s data to better frame what actions are needed to meet your operating goals.

Using data to make better business decisions does not need to be cost-prohibitive. Before investing in any data and analytics tools, implement these best practices to lay the groundwork for a sound approach to using the data you already have. They can be used by any company, regardless of size or budget. And the best part is, you can start to use these best practices today.

* * *

Bob Selle has led culture change and organizational design for America’s most recognized retailers. He is currently the Chief Human Resource Officer for the northeast’s premier close-out store Ocean State Job Lot, leading a transformation that has named them a Forbes Best Midsize Employer two years in a row.

Shannon Shallcross is a data expert who believes that data interpretation holds the key to solving healthcare’s toughest challenges. As the co-founder and CEO of BetaXAnalytics, her company uses the power of data “for good” to improve the cost, transparency and quality of healthcare for employers.

See Bob and Shannon at the Strategic HR Mt. Washington Conference on October 29th, 2018 during their plenary session, Metrics That Matter: Let Numbers Tell a Story.

How Much Personal Data Did I Give Up to Take This Facebook Quiz?


I’m about to reveal a big secret about myself. I love a good Facebook quiz. Whether I’m finding out what I will look like in 20 years or what my leprechaun name is, it’s fun to do these mindless games on Facebook and compare results with friends. If you’ve ever done one of these, you know it’s easy–you click one button to agree to share information about yourself, information in your Facebook profile, and information on your Facebook friends. What could be the harm? We figure, “Of course this information is needed if we’re looking to find the accurate answer to ‘What will my Hollywood movie poster look like?'” It seems harmless, so we trust it.   

The Facebook platform collects massive amounts of data on us, and it does so in a brilliant way. Imagine having a stranger come knocking on your door and asking you for a list of all your family, your friends along with photos and everything you know about them.  No one would ever fall for this. But now that Facebook is such a familiar and popular way to connect with people, it doesn’t feel like a stranger to us. We “trust” Facebook, and we use it to store massive amounts of information about ourselves and the people we know. In fact, we trust it so much that when it comes to their “privacy agreement,” we agree to it without even reading its terms.

The reason why the Facebook/Cambridge Analytica debacle has people angry is because people assumed there was no risk in how their data from Facebook would be used. But in this case, to the shock of the world, Facebook exposed data on 50 million Facebook users to a researcher who worked at Cambridge Analytica. And, as another piece of the puzzle, Cambridge Analytica worked for the Trump campaign. So as the public is wielding pitchforks at Facebook’s door, the first lesson for us all is this:

#1: Any data that we’re publicly sharing will be used.

And once our data is out there, absent restrictions, we have little control over how it is being used. Data is valuable to companies, both in utility and in dollars. So when it comes to any platform that collects and stores any data on you, you can assume this data will be used in some way or sold to a 3rd party. 

#2: So much more of our personal data exists than what we realize.

It’s scary, I know. Data on you and me is everywhere. And if you have watched my talk for TEDxProvidence, you know how the amount of data we’re able to capture has increased exponentially in just the last 15 years. According to Google’s former CEO Eric Schmidt, the same amount of data created from the beginning of time to 2003 is what was generated in the last 2 days. 

Our data is used by marketers, by election strategists, by grocery stores, and by prescription drug companies. It’s used by every social media platform, and our data is used by their affiliated companies as well. Simply put, most companies are using our personal data in some way.

#3: Not only are most companies using our data, but the most successful companies are built on data. 

There are 13 companies in the S&P 500 that have managed to outperform the entire S&P 500 5 years in a row. The majority of these companies are “algorithmically driven,” meaning they gather data from their users and they update the consumer experience almost automatically. These are companies like Facebook, Amazon and Google. Global business investments in data and analytics will surpass $200 billion a year by the year 2020. In the future, we will see more and more businesses moving data to the core of their competitive strategy.

What does this mean to us? The time is right for the public to champion a universal code of ethics surrounding our data use.

#4: Our data should be protected by a common code of ethics.

Now that we have just a glimpse of what can happen when data is available unrestricted in the hands of others, we need to have a common set of rules to govern data use. DJ Patil, the first Chief Data Scientist for the White House, reminded us that “with great power comes great responsibility” in his February 2018 call to action “A Code Of Ethics for Data Science.” This post coincidentally was published over a month before the Facebook/Cambridge Analytica Scandal hit the press. The weighty responsibility of using data appropriately weighs on the minds of many within the data science community.

When my partners and I formed our company BetaXAnalytics, our founding principle is that we wanted to use the power of data “for good” to improve the cost and quality of healthcare in the United States. Since we had a deep experience in clinical and pharmacy data science, we knew there was a resounding need for ethical transparency for those who are paying for health services. We wanted to provide the actionable insight that our clients need to make decisions regarding healthcare services and care coordination.

Since my company BetaXAnalytics works with healthcare data, the way we protect data is governed by HIPAA; this legislation ensures both the privacy and safeguard of people’s health-related information. A large amount of our time and resources are put towards our focus of maintaining data security and privacy. The data we use is governed by strict contracts with our clients and we never sell data to third parties.

As a company whose business is built on interpreting health data, we live by the mantra “with great power comes great responsibility.” We hope to see this movement grow both within and outside the data science community to work towards using the powers of data “for good.”

 Shannon Shallcross is Co-Founder and CEO of BetaXAnalytics

How To Make Data-Driven Decisions When You Don’t Have Data


In 1934, T.S. Eliot famously lamented the empty soul of modern work life. Though he wrote “Choruses from the Rock” over 80 years ago, he hits a nerve in our present-day struggles by asking, “Where is the wisdom we have lost in knowledge? Where is the knowledge we lost in information?” In current times, we have so much data at our fingertips, but does that mean we are making better decisions? Today, the core of data analytics is simply using information to make well-informed decisions. The only difference today from 80 years ago is that we simply have more information available to make decisions and more sophisticated methods to use this information. 

A question that I get time and time again from managers is “How do I make data-driven decisions when I don’t have any data?” As a decision maker, it’s incredibly frustrating to feel hampered by a lack of data.  Despite wide availability of information, companies might not put data into the hands of decision makers for a couple reasons. Maybe the organization does not have an effective way of capturing data—this happens in companies that have older technology in key areas of the business. Or maybe the data they have is too messy—for instance, perhaps they can track customer quotes online, but they have no way of cleaning up the 30 different customer quotes that actually were generated by the same person. In other cases, data is kept sectioned off in certain parts of a company, but it is not shared widely with people whose decisions depend on the information. For whatever the reason that managers feel like they do not have access to information to make an informed decisions, there are a few guidelines you can follow to ensure that you are making the right decisions.

The key is not to get more data – it’s to get the right data.

It’s important to keep in mind you can have all the data in the world and still not have the information you need. The key is not to get more data – it’s to get the right data. In research from the book Stop Spending, Start Managingexecutives reported wasting an average of $7,731 per day—or $2.8 million per year—on wasteful “analytics.” The first step to making sound decisions is to recognize what that “right” data is for your business. Once you identify this, you can cut your time looking at reports significantly because now you have a strategy. You know exactly what you need to see to make a decision, and you can see through the noise of mountains of data that don’t add value to your decisions. 

Executives reported wasting an average of $7,731 per day—or $2.8 million per year—on wasteful “analytics.”

If you don’t have access to the data you need at work, here are some steps you can take:

1.      Identify your business goals.  Here’s your opportunity to start at square one and holistically rethink how your decisions are made. This entails taking a 50,000 foot view of your business to make sure that you’re asking the right questions. We often get in the habit of process, and we repeat process patterns of looking at old reports that don’t tell us what we really need to know. If your business unit always looked at a set group of metrics, it’s easy to get tunnel vision and to see it as a bad decision to stop looking at a certain report. But I recommend taking a step back to ensure you’re asking the following questions before even looking at any data:

·        What are the business objectives for which we are responsible? (In other words,what are our goals?)

·        What are the crucial areas of the business that we need to be tracking?

2.      Identify which data you need to track progress on your goals. What data do you need to see to be able to track progress on these goals and to make sound decisions? In most cases, every business goal you cite has one or multiple metrics that will help you to gauge progress against that goal.

3.      Examine your data access. Identify which of these must-have pieces of data you have access to. For the data you don’t currently have access to, identify how you can get access. This can be as easy as requesting access from another department, or as hard as implementing a way to capture new data.

4.      If needed in the short term, identify proxy data for the information to which you don’t have access. When you can’t access crucial data, is there a proxy measure that would tell you the same thing? For instance, if you have no way today of tracking the number of customers who are calling with a particular complaint, can you poll your front line customer service representatives to identify trends in complaint themes? Finding a short-term proxy for needed data will provide you with some useful information. The proxy is not a perfect solution, but in the short term it’s better than using no information at all.

5.      Start the process of gaining access to the data that you need. As simple as this sounds, if you’re in a situation where you don’t have access to crucial data, the goal is to exit this reality as soon as possible. Whether this means insourcing or outsourcing to gain access to data you need, there’s simply no business case for continuing to manage without the right information.

The guiding principle of how to manage your data is to identify what data aligns with your goals—if you don’t have access to this data today, the best place to be is somewhere on the track to gaining access to this data. Identifying proxy data is a bridge to dealing with an undesirable situation, and moving towards one that puts you on the right path. But it is important to not accept a lack of data within your company simply because it’s “the way it’s always been done.” If you find yourself clamoring for meaningful metrics, creating a process to get this data involves some work–but there are huge rewards for your business in the end.Y

* * *

BetaXAnalytics is a healthcare data consulting firm that helps payers and providers to maximize their CMS reimbursements and helps employers to reduce their healthcare spending through proven strategies to contain costs. For more insights on using data to drive healthcare, pharmacy and wellbeing decisions, follow BetaXAnalytics on Twitter @betaxanalytics, Facebook @bxanalytics and LinkedIn at BetaXAnalytics.

2 Reasons Why Your Data is Lying to You


Big Da·ta noun

An overused buzzword, which, despite its lofty sound, basically means “lots and lots of data.” A Mount Everest of tangled data. 

The term “Big Data” gets thrown around all too often these days, but anyone who works closely with healthcare data is intimately aware of its shortcomings. From lack of sharing patient data between providers to inconsistencies with recording patient data, the more we know about the problem, the more impossible it seems to unlock the powerful potential that lies in healthcare data. But at the heart of the issue, there are 2 main reasons why people don’t get accurate insights from their data.

Reason #1 Your Data Lies: It’s Dirty

Software expert Hollis Tibbets, formerly the Global Director of Marketing at Dell, estimated that duplicate data and bad data combined cost the U.S. economy over $3 trillion every year. This staggering number is just about two times the national deficit.

Unfortunately, the healthcare industry in particular is a breeding ground for duplicate data. The U.S. Attorney’s office estimated that 14% of healthcare spending is wasted due to dirty data; this includes duplicate and/or incomplete data. With 16% of the U.S. Gross Domestic Product attributed to healthcare spending – or $2.14 Trillion total spend – that would mean that duplicate and dirty data costs the healthcare industry over $300 billion every year. And the sad reality of this issue is that 50% of IT Budgets are spent on data rehabilitation.[1]

Larry English, an acclaimed information quality expert and creator of the Total Information Quality Methodology (TIQM) has estimated that that 15-20% of a company’s operating budget can be wasted due to dirty data. This number is quantified by the exhaustive effort to extract, manipulate, append and scrub data via SQL, Excel or other means. And this estimate is independent of the fact that 30% of healthcare provider records are inaccurate or missing information due to inconsistent entry of codes and inaccurately transposing metrics or patient identifiers.[2]

Reason #2 Your Data Lies: It’s Interpreted by People Who Do Not Understand It

A study by McKinsey has projected that “by 2018, the U.S. alone may face a 50 percent to 60 percent gap between supply and requisite demand of deep analytic talent.” The shortage is already taking hold across industries, including healthcare, finance, aerospace, insurance, and pharmaceuticals. In April 2014, the consulting firm Accenture surveyed its clients on their big-data strategies, and more than 90 percent said they planned to hire more employees with expertise in data science—most within a year. However, 41 percent of the more than 1,000 survey respondents said a lack of talent was their main hurdle.[3]

Data Scientists are important in the process of data cleansing, appending and analysis because they work with unstructured data. These are the people who write algorithms to extract insights from the mounds of disparate data sources, including e-mails, text notes, photos and other user-generated content. They sort through the mess of dirty (messy, incomplete, and inaccurate) data and neatly append it to uncover the true insights.

All analytics must start with data investigation. Since data is inherently messy, the analysis process must start with a multi-faceted cleansing process by someone who, while working with health data, has deep clinical understanding. This knowledge enables them to identify and appropriately treat negative values, reversals, duplication, adjustments, and they understand how to handle data anomalies. This experience also enables them to check for clues throughout the process as to why data may not make sense.  For example, thoroughly examining data may reveal issues with recycling patient IDs and inadvertently mixing patient data together. Yes, this happens. Dirty data is not to be trusted…ever.

Bring Truth Out of Data

It is easy to get caught up in the buzz of “Big Data.” You may have a strategy for collecting data…and maybe even an analytics department. But neither of these efforts means your data is telling the truth. If a significant part of your data management strategy is not allocated to 1) scrubbing data and 2) ensuring those who work with the data truly understand it, your data’s actionable insights (read: truth) may still be hiding.

If you liked this post, please share it, click the FOLLOW button above to get more, and comment below!

Shannon Shallcross is the CEO of BetaXAnalytics, a company that leverages data insights to improve clinical outcomes, improve patient well being and decrease health care costs. They deliver custom tools and data analytics to managed care organizations, providers and employers to reduce costs and improve the quality of healthcare and pharmacy services.

Follow BetaXAnalytics on Twitter @betaxanalytics and LinkedIn at BetaXAnalytics.

[1] Tibbetts, H., 2011. $3 Trillion Problem: Three Best Practices for Today’s Dirty Data Pandemic. [Online] Available at:

[2] A Business Case for Fixing Provider Data Issues: Save Money, Reduce Waste and Improve Member Services: Proactive Provider Data Management[Online] Available at:

[3] Orihuela, Rodrigo and Dina Bass. Help Wanted: Black Belts in Data. [Online] Available at: