How Data Can Add Value to Your Organization

Businesses of all sizes have unprecedented access to data. In addition to their own stores of data from website and transaction activity, there is social media data, 3rd party data, open source data on markets, companies and people and many other sources. According to one Forbes article humanity generates 2.5 quintillion bytes of data each day. While many organizations recognize that there is value in this data harnessing and understanding it is where many organizations fall short. Here are a few areas where organizations can focus on optimizing data value for their organizations and over the next few months we will dive deeper into real-world examples and case studies.

Performance

I have great respect for the past. If you don't know where you've come from, you don't know where you're going. - Maya Angelou

Understanding the past helps to ground in organization on how they are performing in the present. Without the context of previous efforts it is impossible to understand how and organization has improved or worse how an organization has declined. What’s more dangerous is not understanding why and organization has been able to make those improvements or declines. Using data proactively not only allows you to benchmark against your previous efforts, it also allows you to understand where those improvements need to be made.

Understand the Market

Speaking of benchmarking, with the right data on your market you can also understand how you compare to your competition, where the market is heading and your organization’s position within the market. What does this mean for you? A broader understanding of your offerings and what resonates with your customer. Knowing your strengths and weaknesses in your market can help you understand where you may want to focus resources and efforts, where to pull back and what holes may exist in the marketplace that your organization can help to solve.

Making Smarter Decisions Faster

One of the great things about the way data is collected and organized today is that it is nearly instantaneous. If you are able to review that data more quickly, or better yet, employ advanced statistical techniques and AI to help review and act upon data instantly, you can improve performance on a personalized and moment to moment basis. Don’t get me wrong… this requires a lot of upfront planning, analysis and understanding as well as regular upkeep and management. Predicting sales trends, personalizing customer experience, analyzing top customers, improving marketing efforts, creating a better SaaS product, automating customer service, inventory management and so many other areas of your organization can be improved. The upside is worth it for large and small organizations alike.

Improving Process

We touched on this topic just a bit above. Data can be used to improve all sorts of processes. in selecting products it can be important to use data to choose what inventory is best suited for your market, how many of each item you need, at what price to sell them and which customers have the highest propensity to buy at any given time. It is important to understand when and what products can be used to upsell or down-sell customers. How often to email people and what time. What features are they using most often. What features are people using very little. How are people using your product. What will a specific customer want to buy next. There are so many questions data can help answer so that your team can make the most of their resources, optimizing your business for the best customer experience and increased revenue.

Over the next few months we will go over each of these ares in depth with real-world examples from business of all sizes. So, how does your organization use data?

Share

Data Policy and Ethics 2020: The Year in Review

After living through 2020 and experiencing the first weeks of 2021, It is no wonder that people across the globe are questioning the ethical standards of everyday organizations and the people in charge of them. Understanding the consequences of the actions of people and power and their followers, or lack thereof, feels daunting after a year of so many tragedies: the murders of countless black and brown people, the pandemic, one of the hottest years on record, political unrest and economic struggles. Looking back at some of the issues of the past year we’ve made great strides in the areas of data ethics, but the circumstances have also raised questions. Admittedly I, like most people, was occupied by the ever-changing state of our world, managing work from home situations, constant meeting interruptions, and trying to retain some ounce of normalcy. Keeping up with the latest news out of the data science world was not a top priority. Now that we are in a new year we can look back at some of the most impactful changes in data science and data policy. Changes that have implications that reach from marketing and sales to government, entertainment and every other facet of life well into the future. 

The Trump Administration Makes its Mark on AI Policy

The Trump administration has had a tumultuous relationship with ethics over the past four years and the first few days of 2021 they’ve only seemed to double down on this policy. The President’s involvement with the riot at the capital and his subsequent calls for action call into question the policies and ethics surrounding these decisions. Similarly throughout his presidency decisions have been made that call into question the ethics of this administration. Despite these actions, Trump made an interesting move in the world of data policy and ethics. At the end of 2020 the President signed an executive order to ensure that Agencies of the United States government use AI in an ethical and trustworthy way and remain competitive in the industry. Over the past year agencies of the US government have hired Chief Data Officers, began creating data standards and practices as well as creating data codes of ethics. The Department of Defence led the way with a comprehensive document on how the agency will ensure ethical standards around data integrity and data use.

While these comprehensive plans are exciting developments in the government, which has historically been slow to adapt technology, they are not compulsory or cohesive across agencies and the standards of ethics in this order are vague at best. Unfortunately President Trump’s executive order does little to remedy this, however, the order does create a registry of models deployed within the government, sets up a timeline for creating policy guidance, encourages agencies to hire tech-focused teams and individuals and encourages transparency in AI use throughout the government in areas not involved with R&D or national security. 

As with all outgoing administrations there is a chance that this order will get modified or thrown out completely, however the basis on which this executive order stands feels solid to me. As in most cases involving AI policy, my greatest concern is transparency in allowing people to understand when and how AI might be affecting their lives and giving watchdogs the ability to call out machine learning models and AI structures that could harm them. This executive order sets up a structure that begins to do just that.

The Facebook and Google Lawsuit Means a Reckoning for the Tech Industry... No Matter the Outcome

After the terrifying events at the capital in early January one of the fastest groups to act in condemning the actions of the president and his followers were the big tech companies. Banning Trump and blocking some of his content as well as blocking groups that helped organize the riots. One has to wonder how these actions as well as the half-baked past attempts to quell misinformation and hate groups will affect upcoming litigation against big tech. 

Several states and the US Federal government have levied a lawsuit against Facebook and Google for antitrust violations. While many people understand the lawsuit to be rooted in the traditional idea of a monopoly that causes an increase in prices, reduction in supply and thus harming consumers; the lawyers in this case are arguing that these companies are not harming consumers economically but have so much data power that they wield that they essentially make it impossible for competitors without the same terabytes of data on each "customer" to enter the market competitively. Additionally, these lawyers are arguing that access to these data stores through software called APIs could be giving the large players like Facebook and Google a bargaining chip they can use against smaller companies trying to enter the market. 

While this lawsuit could take several years to come to a conclusion I think this shows that there are some vulnerabilities in the big-tech data-as-a-service model. Since these companies rely on economies of scale (more users equals more data equals a better product for users equals customers who want to buy user data equals more users… and the cycle continues) and this is what the lawyers are targeting, it may mean big tech needs to revise its business model. It may mean that this data can no longer be considered part of a company’s secret sauce. If the lawsuit loses it may just mean that companies have to come to terms that the legal system that once shielded these companies, or at the very least looked away, may no longer be favorable as they once were. Companies may begin to scrutinize their own systems and innovate. Or perhaps it will take another big lawsuit to shake up big tech. We probably will not see the conduction of this case and all its implications until 2022.

Data Science Implications for Climate Change Policy

In other areas, we are seeing the consequences of our actions in real-time. In 2020 we witnessed fires in Australia and California, an overly active hurricane season, as well as one of the warmest years on record. Weather forecasting and climate change are seeing extremes and these outliers are making an already difficult forecasting challenge even more challenging. Adversarial training is a machine learning methodology that has helped make understanding climate change easier. 

What is adversarial training? It isn’t new. My first recognition of this technique was from the book Zero History in which the character Pep wears a shirt that “...compels erasure. That which the camera sees, bearing the sigil, it deletes from the recalled image.” This was brought to life by Belgian researchers (not sure whether or not they read the book) who created patterns that fool facial recognition software and other image recognition software into “ignoring” that part of the image. Another public incident was the Tesla self-driving software vulnerability that misconstrued a 35 mile per hour sign for an 80 mile per hour sign by simply adding a small piece of electrical tape to the sign. Generally, it works by giving the machine learning model input meant to deceive the algorithm. While this technique is often used as a hacking tool, in this case, it was used to strengthen climate forecasting models and increase accuracy. 

The researchers at the US Department of Energy’s National Renewable Energy Laboratory used this technique by creating competing models that are able to better distinguish real from fake inputs and thus more realistic and higher resolution models. This is important particularly in climate data which often lacks high resolution data for different scenarios. Additionally computational methods traditionally used are slow and cumbersome, requiring computationally dense physics equations and additional data storage which costs organizations money and time. This improved efficiency means high resolution climate data will be available both on a global scale and smaller regional views as well. As these high resolution scenarios become more widely available and an administration more favorable to climate change is installed, climate change policy will become a high priority item for government and industry alike. 

2020 brought with it a requirement that all businesses and organizations adapt to new and ever-changing market circumstances, relying on embracing technology in order to weather the storm that will carry on into the upcoming year. As organizations continue to look for ways to become more efficient and stay competitive in this new environment I think we will see many more advancements and a democratization of data science as companies of varying sizes and industry embrace the technology and begin to use it in different ways. With this use will come policy changes and a new focus and need for data ethics. I’m cautiously optimistic about what we might see in 2021 and beyond. As we have seen in the past year, when there is a void in policy and procedures the systems we rely on can quickly be misinterpreted or decimated entirely. We’ve seen these systems have far reaching effects in our everyday lives, becoming more than just data policy, but having lasting effects on human rights. I strongly advise Biden and his new administration to think carefully about the human impact as they review and create new AI and data ethics policies: our climate, our social interactions, the work of our government agencies and so much more could be at stake.

Share

Health Organizations and Data Science: Still a Contentious Relationship?

Healthcare care is an area that many people agree lives in two worlds: one of scientific cutting-edge studies and another world that consists of old school paper filing and dated documentation systems. Data science and analytics has long been a part of healthcare systems in this country, but many in leadership believe that it has not yet been integrated to its fullest potential. Why might this be? With healthcare workers often being some of the most educated and technically savvy why are they missing out on this golden age of data?

The healthcare industry lags behind others in non-medical innovation.

It is difficult to change how a large and often siloed organization works from the inside. When it comes to data analytics and procedures The Brookings Institute reported that 56% of hospitals have no strategies or plans in place. Some of the reasons why the healthcare industry lags behind other institutions have to do with the healthcare industry’s unique space in following federal policy, institutionalized practices and history, but others are as ingrained as human behavior both from an institutional perspective and a patient perspective. Implementing sound data practices will be a journey and should be done intentionally for the best impact on patients and healthcare providers.

Healthcare is a very human-centric industry.

People are wary of letting machines make life decisions for them beyond which movie to watch on Netflix. Rightfully so. Bias in machine learning and artificial intelligence has shown to amplify some of the racial and sexist decisions of our past. However, when implemented correctly, machine learning in healthcare has been proven to be more effective and accurate than physicians’ diagnosis. And while I don’t condone nixing the doctors, data can be a great way to supplement a health care provider’s arsenal. It may take some time to get patients used to getting routine diagnosis from a machine but paired with healthcare providers’ experience, this data can be invaluable.

Policy can drive many decisions in the healthcare space.

Policy is the metadata in the healthcare space that is never going away and often changing in how it affects how people are able to give and get care. Policies that drive how and when data can be used, what treatments are available for what patients and so much more. It is understandable that many healthcare organizations are intimidated by having to keep this data up-to-date and relevant in how it interacts with their databases. Procedures can be placed around policy changes and how they affect data and logs built into databases can make it easy to find when policy changes went into effect and how the changes have affected the performance and goals of healthcare organizations.

The healthcare industry has ample opportunity to use data analysis to their advantage. Finding when and how to implement these changes will be key for organizations to stay ahead of the curve and provide the best possible care to their patients.

Share

The 1% Perspective in Data Science

Data Science is often looked at as a game that the big companies have mastered and smaller companies can only watch form the sidelines. It is a game that is too expensive, too difficult, too time consuming for any smaller players to excel. This has led to the numerous articles and opinion pieces saying that the majority of data science projects never make it into production, that data science doesn’t provide value or that big data is dead. However, I believe there is a bigger question to ask. Why are so many companies struggling to utilize their data and find value in it? 

I believe the answer is more often tied to the availability of data as well as the communication between departments. I have seen large government agencies fail at data because data availability and consistency was so siloed between departments. On the other hand, I’ve seen tiny nonprofits succeed because there were only one or two people in charge of the data and so it was accessible and consistent. 

The 1% problem in data science is often not a question of resources, it is a question of accessibility.

A company doesn't necessarily need a large budget or a huge data science team to start getting insights from their data. With the right tools and/ or the right partner, you can begin getting data insights and acting on those insights quickly. Often the most valuable part of working with a data partner is getting your data into an easy-to-read, accessible format.

Once data processes have been created and executed, the next question is what to do with that information. Small companies often have a few people who have been deeply entrenched in the business for years. This institutional knowledge is a great starting point for what kinds of questions to ask of your data, starting with what the team “knows” to be true. This can lead to deeper insights and understanding of years of observation: when these observations hold true, when they don’t and, most importantly, is there data that supports that there are new trends emerging and that long held company truths are changing and evolving.

Many small and startup companies feel that being data-driven is not for them. Not because they don’t think it is helpful (many companies are focusing on analytics, 91% of companies say they are accelerating investment in data and analytics according to a NewVantage study) but because they think it is impossible given their budget, resource and time constraints. Focusing making your data accessible and answering one question at a time will get your organization well on their way to being data driven without the huge costs and headache.

Share

Data & Marketing

Data-driven marketing is the approach of optimizing brand communications based on customer information. Data-driven marketers use customer data to predict their needs, desires and future behaviors. Such insight helps develop personalized marketing strategies for the highest possible return on investment (ROI).

For example:

  • Data helps to gain better clarity about the target audience. Information about customers allows marketers to gain a laser-sharp understanding of their target audience. Insights from the CRM, for example, can increase a marketer’s ability to predict customer behavior further. The result? Marketing campaigns that reach customers with the right message at the right time.

  • Data offers the ability to build stronger connections with potential customers. With data, marketers can build much better connections with their audience. What’s more, they can do so at a scale too. As Tom Benton, the CEO of the Data and Marketing Association points out in his Forbes article: “The sheer amount of data from a near-infinite combination of media, devices, platforms and channels allows marketers the opportunity to deliver 1-to-1 customer experiences at a massive scale. If these are leveraged adeptly, a business with a million customers can deliver an experience just as tailored as a business with a dozen customers.” For example, real-time campaign data would help a marketer adjust to match a customer’s engagement. As a result, they could deliver a campaign that is personalized to meet the expectations of each individual customer.

  • Uncover the best channels for promotion. Data can reveal not only a target audience’s preferences, it can also suggest what channels a brand should use to engage their audience now and in the future. Such insight, in turn, could help them position the message where its target audience is or is going to be.

  • Personalization. Today’s customers are skeptical about generic marketing messages they receive. One study revealed 74% of customers feeling frustrated by seeing irrelevant content from brands. 79% of them won’t consider an offer unless a brand personalizes it to their previous interactions. And so, to engage customers, marketers must focus on personalizing their experience. Here’s how data helps to achieve it. First, it delivers a holistic view of the target audience. It helps identify potential customers’ triggers and pain points. Individual customer information, then, can enrich brands communication with the person. And does it work? Studies have shown that businesses that use personalization deliver 5x – 8x higher ROI from their marketing efforts. This suggests the payoff from focusing on data first is huge. In fact, some of the advantages of the data-driven approach include: Greater customer loyalty, acquiring more new customers, Increased customer satisfaction and more.

Data driven marketing is an important tool in the marketer’s toolbox and allows marketing teams to create meaningful relationships with customers. These are just a few tools in the hands of data driven marketing teams.

Share

3 Steps to Prevent Bias in AI

The murder of George Floyd brought to the world stage the long held bias and unethical treatment of black people and minority groups in America and around the world. Governments, individuals and organizations collectively began to stop and consider how they can repair the relationships between them and the people that they have oppressed. And while there is a long, LONG way to go, I am so proud that society has finally begun this work in earnest. Not just giving lip service and bandwagoning the latest social trend, but truly considering how real change can be made.

In the tech sector this kind of self reflection is just as important. Are we building bias, racist machines? The short answer is unfortunately, yes. As technologists we create algorithms that determine everything from the minute aspects of people’s lives to the life-altering. And many organizations do not have processes in place to ensure that they have taken this kind of algorithmic bias into account. However taking steps to prevent bias in current and future AI systems is just one way we can ensure equity for all now and into the future.

Here are just a few ways your organization can begin the process of ensuring best practices in preventing algorithmic bias:

Understand the problem. No, REALLY understand the problem

Step back and look at the question that is being framed and if it truly needs an automated solution. How has this question been addressed in the past and what issues arose then? You likely won’t fix every issue simply through automation and particularly if you are using training data that is based on the problematic decisions from the past. When researchers wanted to create an algorithm that looked at recidivism in prisons the automated system both reproduced and exacerbated racial and socioeconomic bias. The system was more likely to give higher recidivism rates to black and hispanic prisoners despite research showing that race has no bearing on recidivism. Additionally, the algorithm was less likely to recommend women for good behavior and parole. The system that was supposed to create fair and equitable judgement was created using the unfair and biased human decisions of the past. In order to create a fair and balanced system, technologists need to weigh multiple political, historical and economic factors. This should not be the job of the technologists alone, business and leadership roles need to play a part in the process.  

Make sure your data set is representative

Although this appears easy, representation needs to be thoughtfully considered. Does the audience (both the intended and unintended audiences) have representation in your training data set at the correct proportions? Even with a representative dataset models can have a greater error rate with some groups. A large technology company’s facial recognition system was trained with nearly 50-50 male to female participants, using faces and skin tones that covered the entire fitzpatrick scale with reasonable proportion to the population. However after testing  the company saw that even with a representative dataset the algorithm was much more likely to classify women with darker skin tones incorrectly. Really consider if your dataset is representative and remember that testing for these unforeseen biases is key.

Have the hard discussions 

Talk about the historical inequities that have been seen in your industry in years past. Consider testing around bias to see if the training set is misrepresentative or the algorithm is inadvertently recreating the poor decisions of the past. Talk with your data team about how bias can be mitigated, create processes to help inform and regulate data use within your company and have the hard discussions about where your blindspots are and what groups they affect. These conversations should not be left to the technology teams alone, they should be a part of the company culture at all levels.

Bias is everywhere and no one is immune to it. The systems that drive our businesses, government and organizations are just as fallible as the people who build them. However all organizations should strive to take into consideration a broad view of our world. Doing so will not only promote practices that are equitable and beneficial to society but are also good for business! 

Share