Big data is an industry that is growing at huge speed, with the amount of information stored in IT systems doubling every two years, and worldwide servers estimated to hold 44 zettabytes of information, enough to fill a stack of iPads reaching to the moon and back over 6 times. This makes it one of the most exciting times to be in data analysis and related fields, but there are undeniably significant challenges ahead as well. Many have to do with the fact that these are largely uncharted waters, where the industry is figuring itself out as it goes along. There is potential for world-changing innovation – so long as we don’t get side-tracked by these 6 challenges along the way.
Despite it perhaps seeming that there is an overabundance of data analysts in the industry, the fact is there is actually a looming skills gap among not only analysts, but data scientists and data engineers as well.
This is because the data handling tools are still evolving quickly, but the professionals aren’t keeping up. In addition, despite big data analysis becoming ever more commonplace, it is not unusual to still have an old-fashioned company culture that resists the innovation that data brings. This attitude can slow down the acquisition of relevant skill sets by company members. The solution is to invest in analytics education to empower employees at all levels with understanding of how big data can improve their work, and don’t be afraid to bring new recruits on board to help with this.
2. Growing Data Issues
It may sound obvious, but one of the biggest challenges in big data is how to store it all properly. As we mentioned above, the amount of data being stored has seen explosive growth, with no signs of slowing, and this actually takes up space in data centers and databases.
Most of the ‘raw’ data is unstructured and unorganized, being a mixture of documents, videos, audio, text files, emails, and other sources. This means that not only is there a lot of it, but it is difficult to search and filter through to extract substantial insights. In short, it is not fit for purpose.
However, this can be seen as the world’s greatest opportunity, ready to be mined by anybody with the right skills and ambition. Lots of technologies are available to do the work of cleaning and organizing data, like NoSQL databases, Hadoop, Spark, various Business Intelligence applications, as well as artificial intelligence and machine learning.
3. Quality and Cost
It’s no secret that much of the data out there is currently low quality, messy and probably not 100% accurate. And poor data leads to poor decision-making. This wouldn’t be as much of a challenge if it wasn’t for the fact that bringing it all into a consistent – and most importantly, usable – format can be complex and tricky, requiring investment. However, once the work has been implemented, there is a high ROI.
With some sensible analysis, cost-effective strategies can be devised to help decide what data needs to be analyzed first, and what can wait.
With enormous amounts of personal and public information of all qualities being stored in data centers in a still largely unstructured manner, it is unsurprising that it would be considered fruit ripe for the picking by hackers and others seeking to breach security protocols.
Indeed, securing huge data sets is one of the big challenges of big data. Oftentimes, companies make the mistake of spending so much time and money in understanding, analyzing and storing their data, that they neglect to implement security until a much later date – sometimes only when a breach has already occurred.
5. Data Governance
When companies move into big data analytics as part of their business journey, it can be difficult to transition to a healthy data governance mindset, which essentially boils down to the exercise of authority and control over data assets within an organization. Without a data governance strategy and controls in place, much of the benefit of broader, deeper data access is prone to being lost.
The point is that data can and should be treated as a product and an asset from the start. This means it should have leadership in place, as well as a budget, and understanding of ownership. Many companies fall into the trap of thinking that the IT department owns the data, when actually it belongs to the whole organization.
Identifying and managing these governance issues will make it easier to provide self-service access for employees, and increase the ROI from big data.
6. Insight Generation
It should be clear by now that without the right people and tools, big data does not automatically equal big insights. Insights are still generated manually today, although there is ever-greater work being put into automatic insight generation. This means you still need people who know what they are doing and can put 2 and 2 together to make a sound business decision. Thus, for an insight to be worthwhile, it needs to have a plan of action attached to it.
For example, if your data analysis tells you that people on social media are liking an image of a celebrity in a certain pair of shoes, the next step is to advertise that pair of shoes to that audience in as close to real-time as you can, and perhaps offer a discount to incentivize purchases.
Sounds simple, but many companies are still dragging behind in not joining together all the parts of their data journey.
Where you have a well-defined and well-understood process, employing a mix of business intelligence analysts, statisticians, and data scientists with machine learning expertise, you can create actionable insights that have the ability to truly revolutionize prospects for your organization.