Tuesday, August 23, 2016

Top 10 Big Data Facts That You Discuss At The Second Meeting

I was talking about Big Data to PhD students and early career researchers at the University of Melbourne on Friday, 19 August 2016. Here are the insights that I shared.

If you are confused after attending your first public lecture or vendor presentation on Big Data and not sure how much that relates to you. If you are still so puzzled that you have thousands of questions. You are not alone. I have been meeting many new Big Data followers having the same feeling. Based on that experience, here are my top 10 points that the new followers can make themselves aware of prior to that follow-up meeting. Ideally, these are the points that any independent experts would discuss with you in the second meeting.

1. There is nothing in the definition
Big Data is often described using its characteristics. Initially it used to be described in terms of Volume-the quantity, Variety-type of data and Velocity-speed of data generation. But as the time passed by, different vendors and practitioners started to add more characteristics to describe Big Data, such as, Variability-inconsistency of data, Complexity-difficulties to link available data and Veracity-quality of data. For the new followers it has been often bit confusing as they tried to relate their dataset to each of the above characteristics of Big Data.

However, the reality is, you do not have to have each and every one of the above characteristics in your data in order to manage it with a Big Data framework. One should simply be able to relate to some of the characteristics and be able to see the necessity, if any, for the organisation to use Big Data framework.

2. There are plenty of hypes and it may not apply to you
There are plenty of hypes around Big Data and merely having a Big Data framework in your organisation does not solve all of your data management and analysis problems. A very valid question to ask at the very initial stage, do we have an issue of the scale of Big Data in the current project? If the answer is a definite No, then move on. It is not the end of the world if you do not implement an organisational Big Data framework. However, it does not mean that you can ignore the importance of proper data management with the help of traditional frameworks.

3. Be realistic
One has to be realistic about what they can get out of their Big Data framework and process. For example, predictive analysis is something very popular among Big Data devotees. But the output of any predictive analysis depends on contexts, .e.g. accessibility of all the necessary data including external datasets or skill level of the analysts in the team. You may have to allow some time to finally get the best outcome of your Big Data investment. The lesson is to keep your expectation in check and to be patient.

4. You need to understand your business domain
To get the best out of your Big Data exercise you need to understand your business and organisational practices and vision well. Additionally you will also be required to make your best effort to understand your competitors and peers. As leader, at the bare minimum, you have to have a comprehensive list of your organization’s data repositories, sources and targets. Once you know what you already have, you can concentrate what else you may require to move your business forward inline with your organisation’s vision. It is very often the case that you will be pleasantly surprise to see insights from the data that you already have.

5. Creativity should and will always prevail
In order to leverage the huge potential of Big Data you need to be creative. You need to have creative minds in the team, which in turn will drive your analysis and future directions. You need to invest in people and need to retain the right people in the team who overtime are more likely to be able to produce critical insights from your data.

It is very popular to hire an external consultant for short period of time to find insights from your data while the existing employees are doing their routine work. This approach is not always productive. My observation is, whenever an existing employee, who possibly worked for the company for few years, is given the opportunity and allowed to work in finding insights from data, it has been faster and more productive.

6. You don’t have to tell good or bad stories, you have to tell stories out of your data
Often organisations themselves or through external consultants try to find insights from their data in form of either positive or negative stories. This approach mostly tells you stories that you intended to find. It is a bias approach to start with.

In particular, when an organization is not doing so well, they are so focused on what is going wrong that they do not see what is working for them and do not see any potential bright spot. The bottom line is, you simply need to tell stories out of your Big Data, they don’t have to be either positive or negative.

7. You don’t have to analyse the whole dataset at any given time
“I just have learned about all the external data sources on top of my own organisation’s datasets. How is it possible to store, analyse and find insights out of them all? Even though the Big Data presentation promised to deliver, will that outweigh the benefit for my organisation?”

These are some of the most common thoughts that the new followers generally envision right after an overwhelming Big Data presentation. To get you out of your misery, I can categorically ensure that you do not have to analyse each and every dataset that you have access to, every time you needed some insights. It is always been about identifying a possible subset of your data for a given insight that you have in mind and then to let the analysis of that small subset lead you to other datasets, if necessary.

8. Their commercial product is not the solution to your Big Data problem
If you were at a vendor presentation and were listening about their products, it is almost certain that you were told their products have solutions to all of your Big Data issues. That may not always be true. The solutions to your problems lie within. You have to identify the issues and have to detail a vision to overcome the problems, independent of any technology. Then and only then, you identify a tool that can help you achieve your goals. This approach will not only solve your Big Data problems with minimum efforts but will also deliver you the best out of your investment.

9. Big Data is not about checking your business health against Social Media sentiment
This is one of the top misconceptions floating around Big Data. It is mainly fuelled by the examples that speakers often use in their presentation while describing Big Data. Social media data should only be seen as a very tiny component of your Big Data framework. By all means, social media sentiment will provide you with useful insights but that should not be the benchmark and Big Data does not encourages you to run your business based on social media sentiment. In fact it is risky for your business decision making to depend solely on social media sentiment analysis as it is often can be bias.

10. Data management or storage itself is a huge task
Finding interesting insights from data has been the most exciting takeaway from Big Data lectures and rightly so. In the process, data storage and management were often overlooked and new Big Data devotees pay less attention to it. But, it is very important to realise that before finding any insights out of the available data, one has to have plans for data storage and management. The reason is simple, if you do not have any suitable placeholder for your data i.e. if you cannot store, access and secure your data, you will not be able to run any analysis on it.

Knowing the above facts will help you identify the right solution to your Big Data problem. And remember Big Data is not limited to your outgoing channels of your organization, such as, sales or marketing. It can also be useful to understand your internal resources, such as, employee satisfactions and needs. So Happy Exploring Big Data!

No comments: