Big data presents both opportunities and security challenges for organizations. It allows for increased operational efficiency through customized marketing and improved decision making. However, aggregating large amounts of sensitive data in one place makes it a valuable target. Organizations must properly classify and control big data to ensure regulatory compliance and prevent data theft while also taking advantage of security benefits like improved fraud detection. Developing a holistic approach that identifies, classifies, and manages data according to policy is important for addressing the challenges of big data.
Big data presents both opportunities and security challenges for organizations. It allows for increased operational efficiency through customized marketing and improved decision making. However, aggregating large amounts of sensitive data in one place makes it a valuable target. Organizations must properly classify and control big data to ensure regulatory compliance and prevent data theft while also taking advantage of security benefits like improved fraud detection. Developing a holistic approach that identifies, classifies, and manages data according to policy is important for addressing the challenges of big data.
Big data presents both opportunities and security challenges for organizations. It allows for increased operational efficiency through customized marketing and improved decision making. However, aggregating large amounts of sensitive data in one place makes it a valuable target. Organizations must properly classify and control big data to ensure regulatory compliance and prevent data theft while also taking advantage of security benefits like improved fraud detection. Developing a holistic approach that identifies, classifies, and manages data according to policy is important for addressing the challenges of big data.
Big data presents both opportunities and security challenges for organizations. It allows for increased operational efficiency through customized marketing and improved decision making. However, aggregating large amounts of sensitive data in one place makes it a valuable target. Organizations must properly classify and control big data to ensure regulatory compliance and prevent data theft while also taking advantage of security benefits like improved fraud detection. Developing a holistic approach that identifies, classifies, and manages data according to policy is important for addressing the challenges of big data.
Copyright:
Attribution Non-Commercial (BY-NC)
Available Formats
Download as PDF, TXT or read online from Scribd
Download as pdf or txt
You are on page 1of 4
Big data security
There are many advantages to be gained
through harnessing big data (see Figure 1), of which the most compelling is increased operational efficiency. According to the McKinsey Global Institute, companies embracing big data are able to outperform their peers. 2 It estimates that a retailer that properly harnesses big data has the potential to increase its operating margins by more than 60% by gaining market share over its competitors by taking advantage of detailed customer data. McKinsey states that the prime advantages of big data analysis are: Creating transparency by making relevant data more accessible, such as by integrating data from R&D, engineering and manufacturing departments to enable concurrent engineering to cut time to market and improve quality. Enabling experimentation to discover needs, expose variability and improve performance by collecting more accurate and detailed performance data. For example, such data can be used to analyse variability in performance in order to understand the root causes so that performance can be managed at higher levels. Segmenting populations to customise actions so that products and services can be better tailored to meet actual needs. For example, consumer goods and services companies can use big data analysis techniques to better target promotions and advertising. Replacing/supporting human decisionmaking with automated algorithms to improve decisionmaking and minimise risks by unearthing valuable insights that would otherwise remain hidden. McKinsey provides the example of a retailer using big data analytics to automatically fine-tune inventories in response to real-time sales. Innovating new business models, products and services. For example, a manufacturer can use data obtained from actual use of its products to improve the development of the next generation of products. Beyond commercial organisations, big data presents a number of other opportunities, such as improving threat detection capabilities for governments. The US Department of Homeland Security states that there has been an explosion of data recently, helped by expanding use of the Internet and social networks, that can be mined to help defend against growing threats from foreign countries, terrorists, online hacktivists and criminal elements, both in the real world and in cyberspace. It states that the Arab Spring revolutions in the Middle East could have been predicted by monitoring what people were searching for and how they were communicating online. By analysing big data, governments will be better able to understand the various threats that they face, the likely vectors of attack and the actors that might perpetrate them. Security issues with big data One of the key security issues involved with big data aggregation and analysis is that organisations collect and process a great deal of sensitive information regarding customers and employees, as well as intellectual property, trade secrets and financial information. As organisations look to gain value from such information, they are increasingly FEATURE July 2012 Network Security 5 Colin Tankard Colin Tankard, Digital Pathways The term big data has come into use recently to refer to the ever-increasing amount of information that organisations are storing, processing and analysing, owing to the growing number of information sources in use. According to research conducted by IDC, there were 1.8 zettabytes (1.8 trillion gigabytes) of information created and replicated in 2011 alone and that amount is doubling every two years. 1 Within the next decade, the amount of information managed by enterprise datacentres will grow by 50 times, whereas the number of IT professionals will expand by just 1.5 times. Figure 1: What new opportunities does big data present? FEATURE 6 Network Security July 2012 seeking to aggregate data from a wider range of stores and applications to provide more context in order to increase the value of the data for example, to provide a clearer picture of customer preferences in order to better target them. By centralising data in one place, it becomes a valuable target for attackers, which can potentially leave huge swathes of information exposed, which could undermine trust in the organisation and damage its reputation. This makes it essential that big data stores are properly controlled and protected. Another potential problem relates to regulatory compliance, especially with data protection laws. Such laws are more stringent in some jurisdictions than others, particularly with regard to where data can be stored or processed. Organisations need to carefully consider the legal ramifications of where they store and process data to ensure that they remain in compliance with the regulations that they face. However, there are also security advantages to big data projects. When centralising data stores, organisations should first classify the information and apply appropriate controls to it, such as imposing retention periods as specified by the regulations that they face. This will allow organisations to weed out data that has little value or that no longer needs to be kept so that it can be disposed of and is no longer available for theft or subject to litigation demanding presentation of records. Another security advantage is that large swathes of data can be mined for security events, such as malware, spear phishing attempts or fraud, such as account takeovers. Organisations should first classify the information and apply appropriate controls to it, such as imposing retention periods as specified by the regulations that they face The data in Figure 2 illustrate the security advantages of big data for protecting data according to more than 180 IT security practitioners who were surveyed by technology vendor Varonis during the InfoSecurity exhibition in London in April 2012. 3
Developing a holistic approach For most organisations, the volume of big data generated and stored can be a major challenge, with searching such vast amounts of data most of which is unstructured often taking weeks or more using traditional tools. MeriTalk, an online community for the US Government IT community, recently surveyed 151 federal IT professionals regarding big data and found that nine out of ten see challenges on the path to harnessing big data. 4 When asked what they have in place today compared to what they believe will be needed for successful big data management, respondents stated that they had, on average, 49% of the data storage and access technology that they will need, 46% of the computational power Figure 2: Use of big data for managing structured and unstructured information. Source: Varonis. Figure 3: Most significant challenges in managing large volumes of information. Source: The big data gap, MeriTalk, May 2012. and 44% of the personnel. The most significant challenges that they see in managing such large amounts of data are shown in Figure 3. Prior to the start of any big data management project, organisations need to locate and identify all of the data sources in their network, from where they originate, who created them and who can access them. This should be an enterprise-wide effort, with input from security and risk managers, as well as legal and policy teams, that involves locating and indexing data. This also needs to be a continuous process so that not just existing data is uncovered, but also new data as it is created throughout the network. Data classification can be a complex, long and arduous process a factor that has been a significant struggle for many The next step is to classify the data that has been discovered according to its sensitivity and business criticality. However, data classification can be a complex, long and arduous process a factor that has been a significant struggle for many when attempting to implement technologies that rely on data classification, such as data leakage prevention systems. Organisations also need to take into account industry standards and government regulations to which they must adhere, ensuring that records are retained and archived for the time periods specified and that data is protected according to the guidelines contained in some standards (such as PCI DSS, which specifies that payment cardholder data is held in a secure manner). To ease the classification process, organisations should look for automated database and network discovery tools, which can be used to scan networks to identify all data assets. As they go through the data classification process, organisations should also look to develop or update policies regarding data handling, such as defining what types of data must be stored and for how long, where they should be stored and how data will be accessed when they are needed. Enforcement of such policies will prevent users from creating their own data stores that are outside the control of the IT department. Data warehouses are popular technologies for managing large volumes of data. However, most rely on a relational format for storing data, which works fine for structured data, but less so for unstructured data. And unstructured data make up a high proportion of data contained in big data stores, as information is increasingly drawn from a wide range of sources beyond traditional enterprise applications. For example, relational databases are good at handling discrete packets of information, such as credit card numbers and employee identifiers, but are less able to handle content such as video or emails, which do not necessarily conform to a rigid structure. An alternative for organisations looking to get a handle on big data is to use an open source software framework that supports data-intensive distributed applications and can work with thousands of systems in a network, and petabytes of data. Currently, Hadoop is one of the most popular such choices among organisations. Hadoop is particularly suited for storing the large amounts of unstructured data contained in big data stores and provides a large set of tools and technologies that can aid organisations in tackling the problems involved in analysing massive swathes of information, including enterprise search, log analysis and data mining. Such capabilities are critical to allowing data to be retrieved quickly across structured and unstructured sources. Separate silos of data control and protection such as archiving, data leakage prevention and access controls should be brought together According to a recent survey undertaken by InformationWeek among 431 respondents involved with information management technologies, there is a number of factors driving interest in the use of Hadoop or other NoSQL data management and processing platforms, as shown in Figure 4. 5 Big data security controls Research firm Forrester recommends that in order to provide better control over big data sets, controls should be moved so that they are closer to the data store and the data itself, rather FEATURE July 2012 Network Security 7 Figure 4: Factors driving use of Hadoop and other NoSQL platforms. Source: How Hadoop tames enterprises big data, InformationWeek, February 2012. FEATURE 8 Network Security July 2012 than being placed at the edge of the network, in order to provide a more effective line of defence. It also states that separate silos of data control and protection such as archiving, data leakage prevention and access controls should be brought together. In terms of access controls, they should be granular enough to ensure that only those authorised to access data can do so, in order to prevent sensitive information from being compromised. Controls should also be set using the principle of least privilege, especially for those with greater access rights, such as administrators. Products such as Vormetric bring together data encryption and its related policy management and key storage elements and link access control to the data. Therefore companies can decide who can view the data or in the case of an administrator allow them physical access: but should they try to read the data it would be useless because the process would not have allowed decryption. Such an approach is highly effective in any multi-silo environment where any form of electronic data is stored. It is important that the legal department be involved in the development of policies related to data retention and disposal to ensure that they are in compliance with the requirements of industry standards To ensure that access controls are effective, they should be continuously monitored and should be modified as employees change role in the organisation so that they do not accumulate excessive rights and privileges that could be abused. This can be done using existing technologies in use in many organisations such as database activity monitoring tools, the capabilities of which are being expanded by many vendors to deal with unstructured data in big data environments. Other useful tools include Security Information and Event Management (SIEM) technologies, which gather log information from a wide variety of applications on the network. To make SIEM tools more effective and manageable, many vendors, such as AlienVault, are expanding their solutions to provide capabilities called Network Analysis and Visibility (NAV), which capture and analyse network traffic to look for potential attacks and malicious insider abuse and are highly scalable across large networks. NAV tools provide useful add-ons to SIEM tools, such as metadata analysis, packet capture analysis and flow analysis. In the case of AlienVault, further steps have been taken in order to link the analysed data and make proactive decisions in preventing or stopping the breach. Ensuring that data is archived as required and disposed of when no longer needed is another important security consideration so that the organisation is not managing overly large volumes of data, and so the risk of sensitive data being breached is reduced. This can also be reduced through use of techniques that make sensitive data unreadable, such as encryption, tokenisation and data masking, so that only those with the keys to unlock the data can do so. This is a much easier task once data has been properly classified, but it is important that the legal department be involved in the development of policies related to data retention and disposal to ensure that they are in compliance with the requirements of industry standards and government regulations. Conclusions As data volumes continue to expand, as they take in an ever wider range of data sources, much of which is in unstructured form, organisations are increasingly looking to extract value from that data to uncover the opportunities for the business that it contains. However, traditional data storage and analysis tools are not, on their own, up to the task of processing and analysing the information the data contains, owing not just to the volume of data, but also to the unstructured, ad hoc nature of much of the content. In addition, the centralised nature of big data stores creates new security challenges to which organisations must respond, which require that controls are placed around the data itself, rather than the applications and systems that store the data. About the author Colin Tankard is managing director of data security company Digital Pathways, specialists in the design, implementation and management of systems that ensure the security of all data whether at rest within the network, mobile device, in storage or data in transit across public or private networks. Resources 'Big data: harnessing a game-changing asset. Economist Intelligence Unit (EIU), 2011. Accessed June 2012. www.sas.com/resources/asset/SAS_ BigData_final.pdf. References 1. The 2011 IDC digital universe. IDC, 2011. Accessed June 2012. www.emc.com/collateral/about/ news/idc-emc-digital-universe- 2011-infographic.pdf. 2. Big data: the next frontier for innovation, competition and productivity. McKinsey Global Institute, 2011. Accessed June 2012. www.mckinsey.com/Insights/ MGI/Research/Technology_and_ Innovation/Big_data_The_next_ frontier_for_innovation. 3. Big data and infosecurity. Varonis, 2012. Accessed June 2012. http:// blog.varonis.com/big-data- security/. 4. Big data gap. MeriTalk, 2012. Accessed June 2012. www.meritalk. com/bigdatagap. 5. Kajeepeta, Sreedhar. Strategy: Hadoop and big data. InformationWeek, 2012. Accessed June 2012. http:// reports.informationweek.com/ abstract/81/8670/Business- Intelligence-and-Information- Management/strategy-hadoop-and- big-data*.html.