Big Data 3/6

in #education8 years ago (edited)

Welcome to the third entry about the big data, I hope to subscribe, interact and give you Resteem. Thanks @ javf1016 :)


First Entry: https://steemit.com/business/@javf1016/big-data-1-10 Second Entry: https://steemit.com/education/@javf1016/big-data-2-10

II. CONTEXT

Many companies think to save when implementing security and the sea in projects, data management and virtually all processes, but over a period of time that error can cause everything done is just a moment to another. The security in the great information passes through the employees, the systems and the external agents.

The International Conference of Data Protection and Privacy Authorities calls on all parties using Big Data to: [2]

• Respect the principle of specification of purpose.

• Limit the amount of information collected and stored at a level that is necessary for the intended purpose.

• Obtain, where appropriate, the data subject's consent in relation to the use of personal information for analysis and profiling purposes.

• Be transparent about what information is collected, how it is processed, for what purpose it is transferred and transferred to third parties.

• Give people appropriate access to the data that has been collected about them and the information and decisions that have been taken with those data.

• Provide people, where appropriate, with access to information on key inputs and decision-making criteria (algorithms) that have been used as a basis for profile development.

• Conduct a privacy impact assessment, especially when the analysis of large data involves novel or unexpected uses of personal data.

• Develop and use the technologies of the Large Data in accordance with the principles of Privacy by Design.

• Consider when anonymous data will improve privacy protection. Anonymization can help mitigate the privacy risks associated with analyzing large data, but only if anonymization is properly designed and managed.

• Be very careful, and act in compliance with applicable data protection legislation, when sharing public data sets with pseudonyms and which may be indirectly identifiable.

• Demonstrate that decisions regarding the use of large data are fair, transparent and accountable. Related to the use of data for the fines of profiling, both these and the algorithms on which they are based require a continuous valuation.

Below we will look at some of the current challenges of protecting the information collected in Big Data, based on the needs that were exposed at the conference, which directly affect the current behavior and that can be improved in the near future , Grouping them for a better understanding.

Protection and anonymization: Access to behavioral data of a population, images and videos implies that we must protect the integrity of this information, generate security guarantees for the people, properties, systems and environments involved in Big Data. Obtaining data from different environments, whether web, applications, IoT, implies that each has to develop measures that adapt to their specific needs, so as not to affect the growth of information linked to the identity of the source. In recent years several companies have suffered data loss or loss of data, *Fig 3 *, we see that in sectors (banking, energy, financial, telecommunications, government) that are considered safe is where this situation is most reflected and thus we could analyze each of the sectors and we find that at present none is Save from criminal minds.

Fig. 3. Information on data lost or leaked in recent years. [3]

Investment and skill: As has been observed, the lack of investment in companies in the area of ​​security is a very important factor, something that should be mandatory, is not taken into account in most projects, either to save Costs or lack of knowledge, a corporate approach is aimed at generating areas specialized in security in every sense, having professionals specialized in their payrolls. Not only is human potential necessary, the use of tools focused on security needs are also very useful, but the same thing is observed, there is no investment in the tool and in several cases there is no investment in the training of its employees in Such aid; It is not only to acquire them without taking full advantage of them, each and every one of the tools in the market has its advantages and disadvantages, that everyone knows, but if there are no trained personnel to monitor them, they are pending, you will not get anywhere , On that occasion and specific cases, would only be losses for the company and would continue to create such spaces for loss, leakage, mismanagement of data and what would be sought at the end, a study analysis, everything would be considered wrong bringing many more Lost to the future.

Wrong data: Some people try to be secure, sometimes hide their true data, enter valuable information in an erroneous way, all by trying to hide their identity on the network, but do not realize that the network is progressing rapidly and each entry To a page, each time you see photos or videos, information about their behavior is collected. Studies carried out in companies that are interested in showing products or improving the quality of their content are affected by these data when implementing the campaign that has been developed, generating losses, until the end of a company.

The problem of security can be met by collecting all data, which are pre-analyzed based on gifts that we have created and thus show us a result to perform an interpretation, arriving at what we call security visualization, which in a nutshell Means to view data differently, graphically. As shown in Fig. 4

Fig. 4 Description of the process used in the security display

Contextually if in a company thousands of connections are made to the database every second, all belonging to the internal network and from one moment to another external connections are presented and consult the database, you could say that these connections were camouflaged between the Internal sessions, but the visualization of the security allows us in real time to observe all the connections in graphic format, highlighting the suspicious connections, a small example that shows that already a camouflaged attack can not be as effective as before. Now the question that is generated when listening to an example like the one of the database would be How we feed these visualizations ?, its answer is not so complicated, like administrators of a system, users, every time that we realize operations on the different applications, These reflect logs, in addition to rely on network devices such as the firewall, routers and if it is not enough we have intrusion detectors, sniffers and we could get to use the native properties of the operating system. As shown in Fig. 5.

Fig. 5 Operating system log

By analyzing all the generated logs, making the corresponding adjustments we will obtain data easier to analyze in the security visualizations.
In computer security and the world of Big Data does not exploit infinite potential in data analysis, there is a stigma that engineers prefer to review data directly from the system, but with the gigantic volume with which Big Data advances in a Company, company, comes the moment in which it is impossible to carry out the classic operations to review the data generated, besides that it could not be done in real time since, however efficient we may be, human capacity would never give us to analyze and give a Interpretation based on millions of records per second.
Other specific examples that could be detected:

Fraud detection
Irregular transactions
Zero Day
Botnets

We simply add layers of analysis to the internal mechanisms of the company that are focused on detection and control, achieving the ability to detect threats in normal operations. Currently the solutions are based on centralized systems, knowing that the processing is done in a distributed way, the security of the nodes, communication channels, becomes something fundamental to deal with.
As usually happens in technology as it develops, targeted attacks are also developed, something to keep in mind is that security comes much later, it is not something that can be controlled, an analysis of a technology before leaving, not Gives the certainty that it will never fail or that it will be invulnerable. [4]

Glossary
Firewall, part of a system or network, which is responsible for blocking unauthorized access.
Router, device that allows connection at the network level.
Sniffers, people who listen to everything that moves in a network.
Zero day, based on malicious code, knowing the vulnerabilities unknown to ordinary people.
Botnets, robots that run automatically to control computers or servers and to be able to access them remotely when they are infected.

Bibliography
[3] Paul C. Zikopoulos; “Harness the Power of Big Data The IBM Big Data Platform”; Editorial: IBM Corporation; 2012.
[4] Accenture’s 2014 report; “Big Success with Big Data"; Documento WEB - PDF; [https://www.accenture.com/us-en/_acnmedia/Accenture/Conversion-Assets/DotCom/Documents/Global/PDF/Industries_14/Accenture-Big-Data-POV.pdf]

Sort:  

It's very interesting post dear@javf1016!!

thanks, enjoy it :)