A New Message Authentication Approach of Moving Unstructured Data Security in Cloud Environment
Authors- Rashid Hussain, Akanksha Singh
Author’s details
1Department of Mathematics and Computer Science, Umaru Musa Yar’adua University, Katsina, Nigeria
2Department of Computer Science and Engineering, Banasthali Vidyapeeth, Rajasthan
Copy for Cite this Article- Rashid Hussain, Akanksha Singh, “A New Message Authentication Approach of moving Unstructured Data Security in Cloud Environment”, International Journal of Science, Engineering and Technology, Volume 5 Issue 1: 2017, pp. 34- 40.
Abstract
In the digital world data is playing major role sharing information in cloud environment. Data is partitioned into two categories, structured and unstructured data. Unstructured data is widely used over the digital world. It is distributed in nature, so this is the reason behind anomalous activity while moving unstructured data in cloud network. Thus to provide the required authentication and message integrity with confidentiality, in this research paper, we developed a new compact version of HMAC. Aim of this research paper is to protect moving data with implementing compact version of HMAC algorithm in cloud environment.
Keywords- Unstructured Data, Cryptography algorithms, Secured Communications, MAC, HMAC, Stream Ciphers, LFSR, cloud environment.
Introduction
At present over the digital world data is playing major role. Data is categorized into two categories; one is structure data and second is unstructure data. Structure data follows RDBMS rules and unstructured does not follow RDBMS rules. Structure data has pre-defined data model. Unstructure data is information that cannot be easily defined and it has no pre-defined data model. It is growing very fast around the digital world from various sources i.e. sensors, social sites, calls, bank transactions etc [2].
Data is growing at very high speed everyday and their problems also increases as per the analysis. According to the International Data Corporation (IDC), over the digital world unstructured data is 80 per cent and structured data is 20 per cent. Unstructured data is growing very fast around the world from various sources, which results number of risks, such as privacy, integrity and authenticity. Reason behind this is creating huge amount of unstructured data from Internet of Things (IoT), this is one of the largest generators of unstructured data [1]. In other way we can say that data volumes are increasing rapidly so processing of such huge amount of data has become very difficult [6].
Unstructured data has to be captured from multiple heterogeneous sources. It was reported by IBM in Jan 2012 that 2.5 quintillion bytes of data are created every day and 90 per cent of the data in the world today were produced in the past two years. This represents new challenges; including how to efficiently store, organize high volume data, and provide privacy, integrity and authenticity to complex data when communication is in process in cloud environment [7]. To save complex information from the intruders, cryptography is the best way to keep it secure from the anomalies, while transferring or store at some place. Cryptography is the science of using mathematics to encrypt and decrypt data [9]. Cryptanalysis is analyzing and breaking secure communication, which includes an interesting combination of analytical reasoning, mathematical tools, pattern finding, patience, determination, and luck. Cryptanalysts are also called as attackers, but in a secure manner [10]. Cryptography has been used for a very long time; the roman emperor Cesar is the first famous person who used it for his military campaigns. These last few years the army has mainly used it but since the computer has become a common tool, cryptography is used and needed by everyone.
In the next section 2 and 3 give a brief description about security issues and challenges of unstructured moving data, in section 4, 5 and 6 discussed about cryptography algorithms provide security to moving unstructure data. In section 7 we discussed about linear feedback shift register and in section 8 mentioned about proposed algorithm (compact version of HMAC) and in section 9 and 10 discussed conclusion and future scope of this research work.
Unstructure Data Security Issues
There are following security issues of unstructure data [3]:
- Unstructure data is stored in a distributed form in cloud network architecture. So it is partitioned into horizontally, vertically, replicated and distributed among the multiple processing nodes so the unstructured data is needs to be processed secure.
- On the social sites unstructure data changes continuously. So it is necessary to capture the changing data for processing.
- Unstructure data is varying from the different heterogeneous sources. So there is need to write queries for handle varying data.
- Instead of moving data between different multiple processing nodes, it is feasible to move the code, due to the huge amount of unstructured data in cloud environment.
- Due to the distributed architecture of unstructure data storage in cloud environment, it is difficult to find out the exact location of the data among the available data processing nodes.
- Unstructured data captures from various logs, social media etc. who has the right to access the data and at what time and from which location.
Challenges of Unstructured Data
Unstructure data has some major challenges which affects the privacy, integrity and authenticity of complex moving unstructure data and these challenges are [8]:
Privacy and Security
It is the most important issue with unstructure data which is sensitive and includes conceptual, technical as well as legal significance. The complex information of a user after processing with external data generates new facts about that user. This information needs to be protected so it adds value to the business of the organization.
Data Access and Sharing of Information
If unstructure data is to be used to make accurate decisions on time so it becomes necessary that it should be timely. Sharing the complex data of clients with other parties needs to be secured.
Storage and Processing Issues
The storage of unstructure data that is available is not sufficient for storing huge amount of data which is being produced by the organizations and social media. Sometimes this data needs to be uploaded on the cloud. Zeta bytes of data will take more time to get uploaded in cloud and moreover this unstructure data is changing rapidly which makes difficult to update this changed data in real time. The data transfer from storage to processing needs to maintain integrity. So it is mandatory to build up the indexes in the beginning, for large amount of data processing.
MAC
In cryptography, a message authentication code (MAC) is a part of information used to authenticate a message which sometimes called a keyed (cryptographic) hash function, it accepts a secret key as input and an arbitrary-length message to be authenticated and outputs a MAC (sometimes known as a tag). The MAC value protects both a message’s data integrity and its authenticity, by allowing verifiers to detect any changes to the original message content. To provide privacy, integrity and authenticity to moving unstructured data in cloud environment we implemented HMAC algorithm [2] to generate a cipher text which can be transferred safely without being worried about the data loss. MAC (M, K) is a one-way transformation of the message M and a secret key K is shared with the verifier. Both values M and MAC (M, K) are sent to the verifier to detect any changes to the original message. Upon receiving these values, the verifier generates himself a value MAC (M, K) based on the received M and using the value of K known to him.
If the hash message at the sender end becomes equal to the message generated at the receiving end, the verifier decides that the message is authentic and equals its original K and generates MAC for messages M chosen by the attacker, it should be infeasible to guess the MAC value for any new message not interrogated before. In this we also incorporate the features of challenge response interrogation preventing illegal access.
The challenge response consists of interrogated component, channel and interrogator [11]. The interrogated component consists of 3 inputs: The secret key, a random challenge C which is received from the interrogator and the message M, whose authentication is to be done. The role of interrogated component is to transmit the public key in its encrypted form to the interrogator, where it is decrypted using the system decryption key and is used along with the message and the challenge to generate the cipher text, which is decoded in the interrogator.
Interrogated component transmits the message, to be authenticated, to the interrogator in the form of cipher text. The cipher text which is generated will be unique as the key which is used to generate the challenge response is random and will vary with the change in components.
This feature acts as a good security feature, since replay attack can be prevented due to the randomness of the generated key. In other words we can say that, challenge response system is a common authentication technique whereby an individual is prompted (challenge) to provide some private information (response). Most security systems that rely on smart cards are based on challenge-response. A user is given a code (challenge) which he or she enters into the smart card. The smart card then displays a new code (response) that the user can present to log in.
Fig 1.1: MAC Operations [11]
In moving unstructured data security, challenge-response authentication is a family of protocols in which one party presents a question (“challenge”) and another party must provide a valid answer (“response”) to be authenticated. Bluetooth technology is an example in which challenge response authentication technique is used. In this when one Bluetooth enabled device is trying to connect itself, to the other similar device, the receiving sends a data known as challenge to the sender. The sender upon receiving that piece data gives the response to the sender along with the challenge received [12].
Authentication through Challenge response
Authenticity and message integrity are important components of moving data security in message communication. Authentication is confirming that the message was sent by the original source and message integrity verifies that the message was not altered by a cloned source.
So, basically we are using challenge response authentication. In moving unstructured data security, challenge-response authentication is a group of protocols in which one party presents a question (“challenge”) and another party must provide a valid answer (“response”) to be authenticated. The simple example of a challenge-response protocol is password security, where the challenge is asking for the password and the valid response is the original password.
Clearly an adversary that can harm the password authentication then can authenticate itself in the same way. One solution is to issue multiple passwords, each of them marked with an identifier. The verifier can pick any of the identifiers, and the receiver must have the correct password for that identifier. Assuming that the passwords are chosen independently, an adversary who intercepts the challenge-response message pair has no chance of responding correctly to a different challenge than an adversary who has intercepted nothing.
For example, in previous years when other communications security methods are unavailable, then U.S. military uses the AKAC-1553 TRIAD numeral cipher to provide secure communications. TRIAD includes a list of three-letter challenge, in which verifier is supposed to choose randomly from, and random three-letter responses to them. For more security, each set of codes is only valid for a particular time period which is ordinarily 24 hours [18].
HMAC – Introduction
In cryptography science, Hash-based Message Authentication Code (HMAC), is a specific method for calculating a message authentication code (MAC) involving a cryptographic hash function in combination with a secret key [12]. As with any MAC, it may be used to simultaneously verify both the data integrity and the authenticity of a message. Any cryptographic hash function, such as MD5 or SHA-1, may be used in the calculation of HMAC; then the resulting MAC algorithm is termed as HMAC-MD5 or HMAC-SHA1 accordingly. The strength of the HMAC cryptographic algorithm depends upon the strength of the underlying hash function, on the size and quality of the key and the size of the hash output length. An iterative hash function breaks up a original message into small blocks of a fixed size and iterates over them with a compression function. For example, MD5 and SHA-1 operate on 512-bit blocks. The size of the HMAC output is same as the underlying hash function (which is 128 or 160 bits in the case of MD5 or SHA-1, respectively), although it can be shortened if desired.
HMAC also provides a way to check the integrity of transmitted information over or stored in an unreliable medium is a prime necessity in the world of unstructured
Fig 1.2: Diagram of Non-linear feedback shift [13] register.
Moving data Mechanisms that provide such integrity checks based on a secret key that are usually called as message authentication codes (MACs). Typically, message authentication codes are used between two parties those share a secret key, in order to authenticate information transmitted between these two parties. This standard defines a MAC that uses a cryptographic secret key when it is used along with a hash function; it forms HMAC [13]. It shall be used in a combination with an approved cryptography hash function. HMAC algorithm uses a secret key for the verification and calculation of the message authentication codes. The main objective behind the formation of HMAC is:
- To use the hash function without modification; in particular, hash functions h for which code is freely and widely available and can be used in software.
- To get a thorough understanding of the cryptanalysis and the strength associated with authentication mechanism based on assumptions of the hash function.
- Keys can be handled easily.
- This should also allow the replacement of the hash function, in case improved hash functions are developed in future.
HMAC Parameters
In this section we present here parameters of HMAC algorithm those used for encryption and decryption of moving unstructure data [16]:
H An approved hash function
K Secret key shared between the originators and the intended receiver(s).
ipad 00110110(36) repeated b/8 times
opad 01011100(5C) repeated b/8 times
t The number of bytes of MAC.
text The data on which the HMAC is calculated
MAC(text)t = HMAC(K, text)t = H((K Å opad )|| H((K Å ipad) || text))t
Stream Ciphers
In cryptography, a stream cipher is a where plaintext bits are combined with a pseudorandom cipher bit stream (key stream), typically by an Exclusive-OR (XOR) operation. In a stream cipher, the plain text digits are encrypted one at a time, and successive digits transformation varies during the encryption. In practice, the digits are typically single bits or bytes.
Stream ciphers represent a different approach of symmetric encryption from block ciphers. Block ciphers operate on large blocks of digits with a fixed size, unvarying transformation. This distinction is not always correct: in some modes of operation, a block cipher primitive is used in such a way that it acts effectively as a stream cipher. Stream ciphers execute at higher speed than block ciphers and have lower complexity of hardware. However, stream ciphers can be susceptible to major security problems if used incorrectly in particular; the exact starting state must never be used more than one time. The key forms a seed which generates a pseudorandom key stream. The transmitter XOR’s the key stream with the clear text stream, providing the cipher text stream. The receiver which has the same seed key can generate the same key stream. Stream ciphers are faster and require less number of hardware to get implemented [17]. Thus, its hardware complexity is comparatively low. An example of a self-synchronizing stream cipher is a block cipher in cipher-feedback mode (CFB).
Security
For a stream cipher to be secure its key stream must have a large period and it must be impossible to recover the cipher’s key or internal state from the key stream. Cryptographers also demand that the cipher’s key stream be free of even subtle biases that would let attackers distinguish a stream from random noise, and free of detectable relationships between key streams that correspond to related keys. This should be true for all keys and true even if the attacker can know or choose some plaintext or cipher text.
As with other attacks in cryptography, stream cipher attacks can be certification, meaning they aren’t necessarily practical ways to break the cipher but indicate that the cipher might have other weaknesses. Surely using a secure synchronous stream cipher requires that one never reuse the same key stream ; that generally means a different nonce or key must be supplied to each invocation of the cipher. Application designers must also recognize that the most stream ciphers do not provide authenticity, provide only privacy: encrypted messages may still have been modified in transit [16]. Short periods for stream ciphers have been a practical concern.
Linear Feedback Shift Register
A linear feedback shift register is a shift register whose input is a linear function of its previous state. The only linear functions of single bits are XOR and inverse-XOR; thus it is a shift register whose input is driven by the Exclusive-OR (XOR) of some bits of the overall shift register value. The initial value of the linear feedback shift register (LFSR) is called the seed, and because the operation of the register is deterministic, the stream values produced by the register are completely determined by its current or previous state. Likewise, because it must eventually enter a repeating cycle because register has a finite number of possible states. However, an LFSR with a well-chosen feedback function can produce a sequence of bits which appears random and which has a very long cycle [15].
Applications of LFSR include generating pseudo-random numbers, pseudo-noise sequences, fast digital counters, and whitening sequences. Both hardware and software implementations of LFSR are common. Linear feedback shift registers (LFSRs) are popular components in stream ciphers as they can be implemented cheaply in hardware, and their properties are well-understood.
Encryption by LFSR
- Apply linear feedback polynomial using XOR gates and generates output bit per iteration.
- Right shift the content of the shift register.
- Insertion of output bit at most significant bit position [6].
FIGURE 4.2 [15]
Decryption by LFSR
- Apply modified linear feedback polynomial using XOR gates and generates output bit per iteration.
- Left shift the content of the shift register.
- Insertion of output bit at least significant bit position [6].
FIGURE 4.3 [15]
Proposed Algorithm
Provide security to moving unstructured here presents some parameters with their description and a data flow diagram [16].
m(96bits) message bits
c(64bits) challenge bits
iv Initial value for LFSR
K(80bits) Secret key shared between the originators and the intended receiver(s).
ipad 00110110(36) repeated b/8 times
opad 01011100(5C) repeated b/8 times
H(96bits) Output cipher text
|| Concatenation operator
FIGURE 5.1 [16]
Conclusion
In this research paper, message authentication algorithm is, based on the concepts of network security in unstructured cloud environment, is implemented according to the IEEE research paper. The research paper proposed an algorithm through which we can implement a compact version of HMAC (Hash message authentication code) to add integrity, authentication and confidentiality to a moving unstructured data. So, using that proposed algorithm we successfully developed the pseudocode in stream cipher is used to achieve the result which solved many problems in cloud environment and provide much security.
The pseudocode provides security to moving unstructure data in cloud environment by using compact version of HMAC algorithm and can be realized easily.
Future Scope
Future scope of this research work will be extended with Honeypot technology for enhancing security of moving unstructured data in cloud environment.
References
[1] International Data Corporation, 2017-01-06, http://www.idc.com.
[2]http://www.datasciencecentral.com/profiles/blogs/structured-vs-unstructured-data-the-rise-of-data-anarchy, 2012
[3]M. Paryasto, A. Alamsyath, B. Raharjdo and Kuspriyanto, “Big Data Security Management Issues”, IEEE Information and Communication Technology (ICoICT), 2014 2nd International Conference on, Bandung, 2014, pp. 59-63
[4] S. Kaisler, F.Armour, J. A. Espinosa and W. Money, “Big Data: Issues and Challenges Moving Forward ”, System Sciences (HICSS), 2013 46th Hawaii International Conference on, Wailea, Maui, HI, 2013,pp. 995-1004.
[5] Computer Security Division Information Technology Laboratory. Guide for conducting risk assessments. Technical report, National Institute of Standards and Technology, 2012.
[6] Yuri Demchenko, Paola Grosso, Cees de Laat, Peter Membrey, “Addressing Big Data Issues in Scientific Data Infrastructure”,IEEE 2014.
[7] Rongxing Lu, Hui Zhu, Ximeng Liu, Joseph K. Liu, and Jun Shao “Toward Efficient and Privacy-Preserving Computing in Big Data Era” IEEE Network , July/August 2014.
[8] Avita Katal ,Mohammad Wazid , R H Goudar ,Department of CSE ,Graphic Era University “Big Data: Issues, Challenges, Tools and Good Practices”, IEEE Conference 2013.
[9]Anjula Gupta, Navpreet Kaur Walia,(2014) “Cryptography Algorithms: A Review”, IJEDR ,Volume 2, Issue 2 , ISSN: 2321-9939.
[10]S. Suresh Babu, “A symmetric cryptographic model for authentication and confidentiality using Hilbert matrix”, Ph.D thesis, Andhra University, Visakhapatnam, 2010.
[11] Benjamin Arazi, “Message Authentication In Computationally Constrained Environment”, IEEE transactions on mobile computing, pp. 1-7, july 2009
[12] National Institute of Standards and Technology, “The Keyed-Hash Message Authentication Code (HMAC),” FIPS PUB 198, Information Technology Laboratory, pp. 1-32 2002.
[13] H. Krawczyk, M. Bellare, and R. Canetti, “HMAC: Keyed-Hashing for Message Authentication,” IETF RFC 2104, 1997.
[14] J. Bhasker “A VHDL Primer,” PEARSON Prentice hall, pp. 40-190, 2007.
[15] H. Krawczyk, “LFSR-Based Hashing and Authentication,” Proc. Ann. Int’l Cryptology Conf. (CRYPTO 94), pp. 129-139, 1994.
[16] William Stallings “Cryptography and Network Security,” PEARSON Prentice hall, pp. 97-157, 2006.
[17] I. Vajda and L. Buttyan, “Lightweight Authentication Protocols for Low-Cost RFID Tags,” Proc. Second Workshop Security Ubiquitous Computing (Ubicomp ’03), Oct. 2003.
[18] William Stallings, “Computer Security: Principles and Practice”, 1st Edition, pp. 175-190, Sep 2016.
[/et_pb_text][et_pb_button button_url=”http://www.ijset.in/wp-content/uploads/2017/02/Paper5_Cloud.pdf” url_new_window=”off” button_text=”Download” background_layout=”light” custom_button=”on” button_text_size=”14px” button_text_color=”#ffffff” button_bg_color=”#0c71c3″ button_border_radius=”1px” button_use_icon=”on” button_icon_placement=”right” button_on_hover=”on” button_letter_spacing_hover=”0″ disabled=”off”][/et_pb_button][/et_pb_column][/et_pb_row][/et_pb_section]
International Journal of Science, Engineering and Technology