In cloud computing environment, as the infrastructure not owned by users, it is desirable that its security and integrity must be protected and verified time to time. In Hadoop based scalable computing setup, malfunctioning nodes generate wrong output during the run time. To detect such nodes, we create collaborative network between worker node ( i.e . data node of Hadoop) and Master node ( i.e . name node of Hadoop) with the help of trusted heartbeat framework (THF). We propose procedures to register node and to alter status of node based on reputation provided by other co-worker nodes.
Outsourcing computation to cloud can reduce IT expenditure spent by companies. Still, most of them are not willing to do so, due to security concerns with cloud computing environment and services. As per survey [
In a public cloud infrastructure, malfunctioning nodes may infringe security requirements specified by service consumer. They may produce malicious outputs, which may violate the privacy and integrity ofcomputation. This may result in disclosure of users’ confidential data, and profile users’ behaviors (and preferences) for privacy analysis. Moreover, software flaws, bugs and mis-configurations can lead to incorrect results or un- intended information leakage.
Malicious or tempered nodes may eavesdrop the communication between other nodes, in order to disclose confidential data, enforce malicious privacy profiling [
We consider a cloud system, which takes a user task, distributes among computing nodes, and gathers its output as shown in
Rest of the paper is organized as follows: In Section 2, we discuss background and related work on Heartbeats and TPM. In section 3, we propose the Trusted Heartbeat infrastructure. Section 4 shows usage model of proposed framework with conclusion and references at the end.
Apache Hadoop [
Heartbeat [
information to automatically add or subtract resources from their pool. HDFS replicates file blocks for fault tolerance. An application can specify the number of replicas of a file at the time it is created. The NameNode makes all decisions concerning block replication. Each DataNode sends heartbeat messages timely to its NameNode, so the later can identify loss of connectivity if it stops receiving these messages. The NameNode marks such node as dead DataNode (not responding to heartbeats) and desists from sending requests to it. Data stored on such node is no longer available to a client (
The trusted computing group consortium has developed specifications for the trusted platform module. The TPM is a special purpose microcontroller on a motherboard. By incorporating a physical facility for secure generation and storage of cryptographic keys, the TPM becomes the core supporter for creating an interoperable “trusted computing” environment. These capabilities that every TPM provides include hashing by SHA-1 algorith, random number generation, asymmetric key generation as well as encryption and decryption by RSA algorithm. Following in
Integrity verification of the software components to support mitigation of security concerns related to cloud computing infrastructure. Though, it does not actually provide absolute assurance, trusted computing improves the complexity for attackers by operating at hardware level. With a correct implementation, an attacker would need physical access to the hardware in order to subvert the TPM [
There have been many attempts to enhance the fault tolerance and trust based mechanisms to preserve integrity of cloud system in open distributed environment [
Key Name | Purpose |
---|---|
Endorsement Key (EK) | A key-pair based on RSA algorithm; imposed by TPM manufacturer to identify uniquely TPM. |
Storage Root Key (SRK) | A non-transferable key generated by the platform owner to serve as the root key in the hierarchy of keys associated with the TPM. |
Attestation Identity Key (AIK) | Used for attestation and identification of a TPM (i.e. activated mode). Trusted third party can create identity certificate by signing public key part of AIK. |
Signing Key | Used by the system to sign messages. |
Storage Key | Used to encrypt and decrypt other keys. (using RSA) |
Identity Key | Used for operations that requires TPM identity. |
Binding Key | Used for Unbind operations to decrypt a data. |
We propose a scheme to determine whether a particular VM is trustworthy or not. Only attested and trusted VMs can get the tasks and collaborate in network. Negative reputation is assigned if node does not generate output (or produce malicious output).
In this framework, we assume TPM communication cannot temper, and storage is not exposed. The main intention of TPM is to repel most of the attacks on the software, we presume that trusted platform can assess each and every software module loaded on platform in terms of hash code [
Trust manger binds evidence generated with accordance to TC’s notation as trusted data. Moreover, users can get such information to assess the security properties of the worker node at any time, for the entire processing cycle. The Trust & reputation collector collects such properties of nodes and stores them with corresponding values. These values are kept in the trust storage for future score calculation. Following are the three main procedures for our proposed system.
(a) Initial Node Registration
Initially, when a data node joins a network, node registration takes place. It identifies a genuineness of TPM and exchanges keys for sealing and binding operations. The genuineness of TPM is identified by its public EK key.
Every time a worker node initiate connection request to the master node, an initial attestation procedure will be executed by master node. Verifier has collected all the properties of each node whose information is stored at the storage. Therefore, only registered node with allowed properties will be included to the list of the task Manager (for completing tasks). TC credentials and public session keys are stored at trust storage.
In Trusted Heartbeat framework, every node (N) is identified with its corresponding and unique AIK, and the Master (M) facilitates as the Privacy-CA defined by TC infrastructure [
Time to time collector and trust storage updates the nonce information, and initiate the attestation procedure by invoking the TPM Quote from TPM instruction with the fresh nonce. As shown in
(b) Verification of Heartbeats
The verifier from trust and reputation collector is invoked each time, when a heartbeat message with attestation request reaches to the master node. It examines the nonce value in the cache; received through recent heartbeat (last_nonce) message. If verifier does not find that nonce value, it invalidates the connection request through heartbeat message. Once more when worker node sends heartbeat message with valid new_nonce, it can continues to communicate the master and get the task. The trust verifier can verifies the received signature and quote of PCR values using the TPM_Verify [
(c) Reputation based detection
Reputations are gathered with each Heartbeat message received from Master. Calculated reputations, which are lower than a pre-defined threshold, master node, will unregister that node or mark it as a lost one. The threshold value can be computed based on the number of nodes and previously stored information available at trust storage. The Black list is one that contains a list of all such failed nodes. Similarly, Gray list is one that contains a probable list of nodes that have faced some decrements in reputations (
are collected in same cluster only, detecting a failed or malicious node is faster compared to collecting all reputations from all the nodes as depicted in Reputation based decision procedure.
A worker node can receive its reputation or penalties through heartbeat messages. Master node increases reputation of a worker node each time when it gets heartbeat messages with hash values. The Trust & reputation based detector has a upper bound for the maximum reputation, After reaching that value, initialization process begins. However, when node comes in graylist then it starts receiving penalties if it does not reply.
In this paper, we propose Trusted Heartbeat framework; that creates a collaborative network among virtual machines. With remote attestations and heartbeat messages, a Master node can define the exact status (working or malfunctioning) of its nodes. This proposed framework identifies the genuine worker node using trusted computing facilities. Heartbeat interval time is very important parameter in our system. Trust and reputation based detector improve Hadoop like distributed systems in detecting malicious nodes quickly. This framework shows utilization of common messages to establish trust among all the corresponding nodes in distributed environment.
Dipen Contractor,Dhiren Patel,Shreya Patel, (2016) Trusted Heartbeat Framework for Cloud Computing. Journal of Information Security,07,103-111. doi: 10.4236/jis.2016.73007