Research Article  Open Access
D. Vinodha, E. A. Mary Anita, "Discrete Integrity Assuring SliceBased Secured Data Aggregation Scheme for Wireless Sensor Network (DIASSDAS)", Wireless Communications and Mobile Computing, vol. 2021, Article ID 8824220, 17 pages, 2021. https://doi.org/10.1155/2021/8824220
Discrete Integrity Assuring SliceBased Secured Data Aggregation Scheme for Wireless Sensor Network (DIASSDAS)
Abstract
In a wireless sensor network, data privacy with a minimum network bandwidth usage is addressed using homomorphicbased data aggregation schemes. Most of the schemes which ensure the endtoend privacy provide collective integrity verification of aggregated data at the receiver end. The presence of corrupted values affects the integrity of the aggregated data and results in the rejection of the whole data by the base station (BS) thereby leading to the wastage of bandwidth and other resources of energy constraint wireless sensor network. In this paper, we propose a secured data aggregation scheme by slicing the data generated by each sensor node deployed in layered topology and enabling en route aggregation. Novel encoding of data and hash slices based on child order is proposed to enable concatenationbased additive aggregation and smooth extraction of slices from the aggregate by the BS. Elliptic curvebased homomorphic encryption is adopted to ensure endtoend confidentiality. To the best of our knowledge, the proposed scheme is the first which facilitates the BS to perform nodewise integrity verification, filter out only the corrupted portion, and implement dynamic query over the received data. Communication and computationbased performance analysis shows the efficiency of the proposed scheme for varied network sizes. The scheme can resist eavesdropping attack, node compromising attack, replay attack, malleability attack, selective dropping attack, and collusion attack.
1. Introduction
Wireless sensor network(WSN) has started its origin with the SOSUS (SOund SUrveillance System) developed by the U.S. Navy for detecting the Soviet submarines during the end of World War II by employing acoustic sensor on the SOFAR channel [1]. Nowadays, wireless sensor networks have been extensively used in many applications and play a vital role in monitoring remote places which are remarkably inaccessible by humans and require continuous scanning. Surveillance is done by collecting information of measurable factors like temperature, humidity, dissolved gases, and current and water turbulence of environment using tiny sensing devices scattered at different distances. These miniature devices though have become the boon of today’s technological development, starve from resource constraints like less storage, less processing power, and low power backup. Hence, the data transmission which is the major cause of energy depletion in wireless sensor network attracted the attention of research. Efficient routing protocols [2–4] and mobility sinks are proposed to minimize the transmission. Sensor nodes usually deployed in dense lead to redundancy in the data being measured. This plays a major role in the consumption of communication bandwidth. Hence, data aggregation which reduces the communication by aggregating redundant data becomes vital in wireless sensor network. Also, sensor nodes mostly deployed in remote hostile areas are liable to maninthemiddle attack [5] due to the lack of physical protection and attention and untrustworthy transmission channel. Hence, there is a need of privacypreserving schemes. The privacy in data aggregation is preserved either by incorporating encryption schemes or by nonencryption schemes like synapsis diffusion [6] and iterative filtration technique [7].
Our survey [8] on existing secured data aggregation schemes reveals that the number of communication path between nodes and the number of hops along the path are determined by the network model adopted for organizing the nodes. Network model which organizes the nodes as tree suffers from a high probability of communication loss due to the existence of the single communication path with multiple hops [9, 10]. Existence of a single path also prevents the nodes from sharing their role dynamically. In [11], sensor nodes are organized as clusters that enable dynamic sharing of aggregator role within a cluster. But like tree topology, it also suffers from a high probability of communication loss due to the existence of a single path. The problem of both tree and cluster topology is addressed in [12] by organizing the nodes as layers where the nodes are grouped into different layers based on the hop distance from the base station.
Privacy in data aggregation involves 2 conflicting factors. (1) Mechanism for keeping the network free from danger (privacy) requires additional data transmission. (2) Data aggregation is aimed at reducing the data communication. Many privacypreserving secure data aggregation protocols have been developed to meet these two conflicting objectives simultaneously. These schemes can be broadly classified into two, based on how security is achieved either by encryption or by nonencryption [6, 7, 13]. Schemes like [11, 14, 15] employ encryption to conceal the data and prevent intruders from tapping information by overhearing the wireless channel. The scheme [15] verifies the authentication using IDbased aggregated signature. Concealment of data can be in two different levels either between the hops or between the source end and destination end. Concealment between the hops needs the encrypted data to be unwrapped at every hop for aggregation [16]. Unwrapping of encrypted packet opens a hole for the intruders to get access to the data by compromising the aggregator node. To strengthen the security, the actual data is hidden from intruders by slicing it into different pieces in [17]. These pieces are encrypted and grouped into two sets and forwarded to the aggregator of disjoint trees where they get decrypted and aggregated. Girao et al. [18] proposed a new scheme to conceal the data between the source end and the destination end by applying aggregation function over the encrypted data instead of the plain data using privacy homomorphism (PHM). The PHMbased schemes can again be classified based on keys as symmetric PHM and asymmetric PHMbased schemes. The authors in [11, 19] employed the same key for both encryption and decryption. This reduces the communication and computation cost with less message expansion. But sharing of secret key among all the nodes makes the system more vulnerable to attack. To address this vulnerability, different keys are used for encryption and decryption in [20–22]. The key used for decryption is kept secret at the base station, and the key used for encryption is shared among all the nodes. The scheme proposed in [14] employs asymmetric PHM and aggregates cipher text of multiple applications generated using different keys. Privacy homomorphismbased encryption which ensures the confidentiality is vulnerable to malleability attack, and hence, the encrypted data can be falsified by intruders sitting on the way to the BS. This problem is addressed by verifying the integrity of the aggregated data collectively at the BS. In [11], the BS performs collective integrity verification by employing privacy homomorphism and structuring the data as a complex number comprised of the encrypted data and the privacy factor. Another solution using collective integrity verification at the BS proposed in [9] referred to as ASAHS is made scalable by making all the active nodes increment an array element whose index corresponds to the range of value to be contributed by them. The array allows the BS to perform any operation over the sensed data. Since the aggregated values are the representation of only the different range of sensed data, the BS cannot know the exact sensed values. This made this scheme not suitable for applications where the accuracy is at most important. Though the collective integrity of the aggregated value is verified by the BS, it requires the BS to share a secret cipher text with each leaf node. Boudia et al., in [23], proposed a secure additive homomorphicbased aggregation scheme referred to as SASPKE using symmetric state full key which is shared between the BS and sensor nodes. For secure sharing of this state full key, it employs an asymmetric key using elliptic curve cryptosystem (ECC). This scheme ensures endtoend concealment, collective data integrity, and authentication by using homomorphic encryption. For authentication, hashbased message authentication code (MAC) is generated using the state full key which can be unique for each message. Chen et al., in [22], referred to as RCDA, proposed an encodingbased secured data aggregation to support dynamic query by recovering nodewise data. But the integrity verification is done in collective manner at the BS. Since the PH schemes allow the operations over the encrypted data without decryption, the malicious node can perform any malfunction as if it is a valid node. This malleability attack is addressed in MRCDA [24] by verifying the integrity at every intermediate aggregator node with the help of the MAC generated using pairwise shared symmetric key, secret key, and random function. The endtoend integrity is verified by the BS with the help of MAC generated using another key pair shared between the BS and all nodes. But continuous compromising of nodes violates the integrity which leads to the failure of collective integrity verification. Li et al. in [10] verified the integrity of homomorphicbased aggregated data in BS as well as in the forwarding path. To reduce the impact of compromising node attack as in [24], integrity is ensured by aggregator node and monitoring nodes by employing a randomly located subMAC. Detection of false data by checking the integrity at intermediate nodes [10, 24] prevents the invalid data from reaching the BS.
In all these above discussed schemes, failure of collective integrity verification leads to the rejection of whole aggregated value including the valid data. This results in the wastage of bandwidth. Ozdemir and Yang in [25] proposed another solution referred to as IPHCDA which has improved the granularity level of integrity verification by allowing the BS to verify the integrity at the cluster level. In our proposed scheme, we further improved this granularity level by making the BS to do integrity verification at individual sensor node level and separate the corrupted data from the valid one. Hence, the bandwidth consumption problem of collective integrity verification due to the rejection of valid data along with invalid data is addressed in the proposed scheme.
Discrete integrity assuring slicebased secured data aggregation scheme for wireless sensor network is proposed in this paper. The major contributions of the proposed scheme are as follows:(1)The scheme supports concatenationbased additive aggregation of sliced messages at every hop en route to the BS by adopting additive Elliptic CurveElGamal (ECEG) based homomorphic encryption mechanism. To enable this, a novelty is added by creating the sliced messages using child orderbased encoding mechanism(2)The proposed scheme allows smooth extraction of nodewise sliced messages from the aggregate. The BS is made to perform integrity verification at individual sensor node level and filter out only the data of corrupted node(3)The scheme allows the BS to implement dynamic query like sum, mean, max, and min using a single round of data collection. The scheme is made secure against eavesdropping attack, node compromising attack, replay attack, malleability attack, selective dropping attack, and collusion attack by adopting slicebased homomorphic encryption mechanism
The rest of the paper is organized as follows. Section 2 discusses the related work. Section 3 describes the role of hashing function and comparisons of hashing functions. Section 4 first describes the network model. Then, TinyMD5 and ECElGamal cryptosystem are reviewed. Section 5 details our proposed scheme for discrete integrity assuring slicebased secured data aggregation. Security analysis is discussed in Section 6. Section 7 presents performance and comparative analysis. Finally, concluding notes are made in Section 8.
2. Related Work
In [26], the authors have proposed two schemes based on slicing for secure data aggregation. In Scheme I, for the network with nodes, each data is sliced into pieces and confusion is added by mixing the slices of nodes. Mixing is done at the cost of transmissions. Originality of the slices is validated by attaching a hash with each slice, and confidentiality is ensured using the symmetric advanced encryption standard. Homomorphic encryption is adopted to ensure endtoend confidentiality. But the scheme supports only additive aggregation of slices. In Scheme II, the communication cost due to homomorphism is reduced by introducing a noise data while mixing the slices which strengthens the security. Attack by compromising the aggregator node is not addressed in both the schemes. Liu et al. proposed a slicebased secured data aggregation scheme [27] by reducing the number of slices using Euclideanbased decomposition. Different aggregation function is allowed by adding mark bits along with the data slices. But the support of multiple functions using only one query is not possible. In addition, the data integrity is not verified at all levels. In [28], the authors proposed a secure data aggregation to support multiple functions in a single query by representing the data as number of occurrences of values using homomorphic encryption. The authors in [29] proposed clusterbased secure data aggregation using symmetric key encryption. The authenticity of cluster head is ensured before the transmission using hash value. The BS is enabled to detect the malicious node by analysing the timestamp retrieved from the aggregate. But the scheme requires separate symmetric key for each node participating in the data collection round. The authors in [17] proposed a secured data aggregation scheme using elliptic curvebased homomorphic encryption, and integrity of the data is verified collectively at all levels both at CH and at BS by employing two different MAC, one is by using pairwise symmetric key and another one is by using homomorphism. Another elliptic curvebased scheme [22] allows the BS to verify the integrity of the aggregate and to extract nodewise data. This helps the BS to apply any mathematical functions over the data in a single query. But the absence of nodewise integrity verification at the BS results in the rejection of whole received values including the valid data.
By considering the above discussed issues, we propose a scheme based on slicing which can ensure endtoend confidentiality with reduced number of slices and allow the BS to reject only the corrupted data instead of the whole received values by verifying nodewise integrity. The scheme also supports the BS to compute multiple functions (like sum, mean, max, and min) in a single query.
3. Preliminaries
Three major security elements which are vital in wireless communication are confidentiality, integrity, and authentication. Confidentiality can be achieved by applying encryption mechanisms whereas the originality and authentication of the received data can be verified by generating message digest using hash functions with secret key. It makes it difficult for the adversaries to introduce changes in the original data without doing the required modification in the digest. Hence, hashing functions play a vital role in security schemes.
Most of the existing hash algorithm’s design has taken the form of MerkleDamgard construction. The message of arbitrary length is split into blocks of predefined size, and from each block, a fixed size digest is generated by applying mathematical functions [30]. This process is repeated for all the blocks in cyclic manner. MD2 and MD4 are 128bit hash functions and were proven to be in danger of preimage and collision attack. MD5 can generate 128bit message digest in 4 rounds in oppose to MD4 which has taken 3 rounds. The use of unique additive constant in each round makes MD5 to generate an avalanche effect at a faster rate compared to MD4 which uses the same constant for all rounds. The authors in [31] proposed the modified version of MD5 called TinyMD5 by reducing the message digest from 128 bits to 32 bits and made it energy efficient hashing function and suitable for energy starving wireless sensor network. This is adapted in our scheme to generate the hash value for verifying whether the received message is intact. In TinyMD5, the major difference lies in the initial buffer value of four 8bit buffers, size of operational data, and constants used for removing the duplication in the generated hash value. Though the weakness of message digest hash functions with respect to collision attack is proven and made the cryptographers suggest the SHA hash function series, the size of message digest of SHA series which is more than 128 bits leads to communication overhead for the energy constraint wireless sensor network. The comparative analysis of some of the existing hash functions is given in Table 1.
 
MDc: MD construction. 
4. Specification of DIASSDA Method
In this section, first, the network model which illustrates the deployment of nodes around the base station is discussed. Then, the review of TinyMD5 and Elliptic CurveElGamal cryptosystem is presented.
4.1. Network Model
We consider a WSN with a ringbased layered topology. The nodes are organized as layers such that each layer comprises nodes with the same hop distance from the BS. Layer 1 includes all the nodes which can be reached from the BS in one hop. Layer 2 includes nodes which are at two hop distance from the BS. Similarly, layer includes all the nodes which are hops away from the BS. Refer to Figure 1.
The layers which form the rings of nodes around the BS enable multihop communication [12]. Also, the existence of multipath between the nodes and the BS in the layered architecture allows dynamic sharing of aggregator role among the nodes [32]. This prevents one node from getting overloaded by playing the role of aggregator as in cluster and hierarchical network models. It also minimizes the probability of communication loss. To facilitate the propagation of packets from source layer to the BS through multiple paths, each node at layer stores information about the neighbour nodes in the next layer towards the BS.
4.2. TinyMD5
In wireless sensor network, the data in aggregated form moving towards the BS are viable for attack en route. The adversaries may introduce unauthorized modification and affect the integrity of the data. Hence, to verify the integrity of the message, hash value is sent along with the data. We consider an algorithm called TinyMD5 proposed by Bok et al. [31] for generating tiny hash value with minimized computation. The scheme generates an intermediate 128bit hash value from the original data using the normal MD5. Then, this 128bit intermediate hash value is reduced to 32bit lightweight hash value using block processing method. Since the TinyMD5 is a oneway hash function, the plain text cannot be recovered using decryption. The generation of 32bit hash using TinyMD5 is illustrated in Figure 2.
The four 8bit registers A, B, C, and D hold the final output of TinyMD5.
4.3. ECElGamal Cryptosystem
Mykletun et al. [21] proposed an Elliptic CurveElGamal cryptosystem, a transformed version of the original ElGamal cryptosystem. It has the ability of providing the same level of privacy accorded by RSA but with smaller key size and same stubbornness against security attack. Hence, the elliptic curve cryptosystem is considered as the suitable one for bandwidth constraint wireless sensor network. The stubbornness of this system against security attack is based on elliptic curve discrete logarithm problem that is given the multiplier and the product finding the multiplicand is computationally difficult when both the multiplier and the product represent a point over an elliptic curve. It maps the plain text to the points in the elliptic curve independent of the transformation needed for generating the cipher text. The mapping is done using homomorphic function that is . In this cryptosystem, the private and public key parameters are generated as follows:(1)The elliptic curve over the finite field with as prime number and the order of , i.e., , with a large prime factor is constructed(2)The random private key is taken from and kept secret in the BS(3)The public key parameters , where is the generator, is selected such that , , are stored in all sensors during deployment
Encryption and decryption are done as follows:(1)Encryptionwhere .(2)Decryptionwhere rmap is for reverse mapping point to the scalar value.
5. Working Principle
Before deployment, the BS generates the private key and public key parameters using the ECElGamal cryptosystem [21]. The random private key is taken from the curve and kept secret in BS. The public key parameters are stored in all sensors during the deployment along with the node ID which is generated by the BS. To construct the BS centred layered topology, the BS broadcasts Build_Layer message which carries the node ID and layer number. The layer number for the BS is fixed as zero. Each node which receives this message from the BS marks its layer number as 1 and broadcast a new Build_Layer message. The layer number and neighbours of node are fixed as follows.(1)If the node receives the build message for the first time and its layer number is not yet fixed, set its layer number as and mark as its neighbour in the parent layer(2)If the layer number is already fixed and , then is marked as its parent neighbour. If , then is marked as its child neighbour(3)The node which could not find any child neighbour forms the last layer and broadcasts Build_Complete message which carries the node ID and layer number to its parent neighbour. Each node which receives this message stores and rebroadcasts it towards the BS
In each sensor node, the parent neighbours are grouped into sets such that each set has “” parents. This ensures multiple paths between the source node and the BS. In this parentchild relationship, the children are grouped such that each group has 2 children and within a group each child is assigned a number called child order . This child order is either 1 or 2. It is kept secret in the parent as well as in the child. It helps to identify suitable parent for routing the sliced aggregated message to BS. When there is data that need to be sent to BS, the sensor node generates the hash value of the raw data and performs slicing of both raw and hashed data. Then, the sliced messages carrying the piece of raw and hash values along with additional information required for helping the BS to match the received slices are created. Next, the sliced messages are encoded and encrypted using the public key shared with the BS. The nodes en route to BS perform grouping of the received messages according to the child order and aggregate them. The communication overhead caused by slicing is minimized by aggregation at every hop. Thus, the whole procedure is composed of 5 steps: slicing, encryption and aggregation, decryption of aggregated message, data recovery, and integrity verification. The first 2 steps occur in the sensor whereas the last 3 steps are carried out by the BS. The definitions of the symbols used in the explanations are given in Table 2. The detailed explanations of these 5 steps are as follows.

5.1. Slicing
The node at layer generates hash value of the sensed data using TinyMD5. It slices both the sensed data and its hash into pieces. The number of slices () at node is chosen in random from {} where “” refers the maximum number of slices. For each slice, sliced message (SM) is formed by appending sliced data piece with one sliced hash, (node ID generated by the BS during node deployment), number of slices (), and the nonce . The node ID and nonce help the BS to sync the received slices correctly. The slicing reduces the size of the message. The SMs are encoded, encrypted, and forwarded to the random parent neighbour in the upper layer which is chosen from the parent set based on the child order. This child order is used for the encoding. The SMs are generated and encoded as follows:(1)The hash value is computed such that where is TinyMD5 hash function and is the data sensed by node at layer (2)The sensed data and its hash value are split into slices of equal length such that , . The value of remains 1 at one hop layer. For the remaining layer, is chosen in random from (3)Encoded sliced message (ESM) for each slice is created as follows:(a)Let and (b)While (), compute and increment by 1(c) where is sliced message length in bits and is the child order of node for the ^{th} parent chosen in random from the parent set. Repeat step (b) by initializing x by 1 and b by 0. Increment β by (t ∗ x ∗ (g_{i} − 1 )) if the group order (g_{i}) is not 1.(d),
5.2. Encryption and Aggregation
Encoded SMs are encrypted using the public key as follows.
5.2.1. Aggregation
Let the received cipher texts at node in layer be where “” is the number of packets received for aggregation. The received cipher texts are grouped into sets , each with 2 cipher text based on the child order assigned to each child node. Each set is aggregated with corresponding where to . The number of 0’s used in the encoding of SM enables smooth aggregation of cipher texts in each set as it is computed based on child order. For each group , the aggregated cipher text is created as follows.(1)Initialize , i.e., (, )(2) for each in
Node forwards to the parent in the next upper layer. The parent is chosen based on the child order which is used to encode .
5.3. Decryption of the Aggregated Data
The received aggregated cipher text is decrypted by the BS using the secret key. The decrypted message .
5.4. Data Recovery (DR)
The BS extracts individual SMs from the decrypted aggregated message using the message length () as follows.
where to .
5.5. Integrity Verification
The BS decodes the recovered message and extracts the sliced sensed value , sliced hash value , number of slices (), node ID , and nonce from each recovered message using the index position. For the sliced sensed value , the index position is from 0 to . Here, refers to the length of sensed value. The index position of hash value is from to . The node ID is extracted from the index position to . The sliced messages are mapped with their corresponding sliced part using the node ID and . After the availability of all the SMs, the extracted and mapped sliced sensed values are merged to form the received sensed data. The hash value of the received sensed data is calculated by the BS using . The received partial hash values are merged and compared with the calculated hash value. If both are the same, integrity is ensured. Otherwise, the corresponding sensed value of a particular node is dropped by the BS. Nodewise integrity verification by the BS is given as follows.(1)For each recovered message , to
Extract(a)Sliced sensed value (b)Sliced hash value (c)(d)(2)For all the extracted node ID, repeat the following:(a)If the number of recovered (i)Merge all of node (ii)Merge all of node (b)Compute (c)If Integrity verification for node is successful. The data of is accepted. Else The data of is rejected.
5.6. Illustrative Example
To demonstrate the working principle of our scheme, we have taken a simple WSN deployed in ring topology with 8 sensor nodes such that nodes are in layer 3, are in layer 2, and are in layer 1 (refer to Figure 3). Let the data sensed at each node and their corresponding hash value be , , , , , , , and . To make the explanation simple, here we omitted node ID, number of slices, and time stamp in the sliced message and set the number of slices to 2. The data slices generated at node from the data are . The hash is sliced into . The encoded sliced messages at are and , i.e., where ; hence, and where . Let decimal digits. Hence, . The encoded SMs of other nodes are , , , , , and .
The encoded SMs at layer 3 are encrypted and forwarded to the parent node based on the child order. The parents of node are . The child order of is 1 and 2, respectively. Cipher text of , i.e., , is forwarded to and to .
To demonstrate the aggregation process, consider the cipher texts received at . These cipher texts are grouped into 2 sets based on child order such that each group includes cipher texts with child order 1 and 2. Hence, the groups at are and . The encoded SMs of are , and , . The aggregated cipher text of is and its corresponding encoded SM .
In the same way, the encoded SM of aggregate at , , , , and .
The BS extracts nodewise sliced messages from by using where to . The SMs are matched by comparing the node ID, and the order is determined using time stamp. For example, the extracted SMs of are . From the SMs, the data slices and hash slices are retrieved and merged to form a final data value 71 and hash value 75. Now, the BS generates new hash using the retrieved data value and compares it with the received hash. If both are equal, the data is accepted. Hence, nodewise data integrity is verified.
6. Attack Model/Security Analysis
In this section, we provide various attack models of wireless sensor network and the ability of our proposed scheme to afford shield against them.
6.1. Eavesdropping Attack
Since the encrypted sliced data packets (ESMs) can be decrypted using the private key known only to BS, the intruder cannot access the sliced data. Even if the intruder manages to break the keys, the partial sliced information in the ESM prevents the intruder from getting the actual sensed data. In addition, to know the sensed information, the intruder has to find the synchronized matching ESMs by compromising multiple nodes. Since the ESMs are taking random path, this is highly not possible. Thus, the slicing and encryption provide double protection against eavesdropping attack and ensure confidentiality.
6.1.1. Double Protection through Encryption and Slicing
From the two cipher text components , where , for the intruder, it is computationally difficult to get “” given the generator and . This difficulty is based on elliptic curve discrete log problem.
6.2. Node Compromising Attack
Double protection of the sensed data offers strong resistance against intermediate node compromising attack. By compromising the intermediate node, the attacker can only get the ciphered data which can be deciphered only by the BS. If the key is broken by cryptanalysis, the attacker can access only the partial sliced data, because the aggregated message at the compromised node contains only the combination of sliced messages (SMs) as follows.where to .where to .
Each SM contains only the sliced part of the data sensed by node .
6.3. Selective Dropping Attack
The intermediate compromised node may corrupt the data aggregation by selectively dropping packets. With the help of the node ID and , the BS can obtain the number of SMs of particular node and is enabled to detect selective dropping of the sliced packet as follows
If number of SMs of
Detects selective dropping attack
Else
No selective dropping attack
Ability of the BS to derive information about the number of slices from the node ID, allow it to detect the number of packets missed during the transmission.
6.4. Collusion Attack
Collusion attack allows the raw data to be extracted in collaborated manner. In our proposed scheme, the data is travelling in the form of multiple slices and each slice is taking different routes. The path taken depends on the parent node chosen in random. Because of this randomness, finding synchronized sliced packets by compromising multiple nodes and imposing collusion attack becomes difficult.
6.5. Forgery Attack/Replay Attack
The replay attack which falsifies the duplicate sensitive aggregation values like SUM, COUNT, and AVG can be detected and filtered out by the BS. This can be done by counting the number of SMs with the same node ID () and verifying whether the count is equal to . The node ID and nonce provide unique identification for each packet. Forgery attack is possible in 2 ways: (i) by not compromising node and (ii) by compromising node. Since the information is sliced into different packets and each sliced packet carries the hashed information of the whole message, the forged packet without compromising the node can be easily detected by the BS.
6.6. Malleability Attack
The cryptosystem using privacy homomorphism is intrinsically malleable which paves way for the intruder to modify the cipher text without knowing the original data and this cipher text can be decrypted to related raw text. The slicing and nodewise integrity verification at the BS adopted in our proposed system help to remove the malleable data smoothly without affecting the pure data. Illegal manipulation done by the malicious antagonists over the aggregated packet of compromised node affects only the slices of data merged in the aggregated packet. Since the packets carry only the aggregation of slices, it is difficult to make the manipulation to imitate the valid data and this cannot affect the entire aggregate value as in MRCDA and other systems. Also, nodewise integrity verification filters out only the affected slices and their corresponding pairs, not the entire received aggregate as in other existing systems. This eliminates the problem of collective verification where the failure of integrity verification leads to the rejection of the whole aggregated data including both modified and unmodified data which results in wastage of bandwidth.
6.7. NodeSpecific Retransmission of Damaged Data
In our proposed scheme, each sliced message carries the ID of the source node () generated by the message, i.e., . The BS derives information about the number of slices generated by each node from the ID. Once all the slices of the original message are available, the BS performs integrity verification as specified in Section 5.5. In the event of any deviation in integrity verification or missing of any slices, the BS can make request for retransmission of invalid data from the source node whose ID is dug out from the received message.
6.8. Security Proof
The security proof is presented by analysing the possible ways of ESM distribution and the probability of the intruder finding the corresponding siblings of the same node.
Slicing and shuffling which take place when the packets travel towards the BS have a significant impact on the distribution of sibling packets which in turn affects the probability of finding all sibling packets.
6.8.1. Possible Ways of Packet Distribution at Each Layer
Case 1. Under the condition such that packet and neighbours are distinguishable without exclusion that more than one packet can be placed in the same neighbour and no neighbours is empty if .
Possible ways for distributing packets into neighbours present in the layer can be specified using the “Stirling number of the second method” [33] as follows.
Case 2. Under the condition such that packet and neighbours are distinguishable without exclusion that more than one packet can be placed in the same neighbour and no neighbour is empty if .Hence, the possible way of packet distribution at layer is is the possible way for distributing packets to neighbours of node at layer .
6.8.2. Possible Ways of Packet Distribution in All the Layers
The possible distribution of sibling packets is the product of possible distribution of packets in each layer which grows cubically with the number of neighbours and exponentially with the number of sibling packets. The increase in number of levels also has a significant impact on the distribution.
6.8.3. Probability of Getting All Sibling Packets of the Same Node
The sibling packets generated from the same node are travelling towards the BS by taking different paths. Hence, the possible distribution of packets increases exponentially with the number of layers crossed. The probability that the intruder gets all the sibling packets of the same node is calculated using the relative position of the siblings in a given instance. This can be categorized into 3 cases.
Case 1. All siblings are available in one node at any layer positioned between the source layer (sl) and the BS.where refers to the total number of nodes between the source layer and BS.
Case 2. All siblings are available in different nodes at the same layer positioned between the source layer and the BS.where nsbl is the total number of siblings.
Case 3. All siblings are available in different nodes at different layers positioned between the source layer and the BS.The above mathematical model shows that the probability of getting all sibling packets is inversely proportional to the number of nodes participating in data gathering. Results obtained by substituting real time values for the above three cases are given in Figure 4 and Table 3. The results show that the probability reaches almost zero with the distribution of sibling nodes in different nodes at different layers. This makes it highly difficult for the intruder to compromise multiple nodes and acquire all the related sibling packets of the same node.
7. Performance and Comparative Analysis
In this section, we analyse the performance of our proposed scheme by measuring the energyconsuming factors which include the communication and the computation overhead.
7.1. Communication Cost
First, the communication cost is analysed by measuring the impact of topology on the number of hops required for the packets to reach the destination from the origin and then by measuring message size. We consider ring topology, cluster topology, and hierarchical cluster topology for comparison.
Topology of sensor networks with widespread monitoring area spans over many layers around the BS, and the number of hops experienced by the packets is high in the schemes with no data aggregation. Our proposed slicing scheme over the sensor network with ring topology reduces the number of hops by merging the sliced packets at every hop. Hence, the total hops taken by all the packets are less. Though the number of hops in the hierarchical clusterbased topology is comparatively less, the number of hops increases in if the network monitoring area spans for wide range. In clusterbased schemes, it is in . Table 4 shows the hop count based on network topology.

The comparative analysis of communication cost based on network topology with varying sizes is presented in Figure 5. Though the cluster topology experiences less communication overhead compared to the other 2 topologies, it requires a dynamic change in the role of cluster head for preventing the node from getting overloaded [34]. This leads to variation in the distance between dynamic CH and the cluster members close to the border of the cluster. This adds to the communication cost, as the power required to transmit the data is directly proportional to the distance between sender and receiver [35].
Hence, the dynamic change of CH in cluster topology leads to more power consumption. This problem is minimized in sensor network with ring topology where the hop distance between the nodes is not dynamic.
To analyse the impact of different deployments on number of hops, a wireless network with 1000 nodes is considered. The number of levels or layers is kept as 9 for both hierarchicalbased cluster topology and ring topology. Different testing scenarios are created by introducing randomness in the number of nodes per layer in ring topology and number of clusters per level in hierarchicalbased cluster topology. Testing results are given in Figure 6. Results obtained by increasing the number of levels in the network with 1000 nodes are given in Figure 7. From the results presented in Figures 6 and 7, it is observed that in our scheme, increase in the layer and change in deployment do not have impact on the overall number of hops. Hence, nodes can be deployed freely. There is no restriction on the distribution of nodes as in hierarchicalbased cluster topology whose communication cost is directly proportional to the cluster size and the hierarchical level.
In cluster topology, the number of clusters does not emit any impact on the number of hops required to reach the BS. Though it shows relaxation on the number of clusters and cluster size, there is a limitation on the distance between the cluster head and BS for the smooth and direct communication. Since the maximum hops to reach BS are restricted to two, the network cannot scale vertically. Hence, indirectly, condition is imposed on the deployment of nodes and on the assignment of cluster head role [34] to the cluster members. Ring topology, in our proposed scheme, allows the network to scale vertically and horizontally around the BS. Existence of multipath communication allows dynamic sharing of aggregator role, and hence, nodes are prevented from overloading. Though communication cost is slightly more, the bandwidth consumption due to the retransmission of both affected and unaffected packets which are discarded by collective verification in the existing schemes is eliminated by dropping only the corrupted packets.
Now, we analyse the communication cost by measuring the message size. The cipher text of our proposed scheme generated using the Elliptic CurveElGamal cryptosystem has two components, and . Both refer to two different points on the elliptic curve.
A point on an elliptic curve over is represented by using a pair of coordinates each of size . Using point compression mechanism, it can be represented only by using one coordinate with one or more additional bits [21]. Hence, the size of one cipher text is . From the analysis in [21], the ECEG cryptosystem achieves the same 1024bit security level as that of three other elliptic curve schemes [36], Elliptic CurveOkamoto Uchiyama, Elliptic Curve NaccacheStern, and Elliptic CurvePaillier using only 163bit modulus. Hence, the size of cipher text in ECEG is smaller compared to that in the other cryptosystem. We consider the schemes based on elliptic curve cryptosystem which ensures integrity as suitable candidates for comparative analysis of bandwidth consumption. The candidate schemes are MRCDA, ASAHS, RCDAHOMO, and IPHCDA. By considering the security level offered, 163bit modulation is adopted for schemes based on ECEG. 1024bit modulation is adopted for ASAHS and IPHCDA. While analysing the data size, the concatenationbased additive aggregation adapted in the proposed scheme as well as in RCDA and MRCDA has an impact on the data size. The value range of data that can be sensed decreases with an increase in cluster size and layers. Table 5 gives the formula for computing the size of cipher text and communication cost based on hops and message size in bits.
Simulated results of communication cost for varying network sizes are given in Figure 8. For simulation, we kept the slicing size s as two for our scheme. ASAHS is implemented by keeping the vector size as 5, and for IPHCDA, the region size is kept to the minimum of two. While comparing the total packet size which includes size of cipher text and additional information needed for integrity verification, the message size in our scheme is 3 times less than that of ASAHS and IPHCDA. In ASAHS, the communication cost based on hops and message size grows with vector size, and in IPHCDA, it grows with the number of regions.
While comparing RCDA and MRCDA, the communication cost of our scheme is slightly higher. But it becomes acceptable while considering the problem of collective integrity verification. The presence of a single malicious node in a cluster leads to the rejection of the aggregate received from that particular cluster in collective integrity verificationbased schemes and results in bandwidth wastage. To analyse this bandwidth wastage, a network of size 150 nodes is considered. When the malicious nodes spread among multiple clusters and affect the integrity, for every unit increase of infected clusters, the bandwidth wastage increases by 9.6 in RCDA. But in our proposed scheme, the bandwidth wastage increases by 2.64 for every increase in infected nodes by 5. This is illustrated in Figure 9, and it shows that the impact of collective integrity verification on communication cost is more compared to that of nodewise integrity verification.
The saving in bandwidth by nodewise integrity verification (SBW_NIV) is computed as follows.
where BWW_CIV stands for bandwidth wastage due to the rejection of aggregate in collective integrity verification and BWW_NIV refers to the bandwidth wastage due to rejection of false data in nodewise integrity verification. Overall saving in bandwidth in our proposed scheme (OBW_NIV) is computed by finding the difference between the additional bandwidth consumption due to slicing (ABW_S) and SBW_NIV.
While comparing ABW_S with SBW_NIV, the difference, i.e., the overall saving in bandwidth, increases with the number of affected nodes/clusters in our proposed scheme compared to RCDA as shown in Figure 10. Hence, the bandwidth wastage is comparatively reduced in our proposed scheme by filtering only the corrupted part of the data. Added to this, the scalability problem of RCDA which prevents the network from spanning more than two hop distance from the BS is addressed in the proposed scheme by supporting the network to scale with more than 2 hop distance from the BS.
7.2. Computation Cost
Computation is another energyconsuming factor in wireless sensor network. We measure the computation cost of our proposed method by counting the elementary operations (+, −, , pow, mod, and comparison) and other operations like hashing, map, counter generation, additional encryption and decryption required for message generation, aggregation, data recovery, and decryption.
7.2.1. Message Generation
To generate a message, each node has to perform slicing and hashing which costs and map function requires one point multiplication to map the plain text to a point on elliptic curve. The encryption requires 2 point multiplication and 1 point addition operation for each slice. This represents the total of point additions and point multiplications for the whole network.
7.2.2. Aggregation
At every hop towards the BS, the parent node aggregates the received sliced packets using 2 elliptic curve additions which help the sliced message to be embedded in the aggregated message without mixing up with other values and enable nodewise data recovery. Since each node splits the data into , the total cost for aggregation is .
7.2.3. Data Recovery and Decryption
To decipher the aggregated message received from the nodes at the layer next to BS, single multiplication and addition is required. According to this, the total cost for deciphering the aggregated message at BS is .
The different computations incurred at various phases are shown in Table 6, and the comparison of computation cost of candidate schemes based on the results obtained by replacing the variables in Table 6 by realtime values with varying network sizes is presented in Figure 11. The comparison shows that like in other secured data aggregation mechanism, the overall computation cost of our scheme increases with the number of nodes. It is comparatively higher than RCDA and lesser than ASAHS.
 
map: mapping function to map plain value on to curve; rmap: reverse mapping function to get plain value from curve; hg: hash generation; mg: MAC generation; cg: counter generation; pa: point addition; pm: point multiplication; se: symmetric encryption; : number of regions; : number of CH communicated with BS; : vector size; sm: scalar multiplication; child: number of children of each aggregator; ga: scalar addition using generator; sa: scalar addition; gm: scalar multiplication using generator; ss: scalar subtraction; : number of sensor nodes; : cluster size; de: decryption; cmp: comparison; : number of slices; sd: scalar division. 
This additional cost incurred becomes reasonable by enabling the BS to verify nodewise integrity in contrast to the collective integrity verification practiced in RCDA and ASAHS which leads to the rejection of the entire message including the uncorrupted data by the BS which results in bandwidth wastage. This bandwidth wastage is eliminated by dropping only the corrupted data instead of the entire message.
7.3. Comparative Analysis
We pick out the five schemes RCDA, SASPKE, ASAHS, IPHCDA, and MRCDA as the most appropriate aggregation schemes for comparison, because in all these schemes, endtoend confidentiality of the sensed data is ensured by enabling the intermediate node to do aggregation on the encrypted data using asymmetric homomorphic encryption. Also, they facilitate the BS to extract individual node data, apply dynamic queries, and ensure the originality of the received data in collective manner which results in the rejection of the whole received data when integrity verification test signs negative. Rejection of the whole received data results in wastage of bandwidth. This is addressed in the proposed scheme by rejecting only the corrupted data by safely filtering out defected data from the received data. In our proposed scheme, the ability of the BS to extract nodewise raw data from the received aggregated packets and knowledge of BS about the number of sliced packets sent by each node help to address the attacks like node compromising attack, replay attack, malleability attack, selective dropping attack, and collusion attack. To the best of our knowledge, all these attacks are not addressed by any one of the single system in the comparison list. Though our system exhibits performance slightly inferior to the existing system, security against these attacks is comparatively strengthened by adopting slicingbased encryption with randomness in the path taken by the packets to reach BS from the source node. This makes it suitable for the system where security plays a vital role. The comparison of our proposed system with existing systems based on quality measures is given in Table 7.

8. Conclusion
The scheme proposed in the paper is a novel slicebased secured data aggregation for wireless sensor network deployed in an insecure attack prone environment. Our scheme merges the benefits of layered topology, aggregation, and slicing. It also enables the BS to perform nodewise integrity verification by safely extracting all the slices from aggregated data and obtain individual node data by merging all slices. In addition, the BS is allowed to apply any query over the data. Elliptic curvebased encryption and slicing provide double protection and make our system secure against eavesdropping attack, node compromising attack, replay attack, malleability attack, selective dropping attack, and collusion attack. Though slicing results in a slighter compromise in the number of hops compared to cluster topologybased schemes, considering the problem of bandwidth wastage in collective integritybased schemes due to rejection of the whole data, our scheme provides better affordable solutions by filtering out only corrupted data thereby improving the efficiency of the network in terms of bandwidth utilization and providing WSN end user with exact data of individual node instead of aggregate message.
Data Availability
The data used for performance analysis are specified in the article itself.
Conflicts of Interest
The authors declare that there is no conflict of interest regarding the publication of this article.
References
 https://dosits.org/galleries/technologygallery/locatingobjectsbylisteningtotheirsounds/soundsurveillancesystemsosus/.
 B. Bhushan and G. Sahoo, “E2SR2: an acknowledgementbased mobile sink routing protocol with rechargeable sensors for wireless sensor networks,” Wireless Networks, vol. 25, no. 5, pp. 2697–2721, 2019. View at: Publisher Site  Google Scholar
 T. Khurana, S. Singh, and N. Goyal, “An evaluation of adhoc routing protocols for wireless sensor networks,” International Journal of Advanced Research in Computer Science and Electronics Engineering, vol. 5, no. 1, 2012. View at: Google Scholar
 P. Maratha, K. Gupta, and A. K. Luhach, “Improved fault‐tolerant optimal route reconstruction approach for energy consumed areas in wireless sensor networks,” IET Wireless Sensor Systems, vol. 10, no. 3, pp. 112–116, 2019. View at: Google Scholar
 B. Bhushan, G. Sahoo, and A. K. Rai, “Maninthemiddle attack in wireless and computer networking — a review,” in 2017 3rd International Conference on Advances in Computing,Communication & Automation (ICACCA) (Fall), pp. 1–6, Dehradun, 2017. View at: Publisher Site  Google Scholar
 S. Roy, M. Conti, S. Setia, and S. Jajodia, “Secure data aggregation in wireless sensor networks: filtering out the attacker’s impact,” IEEE Transactions on Information Forensics and Security, vol. 9, no. 4, pp. 681–694, 2014. View at: Publisher Site  Google Scholar
 M. Rezvani, A. Ignjatovic, E. Bertino, and S. Jha, “Secure data aggregation technique for wireless sensor networks in the presence of collusion attacks,” IEEE Transactions on Dependable and Secure Computing, vol. 12, no. 1, pp. 98–110, 2015. View at: Publisher Site  Google Scholar
 D. Vinodha and E. A. MaryAnita, “Secure data aggregation techniques for wireless sensor networks: a review,” Archives of Computational Methods in Engineering, vol. 26, no. 4, pp. 1007–1027, 2019. View at: Publisher Site  Google Scholar
 A. Viejo, Q. Wu, and J. DomingoFerrer, “Asymmetric homomorphisms for secure aggregation in heterogeneous scenarios,” Information Fusion, vol. 13, no. 4, pp. 285–295, 2012. View at: Publisher Site  Google Scholar
 X. Li, D. Chen, C. Li, and L. Wang, “Secure data aggregation with fully homomorphic encryption in largescale wireless sensor networks,” Sensors, vol. 15, pp. 15952–15973, 2015. View at: Publisher Site  Google Scholar
 X. Zhao, J. Zhu, X. Liang, S. Jiang, and Q. Chen, “Lightweight and integrity‐protecting oriented data aggregation scheme for wireless sensor networks,” IET Information Security, vol. 11, no. 2, pp. 82–88, 2017. View at: Publisher Site  Google Scholar
 K. Zhang, Q. Han, Z. Cai, and G. Yin, “RiPPAS: a ringbased privacypreserving aggregation scheme in wireless sensor networks,” Sensors, vol. 17, no. 2, p. 300, 2017. View at: Publisher Site  Google Scholar
 D. Wu and R. W. BoranYang, “Scalable privacypreserving big data aggregation mechanism,” Digital Communications and Networks, vol. 2, no. 3, pp. 122–129, 2016. View at: Publisher Site  Google Scholar
 Y.H. Lin, S.Y. Chang, and H.M. Sun, “CDAMA: concealed data aggregation scheme for multiple applications in wireless sensor networks,” IEEE Transactions on Knowledge and Data Engineering, vol. 25, no. 7, pp. 1471–1483, 2013. View at: Publisher Site  Google Scholar
 L. Shen, J. Ma, X. Liu, F. Wei, and M. Miao, “A secure and efficient IDbased aggregate signature scheme for wireless sensor networks,” IEEE Internet of Things Journal, vol. 4, no. 2, pp. 546–554, 2017. View at: Publisher Site  Google Scholar
 Y.K. Kim, H. Lee, M. Yoon, and J.W. Chang, “Hilbertcurve based data aggregation scheme to enforce data privacy and data integrity for wireless sensor networks,” International Journal of Distributed Sensor Networks, vol. 9, no. 6, 2013. View at: Publisher Site  Google Scholar
 W. He, H. Nguyen, X. Liuy, K. Nahrstedt, and T. Abdelzaher, “iPDA: an integrityprotecting private data aggregation scheme for wireless sensor networks,” in MILCOM 20082008 IEEE Military Communications Conference, pp. 1–7, San Diego, CA, USA, 2008. View at: Google Scholar
 J. Girao, D. Westhoff, and M. Schneider, “CDA: concealed data aggregation for reverse multicast traffic in wireless sensor networks,” in IEEE International Conference on Communications, 2005. ICC 2005, pp. 3044–3049, Seoul, Korea, 2005. View at: Publisher Site  Google Scholar
 D. Westhoff, J. Girao, and M. Acharya, “Concealed data aggregation for reverse multicast traffic in sensor networks: encryption, key distribution, and routing adaptation,” IEEE Transactions on Mobile Computing, vol. 5, no. 10, pp. 1417–1431, 2006. View at: Publisher Site  Google Scholar
 X. Liu, X. Zhang, J. Yu, and C. Fu, “Query privacy preserving for data aggregation in wireless sensor networks,” Wireless Communications and Mobile Computing, vol. 2020, Article ID 9754973, 10 pages, 2020. View at: Publisher Site  Google Scholar
 E. Mykletun, J. Girao, and D. Westhoff, “Public key based cryptoschemes for data concealment in wireless sensor networks,” in 2006 IEEE International Conference on Communications, pp. 2288–2295, Istanbul, Turkey, 2006. View at: Publisher Site  Google Scholar
 C. Chen, Y. Lin, Y. Lin, and H. Sun, “RCDA: recoverable concealed data aggregation for data integrity in wireless sensor networks,” IEEE Transactions on Parallel and Distributed Systems, vol. 23, no. 4, pp. 727–734, 2012. View at: Publisher Site  Google Scholar
 O. R. M. Boudia, S. M. Senouci, and M. Feham, “A novel secure aggregation scheme for wireless sensor networks using stateful public key cryptography,” Ad Hoc Networks, vol. 32, pp. 98–113, 2015. View at: Publisher Site  Google Scholar
 K. Parmar and D. C. Jinwala, “Malleability resilient concealed data aggregation in wireless sensor networks,” Wireless Personal Communications, vol. 87, no. 3, pp. 971–993, 2016. View at: Publisher Site  Google Scholar
 S. Ozdemir and X. Yang, “Integrity protecting hierarchical concealed data aggregation for wireless sensor networks,” Computer Networks, vol. 55, no. 8, pp. 1735–1746, 2011. View at: Publisher Site  Google Scholar
 Y. Pu, J. Luo, C. Hu et al., “Two secure privacypreserving data aggregation schemes for IoT,” Wireless Communications and Mobile Computing, vol. 2019, Article ID 3985232, 11 pages, 2019. View at: Publisher Site  Google Scholar
 X. Liu, J. Yu, X. Zhang, Q. Zhang, and C. Fu, “Energyefficient privacypreserving data aggregation protocols based on slicing,” EURASIP Journal on Wireless Communications and Networking, vol. 2020, no. 1, 2020. View at: Publisher Site  Google Scholar
 P. Zhang, J. Wang, K. Guo, F. Wu, and G. Min, “Multifunctional secure data aggregation schemes for WSNs,” Ad Hoc Networks, vol. 69, pp. 86–99, 2018. View at: Publisher Site  Google Scholar
 N. Goyal, M. Dave, and A. K. Verma, “SAPDA: secure authentication with protected data aggregation scheme for improving QoS in scalable and survivable UWSNs,” Wireless Personal Communication, vol. 113, no. 1, pp. 1–15, 2020. View at: Publisher Site  Google Scholar
 R. C. Merkle, “Advances in cryptology — CRYPTO 89 proceedings,” in Lecture Notes in Computer Science vol. 435, G. Brassard, Ed., pp. 218–238, Springer, Verlag, 1989. View at: Publisher Site  Google Scholar
 K. Bok, Y. Lee, J. Park, and J. Yoo, “An energyefficient secure scheme in wireless sensor networks,” Journal of Sensors, vol. 2016, Article ID 1321079, 11 pages, 2016. View at: Publisher Site  Google Scholar
 A. P. Singh, A. K. Luhach, X.Z. Gao, S. Kumar, and D. S. Roy, “Evolution of wireless sensor network design from technology centric to user centric: an architectural perspective,” International Journal of Distributed Sensor Networks, vol. 16, no. 8, 2020. View at: Publisher Site  Google Scholar
 https://en.wikipedia.org/wiki/Stirling_numbers_of_the_second_kind.
 B. Bhushan and G. Sahoo, “ISFCBLS intelligent and secured fuzzy clustering algorithm using balanced load subcluster formation in WSN environment,” Wireless Personal Communication, vol. 111, no. 3, pp. 1667–1694, 2020. View at: Publisher Site  Google Scholar
 Q. Wang, M. Hempstead, and W. Yang, “A realistic power consumption model for wireless sensor network devices,” in 2006 3rd Annual IEEE Communications Society on Sensor and Ad Hoc Communications and Networks, Reston, VA, USA, 2006. View at: Publisher Site  Google Scholar
 P. Pascal and T. Okamoto, “Trapdooring discrete logarithms on elliptic curves over rings,” in Advances in Cryptology — ASIACRYPT 2000, pp. 573–584, Springer, Verlag Berlin Heidelberg, 2000. View at: Google Scholar
Copyright
Copyright © 2021 D. Vinodha and E. A. Mary Anita. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.