Journal of Information Security
Vol. 3 No. 2 (2012) , Article ID: 18783 , 5 pages DOI:10.4236/jis.2012.32011
A Robust Method to Detect Hidden Data from Digital Images
1Department of Computer Science, Faculty of Science, Northern Borders Univeristy, Arar, KSA
2Department of Computer Science, Faculty of Science, South Valley University, Qena, Egypt
Email: romanyf@aun.edu.eg, wafs_73@yahoo.com, amalrashed2011@hotmail.com
Received November 29, 2011; revised December 30, 2011; accepted January 28, 2012
Keywords: Data Hiding; Steganography; Steganalysis; Attack; Color Image Hiding; Stego-Image
ABSTRACT
Recently, numerous novel algorithms have been proposed in the fields of steganography and visual cryptography with the goals of improving security, reliability, and efficiency. Steganography detection is a technique to tell whether there are secret messages hidden in images. The performance of a steganalysis system is mainly determined by the method of feature extraction and the architecture selection of the classifier. In this paper, we present a new method Visual Pixel Detection VPD for extract data from a color or a grayscale images. Because the human eye can recognize the hidden information in the image after using this detection. The experimental results show that the proposed method provides a better performance on testing images in comparison with the existing method in attacking Steghide, Outguess and F5.
1. Introduction
Steganography is the art, science, or practice in which messages, images, or files are hidden inside other messages, images, or files. The concept of steganography is not a new one; it dates back many millennia when messages used to be hidden on things of everyday use such as watermarks on letters, carvings on bottom sides of tables, and other objects. The more recent use of this concept emerged with the dawn of the digital world. Experiments have shown that data can be hidden in many ways inside different types of digital files. The main benefit of steganography is that the payload is not expected by the investigators who get to examine the computer data. The person sending the hidden data and the person meant to receive the data are the only ones who know about it; but to everyone else, the object containing the hidden data just seems like an everyday normal object. A classification of information hiding techniques is described in [1].
Generally, steganographic methods proposed in the past few years can be categorized into two types. The methods of the first type employ the spatial domain of a host image to hide secret data. In other words, secret data are directly embedded into the pixels of the host image [2-5].
Steganographic methods of the second type employ the transformed domain of a host image to hide secret data [6-10]. Transformation functions like the discrete cosine transform (DCT) or discrete wavelet transform (DWT) are first exploited to transform the pixel values in the spatial domain to coefficients in the frequency domain. Then the secret data are embedded in the coefficients.
Recently, the scheme proposed in [11] hid large amounts of data in the pixels of a true color image, which used 8 bits to represent each color component of a color pixel. The secret image can be both a grayscale and a true color image. However, the quality of the extracted color secret image is not good in terms of the peak signal-to-noise ratio (PSNR) value and visual observation. In this paper, we propose a steganographic method which hides a color or a grayscale image in a true color image.
The detection of hidden data presents a big challenge to investigators and individuals looking for hidden data. For images only, there are hundreds of billions of images on the web and looking through all of them would be a very time consuming and computationally challenging task; let alone the other types of files that data can possibly be hidden in. Even if someone manages to go through all the current images on the web, what if some new algorithm for hiding data in images emerges? Is the application used to scan the images for hidden data suitable for and capable of uncovering the hidden data? And is it feasible to go back and rescan all the images all over again with the same or other software updated to detect the hidden data by the new algorithm? The answer to the above questions is that it’s close to impossible to be able to accurately scan or attempt to detect hidden data on such a wide scope of suspect images. It is somewhat easier for investigators to scan for hidden data on a smaller scale such as an image of a hard drive, but they are still faced with the same software inaccuracy and the possibility of encountering unknown data-hiding algorithms.
In our proposed method, the local color information of an image can be preserved well because the technique of color quantization in the proposed method is image-dependent and adaptive color quantization [12]. The extracted color secret image is perceptibly almost identical to the original image, and its PSNR value is very high Moreover, the hiding capacity of the host image and the quality of the stego-image in the proposed method are also superior to that of the scheme in [11].
The remainder of this paper is organized as follows. In Section 2, a brief description of the scheme related to this paper will be given. In Section 3, we shall present our proposed method. The overall algorithm for the proposed method will be provided in Section 4. Finally, the experiments and conclusions shall be given in Sections 5 and 6.
2. Color Quantization
Color quantization is the process that drastically reduces the number of colors used in a digital color image by approximating the original pixels with their nearest representative colors. The true color image is usually quantized to reduce the size of the image to be stored or transmitted. This means the 224 colors of a true color image have to be greatly reduced to a limited number of representative colors, which is called a color table (palette).
Typically, a color table consists of 256 entries where each entry represents a color containing red, green, and blue components. In general, a palette-based image mainly consists of a color table and some image data, which contain indices corresponding to the color table entries.
Bitmap images consist of matrices of numbered points with two dimensions for grayscale and three dimensions for RGB color images. The grayscale images, also called intensity images, contain numbered values at these points, called pixels, between 0 for black and 255 for white, which can be represented as 8-bit binary strings (28 = 256). The numbers between represent gradient gray values between black and white. The RGB, abbreviated for Red, Green and Blue, images are actually three two-dimension image layers, a red, a green and a blue layer, that are combined to produce the full color image. Each layer of a color image also contains values from 0 for black to 255 for the lightest shade of the color. The RGB color scheme is referred to as an additive scheme because adding the effective value of the three layers usually produces a lighter color at that pixel.
For instance if all three values that comprise a pixel are 0, i.e. (0, 0, 0) for (red, green, blue), they produce the color black. If the three are 255, i.e. (255, 255, 255) they add to produce a much lighter shade, white. The product of these three layers can produce over 16 million different colors and is called 24-bit color because 2563 = (28)3 = 224 = 16,777,216, in which three 8-bit binary strings represent pixel colors. They are intensity, color, red layer, green layer and blue layer, from left to right. Notice that the vertical white line that appears in the center of the intensity and color images is lighter than that in the RGB layers and that the blackish colors appear black in all images. This demonstrates the additive nature of RGB color images with respect to intensity.
Messages are hidden in the least significant bits of the 8-bit binary strings representing the color numbers; hence the abbreviated name for this method is “lsb” steganography. Each character in a message has a binary representation under the ASCII (American Standard Code for Information Interchange) character system, which assigns characters with integer values between 0 and 255. This system represents a way to express all necessary single character letters, numbers, punctuations, symbols, etc. for general communication purposes. The pixel is capable of representing 224 or 16,777,216 color values. If we use the lower 2 bits of each color channel to hide data as shown in Figure 1.
3. The Proposed Scheme
In this section, we shall present the proposed the VDX method for extracting hiding a color or a grayscale secret image in a true color host image.
The proposed method is described below.
3.1. Requirements of Hiding Information Digitally
There are many different protocols and embedding techniques that enable us to hide data in a given object. However, all of the protocols and techniques must satisfy a number of requirements so that steganography can be applied correctly. The following is a list of main requirements that steganography techniques must satisfy:
Figure 1. Illustrated representing the pixel values used the lower 2 bits of each color channel to hide data.
● The integrity of the hidden information after it has been embedded inside the stego object must be correct. The secret message must not change in any way, such as additional information being added, loss of information or changes to the secret information after it has been hidden. If secret information is changed during steganography, it would defeat the whole point of the process.
● The stego object must remain unchanged or almost unchanged to the naked eye. If the stego object changes significantly and can be noticed, a third party may see that information is being hidden and therefore could attempt to extract or to destroy it.
● In watermarking, changes in the stego object must have no effect on the watermark. Imagine if you had an illegal copy of an image that you would like to manipulate in various ways. These manipulations can be simple processes such as resizing, trimming or rotating the image. The watermark inside the image must survive these manipulations, otherwise the attackers can very easily remove the watermark and the point of steganography will be broken.
● Finally, we always assume that the attacker knows that there is hidden information inside the stego object.
3.2. Embedding and Detecting a Mark
Figure 2 shows a simple representation of the generic embedding and decoding process in steganography. In this example, a secret image is being embedded inside a cover image to produce the stego image. The first step in embedding and hiding information is to pass both the secret message and the cover message into the encoder. Inside the encoder, one or several protocols will be implemented to embed the secret information into the cover message. The type of protocol will depend on what information you are trying to embed and what you are embedding it in. For example, you will use an image protocol to embed information inside images.
4. The Overall Proposed System Is Designed to Extract the Hidden Data
Hiding information in digital media requires alterations of the media properties, which may introduce some form of degradation or unusual characteristics. The degradation, at times, may become perceptible [1]. These characteristics may act as signatures that broadcast the existence of the embedded message and steganography tools used, thus defeating the purpose of stenography, which is hiding the existence of a message.
The passive attacks of steganalysis involve the detection of these characteristics and signatures. Manipulating digital media in an effort to disable or remove the embedded
Figure 2. Generic processes of encoding and decoding.
messages is a simpler task than detecting the messages. Any image can be manipulated with the intent of destroying some hidden information whether an embedded message exists or not. Detecting the existence of a hidden message will save time in the activity to disable or remove messages by guiding the analyst to process only the media that contain hidden information [2].
4.1. Visual Pixel Detection
Some steganographic algorithms are embedded data in the media by replacing some bits of media with bits of data. For images, these algorithms will replace the bits of pixels by bits of data needed to hide it or changing the palette of image that depends on the value of data. The changing or replacing in values of pixels will be in most cases not sense or seeing by human eyes and will have effect in relation between pixels by increasing the amount of difference between the neighborhood pixels in the same level.
Any pixels have the same color or equal to the neighborhood pixels in the same portion or object in image. When we measure the difference between the pixels in the same portion, and evaluate the relation between these pixels, we can represent the image by these differences or relations.
The image that has no noise changing in the portions will have a good view for user. A user can see the most edge of object in this image, and the user has the ability to distinguish the details of the image. For some case the image will appear with a good efficient viewing. Some images have a bad image pattern details viewing, but have a number of edges and with details clear. When the image has some noise or hidden data in it, the looking will not be clear. Most edges will be destroyed and the image details will not be clear and this depends on the amount of noise or hidden data and the algorithms that are used to hide data.
The visual pixel detection takes the relation between the pixels in images. It deals with BMP and GIF image file format only. The visual pixel detection works as a filter. Its filtering the pixels depends on its information and relation between other neighborhood pixels. This filter can remove from image pixels 1, 2, 3, 4 or 5 bits then re-drew the image depends on these bits.
4.2. Visual Pixel Detection Algorithm
Step1: Load Images.
Step2: Get color of pixels (Red, Green, Blue).
Step3: Calculate the relation between pixels in the same portion.
Step4: Compare between the values of pixel with the neighborhood pixels (4-neighborhood, 8-neighborhood).
Step5: Cut one bit from the pixels then re-draw the image which depends on these bits of pixels and the relation between the pixel and pixels in the neighborhood.
Step6: Repeat Step5 by changing the number of bits from 2 to 5 bit cut from the pixels and re-draw the images.
Step7: End.
The output images will help user to see the amount of hidden data noise. When original image pixels change by 1-bit the output images after filter are follows:
1) When the filter cut 1-bit from suspected image pixels, image output will be destroyed by amount of hidden data.
2) When the filter cut 2-bit from pixels, image output will be approximately clear (have some noise).
3) When the filter cut 3, 4, 5 bits from pixels, images output will be clear 3, 4, 5 bit or have a few noises. If the image has hidden information in two bits (1-2 bits), the output images from the filter will be:
● For 1, 2-bit images not clear.
● For 3, 4, 5-bits image approximately clear.
This filter will be very sensitive to any change in the first five bits of pixels in any BMP or GIF file format images. The number of color levels will not affect the filter results. The filter has the ability to deal with color level, it works in 256, 16-bit high color and true color for BMP images type. This filter also gives good results when used on GIF images; this is due to the fact that GIF file format has a maximum of 256 colors.
5. Experimental
To evaluate the effectiveness of the proposed method, we apply the proposed method “VPD” on deferent three types of secret images, hiding a color secret image, hiding a palette-based 256-color secret image, and hiding a grayscale image in a true color image the VPD Method will detect the stego objects from the different image that contain the stego objects as shown in Figures 3 and 4 in the first both types but it fail in the third type.
The proposed method also is compared with Fridrich’s method [13], Y. Q. Shi’s method [14] and Z. M. He’s method [15]. The averaged results of four independent runs for different steganographic embedding techniques are shown in Tables 1-3.
Figure 3. Original image with secret image.
Figure 4. After using visual pixel detection.
Table 1. Detection accuracy for OutGuess.
Table 2. Detection accuracy for F5.
Table 3. Detection accuracy for Steghide.
The proposed method outperforms the Fridrich’s method and the Y. Q. Shi’s method in attacking all three types of steganographic embedding techniques. Experimental results show that Fridrich’s and He’s methods perform well for F5 while the Shi’s and He’s methods perform well for OutGuess and Steghide. In contrast, the proposed method performs the best on attacking all three types of steganographic embedding techniques. Overall, the proposed method provides good performance on steganalysis and outperforms current methods.
6. Conclusion
It is very difficult to extract a secret message from image due to the numerous methods of hiding in images. In this paper , we apply VPD method on deferent types of secret images to extract the hidden information. Some of these types need stego-image and stego-key, others need stegoimage only, and the paper has extracted the secret data that has been hidden by deferent tools. The final conclusion is very difficult to apply one general method for extraction because there are different techniques such as encryption (block and stream), colors and filters could be used for embedding and each one needs special way to solve it (find the hidden information). In feature works, we will also try to select features using the localized generalization error model to reduce the system complexity.
REFERENCES
- C.-C. Thien and J.-C. Lin, “A Simple and High-Hiding Capacity Method for Hiding Digit-by-Digit Data in Images Based on Modulus Function,” Pattern Recognition, Vol. 36, No. 12, 2003, pp. 2875-2881. doi:10.1016/S0031-3203(03)00221-8
- C.-K. Chan and L. M. Cheng, “Hiding Data in Images by Simple LSB Substitution,” Pattern Recognition, Vol. 37, No. 3, 2004, pp. 469-474. doi:10.1016/j.patcog.2003.08.007
- C.-C. Chang, J.-Y. Hsiao and C.-S. Chan, “Finding Optimal Least-Significant Bit Substitution in Image Hiding by Dynamic Programming Strategy,” Pattern Recognition, Vol. 36, No. 7, 2003, pp. 1583-1595. doi:10.1016/S0031-3203(02)00289-3
- C.-C. Thien and J.-C. Lin, “A Simple and High-Hiding Capacity Method for Hiding Digit-by-Digit Data in Images Based on Modulus Function,” Pattern Recognition, Vol. 36, No. 12, 2003, pp. 2875-2881. doi:10.1016/S0031-3203(03)00221-8
- R.-Z. Wang, C.-F. Lin and J.-C. Lin, “Image Hiding by Optimal LSB Substitution and Genetic Algorithm,” Pattern Recognition, Vol. 34, No. 3, 2001, pp. 671-683. doi:10.1016/S0031-3203(00)00015-7
- C.-C. Chang, T.-S. Chen and L.-Z. Chung, “A Steganographic Method Based upon JPEG and Quantization Table Modification,” Information Sciences, Vol. 141, No. 1- 2, 2002, pp. 123-138. doi:10.1016/S0020-0255(01)00194-3
- M. Iwata, K. Miyake and A. Shiozaki, “Digital Steganography Utilizing Features of JPEG Images,” IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, Vol. E87-A, No. 4, 2004, pp. 929- 936.
- H. Noda, J. Spaulding, M. N. Shirazi and E. Kawaguchi, “Application of Bit-Plane Decomposition Steganography to JPEG2000 Encoded Images,” IEEE Transactions on Signal Processing Letters, Vol. 9, No. 12, 2002, pp. 410- 413. doi:10.1109/LSP.2002.806056
- T. Liu and Z.-D. Qiu, “A DWT-Based Color Image Steganography Scheme,” Proceedings of the International Conference on Signal Processing, Beijing, 26-30 August 2002, pp. 1568-1571. doi:10.1109/ICOSP.2002.1180096
- R. R. Ni and Q. Q. Ruan, “Embedding Information into Color Images Using Wavelet,” Proceedings of the International Conference on Computers, Communications, Control and Power Engineering of the IEEE TENCON, Beijing, 28-31 October 2002, pp. 598-601. doi:10.1109/TENCON.2002.1181346
- M.-H. Lin, Y.-C. Hu and C.-C. Chang, “Both Color and Gray Scale Secret Images Hiding in a Color Image,” International Journal of Pattern Recognition and Artificial Intelligence, Vol. 16, No. 6, 2002, pp. 697-713. doi:10.1142/S0218001402001903
- W.-S. Kim and R.-H. Park, “Color Image Palette Construction Based on the HSI Color System for Minimizing the Reconstruction Error,” Proceedings of the International Conference on Image Processing of the IEEE, Lausanne, 16-19 September 1996, pp. 1041-1044. doi:10.1109/ICIP.1996.561017
- J. Fridrich, “Feature-Based Steganalysis for JPEG Images and Its Implications for Future Design of Steganographic Schemes,” Proceedings of the 6th International Conference on Information Hiding, Toronto, 23-25 May 2004, pp. 67-81. doi:10.1007/978-3-540-30114-1_6
- Y. Q. Shi, C. C. Chen and W. Chen, “A Markov Process Based Approach to Effective Attacking JPEG Steganography,” Proceedings of the 8th International Conference on Information Hiding, Alexandria, 10-12 July 2006, pp. 249-264.
- Z.-M. He, W. W. Y. Ng, P. P. K. Chan and D. S. Yeung, “Steganography Detection Using Localized Generalization Error Model,” Proceedings of the International Conference on Systems Man and Cybernetics of the IEEE SMC, Istanbul, 10-13 October 2010, pp. 1544-1549. doi:10.1109/ICSMC.2010.5642331