Monitoring techniques are a key technology for examining the conditions in various scenarios, e.g., structural conditions, weather conditions, and disasters. In order to understand such scenarios, the appropriate extraction of their features from observation data is important. This paper proposes a monitoring method that allows sound environments to be expressed as a sound pattern. To this end, the concept of synesthesia is exploited. That is, the keys, tones, and pitches of the monitored sound are expressed using the three elements of color, that is, the hue, saturation, and brightness, respectively. In this paper, it is assumed that the hue, saturation, and brightness can be detected from the chromagram, sonogram, and sound spectrogram, respectively, based on a previous synesthesia experiment. Then, the sound pattern can be drawn using color, yielding a “painted sound map.” The usefulness of the proposed monitoring technique is verified using environmental sound data observed at a galleria.
Recently, the analysis of large data sets, so-called “big data,” has allowed a variety of information to be extracted, and this information can help create certain services. Further, monitoring techniques can be useful for determining the phenomena that initially generated the recorded data. Thus, monitoring techniques are regarded as those that allow identification of the monitored environment conditions through analysis of the data observed within the area. For example, in the case of structural monitoring, which is known as building health monitoring, deterioration and damage to buildings can be checked using findings obtained through the analysis of sensor data, e.g., data acquired from acceleration sensors and cameras [
Various methods for understanding sound environments have been proposed to date. However, almost all researchers have focused on topics related to environmental sound recognition (ESR) [
This study proposes an unconventional method that allows the analysis of sound environments using color, where the color rules are based on the concept of synesthesia [
For application of the proposed method, environmental sounds are first collected using a microphone array (
using colors, based on the knowledge of synesthesia.
Synesthesia is a phenomenon in which one kind of sensory stimulation is expressed as another sensation [
Keys, tones, and pitches are assumed to be detected by the chromagram, sonogram, and sound spectrogram, respectively. Hence, the hue score is calculated using the key histogram yielded by the chromagram. Similarly, the saturation and brightness scores are calculated using the frequency-band histograms produced by the sonogram and sound spectrogram, respectively, with a clustering method then being applied to the environmental-sound spectrogram. Similar frequency components are categorized so that the frequency component dispersion of the environmental sounds is clarified. This dispersion information is then used to calculate the histogram with respect to the spectrogram frequency elements.
Below, the proposed method of sound environment analysis is presented in detail. The sound data
The environmental sound chromagram is calculated using the MATLAB chroma toolbox [
Subsequently, a histogram showing the key information indicated in the chromagram is calculated. For example,
The sonogram can be calculated using the MATLAB ma toolbox [
The frequency-band histogram of the sonogram is computed [
A spectrogram can also be calculated.
environmental sound shown in
Each exemplar centroid frequency obtained by the IAP is classified into a low-, medium-, or high-frequency group; then, the frequency-group histogram can be acquired. The brightness score is obtained from the histogram in a similar manner to the case of the hue score. However, when the 8-bit binary code is obtained, the threshold determining “1” or “0” values is set to zero. Therefore, a low score indicates that the dominant frequency of the environmental sound is low, while a higher score indicates that the environmental sound consists of various frequencies.
The hue, saturation, and brightness scores are used to draw the painted sound map, where the hue-saturation-brightness color model obtained using these three scores is converted to a red-green-blue (RGB) color model.
In this section, the efficacy of the painted sound map method is demonstrated using environmental sounds observed in the sound environment shown in
The sound environment shown in
lar to those shown in
From all the above results, it can be concluded that the proposed painted sound map drawn using the three scores discussed above is effective for sound environment analysis. In particular, this approach is useful for visually detecting and determining the sound environment conditions and their variations, and it should be noted that the snapshots provided by the painted sound maps work effectively in this regard.
This paper has proposed a method of monitoring sound environments based on computational auditory scene analysis. The proposed visualization technique allows sound environment conditions to be determined and represented using colors.
As future research work, the proposed monitoring technique can be applied to the monitoring of superannuated building structural conditions.
The author thanks Dr. Sashima and Dr. Kurumatani for helpful discussions. This work was partly supported by a JSPS KAKENHI Grant (Number 16H02911).
Kawamoto, M. (2017) Sound-Environment Monitoring Method Based on Computational Auditory Scene Analysis. Journal of Signal and Information Processing, 8, 65-77. https://doi.org/10.4236/jsip.2017.82005
The method proposed in this paper employs a message exchange clustering algorithm based on an affinity propagation (AP) method [
where
In the AP method, the exemplar is the data point k satisfying the inequality;
Then, the exemplar satisfying condition (4) can be altered by the preference
Here, it should be noted that the
That is, based on the
This means that the
VRC | Number of exemplars | Running time [s] | |
---|---|---|---|
Original AP method | 32.93 | 4.7 | 0.0078 |
Adaptive AP method | 35.99 | 5.5 | 0.2146 |
Proposed IAP method | 38.47 | 9.6 | 0.1013 |
Further, X denotes all data in a cluster consisting of data Xk’’. The parameters
In this subsection, the proposed IAP method is compared with the original and adaptive AP methods, using the 2D random data points
Here, k denotes the number of clusters and SSB is the overall between-cluster variance, which is essentially the variance of all the cluster centroids from the grand centroid in the dataset, defined as
Here,
where