J. Biomedical Science and Engineering, 2008, 1, 157-162
Published Online November 200 8 in SciRes . http://www.srpublishing.org/journal/jbise JBiSE
Identification of essential language areas by
combination of fMRI from different tasks using
probabilistic independent component analysis
Yanmei Tie1, Ralph O. Suarez, Stephen Whalen, Isaiah H. Norton, Alexandra J. Golby
1Departments of Neurosurgery and Radiology, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA. Corresponden ce should be ad-
dressed to Alexandra J. Globy (agolby@bwh.harvard.e du), Tel.: 617-525-6776.
Received Aug ust 5, 2008; revised October 14, 2008; accepted October 14, 2008
Functional magnetic resonance imaging (fMRI)
has been used to lateralize and localize lan-
guage areas for pre-operative planning pur-
poses. To identify the essential language areas
from this kind of observation method, we pro-
pose an analysis strategy to combine fMRI data
from two different tasks using probabilistic in-
dependent component analysis (PICA). The
assumption is that the independent compo-
nents separated by PICA identify the networks
activated by both tasks. The results from a
study of twelve normal subjects showed that a
language-specific component was consistently
identified, with the participating networks sepa-
rated into different components. Compared w ith
a model-based method, PICA’s ability to capture
the neural networks whose temporal activity
may deviate from the task timing suggests that
PICA may be more appropriate for analyzing
language fMRI data with complex event-related
paradigms, and may be particularly helpful for
patient studies. This proposed strategy has the
potential to improve the correlation between
fMRI and invasive techniques which can dem-
onstrate essential areas and which remain the
clinical gold standard.
Keywords: fMRI, probabilistic independent
component analysis (PICA), language map-
ping, event-related paradigm
The purpose of pre-surgical language mapping is to lat-
eralize and localize critical language areas for neurosur-
gical planning when the patient’s lesion is located in or
close to language areas. In addition to invasive lang uage
mapping techniques (e.g., intracarotid amytal test (IAT),
and intra-operative electric cortical stimulation (ECS)),
pre-operative functional magnetic resonance imaging
(fMRI) based on language tasks has been used to deter-
mine the language-dominan t hemisphere [1] and provide
spatial relationships between brain lesions and language
areas [2]. Although fMRI has the advantages of
non-invasiveness, pre-operative availab ility, rep eatability,
and less time and cost, it has several shortcomings for
language mapping applications [3]. First, compared to
the conventional inhibition methods that are able to
demonstrate essential areas, fMRI is an observation
method which thus demonstrates numerous areas in-
volved in the language tasks, but cannot demonstrate the
necessity of those areas in language function. Second,
language fMRI generally uses silent tasks due to the
motion artifact resulting from vocalizing responses,
which complicates comparisons with the clinical gold
standard tests that use overt responses.
To try to address these problems, we applied a
data-driven method, probabilistic independent compo-
nent analysis (PICA) [4], to fMRI data from two lan-
guage tasks. First, we propose an analysis strategy to
examine activations during two different tasks in an ef-
fort to identify the essential language areas. Second, we
investigate the performance of PICA in extracting lan-
guage-related components from vocalized language
fMRI data that are contaminated by motion artifact and
background noise.
Independent component analysis (ICA) has been ap-
plied to fMRI data to extract statistically independent
features [5, 6]. It has been shown that ICA can be used
as a complementary tool to the conventional general
linear model (GLM) method in improving the sensitivity
and specificity of fMRI language mapping [7]. PICA is
an extension of the classical noise-free ICA model [5],
assuming that the data are confounded by additive Gaus-
sian noise [4]. It was proposed to address the overfitting
problem and make the statistical significance testing
feasible for the analysis of fMRI data. PICA has been
applied in several studies, including the investigation of
the neural dynamics of default-mode networks and event
segmentation in music [8-11].
SciRes Copyright © 2008
158 Y. Tie / J. Biomedical Science and Engineering 1 (2008) 157-162
SciRes Copyright © 2008 JBiSE
2.1. Subjects and Image Acquisition
The protocol was approved by the Partner ’s In stitutional
Review Board. Twelve right-handed native English
speaking healthy subjects participated (7 men, 5 women,
mean age = 30.6 ± 6.5 years, range 20-43 years). One
right-handed patient with focal cortical displasia (female,
35 years old) was also studied. All subjects provided
written informed consent.
MR images were obtained using a 3.0 Tesla scanner
(Signa scanner, GE Medical System, Milwaukee, WI,
USA). A single-shot gradient-echo echo-planar imaging
(EPI) was used to acquire blood-oxyge n-level dependent
(BOLD) functional images (TR = 1000 ms, TE = 29 ms,
flip angle = 68°, FOV = 24 cm, dimension = 64 x 64 x
16, slice gap = 0 mm, voxel size = 3.75 x 3.75 x 5 mm3)
using a quadrature head coil. In each image volume, 16
axial slices were acquired using ascending interleaved
scanning sequence. A volumetric T1-weighted magneti-
zation prepared rapid gradient echo (MPRAGE) acquisi-
tion (dimension = 256 x 256 x 124, voxel size = 1 x 1 x
1.3 mm3) was performed to provide a high-resolution
anatomic reference frame for subsequent overlay of
functional activation maps.
The patient’s functional images were acquired using
different parameters (TR = 2000 ms, TE = 40 ms, flip
angle = 90°, FOV = 24 cm, dimension = 128 x 128 x 28,
slice gap = 0 mm, voxel size = 1.88 x 1.88 x 4 mm3,
ascending interleaved scanning sequence).
High-resolution T2-weighted gradient-echo MR images
(dimension = 512 x 512 x 91, voxel size = 0.5 x 0.5 x
1.5 mm3) were acquired to provide background struc-
tural images for the patient’s activation maps.
2.2. Behavioral Paradigm
Subjects performed two language tasks with vocalized
responses: antonym-generation (AG), and noun catego-
rization (NC). In the AG task, subjects were asked to
speak the antonym of a word stimulus presented visually
through MR-compatible video goggles (Resonance
Technology, Los Angeles, CA, USA). In the NC task,
subjects were asked to state whether the word stimulus
referred to either a living (“alive”, e.g., a dog) or
non-living (“not alive”, e.g., a chair) object. Subject vo-
calizations were transmitted by an intercom system
(Avotec Inc., Stuart, FL, USA) to an investigator in the
MRI scanner control room who counted the number of
incorrect or omitted responses in order to verify satis-
factory task performance. Subjects were instructed to
verbalize responses with minimal movement of their
head, jaw, or lips. During the time period between visual
stimuli, subjects were asked to relax and look at a cross-
hair shown in the center of the visual field.
The language tasks were implemented as a
rapid-presentation, event-related fMRI paradigm with a
jittered inter-stimulus-interval (ISI = 8.3 ± 5.1 sec), last-
ing 7 min 20 sec (including a 10-sec pre-stimulus period
acquired to allow stabilization of the BOLD signal, ex-
cluded from analysis). A total of 50 word stimuli were
delivered during each task, and each word was shown
for 2 sec. The fMRI paradigms of the patient study
lasted 5 min 20 sec (including a 10-sec pre-stimulus pe-
riod), delivering 34 words in the AG task and 39 words
in the NC task). The order and exact timing for delivery
of word stimuli was based on a stochastic design in-
tended to maximize the statistical significance of the
fMRI paradigm, and minimize subject’s expectation and
habituation effects. Stimuli event scheduling was per-
formed using the Optseq2 software package (NMR Cen-
ter, Massachusetts General Hospital, MA, USA). Stimu-
lus paradigms were implemented using Presentation
software package (Version 9.70, Neurobehavioral Sys-
tems Inc., Davis, CA, USA).
3.1. Concatenation of Two Tasks and Data
FMRI data from two tasks were concatenated in time by
putting the NC task data at the end of the AG task data
(Figure 1). Thus the total data set of each subject was
860 volumes (430 volumes for each task). Then we used
the Statistical Parametric Mapping software package
(SPM2, Wellcome Department of Cognitive Neurology,
London, UK) to perform motion correction by realigning
the fMRI images to the first functional image.
The data were then sent to the Multivariate Explora-
tory Linear Optimized Decomposition into Independent
Components (MELODIC, Version 3.05) module of
FMRIB’s Software Library (FSL, Version 4.0, Oxford
Center for Functional Magnetic Resonance Imaging of
the Brain, University of Oxford, Oxford, UK) for PICA
analysis [4]. Before the PICA procedure, the following
steps were applied to the input data file: masking of
non-brain voxels by an intensity thresholding at 10%;
high-pass temporal filtering to remove low-frequency
drifts with cut-off period of 128 sec; and voxel-wise
de-meaning and variance normalization o f the data.
3.2. Probabilistic ICA of fMRI Data
We applied the PICA technique proposed by Beckmann
and Smith [4] to analyze the concatenated fMRI data.
The fMRI signal (X) is assumed to be generated from a
linear mixing process of the independent non-Gaussian
sources (S) by a mixing matrix (A), and corrupted by
additive Gaussian noise (η):
X = AS + η. (1)
In the PICA model (1), X is a p × n matrix denoting p
volumes (p = 860 volumes for this study) of n voxels
fMRI data, S is a q × n matrix denoting q non-G aussian
sources (i.e., independent components, ICs), and A is a p
× q mixing matrix.
First, the number of ICs (q) was estimated using the
Laplace approximation to the Bayesian evidence of the
model order [4]. There were 95 ~ 166 components esti-
mated for each subject (mean ± STD = 122 ± 20, across
Y. Tie / J. Biomedical Science and Engineering 1 (2008) 157-162 159
SciRes Copyright © 2008 JBiSE
Figure 1. Concatenation of two tasks fMRI data. Red bars: word
stimuli onsets.
Figure 2. Spatial maps of (A) “language COI”, and (B) “mo-
tor COI” of an example subject estimated by the PICA
analysis (posterior probability thresholded at p > 0.5).
subjects). The fMRI data were then reduced to q dimen-
sions by principal component analysis (PCA) and de-
composed into q spatially independent components by
the fastICA algorithm [12]. A de-mixing matrix W was
found to generate:
Ŝ = WX, (2)
where Ŝ is a good approximation to the sources.
Then the estimated component maps were divided by
the standard deviation of the Gaussian noise to generate
z-score maps, and sequentially thresholded at a posterior
probability p > 0.5, by fitting a Gaussian/Gamma mix-
ture model to the histogram of intens ity values [4].
3.3. Identification of Components of Interest
We used a two-step process to identify the components
Table 1. Ranking result s of COIs based o n IC ti me cour se’ s
average power in frequency range of less than 0.1 Hz.
Subject Language COI Motor COI
1 4th 3
2 8th 4
3 1st 2
4 11th 10th
5 14th 9
6 5th 4
7 5th 6
8 1st 11th
9 6th 5
10 23rd 25th
11 3rd 8
12 1st 9
Figure 3. Time courses of “language COI” (red) and “motor COI”
(blue) averaged across subjects. The green plot is the expected
HRF model.
of interest (COIs) from the PICA output of each sub-
ject’s data. First, the power spectrum density of each
component’s time course was estimated by periodogram
spectral estimation. Then the average power was calcu-
lated for frequency less than 0.1 Hz, and ranked in de-
scending order. This frequency range was selected based
on the power spectrum of the expected hemodynamic
response function (HRF), which was in low-frequency
range (< 0.1 Hz). Next, the components’ spatial maps
were visually inspected in the order determined by the
previous step to identify the components with activation
in the putative language areas, and other areas of inter-
3.4. GLM Analysis of fMRI Data
For comparison purposes the pre-processed data were
submitted to SPM2 for conven tional GLM analysis. The
basis function consisted of the canonical HRF model
with temporal and dispersion derivatives. Run-specific
responses were modeled in an event-related design [13]
by convolving a series of Dirac’s delta function, each
representing a stimulus event onset, with the basis func-
tion. After GLM, the t maps were fitted to a Gaus-
sian/Gamma mixture model and thresholded at a poste-
rior probability p > 0.5 in order to be comparable with
the PICA results.
+22 +46
+36 +26
+22 +58
+36 0
160 Y. Tie / J. Biomedical Science and Engineering 1 (2008) 157-162
SciRes Copyright © 2008 JBiSE
Figure 4. Temporal c orrelati on coef ficie nt s betw een the identif ied
COIs’ time courses and the expected HRF model for each sub-
ject (red bars: language COIs; blue bars: motor COIs).
Figure 5. Spatial maps of GLM analysis of the same exam-
ple subject a s in Fig. 2 (posteri o r pr ob a bi l i ty thresholded at p
> 0.5).
4.1. Spatial Maps of Identified Language and
Motor COIs
For each subject, among an average of 122 components
estimated by PICA, one component was consistently
identified with activations in the left frontal and tempo-
ral lobes, primarily in the putative language areas (infe-
rior frontal gyrus, superior temporal gyrus, and su-
pramarginal gyrus), and pre-motor area (PMA, middle
frontal gyrus). This component was named “language
COI”. Figure 2A shows the spatial maps of the language
COI of an example subject (activations in the putative
language areas and PMA are highlighted by red circles).
Another component was identified with bilateral activa-
tions mainly in the primary face motor areas (precentral
gyrus), which was named “motor COI”. Figure 2B
shows the spatial maps of the motor COI of the same
The ranking results of the language and motor COIs
based on the average power of their time courses are
listed in Table 1 for each subject. It demonstrated that
these COIs were ranked within the first 15 ICs (except
for one subject (#10), whose language COI ranked the
23rd, and motor COI ranked the 25th).
4.2. Time Courses of Identified COIs
Figure 6. Spatial maps of (A) “language COI”, and (B)
“motor COI” o f the patient estim a ted by t h e P I C A analysis
(posterior probability thresholded at p > 0.5).
Figure 3 shows the time courses of the language (red)
and motor (blue) COIs averaged across all subjects. The
expected HRF model is also shown (green).
Figure 4 shows the results of temporal correlation
analysis between the identified COIs’ time courses and
the expected HRF mode for each subject. It demon-
strated that the time courses of the motor COIs of 7 sub-
jects correlated more closely with the expected HRF
than that of the language COIs. The correlation coeffi-
cients are 0.30 ± 0.17 (mean ± STD, across subjects) for
the language COIs, and 0.38 ± 0.25 for the motor COIs.
4.3. Comparison of PICA and GLM Results
Figure 5 shows the spatial maps of the GLM results of
the same example subject. Compared with the PICA
maps, the GLM maps identified activation patterns very
similar to that of the motor COI (Figure 2B), with bilat-
eral activations mainly in the primary face motor areas.
The GLM maps showed weak activations in the putative
language areas.
4.4. Results of Patient Data
Y. Tie / J. Biomedical Science and Engineering 1 (2008) 157-162 161
SciRes Copyright © 2008 JBiSE
Figure 7. Two noise components showing signals from (A)
the brain edge; and (B) the ventricle system.
The spatial maps of the language and motor COIs re-
sulting from PICA analysis of the pa tient data are shown
in Figure 6. The “language COI” (Figure 6A) showed
significant activations in the left inferior frontal gyrus
and left PMA. The “motor COI” (Figure 6B) shows
bilateral activations in the primary face motor ar-
eas.Temporal correlation analysis indicated that the lan-
guage COI correlated poorly with the task timing (r =
0.15, p < 0.01), while the motor COI correlated highly
with the task timing (r = 0.45, p < 0.0001). Using the
same pre-processed data, GLM generated maps (not
shown) revealed activation s mainly in the bilateral motor
area, but almost no activations in the putative language
Figure 7 shows two noise components indicating sig-
nals from the edge of the brain (due to motion artifact,
Figure 7A) and the ventricle system (Figure 7B).
In this study, we propose a new analysis strategy using
PICA and combining fMRI data fro m different languag e
tasks to better identify the essential language areas for
pre-operative language mapping. A component was con-
sistently identified in the putative language areas, sug-
gesting that the underlying system was essential to both
tasks. Compared with the conventional GLM method,
PICA indicated better performance in extracting lan-
guage activations, and separating noise into different
It is clinically useful to demonstrate brain regions
critical for language function so that they may be
avoided during resection [3]. Whereas, clinical tests
which block neural activity (IAT and ECS) are able to
demonstrate the necessity of particular areas for lan-
guage function, fMRI maps may include
non-language-specific areas that participate in the task
[7]. To try to demonstrate language-specific areas, we
concatenated the fMRI data from two different tasks in
time, and applied PICA to estimate the spatially inde-
pendent components from the data. The underlying as-
sumption is that the networks revealed by PICA are ac-
tivated by both tasks, which are more likely to be essen-
tial language areas. The results of all subjects consis-
tently identified a ne twork in the putative langu age areas
and the ipsilateral PMA. Another network was found in
the bilateral face motor areas, ind icating the participation
of these areas in overt language production.
The temporal correlation analysis on the time courses
of the “language COIs” showed a relatively low correla-
tion with the expected HRF model, while the “motor
COIs” correlated more closely with the HRF model.
This indicated that the temporal profile of the language
activation may differ from the task timing, while the
motor activation followed the task very well. Since the
GLM method is based on the expected HRF model, and
therefore able to identify voxels whose time courses
correlate highly with the paradigm, this may explain the
observation that the GLM identified very similar activa-
tion patterns to the “motor COIs”, but did not show sig-
nificant activations in the language areas. PICA’s ability
to capture the neural networks whose temporal activ ities
may deviate from the time course of the paradigm sug-
gests that the data-driven method may be well-suited to
analysis of complex event-related language fMRI. The
data-driven method could be particularly useful in ana-
lyzing patient data, since the lesions adjacent to the lan-
guage areas may result in alterations in the BOLD re-
sponse [14], and patients may have difficulty with task
performance causing altered timing of the cognitive
process relative to the model.
To identify components of interest from a large num-
ber of separated components remains a practical chal-
lenge for the ICA technique, and methods have been
proposed based on spatial, temporal, an d spectral criteria
[5-7, 15]. In this study, we ranked the ICs based on their
time courses’ average power in the frequency range of
less than 0.1 Hz. This frequency range was chosen based
on the power spectrum of the expected HRF model. The
language and motor COIs were among the highly ranked
ICs, which confirmed the effectiveness of this selection
criterion. Among other highly ranked ICs were de-
fault–mode networks as shown in [8], and low frequency
head motion artifact.
Vocalized event-related language paradigms offer an
advantage of more closely simulating natural language
performance. However, the motion artifact resulting
from vocalizing responses may lead to contamination in
the statistical maps. PICA has the ability to separate sig-
nals from motion artifact, noise, and physiological ef-
fects, into different components, and therefore may be
particularly applicable to vocalized language fMRI data.
This proposed combination strategy can be extended
to fMRI data from multiple tasks. To improve the effec-
tiveness of this strategy, the different tasks should be
162 Y. Tie / J. Biomedical Science and Engineering 1 (2008) 157-162
SciRes Copyright © 2008 JBiSE
selected in an optimal way. In this study, the AG task
focused on both the receptive and expressive aspects of
language function, and the NC task was more involved
in the receptive aspect. Future work will be directed at
the optimal combination of task paradigms, as well as
validation of this approach against invasive testing in
We propose a new analysis strategy to identify essential
language areas by combining fMRI data from two dif-
ferent tasks. We applied a data-driven method, PICA,
based on the assumption that the separated spatially in-
dependent networks were activated by both tasks. The
results show that using this approach, the language
component was consistently identified and separated
from the participating networks. This approach com-
pares favorably with GLM for complex event-related
language paradigms, and may be particularly helpful for
patient studies for pre-operative language mapping.
This study is supported by NIH K08 NS048063, NIH-NCRR U41
RR019703, and The Brain Science Foundation.
[1] J.R. Binder, S.J. Swanson, T.A. Hammeke, G.L. Morris, W.M.
Mueller, M. Fischer, S. Benbadis, J.A. Frost, S.M. Rao, and V.M.
Haughton, (1996) “Determination of language dominance using
functional MRI: a comparison with the Wada test,” Neurology,
vol. 46, 978-984.
[2] C. Stippich, N. Rapps, J. Dreyhaupt, A. Durst, B. Kress, E. Nen-
ning, V.M. Tronnier, and K. Sartor, (2007) “Localizing and later-
alizing language in partients with brain tumors: feasibilty of rou-
tine preoperative functional MR imaging in 81 consecutive pa-
tients,” Radiology, vol. 243, 828-836.
[3] S. Tharin and A. Golby, (2007) “Functional brain mapping and its
applications to neurosurgery,” Neurosurgery, vol. 60, 185-201.
[4] C.F. Beckmann and S.M. Smith, (2004) “Probabilistic independ-
ent component analysis for functional magnetic resonance imag-
ing,” IEEE Trans. Med. Imaging, vol. 23, 137-152.
[5] M.J. McKeown, S. Makeig, G.G. Brown, T.P. Jung, S.S. Kinder-
mann, A.J. Bell, and T.J. Sejnowski, (1998) “Analysis of fMRI
data by blind separation into independent spatial components,”
Hum. Brain Mapp., vol. 6, 160-188.
[6] V.D. Calhoun, T. Adali, G.D. Pearlson, and J.J. Pekar, (2001) “A
method for making group inferences from functional MRI data
using independent component analysis,” Hum. Brain Mapp., vol.
14, 140-151.
[7] Y. Tie, S. Whalen, R.O. Suarez, and A.J. Golby, (2008) “Group
independent component analysis of language fMRI from word
generation tasks,” Neuroimage, vol. 42, 1214- 1225.
[8] M.D. Greicius, G. Srivastava, A.L., Reiss, and V. Menon, (2004)
“Default-mode network activity distinguishes Alzheimer’s disease
from healthy aging: Evidence form functional MRI,” Proc. Natl.
Acad. Sci. U.S.A., vol. 101, 4637-4642.
[9] C.F. Beckmann, M. DeLuca, J.T. Devlin, and S.M. Smith, (2005)
“Investigations into resting-state connectivity using independent
component analysis,” Phil. Trans. R. Soc. B, vol. 360, 1001-1013.
[10] M. Fukunaga, S.G. Horovitz, P. Van Gelderen, J.A. de Zwart, J.M.
Jansma, V.N. Ikonomidou, R. Chu, R.H.R. Deckers, D.A. Leo-
pold, and J.H. Duyn, (2006) “Large-amplitude, spatially corre-
lated fluctuations in BOLD fMRI signals during extended rest and
early sleep stages,” Magn. Reson. I maging, vol. 24, 979-992.
[11] D. Sridharan, D. J. Levitin, C.H. Chafe, J. Berger, and V. Menon,
(2007) “Neural dynamics of event segmentation in music: Con-
verging evidence for dissociable ventral and dorsal networks,”
Neuron, vol. 55, 521- 532.
[12] A. Hyvarinen, (1999) “Fast and robust fixed-point algorithms for
independent component analysis,” IEEE Trans. Neural. Netw., vol.
10, 626-634.
[13] K.J. Friston, P. Fletcher, O. Josephs, A. Holmes, M.D. Rugg, and
R. Turner, (1998) “Event-related fMRI: characterizing differential
responses,” Neuroimage, vol. 7, 30-40.
[14] A.I. Holodny, M. Schulder, W.C. Liu, J. Wolko, J.A. Maldjian,
and A.J. Kalnin, (2000)“The effect of brain tumors on BOLD
functional MR imaging activation in the adjacent motor cortex:
implications for image-guided neurosurgery,” AJNR Am. J. Neu-
roradiol., vol. 21, 1415-1422.
[15] C.H. Moritz, B.P. Rogers, and M.E. Meyerand, (2003) “Power
spectrum ranked independent component analysis of a periodic
fMRI complex motor paradigm,” Hum. Brain Mapp, vol. 18,