Identification of essential language areas by combination of fMRI from different tasks using probabilistic independent component analysis

doi:10.4236/jbise.2008.13026

Paper Menu >>

Journal Menu >>

J. Biomedical Science and Engineering, 2008, 1, 157-162

Published Online November 200 8 in SciRes . http://www.srpublishing.org/journal/jbise JBiSE

Identification of essential language areas by

combination of fMRI from different tasks using

probabilistic independent component analysis

Yanmei Tie1, Ralph O. Suarez, Stephen Whalen, Isaiah H. Norton, Alexandra J. Golby

1Departments of Neurosurgery and Radiology, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA. Corresponden ce should be ad-

dressed to Alexandra J. Globy (agolby@bwh.harvard.e du), Tel.: 617-525-6776.

Received Aug ust 5, 2008; revised October 14, 2008; accepted October 14, 2008

ABSTRACT

Functional magnetic resonance imaging (fMRI)

has been used to lateralize and localize lan-

guage areas for pre-operative planning pur-

poses. To identify the essential language areas

from this kind of observation method, we pro-

pose an analysis strategy to combine fMRI data

from two different tasks using probabilistic in-

dependent component analysis (PICA). The

assumption is that the independent compo-

nents separated by PICA identify the networks

activated by both tasks. The results from a

study of twelve normal subjects showed that a

language-specific component was consistently

identified, with the participating networks sepa-

rated into different components. Compared w ith

a model-based method, PICA’s ability to capture

the neural networks whose temporal activity

may deviate from the task timing suggests that

PICA may be more appropriate for analyzing

language fMRI data with complex event-related

paradigms, and may be particularly helpful for

patient studies. This proposed strategy has the

potential to improve the correlation between

fMRI and invasive techniques which can dem-

onstrate essential areas and which remain the

clinical gold standard.

Keywords: fMRI, probabilistic independent

component analysis (PICA), language map-

ping, event-related paradigm

1. INTRODUCTION

The purpose of pre-surgical language mapping is to lat-

eralize and localize critical language areas for neurosur-

gical planning when the patient’s lesion is located in or

close to language areas. In addition to invasive lang uage

mapping techniques (e.g., intracarotid amytal test (IAT),

and intra-operative electric cortical stimulation (ECS)),

pre-operative functional magnetic resonance imaging

(fMRI) based on language tasks has been used to deter-

mine the language-dominan t hemisphere [1] and provide

spatial relationships between brain lesions and language

areas [2]. Although fMRI has the advantages of

non-invasiveness, pre-operative availab ility, rep eatability,

and less time and cost, it has several shortcomings for

language mapping applications [3]. First, compared to

the conventional inhibition methods that are able to

demonstrate essential areas, fMRI is an observation

method which thus demonstrates numerous areas in-

volved in the language tasks, but cannot demonstrate the

necessity of those areas in language function. Second,

language fMRI generally uses silent tasks due to the

motion artifact resulting from vocalizing responses,

which complicates comparisons with the clinical gold

standard tests that use overt responses.

To try to address these problems, we applied a

data-driven method, probabilistic independent compo-

nent analysis (PICA) [4], to fMRI data from two lan-

guage tasks. First, we propose an analysis strategy to

examine activations during two different tasks in an ef-

fort to identify the essential language areas. Second, we

investigate the performance of PICA in extracting lan-

guage-related components from vocalized language

fMRI data that are contaminated by motion artifact and

background noise.

Independent component analysis (ICA) has been ap-

plied to fMRI data to extract statistically independent

features [5, 6]. It has been shown that ICA can be used

as a complementary tool to the conventional general

linear model (GLM) method in improving the sensitivity

and specificity of fMRI language mapping [7]. PICA is

an extension of the classical noise-free ICA model [5],

assuming that the data are confounded by additive Gaus-

sian noise [4]. It was proposed to address the overfitting

problem and make the statistical significance testing

feasible for the analysis of fMRI data. PICA has been

applied in several studies, including the investigation of

the neural dynamics of default-mode networks and event

segmentation in music [8-11].

2. MATERIALS

158 Y. Tie / J. Biomedical Science and Engineering 1 (2008) 157-162

2.1. Subjects and Image Acquisition

The protocol was approved by the Partner ’s In stitutional

Review Board. Twelve right-handed native English

speaking healthy subjects participated (7 men, 5 women,

mean age = 30.6 ± 6.5 years, range 20-43 years). One

right-handed patient with focal cortical displasia (female,

35 years old) was also studied. All subjects provided

written informed consent.

MR images were obtained using a 3.0 Tesla scanner

(Signa scanner, GE Medical System, Milwaukee, WI,

USA). A single-shot gradient-echo echo-planar imaging

(EPI) was used to acquire blood-oxyge n-level dependent

(BOLD) functional images (TR = 1000 ms, TE = 29 ms,

flip angle = 68°, FOV = 24 cm, dimension = 64 x 64 x

16, slice gap = 0 mm, voxel size = 3.75 x 3.75 x 5 mm3)

using a quadrature head coil. In each image volume, 16

axial slices were acquired using ascending interleaved

scanning sequence. A volumetric T1-weighted magneti-

zation prepared rapid gradient echo (MPRAGE) acquisi-

tion (dimension = 256 x 256 x 124, voxel size = 1 x 1 x

1.3 mm3) was performed to provide a high-resolution

anatomic reference frame for subsequent overlay of

functional activation maps.

The patient’s functional images were acquired using

different parameters (TR = 2000 ms, TE = 40 ms, flip

angle = 90°, FOV = 24 cm, dimension = 128 x 128 x 28,

slice gap = 0 mm, voxel size = 1.88 x 1.88 x 4 mm3,

ascending interleaved scanning sequence).

High-resolution T2-weighted gradient-echo MR images

(dimension = 512 x 512 x 91, voxel size = 0.5 x 0.5 x

1.5 mm3) were acquired to provide background struc-

tural images for the patient’s activation maps.

2.2. Behavioral Paradigm

Subjects performed two language tasks with vocalized

responses: antonym-generation (AG), and noun catego-

rization (NC). In the AG task, subjects were asked to

speak the antonym of a word stimulus presented visually

through MR-compatible video goggles (Resonance

Technology, Los Angeles, CA, USA). In the NC task,

subjects were asked to state whether the word stimulus

referred to either a living (“alive”, e.g., a dog) or

non-living (“not alive”, e.g., a chair) object. Subject vo-

calizations were transmitted by an intercom system

(Avotec Inc., Stuart, FL, USA) to an investigator in the

MRI scanner control room who counted the number of

incorrect or omitted responses in order to verify satis-

factory task performance. Subjects were instructed to

verbalize responses with minimal movement of their

head, jaw, or lips. During the time period between visual

stimuli, subjects were asked to relax and look at a cross-

hair shown in the center of the visual field.

The language tasks were implemented as a

rapid-presentation, event-related fMRI paradigm with a

jittered inter-stimulus-interval (ISI = 8.3 ± 5.1 sec), last-

ing 7 min 20 sec (including a 10-sec pre-stimulus period

acquired to allow stabilization of the BOLD signal, ex-

cluded from analysis). A total of 50 word stimuli were

delivered during each task, and each word was shown

for 2 sec. The fMRI paradigms of the patient study

lasted 5 min 20 sec (including a 10-sec pre-stimulus pe-

riod), delivering 34 words in the AG task and 39 words

in the NC task). The order and exact timing for delivery

of word stimuli was based on a stochastic design in-

tended to maximize the statistical significance of the

fMRI paradigm, and minimize subject’s expectation and

habituation effects. Stimuli event scheduling was per-

formed using the Optseq2 software package (NMR Cen-

ter, Massachusetts General Hospital, MA, USA). Stimu-

lus paradigms were implemented using Presentation

software package (Version 9.70, Neurobehavioral Sys-

tems Inc., Davis, CA, USA).

3. METHODS

3.1. Concatenation of Two Tasks and Data

Pre-processing

FMRI data from two tasks were concatenated in time by

putting the NC task data at the end of the AG task data

(Figure 1). Thus the total data set of each subject was

860 volumes (430 volumes for each task). Then we used

the Statistical Parametric Mapping software package

(SPM2, Wellcome Department of Cognitive Neurology,

London, UK) to perform motion correction by realigning

the fMRI images to the first functional image.

The data were then sent to the Multivariate Explora-

tory Linear Optimized Decomposition into Independent

Components (MELODIC, Version 3.05) module of

FMRIB’s Software Library (FSL, Version 4.0, Oxford

Center for Functional Magnetic Resonance Imaging of

the Brain, University of Oxford, Oxford, UK) for PICA

analysis [4]. Before the PICA procedure, the following

steps were applied to the input data file: masking of

non-brain voxels by an intensity thresholding at 10%;

high-pass temporal filtering to remove low-frequency

drifts with cut-off period of 128 sec; and voxel-wise

de-meaning and variance normalization o f the data.

3.2. Probabilistic ICA of fMRI Data

We applied the PICA technique proposed by Beckmann

and Smith [4] to analyze the concatenated fMRI data.

The fMRI signal (X) is assumed to be generated from a

linear mixing process of the independent non-Gaussian

sources (S) by a mixing matrix (A), and corrupted by

additive Gaussian noise (η):

X = AS + η. (1)

In the PICA model (1), X is a p × n matrix denoting p

volumes (p = 860 volumes for this study) of n voxels

fMRI data, S is a q × n matrix denoting q non-G aussian

sources (i.e., independent components, ICs), and A is a p

× q mixing matrix.

First, the number of ICs (q) was estimated using the

Laplace approximation to the Bayesian evidence of the

model order [4]. There were 95 ~ 166 components esti-

mated for each subject (mean ± STD = 122 ± 20, across

Y. Tie / J. Biomedical Science and Engineering 1 (2008) 157-162 159

Figure 1. Concatenation of two tasks fMRI data. Red bars: word

stimuli onsets.

Figure 2. Spatial maps of (A) “language COI”, and (B) “mo-

tor COI” of an example subject estimated by the PICA

analysis (posterior probability thresholded at p > 0.5).

subjects). The fMRI data were then reduced to q dimen-

sions by principal component analysis (PCA) and de-

composed into q spatially independent components by

the fastICA algorithm [12]. A de-mixing matrix W was

found to generate:

Ŝ = WX, (2)

where Ŝ is a good approximation to the sources.

Then the estimated component maps were divided by

the standard deviation of the Gaussian noise to generate

z-score maps, and sequentially thresholded at a posterior

probability p > 0.5, by fitting a Gaussian/Gamma mix-

ture model to the histogram of intens ity values [4].

3.3. Identification of Components of Interest

We used a two-step process to identify the components

Table 1. Ranking result s of COIs based o n IC ti me cour se’ s

average power in frequency range of less than 0.1 Hz.

Subject Language COI Motor COI

1 4th 3

2 8th 4

3 1st 2

4 11th 10th

5 14th 9

6 5th 4

7 5th 6

8 1st 11th

9 6th 5

10 23rd 25th

11 3rd 8

12 1st 9

Figure 3. Time courses of “language COI” (red) and “motor COI”

(blue) averaged across subjects. The green plot is the expected

HRF model.

of interest (COIs) from the PICA output of each sub-

ject’s data. First, the power spectrum density of each

component’s time course was estimated by periodogram

spectral estimation. Then the average power was calcu-

lated for frequency less than 0.1 Hz, and ranked in de-

scending order. This frequency range was selected based

on the power spectrum of the expected hemodynamic

response function (HRF), which was in low-frequency

range (< 0.1 Hz). Next, the components’ spatial maps

were visually inspected in the order determined by the

previous step to identify the components with activation

in the putative language areas, and other areas of inter-

est.

3.4. GLM Analysis of fMRI Data

For comparison purposes the pre-processed data were

submitted to SPM2 for conven tional GLM analysis. The

basis function consisted of the canonical HRF model

with temporal and dispersion derivatives. Run-specific

responses were modeled in an event-related design [13]

by convolving a series of Dirac’s delta function, each

representing a stimulus event onset, with the basis func-

tion. After GLM, the t maps were fitted to a Gaus-

sian/Gamma mixture model and thresholded at a poste-

rior probability p > 0.5 in order to be comparable with

the PICA results.

4. RESULTS

+22 +46

+36 +26

+22 +58

+36 0

160 Y. Tie / J. Biomedical Science and Engineering 1 (2008) 157-162

Figure 4. Temporal c orrelati on coef ficie nt s betw een the identif ied

COIs’ time courses and the expected HRF model for each sub-

ject (red bars: language COIs; blue bars: motor COIs).

Figure 5. Spatial maps of GLM analysis of the same exam-

ple subject a s in Fig. 2 (posteri o r pr ob a bi l i ty thresholded at p

> 0.5).

4.1. Spatial Maps of Identified Language and

Motor COIs

For each subject, among an average of 122 components

estimated by PICA, one component was consistently

identified with activations in the left frontal and tempo-

ral lobes, primarily in the putative language areas (infe-

rior frontal gyrus, superior temporal gyrus, and su-

pramarginal gyrus), and pre-motor area (PMA, middle

frontal gyrus). This component was named “language

COI”. Figure 2A shows the spatial maps of the language

COI of an example subject (activations in the putative

language areas and PMA are highlighted by red circles).

Another component was identified with bilateral activa-

tions mainly in the primary face motor areas (precentral

gyrus), which was named “motor COI”. Figure 2B

shows the spatial maps of the motor COI of the same

subject.

The ranking results of the language and motor COIs

based on the average power of their time courses are

listed in Table 1 for each subject. It demonstrated that

these COIs were ranked within the first 15 ICs (except

for one subject (#10), whose language COI ranked the

23rd, and motor COI ranked the 25th).

4.2. Time Courses of Identified COIs

Figure 6. Spatial maps of (A) “language COI”, and (B)

“motor COI” o f the patient estim a ted by t h e P I C A analysis

(posterior probability thresholded at p > 0.5).

Figure 3 shows the time courses of the language (red)

and motor (blue) COIs averaged across all subjects. The

expected HRF model is also shown (green).

Figure 4 shows the results of temporal correlation

analysis between the identified COIs’ time courses and

the expected HRF mode for each subject. It demon-

strated that the time courses of the motor COIs of 7 sub-

jects correlated more closely with the expected HRF

than that of the language COIs. The correlation coeffi-

cients are 0.30 ± 0.17 (mean ± STD, across subjects) for

the language COIs, and 0.38 ± 0.25 for the motor COIs.

4.3. Comparison of PICA and GLM Results

Figure 5 shows the spatial maps of the GLM results of

the same example subject. Compared with the PICA

maps, the GLM maps identified activation patterns very

similar to that of the motor COI (Figure 2B), with bilat-

eral activations mainly in the primary face motor areas.

The GLM maps showed weak activations in the putative

language areas.

4.4. Results of Patient Data

Y. Tie / J. Biomedical Science and Engineering 1 (2008) 157-162 161

Figure 7. Two noise components showing signals from (A)

the brain edge; and (B) the ventricle system.

The spatial maps of the language and motor COIs re-

sulting from PICA analysis of the pa tient data are shown

in Figure 6. The “language COI” (Figure 6A) showed

significant activations in the left inferior frontal gyrus

and left PMA. The “motor COI” (Figure 6B) shows

bilateral activations in the primary face motor ar-

eas.Temporal correlation analysis indicated that the lan-

guage COI correlated poorly with the task timing (r =

0.15, p < 0.01), while the motor COI correlated highly

with the task timing (r = 0.45, p < 0.0001). Using the

same pre-processed data, GLM generated maps (not

shown) revealed activation s mainly in the bilateral motor

area, but almost no activations in the putative language

areas.

Figure 7 shows two noise components indicating sig-

nals from the edge of the brain (due to motion artifact,

Figure 7A) and the ventricle system (Figure 7B).

5. DISCUSSION

In this study, we propose a new analysis strategy using

PICA and combining fMRI data fro m different languag e

tasks to better identify the essential language areas for

pre-operative language mapping. A component was con-

sistently identified in the putative language areas, sug-

gesting that the underlying system was essential to both

tasks. Compared with the conventional GLM method,

PICA indicated better performance in extracting lan-

guage activations, and separating noise into different

components.

It is clinically useful to demonstrate brain regions

critical for language function so that they may be

avoided during resection [3]. Whereas, clinical tests

which block neural activity (IAT and ECS) are able to

demonstrate the necessity of particular areas for lan-

guage function, fMRI maps may include

non-language-specific areas that participate in the task

[7]. To try to demonstrate language-specific areas, we

concatenated the fMRI data from two different tasks in

time, and applied PICA to estimate the spatially inde-

pendent components from the data. The underlying as-

sumption is that the networks revealed by PICA are ac-

tivated by both tasks, which are more likely to be essen-

tial language areas. The results of all subjects consis-

tently identified a ne twork in the putative langu age areas

and the ipsilateral PMA. Another network was found in

the bilateral face motor areas, ind icating the participation

of these areas in overt language production.

The temporal correlation analysis on the time courses

of the “language COIs” showed a relatively low correla-

tion with the expected HRF model, while the “motor

COIs” correlated more closely with the HRF model.

This indicated that the temporal profile of the language

activation may differ from the task timing, while the

motor activation followed the task very well. Since the

GLM method is based on the expected HRF model, and

therefore able to identify voxels whose time courses

correlate highly with the paradigm, this may explain the

observation that the GLM identified very similar activa-

tion patterns to the “motor COIs”, but did not show sig-

nificant activations in the language areas. PICA’s ability

to capture the neural networks whose temporal activ ities

may deviate from the time course of the paradigm sug-

gests that the data-driven method may be well-suited to

analysis of complex event-related language fMRI. The

data-driven method could be particularly useful in ana-

lyzing patient data, since the lesions adjacent to the lan-

guage areas may result in alterations in the BOLD re-

sponse [14], and patients may have difficulty with task

performance causing altered timing of the cognitive

process relative to the model.

To identify components of interest from a large num-

ber of separated components remains a practical chal-

lenge for the ICA technique, and methods have been

proposed based on spatial, temporal, an d spectral criteria

[5-7, 15]. In this study, we ranked the ICs based on their

time courses’ average power in the frequency range of

less than 0.1 Hz. This frequency range was chosen based

on the power spectrum of the expected HRF model. The

language and motor COIs were among the highly ranked

ICs, which confirmed the effectiveness of this selection

criterion. Among other highly ranked ICs were de-

fault–mode networks as shown in [8], and low frequency

head motion artifact.

Vocalized event-related language paradigms offer an

advantage of more closely simulating natural language

performance. However, the motion artifact resulting

from vocalizing responses may lead to contamination in

the statistical maps. PICA has the ability to separate sig-

nals from motion artifact, noise, and physiological ef-

fects, into different components, and therefore may be

particularly applicable to vocalized language fMRI data.

This proposed combination strategy can be extended

to fMRI data from multiple tasks. To improve the effec-

tiveness of this strategy, the different tasks should be

162 Y. Tie / J. Biomedical Science and Engineering 1 (2008) 157-162

selected in an optimal way. In this study, the AG task

focused on both the receptive and expressive aspects of

language function, and the NC task was more involved

in the receptive aspect. Future work will be directed at

the optimal combination of task paradigms, as well as

validation of this approach against invasive testing in

patients.

6. CONCLUSION

We propose a new analysis strategy to identify essential

language areas by combining fMRI data from two dif-

ferent tasks. We applied a data-driven method, PICA,

based on the assumption that the separated spatially in-

dependent networks were activated by both tasks. The

results show that using this approach, the language

component was consistently identified and separated

from the participating networks. This approach com-

pares favorably with GLM for complex event-related

language paradigms, and may be particularly helpful for

patient studies for pre-operative language mapping.

ACKNOWLEDGEMENT

This study is supported by NIH K08 NS048063, NIH-NCRR U41

RR019703, and The Brain Science Foundation.

REFERENCES

[1] J.R. Binder, S.J. Swanson, T.A. Hammeke, G.L. Morris, W.M.

Mueller, M. Fischer, S. Benbadis, J.A. Frost, S.M. Rao, and V.M.

Haughton, (1996) “Determination of language dominance using

functional MRI: a comparison with the Wada test,” Neurology,

vol. 46, 978-984.

[2] C. Stippich, N. Rapps, J. Dreyhaupt, A. Durst, B. Kress, E. Nen-

ning, V.M. Tronnier, and K. Sartor, (2007) “Localizing and later-

alizing language in partients with brain tumors: feasibilty of rou-

tine preoperative functional MR imaging in 81 consecutive pa-

tients,” Radiology, vol. 243, 828-836.

[3] S. Tharin and A. Golby, (2007) “Functional brain mapping and its

applications to neurosurgery,” Neurosurgery, vol. 60, 185-201.

[4] C.F. Beckmann and S.M. Smith, (2004) “Probabilistic independ-

ent component analysis for functional magnetic resonance imag-

ing,” IEEE Trans. Med. Imaging, vol. 23, 137-152.

[5] M.J. McKeown, S. Makeig, G.G. Brown, T.P. Jung, S.S. Kinder-

mann, A.J. Bell, and T.J. Sejnowski, (1998) “Analysis of fMRI

data by blind separation into independent spatial components,”

Hum. Brain Mapp., vol. 6, 160-188.

[6] V.D. Calhoun, T. Adali, G.D. Pearlson, and J.J. Pekar, (2001) “A

method for making group inferences from functional MRI data

using independent component analysis,” Hum. Brain Mapp., vol.

14, 140-151.

[7] Y. Tie, S. Whalen, R.O. Suarez, and A.J. Golby, (2008) “Group

independent component analysis of language fMRI from word

generation tasks,” Neuroimage, vol. 42, 1214- 1225.

[8] M.D. Greicius, G. Srivastava, A.L., Reiss, and V. Menon, (2004)

“Default-mode network activity distinguishes Alzheimer’s disease

from healthy aging: Evidence form functional MRI,” Proc. Natl.

Acad. Sci. U.S.A., vol. 101, 4637-4642.

[9] C.F. Beckmann, M. DeLuca, J.T. Devlin, and S.M. Smith, (2005)

“Investigations into resting-state connectivity using independent

component analysis,” Phil. Trans. R. Soc. B, vol. 360, 1001-1013.

[10] M. Fukunaga, S.G. Horovitz, P. Van Gelderen, J.A. de Zwart, J.M.

Jansma, V.N. Ikonomidou, R. Chu, R.H.R. Deckers, D.A. Leo-

pold, and J.H. Duyn, (2006) “Large-amplitude, spatially corre-

lated fluctuations in BOLD fMRI signals during extended rest and

early sleep stages,” Magn. Reson. I maging, vol. 24, 979-992.

[11] D. Sridharan, D. J. Levitin, C.H. Chafe, J. Berger, and V. Menon,

(2007) “Neural dynamics of event segmentation in music: Con-

verging evidence for dissociable ventral and dorsal networks,”

Neuron, vol. 55, 521- 532.

[12] A. Hyvarinen, (1999) “Fast and robust fixed-point algorithms for

independent component analysis,” IEEE Trans. Neural. Netw., vol.

10, 626-634.

[13] K.J. Friston, P. Fletcher, O. Josephs, A. Holmes, M.D. Rugg, and

R. Turner, (1998) “Event-related fMRI: characterizing differential

responses,” Neuroimage, vol. 7, 30-40.

[14] A.I. Holodny, M. Schulder, W.C. Liu, J. Wolko, J.A. Maldjian,

and A.J. Kalnin, (2000)“The effect of brain tumors on BOLD

functional MR imaging activation in the adjacent motor cortex:

implications for image-guided neurosurgery,” AJNR Am. J. Neu-

roradiol., vol. 21, 1415-1422.

[15] C.H. Moritz, B.P. Rogers, and M.E. Meyerand, (2003) “Power

spectrum ranked independent component analysis of a periodic

fMRI complex motor paradigm,” Hum. Brain Mapp, vol. 18,

111-122.