Neurosynth: Frequently Asked Questions


What is Neurosynth?
Neurosynth is a platform for automatically synthesizing the results of many different neuroimaging studies.
How do I get more help?
This page consolidates questions about the platform in general. There are additional FAQs on most of the main pages on the site; look for the "FAQs" tab on the 'Analyses', 'Locations', and 'Decoder' pages. Beyond that, you're welcome to email the mailing list, or (for bug reports, feature requests, etc.) open an issue on GitHub. Since this project is currently unfunded, and resources are limited, we can't guarantee a rapid response, but we'll do our best.


How are the meta-analysis images generated?
A full explanation is in our Nature Methods paper. Briefly, the processing stream involves the following steps:
  • Activation coordinates are extracted from published neuroimaging articles using an automated parser.
  • The full text of all articles is parsed, and each article is 'tagged' with a set of terms that occur at a high frequency in that article.
  • A list of several thousand terms that occur at high frequency in 20 or more studies is generated.
  • For each term of interest (e.g., 'emotion', 'language', etc.), the entire database of coordinates is divided into two sets: those that occur in articles containing the term, and those that don't.
  • A giant meta-analysis is performed comparing the coordinates reported for studies with and without the term of interest. In addition to producing statistical inference maps (i.e., z and p value maps), we also compute posterior probability maps, which display the likelihood of a given term being used in a study if activation is observed at a particular voxel.
Doesn't the coordinate extraction process have errors?
Yes. Our parser extracts coordinates from HTML articles, which presents some difficulties, because different publishers and journals format tables very differently. Moreover, many articles have idiosyncratic formatting or errors in the table HTML that can prevent data extraction. What this means is that the quality of individual extracted studies in the database is quite low, and the results only look sensible because of the sheer volume of data we're working with.
I found an error in the coordinate data for a paper. How can I fix it?
Unfortunately, we don't have any way for users to contribute to the manual verification and validation of coordinates right now. You can send us an email detailing any problems you find. But if we're being honest, it's unlikely we'll do anything about this unless it's a systematic problem with the code that affects a non-trivial fraction of studies. We realize that this is less than ideal, and we'd love to move to a crowd-sourced model where other researchers can help validate and extend the database via a web interface. One such effort is, where you can manually annotate coordinates and articles in the Neurosynth database. We eventually plan to integrate the annotations on brainspell into Neurosynth. In the meantime, if you'd like to help, you can email us with the details of any problems you identify. It's particularly helpful if you can identify systematic problems with the parser (e.g., that it consistently misses the third column in a particular kind of table in ScienceDirect articles). We're unlikely to address errors on a case-by-case basis unless they're part of a broader pattern. Of course, if you'd like to dive in and help write a better parser, that's even better--all of the Automated Coordinate Extraction (ACE) code is available at github.
How come my study isn't in the database?
For technical reasons, the database currently includes only a subset of published neuroimaging studies. Specifically, to be included, a study has to (a) be in a journal that we currently cover (each new journal or publisher requires us to write a custom filter, which isn't trivial); (b) be available online in HTML form (we currently don't work with PDFs, and some journals only have HTML versions for more recent articles); and (c) contain at least one activation coordinate that's detectable by our parser. Aside from these criteria, there are a small number of studies that present special problems for various reasons (< 2% of the total). But rest assured that we don't deliberately exclude any study from the database, so if there's a study you're interested in that isn't represented, odds are it just isn't covered yet. The database is continually growing as we have time to write filters for new journals, so check back from time to time.
How do you deal with the fact that different studies report coordinates in different stereotactic spaces?
The results you see incorporate an automated correction for differences in stereotactic space. Specifically, we use a simple algorithm to determine whether a study is reporting results in MNI or Talairach space, and we convert Talairach results to MNI using the Lancaster et al (2007) transform. This doesn't work perfectly, but we can correctly detect the space being used about 80% of the time. In our manuscript, we report some supplemental analyses validating this approach, but there's no question that the results you see on Neurosynth are spatially noisy. We don't recommend drawing any strong conclusions about fine-grained localization based on anything you find on this website.
How do you distinguish between activations and deactivations when extracting coordinates from published articles?
We don't; see above. The parser is dumb. It considers anything that looks like a coordinate in a table to be a relevant activation. It doesn't know what the activation means.
Are individual words or phrases really a good proxy for cognitive processes? Can you really say that studies that use the term 'pain' at a certain frequency are about pain?
Depends. There's no question that the 'lexical assumption' we make will fail in many individual cases. For instance, many studies might use the word 'pain' in the context of emotional rather than physical pain, which might partly explain why we see a fair amount of amygdala activation in the meta-analysis map for pain (though much of that probably reflects the fact that pain is a salient and affectively negative signal). But ultimately it's an empirical question, and for the most part, the answer seems to be that, at least for relatively broad domains (e.g., working memory, language, emotion, etc.), individual terms are a reasonably good substitute for much more careful coding of studies. If you compare the maps on this website with those that have been previously published in much more careful analyses (see our manuscript for examples), we think the results converge reasonably closely. So while we certainly don't think the current lexical approach is ideal, we feel pretty good about using words as a proxy for processes. The main limitation right now is a lack of specificity (i.e., narrower constructs like disgust are difficult to isolate from broader ones like negative emotion).
Isn't selection bias likely to play a role here? If everyone thinks that the amygdala is involved in emotion, isn't it likely that an automated meta-analysis will simply capture that bias?
Yes, but we don't see this as any more of a concern for this type of analysis than for others. There's no question that biases in the primary literature skew the results of automated meta-analyses. We don't doubt that there's a general tendency towards confirmation bias, so that researchers are more likely than they should be to report that the amygdala is involved in emotion, that the left IFG is involved in speech production, that S2 is involved in tactile sensation, and so on. But it's important to recognize that this isn't really a limitation of our particular approach, and is a much more general problem. We all know, for example, that researchers are much more likely to look for and report emotion-related activation in the amgydala than in, say, motor cortex. Confirmation bias is unquestionably a huge problem for an automated meta-analysis, but we think it's just as big a problem for individual fMRI studies or other kinds of meta-analyses.
How do you decide which terms to keep?

We're not very principled about this. Every time the database expands, we generate a list of all possible terms that meet some minimum cut-off for number of studies. Then, we manually go through the list of several thousand terms and keep only the ones that seem to us to carry psychological or anatomical content (e.g., 'emotion' and 'language' would make the cut; 'activation' and 'necessary' wouldn't. It's entirely possible (likely, in fact) that we miss some terms that other people would include, or include some terms that don't have any interesting content.


Why do images take a while to load?

There are three reasons why images may take a while to display in the interactive viewer when you first load a page. First, the image files themselves are fairly large (typically around 1 MB each), and most pages on Neurosynth display multiple images simultaneously. This means that your browser is usually receiving 1 - 3 MB of data on each request. Needless to say, the images can't be rendered until they've been received, so users on slower connections may have to wait a bit.

Second, there's some overhead incurred by reading the binary Nifti image into memory in JavaScript; on most browsers, this adds another second or two of delay.

Lastly, some images are only generated on demand--for example, if you're the first user to vist a particular brain location, you'll enjoy the privilege of waiting for the coactivation image for that voxel to be generated anew. This will typically only take a second or two, but can take as long as a minute in rare cases (specifically, when too many users are accessing Neurosynth and we need to spin up another background worker to handle the load).

When I enter coordinates manually, they're rounded to different numbers!
Because the overlays have a resolution of 2mm, when you enter a set of coordinates manually, they'll be rounded to the nearest point we have data for. There are no available data at odd coordinates (e.g., [-11, 35, 27]).
Something broke!
Please report bugs (and also feature requests) on the GitHub repository.
What happened to the 'Custom Analysis' interface?

Custom online analyses are no longer supported on This functionality was experimental, and we've decided to take it off-line for now. The precipitating reasons for this were that (a) the interface was buggy, (b) the results were insensible much (most?) of the time, and (c) what was intended to be purely an exploratory, experimental tool was being used to generate questionable results that users were in our view reading far too much into (and in some cases, reporting in publications).

We're continuing to develop a more comprehensive and robust suite of tools for online meta-analysis, but for the time being, if you want to conduct custom analyses using Neurosynth data, you'll have to use the Python core tools (which contain all of the same functionality and more, but require some basic Python proficiency to use).


How are the images thresholded?

The images you see are thresholded to correct for multiple comparisons. We use a false discovery rate (FDR) criterion of .01, meaning that, in theory, and on average, you can expect about 1% of the voxels you see activated in any given map to be false positives (though the actual proportion will vary and is impossible to determine). That said, we haven't done any extensive investigation to determine how well these thresholds are calibrated with gold-standard approaches based on permutation, so it's possible the thresholds might be somewhat more liberal or conservative than the nominal value suggests.

What format are the downloadable images in?

When you click the download link below the map you're currently viewing, you'll start downloading a 3D image corresponding to the map you currently have loaded. The downloaded images are in NIFTI format, and the files are gzipped to conserve space. All major neuroimaging software packages should be able to read these images in without any problems. The images are all nominally in MNI152 2mm space (the default space in SPM and FSL), though there's a bit more to it than that, because technically we don't account very well for stereotactic differences between studies in the underlying database (we convert Talairach to MNI, but it's imperfect, and we don't account for more subtle differences between, e.g., FSL and SPM templates). For a more detailed explanation, see the paper.

Note that the downloaded images are not dynamically adjusted to reflect the viewing options you currently have set in your browser. For instance, if you've adjusted the settings to only display negative activations at a threshold of z = -7 or lower, clicking the download link won't give you an image with only extremely strong negative activations--it'll give you the original (FDR-corrected) image. Of course, you can easily recreate what you're seeing in your browser by adjusting the thresholds correspondingly in your off-line viewer.

What do the 'uniformity test' and 'association test' descriptions mean in the feature image names?

Short answer:

  • uniformity test map: z-scores from a one-way ANOVA testing whether the proportion of studies that report activation at a given voxel differs from the rate that would be expected if activations were uniformly distributed throughout gray matter.
  • association test map: z-scores from a two-way ANOVA testing for the presence of a non-zero association between term use and voxel activation.

Long answer: The uniformity test and association test maps are statistical inference maps; they display z-scores for two different kinds of analyses. The uniformity test map can be interpreted in roughly the same way as most standard whole-brain fMRI analysis: it displays the degree to which each voxel is consistently activated in studies that use a given term. For instance, the fact that the uniformity test map for the term 'emotion' displays high z-scores in the amygdala implies that studies that use the word emotion a lot tend to consistently report activation in the amygdala--at least, more consistently than one would expect if activation were uniformly distributed throughout gray matter. Note that, unlike most meta-analysis packages (e.g., ALE or MKDA), z-scores aren't generated through permutation, but using a chi-square test. (We use the chi-square test solely for pragmatic reasons: we generate thousands of maps at a time, so it's not computationally feasible to run thousands of permutations for each one.)

The association test maps provides somewhat different (and, in our view, typically more useful) information. Whereas the uniformity test maps tell you about the consistency of activation for a given term, the association test maps tell you whether activation in a region occurs more consistently for studies that mention the current term than for studies that don't. So for instance, the fact that the amygdala shows a large positive z-score in the association test map for emotion implies that studies whose abstracts include the word 'emotion' are more likely to report amygdala activation than studies whose abstracts don't include the word 'emotion'. That's important, because it controls for base rate differences between regions. Meaning, some regions (e.g., dorsal medial frontal cortex and lateral PFC) play a very broad role in cognition, and hence tend to be consistently activated for many different terms, despite lacking selectivity. The association test maps let you make slightly more confident claims that a given region is involved in a particular process, and isn't involved in just about every task.

What happened to the "forward inference" and "reverse inference" maps that used to be on the website?
We renamed the forward and reverse inference maps; they're now referred to as the "uniformity test" and "association test" maps, respectively (see previous question). The latter names more accurately capture what these maps actually mean. It was a mistake on our part to have used the forward and reverse inference labels; those labels should properly be reserved for probabilistic maps generated via a Bayesian estimation analysis, rather than for z-scores resulting from frequentist inferential tests. The two sets of maps are related but different. "Real" forward and reverse inference maps (to the degree that such a thing is possible to obtain using this kind of data) can be generated using the Neurosynth core tools. Be warned though that they're more difficult to interpret and use correctly, which is why we don't show them here.


How do I use the Neurosynth code to generate my own meta-analysis images?
Documentation for the Neurosynth core tools is admittedly sparse at the moment. That said, the examples folder in the github repository contains a number of worked examples that illustrates various uses of the code. Probably the best place to start is with the demo Jupyter notebook, which can be viewed online here.