Metaanalysis
A metaanalysis is a statistical analysis that combines the results of multiple scientific studies.
The basic tenet behind metaanalyses is that there is a common truth behind all conceptually similar scientific studies, but which has been measured with a certain error within individual studies. The aim then is to use approaches from statistics to derive a pooled estimate closest to the unknown common truth based on how this error is perceived. In essence, all existing methods yield a weighted average from the results of the individual studies and what differs is the manner in which these weights are allocated and also the manner in which the uncertainty is computed around the point estimate thus generated. In addition to providing an estimate of the unknown common truth, metaanalysis has the capacity to contrast results from different studies and identify patterns among study results, sources of disagreement among those results, or other interesting relationships that may come to light in the context of multiple studies.^{[1]}
A key benefit of this approach is the aggregation of information leading to a higher statistical power and more robust point estimate than is possible from the measure derived from any individual study. However, in performing a metaanalysis, an investigator must make choices which can affect the results, including deciding how to search for studies, selecting studies based on a set of objective criteria, dealing with incomplete data, analyzing the data, and accounting for or choosing not to account for publication bias.^{[2]}
Metaanalyses are often, but not always, important components of a systematic review procedure. For instance, a metaanalysis may be conducted on several clinical trials of a medical treatment, in an effort to obtain a better understanding of how well the treatment works. Here it is convenient to follow the terminology used by the Cochrane Collaboration,^{[3]} and use "metaanalysis" to refer to statistical methods of combining evidence, leaving other aspects of 'research synthesis' or 'evidence synthesis', such as combining information from qualitative studies, for the more general context of systematic reviews.
Contents
 1 History
 2 Advantages
 3 Problems
 4 Steps in a metaanalysis
 5 Methods and assumptions
 6 Applications in modern science
 7 See also
 8 References
 9 Further reading
 10 External links
History
The historical roots of metaanalysis can be traced back to 17th century studies of astronomy,^{[4]} while a paper published in 1904 by the statistician Karl Pearson in the British Medical Journal^{[5]} which collated data from several studies of typhoid inoculation is seen as the first time a metaanalytic approach was used to aggregate the outcomes of multiple clinical studies.^{[6]}^{[7]} The first metaanalysis of all conceptually identical experiments concerning a particular research issue, and conducted by independent researchers, has been identified as the 1940 booklength publication Extrasensory Perception After Sixty Years, authored by Duke University psychologists J. G. Pratt, J. B. Rhine, and associates.^{[8]} This encompassed a review of 145 reports on ESP experiments published from 1882 to 1939, and included an estimate of the influence of unpublished papers on the overall effect (the filedrawer problem). Although metaanalysis is widely used in epidemiology and evidencebased medicine today, a metaanalysis of a medical treatment was not published until 1955. In the 1970s, more sophisticated analytical techniques were introduced in educational research, starting with the work of Gene V. Glass, Frank L. Schmidt and John E. Hunter.
The term "metaanalysis" was coined by Gene V. Glass,^{[9]} who was the first modern statistician to formalize the use of the term metaanalysis. He states "my major interest currently is in what we have come to call ...the metaanalysis of research. The term is a bit grand, but it is precise and apt ... Metaanalysis refers to the analysis of analyses". Although this led to him being widely recognized as the modern founder of the method, the methodology behind what he termed "metaanalysis" predates his work by several decades.^{[10]}^{[11]} The statistical theory surrounding metaanalysis was greatly advanced by the work of Nambury S. Raju, Larry V. Hedges, Harris Cooper, Ingram Olkin, John E. Hunter, Jacob Cohen, Thomas C. Chalmers, Robert Rosenthal, Frank L. Schmidt, and Douglas G. Bonett.
Advantages
Conceptually, a metaanalysis uses a statistical approach to combine the results from multiple studies in an effort to increase power (over individual studies), improve estimates of the size of the effect and/or to resolve uncertainty when reports disagree. A metaanalysis is a statistical overview of the results from one or more systematic reviews. Basically, it produces a weighted average of the included study results and this approach has several advantages:
 Results can be generalized to a larger population,
 The precision and accuracy of estimates can be improved as more data is used. This, in turn, may increase the statistical power to detect an effect.
 Inconsistency of results across studies can be quantified and analyzed. For instance, does inconsistency arise from sampling error, or are study results (partially) influenced by betweenstudy heterogeneity.
 Hypothesis testing can be applied on summary estimates,
 Moderators can be included to explain variation between studies,
 The presence of publication bias can be investigated
Problems
A metaanalysis of several small studies does not predict the results of a single large study.^{[12]} Some have argued that a weakness of the method is that sources of bias are not controlled by the method: a good metaanalysis cannot correct for poor design and/or bias in the original studies.^{[13]} This would mean that only methodologically sound studies should be included in a metaanalysis, a practice called 'best evidence synthesis'.^{[13]} Other metaanalysts would include weaker studies, and add a studylevel predictor variable that reflects the methodological quality of the studies to examine the effect of study quality on the effect size.^{[14]} However, others have argued that a better approach is to preserve information about the variance in the study sample, casting as wide a net as possible, and that methodological selection criteria introduce unwanted subjectivity, defeating the purpose of the approach.^{[15]}
Publication bias: the file drawer problem
Another potential pitfall is the reliance on the available body of published studies, which may create exaggerated outcomes due to publication bias, as studies which show negative results or insignificant results are less likely to be published. For example, pharmaceutical companies have been known to hide negative studies and researchers may have overlooked unpublished studies such as dissertation studies or conference abstracts that did not reach publication. This is not easily solved, as one cannot know how many studies have gone unreported.^{[16]}
This file drawer problem (characterized by negative or nonsignificant results being tucked away in a cabinet), can result in a biased distribution of effect sizes thus creating a serious base rate fallacy, in which the significance of the published studies is overestimated, as other studies were either not submitted for publication or were rejected. This should be seriously considered when interpreting the outcomes of a metaanalysis.^{[16]}^{[17]}
The distribution of effect sizes can be visualized with a funnel plot which (in its most common version) is a scatter plot of standard error versus the effect size. It makes use of the fact that the smaller studies (thus larger standard errors) have more scatter of the magnitude of effect (being less precise) while the larger studies have less scatter and form the tip of the funnel. If many negative studies were not published, the remaining positive studies give rise to a funnel plot in which the base is skewed to one side (asymmetry of the funnel plot). In contrast, when there is no publication bias, the effect of the smaller studies has no reason to be skewed to one side and so a symmetric funnel plot results. This also means that if no publication bias is present, there would be no relationship between standard error and effect size.^{[18]} A negative or positive relation between standard error and effect size would imply that smaller studies that found effects in one direction only were more likely to be published and/or to be submitted for publication.
Apart from the visual funnel plot, statistical methods for detecting publication bias have also been proposed. These are controversial because they typically have low power for detection of bias, but also may make false positives under some circumstances.^{[19]} For instance small study effects (biased smaller studies), wherein methodological differences between smaller and larger studies exist, may cause asymmetry in effect sizes that resembles publication bias. However, small study effects may be just as problematic for the interpretation of metaanalyses, and the imperative is on metaanalytic authors to investigate potential sources of bias.
A Tandem Method for analyzing publication bias has been suggested for cutting down false positive error problems.^{[20]} This Tandem method consists of three stages. Firstly, one calculates Orwin's failsafe N, to check how many studies should be added in order to reduce the test statistic to a trivial size. If this number of studies is larger than the number of studies used in the metaanalysis, it is a sign that there is no publication bias, as in that case, one needs a lot of studies to reduce the effect size. Secondly, one can do an Egger's regression test, which tests whether the funnel plot is symmetrical. As mentioned before: a symmetrical funnel plot is a sign that there is no publication bias, as the effect size and sample size are not dependent. Thirdly, one can do the trimandfill method, which imputes data if the funnel plot is asymmetrical.
The problem of publication bias is not trivial as it is suggested that 25% of metaanalyses in the psychological sciences may have suffered from publication bias.^{[20]} However, low power of existing tests and problems with the visual appearance of the funnel plot remain an issue, and estimates of publication bias may remain lower than what truly exists.
Most discussions of publication bias focus on journal practices favoring publication of statistically significant findings. However, questionable research practices, such as reworking statistical models until significance is achieved, may also favor statistically significant findings in support of researchers' hypotheses.^{[21]}^{[22]}
It is not uncommon that studies do not report the effects when they do not reach statistical significance. For example, they may simply say that the groups did not show statistically significant differences, without report any other information (e.g. a statistic or pvalue). Exclusion of these studies would lead to a situation similar to publication bias, but their inclusion (assuming null effects) would also bias the metaanalysis. A new method, MetaNSUE, has shown to allow researchers to include unbiasedly these studies.^{[23]}
Other weaknesses are that it has not been determined if the statistically most accurate method for combining results is the fixed, IVhet, random or quality effect models, though the criticism against the random effects model is mounting because of the perception that the new random effects (used in metaanalysis) are essentially formal devices to facilitate smoothing or shrinkage and prediction may be impossible or illadvised.^{[24]} The main problem with the random effects approach is that it uses the classic statistical thought of generating a "compromise estimator" that makes the weights close to the naturally weighted estimator if heterogeneity across studies is large but close to the inverse variance weighted estimator if the between study heterogeneity is small. However, what has been ignored is the distinction between the model we choose to analyze a given dataset, and the mechanism by which the data came into being.^{[25]} A random effect can be present in either of these roles, but the two roles are quite distinct. There's no reason to think the analysis model and datageneration mechanism (model) are similar in form, but many subfields of statistics have developed the habit of assuming, for theory and simulations, that the datageneration mechanism (model) is identical to the analysis model we choose (or would like others to choose). As a hypothesized mechanisms for producing the data, the random effect model for metaanalysis is silly and it is more appropriate to think of this model as a superficial description and something we choose as an analytical tool – but this choice for metaanalysis may not work because the study effects are a fixed feature of the respective metaanalysis and the probability distribution is only a descriptive tool.^{[25]}
Problems arising from agendadriven bias
The most severe fault in metaanalysis^{[26]} often occurs when the person or persons doing the metaanalysis have an economic, social, or political agenda such as the passage or defeat of legislation. People with these types of agendas may be more likely to abuse metaanalysis due to personal bias. For example, researchers favorable to the author's agenda are likely to have their studies cherrypicked while those not favorable will be ignored or labeled as "not credible". In addition, the favored authors may themselves be biased or paid to produce results that support their overall political, social, or economic goals in ways such as selecting small favorable data sets and not incorporating larger unfavorable data sets. The influence of such biases on the results of a metaanalysis is possible because the methodology of metaanalysis is highly malleable.^{[27]}
A 2011 study done to disclose possible conflicts of interests in underlying research studies used for medical metaanalyses reviewed 29 metaanalyses and found that conflicts of interests in the studies underlying the metaanalyses were rarely disclosed. The 29 metaanalyses included 11 from general medicine journals, 15 from specialty medicine journals, and three from the Cochrane Database of Systematic Reviews. The 29 metaanalyses reviewed a total of 509 randomized controlled trials (RCTs). Of these, 318 RCTs reported funding sources, with 219 (69%) receiving funding from industry^{[clarification needed]}. Of the 509 RCTs, 132 reported author conflict of interest disclosures, with 91 studies (69%) disclosing one or more authors having financial ties to industry. The information was, however, seldom reflected in the metaanalyses. Only two (7%) reported RCT funding sources and none reported RCT authorindustry ties. The authors concluded "without acknowledgment of COI due to industry funding or author industry financial ties from RCTs included in metaanalyses, readers' understanding and appraisal of the evidence from the metaanalysis may be compromised."^{[28]}
For example, in 1998, a US federal judge found that the United States Environmental Protection Agency had abused the metaanalysis process to produce a study claiming cancer risks to nonsmokers from environmental tobacco smoke (ETS) with the intent to influence policy makers to pass smokefree–workplace laws. The judge found that:
EPA's study selection is disturbing. First, there is evidence in the record supporting the accusation that EPA "cherry picked" its data. Without criteria for pooling studies into a metaanalysis, the court cannot determine whether the exclusion of studies likely to disprove EPA's a priori hypothesis was coincidence or intentional. Second, EPA's excluding nearly half of the available studies directly conflicts with EPA's purported purpose for analyzing the epidemiological studies and conflicts with EPA's Risk Assessment Guidelines. See ETS Risk Assessment at 429 ("These data should also be examined in the interest of weighing all the available evidence, as recommended by EPA's carcinogen risk assessment guidelines (U.S. EPA, 1986a) (emphasis added)). Third, EPA's selective use of data conflicts with the Radon Research Act. The Act states EPA's program shall "gather data and information on all aspects of indoor air quality" (Radon Research Act § 403(a)(1)) (emphasis added).^{[29]}
As a result of the abuse, the court vacated Chapters 1–6 of and the Appendices to EPA's "Respiratory Health Effects of Passive Smoking: Lung Cancer and other Disorders".^{[29]}
Steps in a metaanalysis
A metaanalysis is usually preceded by a systematic review, as this allows identification and critical appraisal of all the relevant evidence (thereby limiting the risk of bias in summary estimates). The general steps are then as follows:
 Formulation of the research question, e.g. using the PICO model (Population, Intervention, Outcome).
 Search of literature
 Selection of studies ('incorporation criteria')
 Based on quality criteria, e.g. the requirement of randomization and blinding in a clinical trial
 Selection of specific studies on a wellspecified subject, e.g. the treatment of breast cancer.
 Decide whether unpublished studies are included to avoid publication bias (file drawer problem)
 Decide which dependent variables or summary measures are allowed. For instance, when considering a metaanalysis of published (aggregate) data:
 Differences (discrete data)
 Means (continuous data)

Hedges' g is a popular summary measure for continuous data that is standardized in order to eliminate scale differences, but it incorporates an index of variation between groups:
 in which is the treatment mean, is the control mean, the pooled variance.
 Selection of a metaanalysis model, e.g. fixed effect or random effects metaanalysis.
 Examine sources of betweenstudy heterogeneity, e.g. using subgroup analysis or metaregression.
Formal guidance for the conduct and reporting of metaanalyses is provided by the Cochrane Handbook.
For reporting guidelines, see the Preferred Reporting Items for Systematic Reviews and MetaAnalyses (PRISMA) statement.^{[30]}
Methods and assumptions
Approaches
In general, two types of evidence can be distinguished when performing a metaanalysis: individual participant data (IPD), and aggregate data (AD). The aggregate data can be direct or indirect.
AD is more commonly available (e.g. from the literature) and typically represents summary estimates such as odds ratios or relative risks. This can be directly synthesized across conceptually similar studies using several approaches (see below). On the other hand, indirect aggregate data measures the effect of two treatments that were each compared against a similar control group in a metaanalysis. For example, if treatment A and treatment B were directly compared vs placebo in separate metaanalyses, we can use these two pooled results to get an estimate of the effects of A vs B in an indirect comparison as effect A vs Placebo minus effect B vs Placebo.
IPD evidence represents raw data as collected by the study centers. This distinction has raised the need for different metaanalytic methods when evidence synthesis is desired, and has led to the development of onestage and twostage methods. ^{[31]} In onestage methods the IPD from all studies are modeled simultaneously whilst accounting for the clustering of participants within studies. Twostage methods first compute summary statistics for AD from each study and then calculate overall statistics as a weighted average of the study statistics. By reducing IPD to AD, twostage methods can also be applied when IPD is available; this makes them an appealing choice when performing a metaanalysis. Although it is conventionally believed that onestage and twostage methods yield similar results, recent studies have shown that they may occasionally lead to different conclusions.^{[32]}^{[33]}
Statistical models for aggregate data
Direct evidence: Models incorporating study effects only
Fixed effects model
The fixed effect model provides a weighted average of a series of study estimates. The inverse of the estimates' variance is commonly used as study weight, so that larger studies tend to contribute more than smaller studies to the weighted average. Consequently, when studies within a metaanalysis are dominated by a very large study, the findings from smaller studies are practically ignored.^{[34]} Most importantly, the fixed effects model assumes that all included studies investigate the same population, use the same variable and outcome definitions, etc. This assumption is typically unrealistic as research is often prone to several sources of heterogeneity; e.g. treatment effects may differ according to locale, dosage levels, study conditions, ...
Random effects model
A common model used to synthesize heterogeneous research is the random effects model of metaanalysis. This is simply the weighted average of the effect sizes of a group of studies. The weight that is applied in this process of weighted averaging with a random effects metaanalysis is achieved in two steps:^{[35]}
 Step 1: Inverse variance weighting
 Step 2: Unweighting of this inverse variance weighting by applying a random effects variance component (REVC) that is simply derived from the extent of variability of the effect sizes of the underlying studies.
This means that the greater this variability in effect sizes (otherwise known as heterogeneity), the greater the unweighting and this can reach a point when the random effects metaanalysis result becomes simply the unweighted average effect size across the studies. At the other extreme, when all effect sizes are similar (or variability does not exceed sampling error), no REVC is applied and the random effects metaanalysis defaults to simply a fixed effect metaanalysis (only inverse variance weighting).
The extent of this reversal is solely dependent on two factors:^{[36]}
 Heterogeneity of precision
 Heterogeneity of effect size
Since neither of these factors automatically indicates a faulty larger study or more reliable smaller studies, the redistribution of weights under this model will not bear a relationship to what these studies actually might offer. Indeed, it has been demonstrated that redistribution of weights is simply in one direction from larger to smaller studies as heterogeneity increases until eventually all studies have equal weight and no more redistribution is possible.^{[36]} Another issue with the random effects model is that the most commonly used confidence intervals generally do not retain their coverage probability above the specified nominal level and thus substantially underestimate the statistical error and are potentially overconfident in their conclusions.^{[37]}^{[38]} Several fixes have been suggested^{[39]}^{[40]} but the debate continues on.^{[38]}^{[41]} A further concern is that the average treatment effect can sometimes be even less conservative compared to the fixed effect model^{[42]} and therefore misleading in practice. One interpretational fix that has been suggested is to create a prediction interval around the random effects estimate to portray the range of possible effects in practice.^{[43]} However, an assumption behind the calculation of such a prediction interval is that trials are considered more or less homogeneous entities and that included patient populations and comparator treatments should be considered exchangeable^{[44]} and this is usually unattainable in practice.
The most widely used method to estimate between studies variance (REVC) is the DerSimonianLaird (DL) approach.^{[45]} Several advanced iterative (and computationally expensive) techniques for computing the between studies variance exist (such as maximum likelihood, profile likelihood and restricted maximum likelihood methods) and random effects models using these methods can be run in Stata with the metaan command.^{[46]} The metaan command must be distinguished from the classic metan (single "a") command in Stata that uses the DL estimator. These advanced methods have also been implemented in a free and easy to use Microsoft Excel addon, MetaEasy.^{[47]}^{[48]} However, a comparison between these advanced methods and the DL method of computing the between studies variance demonstrated that there is little to gain and DL is quite adequate in most scenarios.^{[49]}^{[50]}
However, most metaanalyses include between 2 and 4 studies and such a sample is more often than not inadequate to accurately estimate heterogeneity. Thus it appears that in small metaanalyses, an incorrect zero between study variance estimate is obtained, leading to a false homogeneity assumption. Overall, it appears that heterogeneity is being consistently underestimated in metaanalyses and sensitivity analyses in which high heterogeneity levels are assumed could be informative.^{[51]} These random effects models and software packages mentioned above relate to studyaggregate metaanalyses and researchers wishing to conduct individual patient data (IPD) metaanalyses need to consider mixedeffects modelling approaches.^{[52]}
IVhet model
Doi & Barendregt working in collaboration with Khan, Thalib and Williams (from the University of Queensland, University of Southern Queensland and Kuwait University), have created an inverse variance quasi likelihood based alternative (IVhet) to the random effects (RE) model for which details are available online.^{[53]} This was incorporated into MetaXL version 2.0,^{[54]} a free Microsoft excel addin for metaanalysis produced by Epigear International Pty Ltd, and made available on 5 April 2014. The authors state that a clear advantage of this model is that it resolves the two main problems of the random effects model. The first advantage of the IVhet model is that coverage remains at the nominal (usually 95%) level for the confidence interval unlike the random effects model which drops in coverage with increasing heterogeneity.^{[37]}^{[38]} The second advantage is that the IVhet model maintains the inverse variance weights of individual studies, unlike the RE model which gives small studies more weight (and therefore larger studies less) with increasing heterogeneity. When heterogeneity becomes large, the individual study weights under the RE model become equal and thus the RE model returns an arithmetic mean rather than a weighted average. This sideeffect of the RE model does not occur with the IVhet model which thus differs from the RE model estimate in two perspectives:^{[53]} Pooled estimates will favor larger trials (as opposed to penalizing larger trials in the RE model) and will have a confidence interval that remains within the nominal coverage under uncertainty (heterogeneity). Doi & Barendregt suggest that while the RE model provides an alternative method of pooling the study data, their simulation results^{[55]} demonstrate that using a more specified probability model with untenable assumptions, as with the RE model, does not necessarily provide better results. The latter study also reports that the IVhet model resolves the problems related to underestimation of the statistical error, poor coverage of the confidence interval and increased MSE seen with the random effects model and the authors conclude that researchers should henceforth abandon use of the random effects model in metaanalysis. While their data is compelling, the ramifications (in terms of the magnitude of spuriously positive results within the Cochrane database) are huge and thus accepting this conclusion requires careful independent confirmation. The availability of a free software (MetaXL)^{[54]} that runs the IVhet model (and all other models for comparison) facilitates this for the research community.
Direct evidence: Models incorporating additional information
Quality effects model
Doi and Thalib originally introduced the quality effects model.^{[56]} They^{[57]} introduced a new approach to adjustment for interstudy variability by incorporating the contribution of variance due to a relevant component (quality) in addition to the contribution of variance due to random error that is used in any fixed effects metaanalysis model to generate weights for each study. The strength of the quality effects metaanalysis is that it allows available methodological evidence to be used over subjective random effects, and thereby helps to close the damaging gap which has opened up between methodology and statistics in clinical research. To do this a synthetic bias variance is computed based on quality information to adjust inverse variance weights and the quality adjusted weight of the ith study is introduced.^{[56]} These adjusted weights are then used in metaanalysis. In other words, if study i is of good quality and other studies are of poor quality, a proportion of their quality adjusted weights is mathematically redistributed to study i giving it more weight towards the overall effect size. As studies become increasingly similar in terms of quality, redistribution becomes progressively less and ceases when all studies are of equal quality (in the case of equal quality, the quality effects model defaults to the IVhet model – see previous section). A recent evaluation of the quality effects model (with some updates) demonstrates that despite the subjectivity of quality assessment, the performance (MSE and true variance under simulation) is superior to that achievable with the random effects model.^{[58]}^{[59]} This model thus replaces the untenable interpretations that abound in the literature and a software is available to explore this method further.^{[54]}
Indirect evidence: Network metaanalysis methods
Indirect comparison metaanalysis methods (also called network metaanalyses, in particular when multiple treatments are assessed simultaneously) generally use two main methodologies. First, is the Bucher method^{[60]} which is a single or repeated comparison of a closed loop of threetreatments such that one of them is common to the two studies and forms the node where the loop begins and ends. Therefore, multiple twobytwo comparisons (3treatment loops) are needed to compare multiple treatments. This methodology requires that trials with more than two arms have two arms only selected as independent pairwise comparisons are required. The alternative methodology uses complex statistical modelling to include the multiple arm trials and comparisons simultaneously between all competing treatments. These have been executed using Bayesian methods, mixed linear models and metaregression approaches
Bayesian framework
Specifying a Bayesian network metaanalysis model involves writing a directed acyclic graph (DAG) model for generalpurpose Markov chain Monte Carlo (MCMC) software such as WinBUGS.^{[61]} In addition, prior distributions have to be specified for a number of the parameters, and the data have to be supplied in a specific format.^{[61]} Together, the DAG, priors, and data form a Bayesian hierarchical model. To complicate matters further, because of the nature of MCMC estimation, overdispersed starting values have to be chosen for a number of independent chains so that convergence can be assessed.^{[62]} Currently, there is no software that automatically generates such models, although there are some tools to aid in the process. The complexity of the Bayesian approach has limited usage of this methodology. Methodology for automation of this method has been suggested^{[63]} but requires that armlevel outcome data are available, and this is usually unavailable. Great claims are sometimes made for the inherent ability of the Bayesian framework to handle network metaanalysis and its greater flexibility. However, this choice of implementation of framework for inference, Bayesian or frequentist, may be less important than other choices regarding the modeling of effects^{[64]} (see discussion on models above).
Frequentist multivariate framework
On the other hand, the frequentist multivariate methods involve approximations and assumptions that are not stated explicitly or verified when the methods are applied (see discussion on metaanalysis models above). For example, the mvmeta package for Stata enables network metaanalysis in a frequentist framework.^{[65]} However, if there is no common comparator in the network, then this has to be handled by augmenting the dataset with fictional arms with high variance, which is not very objective and requires a decision as to what constitutes a sufficiently high variance.^{[66]} The other issue is use of the random effects model in both this frequentist framework and the Bayesian framework. Senn advises analysts to be cautious about interpreting the 'random effects' analysis since only one random effect is allowed for but one could envisage many.^{[64]} Senn goes on to say that it is rather naıve, even in the case where only two treatments are being compared to assume that randomeffects analysis accounts for all uncertainty about the way effects can vary from trial to trial. Newer models of metaanalysis such as those discussed above would certainly help alleviate this situation and have been implemented in the next framework.
Generalized pairwise modelling framework
An approach that has been tried since the late 1990s is the implementation of the multiple threetreatment closedloop analysis. This has not been popular because the process rapidly becomes overwhelming as network complexity increases. Development in this area was then abandoned in favor of the Bayesian and multivariate frequentist methods which emerged as alternatives. Very recently, automation of the threetreatment closed loop method has been developed for complex networks by some researchers^{[53]} as a way to make this methodology available to the mainstream research community. This proposal does restrict each trial to two interventions, but also introduces a workaround for multiple arm trials: a different fixed control node can be selected in different runs. It also utilizes robust metaanalysis methods so that many of the problems highlighted above are avoided. Further research around this framework is required to determine if this is indeed superior to the Bayesian or multivariate frequentist frameworks. Researchers willing to try this out have access to this framework through a free software.^{[54]}
Applications in modern science
Modern statistical metaanalysis does more than just combine the effect sizes of a set of studies using a weighted average. It can test if the outcomes of studies show more variation than the variation that is expected because of the sampling of different numbers of research participants. Additionally, study characteristics such as measurement instrument used, population sampled, or aspects of the studies' design can be coded and used to reduce variance of the estimator (see statistical models above). Thus some methodological weaknesses in studies can be corrected statistically. Other uses of metaanalytic methods include the development of clinical prediction models, where metaanalysis may be used to combine data from different research centers,^{[67]} or even to aggregate existing prediction models.^{[68]}
Metaanalysis can be done with singlesubject design as well as group research designs. This is important because much research has been done with singlesubject research designs. Considerable dispute exists for the most appropriate metaanalytic technique for single subject research.^{[69]}
Metaanalysis leads to a shift of emphasis from single studies to multiple studies. It emphasizes the practical importance of the effect size instead of the statistical significance of individual studies. This shift in thinking has been termed "metaanalytic thinking". The results of a metaanalysis are often shown in a forest plot.
Results from studies are combined using different approaches. One approach frequently used in metaanalysis in health care research is termed 'inverse variance method'. The average effect size across all studies is computed as a weighted mean, whereby the weights are equal to the inverse variance of each study's effect estimator. Larger studies and studies with less random variation are given greater weight than smaller studies. Other common approaches include the Mantel–Haenszel method^{[70]} and the Peto method.^{[71]}
Seedbased d mapping (formerly signed differential mapping, SDM) is a statistical technique for metaanalyzing studies on differences in brain activity or structure which used neuroimaging techniques such as fMRI, VBM or PET.
Different high throughput techniques such as microarrays have been used to understand Gene expression. MicroRNA expression profiles have been used to identify differentially expressed microRNAs in particular cell or tissue type or disease conditions or to check the effect of a treatment. A metaanalysis of such expression profiles was performed to derive novel conclusions and to validate the known findings.^{[72]}
See also
 Estimation statistics
 Newcastle–Ottawa scale
 Reporting bias
 Review journal
 Secondary research
 Study heterogeneity
 Systematic review
 Galbraith plot
 Data aggregation
References
 ^ Greenland S, O' Rourke K: MetaAnalysis. Page 652 in Modern Epidemiology, 3rd ed. Edited by Rothman KJ, Greenland S, Lash T. Lippincott Williams and Wilkins; 2008.
 ^ WALKER, E.; HERNANDEZ, A. V.; KATTAN, M. W. (1 June 2008). "Metaanalysis: Its strengths and limitations". Cleveland Clinic Journal of Medicine. 75 (6): 431–439. doi:10.3949/ccjm.75.6.431.
 ^ Glossary at Cochrane Collaboration
 ^ PLACKETT, R. L. (1958). "STUDIES IN THE HISTORY OF PROBABILITY AND STATISTICS: VII. THE PRINCIPLE OF THE ARITHMETIC MEAN". Biometrika. 45 (1–2): 133. doi:10.1093/biomet/45.12.130. Retrieved 29 May 2016.
 ^ Pearson K (1904). "Report on certain enteric fever inoculation statistics". BMJ. 2 (2288): 1243–1246. doi:10.1136/bmj.2.2288.1243. PMC 2355479 . PMID 20761760.
 ^ Nordmann AJ, Kasenda B, Briel M (Mar 9, 2012). "Metaanalyses: what they can and cannot do". Swiss Medical Weekly. 142: w13518. doi:10.4414/smw.2012.13518. PMID 22407741.
 ^ O'Rourke K (20071201). "An historical perspective on metaanalysis: dealing quantitatively with varying study results". J R Soc Med. 100 (12): 579–582. doi:10.1258/jrsm.100.12.579. PMC 2121629 . PMID 18065712.
 ^ Pratt JG, Rhine JB, Smith BM, Stuart CE, Greenwood JA. ExtraSensory Perception after Sixty Years: A Critical Appraisal of the Research in ExtraSensory Perception. New York: Henry Holt, 1940
 ^ Glass G. V (1976). "Primary, secondary, and metaanalysis of research". Educational Researcher. 5 (10): 3–8. doi:10.3102/0013189X005010003.
 ^ Cochran WG (1937). "Problems Arising in the Analysis of a Series of Similar Experiments". Journal of the Royal Statistical Society. 4: 102–118. doi:10.2307/2984123.
 ^ Cochran WG, Carroll SP (1953). "A Sampling Investigation of the Efficiency of Weighting Inversely as the Estimated Variance". Biometrics. 9: 447–459. doi:10.2307/3001436.
 ^ LeLorier J, Grégoire G, Benhaddad A, Lapierre J, Derderian F (1997). "Discrepancies between MetaAnalyses and Subsequent Large Randomized, Controlled Trials". New England Journal of Medicine. 337 (8): 536–542. doi:10.1056/NEJM199708213370806. PMID 9262498.
 ^ ^{a} ^{b} Slavin RE (1986). "BestEvidence Synthesis: An Alternative to MetaAnalytic and Traditional Reviews". Educational Researcher. 15 (9): 5–9. doi:10.3102/0013189X015009005.
 ^ Hunter, Schmidt, & Jackson, John E. (1982). Metaanalysis: Cumulating research findings across studies. Beverly Hills, California: Sage.
 ^ Glass, McGaw, & Smith (1981). Metaanalysis in social research. Beverly Hills, CA: Sage.
 ^ ^{a} ^{b} Rosenthal R (1979). "The "File Drawer Problem" and the Tolerance for Null Results". Psychological Bulletin. 86 (3): 638–641. doi:10.1037/00332909.86.3.638.
 ^ Hunter, John E; Schmidt, Frank L (1990). Methods of MetaAnalysis: Correcting Error and Bias in Research Findings. Newbury Park, California; London; New Delhi: SAGE Publications.
 ^ Light & Pillemer (1984). Summing up: The science of reviewing research. Cambridge, CA: Harvard University Pree.
 ^ Ioannidis JP, Trikalinos TA (2007). "The appropriateness of asymmetry tests for publication bias in metaanalyses: a large survey". CMAJ. 176 (8): 1091–6. doi:10.1503/cmaj.060410. PMC 1839799 . PMID 17420491.
 ^ ^{a} ^{b} Ferguson CJ, Brannick MT (2012). "Publication bias in psychological science: prevalence, methods for identifying and controlling, and implications for the use of metaanalyses". Psychol Methods. 17 (1): 120–8. doi:10.1037/a0024445. PMID 21787082.
 ^ Simmons JP, Nelson LD, Simonsohn U (2011). "Falsepositive psychology: undisclosed flexibility in data collection and analysis allows presenting anything as significant". Psychol Sci. 22 (11): 1359–66. doi:10.1177/0956797611417632. PMID 22006061.
 ^ LeBel, E.; Peters, K. (2011). "Fearing the future of empirical psychology: Bem's (2011) evidence of psi as a case study of deficiencies in modal research practice" (PDF). Review of General Psychology. 15 (4): 371–379. doi:10.1037/a0025172.
 ^ Radua, J.; Schmidt, A.; Borgwardt, S.; Heinz, A.; Schlagenhauf, F.; McGuire, P.; FusarPoli, P. (2015). "Ventral Striatal Activation During Reward Processing in Psychosis: A Neurofunctional MetaAnalysis". JAMA Psychiatry. 72: 1243–1251. doi:10.1001/jamapsychiatry.2015.2196. PMID 26558708.
 ^ Hodges, Jim, and Clayton, Murray K. Random Effects: Old and New. Statistical Science XX: XX–XX. URL http://www Archived 24 May 2011 at the Wayback Machine.. biostat. umn. edu/~ hodges/HodgesClaytonREONsubToStatSci (2011)
 ^ ^{a} ^{b} Hodges JS. Random effects old and new. In Hodges JS. Richly parameterized linear models: additive, time series, and spatial models using random effects. USA: CRC Press, 2013: 285–302.
 ^ H. Sabhan
 ^ Stegenga J (2011). "Is metaanalysis the platinum standard of evidence?". Stud Hist Philos Biol Biomed Sci. 42 (4): 497–507. doi:10.1016/j.shpsc.2011.07.003. PMID 22035723.
 ^ "How Well Do MetaAnalyses Disclose Conflicts of Interests in Underlying Research Studies  The Cochrane Collaboration". Cochrane.org. Retrieved 20120113.
 ^ ^{a} ^{b} "The Osteen Decision". The United States District Court for the Middle District of North Carolina. 19980717. Retrieved 20170318.
 ^ "The PRISMA statement". Prismastatement.org. 20120202. Retrieved 20120202.
 ^ Debray, Thomas P. A.; Moons, Karel G. M.; van Valkenhoef, Gert; Efthimiou, Orestis; Hummel, Noemi; Groenwold, Rolf H. H.; Reitsma, Johannes B.; on behalf of the GetReal methods review group (20151201). "Get real in individual participant data (IPD) metaanalysis: a review of the methodology". Research Synthesis Methods. 6 (4): 293–309. doi:10.1002/jrsm.1160. ISSN 17592887.
 ^ Debray TP, Moons KG, AboZaid GM, Koffijberg H, Riley RD (2013). "Individual participant data metaanalysis for a binary outcome: onestage or twostage?". PLoS ONE. 8 (4): e60650. doi:10.1371/journal.pone.0060650. PMC 3621872 . PMID 23585842.
 ^ Burke, Danielle L.; Ensor, Joie; Riley, Richard D. (20170228). "Metaanalysis using individual participant data: onestage and twostage approaches, and why they may differ". Statistics in Medicine. 36 (5): 855–875. doi:10.1002/sim.7141. ISSN 10970258.
 ^ Helfenstein U (2002). "Data and models determine treatment proposalsan illustration from metaanalysis". Postgrad Med J. 78 (917): 131–4. doi:10.1136/pmj.78.917.131. PMC 1742301 . PMID 11884693.
 ^ Senn S (2007). "Trying to be precise about vagueness". Stat Med. 26: 1417–30. doi:10.1002/sim.2639.
 ^ ^{a} ^{b} Al Khalaf MM, Thalib L, Doi SA (2011). "Combining heterogenous studies using the randomeffects model is a mistake and leads to inconclusive metaanalyses" (PDF). Journal of Clinical Epidemiology. 64: 119–23. doi:10.1016/j.jclinepi.2010.01.009.
 ^ ^{a} ^{b} Brockwell S.E.; Gordon I.R. (2001). "A comparison of statistical methods for metaanalysis". Statistics in Medicine. 20: 825–840. doi:10.1002/sim.650.
 ^ ^{a} ^{b} ^{c} Noma H (Dec 2011). "Confidence intervals for a randomeffects metaanalysis based on Bartletttype corrections". Stat Med. 30 (28): 3304–12. doi:10.1002/sim.4350.
 ^ Brockwell SE, Gordon IR (2007). "A simple method for inference on an overall effect in metaanalysis". Statistics in Medicine. 26: 4531–4543. doi:10.1002/sim.2883.
 ^ Sidik K, Jonkman JN (2002). "A simple confidence interval for metaanalysis". Statistics in Medicine. 21: 3153–3159. doi:10.1002/sim.1262.
 ^ Jackson D, Bowden J (2009). "A reevaluation of the 'quantile approximation method' for random effects metaanalysis". Stat Med. 28 (2): 338–48. doi:10.1002/sim.3487. PMC 2991773 . PMID 19016302.
 ^ Poole C, Greenland S (Sep 1999). "Randomeffects metaanalyses are not always conservative". Am J Epidemiol. 150 (5): 469–75. doi:10.1093/oxfordjournals.aje.a010035.
 ^ Riley RD, Higgins JP, Deeks JJ (2011). "Interpretation of random effects metaanalyses". British Medical Journal. 342: d549. doi:10.1136/bmj.d549.
 ^ Kriston L (2013). "Dealing with clinical heterogeneity in metaanalysis. Assumptions, methods, interpretation". Int J Methods Psychiatr Res. 22 (1): 1–15. doi:10.1002/mpr.1377. PMID 23494781.
 ^ DerSimonian R, Laird N (1986). "Metaanalysis in clinical trials". Control Clin Trials. 7 (3): 177–88. doi:10.1016/01972456(86)900462. PMID 3802833.
 ^ metaan:Randomeffects metaanalysis, Stata Journal 2010
 ^ MetaEasy:A MetaAnalysis AddIn for Microsoft Excel, Journal of Statistical Software 2009
 ^ Developer's website
 ^ Kontopantelis E, Reeves D (2012). "Performance of statistical methods for metaanalysis when true study effects are nonnormally distributed: A simulation study". Statistical Methods in Medical Research. 21 (4): 409–26. doi:10.1177/0962280210392008. PMID 21148194.
 ^ Kontopantelis E, Reeves D (2012). "Performance of statistical methods for metaanalysis when true study effects are nonnormally distributed: a comparison between DerSimonianLaird and restricted maximum likelihood". SMMR. 21 (6): 657–9. doi:10.1177/0962280211413451. PMID 23171971.
 ^ Kontopantelis E, Springate DA, Reeves D (2013). Friede, Tim, ed. "A ReAnalysis of the Cochrane Library Data: The Dangers of Unobserved Heterogeneity in MetaAnalyses". PLoS ONE. 8 (7): e69930. doi:10.1371/journal.pone.0069930. PMC 3724681 . PMID 23922860.
 ^ A short guide and a forest plot command (ipdforest) for onestage metaanalysis, Stata Journal 2012
 ^ ^{a} ^{b} ^{c} MetaXL User Guide
 ^ ^{a} ^{b} ^{c} ^{d} MetaXL software page
 ^ Doi SA, Barendregt JJ, Khan S, Thalib L, Williams GM (2015). "Advances in the Metaanalysis of heterogeneous clinical trials I: The inverse variance heterogeneity model". Contemp Clin Trials. doi:10.1016/j.cct.2015.05.009. PMID 26003435.
 ^ ^{a} ^{b} Doi SA, Thalib L (2008). "A qualityeffects model for metaanalysis". Epidemiology. 19 (1): 94–100. doi:10.1097/EDE.0b013e31815c24e7. PMID 18090860.
 ^ Doi SA, Barendregt JJ, Mozurkewich EL (2011). "Metaanalysis of heterogeneous clinical trials: an empirical example". Contemp Clin Trials. 32 (2): 288–98.
 ^ Doi SA, Barendregt JJ, Williams GM, Khan S, Thalib L (2015). "Simulation Comparison of the Quality Effects and Random Effects Methods of Metaanalysis". Epidemiology. 26: e42–4. doi:10.1097/EDE.0000000000000289. PMID 25872162.
 ^ Doi SA, Barendregt JJ, Khan S, Thalib L, Williams GM (2015). "Advances in the metaanalysis of heterogeneous clinical trials II: The quality effects model". Contemp Clin Trials. doi:10.1016/j.cct.2015.05.010. PMID 26003432.
 ^ Bucher H. C.; Guyatt G. H.; Griffith L. E.; Walter S. D. (1997). "The results of direct and indirect treatment comparisons in metaanalysis of randomized controlled trials". J Clin Epidemiol. 50 (6): 683–691. doi:10.1016/s08954356(97)000498.
 ^ ^{a} ^{b} Valkenhoef G.; Lu G.; Brock B.; Hillege H.; Ades A. E.; Welton N. J. (2012). "Automating network meta‐analysis". Research Synthesis Methods. 3 (4): 285–299.
 ^ Brooks SP, Gelman A (1998). "General methods for monitoring convergence of iterative simulations" (PDF). Journal of Computational and Graphical Statistics. 7 (4): 434–455. doi:10.1080/10618600.1998.10474787.
 ^ van Valkenhoef G, Lu G, de Brock B, Hillege H, Ades AE, Welton NJ. Automating network metaanalysis. Res Synth Methods. 2012 Dec;3(4):28599.
 ^ ^{a} ^{b} Senn S, Gavini F, Magrez D, Scheen A (Apr 2013). "Issues in performing a network metaanalysis". Stat Methods Med Res. 22 (2): 169–89.
 ^ White IR (2011). "Multivariate randomeffects metaregression: updates to mvmeta". The Stata Journal. 11 (2): 255–270.
 ^ van Valkenhoef G, Lu G, de Brock B, Hillege H, Ades AE, Welton NJ. Automating network metaanalysis. Res Synth Methods. 2012 Dec;3(4):28599
 ^ Debray TP, Moons KG, Ahmed I, Koffijberg H, Riley RD (2013). "A framework for developing, implementing, and evaluating clinical prediction models in an individual participant data metaanalysis". Statistics in Medicine. 32 (18): 3158–80. doi:10.1002/sim.5732. PMID 23307585.
 ^ Debray TP, Koffijberg H, Vergouwe Y, Moons KG, Steyerberg EW (2012). "Aggregating published prediction models with individual participant data: a comparison of different approaches". Statistics in Medicine. 31 (23): 2697–2712. doi:10.1002/sim.5412. PMID 22733546.
 ^ Van den Noortgate W, Onghena P (2007). "Aggregating SingleCase Results". The Behavior Analyst Today. 8 (2): 196–209. doi:10.1037/h0100613.
 ^ Mantel N, Haenszel W (1959). "Statistical aspects of the analysis of data from the retrospective analysis of disease". Journal of the National Cancer Institute. 22 (4): 719–748. doi:10.1093/jnci/22.4.719. PMID 13655060.
 ^ "9.4.4.2 Peto odds ratio method". Cochrane Handbook for Systematic Reviews of Interventions v 5.1.0. March 2011.
 ^ Bargaje R, Hariharan M, Scaria V, Pillai B (2010). "Consensus miRNA expression profiles derived from interplatform normalization of microarray data". RNA. 16 (1): 16–25. doi:10.1261/rna.1688110. PMC 2802026 . PMID 19948767.
Further reading
 Cooper, H. & Hedges, L.V. (1994). The Handbook of Research Synthesis. New York: Russell Sage.
 Cornell, J. E. & Mulrow, C. D. (1999). Metaanalysis. In: H. J. Adèr & G. J. Mellenbergh (Eds). Research Methodology in the social, behavioral and life sciences (pp. 285–323). London: Sage.
 Normand SL (1999). "Tutorial in Biostatistics. MetaAnalysis: Formulating, Evaluating, Combining, and Reporting". Statistics in Medicine. 18 (3): 321–359. doi:10.1002/(SICI)10970258(19990215)18:3<321::AIDSIM28>3.0.CO;2P. PMID 10070677.
 Sutton, A.J., Jones, D.R., Abrams, K.R., Sheldon, T.A., & Song, F. (2000). Methods for Metaanalysis in Medical Research. London: John Wiley. ISBN 0471490660
 Higgins JPT, Green S (editors). Cochrane Handbook for Systematic Reviews of Interventions Version 5.0.1 [updated September 2008]. The Cochrane Collaboration, 2008. Available from www.cochranehandbook.org
 Thompson SG, Pocock SJ; Pocock, Stuart J (2 November 1991). "Can metaanalysis be trusted?" (PDF). The Lancet. 338 (8775): 1127–1130. doi:10.1016/01406736(91)91975Z. PMID 1682553. Retrieved 17 June 2011.. Explores two contrasting views: does metaanalysis provide "objective, quantitative methods for combining evidence from separate but similar studies" or merely "statistical tricks which make unjustified assumptions in producing oversimplified generalisations out of a complex of disparate studies"?
 Wilson, D. B., & Lipsey, M. W. (2001). Practical metaanalysis. Thousand Oaks: Sage publications. ISBN 0761921680
 O'Rourke, K. (2007) Just the history from the combining of information: investigating and synthesizing what is possibly common in clinical observations or studies via likelihood. Oxford: University of Oxford, Department of Statistics. Gives technical background material and details on the "An historical perspective on metaanalysis" paper cited in the references.
 Owen, A. B. (2009). "Karl Pearson's metaanalysis revisited". Annals of Statistics, 37 (6B), 3867–3892. Supplementary report.
 Ellis, Paul D. (2010). The Essential Guide to Effect Sizes: An Introduction to Statistical Power, MetaAnalysis and the Interpretation of Research Results. United Kingdom: Cambridge University Press. ISBN 0521142466
 Bonett DG, Price RM (2015). "Varying coefficient metaanalysis methods for odds ratios and risk ratios". Psychol Methods. 20 (3): 394–406. doi:10.1037/met0000032. PMID 25751513.
 Bonett DG, Price RM (2014). "Metaanalysis methods for risk differences". Br J Math Stat Psychol. 67 (3): 371–87. doi:10.1111/bmsp.12024. PMID 23962020.
 Bonett DG (2010). "Varying coefficient metaanalytic methods for alpha reliability". Psychol Methods. 15 (4): 368–85. doi:10.1037/a0020142. PMID 20853952.
 Bonett DG (2009). "Metaanalytic interval estimation for standardized and unstandardized mean differences". Psychol Methods. 14 (3): 225–38. doi:10.1037/a0016619. PMID 19719359.
 Bonett DG (2008). "Metaanalytic interval estimation for bivariate correlations". Psychol Methods. 13 (3): 173–81. doi:10.1037/a0012868. PMID 18778150.
External links
Wikiversity has learning resources about Metaanalysis 
 Cochrane Handbook for Systematic Reviews of Interventions
 MetaAnalysis at 25 (Gene V Glass)
 Preferred Reporting Items for Systematic Reviews and MetaAnalyses (PRISMA) Statement, "an evidencebased minimum set of items for reporting in systematic reviews and metaanalyses."