glmulti tutorials: lme4 and exclusions

Following several demands I have written short tutorials explaining how to use glmulti 1.x with the latest versions of package lme4 (mixed-models) and on how to limit candidate models to a subset of all possible interactions (in a more robust way than the built-in exclude argument). I hope that will help. Go see the glmulti page there.

New article on optimal movement

My first article on optimal behavior in patchy habitats has just been published online in the Journal of Mathematical Biology. It is available in Open-access from the journal :

How optimal foragers should respond to habitat changes: a reanalysis of the Marginal Value Theorem

Vincent Calcagno, Ludovic Mailleret, Éric Wajnberg, Frédéric Grognard

This is the first of a series of articles (which we informally call the “Charnov series”) where we revisit the celebrated Marginal Value Theorem from Charnov (1976), proving and expanding the theoretical predictions. In particular, we work out the adaptive consequences of habitat heterogeneity (i.e. resource distribution) on individual fitness and on the optimal rate of movement.

CSEE meeting in Kelowna, next month

I won’t be able to attend the meeting unfortunately, but Renato Henriques- Silva will be presenting the PhD research he is doing in the Peres-Neto lab at UQAM, Montreal. He is modelling the evolution of flexible dispersal strategies in a network-metapopulation context. Go see his talk if you have the luck to be in Kelowna!

The structure of spatial networks influences dispersal evolution“, Tuesday 14th, 11:15am, room EME1101

The benefits of rejection, continued

One result that attracts considerable attention in our article on Submission Flows is the one presented in Figure 4A, indicating that resubmitted articles were, in a given year in a given journal, significantly more cited a few years (3-5) later. Given the notoriously wild distribution of citation count data, and the thousands of factors that influence the citational success of published articles, I honestly did not expect to find any effect of just submission history. In the article we focused on establishing the statistical significance of this effect in a robust way, but it is not quantified (Figure 4A presents the raw data but one needs to control for publication year and journal). So how different are they?

The density distribution of citation counts for all sampled articles is shown as a histogram. The difference, controlling for publication year*journal, between resubmissions and first-intents (red bars) is represented on the same scale.

As is visible in the Figure, most articles in our sample had been cited between zero and 50 times by July 2011 (even though a few articles had as high as 8000 citations). Comparing articles that were resubmitted and those that were first-intent submissions to a journal, it can be seen that resubmissions were less likely to receive no citation at all (about 30% less likely; first red bar) or just one citation (second red bar). In contrast, there were more likely to be cited 3-5 times, 6-10 times, or 11-50 times, which spans most of the range. All these effects combine in producing a global increase of citation count for resubmissions. Interestingly, the few very-cited articles (51 and above) showed no trend or even a reversed trend, but they are much rarer so that the numbers are less reliable. This would suggest that very-highly cited papers obey different rules than “normal” papers. It is indeed likely that the effect of submission histories varies with journal and/or field. In particular, under the hypothesis that the review process improves manuscripts, the effect should be smaller for top journals, since many rejections are then made without any review, so that no effect is expected. This latter point applies globally, by the way: as in our study we could not discriminate resubmissions following review from resubmissions following editorial rejection, the dataset is likely to underestimate the difference between the two classes of submission histories. A stronger difference would be expected for resubmissions following actual review(s).

Another factor that tends to minimize the difference observed in the above Figure, is that resubmissions occurring between journals from different journal communities (as determined from network analysis) were less cited than those between journals of the same community (Figure 4B in the article). To look at this difference we use only journals that are connected in the network and assigned to one of the 7 major clusters; (this excludes the top multidisciplinary journals that are, by definition, not well assigned to a specific cluster), so the dataset is smaller than above. The difference between within- and between-clusters resubmissions is shown in the first left figure below.

As can be seen, citation counts are consistently shifted to higher values for resubmissions within a cluster of journals (field) compared to those between fields. The latter were more likely to receive 0 to 5 citations, but less likely to receive 6-100 citations.

The data are not the same as in the top figure (some journals were excluded) so we cannot directly compare them. For comparison, we can contrast the two types of resubmissions and first-intents, using the same set of journals. This is shown in the right figure above. Clearly, when using only resubmissions within fields (the vast majority of resubmissions) and omitting journals not well-assigned in the network has reinforced the difference between resubmissions and first-intents: looking at red bars, resubmissions were less likely to have <=5 citations and more likely to have >6. Resubmissions were, in particular, about 50% less likely not to be cited at all. In contrast (green bars), resubmissions between fields showed an opposite pattern and were LESS cited than first-intent submissions.

A little bit of technique: It is not advisable to try to fit simple parametric models (e.g. ANOVA or mixed-GLM) since, even when log-transformed, citation counts have ugly distributions and homoscedasticity is utopia. One can use an exact permutation procedure to remove the effect of submission history while controlling for year (3 levels), journal (923 levels), and the interaction of the two. The difference between the observed test-statistic and the center of the null (permuted) distribution for this statistic gives a reliable estimate of effect size. In the article we tested for a shift in location with such a permutation procedure, using as test-statistics the difference in mean log-transformed counts (but this still has a very skewed distribution, so that the mean does not tell all) and Wilcoxon’s rank-based statistic (which is more robust to the long tail of the distributions). To visualize the difference, it is better to use the actual density-distribution of citation counts (i.e. the histogram) as a test-statistic, and compare the distributions of the two submission-histories. This is what is shown in the figure.

More about submission flows

Here you can obtain the raw data file containing the submission history (which journal was tried before) of above 80,000 articles published in years 2006 to 2008 in some 900 scientific journals indexed by ISI. This forms the basis of the resubmission network we have studied in the article Flows of research manuscripts among scientific journals reveal hidden submission patterns.
Use the following contact form; it will send an email and an automatic reply will bring the file to you (a gzipped .csv; see the README header for instructions).

Upcoming article in Science: publication web project!

The publication web project I started back in 2008, as a postdoc at McGill, finally made it to adulthood! The article is to be released today in Science Express:

Flows of research manuscripts among scientific journals reveal hidden submission patterns
by Calcagno V, Demoinet E, Gollner K, Guidi L, Ruths D and de Mazancourt C.

Thank you very much to the 100,000 scientific authors or so who bothered to respond to my emails!! From the 220,000 emails I had sent, I gathered some 80,000 usable replies, each containing the submission history of one published article. From these we were able to construct a resubmission network describing the manuscripts flows connecting over 1,000 scientific journals from 16 subject categories, including Nature, Science and PNAS. The analysis of this new type of network revealed intriguing patterns about the submission strategy of authors and the impact of submission history on the post-publication impact of research articles.

SEE THE ARTICLE ON SCIENCE EXPRESS      [access pdf for free]  (works only from WITHIN the post!)

Learn more about the results in the Science Podcast

Read reports, with comments from experts, in The Scientist Magazine and in the Nature News.