I won’t be able to attend the meeting unfortunately, but Renato Henriques- Silva will be presenting the PhD research he is doing in the Peres-Neto lab at UQAM, Montreal. He is modelling the evolution of flexible dispersal strategies in a network-metapopulation context. Go see his talk if you have the luck to be in Kelowna!

CSEE meeting in Kelowna, next month
April 12, 2013
Outreach articles on submission flows
February 28, 2013In two forthcoming articles, we explain and discuss (in French!) our results on manuscripts submission flows. One has just been published online in the latest issue of Medecine/Sciences, the other is to appear soon in Virologie, with a closer look at selected journals from this field.


Stat course in Sophia: February 11-14
February 7, 2013The course will take place at the Molecular & Cellular Physiology Institute (IPMC) in Sophia-Antipolis, room B07 (IPMC2 – R-1):
To prepare for the course, especially if you have no familiarity with R, consult one of these tutorials:
- http://www.r-tutor.com/r-introduction
- tutoriel R de Denis Puthier

Going to Montreal: December-January
November 6, 2012I’ll be back to Montreal next December and January, spending the end of 2012 with (hopefully) plenty of snow!
I’ll give a seminar for the CAMBAM, at McGill University, on Wednesday, December 5, 4pm, McIntyre building, room 1101. I’ll present our results on manuscript submission flows, a project for which I received funding from CAMBAM as a postdoc.

The benefits of rejection, continued
October 23, 2012One result that attracts considerable attention in our article on Submission Flows is the one presented in Figure 4A, indicating that resubmitted articles were, in a given year in a given journal, significantly more cited a few years (3-5) later. Given the notoriously wild distribution of citation count data, and the thousands of factors that influence the citational success of published articles, I honestly did not expect to find any effect of just submission history. In the article we focused on establishing the statistical significance of this effect in a robust way, but it is not quantified (Figure 4A presents the raw data but one needs to control for publication year and journal). So how different are they?

The density distribution of citation counts for all sampled articles is shown as a histogram. The difference, controlling for publication year*journal, between resubmissions and first-intents (red bars) is represented on the same scale.
As is visible in the Figure, most articles in our sample had been cited between zero and 50 times by July 2011 (even though a few articles had as high as 8000 citations). Comparing articles that were resubmitted and those that were first-intent submissions to a journal, it can be seen that resubmissions were less likely to receive no citation at all (about 30% less likely; first red bar) or just one citation (second red bar). In contrast, there were more likely to be cited 3-5 times, 6-10 times, or 11-50 times, which spans most of the range. All these effects combine in producing a global increase of citation count for resubmissions. Interestingly, the few very-cited articles (51 and above) showed no trend or even a reversed trend, but they are much rarer so that the numbers are less reliable. This would suggest that very-highly cited papers obey different rules than “normal” papers. It is indeed likely that the effect of submission histories varies with journal and/or field. In particular, under the hypothesis that the review process improves manuscripts, the effect should be smaller for top journals, since many rejections are then made without any review, so that no effect is expected. This latter point applies globally, by the way: as in our study we could not discriminate resubmissions following review from resubmissions following editorial rejection, the dataset is likely to underestimate the difference between the two classes of submission histories. A stronger difference would be expected for resubmissions following actual review(s).
Another factor that tends to minimize the difference observed in the above Figure, is that resubmissions occurring between journals from different journal communities (as determined from network analysis) were less cited than those between journals of the same community (Figure 4B in the article). To look at this difference we use only journals that are connected in the network and assigned to one of the 7 major clusters; (this excludes the top multidisciplinary journals that are, by definition, not well assigned to a specific cluster), so the dataset is smaller than above. The difference between within- and between-clusters resubmissions is shown in the first left figure below.
- Resubmissions were less cited when occuring between journals from different journal communities (fields)
- The difference between first-intents and resubmissions within fields (red bars) is even more pronounced. Resubmissions between journal fields (green) show an opposite pattern.
As can be seen, citation counts are consistently shifted to higher values for resubmissions within a cluster of journals (field) compared to those between fields. The latter were more likely to receive 0 to 5 citations, but less likely to receive 6-100 citations.
The data are not the same as in the top figure (some journals were excluded) so we cannot directly compare them. For comparison, we can contrast the two types of resubmissions and first-intents, using the same set of journals. This is shown in the right figure above. Clearly, when using only resubmissions within fields (the vast majority of resubmissions) and omitting journals not well-assigned in the network has reinforced the difference between resubmissions and first-intents: looking at red bars, resubmissions were less likely to have <=5 citations and more likely to have >6. Resubmissions were, in particular, about 50% less likely not to be cited at all. In contrast (green bars), resubmissions between fields showed an opposite pattern and were LESS cited than first-intent submissions.
A little bit of technique: It is not advisable to try to fit simple parametric models (e.g. ANOVA or mixed-GLM) since, even when log-transformed, citation counts have ugly distributions and homoscedasticity is utopia. One can use an exact permutation procedure to remove the effect of submission history while controlling for year (3 levels), journal (923 levels), and the interaction of the two. The difference between the observed test-statistic and the center of the null (permuted) distribution for this statistic gives a reliable estimate of effect size. In the article we tested for a shift in location with such a permutation procedure, using as test-statistics the difference in mean log-transformed counts (but this still has a very skewed distribution, so that the mean does not tell all) and Wilcoxon’s rank-based statistic (which is more robust to the long tail of the distributions). To visualize the difference, it is better to use the actual density-distribution of citation counts (i.e. the histogram) as a test-statistic, and compare the distributions of the two submission-histories. This is what is shown in the figure.
More about submission flows
October 19, 2012Here you can obtain the raw data file containing the submission history (which journal was tried before) of above 80,000 articles published in years 2006 to 2008 in some 900 scientific journals indexed by ISI. This forms the basis of the resubmission network we have studied in the article Flows of research manuscripts among scientific journals reveal hidden submission patterns.
Use the following contact form; it will send an email and an automatic reply will bring the file to you (a gzipped .csv; see the README header for instructions).




