BACKGROUND OF THE STUDY
For the purposes of this study the term scholarly research (or research) is defined as an original individual or collaborative inquiry conducted with the purpose of developing generalizable knowledge via a systematic and reproducible process of exploration of empirical data. Scholarly publication (or publication) is defined as a publicly available codification of the generalizable knowledge produced because of the inquiry in the form of an entry in a specialized peer-reviewed journal. Co-authorship means a formal manifestation of intellectual collaboration in scholarly research. Ideally, it involves the participation of two or more authors in the production of a scholarly publication. Co-authorship tendency is understood as a researcher’s most frequent way of co-authorship on a publication. Research productivity is defined as a ratio of inputs to outputs of scholarly research. Existing approaches to measuring research productivity relate quality or quantity of research outputs (or combination of them) to time, as a proxy of research inputs.[2,3] This study uses many scholarly publications (specifically, articles) within a fixed period as a measure of productivity. Each collaborative article is counted as a whole article towards the productivity of each of the contributing authors.
Review of the literature
In recent decades, there has been a growing, global trend of co-authorship in scientific publication.[4,5,6,7] The widely spread explanation of the increasing trend of co-authorship is that co-authorship has an advantage over independent solo publication strategy because collaborative work increases research productivity both in terms of quality and quantity of publications. The assumption is that productivity rises because co-authorship creates various benefits at a relatively low cost.[8,9] Some research. Pravdic and Oluic-Vukovic also shows that frequency of co-authorship increases with productivity, because researchers at all levels of productivity are interested in collaborating with highly productive authors. Recently, co-authorship has become a preferred form of research production by many researchers because researchers’ reputation, promotion and salaries nowadays increasingly depend on their productivity.
Some prior studies hypothesized potential benefits of research collaboration, which may underlie the relationship between co-authorship and productivity. One benefit is the opportunity for collaborating researchers to expand their narrow conceptual and methodological expertise with the expertise of their collaborator.[11,12,13] Another benefit is that collaboration allows researchers to pull together limited funds and to improve access to costly equipment.[11,13] The third benefit is that it increases the number of studies that can be undertaken due to efficiency arising from the division of labor and that it provides an opportunity for rigorous peer review, which improves article quality and increases the probability that an author’s work will be accepted for publication in a journal. An important benefit of collaboration is that it allows a researcher to gain from their partner tacit information about techniques, which is otherwise inaccessible.[11,12] In addition to that, co-authorship creates a companionship and sense of mutual responsibility, which helps researchers to sustain motivation.
To inform public policy on whether to and how to encourage co-authorship to increase research productivity the following questions need to be answered: (1) is there strong empirical evidence that co-authorship increases research productivity? (2) How exactly does co-authorship increase research productivity? Does it matter how one co-authors in terms of effects on productivity? (3) How does the context of research and co-authorship affect productivity?
At this stage, most research is focused on the first question. Some studies have demonstrated that co-authoring researchers publish more articles over the course of their career.[16,17,10] Other studies have revealed that articles produced in co-authorship are of higher quality, they are more likely to be accepted for publication, to be published in leading journals, and to be cited by other researchers over a longer period.[18,19,9,20]
Even though some statistic studies have confirmed the link between co-authorships and productivity, Bozeman and Lee noted that there are not enough studies to claim the presence of the association with confidence. More studies must be conducted to test the association in various disciplines. Duque et al. pointed that most studies used data from countries in Europe and North America; and, hence, the conclusions of the studies may not hold for countries with lower levels of science and technology capacity. Majority of studies do not take into consideration the possible interaction effects with other variables.[21,8,9] These variables include age, rank, and status of the researchers, grant support or contractual organization of collaboration, gender, marital status, citizenship, perceived discrimination, job satisfaction, intellectual ability of the researcher, and actual form of co-authorship.
About the second question posed by public policy, the dominant view is that the mechanism linking co-authorship and productivity can be explained by the concept of social capital. Social capital is defined as “the sum of the resources, actual or virtual, that accrue to an individual or a group by possessing a durable network of more or less institutionalised relationships of mutual acquaintance and recognition” The difference in productivity among co-authoring and non-co-authoring researchers can be attributed to presence or absence of social capital that is generated in a collaborating group of authors. A methodological approach associated with research on social capital is social network analysis, which views the behavior as a result of complex relations among individuals and explores the behavior by reconstructing and then analyzing social networks consisting of individuals and relationships among them.
The concept of social capital and social network analysis have been applied in studies of co-authorship and research productivity in many previous studies.[1,23,24,25,26,27] These studies have reproduced social networks of researchers and co-authorship relations among them in various disciplines and have established a strong positive link between different characteristics of social networks representing co-authorships and author productivity. In addition, these studies showed a relation between publication productivity and an author’s position in relation to other researchers in the networks, which might indicate that the form of co-authorship with others does impact productivity.
The focus of some recent studies was to explore the relationship between different forms of co-authorship and research productivity more directly. Rumsey-Wairepo used dominant theories in social network analysis to identify several types of co-authorship network structures, which could be interpreted as forms of co-authorship. She implemented social network analysis of higher education research in the US to explore to what extent each of the structures were present in the field and how they compared in their relationship with productivity. Hill used publication data from tenured faculty in a computer science department in a U.S. university to determine the relationship among different co-authorship network structures and research productivity. While not directly concerned with research productivity in terms of article publication, some studies in management, most notably by Lee Fleming, et al, use patent data to analyze the relationship between co-authorship network structures and creativity.
Research on co-authorship has yet to provide answers regarding the third question that might be of interest for policy makers, i.e. how the effect of co-authorship on productivity is shaped by the context of research. Most importantly, because prior research has been based on data from countries with advanced science and technology capacity, findings might not be appropriate for policy makers to reference in countries that have not yet achieved a similar capacity. This circumstance calls for more country-level case studies of co-authorship and research productivity, as well as for statistical studies, which use country as a control variable.
Given the state of research on co-authorship and productivity, this study intends to contribute to the field by exploring the relationship between co-authorship forms and research productivity. To account for the fact that some researchers may use different forms of co-authorship in different publications, we assume that some forms occur more frequently in the portfolio of an individual author. We refer to such frequently used forms as co-authorship tendencies.
To address the criticism that prior studies gave little attention to co-authorship in countries outside Europe and North America, this study uses data from Russia. It also uses publications from cardiology to contribute to the understanding of co-authorship in the underexplored biomedical research field.
The study is organized around the following research questions
To what extent are various co-authorship tendencies present in cardiologic research in Russia?
How does research productivity differ among authors with various co-authorship tendencies?
If a particular co-authorship tendency is associated with a specific configuration of a co-authorship network, this study classifies co-authorship tendencies using classification of co-authorship network structures developed by Rumsey-Wairepo. Her classification uses ideas, methods, and measures from social network theory and analysis to differentiate among various types of social structures.
The two competing views on the type of social structure creating social capital were proposed by Burt and Coleman Coleman argued that social capital is bonding in nature and arises primarily from cohesive networks. Cohesion facilitates trust and cooperation between individuals. Social capital arises in cohesive communities through such processes as establishing obligations, expectations and trustworthiness, creating channels for information, and setting norms backed by effective sanctions.
A cohesive network is characterized by several social network analysis measures developed by Burt. First, it has high density, i.e. it may have redundant relations among individuals, whereby individuals are connected to others who are also connected to one another. Second, it is characterized by high mean strength of ties, i.e. within the network a pair of individuals tends to have multiple contacts with each other. Third, a cohesive network includes a small number of participating individuals. Finally, a cohesive network is also characterized by high constraint, a complex social network measure, developed by Burt, which takes into consideration all the preceding measures and, essentially, assesses the extent to which individuals in cohesive networks constrain each other.
In the case of co-authorship networks, cohesion might exist in small groups of co-authors, who regularly write papers with one another. Greater research productivity may arise in such groups because members of the group know each other’s strengths and weaknesses, can effectively distribute responsibilities, and trust the quality of work which saves time and effort. In addition, informal norms of communication that might exist among co-authors prevent free-riding and encourage mutually beneficial behavior, such as citing each other’s work in publications. Finally, repetitive co-writing creates clear and efficient communication among the researchers, making clarification of ideas and approaches more effective.
Burt suggested an alternative to Coleman’s explanation of the source of social capital. He argued that social capital is bridging in nature and is associated with the social structure full of structural holes, existing when the ties within a network are weak and many potential new contacts can be established. Social capital is accumulated by individuals serving as brokers, connecting otherwise unconnected individuals. The advantages created by bridging social capital include better access to novel information and to diverse human and financial capital of other people, having greater visibility and being able to produce more innovative solutions. Burt also noted that weaker ties do not require investment to sustain the level of closeness necessary for cohesion.
In terms of social network measures, a network with structural holes is characterized by a lower density and a lower mean strength of co-authorship ties. It typically includes many participants, i.e. it is large. Finally, a network with structural holes has a high value of efficiency. This is another one of Burt’s complex measures that uses size, density and mean strength of ties to assess redundancy and number of holes in a network.
In the case of co-authorship networks, a researcher demonstrating an egocentric (i.e. individual) network of structural holes would have an extended network of contacts with whom he/she co-authors only once. The co-authors would be expected to be very different in their conceptual and methodological backgrounds, depending on their field and line of inquiry. Such authors could be more productive because they would generate more novel ideas due to working in interdisciplinary fields. They might also have higher visibility because co-authors would publish and cite collaborative papers in diverse journals, thus increasing each other’s visibility and the chances that the work would be cited by people outside the network. They would also have better access to resources.
Building on the ideas of Coleman and Burt, and utilizing the summary measures of constraint and efficiency, Rumsey-Wairepo developed the following classification of co-authorship structures: (1) isolate structure, which is characteristic for a researchers, who tends to publish alone; (2) dyadic structure, which is common for researchers working in exclusive pairs; (3) cohesive structure , which is characterized by a high value of constraint and a low value of efficiency; (4) structural holes structure, which is characterized by low constraint and high efficiency; (5) independent structure, which is characteristic for configuration of ties with low values for both measures; (6) complex structure, which is characteristic for tie configurations with high values of both measures; (7) middle structure, which is essentially the middle ground, where the configuration is average in both efficiency and constraint.
In this study, to classify types of co-authorship tendencies, we use Rumsey-Wairepo’s system of classification with two minor modifications. Specifically, to make labels for tendencies more parallel, we use the labels “bonding” and “bridging” for cohesive and structural holes tendencies respectively. The labels will correspond to two types of social capital created by the tendencies. In addition to that, the label “complex” is replaced with the label “combination,” which better describes the essence of the tendency. Figure 1 summarizes the adapted visual representation of the classification of co-authorship tendencies based on Rumsey-Wairepo. Table 1 provides a lay description of the tendencies.
Justification for the Choice of Russian Cardiology
Russia represents an interesting case for the analysis for several reasons. Russia is not a scientifically-advanced country, but it is recognized as scientifically-proficient with advanced scientific expertise in many disciplines at the basic and applied level. In addition, it has developed research publishing. In 2007, Russia was ninth in the world in the number of published research journals. Russia’s high level of research activity implies large co-authorship networks and the potential for manifestation of a variety of co-authorship tendencies. Due to availability of recognized Russian journals it is expected that many researchers publish domestically rather than submit their publications abroad. The Russian scientific enterprise is still relatively closed and underfunded at the individual level, and in many fields Russian researchers do not actively submit their articles to journals outside the country.
The second rationale for the choice of Russia was the high probability that the co-authorship networks in the country would be relatively complete and representative of the pool of researchers in the country. In the case of other scientifically proficient countries, such as India or China, the fear was that a large share of talented researchers would prefer submitting their articles to journals abroad. As a result, if only national journals in these countries were examined, the productivity of those researchers, who publish both domestically and abroad, would be underestimated.
Cardiology was chosen as the field of analysis because it could provide useful insight about co-authorship and productivity in the underexplored field of biomedical sciences. As has been explained before, prior research has shown that scientists in different fields have different preferences for co-authorships, vary in terms of the average number of co-authoring contributors per article, and in the frequency of participation in international collaborations.[33,34,35,36] In addition to that, there is prior evidence that medical fields of research are relatively closed in Russia due to nationally specific methodological approaches and lack of knowledge of the English language among the researchers Based on this evidence, it would be safe to assume that a social network of cardiologic research in Russia would be complete and would generate accurate estimations of productivity.
The study was conducted in two phases. During the first phase, social network analysis was used to reproduce the co-authorship network in cardiologic research in Russia and to identify co-authorship tendencies of each of the authors. In the second phase, statistical analysis was conducted to determine the distribution of researchers across types of co-authorship tendencies and to examine the relationship between co-authorship and productivity in general, as well as to explore whether there is any difference in productivity between researchers with different co-authorship tendencies.
Social Network Analysis
The methodology for identifying co-authorship tendencies was largely adopted from Rumsey-Wairepo (2006). Study-specific details are provided below.
Three journals were chosen for inclusion in social network analysis to represent the field of cardiology in Russia: Cardiology (Kardiologiya), Cardiovascular Therapy and Prevention Kardiovaskylyarnaya Terapiya i Profilaktika), and Russian Journal of Cardiology (Russkii Zhurnal Kardiologiyi). These three journals were selected because they were listed as key journals for the publication of research articles on the website of the Russian Scientific Society of Cardiologists (Vserossiiskoe Nauchnoe Obshestvo Kardiologov) http://www.cardiosite.ru.
The authors and co-authorships for inclusion in the network were selected using an advanced search in ISI Web of Science. The search generated all scholarly articles from the three journals for the period of six years (2004-2009), which indicated Russia as the country of origin. The search results were imported to a Microsoft ACCESS database, where several queries were run to determine each of the authors’ productivity (count of articles) and an edge list, i.e a Table recording absence or presence of co-authorship ties (representing an instance of co-authorship) between all possible pairs of authors in the database.
The resulting edge list was imported into the UCINet 5.0 software (34), which is a special software used for the analysis of network structures. The UCINet 5.0 analysis was aimed at calculation of size, as well as measures of constraint and efficiency, which were used in the identification of co-authorship tendencies.
To categorize co-authorship network structures into seven hypothesized tendencies the UCINet 5.0 file was exported to EXCEL. An additional EXCEL file (database) was created to keep the results of the analysis. First, all isolates and dyads were identified based on size and were extracted from the analysis database into the results database. Second, a list of authors with middle co-authorship tendency was generated by: (1) trichotomizing the range of constraint and efficiency with “percentile” formula in Excel; and (2) filtering out all records with the values of constraint and efficiency falling in the second third of the corresponding ranges. The resulting list was then copied to the results database, and all records of authors with middle co-authorship tendency were removed from the analysis database.
Finally, to determine who of the remaining authors had independent, complex, bonding or bridging co-authorship tendencies, the following procedures were run in Excel. First, the middle of the ranges for constraint and efficiency were calculated using the median formula in Excel. Second, the values of constraint and efficiency were then recoded using a logical formula in Excel into “Low” or “High” depending on whether they fell below or above the middle of the range. The records in the analysis database were then sorted into the remaining four tendencies using filters on constraint and efficiency columns. The settings of the filter “Equals Low” for both measures corresponded to independent co-authorship tendency. The settings “Equals High” for both measures corresponded to complex co-authorship tendency. The settings “Equals Low” for efficiency and “Equals High” for constraint corresponded to bonding co-authorship tendency, while the reversed filter settings were used to determine authors with bridging tendency. The lists of authors produced by filtering were then copied to the results database. The results database was used in the statistical analysis.
Statistical analysis included the analysis of descriptive and inferential analysis with SPSS 16.0 software. The primary descriptive statistics of interest was the number of authors with each of the proposed co-authorship tendencies. This statistic provided information on the distribution of researchers across types of tendencies to answer the first research question.
Inferential analysis was intended to answer the second research question. The original plan was to use the classical analysis of variance (ANOVA) with post hoc multiple comparisons. The dependent variable in the ANOVA model would be the number of publications by an individual researcher (measure of research productivity). The main independent (or treatment) variable would be types of co-authorship tendencies identified during Phase I. The types of tendencies were recoded into integers prior to statistical analysis.
The data obtained because of Phase I was non-parametric. Specifically, it failed to meet both the normality and homogeneity of variance assumptions. Several data transformations (square root, cube root, logarithmic, inverse, and sine) were attempted to achieve either normality or greater homogeneity. The data was found to be insensitive to the transformations. The inferential strategy, which was chosen as an alternative to classical ANOVA, given the nature of the data, was resampling, specifically, bootstrap ANOVA (35, 36, and 37). The procedure does not make normality and homogeneity of variance assumptions about the underlying population. Instead, it approximates the actual population by resampling with replacement from the original sample and makes inferences based on this approximated population.
RESULTS AND DISCUSSION
General characteristics of the sample
The total number of 1,241 records or articles was extracted from ISI Web of Science for the period 2004-2009. Seven hundred forty-seven articles (60%) were from the journal Kardiologiya, 319 (26%) from Cardiovascular Therapy and Prevention, and 175 (14%) from the Russian Journal of Cardiology. In terms of year of publication Table 2, the largest number of articles (262 or 21%) was from year 2008, while the smallest number of articles (118 or 10%) was from year 2006.
The total number of authors who published in the three journals during the period 2004-2009 was 2,666. Table 3 shows how many authors contributed to all three, two, or only one of the journals in the sample. From the Table most of the authors (85%) contributed to only one journal. Only 43 individuals (1 %) contributed to all three journals.
The three most productive authors in the sample published 36 articles each. The next ten most productive authors contributed from 19 to 32 articles. Most of the authors (1,790 or 67%) published only once in the three journals during the period 2004-2009. The mean number of articles published by an individual author in the three journals during 2004-2009 was 1.95 with a standard deviation of 2.58. Median productivity for all authors was equal to 1.
Results Pertaining to Research Question 1
Table 4 presents the distribution of authors across the hypothesized types of co-authorship tendencies. As can be seen from the Table, the most typical tendency among the authors publishing in the Russian cardiologic journals was the middle ground tendency, which was used by 26 percent of the authors. The independent, combination, and bridging tendencies were well and relatively equally represented (18 to 21%). The bonding, dyadic, and isolate tendencies were less represented and were characteristic for 231 (9%), 135 (5%), and 39 (1%) researchers respectively.
Results Pertaining to Research Question 2
Prior to implementing a formal inferential test to assess the difference in research productivity among researchers with different co-authorship tendencies, the data was analyzed descriptively. Table 5 lists the identifiers for the 25 most productive authors in the network alongside their strategies and number of publications. The average number of publications for the top 25 authors is 21 articles. All but two of these authors used bridging co-authorship tendency, which indicates that this tendency is probably the most productive on the average.
The fact that the bridging tendency is associated with the highest level of research productivity compared with other types of tendencies is most evident in Table 6. Authors demonstrating bridging co-authorship tendency had an average productivity of five articles, while the average for all other tendencies is one article.
As indicated above, the assumptions of normality and homogeneity of variances were not met by the data in order to justify the inferential analysis of the relationship between the level of productivity and co-authorship tendency with classical ANOVA. Since ANOVA is widely considered to be robust to violations (38), the results of tests of assumptions are presented below to justify the utilisation of bootstrap ANOVA.
Figure 2 presents a Normal Q-Q plot of the quantiles of the residuals against the quintiles of the normal distribution. Figure 3 presents histogram of the residuals compared with the normal plot. Both graphs show that the distribution of the residuals does not fit the theoretical normal distribution, providing evidence that the normality assumption is violated. Kolmogorov – Smirnov test appropriate for large datasets also failed to confirm the null hypothesis that the residuals were normally distributed (D(2,666) = 0.32, p<0.01). Descriptive Statistics command produced a value of skeweness for residuals, which is equal to 7.27, and the value of kurtosis, which is equal to 82.26. The positive values indicate a high degree of right skeweness of residuals and leptokurtic distribution. The data also failed to meet the assumption of homogeneity of variance, which was tested using the Levene’s test (W(6; 2,659)=153,44; p<0.01).
Bootstrap ANOVA was conducted in SPSS 16.0 as an alternative to classical ANOVA. Bootstrap was based on 5,000 resampling’s. The seed was set at 2,000,000. Nighty-five percent bias-corrected accelerated interval (BCa) was used to correct for bias and skeweness. The mean differences for the bootstrap procedure were obtained from and compared with Tamhane 2 test, which automatically controls for family-wise error rate at 0.05.
As should be clear from Table 7 for 12 comparisons the bootstrap confidence interval did not include Ma-Mb=0. This implies that for the twelve comparisons, for which confidence interval is italicized in the Table, a significant difference in the mean productivity was found. Figure 4 summarizes the results of the analysis by presenting the ranking of tendencies in terms of their relationship with productivity. According to this ranking bridging is associated with the highest levels of productivity. This tendency is followed by the middle tendency. All other tendencies are associated with much lower levels of productivity than the middle tendency. In addition to that, the independent and combination tendencies are more productive than the bonding tendency.
One important finding of the study is that researchers in Russian cardiology have low average productivity and that they are not normally distributed in terms of their productivity level. One explanation of the low average productivity is that article publication is not important for a career in Russian cardiology. Another explanation is that the key journals cover two types of contributors: highly productive researchers and rarely publishing practitioners. There might also be “noise” in the sample, produced by individuals, who are not actual researchers (including graduate students and laboratory personnel).
Another key finding of the study is that all hypothesized co-authorship tendencies are present in the field of Russian cardiologic research, although to a different extent. The most common form of co-authorship among the authors publishing in the Russian cardiologic journals was the combination tendency. The independent, bridging, and middle tendencies were well and relatively equally represented. The isolate, dyadic, and bonding tendencies were less represented, with the bonding tendency being most underutilized. The pattern of occurrence of tendencies indicates that, overall, forms of co-authorship based on bridging are more common in Russian cardiology than individual publication or forms of co-authorship generating bonding social capital. The combination tendency, which is most common in Russian cardiologic research, does create the bonding capital, but only in combination with bridging capital.
The identified distribution could be explained in two ways. First, co-authorship based on bridging may be chosen by researchers because of the nature of their training and the dominant culture in the field; in other words, researchers favor co-authorship based on bridging because they know no alternative or because the alternative is not an option. Second, co-authorship based on bridging might be more productive and preferred by researchers in pursuit of a greater number of publications.
It is important to mention that maintaining contact with co-authors makes bonding costly. In the case of bonding, much effort and money are expended on maintaining contact with co-authors. In view of the low frequency of bonding tendency and its lowest position in the productivity ranking, it is possible that the cost of bonding is more prohibitive than the cost of bridging in Russia. One possible explanation comes to mind: a researcher using bridging could invest in attending one conference and get many new contacts out of it, thus cutting some costs. Costs cannot be cut in such a way in the case of bonding.
The dominant position of the combination tendency in the ranking, based on usage, and the second position of the middle tendency in the ranking of productivity effects indicate that the best and most preferred form of co-authorship for cardiologic researchers in Russia is to combine both bonding and bridging. In such a combination a researcher invests efforts and financial resources to maintain close contact with an established group of researchers, possibly conducting research of common interest, and, simultaneously, tries to publish with people from other groups to expose himself/herself to novel ideas and to increase visibility.
Several implications follow from the study for future research. First, more studies need to be conducted to explore the relationship between forms of co-authorship and research productivity. So far only few studies, addressed the relationship directly and their conclusions need to be confirmed by similar studies contextualized in a variety of research fields and countries with different level of science and technology capacity.
Second, to the extent possible, the future studies should attempt to control the potential confounding variables, such as the amount of available funding, the gender of the researcher, their rank and experience, as well as others indicated in the background section. Evidently, the use of ISI Web of Science as a source of data for compiling the study sample does not allow collecting sufficient information about the confounding variables. Hence, different sources of data might be utilized. Alternatively, the confounded variables could be accounted for indirectly. For example, a mixed method approach could be utilized, whereby a quantitative analysis would be supplemented with an interview, survey or document analysis to gain better understanding of the journals, the research field and the research activity in the field, including the information about the extent of participation in research of practitioners, the types of articles published in the journals, and the costs involved in different types of research collaboration.
Finally, subsequent studies of co-authorship and productivity in biomedical fields might want to address one of the limitations of this study – failure to differentiate between different types of research articles. After this study was completed, one of the experts in the field of biomedical science brought to the author’s attention that there are many types of cardio logical articles, such as reports on randomized clinical trials, clinical investigations, case reports, laboratory studies carried out in animals, or retrospective studies of medical records. It is possible that the relationship between a co-authorship form and research productivity is moderated by the important covariate – type of cardio logical article. While it was not feasible to control for the type of article in this study due to the author’s lack of training in the field to be able to classify articles qualitatively and the fact that there is no information about the type of article in the ISI Web of Science’ bibliometric record, the reported results might stimulate interest and provide background for subsequent studies, which would address this issue by involving an expert to determine an article type.