Measuring the value of a scientific work is important to researchers, institutions, decision makers, and countries. Researchers consider a number of values with respect to journal quality when considering sources in which they will publish their work. Information professionals also consider these values in the source development process. In addition, the quality of journals in which scientific papers are published is a critical factor in university rankings. Many methods are currently utilized to determine journal values, and they are affected by all these considerations. The journal impact factor (JIF) is the most well-known measurement criteria for determining the relative importance of journals. In addition, the eigenfactor (EF) is a metric used for evaluating scientific outputs. The JIF is used most both in the collection of sources and in the criteria considered by authors in selecting publication sources for their scientific output. JIF is calculated by simply counting the raw number of citations. Another metric for determining the impact of an article within five years of its publication is the article influence score (AIS), which is based on its EF value (Journal Citation Reports, 2016).
However, these quantitative metrics used in scientific circles to gauge the value of published works do not always yield an accurate evaluation. JIF and AIS were not initially designed to measure article quality. The original purpose of the JIF was to help librarians select journals (San Francisco, 2012). JIF, in fact, determines about how many citations “an average article” published in a given journal receives within a specific period of time. Therefore, evaluating the quality of an article by considering only its JIF value is inappropriate. Besides, the distribution of article citations is skewed. A few articles published in journals are cited much more than the average and most others receive few or no citations (long tail theory) (Andersen, 2007). As such, even if the impact factor of a journal is known, the number of citations for an article published in that journal cannot be easily estimated based on the normal distribution theory and the average number of citations. Moreover, the citation culture varies in different disciplines. For instance, the JIF value of a journal in the social sciences is always lower than that of a journal in the physical sciences because books are the dominant form of publication in the social sciences. If we simply consider the number of citations, the citations from each journal are treated equally, and the prestige of the journal from which the citation is made is not considered. In addition, the unique citation culture of the fields is also ignored in these evaluations. In our consideration of popularity and literary prestige, popularity indicates the number of citations made by authors of other studies and prestige indicates which citations were made by more prestigious authors and journals (Ding and Cronin, 2011). Based on these definitions, the popularity of a source cannot be expected to necessarily be equal to its prestige. Therefore, a citation from a more visible study that received many citations does not have the same weight as a citation from a study with lower scientific visibility (number of citations) (Bollen, Rodrigues, and Van de Sompel, 2006; Maslov and Redner, 2008).
In calculating popularity, JIF measures how many average citations journal articles received in a given year. This makes it a journal-level metric that normalizes the number of citations (Sjimagojr, 2016). Nevertheless, for this calculation, citations from a journal with a higher impact factor are considered to be equal to those from a journal with a lower impact factor. The problem related to prestige having been ignored in the literature is for the most part solved with the EF metric. We can use computer technology to bring to light the connections within a citation structure network. This technique was inspired by Google’s PageRank algorithm, which offers a more in-depth calculation for the evaluation of sources. Besides the number of citations of publications, EF considers the sources of these citations. Thus, we can calculate a more accurate value by taking into account both article popularity and the quality of the citation sources. Furthermore, we perform normalization based on the EF calculation fields in which each field has its own distinct citation culture (Bergstrom, West and Wiseman, 2008). To make an article-level evaluation, we obtain the AIS value by dividing the EF values by the number of articles published in the journal (JCR, 2016). In fact, AIS is similar to the JIF metric in being a ratio of the journal citation impact to the number of articles in that journal over a five-year period, (JCR, 2016). Moreover, as AIS is calculated based on the whole JCR citation network, citation weights differ in citations from journals receiving frequent citations. Therefore, we can use AIS as a metric for interdisciplinary comparisons.
Few studies in the literature have separately treated popularity—based on the raw number of citations—and prestige—based on the weighted number of citations. Nor has the difference between popularity and prestige been mentioned in many studies. Only a few researchers have used these two methods to evaluate authors, publications, and journals.
The increased weighting of citations from prestigious sources was initially suggested by Kochen (1974) and Pinski and Narin (1976). Pinski and Narin (1976) proposed a model for calculating the prestige of journals by considering the prestige of the journal making the citation. The objective in these studies was to develop a weighted citation-based metric for measuring the relative impacts of scientific journals. According to this metric, citations from a prestigious journal and those from others cannot have equal weight. Later, two methods were proposed for calculating the prestige of scientific products. Bollen, Rodrigues, and Van de Sompel (2006) suggested the use of the PageRank algorithm to evaluate the quality (prestige) of products and Berstorm (2007) and Bergstorm, West, and Wiseman (2008) proposed EF, as described above.
Before the use of EF value-based studies in the literature, researchers calculated the prestige of journals by using the PageRank algorithm, which provided the groundwork for calculating EF. Bollen, Rodrigues, and Van de Sompel (2006) compared journals in terms of their weighted PageRank and JIF values, and the ranking results yielded very big differences. Only journals such as Nature, Science, and The New England Journal of Medicine appeared in both rankings. According to that study, while popular journals received high JIF values, they had lower PageRank values. Habibzadeh and Yadolahie (2008) stated that the best way to measure journal quality is the weighted JIF calculation and they conducted a study using the JIF values of citation sources. According to this study, if the journal in which the citation was made has a high JIF value, a higher value is assigned to the citation. In another study on information retrieval, Ding and Cronin (2011) analyzed how many citations came from highly cited studies.
After the EF calculation was introduced, a number of studies compared this metric with others. Davis (2008) found a strong correlation between two metrics in 165 pharmaceutical journals in a comparison of EF, total citations, and two-year JIF values (Spearman’s rho = 0.84). Interestingly, the author also found a strong correlation between EF and the total number of citations (Spearman’s rho = 0.95). According to this study, popularity and prestige considerations in this field support the same data. Franceschet (2010) took basic bibliometric values into consideration to measure popularity and EF values to measure prestige. This author investigated the occurrence of overlapping of five-year IF and EF values in the social and physical science journals in the JCR (Journal of Citation Report). Accordingly, although there is a strong statistical correlation between prestige and popularity in both fields, there were significant differences in some cases, especially in the physical sciences. Based on the results, the author divided journals into four categories with respect to their prestige and popularity.
We also see higher similarity rates between metrics in studies that have included the AIS metric. Arendt (2010) compared JIF and AIS values and found a significant correlation between them (0.89). The author investigated the variability of the AIS metric with respect to discipline. A correlation between the two metrics, however, does not mean that there is necessarily a cause and effect (causality) relationship. Nevertheless, similar findings were made regarding the correlation between these metrics in other comparisons. Saad (2007) compared EF, AIS, and journal h-index values in journals published between the years 1989–2004, and found a high correlation (0.90) between the metrics. Rousseau and STIMULATE 8 GROUP (2009) also compared JIF with four other metrics (SCImago Journal Rank Indicator (SJR), EF, AIS and journal h-index) in 77 journals on an annual basis. Although each metric had been calculated in a different way and were used in different databases, strong correlations were found between these four metrics and JIF metric in the Web of Science (WoS) scientific citation index, and also between these four metrics themselves.
Apart from studies of journal and author rankings, a few studies have focused on articles and used the PageRank algorithm. Chen, Xie, Maslow, and Redner (2007) examined articles published in the Physical Review journal between the years 1863–2003. While the authors found a strong correlation between PageRank and total number of citations (0.91), many articles had not been included in the PageRank ranking despite the study results showing that they had received many citations. In a study conducted by Ma, Guan, and Zhao (2008) on molecular chemistry and molecular biology articles indexed in the WoS between 2000–2005, again, the authors found a strong correlation between PageRank and the total number of citations (Spearman’s rho = 0.98).
In contrast to the tendency for authors to evaluate prestige and popularity on the basis of journals, in our study we performed an article-level evaluation. Based on the fact that the weight of each citation would differ, we measured the quality of the cited articles by ranking them based on an evaluation of both their popularity and prestige. We evaluated articles published in the Journal of the American Society for Information Science and Technology (JASIST) between the years 2010–2015, analyzed the citations of these articles, and calculated their rankings based on their IF, EF, and AIS values.
Study Objective and Hypothesis
In this study, each citation was recognized as having a different weight, and we sought to answer the question “How much do relevance rankings based on raw citation data (JIF) coincide with those determined by normalizing the EF and AIS fields?” For this purpose, articles of JASIST journal published between the years 2010-2015 were weighted by also considering the value and field of the source which made the citation like in the PageRank algorithm. We ranked the articles based on their popularity (JIF) and journal-level prestige (EF), and then compared these two rankings. Next, we made a list of the AIS values obtained from the EF values to analyze the article-level prestige values. In these rankings, we assigned a new value to articles that considered the JIF, AIS, and EF values of the journals which had published the studies citing those articles. Thus, these metrics of the popularity and prestige of the journal were converted to an article–level metric.
Our study hypothesis as follows: “relevance ranking that takes popularity into consideration and makes normalization according to the fields is more successful than relevance ranking based only on the raw number of citations.”
Study Scope and Limitations
We examined 1,417 articles published in the JASIST journal between the years 2010–2015 and 15,370 (13,965 unique) studies cited by these articles. Since the title of the Journal of the American Society for Information Science and Technology was changed to the Journal of the Association for Information Science and Technology in 2014, we performed our searches accordingly.
We downloaded and examined all of the articles in the related years and the sources citing these articles in .txt and .xls formats (no publication type was restricted). To obtain the data to be used in our calculations, we browsed the WoS database at the Information Sciences Institute (ISI) in the first week of February 2016.
From the Journal Citation Report (JCR), we obtained the JIF and EF values of the journals, including the sources citing the articles, from JASIST issues published between the years 2010–2015. To complete the data set, we also identified the WoS website and journal EFs, the values of which were not included in the report. Then, we collected their AIS values from the eigenfactor.org website. We have not included conferences, proceedings, or books in our data set, which fall outside the scope of the JIF and EF calculations.
If a citation source of a related article was not indexed on the WoS, we referred to it as an external source (Akbulut, 2015), which are not presented in the WoS-related records ranking. Therefore, since external source data are outside the scope of this study and we did not include them in our calculations.
Study Method and Data Collection Techniques
We created a data set of all the articles in JASIST journal issues published between the years 2010–2015 and the sources of the articles’ citations. We listed the cited publications and downloaded them in .xls format. After data collection, we cleaned and integrated the data. Since the cited publications manually index some of their information, mistakes are inevitably made in the indexing stage. As such, quantitative studies based on the analysis of incorrect data may be misleading. For instance, a given publication may be recorded as two different publications (see Figure 1). Due care must be exercised in the integration stage to ensure that the big picture is accurately interpreted. In order to improve accuracy, we identified potential duplicate records by running a similarity algorithm (similarity algorithm codes may be found at https://goo.gl/ZktZA2) on the data exported from the WoS and on the integrated data (the similarity rate was determined to be 85%). We examined the potential duplicate records and consolidated those that were the same.
The downloaded data set included the year, authors, title, and source information. We repeated the same operation for each year and created a single Excel file. In addition to this information, we added “ID” and “Entity ID” domains to this file, as shown in Figure 2. For example, a total of three citations were made for the 35th article (entity id) from the total 1417 articles published in JASIST journal between the years 2010–2015 and a total of five citations were made in each of the 36th and 37th articles.
In addition to these domains, we collected the JIF, EF, and AIS values of the journal in which each study was published in order to calculate the new ranking. As noted above, we obtained the JIF1and EF values from the JCR, a product of Clarivate Analytics, which is a source for evaluating and comparing scientific journals. For the journals for which these values were not available, we created a list from related values we collected from the WoS and EF (http://www.eigenfactor.org) websites. We also obtained the AIS values from the EF website.
For articles published in JASIS between the years 2010–2015, we created new values based on the JIF, AIS, and EF values of the journals that had cited those articles. In other words, we converted the JIF, AIS and EF values calculated for the journals into article-level values.
To total the JIF, EF, and AIS values for each source, we removed all duplications from the source domains of the data set shown in Figure 2 and created a new list. Then, we compared this list with a list downloaded from JCR (2014) and filled in the missing information using a written script (see Figure 3), thereby obtaining a new list containing JIF, EF, and AIS values in addition to the domain information (see Figure 3).
We downloaded these sources and their respective values from the JCR and, using a second script, added them to the updated list. Thus, we gathered into a single list the information and JIF, AIS, and EF values of all of the publications that had cited articles in JASIST issues published between the years 2010–2015. For example, in Figure 3, there were five citations for article number 37 published in JASIST in 2013, and three of these were studies published in The Library Trends journal. The other two were in Computers in Human Behavior and Information and Management journals. The JIF, EF, and AIS values of these three journals and the number of citations made by each of them must be included in order to perform the new calculation. Therefore, we wrote a third script, in which we multiplied the JIF, EF and AIS values of the articles citing each entity (for example 37th article published in 2013) by the number of repetitions of that article and took the average to designate a new value for each entity (this value is then used in ranking the related articles). A new calculation was then performed by taking into account the prestige of the sources rather than just their raw citation values, to obtain the new value (Figure 4).
Next, we ranked the cited articles according to their JIF, EF, and AIS values and compared these lists. For instance, in Figure 3, while article number 37, which was published in JASIST in 2013, is ranked 668th in the JIF-based calculation, it is ranked 568th in the calculation that included the fields with the EF values. It is ranked 579th in the AIS ranking, which is calculated based on the EF value.
With respect to the AIS values, those lower than 0.1 were treated as 0. Also, since there are many negative values, we took into consideration the absolute value when examining the rankings.
The JIF value is an article-level metric that is the ratio of the number of times an article was cited in one year to the number of articles published in that journal over the previous two years (Web of Science, 2016). While the JIF value takes into consideration only the number of citations, the EF values are normalized. The AIS value is obtained by dividing EF by the number of articles published in the journal, and its average value is 1. A journal AIS value higher than 1 indicates that the journal has a high average impact and a value lower than 1 indicates that the journal has a low average impact (Cornell University Library, 2016). In this study, we created and compared relevance rankings of the same sources with respect to these metrics. In the first stage, we compared the JIF and EF rankings and evaluated their similarities. Then, we compared the ranking based on the EF values of the citations for articles published in JASIST between the years 2010–2015 with the ranking based on the AIS values. This comparison served as a kind of study checksum.
Similarities of the Lists and their Rate of Overlap
The JIF, EF, and AIS rankings comprise the output of the third script in which the values of the articles are calculated. Using this script, we created lists comprising the first 50 records of each ranking based on their JIF, EF, and AIS values. We considered 50 records from each to be sufficient for detecting similarities and differences between the rankings. As shown in the comparison of JIF and EF rankings in Table 1, the rate of overlap of these lists is 26% (see https://goo.gl/KjULxC), which means that the number of articles that are on both lists is 13.2 Table 1 presents information about the 13 records on both lists. When carefully examined, we can see that the top-ranked records are among the top 50 on the other list. For example, whereas the study titled “Last but not Least: Additional Positional Effects on Citation and Readership in arXiv” published in JASIST in 2010 is ranked first in JIF ranking, it is ranked 11th in EF ranking. The reason why the JIF-based ranking was so high for this article which had received only three citations was that the citations were made by sources from which many other citations are made, such as from Nature, from JASIST, and from an ISSI (International Society for Scientometrics and Informetrics) proceedings book. The EF value dominated the ranking slightly more since it does not include self-citation. Yet, when the status of the journal in which the article was published was considered, just two citations propelled this article to 11th-place ranking since both citations were from articles in a prestigious journal.
Comparison of JIF and EF rankings
When we examine the rankings created by taking JIF and EF values into consideration, the problematic aspects of the JIF metric become evident. Outliers, especially for articles with a low number of citations, yield misleading results. We examined all records (1417 records) on a yearly basis in order to determine the difference between the two rankings. Specifically, we looked at the values for which there was the biggest difference between the JIF and EF rankings. To do so, since studies with a low number of citations do not yield precise results and since making generalizations would lead to inaccuracies, we considered only those values with a high number of citations in a year.
Figure 6 shows studies with the highest EF and the lowest JIF annual values and those with the highest JIF and lowest EF annual values. When we examine the ranking values, the negative values at the top of the variation column indicate that the article is top-ranked, based on the raw-citation calculation, and it is low-ranked when they have EF and AIS field-normalized values. When we examine the citations for these studies, there is a higher possibility of the sleeping beauty or citation classics effect. The positive values, in contrast, are the top-ranked sources with calculations based on normalized values, but are low-ranked when based on pure citation values, for which the possibility of the sleeping beauty or citation classics effect is lower. This effect is dominated in the new ranking.
For example, the variation values for the study titled “Sentiment in Short Strength Detection Informal Text” in the first ranking is 328. In other words, it is top-ranked in the ranking that is citation-field-normalized, but is bottom-ranked in the raw-citation-based ranking. The second-ranked study titled “Science Overlay Maps: A New Tool for Research Policy and Library Management” has a variation value of −11 and is top-ranked in the raw-citation-based ranking, but is bottom-ranked in the citation-field-normalized ranking. Therefore, the sources indirectly impacted by the sleeping beauty or citation classics effect are moderated, and as such, they are not top-ranked as are raw-citation-based (JIF) articles.
The differences in the rankings may be due to the phenomenon in which the citation distortion is intensified, which is referred to in the literature as the Matthew effect. The Matthew effect, which indicates an accumulated advantage, is used in sociology for situations in which the rich get richer and the poor get poorer. In bibliometrics, the Matthew effect refers to the phenomenon whereby if two scientists have conducted studies of a similar nature, the well-known one will receive more citations than the lesser known scientist (Merton, 1968, 1988, Smucker, 2008). When we consider the citation tendencies of authors, the fact that an article has received many citations is usually an indication that it will continue to receive many future citations. As is frequently seen in the literature, some studies receive many citations due to their historical significance even if they are not directly related to the study in which the citation is made. This may lead to problems in citation-based measurements (Wang, 2014). In a calculation such as that for EF, where citation-field normalization is performed, the Matthew effect is moderated. Thus, these types of articles can be prevented from being ranked higher than they deserve
The sleeping beauty or citation classics effect, which is caused by the Matthew effect, is presented in Table 2. Since our data set comprises publications which are a maximum of five years old and which have received a maximum of 157 citations, the possibility of their having been affected by the sleeping beauty or citation classics effect is very low. In any case, Figure 6 shows the potential for the sleeping beauty or citation classics effect.
Comparison of JIF and EF Rankings
After comparing the rankings based on the JIF and EF values, we compared the rankings based on the EF values with those based on AIS values. Our objective was to crosscheck our results by comparing EF values, in which the status of the journals in which the articles are published are taken into consideration, with the AIS values also obtained from this metric.
Our comparison of the first 50 EF and AIS records showed that the rate of overlap of these two lists is 22%. The top-ranked studies of both lists are also among the top 50 of the other list. In addition, the top four studies in the EF ranking are also among the top 50 in the AIS ranking.
When we examine these three lists together, we see that all of the studies that ranked in the top 50 in both the EF and AIS rankings (11 studies) are also among the top 50 in the JIF ranking. The EF–AIS comparison demonstrates less similarity than does the comparison of the JIF–EF rankings.
Figure 7 shows a Venn diagram of the JIF, AIS, and EF metrics, which we examined in detail. We plotted the diagram for the top 50 records of the rankings based on all three metrics. As we can also see here, the intersection of the JIF–EF ranking list (26%) is bigger than that of the AIS and EF (22%) for the top 50 records in each category.
As already noted, AIS measures the average impact of an article in a journal within the five years of its initial publication, we obtain its value by dividing the number of articles in the journal by the EF values of the journal. This measurement is approximately similar to the five-year JIF calculation (Web of Science, 2012). In other words, AIS is the contribution of the number of articles in the journal over a period of five years. Therefore, it is not surprising to see that the list based on the JIF and EF values is similar to the list based on EF and AIS values.
Since our data set contains outliers and it is not normally distributed, it is difficult to identify similarities in the scatter plots based on raw values. We performed a logarithmic transformation to more easily observe the similarity rate and to see outliers as a little closer to each other. To facilitate interpretation, we also used the respective absolute values for the negative values in the data set to ensure that the x and y axes of the compared graphics comprised the same values.
The points representing articles in Figure 8 are more scattered than those in Figure 9. This means that the JIF and EF values are closer to each other than are the EF and AIS values in the whole data set, as is the case in the sample in which we examined the top 50 records.
Conclusion and Recommendations
Based on the idea that the weight of each citation differs, in our study we considered the weighted number of citations rather than the raw number of citations. The potential for journal citations in some scientific fields are higher than in others. For instance, the potential for a journal publishing in the basic science field to receive citations is higher than one that publishes in the field of clinical sciences. In this case, evaluating the quality or distinction of a journal based only on the number of citations per paper can be misleading. Since citation-field normalization is performed in EF calculations, this misleading effect is reduced. We weighted the scores of journals calculated with this method by their association with each citation. In other words, we determined the weight of a citation by an article-level calculation of EF, which is based on a journal-level calculation method inspired by the PageRank algorithm. Then, we crosschecked the results by comparing the AIS metric based on the EF value. Due to the interdisciplinary nature of information sciences, the ranking changed significantly upon normalization, yet the minimum overlap rate is 22% for rankings of the top 50 records.
We did not confirm our hypothesis that “relevance ranking that takes popularity into consideration and normalizes the citation fields is more successful than relevance rankings based on the raw number of citations.” This is because even though we normalized the EF and the EF-based AIS metrics, the calculation results for the AIS metric are basically similar to the five-year JIF value (Web of Science, 2016). Our findings showed that rankings based on JIF and EF values are even more similar.
The data set used in our study is limited to the sources indexed in the WoS, and it is possible that sources from journals not indexed in the WoS may change the ranking. However, this effect would likely be very slight. It is obvious that journals that do not meet the criteria for being indexed in the WoS would not have much impact on prestige calculation either.
Our findings show that the ranking created with EF values inspired by the PageRank algorithm is more appropriate for articles published in issues of the JASIST journal (since the mentioned articles are interdisciplinary). We found that sources having the Matthew effect, citation classics, and sleeping beauty properties are moderated in the EF ranking. The ranking of EF calculation at the article level represented reality more accurately. In other words, it is a much more appropriate metric for this field. On the other hand, the exclusion of all self-citations by journals in the calculations of the EF score may also cause misleading results. In future work, these two rankings should be compared by including journal self-citations in the calculations. Since there will be more citations if older publications are considered in studies, we anticipate the possibility that they would more accurately represent the citation universe.