    3.3 - References and Citations Completeness

    Reference and citation lists in the ADS are not complete. There are several sources of incompleteness of citation lists:

    • The ADS doesn't have the cited article in the database. This happens for instance for most papers appearing in mathematics, chemistry, and geophysics journals.

    • Our reference resolver program couldn't interpret the reference. This may be due to errors or incompleteness in the reference, unusual formatting of it, or simply limitations in our program's abilities.

    • We do not have the reference list for the citing paper. This happens for older articles and for articles in journals and conference proceedings that do not supply us with reference lists.

    We are constantly adding to our references by extracting reference lists from scanned articles, trying to improve reference recognition capabilities, and adding new records to our databases, so this work is an ongoing effort which will cause reference and citation lists to change over time.

    We currently get reference lists from the journals published by APS, IoP, Springer, Elsevier, and Nature as well as A&A, and MNRAS. The APS and IoP reference lists go back to volume 1 for their journals, while coverage for other journals is not complete.

    Another additional source of citation information is the metadata feed that we receive from CrossRef as of 2008.

    If you would like to submit missing citations, we do accept user-submitted citations, but ask that you follow the instructions given in our FAQ:

    3.4 - References and Citations From ArXiv Preprints

    As of March 2005, references the from arXiv preprints are integrated in the ADS. When we retrieve the metadata for the nightly update of arXiv preprints, we also process the source data to retrieve the references contained in it, either from the (La)TeX source or the PDF version of the preprints. Next, the retrieved references are parsed to resolve them into a match with an existing record in the ADS. This may not succeed for various reasons. Since there is no unformity in how the reference sections are formatted and since the coverage of journal articles outside of Astronomy and Physics in ADS is incomplete, there is always the possibility that a preprint reference section is partially or completely missing. But assuming the procedure is successfull, references found in the fulltext of a preprint are extraced, parsed, and identified as a list of ADS records.

    An obvious concern is whether references now contribute twice to the ADS citation lists after the associated journal article gets added to the ADS. This should not be the case, since we perform extensive matching between the preprints and published articles in order to identify them as unique records. All preprints for which we find a matching journal article with associated references will not contribute to the citation count. Whenever a (journal) article is published for which we have the preprint in our system, we will replace the preprint references with the references from the paper, if they are available. If they are not, we continue to use the references from the preprint but attribute them to the published paper.

    This should minimize instances of double counting of citations. However, in the case that we can not find the match accurately, we would appreciate feedback from users informing us of the match so that we can eliminate any double counting. The best way to do this is to properly update the published information for the preprint in question in the arXiv database. The ADS will then automatically be notified of the change with our daily update from them.
    We have given this a lot of thought and these are the main reasons why we decided to go ahead with this addition:

    • The arXiv preprints have been an integral part of the ADS for quite some time now. Besides indexing the astracts, author and title information, we now also attempt to extract the references from the preprints and resolve them. This way, the preprints will participate fully in the ADS's reference/citation system.

    • For the ADS, the links between articles are the primary purpose of the reference/citation system. The actual citation count, while interesting, is a fully secondary by-product of the primary goal, which is to allow scientists to easily find and access those articles that will aid their research.

    • With about 75% of all refereed astronomy papers appearing first in the arXiv, and well over 90% of the highly cited ones (99 out of the top 100 2003 ApJ papers were preprinted, according to our recent study), the preprints have become an integral part of astronomy (and physics) research.

    • The use of the preprints has a major, positive effect on research. The effective latency between publication and citation has shortened by about six months, without taking the citations to the preprints themselves into account. This has the effect of increasing the rate of discovery; we believe we have an obligation to support this change. For more information on this subject, please see our own study on the effect of use and access on citations.

