Page 136 - Becoming Metric Wise
P. 136

127
                                                    Publication and Citation Analysis

              Proof.:Pick any document d 0 in D and recall that C(d 0 )isthe set of all
                                 21
              references of d 0 and C (d 0 ) the set of all documents that cite d 0 .Then C 1
              is the set of all documents which either cite d 0 or are cited by d 0 .Bythe
              requirement of weak connectedness, C 1 is not empty unless D is equal to
              the singleton {d 0 }, in which case the theorem is proved. So we can pro-
              ceed and form C 2 . By Theorem 1 we know that d 0 belongs to C 2 .
                 To show that  ,  N21 C j is equal to D we suppose that some d in D
                                 j50
              does not belong to  , N21 C j . This assumption leads to a contradiction:
                                   j50
              as there is a path, necessarily finite, joining d to d 0 there is a number
              j # N 2 1 such that dAC j .

                 This theorem yields an algorithm for obtaining all the documents in a
              given collection, provided the collection is reasonably homogeneous, so
              that its citation graph is weakly connected. Moreover, if D is a large com-
              puter file, then the algorithm provides a procedure for exploring the core
              of a topic (take d 0 to be a core document) and moving further and further
              towards the boundaries. This method is known as “cycling.” A mathe-
              matical discussion of cycling can be found in (Cummings & Fox, 1973;
              Garner, 1967).
                 The following results provide useful insight into the structure of a
              citation

              Theorem 3 (Kochen, 1974, p. 21): The average number of references per
              document times the number of documents in a collection under investigation is
              equal to the average number of citations to a reference item in the collection times
              the total number of different references.


              Proof.: Let C be the citation matrix of the collection under investiga-
              tion. Then c ij 5 1 if document d i cites reference r j and c ij 5 0 if it does
              not; note that the index i refers to citing document i, while the index j
              refers to cited document j. The columns of this citation matrix contain
              only those documents that are cited at least once by the documents in the
              collection. On the one hand, if there are n source documents, the average
              number of references per document is:
                                          n    p   !
                                       1  X X
                                                 c ij                     (5.4)
                                       n
                                         i51  j51
   131   132   133   134   135   136   137   138   139   140   141