Patent Classifications as Indicators of Intellectual - arXiv.org

Patent Classifications as Indicators of Intellectual Organization

Journal of the American Society for Information Science & Technology (forthcoming)

Loet Leydesdorff

Amsterdam School of Communications Research (ASCoR), University of Amsterdam

Kloveniersburgwal 48, 1012 CX Amsterdam, The Netherlands

[email protected] ; http://www.leydesdorff.net

Abstract

Using the 138,751 patents filed in 2006 under the Patent Cooperation Treaty, co-

classification analysis is pursued on the basis of three- and four-digit codes in the

International Patent Classification (IPC, 8th edition). The co-classifications among the

patents enable us to analyze and visualize the relations among technologies at different

levels of aggregation. The hypothesis that classifications might be considered as the

organizers of patents into classes, and that therefore co-classification patterns—more than

co-citation patterns—might be useful for mapping, is not corroborated. The

classifications hang weakly together, even at the four-digit level; at the country level,

more specificity can be made visible. However, countries are not the appropriate units of

analysis because patent portfolios are largely similar in many advanced countries in terms

of the classes attributed. Instead of classes, one may wish to explore the mapping of title

words as a better approach to visualize the intellectual organization of patents.

Keywords: patent, classification, indicator, map, WIPO, IPC

1

mailto:[email protected]

http://www.leydesdorff.net/

1. Introduction

Soon after his original publication in Science about using citation indexing in scientific

literature (Garfield, 1955), Garfield (1957) published a less-known paper in the Journal

of the Patent Office Society entitled “Breaking the subject index barrier—a citation index

for chemical patents.” Based on Shephard’s citation index in law (Adair, 1955), Garfield

generalized the notion of citation indexing as a tool for mapping an associative network

as distinct from a hierarchically organized index. As he stated (at p. 472):

Keeping the user in mind, the conscientious indexer will translate the terminology and

phraseology of the author into a standardized and more usable form. However, the

indexer is, of necessity, primarily guided by the subject content which authors provide.

The indexer also faced with a practical economic barrier cannot index with the

almost infinite depth to be found in the Citation Index. The Citation Index breaks this

“barrier” by presenting subject matter in bibliographical arrays which are neither

alphabetical nor classified but associative.

Unlike the imposed categories of classificatory indices, associations by citation are

generated by the authors and inventors themselves. The network structure emerges

as a property from the aggregates of individual actions. Since this network is

dynamic, it can be expected to develop self-organizing properties. Order—such as

the grouping into disciplines and specialties—is the result of interaction between

top-down and bottom-up dynamics (Kauffman 1993; Cilliers 1998).

2

Although the self-organizing dynamics of the communication can be expected to prevail

in the case of scientific literature (Kuhn, 1962; Price, 1965; Luhmann, 1990; Leydesdorff,

1995), the situation for patents is very different. Patents are regulated by law; legislation

is (predominantly) organized at the level of national states. Patent citations and

classifications are often added by the patent examiners. Can patent citations be used to

map the underlying science base of patents (Jaffe & Trajtenberg, 2002; Leydesdorff,

2004; Porter & Cunningham, 2005; Sampat, 2006)?

Citations are added to the front pages of the patents by the examiners to varying extents

(Cockburn et al., 2002), and the procedures differ among national, regional, and

worldwide operating offices (Criscuolo, 2004). Patent citations can also be expected to

fulfill other functions such as protecting an industrial portfolio, etc. (Mogee & Kolar,

1999). Because of these legal and economic functions, patent citations have been

considered from perspectives different from the mapping of their knowledge base. For

example, patent citations have been used for measuring the economic value of patents

(Trajtenberg, 1990; Hall et al., 2002; Sapsalis et al., 2006). This has additionally been

done at the level of innovation systems or even companies (Engelsman & Van Raan,

1993, 1994; Breschi et al., 2003; Leten et al., 2007). In summary, the different contexts

of innovations (the market value, the legal status, and the knowledge base) provide patent

citations with a variety of meanings.

In order to focus on the science base of patents, some scholars have analyzed the so-

called “non-patent literary references” (NPLR) among the patent citations (Narin &

3

Olivastro, 1988 and 1992; Grupp & Schmoch, 1999; Leydesdorff, 2004). In a recent

study, Leydesdorff & Zhou (2007) examined the disciplinary background of NPLR using

the journals being cited in the case of nanotechnology patents. We found in these case

studies that NPLR tend to be dominated by references to general science and professional

journals. These journals cannot easily be identified with specific journal categories or

disciplines (Guan & He, forthcoming).

In other words, contrary to Garfield’s (1957) idea, we may have to pay more attention to

the indexing and the expected indexer effects (Healey et al., 1986). In the case of patents,

indexing is part of a search process by the examiner which may lead to the addition of

citations, but in any case to the careful selection of primary and secondary categories for

the disclosure. The primary classes may be seen as defined areas of interest; the class title

gives an indication of the content of the class (WIPO, 2006, at p. 10). The secondary

classes can be viewed either as exploratory classifications (Verspagen, 2005) or as

reinforcing the primary classification (Breschi et al., 2000). The subclass title indicates as

precisely as possible the content of the subclass (WIPO, 2006, at p. 10). The examiners

thus add intellectual content to the patent structure (Larkey, 1999). The two processes of

application and evaluation are more interwoven than in the science system. Furthermore,

one would expect different networks to emerge from the different constraints of

procedures in the cases of national, regional, and worldwide applications.

4

2. Various patent databases

The database of the U.S. Patent and Trade Office (USPTO) contains all the data since

1790. Patents are retrievable from this date as image files, and after 1976 also as full text.

The html-format allows us to study them in considerable detail (Leydesdorff, 2004). The

European Patent Office (EPO) was established as a transnational patent office in 1973.

This database is also online, but in a less accessible pdf-format. Furthermore, 135 nations

(as of December 31, 2006) were signatories of the Patent Cooperation Treaty (PCT) of

1970, which mandated the World Intellectual Property Organization (WIPO) in Geneva

to administer fee-based services (since 1978). This database is well organized and in the

html-format. The various services enable users in member countries to file international

applications for patents. Like the European Union, several world regions have established

regional patent offices.

The various offices provide applicants with a number of choices which imply different

procedures and timelines. For example, the USPTO operates on the basis of “first to

invent,” while the EPO uses “first to file” as a criterion. If the applicant wishes to protect

an invention in countries outside the country of its origin, s/he can file for a patent in each

country in which protection is desired, or to a regional office (e.g., EPO), or file an

international application under the Patent Cooperation Treaty procedure (OECD, 2005:

54 ff.). Various factors (e.g., the costs of patenting, the time taken to grant patents,

differences in rules regarding the scope of patents, etc.) influence the decision on whether

to follow one procedure or another.

5

This variety of (partially overlapping) procedures complicates the use of patent statistics.

In the USPTO the inventor and his/her attorney are obliged to provide a list of references

describing the state of the art—the so-called “duty of candor” (Michel & Bettels, 2001).

EPO examiners, and not inventors or applicants, add the large majority of patent citations

(Criscuolo & Verspagen, 2005, at p. 10). While in the USPTO applications are examined

automatically, the EPO considers an application as a request for a “patentability search

report.” These reports contain citations from patents and non-patent documents that have

either been suggested by the inventor or added by the patent examiner (Criscuolo, 2004,

at pp. 92f.). Applications at the WIPO for intellectual property protection under the

regime of the PCT protocol require an International Search Report (ISR) and a written

opinion by the examiner about the patentability of the invention (OECD, 2005, at p. 57).

Although the initial investigations are usually carried out by the receiving offices, the

international extension can be expected to lead to a further streamlining of the patent

citations with reference to their economic value and legal protection against possible

litigation in court.

From the perspective of information science and technology, patent classifications

provide us with the outcomes of major investments of the patent offices to organize the

patents intellectually. The International Patent Classification system (IPC8) is currently

(since January 1, 2006) in its eighth edition, using a 12-digit code containing 70,000

categories (WIPO, 2006 and 2007). The European classification system (ECLA) builds

on this IPC and extends the number of categories to 134,000. The USPTO has its own

6

classification system, which has been used extensively in economic research (Jaffe &

Trajtenberg, 2002). This classification scheme currently employs up to 430 classes and

140,000 subclasses.

Attempts to map patents for the purpose of analyzing economic activities in terms of the

technologies involved have been moderately successful. The OECD has for this purpose

defined “triadic patent families” which counteract upon “home advantage effects” in the

various databases (Criscuolo, 2006). A patent is a member of a patent family if and only

if it is filed at the European Patent Office (EPO), the Japanese Patent Office (JPO), and is

granted by the US Patent & Trademark Office (USPTO) (Eurostat, 2006). However, this

institutional integration of the different databases into a single set of files that can then

also be used for normalization has hitherto remained problematic. The recent integration

of the various databases (OECD, WIPO, EPO, etc.) into the framework of PATSTAT

cannot be expected to resolve the problem because this relational database is developed at

the institutional level (Magnani & Montesi, 2007).

Several research teams have invested in generating concordance tables between the

International Standard Industrial Classification (ISIC) and the International Patent

Classifications (IPC) (Evenson & Puttnam, 1988; Verspagen et al., 1994; Verspagen,

1997). In a recent validation study, Schmoch et al. (2003, at p. 58) concluded that the

correlations between these tables are low. The authors cite Grupp et al. (1996, at p. 272)

that “industries do not represent homogeneous technologies.” From the perspective of

industrial economics, the IPC itself is mostly taken as a given because patents are

7

considered as input indicators. The purpose of this study is to open this black box and to

consider patents as outputs of the R&D system. How do the classifications map the

intellectual organization of patents?

3. Classifications

While scientific publications are organized in terms of journals, the primary system for

organizing patents intellectually is classifications. In a recent review of methodologies in

patent statistics, Dibiaggio & Nesta (2005) argued that technology classes could be the

most appropriate unit of analysis for exploiting the information contained in the patent

databases. Co-classification analysis has been explored in the study of scientific literature,

but in scientometrics co-classification analysis has been less successful than co-citation

analysis because the classifications are imposed, while the citations are not (Todorov,

1988; Tijssen, 1992b; Spasser, 1997). From a methodological perspective, co-occurrence

data (co-words, co-classifications, co-citations, etc.) can equally be used for the mapping

(Leydesdorff, 1987; Tijssen, 1992a; Leydesdorff & Vaughan, 2006).

In other words, it seems worthwhile to investigate whether co-classification analysis at

the level of the database can provide us with an angle for the mapping of the intellectual

organization of the patents. From the perspective of information analysis, the PCT

database of the World Intellectual Property Organization in Geneva has the advantage

that all records are gathered according to a common standard. Furthermore, one can

expect that mainly patents of a certain economic and technological value will be extended

8

for protection beyond the domestic market. Thus, mapping these patents can be expected

to show the fields of technological specialization of each country. The disadvantage

remains that patents under the PCT regime are a specific subset of the total set of patents

in the world. Some countries may use this route more than others. However, the PCT

procedure is increasingly used for patent applications. The number of applications

increased from around 24,000 in 1991 to 110,000 in 2002 (OECD, 2005, at p. 7).

Currently, more than 135,000 applications are registered yearly (WIPO, 2007).

From the perspective of my research question, the problem that the PCT set is a specific

subset is ameliorated by the high level of codification within this set because of the

intensive development of the IPC by the WIPO. If one were unable to retrieve structure in

this relatively well-organized set, then the more fuzzy sets would be even more difficult

to analyze. The national, regional, and international procedures for an application under

the PCT regime take approximately 30 months, but after this period the patent is

designated for all the countries indicated (Figure 1). This delay is sometimes considered

as an advantage because it provides the applicant with more time to decide whether or not

to seek a national or regional patent.

9

Figure 1: Timeline for PCT procedures (Source: OECD, 2005, at p. 57)

Patent co-classifications were already mentioned in the OECD Manual of 1994 as a

potential indicator of linkages among technologies (OECD, 1994, at p. 52). However, the

emphasis in the literature has been on patent citations because of (1) the analogy with

citations in the scientific literature, and (2) the interest in patent citations as indicators of

economic value (Breschi & Lissoni, 2004). Hall et al. (2002) grouped the classes of the

U.S. classification into six technological categories and 36 subcategories, but this

research was not used intensively in further research for mapping the knowledge structure

of patents. Breschi et al. (2002, 2003) have explored the mapping of firm portfolios and

their technological coherence in terms of patent classifications. Using the cosine and

other measures of similarity, these authors concluded that relevant measures of the

technological proximity of the classes can be retrieved from the database (Ejermo, 2005).

4. Methods and materials

4.1. Data

Because the WIPO database of the PCT applications is fully accessible online, has

worldwide coverage, and is carefully indexed using the latest version of the IPC, I

decided to download one year of data, that is, the 2006 data, from this database. The

downloads were done in the second week of January 2007. The dataset was fixed at that

10

date at 138,741 patents with a publication date in 2006. Actually, 138,751 patents were

retrieved, but the difference of ten patents is negligible given the large numbers.

It was decided to use publication dates instead of application dates because the

applications are brought online in a moving process. Thus, the number of patents using

application dates as the search code varies from day to day. For example, on 18 February

2007, 73,506 patents with application dates in 2006 were available, while this number

increased to 76,237 one week later, on February 25. Even the number of patents with

application dates in 2005 changed in this week from 129,841 to 130,066.

The patents were downloaded and brought under the control of relational database

management using dedicated software routines. Table 1 provides the descriptive statistics.

N N / patent patents 138,751 inventors 365,699 2.63 applicants 473,367 3.41 classifications 325,393 2.35 designations 13,847,717 99.80regional 415,729 3.0 } 102.8 countries 225 includes regions

Table 1: descriptive statistics of the data

Using the addresses of the inventors for the attributions, the distributions of inventors and

applicants over various countries are shown in Figure 2. These distributions exhibit the

well-known logarithmic shape of a Lotka-distribution. The fit is almost perfect (r > 0.99)

for the inventors.

11

; y = 135330x -1.3633 ; r > 0.99

0

20,000

40,000

60,000

80,000

100,000

120,000

140,000

160,000

180,000

200,000

United

States

of Ameri

caJa

pan

German

y

France

United

Kingdo

m

Repub

lic of

Korea

Netherl

ands

Canad

aChin

aIta

ly

Switzerl

and

Sweden

Israe

l

Austra

lia

Finlan

dInd

ia

Austria

Denmark

Belgium

Spain

inventors

applicants ; y = 181609x -1.3689 ; r > 0.95

Figure 2: Distribution over major patenting countries (N patents > 1000)

Relatively small countries like Korea and the Netherlands are more important

contributors to the database than Italy and China. In larger countries, domestic patenting

may play a more important role than in smaller ones. Within the EU, European patents

increasingly replace domestic patenting. For example, only 2,152 of the 17,095 patents

with a German inventor are patented in Germany itself. Note that Russia is not a major

player in this system.

As summarized in Table 1, each patent has on average 2.6 inventors and 3.4 applicants.

However, 334,737 inventors (> 99%) are also co-applicants; only 138,630 applicants are

non-inventors. This is approximately one per patent. Figure 2 shows that the practices of

co-application by inventors vary among countries. In the case of South Korea, for

example, mostly inventors seem to apply.

12

The number of classifications per patent is on average 2.4. The number of designations is

of the order of 100. Further analysis of these co-designations may be interesting from the

perspective of industrial strategies and spillovers. As noted, the classifications are very

detailed, using up to 12 digits. The main classes are contained in a four-“digit”

categorization (WIPO, 2006). Table 2 provides class A01 and its four-digit extensions as

an example. In the 2006 data, 121 main categories (at the three-digit level) were included

with elaboration into 623 categories at the four-digit level.

A01 AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING A01B Soil working in agriculture or forestry; Parts, details, or accessories... A01C Planting; Sowing; Fertilising A01D Harvesting; Mowing A01F Processing of harvested produce; Hay or straw presses; Devices for... A01G Horticulture; Cultivation of vegetables, flowers, rice, fruit, vines,... A01H New plants or processes for obtaining them; Plant reproduction by... A01J Manufacture of dairy products A01K Animal husbandry; Care of birds, fishes, insects; Fishing; Rearing or... A01L Shoeing of animals A01M Catching, trapping or scaring of animals; Apparatus for the destruction... A01N Preservation of bodies of humans or animals or plants or parts thereof;...

Table 2: The first category (“A01”) of the IPC with its sub-classifications as an example

From the perspective of visualization, the 121 categories at the three-digit level are

optimal, since the screen becomes unreadable with larger numbers. The user could then

be enabled to zoom into the four-digit level. However, in such a hierarchical approach

one would lose the lateral links provided by the co-classification analysis. In this study, I

first focus on the network of co-classifications at the three-digit level and subsequently at

the four-digit level, and then analyze the level of detail available at the level of the

nations participating in the database. The analytical insights may help us to understand

which approach is more feasible and meaningful.

13

One four-digit category was added to the patents by hand, using the search facility of the

European Patent Office at http://ep.espacenet.com/advancedSearch?locale=en_EP. The

EPO recently made a substantial investment by developing the code “Y01N” as an

additional tag to the existing database for the nano-categories (Scheu et al., 2006;

Hullmann, 2006).1 The tag is relevant because of the interest in policy circles in the

evaluation of the current efforts to stimulate nanotechnology (Braun & Meyer, 2007).

Since the EPO database can also be searched for PCT applications, 762 records could be

matched with this tag.2 Thus, at the four-digit level, I work with 624 categories.

4.2. Methods

Both at the three-digit and the four-digit level, a matrix was constructed with the patents

as the units of analysis and the classification codes as column variables. The analysis

focuses on the relations among the variables. For that purpose, I use factor analysis

(Varimax Rotation in SPSS) and visualization techniques from social network analysis

after normalization of the variables using the cosine. The factor analysis informs us about

the structure in the matrix, while the cosine-normalized matrices allow for the

visualization and the study of centrality measures (Leydesdorff, 2007). Whenever a

1 The class is further subdivided into Y01N2 for Bio-nanotechnology, Y01N4 Nanotechnology for information processing, storage and transmission, Y01N6 Nanotechnlogy for materials and surface science, Y01N8 Nanotechnlogy for interacting, sensing or actuating, and Y01N10 Nanooptics. 2 In the previous organization the field “B82B: Nano-structures: Manufacture and treatment thereof” corresponded to the special class CL/977 which was added to the USPTO. This category matched only 275 patents in 2006 (Leydesdorff & Zhou, 2007).

14

http://ep.espacenet.com/advancedSearch?locale=en_EP

submatrix is extracted—for example, for a country on the criterion of the institutional

address of the inventor—these same analytical techniques are used.

Of the 138,751 patents, only 135,536 patents contained valid classifications. As noted,

the number of classifications in this data was 121 and 624 at the level of three and four

digits, respectively. From these matrices, subsets were extracted for 126 countries.

Additionally, an aggregated file of countries versus classification codes was generated for

further analysis. The latter file enables us to study the distributions of classifications over

countries or, vice versa, the distributions of countries over classifications. For example,

one can raise the question how patents with the IPC-classification “B82B: Nano-

structures: Manufacture and treatment thereof” or the ECLA-tag for nanotechnology

“Y01N” are distributed across the countries.

The two basic matrices of patents versus classification codes are extremely sparse: in the

case of three-digit classes only 1.14% of the cells have a non-zero value and in the four-

digit case only 0.25%. Because of the large numbers of zeros the cosine between the

variable vectors is a better measure of similarity than the Pearson correlation coefficient

(Ahlgren et al., 2003). Unlike the latter, the cosine normalizes with reference not to the

arithmetic mean, but to the geometric mean. Using the cosine values, one can span a

vector space which can be used for visualization purposes (Salton & McGill, 1983;

Breschi et al., 2003).

15

The freeware program Pajek (available for academic purposes at http://vlado.fmf.uni-

lj.si/pub/networks/pajek/) is used for the visualizations of the cosine-normalized patterns;

UCINet and SPSS are used for the statistical analysis. For each country, a cosine-

normalized input-file for Pajek is brought online at

http://www.leydesdorff.net/wipo06/index.htm. These files allow users to freely choose a

visualization routine for a specific country, since Pajek files can be exported into other

formats. I provide examples below using the available visualization routines with the

purpose of conveying the analytical argument. The algorithm of Furchterman & Reingold

(1991) is used in this version for most of the visualizations because—unlike using

Kamada & Kawai (1989)—the isolates remain meaningfully visible using this algorithm.

Within the visualizations, the size of the nodes is set proportionally to the logarithm of

the number of patents in the category. The lines are made proportional to the cosine

values in five equal steps of 0.2. A threshold is set at the cosine ≥ 0.05 in order to weed

out incidental variation. In most cases, greyshades are added using the k-core algorithm

for the partitioning in order to support the readability of the visualizations.

5. Results

5.1. The 3-digit level

Let us first inspect the overall picture for the 121 patent classifications at the three-digit

level and the 135,536 patents. Figure 3 shows that the set is not well connected. Fifty-six

16

http://vlado.fmf.uni-lj.si/pub/networks/pajek/

http://vlado.fmf.uni-lj.si/pub/networks/pajek/

http://www.leydesdorff.net/wipo06/index.htm

of the 121 categories are not connected to others at the level of cosine ≥ 0.05; only 13 are

more strongly connected (as k-cores); 52 classes are connected in weak graphs. I added

labels to the 13 categories which form more strongly connected k-cores. Upon visual

inspection, it seems that the well-connected sets represent chemical industries and

biotechnological applications in agriculture.

Figure 3: 121 patent classifications at the 3-digit level; N = 135,536; cosine ≥ 0.05; 2D-

visualization based on the algorithm of Fruchterman & Reingold (1991).

In other words, the co-classifications do not reveal a clear structure. Unlike scientific

citations, one would not expect to find the operation in this data of structure-generating

mechanisms like the Matthew effect (Merton, 1968) or preferential attachment (Barabási,

17

2002; cf. Leydesdorff & Bensman, 2006). On the contrary, one would expect the index to

enable the patent offices to distribute the patents over categories. The number of patents

per class is intentionally kept down, but a new subclass can be formed to accommodate

overflow (Larkey, 1999). Notice also that in the analysis, all PCT applications are

considered for a certain time period; this would equal to an analysis of all scientific

publications for a given time period. Would one observe a lot of structure within such an

exercise?

A next question is whether countries show specific profiles within the larger set. I explore

this below using Germany and China as examples. Germany is one of the largest share-

holders in the PCT applications, while China is at the ninth position (Figure 2). Germany,

of course, has a mature industrial structure, while the Chinese system has been booming

during the last decade or so. The German patent set of 17,095 patents covers all 121

patent categories (Figure 4); the Chinese one based on 3,084 patents contains 109 of

these categories (Figure 5). Table 3 compares the two sets with the global sets in terms of

network statistics.

three digits; cosine > 0.05

Global set (N = 121)

Germany (N = 121)

China (N = 109)

Density 0.008 0.008 0.020 % Degree centralization 4.31 2.56 9.33 % Closeness centralization3 n.a. n.a. n.a. % Betweenness centralization 0.78 0.90 21.28 Clustering coefficient 0.198 0.089 0.233 Table 3: Network statistics of the cosine-normalized matrices for the German, Chinese, and global sets of patents classified at the three-digit level.

3 Closeness centralization cannot be computed since the networks are not weakly connected.

18

Figure 4: Co-classifications of 17,095 of German patents at the 3-digit level; cosine ≥

0.05; 2D-visualization based on the algorithm of Fruchterman & Reingold (1991).

19

Figure 5: Co-classifications of 3,084 Chinese patents at the 3-digit level; cosine ≥ 0.05;

2D-visualization based on the algorithm of Fruchterman & Reingold (1991).

The structure in the German set is of the same order as for the complete set, while there is

much more structure in the Chinese set. In Figure 5, 82 of the 109 classifications are

connected into a weak component. All three sets (the global one and the two for Germany

and China, respectively) have in common a core group of “medical or veterinary science;

hygiene,” biochemistry, and organic chemistry.

Factor analysis of the underlying matrices of patents versus classes confirms that there

are no pronounced eigenvectors: more than 55 eigenvectors explain more than an average

20

variable; none of the eigenvectors explains more than 2% of the common variance

(Leydesdorff & Hellsten, 2005). The Chinese network is a bit more pronounced than the

German one or the one at the global level.4 In other words, the networks are very flat and

the categories are not obviously informative.

The lack of organization in the data suggests taking a closer look at betweenness

centrality as another measure for connectedness and coherence in the profiles (Freeman,

1997; Breschi et al., 2003; Leydesdorff, 2007). Table 4 provides the top-25 categories in

terms of the percentage of betweenness centrality for the two countries.5

Germany % China %1. medical or veterinary science; hygiene 7.5 1. measuring; testing 13.2 2. engineering elements or units; general

measures for producing and... 6.9 2. medical or veterinary science; hygiene 13.1

3. measuring; testing 6.8 3. furniture; domestic articles or appliances;

coffee mills; spice mills;... 10.1 4. physical or chemical processes or apparatus in

general 5.9 4. physical or chemical processes or apparatus

in general 8.9 5. vehicles in general 5.8 5. basic electric elements 8.5 6. dyes; paints; polishes; natural resins;

adhesives; miscellaneous... 3.5 6. engineering elements or units; general

measures for producing and... 7.9 7. working of plastics; working of substances in a

plastic state in general 3.5 7. organic macromolecular compounds; their

preparation or chemical... 5.7 8. basic electric elements 3.3 8. computing; calculating; counting 5.4 9. furniture; domestic articles or appliances; coffee

mills; spice mills;... 2.7 9. layered products 4.8 10. layered products 2.6 10. vehicles in general 4.6 11. conveying; packing; storing; handling thin or

filamentary material 2.5 11. foods or foodstuffs; their treatment, not

covered by other classes 3.8 12. organic macromolecular compounds; their

preparation or chemical... 2.3 12. conveying; packing; storing; handling thin or

filamentary material 3.8 13. machine tools; metal-working not otherwise

provided for 1.7 13. agriculture; forestry; animal husbandry;

hunting; trapping; fishing 3.0

14. ammunition; blasting 1.7 14. cements; concrete; artificial stone; ceramics;

refractories 2.7 15. coating metallic material; coating material with

metallic material;... 1.6 15. treatment of textiles or the like; laundering;

flexible materials not... 2.7 16. electric techniques not otherwise provided for 1.5 16. electric techniques not otherwise provided for 2.7 17. building 1.3 17. electric communication technique 2.4

4 The first eigenvector explains 1.7% of the common variance in the Chinese case, versus 1.2% for both the German and the total set. 5 The betweenness centrality is calculated from the cosine-normalized matrix, but before a threshold is set. If the matrix is not normalized, betweenness centrality is often overshadowed by degree centrality, since a “star” in a network is also “between” many nodes (Leydesdorff, forthcoming).

21

18. agriculture; forestry; animal husbandry; hunting; trapping; fishing 1.3

18. dyes; paints; polishes; natural resins; adhesives; miscellaneous... 2.3

19. spraying or atomising in general; applying liquids or other fluent... 1.2 19. hand cutting tools; cutting; severing 2.3

20. optics 1.1 20. optics 2.2 Table 4: top 20 classes at the three-digit level and the percentages of betweenness centrality for Germany and China, respectively.

These results confirm the impression that the Chinese system is more integrated than the

German one in terms of these measures. Although one can observe differences in the

ranking, these differences are not obviously informative. The similarities are also

considerable: the two tables have fourteen of the twenty categories in common.

5.2 The 4-digit level

At the 4-digit level, the lack of structure is less obvious, but still considerable at the level

of the aggregated set. 115 categories are not connected at the 0.05 level for the cosine.

There are a few dense clusters in traditional industries (fertilizers, chemistry, etc.).

22

Figure 6: 624 patent categories versus 135,536 patents; 115 classes are not connected at

cosine ≥ 0.05; visualization based on the algorithm of Fruchterman & Reingold (1991).

At this level of fine-tuning, the German set appears as more integrated than the Chinese

one. 560 of the 624 categories are used by the German set, as against 412 by the Chinese

set; and 501 of the categories are related with a threshold of cosine ≥ 0.05 as against 349

in the Chinese case. However, in terms of structural properties, the two matrices (and the

one for the global set) are again very flat. In the German case, for example, the first factor

explains 0.32% of the common variance with an eigenvalue of 1.988, and 284 factors

have an eigenvalue larger than one. Table 5 provides the network statistics in a format

similar to that of Table 3 above.

23

24

four digits; cosine > 0.05

Global set (N = 624)

Germany (N = 560)

China (N = 412)

Density 0.003 0.005 0.007 % Degree centralization 1.43 1.83 4.18 % Closeness centralization n.a. n.a. n.a. % Betweenness centralization 10.20 8.76 9.74 Clustering coefficient 0.345 0.215 0.356 Table 5: Network statistics of the cosine-normalized matrices for the German, Chinese, and global sets of patents classified at the four-digit level.

Figure 7: 501 of the 560 patent classes are related at cosine ≥ 0.05 in the case of 17,095

patents with an inventor in Germany. The (k = 10) core set is labeled; 2D-visualization

based on the algorithm of Fruchterman & Reingold (1991).

This many categories cannot possibly be displayed meaningfully on a single screen, but

the algorithms available in Pajek (and other visualization programs) enable us to filter out

interesting subsets. In Figure 7, the set of nodes with the highest number of links among

them (k = 10) is displayed as an example. A similar exercise can be performed with the

Chinese set.

Figure 8 shows a similar map for the much smaller Czech Republic. The structure of the

core of this map is informative about the technological make-up of the country.

26

Figure 8: 20 patent classes in three clusters among the 113 which are listed for the Czech

Republic; N of patents = 132; cosine ≥ 0.05. (Visualization based on the algorithm of

Kamada & Kawai, 1989.)

In summary, and not surprisingly, the visualizations are more informative at the four-digit

level: the countries are specific in terms of their portfolios. Table 6 compares Germany

and China, analogously to Table 4, in terms of the percentage of betweenness centrality

for the top 20 patent categories. Only three of the 20 categories match. Note that the

added category “nanotechnology” (Y01N) ranks in the sixth place of this list for

Germany.

27

Germany % China %1. layered products, i.e. products built-up of strata of

flat or non-flat,... 9.6 1. electric digital data processing 10.6 2. spraying apparatus; atomising apparatus; nozzles 8.9 2. separation 9.8

3. cleaning in general; prevention of fouling in general 5.7 3. semiconductor devices; electric solid state

devices not otherwise... 7.9 4. other working of metal; combined operations;

universal machine tools 5.1 4. investigating or analysing materials by

determining their chemical or... 7.8

5. mixing, e.g. dissolving, emulsifying, dispersing 4.9 5. preparations for medical, dental, or toilet

purposes 5.1

6. nanotechnology 4.7 6. containers for storage or transport of articles or

materials, e.g.... 5.0 7. lime; magnesia; slag; cements; compositions

thereof, e.g. mortars,... 4.5 7. layered products, i.e. products built-up of strata

of flat or non-flat,... 4.4 8. domestic plumbing installations for fresh water or

waste water; sinks 4.1 8. diagnosis; surgery; identification 3.9 9. chemical or physical processes, e.g. catalysis,

colloid chemistry;... 4.1 9. kitchen equipment; coffee mills; spice mills;

apparatus for making... 3.7 10. gas-turbine plants; air intakes for jet-propulsion

plants; controlling... 3.8 10. devices for fastening or securing constructional

elements or machine... 3.6 11. macromolecular compounds obtained by reactions

only involving... 3.8 11. air-conditioning; air-humidification; ventilation;

use of air currents... 3.6 12. compounds of the metals beryllium, magnesium,

aluminium, calcium,... 3.8 12. pictorial communication, e.g. television 3.1 13. processes for applying liquids or other fluent

materials to surfaces,... 3.7 13. methods or apparatus for sterilising materials or

objects in general;... 2.9 14. printing, duplicating, marking, or copying

processes; colour printing 3.4 14. processes or means, e.g. batteries, for the

direct conversion of... 2.8 15. making textile fabrics, e.g. from fibres or filamentary

material;... 3.4 15. household or table equipment 2.5

16. abrasive or related blasting with particulate material 3.2 16. foods, foodstuffs, or non-alcoholic beverages,

not covered by... 2.5 17. removal or treatment of combustion products or

combustion residues;... 3.1 17. printed circuits; casings or constructional details

of electric... 2.5 18. books; book covers; loose leaves; printed matter of

special format or... 3.1 18. macromolecular compounds obtained otherwise

than by reactions only... 2.4 19. launching, hauling-out, or dry-docking of vessels;

life-saving in... 3.1 19. chemical or physical processes, e.g. catalysis,

colloid chemistry;... 2.4 20. filling with liquids or semiliquids, or emptying, of

bottles, jars,... 3.0 20. working-up; general processes of compounding;

after-treatment not... 2.3

Table 6: top 20 classes at the four-digit level and the percentages of betweenness centrality for Germany and China, respectively.

5.3 Countries as units of analysis

Are countries the appropriate unit of analysis? No, they are not. The data is generated

bottom-up because of the national patent laws, but the PCT provides a protocol for

streamlining the process and the IPC is developed as a classification at the level of the set.

Some countries (e.g., Western European ones) may be very similar in terms of their

patent portfolios.

28

Using this data, the common practice in evolutionary economics (e.g., Lundvall, 1992;

Nelson, 1993) of using national systems as units for comparison can be investigated

empirically (Foray & Lundvall, 1996; Leydesdorff, 2006a). When the patent data is

aggregated at the level of countries, a matrix is generated with a single communality that

explains 31.4% of the variance; factor 2 explains only 4.3% of the variance, and factor 3

less than 3.1%. The first communality corresponds with a large group of 78 countries (out

of 126) which form a k-core in the network. If the threshold is set at cosine ≥ 0.05, only

three countries are removed from the core set. At the level of cosine ≥ 0.5, 70 relatively

advanced countries still form this dense core (Figure 9).

29

Figure 9: 70 countries form a dense core in terms of IPC classifications of their patent

portfolios at the level of cosine ≥ 0.5.

Similarly, if one analyzes the four-digit codes used as variables in the underlying matrix,

one obtains a first principal component that explains 65.4% of the common variance. 426

IPC categories are still connected at the level of cosine ≥ 0.5. In other words, most

countries are very similar in terms of their patent portfolios and most patent classes are

very similar in their distributions over the countries. In terms of patent classification,

globalization has already taken place. The construction of the IPC at the global level may

reinforce this abstraction from the national and institutional origins of the patent

applications.

30

5.4 Mapping emerging technologies

The data-matrix used for the mapping of the four-digit classification codes at the world

level (used for drawing Figure 6 above) enables us to select a specific class and to search

for its relevant environment. (This file is brought online at

http://www.leydesdorff.net/wipo06/world.zip.) This application of the instrument can be

made policy relevant.

Using the recently added class for nanotechnology Y01N, for example, Figure 10 can be

generated as its k = 1 environment at a threshold level of cosine ≥ 0.05. Because the units

of analysis are now again the patents themselves, the matrix is extremely sparse. The

network layer spanned by the IPC is thin and has no pronounced structure.

31

http://www.leydesdorff.net/wipo06/world.zip

Figure 10: k = 1 neighborhood of class Y01N; N = 762; cosine ≥ 0.05. (Visualization

based on the algorithm of Kamada & Kawai, 1989.)

USA 330 Austria 4Japan 120 Australia 4Germany 88 India 4France 46 Denmark 3United Kingdom 34 Greece 3South Korea 23 Norway 3Netherlands 21 Poland 3Switzerland 15 Russia 3Italy 15 Brazil 2Canada 13 New Zealand 2China 11 Turkey 2Israel 7 Belarus 1Sweden 7 Czech Republic 1Belgium 6 Hong Kong 1Spain 6 FYR Macedonia 1Singapore 6 Mexico 1Finland 5 Romania 1Ireand 5 Taiwan 1

32

South Africa 1 Table 7: The distribution of patents over (37) countries for the category “nanotechnology” (Y01N) using the WIPO dataset 2006 (762 patents; 799 addresses).

Table 7 lists the 37 countries which exhibit activity in this class using inventor addresses.

Thus, the indicator can be made policy relevant. The relatively strong position of small

countries like the Netherlands, Switzerland, and South Korea is again notable. Hullmann

(2007, at p. 745) lists estimated public funding for these countries in 2004. The list of

Table 7 correlates with r = 0.97; p = 0.01 (N = 31; 6 cases missing).

6. Conclusions

The major difference between the organization of scientific literature into journals which

maintain and reproduce aggregated citation relations and the organization of patents into

classes is a consequence of the role of the examiner. The examiner imposes additional

citations and classifications for the purpose of use, while the journal structures emerge

from the aggregated citation data in a self-organizing mode. From this perspective, patent

classifications can be compared with the subject categories which the Institute of

Scientific Information (of Thomson) attributes to journals (Leydesdorff & Rafols, in

preparation). These categories are assigned by the ISI staff on the basis of a number of

criteria, among which are the journal’s title, its citation patterns, etc. (McVeigh, personal

communication, 9 March 2006). The classifications, however, match poorly with

classifications derived from the database itself on the basis of analysis of the principal

33

components of the networks generated by citations among them (Boyack et al., 2005;

Leydesdorff, 2006b).

In the case of patents, classifications are attributed less arbitrarily. The patent offices

make major investments in developing classification systems. Because of the depth of the

classification system in terms of number of digits, one is able to zoom in or out of the

system using a hierarchical structure. This is convenient for the human understanding, but

it provides a thin layer for reflection on the underlying dynamics. The evolving database

is captured in a dendogram. The associative relations within the dendogram can be made

visible using co-classification analysis and provide us with a geometrical window on the

complexity of the data. However, this is not an eigenstructure of the data, nor can one

reveal the eigendynamics in the data by using these indicators. In other words, the status

of these indicators is different from that of science indicators.

In the design of this study, the focus was on co-classifications as an alternative to co-

citations because of the noted heterogeneity of functions of citations in the case of patent

literature. In a follow-up study, a systematic comparison of co-classification patterns with

citation patterns would be desirable using the USPTO set (because of the unification of

citation formats within this set). One could then consider the classes as equivalents to

journals and analyze the corresponding equivalent of an aggregated journal-journal

citation matrix. This may work for statistical reasons despite the lack of retrievable

structure in the co-classification patterns themselves (Leydesdorff & Rafols, in

preparation).

34

The analysis taught us further that nations—which are the intuitive units of analysis

because of national patent legislation—are not (or perhaps, no longer) the appropriate

units of analysis for patent portfolios. The major structure at the global level seems the

one between “haves” and “have-nots,” or in other words, between countries included and

those excluded from this technological realm. A majority of countries are included. The

database is more apt for the analysis of how technologies are distributed among them in

terms of the patent classifications. But even here, there seem to be no general rules of

thumb, since the networks are sparse and can be expected therefore to remain highly

sensitive to the parameters of the model, like the thresholds chosen, etc.

In summary, the results are a bit disappointing given the relatively well-organized dataset,

and one should not expect better results from using more mixed sets like those currently

under preparation by the OECD and the various patent offices. However useful from the

perspective of management and policy making, “Tech Mining” (Porter & Cunningham,

2005) on the basis of institutionally composed databases can be expected to generate

more fuzzy sets.

7. Discussion

The above conclusion may seem negative. However, this contribution is part of a

discourse about the quality of various indicators for mapping. In this study, I analyzed

(co)classifications because citations are a mixed bag in the case of patents more than in

35

the case of scientific literature. In addition to citations and classifications, however, the

patents as textual units also contain other textual elements, such as titles, abstracts, and

full texts (Callon et al., 1982, 1986; Mogoutov et al., 2007). Words are less codified than

citations (Leydesdorff, 1989), but in this case they may nevertheless be the best

indicators of meaning that are available for the mapping (Leydesdorff & Hellsten, 2006).

Let me illustrate this by using the patent portfolio of China.

Figure 11: 139 words occurring more than twenty times in 3,084 titles of Chinese

patents; cosine ≥ 0.05; visualization based on the algorithm of Kamada & Kawai (1989).

36

Figure 11 shows the cosine relations between 139 words that occur more than twenty

times in 3,084 Chinese patents (used for the construction of Figure 5).6 The picture

reveals the focus on communication, computing, and networking in the Chinese patent

portfolio. An analogous picture using the German patent portfolio (not shown here)

exhibits the dominance of manufacturing. The contexts in which central words like

“Methods,” “Devices,” and “Apparatuses” are provided with meaning are very different.

For pragmatic reasons, these visualizations are limited to approximately 150 nodes on a

single screen, but in terms of the statistics there are no limitations of this kind

(Leydesdorff & Hellsten, 2006). Furthermore, the classifications enable us to delineate

meaningful subsets whose contents can be analyzed further by using co-word (or

citation!) analysis.

Acknowledgements

I am grateful to Paola Criscuolo, Diana Lucio-Arias, Andrea Scharnhorst, Wilfred

Dolfsma, David Gick, Martin Meyer, Thomas Gurney for advice and help in collecting

the data. Three anonymous referees provided valuable comments and suggestions.

References

Adair, W. C. (1955). Citation indexes for scientific literature. American Documentation, 6, 31–32.

Ahlgren, P., Jarneving, B., & Rousseau, R. (2003). Requirement for a Cocitation Similarity Measure, with Special Reference to Pearson’s Correlation Coefficient. Journal of the American Society for Information Science and Technology, 54(6), 550-560.

Barabási, A.-L. (2002). Linked: The New Science of Networks. Cambridge, MA: Perseus Publishing.

6 The patents contain 4,0354 words which occur 18,1756 times. I used the stopword list of the USPTO available at http://ftp.uspto.gov/patft/help/stopword.htm.

37

http://ftp.uspto.gov/patft/help/stopword.htm

Boyack, K. W., Klavans, R., & Börner, K. (2005). Mapping the Backbone of Science. Scientometrics, 64(3), 351-374.

Braun, T, & Meyer, M. (Eds.) (2007). The Mechanism of Research on Nanostructures. Budapest: Akadémiai Kiadó.

Breschi, S., Lissoni, F., & Malerba, F. (2002). The Empirical Assessment of Firms’ Technological Coherence: Data and Methodology. In: The Economics and Management of Technological Diversification, ed. by J. Cantwell, A. Gambardella, and O. Granstrand, Routledge Studies in the Modern World Economy. London: Routledge.

Breschi, S., Lissoni, F., & Malerba, F. (2003). Knowledge-relatedness in firm technological diversification. Research Policy, 32(1), 69-87.

Breschi, S., & Lissoni, F. (2004). Knowledge Networks from Patent Data. In H. F. Moed, W. Glänzel & U. Schmoch (Eds.), Handbook of Quantitative Science and Technology Research (pp. 613-643). Dordrecht, etc.: Kluwer Academic Publishers.

Callon, M., Courtial, J.-P., Turner, W. A., & Bauin, S. (1983). From Translations to Problematic Networks: An Introduction to Co-word Analysis,. Social Science Information 22, 191-235.

Callon, M., Law, J., & Rip, A. (Eds.). (1986). Mapping the Dynamics of Science and Technology. London: Macmillan.

Cilliers, P. (1998). Complexity and Post-Modernism. London: Routledge. Cockburn, I. M., Kortum, S. S., & Stern, S. (2002). Are All Patent Examiners Equal? The

Impact of Examiner Characteristics. Cambridge, MA: NBER; Working Paper 8980. Retrieved November 13, 2007, at http://www.nber.org/papers/w8980 .

Criscuolo, P. (2004). R&D Internationalisation and Knowledge Transfer. University of Maastricht, Maastricht.

Criscuolo, P. (2006). The ‘home advantage’ effect and patent families. A comparison of OECD triadic patents, the USTPTO and EPO. Scientometrics, 66(1), 23-41.

Criscuolo, P., & Verspagen, B. (2005). Does it Matter where Patent Citations Come From?: Inventor Versus Examiner Citations in European Patents (Working Paper 05.06). Eindhoven: Eindhoven Centre for Innovation Studies.

Dibiaggio, L., & Nesta, L. (2005). Patents statistics, knowledge specialisation and the organisation of competencies. Revue d’économie industrielle. Nr. 110, 103-126.

Ejermo, O. (2005). Technological Diversity and Jacobs’ Externality Hypothesis Revisited. Growth and Change, 36(2), 167-195.

Engelsman, E. C., & Van Raan, A. F. J. (1993). International comparison of technological activities and specializations: a patent-based monitoring system. Technology Analysis & Strategic Management, 5(2), 113-136.

Engelsman, E. C., & van Raan, A. F. J. (1994). A patent-based cartography of technology. Research Policy, 23(1), 1-26.

Eurostat (2006). Triadic patent families. Eurostat Metadata in SDDS format: Summary Methodology. Retrieved November 13, 2007, at http://europa.eu.int/estatref/info/sdds/en/pat/pat_triadic_sm.htm

Evenson, R., & Puttnam, J. (1988). The Yale-Canada patent flow concordance. Yale University, Economic Growth Centre Working Paper.

38

http://www.nber.org/papers/w8980

http://europa.eu.int/estatref/info/sdds/en/pat/pat_triadic_sm.htm

Foray, D., & Lundvall, B.-A. (1996). The Knowledge-Based Economy: From the Economics of Knowledge to the Learning Economy. In Employment and Growth in the Knowledge-based Economy (pp. 11-32). Paris: OECD.

Freeman, L. C. (1977). A Set of Measures of Centrality Based on Betweenness. Sociometry, 40(1), 35-41.

Fruchterman, T., & Reingold, E. (1991). Graph drawing by force-directed replacement. Software—Practice and Experience, 21, 1129-1166.

Garfield, E. (1955). Citation Indexes for Science. Science, 122(3159), 108-111. Garfield, E. (1957). Breaking the subject index barrier—a citation index for chemical

patents. Journal of the Patent Office Society, 39(8), 583–595. Grupp, H., Münt, G., & Schmoch, U. (1996). Assessing Different Types of Patent Data

for Describing High-Technology Export Performance. In Innovation, Patents and Technological Strategies (pp. 271-287). Paris: OECD.

Grupp, H., Münt, G., & Schmoch, U. (1996). Assessing Different Types of Patent Data for Describing High-Technology Export Performance. Innovation, Patents and Technological Strategies, OECD (Hrsg.).

Grupp, H., & Schmoch, U. (1999). Patent statistics in the age of globalisation: new legal procedures, new analytical methods, new economic interpretation,. Research Policy, 28, 377-396.

Guan, J. C., & He, Y. (forthcoming). Networks of scientific journals: exploration of Chinese patent data. Scientometrics.

Hall, B. H., Jaffe, A. B., & Trajtenberg, M. (2002). The NBER Patent-Citations Cata File: Lessons, Insights, and Methodological Tools. In A. B. Jaffe & M. Trajtenberg (Eds.), Patents, Citations, & Innovations (pp. 403-459). Cambrigde, MA/ London: The MIT Press.

Healey, P., Rothman, H., & Koch, P. K. (1986). An Experiment in Science Mapping for Research Planning. Research Policy 15, 179-184.

Hullmann, A. (2006). Who is winning the global nanorace? Nature, 1(2), 81-83. Hullmann, A. (2007). Measuring and assessing the development of nanotechnology.

Scientometrics, 70(3), 739-758. Jaffe, A. B., & Trajtenberg, M. (2002). Patents, Citations, and Innovations: A Window on

the Knowledge Economy. Cambridge, MA/London: MIT Press. Kamada, T., & Kawai, S. (1989). An algorithm for drawing general undirected graphs.

Information Processing Letters, 31(1), 7-15. Kauffman, S. A. (1993). The Origins of Order: Self-Organization and Selection in

Evolution. New York: Oxford University Press. Kuhn, T. S. (1962). The Structure of Scientific Revolutions. Chicago: University of

Chicago Press. Larkey, L. (1999). A patent search and classification system. Proceedings of the fourth

ACM conference on Digital libraries, 179-187. Leten, B., Belderbos, R., & Van Looy, B. Technological diversification, coherence and

performance of firms. Leuven / Eindhoven: KU Leuven: Department of Managerial Economics, Strategy and Innovation (MSI No. 0706). Retrieved November 13, 2007, at http://www.econ.kuleuven.be/fetew/pdf_publicaties/MSI_0706.pdf

39

http://www.econ.kuleuven.be/fetew/pdf_publicaties/MSI_0706.pdf

Leydesdorff, L. (1987). Various methods for the Mapping of Science. Scientometrics 11, 291-320.

Leydesdorff, L. (1989). Words and Co-Words as Indicators of Intellectual Organization. Research Policy, 18, 209-223.

Leydesdorff, L. (1995). The Challenge of Scientometrics: the development, measurement, and self-organization of scientific communications. Leiden: DSWO Press, Leiden University. Retrieved November 13, 2007, at http://www.universal-publishers.com/book.php?method=ISBN&book=1581126816.

Leydesdorff, L. (2004). The University-Industry Knowlege Relationship: Analyzing Patents and the Science Base of Technologies. Journal of the American Society for Information Science & Technology, 55(11), 991-1001.

Leydesdorff, L. (2006a). The Knowledge-Based Economy: Modeled, Measured, Simulated. Boca Rota, FL: Universal Publishers.

Leydesdorff, L. (2006b). Can Scientific Journals be Classified in Terms of Aggregated Journal-Journal Citation Relations using the Journal Citation Reports? Journal of the American Society for Information Science & Technology, 57(5), 601-613.

Leydesdorff, L. (2007). “Betweenness Centrality” as an Indicator of the “Interdisciplinarity” of Scientific Journals. Journal of the American Society for Information Science and Technology, 58(9), 1303-1309.

Leydesdorff, L., & Bensman, S. J. (2006). Classification and Powerlaws: The logarithmic transformation. Journal of the American Society for Information Science and Technology, 57(11), 1470-1486.

Leydesdorff, L., & Hellsten, I. (2005). Metaphors and Diaphors in Science Communication: Mapping the Case of ‘Stem-Cell Research’. Science Communication, 27(1), 64-99.

Leydesdorff, L., & Hellsten, I. (2006). Measuring the Meaning of Words in Contexts: An automated analysis of controversies about 'Monarch butterflies,' 'Frankenfoods,' and 'stem cells.' Scientometrics, 67(2), 231-258.

Leydesdorff, L., & Rafols, I. (forthcoming). A Global Map of Science Based on the ISI Subject Categories. Retrieved November 13, 2007, at http://www.leydesdorff.net/map06/texts/index.htm .

Leydesdorff, L., & Vaughan, L. (2006). Co-occurrence Matrices and their Applications in Information Science: Extending ACA to the Web Environment. Journal of the American Society for Information Science and Technology, 57(12), 1616-1628.

Leydesdorff, L., & Zhou, P. (2007). Nanotechnology as a Field of Science: Its Delineation in Terms of Journals and Patents. Scientometrics, 70(3), 693-713.

Luhmann, N. (1990). Die Wissenschaft der Gesellschaft. Frankfurt a.M.: Suhrkamp. Lundvall, B.-Å. (Ed.). (1992). National Systems of Innovation. London: Pinter. Magnani, M., & Montesi, D. (2007). Integration of Patent and Company Databases. 11th

International Database Engineering and Applications Symposium, 2007. IDEAS 2007, 163-171. Retreived November 13, 2007, at http://csdl2.computer.org/persagen/DLAbsToc.jsp?resourcePath=/dl/proceedings/&toc=comp/proceedings/ideas/2007/2947/00/2947toc.xml&DOI=10.1109/IDEAS.2007.32 .

Merton, R. K. (1968). The Matthew Effect in science. Science, 159, 56-63.

40

http://www.universal-publishers.com/book.php?method=ISBN&book=1581126816

http://www.universal-publishers.com/book.php?method=ISBN&book=1581126816

http://www.leydesdorff.net/map06/texts/index.htm

http://csdl2.computer.org/persagen/DLAbsToc.jsp?resourcePath=/dl/proceedings/&toc=comp/proceedings/ideas/2007/2947/00/2947toc.xml&DOI=10.1109/IDEAS.2007.32



Michel, J., & Bettels, B. (2001). Patent citation analysis. A closer look at the basic input data from patent search reports. Scientometrics, 51(1), 185-201.

Mogee, M. E., & Kolar, R. G. (1999). Patent co-ciation analysis of Eli Lilly & Co. patents. Exp. Opin. Ther. Patents 9 (3), 291-305.

Mogoutov, A., Cambrosio, A., Keating, P., & Mustar, P. (2007). Biomedical Innovation at the Laboratory, Clinical and Commercial Interface: Mapping research projects, publications and patents in the field of microarrays. 6th International Conference of the Triple Helix of University-Industry-Government Relations, Singapore, 16-18 May 2007.

Narin, F., & Olivastro, D. (1988). Technology Indicators Based on Patents and Patent Citations. In A. F. J. v. Raan (Ed.), Handbook of Quantitative Studies of Science and Technology (pp. 465-507). Amsterdam: Elsevier.

Narin, F., & Olivastro, D. (1992). Status Report: Linkage beteen technology and science. Research Policy, 21, 237-249.

Nelson, R. R. (Ed.). (1993). National Innovation Systems: A comparative analysis. New York: Oxford University Press.

Nesta, L., & Saviotti, P. (2005). Coherence of the Knowledge Base and the Firm's Innovative Performance: Evidence from the U.S. Pharmaceutical Industry. The Journal of Industrial Economics, 53(1), 123-142.

OECD. (1994). The measurement of scientific and technological activities: Using patent data as science and technology indicators (Vol. OCDE/GD(94)114). Paris: OECD. Retrieved November 13, at http://www.oecd.org/dataoecd/33/62/2095942.pdf.

OECD. (2005). Compendium of Patent Statistics. Paris: OECD. Retrieved November 13, 2007, at http://www.oecd.org/dataoecd/60/24/8208325.pdf.

Porter, A. L., & Cunningham, S. W. (2005). Tech Mining: Exploiting New Technologies for Competitive Advantage. Hoboken, NJ: Wiley.

Price, D. J. de Solla (1965). Networks of scientific papers. Science, 149, 510- 515. Salton, G., & McGill, M. J. (1983). Introduction to Modern Information Retrieval.

Auckland, etc.: McGraw-Hill. Sampat, B. N. (2006). Patenting and U.S. academic research in the 20th century: The

world before and after Bayh-Dole. Research Policy, 35, 772-789. Sapsalis, E., Van Pottelsberghe de la Potterie, B., & Navon, R. (2006). Academic vs.

industry patenting: An in-depth analysis of what determines patent value. Research Policy 35(10), 1631-1645.

Scheu, M., Veefkind, V., Verbandt, Y., Galan, E. M., Absalom, R., & Förster, W. (2006). Mapping nanotechnology patents: The EPO approach. World Patent Information, 28, 204-211.

Schmoch, U., Laville, F., Patel, P., & Frietsch, R. (2003). Linking Technology Areas to Industrial Sectors. Final Report to the European Commission, DG Research.

Spasser, M. A. (1997). Mapping the terrain of pharmacy: Co-classification analysis of theInternational Pharmaceutical Abstracts database. Scientometrics, 39(1), 77-97.

Tijssen, R. J. W. (1992a). Cartography of Science: scientometric mapping with multidimensional scaling methods: scientometric mapping with multidimensional scaling methods. Leiden: DSWO Press, Leiden University.

41

http://www.oecd.org/dataoecd/33/62/2095942.pdf

http://www.oecd.org/dataoecd/60/24/8208325.pdf

Tijssen, R. J. W. (1992b). A quantitative assessment of interdisciplinary structures in science and technology: coclassification analysis of energy research. Research Policy, 21(1), 27-44.

Todorov, R. (1988). Representing a scientific field: A bibliometric approach. Scientometrics, 15(5-6), 593-605.

Trajtenberg, M. (1990). A Penny for Your Quotes: Patent Citations and the Value of Innovations. The RAND Journal of Economics, 21(1), 172-187.

Verspagen, B. (1997). Measuring Intersectoral Technology Spillovers: Estimates from the European and US Patent Office Databases. Economic Systems Research, 9(1), 47-65.

Verspagen, B. (2005). Mapping Technological Trajectories as Patent Citation Networks: A Study on the History of Fuel Cell Research. MERIT, Maastricht Economic Research Institute on Innovation and Technology; University Library, Universiteit Maastricht.

Verspagen, B., Van Moergastel, T., & Slabbers, M. (1994). MERIT Concordance Table: IPC-ISIC (rev. 2). Maastricht: MERIT.

WIPO (1970). Patent Cooperation Treaty. Geneva: WIPO. Retrieved November 13, 2007, at http://www.wipo.int/pct/en/texts/articles/atoc.htm

WIPO (2006). International Patent Classification, Eight Edition, Guide. Geneva: WIPO. Retrieved November 13, 2007, at http://www.wipo.int/classifications/ipc/en/other/guide/guide_ipc8.pdf

WIPO (2007). WIPO Patent Report: Statistics on Worldwide Patent Activity (2007 Edition). Geneva: WIPO. Retrieved November 13, 2007, at http://www.wipo.int/ipstats/en/statistics/patents/patent_report_2007.html

42

http://www.wipo.int/pct/en/texts/articles/atoc.htm

http://www.wipo.int/classifications/ipc/en/other/guide/guide_ipc8.pdf

http://www.wipo.int/ipstats/en/statistics/patents/patent_report_2007.html

Patent Classifications as Indicators of Intellectual - arXiv.org

Documents

Transcript of Patent Classifications as Indicators of Intellectual - arXiv.org