Page 11 -
P. 11
2 Analytical Approach
2.1 Patent Search
Lai et al. [35] developed a disambiguated database of granted U.S. patents (1975–2008).
Using a software text search tool (available at the Harvard Dataverse Network Patent
Project [35]) designed to interrogate the U.S. Patent and Trademark (USPTO) database
(http://patft.uspto.gov/), patent abstracts were searched using energy keyword search
terms [18, 20, 36]. We focused our search on technology areas for which search
ontologies were already developed and therefore omit other possible areas of important
carbon-reduction innovations, such as carbon capture and sequestration (CCS). We
experimented with searching only patent titles and the entire patent body for our
keywords and found that searching abstracts provided the best mixture of inclusion and
accuracy. The resultant lists were cleaned using text analysis functions and by hand to
ensure their accuracy. Patents were organized and analyzed by application date to best
position the patent within the appropriate inventive context [37] because there is a
variable and increasing time lag between patent application and issuance [38].
Our keyword-based search of energy patent titles and abstracts, while effective at
identifying patents related to energy technologies, favors patents focused on later-stage
innovations that include mention of the production of energy or fuels. For example,
when comparing patents that the U.S. Department of Energy (DOE) funded (which
presumably are parts of the energy knowledge space) with our keyword-search-driven
energy patent list, we observe an overlap of less than 3%. Presumably, this is because the
DOE funds very early-stage research (such as algal strains or materials chemistry) that
does not lend itself solely to energy generation or fuel applications and may not
uniformly include applications to energy in the language of their patent applications.
However, while limited to later-stage technologies, our energy list is accurate and
includes only energy patents. The accuracy of the keyword search methodology was
robust (average of 3.0% initial patent results were determined to not be energy patents
after manual inspection) except for nuclear (20.7%) and wind (34.6%) technology areas.
The errors in nuclear and wind technology areas were by-in-large a result of patents that
included homographic keywords (e.g., “nuclear” in reference to genetic techniques and
“turbine” in reference to aeronautic applications, respectively). However, we
acknowledge that we are likely omitting many earlier-stage energy technologies because
of the keyword selection. We intend to develop a more sophisticated ontology that will
combine a more inclusive keyword set while assuring accuracy by using patent
classification codes in combination.
Because energy technologies draw heavily upon proceeding innovations (e.g., solar
draws upon semiconductor manufacturing), this improved ontology will allow interesting
investigations of knowledge flow [39] and the organic development of new technology
areas [37]. Although this investigation would be interesting, it is beyond the scope of this
manuscript. All analyses include only energy patents and not other patents that may have
been invented by the included authors. This artificial boundary around the energy patents
creates rigid and deterministic barriers for our investigation and limits the determination
of the role of the entire inventive career and collaborations therein but is necessary for
4