Descriptive Analysis of the Drug Name Lexicon

Bruce L. Lambert*, Ken yu Chang, Swu Jane Lin

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review


The complexity of the drug use process is managed in part by developing systematic nomenclature for drugs. This nomenclature is cataloged in a variety of drug information databases. Answers to simple questions about the whole population of brand and generic drug names, however, are not easily obtained. This paper provides a descriptive analysis of the drug name lexicon, with a primary (though not exclusive) emphasis on drugs marketed in the United States. Using the techniques of computational lexicography, one large database of trademark names (the US Patent and Trademark database) and one large database of nonproprietary names (the USP Dictionary of USAN and International Drug Names) were analyzed. Results describe a variety of distributional characteristics of drug names, including the number of characters per name, the number of syllables per name, and the number of words per name. Distributions of pairwise similarity and distance scores for a large sample of names are provided, as are lists of the 25 most common initial and terminal bigrams and trigrams. The information should be of interest to trademark attorneys, patient safety advocates, regulators, and students of drug nomenclature.

Original languageEnglish (US)
Pages (from-to)163-172
Number of pages10
JournalTherapeutic Innovation & Regulatory Science
Issue number1
StatePublished - Jan 1 2000


  • Description
  • Drug nomenclature
  • Generic
  • Medication errors
  • Similarity
  • Trademark

ASJC Scopus subject areas

  • Pharmacology, Toxicology and Pharmaceutics (miscellaneous)
  • Public Health, Environmental and Occupational Health
  • Pharmacology (medical)


Dive into the research topics of 'Descriptive Analysis of the Drug Name Lexicon'. Together they form a unique fingerprint.

Cite this