USUGM 2014 - Zhengwei Peng (Merck): In-depth analysis of patent molecular spaces
Transcript of USUGM 2014 - Zhengwei Peng (Merck): In-depth analysis of patent molecular spaces
1
The challenges related to
in-depth analysis of patent molecular spaces
in support of drug discovery projects
Zhengwei Peng
09/26/2014, 2014 ChemAxon UGM, Boston
2
Outline
• Quick introduction
• Need for in-depth analysis of patent molecular space
• Current state, major challenges, and the desired future state
• Potential implications for the wider community
• Pharma scientists
• Pharma patent attorneys
• Patent offices
• Patent content providers
• A pipe dream
3
Claimed chemical space by a chemical patent
The size of a patent Markush claim space could be large (1012 – 1030).
However commercial solutions do exist to encode Markush claims into
searchable form(s) and perform structure searches (SSS and Exact)
against these vast virtual compound spaces.
Exemplified
Molecules
Broadest Claim
Encode into
computer-
searchable
format(s)
4
A sample patent Markush claim
core
R1
R2
R3
R4
R5
CAS & TR are the two dominant
Markush content providers.
100-500 analysts are required to
encode published chemical
patents Time, $$ cost of
subscription for customers
5
The importance of in-depth patent analysis
ViagraLevitra
The need for a
good defenseThe power of a
good analysis
6
Major questions asked by project chemists
• Is my lead compound patentable, and do we have freedom to operate?• This requires comprehensive prior art search (CAS/TR Databases)
• Patent attorneys provide the analysis and opinion on patentability and FTO
• What is the competitive landscape in this area (disease, protein target, and
compound)?• literature and patent searches done by project members & expert searchers
• I want an in-depth analysis on patents or patent applications pertinent to my
project • exemplified molecules
• assay data associated with SAR
• patent Markush claim(s)
• I want an in-depth analysis of the strength of my draft patent application • Does the draft Markush claim cover all the exemplified molecules?
• Did I provide sufficient information to cover all embodiments of the invention?
7
The current state, challenges,
and the desired future states
Search &
browse tool
Internal
patent store
Selective
download &
encoding
anytime needed
Markush
patent
content
Expert
Search tool
Search
expert
Patent
attorney
New
patents and
patent
applications
Encoding &
publish ~
monthly
(time & $)
• Time delay by the content provider
• No direct access to expert search tool for project chemists
8
Tools and workflow to enable project teams to
perform in-depth analysis of patent Markush
Internal
patent
store
Patent
files
(xml/pdf)
Selective
patents
relevant to a
discovery
project
(10-100 per
project)
Curation
GUI tool
Manual
review &
edit to
ensure high
quality
Doc 2 Str
extraction
Auto
structure
recognition
& extraction
• Search
• Enumerate
• Compare
Search &
browse tool
Markush
patent
content
API
9
An early prototype by ChemAxon
D2S: Chemical terms inside
patent doc are recognized
and converted to live
molecular structures
Molecular structures can be
dragged and dropped to
construct the searchable
patent Markush object
10
Implications for the wider community
Curation
GUI tool
Markush
search &
analysis
tool
1) Patent preparation stage
2) Patent
approval
stage
3) Patent publication stage
4) Content
distribution
stage
Patent
attorney
Patent
Office
Win-Win:
• Tool potentially useful for patent attorney to prepare and prosecute
patent applications.
• Faster/cheaper content publication (good for content providers as
well as content users)
• Timely data access by scientists
• Higher efficiency & lower cost Benefit society in general…
Objections heard so far:
• Naïve, a pipe dream…
• The need for obfuscation is always here..
• Migrating from legacy formats to the new one is hard & costly..
• ..
11
Summary
• To better serve drug discovery projects, in-depth analysis of patent
molecular spaces in a timely manner is highly desirable to
• understand the implication of newly published external patents and
patent applications on the project
• assist the project chemist to understand the scope of the published
art to help identify productive areas
• ChemAxon’s suite of tools (D2S, Chem-Curator, Patent Markush Search
Platform, etc.) are well positioned as the potential solution for the
wider community
• a newer & better (?) search engine one can license
• content neutral
• once adopted by the wider community as a standard, it has the
potential to solve a long standing challenge and yield significant
benefit to the drug discovery community.
• Pharma companies have an active role in shaping the future
12
Acknowledgements
• ChemAxon collaborators• Tim Dudgeon, David Deng, Doug Drake, and Arpad Figyelmesi
• Merck colleagues• Chris Brofft and Wendy Cornell (MRL-IT)
• Kenrick Vidale (Legal-IP Group)
• Michael Altman and Nicolas Zorn (MRL-SC)
• Chris Waller, Frank Brown, and Emma Parmee (Management sponsorship
and support)
13
Supplemental material
Backup slides
14
References
• Lynch et al., (1996) The Sheffield Generic Structures Project: a retrospective review. J. Chem. Inf.
Comput. Sci., 36, 930-936.
• The MARPAT patent Markush searching system of CAS: Ebe et a;., (1991) The Chemical Abstracts Service
generic chemical (Markush) structure storage and retrieval capabilities. J. Chem. Inf. Comput. Sci. 31, 31-36.
(CAS web link: http://www.cas.org )
• The Merged Markush Service (MMS) of Thomson Reuters: Benichou et al. (1997) Handling genericity in
chemical structures using the Markush DARC software. J. Chem. Inf. Comput. Sci. 37, 43-53.
• The ChemAxon Markush search technology enables structure searches (SSS and Exact) into vast patent
Markush spaces: http://www.chemaxon.com/products/Markush-ip/
• ChemAxon Instant J-Chem interfaces with the MMS search service form Thomson-Reuters. This provides a
more interactive and user-friendly GUI to the existing STN search user interface offered by Thomson
Reuters. TR Markush patent content was converted to ChemAxon readable format automatically => 80%
concordance in search results between two search engines.
• ChemProspector (http://infochem.de/news/projectdisplay.shtml?chemprospector.shtml), a project funded by
the German government to build software to automatically extract the core and R Groups from chemical
patents and store them in a searchable database. Conclusion: it is very hard to automatically construct high-
quality Markush space definition based on the diverse ways patent claim sections were crafted.
15
• Also SciFinder now interfaces with the CAS’ MARPAT search service.
• Barnard et al.,(2009) Towards in-house searching of Markush structures from patents. World Patent Inf. 31,
97-103.
• Downs & Barnard, (2011) Chemical patents information systems, Wiley Interdiscip. Rev.: Comput. Sci. 1,
727-741.
• Cosgrove et al. from Astra Zeneca, (2012) A System for Encoding and Searching Markush Structures, J.
Chem. Inf. Model, 52, 1936-1947.
References