“It’s sort of the ideal problem for machine learning: you know the answer, but not the formula you want to apply,” says Aron Cohen, a theoretical chemist who has long worked on DFT and who is now at DeepMind. (Nature600, 371 (2021))
Discovery of RRx-001, a Myc and CD47 Downregulating Small Molecule with Tumor Targeted Cytotoxicity and Healthy Tissue Cytoprotective Properties in Clinical Development J. Med. Chem. 2021(64)7261
New Substructure Filters for Removal of Pan Assay Interference Compounds (PAINS) from Screening Libraries and for Their Exclusion in Bioassays J. Med. Chem. 2010(53)2719
# NIHフィルターを作成
p2 = FilterCatalog.FilterCatalogParams()
p2.AddCatalog(FilterCatalog.FilterCatalogParams.FilterCatalogs.NIH)
f2 = FilterCatalog.FilterCatalog(p2)
# フィルターに認識された構造情報を取り出す
matches = f2.GetMatches(RRx_001_mol)
for match in matches:
print(match.GetProp("description"))
print(match.GetProp("Reference"))
print(match.GetProp("Scope"))
print("------------")
"""alpha_halo_carbonylJadhav A, et al. Quantitative Analyses of Aggregation, Autofluorescence, and Reactivity Artifacts in a Screen for Inhibitors of a Thiol Protease. J Med Chem 53 (2009) 37D51. doi:10.1021/jm901070c.annotate compounds with problematic functional groups------------non_ring_ketalJadhav A, et al. Quantitative Analyses of Aggregation, Autofluorescence, and Reactivity Artifacts in a Screen for Inhibitors of a Thiol Protease. J Med Chem 53 (2009) 37D51. doi:10.1021/jm901070c.annotate compounds with problematic functional groups------------primary_halide_sulfateJadhav A, et al. Quantitative Analyses of Aggregation, Autofluorescence, and Reactivity Artifacts in a Screen for Inhibitors of a Thiol Protease. J Med Chem 53 (2009) 37D51. doi:10.1021/jm901070c.annotate compounds with problematic functional groups------------"""
Quantitative Analyses of Aggregation, Autofluorescence, and Reactivity Artifacts in a Screen for Inhibitors of a Thiol Protease J. Med. Chem. 2010 (53)37
# descriptionのリスト
descriptions = [match.GetProp("description") for match in matches]
# アトム番号のリストのリスト
atom_nums_list = []
for match in matches:
atom_nums = [x[1] for x in match.GetFilterMatches(RRx_001_mol)[0].atomPairs]
atom_nums_list.append(atom_nums)
# 描画
Draw.MolsToGridImage([RRx_001_mol for _ inrange(len(matches))],
highlightAtomLists=atom_nums_list,
legends=descriptions)
こちらもディープラーニングによる手法で、DECIMERはDeep lEarnig for Chemical ImagE Recognitionの略です。
このソフトは「化学構造式の光学認識(Optical Chemical Structure Recognition, OCSR)」の課題について、「最新の人工知能技術を使ったオープンソースの自動化されたソフトウェアを開発しよう!」というDECIMERプロジェクトの中の一つです(プロジェクトWebサイト)。
最近ではJournal of Medicinal ChemistryなどSupplementary Informationで化学構造のSMILESを列挙したファイルを提供するものも出てきましたが、まだまだ多くの文献で化学構造式は画像でのみ示されています。また特許では公開されている書類がPDFや画像形式であり、さらに産業上の理由から敢えて曖昧に書かれていることもあります。
文献①: C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich.
Going deeper with convolutions. In Proc. of CVPR, 2015. arXiv.:1409.4842v1
Finally, careful handling of the structure and intermediate losses (using both end-to-end structure gradients and providing intermediate loss gradients in recycling) is important to achieving full accuracy. This is likely due to pushing the network towards having a concrete representation of the structure as appears to be present in the trajectories of intermediate structure predictions.
(Supplementary Information p51 強調は追加)
良かった良かった(?)
2. Transformer-XL
次の用語はTransformer-XLです。XLはextra longの意です。
文献②:Z. Dai, Z. Yang, Y. Yang, J. Carbonell, Q. Le, and R. Salakhutdinov.
Transformer-XL: Attentive language models beyond a fixed-length context. arXiv:1901.02860v3
文献⑤: H. Wang, Y. Zhu, B.Green, H. Adam, A. Yuille and L.-C. Chen
Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation arXiv:2003.07853v2
文献⑥: Z. Huang, X. Wang, Y. Wei, L. Huang, H. Shi, W. Liu and T. Huang
CCNet: Criss-Cross Attention for Semantic Segmentation arXiv:1811.11721v2