#6 A new similarity-based read-across algorithm for the prediction of small datasets: Case studies with nano-toxicity data



How to Cite

Roy, K. . #6 A New Similarity-Based Read-across Algorithm for the Prediction of Small Datasets: Case Studies With Nano-Toxicity Data. J Pharm Chem 2022, 8.


Nanotechnology is an important area of science developed in 21 st century, and it is being further advanced with time. Various new and modern technologies are utilized to produce different nanomaterials and nanoparticles nowadays, and these are used in various fields of industries and society. Due to their random use, the nanomaterials are dumped improperly affecting the environment adversely. They can pass the plasma membrane with ease due to their small particle size and hence can cause toxicity also. The regulatory agencies are working continuously to assess the risk associated with the nanoparticles and nanomaterials. They rely mostly on computational toxicity prediction to avoid the complexities associated with laboratory experimentation. QSAR and Read across are mostly used to fill the data gaps and for the risk and hazard assessment. In the present communication, we discuss a new similarity-based read-across algorithm for the prediction of toxicity (biological activity in general) of untested compounds from structural analogues. Three similarity estimation techniques such as, Euclidean distance based similarity, Gaussian kernel function similarity, and Laplacian kernel function similarity are used in this algorithm. The new algorithm is properly validated against three published nanotoxicity datasets. The quality of predictions depends on the selection of the distance threshold, similarity threshold, and the number of most similar training compounds. In this work, best predictions were obtained after selecting 0.4 – 0.5 as the distance threshold, 0.00 – 0.05 as the similarity threshold, and 2– 5 as the number of most similar training compounds. After toxicity prediction of test set compounds, the external validation metrics such as Q 2 ext_F1 , Q 2 ext_F2 , RMSEp were calculated. The computed metric values clearly justify the efficiency of the new read-across method and accuracy of the generated data by the proposed algorithm. A java based computer program (available at https://sites.google.com/jadavpuruniversity.in/dtc-lab-software/home) has also been developed based on the proposed algorithm which can effectively predict the toxicity of unknown NPs after providing the structural information of chemical analogues. The new algorithm and the program can be used for the data gap filling, prioritizing existing and new NPs, and for the risk assessments of NPs.


Chatterjee M, Banerjee A, De P, Gajewicz-Skretna A, Roy K. (2022) A novel quantitative read-

across tool designed purposefully to fill the existing gaps in nanosafety data. Environ Sci: Nano

DOI: https://doi.org/10.1039/D1EN00725D

Creative Commons License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Copyright (c) 2022 Journal of Pharmaceutical Chemistry