Registration Dossier

Data platform availability banner - registered substances factsheets

Please be aware that this old REACH registration data factsheet is no longer maintained; it remains frozen as of 19th May 2023.

The new ECHA CHEM database has been released by ECHA, and it now contains all REACH registration data. There are more details on the transition of ECHA's published data to ECHA CHEM here.

Diss Factsheets

Physical & Chemical properties

Water solubility

Currently viewing:

Administrative data

Endpoint:
water solubility
Type of information:
calculation (if not (Q)SAR)
Adequacy of study:
supporting study
Study period:
09/10/2020
Reliability:
2 (reliable with restrictions)
Rationale for reliability incl. deficiencies:
accepted calculation method
Justification for type of information:
1. SOFTWARE : The Estimation Programs Interface (EPI) Suite TM

2. MODEL (incl. version number) . WSKOW v1.42;

3. SMILES OR OTHER IDENTIFIERS USED AS INPUT FOR THE MODEL :
SMILES : CCCCCCON=O
CAS no.: 638-51-7

4. SCIENTIFIC VALIDITY OF THE EPI Suite MODEL: It is a screening toool, inteded for use in applications such as to quickly screen chemicals for release potential and "bin" chemicals by priority for future works.
Overall, the MCI methodology is somewhat more accurate than the Log Kow methodology, although both methods yield good results.  If the Training datasets are combined in to one dataset of 516 compounds (69 having no corrections plus 447 with corrections), the MCI methodology has an r2, standard deviation and average deviation of 0.916, 0.330 and 0.263, respectively, versus 0.86, 0.429 and 0.321 for the Log Kow methodology.


5. APPLICABILITY DOMAIN : EPI Suite TM cannot be used for all chemical substances. The intended application domain is organic chemicals.
Appendix D lists (for each correction factor) the maximum number of instances of that factor in any of the 447 training set compounds (the minimum number of instances is of course zero, since not all compounds had every fragment).  The minimum and maximum values for molecular weight are the following:

Training Set Molecular Weights:
Minimum MW:  32.04
Maximum MW:  665.02
Average MW:  224.4

Validation Molecular Weights:
Minimum MW:  73.14
Maximum MW:  504.12
Average MW:  277.8

Currently there is no universally accepted definition of model domain.  However, users may wish to consider the possibility that log Koc estimates are less accurate for compounds outside the MW range of the training set compounds, and/or that have more instances of a given fragment than the maximum for all training set compounds.  It is also possible that a compound may have a functional group(s) or other structural features not represented in the training set, and for which no fragment coefficient or correction factor was developed.  These points should be taken into consideration when interpreting model results.

Data source

Reference
Reference Type:
other: Software
Title:
Unnamed
Year:
2020
Report date:
2020

Materials and methods

Test guideline
Qualifier:
no guideline followed
Principles of method if other than guideline:
PCKOCWIN (version 2) estimates Koc with two separate estimation methodologies:
(1) estimation using first-order Molecular Connectivity Index (MCI)
(2) estimation using log Kow (octanol-water partition coefficient)
Collection of Koc Data: The major source of experimental Koc data was Schuurmann et al. (2006) which includes a compilation of selected Koc values for 571 compounds.  Data from the original PCKOWIN regression (that included a total of 389 compounds) were also used.  A new data source was the USDA Pesticide Properties Database (data for more than 150 compounds were collected).  Koc values for a few additional compounds were collected from references in Environmental Fate Data Base system (Howard et al., 1982, 1986). For compounds appearing in more than one of the various compilations, a single Koc value was selected.  In general, values from the Schuurmann et al. (2006) compilation were used.  An average value was used for a few compounds. After some additional quality control to ensure various reported Koc values were not estimated (as opposed to experimental), selected Koc values for 674 compounds were used to train and validate the updated regression from these sources:
(1) Schuurmann et al. (2006) compilation:  453 compounds
(2)  Original PCKOCWIN regression:  85 compounds
(3) USDA Pesticide Properties Database :  85 compounds
(4)  Average from multiple sources:  21 compounds
(5)  Miscellaneous sources:  30 compounds  (Nguyen et al, 2005; Sabljic et al, 1995; Baker et al, 1997; VonOepen et al, 1991; Kaune, 1998; Gawlik et al, 1998; HSBD, 2008)
The 674 compounds were eventually divided into a training set of 516 compounds and a validation set of 158 compounds.  The training set was divided further into a dataset of 69 non-polar organics and 447 polar organics (same as previously described in Meylan et al, 1992).  For the current model development, the non-polar dataset is designated as compounds having "No Correction Factors" while the polar compounds are designated as compounds "Having Correction factors".
Estimation Using Molecular Connectivity Index: PCKOCWIN (version 1) estimated Koc solely with a QSAR utilizing Molecular Connectivity Index (MCI).  This QSAR estimation methodology is described completely in a journal article (Meylan et al, 1992) and in a report prepared for the US EPA (SRC, 1991).  PCKOCWIN (version 2) utilizes the same methodology, but the QSAR has been re-regressed using a larger database of experimental Koc values that includes many new chemicals and structure types.
QSAR Derivation: The same methodology as described in (Meylan et al, 1992) was used to develop the QSAR equations utilizing Molecular Connectivity Index (MCI).  Two separate regressions were performed.  The first regression related log Koc of non-polar compounds to the first-order MCI.  As noted above, non-polar compounds are now designated as "compounds having no correction factors" which simply means the MCI descriptor alone can adequately predict the Koc.  Measured log Koc values were fit to a simple linear equation of the form: log Koc  = a MCI  + b, where a and b are the coefficients fit by least-square analysis.  The 69 compounds used for this regression are listed in Appendix E.
The second regression included the 447 compounds having correction factors; these compounds are listed in Appendix F.  The correction factors descriptors are listed in Appendix D.  Correction factors are specific chemical classes or structural fragments.  The regression coefficients were derived via multiple linear regression of the correction descriptors to the residual error of the prediction from the non-polar equation.
GLP compliance:
no

Test material

Reference
Name:
Unnamed
Type:
Constituent
Test material form:
liquid

Results and discussion

Water solubility
Key result
Water solubility:
ca. 249.4 mg/L
Conc. based on:
test mat.
Temp.:
25 °C

Applicant's summary and conclusion

Conclusions:
Estimated water solubility of tested item is 249.4 mg/l at 25°C.
Executive summary:

Estimated water solubility of tested item is 249.4 mg/l at 25°C (EPI Suite TM, WSKOW v1.43.)