to avoid iterative calculation of Henderson-Hasselbach equations in different pH you can use
. This should speed up the algorithm ~100 folds.
Old, but still useful information about implementation one can find
As mentioned in "Theory"
section there are many pKa
estimates based on different experiments.
On the other hand, one can try to obtain pKa
computationally. Here I present example how to
in order to obtain more accurate isoelectric point predictions. For that protein dataset(s) with experimentally determined isoelectric points is needed. For proteins
there are at least two such: PIP-DB and SWISS-2DPAGE (for more details see "Datasets"
Brute force attack:
Checking all possible combinations is not very tractable as even for 9 variables (charged amino acid pKa
in range of pH of 3 (±1.5 pH of average for given amino acid pKa
) with 0.01 precision gives 1.9683 × 1022
possibilities. Far too many to compute.
Basinhopping optimization using truncated Newton algorithm:
This produces suboptimal results in more reasonable time with less than few dozens of iterations with pKa
optimized with high precision.
In the nutshell, the basinhopping algorithm is iterative search procedure with each cycle composed of the following features:
As an initial seed previously published pKa
values were used. To limit search space truncated Newton algorithm
was used with 2 pH units bounds for pKa
(e.g. if starting point for Cys pKa
was 8.5 the solution was allowed in the interval [6.5, 10.5]).
For more details how those algorithms works go here