molpx._bmutils.get_good_starting_point

molpx._bmutils.get_good_starting_point(cl, geom_samples, cl_order=None, strategy='smallest_Rgyr')

provided a pyemma-cl object and a list of geometries, return the index of the clustercenter that’s most suited to start a minimally diffusing path.

Parameters:
  • cl (pyemma.coordinates clustering object) –
  • geom_samples (list of mdtraj.Trajectory objects corresponding to each clustercenter in cl) –
  • cl_order (None or iterable of integers) – The order of the list geom_samples may or may not correspond to the order of cl. Very often, geom_samples is sorted in ascending order of a given coordinate while the clustercenters in cl are not. cl_order represents this reordering, so that geom_samples[cl_order] reproduces the order of the clusterscenters, so that finally: geom_samples[cl_order][i] contains geometries sampled for the i-th clustercenter
  • strategy (str, default is 'smallest_Rgyr') –
    Which property gets optimized
    • smallest_Rgyr: look for the geometries with smallest radius of gyration(mdtraj.compute_rg), regardless of the population
    • most_pop: look for the clustercenter that’s most populated, regardless of the associated geometries
    • most_pop_x_smallest_Rgyr: Mix both criteria. Weight Rgyr values with population to avoid highly compact but rarely populated structures
    • bimodal_compact: assume the distribution of clustercenters is bimodal, then locate its centers and choose the one with smaller Rgyr
    • bimodal_open: assume the distribution of clustercenters is bimodal, then locate its centers and choose the one with larger Rgyr
Returns:

start_idx – The mdtraj.Trajectory in geom_samples[start_idx] satisfies best the strategy criterion

Return type:

int, ndex of list geom_samples