molpx.visualize

The core functionality is to link two interative figures, fig1 and fig2, inside an IPython/Jupyter notebook, so that an action in fig1 (e.g.a click of the mouse or a slide of a slidebar) will trigger an event in fig2 (e.g. a frame update or point moved) and vice versa. Usually, these two figures contain representations from:

  • molecules: an nglviewer widget showing one (or more) molecular structure(s) that a particular value of the coordinate(s) is associated with and
  • projected coordinates: a matplotlib figure showing the projected coordinates (e.g. TICs or PCs or any other), \({Y_0, ..., Y_N}\), either as a 2D histogram, \(PDF(Y_i, Y_j)\) or as trajectory views \({Y_0(t), ...Y_N(t)}\)

You are strongly encouraged to check nglview’s documentation, since its functionalities extend beyond the scope of this package and the molecular visualization universe is rich and complex (unlike this module).

The three methods offered by this module are:

molpx.visualize.FES(MD_trajectories, MD_top, ...) Return a molecular visualization widget connected with a free energy plot.
molpx.visualize.sample(positions, geom, ax) Visualize the geometries in geom according to the data in positions on an existing matplotlib axes ax
molpx.visualize.traj(MD_trajectories, ...[, ...]) Link one or many projected trajectories, [Y_0(t), Y_1(t)...], with the MD_trajectories that originated them.
molpx.visualize.FES(MD_trajectories, MD_top, projected_trajectory, proj_idxs=[0, 1], nbins=100, n_sample=100, axlabel='proj', n_overlays=1)

Return a molecular visualization widget connected with a free energy plot.

Parameters:
  • MD_trajectories (str, or list of strings with the filename(s) the the molecular dynamics (MD) trajectories.) –

    Any file extension that mdtraj (.xtc, .dcd etc) can read is accepted.

    Alternatively, a single mdtraj.Trajectory object or a list of them can be given as input.

  • MD_top (str to topology filename or directly an mdtraj.Topology object) –
  • projected_trajectory (str to a filename or numpy ndarray of shape (n_frames, n_dims)) – Time-series with the projection(s) that want to be explored. If these have been computed externally, you can provide .npy-filenames or readable asciis (.dat, .txt etc). NOTE: molpx assumes that there is no time column.
  • proj_idxs (int, list or ndarray) – Selection of projection idxs (zero-idxd) to visualize.
  • nbins (int, default 100) – The number of bins per axis to used in the histogram (FES)
  • n_sample (int, default is 100) – The number of geometries that will be used to represent the FES. The higher the number, the higher the spatial resolution of the “click”-action.
  • axlabel (str, default is 'proj') – Format of the labels in the FES plot
  • n_overlays (int, default is 1) – The number of structures that will be simultaneously displayed as overlays for every sampled point of the FES. This parameter can seriously slow down the method, it is currently limited to a maximum value of 50
Returns:

  • axpylab.Axis object
  • iwdnglview.NGLWidget
  • data_sample – numpy ndarray of shape (n, n_sample) with the position of the dots in the plot
  • geomsmdtraj.Trajectory object with the geometries n_sample geometries shown by the nglwidget

molpx.visualize.correlations(correlation_input, geoms=None, proj_idxs=None, feat_name=None, widget=None, proj_color_list=None, n_feats=1, verbose=False, featurizer=None)
Provide a visual and textual representation of the linear correlations between projected coordinates (PCA, TICA)
and original features.
Parameters:correlation_input (anything) – Something that could, in principle, be a pyemma.coordinates.transformer, like a TICA or PCA object or directly a correlation matrix, with a row for each feature and a column for each projection, very much like the :obj:`feature_TIC_correlation of the TICA object of pyemma.
geoms : None or obj:md.Trajectory, default is None

The values of the most correlated features will be returned for the geometires in this object. If widget is left to its default, None, correlations will create a new widget and try to show the most correlated

features on top of the widget
widget : None or nglview widget

Provide an already existing widget to visualize the correlations on top of. This is only for expert use, because no checks are done to see if correlation_input and the geometry contained in the widget actually match. Use with caution.

When objects geoms and widget are provided simultaneously, three things happen:
  • no new widget will be instantiated
  • the display of features will be on top of whatever geometry widget contains
  • the value of the features is computed for the geometry of geom

Use with caution and clean bookkeeping!

proj_color_list: list, default is None
projection specific list of colors to provide the representations with. The default None yields blue. In principle, the list can contain one color for each projection (= as many colors as len(proj_idxs) but if your list is short it will just default to the last color. This way, proj_color_list=[‘black’] will paint all black regardless len(proj_idxs)
proj_idxs: None, or int, or iterable of integers, default is None
The indices of the projections for which the most correlated feture will be returned If none it will default to the dimension of the correlation_input object
feat_name : None or str, default is None
The prefix with which to prepend the labels of the most correlated features. If left to None, the feature description found in correlation_input will be used (if available)
n_feats : int, default is 1
Number of argmax correlation to return for each feature.
featurizer : optional featurizer, default is None
In case the correlation_input doest no have a data_producer.featurizer attribute, the user can input one here
verbose : Bool, default is True
print to standard output
Returns:

most_corr_idxs, most_corr_vals, most_corr_labels, most_corr_feats, most_corr_atom_idxs, lines, widget, lines

molpx.visualize.sample(positions, geom, ax, plot_path=False, clear_lines=True, n_smooth=0, widget=None, superpose=True, projection=None, n_feats=1, **link_ax2wdg_kwargs)

Visualize the geometries in geom according to the data in positions on an existing matplotlib axes ax

Use this method when the array of positions, the geometries, the axes (and the widget, optionally) have already been generated elsewhere.

Parameters:
  • positions (numpy nd.array of shape (n_frames, 2)) – Contains the position associated with each frame in geom in that order
  • geom (mdtraj.Trajectory objects or a list thereof.) – The geometries associated with the the positions. Hence, all have to have the same number of n_frames
  • ax (matplotlib.pyplot.Axes object) – The axes to be linked with the nglviewer widget
  • plot_path (bool, default is False) – whether to draw a line connecting the positions in positions
  • clear_lines (bool, default is True) – whether to clear all the lines that were previously drawn in ax
  • n_smooth (int, default is 0,) – if n_smooth > 0, the shown geometries and paths will be smoothed out by 2*n frames. See bmutils.smooth_geom for more information
  • widget (None or existing nglview widget) – you can provide an already instantiated nglviewer widget here (avanced use)
  • superpose (boolean, default is True) – The geometries in geom may or may not be oriented, depending on where they were generated. Since this method is mostly for visualization purposes, the default behaviour is to orient them all to maximally overlap with the first frame (of the first mdtraj.Trajectory object, in case geom is a list)
  • projection (object that generated the projection, default is None) –

    The projected coordinates may come from a variety of sources. When working with pyemma a number of objects might have generated this projection, like a * pyemma.coordinates.transform.TICA or a * pyemma.coordinates.transform.PCA or a

    Expert use. Pass this object along ONLY if the positions have been generetaed using projection_paths, so that looking at linear correlations makes sense. Observe the features that are most correlated with the projections will be plotted for the sample, allowing the user to establish a visual connection between the projected coordinate and the original features (distances, angles, contacts etc)

  • n_feats (int, default is 1) – If a projection is passed along, the first n_feats features that most correlate the the projected trajectories will be represented, both in form of trajectories feat vs t as well as in the nglwidget. If projection is None, nfeats will be ignored.
  • link_ax2wdg_kwargs (dictionary of named arguments, optional) – named arguments for the function _link_ax_w_pos_2_nglwidget, which is the one that internally provides the interactivity. Non-expert users can safely ignore this option.
Returns:

iwd

Return type:

nglview.NGLWidget

molpx.visualize.traj(MD_trajectories, MD_top, projected_trajectories, active_traj=0, max_frames=2000, stride=1, proj_stride=1, proj_idxs=[0, 1], proj_labels='proj', plot_FES=False, panel_height=1, sharey_traj=True, dt=1.0, tunits='frames', traj_selection=None, projection=None, n_feats=1)

Link one or many projected trajectories, [Y_0(t), Y_1(t)...], with the MD_trajectories that originated them.

Optionally plot also the resulting FES.

Parameters:
  • MD_trajectories (str, or list of strings with the filename(s) the the molecular dynamics (MD) trajectories.) –

    Any file extension that mdtraj (.xtc, .dcd etc) can read is accepted.

    Alternatively, a single mdtraj.Trajectory object or a list of them can be given as input.

  • MD_top (str to topology filename or directly mdtraj.Topology object) –
  • projected_trajectories (str to a filename or numpy ndarray of shape (n_frames, n_dims)) – Time-series with the projection(s) that want to be explored. If these have been computed externally, you can provide .npy-filenames or readable asciis (.dat, .txt etc). NOTE: molpx assumes that there is no time column.
  • active_traj (int, default 0) – Index of the trajectory that will be responsive. (zero-indexing)
  • max_frames (int, default is 1000) – If the trajectoy is longer than this, stride to this length (in frames)
  • stride (int, default is 1) – Stride value in case of large datasets. In case of having MD_trajectories and projected_trajectories in memory (and not on disk) the stride can take place also before calling this method.
  • proj_stride (int, default is 1) – Stride value that was used in the projected_trajectories relative to the MD_trajectories If the original MD_trajectories were stored every 5 ps but the projected trajectories were stored every 50 ps, proj_stride = 10 has to be provided, otherwise an exception will be thrown informing the user that the MD_trajectories and the projected_trajectories have different number of frames.
  • proj_idxs (iterable of ints, default is [0,1]) – Indices of the projected coordinates to use in the various representations
  • proj_labels (either string or list of strings) – The projection plots will get this paramter for labeling their yaxis. If a str is provided, that will be the base name proj_labels=’%s_%u’%(proj_labels,ii) for each projection. If a list, the list will be used. If not enough labels are there the module will complain
  • plot_FES (bool, default is False) – Plot (and interactively link) the FES as well
  • panel_height (int, default 1) – Height, in inches, of each panel of each trajectory subplots
  • sharey_traj (bool, default is True) – Force the panels of each projection to have the same yaxes across trajectories (Note: Not across coordinates)
  • dt (float, default is 1.0) – Physical time-unit equivalent to one frame of the projected_trajectories
  • tunits (str, default is 'frames') – Name of the physical time unit provided in dt
  • traj_selection (None, int, iterable of ints, default is None) – Don’t plot all trajectories but only few of them. The default None implies that all trajs will be plotted. Note: the data used for the FES will always include all trajectories, regardless of this value
  • projection (object that generated the projection, default is None) –

    The projected coordinates may come from a variety of sources. When working with pyemma a number of objects might have generated this projection, like a * pyemma.coordinates.transform.TICA or a * pyemma.coordinates.transform.PCA or a

    Pass this object along and observe and the features that are most correlated with the projections will be plotted for the active trajectory, allowing the user to establish a visual connection between the projected coordinate and the original features (distances, angles, contacts etc)

  • n_feats (int, default is 1) – If a projection is passed along, the first n_feats features that most correlate the the projected trajectories will be represented, both in form of trajectories feat vs t as well as in the nglwidget. If projection is None, nfeats will be ignored.
Returns:

  • ax, iwd, data_sample, geoms – return _plt.gca(), _plt.gcf(), widget, geoms
  • axpylab.Axis object
  • figpylab.Figure object
  • iwdnglview.NGLWidget
  • geomsmdtraj.Trajectory object with the geometries n_sample geometries shown by the nglwidget