Running unfold.py — A worked example

This is a worked example of running unfold.py on a twisted bilayer graphene sample.

This example uses these input data files (132MB).

If that link breaks, please let me know, and in the meanwhile you can try this zip with everything but the dynamical matrices. (85.6KB) (permalink)

Preparation

Running the script requires setting some paths. I like to create a file <rsp2-dir>/env and source it before calling scripts:

#!/usr/bin/env bash

if [ "${BASH_SOURCE[0]}" -ef "$0" ]
then
    echo >&2 "Hey, you should source this script, not execute it!"
    exit 1
fi

# note: this env var is not strictly needed, it's just used by the script I included with these inputs
export RSP2=$(dirname $(readlink -f "${BASH_SOURCE[0]}"))

# Path changes required to use rsp2
export PATH=$RSP2/target/release:$PATH
export PYTHONPATH=$RSP2/src/python:$PYTHONPATH
export PYTHONPATH=$RSP2/scripts:$PYTHONPATH

. $RSP2/venv/bin/activate

Structures

In the input are two structures, initial.structure and final.structure.

initial.structure is the structure prior to relaxation (after optimizing the lattice parameter).
final.structure is the fully relaxed structure. This relaxation has broken the translational symmetry within each layer.

There are also (large!) .npz files containing dynamical matrices for final.structure at a couple of high-symmetry points.

We will only be using initial.structure during the unfolding process; everything to do involving final.structure has already been done (namely, computing the dynamical matrices), and it is really only included for demonstrative purposes.

Why only the initial structure is used during unfolding:

unfold.py currently relies on a "perfect supercell approximation." To put it briefly, the unfolding process involves computing the expectation value of some translation operators <ψ| T(r) |ψ>, where |ψ> will be a sort of Bloch function field representation of the normal mode projected onto one layer.

When the normal modes are expressed as a sum of delta functions at each atom, these expectation values will be zero unless the translation is a perfect translational symmetry of the single layer, preventing the recovery of any meaningful information from the unfolding process. This is obviously a problem for the relaxed structure where these symmetries have been broken!

As such, an approximation is made: when constructing the Bloch functions for each normal mode, we will assume each atom vibrates around its unrelaxed position. Mind that this has no impact on the frequencies of the modes (which have already been computed); it only impacts the coefficients of each frequency being unfolded onto each point in the single-layer FBZ.

Structure format

The structures are in a directory-like format that is specific to rsp2. They include a POSCAR (VASP crystal structure format), and a meta.json file with some metadata. Please look at this meta.json file!

Things to pay attention to:

The CLI interface of unfold.py is limited to planar materials. The plane normal must be the z direction. (For 3D materials, you'll need to use unfold_lib directly, rendering this tutorial somewhat irrelevant.)
meta.json has layers (zero-based indices) and masses (AMU) arrays with values for each atom. You need these!
It also has layer-sc-matrices. This is important as well! For each layer, it contains two things:
- matrix is the integer matrix C such that S = C A, where the rows of A are the graphene unit cell vectors for one layer, and the rows of S are the TBLG unit cell vectors.
- repeats is ignored by the script and can be omitted.

Things you DON'T need to worry about:

The order of the atoms doesn't matter. (as long as the dynamical matrix uses the same ordering)
frac-bonds.json contains the bond graph. unfold.py does not read this file. (only rsp2 needs it)

Diagonalizing

unfold.py works best on LARGE structures. The sample structure is a bit on the small side, so we will want to unfold using data at multiple qpoints in the supercell. The archive includes dynamical matrices at Gamma, K, and M. These are .npz files containing the dynamical matrices in the same units used by phonopy. (which is.... whatever you get from the standard definition of the dynamical matrix when your mass is in AMU, distances are Angstrom, and forces are eV/Angstrom).

We now need to compute normal modes. Take a look at unfold-gmk which shows how to invoke the script on these dynamical matrices to compute data about modes.

[ lampam @ 14:27:01 ] (master •2) ~/rsp2-clone/work/run
$ . ~/rsp2-clone/env
(venv) [ lampam @ 14:27:01 ] (master •2) ~/rsp2-clone/work/run
$ ./unfold-gmk 397-1-a

   Compiling autocfg v1.0.1
   Compiling libc v0.2.43
   Compiling cfg-if v1.0.0
   Compiling const_fn v0.4.4
   Compiling lazy_static v1.4.0
   Compiling rayon-core v1.9.0
   Compiling num-traits v0.2.5
   Compiling scopeguard v1.1.0
   Compiling num-complex v0.2.1
   Compiling either v1.5.0
   Compiling slice-of-array v0.2.1
   Compiling rsp2-assert-close v0.1.0 (/home/lampam/rsp2-clone/src/util/assert-close)
   Compiling rsp2-util-macros v0.1.0 (/home/lampam/rsp2-clone/src/util/macros)
   Compiling crossbeam-utils v0.8.1
   Compiling memoffset v0.6.1
   Compiling rayon v1.5.0
   Compiling rand v0.4.2
   Compiling num_cpus v1.8.0
   Compiling atty v0.2.11
   Compiling rand v0.3.22
   Compiling rsp2-soa-ops v0.1.0 (/home/lampam/rsp2-clone/src/util/soa-ops)
   Compiling crossbeam-epoch v0.9.1
   Compiling crossbeam-channel v0.5.0
   Compiling crossbeam-deque v0.8.0
   Compiling rsp2-array-types v0.1.0 (/home/lampam/rsp2-clone/src/util/array-types)
   Compiling rsp2c-unfold v0.1.0 (/home/lampam/rsp2-clone/scripts/unfold_lib)
    Finished release [optimized] target(s) in 17.50s

    Finished release [optimized] target(s) in 0.05s
--probs not supplied. Will compute by unfolding eigensols.
--eigensols not supplied. Will diagonalize dynmat.
Layer 0: Unfolded  4764 of 4764 eigenvectors
Magnitude summary:
[(-999, 2288), (-18, 7), (-17, 60), (-16, 458), (-15, 4163), (-14, 26682), (-13, 51072), (-12, 58532), (-11, 97449), (-10, 188771), (-9, 285341), (-8, 299513), (-7, 251359), (-6, 215089), (-5, 184223), (-4, 118367), (-3, 68067), (-2, 33980), (-1, 5887)]
Probs matrix density: 46.37%
--mode-data not supplied. Computing from eigensols.
    Finished release [optimized] target(s) in 0.05s
--probs not supplied. Will compute by unfolding eigensols.
--eigensols not supplied. Will diagonalize dynmat.
Layer 0: Unfolded  4764 of 4764 eigenvectors
Magnitude summary:
[(-999, 2897), (-18, 5), (-17, 30), (-16, 303), (-15, 4059), (-14, 29144), (-13, 58151), (-12, 71743), (-11, 125829), (-10, 230696), (-9, 303672), (-8, 297887), (-7, 242335), (-6, 183517), (-5, 147131), (-4, 104522), (-3, 61340), (-2, 20808), (-1, 7239)]
Probs matrix density: 40.55%
--mode-data not supplied. Computing from eigensols.
    Finished release [optimized] target(s) in 0.04s
--probs not supplied. Will compute by unfolding eigensols.
--eigensols not supplied. Will diagonalize dynmat.
Layer 0: Unfolded  4764 of 4764 eigenvectors
Magnitude summary:
[(-999, 2882), (-18, 4), (-17, 31), (-16, 351), (-15, 4641), (-14, 33304), (-13, 74899), (-12, 100712), (-11, 157312), (-10, 252650), (-9, 306061), (-8, 281357), (-7, 219797), (-6, 158445), (-5, 126672), (-4, 93534), (-3, 54166), (-2, 16954), (-1, 7536)]
Probs matrix density: 35.8%
--mode-data not supplied. Computing from eigensols.

This script used the flags --write-mode-data and --write-probs to write files like mode-data-*.npz and probs-*.npz for each Q point.

mode-data-*.npz has some scalar data for each mode, like their frequencies. You can read it with numpy.load.
probs-*.npz is a sparse matrix providing the squared absolute value (i.e. probability) of each normal mode mapping onto each image of the supercell Q point in the single layer FBZ. You can read it with scipy.sparse.load_npz.

Plotting

Because we have data from multiple qpoints, we need to create a "multi qpoint file" to tell the unfolding script how to use all of the data in a single plot. I included multi-qpoint-gmk.yaml in the archive (please take a quick look at it) so this step is already done.

There seems to be a problem right now with numba on arch linux and I had to uninstall it from the venv:

(venv) [ lampam @ 14:27:01 ] (master •2) ~/rsp2-clone/work/run
$ python3 -m pip uninstall numba

After that I could perform the plotting. There's a LOT of command line args you can supply. Here's what I used for following PDF:

$ (cd 397-1-a && python3 $RSP2/scripts/unfold.py initial.structure --plot-path=GKMG --multi-qpoint multi-qpoint-gmk.yaml --verbose --debug --plot-exponent=0.60 --plot-coalesce-threshold=0.5 --decimate-x=10  --plot-truncate 1e-3 --plot-coalesce sum --plot-color uniform:b --plot-using-size=both --plot-ylim=-50:1800 --plot-style ../landscape.mplrc --write-plot band-gmk.pdf)

Output band plot

The density of data in the plot is a bit low because even the images of Gamma, M, and K are not really enough to cover our high symmetry path that well. Normally I would use more points for such a small structure, in particular favoring those that are known to have an image directly on the high symmetry path.

Sidebar: Different ways to break up the computation

In this example, we essentially broke the process up into two runs; one that wrote mode-data.npz and probs.npz, and one that used them to generate a plot. This is a convenient way to break up the computation because it aligns with the requirements of multi-qpoint.yaml. But there are many other intermediate files you can save using --write-THING and read using --THING, in case you find that some part of the computation takes long. (e.g. in unfold-gmk, all three runs compute the permutations for translation from scratch, but if this took long, you could add --write-perms perms.npy to the first run and --perms perms.npy to the others)

Alternatively, if data from a single Q-point is sufficient, you could even run the whole thing from start to finish in one command! For instance:

(cd 397-1-a && python3 $RSP2/scripts/unfold.py initial.structure --plot-path=GKMG --qpoint '0 0 0' --dynmat dynmat-gamma.npz --verbose --debug --plot-exponent=0.60 --plot-coalesce-threshold=0.5 --decimate-x=10  --plot-truncate 1e-3 --plot-coalesce sum --plot-color uniform:b --plot-using-size=both --plot-ylim=-50:1800 --plot-style ../landscape.mplrc --write-plot band-gmk.pdf)

(this is similar to the above command, but with --multi-qpoint multi-qpoint-gmk.yaml replaced with --qpoint '0 0 0' --dynmat dynmat-gamma.npz)

ExpHP/running-unfold.md