Site Tools


freshs:documentation

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

freshs:documentation [2016/03/03 13:21] (current)
Line 1: Line 1:
 +====== Documentation ======
 +
 +===== Introduction =====
 +
 +FRESHS is intended to bring together the many rare event sampling methods which
 +are appearing in the literature, making them easy to test against each other or to
 +use for practical purposes together with efficient simulation codes. FRESHS DOES
 +NOT REQUIRE RECOMPILATION OF YOUR SIMULATION PROGRAM.
 +
 +
 +FRESHS is organised to run in parallel, with small trajectory fragments being
 +farmed out to multiple instances of your chosen simulation program. Communica-
 +tions take place using sockets (between the FRESHS server and clients) and using
 +named pipes (between the FRESHS clients and simulation programs) (Figure 1). If
 +your OS or filesystem don’t support pipes it is possible to communicate with the
 +simulation programs directly through writing and reading to files instead. This,
 +however, imposes a performance penalty. It is documented that NFS filesystems
 +may have problems with named pipes.
 +
 +FRESHS is designed as a harness system, so it should be easy to wrap around any
 +given simulation program, without changing the FRESHS code itself. Application-specific stuff is contained in the "​harness"​ directory. Please refer to [[freshs:​harnessScripts|harness scripts]] for examples.
 +
 +<box left blue|>​{{:​manual:​freshs_architecture_slide.png?​direct&​600 |}}</​box|Figure 1: The structure of the FRESHS system. Clients can be on the same machine as the server or on different machines. Each client instance communicates with its own instance of the MD simulation program, run as a “subprocess”. The simulation program can itself be parallel.>​
 +
 +
 +
 +===== Server description =====
 +
 +Normally you will run one server (the program that actually knows the sampling
 +algorithm) and many clients (the programs which accept simulation jobs and run
 +them, then report back to the server). ​
 +
 +Always start the server first (see also the [[freshs:​quickstart|quickstart]] page):
 +
 +<code bash>
 +cd server
 +python main_server.py -c server-sample.conf
 +</​code>​
 +
 +there are a few options for the server, the most important argument is the
 +configuration file.
 +
 +
 +
 +
 +
 +
 +
 +The configuration file is broken up into sections; it has the structure:
 +
 +<​code>​
 +[General]
 +an_input_variable = 1
 +another_variable = 2
 +
 +[Another_section]
 +another_variable = False
 +</​code>​
 +Depending on the algorithm in use, different sections will be read or ignored.
 +
 +==== Configuration file options: general server options ====
 +
 +<code bash>
 +[general]
 +# the port to listen on
 +listenport = 10000
 +# algorithm name, e.g. perm_ffs, ffs, spres
 +algo_name = perm_ffs
 +# warn or disconnect, if client was not seen since this amount of seconds
 +check_alive = 3600
 +# if this is set to 1, clients will be kicked if they do not report ​
 +# something during the check_alive interval
 +kick_absent_clients = 0
 +# interval for periodic info and check
 +# (clients are checked for timeout in this interval)
 +t_infocheck = 15
 +# use ghost runs for precalculation
 +use_ghosts = 0
 +# Directory structure
 +folder_out ​ = OUTPUT
 +folder_conf = USERCONF
 +folder_db ​  = DB
 +folder_log ​ = LOG
 +# User-defined message string for the clients in the form of a python dict
 +# Do not use curly brackets. Use double quotes.
 +# example: "​servername":​ "​default-server",​ "​pressure":​ 50
 +user_msg = "​servername":​ "​default-server"​
 +
 +</​code>​
 +====Configuration file options: “spres_control”====
 +
 +This heading groups together options specific to the SPRES algorithm. This section
 +will not be read unless the option “algo_name” in the section “General” is set to
 +“spres”.
 +
 +<code bash>
 +
 +[spres_control]
 +# The SPRES algorithm in its ‘vanilla’ flavour does not monitor reaction fluxes. If a flux across the hyperplane defined by the reaction coordinate value B is desired, then the value ‘test_absorb_at_B_every’ must be set non-zero, typically 1. This is because reaction flux is defined as the rate of first crossings of B; so trajectories must be interrupted and checked for crossing after every timestep if a reaction flux is desired. See also the entry ‘test_absorb_at_B_after_bin’ in connection with this setting
 +test_absorb_at_B_every =     0
 +# If the variable ‘test_absorb_at_B_every’ is set non-zero then this can lead to expensive, repeated, evaluations of the reaction coordinate at very short intervals, even far from the hyperplane B. Setting ‘test_absorb_at_B_after_bin’ as greater than zero will turn off this repeated evaluation for bins below the index specified. You should have some understanding of the timescales of your system before setting this variable
 +test_absorb_at_B_after_bin = 0
 +# sets the length of trajectory fragments in the SPRES algorithm
 +tau                    =   100
 +# sets the maximum time covered during an SPRES run
 +max_epoch ​             =  1000
 +# maximum number of shots made from a given bin at a given timestep in a SPRES calculation. A high value of this gives maximum flexibility to the importance sampling component of the algorithm to achieve an even number of transitions over phase-space/​time;​ however in practice it is sometimes better to have a bound on the cost of your computation. This is achieved by setting this value relatively low. If your system is not achieving good coverage of transitions between bins, often it is better to decrease the bin size (see the section on Hypersurfaces) rather than increasing the ceiling of shots per bin
 +max_shots_per_bin ​     =  10
 +# sets the typical sampling density of interface crossings per timestep ‘tau’. Logically, this cannot be greater than the ‘max_shots_per_bin’
 +target_forward ​        ​= ​    2
 +# use several databases to store the information
 +use_multDB ​            ​= ​      0
 +
 +</​code>​
 +====Configuration file options: “ffs_control”====
 +
 +=== Basic options ===
 +
 +<code bash>
 +
 +[ffs_control]
 +# require number of points on interface.
 +# If 0, only the number of runs which reached the interface is used
 +require_runs = 1
 +# minimum points per interface to proceed
 +min_success = 2
 +# If this option is enabled, the clients must support max_steps
 +parallel_escape = 1
 +# if parallel_escape is 1, the following number of simulation steps
 +# are used for the escape run
 +escape_steps = 10000
 +
 +</​code>​
 +
 +
 +=== Advanced options ===
 +
 +<code bash>
 +# configpoints must have at least this number of different origin
 +# points when tracing back to first interface
 +
 +# Use the min_origin options if you find that your system is getting ​
 +# stuck at particular interfaces, this could be due to all configs
 +# at that interface being in the same kinetic trap.
 +#
 +# Increasing the diversity of configurations that you have,
 +# by using this option, might help.
 +min_origin = 5
 +
 +# fraction of successful different traces from interface to interface
 +min_origin_decay = 0.3
 +
 +# try to increase the number of points this many times
 +min_origin_increase_count = 2
 +
 +# maximum ghost point transfers in a row between real runs
 +# (decrease this number, if you use ghosts and the server slows down on
 +# an interface change)
 +max_ghosts_between = 3
 +
 +
 +</​code>​
 +
 +====Configuration file options: section “Hypersurfaces”====
 +
 +This heading selects values of the reaction coordinate which define hypersurfaces
 +in the phase space of the system. The reaction coordinate is calculated from the
 +configuration (potentially including velocities) in way which is inherently system-dependent;​ example means of doing this are found in the various ‘harness’ scripts.
 +
 +^ Option ^ Default ^ Description ^
 +| lambdacount | 2 | This variable sets the number of non-boundary interfaces. The total number of interfaces will be equal to this value +2. |
 +| borderA | 0.0 | This variable sets the reaction coordinate value which defines the lowest boundary interface (be it open or absorbing) of the system. RC values λ ≤ A are considered to be in ‘region A’ of the phase space. |
 +| borderB | 10.0 | This variable sets the reaction coordinate value which defines the highest boundary interface (be it open or absorbing) of the system. RC values λ > B are considered to be in ‘region B’ of the phase space. |
 +
 +===lambdas===
 +<​code>​
 +lambda1 = 1.0
 +lambda2 = 2.0
 +</​code>​
 +
 +The variables ‘lambda1’,​ ‘lambda2’ . . . etc set the positions of the non-boundary
 +interfaces. RC values λ ≤ lambda2, λ > lambda1 are considered to be in the
 +‘λ1 − λ2 ’ region.
 +
 +====Configuration file options: section “Runs_per_interface”====
 +
 +<​code>​
 +borderA = 10
 +borderB = 15
 +lambda1 = 20
 +lambda2 = 20
 +.
 +.
 +</​code>​
 +
 +
 +This section is read for both SPRES and FFS, but has different meanings. Each entry
 +sets the initial number of runs per interface - the number of shots made from
 +within the region λN − λN +1 the first time that this region is found to contain
 +a configuration. After the first shots from the region, the importance sampling
 +component of the SPRES algorithm will then adjust this number up or down.
 +In FFS, each entry sets the total number of shots which are required to reach
 +the interface λN before beginning to make shots which themselves begin from this
 +interface.
 +
 +
 +==== Configuration file options: Automatic, optimized interface placement options ====
 +
 +<code bash>
 +
 +[auto_interfaces]
 +# use automatic interface placement
 +auto_interfaces = 0
 +# minimal distance between interfaces
 +auto_mindist = 0.001
 +# order parameter is integer
 +auto_lambda_is_int = 0
 +# maximum number of calculation steps for exploring runs
 +auto_max_steps = 10000
 +# number of trials
 +auto_trials = 20
 +# number of runs per newly placed interface (M_0-counter)
 +auto_runs = 30
 +# minimum fraction of desired points on last interface before starting explorer runs
 +auto_min_points = 0.95
 +# minimum acceptable estimated flux
 +auto_flux_min = 0.3
 +# maximum acceptable estimated flux
 +auto_flux_max = 0.6
 +# moveunit for the trial interface method
 +auto_moveunit = 0.25
 +# use exploring scouts method. Clients must support this by returning max(reaction_coordinate).
 +auto_histo = 0
 +
 +</​code>​
 +
 +===== Client description =====
 +
 +You can start as many clients as you want by repeating the following command:
 +
 +<code bash>
 +cd client
 +python main_client.py -c client-sample.conf
 +</​code>​
 +
 +The executable and the harness script for the simulation are thereby specified in the 
 +configuration file:
 +
 +==== Configuration file options: general client options ====
 +
 +<code bash>
 +[general]
 +# the host to connect to
 +host = localhost
 +# the port to listen on
 +port = 10000
 +# ssh tunnel
 +ssh_tunnel = 0
 +# ssh tunnelcommand
 +ssh_tunnelcommand = ssh -N -L 10000:​localhost:​10000 tunneluser@tunnelhost
 +# refuse to accept new jobs after timeout(hours)
 +timeout = 0.0
 +# the executable which should be called (no quotes!)
 +executable = /​usr/​bin/​espresso
 +# the harness to use,
 +# location of the harness dir where the '​job_script'​ is located (no quotes!)
 +harness = ../​harnesses/​espresso
 +# give number of lines to expect in first line of input fifo to simulation
 +nlines_in = 0
 +# set niceness of executable, 0 = disable
 +nice_job = 0
 +</​code>​
 +
 +The client program has been kept as simple and lightweight as possible. Its
 +function is to start a simulation program, relay initial coordinates from the server
 +program to the simulation program, then collect outputs and return them to the
 +server.
 + 
 +When you are running clients over the network for the first time, it is easy to check that you have the correct address and port for the server, first start the server, then go to the client machine and type:
 +
 +''​echo -n "​hello"​ | netcat mycomputer.uni.edu 10000''​
 +
 +where “mycomputer.uni.edu” is the address of the machine running the server
 +program and “10000” is the port. If you have the right address, then the server
 +should interpret the “hello” as a client trying to connect, and print something to
 +screen. If nothing happens, check the address of the server machine using ifconfig.
 +
 +If you are sure that the address is right, try running it on a few different ports. If
 +you still can’t get through then you have some kind of firewall problem and should
 +speak to your network administrator.
 +
 +=====Simulation Programs=====
 +FRESHS is intended to be easy to use with any simulation program. The interfaces between FRESHS and each simulation program are contained in the directory **harnesses**. Each subdirectory (e.g. **harnesses/​espresso**,​ **harnesses/​lammps**) contains a script **job_script** plus optinal data files it needs.
 +
 +The script is called by the client program based on the information passed to it from the server. Input and output filenames passed to the scripts are actually **FIFOs** (also called ‘named pipes’). These can be treated just like files, with the exception that they do not support seeking or rewinding - data must be read or written to
 +them strictly sequentially,​ just like rolling tennis balls down a drainpipe.
 +
 +==== job_script arguments ====
 +
 +Here is a detailed specification of arguments that the job_script should process (see also [[freshs:​harnessscripts|example harness scripts]]):
 +
 +^ option ^ value description ^
 +| -tmpdir ​            | temporary folder for simulation run |
 +| -initial_config ​    | initial configuration file |
 +| -in_fifoname ​       | input fifoname |
 +| -back_fifoname ​     | output fifoname |
 +| -metadata_fifoname ​ | metadata fifoname |
 +| -halt_steps ​        | use halt steps |
 +| -check_rc_every ​    | check reaction coordinate after a number of simulation steps |
 +| -A                  | the border of the initial state A in terms of the reaction coordinate |
 +| -B                  | the border of the final state B in terms of the reaction coordinate |
 +| -random_points ​     | configuration point to start the simulation from |
 +| -seed               | a random seed, which is also written back to the database for reproducing simulation runs |
 +| -next_interface ​    | next interface location in terms of the reaction coordinate |
 +| -act_lambda ​        | actual lambda value |
 +| -jobtype ​           | jobtype of the simulation, e.g. FFS escape flux or SPRES run |
 +| -rp_id ​             | id of the runpoint |
 +| -max_steps ​         | maximum number of steps to calculate. If this number is set, the calculation should be interrupted after this number of steps |
 +| -clientname ​        | name of the client |
 +| -timestamp ​         | timestamp of the simulation |
 +| -uuid               | uuid of the simulation run |
 +
 +==== job_script argument parsing ====
 +
 +=== BASH ===
 +
 +This method loops over the arguments and sets the variables directly using eval (use with caution). The IFS must be set if values containing spaces are expected.
 +
 +<code bash>
 +#!/bin/bash
 +OLDIFS=$IFS
 +IFS=$'​\n'​
 +
 +varmod=0
 +for el in $*;do
 +    if [ $varmod == 0 ];then
 +        varset=`echo $el | sed s/^-//g`
 +        varmod=1
 +    else
 +        eval "​$varset=\"​$el\""​
 +        varmod=0
 +    fi
 +done
 +
 +IFS=$OLDIFS
 +</​code>​
 +
 +=== TCL ===
 +
 +This script loops over all (argument,​value) pairs and sets the variables accordingly.
 +
 +<code tcl>
 +foreach {option value} $argv {
 +  switch -glob -- $option {
 +    -argument1 ​            {set argument1 $value }
 +    -argument2 ​            {set argument2 $value }
 +    default ​            {puts "​Additional not-used parameter $option"​}
 +  }
 +}
 +</​code>​
 +
 +
 +=== Python ===
 +This code builds a dictionary called **optval**, which contains all arguments index by their keys.
 +Alternatively,​ an option parser like argparse could be used.
 +
 +<code python>
 +import sys
 +import re
 +
 +optval = {}
 +for eli in range(len(sys.argv)):​
 +    if (eli+1) % 2 == 0:
 +        optval[re.sub('​-','',​ sys.argv[eli],​1)] = sys.argv[eli+1]
 +</​code>​
 +
 +
 +
 +===== Postprocessing =====
 +
 +==== Averaging Multiple Runs [NON TRIVIAL] ====
 +
 +The FFS and flatPERM-FFS algorithms continue sampling at each interface until a fixed number of shots progress to the next interface. This type of trial produces successes according to a negative binomial distribution:​ for a single trial, the success probability is estimated simply as the proportion of successful shots; however for multiple trials (multiple independent FFS/​FFS-flatPERM calculations) the correct averaging to determine the success probability is N_success / (<​N_fails>​ + N_success). ​ The number of failures must be saved and averaged in order to then find the expected success rate.
 +
 +When combining multiple independent simulations,​ the structure of the interfaces must be the same for each calculation,​ and the shot databases post-processed to get the correct averages. ​ Automatic interface placement is therefore contraindicated in this case.  Only the probability of success (the total statistical weight) needs to be corrected for this effect; the proportional statistical weights of individual trajectories relative to each other remain correct.
 +
 +==== Using the SQLITE Database of Trajectory Fragments ====
 +
 +A database is the standard tool used in business applications for managing large quantities
 +of data. It has some advantages over storing data directly onto the filesystem, as is
 +more commonly done for simulation applications,​ in that it is platform-independent and in that it provides various convenient tools for searching through the databases.
 +
 +The final result of a FRESHS calculation will be a lot of log files, some algorithm-dependent data
 +about reaction rates and statistical weights, and a large sqlite database. To look at
 +the database directly, the free tool “sqlitebrowser” is recommended. A firefox plugin
 +also exists for this purpose, the choice of interface is up to the user. To process
 +the database in an organized way, a couple of python scripts are provided with
 +FRESHS (see directory //​scripts//​). Looking at these scripts should make it possible for you to write your own. You can work with SQL either natively, through a GUI, or through the python bindings as is done in the example scripts.
 +
 +
  
freshs/documentation.txt · Last modified: 2016/03/03 13:21 (external edit)