Command Line Arguments

In addition to the Compile-Time Variables, we make use of the JSL::Argument class, allowing easy integration of command-line arguments for a number of properties.

Arguments & Triggers

Variable Name

Type

Trigger

Purpose

Default Value

ConfigFile

std::string

config

The first argument read: if set away from the default, reverts to reading all arguments from the chosen configuration file instead of the command line.

“__null_location__”

ConfigDelimiter

char

config-delim

The delimiter character used to parse the config file

‘ ‘

RandomSeed

int

random-seed

The value passed to srand() at startup for reproducible randomness

0

DataSource

std::string

data

The directory for the stellar data lists

“../../Data/ShuffledData”

OutputDirectory

std::string

output

The director for the output data (created if it doesn’t already exist)

“Output”

StartVectorLocation

std::string`

restart

The directory to search in for a valid savefile for vector initialisation (if set to default, uses random vector)

“__null_location__”

GradLim

double

gradlim

The maximum value of \(\nabla\mathcal{L}\) which will be considered ‘converged’

0

MaxSteps

int

max-steps

The maximum number of epochs the optimiser may use before exiting

1000

SaveSteps

int

save-steps

The number of steps between saving locations

1

SaveAllSteps

bool

unique-temp-save

If true, the temporary, raw vectors are saved uniquely. Recommended to set this to false to prevent huge amounts of data generation.

0

Minibatches

int

minibatch

The maximum number of batches used per epoch in the SGD prescription

64

HarnessSlowDown

double

harness-slow

The factor by which step sizes are reduced when the harness is active

10

HarnessRelease

int

harness-release

The number of full epochs over which the step size recovers from the HarnessSlowDown

5

Usage

Commands can be passed into the code in one of two ways: either through the commandline interface:

> ./theia -minibatch 16 -gradlim 0.1 -harness-release 12

Or through a configuration file, written as:

//configuration.txt
minibatch 16
gradlim 0.1
data ../new/data/location
random-seed 299

And launched as:

> ./theia -config configuration.txt

Note that if a configuration file is attached, all other command line arguments will be ignored.

CommandArgs Holder Class

Once loaded, the command-line arguments are stored within a central CommandArgs object

class CommandArgs

Public Functions

inline CommandArgs()

Default constructor….doesn’t do anything as the arguments self-initialise.

inline void ReadArguments(int argc, char *argv[], int ProcessRank)

Initialise the command-line arguments, and check if a configuration file is requested. Note that there is no checking for repeat arguments or multiply defined trigger strings, so multiple assignments are perfectly possible.

Parameters
  • argc – The system-provided command argument count

  • *argv[] – The system-provided list of command arguments

  • ProcessRank – The MPI-provided ID of the current process

Returns

Initialises the object against the provided parameters

Public Members

Argument<int> RandomSeed = Argument<int>(0, "random-seed")

The value passed to srand() <>_ for reproducible randomness.

Argument<std::string> StartVectorLocation = Argument<std::string>("__null_location__", "restart")

The directory to search in for a valid savefile configuration for relaunch.

Argument<double> GradLim = Argument<double>(0, "gradlim")

The maximum value of :math:\nabla\mathcal{L} which will be considered ‘converged’

Argument<int> MaxSteps = Argument<int>(1000, "max-steps")

The maximum number of epochs the optimiser may use before exiting.

Argument<int> SaveSteps = Argument<int>(1, "save-steps")

The number of steps between saving locations.

Argument<bool> SaveAllSteps = Argument<bool>(false, "unique-temp-save")

If true, the temporary, raw vectors are saved uniquely. Recommended to set this to false to prevent huge amounts of data generation.

Argument<int> Minibatches = Argument<int>(64, "minibatch")

The maximum number of batches used per epoch in the SGD prescription.

Argument<double> HarnessSlowDown = Argument<double>(10, "harness-slow")

The factor by which step sizes are reduced when the harness is active.

Argument<int> HarnessRelease = Argument<int>(5, "harness-release")

The number of full epochs over which the step size recovers from the HarnessSlowDown.

Argument<std::string> DataSource = Argument<std::string>("../../Data/ShuffledData", "data")

The directory for the stellar data lists.

Argument<std::string> OutputDirectory = Argument<std::string>("Output", "output")

The director for the output data (created if it doesn’t already exist)

std::vector<JSL::ArgumentInterface*> argPointers = {&RandomSeed, &StartVectorLocation, &GradLim, &MaxSteps, &DataSource, &OutputDirectory, &SaveSteps, &SaveAllSteps, &Minibatches, &HarnessRelease, &HarnessSlowDown}

Pointers list so can easily loop over the (heterogenous) array for assigments.