ExternalProcess (biojava-legacy 1.9.5 API)
Utility class to execute an external process and to handle
the STDOUT, STDERR and STDIN streams
in multiple threads managed by a thread pool.
This class is intended for applications that call an external program many
times, e.g. in a loop, and that need high performance throughput, i.e.
the program's input and output should not be written to disk. The Java
Runtime.exec(java.lang.String) methods requires the application to read/write
the external program's input and output streams in multiple threads.
Otherwise the calling application may block. However, instantiating multiple
threads for each call is extensive. On Linux systems there is also the
problem that each Java thread is represented by a single process and the
number of processes is limited on Linux. Because the Java garbage collector
does not free the Thread objects properly, an application
might run out of threads (indicated by a OutOfMemoryError
exception) after multiple iterations. Therefore, the
ExternalProcess class uses a
thread pool.
The simplest way to use this class is by calling the static methods
execute(String) and
execute(String, String, StringWriter, StringWriter). However, these
methods are not thread safe and no configuration is possible. In the former
case the program's input, output and error output is redirected to STDIN,
STDOUT and STDERR of the calling program. In the
latter case input is provided as string and output and error output is
written to StringWriter objects. The environment, i.e.
the current working directory and the environment variables, are inherited
from the calling process. In both cases, a static thread pool of size
THREAD_POOL_SIZE is used. The command that should be executed is
provided as a string argument.
In scenarios where the environment has to be changed, the program input
is generated just in time, or the program's output is parsed just in time,
the use of an explicit instance of the ExternalProcess class
is recommended. This instance could be initialized with a custom thread pool.
Otherwise a SimpleThreadPool of size 3 is used.
The input and output is managed by multithreaded
input handler and
output handler objects.
There are four predefined handlers that read the program's input from a
Reader object or a InputStream object and
write the program's output to a Writer object or a
OutputStream object. These classes are called:
ReaderInputHandler,
SimpleInputHandler,
WriterOutputHandler and
SimpleOutputHandler. If no handlers are
specified the input and output is redirected to the standards streams of
the calling process.
Before one of the methods execute() or
execute(Properties) is called, the commands property should be set. One may include placeholders of the form
%PARAM% within the commands. If a
Properties object is passed to the
execute(Properties) method, the placeholders are replaced by the
particular property value. Therefore, the Properties object
must contain a key named PARAM (case doesn't matter). The
environment for calling the external program can be configured using the
properties workingDirectory and
environmentProperties.
Finally, the sleepTime property can be
increased, in case the output handlers are not able to catch the whole
program's output within the given time. The default value is
SLEEP_TIME [in milliseconds].