Architecture and topology
A JPPF grid is made of three different types of components that communicate together:
From this picture, we can see that the server plays a central role, and its interactions with the nodes define a master / worker architecture, where the server (i.e. master) distributes the work to the nodes (i.e. workers).
This also represents the most common topology in a JPPF grid, where each client is connected to a single server, and many nodes are attached to the same server. As with any such architecture, this one is facing the risk of a single point of failure. To mitigate this risk, JPPF provides the ability to connect multiple servers together in a peer-to-peer network and additional connectivity options for clients and nodes, as illustrated in this figure:
Note how some of the clients are connected to multiple servers, providing failover as well as load balancing capabilities. In addition, and not visible in the previous figure, the nodes have a failover mechanism that will enable them to attach to a different server, should the one they are attached to fail or die.
The connection between two servers is directional: if server A is connected to server B then A will see B as a client, and B will see A as a node. This relationship can be made bi-directional by also connecting B to A. Note that in this scenario, each server taken locally still acts as a master in a master/worker paradigm.
In short, we can say that the single point of failure issue is addressed by a combination of redundancy and dynamic reconfiguration of the grid topology.
A task is the smallest unit of work that can be handled in the grid. From the JPPF perspective, it is considered atomic.
A job is a logical grouping of tasks that are submitted together, and may define a common service level agreement (SLA) with the JPPF grid. The SLA can have a significant influence on how the job's work will be distributed in the grid, by specifying a number of behavioral characteristics:
This chart shows the different steps involved in the execution of a job, and where each of them takes place with regards to the grid component boundaries.
It also shows that the main source of parallelism is provided by the load balancer, whose role is to split each job into multiple subsets that can be executed on multiple nodes in parallel. There are other sources of parallelism at different levels, and we will describe them in the next sections.
We illustrate this in the following picture:
This communication model has a number of important implications:
Messages all have the same structure, and are made of one or more blocks of data (in fact blocks of bytes), each preceded by its block size. Each block of data represents a serialized object graph. Thus, each message can be represented generically as follows:
The actual message format is different for each type of communication
channel, and may also differ depending on whether it is a request or
response message:
Job data channel
A job data request is composed of the following elements:
To read the full message, JPPF has to first read the header and obtain the number of tasks in the job.
The response will be in a very similar format, except that it doesn't have a data provider: being read-only, no change to its content is expected, which removes the need to send it in the response. Thus the response can be represented as:
Class loader channel
A class loader message, either request or response, is always made of a single serialized object. Therefore, the message structure is always as follows:
Single client, multiple concurrent jobs
A single client may submit multiple jobs in parallel. This differs from the single client/single job scenario in that the jobs must be submitted in non-blocking mode, and their results are retrieved asynchronously. An other difference is that the client must establish multiple connections to the server to enable parallelism, and not just asynchronous submission. When multiple non-blocking jobs are submitted over a single connection, only one at a time will be submitted, and the others will be queued on the client side. The only parallelism is in the submission of the jobs, but not in their execution. To enable parallel execution of multiple jobs, it is necessary to configure a pool of connections for the client. The size of the pool determines the number of jobs that can be processed in parallel by the server.
Multiple clients
In this configuration, the parallelism occurs naturally, by letting the different clients work concurrently.
Mixed local and remote execution
Clients have the ability to execute jobs locally, within the same process, rather than remotely on the server. They may also use both capabilities at the same time, in which case a load-balancing mechanism will provide an additional source of parallelism.
Number of connected clients
The number of connected clients, or more accurately, client connections, has a direct influence on how many jobs can be processed by the grid at any one time. The relationship is defined as: maximum number of parallel jobs = total number of client connections
Number of attached nodes
This determines the maximum number of jobs that can be executed on the grid nodes. With regards to the previous point, we can redefine it as: maximum number of parallel jobs = min(total number of client connections, total number of nodes)
Load balancing
This is the mechanism that splits the jobs into multiple subsets of their tasks, and distributes these subsets over the available nodes. Given the synchronous nature of the server to node connectins, a node is available only when it is not already executing a job subset. The load balancing also computes how many tasks will be sent to each node, in a way that can be static, dynamic, or even user-defined.
Job SLA
The job Service Level Agreement is used to filter out nodes in which the user does not want to see a job executed. This can be done by specifying an execution policy (rules-based filtering) for the job, or by configuring the maximum nuumber of nodes a job can run on (grid partitioning).
Parallel I/O
Each server maintains internally a pool of threads dedicated to network I/O. The size of this pool determines how many nodes the server can communicate with in parallel, at any given time. Furthermore, as communication with the nodes is non-blocking, this pool of I/O threads is part of a mechanism that achieves a preemptive multitasking of the network I/O. This means that, even if you have a limited number of I/O threads, the overall result will be as if the server were communicating with all nodes in parallel.
The pool size may be adjusted either statically or dynamically to account for the actual number of processors available to the node, and for the tasks' resource usage profile (i.e. I/O bound tasks versus CPU bound tasks).
- clients are entry points to the grid and enable developers to submit work via the client APIs
- servers are the components that receive work from the clients, dispatch it to the nodes, receive the results fom the nodes, and send these results back to the clients
- nodes perform the actual work execution
From this picture, we can see that the server plays a central role, and its interactions with the nodes define a master / worker architecture, where the server (i.e. master) distributes the work to the nodes (i.e. workers).
This also represents the most common topology in a JPPF grid, where each client is connected to a single server, and many nodes are attached to the same server. As with any such architecture, this one is facing the risk of a single point of failure. To mitigate this risk, JPPF provides the ability to connect multiple servers together in a peer-to-peer network and additional connectivity options for clients and nodes, as illustrated in this figure:
Note how some of the clients are connected to multiple servers, providing failover as well as load balancing capabilities. In addition, and not visible in the previous figure, the nodes have a failover mechanism that will enable them to attach to a different server, should the one they are attached to fail or die.
The connection between two servers is directional: if server A is connected to server B then A will see B as a client, and B will see A as a node. This relationship can be made bi-directional by also connecting B to A. Note that in this scenario, each server taken locally still acts as a master in a master/worker paradigm.
In short, we can say that the single point of failure issue is addressed by a combination of redundancy and dynamic reconfiguration of the grid topology.
Work distribution
To understand how the work is distributed in a JPPF grid, and what role is played by each component, we will start by defining the two units of work that JPPF handles.A task is the smallest unit of work that can be handled in the grid. From the JPPF perspective, it is considered atomic.
A job is a logical grouping of tasks that are submitted together, and may define a common service level agreement (SLA) with the JPPF grid. The SLA can have a significant influence on how the job's work will be distributed in the grid, by specifying a number of behavioral characteristics:
- rule-based filtering of nodes, specifying which nodes the work can be distributed to (aka execution policies)
- maximum number of nodes the work can be distributed to
- job priority
- start and expiration schedule
- user-defined metadata which can be used by the load balancer
This chart shows the different steps involved in the execution of a job, and where each of them takes place with regards to the grid component boundaries.
It also shows that the main source of parallelism is provided by the load balancer, whose role is to split each job into multiple subsets that can be executed on multiple nodes in parallel. There are other sources of parallelism at different levels, and we will describe them in the next sections.
Networking considerations
Two channels per connection
Each connection between a server and any other component is in fact a grouping of two network channels:- one channel is used to transport job data
- the other channel is used by the JPPF distributed class loader, that allows Java classes to be deployed on-demand where they are needed, completely transparently from a developer's perspective.
Synchronous networking
In JPPF, all network communications are synchronous and follow a protocol based on a request/response paradigm. The attribution of requester vs. responder role depends on which components communicate and through which channel.We illustrate this in the following picture:
This communication model has a number of important implications:
- nodes can only process one job at a time; however they can execute multiple tasks in parallel
- in the same way, a single client / server connection can only process one job at a time; however, each client can be connected multiple times to the same server, or multiple times to many servers
- in the case of a server-to-server communication, only one job can be processed at a time, since a server attaches to another server in exactly the same way as a node.
Protocol
JPPF components communicate by exchanging messages. As described in the previous section, each JPPF transaction will be made of a request message, followed by a response message.Messages all have the same structure, and are made of one or more blocks of data (in fact blocks of bytes), each preceded by its block size. Each block of data represents a serialized object graph. Thus, each message can be represented generically as follows:
|
|
|
|
|
Job data channel
A job data request is composed of the following elements:
- a header, which is an object representing information about the job, including the number of tasks in the job, the job SLA, job metadata, and additional information required internally by the JPPF components.
- a data provider, which is a read-only container for data shared among all the tasks
- the tasks themselves
size |
|
vider size |
data |
|
|
|
|
|
The response will be in a very similar format, except that it doesn't have a data provider: being read-only, no change to its content is expected, which removes the need to send it in the response. Thus the response can be represented as:
size |
|
|
|
|
|
|
Class loader channel
A class loader message, either request or response, is always made of a single serialized object. Therefore, the message structure is always as follows:
|
|
Sources of parallelism
At the client level
There are three ways JPPF clients can provide parallel processing, which may be used individually or in any combination:Single client, multiple concurrent jobs
A single client may submit multiple jobs in parallel. This differs from the single client/single job scenario in that the jobs must be submitted in non-blocking mode, and their results are retrieved asynchronously. An other difference is that the client must establish multiple connections to the server to enable parallelism, and not just asynchronous submission. When multiple non-blocking jobs are submitted over a single connection, only one at a time will be submitted, and the others will be queued on the client side. The only parallelism is in the submission of the jobs, but not in their execution. To enable parallel execution of multiple jobs, it is necessary to configure a pool of connections for the client. The size of the pool determines the number of jobs that can be processed in parallel by the server.
Multiple clients
In this configuration, the parallelism occurs naturally, by letting the different clients work concurrently.
Mixed local and remote execution
Clients have the ability to execute jobs locally, within the same process, rather than remotely on the server. They may also use both capabilities at the same time, in which case a load-balancing mechanism will provide an additional source of parallelism.
At the server level
The server has a number of factors that determine what can be parallelized and how much:Number of connected clients
The number of connected clients, or more accurately, client connections, has a direct influence on how many jobs can be processed by the grid at any one time. The relationship is defined as: maximum number of parallel jobs = total number of client connections
Number of attached nodes
This determines the maximum number of jobs that can be executed on the grid nodes. With regards to the previous point, we can redefine it as: maximum number of parallel jobs = min(total number of client connections, total number of nodes)
Load balancing
This is the mechanism that splits the jobs into multiple subsets of their tasks, and distributes these subsets over the available nodes. Given the synchronous nature of the server to node connectins, a node is available only when it is not already executing a job subset. The load balancing also computes how many tasks will be sent to each node, in a way that can be static, dynamic, or even user-defined.
Job SLA
The job Service Level Agreement is used to filter out nodes in which the user does not want to see a job executed. This can be done by specifying an execution policy (rules-based filtering) for the job, or by configuring the maximum nuumber of nodes a job can run on (grid partitioning).
Parallel I/O
Each server maintains internally a pool of threads dedicated to network I/O. The size of this pool determines how many nodes the server can communicate with in parallel, at any given time. Furthermore, as communication with the nodes is non-blocking, this pool of I/O threads is part of a mechanism that achieves a preemptive multitasking of the network I/O. This means that, even if you have a limited number of I/O threads, the overall result will be as if the server were communicating with all nodes in parallel.
At the node level
To execute tasks, each node uses a pool of threads that are called "processing threads". The size of the pool determines the maximum number of tasks a single node can execute in parallel.The pool size may be adjusted either statically or dynamically to account for the actual number of processors available to the node, and for the tasks' resource usage profile (i.e. I/O bound tasks versus CPU bound tasks).
Just regurgitation of what is on the JPPF site
ReplyDelete