Pages

Monday, March 6, 2017

Local and Global Variables

Welcome » NERWous C » Pel
  1. Local Variables
  2. Global Variables
  3. Import Attribute
  4. Import-File Attribute
  5. Array Variables
  6. Mel Variable Properties


Local Variables

In the previous chapter, we say that in order to run something in parallel, we just pel it with the <!> construct. Actually, things are not that simple. For example, can we use local variables from the parent task? Let's take a look at this serial program:
/* VERSION 1 */
#define MAX 100
main () {
    int store[MAX];
    int counter;

    /* Producer first */
    counter = MAX;
    while ( --counter )
     store[counter] = Produce();

    /* Consumer next */
    counter = MAX;
    while ( --counter ) 
        Consume(store[counter]);
}
This contrived example uses a local variable, counter. It is used first in the Producer code to load the array store with Produced items. It is then used again in the Consumer code to unload the store array to Consume.

Let's change the Producer and Consumer codes to run in parallel, and see how the local variable counter behaves.
/* VERSION 2 */
#define MAX 100
main () {
    int counter = MAX;
    <mel buffer=counter/10> int store;

    <!> {    /* Producer task */
        while ( --counter ) 
            <?>store = Produce();
    }

    <!> {    /* Consumer task */
        while ( --counter ) 
            Consume(<?>store);
    }
}
First we transform the store array into a mel variable so that it can be shared between the Producer task and Consumer task. To increase concurrency, we give it a buffer with an arbitrary size of 10% of the number of iterations maintained by counter. Then we consecutively pel the code blocks for Producer and Consumer. The main task then ends. The whole program will exit at a later time when the Producer and Consumer tasks have also ended.

The item of interest here is the local variable counter. It appears in both pelled tasks. Unlike the serial VERSION 1 example, the value of counter when it gets to the Consumer task is still MAX as originally set by main. The reduction of counter in the Producer task (via the statement --counter) is done separately in that task and does not impact the counter in the main task. In fact, there are three distinct local counter variables in the VERSION 2 example, one for main, one for Producer, and one for Consumer.

During compile time, the NERW translator will analyze any block of code that is under the pel <!> construct. Any local variables from the pelling code (which is main here) that are re-used in the pelled code, will be duplicated in the pelled tasks. The values of the duplicated local variables will be the same as of the ones of the pelling code at the time of the pelling. This is why Consumer has its own counter variable and it is valued with the MAX value, same as Producer.

Let's look at another version:
/* VERSION 3 */
#define MAX 100
main () {
    int counter = MAX;
    int numtasks = 0;
    <mel buffer=counter/10> int store;

    <!> {    /* Producer task */
        while ( --counter )
            <?>store = Produce();
    }
    ++numtasks;

    <!> {    /* Consumer task */
        printf ("This is task [%]", numtasks);
        while ( --counter )
            Consume(<?>store);
    }
    ++numtasks;

    printf ("Waiting for %d pelled tasks to terminate", numtasks);
}
Like counter, numtasks is also a local variable of main. Unlike counter though, numtasks is not used in the Producer pelled code block. Thus, the NERWous C translator during compilation time will not generate behind-the-scene code to duplicate this variable for the Producer task. On the other hand, the Consumer pelled code block does use numtask. The NERWous C translator can detect this use during compilation, and will generate code to create a separate numtask local variable for the Consumer task. It also initializes this local variable with the value of numtask from the main task at the time of the pelling (which is 1 in the example above).

To support local variables, the compile-time NERWous C translator needs to be able to:
  1. re-create any local variables from the pelling task to the pelled task if they also appear in the pelled code as right-hand-side values,
  2. initialize these created local variables with the values of the eponymous local variables from the pelling task at the time of pelling.
An implementation of the NERWous C translator may initialize all duplicated local variables intead of just the ones used in the right-hand side. Such implementation may make sense in a single-server multi-threaded environment where local variables in a stack can be copied in bulk from one threaded task to another threaded task. For tasks that are distributed in separate nodes, especially over a wide area network, the cost of transferring data to initialize un-needed left-hand-side local variables may impact performance. Thus, the requirement for the NERWous C translator is to initialize only right-hand-side local variables.

Let's rewrite the above example, replacing code blocks by functions:
/* VERSION 4 */
#define MAX 100
main () {
    int counter = MAX;
    int numtasks = 0;
    <mel buffer=counter/10> int store;

    /* Producer task */
    <!>Producer (store, counter);
    ++numtasks;

    /* Consumer task */
    <!>Consumer (store, counter, numtasks);
    ++numtasks;

    printf ("Waiting for %d pelled tasks to terminate", numtasks);
}

void Producer (mel int store, int counter) {
    while ( --counter ) 
        <?>store = Produce();
}

void Consumer (mel int store, int counter, numtasks) {
    printf ("This is task [%]", numtasks);
    while ( --counter ) 
        Consume(<?>store);
}

With the Producer and Consumer tasks encapsulated in functions, whatever local variables from the main task are needed, have to be explicitly passed as functional arguments. Again, the values of the arguments are the values of the local variables of the pelling task at the time of pelling. In the example above, counter has the value of MAX for both Producer and Consumer tasks, and numtasks has the value of 1 on entry of the Consumer task.


Global Variables

In the previous examples, we look at local variables. In the following examples, let's consider global variables with pel statements. We start with a serial C program that contains three global variables:
/* VERSION 5 */
#define MAX 100
int Counter;
int Item;
int Version = 5;
main () {
    int store[MAX];

    /* Producer first */
    Counter = MAX;
    while ( --Counter ) {
        Item = Produce ();
        store[Counter] = Item;
        REPORT ("Producing");
    }

    /* Consumer next */
    Counter = MAX;
    while ( --Counter ) {
        Item = store[Counter];
        Consume(Item);
        REPORT ("Consuming");
    }
}

function REPORT (action) {
    printf ("Version [%d]: Action [%s], Counter [%d] Item [%d]\n", Version, action, Counter, Item);
}
The three global variables, Version, Counter and Item, are used so that we don't need to pass them to the REPORT function as arguments. (This may not be a good way to write a computer program, but we need something simplistic to discuss the complex issue of global variables.)

Now let's rewrite the Producer and Consumer code blocks to run in parallel:
/* VERSION 6 - BUGGY */
#define MAX 100
int Version = 6;
int Counter = MAX;
int Item;
main () {
    <mel buffer=counter/10> int store;

    /* Producer task */
    <!> while ( --Counter ) {
       Item = Produce ();
       <?>store = Item;
       REPORT ("Producing");
    }

    /* Consumer task */
    <!> while ( --Counter ) {
       Item = <?>store;
       Consume(Item);
       REPORT ("Consuming");
    }
}

function REPORT (action) {
    printf ("Version [%d]: Action [%s], Counter [%d] Item [%d]\n", Version, action, Counter, Item);
}
Like local variables, the global variables are automatically made available in the pelled tasks. The above example contains 3 sets of Version, Counter and Item The first set is in the main task, the second in the Producer task, and the third in the Consumer task. Each set is independent to each other, and is used for localized global access within a task only.

In NERWous C, a global variable is different from a memory element (mel) variable. A global variable is globally accessible in the compile-time scope, while a mel variable is globally accessible in the run-time scope. A compile-time scope means that a global variable can be accessed anywhere in the program: it is declared once and any block of code or function can refer to it with the help of the language compiler. A run-time scope means that a mel variable can be shared by different tasks that are forked (or in NERWous C parlance, pelled) during the running of the program. The shared access is provided by the CHAOS runtime environment. In the example above, the mel variable store is shared between the Producer task and the Consumer task. Note that store is declared as a local variable and not as a global variable for the compile-time scope.

This is how the NERWous C translator handles global variables at compile time:
  1. The translator collects the types and names of all global variables in the pelling task.
     
  2. When the translator first sees a variable in a pelled task and this variable is not a local variable, the translator automatically creates a global variable in the pelled task environment, using the type of the eponymous global variable collected from the pelling task.
     
  3. While the translator automatically re-creates global variables for pelled tasks, the translator does not automatically initialize them, unless the global variables are included in the import or import-file attribute.
The reason the translator does not do automatic initialization of re-created global variables for pelled tasks, while local variables are both re-created and initialized, is due to performance.

It can be hard to detect what global variables a pelled task will use, since global variables can appear in functions called deep in the execution thread starting from the first level of the pelled task. This differs from local variables where they only appear in the first level of a pelled code block -- which is easier to detect. Automatically initializing all global variables during run-time for any created pel task even if it never makes use of all of them, will likely impact the performance of a concurrent NERWous C program, especially when the tasks are distributed over a wide area network, and some global variables contain huge data.

The NERWous C translator will automatically create global variables in pelled tasks as they are needed, but initializes them only on demand, via the import or import-file attribute. In the example above, neither of these attributes is used, resulting in the program being "buggy". For example, the global variable Counter is re-created in the Producer and Consumer tasks, but neither of these copies are initialized. Thus the statement:
while ( --Counter ) {
will fail because Counter is undefined.

The import feature allows the pelling task to send data to the pelled task to initialize global variables. There is no export feature for the pelled task to update back the global variables in the pelling task. For this, the global variables must be declared as mel variables.


Import Attribute

Let's fix the above "buggy" example by making use of the import attribute to the pel <!> statement:
/* VERSION 7 */
#define MAX 100
int Version = 7;
int Counter = MAX;
int Item;
main () {
    <mel buffer=counter/10> int store;

    /* Producer task */
    <! import="Version,Counter"> while ( --Counter ) {
       Item = Produce ();
       <?>store = Item;
       REPORT ("Producing");
    }

    /* Consumer task */
    <! import="Version,Counter"> while ( --Counter ) {
       Item = <?>store;
       Consume(Item);
       REPORT ("Consuming");
    }
}

function REPORT (action) {
    printf ("Version [%d]: Action [%s], Counter [%d] Item [%d]\n", Version, action, Counter, Item);
}
The import attribute to the pel construct <!> instructs the NERWous C translator to generate behind-the-scene code to re-create the specified global variables (i.e. Version and Counter) in the pelled task, and to also initialize them with the current values of the same-named variables from the pelling task.

The global variable Item is not in the import list and is not created at the time of the pel statement. Only when the NERWous C translator encounters Item the first time in the Producer code block (or Consumer code block), then it will create a global variable for Item localized for the pelled task. That means that the Item created for the Producer task is different from the one created for the Consumer task, and both of them different from the one declared in the main task. The only sameness between these different global variables is the type (which is int) that the NERWous C translator knows from parsing the whole program.

Global variables, like Item, that are created on a needed basis, will not be initialized. The current values in the pelling task are not transferred to the pelled task. This is beneficial, not detrimental. In the example above, there is no need to transfer the value of Item over from the pelling task, and have the pelled task overrides it with the result of Produce() or the value of the mel store. Via the import attribute, the programmer can specify only the global variables whose values need to be transferred over.

What happens if the programmer forgets a global variable in the import list, such as in this code fragment:
<! import="Counter">
Since Version is not imported, it is created uninitialized on a needed basis. On the above example, this will be when the function REPORT is invoked. The printf will display the indeterminate value of the uninitialized Version, instead of the static value 7 from the main task.

The import and import-file (to be introduced next) attributes support one-way transfer of data from the pelling task to the pelled task, Any changes that the pelled task does to its copy of the global variables stay with the pelled task and not copied back to the pelling task. If this update is desired, then the global variables should be declared as mel global variables whom read/write access is shared to multiple tasks, including the pelling task.


Import-File Attribute

The import attribute is fine when we have a few global variables to transfer over. When the list grows long, it becomes hard to maintain. In this case, it may be better to put this list into a separate file and use the import-file attribute to the <!> pel statement.

In this import file, procon.h, we introduce the global variables to be imported with the C language extern:
/* procon.h */
extern int Version;
extern int Counter;
Let's modify our example to use procon.h:
/* VERSION 8 */
#define MAX 100
int Version = 8;
int Counter = MAX;
int Item;
main () {
    <mel buffer=counter/10> int store;

    /* Producer task */
    <! import-file="procon.h"> while ( --Counter ) {
       Item = Produce ();
       <?>store = Item;
       REPORT ("Producing");
    }

    /* Consumer task */
    <! import-file="procon.h"> while ( --Counter ) {
       Item = <?>store;
       Consume(Item);
       REPORT ("Consuming");
    }
}

function REPORT (action) {
    printf ("Version [%d]: Action [%s], Counter [%d] Item [%d]\n", Version, action, Counter, Item);
}
To find the location of procon.h the NERWous C translator follows the same resolution rule used by the C preprocessor for handling:
#include "procon.h"

The same file inclusion works similarly when the pelled tasks are encapsulated as functions:
/* VERSION 9 */
#define MAX 100
int Version = 9;
int Counter = MAX;
int Item;
main () {
    <mel buffer=counter/10> int store;

    /* Producer task */
    <! import-file="procon.h"> 
        Producer (store);

    /* Consumer task */
    <! import-file="procon.h"> 
        Consumer (store);
}

void Producer (mel int store) {
    while ( --Counter ) {
        Item = Produce ();
        <?>store = Item;
        REPORT ("Producing");
    }
}

void Consumer (mel int store) {
    while ( --Counter ) {
       Item = <?>store;
       Consume(Item);
       REPORT ("Consuming");
    }
}

function REPORT (action) {
    printf ("Version [%d]: Action [%s], Counter [%d] Item [%d]\n", Version, action, Counter, Item);
}
For both the Producer and Consumer tasks, their localized version of the global variables Counter and Version are created and initialized with the values from the main task right at the pel <!> statement. However they are not put to use until the execution flows into the Producer or Consumer function.

We can import multiple files with the import-file attribute:
<! import-file="file1.h, file2.h">

We can mix the import-file attribute with the import attribute:
<! import-file="file1.h, file2.h" import="var1, var2">
The imported files usually contain groups of global variables that are commonly imported by many tasks, while the individual imported global variables are specific to a certain task.


Array Variables

The previous examples use simple data types for the global variables. Now let's explore the use of arrays as variables, either local or global. If the whole array is to be imported to the pelled task, then an array variable is behaving no differently than a simple-typed variable. What interesting is the import facility allowing the transfer of a subset of the array to a pelled task.

Let's modify the above example again:
/* VERSION 10 */
#define MAX 100
#define BUNCH 5
int Version = 10;
int Counter;
int Item;
main () {
    int store[MAX];

    /* Producer task - run in serial */
    for (Counter=0; Counter<MAX; ++Counter) {
       Item = Produce ();
       store[Counter] = Item;
       REPORT ("Producing");
    }  

    /* Consumer task - run in parallel */
    for (int i=0; i<MAX; i += BUNCH)
    <! import="Version, store[i] ... store[i+BUNCH-1]"> {
       for (Counter=i; Counter<i+BUNCH; ++Counter) {
          Item = store[Counter];
          Consume (Item);
          REPORT ("Consuming");
       }
    }
}

function REPORT (action) {
    printf ("Version [%d]: Action [%s], Counter [%d] Item [%d]\n", Version, action, Counter, Item);
}
First, the mel array store is converted to a local array variable. The Producer then seeds this array with a seqential for loop. Once the array is seeded, it will be consumed in parallel, in a BUNCH items at a time. The main task uses the LIST (...) construct:
store[i] ... store[i+BUNCH-1]
to pass BUNCH items for the pelling task to use.

Note the use of the local variable i in these three consecutive lines:
    for (int i=0; i<MAX; i += BUNCH)
    <! import="Version, store[i] ... store[i+BUNCH-1]"> {
       for (Counter=i; Counter<i+BUNCH; ++Counter) {
The i variables appearing in the outer for statement and the import attribute are the same, and reside in the main task. On the other hand, the i used in the for loop belongs to the corresponding pelled task, and each pelled task has its own local version of i. Although they reside in different tasks than the main task, they get initialized with the value of the i variable of the main task at the time of the task creation.

We "cheat" a little bit in the previous example by making MAX divisible to BUNCH so that we don't have to worry the last iteration not having all BUNCH number of store elements. Let's update the pelling code to handle this boundary condition:
/* Consumer task - run in parallel */
    for (int i=0; i<MAX; i += BUNCH) {
        int upperbound = i+BUNCH-1;
        if ( upperbound >= MAX ) upperbound = MAX-1;
        
        <! import="Version, store[i ... upperbound]"> {
          for (Counter=i; Counter<=upperbound; ++Counter) {
              Item = store[Counter];
              Consume (Item);
              REPORT ("Consuming");
          }
        }
    }
Like the i variable, upperbound is also a local variable. This allows it to be automatically duplicated to the pelled task and be used there without much ado.


Mel Variable Properties

The properties of a mel variable are information about that mel variable that are cached locally in the requesting task so that they don't have to be constantly retrieved from the remote mel location. The mel properties of a mel variable have the same compile-time local or global scope as of that mel variable:
<mel> int store1;
main () {
    <mel> int store2;
    printf ("In [main], store1 buffer is [%d]", store1<buffersize>);
    printf ("In [main], store2 buffer is [%d]", store2<buffersize>);
 
    <!> {
        printf ("In pelled task, store1 buffer is [%d]", store1<buffersize>);
        printf ("In pelled task, store2 buffer is [%d]", store2<buffersize>);
    }
}
The store2 variable is declared as a local variable, and so are its properties. The properties of store2 are therefore automatically transferred from the main task to the pelled task. The printf statemetns for store2 will show the same buffer size.

Interestingly, both printf statements for store1 also show the same buffer size, even though store1 is a global variable. Its properties are thus also global, and we have been said that global variables are not initialized automatically in the pelled task. What was said is correct for regular variables, but not true for mel properties.

When the store1 is created in the statement:
<mel> int store1;
its properties are initialized with the current information of the newly created mel entity. The NERWous C translator, during compile time, sees that the main task makes use of the store1 variable, and so it adds an implicit import operation of store1 into main. This import is executed at run-time when main is created.

When main pels the inline task, the properties cache that main currently maintains for both store1 and store2 will be passed to the inline task. Let's expand the above example:
<mel> int store1;
main () {
    <mel> int store2;
    printf ("In [main], store1 buffer is [%d]", store1<buffersize>);
    printf ("In [main], store2 buffer is [%d]", store2<buffersize>);
    <rebuffer buffer+10> store1;
    <rebuffer buffer+10> store2;

    <!> {
        printf ("In pelled task, store1 buffer is [%d]", store1<buffersize>);
        printf ("In pelled task, store2 buffer is [%d]", store2<buffersize>);
    }
}
The printf statements now show different values. While those from main show the original buffer sizes of the mel variables, the ones from the pelled task show the buffer sizes as having been incremented by 10 due to the rebuffer operations. These values are what main have in the properties caches when it pels the inline task.

The reason for mel properties to be treated the same for both local and global scopes with automatic initialization is that the information on how to access the remote mel must be passed to the tasks so that they can do mel operations such as reads and writes. The rest of the properties thus piggy-backs on that. If programmers have to explicitly specify the import attribute for global mel variables every time they make use of the pel <!> statement, a NERWous C program would become unwieldy.

How about performance? Unlike regular variables which contain data, mel properties only contain metadata (i.e. the properties). The real data of a mel is stored centrally in the cel that hosts that mel. The local properties cache of each mel variable is quite manageable, and can be transferred to a pelled task without undue impact on performance.


Previous Next Top

No comments:

Post a Comment