Software Transactional Memory Library
        
            - General information
- Configurations
- XL compiler
- Platforms
- Functions
- Statistics
General information
        The software transactional memory (STM) library code provides
        several configurations and policies for implementing transactional
        memory (TM). A transaction performs a set of reads and writes to shared
        memory. The TM system, under certain conditions, guarantees that each
        transaction appears as if all of its reads and writes are performed
        atomically together in relation to other transactions.
        For the current implementation, it is assumed that shared locations
        are either always accessed inside transactions or always outside
        transactions, and that transactions include only revocable operations
        without side effects, and hence can be safely undone and retried.
        The STM implementations use metadata to synchronize access to shared
        locations accessed transactionally. Each shared location is associated
        with a metadata entry. A metadata entry serves as a version number
        tracking updates of associated shared locations, and as a lock
        protecting updates of these shared locations.
        In general, at the beginning of a transaction, the thread sets
        private per-thread transactional status data and in some configurations
        read some global data. For transactional reads, the thread reads and
        checks the metadata corresponding to the target location and then reads
        the target location, and in some configurations, checks the consistency
        of the read set. For transactional writes, the thread records the
        target address and the value to be written, and in some configurations
        acquires the corresponding metadata lock. At the end of a transaction,
        the thread acquires the metadata locks corresponding to its write set
        if not already acquired, validates the consistency of its read set,
        writes the values to the addresses in its write set, releases the
        metadata locks, and finally resets its private transactional data.
        The main design configuration issues are:
 
         
            - Read set inconsistency may lead to failures such as segmentation
            faults, division by zero and infinite loops. Under
            configurations that allow read set inconsistency, the STM
            system attempts to catch and recover from these failures.
            However, these attempts are not guaranteed to catch every
            possible failure. For example, the STM library cannot detect
            and recover from an infinite loop that does not generate any
            calls to the STM code. In some cases all possible failure types
            that may arise due to inconsistent reads can be anticipated,
            and it may be advantageous to forego read set consistency, in
            order to avoid the costs of frequent read set validations that
            are needed to guarantee read set consistency.  
- If read set consistency is to be guaranteed, there are multiple
            configurations. One is to check the metadata of each read in
            the read set after every read, which leads to a quadratic cost.
            The advantage of this choice is that it is scalable, as in
            general disjoint transactions do not interfere with each
            other.
- Another configuration employs a global version number to enable
            fast validation without checking each individual metadata entry
            corresponding to the reads in the transaction's read set unless
            a change in the global version number is detected. However,
            this requires each writing transactions to increment the global
            version number and hence this may limit concurrency, even among
            non-conflicting transactions.
- For programs with transactions with small read sets, the cost of
            full validation after every read may not be high, while
            incrementing a global version may be expensive relative to the
            small transactions and limits concurrency, and thus it may be a
            better choice to use full validation.
- For programs with transactions with large read sets, the cost of
            updating the global version number may be small relative to the
            large transactions, and the low frequency of global version number
            updates may not pose significant limitations on concurrency, while
            the quadratic cost of full validation after every read can be very
            high. Thus, in such cases it may be better to use a configuration
            with a global version number.
- Another design issue is when to acquire the metadata locks for the
            write set. Acquiring locks at encounter time enables fast
            determination for reads whether the location has been written by
            the same transaction or not without searching the write set, but
            increases the time where locks are held and thus increases the
            chances of inter-transaction conflicts. Acquiring locks at the end
            of the transaction minimizes the time where locks are held, but
            requires other mechanisms for detecting read after write cases
            using Bloom filters for fast search of write sets.  
Configurations
        The default implementation uses the following policy and configuration
        options:
        
            - Consistent read sets
- Global version number
- Write set lock acquisition at the end of the transaction
- 1M metadata data entries
- 8 byte conflict unit
- 512-bit write set Bloom filters
In order to use different policies and configurations, add EXTRA_FLAGS
        settings to the make command. For example, to use an encounter-time
        acquire policy for transactional writes use the command:
            make all EXTRA_FLAGS="-DENCOUNTER_ACQUIRE"
        
        Some of the main configuration flags are:
        
            - NOINC_VALIDATION: To allow inconsistent reads and use signals to catch failures.
- NOGLOBAL_VERSION: To avoid the use of a global version number.
- ENCOUNTER_ACQUIRE: To acquire metadata locks on encountering transactional writes instead of at the end of the transaction
- LOG_2_NUM_ORECS=<integer>: Log 2 of the number of metadata entries. Default 20.
- LOG_2_BLOOM_FILTER_BITS=<integer>: Log 2 of the write set Bloom filter size. Default 9.
- LOG_2_BLOCKS_SIZE=<integer>: Log 2 of the conflict unit block size. Default 3.
- MAX_TXNS=<integer>: Maximum number of static transactions in the programs. Used only for statistics. Default 64.
XL compiler
        
        The STM implementations support a low-level interface. The IBM XL
        C/C++ compiler for Transactional Memory available from IBM AlphaWorks
        (http://www.alphaworks.ibm.com/tech/xlcstm/) uses this
        interface. Therefore, STM libraries built from this code release can
        be linked with user programs using that compiler without need to
        instrument individual reads and writes inside transactions. Programs
        need to use a high-level interface as specified in
        http://www.alphaworks.ibm.com/tech/xlcstm/. Note that the IBM XL C/C++
        compiler for Transactional Memory runs on AIX systems only at this
        point, while the STM code runs on a variety of systems as noted below.
        
        
        It is possible for other compilers to use this STM code, if they use
        the same low level interface described below.
        
        Platforms
        The STM code runs on several platforms including AIX-PowerPC,
        Linux-PowerPC, and Linux-X86, and Linux-X86_64. The code may run on Solaris-SPARC
        but it has not been tested.
        Functions
        The low-level STM interface is defined in the file stm.h. 
        The main functions and macros are:
        
            - The macros STM_BEGIN and STM_END are to be used at the beginning and
            end of transactions. 
- stm_thr_init takes no arguments and returns a pointer to the thread's
            private transactional descriptor. This function should be called once
            before a thread starts using transactions.
- stm_thr_retire takes no arguments and has no return value. This
            function should be called once when a thread is no longer to use
            transactions. A thread may call stm_thr_init after it calls stm_thr_retire
            in order to resume using transactions.
- stm_desc takes no arguments and returns a pointer to the per-thread
            private transactional descriptor that is passed as an argument to most
            other STM functions as described below.  
- stm_read_* functions, e.g., stm_read_int, stm_read_float, etc. These
            functions take two arguments: a pointer to a location of the specified type
            and a pointer to the per-thread private transactional descriptor.  
- stm_write_* functions, e.g., stm_write_int, stm_write_double, etc.
            These functions take three arguments: a pointer to the target location, the
            value to be written, and a pointer to the per-thread private transactional
            descriptor.  
- stm_malloc, stm_free are similar to the standard malloc and free
            functions but take an additional argument, a pointer to the thread's
            transactional descriptor.  
Statistics
        The STM code can collect runtime statistics related to the inherent
        transactional characteristics of the program independent of the STM
        implementation, such as transaction sizes and frequencies, as well as the
        interaction of the program with the STM implementation specifics, such as Bloom
        filter matches, metadata locks acquired.
        In order to build an STM library that collects statistics use:
        
            make all STATS=on
        
        Note that the performance of the STM library is significantly lower when
        collecting statistics is enabled.
        Runtime statistics are categorized by static transactions. Each static
        transaction is identified by the file name and line number where it starts.
        Aggregate statistics are also generated.
        In order to generate statistics files, the program needs to call the
        function stm_stats_out(). The program may call this function multiple times.
        Note that if the function is called by a thread while other threads are
        actively using transactions, then the statistics may be inconsistent as
        statistics of other threads may change while taking a snapshot of the
        statistics. Typically, calls to stm_stats_out should be at stable points of the
        programs such as at the end where only the main thread is active. In such
        cases, the statistics should be consistent.
        The output files take the form
        stm_stats_tag<xx>_txn_<filename>_<lineno>.out for statistics
        per static transaction and stm_stats_tag<xx>_all.out for aggregate
        statistics. The tag indicates the number of times stm_stats_out has been
        called. The file name and line number indicate the associated static
        transaction. For example, on the second call to stm_stats_out, the runtime
        statistics for the static transaction starting at line 100 in the file foo.c
        will be included in the file stm_stats_tag02_foo.c_100.out.
        A list of statistics is in the file stats.h. The following are some of the
        main statistics collected by the STM code:
        
            - READ_ONLY_COMMITS: Number of committed transactions with no writes
- READ_WRITE_COMMITS: Number of committed transactions with writes
- TOTAL_COMMITS: Total number of successfully committed transactions
- TOTAL_RETRIES: Total number of retried to transactions
- AVG_RETRIES_PER_TXN: Average number retries per committed
            transaction
- READ_SET_SIZES: Total number of items in read sets of committed
            transactions excluding duplicates
- WRITE_SET_SIZES: Total number of items in write sets of committed
            transactions excluding duplicates
- AVG_READ_SET_SIZE: Average number of transactional reads to unique
            locations per transaction
- AVG_WRITE_SET_SIZE: Average number of transactional writes to unique
            locations per transaction
- PCT_DUPLICATE_WRITES: Percent of transactional writes that are to
            locations already written in the same transaction
- NUM_SILENT_WRITES: Number of writes with new value the same as the
            current value
- PCT_SILENT_WRITES: Percent of transactional writes with values the same
            as the current values
- DUPLICATE_WRITES: Number of writes to locations already written in the
            same transaction
- DUPLICATE_READS: Number of reads of locations already read in the same
            transaction
- PCT_DUPLICATE_READS: Percent of transactional reads that are to
            locations already read in the same transaction
- READ_AFTER_WRITE_MATCHES: Number of read after write of the same
            location in the same transaction
- PCT_READ_AFTER_WRITE: Percent of transactional reads that are to
            locations already written by the same transaction
- NUM_MALLOCS: Number of calls to malloc inside transactions
- NUM_FREES: Number of calls to free inside transactions
- NUM_FREE_PRIVATE: Number of calls to free of blocks allocated in the
            same transaction
- READ_LIST_MAX_SIZE: Maximum number of items in a transaction read
            list
- WRITE_LIST_MAX_SIZE: Maximum number of items in a transaction write
            list
- READ_SET_MAX_SIZE: Maximum number of unique items in a transaction read
            set
- WRITE_SET_MAX_SIZE: Maximum number of unique items in a transaction
            write set
- MAX_NESTING: Maximum level of nesting