Concurrency - Multithreading

Overview

Concurrency is the ability to run several programs or several parts of a program in parallel. If a time consuming task can be performed asynchronously or in parallel, this improve the throughput and the interactivity of the program.
A modern computer has several CPU’s or several cores within one CPU.  Taking advantage of these multi-cores can allow a high-volume application to perform.


Process and Thread

A process runs independently and isolated of other processes. It cannot directly access shared data in other processes. The resources of the process, e.g. memory and CPU time, are allocated to it via the operating system.
A thread is a so called lightweight process. It has its own call stack, but can access shared data of other threads in the same process. Every thread has its own memory cache. If a thread reads shared data it stores this data in its own memory cache. A thread can re-read the shared data.
A Java application runs by default in one process. Within a Java application you work with several threads to achieve parallel processing or asynchronous behavior.

Concurrency issues

Threads have their own call stack, but can also access shared data. Therefore you have two basic problems, visibility and access problems.

A visibility problem occurs if thread A reads shared data which is later changed by thread B and thread A is unaware of this change.
An access problem can occur if several thread access and change the same shared data at the same time.


Immutable Objects

An object is considered immutable if its state cannot change after it is constructed.
Immutable objects are particularly useful in concurrent applications. Since they cannot change state, they cannot be corrupted by thread interference or observed in an inconsistent state.

The simplest way to avoid problems with concurrency is to share only immutable data between threads. Immutable data is data which cannot changed.

To make a class immutable, the Class should satisfy the following:
  • all fields are final
  • the class is declared as final
  • the this reference is not allowed to escape during construction
  • Any fields which refer to mutable data objects are
  1. private
  2. have no setter method
  3. they are never directly returned of otherwise exposed to a caller
  4. if they are changed internally in the class this change is not visible and has no effect outside of the class.
An immutable class may have some mutable data which it uses to manages its state but from the outside this class nor any attribute of this class can get changed.

Some famous immutable classes in the Standard API
  • Java.lang.String
  • The wrapper classes for the primitive types: java.lang.Integer, java.lang.Byte, java.lang.Character, java.lang.Short, java.lang.Boolean, java.lang.Long, java.lang.Double
  • Most enum classes are immutable, but this in fact depends on the concrete case.
  • java.math.BigInteger and java.math.BigDecimal
  • java.io.File

HashMap, Hastable and ConcurrentHashMap

HashTable and HashMap both stores data with key value pairs.
Hash functions accept a key and return an output unique only to that specific key. It maps an object (of some type) to an integer.
This is known as hashing, which is the concept that an input and an output have a one-to-one correspondence to map information.

Hash functions return a unique address in memory for that data.
They are designed to optimize searching, insertion, and deletion.
Hash collisions are when a hash function returns the same output for two distinct inputs.
All hash functions have this problem.
This is often accommodated for by having the hash tables be very large.

Hashes are important for associative arrays and database indexing.

Maps are naturally one of the most widely style of Java collection.
HashMap is not a thread-safe implementation, while Hashtable does provide thread-safety by synchronizing operations.
Even though Hashtable is thread safe, it is not very efficient. Another fully synchronized Map, Collections.synchronizedMap, does not exhibit great efficiency either.
If we want thread-safety with high throughput under high concurrency, these implementations aren’t the way to go.

To solve the problem, the Java Collections Framework introduced ConcurrentMap in Java 1.5.
ConcurrentMap is an extension of the Map interface. It provides thread-safety and memory-consistent atomic operations.
ConcurrentMap guarantees memory consistency on key/value operations in a multi-threading environment.
ConcurrentHashMap is the ready to use ConcurrentMap implementation.

Blocking Concurrency Algorithms

A blocking concurrency algorithm is an algorithm which either:
  • Performs the action requested by the thread
  • or
  • Blocks the thread until the action can be performed safely
Many types of algorithms and concurrent data structures are blocking. For instance, the different implementations of the java.util.concurrent.BlockingQueue interface are all blocking data structures. If a thread attempts to insert an element into a BlockingQueue and the queue does not have space, the inserting thread is blocked (suspended) until the BlockingQueue has space for the new element.

Non-blocking Concurrency Algorithms

Non-blocking applications are written in a way that threads never block – whenever a thread would have to block on I/O (e.g. reading/writing from/to a socket), it instead gets notified when new data is available.
A non-blocking concurrency algorithm is an algorithm which either:
  • Performs the action requested by the thread
  • or
  • Notifies the requesting thread that the action could not be performed.
Java contains several non-blocking data structures too. The AtomicBoolean, AtomicInteger, AtomicLong and AtomicReference are all examples of non-blocking data structures. These Classes have the method compareAndSet() which atomically sets the value to the given updated value if the current value == the expected value.

public class MyLock {
    private AtomicBoolean locked = new AtomicBoolean(false);
    private AtomicInteger sharedInt = new AtomicInteger(2);

    public boolean lock() {
        return locked.compareAndSet(false, true);
    }
 
    public void setSharedInt (int expectedValue, int pNewValue) {
    sharedInt.compareAndSet(expectedValue, pNewValue);
    }
}


ConcurrentHashMap is a data structure allowing concurrent access to the map. Part of the map called Segment (internal data structure) is only getting locked while adding or updating the map. So ConcurrentHashMap allows concurrent threads to read the value without locking at all. This data structure was introduced to improve performance.

Volatile variables

Java volatile variables are variables that are always read directly from main memory. When a new value is assigned to a volatile variable the value is always written immediately to main memory. This guarantees that the latest value of a volatile variable is always visible to other threads running on other CPUs.

 

 

No comments:

Post a Comment