Thursday, February 25, 2016

Query Execution of PostgreSQL

1. A connection from an application program to the PostgreSQL server has to be established.
The application program transmits a query to the server and waits to receive the results sent back by the server.

2. The parser stage checks the query transmitted by the application program for correct syntax and creates a query tree.

3. The rewrite system takes the query tree created by the parser stage and looks for any rules (stored in the system catalogs) to apply to the query tree.
It performs the transformations given in the rule bodies.

(Syntactic and Semantic Checks, Statement Transformation and View Merging).

4. The planner/optimizer takes the (rewritten) query tree and creates a query plan that will be the input to the executor.

Optimizer creates all possible paths leading to the same result.
For example if there is an index on a relation to be scanned, there are two paths for the scan.
One possibility is a simple sequential scan and the other possibility is to use the index.
Next the cost for the execution of each path is estimated and the cheapest path is chosen.
The cheapest path is expanded into a complete plan that the executor can use.

5. The executor recursively steps through the plan tree and retrieves rows in the way represented by the plan.
The executor makes use of the storage system while scanning relations, performs sorts and joins, evaluates qualifications and finally hands back the rows derived.


Tuesday, February 23, 2016

Utility Processes



Background writer − Writes dirty data blocks to disk 
WAL writer − Flushes write-ahead log to disk 
Checkpointer process − Automatically performs a checkpoint based on config parameters 

Autovacuum −It performs automatic operation which includes Auto vacuuming and Analyze. In other words, helps to recover free space for reuse 

Logging collector (log writer) − Routes log messages to syslog, eventlog, or log files 
Stats collector − Collects usage statistics by relation and block

Archiver − Archives write-ahead log files from WAL segments area to the Archived WAL area

Shared Buffers

Shared Buffers is the area needed for the Query Processing.

Shared Buffers are needed to ensure that the server processes handling user queries, never need to wait for a write to occur.

Buffer being modified is called a Dirty Buffer.

Any Query connecting to the Database requires a buffer area to process the User Request.

Shared Buffers is the memory area that gives the buffer place to Queries connecting to the Database.

Data from the Database’s Data files is fetched to the Memory.

Postmaster & Shared Memory



From the diagram, we can see Postmaster.

This is the process which starts when you start your database instance. When a postmaster starts the shared memory for the server is allocated to the Postmaster.

It Listens for any and all calls to the database.  When it receives a call from a client, it creates a back-end process (postgres server) to match it, using 1-1 correspondence.
Once the process is created, it links the client and postgres process so that they no longer have to communicate through the postmaster

Also, you can see that the shared memory contains: Shared Buffers, WAL Buffers and Process Array.

Shared buffers are used for reading and writing of the data from the data files.
When you read or write the content to a table and you commit the same, it goes to the data files. It all happens thru the shared buffers area.

Similarly, the transactions will go thru the WAL buffers area. WAL stands for write ahead log also called as transaction log in Postgres.

So when you write it goes to the WAL buffers and then goes to the WAL segments.
Process array stores the data of the different processes and the activity going thru these processes.

Monday, February 22, 2016

Architecture

  
 Image Courtesy : EDB 





We usually find several images for Postgres Architecture on google but, I feel this image is comfortable than any other.
I got this from a PPT shared by EDB.

Postgres Intro



     Postgres follows Process based architecture.

A PostgreSQL session consists of server process which manages the database files, accepts the connections to the database from the client application, and performs database actions on behalf of the clients. 

In other words, the server itself is a process which is running on a database server machine and, each client which connects with the server is a separate process. 

The server is also known as a Postmaster Process which also acts as a supervisor process for all the processes running on the server.

Whenever you start a server and start your database instance it will start Postmaster and several utility processes and memory.

Whenever a user connects with a server, it is a separate process so this separate process will take care of all the user requests. This separate process is started by the Postmaster whenever a user requests for a connection.