Physical Storage Structures

Physical storage structures are accessible only by an operating system. In this article, I will familiarize you with how data are stored and retrieved from the operating system (physical) perspective. If you are interested in the logical perspective, please read the previous article “Logical Storage Structures“.

Physical storage structures

Physical storage structures are a set of physical files storing data. I touched this topic already in the Architecture category in the article database files. Let’s recap what are the main 4  files and further in the article I will deep dive into every single one of them:

  • Data Files – contain tables, indexes, …
  • Temporary Data Files – store data in the temporary tablespace; cannot be read by other applications (belong to Data File category)
  • Control Files – have information about redo log files (names, locations), store log sequence number, checkpoint information, …
  • Online Redo Log Files – store changes made to data

All these files are managed by the “Database Instance“.

How database files are managed and stored

Not only the Database Instance is involved in managing the data files but also some other mechanisms implemented by Oracle. Oracle usually uses a combination of them (not only one).

  • Oracle ASM (Automatic Storage Management) – Oracle ASM smartly manages the disk space, data distribution, and optimizes the performance for I/O
  • Operating System File System – the most common is LVM (Logical Volume Manager)
  • Cluster File System
  • Raw Files – feature with direct I/O support (therefore used for large buffers; bypassing the buffer cache)

Data Files

All data are stored in data files and every database system has to have at least one data file (however, generally it is many).  Just to recap what was mentioned in the article about Logical Storage Structures – data files contain tablespaces, in tablespaces are segments, where extents are stored and finally, the smallest pieces are data blocks.

Permanent and Temporary data files

All files persistent (=permanent files) to a database are stored in permanent tablespaces. Similarly, temporary data files are stored in temporary tablespaces. Temporary data files are existing only for the duration of a session. They have a few characteristics (compared to permanent data files) such as:

  • they are always created with NOLOGGING option (they cannot be restored in case of database instance crashes
  • they are always in read-write mode (you can never make them read-only)

You can check some additional details about temporary data files in the following views

SELECT * FROM dba_temp_files;
-- or 
SELECT * FROM v$tempfile;

There are two “modes” of data files (both permanent and temporary).

  • Online
    • are accessible
    • an application can read and write data from online data files
    • you cannot physically modify the files (rename, move, …)
  • Offline
    • are not accessible
    • allows you to do operations with files (corruption investigation, renaming, backing up, …)

Data File Structure

Data files are divided into multiple parts. Each of the parts serves it’s own purpose when it comes to data storing.

  • Data file header
    • stores general and key information about data inside such as
      • System Change Number (SCN)
      • Size
      • Unique identifier of the data file in a database
      • Unique identifier of the data file in a tablespace
  • Used block
  • Free block
    • never used – free and never used
    • previously used – but still free to be reused

Control Files

Oracle database requires control files to operate the database smoothly. In spite of an option to have multiple copies of those files, each database can have only one control file.

Control files help Oracle to locate data files, redo log files, store metadata information, database name and DBID (database ID), tablespaces, … and many more.

As I said earlier, there is an option to have multiple instances for control files and that is to avoid a single point of failure; clustering. If a control file is damaged or missing, the database instance cannot start or even worse – it will crash. However, having multiple copies prevents this situation.

Oracle reads and writes control files directly from PGA (Program Global Area), unlike with other data blocks that are in SGA (Shared Global Area).

Online Redo Log Files

These are the most important files in the database, especially when it comes to recovery. Oracle database has to include at least two (one being used and the other one being archived), but usually has multiple, files. It might happen that Oracle instance fails and to prevent data loss Oracle maintains redo log files. These files contain every single transaction executed in the database; even not committed. Redo log files are used exclusively for recovery only, however they can be browser as well manually in order to check the activity inside the database. To browse it, you can use the tool called Oracle LogMiner.

Oracle can operate in two modes:

  • Single Instance – only one redo thread is accessed
  • Multiple Instance (Oracle RAC – Real Application Cluster) – multiple instances are concurrently accessing a database with its own redo thread assigned

Redo Log switching

I briefly discussed how the redo log operates in the article Redo Log. I will now elaborate on how redo log stores data, switches between files and archives.

As I said, there is always one active redo log file (=current) and the rest of the files are being archived.  To write transaction details into the redo log file Oracle uses a process called Log Writer (LGWR). Every time the current redo log file is full (there might be other situations when Oracle stops writing to one file and needs to continue with another; the file being full is the most common scenario though) and Oracle needs to switch to another one – we called this situation “log switch“.  You can also force switching to occur at regular intervals.

See the picture below to understand how the redo switch is being done.

A full redo log file can be reused if the database is in NOARCHIVELOG mode. It is available right after the CKPT (checkpoint process) finishes the writing, called checkpointing, by the process DBWR (database writer).

If, on the other hand, the database is in archive log mode, the redo log file is available only after it is checkpointed and archived.

Usually, the database is configured to operate with multiple copies of redo log files; clustering and high availability reason again. The log writer saves transaction details into multiple redo log files at the same time and ideally distributed across multiple disks.

Archived redo log files

Archived log files are offline files and meant to serve as a recovery or/and updating a standby database.

Structure of redo log files

Redo logs contain all necessary data to identify and possibly restore all transactions successfully. They include:

  • SCN number
  • The timestamp of the change
  • Transaction ID
  • SCN and timestamp of the committed transaction (if ever committed)
  • Operation type
  • Modified data segment name and type