System Programming and Operating System

Topic 1: What is System Programming?

What is System?

A system is the collection of various components working together to achieve a goal.

Example: College is a system.

It consists of various components like departments, classrooms, faculties, and students.


What is Programming?

Programming is the art of designing and implementing programs to perform specific tasks.

In a college system, what is a program?

A lecture can be considered as a program because it has input and output.

Input: The information that the teacher is delivering.

Output: The knowledge that the student has received.


What is Software?

Software is a collection of many programs designed to perform specific tasks or operations.

Types of Software:

  • System Software: Assists in the effective execution of general user programs and manages hardware.
  • Examples: Operating System, Assembler, Compiler

  • Application Software: Developed for specific goals or user tasks.
  • Examples: Media Player, Adobe Reader


    What is System Program?

    System Programs are programs required for the effective execution of general user programs on a computer system.


    So the system programming is an art of designing and implementing System Programs.

Topic 2: Need of System Programming

System programming is essential for the smooth functioning of any computer system. Below is a detailed explanation you can use in your teaching, notes, or website:

What is the Need of System Programming?

System programming is required to design and develop software that manages and controls the hardware components of a computer system. It creates a bridge between the hardware and application software, enabling the entire system to function efficiently and reliably.

Reasons Why System Programming is Needed:

  • Hardware Control:
    • System programs directly interact with the hardware.
    • They enable effective communication between hardware components and the user.
  • Efficient Resource Management:
    • System software manages CPU, memory, I/O devices, and other resources.
    • Ensures fair and optimal usage of system resources.
  • Platform for Application Software:
    • Provides the necessary environment for application programs to run.
    • Without system software (like OS), application software cannot function.
  • Automation of Tasks:
    • Automates hardware-related tasks such as booting, loading programs, and memory allocation.
  • Security and Protection:
    • Helps implement security features to protect system resources and user data from unauthorized access.
  • Performance Optimization:
    • Allows efficient execution of programs by optimizing memory, CPU scheduling, and I/O operations.
  • Error Detection and Handling:
    • Provides mechanisms for detecting, reporting, and handling system-level errors.
  • Support for Development Tools:
    • Compilers, assemblers, loaders, linkers, and debuggers are all system programs that support application development.

Examples of System Programs:

  • Operating Systems (Windows, Linux, macOS)
  • Compilers
  • Assemblers
  • Loaders
  • Linkers
  • Device Drivers

Topic 3: Software Hierarchy

Topic 4: Differece between Application Software and System Software


Feature Application Software System Software
Definition Application software is designed to perform specific tasks for the user, such as document editing or media playback. System software is developed to manage and control the computer system's hardware and provide a platform for running application software.
Purpose Helps users accomplish a particular task like creating a presentation or editing a video. Operates the computer hardware and provides essential services for application software to function.
User Interaction Directly used by end-users for personal or professional tasks. Runs in the background without direct user interaction.
Dependency Depends on system software to run. Can run independently and supports application software.
Examples MS Word, Adobe Reader, VLC Media Player. Windows OS, Linux, Compiler, Assembler.
Installation Installed as per the user’s needs based on tasks to perform. Usually comes pre-installed with the system.
Execution Runs only when the user manually launches it. Starts automatically during system boot and runs in the background.

Topic 5: Evolution of components of Systems Programming:

System programming refers to the development of software that provides services to the computer hardware and application software. As computing needs evolved, various components of system programming were introduced to make systems more efficient, modular, secure, and user-friendly.

1. Text Editors

Purpose: Used to write and edit source code (program instructions).

Evolution:

  • First-generation editors like ed, ex in Unix were command-line based.
  • Then came full-screen editors like vi, nano, Notepad.
  • Now we have smart editors and IDEs (VS Code, IntelliJ, Eclipse) that support syntax highlighting, code completion, linting, version control integration, and more.

Importance: Acts as the programmer's workspace.


2. Assembler

Purpose: Converts assembly language (low-level, human-readable) into machine code (binary instructions understood by CPU).

Pass Structure:

  • Pass 1: Builds symbol table, handles labels, and resolves addresses.
  • Pass 2: Converts mnemonics to binary, resolves addresses using symbol/literal tables.

Evolution:

  • Early assemblers were simple.
  • Later assemblers included features like macros, conditional assembly, and optimization.

Examples: NASM, MASM, TASM.


3. Macro Processor

Purpose: Allows code reuse by enabling macro definitions — blocks of code that can be reused with arguments.

Evolution:

  • Early assemblers supported inline macros.
  • Later languages added pre-processors (like C preprocessor with #define, #include, etc.).

Benefits:

  • Reduces redundancy.
  • Makes code modular and easier to maintain.

4. Compiler

Purpose: Translates high-level language (like C, Java) to machine code.

Phases:

  • Lexical analysis → Syntax analysis → Semantic analysis → Optimization → Code generation → Code linking

Evolution:

  • Early compilers were monolithic and platform-dependent.
  • Modern compilers are modular, support cross-compilation, and generate optimized code for multiple architectures.

Examples: GCC (GNU Compiler Collection), Clang, Java Compiler (javac).


5. Interpreter

Purpose: Directly executes source code line-by-line without compiling.

Comparison:

  • Slower than compilers but ideal for testing and rapid development.

Used in:

  • Scripting languages like Python, JavaScript.
  • Education and prototyping where simplicity is more important than performance.

Modern Interpreters: Often use Just-In-Time (JIT) compilation for speed (e.g., V8 engine for JavaScript).


6. Loader

Purpose: Loads compiled machine code into memory and starts execution.

Types:

  • Absolute Loader: Loads code at fixed memory location.
  • Relocating Loader: Adjusts memory addresses during load.
  • Dynamic Loader: Loads shared libraries during program execution.

Use: When you click on an app icon, the loader gets invoked in the background.


7. Linker

Purpose: Links multiple object files (compiled units) and libraries to form a single executable.

Responsibilities:

  • Address resolution.
  • Handling external references (like function calls across files).
  • Static vs. Dynamic linking.

Examples: ld (Linux), MSVC Linker (Windows).


8. Debugger

Purpose: Helps identify and fix errors in source code.

Features:

  • Breakpoints – pause execution at specific lines.
  • Step-by-step execution.
  • Variable inspection, memory dumps.

Evolution:

  • Started as command-line tools.
  • Modern debuggers are GUI-based with advanced visualization (e.g., Eclipse, Visual Studio).

9. Device Drivers

Purpose: Acts as interface between hardware and operating system/application software.

Types:

  • Kernel-mode drivers: For low-level hardware operations.
  • User-mode drivers: For peripheral devices like printers and cameras.

Evolution:

  • Initially written in assembly.
  • Now written in C/C++ with OS-specific APIs.

10. Operating System

Purpose: Manages computer hardware and software resources and provides common services for programs.

Functions:

  • Process management
  • Memory management
  • File system management
  • Device and I/O control
  • Security and user interface

Evolution:

  • Batch systemsTime-sharing systemsMultitasking OS
  • Modern OS: Windows, Linux, macOS, Android

Topic 6: Elements of Assembly Language Statements

Assembly Language Overview

Assembly language is a low-level programming language that provides a symbolic and human-readable representation of a computer's machine code.

Key Characteristics:

Low-Level Language

  • Assembly language is very close to machine (binary) language but easier to read and write.
  • It directly controls hardware and performs operations at the instruction level.

Machine Dependent

  • Assembly language is specific to a processor architecture (e.g., Intel 8086, ARM, MIPS).
  • A program written for one CPU won’t work on another without modification.

Human-Friendly Compared to Binary

  • Writing programs in binary (like 10100011) is complex and error-prone.
  • Assembly uses mnemonics like MOVEM, ADD, MOVER which are easier to remember and understand.

This explanation is based on the hypothetical assembly language presented in the textbook “Systems Programming” by D.M. Dhamdhere. This simplified language model is used for understanding the internal working of assemblers and system software.


Hypothetical Assembly Language Overview

This assembly language supports a limited set of CPU registers and instructions. It is commonly used in system programming studies (e.g., in the book by D.M. Dhamdhere).

Supported CPU Registers

  • AREG
  • BREG
  • CREG
  • DREG

Supported Operations (Instructions)

  1. STOP
  2. ADD
  3. SUB
  4. MULT
  5. MOVER
  6. MOVEM
  7. COMP
  8. BC
  9. DIV
  10. READ
  11. PRINT

Instruction Format Rules

  • First operand is always a CPU register (e.g., AREG).
  • Second operand is always a memory operand (e.g., a variable or label).
  • READ and PRINT use only a memory operand (no register).
  • STOP has no operands.

Machine Opcode Table (MOT)

The Machine Opcode Table (MOT) holds details for each instruction:

  • Mnemonic form of the opcode (e.g., ADD, SUB)
  • Machine code (numeric value) associated with the opcode

Assembly language programming Terms:

1. Location Counter (LC)

A pointer that tracks the memory address of the next instruction. The assembler uses the LC to assign addresses to instructions and data elements in the source program.

2. Literals

Literals are constant values directly used in instructions.

Example: MOVER AREG, =5 → Here, =5 is a literal constant.

3. Symbols

Symbols are user-defined names representing variables, labels, or memory locations. They enhance code readability and are stored in the Symbol Table during assembly.

Example: X DC 10 (Variable)

4. Procedures

Procedures, also called subroutines or functions, are reusable blocks of code that perform specific tasks. They promote modular programming and reduce redundancy.

Example:
CALL ADDITION

Topic 7: Assembly Language statements

Assembly Language Statements (as per Dhamdhere)

Assembly language statements in a typical hypothetical system (as described by Dhamdhere) are generally classified into the following categories:

1. Imperative Statements

These are instructions that generate machine code and are directly executed by the processor.

  • Examples: ADD AREG, X, MOVER BREG, Y, DIV CREG, Z
  • They perform data movement, arithmetic, logical, and control operations.

2. Declarative Statements

Used to define constants, reserve memory space, and declare variables. These do not translate to machine instructions.

  • Examples: X DC 5, Y DS 1
  • DC: Define Constant (e.g., DC 5 reserves memory and initializes with 5)
  • DS: Define Storage (e.g., DS 1 reserves one word of memory)

3. Assembler Directives

These instructions are for the assembler, not the CPU. They control the assembly process and symbol handling.

  • Examples: START 100, END, ORIGIN 205, EQU, LTORG
  • START: Specifies the starting address of the program
  • END: Marks the end of the source code
  • ORIGIN: Changes the value of the location counter
  • EQU: Equates a symbol with a value
  • LTORG: Assigns addresses to literal constants

Watch this video for better understanding:


General Format of Assembly Language Statement

LABEL    OPCODE    OPERAND1, OPERAND2    ; Comment

Example:

LOOP     ADD       AREG, NUM             ; Add NUM to AREG
         MOVEM     AREG, RESULT          ; Move result to memory

Notes:

  • The assembler parses each statement into fields: label, opcode, operands, and comment.
  • Statements are stored and translated into machine code during the pass(es) of the assembler.

Topic 8: Benefits of Assembly Language

Benefits of Assembly Language Statements

  1. Close to Hardware: Assembly language gives precise control over hardware components, making it ideal for systems programming.
  2. Faster Execution: Programs written in assembly are highly efficient and execute faster due to minimal abstraction.
  3. Optimized Performance: Developers can fine-tune code for speed, memory usage, and performance, especially in critical system tasks.
  4. Better Understanding of System Architecture: Working with registers, memory addresses, and opcodes helps understand how a system functions internally.
  5. Reusability with Procedures: Assembly language allows modular programming through procedures (subroutines), promoting code reuse and structure.
  6. Simplified Debugging: The symbolic nature of labels and mnemonics in assembly statements makes debugging easier compared to binary machine code.
  7. Essential for OS and Embedded Development: Critical parts of operating systems, device drivers, and embedded systems rely on assembly for direct hardware interaction.
  8. Precise Resource Management: Enables direct allocation and management of system resources like CPU registers and memory.
  9. Structured through Statement Types: Use of Imperative, Declarative, and Directive statements helps organize and manage program logic clearly.

Topic 9: Pass Structure of Assembler

Pass Structure of an Assembler

An assembler typically operates in two passes to convert assembly language into machine code.

Pass 1: Analysis Phase

Pass 1 of the assembler performs the following tasks:

  • Assigns Addresses:
    Maintains the Location Counter (LC) to assign memory addresses to each instruction and data item.
  • Builds Symbol Table:
    Collects all labels and variable names with their corresponding addresses.
    Adds them to the Symbol Table (SYMTAB).
  • Processes Assembler Directives:
    Handles instructions like START, END, ORIGIN, EQU, and LTORG.
  • Handles Literals:
    Detects and stores literals like =5 or ='A' in the Literal Table (LITTAB).
  • Generates Intermediate Code (IC):
    Produces an intermediate representation of the source code for Pass 2 to use.

Pass 2: Synthesis Phase

Pass 2 performs actual machine code generation:

  • Generates Machine Code:
    Converts intermediate code from Pass 1 into actual object code (binary or hex).
  • Uses Symbol & Literal Tables:
    Looks up operands in the Symbol Table and Literal Table to get memory addresses.
  • Handles Address Resolution:
    Calculates and resolves final memory addresses using data from tables.
  • Generates Final Object Program:
    Produces the output file (often a .obj or .exe) ready to be loaded into memory.

Topic 10: Design of Two Pass Assembler

Design of Two-Pass Assembler

A Two-Pass Assembler is used to convert assembly language into machine code in two distinct phases:

Pass 1: Analysis Phase

  • Initialize: Location Counter (LC), Symbol Table (SYMTAB), Literal Table (LITTAB), and Intermediate Code (IC) buffer are initialized.
  • Assign Addresses: Each instruction and data label is assigned a memory address using the LC.
  • Build Symbol Table: Labels and symbols with their addresses are added to SYMTAB.
  • Detect Literals: All literals (e.g., =’5’) are stored in LITTAB.
  • Process Directives: Handles pseudo-instructions like START, END, ORIGIN, LTORG, EQU.
  • Generate Intermediate Code: A simplified code version with symbolic references for further processing in Pass 2.

Pass 2: Synthesis Phase

  • Read Intermediate Code: Parse IC line by line for translation.
  • Resolve Symbols: Use SYMTAB and LITTAB to get actual addresses for operands.
  • Translate Opcodes: Use Machine Opcode Table (MOT) to convert symbolic opcodes into binary/machine code.
  • Generate Object Code: Write the final machine code to an object file (e.g., .obj, .exe).
  • Handle Address Modifications: For ORIGIN/EQU and other address adjustments.

Data Structures Used

  • SYMTAB: Stores symbols and corresponding memory locations.
  • LITTAB: Stores literals and their resolved addresses.
  • MOT (Machine Opcode Table): Contains symbolic opcodes and their binary equivalents.
  • POT (Pseudo Opcode Table): Stores directives like START, END, etc.

Advantages of Two-Pass Assembler

  • Efficient handling of forward references.
  • Structured code generation.
  • Modular and easy to implement.

Topic 11: Processing of Declaration Statements

In assembly language, declaration statements are used to define and reserve memory for data elements. These statements are processed during Pass 1 of the assembler.

Types of Declaration Statements

  • DS (Define Storage): Reserves memory without initialization.
  • DC (Define Constant): Reserves memory and initializes it with a constant value.

Steps Involved in Processing

  1. Location Counter Update: The assembler uses the Location Counter (LC) to track where memory should be reserved.
  2. Symbol Table Entry: The label or variable is added to the Symbol Table (SYMTAB) with its assigned address.
  3. Size Determination: The size of storage to be reserved is computed based on the declaration type and data type.
  4. IC Generation: Intermediate Code (IC) entry is created for each declaration statement for use in Pass 2.

Example

  VALUE   DS    1
  NUMBER  DC    '5'
  

Explanation:

  • VALUE DS 1 → Reserves 1 word of memory; no initialization.
  • NUMBER DC '5' → Reserves memory and initializes it with 5.

Assembler Tables Used

  • Symbol Table (SYMTAB): Stores label names and corresponding addresses.
  • Literal Table (if literal is used): Stores literal values and addresses (not applicable here directly).

Final Notes

Declaration statements are critical for memory management and must be processed correctly during Pass 1 to ensure proper allocation and code generation in Pass 2.

Topic 12: Processing of Assembler Directives

Assembler directives are instructions to the assembler itself. They do not generate machine code but guide the assembler during program translation. These are processed during Pass 1 of the assembler.

Common Assembler Directives

  • START: Specifies the starting address of the program.
  • END: Marks the end of the source code.
  • ORIGIN: Changes the location counter to a specified address.
  • EQU: Equates a label to a constant value or another label.
  • LTORG: Directs the assembler to allocate memory for literals at that point.

Steps Involved in Processing

  1. Update Location Counter: Directives like START and ORIGIN modify the value of the LC (Location Counter).
  2. Symbol Table Updates: EQU directives update the Symbol Table with the computed value.
  3. Literal Table Allocation: LTORG assigns addresses to literals collected so far.
  4. Directive Execution: Instructions are interpreted and executed internally without generating object code.

Example

  START 100
  A      EQU   5
  B      DS    1
  C      DC    '2'
  LTORG
  END
  

Explanation:

  • START 100 → Location counter starts at 100.
  • A EQU 5 → Symbol A is assigned constant value 5.
  • LTORG → Assigns memory addresses to any literals found before this directive.

Assembler Tables Used

  • Symbol Table (SYMTAB): Stores label names and their addresses/values.
  • Literal Table (LITTAB): Stores literal constants and their assigned addresses.

Conclusion

Assembler directives help manage memory layout, symbol resolution, and literal storage efficiently. Though they do not produce machine code, they are essential for correct assembly of the program.

Topic 13: Processing of Imperative Statements

Imperative statements in assembly language are actual executable instructions. These are responsible for performing operations like moving data, performing arithmetic, or controlling program flow. They generate machine code and are processed in Pass 2 of the assembler.

Common Imperative Statements

  • MOVER – Move data from memory to register
  • MOVEM – Move data from register to memory
  • ADD, SUB, MULT, DIV – Arithmetic operations
  • COMP – Compare register and memory value
  • BC – Branch conditionally
  • READ, PRINT – Input/output operations
  • STOP – End program execution

Steps in Processing Imperative Statements

  1. Identify Instruction Type: Using the Mnemonic Opcode Table (MOT).
  2. Extract Operands: Separate register and memory components.
  3. Resolve Operands: Get memory address or constant value from the Symbol Table or Literal Table.
  4. Generate Machine Code: Combine opcode, register code, and memory address to form object code.

Example

  MOVER AREG, NUM
  ADD   BREG, COUNT
  MOVEM AREG, RESULT
  STOP
  

Explanation:

  • MOVER AREG, NUM → Move value at NUM to AREG
  • ADD BREG, COUNT → Add value at COUNT to BREG
  • MOVEM AREG, RESULT → Move AREG to RESULT
  • STOP → End execution

Assembler Tables Used

  • Mnemonic Opcode Table (MOT): Used to map opcodes to machine codes.
  • Register Table: Used to get binary codes for registers.
  • Symbol Table (SYMTAB): Used to get memory address of labels.

Conclusion

Imperative statements are core executable commands in assembly language. The assembler converts them into machine-level instructions by using predefined tables during Pass 2.

Topic 14: Advanced Assembler Directives

Assembler directives do not generate machine instructions but give instructions to the assembler itself.

Directive Purpose
EQU Defines a constant
ORIGIN Sets memory location counter
LTORG Allocates memory for literals
START Starting address of the program
END End of program, triggers literal allocation

Example

  START 100
  TEN EQU 10
  MOVER AREG, ='5'
  ORIGIN 200
  NUM DC 5
  LTORG
  END
  

Topic 15: Pass I of two pass Assembler

Working of Pass I Assembler

In Pass I, the assembler scans the source program to analyze its structure, collect necessary information, and prepare tables for Pass II.


Purpose of Pass I

- To scan the source code line by line.
- To assign addresses to instructions and data.
- To build the Symbol Table (SYMTAB) for all labels/symbols.
- To build the Literal Table (LITTAB) for all literals.
- To record assembler directives (like START, ORIGIN, EQU, LTORG, END).
- To generate the Intermediate Code (IC) that will be used in Pass II.


Steps in Pass I Working

1. Initialization

Read the START directive to initialize the Location Counter (LC).
Example:
START 200 → Sets LC = 200


2. Scanning Each Line

For each line of source code:

Check for Label
If there’s a label, enter it into SYMTAB with the current LC.

Check the Mnemonic
If it’s a machine instruction → Increase LC by instruction length.
If it’s a declarative statement (e.g., DS, DC) → Allocate memory.
If it’s an assembler directive (e.g., ORIGIN, EQU, LTORG) → Update tables and LC accordingly.

Check for Literals
If a literal appears (e.g., =’5’), store it temporarily in LITTAB without assigning an address yet.


3. Processing Literals

When LTORG or END is encountered:
- Assign addresses to all unassigned literals in LITTAB.
- Store them in the Pool Table (POOLTAB).


4. Updating Location Counter

After processing each statement, increment LC based on:
- Type of instruction.
- Operand sizes.
- Data definitions.


5. Creating Intermediate Code (IC)

For each instruction, generate an intermediate representation:
- OpCode class (Imperative Statement, Declarative Statement, or Assembler Directive)
- Register codes, constants, symbol references.

Example of IC format:
(IS, 04) (1) (S, 05) → Imperative Statement 04 (ADD) with register 1 and symbol index 05.


Output of Pass I

At the end of Pass I, we have:
- SYMTAB → All labels with addresses.
- LITTAB → All literals with addresses.
- POOLTAB → Indicating literal pools.
- Intermediate Code (IC) → For Pass II to generate machine code.


Pass I of Assembler - Example

Source Program

START 200
MOVER AREG, A
ADD BREG, ='5'
A DS 1
LTORG
B DC '2'
END

Pass I Working

During Pass I, the assembler:
- Initializes LC using START directive.
- Builds Symbol Table (SYMTAB) with labels and addresses.
- Builds Literal Table (LITTAB) with literals and addresses.
- Creates Pool Table (POOLTAB) for literal pools.
- Generates Intermediate Code (IC) for Pass II.


Location Counter (LC) Values for Each Line

Line No. Statement LC Value Remarks
1 START 200 200 START initializes LC
2 MOVER AREG, A 201 1-word instruction
3 ADD BREG, ='5' 202 1-word instruction, literal table updated
4 A DS 1 203 Reserves 1 word for variable A
5 LTORG 203 Literal =’5’ assigned at LC=202 (no LC change for directive)
6 B DC '2' 204 Allocates constant at address 204
7 END 205 End of program

SYMTAB (Symbol Table)

Index Symbol Address
0 A 201
1 B 204

LITTAB (Literal Table)

Index Literal Address
0 =’5’ 202

POOLTAB (Pool Table)

Pool No Start Index
1 0

Intermediate Code (IC)

(AD, 01) (C, 200)     ; START 200
(IS, 04) (1) (S, 0)   ; MOVER AREG, A
(IS, 01) (2) (L, 0)   ; ADD BREG, ='5'
(DL, 01) (C, 1)       ; A DS 1
(DL, 02) (C, 2)       ; B DC '2'
(AD, 02)              ; END

Topic 16: Intermediate Code Forms

Topic 17: Pass II Assembler

Pass II Working

Pass II Working (With IC)

Source Code Line LC Intermediate Code (IC) Machine Code
START 200 200 (AD, 01) (C, 200) -- (No machine code for START)
MOVER AREG, A 201 (IS, 04) (RG, 01) (S, 0) 04 01 201
ADD BREG, ='5' 202 (IS, 01) (RG, 02) (L, 0) 01 02 202
A DS 1 203 (DL, 01) (C, 1) -- (Memory reserved, no code)
LTORG 203 -- (Literal assigned) 00 00 05
B DC '2' 204 (DL, 02) (C, 2) 00 00 02
END 205 (AD, 02) -- (No machine code for END)