Training institutes in embedded systems

Job oriented Advanced Embedded Systems Training - Next batch starts on December 18th  Click Here To Register.

Linux Interview Essentials

Linux Internals interview questions and answers covers various topics such as Linux Internals Basics, System calls and kernels, Virtual File System, Processes, IPC mechanisms, Synchronization, Threads, TCP/IP(basics, addressing, protocols brief working details), Socket programming basics, process management and memory management. Linux Internals interview questions and answers pdf is prepared by our mentors to help our students with last minute preparation.

  • Topic wise Questions

    Linux Internals Basics

    What is a memory leak? Explain with details.

    When you allocate the memory dynamically and fail to de-allocate it leads to memory leak. Obviously the memory will be cleaned when process terminates. But think about an embedded system running for 24x7 and allocating memory without freeing. Eventually process heap memory will run-out which leads to crash your system. Such issue is called as Memory Leak. Hence it is very important to de-allocate your memory after its usage is completed.

    What are the common errors in stack segment? Explain with details

    Stack Overflow
    Whenever process stack limit is over try to access an outside stack memory leads to stack overflow. But this error also prints as segmentation fault in some Linux systems.
    E.g. Call a recursive function infinite times.

    Stack Smashing
    When you trying to access memory beyond limits.
    E.g. int arr[5]; arr[100];

    What are the common errors in code segment? Explain with details

    Code segment in Linux is a read only memory. So if we try to change any value inside this memory will leads to error called as Segmentation fault. All the instructions + constant values will be stored in this memory. Following are considered as constants.

    • char *str = “Hello”; Here the hello is a string constant
    • int x = 10; Here ‘10’ is integer constant
    • float f = 2.5; Here ‘2.5’ is a double constant
    • const int i = 1; Here ‘i’ is a constant variable (Constant variable will store in code segment for some compilers)

    What are the differences between static linking and dynamic linking?

    Static linkingDynamic linking
    Linking happens at compile time.Linking happens at run time.
    Binary size will be more, entire library added to it.Binary size will be less because library is separately loaded and linked.
    Loading time is less, only binary need to load.Loading time will be more as we need to load library and link at run time.
    Memory usage is high. Each time when you start application, library will be duplicated in memory.Memory usage is less as the library is loaded dynamically

    Differences between function calls and library calls. How do they work?

    Library callsSystem calls
    Switch to library which is present in user space.Switch to kernel space, because it’s defined there.
    Using stack to pass arguments to library.Using CPU registers to pass arguments.
    Faster execution because no switching between spacesSlower execution, delay for switching between User to Kernel space.
    Not portable, library dependency is always present.Always portable, kernel will be same for any Linux distribution.
    Simple and Easy to use.Little complex to use and understand.
    We can create and add dynamically any time.Adding new system call is difficult. We need to recompile the kernel after adding.

    What is multi-tasking? How is it achieved in in any OS?

    Multi-tasking is nothing but doing multiple jobs concurrently, even if you have only one CPU.
    Multi-tasking is done in OS by switching between processes frequently. The switching is based on scheduling policies of OS. Many scheduling policies are available, which will be configured in the OS depending on the application.

    • First Come First Serve
    • Round Robin
    • Pre-emptive
    • Earliest deadline first
    • Rate monotonic

    What is an operating system (OS)? List parts of an OS.

    OS is a software used to interface between user applications and hardware. OS helps to manage resources efficiently (ex: divide memory and CPU among all applications) so that multi-tasking is achieved in system. Major parts of the OS are given below:

    • Process management
    • Memory management
    • IPC
    • Network management
    • VFS

    System Calls and Kernel

    Explain Synchronous and Asynchronous methods of communication with details.

    SynchronousAsynchronous
    Sender and receiver must be in synch always.Sender and receiver not in synch
    Receiver need to wait for data transfer as data might come any time.Sender can send data any time at any speed
    Received have to wait unnecessarily for a long time.Receiver no need to wait for data transfer.
    Faster, run as single binary communication between services is faster.Slower due to complex message passing between services
    Will be fasterWhenever there is a data to read, receiver will get a notification.
    While waiting receiver cannot any other job.Unnecessary waiting can be avoided

    What are the drawbacks of system calls?

    • Switching between spaces (user space to kernel space) makes a delay for execution.
      Compare to library calls, usage is difficult.
    • System calls present in latest kernels may not be there in old kernels. So there is a difficulty to write a portable code.
    • If you want add a new system call we have to recompile the kernel after adding. Which need more time and entire source code.

    Is it always necessary to implement system calls as soft interrupts?

    Yes. There is no other way to switch to kernel space.

    What is system call? How is it implemented in Linux?

    A system call is the programmatic way in which a computer program requests a service from the kernel of the operating it is executed on. This may include hardware-related services (for example, accessing a hard disk drive), creation and execution of new processes, and communication with integral kernel services such as process scheduling. System calls provide an essential interface between a process and the operating system.

    Implementing system calls requires a control transfer from user space to kernel space, which involves some sort of architecture-specific feature. A typical way to implement this is to use a software interrupt or trap. Interrupts transfer control to the operating system kernel so software simply needs to set up some register with the system call number needed, and execute the software interrupt.

    What is an exception in a system? How the OS handles that? Explain

    Exceptions belong to a special type of software interrupts. They are generated by the processor itself whenever some unexpected critical event occurs. For instance, a page fault exception (interrupt 14) is triggered when the processor attempts to access a page, which is marked as not-present. The exception handler can then reload the page from disk (virtual memory) and restart the instruction which generated the exception. Three types of exceptions can be generated by the processor: faults, traps and aborts.
    Eg: Fault – Divide by zero, segmentation fault, Bus error etc
    Traps – Debug, breakpoint and overflow.
    Abort- Double fault, machine check.

    What are the type of interrupts? Explain the differences.

    Interrupts we can be divided into two categories. Hardware interrupts and software interrupts.
    Interrupts generated due to hardware changes (ex: USB plug-in to your PC) is called hardware interrupt. These are used to handle asynchronous hardware changes or data manipulation. An interrupt generated by software/instruction is called software interrupt. Eg: INT 0x80, sys_enter. These we are using to implement system calls. All the system calls you study as a part of Linux Internals course is also soft interrupts. In both case execution will jump to ISR.

    Another categorization of interrupts are mask-able and unmask-able interrupts.
    Assume when you executing inside an ISR and another interrupts comes. Default case it will stop the current ISR and start executing new ISR. This is a problem if you doing an important job (ex: critical real time task like air bag control) in ISR. So there is an option to change this by masking interrupts. But it’s not possible for all interrupts available. So the interrupts which is possible to mask or block is called mask-able interrupts, and which is not possible to mask is non mask-able interrupts.

    What is the connection between Real time OS (RTOS) & Embedded OS (EOS)? Explain.

    How real time systems (RTS) design is connected with kernel designs? Explain.

    What are the pros & cons of “monolithic” and “micro” kernels?

    Monolithic kernelMicro kernel
    Kernel size is high because kernel + kernel subsystems compiled as single binaryKernel size is small because kernel subsystems run as separate binaries
    Difficult to do extension or bug fixingEasily extensible and bug fixing
    Need to compile entire source code at onceSubsystems can be compiled independently
    Faster, run as single binary communication between services is faster.Slower due to complex message passing between services
    No crash recovery.Easy to recover from crash
    More secureLess secure
    Eg: Windows, Linux etcCommunication is slow
    Eg: MacOS, WinNT

    What is a “Monolithic” and “Micro” kernels?

    Monolithic kernelMicro kernel
    Kernel size is high because kernel + kernel subsystems compiled as single binaryKernel size is small because kernel subsystems run as separate binaries
    Difficult to do extension or bug fixingEasily extensible and bug fixing
    Need to compile entire source code at onceSubsystems can be compiled independently
    Faster, run as single binary communication between services is faster.Slower due to complex message passing between services
    No crash recovery.Easy to recover from crash
    More secureLess secure
    Eg: Windows, Linux etcCommunication is slow
    Eg: MacOS, WinNT

    What is "kernel?" Explain the difference between privilege mode and user mode.

    An Operating System (OS) is a software package that communicates directly to the computer hardware and all your applications run on top of it while the kernel is the part of the operating system that communicates directly to the hardware. Though each operating system has a kernel, this is buried behind a lot of other software and most users don’t even know it exists. In summary it is the “core” part of the OS.

    • Kernel Mode /privilege mode:
      In Kernel mode, the executing code has complete and unrestricted access to the underlying hardware. It can execute any CPU instruction and reference any memory address. Kernel mode is generally reserved for the lowest-level, most trusted functions of the operating system. Crashes in kernel mode are catastrophic; they will halt the entire PC.
    • User Mode:
      In User mode, the executing code has no ability to directly access hardware or reference memory. Code running in user mode must delegate to system calls to access hardware or memory. Due to the protection afforded by this sort of isolation, crashes in user mode are always recoverable. Most of the code running on your computer will execute in user mode.

    Virtual File System (VFS)

    What is the importance of Virtual File System? Explain with an example

    VFS allows Linux to support many, often very different, file systems, each presenting a common software interface to the VFS. All of the details of the Linux file systems are translated by software so that all file systems appear identical to the rest of the Linux kernel and to programs running in the system. Linux's Virtual File system layer allows you to transparently mount the many different file systems at the same time.

    Eg:
    Write(1, buf, len); Writing to stdout -> Terminal
    Write(fd, buf, len); Writing to normal file -> Hard disk
    Write(socket, buf, len); Writing to socket -> Network
    Write(device, buf, len); Writing to device file -> peripheral interface

    In all these case we are using same system call but different types of files. In this case it’s VFS who will identify where to pass the data. In other words with the help of VFS only “Everything is file” is possible in Linux.

    What is i-node number? Why is it important?

    An i-node is an entry in i-node table, containing information (the meta-data) about a regular file and directory. An i-node is a data structure on a traditional Unix-style file system such as ext3 or ext4. The name evolved from index number. Inode number will be unique for all files in system. It is important because it helps to locate any file in the system in a unique way.

    What is file-system? Describe the Linux File System.

    A filesystem is the methods and data structures that an operating system uses to keep track of files on a disk or partition; that is, the way the files are organized on the disk. The word is also used to refer to a partition or disk that is used to store the files or the type of the filesystem

    Processes

    It’s always better to use wait() system call over sleep() in order for the parent to wait till the child dies. Why is it so? Explain it with respect to sync / Async communication and process states.

    With sleep we can only provide a fixed time for waiting. Where using wait it waits until child change states. Sleep only wait for child, but wait will clean the resources to avoid zombie. Sleep is a synchronous wait, where is consumed CPU cycles unnecessarily.

    Explain about Copy-On-Write (COW) optimization strategy. What are its benefits?

    • Copy-on-write (called COW) is an optimization strategy
    • When multiple separate process use same copy of the same information it is not necessary to re-create it
    • Instead they can all be given pointers to the same resource, thereby effectively using the resources
    • However, when a local copy has been modified (i.e. write) ,the COW has to replicate the copy, has no other option

    What is the difference between fork() and exec()?

    Fork will create a new process with new PID. Exec will replace current process with new and preserves the PID.

    What is a blocking call?

    A function/system call which will move your process / thread to waiting/sleeping state.
    Eg: wait(), pthread_lock(), sem_wait() etc

    What is OS scheduler? What is context switching?

    In a multiprogramming OS the process scheduling is the activity of the process manager that handles the removal of the running process from the CPU and the selection of another process on the basis of a particular strategy.

    Context switch is the process of storing and restoring the state (more specifically, the execution context) of a process or thread so that execution can be resumed from the same point at a later time. This enables multiple processes to share a single CPU.

    What is importance of PCB/TCB? How the Linux manages processes using that?

    Process Control Block (PCB, also called Task Controlling Block / Process table / Task Structure, or Switch frame) is a data structure in the operating system kernel containing the information needed to manage a particular process. This is like a data base that Kernel maintains to keep track of all the processes that are currently available in the system. Once the process completes executing, corresponding PCB is deleted. It mains many information about a process, some of them is given below.

    • Process state: State may enter into new, ready, running, waiting, dead depending on CPU scheduling.
    • Process ID: a unique identification number for each process in the operating system.
    • Program counter: a pointer to the address of the next instruction to be executed for this process.
    • CPU registers: indicates various register set of CPU where process need to be stored for execution for running state.
    • CPU scheduling information: indicates the information of a process with which it uses the CPU time through scheduling.

    What is daemon process?

    A daemon is a type of program on Linux operating systems that runs without interacting in the background, rather than under the direct control of a user.

    What is zombie process? How can you create a zombie?

    • Zombie process is a process that has terminated but has not been cleaned up yet
    • It is the responsibility of the parent process to clean up its zombie children
    • If the parent does not clean up its children, they stay around in the system, as zombie

    What is orphan process? How can you create an orphan process?

    • An orphan process is a process whose parent process has finished or terminated, though it remains running itself.
    • Orphaned children are immediately "adopted" by init process in Linux
    • Init automatically cleans its children.

    How the fork() system call work? Explain them with differences.

    A fork () system call creates a new process. After creating a process fork returns two different values for parent and child to differentiate both process. Fork returns new child id to parent and returns 0 to child. So by checking the return value of fork we can separate parent and child.

    What is a parent process and a child process? How will you create child process?

    The process which creates a new process called parent process. To create a child process we can use fork() system call. Fork will create a child process which is a duplicate of parent process with a new process id.

    List the various ids related to a process

    • PID process id
    • PPID parent process id
    • GID group id
    • UID user id
    • SID session id

    Can I say if I execute “./a.out” in my command prompt that the process is running? If not what state it is in? When exactly I can say it is running?

    When you do ./a.out in terminal the program will be loaded to main memory. But not yet started running. It will be in a state called ready state where scheduler keep all process which is ready to run. According to priority and scheduling policy this process will move from ready state to running stage where the CPU picks up instructions from its code segment to execute. In that situation you can say it’s actually in running state.

    What are the various states of the process? Explain

    As a process executes it changes state according to its circumstances. Linux processes have the following states:

    • Running - The process is either running (it is the current process in the system) or it is ready to run (it is waiting to be assigned to one of the system's CPUs).
    • Waiting - The process is waiting for an event or for a resource. Linux differentiates between two types of waiting process; interruptible and uninterruptible. Interruptible waiting processes can be interrupted by signals whereas uninterruptible waiting processes are waiting directly on hardware conditions and cannot be interrupted under any circumstances.
    • Stopped - The process has been stopped, usually by receiving a signal. A process that is being debugged can be in a stopped state.
    • Zombie - This is a halted process which, for some reason, still has a task_struct data structure in the task vector. It is what it sounds like, a dead process.

    What are the differences between a program & process?

    • A program is a passive entity, such as file containing a list of instructions stored on a disk
    • Process is an active entity, with a program counter specifying the next instruction to execute and a set of associated resources.
    • A program becomes a process when an executable file is loaded into main memory

    IPC

    What is the difference between communication and synchronization problems?

    These are two different classifications of IPC. In communication IPC data is transferring across processes, but in synchronization IPC no data is transferring. Its using to synchronize processes while accessing shared data.

    Explain process semaphores in Linux.

    Process semaphore are used to synchronize between multiple processes in system.
    In Linux we two standards are available to implement process semaphores. POSIX and System V

    Explain “kill” command with respect to signal handling

    Kill command is used to send a particular signal to a process
    Kill

    List all default propositions that a process can do with respect to signal.

    • Terminate
    • Core dump
    • Ignore
    • Stop/pause
    • Continue

    Where and how callback functions are used? Explain with an example

    In Linux callback functions are used in signals (one of the examples). If we want to change behavior of a signal, we need to write handler function and register to kernel. While registering we use function pointer for our handler which will act as callback function. In this case kernel will use the function pointer to call our handler when signal receives.

    What is a call back function? How it is implemented using function pointers in C?

    In computer programming, a callback is a reference to executable code, or a piece of
    executable code that is passed as an argument (Function pointer) to other code. This allows a lower level software layer to call a subroutine (or function) defined in a higher-level layer.

    What are the difference between normal pointers & function pointers in C?

    Function pointerNormal pointer
    Holds address of functionHolds address of an object (data)
    Pointing to an address from code segment.Pointing to an address from stack/heap/data segment
    Dereference to execute the FunctionDereference to get value from address
    Pointer arithmetic not validPointer arithmetic is valid

    Shared memory is the fastest IPC. Why is it so?

    After attaching shared memory to a process, we can access memory using a pointer. So no need to use any system calls to read/write from the memory. Means no kernel involved for read/write, hence its fastest IPC.

    What are the advantages & disadvantages of shared memory? Explain

    ProsCons
    Any number process can communicate same timeManual synchronization is necessary, failing which will result in race condition
    Kernel is not involvedSeparate system calls are required to handle shared memory
    Fastest form of IPCComplex implementation
    Memory customizable

    What are the advantages & disadvantages of FIFO? Explain

    ProsCons
    Naturally synchronizedLess memory size (4K)
    Simple to use and createOnly two process can communicate
    Unrelated process can communicate.One directional communication
    No extra system calls required to communicate (read/write)Kernel is involved
    Work like normal file

    What are the advantages & disadvantages of Pipes? Explain

    ProsCons
    Naturally synchronizedLess memory size (4K)
    Simple to use and createOnly related process can communicate.
    No extra system calls required to Communicate (read/write)Only two process can communicate
    One directional communication
    Kernel is involved

    What is IPC (inter process communication)? List different types of IPC possible in Linux.

    IPC is a mechanism provided by kernel in order to communicate between two or more processes in system.

    • Pipes
    • FIFO
    • Signal
    • Shared memory
    • Message queue
    • Sockets

    Synchronization

    What are the differences between Binary semaphore and Mutex?

    Both Binary semaphore and mutex achieve the same purpose. But we can use them separately where it can be used as a signaling vs locking mechanisms (as mentioned above). This means mutex can only be unlocked by the same thread, whereas semaphore can be signaled by any other thread.

    What are the differences between Mutex & Semaphore?

    Mutexes and semaphores have some similarities in their implementation, they should always be used differently.

    • A mutex is locking mechanism used to synchronize access to a resource. Only one task (can be a thread or process) can acquire the mutex. It means there is ownership associated with mutex, and only the owner can release the lock (mutex).
    • Semaphore is signaling mechanism (“I am done, you can carry on” kind of signal). For example, if you are listening songs (assume it as one task) on your mobile and at the same time your friend calls you, an interrupt is triggered upon which an interrupt service routine (ISR) signals the call processing task to wake up.

    Explain working details of Semaphore

    We need use semaphores when we have multiple resources and multiple threads. Semaphores acts like a counter for resources available for threads. Before accessing resources thread will use a wait() function call which will decrement the counter by one. When counter is already 0 then thread will wait until it’s a positive value. When a thread done with resource, it will release the resource and do post operation to increment the counter.

    Explain working details of Mutex

    This a synchronization mechanism to avoid race condition. Idea of this mechanism to create a critical section by locking and unlocking a piece of code. So make sure that only one thread is executing the portion of code, and others are waiting for lock

    Explain the following definitions

    • Race condition
      The situation where multiple threads/process trying to access same resource at the same time is called race condition. Because of these race condition there is chance of corrupting data inside that memory. This leads to getting unexpected output. All this happening because they are running concurrently at same time and switching between thread/process will happen at any pint of time.
    • Critical section
      A section or piece of code where we are doing some important job and that code should be executed by only one thread/process at a time is called critical section.
      We can create a critical section using mutex or semaphores.
    • Atomicity
      One reason of race condition is frequent switching between thread/process.
      To avoid this, one solution is to use atomic variable. Where ever we use an atomic variable scheduler will avoid the switching of process. Scheduler will let the process to finish instructions where atomic variables are coming.
    • Mutual exclusion
      This a synchronization mechanism to avoid race condition. Idea of this mechanism to create a critical section by locking and unlocking a piece of code. So make sure that only one thread is executing the portion of code, and others are waiting for lock
    • Priority inversion
      This a situation where a high priority task is waiting for a mutex/semaphore which acquired by a low priority process and low priority process is pre-empted by a medium priority process.
    • Deadlock
      Deadlock is a situation where process or thread waiting for a mutex/semaphore and will never happen at all. Means thread/process wait forever.

    What is the difference between synchronization and scheduling?

    In synchronization we make processes to wait for a resources using semaphores. Here the shared memory will be the common resource for process. In scheduling also scheduler will synchronize between process, but here CPU will act as a resource here.

    Explain the problem of synchronization in a multi-tasking environment.

    • Dead lock:
      Deadlock is a situation where process or thread waiting for a mutex/semaphore and will never happen at all. Means thread/process wait forever.
    • Starvation:
      Starvation occurs when a scheduler process (i.e. the operating system) refuses to give a particular thread any quantity of a particular resource (generally CPU). If there are too many high-priority threads, a lower priority thread may be starved. This can have negative impacts, though, particularly when the lower-priority thread has a lock on some resource.
      This will lead to priority inversion.
    • Priority inversion:
      This a situation where a high priority task is waiting for a mutex/semaphore which acquired by a low priority process and low priority process is pre-empted by a medium priority process.

    Threads

    How to pass data to a thread and get it back?

    To pass arguments to a thread use fourth argument of pthread_create. If multiple arguments need to pass, create a structure and pass structure.
    To return value from thread use second argument of pthread_join. Pass address of type variable you are returning.

    What is a joinable and detached threads? Explain with an example

    A joinable thread, like a process, is not automatically cleaned up by GNU/Linux when it terminates. Thread’s exit state hangs around in the system (kind of like a zombie process) until another thread calls pthread_join to obtain its return value. Only then are its resources released.
    A detached thread, in contrast, is cleaned up automatically when it terminates

    How the Kernel does schedules threads?

    Threads under Linux are implemented as processes that share resources. The scheduler does not differentiate between a thread and a process. Threads on Linux are kernel threads (in the sense of being managed by the kernel). Means to scheduler threads and process are same.
    Eg try command ps -eLf (shows thread info also).

    How concurrently is achieved in threads?

    The Linux kernel and the pThreads libraries work together to administer the threads. The kernel does the context switching, scheduling, memory management, cache memory management, etc. There is other administration done at the user level also. The kernel treats each process-thread as one entity. It has its own rules about time slicing that take processes (and process priorities) into consideration but each sub-process thread is a schedulable entity.

    What are the advantages/disadvantages of writing a server using multiple threads instead of multiple processes?

    ProcessThreads
    Easy to create.Complex to create.
    If clients want to share data among them, IPC is required.Sharing data among clients are easy.
    Memory overhead is more.Less memory overhead.
    No dependency of thread library.Proper synchronization is required.
    Chances for dead lock.

    What is a thread (pthread)? Why use threads instead of processes.

    Threads, like processes, are a mechanism to allow a program to do more than one thing at a time. As with processes, threads appear to run concurrently. Each thread may be executing a different part of the program at any given time. Threads can be used to achieve concurrency, similar to processes, which consumes lesser resources than a process.

    TCP/IP and Socket programming Basics

    What are the differences between Hub / Switch / Router?

    • Hub
      A network hub is designed to connect computers to each other with no real understanding of what it is transferring. When a hub receives a packet of data from a connected device, it broadcasts that data packet to all other connected devices regardless of which one ends up being the final destination. Working at layer 1 (Refer OSI model)
    • Switch
      A network switch also connects computers to each other, like a hub. Where the switch differs from a hub is in the way it handles packets of data. When a switch receives a packet of data, it determines what computer or device the packet is intended for and sends it to that computer only. Traditional switching operates at layer 2 but layer 3 switches also available
    • Router
      A network router is quite different from a switch or hub since its primary function is to route data packets to other networks, instead of just the local computers. A router is quite common to find in homes and businesses since it allows your network to communicate with other networks including the Internet. Operates at layer 3

    What other models you know other than TCP/IP that are derived based on OSI?

    • AppleTalk
    • IPX
    • SNA
    • UMTS

    What is the fundamental difference between OSI model & TCP/IP model?

    OSI model is a reference model, which mentions about functionality of each of the layers. TCP/IP model is a derived model from OSI, which practically implements the OSI functionalities.

    difference between OSI & TCP/IP model

    Explain TCP/IP layers in detail

    TCP/IP layers

    Explain OSI layers in detail

    OSI-layers

    What is difference between LAN, MAN and WAN?

    LAN is a private network used in small offices or homes usually within 1km range with high speed transfer data rate and fulltime service connectivity in low cost. WAN covers a large geographical area for example, a country or a continent. Its data transfer data is usually low as compared to LAN, but it is compatible with a variety of access lines and has an advanced security. MAN covers an area bigger than LAN within a city or town and serves as an ISP for larger LAN. It uses optical fibers or wireless infrastructure to link the LANs therefore, providing high speed regional resource sharing.

    TCP/IP addressing

    Briefly explain how routing works?

    Routing is the process of forwarding IP packets using routing table from one network to another. A router is a device that joins networks together and routes traffic between them.
    A routing table is a set of rules, often viewed in table format that is used to determine where data packets traveling over an Internet Protocol (IP) network will be directed. All IP-enabled devices, including routers and switches, use routing tables.

    Given an application you need to choose between TCP and UDP. Compare them with respect to speed, cost and reliability

    Explain various fields of UDP header

    UDP header

    Explain various fields of IP header

    IP header

    Version:
    IPv4 or IPv6
    Source and destination IP address:
    This fields store the source and destination address respectively. Since size of these fields will vary according to version (IPv4 – 32bit, IPv6 – 128 bit).

    Explain various fields of TCP header

    TCP header

    Source Port: 16 bits
    The source port number.

    Destination Port: 16 bits
    The destination port number.

    Sequence Number: 32 bits
    The sequence number of the first data octet in this segment (except when SYN is present). If SYN is present the sequence number is the initial sequence number (ISN) and the first data octet is ISN+1.

    Acknowledgment Number: 32 bits
    If the ACK control bit is set this field contains the value of the next sequence number the sender of the segment is expecting to receive. Once a connection is established this is always sent.

    Data Offset (Header len) : 4 bits
    The number of 32 bit words in the TCP Header. This indicates where
    the data begins. The TCP header (even one including options) is an
    integral number of 32 bits long.

    Control Bits: 6 bits (from left to right):
    URG: Urgent Pointer field significant
    ACK: Acknowledgment field significant
    PSH: Push Function
    RST: Reset the connection
    SYN: Synchronize sequence numbers
    FIN: No more data from sender

    What is the difference between TCP and UDP

    TCPUDP
    Connection oriented TCPConnectionless UDP
    Reliable deliveryUnreliable delivery
    In-order guaranteedNo-order guarantees
    Three way handshakeNo notion of “connection”
    More network BWLess network BW

    Explain TCP three-way handshake steps in detail

    Before the sending device and the receiving device start the exchange of data, both devices need to be synchronized. During the TCP initialization process, the sending device and the receiving device exchange a few control packets for synchronization purposes. This exchange is known as Three-way handshake.
    three way handshaking

    Step 1. Device A (Client) sends a TCP segment with SYN = 1, ACK = 0, ISN (Initial Sequence Number) = 2000. Refer TCP header
    An Initial Sequence Number (ISN) is a random Sequence Number, allocated for the first packet in a new TCP connection. The Active Open device (Device A) sends a segment with the SYN flag set to 1, ACK flag set to 0 and an Initial Sequence Number 2000 (For Example), which marks the beginning of the sequence numbers for data that device A will transmit. SYN is short for SYNchronize. SYN flag announces an attempt to open a connection.

    Step 2. Device B (Server) receives Device A's TCP segment and returns a TCP segment with SYN = 1, ACK = 1, ISN = 5000 (Device B's Initial Sequence Number), Acknowledgment Number = 2001 (2000 + 1, the next sequence number Device B expecting from Device A).Refer TCP header

    Step 3. Device A sends a TCP segment to Device B that acknowledges receipt of Device B's ISN, With flags set as SYN = 0, ACK = 1, Sequence number = 2001, Acknowledgment number = 5001 (5000 + 1, the next sequence number Device A expecting from Device B)

    This handshaking technique is referred to as TCP Three-way handshake or SYN, SYN-ACK, ACK.
    After the Three-way handshake, the connection is open and the participant computers start sending data using the agreed sequence and acknowledge numbers.

    Explain importance of following Linux Networking commands

    • ifconfig
      To get IP address as well as MAC address of system interfaces.
    • Traceroute
      To get the route of each packet from source to destination
    • Nslookup / host
      To convert a domain name to IP address.
    • Netstat
      Current network status of application using network
    • ping
      To check connectivity between two system using ICMP protocol

    Explain differences between different types of ports.

    • Well known ports
      Well known ports are used by system or processes run by root or with specific privileges. The port numbers range from 0 to 1023.
    • System ports
      Same as well-known ports
    • User ports /Registered ports
      The registered port numbers range from 1024-49151. Such ports are used by programs run by users in the system.
    • Dynamic / Private ports
      Private ports are not assigned for any specific purpose. Its range are from range 49152–65535

    What is IPv4 and IPv6 addresses? Explain their differences

    An IP address is binary numbers but can be stored as text for human readers. For example, a 32-bit numeric address (IPv4) is written in decimal as four numbers separated by periods. Each number can be zero to 255. For example, 1.160.10.240 could be an IP address.
    IPv6 addresses are 128-bit IP address written in hexadecimal and separated by colons. An example IPv6 address could be written like this: 3ffe:1900:4545:3:200:f8ff:fe21:67cf
    Advantages of IPv6 over IPv4:

    • IPv6 simplified the router’s task compared to IPv4.
    • IPv6 is more compatible to mobile networks than IPv4.
    • IPv6 allows for bigger payloads than what is allowed in IPv4.
    • IPv6 is used by less than 1% of the networks, while IPv4 is still in use by the remaining 99%.

    What is a netmask? Explain its significance

    A netmask is a 32-bit mask used to divide an IP address into subnets and specify the network's available hosts. In a netmask, two bits are always automatically assigned. For example, in 255.255.225.0, "0" is the assigned network address. In 255.255.255.255, "255" is the assigned broadcast address.

    What are various classes of IP address?

    ClassTheoretical Address RangeBinary Start
    A0.0.0.0 to 127.255.255.2550
    B128.0.0.0 to 191.255.255.25510
    C192.0.0.0 to 223.255.255.255110
    D224.0.0.0 to 239.255.255.2551110

    What is port number? What is its significance?

    A port number is the logical address of each application or process that uses a network or the Internet to communicate. A port number uniquely identifies a network-based application on a computer. Each application/program is allocated a 16-bit integer port number.
    Port numbers are mainly used in TCP and UDP based networks, with an available range of 0-65,535 for assigning port numbers. Although an application can change its port number, some commonly used Internet/network services are allocated with global port numbers such as Port Number 80 for HTTP, 23 for Telnet and 25 for SMTP.

    What is an IP address? What is its significance?

    An IP address is an identifier/address for a computer or device on a TCP/IP network. Where the networks using the TCP/IP protocol and route messages based on the IP address of the destination.
    The format of an IP address is a 32-bit numeric address written as four numbers separated by periods. Each number can be 0 to 255. For example, 192.168.32.12 could be an IP address. IP have two parts network and host part.

    What is a MAC address? What is its significance?

    A media access control address (MAC address), also called physical address, is a unique identifier assigned to network interfaces for communications on the physical network segment. MAC addresses are used as a network address for most network technologies, including Ethernet and Wi-Fi. Logically, MAC addresses are used in the media access control protocol sublayer of the OSI reference model.
    MAC addresses are most often assigned by the manufacturer of a network interface controller (NIC) and are stored in its hardware.

    TCP/IP protocols Brief working details

    DHCP

    DHCP (Dynamic Host Configuration Protocol) is a communications protocol that network administrators use to centrally manage and automate the network configuration of devices attaching to an Internet Protocol (IP) network. DHCP allows devices needing an IP address to request one when they are starting up, for example, rather than an address preassigned and manually configured on each device. With DHCP, if a device is moved from place to place, it will be assigned a new address in each location.

    DNS

    The domain name system (DNS) is the way that Internet domain names are located and translated into Internet Protocol addresses. A domain name is a meaningful and easy-to-remember "handle" for an Internet address. Because maintaining a central list of domain name/IP address correspondences would be impractical, the lists of domain names and IP addresses are distributed throughout the Internet in a hierarchy of authority. There is probably a DNS server within close geographic proximity to your access provider that maps the domain names in your Internet requests or forwards them to other servers in the Internet.

    ARP & RARP

    The address resolution protocol (ARP) is a protocol used by the Internet Protocol (IP) [RFC826], specifically IPv4, to map IP network addresses to the hardware addresses used by a data link protocol. The protocol operates below the network layer as a part of the interface between the OSI network and OSI link layer.
    The Reverse Address Resolution Protocol (RARP) is an obsolete computer networking protocol used by a client computer to request its Internet Protocol (IPv4) address from a computer network, when all it has available is its link layer or hardware address, such as a MAC address.

    ICMP

    ICMP (Internet Control Message Protocol) is an error-reporting protocol network devices like routers use to generate error messages to the source IP address when network problems prevent delivery of IP packets. ICMP creates and sends messages to the source IP address indicating that a gateway to the Internet that a router, service or host cannot be reached for packet delivery. Any IP network device has the capability to send, receive or process ICMP messages. ICMP is not a transport protocol that sends data between systems.

    SNMP

    Simple Network Management Protocol (SNMP) is an Internet-standard protocol for collecting and organizing information about managed devices on IP networks and for modifying that information to change device behavior. Devices that typically support SNMP include routers, switches, servers, workstations, printers, modem racks and more.
    SNMP is widely used in network management systems to monitor network-attached devices for conditions that warrant administrative attention. SNMP exposes management data in the form of variables on the managed systems, which describe the system configuration. These variables can then be queried (and sometimes set) by managing applications.

    Socket Programming

    What is a RAW socket? Explain its importance

    Using RAW packet we can read the packet directly from physical layer. Means it will by-pass transport and network layer and directly passed application layer.
    If you want implement your own protocols on top of physical layer, we can use RAW packet data.

    What is the importance of select () system call? Explain it in the context of synchronous I/O

    Select is a system call which help to write a concurrent server in one process. It also used to examine the status of file descriptors of open input/output channels. The select system call is similar to the polling in operating systems. Select loop uses the select system call to sleep until a condition occurs on a file descriptor (e.g., when data is available for reading) or a timeout occurs. By examining the return parameters of the select call, the loop finds out which file descriptor has changed and executes the appropriate code to read/write to file descriptor.

    Define iterative server and concurrent server. What are their advantages and disadvantages?

    Iterative serverCon-current server
    Pros:Pros:
    SimpleConcurrent access
    Reduced network overheadCan run longer since no one is waiting for completion
    Less CPU intensiveOnly one listener for many clients
    Higher single-threaded transaction throughput
    Cons:Cons:
    Severely limits concurrent accessIncreased network overhead
    Server is locked while dealing with one clientMore CPU and resource intensive

    What is the difference between send() and sendto()?

    Send is using for TCP/IP connection where we only send data. Sendto using for UDP connection where we need send data as well as destination details (IP and PORT).

    What is control connection and what is data connection? Explain with respect to socket() and accept() system calls.

    In server program of a TCP/IP connection we will get two socket fds.

    1. From socket system call, which we need to bind with IP and port number. This socket fd is called as control socket. We use this socket to accept connections from multiple clients. With one control socket we can accept any number of clients.
    2. When accept multiple connections we need one more fd to send/receive data with client.
      That fd we will receive from accept system call which called as data socket.

    What is host byte order and network byte order? Explain

    Since machines will have different type of byte orders (little endian v/s big endian), it will create undesired issues in the network. In order to ensure consistency network (big endian) byte
    order to be used as a standard. There are multiple help functions (for conversion) available
    which can be used for this purpose.
    uint16_t htons(uint16_t host_short);
    uint16_t ntohs(uint16_t network_short);
    uint32_t htonl(uint32_t host_long);
    uint32_t ntohl(uint32_t network_long);

    Is bind() is mandatory in the client side? Justify your answer.

    Bind is optional in client side. When we do a connect request from the client bind will happen automatically.

    Explain TCP client and server system calls and three-way handshake

    Three way handshake happens between connect and accept.
    Need to explain 3 way handshaking

    Explain system calls used in TCP client and server.

    System calls in TCP client and server

    What is a socket? How it is different compared to other system calls?

    A socket is just a logical endpoint for communication. They exist on the transport layer. You can send and receive data on a socket. A socket is bind to a protocol, machine, and port.
    Using socket we can communicate between process in same system as well as process in different device over network.

    Application specific

    List of protocols that doesn’t use IP address

    DHCP, ICMP, IGRP, ARP etc

    I type www.google.com in Chrome. Explain various protocols that helps to make browsing happen

    • The browser extracts the domain name from the URL.
    • The browser queries DNS for the IP address of the URL. If neither the browser nor the OS have a cached copy of the IP address, then a request is sent off to the system's configured DNS server.
    • If that DNS server has the address for that domain, it will return it. Otherwise, it will forward the query along to DNS server it is configured to defer to.
    • The web browser then assembles an HTTP request, which consists of a header and optional content.
    • This HTTP request is sent off to the web server host (via TCP/IP/MAC layer protocols) as some number of packets, each of which is routed in the same way as the earlier DNS query. (The packets have sequence numbers that allow them to be reassembled in order even if they take different paths.)
    • In the routing side there are a set of routing protocols (RIP, IGMP, OSPF etc..) which routes the packet from the source (your machine) to the web-server. This follows complex routing algorithm
    • Once the request arrives at the webserver, it generates a response (this may be a static page, served as-is, or a more dynamic response, generated in any number of ways.) The web server software sends the generated page back to the client in the similar way.

    Memory Management

    What is swapping? Explain with details

    Swapping is a mechanism in which a process can be swapped temporarily out of main memory to a backing store, and then brought back into memory for continued execution. Backing store is a usually a hard disk drive or any other secondary storage which fast in access and large enough to accommodate copies of all memory images for all users. It must be capable of providing direct access to these memory images.

    Major time consuming part of swapping is transfer time. Total transfer time is directly proportional to the amount of memory swapped. Let us assume that the user process is of size 100KB and the backing store is a standard hard disk with transfer rate of 1 MB per second. The actual transfer of the 100K process to or from memory will take

    What is position dependent code and position independent code in Linux?

    The code within a dynamic executable is typically position-dependent, and is tied to a fixed address in memory. Shared objects, on the other hand, can be loaded at different addresses in different processes. Position-independent code is not tied to a specific address. This independence allows the code to execute efficiently at a different address in each process that uses the code. Position independent code is recommended for the creation of shared objects.

    What is DMA cache coherency? How to solve?

    If a CPU has a cache and external memory, then the data the DMA controller has access to (store in RAM) may not be updated with the correct data stored in the cache.
    Solutions

    • Cache-coherent systems:
      External writes are signaled to the cache controller which performs a cache invalidation for incoming DMA transfers or cache flush for outgoing DMA transfers (done by hardware)
    • Non-coherent systems:
      OS ensures that the cache lines are flushed before an outgoing DMA transfer is started and invalidated before a memory range affected by an incoming DMA transfer is accessed. The OS makes sure that the memory range is not accessed by any running threads in the meantime.

    What is cache memory?

    A CPU cache is a cache used by the central processing unit (CPU) of a computer to reduce the average time to access data from the main memory. The cache is a smaller, faster memory which stores copies of the data from frequently used main memory locations. Most CPUs have different independent caches, including instruction and data caches, where the data cache is usually organized as a hierarchy of more cache levels (L1, L2, etc.).

    When the processor needs to read from or write to a location in main memory, it first checks whether a copy of that data is in the cache. If so, the processor immediately reads from or writes to the cache, which is much faster than reading from or writing to main memory. Most modern desktop and server CPUs have at least three independent caches: an instruction cache to speed up executable instruction fetch, a data cache to speed up data fetch and store, and a translation look-aside buffer (TLB) used to speed up virtual-to-physical address translation for both executable instructions and data.

    The data cache is usually organized as a hierarchy of more cache levels (L1, L2, etc.; see also multi-level caches below). However, a TLB cache is part of the memory management unit (MMU) and not directly related to the CPU caches.

    Explain differences between memory mapped I/O and port mapped I/O

    • Port-mapped I/O:
      The devices are programmed to occupy a range in the I/O address space. Each device is on a different I/O port. The I/O ports are accessed through special processor instructions, and actual physical access is accomplished through special hardware circuitry. This I/O method is also called isolated I/O because the memory space is isolated from the I/O space, thus the entire memory address space is available for application use.

      port mapped IO

    • Memory mapped I/O:
      The memory-mapped I/O, the device address is part of the system memory address space. Any machine instruction that is encoded to transfer data between a memory location and the processor or between two memory locations can potentially be used to access the I/O device. The I/O device is treated as if it were another memory location. Because the I/O address space occupies a range in the system memory address space, this region of the memory address space is not available for an application to use.

      memory mapped IO

    What is DMA?

    Direct memory access (DMA) chips or controllers solve this problem by allowing the device to access the memory directly without involving the processor, as shown in Figure. The processor is used to set up the DMA controller before a data transfer operation begins, but the processor is bypassed during data transfer, regardless of whether it is a read or write operation.

    The transfer speed depends on the transfer speed of the I/O device, the speed of the memory device, and the speed of the DMA controller. In essence, the DMA controller provides an alternative data path between the I/O device and the main memory. The processor sets up the transfer operation by specifying the source address, the destination memory address, and the length of the transfer to the DMA controller.

    Direct Memory Access

    What is TLB? Explain

    For faster access page table entries will be stored in CPU cache memory is called TLB.

    • But limited entries only possible.
    • If page entry available in TLB (Hit), control goes to physical address directly (Within one cycle).
    • If page entry not available in TLB (Miss), it use page table from main memory and maps to physical address (Takes more cycles compared to TLB).

    TLB

    What is paging and page table? What are the advantages?

    If programs access physical memory, we will face three problems

    • Don't have enough physical memory.
    • Holes in address space (fragmentation).
    • No security (All program can access same memory)

    These problems can be solved using virtual memory.

    • Each program will have their own virtual memory.
    • They separately maps virtual memory space to physical memory.
    • We can even move to disk if we run out of memory (Swapping).

    What is paging?

    • Virtual memory divided into small chunks called pages.
    • Similarly physical memory divided into frames.
    • Virtual memory and physical memory mapped using page table.

    Page table

    Explain the concept of Virtual memory

    What are difference between Volatile / Non-volatile / Hybrid memory?

    Non-volatile memory is typically used for the task long-term persistent storage (ROM) which will generally use for one time dump. The most widely used form of primary storage today is a volatile form of random access memory (RAM), meaning that when the system power down, anything contained in RAM is lost. Some memory have both properties like, we can store and remove data any number of time. Eg Flash, EPROM etc

    What is logical address and physical address? Explain the differences

    An address generated by the CPU is a logical address whereas address actually available on memory unit is a physical address. Logical address is also known a Virtual address. Virtual and physical addresses are the same in compile-time and load-time address-binding schemes. Virtual and physical addresses differ in execution-time address-binding scheme.

    The set of all logical addresses generated by a program is referred to as a logical address space. The set of all physical addresses corresponding to these logical addresses is referred to as a physical address space. The run-time mapping from virtual to physical address is done by the memory management unit (MMU) which is a hardware device.

    What is MMU? Explain its significance

    A memory management unit (MMU) is a computer hardware component that handles all memory and caching operations associated with the processor. In other words, the MMU is responsible for all aspects of memory management. It is usually integrated into the processor, although in some systems it occupies a separate IC (integrated circuit) chip.
    The work of the MMU can be divided into three major categories:

    • Hardware memory management, which oversees and regulates the processor's use of RAM (random access memory) and cache memory.
    • OS (operating system) memory management, which ensures the availability of adequate memory resources for the objects and data structures of each running program at all times.
    • Application memory management, which allocates each individual program's required memory, and then recycles freed-up memory space when the operation concludes.

    Process Management

    What specific situations you need to take care while building an RTS?

    • Reliability
    • Predictability
    • Performance
    • Compactness
    • Scalability
    • User control over OS Policies
    • Responsiveness
      • Fast task switch
      • Fast interrupt response

    What is real time scheduling?

    Rate monotonic and earliest deadline first are real time scheduling.

    1. Rate monotonic
      Rate-monotonic scheduling (RMS) is a scheduling algorithm used in real-time operating systems (RTOS) with a static-priority scheduling class. The static priorities are assigned according to the cycle duration of the job, so a shorter cycle duration results in a higher job priority. Smaller the period, higher the priority.
    2. Earliest deadline first
      Earliest deadline first (EDF) or least time to go is a dynamic scheduling algorithm used in real-time operating systems. A priority queue will be maintained and will be searched for the process closest to its deadline. This process is the next to be scheduled for execution.

    Is Linux a RTOS? Explain your justification

    Linux is not RTOS but RTLinux is a RTOS version of Linux OS. RTLinux is a hard real-time RTOS microkernel that runs the entire Linux operating system as a fully preemptive process.
    Ref Qn 6

    Explain differences between General Purpose Operating System (GPOS), Real Time Operating System (RTOS) and Embedded Operating System (EOS)

    • GPOS
      General Purpose Operating System (GPOS) is a software that provides interface to the user to access various applications. These are running in a commonly used systems & run applications like desktop PCs (ex: Windows 10). They run on a general purpose hardware.
    • RTOS
      Real Time Operating System (RTOS) is a software that has certain time constraint. Such OS operate with stringent time expectations to deliver a predictable response to use applications (ex: robotics). This OS can run either on a general or specific hardware that is designed.
    • EOS
      Embedded OS (EOS) is a software that runs on a specific hardware that runs specific set of applications. Along with running in a specific target, EOS will have resource constraints (ex: size) hence it should be customizable and configurable to meet embedded system needs.

    What is a real-time systems (RTS)? What are the different types of RTS?

    Real Time Operating System (RTOS) is designed to provide a predictable (normally described as deterministic) execution pattern. This is particularly of interest to embedded systems as embedded systems often have real time requirements. A real time requirement is one that specifies that the embedded system must respond to a certain event within a strictly defined time (the deadline). A guarantee to meet real time requirements can only be made if the behavior of the operating system's scheduler can be predicted.

    Soft-RTOS
    Soft real time systems can miss some deadlines, but eventually performance will degrade if too many are missed. Example – multimedia streaming, computer games
    Hard-RTOS
    Hard real-time means you must absolutely hit every deadline. Very few systems have this requirement. Some examples are nuclear systems, some medical applications such as pacemakers, a large number of defense applications, avionics etc. Where life critical situation we have to use hard real time system.

    Briefly explain the following algorithms:

    • FCFS:
      1. Round Robin – Time slice based
        In RR time slices are assigned to each process in equal portions and in circular order, handling all processes without priority (also known as cyclic executive). Round-robin scheduling is simple, easy to implement, and starvation-free.
      2. Round Robin – Priority based
        It works same as RR time slice, but we can also assign priority to process. More priority means large time slice.
    • Priority based:
      1. Rate monotonic
        Rate-monotonic scheduling (RMS) is a scheduling algorithm used in real-time operating systems (RTOS) with a static-priority scheduling class. The static priorities are assigned according to the cycle duration of the job, so a shorter cycle duration results in a higher job priority. Smaller the period, higher the priority.
      2. Earliest deadline first
        Earliest deadline first (EDF) or least time to go is a dynamic scheduling algorithm used in real-time operating systems. A priority queue will be maintained and will be searched for the process closest to its deadline. This process is the next to be scheduled for execution.

    What is pre-emptive and non-pre-emptive scheduling algorithms?

    • Non pre-emptive scheduling:When the currently executing process gives up the CPU voluntarily.
    • Pre-emptive scheduling:When the operating system decides to favor another process (High priority), pre-empting the currently executing process.

    During which process states CPU scheduling decisions are made? Explain

    CPU scheduling decisions take place under one of four conditions:

    • When a process switches from the running state to the waiting state, such as for an I/O request or invocation of the wait() system call.
    • When a process switches from the running state to the ready state, for example in response to an interrupt.
    • When a process switches from the waiting state to the ready state, say at completion of I/O or a return from wait().
    • When a process terminates.

    Explain context switching in detail.

    Context switching is the procedure of storing the state of an active process for the CPU when it has to start executing a new one. For example, process A with its address space and stack is currently being executed by the CPU and there is a system call to jump to a higher priority process B; the CPU needs to remember the current state of the process A so that it can suspend its operation, begin executing the new process B and when done, return to its previously executing process A.

    • Suspending the progression of one process and storing the CPU's state (i.e., the context) for that process somewhere in memory.
    • Retrieving the context of the next process from memory and restoring it in the CPU's registers.
    • Returning to the location indicated by the program counter (i.e., returning to the line of code at which the process was interrupted) in order to resume the process.

    What is scheduling?

    The act of determining which process in the ready state should be moved to the running state is known as Process Scheduling.
    The prime aim of the process scheduling system is to keep the CPU busy all the time and to deliver minimum response time for all programs. For achieving this, the scheduler must apply appropriate rules for swapping processes IN and OUT of CPU.

  • Company wise Questions
  • Quiz
Related Links: