Interview Questions on Database

A database is simply an organized collection of related data, typically stored on disk, and accessible by possibly many concurrent users  . A common and powerful method for organising data for computerisation is the relational data model.

Here we have gathered the interview questions can be asked on Database.

  1. What is normalization?
  2. What are pros and cons of normalized database design?
  3. What are pros and cons of denormalized database design?
  4. What is difference between a where clause and a having clause ?
  5. What is difference between primary key and a unique key ?
  6. What are cursors in data base?explain the types of it ?
  7. What are triggers? How to invoke a trigger on demand?
  8. What is a join and its types ?
  9. What is self-join?
  10. Define candidate key, alternate key, and composite key.
  11. What is lock escalation ?
  12. What is constraints and its types?
  13. What is Index and its types ?
  14. What is deadlock and live lock in database ?
  15. How to resolve deadlock ?
  16. What is Blocking in database?

 

Q1.What is normalization? 

Normalization is the process of reorganizing data in a database so that it meets two basic requirements:

  1. There is no redundancy of data (all data is stored in only one place), and
  2. Data dependencies are logical (all related data items are stored together).

Normalization is important for many reasons, but chiefly because it allows databases to take up as little disk space as possible, resulting in increased performance.

 

Q2.What are pros and cons of normalized database design?

Normalized databases fare very well under conditions where the applications are write-intensive and the write-load is more than the read-load. 

This is because of the following reasons:

  • Normalized tables are usually smaller and have a smaller foot-print because the data is divided vertically among many tables. This allows them to perform better as they are small enough to get fit into the buffer.
  • The updates are very fast because the data to be updated is located at a single place and there are no duplicates.
  • Similarly the inserts are very fast because the data has to be inserted at a single place and does not have to be duplicated.
  • The selects are fast in cases where data has to be fetched from a single table, because normally normalized tables are small enough to get fit into the buffer.
  • Because the data is not duplicated so there is less need for heavy duty group by or distinct queries.

Although there seems to be much in favor of normalized tables, with all the pros outlined above, but the main cause of concern with fully normalized tables is that normalized data means joins between tables. And this joining means that read operations have to suffer because indexing strategies do not go well with table joins.

 

Q3.What are pros and cons of denormalized database design?

Denormalized databases fair well under heavy read-load and when the application is read intensive.

This is because of the following reasons:

  • The data is present in the same table so there is no need for any joins, hence the selects are very fast.
  • A single table with all the required data allows much more efficient index usage. If the columns are indexed properly, then results can be filtered and sorted by utilizing the same index. While in the case of a normalized table, since the data would be spread out in different tables, this would not be possible.

Although for reasons mentioned above selects can be very fast on denormalized tables, but because the data is duplicated, the updates and inserts become complex and costly.

 

Q4.What is difference between a where clause and a having clause ?

Both where and having clause are filters. The main difference is that over what it is filtering.

  1. WHERE clause is used for filtering rows and it applies on each and every row, while HAVING clause is used to filter groups .
  2. The WHERE clause specifies the criteria in which individual records must meet the criteria. It can be used without the GROUP BY clause. The HAVING clause cannot be used without the GROUP BY clause.
  3. The WHERE clause selects rows before grouping. The HAVING clause selects rows after grouping.
  4. The WHERE clause cannot contain aggregate functions. The HAVING clause can contain aggregate functions.
  5. The WHERE clause can be used any type of sql statements but The HAVING clause can be used with only group functions.

 

Q5.What is difference between primary key and a unique key ?

  1. A primary key is sort of a unique key identifier that uniquely identifies a row within a database table, while a unique key identifies all possible rows that exist in a table and not just the currently existing rows.
  2. A primary key is used to identify a record in a database table, whereas a unique key is used to prevent duplicate values in a column with the exception of a null entry.
  3. A primary key creates a clustered unique index by default while a unique key is a unique non-clustered index in a database table by default.
  4. A primary key cannot accept NULL values in a database table whereas a unique key can accept only one NULL value in the table.
  5. There can only be one and only one primary key on a table, however, there can be multiple unique keys for a table in a database system.



Q6.What are cursors in data base?explain the types of it ?

Cursor is a temporary work area created in the system memory when a SQL statement is executed. A cursor contains information on a select statement and the rows of data accessed by it.

This temporary work area is used to store the data retrieved from the database, and manipulate this data. A cursor can hold more than one row, but can process only one row at a time. The set of rows the cursor holds is called the active set.

There are two types of cursors in PL/SQL  :

  1. Implicit cursors.
  2. Explicit cursors.

Both implicit and explicit cursors have the same functionality, but they differ in the way they are accessed.

Implicit cursors :

These are created by default when DML statements like, INSERT, UPDATE, and DELETE statements are executed. They are also created when a SELECT statement that returns just one row is executed.

When you execute DML statements like DELETE, INSERT, UPDATE and SELECT statements, implicit statements are created to process these statements. Oracle provides few attributes called as implicit cursor attributes to check the status of DML operations. The cursor attributes available are %FOUND, %NOTFOUND, %ROWCOUNT, and %ISOPEN

Explicit Cursors :

They must be created when you are executing a SELECT statement that returns more than one row. Even though the cursor stores multiple records, only one record can be processed at a time, which is called as current row. When you fetch a row the current row position moves to next row.

An explicit cursor is defined in the declaration section of the PL/SQL Block. It is created on a SELECT Statement which returns more than one row. We can provide a suitable name for the cursor.

 

Q7.What are triggers? How to invoke a trigger on demand?

Triggers are special kind of stored procedures that gets executed automatically when an INSERT, UPDATE or DELETE operation takes place.

Triggers cannot be invoked on demand. They get executed automatically when defined event occurred.

 

Q8.What is a join and its types ?

Joins can be simply defined as the combining or merging the related tuples from the two different relations into a single type. We can classify joins basically into two types

  1. INNER JOINS
    • These joins are the one that has the tuples that satisfy some conditions and rest are discarded. Further they are classified as
  2. THETA JOIN
    • They have tuples from different relations if and only if they satisfy the theta condition, here the comparison operators (≤, ≥, ˂, ˃, =, ̚ )for theta join.
  3. EQUI JOIN
    • This join uses only the equality operator then it is called as equi join.
  4. NATURAL JOIN
    •  It does not utilize any of the comparison operator. Here the condition is that the attributes should have same name and domain.
    • There has to be at least one common attribute between two relations.
    • It forms the Cartesian product of two arguments, performs selection forming equality on those attributes that appear in both relations and eliminates the duplicate attributes. 
  5. OUTER JOINS
    •  These have all the tuples from either or both the relations. Further they are classified as
  6. LEFT JOIN or LEFT OUTER JOIN
    •  The result set of a Left Outer Join includes all the rows from the left table specified in the LEFT OUTER clause, not just the ones in which the joined columns match. When a row in the left table has no matching rows in the right table, the associated result set row contains null values for all select list columns coming from the right table.
  7. RIGHT JOIN or RIGHT OUTER JOIN
    •  A Right Outer Join is the reverse of a Left Outer Join. All rows from the right table are returned. Null values are returned for the left table any time a right table row has no matching row in the left table.
  8. FULL JOIN or FULL OUTER JOIN
    • A Full Outer Join returns all rows in both the Left and Right tables. Any time a row has no match in the other table, the select list columns from the other table contain null values. When there is a match between the tables, the entire result set row contains data values from the base tables

 

Q9.What is self-join?

A self-join is a join in which a table is joined with itself (which is also called Unary relationships), especially when the table has a FOREIGN KEY which references its own PRIMARY KEY.

A self-join, also known as an inner join, is a structured query language (SQL) statement where a queried table is joined to itself. The self-join statement is necessary when two sets of data, within the same table, are compared.

 

Q10.Define candidate key, alternate key, and composite key.

Candidate key is a set of fields that uniquely identify rows in a table. You can choose only one of the candidate keys as primary key. For example mobile number and email address can uniquely identify each row in customers table. They both are candidate keys. But you can choose only one of these as primary key.

Composite key is a primary key that has more than one column in it. E.g. mobile and email together identify a customer in a table.

Alternate key is a candidate key that has not been used as primary key. E.g. both email and mobile can identify customer uniquely. They are both candidate keys. You made email primary key. So the other key mobile is the alternate key.

 

Q11.What is lock escalation ?

If many locks of the same granularity are held during a transaction, the Database Engine automatically upgrades these locks into a table lock. This process of converting many page-, row-, or index-level locks into one table lock is called lock escalation.

The escalation threshold is the boundary at which the database system applies the lock escalation. Escalation thresholds are determined dynamically by the system and require no configuration.

 

Q12.What is constraints and its types?

Database constraints are restrictions on the contents of the database or on database operations. It is a condition specified on a database schema that restricts the data to be inserted in an instance of the database.

Types of Constraints
  • Domain Constraints
  • Tuple Uniqueness Constraints
  • Key Constraints
  • Single Value Constraints
  • Integrity Rule 1 (Entity Integrity Rule or Constraint)
  • Integrity Rule 2 (Referential Integrity Rule or Constraint)
  • General Constraints

 

Q13.What is Index and its types ?

Indexing is a data structure technique to efficiently retrieve records from the database files based on some attributes on which the indexing has been done. Indexing in database systems is similar to what we see in books.

Indexing is defined based on its indexing attributes. Indexing can be of the following types

  • Primary Index − Primary index is defined on an ordered data file. The data file is ordered on a key field. The key field is generally the primary key of the relation.
  • Secondary Index − Secondary index may be generated from a field which is a candidate key and has a unique value in every record, or a non-key with duplicate values.
  • Clustering Index − Clustering index is defined on an ordered data file. The data file is ordered on a non-key field.

 

Q14.What is deadlock and live lock in database ?

In a database, a deadlock is an unwanted situation in which two or more transactions are waiting indefinitely for one another to give up locks. Deadlock is said to be one of the most feared complications in DBMS as it brings the whole system to a Halt.

A livelock is similar to a deadlock, except that the states of the processes involved in the livelock constantly change with regard to one another, none progressing. Livelock is a special case of resource starvation; the general definition only states that a specific process is not progressing.

 

Q15.How to resolve deadlock ?

Deadlock Prevention

Deadlock Prevention ensures that the system never enters a deadlock state. 

Following are the requirements to free the deadlock

1. No Mutual Exclusion : No Mutual Exclusion means removing all the resources that are sharable. 

2. No Hold and Wait : Removing hold and wait condition can be done if a process acquires all the resources that are needed before starting out. 

3. Allow Preemption : Allowing preemption is as good as removing mutual exclusion. The only need is to restore the state of the resource for the preempted process rather than letting it in at the same time as the preemptor.

4. Removing Circular Wait : The circular wait can be removed only if the resources are maintained in a hierarchy and process can hold the resources in increasing the order of precedence.

Deadlock Avoidance
  • Deadlock Avoidance helps in avoiding the rolling back conflicting transactions.
  • It is not good approach to abort a transaction when a deadlock occurs.
  • Rather deadlock avoidance should be used to detect any deadlock situation in advance.

 

Q16.What is Blocking in database?

Database blocking occurs when a connection to the SQL server locks one or more records, and a second connection to the SQL server requires a conflicting lock type on the record, or records, locked by the first connection.

This results in the second connection waiting until the first connection releases its locks. A connection waits, by default, an unlimited amount of time for the blocking lock to cease.

One connection can block another connection, regardless of whether they emanate from the same application or separate applications on different client computers.