Netezza Interview Questions & Answers

1. What are the environment variables that are required to connect to netezza?
Ans :

The environment variables required are: NZ_HOST, NZ_DATABASE, NZ_USER, NZ_PASSWORD

2. What are the different states of Netezza?
Ans :

Online: Normal or usual state.
Stopped: Netezza will shut down after completing current queries, no new queries allowed.
Offline: Waits for completion of current queries, new queries in queue receives error.
Paused: Same as above, but no error displayed. Typically caused during Netezza boot up or startup.
Down: Just plain down, could be due to Netezza server problem or user initiated.

3. What are the constraints on a table are enforced?
Ans :

The only constraint netezza supports is Not null. There are no primary key and foreign key constraints in netezza.

4. Can you insert duplicate rows in netezza table?
Ans :

Yes. As there are no primary key constraints in netezza you can insert duplicate rows.

5. How the NOT NULL specification on a column improves the netezza performance?
Ans :

Specifying Not Null on each column in table results in better performance. Netezza tracks the NULL values at rowheader level. Having NULL values results in storing references to NULL values in header. If all columns are NOT NULL, then there is no record header.

6. How FPGA can be helpful in improving query performance?
Ans :

While reading data from the disk, the Field Programmable Gate Array (FPGA) on each SPU filters out unwanted data. This process of data elimination removes IO bottlenecks and frees up downstream components such as the CPU, memory and network from processing extra data.

7. What is a snippet?
Ans :

A snippet is a small unit of work that are carried out in SPU.

8.What are zonemaps?
Ans :

An extent is the smallest unit of disk allocation on a SPU. Zonemaps is internal mapping structures to the extents that take advantage of the internal ordering of data to eliminate extents that do not need to be scanned. Zonemaps transparently avoid scanning of unreferenced rows. Zonemaps are created for every column in the table and contain the minimum and maximum values for every extent.

9. How the zonemaps are created and updated?
Ans :

Zonemaps are created and refreshed for every SPU when you Generate statistics, Nzload operation, Insert, update operations, Nzreclaim operation.

10. What is generate statistics and generate express statistics OR what is the difference between generate statistics and generate express statistics?
Ans :

Generate statistics is used to gather statistics about each table column’s proportion of duplicate values, minimum values, maximum values, null values, unique values and updates the system catalog tables.
The difference between ‘generate statistics’ and ‘generate express statistics’ is based on how the column uniqueness is calculates. The ‘generate express statistics’ calculates estimated dispersion values based on the sampling of rows in the table. ‘Generate express statistics’ uses approximation in generating the stats where as ‘generate statistics’ uses all the rows in the table.

11. What is the use of creating materialized views?
Ans :

A materialized view reduces the width (number of columns) of data being scanned in the base table by creating a thin version (fewer columns) of the base table that contains a small subset of frequently queried columns.

12. What is the distribution of materialized views?
Ans :

A materialized view has the same distribution key as the base table.

13. What are the limitations of materialized views?
Ans :

You cannot insert, update, delete or truncate a materialized view. Any changes on the base tables will reflect into materialized views.
You can specify only one base table in the from clause.
Base table can’t be a external table, system table or a temporary table.
You cannot use a where clause in the materialized view.
Expressions are not allowed as columns.

14. What are the best practices of creating materialized views?
Ans :

Create materialized views with few columns which are frequently queried.
Specify order by clause on the most restrictive columns (columns used in where clause).
Periodically or manually refresh the materialized views.

15. What are the partitioning methods available in netezza?
Ans :

There are two partitioning methods available in netezza:
Random partitioning: Distributes the data randomly.
Hash Partitioning: Distributes the data on the specified columns.

16. Up to how many columns you can specify in distribute on clause?
Ans :

You can specify up to four columns in the distribution clause

17.If you did not specify any distribute on clause while creating a table, what distribution netezza uses?
Ans :

Netezza distributes the data on the first column and it uses Hash partitioning

18. Can you update the columns used in distribution clause?
Ans :

No, the column that is used in distribution clause cannot be used for updates.

19. What data types are most suited for the columns specified in distribution clause?
Ans :

Integer

20. How do you redistribute a table?
Ans :

Use Create Table As (CTAS) to redistribute the data in a table. While creating the new table specify the distribute on clause to distribute the data on the new columns.

21. If you did not specify any distribution clause, how the Create Table AS (CTAS) will distribute the rows?
Ans :

CTAS will get distribution from the original table.

22.How do you check the rows in a table are equally distributed in all SPU’s or not?
Ans :

To check the distribution of rows run the following query
SELECT datasliceid, COUNT(*)FROM <table name> GROUP BY datasliceid

23. What is collocated join?
Ans :

When you join tables which are distributed on the same key and used these key columns in the join condition, then each SPU in netezza works 100% independent of the other, as the required data is available in itself. This type of joins is called collocated joins.

24. When netezza redistributes a table and when it broadcasts a table?
Ans :

Whenever it is not possible to do a collocated join, netezza either redistributes the tables or broadcasts the table. When the table is a small one, then netezza broadcasts the table. Otherwise netezza redistributes the table.

25. How do you remove logically deleted records?
Ans :

Whenever you delete a row in a table, it is not physically deleted. It is logically deleted by flagging the deletexid field in the table. NZRECLAIM utility is used to remove the logically deleted records.

26. What is nzload?
Ans :

Nzload utility is used load data from a file into a table. It is used to load bulk data quickly and simultaneously rejects erroneous content.

27. What are the ways to load a data from a table into a file?
Ans :

Create an external table.
Use nzsql utility with -o option.

28. What are the different ways to load data from a file into a table?
Ans :

Use nzload to load the data from a file into a table
Create an external table and then load the original table using the external table.

29. How netezza updates a row in a table?
Ans :

Netezza logically deletes the original row by flagging the deletexid column with the current transaction id and inserts a new row with the updated values.

30. What Are 4 Environment Variables That Are Required. What Are Different States On Netezza?

Ans :

Environment variables: NZ_HOST, NZ_DATABASE, NZ_USER and NZ_PASSWORD

Online: Normal or usual state.
Stopped: Netezza will shutdown after completing current queries, no new queries allowed.
Offline: Waits for completion of current queries, new or queries in queue receive error.
Paused: Same as above, but no error displayed. Typically caused during Netezza bootup or startup.
Down: Just plain down, could be due to Netezza server problem or user initiated.

For more  Click Here


For Course Content  Click Here