Apache Storm Interview Questions
Q. What is Apache Storm?
Apache Storm is a free and open source distributed realtime computation system. Apache Storm makes it easy to reliably process unbounded streams of data, doing for realtime processing what Hadoop did for batch processing. Apache Storm is simple, can be used with any programming language.
Q. What are the different type of nodes on a Storm cluster?
There are two kinds of nodes on a Storm cluster: the master node and the worker nodes. The master node runs a daemon called “Nimbus” that is similar to Hadoop’s “JobTracker”. Nimbus is responsible for distributing code around the cluster, assigning tasks to machines, and monitoring for failures.Each worker node runs a daemon called the “Supervisor”. The supervisor listens for work assigned to its machine and starts and stops worker processes as necessary based on what Nimbus has assigned to it. Each worker process executes a subset of a topology; a running topology consists of many worker processes spread across many machines.
Q. What is Topologies in Apache Storm ?
A topology is a graph of computation. Each node in a topology contains processing logic, and links between nodes indicate how data should be passed around between nodes. To do realtime computation on Storm, we need to create “topologies”. Since topology definitions are just Thrift structs, and Nimbus is a Thrift service, you can create and submit topologies using any programming language
Q. What is Streams in Apache Storm ?
The core abstraction in Storm is the “stream”. A stream is an unbounded sequence of tuples. Storm provides the primitives for transforming a stream into a new stream in a distributed and reliable way.
Q. What is Spouts in Apache Storm ?
A spout is a source of streams in a topology. Generally spouts will read tuples from an external source and emit them into the topology .
Spouts can emit more than one stream. To do so, declare multiple streams using the declareStream method of OutputFieldsDeclarer and specify the stream to emit to when using the emit method on SpoutOutputCollector.
Q. What is the difference between Apache Storm and Kafka?
Helps in data exchange between an input to output streams.
Storm is independent.
It was invented by Twitter.
Storm supports all languages.
It is a broker which can handle big amount of messages.
Kafka depends on Zookeeper.
Kafka was invented by Linkdin.
It also supports all languagesbut Java is recommended.
Q. What are the features of Apache Storm?
Storm is easy to operate and its configuration are helpful in deploying and using it easily.
It can process upto 100 messages in one second for each node.
It can automatically detect faults.
It also helps in ensuring the data to be executed once and in some cases more than once.
Q. Name the different stream grouping in Apache storm?
Different stream grouping in Apache storm are:
- Shuffle grouping
- Fields grouping
- Global grouping
- All grouping
- None grouping
- Direct grouping
- Local grouping
Q. Name the two types nodes in Cluster Architecture?
Nimbus (master node)
Supervisor (worker node)
Q. How to use apache storm tuple?
Tuple is not required for adding additional attributes fro feild grouping.
Iy can reduce the number of shipped bytes.
It can also preserve the advantage of the beans pattern.
Q. What are 2 modes in storm cluster?
Local mode helps to adjust parameters that enables us to see how our topology runs in different Storm configuration environments.
Production mode is composed of many processes like running on different machines.
Q. What is combinerAggregator?
CombinerAggregator is used for combining a set of tuples in a single feild.
Q. How is storm application beneficial in financial services?
Storm can be helpful by:
Q. What are the components of Storm?
Nimbus – Helps in distributing code across the cluster and allocating workers across the cluster and monitors computation.
Zookeeper – Helps as a mediator for communication.
Supervisor – Helps in interacting with Nimbus through Zookeeper.
For more Click Here