we have seen twitter analysis using Hive at many places, Here I am going to present my way of analyzing tweet's sentiments using Apache Pig. what we want to do ? we want to analyse tweets to check if they contain positive emotions or negative emotions. tweets reflects person's emotions when he or she was posting it like " got this Job done...Hurrahh.." or "xyz Movie sucks!!!! worst movie I ever saw....". what we need ? we need to fetch tweets from twitter to HDFS so that we can do our analysis using Hadoop ecosystem (Apache Pig Here). How will we do it ? we will use apache flume to fetch tweets from twitter to HDFS, the flume version I am using here is apache flume-1.4.0 . then we will do some text analysis on tweets posted by twitter users to check if they contain positive emotions or negative emotions...
we are using Apache flume to fetch tweeter data and store it in to HDFS. so lets get started , flume version I am using here is apache flume-1.4.0 Download apache flume - on your unix terminal type this command wget http://apache.mirrors.hoobly.com/flume/1.4.0/apache-flume-1.4.0-bin.tar.gz create directory - "flume-ng" create directory in your /usr/lib folder, type this command sudo mkdir /usr/lib/flume-ng Now copy the flume tar file you have downloaded to your usr/lib/flume-ng directory, which you just have created. command is sudo cp –r apache-flume-1.4.0-bin.tar.gz /usr/lib/flume-ng/ check if your tar file is copied to your flume-ng directory , give command ls /usr/lib/flume-ng/ untar the tar file in flume-ng directory , but first you need to change your directory from /Home to /usr/lib/flume-ng/ cd /usr/lib/flume-ng/ and now untar the file with the command sudo tar -xvf /usr/lib/flume-ng/apache-flume-1.4.0-bin.tar.gz ...