Skip to main content

Posts

Twitter sentiment analysis in Hadoop using apache Pig

we have seen twitter analysis using Hive at many places, Here I am going to present my way of analyzing tweet's sentiments using Apache Pig. what we want to do ?            we want to analyse tweets to check if they contain positive emotions or negative emotions. tweets reflects person's emotions when he or she was posting it like " got this Job done...Hurrahh.." or "xyz Movie sucks!!!! worst movie I ever saw....". what we need ?         we need to fetch tweets from twitter to HDFS so that we can do our analysis using Hadoop ecosystem (Apache Pig Here). How will we do it ?              we will use apache flume to fetch tweets from twitter to HDFS, the flume version I am using here is apache flume-1.4.0 .  then we will do some text analysis on tweets posted by twitter users to check if they contain positive emotions or negative emotions. we will use apache Pig for this purpose, version I have used is apache pig-0.11.0, we will write  
Recent posts

apache flume to fetch twitter data

we are using Apache flume to fetch tweeter data and store it in to HDFS. so lets get started , flume version I am using here is  apache flume-1.4.0   Download apache flume - on your unix terminal type this command  wget http://apache.mirrors.hoobly.com/flume/1.4.0/apache-flume-1.4.0-bin.tar.gz create directory - "flume-ng" create directory in your /usr/lib folder, type this command sudo mkdir /usr/lib/flume-ng Now copy the flume tar file you have downloaded to your usr/lib/flume-ng directory, which you just have created. command is sudo cp –r apache-flume-1.4.0-bin.tar.gz /usr/lib/flume-ng/ check if your tar file is copied to your flume-ng directory , give command ls /usr/lib/flume-ng/ untar the tar file in flume-ng directory , but first you need to change your directory from /Home to /usr/lib/flume-ng/ cd /usr/lib/flume-ng/ and now untar the file with the command sudo tar -xvf /usr/lib/flume-ng/apache-flume-1.4.0-bin.tar.gz

Apache Hadoop pseudo distributed cluster on Ubuntu virtual Machine

Hi,   Here i am going to show you , How to setup up a pseudo distributed (Single Node) hadoop cluster on Ubuntu VM. Prerequisite -  understanding of Hadoop  VM ware player  ubuntu  Things you need   VMWare Player , Just a simple google will take you to vmware web site, which will give you info and download link to the latest version of the Virtual player. Download and install it. ubuntu VM Image , again google will help you here, (the version I used is ubuntu-14.04).  and off course a Laptop :-). setting up VM- after installing vmware player and extracting ubuntu to Directory of your choice, double click the vmware icon on your Desktop, click on open virtual machine and go to the directory where you have extracted ubuntu, you will find a ubuntu.vmx file, double click on it. and then Play the VM. (you can later edit the settings of VM, if you want to). Updating ubuntu Go to terminal and give this command - ' $>sudo apt-get update '.