Apache Command Set What types ? What are they ? What do they do ? Environment Configuration

Hadoop commands – What types ? User commands Administration commands Generic options for all commands Configuration options Environment Variables i.e. HADOOP_PREFIX Aliases i.e. hls = hadoop fs -ls


Hadoop commands – What are they ? User Commands archive – save files to a har archive distcp – copy files or directories recursively fs – file system commands cat – copies file to stdout chgrp – change group associated with file chmod – change file permissions chown – change file ownership CopyFromLocal – copy from local file reference CopyToLocal – copy to local file reference


Hadoop commands – What are they ? User Commands fs – file system commands count – count of dir / files/ bytes cp – copy files du – size of files and directories dus – display file lengths expunge – empty trash get – copy files to local file system getmerge – get but merge files ls – file listing lsr recursive ls

Hadoop commands – What are they ? User Commands fs – file system commands mkdir – make directory moveFromLocal – put with delete of origin mv – move from source to destination put – copy between file systems rm – remove a file rmr – recursive delete setrep – change file replication factor stat – returns file stat information

Hadoop commands – What are they ? User Commands fs – file system commands tail – display end of file test – check file existence / type text – output file as text touchz – create zero length file fsck – HDFS file system check fetchdt – get delegation token from name node jar – run jar file Job – manage mapreduce jobs

Hadoop commands – What are they ? User Commands pipes – run a pipe job queue – interact and view job queue version – get Hadoop version CLASSNAME – run class named CLASSNAME classpath – print the class path

Hadoop commands – What are they ? Administration Commands balancer – run cluster balancing daemonlog – get/set daemon log level datanode – run hdfs data node dfsadmin – run dfsadmin client mradmin – run map reduce admin client jobtracker – run mr jobtracker node namenode – runs the name node secondarynamenode – run secondary name node tasktracker – run task tracker node

Hadoop Environment See the .bashrc for environment set up ##export HADOOP_HOME=/usr/local/hadoop ## deprecated export HADOOP_PREFIX=/usr/local/hadoop export JAVA_HOME=/usr/lib/jvm/java-6-openjdk-i386 unalias hfs &> /dev/null alias hfs="hadoop fs" unalias hls &> /dev/null ; alias hls="hfs -ls" unalias hup1 &> /dev/null ; alias hup1="cd $HADOOP_PREFIX/bin ; ./" unalias hup2 &> /dev/null ; alias hup2="cd $HADOOP_PREFIX/bin ; ./" unalias hdwn1 &> /dev/null ; alias hdwn1="cd $HADOOP_PREFIX/bin ; ./" unalias hdwn2 &> /dev/null ; alias hdwn2="cd $HADOOP_PREFIX/bin ; ./" # if using LZO compression then add entry here for viewing # LZO compressed files ##PATH=$PATH:$HADOOP_HOME/bin ## deprecated PATH=$PATH:$HADOOP_PREFIX/bin PATH=$PATH:$JAVA_HOME/bin export PATH

Hadoop Configuration Configuration files under $HADOOP_PREFIX/conf Initial set up in core-site.xml hdfs-site.xml mapred-site.xml Example from core-site.xml <property> <name>hadoop.tmp.dir</name> <value>/app/hadoop/tmp</value> <description>A base for other temporary directories.</description> </property>

