problem about piglatin-Collection of common programming errors


  • Fraz
    java pig piglatin
    I am new into java world.. But basically I am trying to write a user-defined-function in pig-latin.The following is the relevant code.public class time extends EvalFunc<String>{public String exec(Tuple input) throws IOException {if ((input == null) || (input.size() == 0))return null;try{String time = (String) input.get(0) ;DateFormat df = new SimpleDateFormat(“hh:mm:ss.000”);Date date = df.parse(time);String timeOfDay = getTimeOfDay(date);return timeOfDay;} catch (IOException e) {throw e;}

  • Simon Guo
    hadoop amazon-web-services pig piglatin amazon-emr
    I am developing an application that try to read log file stored in S3 bucks and parse it using Elastic MapReduce. Current the log file has following format ——————————- COLOR=Black Date=1349719200 PID=23898 Program=Java EOE ——————————- COLOR=White Date=1349719234 PID=23828 Program=Python EOE So I try to load the file into my Pig script, but the build-in Pig Loader doesn’t seems be able to load my data, so I have to create my own UDF. Since I am p

  • user1158351
    hadoop hbase pig hdfs piglatin
    File Content : one,1 two,2 three,3file location : hdfs:/hbasetest.txtTable in Hbase : create ‘mydata’, ‘mycf’PIG Script :A = LOAD ‘/hbasetest.txt’ USING PigStorage(‘,’) as (strdata:chararray, intdata:long); STORE A INTO ‘hbase://mydata’USING org.apache.pig.backend.hadoop.hbase.HBaseStorage(‘mycf:intdata’);And i m getting following error : ON CONSOLE2012-03-13 16:26:22,170 [main] INFO org.apache.pig.tools.pigstats.ScriptState – Pig features used in the script: UNKNOWN 2012-03-13 16:26:22,170 [ma

  • shanks_roux
    java hadoop user-defined-functions pig piglatin
    I had a specific filtering problem (described here: Pig – How to manipulate and compare dates?), so as we told me, I decided to write my own filtering UDF. Here is the code:import java.io.IOException;import org.apache.pig.FilterFunc; import org.apache.pig.data.Tuple;import org.joda.time.*; import org.joda.time.format.*;public class DateCloseEnough extends FilterFunc {int nbmois;/** @param nbMois: if the number of months between two dates is inferior to this variable, then we consider that these

  • shanks_roux
    hadoop pig bigdata piglatin
    Let me explain the problem. I have this line of code:u = FOREACH persons GENERATE FLATTEN($0#’experiences’) as j; dump u;which produces this output:([id#1,date_begin#12 2012,description#blabla,date_end#04 2013],[id#2,date_begin#02 2011,description#blabla2,date_end#04 2013]) ([id#1,date_begin#12 2011,description#blabla3,date_end#04 2012],[id#2,date_begin#02 2010,description#blabla4,date_end#04 2011])Then, when I do this:p = foreach u generate j#’id’, j#’description’; dump p;I have this output:(1,

  • Matt S.
    pig dump piglatin grunt verbosity
    I’m working with PigLatin, using grunt, and every time I ‘dump’ stuffs, my console gets clobbered with blah blah, blah non-info, is there a way to surpress all that? grunt> A = LOAD ‘testingData’ USING PigStorage(‘:’); dump A; 2013-05-06 19:42:04,146 [main] INFO org.apache.pig.tools.pigstats.ScriptState – Pig features used in the script: UNKNOWN 2013-05-06 19:42:04,147 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler – File concatenation threshold: 100 optimi

  • hadinbe
    windows hadoop pig piglatin
    I am trying to run PigUnit tests on a Windows 7 machine before running the actual pig script on a Ubuntu cluster and I start to think that my understanding of “withouthadoop” is not correct.Do I need to install Hadoop to locally run a PigUnit test on a Windows 7 machine?I installed:eclipse Juno & ant cygwinI set up:JAVA_HOME=C:\Program Files\Java\jdk1.6.0_39 PIG_HOME=C:\Users\john.doe\Java\eclipse\pig PIG_CLASSPATH=%PIG_HOME%\binI created using eclipse’s Ant builder jar-all and pigunit-jar:p