{"id":2080,"date":"2022-08-30T15:21:53","date_gmt":"2022-08-30T15:21:53","guid":{"rendered":"https:\/\/unknownerror.org\/index.php\/2013\/12\/25\/problem-about-piglatin-collection-of-common-programming-errors\/"},"modified":"2022-08-30T15:21:53","modified_gmt":"2022-08-30T15:21:53","slug":"problem-about-piglatin-collection-of-common-programming-errors","status":"publish","type":"post","link":"https:\/\/unknownerror.org\/index.php\/2022\/08\/30\/problem-about-piglatin-collection-of-common-programming-errors\/","title":{"rendered":"problem about piglatin-Collection of common programming errors"},"content":{"rendered":"<ul>\n<li>\n<img decoding=\"async\" src=\"http:\/\/www.gravatar.com\/avatar\/fc74acf73e1261de1f71cc5987258771?s=32&amp;d=identicon&amp;r=PG\" \/><br \/>\nFraz<br \/>\njava pig piglatin<br \/>\nI am new into java world.. But basically I am trying to write a user-defined-function in pig-latin.The following is the relevant code.public class time extends EvalFunc&lt;String&gt;{public String exec(Tuple input) throws IOException {if ((input == null) || (input.size() == 0))return null;try{String time = (String) input.get(0) ;DateFormat df = new SimpleDateFormat(&#8220;hh:mm:ss.000&#8221;);Date date = df.parse(time);String timeOfDay = getTimeOfDay(date);return timeOfDay;} catch (IOException e) {throw e;}<\/li>\n<li>\n<img decoding=\"async\" src=\"http:\/\/www.gravatar.com\/avatar\/7441ce9c7d90dd843baf24862b9f1cb4?s=32&amp;d=identicon&amp;r=PG\" \/><br \/>\nSimon Guo<br \/>\nhadoop amazon-web-services pig piglatin amazon-emr<br \/>\nI am developing an application that try to read log file stored in S3 bucks and parse it using Elastic MapReduce. Current the log file has following format &#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;- COLOR=Black Date=1349719200 PID=23898 Program=Java EOE &#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;- COLOR=White Date=1349719234 PID=23828 Program=Python EOE So I try to load the file into my Pig script, but the build-in Pig Loader doesn&#8217;t seems be able to load my data, so I have to create my own UDF. Since I am p<\/li>\n<li>\n<img decoding=\"async\" src=\"http:\/\/www.gravatar.com\/avatar\/06288dded18d1b525738107e8d1347a6?s=32&amp;d=identicon&amp;r=PG\" \/><br \/>\nuser1158351<br \/>\nhadoop hbase pig hdfs piglatin<br \/>\nFile Content : one,1 two,2 three,3file location : hdfs:\/hbasetest.txtTable in Hbase : create &#8216;mydata&#8217;, &#8216;mycf&#8217;PIG Script :A = LOAD &#8216;\/hbasetest.txt&#8217; USING PigStorage(&#8216;,&#8217;) as (strdata:chararray, intdata:long); STORE A INTO &#8216;hbase:\/\/mydata&#8217;USING org.apache.pig.backend.hadoop.hbase.HBaseStorage(&#8216;mycf:intdata&#8217;);And i m getting following error : ON CONSOLE2012-03-13 16:26:22,170 [main] INFO org.apache.pig.tools.pigstats.ScriptState &#8211; Pig features used in the script: UNKNOWN 2012-03-13 16:26:22,170 [ma<\/li>\n<li>\n<img decoding=\"async\" src=\"http:\/\/www.gravatar.com\/avatar\/872fcc62ee2d86d1c293c72bbf28b97d?s=32&amp;d=identicon&amp;r=PG\" \/><br \/>\nshanks_roux<br \/>\njava hadoop user-defined-functions pig piglatin<br \/>\nI had a specific filtering problem (described here: Pig &#8211; How to manipulate and compare dates?), so as we told me, I decided to write my own filtering UDF. Here is the code:import java.io.IOException;import org.apache.pig.FilterFunc; import org.apache.pig.data.Tuple;import org.joda.time.*; import org.joda.time.format.*;public class DateCloseEnough extends FilterFunc {int nbmois;\/** @param nbMois: if the number of months between two dates is inferior to this variable, then we consider that these<\/li>\n<li>\n<img decoding=\"async\" src=\"http:\/\/www.gravatar.com\/avatar\/872fcc62ee2d86d1c293c72bbf28b97d?s=32&amp;d=identicon&amp;r=PG\" \/><br \/>\nshanks_roux<br \/>\nhadoop pig bigdata piglatin<br \/>\nLet me explain the problem. I have this line of code:u = FOREACH persons GENERATE FLATTEN($0#&#8217;experiences&#8217;) as j; dump u;which produces this output:([id#1,date_begin#12 2012,description#blabla,date_end#04 2013],[id#2,date_begin#02 2011,description#blabla2,date_end#04 2013]) ([id#1,date_begin#12 2011,description#blabla3,date_end#04 2012],[id#2,date_begin#02 2010,description#blabla4,date_end#04 2011])Then, when I do this:p = foreach u generate j#&#8217;id&#8217;, j#&#8217;description&#8217;; dump p;I have this output:(1,<\/li>\n<li>\n<img decoding=\"async\" src=\"http:\/\/www.gravatar.com\/avatar\/6afd8e2c0b36e3eabb037ff9dd687bb3?s=32&amp;d=identicon&amp;r=PG\" \/><br \/>\nMatt S.<br \/>\npig dump piglatin grunt verbosity<br \/>\nI&#8217;m working with PigLatin, using grunt, and every time I &#8216;dump&#8217; stuffs, my console gets clobbered with blah blah, blah non-info, is there a way to surpress all that? grunt&gt; A = LOAD &#8216;testingData&#8217; USING PigStorage(&#8216;:&#8217;); dump A; 2013-05-06 19:42:04,146 [main] INFO org.apache.pig.tools.pigstats.ScriptState &#8211; Pig features used in the script: UNKNOWN 2013-05-06 19:42:04,147 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler &#8211; File concatenation threshold: 100 optimi<\/li>\n<li>\n<img decoding=\"async\" src=\"http:\/\/i.stack.imgur.com\/rnpOq.png?s=32&amp;g=1\" \/><br \/>\nhadinbe<br \/>\nwindows hadoop pig piglatin<br \/>\nI am trying to run PigUnit tests on a Windows 7 machine before running the actual pig script on a Ubuntu cluster and I start to think that my understanding of &#8220;withouthadoop&#8221; is not correct.Do I need to install Hadoop to locally run a PigUnit test on a Windows 7 machine?I installed:eclipse Juno &amp; ant cygwinI set up:JAVA_HOME=C:\\Program Files\\Java\\jdk1.6.0_39 PIG_HOME=C:\\Users\\john.doe\\Java\\eclipse\\pig PIG_CLASSPATH=%PIG_HOME%\\binI created using eclipse&#8217;s Ant builder jar-all and pigunit-jar:p<\/li>\n<\/ul>\n<p id=\"rop\"><small>Originally posted 2013-12-25 10:52:20. <\/small><\/p>","protected":false},"excerpt":{"rendered":"<p>Fraz java pig piglatin I am new into java world.. But basically I am trying to write a user-defined-function in pig-latin.The following is the relevant code.public class time extends EvalFunc&lt;String&gt;{public String exec(Tuple input) throws IOException {if ((input == null) || (input.size() == 0))return null;try{String time = (String) input.get(0) ;DateFormat df = new SimpleDateFormat(&#8220;hh:mm:ss.000&#8221;);Date date = [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-2080","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/posts\/2080","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/comments?post=2080"}],"version-history":[{"count":0,"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/posts\/2080\/revisions"}],"wp:attachment":[{"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/media?parent=2080"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/categories?post=2080"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/tags?post=2080"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}