Hive中定义分割符会使用八进制的ASCII码
问题描述:
今天在用Azkaban跑job的时候发现出了如下问题:
14-11-2021 15:50:00 CST analysis INFO - MismatchedTokenException(24!=347)
14-11-2021 15:50:00 CST analysis INFO - at org.antlr.runtime.BaseRecognizer.recoverFromMismatchedToken(BaseRecognizer.java:617)
14-11-2021 15:50:00 CST analysis INFO - at org.antlr.runtime.BaseRecognizer.match(BaseRecognizer.java:115)
14-11-2021 15:50:00 CST analysis INFO - at org.apache.hadoop.hive.ql.parse.HiveParser.cteStatement(HiveParser.java:36027)
14-11-2021 15:50:00 CST analysis INFO - at org.apache.hadoop.hive.ql.parse.HiveParser.withClause(HiveParser.java:35886)
14-11-2021 15:50:00 CST analysis INFO - at org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:35700)
14-11-2021 15:50:00 CST analysis INFO - at org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:2284)
14-11-2021 15:50:00 CST analysis INFO - at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1333)
14-11-2021 15:50:00 CST analysis INFO - at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:208)
14-11-2021 15:50:00 CST analysis INFO - at org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:77)
14-11-2021 15:50:00 CST analysis INFO - at org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:70)
14-11-2021 15:50:00 CST analysis INFO - at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:468)
14-11-2021 15:50:00 CST analysis INFO - at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1317)
14-11-2021 15:50:00 CST analysis INFO - at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1457)
14-11-2021 15:50:00 CST analysis INFO - at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1237)
14-11-2021 15:50:00 CST analysis INFO - at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1227)
14-11-2021 15:50:00 CST analysis INFO - at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233)
14-11-2021 15:50:00 CST analysis INFO - at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184)
14-11-2021 15:50:00 CST analysis INFO - at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403)
14-11-2021 15:50:00 CST analysis INFO - at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:336)
14-11-2021 15:50:00 CST analysis INFO - at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:787)
14-11-2021 15:50:00 CST analysis INFO - at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
14-11-2021 15:50:00 CST analysis INFO - at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686)
14-11-2021 15:50:00 CST analysis INFO - at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
14-11-2021 15:50:00 CST analysis INFO - at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
14-11-2021 15:50:00 CST analysis INFO - at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
14-11-2021 15:50:00 CST analysis INFO - at java.lang.reflect.Method.invoke(Method.java:498)
14-11-2021 15:50:00 CST analysis INFO - at org.apache.hadoop.util.RunJar.run(RunJar.java:244)
14-11-2021 15:50:00 CST analysis INFO - at org.apache.hadoop.util.RunJar.main(RunJar.java:158)
14-11-2021 15:50:00 CST analysis INFO - FAILED: ParseException line 1:75 mismatched input 'Nov' expecting ) near 'Sun' in statement
14-11-2021 15:50:01 CST analysis INFO - Process completed unsuccessfully in 52 seconds.
14-11-2021 15:50:01 CST analysis ERROR - Job run failed!
其中有一个job是将数据处理后放入到一个预先创建好的表中,但是发现插入失败.
具体创建表语句
create table user_info(active_num string,`date` string)
row format delimited fields terminated by 't' ;
这个问题出现的原因是hive中的分割符使用八进制的ASCII码表示
具体如下表
八进制 | 十六进制 | 十进制 | 字符 | 八进制 | 十六进制 | 十进制 | 字符 |
---|---|---|---|---|---|---|---|
00 | 00 | 0 | nul | 100 | 40 | 64 | @ |
01 | 01 | 1 | soh | 101 | 41 | 65 | A |
02 | 02 | 2 | stx | 102 | 42 | 66 | B |
03 | 03 | 3 | etx | 103 | 43 | 67 | C |
04 | 04 | 4 | eot | 104 | 44 | 68 | D |
05 | 05 | 5 | enq | 105 | 45 | 69 | E |
06 | 06 | 6 | ack | 106 | 46 | 70 | F |
07 | 07 | 7 | bel | 107 | 47 | 71 | G |
10 | 08 | 8 | bs | 110 | 48 | 72 | H |
11 | 09 | 9 | ht | 111 | 49 | 73 | I |
12 | 0a | 10 | nl | 112 | 4a | 74 | J |
13 | 0b | 11 | vt | 113 | 4b | 75 | K |
14 | 0c | 12 | ff | 114 | 4c | 76 | L |
15 | 0d | 13 | er | 115 | 4d | 77 | M |
16 | 0e | 14 | so | 116 | 4e | 78 | N |
17 | 0f | 15 | si | 117 | 4f | 79 | O |
20 | 10 | 16 | dle | 120 | 50 | 80 | P |
21 | 11 | 17 | dc1 | 121 | 51 | 81 | Q |
22 | 12 | 18 | dc2 | 122 | 52 | 82 | R |
23 | 13 | 19 | dc3 | 123 | 53 | 83 | S |
24 | 14 | 20 | dc4 | 124 | 54 | 84 | T |
25 | 15 | 21 | nak | 125 | 55 | 85 | U |
26 | 16 | 22 | syn | 126 | 56 | 86 | V |
27 | 17 | 23 | etb | 127 | 57 | 87 | W |
30 | 18 | 24 | can | 130 | 58 | 88 | X |
31 | 19 | 25 | em | 131 | 59 | 89 | Y |
32 | 1a | 26 | sub | 132 | 5a | 90 | Z |
33 | 1b | 27 | esc | 133 | 5b | 91 | [ |
34 | 1c | 28 | fs | 134 | 5c | 92 | | |
35 | 1d | 29 | gs | 135 | 5d | 93 | ] |
36 | 1e | 30 | re | 136 | 5e | 94 | ^ |
37 | 1f | 31 | us | 137 | 5f | 95 | _ |
40 | 20 | 32 | sp | 140 | 60 | 96 | ‘ |
41 | 21 | 33 | ! | 141 | 61 | 97 | a |
42 | 22 | 34 | “ | 142 | 62 | 98 | b |
43 | 23 | 35 | # | 143 | 63 | 99 | c |
44 | 24 | 36 | $ | 144 | 64 | 100 | d |
45 | 25 | 37 | % | 145 | 65 | 101 | e |
46 | 26 | 38 | & | 146 | 66 | 102 | f |
47 | 27 | 39 | ` | 147 | 67 | 103 | g |
50 | 28 | 40 | ( | 150 | 68 | 104 | h |
51 | 29 | 41 | ) | 151 | 69 | 105 | i |
52 | 2a | 42 | * | 152 | 6a | 106 | j |
53 | 2b | 43 | + | 153 | 6b | 107 | k |
54 | 2c | 44 | , | 154 | 6c | 108 | l |
55 | 2d | 45 | - | 155 | 6d | 109 | m |
56 | 2e | 46 | . | 156 | 6e | 110 | n |
57 | 2f | 47 | / | 157 | 6f | 111 | o |
60 | 30 | 48 | 0 | 160 | 70 | 112 | p |
61 | 31 | 49 | 1 | 161 | 71 | 113 | q |
62 | 32 | 50 | 2 | 162 | 72 | 114 | r |
63 | 33 | 51 | 3 | 163 | 73 | 115 | s |
64 | 34 | 52 | 4 | 164 | 74 | 116 | t |
65 | 35 | 53 | 5 | 165 | 75 | 117 | u |
66 | 36 | 54 | 6 | 166 | 76 | 118 | v |
67 | 37 | 55 | 7 | 167 | 77 | 119 | w |
70 | 38 | 56 | 8 | 170 | 78 | 120 | x |
71 | 39 | 57 | 9 | 171 | 79 | 121 | y |
72 | 3a | 58 | : | 172 | 7a | 122 | z |
73 | 3b | 59 | ; | 173 | 7b | 123 | { |
74 | 3c | 60 | < | 174 | 7c | 124 | | |
75 | 3d | 61 | = | 175 | 7d | 125 | } |
76 | 3e | 62 | > | 176 | 7e | 126 | ~ |
77 | 3f | 63 | ? | 177 | 7f | 127 | del |
如果没有可以通过对应的二进制转化
https://www.bejson.com/convert/jinzhi/
比如我获取水平制表符的八进制码
重新创建表:
create table user_info(active_num string,`date` string) row format delimited fields terminated by '11';
每天进步一点点.
本图文内容来源于网友网络收集整理提供,作为学习参考使用,版权属于原作者。
THE END
二维码